Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

SAMtools is a suite of commands for dealing with databases of mapped reads. You'll be using it quite a bit throughout the course.

Calling variants in reads mapped by bowtie

Right now, we'll be using it to call variants (find mutations) in the re-sequenced E. coli genome from the Introduction to mapping (bowtie, BWA). You will need the output SAM files from that tutorial to continue here. We assume that you are still in the main directory of introduction_to_mapping data that you copied to $SCRATCH.

...

Take a look at this file. It has a nice header explaining what the columns mean.

Exercise 1

VCF format has Allele Frequency tags denoted by AF1. Try the following command to see what values we have in our files.

...

Expand
Hint...
Hint...

What does the -v flag do in grep?

Code Block
grep -v *something*

Calling variants in reads mapped by BWA

Follow the same directions to call variants in the BWA-mapped reads.

...

You could also try running all of the commands from inside of the samtools_bwa directory, just for a change of pace.

Comparing the results of different mappers

Set up a new output directory and copy the respective VCF files to it.

...

Code Block
subtractBed -a bowtie.vcf -b common_bowtie_bwa.vcf > unique_bowtie.vcf
subtractBed -a bwa.vcf -b common_bowtie_bwa.vcf > unique_bwa.vcf

Exercises

  • Which mapper finds more variants?
  • Can you figure out how to filter the VCF files on various criteria, like coverage, quality, ... ?
  • How many high quality mutations are there in these E. coli samples relative to the reference genome?

Next up

We will get a chance to look at how the reads supporting these variants were aligned to the reference genome by Using the Integrative Genomics Viewer (IGV)

Old stuff

Output - Samtools

Output - Determining Differences Between Mappers

...