...
Right now, we'll be using it to call variants (find mutations) in the re-sequenced E. coli genome from the Mapping tutorial. You will need the output SAM files from that tutorial to continue here.
Warning | ||
---|---|---|
If you do not have the output from the Mapping tutorial, run these commands to copy over the output that would have been produced. Then, you can immediately start this tutorial!
|
We assume that you are still working in the main directory of called intro_to_mapping
data that you copied to created on $SCRATCH
.
Load SAMtools
Load the SAMtools module (if not already loaded).
...
Code Block |
---|
mkdir comparison cp samtools_bowtie/SRR030257.vcf comparison/bowtie.vcf cp samtools_bwa/SRR030257.vcf comparison/bwa.vcf cp samtools_bowtie2/SRR030257.vcf comparison/bowtie2.vcf cd comparison |
Use the subcommands bedtools intersect
and bedtools subtract
we can find equal and different predictions between mappers. Try to figure out how to to do this on your own first. Hint: Remember that adding > output.vcf
to the end of a command will pipe the output that is to the terminal into a file, so that you can save it.
...
- Which mapper finds more variants?
- Can you figure out how to filter the VCF files on various criteria, like coverage, quality, ... ?
- How many high quality mutations are there in these E. coli samples relative to the reference genome?
...
From here...
Look at how the reads supporting these variants were aligned to the reference genome in the Integrative Genomics Viewer (IGV) tutorial.