Setup output directory.
mkdir samtools_bowtie |
If you do not have an alignment file in the SAM format you may want to start with Introduction to mapping.
cp bowtie/REL606.5.sam samtools_bowtie/ cp bowtie/REL606.5.fasta samtools_bowtie/ |
Index the reference file.
samtools faidx samtools_bowtie/REL606.5.fasta |
Convert from SAM to BAM format.
samtools view -bS -o samtools_bowtie/REL606.5.bam bowtie/REL606.5.sam |borderStyle=solid} |
Sort the BAM file.
samtools sort samtools_bowtie/REL606.5.bam samtools_bowtie/sorted_REL606.5 |
Output VCF file.
samtools mpileup -uf samtools_bowtie/REL606.5.fasta samtools_bowtie/sorted_REL606.5.bam \|bcftools view -vcg - \> samtools_bowtie/output.vcf |
VCF format has Allele Frequency tags denoted by AF1. Try the following command to see what values we have in our files.
cat input.vcf | grep AF1 |
For the data we are dealing with, predictions with an allele frequency not equal to 1 are not really applicable. How can we remove these lines from the file and continue on?
What does the -v flag do in grep?
|
Filtering Allele Frequencies Output
Setup output directory and then change into it.
mkdir output cp samtools_bowtie/output.vcf output/bowtie.vcf cp samtools_bwa/output.vcf output/bwa.vcf cd output |
Bedtools is a suite of utility programs that work on a variety of file formats, one of which is conveniently VCF format. Using intersectBed and subtractBed we can find equal and different predictions between mappers.
Load Bedtools.
module load bedtools |
Finding alike mutations.
intersectBed -a bowtie.vcf -b bwa.vcf > intersect.vcf |
Finding unique mutations for each mapper.
subtractBed -a bowtie.vcf -b intersect.vcf > unique_bowtie.vcf subtractBed -a bwa.vcf -b intersect.vcf > unique_bwa.vcf |