You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Now that we have a bam file with only the reads we want included, we can do some more sophisticated analysis using bedtools.  Bedtools changes from version to version, and here we are using version 2.22, the newest version, and what is currently installed on stampede.

Converting a bam file to a fastq file

Sometimes, especially when working with external data, we need to go from a bam file back to a fastq file.  This can be useful for re-aligning reads using a different aligner, different settings on the original aligner used.  It can also be useful for extracting the sequence of interesting regions of the genome after you have manipulated your bam file.

For this exercise, you'll be using bamtofastq.  This function takes an aligned bam file as input and outputs a fastq format file.  You can use the options if you have paired end data to output R1 and R2 reads for your fastq file.  This type of function is especially useful if you need to to  analyze sequences after you've compared several bam or bed files.

bedtools bamtofastq [OPTIONS] -i input.bam -fq output.fastq

Exercise 1: convert bam to fastq and look at the quality scores

solution code
module load bedtools
bedtools bamtofastq [OPTIONS] -i input.bam -fq output.fastq
more output.fastq

Bed file format: converting a bam file into a bed file.

While it's useful to be able to look at the fastq file, many analyses will be easiest to perform in bed format.  Bed format is a simple tab delimited format that designates various properties about segments of the genome, defined by the chromosome, start coordinates and end coordinates.  Bedtools provides a simple utility to convert bam files over into bed files, termed bamtobed.

bedtools bamtobed [OPTIONS] -i input.bam > output.bed

Note that the output will be piped to standard out unless you redirect to a program (head, more, less) or a file (output.bed).

Exercise 2: Convert the filtered yeast paired end bam to bed using bamtobed and find the number of lines in the file

solution code
module load bedtools
bedtools bamtobed [OPTIONS] -i input.bam > output.bed
wc -l output.bed
more output.bed #to examine the bed file visually

 

Exercise 3: Find the coverage of your bed file over the SacCer3 genome

solution code
solution goes here

 

Exercise 4: Use bedtools merge to merge an experiment

solution code
solution goes here

 

Exercise 5: Intersect two experiments using intersect

solution code
solution goes here

 

 

  • No labels