Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Now that we have a bam file with only the reads we want included, we can do some more sophisticated analysis using bedtools.  Bedtools changes from version to version, and here we are using version 2.22, the newest version, and what is currently installed on stampede.

First, login to stampede and make a directory in scratch called bedtools in your scratch folder.  Then copy your filtered bam file from the samtools section into this folder.

Code Block
languagebash
ssh user@stampede.tacc.utexas.edu
cds
mkdir bedtools
cp yeastpairedend.filtered.bed /bedtools
cd bedtools

 

Converting a bam file to a fastq file

...

When we originally examined the bed files produced from our bam file, we can see many reads that overlap over the same interval.  While this level of detail is useful, for some analyses, we can collapse each read into a single line, and indicate how many reads occured over that genomic interval.  We can accomplish this using bedtools merge.

Code Block
languagebash
bedtools merge [OPTIONS] -i experiment.bed > experiment.merge.bed

Bedtools merge also directs the output to standard out, to make sure to point the output to a file or a program.  While we haven't discussed the options for each bedtools function in detail, here they are very important.  Many of the options define what to do with each column (-c) of the output (-o).  This defines what type of operation to perform on each column, and in what order to output the columns.  Standard bed6 format is chrom, start, stop, name, score, strand and controlling column operations allows you to control what to put into each column of output.  The valid operations defined by the -o operation are as follows:

 

 

Expand
titlevalid operations using the -o option
  • sum, min, max, absmin, absmax,
  • mean, median,
  • collapse (i.e., print a delimited list (duplicates allowed)),
  • distinct (i.e., print a delimited list (NO duplicates allowed)),
  • count
  • count_distinct (i.e., a count of the unique values in the column)

 

Exercise 4: Use bedtools merge to merge an experiment

...