Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The resulting output will contain several additional columns which summarize this information:

 

Expand
titlebedtools coverage output

After each interval in B, coverageBed will report:

  1. The number of features in A that overlapped (by at least one base pair) the B interval.
  2. The number of bases in B that had non-zero coverage from features in A.
  3. The length of the entry in B.
  4. The fraction of bases in B that had non-zero coverage from features in A.

...

When we originally examined the bed files produced from our bam file, we can see many reads that overlap over the same interval.  While this level of detail is useful, for some analyses, we can collapse each read into a single line, and indicate how many reads occured over that genomic interval.  We can accomplish this using bedtools merge.

Code Block
languagebash
bedtools merge [OPTIONS] -i experiment.bed > experiment.merge.bed

Bedtools merge also directs the output to standard out, to make sure to point the output to a file or a program.  While we haven't discussed the options for each bedtools function in detail, here they are very important.  Many of the options define what to do with each column (-c) of the output (-o).  This defines what type of operation to perform on each column, and in what order to output the columns.  Standard bed6 format is chrom, start, stop, name, score, strand and controlling column operations allows you to control what to put into each column of output.  The valid operations defined by the -o operation are as follows: 

 

Expand
titlevalid operations using the -o option
  • sum, min, max, absmin, absmax,
  • mean, median,
  • collapse (i.e., print a delimited list (duplicates allowed)),
  • distinct (i.e., print a delimited list (NO duplicates allowed)),
  • count
  • count_distinct (i.e., a count of the unique values in the column)

For this exercise, we'll be summing the number of reads over a region to get a score column, using distinct to choose a name, and using distinct again to keep track of the strand.  For the -c options, define which columsn to operate on, in the order you want the output.  In this case, to keep the standard bed format, we'll list as -c 5,4,6 and -o distinct,sum,distinct, to keep the proper order of name, score, strand.

...

One useful way to compare two experiments (especially biological replicates, or similar experiments in two yeast strains/cell lines/mouse strains) is to compare where reads in one experiment overlap with reads in another experiment.  Bedtools offers a simple way to do this using the intersect function.

 

Code Block
languagebash
bedtools intersect [OPTIONS] -a <FILE> \
                             -b <FILE1, FILE2, ..., FILEN>

...

-a and -b indicate what files to intersect.  in -b, you can specify one, or several files to intersect with the file specified in -a.

-

Expand
titlebedtools intersect options

wa:   Write the original entry in A for each overlap.

wb:   Write the original entry in B for each overlap. Useful for knowing what A overlaps. Restricted by -f and -r.

loj:   Perform a “left outer join”. That is, for each feature in A report each overlap with B. If no overlaps are found, report a NULL feature for B.

wo:   Write the original A and B entries plus the number of base pairs of overlap between the two features. Only A features with overlap are reported. Restricted by -f and -r.

wao: Write the original A and B entries plus the number of base pairs of overlap between the two features. However, A features w/o overlap are also reported with a NULL B feature and overlap = 0. Restricted by -f and -r.

f: Minimum overlap required as a fraction of A. Default is 1E-9 (i.e. 1bp).

names: When using multiple databases (-b), provide an alias for each that will appear instead of a fileId when also printing the DB record.

In this section, we'll intersect our yeast data with some yeast chip-seq data.  First copy the yeast chip-seq data over from the iyer lab share at corral:

cp corral-repl/iyer/etc/etc .

Exercise 5: Intersect two experiments using intersect

Expand
titleSolution
Code Block
languagebash
titlesolution code
solution goes herebedtools intersect -wao -a <FILE> \
                             -b output.merge.bed > intersect.bed