Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Table of Contents

Overview and Objectives

Once raw sequence files are generated (in FASTQ format) and quality-checked, the next step in most NGS pipelines is mapping to a reference genome. For individual sequences of interest, it is common to use a tool like BLAST to identify genes or species of origin. However, a typical example will have millions of reads, and a reference space that is frequently billions of bases, which BLAST and similar tools are not really designed to handle.

Thus, a large set of computational tools have been developed to quickly, and with sufficient (but NOT absolute) accuracy align each read to its best location, if any, in a reference. Even though many mapping tools exist, a few individual programs have a dominant "market share" of the NGS world. These programs vary widely in their design, inputs, outputs, and applications. In this section, we will primarily focus on two of the most versatile mappers: BWA and Bowtie2, the latter being part of the Tuxedo suite (e.g. Tophat2).

Sample Datasets

You have already worked with one paired-end yeast ChIP-seq dataset, which we will continue to use here.  In order to demonstrate how to process 

Reference Genomes

 

BWA - The Most General Mapper

 

Bowtie2 - More Pain, More Gain

 

Future Directions

 

  • No labels