After raw sequence files are generated (in FASTQ format), quality-checked, and pre-processed in some way, the next step in most many NGS pipelines is mapping to a reference genome.
For individual sequences , it is common to use a tool like BLAST to identify genes or species of origin. However , a normal NGS dataset will have tens to hundreds of millions of sequences, which BLAST and similar tools are not designed to handle. Thus , a large set of computational tools have been developed to quickly , align each read to its best location , (if any, ) in a reference.
Even though many mapping tools exist, a few individual programs have a dominant "market share" of the NGS world. In this section, we will primarily focus on two of the most versatile mappersgeneral-purpose aligners: BWA and Bowtie2, (the latter being part of the Tuxedo suite which includes the transcriptome-aware RNA-seq aligner Tophat2 as well as other downstream quantifiaction tools).
Stage the alignment data
First connect to login5.ls5.tacc.utexas.edu. This should be second nature by now