A healthy taste of resources available, specifically for this course - not a comprehensive catalog.
- SEQAnwers forum - many NGS sequencing questions answered here
- UCSC Genome Browser - visualize and download NGS data (see more below)
- Galaxy website for online sequencing data analysis
- Broad Institute Integrated Genomcs Viewer (IGV) - especially good for bam files
Getting started with Linux and Perl
- Unix and Perl for Biologists website
- Cheat sheet of useful Unix commands
- A funny SEQAnwers post about biologists starting to analyze NGS data
- Wikipedia FASTQ format page
- FastQC from Babraham Bioinformatics; produces nice quality report for fastq files.
- Cutadapt - An excellent command line tool for adapter sequence removal.
- FASTX Toolkit - Command line tools for fastq analysis and manipulation
- Illumina library construction on GSAF user wiki - useful for contaminent detection or adapter removal.
- Comparison of different aligners
- File formats
- SAM (Sequence Alignment Map) format specification (pdf)
- sam/bam tools
- SAMstat - produces detailed graphical statistics for sam/bam files.
- BEDTools - region overlap, merge, coverage & much more, w/bed, bam, vcf, gff support
- BEDTools user manual (pdf)
UCSC Genome Browser
- intro on this wiki
- Main UCSC Genome Browser web site
- Beta Test browser site - most up-to-date datasets and features; can be buggy
- File formats - BED format especially is widely used
- Table browser - Browse and download data in different formats
- The 1000 Genomes project - catalog of human genetic variants
- Broad institute GATK - complex but powerful; used by 1000 Genomes
- File formats
- The Tuxedo pipeline: RNAseq with tophat/cufflinks
Format converters and miscellaneous tools
- SRA (Sequence Read Archive) from NCBI
- Mason program for simulating second-generation sequencing reads.
De novo assembly
- <put something here>
Other courses with online tutorials
- 2012 Next-Gen Sequence Analysis Workshop (Michigan State University) has similar tutorials to our course, but also includes introductions to using the Amazon EC2 where you can "rent" Linux machines (useful if you don't have access to TACC), Python, R, ChIP-Seq, etc.