A healthy taste of resources available, specifically for this course - not a comprehensive catalog.
Linux/TACC
- Linux fundamentals on this wiki
- Cloud recording of CBRS's Intro to Unix short course, taught by Benni Goetz:
- Cloud recording of CBRS's Intro to TACC short course, taught by Benni Goetz:
Online tutorials:
Community Resources
Sequencing Technologies
- Overviews
Technology intros
- Illumina (Solexa) – most common "short" (< 300 bp) read sequencing
- Newer single molecule sequencing
- Single cell sequencing
- Older technologies (less common now)
FASTQ analysis/manipulation/QC
Reference genomes
Basic alignment and aligners
- Comparison of different aligners
- by Heng Li, developer of bwa, samtools, and many other bioinformatics tools
- File formats
- input: FASTQ format
- output: the SAM (Sequence Alignment Map) format specification
- Aligners
- The BioITeam has some TACC-aware alignment scripts you might find useful:
- bwa alignment
/work2/projects/BioITeam/common/script
/align_bwa_illumina.sh
- bowtie2 alignment
/work2/projects/BioITeam/common/script/
align_bowtie2_illumina.sh
- merging sorted BAM files (read-group aware)
/work2/projects/BioITeam/common/script/
merge_sorted_bams.sh
- kallisto pseudo-alignment to annotated transcripts
/work2/projects/BioITeam/common/script/run_kallisto.sh
- also available on many BRCF pods under /mnt/bioi/script.
- email or come talk to Anna if you have questions or problems
Transcriptome-aware aligners
Alignment analysis
File formats and conversion
- SAM format specification – http://samtools.github.io/hts-specs/SAMv1.pdf
- crucial for performing format conversions, of which ChIP-seq analysis can have many
- Genome browser file formats – http://genome.ucsc.edu/FAQ/FAQformat.html
- BED, bedGraph, narrowPeak and many more
- SRA (Sequence Read Archive) from NCBI
- UCSC file format conversion scripts - useful for getting to/from WIG and BED to corresponding binary formats.
- Make sure you download the correct scripts for your operating system!
- Directories containing these tools can be found at TACC:
/work2/projects/BioITeam/common/opt/UCSC_utils.2019_09
/work2/projects/BioITeam/common/opt/UCSC_utils.2017_07
UCSC Genome Browser
RNAseq/Transcriptome analysis
Variant calling
Genome Annotation