Oct 8, 2012
Location: ACES 2.402 ; 9 am - 2 pm (CDT)
Your Instructors
Name |
Initials |
Affiliation |
|
---|---|---|---|
Matt Vaughn |
MWV |
Manager, TACC Life Sciences |
|
John Fonner |
JF |
Research Associate, TACC Life Sciences |
|
Scott Hunicke-Smith |
|
Director, GSAF |
in absentia |
Jeff Barrick |
|
Asst. Prof. Biochemistry |
in absentia |
Learning objectives
- Role of TACC for UT System researchers
- Logging into Lonestar and other TACC systems
- Batch vs interactive computing on Lonestar
- The TACC software environmment
- File systems available on TACC Lonestar
- Parallelism strategies for increasing efficiency of NextGen analyses
- Use of idev interactive sessions on TACC systems
- Moving files via SFTP and SCP
Outline
Linux and Lonestar Refresher Course (9:00-10:00)
- Introduction to TACC
- Linux basics with Lonestar
- Logging in via SSH
- Command-line tricks (tab completion and the history)
- Essential Linux commands
- Wildcards and special file names
- Using options with Linux commands
- Getting help
- Extra: Printable cheat sheet of common Linux commands
- Lonestar Essentials
- The login (or head) nodes
- Acceptable uses for login nodes
- What not to use login nodes for
- Lonestar file systems
- Running compute jobs on Lonestar
- Batch (qsub)
- Interactive (idev)
- The login (or head) nodes
- Editing files
- Using nano on the command line
- Demo: Using TextWrangler (Mac), Notepad++ (Windows), or gEdit (Linux) from a desktop environment
- Finding and using software
- The module system
- avail, list, load, swap, unload, key
- Linux PATHs
- Extra: List of genomics modules available at TACC
- Extra: Installing your own Linux software
- The module system
Break (10:00-10:15)
Delving into HPC-oriented NGS analysis (10:15-11:15)
- Tutorial: Read mapping with BWA and BOWTIE
- Supplemental Material
- Presentation: Introduction to Read Mapping (Barrick)
Speeding up your analyses using parallel computing (11:15-11:50)
- Introduction: Parallelism Strategies (PDF)
- Tutorial: Using threads to speed up mapping on a single compute node
- Tutorial: Using the TACC Parametric Launcher to speed up mapping (or any other natively parallel task) across multiple nodes
- Bonus example: Using the Launcher to automate Rscript analyses
Set-up for Afternoon (11:50-12:00)
- Pre-flight instructions for Interactive Computing session
- Please complete before going to lunch!
Lunch (12:00-1:00)
- Please return promptly!
Variant Calling with SAMtools (1:00-1:40)
- Tutorial: Variant calling using SAMtools
- Calling SNPs and Indels
- Inspecting alignments supporting a variant using tview
- Filtering variants
- Finding summary statistics
- Supplemental material
Downloading and uploading files using SFTP (1:40-1:50)
- The sftp and scp commands
- DEMO: Using CyberDuck on the desktop
- Supplemental material