Communication

Post its

Green post-it – I'm good at the moment.

Pink post-it – I need a bit of help.

Conventions

If you see a block of text like this:

ls -h

it means, type the command ls -h into a terminal window, hit return, and see what happens.

We intend this course to offer as much self-learning as possible. Consequently, you'll find many sections like this - click on the triangle to expand them:

Hint sections will provide you some guidance on what to do next, but will not spell it out.

and some sections like this:

Solution sections will contain the commands so that you could copy-and-paste them if you have to. They will represent one method of answering the question – but there are often many ways to skin a cat!

Your Instructors

About the Iyer Lab

http://iyerlab.org/

Dr. Vishy Iyer, PI

Main focus is functional genomics

    • large-scale transciptional reprogramming
      in response to diverse stimuli
    • Encode consortium collaborator
    • work in human and yeast


Research methods include
  • microarrays (Dr. Iyer was co-inventor)

  • high-throughput sequencing (since 2007)
    • especially ChIP-seq
    • also RNA-seq, RIP-seq, MNase-seq ...
    • we now have nearly 2,000 NGS datasets

Course goals

NGS Challenges

Diverse skill set requirements

  • Analysis – making sense of raw data
    • one part bioinformatics and statistics
    • one part scripting / programming
      • Linux command line
      • High Performance Computing (TACC)
      • bash scripting (grep, awk, sed)
      • R, python, perl
  • Management – making order out of chaos
    • one part organization
    • one part data wrangling
  • Adoption of best practices is critical!

Large and growing datasets

NGS methods produce staggering amounts of data!

Typical dataset these days

The initial fastq files are big (100s of MB to GB) – and they're just the start.

progression of Iyer Lab datasets over time: