You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

Your Instructors

Most of us are members (or alumni) of the functional genomics lab of Vishwanath Iyer, UT Austin.

  • Anna Battenhouse, Associate Research Scientist, Iyer Lab, abattenhouse@utexas.edu
    • BA English literature, 1978
    • Commercial software development 1982 – 2005
    • Joined Iyer Lab 2007 (“retirement career”)
    • BS Biochemistry, 2013
  • Amelia Weber Hall, Graduate Student, Iyer Lab, ameliahall@utexas.edu
    • 5th year Microbiology graduate student
    • Laboratory Technician at UT 2007-2010
    • BS Molecular Genetics, 2007
  • Nathan Abell, Research Assistant, Xhemalce Lab, abell.nathan@gmail.com
    • Undergraduate researcher in Iyer Lab 2011-2013
    • BS Molecular Biology, UT, 2013
    • Research Assistant
  • Dakota Derryberry, Graduate Student, Wilke Lab, dakotaz@utexas.edu
    • ???

About the Iyer Lab

http://iyerlab.org/

  • Main focus is functional genomics
    • large-scale transciptional reprogramming in response to diverse stimuli
    • Encode consortium collaborator
    • work in human and yeast
  • Research methods include
    • microarrays (Dr. Iyer was co-inventor)
    • high-throughput sequencing (since 2007)
      • especially ChIP-seq
      • also RNA-seq, RIP-seq, MNase-seq ...
      • we now have > 1,500 NGS datasets

Communication

Post its

Green post-it – I'm good at the moment.

Pink post-it – I need a bit of help.

Conventions

Text that you find in courier font refers to a program or file name on a computer.

If you see a block of text like this:

Example code block
ls -h

it means, "type the command ls -h into a terminal window, hit return, and see what happens".

We intend this course to offer as much self-learning as possible. Consequently, you'll find many sections like this - click on the triangle to expand them:

Hint sections will provide you some guidance on what to do next, but will not spell it out.

and some sections like this:

Solution sections will contain the commands so that you could copy-and-paste them if you have to. They should be exactly accurate.

Goals and challenges

Course goals

  • Hands-on, tutorial style – learn by doing
  • Cover the NGS tool basics – the first few things you'll do after receiving raw sequences
  • Get you comfortable with Linux and TACC – your best "frenemies"
  • Make you self sufficient in 4 days to become experts over time
  • Show some "best practices" for working with NGS data

Challenges

Large and growing datasets

NGS methods procude staggering amounts of data!

Typical dataset these days

  • yeast:  5 – 20 million reads
  • human:  20 – 100 million reads
  • paired end, length 75 – 100 bases

The initial fastq files are big (100s of MB to GB) – and they're just the start.

  • Organization and naming conventions are critical.
  • Your data can get out of hand very quickly!
progression of Iyer Lab ChIP-seq datasets over time
  • 2008 – Yeast heat shock remodeling of chromatin
    • 2 yeast datasets
    • less than 2 million reads
  • 2010 – Allelic bias in CTCF binding
    • 13 CTCF datasets from 3 GM cell lines
    • ~200 million reads
  • 2012 – Analysis of 3 TFs across 11 cell lines
    • 32 datasets gathered over 3 years
    • ~ 1 billion reads
  • 2014 – QTL analysis of CTCF binding
    • 52 very deeply sequenced CTCF datasets
    • ~ 8 billion reads
  • in progress – Functional analysis of glioblastoma tumors and cell lines
    • > 300 datasets so far
    • > 17 billion reads

 

  • No labels