Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Differential expression with splice variant analysis at the same time: the Tophat/Cufflinks workflow

Objectives

In this lab, you will explore a fairly typical RNA-seq analysis workflow. Simulated RNA-seq data will be provided to you; the data contains 75 bp paired-end reads that have been generated in silico to replicate real gene count data from Drosophila. The data simulates two biological groups with three biological replicates per group (6 samples total). This simulated data has already been run through a basic RNA-seq analysis workflow. We will look at:i.

  1. How the workflow was run and what steps are involved.

...

  1. What genes and isoforms are significantly differentially expressed

Six raw data files were provided as the starting point:

  • c1_r1, c1_r2, c1_r3 from the first biological condition

...

  • c2_r1, c2_r2, and c2_r3 from the second

...

  • biological condition

Due to the size of the data and length of run time, *most of the programs have already been run for this exercise*. The commands run are in different *.commands files. We will spend some time looking through these commands to understand them. You will then be parsing the output, finding answers, and visualizing results.

...