Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
module load boost/1.45.0
module load bowtie
module load tophat
module load cufflinks

QuestionExercise: If I wanted to provide a trancript annotation file (gtf file) in cufflinks command, what would I add to the command?
i. If I wanted cufflinks to not assemble any novel transcripts outside of what is in the gtf file
ii. If I wanted cufflinks to assemble novel as well as annotated transcripts?

...

Code Block
cuffdiff [options] <merged.gtf> <sample1_rep1.bam,sample1_rep2.bam> <sample2_rep1.bam,sample2_rep2.bam>

QuestionExercise: What does cuffdiff -b do?

...

For d), see the file "cuffdiff.sh" and "cuffdiff.sh.log" for the input, command line, and output. Then, find the cuffdiff output (either by understanding cuffdiff.sh or by looking) and by looking at it and/or reading the documentation find the isoforms and genes that are differentially expressed. Note that cuffdiff has performed a statistical test on the expression values between our two biological groups. It reports the FPKM expression levels for each group, the log2(group 1 FPKM/ group 2 FPKM), and a p-value measure of statistical confidence, among many other helpful
data items.

25 pointsExercise: Find the number of genes that are differentially expressed at the gene level and the number of genes that are differentially expressed in isoform level. Find the number of genes that are differentially expressed in one (gene/isoform) level, but not in the other. This can be done with a python or perl script, or with a one-line linux command.

...

Code Block
titleLinux one-liner for sorting cuffdiff output by log2 fold-change values
cat isoform_exp.diff | awk '{print $10 "\t" $4}' | sort -n -r | head

Using cummeRbund to inspect differential expression data.

Code Block

module load R
R
>source("http://bioconductor.org/biocLite.R")
>biocLite("cummeRbund")
>library(cummeRbund) 
>cuff_data  <- readCuf?inks('diff_out')

>csScatter(genes(cuff_data), 'C1', 'C2')

>gene_diff_data  <- diffData(genes(cuff_data)) 
>sig_gene_data  <- subset(gene_diff_data, (signi?cant  =  =  'yes')) 
>nrow(sig_gene_data) 


>isoform_diff_data <-diffData(isoforms(cuff_data), 'C1', 'C2')
>sig_isoform_data <- subset(isoform_diff_data, (significant == 'yes'))
>nrow(sig_isoform_data))

>mygene1 <- getGene(cuff_data,'regucalcin') 
>expressionBarplot(mygene)
>expressionBarplot(isoforms(mygene))

>mygene2 <- getGene(cuff_data, 'Rala')
>expressionBarplot(mygene2)
>expressionBarplot(isoforms(mygene))