Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Exercise 1a: Providing a transcript annotation file

If I wanted Which tophat option is used to provide a trancript transcript annotation file (GTF file) to use in tophat, what option would I add to the command?

Expand
Hints
Hints

Remember that you can type the command and hit enter to get help info about it.
Or Google tophat to find its online manual

Expand
Solution
Solution

Add The -G <GTF filename> to use the annotated splice junctions in the supplied GTF file

Exercise 1b: Using only annotated junctions
How would I tell tophat to only use a specified set of transcript annotation and not assemble any novel transcripts?

Expand
Solution
Solution

Add Specify -G <gtf filename> to have tophat use the annotated transcripts in the supplied GTF file
Also add the --no-novel-juncs option to suppress de novo junction detection

...

The GTF file for our Drosophila genome (dm3) is in $BI/ngs_course/tophat_cufflinks/reference/genes.gtf. What does it look like?

...

Expand
Hint
Hint

The 6th BAM file field is the CIGAR string which tells you how your query sequence mapped to the reference.

Expand
Answer
Answer
Code Block
Looking at the CIGAR string
Looking at the CIGAR string

The CIGAR string "58M76N17M" representst a spliced sequence. The codes mean:

  • 56M - the first 58 bases match the reference
  • 76N - there are then 76 bases on the reference with no corresponding bases in the sequence (an intron)
  • 17M - the last 17 bases match the reference

...