Page History

...

After the header lines, each feature in the genome is represented by a line that gives chromosome, start, stop, strand, and other information. Features are things like "mRNA," "CDS," and "EXON." As you would expect in a prokaryotic genome it is frequently the case that the gene, mRNA, CDS, and exon annotations are identical, meaning they share coordinate information. You could parse these files further using commands like grep and awk to extract, say, all exons from the full file or to remove the header lines that begin with "#".

Building the bowtie2 vibCho index

To build the reference index for alignment, we actually only need the FASTA file, since annotations are often not necessary for alignment. (This is not always true - extensively spliced transcriptomes requires splice junction annotations to align RNA-seq data properly, but for now we will only use the FASTA file.) We build the reference files exactly as we did with mirbase, only swapping out the FASTA files as follows:

...

Page tree

Versions Compared

Old Version 83

New Version 84

Key

Building the bowtie2 vibCho index