...
Before the alignment, of course, we've got to build a mirbase index using bowtie2-build (go ahead and check out its options). Unlike for the aligner itself, we only need to worry about a few things here:
Code Block |
---|
bowtie2-build <reference_in> <bt2_index_base> |
...
- reference_in file is just the FASTA file containing mirbase v20 sequences
...
- bt2_index_base is the prefix of where we want the files to go
...
Following what we did earlier for BWA indexing:
Code Block | ||||
---|---|---|---|---|
| ||||
mkdir -p $WORK/archive/references/bt2/mirbase.v20 cd $WORK/archive/references/bt2/mirbase.v20 ln -s -f ../../fasta/hairpin_cDNA_hsa.fa ls mirbase cd mirbase -la |
Now build the index with bowtie2-build:
Code Block | ||||
---|---|---|---|---|
| ||||
bowtie2-build hairpin_cDNA_hsa.fa hairpin_cDNA_hsa.fa |
That was very fast! It's because the mirbase reference genome is so small compared to what programs like this are used to dealing with, which is the human genome (or bigger). Now, your $SCRATCH/references/mirbase directory should be filled with You should see the following files:
Code Block | ||
---|---|---|
| ||
hairpin_cDNA_hsa.fa hairpin_cDNA_hsa.fa.1.bt2 hairpin_cDNA_hsa.fa.2.bt2 hairpin_cDNA_hsa.fa.3.bt2 hairpin_cDNA_hsa.fa.4.bt2 hairpin_cDNA_hsa.fa.rev.1.bt2 hairpin_cDNA_hsa.fa.rev.2.bt2 |
Now, we're ready to actually try to do the alignment. Remember, unlike BWA, we actually need to set some options depending on what we're after. These are Some of the most important options when using Bowtie2for bowtie2 are:
Option | Effect | ||||
---|---|---|---|---|---|
-N | Controls the number of mismatches allowable in the seed of each alignment (default = 0) | - | LControls the length of seed substrings generated from each read (default = 22) | --end-to-end or --local | Controls whether the entire read must align to the reference, or whether soft-clipping the ends is allowed to find internal alignments. Default --end-to-end |
-L | Controls the length of seed substrings generated from each read (default = 22) | ||||
-N | Controls the number of mismatches allowable in the seed of each alignment (default = 0) | ||||
-ma | Controls the alignment score contribution of a matching base (0 for --end-to-end, 2 for --local) |
To decide how we want to go about doing our alignment, check out the file we're aligning with 'less'.:
Expand | ||
---|---|---|
| ||
|
...