...
Expand | ||
---|---|---|
| ||
Because these are short reads we do not have to adjust parameters like inter-seed distance (-i) or minimum alignment score (--min-score) that are a function of read length. If we were processing longer reads, we might need to use parameters like this, to force bowtie2 to "pretend" the read is a short, constant length: -i C,1,0 Yes, that looks complicated, and it kind of is. It's basically saying "slide the seed down the read one base at a time", and "report alignments as long as they have a minimum alignment score of 32 (16 matching bases x 2 points per match, minimum). See the bowtie2 manual (after you have had a good stiff drink) for a full explanation. |
...
Expand | ||
---|---|---|
| ||
|
Now you should have a human_mirnaseq.sam file that you can examine using whatever commands you like. An example alignment looks like this (note this is one alignment record, although it has been broken up below for readability).
Code Block | ||
---|---|---|
| ||
TUPAC_0037_FC62EE7AAXX:2:1:2607:1430#0/1 0 hsa-mir-302b 50 22 3S20M13S * 0 0 TACGTGCTTCCATGTTTTANTAGAAAAAAAAAAAAG ZZFQV]Z[\IacaWc]RZIBVGSHL_b[XQQcXQcc AS:i:37 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:16G3 YT:Z:UU |
Notes:
- This is one alignment record, although it has been broken up below for readability.
- Notice the CIGAR string is 3S20M13S, meaning that 3 bases were soft clipped from one end (3S), and 13 from the other (13S).
- If we did the same alignment using either bowtie2 --end-to-end mode, or using bwa aln as in Exercise #1, very little of this file would have aligned.
- The 20M part of the CIGAR string says there was a block of 20 read bases that mapped to the reference.
- If we had not lowered the seed parameter of Bowtie2 from its default of 22, we would not have found many of the alignments like this one that only matched for 20 bases.
...