...
Code Block | ||
---|---|---|
| ||
cds cd my_rnaseq_course/day_1_partB head bwa_exercise/bwa_mem_results_transcriptome/C1_R1.mem.sam #lot of these are header lines that start with @ grep -v '^@' bwa_exercise/bwa_mem_results_transcriptome/C1_R1.mem.sam|head |
Exercise 5b: Spliced sequences
...
Code Block | ||
---|---|---|
| ||
The 6th BAM/SAM file field is the CIGAR string which tells you how your query sequence mapped to the reference. grep -v '^@' bwa_exercise/bwa_mem_results_transcriptome/C1_R1.mem.sam|head|cut -f 6 |
...
Examine the cigar scores of hisat2 results
Code Block | ||
---|---|---|
| ||
grep -v '^@' hisat_exercise/results/GSM794483_C1.sam|head|cut -f 6 The CIGAR string "58M76N17M" representst a spliced sequence. The codes mean: 56M - the first 58 bases match the reference 76N - there are then 76 bases on the reference with no corresponding bases in the sequence (an intron) 17M - the last 17 bases match the reference |
Exercise 4: Count spliced sequences
How many spliced sequences are there in the C1_R1 alignment file?
Code Block | ||
---|---|---|
| ||
grep -v '^@' hisat_exercise/results/GSM794483_C1.sam|cut -f 6|grep 'N'|wc -l |
...
BACK TO COURSE OUTLINE