Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagebash
titleStart an idev session
idev -p normal -m 180 -A UT-2015-05-18 -N 1 -n 68 --reservation=BIO_DATA_week_1

Next set up a directory for these exercises, and copy an indexed BAM file there. This is a renamed version of the yeast_pairedend.sort.bam file from our Alignment workflow exercises.

...

Code Block
languagebash
module load biocontainers  # takes a while
module load samtools

cd $SCRATCH/core_ngs/alignment/samtools
samtools view yeast_pe.sort.bam | head

...

Tip

awk also has a printf function, which can take the standard formatting commands (see https://en.wikipedia.org/wiki/Printf_format_string#Type_field).

Code Block
languagebash
awk 'BEGIN{ printf("%.2f %%\n", 100*6697/547664) }'

Notes:

  • The printf arguments are enclosed in parentheses since it is a true function.
  • The 1st argument is the format string, enclosed in double quotes.
    • The %.2f format specifier says to output a floating point number with 2 digits after the decimal place.
    • The %% format specifier is used to output a single, literal percent sign.
    • Unlike the standard print statement, the printf function does not automaticlly append a newline to the output, so \n is added to the format string here.
  • Remaining arguments to printf are the values to be substituted for the format specifiers.
    • Here just our percentage computation

...

Code Block
languagebash
cd $SCRATCH/core_ngs/alignment/samtools

# count the number of reads mapped to chromosome 2 (chrII)
samtools view -c -F 0x4 yeast_pe.sort.bam chrII

# count the number of reads mapped to chromosomes 1 or M (chrI, chrM)
samtools view -c -F 0x4 yeast_pe.sort.bam chrI chrM

# count the number of reads mapped to chromosomes 1 that overlap coordinates 1000-2000
samtools view -c -F 0x4 yeast_pe.sort.bam chrI:1000-2000

# since there are only 20 reads in the chrI:1000-2000 region, examine them individually
samtools view -F 0x4 yeast_pe.sort.bam chrI:1000-2000

# look at a subset of field for chrI:1000-2000 reads
#   2=flags, 3=contig, 4=start, 5=mapping quality, 6=CIGAR, 9=insert size
samtools view -F 0x4 yeast_pe.sort.bam chrI:1000-2000 | cut -f 2-6,9

...