Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagebash
titleCounting sequences in a FASTQ file
cd $SCRATCH/core_ngs/fastq_prep
zcat Sample_Yeast_L005_R1.cat.fastq.gz | echo "$((`wc -l` / 4))"

...

Warning
titlebash arithmetic is integer valued only

Note that arithmetic in the bash shell is integer valued only, so don't use it for anything that requires decimal places!

A better way to do math

Well, doing math in bash is pretty awful – there has to be something better. There is! It's called awk, which is a powerful scripting language that is easily invoked from the command line.

In the code below we pipe the output from wc -l (number of lines in the FASTQ file) to awk, which executes its body (the statements between the curly braces ( {  } ) for each line of input. Here the input is just one line, with one field – the line count. The awk body just divides the 1st input field ($1) by 4 and writes the result to standard output. (Read more about awk in Advanced commands: awk)

Code Block
languagebash
titleCounting FASTQ sequences with awk
cd $SCRATCH/core_ngs/fastq_prep
zcat Sample_Yeast_L005_R1.cat.fastq.gz | wc -l | awk '{print $1 / 4}'

Processing multiple compressed files

...