...
Code Block |
---|
title | FASTX_toolkit module description |
---|
|
module spider fastx
|
Fastx toolkit is not a module on lonestar5, but it is installed in my /work/01184/daras/bin/ directory. Because it is in your path, you will be able to access it without providing the entire path to it.
Code Block |
---|
title | FASTX_toolkit module description |
---|
|
which fastx_trimmer
echo $PATH
fastx_trimmer -h load fastx_toolkit
|
Let's run fastx_trimmer to trim all input sequences down to 90 bases:
...
- The -l 90 option says that base 90 should be the last base (i.e., trim down to 90 bases)
- the -Q 33 option specifies how base qualities on the 4th line of each fastq entry are encoded. The FASTX toolkit is an older program, written in the time when Illumina base qualities were encoded differently. These days Illumina base qualities follow the Sanger FASTQ standard (Phred score + 33 to make an ASCII character).
Exercise: fastx toolkit programs
What other fastx manipulation programs are part of the fastx toolkit?
...
Type fastx_ then tab to see their names
See all the programs like this:
...
title | fastx toolkit programs |
---|
...
Exercise: What if you just want to get rid of reads that are too low in quality?
Code Block |
---|
title | fastx_quality_filter syntax |
---|
|
fastq_quality_filter -q <N> -p <N> -i <inputfile> -o <outputfile>
-q N: Minimum Base quality score
-p N: Minimum percent of bases that must have [-q] quality
|
Let's try it on our data- trim it to only include reads with atleast 80% of the read having a quality score of 30 or above.
Code Block |
---|
title | Run fastx_quality_filter |
---|
|
fastq_quality_filter -q 20 -p 80 -i data/Sample1_R1.fastq -Q 33 -o Sample1_R1.filtered.fastq |
Code Block |
---|
|
grep '^@HWI' Sample1_R1.trimmed.fastq |wc -l
grep '^@HWI' Sample1_R1.filtered.fastq |wc -l |
...