...
It has a number of options (see fastqc --help | more) but can be run very simply with just a FASTQ file as its argument.
Expand |
---|
|
Code Block |
---|
| # Setup (if needed)
mkdir -p $SCRATCH/core_ngs/fastq_prep
cd $SCRATCH/core_ngs/fastq_prep
cp $CORENGS/misc/small.fq . |
|
Code Block |
---|
language | bash |
---|
title | Running fastqc on a FASTQ file |
---|
|
# make sure you're in your $SCRATCH/core_ngs/fastq_prep directory
cds
cd core_ngs/fastq_prep
fastqc small.fq |
...
Even though multiqc has many options, it is quite easy to create a basic report by just pointing it to the directory where individual reports are located:
Expand |
---|
|
cdmkdir -p $SCRATCH/core_ngs/multiqc/fqc.atacseq
| multiqc fqc.atacseqcd $SCRATCH/core_ngs/multiqc/fqc.atacseq
cp $CORENGS/multiqc/fqc.atacseq/*.zip . |
|
Code Block |
---|
|
cd $SCRATCH/core_ngs/multiqc
multiqc fqc.atacseq |
Exercise: Exercise: How many reports did multiqc find?
...
The FASTX Toolkit provides a set of command line tools for manipulating both FASTA and FASTQ files. The available modules are described on their website. They include a fast fastx_trimmer utility for trimming FASTQ sequences (and quality score strings) before alignment.
Set up to process the yeast data if you haven't already.
Code Block |
---|
language | bash |
---|
title | Set up directory for working with FASTQs |
---|
|
# Create a $SCRATCH area to work on data for this course,
# with a sub-direct[1ory for pre-processing raw fastq files
mkdir -p $SCRATCH/core_ngs/fastq_prep
# Make a symbolic links to the original yeast data:
cd $SCRATCH/core_ngs/fastq_prep
ln -s -f /work2/projects/BioITeam/projects/courses/Core_NGS_Tools/yeast_stuff/Sample_Yeast_L005_R1.cat.fastq.gz
ln -s -f /work2/projects/BioITeam/projects/courses/Core_NGS_Tools/yeast_stuff/Sample_Yeast_L005_R2.cat.fastq.gz |
Expand |
---|
title | Make sure you're in a idev session |
---|
|
Make sure you're in an idev session. If you're in an idev session, the hostname command will display a name like c455-021.ls6.tacc.utexas.edu. But if you're on a login node the hostname will be something like login3.ls6.tacc.utexas.edu. If you're on a login node, start an idev session like this: Code Block |
---|
language | bash |
---|
title | Start an idev session |
---|
| idev -m 120 -N 1 -A OTH21164 -r CoreNGSday3 |
|
FASTX Toolkit is available as a BioContainers module.
Code Block |
---|
|
module load biocontainers # takes a while
module spider fastx
module load fastxtools
|
Here's an example of how to run fastx_trimmer to trim all input sequences down to 50 bases. By default the program reads its input data from standard input and writes trimmed sequences to standard output:
Expand |
---|
|
|
Expand |
---|
title | Make sure you're in a idev session |
---|
|
Make sure you're in an idev session. If you're in an idev session, the hostname command will display a name like c455-021.ls6.tacc.utexas.edu. But if you're on a login node the hostname will be something like login3.ls6.tacc.utexas.edu. If you're on a login node, start an idev session like this: Code Block |
---|
language | bash |
---|
title | Start an idev session |
---|
| idev -m 120 -N 1 -A OTH21164 -r CoreNGSday3 |
|
FASTX Toolkit is available as a BioContainers module.
Code Block |
---|
|
module load biocontainers # takes a while
module spider fastx
module load fastxtools
|
...
Set up directory for working with FASTQs |
| # Create a $SCRATCH area to work on data for this course,
# with a sub-direct[1ory for pre-processing raw fastq files
mkdir -p $SCRATCH/core_ngs/fastq_prep
# Make a symbolic links to the original yeast data:
cd $SCRATCH/core_ngs/fastq_prep
ln -s -f $CORENGS/yeast_stuff/Sample_Yeast_L005_R1.cat.fastq.gz
ln -s -f $CORENGS/yeast_stuff/Sample_Yeast_L005_R2.cat.fastq.gz |
|
Code Block |
---|
language | bash |
---|
title | Trimming FASTQ sequences to 50 bases with fastx_trimmer |
---|
|
# make sure you're in your $SCRATCH/core_ngs/fastq_prep directory
cd $SCRATCH/core_ngs/fastq_prep
zcat Sample_Yeast_L005_R1.cat.fastq.gz | fastx_trimmer -l 50 -Q 33 > trim50_R1.fq
|
...
Now execute cutadapt like this:
Expand |
---|
|
Code Block |
---|
language | bash |
---|
title | Setup for cutadapt on miRNA FASTQ |
---|
| mkdir -p $SCRATCH/core_ngs/fastq_prep
cd $SCRATCH/core_ngs/fastq_prep
cp $CORENGS/human_stuff/miRNA_test.fq .
|
|
Code Block |
---|
language | bash |
---|
title | Cutadapt batch command for R1 FASTQ |
---|
|
cutadapt -m 20 -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC miRNA_test.fq 2> miRNA_test.cuta.log | gzip > miRNA_test.cutadapt.fq.gz |
...