Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

It has a number of options (see fastqc --help | more) but can be run very simply with just a FASTQ file as its argument.

Expand
titleSetup (if needed)


Code Block
languagebash
# Setup (if needed)
mkdir -p $SCRATCH/core_ngs/fastq_prep
cd $SCRATCH/core_ngs/fastq_prep
cp $CORENGS/misc/small.fq .



Code Block
languagebash
titleRunning fastqc on a FASTQ file
# make sure you're in your $SCRATCH/core_ngs/fastq_prep directory
cds
cd core_ngs/fastq_prep
fastqc small.fq

...

Even though multiqc has many options, it is quite easy to create a basic report by just pointing it to the directory where individual reports are located:

Expand
titleSetup (if needed)


Code Block
languagebash
cd
mkdir -p $SCRATCH/core_ngs/multiqc/fqc.atacseq
multiqc fqc.atacseq
cd $SCRATCH/core_ngs/multiqc/fqc.atacseq
cp $CORENGS/multiqc/fqc.atacseq/*.zip .



Code Block
languagebash
cd $SCRATCH/core_ngs/multiqc
multiqc fqc.atacseq

Exercise: Exercise: How many reports did multiqc find?

...

The FASTX Toolkit provides a set of command line tools for manipulating both FASTA and FASTQ files. The available modules are described on their website. They include a fast fastx_trimmer utility for trimming FASTQ sequences (and quality score strings) before alignment.

Set up to process the yeast data if you haven't already.

Code Block
languagebash
titleSet up directory for working with FASTQs
# Create a $SCRATCH area to work on data for this course,
# with a sub-direct[1ory for pre-processing raw fastq files
mkdir -p $SCRATCH/core_ngs/fastq_prep

# Make a symbolic links to the original yeast data:
cd $SCRATCH/core_ngs/fastq_prep
ln -s -f /work2/projects/BioITeam/projects/courses/Core_NGS_Tools/yeast_stuff/Sample_Yeast_L005_R1.cat.fastq.gz
ln -s -f /work2/projects/BioITeam/projects/courses/Core_NGS_Tools/yeast_stuff/Sample_Yeast_L005_R2.cat.fastq.gz
Expand
titleMake sure you're in a idev session

Make sure you're in an idev session. If you're in an idev session, the hostname command will display a name like c455-021.ls6.tacc.utexas.edu. But if you're on a login node the hostname will be something like login3.ls6.tacc.utexas.edu.

If you're on a login node, start an idev session like this:

Code Block
languagebash
titleStart an idev session
idev -m 120 -N 1 -A OTH21164 -r CoreNGSday3


FASTX Toolkit is available as a BioContainers module.

Code Block
languagebash
module load biocontainers  # takes a while
module spider fastx
module load fastxtools

Here's an example of how to run fastx_trimmer to trim all input sequences down to 50 bases. By default the program reads its input data from standard input and writes trimmed sequences to standard output:

Expand
titleSetup (if needed)


Expand
titleMake sure you're in a idev session

Make sure you're in an idev session. If you're in an idev session, the hostname command will display a name like c455-021.ls6.tacc.utexas.edu. But if you're on a login node the hostname will be something like login3.ls6.tacc.utexas.edu.

If you're on a login node, start an idev session like this:

Code Block
languagebash
titleStart an idev session
idev -m 120 -N 1 -A OTH21164 -r CoreNGSday3

FASTX Toolkit is available as a BioContainers module.

Code Block
languagebash
module load biocontainers  # takes a while
module spider fastx
module load fastxtools

...

Set up directory for working with FASTQs
# Create a $SCRATCH area to work on data for this course,
# with a sub-direct[1ory for pre-processing raw fastq files
mkdir -p $SCRATCH/core_ngs/fastq_prep

# Make a symbolic links to the original yeast data:
cd $SCRATCH/core_ngs/fastq_prep
ln -s -f $CORENGS/yeast_stuff/Sample_Yeast_L005_R1.cat.fastq.gz
ln -s -f $CORENGS/yeast_stuff/Sample_Yeast_L005_R2.cat.fastq.gz



Code Block
languagebash
titleTrimming FASTQ sequences to 50 bases with fastx_trimmer
# make sure you're in your $SCRATCH/core_ngs/fastq_prep directory
cd $SCRATCH/core_ngs/fastq_prep
zcat Sample_Yeast_L005_R1.cat.fastq.gz | fastx_trimmer -l 50 -Q 33 > trim50_R1.fq

...

Now execute cutadapt like this:

Expand
titleSetup (if needed)


Code Block
languagebash
titleSetup for cutadapt on miRNA FASTQ
mkdir -p $SCRATCH/core_ngs/fastq_prep
cd $SCRATCH/core_ngs/fastq_prep
cp $CORENGS/human_stuff/miRNA_test.fq .



Code Block
languagebash
titleCutadapt batch command for R1 FASTQ
cutadapt -m 20 -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC miRNA_test.fq 2> miRNA_test.cuta.log | gzip > miRNA_test.cutadapt.fq.gz

...