Page History

...

It has a number of options (see fastqc --help | more) but can be run very simply with just a FASTQ file as its argument.

Expand

title	Setup (if needed)

Code Block

language	bash

# Setup (if needed)
mkdir -p $SCRATCH/core_ngs/fastq_prep
cd $SCRATCH/core_ngs/fastq_prep
cp $CORENGS/misc/small.fq .

Code Block

language	bash
title	Running fastqc on a FASTQ file

# make sure you're in your $SCRATCH/core_ngs/fastq_prep directory
cds
cd core_ngs/fastq_prep
fastqc small.fq

...

Even though multiqc has many options, it is quite easy to create a basic report by just pointing it to the directory where individual reports are located:

Expand

title	Setup (if needed)

Code Block

language	bash

cd

mkdir -p $SCRATCH/core_ngs/multiqc/fqc.atacseq

multiqc fqc.atacseq

cd $SCRATCH/core_ngs/multiqc/fqc.atacseq
cp $CORENGS/multiqc/fqc.atacseq/*.zip .

Code Block

language	bash

cd $SCRATCH/core_ngs/multiqc
multiqc fqc.atacseq

Exercise: Exercise: How many reports did multiqc find?

...

The FASTX Toolkit provides a set of command line tools for manipulating both FASTA and FASTQ files. The available modules are described on their website. They include a fast fastx_trimmer utility for trimming FASTQ sequences (and quality score strings) before alignment.

Set up to process the yeast data if you haven't already.

Code Block

language	bash
title	Set up directory for working with FASTQs

# Create a $SCRATCH area to work on data for this course,
# with a sub-direct[1ory for pre-processing raw fastq files
mkdir -p $SCRATCH/core_ngs/fastq_prep

# Make a symbolic links to the original yeast data:
cd $SCRATCH/core_ngs/fastq_prep
ln -s -f /work2/projects/BioITeam/projects/courses/Core_NGS_Tools/yeast_stuff/Sample_Yeast_L005_R1.cat.fastq.gz
ln -s -f /work2/projects/BioITeam/projects/courses/Core_NGS_Tools/yeast_stuff/Sample_Yeast_L005_R2.cat.fastq.gz

Expand

title	Make sure you're in a idev session

Make sure you're in an idev session. If you're in an idev session, the hostname command will display a name like c455-021.ls6.tacc.utexas.edu. But if you're on a login node the hostname will be something like login3.ls6.tacc.utexas.edu.

If you're on a login node, start an idev session like this:

Code Block

language	bash
title	Start an idev session

idev -m 120 -N 1 -A OTH21164 -r CoreNGSday3

FASTX Toolkit is available as a BioContainers module.

Code Block

language	bash

module load biocontainers  # takes a while
module spider fastx
module load fastxtools

Here's an example of how to run fastx_trimmer to trim all input sequences down to 50 bases. By default the program reads its input data from standard input and writes trimmed sequences to standard output:

Expand

title	Setup (if needed)

Expand

title	Make sure you're in a idev session

Make sure you're in an idev session. If you're in an idev session, the hostname command will display a name like c455-021.ls6.tacc.utexas.edu. But if you're on a login node the hostname will be something like login3.ls6.tacc.utexas.edu.

If you're on a login node, start an idev session like this:

Code Block

language	bash
title	Start an idev session

idev -m 120 -N 1 -A OTH21164 -r CoreNGSday3

FASTX Toolkit is available as a BioContainers module.

Code Block

language	bash

module load biocontainers  # takes a while
module spider fastx
module load fastxtools

...

Set up directory for working with FASTQs

# Create a $SCRATCH area to work on data for this course,
# with a sub-direct[1ory for pre-processing raw fastq files
mkdir -p $SCRATCH/core_ngs/fastq_prep

# Make a symbolic links to the original yeast data:
cd $SCRATCH/core_ngs/fastq_prep
ln -s -f $CORENGS/yeast_stuff/Sample_Yeast_L005_R1.cat.fastq.gz
ln -s -f $CORENGS/yeast_stuff/Sample_Yeast_L005_R2.cat.fastq.gz

Code Block

language	bash
title	Trimming FASTQ sequences to 50 bases with fastx_trimmer

# make sure you're in your $SCRATCH/core_ngs/fastq_prep directory
cd $SCRATCH/core_ngs/fastq_prep
zcat Sample_Yeast_L005_R1.cat.fastq.gz | fastx_trimmer -l 50 -Q 33 > trim50_R1.fq

...

Now execute cutadapt like this:

Expand

title	Setup (if needed)

Code Block

language	bash
title	Setup for cutadapt on miRNA FASTQ

mkdir -p $SCRATCH/core_ngs/fastq_prep
cd $SCRATCH/core_ngs/fastq_prep
cp $CORENGS/human_stuff/miRNA_test.fq .

Code Block

language	bash
title	Cutadapt batch command for R1 FASTQ

cutadapt -m 20 -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC miRNA_test.fq 2> miRNA_test.cuta.log | gzip > miRNA_test.cutadapt.fq.gz

...

Page tree

Versions Compared

Old Version 123

New Version 124

Key