MAINTENANCE OUTAGE: The University Wiki Service will undergo maintenance on Tuesday, September 25th, from 6:00 pm to 8:00 pm.
During this 2 hour time period may be unavailable.
Users are advised to save content locally that may be needed during this time and to otherwise save all edits as unsaved work may be lost.
Skip to end of metadata
Go to start of metadata

Demultiplexing ddRAD data with Stacks exercise:

This exercise should be done on TACC.

First copy over the exercise directory:

copy over directory
#start an idev session if you have not already

#go to coure directory on scratch
cd rad_intro/

#copy over the demultiplexing directory
cp -r /work/02260/grovesd/lonestar/intro_to_rad_2017/demultiplexing/process_ddRAD_stacks .

#go to the demultiplexing with stacks exercise directory
cd process_ddRAD_stacks

Check out the reads to make sure they make sense:

check out reads
#we have two fastq files: Lib01_R1.fastq and Lib01_R2.fastq
ls *.fastq
#These are paired end reads, two separate reads from either end of the same DNA fragment
#Double-check that the files have the same number of reads. (note that fastq genreally files have 4 lines per read)
expr $(cat Lib01_R1.fastq | wc -l) / 4
expr $(cat Lib01_R2.fastq | wc -l) / 4
#(Note these are for demo purposes and shorter than they should be)
#this ddRAD library was prepared so that forward reads were cut with the nlaIII restriction enzyme.
#Looking up nlaIII, we see it's cut site:
#     CATG'
#    'GTAC
#check if we see this cut site in the forward reads
grep CATG Lib01_R1.fastq | wc -l
#compare this with the total number of reads we got above
#do these numbers make sense?
#look at where in the reads the cut site is (may help to paste into a text editor and character search)
grep CATG Lib01_R1.fastq | head -n 30

process the reads using process_radtags (part of the Stacks package):

process tags
#make a directory to put the resulting sample fastq files into
mkdir sample_fastqs

#look at the documentation for process_radtags
./process_radtags -h
#execute the command to process the rad data
./process_radtags -i 'fastq' -1 Lib01_R1.fastq -2 Lib01_R2.fastq -o ./sample_fastqs/ -b barcodes_Lib1.tsv --inline_index -e 'nlaIII' -r --disable_rad_check

Check the results:

check results
#how many barcode combinations did we have?
#First look at the barcodes file:
cat barcodes_Lib1.tsv 
#then count the lines
cat barcodes_Lib1.tsv | wc -l
#how r1 and r2 fastq files did we output (ignore the 'rem' files)
ls sample_fastqs/*AGCGAC.1.fq
ls sample_fastqs/*AGCGAC.1.fq
ls sample_fastqs/*AGCGAC.1.fq | wc -l
ls sample_fastqs/*AGCGAC.1.fq | wc -l
#are the paired end files still the same length?
cat sample_fastqs/sample_CATAT-AGCGAC.1.fq | wc -l
cat sample_fastqs/sample_CATAT-AGCGAC.2.fq | wc -l
  • No labels