This exercise demonstrates de novo locus generation in the 2bRAD de novo pipeline

For full documentation of the 2bRAD de novo pipeline see the github page

The pipeline is very similar to that performed by Stacks (Catchen et al. 2011):

 

#start idev session if you need to
idev

#copy over the exercise directory
cds
cd rad_intro/
cp -r /work/02260/grovesd/lonestar/intro_to_rad_2017/mapping/denovo_2bRAD .
cd denovo_2bRAD/

#look at starting trimmed fastq files
ls *.trim

		sampleA.trim  sampleB.trim  sampleC.trim

#run uniquerOne.pl
#(this is analogous to making 'stacks' in STACKS (Fig1A Catchen et al. (2011))
#finds the unique RAD tags from each fastq

uniquerOne.pl sampleA.trim > sampleA.trim.uni
uniquerOne.pl sampleB.trim > sampleB.trim.uni
uniquerOne.pl sampleC.trim > sampleC.trim.uni

# merging uniqued files
#(Fig1B Catchen et al. (2011))
mergeUniq.pl uni minDP=2 >mydataMerged.uniq

#generates a merged set of unique tags:
		mergedUniqTags.fasta

# clustering allowing for up to 3 mismatches (-c 0.91); the most abundant sequence becomes reference
#This is equivalent to calling loci (Fig1C-D Catchen et al. (2011))
module load cd-hit
cd-hit-est -i mergedUniqTags.fasta -o cdh_alltags.fas -aL 1 -aS 1 -g 1 -c 0.91 -M 0 -T 0

#now we have called de novo loci based on the tags
#assemble them into an artificial reference for re-mapping and genotyping
concatFasta.pl fasta=cdh_alltags.fas num=8

#index the artificial reference with bowtie
module load bowtie
bowtie2-build cdh_alltags_cc.fasta cdh_alltags_cc.fasta

#now map the reads back to the artificial reference
bowtie2 --no-unal -x cdh_alltags_cc.fasta -U sampleC.trim -S sampleC.trim.bt2.sam
bowtie2 --no-unal -x cdh_alltags_cc.fasta -U sampleB.trim -S sampleB.trim.bt2.sam
bowtie2 --no-unal -x cdh_alltags_cc.fasta -U sampleA.trim -S sampleA.trim.bt2.sam

#The alignment files can now be used for whichever genotyping method you prefer