You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Idev session and getting data


Setting up
ssh <username>@lonestar.tacc.utexas.edu
 
cds
cp -r /corral-repl/utexas/BioITeam/short_rnaseq_course/ my_short_rnaseq_course 

cd my_short_rnaseq_course/

cp -r /corral-repl/utexas/BioITeam/short_rnaseq_course/ my_short_rnaseq_course Step 1: Evaluate Raw Data

 

Count and Fastqc
head data/Sample1_R1.fastq 

wc -l data/Sample1_R1.fastq 
grep -c '^@HWI'  data/Sample1_R1.fastq 


module spider fastqc
module load fastqc
fastqc -h
fastqc data/Sample1_R1.fastq

 

Look at Fastqc results:

Ideal dataset: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/good_sequence_short_fastqc.html

Our dataset: http://web.corral.tacc.utexas.edu/BioITeam/rnaseq_course/fastqc_exercise/Sample1_R1_fastqc/fastqc_report.html

 

manipulate fastq files- quality trimming and filtering
module spider fastx
module load fastx_toolkit
fastx_trimmer -i data/Sample1_R1.fastq -l 90 -Q 33 -o Sample1_R1.trimmed.fastq
fastq_quality_filter -q 20 -p 80 -i data/Sample1_R1.fastq -Q 33 -o Sample1_R1.filtered.fastq


grep -c '^@HWI' results/Sample1_R1.trimmed.fastq 
grep -c '^@HWI' results/Sample1_R1.filtered.fastq 
manipulate fastq files- adaptor trimming
fastx_clipper -h
/corral-repl/utexas/BioITeam/bin/cutadapt-1.3/bin/cutadapt -m 22 -a GATCGGAAGAGCACACGTCTGAACTCCAGTCAC Sample1_R1.fastq
/corral-repl/utexas/BioITeam/bin/cutadapt-1.3/bin/cutadapt -m 22 -a TGATCGTCGGACTGTAGAACTCTGAACGTGTAGA Sample1_R1.fastq 


Step 1: Map Raw Data to Reference, Assess results

Map with BWA
module spider bwa
module load bwa/0.7.7


bwa index -a bwtsw reference/genome.fa
bwa mem
bwa mem reference/genome.fa data/GSM794483_C1_R1_1.fq data/GSM794483_C1_R1_2.fq > C1_R1.mem.sam

Map with Tophat
module spider tophat
module load tophat/2.0.10

tophat -p 2 -G reference/genes.gtf -o C1_R1_thout reference/genome.fa data/GSM794483_C1_R1_1.fq data/GSM794483_C1_R1_2.fq

 

Assess Mapping Results
module load samtools

#SYNTAX:
samtools view -b -S samfile > bamfile
samtools sort bamfile sortedbamfile
samtools index sortedbamfile


#BWA RESULTS
samtools flagstat bwa_results/C1_R1.mem.bam
samtools idxstats bwa_results/C1_R1.mem.bam


#SAMTOOLS RESULTS
samtools flagstat tophat_results/C1_R1_thout/accepted_hits.bam
head tophat_results/C1_R1_thout/accepted_hits.sam
cut -f 6 tophat_results/C1_R1_thout/accepted_hits.sam|grep 'N'|head
#DONT RUN: cut -f 6 tophat_results/C1_R1_thout/accepted_hits.sam|grep 'N'|wc -l

 

 

  • No labels