Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Now that you've done everything the hard way, let's see how to do run an alignment pipeline using a BWA alignment script maintained by the BioITeam

First the setup, to work in a new directory:

Code Block
languagebash
mkdir -p $SCRATCH/core_ngs/alignment/bwa_script
cd $SCRATCH/core_ngs/alignment/bwa_script
ln -s -f ../../fastq

The BioITeam BWA alignment script is /work/projects/BioITeam/common/script/align_bwa_illumina.sh. Type in the script name to see its usage.

...

  1. aln_mode – whether to perform a global or local alignment (the 1st argument must be one of those words)
    • global mode uses the bwa aln workflow as we did above
    • local mode uses the bwa mem command
  2. FASTQ file – full or relative path to the FASTQ file (just the R1 fastq if paired end). Can be compressed (.gz)
  3. output prefix – prefix for all the output files produced by the script. Should relate back to what the data is.
  4. assembly – genome assembly to use.
    • there are pre-built indexes for some common eukaryotes (hg19, hg18, mm10, mm9, danRer7, sacCer3) that you can use
    • or provide a full path for a bwa reference index you have built somewhere
  5. paired flag – 0 means single end (the default); 1 means paired end
  6. hard trim length – if you want the FASTQ hard trimmed down to a specific length before alignment, supply that number here

We're going to run this script and a similar Bowtie2 alignment script, on the yeast data using the TACC batch system. Copy over the commands file and take a look at itand submit the batch job:


Code Block
languagebash
mkdir -p $SCRATCH/core_ngs/alignment/bwa_script
cd $SCRATCH/core_ngs/alignment/bwa_script
ln -s -f ../../fastq
cp $CORENGS/tacc/aln_script.cmds .
cat alnlauncher_scriptmaker.cmdspy


Code Block
languagebash
titleCopy the commands file
cd $SCRATCH/core_ngs/alignment/bwa_script
cp $CORENGS/tacc/aln_script.cmds .
cat aln_script.cmds

While we're waiting

Code Block
languagebash
titleCommands to run multiple alignments
/work/projects/BioITeam/common/script/alignalign_bwa_illumina.sh global     ./fastq/Sample_Yeast_L005_R1.cat.fastq.gz yeastbwa_chipglobal sacCer3 1 50
/work/projects/BioITeam/common/script/align_bwa_illumina.sh local      ./fastq/Sample_Yeast_L005_R1.cat.fastq.gz bwa_local sacCer3 1
/work/projects/BioITeam/common/script/align_bowtie2_illumina.sh global ./fastq/Sample_Yeast_L005_R1.cat.fastq.gz bt2_global sacCer3 1 50
/work/projects/BioITeam/common/script/align_bowtie2_illumina.sh local  ./fastq/Sample_Yeast_L005_R1.cat.fastq.gz bt2_local sacCer3 1

Notes:

  • The 1st command performs a paired-end BWA global alignment similar to what we did before, but asks that the 100 bp reads be trimmed to 50 first.
    • we refer to the pre-built index for yeast by name: sacCer3
      • this index is located in the /work/projects/BioITeam/ref_genome/bwa/bwtsw/sacCer3/ directory
    • we provide the name of the R1 FASTQ file
      • because we request a PE alignment (the 1 argument) the script will look for a similarly-named R2 file.
    • all output files associated with this command will be named with the prefix bwa_global.
  • The 2nd command performs a paired-end BWA local alignment.
    • all output files associated with this command will be named with the prefix bwa_local.
    • no trimming is requested because the local alignment should ignore 5' and 3' bases that don't match the reference genome
  • The 3rd command performs a paired-end Bowtie2 global alignment.
    • the Bowtie2 alignment script has the same first arguments as the BWA alignment script.
    • all output files associated with this command will be named with the prefix bt2_global.
    • again, we specify that reads should first be trimmed to 50 bp.
  • The 4th command performs a paired-end Bowtie2 local alignment.
    • all output files associated with this command will be named with the prefix bt2_local.
    • again, no trimming is requested for the local alignment

in both global and local alignment modes, and also ru

...