...
For a simple alignment like this, we can just go with the default alignment parameters.
Also note that bwa writes its (binary) output to standard output by default, so we need to redirect that to a .sai file.
For simplicity, we will just execute these commands directly, one at a time. (We can do this since we're on the special login8 head node!) Each command you execute should only take a minute or so.
Code Block | ||||
---|---|---|---|---|
| ||||
bwa aln sacCer3/sacCer3.fa fastq/Sample_Yeast_L005_R1.cat.fastq.gz > yeast_R1.sai
bwa aln sacCer3/sacCer3.fa fastq/Sample_Yeast_L005_R2.cat.fastq.gz > yeast_R2.sai |
When all is done you should have two .sai files.
Expand | ||
---|---|---|
| ||
Here's how you would use the TACC batch system for these commands. For a simple alignment like this, we can just go with the default alignment parameters, with one exception. At TACC, we want to optimize our alignment speed by allocating more than one thread (-t) to the alignment. We want to run 2 tasks, and will use a minimum of one 16-core node. So we can assign 8 cores to each alignment by specifying -t 8. |
...
Create an aln.cmds file (using nano) with the following lines. Here we redirect standard error to a log file, one per file |
...
. |
...
Create the batch submission script specifying a wayness of 2 (2 tasks per node) on the normal queue and a time of 10 minutes, then submit the job and monitor the queue:
Since you have directed standard error to log files, you can use a neat trick to monitor the progress of the alignment: tail -f. The -f means "follow" the tail, so new lines at the end of the file are displayed as they are added to the file.
|
When it's done you should see two .sai files. Next we use the bwa sampe command to pair the reads and output SAM format data. Just type that command in with no arguments to see its usage.
For this command you provide the same reference index prefix as for bwa aln, along with the two .sai files and the two original FASTQ files.
...