MAINTENANCE OUTAGE: The University Wiki Service will undergo maintenance on September 26th, 2017, from 6 pm to 8 pm. During this 2 hour time period https://wikis.utexas.edu may be unavailable. Users are advised to save content locally that may be needed during this time and to otherwise save all edits as unsaved work may be lost. Please contact the UT Service Desk at 512-475-9400 for any questions.
The University Wiki Service has upgraded the Confluence Server software, from version 5.9.14 to 5.10.8. Please refer to the knowledge base article, KB0015891, for a high level summary of upgrade changes. Thank you!
Skip to end of metadata
Go to start of metadata

How to use

bfast-book.pdf

See: /home/scott/Downloads/bfast-0.6.4d/manual/bfast-book.pdf for the manual. Evince is the PDF reader on Fourierseq, or scp it to your local computer.

First, convert SOLiD data to fastq format: solid2fastq <csfastafile> <qual file>

A shortcut script that executes all these functions on <input.fasta> and <reads.csfasta> is: 

Create a reference genome (note the -A 1 option means colorspace, use -A 0 for base space)

bfast fasta2brg -f <fastafile> and make a colorspace one too: bfast fasta2brg -f <fastafile> -A 1

Create indexes of the reference genome

For something like bacteria, these are some reasonable masks. Note that both base space and color space indexes are created with these commands:

bfast index -f <fastafile> -m 111111111111111111 -w 12 -i 1

bfast index -f <fastafile> -m 1111111110111111111 -w 12 -i 2

bfast index -f <fastafile> -m 111111011111101011111 -w 12 -i 3

bfast index -f <fastafile> -m 111111011001100111011111 -w 12 -i 4

bfast index -f <fastafile> -m 1111011101011111101111 -w 12 -i 5

bfast index -f <fastafile> -m 111111111111111111 -w 12 -i 1 - A 1

bfast index -f <fastafile> -m 1111111110111111111 -w 12 -i 2 -A 1

bfast index -f <fastafile> -m 111111011111101011111 -w 12 -i 3 -A 1

bfast index -f <fastafile> -m 111111011001100111011111 -w 12 -i 4 -A 1

bfast index -f <fastafile> -m 1111011101011111101111 -w 12 -i 5 -A 1

Use the indexes and reference genome to find CALs(Candidate Alignment Locations) (again, note -A 1 is colorspace)

bfast match -f <fastafile> -A 1 -r <fastq> > bfast.matches.fasta.reads.bmf

Align each CAL using a local alignment algorithm (again, note -A 1 is colorspace)

bfast localalign -f <fasta> -m bfast.matches.fasta.reads.bmf -A 1 > bfast.aligned.fasta.reads.baf

Filter/Prioritize alignments (again, note -A 1 is colorspace)

bfast postprocess -f <fasta> -i bfast.aligned.fasta.reads.baf -A 1 > bfast.reported.fasta.reads.sam

Then sam to bam via:

samtools view -S -b bfast.reported.reads.sam > bfast.reported.reads.bam

samtools sort bfast.reported.reads.bam

samtools index bfast.reported.reads.bam

 

The packages was installed on

Phylocluster  /share/apps

References

  • Homer N, Merriman B, Nelson SF., BFAST: an alignment tool for large scale genome resequencing., PLoS One., 4(11):e7767 (2009 Nov 11) PubMed Link
  • Homer N, Merriman B, Nelson SF., Local alignment of two-base encoded DNA sequence., BMC Bioinformatics., 10:175 (2009 Jun 9) PubMed Link
  • No labels