Date: Thu, 28 Mar 2024 05:31:35 -0500 (CDT)
Message-ID: <700022770.2418.1711621895166@ip-10-0-27-248.ec2.internal>
Subject: Exported From Confluence
MIME-Version: 1.0
Content-Type: multipart/related;
boundary="----=_Part_2417_1603855471.1711621895164"
------=_Part_2417_1603855471.1711621895164
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Content-Location: file:///C:/exported.html
GS De novo assembler
GS De novo assembler
Summary
Performs assembly of reads and generates contigs. Current Version 2.5.3.=
Full Roche manual.
- Sff files
- Fasta files
- Converted Sanger data- Fasta files and corresponding Quality files
Output options
- Consensus sequence (contigs)
- Corresponding quality scores
- ACE files
- Assembly metrics files
- Pairwise alignments
- Read status file
- Alignment views - GUI only
- Flowgrams - GUI only
- For paired end data, scaffold files
Running GS De Novo as=
sembler
GUI Assembler -
- Can be accessed by typing gsAssembler
=
Commandline Assembler -
- runAssembly -o /data/filename /data/R_/D_
- For paired end data, runAssembly -o /data/filename -p /data/R_/D_ =
Some options
- Incremental de novo assembly - will allow you to add more data to the a=
ssembly when needed.
- Large or complex genomes - for genomes larger than 15 Mb, use this opti=
on.
- Trimming database file - Provide a file with fasta sequences that need =
to be removed (trimmed) from reads (like vectors).
- Screening database file - Provide a file containing contamination seque=
nces for screening.
- cDNA assembly- use option -cdna
Things to remember
- Reads shorter than 50 bp long are removed by default.
- The tool is more powerful and produces better assemblies when using sff=
files than just fasta files as input. The flowgrams are used when computin=
g signals.
- It is a good idea to use Repeatmasker to handle repeats before assembly=
.
- The current assembler version uses 3 to 4 bytes of memory per base and =
is equipped to run only on a single processor. In cases where memory is not=
enough to do an assembly, try the incremental de novo assembly option.
------=_Part_2417_1603855471.1711621895164--