You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Genome Variation

Genome Variation typically means variation of individual genomes within a species. Variation between species is the realm of phylogenetics and/or comparative genomics.

Variants commonly detected by NGS consist of:

  • single base base changes (SNPs)
  • insertions and deletions, and
  • larger scale structural changes such as large deletions, large duplications (up to and including whole chromosomes) and translocations

"Larger scale" is usually defined relative to the capabilities of the technology; for example, a "small indel" usually means "detectable within a single sequence read". In 2009, sequence reads were about 50 bp but in 2011 they were 100 bp.

More classic variants such as microsatellites, STRP's, and Alu's, can be somewhat more difficult to detect with NGS because their length is just slightly beyond typical NGS read lengths but shorter than what could be reliably detected with paired-end sequence data given the variation in fragment lengths with NGS. Fortunately, these methods were usually proxies for the functional genetic polymorphisms which can now be detected directly with NGS.

According to the Genome News Network, about 90% of genome variation occurs as single nucleotide polymorphisms. dbSNP as of June 26, 2012 contains 187,852,828 SNP submissions (ss #'s) which condense to 53,558,214 familes (refSNPs, rs #'s).

Yesterday we looked at the generalized workflow for finding variants with NGS. Here is an image displaying SNPs and indels in IGV:

  • No labels