www.scienceboard.org The Science Advisory Board - Protocols, Product Reviews, Member Forum, and Science News
Note: You are seeing this message either because your browser has not loaded our stylesheets, or because your browser does not support stylesheets (CSS). Please upgrade to a relatively modern browser to improve your experience. Not sure what to upgrade to? Try Firefox.
The Science Advisory Board
Screen Name: 
 
Password: 
 

Perspectives

only search SAB perspectives

Are you interested in submitting a Perspective Article? Be sure to read The Science Advisory Board's Editorial Guides for Perspective Articles. Click here.


Genomic Structural Variation - the end of the array?
by Richard Wintle, Ph.D.

The human genome is a large and complex place, variable in its sequence content and organization in ways that were not contemplated when the first draft human genome sequences were published in 2001 [1,2]. Many types of genomic changes, including insertions, deletions, duplications, inversions and complex copy-number variations (CNVs), have since been detected, although rare examples of these had been described, in some cases many years previously [3,4]. Indications that CNV events were both common and widespread came from fortuitous observation of unusual patterns of intensity in genome-wide microarray data [5]. Since that time, studies of genomic structural variants have become common. New technologies have in some cases been created to enable these studies – for example, clone-based comparative genomic hybridization (CGH) arrays, or hybrid microarrays containing single nucleotide polymorphism (SNP) and CNV probes. Existing technologies such as DNA sequencing and mapping of clone ends have also been applied to identify and catalogue these types of variation (e.g. [6]).


The power of using whole-genome sequencing to discover and catalogue genetic variation has been recently illustrated by two studies: the sequencing of Dr. Craig Venter's genome by conventional Sanger capillary-electrophoresis sequencing [7] and of Dr. James Watson's genome using next-generation Roche/454 pyrosequencing technology [8]. Each of these studies identified millions of single-base changes, as compared with reference sequences, as well as a host of larger structural variants: insertions, deletions, CNVs and inversions. Importantly for studies of genetic predisposition to disease or other traits of interest, a complete sequence captures the entire SNP content of the genome, eliminating the need for whole genome array-based, or locus-specific targeted, genotyping.


So, is this the end of the genome-wide DNA microarray? Even complete genomic sequencing cannot reliably detect all classes of genomic variation. In particular, direct duplications with perfect or near-perfect homology will likely be mis-assembled as single regions, although these might be revealed by counting read depth of sequence coverage. Other types of events, particularly those in “hard to assemble” regions, might be similarly missed. Paired-end sequencing, whereby sequence reads are derived from each end of a single DNA fragment, can help to overcome this obstacle, by allowing for identification of inversions (when sequence ends map in an unexpected orientation), or insertions and deletions (when they map nearer to, or farther away from, each other than is expected). Even so, the sizes of events that can be detected by paired-end technologies are limited by the size of fragments that are sequenced. These might be tens to hundreds of kilobases (for clones such as fosmids [9] or bacterial artificial chromosomes), to a few kilobases or smaller (for fragment libraries made for analysis on next-generation instruments from Illumina, Applied Biosystems, 454/Roche, or others).


By contrast, microarray-based approaches such as array-comparative genomic hybridization (array-CGH), or derivation of genomic copy number by examination of signal intensities on SNP- or non-polymorphic oligonucleotide arrays, are relatively robust methods of detecting certain classes of structural variation [10]. These methods, typically based on arrays from Affymetrix, Agilent, Illumina, Nimblegen, and various vendors of clone-based CGH arrays, are widely used in both research and clinical settings. However, none of these methods are able to detect balanced events such as translocations or inversions, necessitating other approaches such as sequencing, karyotyping or fluorescent in situ hybridization (FISH). Further, extracting CNV data from SNP and CGH arrays relies on various computational algorithms, and the application and interpretation of these can be challenging [11,12].


As with sequence-based analysis, array-based approaches also suffer from resolution limitations. Arrays can detect copy number events with resolution determined by the technology used: hundreds of kb in size for BAC-CGH arrays, down to perhaps tens of kb for high-density SNP arrays. With the promise of even higher density arrays, including a 10-million locus single array now in development by Affymetrix [13], and the potential to custom-create very high density, multiple array, genome-wide sets using Affymetrix, Agilent, Nimblegen or other technologies, it would seem that there is no practical reason that even the smallest genomic events cannot be detected using microarrays. Indeed, the premise of resequencing arrays is that single-base substitutions can be identified. Reaching the point at which whole-genome sequencing becomes preferable and routine is now only being prevented by cost, both of sequencing and computational resources to analyze the resulting data. Eventually, the expense of creating and using higher and higher density arrays will outstrip the ever-decreasing cost of whole-genome sequencing. Once this tipping point is reached, sequencing will become the method of choice for comprehensive assessment of genomic structural variation. In the meantime, microarrays will remain the tool of choice for genome-wise analysis.

  1. Lander ES, et al. Initial sequencing and analysis of the human genome. Nature. 2001 Feb 15;409(6822):860-921. PMID: 11237011

  2. Venter JC, et al. The sequence of the human genome. Science. 2001 Feb 16;291(5507):1304-51. PMID: 11181995

  3. Feuk L, et al. Structural variation in the human genome. Nat Rev Genet. 2006 Feb;7(2):85-97. PMID: 16418744

  4. Sharp AJ, et al. Structural variation of the human genome. Annu Rev Genomics Hum Genet. 2006;7:407-42. PMID: 16780417

  5. Iafrate AJ, et al. Detection of large-scale variation in the human genome. Nat Genet. 2004 Sep;36(9):949-51. PMID: 15286789

  6. Korbel JO, et al. Paired-end mapping reveals extensive structural variation in the human genome. Science. 2007 Oct 19;318(5849):420-6. PMID: 17901297

  7. Levy S, et al. The diploid genome sequence of an individual human. PLoS Biol. 2007 Sep 4;5(10):e254. PMID: 17803354

  8. Wheeler DA, et al. The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008 Apr 17;452(7189):872-6. PMID: 18421352

  9. Tuzun E, et al. Fine-scale structural variation of the human genome. Nat Genet. 2005 Jul;37(7):727-32. PMID: 15895083

  10. Carter NP. Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet. 2007 Jul;39(7 Suppl):S16-21. PMID: 17597776

  11. Pinto D, et al. Copy-number variation in control population cohorts. Hum Mol Genet. 2007 Oct 15;16 Spec No. 2:R168-73. PMID: 17911159

  12. Scherer SW, et al. Challenges and standards in integrating surveys of structural variation. Nat Genet. 2007 Jul;39(7 Suppl):S7-15. PMID: 17597783

  13. Affymetrix prep new genotyping tools. LabTechnologist.com (online at http://www.labtechnologist.com/news/ng.asp?n=82468-affymetrix-genotyping-sequence-reactions-microarrays-genetics). Accessed July 4, 2008.



###

<< Previous    Next >>   

[ View All Perspectives ]
Active Members
68,371

The Science Advisory Board is the world's original professional network of life scientists.

Members of the SAB:

  • Connect with other scientists.

  • Find tips, methodologies and procedures from established researchers.

  • Share insights, stories, jokes and even "gripes" in an open environment.

  • Voice opinions on companies and products used in their work.

  • Earn generous rewards for their opinions.

Practicing life science researchers and medical professionals participate in The Science Advisory Board's studies, forums, news articles and social media channels.