Blog @ Illumina
Real scientists. Real commentary.

A Synthesis of PAG Asia 2013: Part 1

Brian Fritz, Ph.D.
| Mar 21, 2013

describe the imageOne summary of this year’s inaugural PAG Asia 2013 conference might be thus: a reference genome, while no doubt critical for plant or animal genomics and the result of a significant investment of time and resources, nevertheless remains a largely static document. From an initial reference one can generate some interesting statistics on N50 contigs, numbers of annotated genes, syntenic conservation between related species, and the like. However, a reference is really only the beginning, and it is the subsequent resequencing of diverse natural populations and the application of duly discovered molecular genetic variation via GWAS or related approaches which really unlocks the power of the reference genome. Through such further application does the static document become a living document, sustained over multiple generations of crop or animal improvement to understand the origins of domestication, identify loci crucial to sustaining food security in the face of a changing climate, abiotic stress, or disease, and map diversity of wild and domesticated stocks across the planet. As highlighted in Kahlil’s post, none of these technical improvements matter without also understanding the history, environment, and biology of the systems under study. Sustaining the natural and domesticated diversity of all species will ensure that the genomic revolution has the greatest amount of raw material to work with for future generations of plant and animal genomic improvement.

A brief history of tools for plant and animal genomics

The history of plant and animal selection, as highlighted by Max Rothschild and a number of other speakers at the PAG Asia 2013 meeting, extends back thousands of years to its origins as labour-intensive and largely observational selection and breeding programs.  Desirable traits were identified through sight, smell, or taste and then selected for further generations of planting or breeding.  Such programs persisted and were improved over thousands of years and have resulted in the domestication of a staggering diversity of plant and animal species.

With the onset of the age of molecular biology in the past century, a new opportunity arose to apply molecular genetic tools to plant and animal improvement programs. Molecular marker technologies—such as microsatellites and restriction fragment length polymorphisms—allowed the creation of genetic maps, which in turn enabled QTL analyses to begin to identify the molecular genetic basis of observed phenotypic variation.  The creation of BAC and cDNA libraries and the commoditization of capillary electrophoresis-based sequencing technology then enabled the physical sequencing of QTL loci and direct linkage between specific polymorphic gene sequences and phenotypes. Drawing upon lessons learned from the human and other foundational eukaryotic genome sequencing programs, high-throughput shotgun sequencing strategies were employed for the identification of single nucleotide polymorphisms within plant and animal populations. They in turn fostered the development of high-throughput SNP genotyping technologies and subsequent GWAS studies to map loci underlying traits of interest with increasing greater resolution. As such technologies improved in reliability, efficiency, and cost per sample, an increasing number of individual genomes were assayed in diverse plant and animal species, and the utilization of molecular analyses within crop or animal improvement programs began to proliferate.

Fast-forward to just after the turn of the last century. The first “next-generation” sequencing technologies began to supplant CE-based methods to establish reference genome sequences and identify SNPs or other molecular genetic variants in an expanding constellation of different organisms. Cost per base of sequencing dropped by orders of magnitude within just a few years and, in addition to de novo assemblies of previously unknown reference genomes, high-throughput sequencing tools have since been applied to greatly improving such reference genome assemblies. Some improvements have been technical, such as through increased read lengths or paired-end and mate-pair sequencing strategies to bridge gaps between contigs. Other improvements have been through decreased costs and the ability to sample ever larger numbers of individuals and different populations. The proliferation of reference genomes for different plant and animal species also affords the opportunity to leverage genomic synteny between related organisms and facilitates de novo assembly of new reference genome sequences from short-read based shotgun sequencing approaches. One particularly noteworthy approach (mentioned by a number of talks in the IWGSC [wheat] session and Dave Edwards for Brassica and chickpea) involves flow-sorting of individual plant chromosomes, followed by BAC-by-BAC or similarly targeted sequencing to reduce the complexity of sequencing and assembling highly complex genomes. Finally, it is important to note that none of these improvements in molecular biology and other “bench” tools would have made so great an impact were it not for the continuous development and refinement of the bioanalytical methods required to generate meaningful results from such an explosive number of molecular genetic datasets.