Blog @ Illumina
Real scientists. Real commentary.

Following the Herd- Report from the First Illumina Agrigenomics Summit

Scott Brouilette, Ph.D.
| Dec 08, 2014

Given Illumina’s continued growth in the sequencing space, it is now commonplace to attend meetings where the vast majority of the audience is unaware that we were originally founded as a mircoarray company. One notable exception where the reverse is true is the agrigenomics community, where rapidly increasing numbers of cattle and livestock are being genotyped as the industry moves away from traditional animal breeding (TAB) methods.

Last week saw the ICAR/Interbeef Genomics Workshop hosted in Dublin, Ireland. This three-day event covered topics such as performance testing of commercial beef cattle with the aim of discussing the current state of genomics and how this can be applied to future breeding programs. The last day of the meeting featured Illumina’s first Agrigenomics Scientific Summit, providing opportunities to discuss the potential for sequence data to facilitate future genetic improvement programs. 

Iain MacLaren-Lee, Illumina’s Agrigenomics & Food Market Development Manager, kicked the session off by reviewing the current state of sequencing with the now-obligatory graph highlighting the rapid decline in costs, and an overview of the key terminology. He then introduced Illumina’s sequencing-by-synthesis (SBS) chemistry and reviewed the steps in the analysis pipelines. Illumina’s Dr. André Eggen followed-up to discuss agrigenomic applications of NGS, joking that it should stand for NOW-generation sequencing given its prevalence. André reminded the audience that Illumina technology enables the rapid genotyping of anywhere from tens of thousands to hundreds of thousands of SNPs using Illumina Whole-Genome Beadchips. While genotyping has played a fundamental role in the move to marker-assisted breeding (MAS) towards genomic selection (GS), there are still certain limitations if relying solely on array-based technology. One use of imputation is to allow meta-analysis across different array types, but fundamentally imputation aims to increase study power. In this regard, the combination of array-based genotyping and whole-genome sequencing (WGS) offers an unprecedented opportunity to identifying key variants.

André introduced a paper from Towers, et al. that identified a causative mutation for the condition incontinentia pigmenti using WGS, calling SNPs and INDELS against the EquCab2 reference genome. This approach is very similar to a recent paper from the 1000 Bull Genomes Project ,in which whole-genome sequencing of 234 bulls facilitated the mapping of both monogenic and complex traits in cattle. In addition to reporting observations such as the Holstein and Fleckvieh genomes sharing the greatest number while the Angus and Jersey genomes shared the lowest number of polymorphic variants, the paper also illustrated that WGS of just “a few” affected bulls combined with the 1000 Bull Genomes data enables identification of dominant variants. Another important benefit of NGS is the ability to look at gene expression and regulation, even in non-model animals for which there is currently no (or poor quality) reference genomes.

Next, Dr. Karim Gharbi from Edinburgh Genomics presented on genotyping-by-sequencing (GBS), which  offers the cost benefits of genotyping but can be applied at varying densities and to non-model organisms for which no genotyping arrays exist. Dr. Gharbi gave an elegant review of the current methods (see Davey et al.’s excellent review here), before concluding with the bold statement “GBS will replace arrays; it’s not if, it’s when”.

Next up was Dr. Mick Watson, also from Edinburgh Genomics, on exome sequencing in pigs. Mick started off by introducing us to the concept of “Food Security”– that is, ensuring that everyone on the planet has enough food to eat. Challenges include population increases, consumer demand, governance, and globalisation. Dr. Watson beautifully illustrated man’s environmental impact using China’s Three Gorges Dam as an example, which is so large that it has changed the Earth’s spin and shape! It is estimated that over the next 50 years we need to produce more food than has been produced in the past 10,000 years combined. Dr. Mick moved next onto pigs, which in addition to being an important food, also serves as an excellent model for many human diseases. A key issue is that the pig exome is just 45-50 Mb compared to 65 Mb+ for mouse/humans. Mick informed us that the real issue is one of poor annotation, highlighting the example of the IGF2 gene. IGF2 is central to many biological processes and diseases but is very poorly assembled and annotated in the pig genome. A quick overview of his current pipeline showed that they’re using BWA/GATK/Ensembl VEP, and currently have challenges with the elimination of false positives; something that he hopes will be improved for Pig Exome 2.0.

The last customer presentation of the day came from Dr. John Hickey from the Roslin Institute, presenting on sequencing for genomic selection. Dr. Hickey estimates by the end of the decade, individual breeding programs with more than 1 million animals will be the norm, but pointed out that the target is the population rather than an individual. Dr. Hickey discussed the evolution of Genomic Selection: GS 0.0 relied on linkage disequilibrium (LD), GS 1.0 was linkage-based, whereas GS 2.0 is LD and quantitative trait nucleotide (QTN)-based. Falling sequencing costs will drive the transition from arrays to sequencing in all individuals, but current costs preclude this model. In the interim, it may be best to sequence haplotypes at high-coverage. Using simulation data to compare two experiments:

 (1) 25 sires at 40x and genotyping progeny with 200 = 0.752 imputation accuracy

 (2) Sequencing alone of 1000 progeny at 1x = > 0.9 imputation accuracy  

Dr. Hickey concluded that high candidate numbers with low pass coverage is arguably more informative than high coverage in smaller sample sizes.

Overall, and despite an intense week, our audience enjoyed the session and it as clear that the debate about the transitioning from array-based genotyping to GBS or even WGS has begun in earnest! We’ll report more on this topic next month from the Plant and Animal Genome meeting (PAG) in San Diego.