Blog @ Illumina
Real scientists. Real commentary.

The Power of Sequencing + Arrays

Neil Ward
| Jun 25, 2012

seq arrays resized 600Sunday started for me with the session on statistic analysis of sequence data in complex disease. It's not an easy topic to digest first thing in the morning but the three speakers, Ben Neale, Andrew Morris, and Yukihide Momozawa did a great job of describing their methodologies.

Ben enthusiastically explained the analysis of some large-scale exome sequencing data and went on to describe how the data from 12,000 exomes had been used to design the hugely popular Infinium HumanExome Beadchips. Andrew also spoke about the utility of combining array and sequencing data, and illustrated the power of combining these data types in his reanalysis of the Wellcome Trust Case Control Consortium (WTCCC) data. Andrew estimated that 80% of the power to detect rare variant associations from resequencing can be attained via imputation and array-based GWAS data, but he also warned that the computation of the imputation is not trivial. Yukihide concluded the session with an interesting talk on looking for correlations between GWAS data and quantitative trait loci (QTLs).

The QTL theme reoccurred later in the day in the cellular models of human biology and genetics session. It was great to see Illumina RNA-Seq data being used to assess allele-specific expression. However, Lude Franke brought home the importance of assessing the right tissues for your disease of interest with the example of ASE of rs2186366, which was consistent in blood, adipose and muscle, yet had the opposite profile in liver. These types of studies just wouldn’t have been possible with previous array-based gene expression data, and represent a great utility of the ultra high-depth RNA-Seq data that can be easily obtained on the HiSeq platform. It sounds like there are plans by many of the people in the session to do more ASE work in more tissues and diseases. During the discussion session, there seemed to be a strong desire to centrally collect ASE data into a suitable database so as to maximise the utility of this data. However, the structure of this database isn’t easy to conceptualise, and it’s likely that it would require significant funding to get something like this off the ground.

My main take home from these sessions, and the many other talks that I managed to catch throughout the day, is that researchers are more routinely integrating multiple large-scale omics data types in order to gain a fuller understanding of biology. It's a complex bioinformatics challenge, but one that seems to be having big rewards.