With cops in full uniform guarding the entrances to the Islands Ballroom, and nearly 500 people in the large conference room, Eric Green kicked off the meeting at 4 pm stating that this was the largest meeting in the 14 year history of AGBT with nearly 800 people signed up. The organizing committee struggled to keep the meeting small but the high demand made it a challenge. Eric stated that the science is better than ever and that we can expect a spectacular 72 hours of science with excellent weather which unfortunately we will not have a chance to enjoy.
In previous years, the first day sessions were typically tutorials giving an overview of extant sequencing technologies but today the sessions jumped directly into the research topics.
Eric Beorwinkle from the University of Texas Health presented an analysis on associating gene variants with metabolite phenotypes by exome and whole genome sequencing. To determine metabolite phenotypes, 602 serum metabolites were evaluated from 1,260 African-Americans from Jackson Field. Out of these, 308 metabolites were selected to study association with exome sequencing. About 8,500 exomes (plus 5,500 still to be sequenced) and 3,000 whole genomes from the CHARGE (Cohorts for Heart and Aging Research in Genetic Epidemiology) Consortium were sequenced (6x depth for WGS) on Illumina platforms. Data for 962 individuals from European and African American ancestry were further analyzed. About 25 million variants were discovered, about half were singletons and doubletons, 50% were novel variants. 58% of the variants were intergenic, 35% intronic, and 7% were things like UTRs, ncRNA, etc. When genes reported to influence HDL levels were studied in this sample set, some of these showed variants, but many did not. For example, ABCA1 and APOA1 variants correlated with lower HDL than reported. They found several Mendelian SNVs that affected HDL-C and corrected several reported in the literature.
Eric focused on a specific metabolite, N-acetyl methylhistidine- a marker for chronic kidney disease, as an example. A NAT8 GWAS study showed a significant effect on chronic kidney disease and on n-acetyl methylhistidine levels. Several loss-of-function variants were identified and the gene FAM198B was implicated in decreasing 5-HETE levels by about 48%.
In addition for a need to improve sequencing technologies even further, there is a strong need to improve analysis methods and pipelines, and a better integration of clinical information into research. A tool they have developed, Lachesis, can be used to survey the genomic landscape with overlays of common variants with sliding windows, plus annotation-based results gives a p value of the phenotype overlaid over the annotated genome. This tool should be very useful to see phenotypes in the context of known variation. In addition, they are beginning to see a better integration of clinical and research data with about 150 cases per month of clinical samples at Baylor entering research discovery.
Leonid Moroz from the University of Florida gave a fascinating presentation on the changes in the epigenome and transcriptome on creating a memory in neurons. He got the audiences’ attention with his first slide asking the question 'how do we remember our first kiss for the rest of our life?" It turns out that studying the human brain, with its 100,000,000,000 small neurons can be quite overwhelming. In contrast, Aplysia snails have only 10,000 giant neurons of nearly 1 mm diameter, making it a more feasible experimental model. Also, the genome and transcriptome of Aplysia have been completed.
Single neuron and neuronal compartments were isolated from Aplysia brains, and subjected to transcriptome sequencing. Quite surprisingly, about 80% of the genome was expressed with 45-50% of the transcripts being differentially expressed. This equates to 100k unique transcripts in the neuron. Gene expression based classification did not overlap conventional classification of neurons.
Long term memory leads to altered gene expression and new synaptic connections. Also, efficiency of neuronal connections changes with memory. They identified 70 messenger molecules at the same synapse! To simplify the analysis, they reconstructed memory circuits in cell culture and performed RNA-seq on these cells. About 3,000 genes were identified, and 692 of these genes overlapped with mouse genes.
Facilitative neurotransmitters induce a massive rapid change in genome wide demethylation as detected by transcriptome and methylation sequencing from the same neuron. It is currently believed that active demethylation occurs via hydroxymethyl DNA. Interestingly, they showed demethylation occurring on one strand of the DNA! Although they have no idea how this happens, Leonid hinted at RNA cargo transport and active demethylation.
Jumping across organisms, the talk moved into brain evolution with the interesting statement that the brain has evolved independently at least 9 times and that memory of injury provides an evolutionarily important function. Leonid gave some cool examples of single cell studies but did not get a chance to divulge methods. Maybe I can follow up with him later and post the notes from that discussion here.
Ross Hardison, from Penn State, had an interesting presentation on the genomics and epigenomics of gene regulation. Cis-regulatory motifs (CRMs) can be identified confidently in complex genomes by using a combination of sequence conservation in non-coding regions, ChIP-Seq to detect histone modifications, and transcription factor binding to mark potential regulatory regions. For example, TAL1, GATA1, and H3k4me binding patterns correlate with fold change in activity. Data from the ENCODE project that predicted an erythroid enhancer were confirmed in blood cells with GFP expression. Other examples were confirmed by lacZ enhancer expression in mice. Many GWAS hits are believed to be located in cis-regulatory motifs. DNase sensitivity and transcription factor occupancy co-located with 35% of GWAS SNPs. For example, 302 phenotype SNP groups overlap with transcription factor binding data and DNAse I-sensitive sites. In an additional example, a Crohn's disease SNP mapped with ENCODE tracks, showing a strong coincidence with ENCODE peaks.
The last talk of the evening session featured Barbara Methé from the J. Craig Venter Institute, who gave a report on the Human Microbiome Project (HMP), which includes 3,000 bacterial and viral reference genomes- including many from single-cell uncultured organisms. Metagenomic data sets consist of samples from 300 healthy adults, each including 15 male body sites and 18 female body sites. Barbara spent a lot of time explaining different graphs that showed ubiquity (0-100%, in how many samples is the organism found) versus abundance (0.01% to 100%, the proportion of sample containing the organism) from different body locations. Using whole-genome sequencing as compared to 16S rRNA sequencing revealed functional profiles for KEGG orthologs, which is useful for seeing differences in antibiotic resistance and pathogenesis. It was an interesting talk highlighting the usefulness of the HMP data set.