In this day and age, it’s safe to assume that just about every research biologist has read a paper or two using next-generation sequencing (NGS) technology. NGS has revolutionized our understanding of genetic information through highly automated and accurate improvements to the traditional chain termination technique for decoding DNA. Its accuracy, reliability, availability, and price point have put it within reach of nearly every researcher, making it an indispensable and universal tool. In some areas of research, such as human genetics, scientists were quick to adopt this technology. In others, such as microbiology or forensics, uptake has been slower. In this two-part blog post, I will outline the basic concepts behind, and the benefits of using NGS, as well as provide an overview of how this technology is used for microbiology-specific applications, including 16S rRNA sequencing and de novo assembly of microbial genomes.
To understand the popularity of this technique today, it’s important to look at the sequencing costs and output from the early years of the technology. Small genomes came first, with the draft sequence of Haemophilus influenzae completed with Sanger-based capillary electrophoresis (CE) methods at a cost of ~1 million US dollars in 1995. The draft human genome was produced largely with the same techniques in 2001 through Herculean effort and a matching cost. As reagents and methods became more efficient, sequencing costs began to come down. Then came the commercialization of massively parallel sequencing, and throughput had a lot of catching up to do. In 2007, a single sequencing run using Illumina’s Solexa technology, which later became the Genome Analyzer, could produce about 1 Gb of data. Today, Illumina’s HiSeq 2000 system offers a 1000× increase in throughput and a 105 decrease in cost, enabling researchers to produce a terabase (Tb) of sequence in a single run, or sequence the entire E. coli genome for about 100 US dollars. In fact, one single HiSeq instrument run produces sequence equivalent to the output of 60,000 capillary electrophoresis instruments.
In principle, the core concept behind NGS is similar to chain termination sequencing methods. Within small fragments of DNA, each base is sequentially identified from signals emitted as each fragment is resynthesized from the DNA template strand. Instead of serial reactions, NGS extends this process across millions of reactions in a massively parallel fashion, all automated in a single environment. The process of cluster generation uses clonal amplification to generate copies of each fragment. As a result, an ultra-high-throughput instrument like the HiSeq 2000, can perform ~6 billion reactions at a time. Both single- and paired-end sequencing (sequencing both ends of a fragment) libraries can be generated. Paired-end sequencing is important for sequence alignment and genome assembly, providing long range information on structural variation.
The beauty and power behind NGS technology is that analysis isn’t limited to sequencing genomic DNA. NGS research applications are largely determined by how sequencing libraries are prepared, and the way the data is analyzed. A wide assortment of commercially available and lab-demonstrated library prep kits exist for examining whole genomes, small genomes, mRNA, ncRNA, targeted regions, custom-selected regions, protein-binding regions, and more. For example, using a library prep kit containing modified transposable elements that simultaneously fragment and add sequencing adapters necessary for sequencing chemistry, you can produce a high-quality sequencing library composed of hundreds of bacterial genomes within two hours.
For microbiologists, NGS is a workhorse for many types of experimentation. Just a few of these applications include analyzing the genome of unculturable organisms without prior information, or tracking microbial adaptation in the lab or in the environment with single-base resolution. In part two of this post, I’ll talk about microbiology-specific experiments that exploit the full benefits of NGS— many of which aren’t even possible using CE methods. If you want to jump ahead, read our Introduction to Next-Generation Sequencing for Microbiologists.