JAMAに掲載された、Next-Generation Sequencing of Infectious Pathogens。
Next-generation sequencing is a versatile technology, broadly applicable to viruses, bacteria, fungi, parasites, animal vectors, and human hosts. Choosing among available methods depends on sequencing objectives and involves tradeoffs in accuracy, efficiency, and cost.For routine sequencing, most US clinical and public health microbiology laboratories use short-read sequencing platforms, which produce sequence fragments up to 1000 base-pairs long. Although microbial genomes are generally smaller and less complex than human genomes, long-read sequencing technologies (such as single-molecule real-time sequencing) are useful for constructing complete, highly accurate genomes and sorting out plasmids, repeats, and other complex regions.
A different approach, nanopore sequencing, relies on threading individual DNA or RNA molecules through engineered protein nanopores and monitoring the electric current across each pore. The first such commercially available instrument offers relatively long sequence reads and allows data analysis to begin while sequencing is still in progress. Early limitations in throughput and accuracy have been mitigated by continued improvements in hardware and reagents. Because of device portability, fast sample preparation, flexibility, and relatively low cost, nanopore sequencing is becoming a feasible first-line strategy for pathogen sequencing in clinical and public health settings.
The transformation of raw sequence data into actionable information is complex and computationally intensive (Figure). The first step is typically to assemble shorter fragments into a complete sequence, either by mapping against a known reference genome or by assembling the sequence de novo using overlapping reads. Comparing the assembled genome with reference strains facilitates many different inferences, such as pathogen identification, high-resolution strain typing, and prediction of important phenotypic characteristics (eg, virulence, antimicrobial resistance). Well-curated and up-to-date reference databases are crucially important because microbial pathogens evolve rapidly and bacteria can exchange plasmids—often encoding virulence and antimicrobial resistance traits—across strains and species. Assembled genomes can be compared with others to look for phylogenetic clustering as evidence of transmission. Each step—assembly, strain typing, phenotyping, and clustering—requires different bioinformatics tools that must be harmonized into a consistent workflow.