It’s hard to believe, but it’s been almost 15 years since we successfully completed the Human Genome Project, ahead of schedule and under budget. I was proud to stand with my international colleagues in a celebration at the Library of Congress on April 14, 2003 (which happens to be my birthday), to announce that we had stitched together the very first reference sequence of the human genome at a total cost of about $400 million. As remarkable as that achievement was, it was just the beginning of our ongoing effort to understand the human genome, and to use that understanding to improve human health.
That first reference human genome was sequenced using automated machines that were the size of small phone booths. Since then, breathtaking progress has been made in developing innovative technologies that have made DNA sequencing far easier, faster, and more affordable. Now, a report in Nature Biotechnology highlights the latest advance: the sequencing and assembly of a human genome using a pocket-sized device . It was generated using several “nanopore” devices that can be purchased online with a “starter kit” for just $1,000. In fact, this new genome sequence—completed in a matter of weeks—includes some notoriously hard-to-sequence stretches of DNA, filling several key gaps in our original reference genome.
For most sequencing methods, DNA must be broken into smaller, more manageable fragments. That means all of the nucleotide “letters”— the As, Cs, Gs, and Ts—in the DNA code must be pieced back together in their correct order like a complex puzzle. While many methods are incredibly accurate at reassembling many parts of the puzzle, it’s much trickier to do this in highly repetitive stretches of DNA. When broken up, they produce puzzle pieces that are essentially identical.
To get around that problem, some newer sequencing technologies are able to read out much longer stretches of DNA. In this latest report, an international team including Nicholas Loman at the University of Birmingham in the United Kingdom (U.K.), Matthew Loose at the University of Nottingham, U.K., and Adam Phillippy at NIH’s National Human Genome Research Institute, Bethesda, MD, relied on one such device: the hand-held MinION nanopore sequencer, produced by Oxford Nanopore Technologies.
In fact, nanopore sequencing was named one of Science magazine’s “Breakthroughs of the Year” in 2016. The method involves threading single DNA strands through many tiny protein pores, i.e., nanopores, set in an electrically resistant polymer membrane. Inside the device, an ionic current is passed through the nanopore. When a single-stranded DNA molecule passes through the charged nanopore, it alters the current. In fact, the current is altered in different ways depending on which of DNA’s four unique nucletoides—adenine (A), cytosine (C), guanine (G), or thymine (T)—is passing through the pore. As a result, it’s possible to “read” off the DNA sequence, letter by letter!
The nanopore sequencer was initially used primarily for sequencing smaller microbial genomes. In fact, Loman was part of a team that used the portable nanopore device to track Ebola and Zika viruses during the recent outbreaks in Africa and Brazil [2, 3]. The nanopore sequencer was also used on the International Space Station to do the very first DNA sequencing in zero gravity .
The larger, more complex human genome represents a much stiffer challenge. But Loman and colleagues took on the challenge, betting that MinION was now up to the task based on recent improvements in its sequencing speed, computer software, and sample prep.
The team, which included five labs in three countries, sequenced the complete genome of a well-studied human cell line in a matter of weeks. The researchers generated 91.2 gigabytes of DNA data, enough to cover the genome 30 times over, which helps to put the pieces together accurately. Most notably, they also generated ultra-long “reads” up to 882,000 bases of contiguous DNA sequence. The researchers report that they have since read individual DNA molecules over a million bases long! Though the final cost ran about $23,000 to sequence one human genome, further refinements should continue to drop the price.
The real trick to getting such long reads is to prepare the DNA in such a way that the molecules don’t get cut or otherwise broken into small fragments, which the team has learned to do well. In fact, the team reports that in principle there may be no limit to the read-lengths that are possible using nanopore-based sequencing, including possibly entire chromosomes. The challenge will be getting the DNA molecules into the sequencing device without damaging them. Once a DNA molecule is threaded into a pore, there’s really no reason for it to stop until its passed all the way through.
Despite those longer, easier-to-assemble reads, the researchers still required some big computers, including the high-performance computational resources in NIH’s Biowulf system, to make sense of the data, correct for errors, and piece together portions of the genome that had been impossible to assemble previously. For example, they resolved several highly repetitive genomic regions, including the sequences of some essential genes in immunity. They were also able to accurately estimate the lengths of highly repetitive telomeres, which act like “caps” at the tips of chromosomes. Telomere lengths are of great research interest for their implications in aging and cancer.
Just as capabilities once only available through huge supercomputers can today be accessed though apps on smartphones, DNA sequencers continue to get better, smaller, and more portable. And as this study demonstrates, there’s no doubt that we’re pushing ever closer to a time when it may become both feasible and practical to sequence individual human genomes to bring greater precision to the delivery of health care for everyone.
 Nanopore sequencing and assembly of a human genome with ultra-long reads. Jain M, Koren S, Miga KH, Quick J, Rand AC, Sasani TA, Tyson JR, Beggs AD, Dilthey AT, Fiddes IT, Malla S, Marriott H, Nieto T, O’Grady J, Olsen HE, Pedersen BS, Rhie A, Richardson H, Quinlan AR, Snutch TP, Tee L, Paten B, Phillippy AM, Simpson JT, Loman NJ, Loose M. Nature Biotech. 2018 Jan. 29. [Epub ahead of print]
 Real-time, portable genome sequencing for Ebola surveillance. Quick J, Loman NJ, Duraffour S, Simpson JT, Severi E, Cowley L, et al..Nature. 2016 Feb 11;530(7589):228-232.
 Establishment and cryptic transmission of Zika virus in Brazil and the Americas. Faria NR, Quick J, Claro IM, Thézé J, de Jesus JG, et al. Nature. 2017 Jun 15;546(7658):406-410.
 Nanopore DNA Sequencing and Genome Assembly on the International Space Station. Castro-Wallace SL, Chiu CY, John KK, Stahl SE, Rubins KH, McIntyre ABR, Dworkin JP, Lupisella ML, Smith DJ, Botkin DJ, Stephenson TA, Juul S, Turner DJ, Izquierdo F, Federman S, Stryke D, Somasekar S, Alexander N, Yu G, Mason CE7, Burton AS. Sci Rep. 2017 Dec 21;7(1):18022.
DNA Sequencing (National Human Genome Research Institute/NIH)
Loman Lab (University of Birmingham, United Kingdom)
Matt Loose (University of Nottingham, U.K.)
Adam Phillippy (National Human Genome Research Institute/NIH)
MinION (Oxford Nanopore Technologies, U.K.)
NIH Support: National Human Genome Research Institute; National Cancer Institute