Posted on by Dr. Francis Collins
Back in April 2003, when the international Human Genome Project successfully completed the first reference sequence of the human DNA blueprint, we were thrilled to have achieved that feat in just 13 years. Sure, the U.S. contribution to that first human reference sequence cost an estimated $400 million, but we knew (or at least we hoped) that the costs would come down quickly, and the speed would accelerate. How far we’ve come since then! A new study shows that whole genome sequencing—combined with artificial intelligence (AI)—can now be used to diagnose genetic diseases in seriously ill babies in less than 24 hours.
Take a moment to absorb this. I would submit that there is no other technology in the history of planet Earth that has experienced this degree of progress in speed and affordability. And, at the same time, DNA sequence technology has achieved spectacularly high levels of accuracy. The time-honored adage that you can only get two out of three for “faster, better, and cheaper” has been broken—all three have been dramatically enhanced by the advances of the last 16 years.
Rapid diagnosis is critical for infants born with mysterious conditions because it enables them to receive potentially life-saving interventions as soon as possible after birth. In a study in Science Translational Medicine, NIH-funded researchers describe development of a highly automated, genome-sequencing pipeline that’s capable of routinely delivering a diagnosis to anxious parents and health-care professionals dramatically earlier than typically has been possible .
While the cost of rapid DNA sequencing continues to fall, challenges remain in utilizing this valuable tool to make quick diagnostic decisions. In most clinical settings, the wait for whole-genome sequencing results still runs more than two weeks. Attempts to obtain faster results also have been labor intensive, requiring dedicated teams of experts to sift through the data, one sample at a time.
In the new study, a research team led by Stephen Kingsmore, Rady Children’s Institute for Genomic Medicine, San Diego, CA, describes a streamlined approach that accelerates every step in the process, making it possible to obtain whole-genome test results in a median time of about 20 hours and with much less manual labor. They propose that the system could deliver answers for 30 patients per week using a single genome sequencing instrument.
Here’s how it works: Instead of manually preparing blood samples, his team used special microbeads to isolate DNA much more rapidly with very little labor. The approach reduced the time for sample preparation from 10 hours to less than three. Then, using a state-of-the-art DNA sequencer, they sequence those samples to obtain good quality whole genome data in just 15.5 hours.
The next potentially time-consuming challenge is making sense of all that data. To speed up the analysis, Kingsmore’s team took advantage of a machine-learning system called MOON. The automated platform sifts through all the data using artificial intelligence to search for potentially disease-causing variants.
The researchers paired MOON with a clinical language processing system, which allowed them to extract relevant information from the child’s electronic health records within seconds. Teaming that patient-specific information with data on more than 13,000 known genetic diseases in the scientific literature, the machine-learning system could pick out a likely disease-causing mutation out of 4.5 million potential variants in an impressive 5 minutes or less!
To put the system to the test, the researchers first evaluated its ability to reach a correct diagnosis in a sample of 101 children with 105 previously diagnosed genetic diseases. In nearly every case, the automated diagnosis matched the opinions reached previously via the more lengthy and laborious manual interpretation of experts.
Next, the researchers tested the automated system in assisting diagnosis of seven seriously ill infants in the intensive care unit, and three previously diagnosed infants. They showed that their automated system could reach a diagnosis in less than 20 hours. That’s compared to the fastest manual approach, which typically took about 48 hours. The automated system also required about 90 percent less manpower.
The system nailed a rapid diagnosis for 3 of 7 infants without returning any false-positive results. Those diagnoses were made with an average time savings of more than 22 hours. In each case, the early diagnosis immediately influenced the treatment those children received. That’s key given that, for young children suffering from serious and unexplained symptoms such as seizures, metabolic abnormalities, or immunodeficiencies, time is of the essence.
Of course, artificial intelligence may never replace doctors and other healthcare providers. Kingsmore notes that 106 years after the invention of the autopilot, two pilots are still required to fly a commercial aircraft. Likewise, health care decisions based on genome interpretation also will continue to require the expertise of skilled physicians.
Still, such a rapid automated system will prove incredibly useful. For instance, this system can provide immediate provisional diagnosis, allowing the experts to focus their attention on more difficult unsolved cases or other needs. It may also prove useful in re-evaluating the evidence in the many cases in which manual interpretation by experts fails to provide an answer.
The automated system may also be useful for periodically reanalyzing data in the many cases that remain unsolved. Keeping up with such reanalysis is a particular challenge considering that researchers continue to discover hundreds of disease-associated genes and thousands of variants each and every year. The hope is that in the years ahead, the combination of whole genome sequencing, artificial intelligence, and expert care will make all the difference in the lives of many more seriously ill babies and their families.
 Diagnosis of genetic diseases in seriously ill children by rapid whole-genome sequencing and automated phenotyping and interpretation. Clark MM, Hildreth A, Batalov S, Ding Y, Chowdhury S, Watkins K, Ellsworth K, Camp B, Kint CI, Yacoubian C, Farnaes L, Bainbridge MN, Beebe C, Braun JJA, Bray M, Carroll J, Cakici JA, Caylor SA, Clarke C, Creed MP, Friedman J, Frith A, Gain R, Gaughran M, George S, Gilmer S, Gleeson J, Gore J, Grunenwald H, Hovey RL, Janes ML, Lin K, McDonagh PD, McBride K, Mulrooney P, Nahas S, Oh D, Oriol A, Puckett L, Rady Z, Reese MG, Ryu J, Salz L, Sanford E, Stewart L, Sweeney N, Tokita M, Van Der Kraan L, White S, Wigby K, Williams B, Wong T, Wright MS, Yamada C, Schols P, Reynders J, Hall K, Dimmock D, Veeraraghavan N, Defay T, Kingsmore SF. Sci Transl Med. 2019 Apr 24;11(489).
DNA Sequencing Fact Sheet (National Human Genome Research Institute/NIH)
Genomics and Medicine (NHGRI/NIH)
Genetic and Rare Disease Information Center (National Center for Advancing Translational Sciences/NIH)
Stephen Kingsmore (Rady Children’s Institute for Genomic Medicine, San Diego, CA)
NIH Support: National Institute of Child Health and Human Development; National Human Genome Research Institute; National Center for Advancing Translational Sciences