Skip to main content

Human Genome Project

Whole-Genome Sequencing Plus AI Yields Same-Day Genetic Diagnoses

Posted on by Dr. Francis Collins

Caption: Rapid whole-genome sequencing helped doctors diagnose Sebastiana Manuel with Ohtahara syndrome, a neurological condition that causes seizures. Her data are now being used as part of an effort to speed the diagnosis of other children born with unexplained illnesses. Credits: Getty Images (left); Jenny Siegwart (right).

Back in April 2003, when the international Human Genome Project successfully completed the first reference sequence of the human DNA blueprint, we were thrilled to have achieved that feat in just 13 years. Sure, the U.S. contribution to that first human reference sequence cost an estimated $400 million, but we knew (or at least we hoped) that the costs would come down quickly, and the speed would accelerate. How far we’ve come since then! A new study shows that whole genome sequencing—combined with artificial intelligence (AI)—can now be used to diagnose genetic diseases in seriously ill babies in less than 24 hours.

Take a moment to absorb this. I would submit that there is no other technology in the history of planet Earth that has experienced this degree of progress in speed and affordability. And, at the same time, DNA sequence technology has achieved spectacularly high levels of accuracy. The time-honored adage that you can only get two out of three for “faster, better, and cheaper” has been broken—all three have been dramatically enhanced by the advances of the last 16 years.

Rapid diagnosis is critical for infants born with mysterious conditions because it enables them to receive potentially life-saving interventions as soon as possible after birth. In a study in Science Translational Medicine, NIH-funded researchers describe development of a highly automated, genome-sequencing pipeline that’s capable of routinely delivering a diagnosis to anxious parents and health-care professionals dramatically earlier than typically has been possible [1].

While the cost of rapid DNA sequencing continues to fall, challenges remain in utilizing this valuable tool to make quick diagnostic decisions. In most clinical settings, the wait for whole-genome sequencing results still runs more than two weeks. Attempts to obtain faster results also have been labor intensive, requiring dedicated teams of experts to sift through the data, one sample at a time.

In the new study, a research team led by Stephen Kingsmore, Rady Children’s Institute for Genomic Medicine, San Diego, CA, describes a streamlined approach that accelerates every step in the process, making it possible to obtain whole-genome test results in a median time of about 20 hours and with much less manual labor. They propose that the system could deliver answers for 30 patients per week using a single genome sequencing instrument.

Here’s how it works: Instead of manually preparing blood samples, his team used special microbeads to isolate DNA much more rapidly with very little labor. The approach reduced the time for sample preparation from 10 hours to less than three. Then, using a state-of-the-art DNA sequencer, they sequence those samples to obtain good quality whole genome data in just 15.5 hours.

The next potentially time-consuming challenge is making sense of all that data. To speed up the analysis, Kingsmore’s team took advantage of a machine-learning system called MOON. The automated platform sifts through all the data using artificial intelligence to search for potentially disease-causing variants.

The researchers paired MOON with a clinical language processing system, which allowed them to extract relevant information from the child’s electronic health records within seconds. Teaming that patient-specific information with data on more than 13,000 known genetic diseases in the scientific literature, the machine-learning system could pick out a likely disease-causing mutation out of 4.5 million potential variants in an impressive 5 minutes or less!

To put the system to the test, the researchers first evaluated its ability to reach a correct diagnosis in a sample of 101 children with 105 previously diagnosed genetic diseases. In nearly every case, the automated diagnosis matched the opinions reached previously via the more lengthy and laborious manual interpretation of experts.

Next, the researchers tested the automated system in assisting diagnosis of seven seriously ill infants in the intensive care unit, and three previously diagnosed infants. They showed that their automated system could reach a diagnosis in less than 20 hours. That’s compared to the fastest manual approach, which typically took about 48 hours. The automated system also required about 90 percent less manpower.

The system nailed a rapid diagnosis for 3 of 7 infants without returning any false-positive results. Those diagnoses were made with an average time savings of more than 22 hours. In each case, the early diagnosis immediately influenced the treatment those children received. That’s key given that, for young children suffering from serious and unexplained symptoms such as seizures, metabolic abnormalities, or immunodeficiencies, time is of the essence.

Of course, artificial intelligence may never replace doctors and other healthcare providers. Kingsmore notes that 106 years after the invention of the autopilot, two pilots are still required to fly a commercial aircraft. Likewise, health care decisions based on genome interpretation also will continue to require the expertise of skilled physicians.

Still, such a rapid automated system will prove incredibly useful. For instance, this system can provide immediate provisional diagnosis, allowing the experts to focus their attention on more difficult unsolved cases or other needs. It may also prove useful in re-evaluating the evidence in the many cases in which manual interpretation by experts fails to provide an answer.

The automated system may also be useful for periodically reanalyzing data in the many cases that remain unsolved. Keeping up with such reanalysis is a particular challenge considering that researchers continue to discover hundreds of disease-associated genes and thousands of variants each and every year. The hope is that in the years ahead, the combination of whole genome sequencing, artificial intelligence, and expert care will make all the difference in the lives of many more seriously ill babies and their families.


[1] Diagnosis of genetic diseases in seriously ill children by rapid whole-genome sequencing and automated phenotyping and interpretation. Clark MM, Hildreth A, Batalov S, Ding Y, Chowdhury S, Watkins K, Ellsworth K, Camp B, Kint CI, Yacoubian C, Farnaes L, Bainbridge MN, Beebe C, Braun JJA, Bray M, Carroll J, Cakici JA, Caylor SA, Clarke C, Creed MP, Friedman J, Frith A, Gain R, Gaughran M, George S, Gilmer S, Gleeson J, Gore J, Grunenwald H, Hovey RL, Janes ML, Lin K, McDonagh PD, McBride K, Mulrooney P, Nahas S, Oh D, Oriol A, Puckett L, Rady Z, Reese MG, Ryu J, Salz L, Sanford E, Stewart L, Sweeney N, Tokita M, Van Der Kraan L, White S, Wigby K, Williams B, Wong T, Wright MS, Yamada C, Schols P, Reynders J, Hall K, Dimmock D, Veeraraghavan N, Defay T, Kingsmore SF. Sci Transl Med. 2019 Apr 24;11(489).


DNA Sequencing Fact Sheet (National Human Genome Research Institute/NIH)

Genomics and Medicine (NHGRI/NIH)

Genetic and Rare Disease Information Center (National Center for Advancing Translational Sciences/NIH)

Stephen Kingsmore (Rady Children’s Institute for Genomic Medicine, San Diego, CA)

NIH Support: National Institute of Child Health and Human Development; National Human Genome Research Institute; National Center for Advancing Translational Sciences

Study Shows Genes Unique to Humans Tied to Bigger Brains

Posted on by Dr. Francis Collins

cortical organoid

Caption: Cortical organoid, showing radial glial stem cells (green) and cortical neurons (red).
Credit: Sofie Salama, University of California, Santa Cruz

In seeking the biological answer to the question of what it means to be human, the brain’s cerebral cortex is a good place to start. This densely folded, outer layer of grey matter, which is vastly larger in Homo sapiens than in other primates, plays an essential role in human consciousness, language, and reasoning.

Now, an NIH-funded team has pinpointed a key set of genes—found only in humans—that may help explain why our species possesses such a large cerebral cortex. Experimental evidence shows these genes prolong the development of stem cells that generate neurons in the cerebral cortex, which in turn enables the human brain to produce more mature cortical neurons and, thus, build a bigger cerebral cortex than our fellow primates.

That sounds like a great advantage for humans! But there’s a downside. Researchers found the same genomic changes that facilitated the expansion of the human cortex may also render our species more susceptible to certain rare neurodevelopmental disorders.

A Tribute to Two Amazing Scientists

Posted on by Dr. Francis Collins


Caption: Sir John Sulston (left) and Stephen Hawking (right)
Credit: Jane Gitschier, PLoS; Paul Alers, NASA

Over the past couple of weeks, we’ve lost two legendary scientists who made major contributions to our world: Sir John Sulston and Stephen Hawking. Although they worked in very different areas of science—biology and physics—both have left us with an enduring legacy through their brilliant work that unlocked fundamental mysteries of life and the universe.

I had the privilege of working closely with John as part of the international Human Genome Project (HGP), a historic endeavor that successfully produced the first reference sequence of the human genetic blueprint nearly 15 years ago, in April 2003. As founding director of the Sanger Centre (now the Sanger Institute) in Cambridge, England, John oversaw the British contributions to this publicly funded effort. Throughout our many planning meetings and sometimes stormy weekly conference calls about progress of this intense and all-consuming enterprise, John stood out for his keen intellect and high ethical standards.

Sequencing Human Genome with Pocket-Sized “Nanopore” Device

Posted on by Dr. Francis Collins

MinION sequencing device

Caption: MinION sequencing device plugged into a laptop/Oxford Nanopore Technologies

It’s hard to believe, but it’s been almost 15 years since we successfully completed the Human Genome Project, ahead of schedule and under budget. I was proud to stand with my international colleagues in a celebration at the Library of Congress on April 14, 2003 (which happens to be my birthday), to announce that we had stitched together the very first reference sequence of the human genome at a total cost of about $400 million. As remarkable as that achievement was, it was just the beginning of our ongoing effort to understand the human genome, and to use that understanding to improve human health.

That first reference human genome was sequenced using automated machines that were the size of small phone booths. Since then, breathtaking progress has been made in developing innovative technologies that have made DNA sequencing far easier, faster, and more affordable. Now, a report in Nature Biotechnology highlights the latest advance: the sequencing and assembly of a human genome using a pocket-sized device [1]. It was generated using several “nanopore” devices that can be purchased online with a “starter kit” for just $1,000. In fact, this new genome sequence—completed in a matter of weeks—includes some notoriously hard-to-sequence stretches of DNA, filling several key gaps in our original reference genome.

Creative Minds: Bacteria, Gene Swaps, and Human Cancer

Posted on by Dr. Francis Collins

Julie Dunning Hotopp

Julie Dunning Hotopp

When Julie Dunning Hotopp was a post-doctoral fellow in the early 2000s, bacteria were known for swapping bits of their DNA with other bacteria, a strategy known as lateral gene transfer. But the offloading of genes from bacteria into multicellular organisms was thought to be rare, with limited evidence that a bacterial genus called Wolbachia, which invades the cells of other organisms and takes up permanent residence, had passed off some of its DNA onto a species of beetle and a parasitic worm. Dunning Hotopp wondered whether lateral gene transfer might be a more common phenomenon than the evidence showed.

She and her colleagues soon discovered that Wolbachia had engaged in widespread lateral gene transfer with eight species of insects and nematode worms, possibly passing on genes and traits to their invertebrate hosts [1]. This important discovery put Dunning Hotopp on a research trail that now has taken a sharp turn toward human cancer and earned her a 2015 NIH Director’s Transformative Research Award. This NIH award supports exceptionally innovative research projects that are inherently risky and untested but have the potential to change fundamental research paradigms in areas such as cancer and throughout the biomedical sciences.

Genome Exhibit Opens at Smithsonian

Posted on by Dr. Francis Collins

Photo of one of the interacive touch screen exhibits

Credit: Sasan Azami-Soheily, National Human Genome Research Institute, NIH

To celebrate the 10th anniversary of the completion of the Human Genome Project—a 13-year endeavor that I had the privilege of leading—the Smithsonian’s National Museum of Natural History in Washington, DC is launching an absolutely fantastic exhibit called “Genome: Unlocking Life’s Code.”

DNA’s Double Anniversary

Posted on by Dr. Francis Collins

Images of the first publication of DNA's structure adjacent to the image on the cover of the published human genome

April 25 is a very special day. In 2003, Congress declared April 25th DNA Day to mark the date that James Watson and Francis Crick published their seminal one-page paper in Nature [1] describing the helical structure of DNA. That was 60 years ago. In that single page, they revealed how organisms elegantly store biological information and pass it from generation to generation; they discovered the molecular basis of evolution; and they effectively launched the era of modern biology.

But that’s not all that’s special about this date. It was ten years ago this month that we celebrated the completion of all of the original goals of the Human Genome Project (HGP), which produced a reference sequence of the 3 billion DNA letters that make up the instruction book for building and maintaining a human being. The $3 billion, 13-year project involved more than 2,000 scientists from six countries. As the scientist tasked with leading that effort, I remain immensely proud of the team. They worked tirelessly and creatively to do something once thought impossible, never worrying about who got the credit, and giving all of the data away immediately so that anyone who had a good idea about how to use it for human benefit could proceed immediately. Biology will never be the same. Medical research will never be the same.