Posted on by Dr. Francis Collins
At the close of every year, editors and writers at the journal Science review the progress that’s been made in all fields of science—from anthropology to zoology—to select the biggest advance of the past 12 months. In most cases, this Breakthrough of the Year is as tough to predict as the Oscar for Best Picture. Not in 2020. In a year filled with a multitude of challenges posed by the emergence of the deadly coronavirus disease 2019 (COVID-2019), the breakthrough was the development of the first vaccines to protect against this pandemic that’s already claimed the lives of more than 360,000 Americans.
In keeping with its annual tradition, Science also selected nine runner-up breakthroughs. This impressive list includes at least three areas that involved efforts supported by NIH: therapeutic applications of gene editing, basic research understanding HIV, and scientists speaking up for diversity. Here’s a quick rundown of all the pioneering advances in biomedical research, both NIH and non-NIH funded:
Shots of Hope. A lot of things happened in 2020 that were unprecedented. At the top of the list was the rapid development of COVID-19 vaccines. Public and private researchers accomplished in 10 months what normally takes about 8 years to produce two vaccines for public use, with more on the way in 2021. In my more than 25 years at NIH, I’ve never encountered such a willingness among researchers to set aside their other concerns and gather around the same table to get the job done fast, safely, and efficiently for the world.
It’s also pretty amazing that the first two conditionally approved vaccines from Pfizer and Moderna were found to be more than 90 percent effective at protecting people from infection with SARS-CoV-2, the coronavirus that causes COVID-19. Both are innovative messenger RNA (mRNA) vaccines, a new approach to vaccination.
For this type of vaccine, the centerpiece is a small, non-infectious snippet of mRNA that encodes the instructions to make the spike protein that crowns the outer surface of SARS-CoV-2. When the mRNA is injected into a shoulder muscle, cells there will follow the encoded instructions and temporarily make copies of this signature viral protein. As the immune system detects these copies, it spurs the production of antibodies and helps the body remember how to fend off SARS-CoV-2 should the real thing be encountered.
It also can’t be understated that both mRNA vaccines—one developed by Pfizer and the other by Moderna in conjunction with NIH’s National Institute of Allergy and Infectious Diseases—were rigorously evaluated in clinical trials. Detailed data were posted online and discussed in all-day meetings of an FDA Advisory Committee, open to the public. In fact, given the high stakes, the level of review probably was more scientifically rigorous than ever.
First CRISPR Cures: One of the most promising areas of research now underway involves gene editing. These tools, still relatively new, hold the potential to fix gene misspellings—and potentially cure—a wide range of genetic diseases that were once to be out of reach. Much of the research focus has centered on CRISPR/Cas9. This highly precise gene-editing system relies on guide RNA molecules to direct a scissor-like Cas9 enzyme to just the right spot in the genome to cut out or correct a disease-causing misspelling.
In late 2020, a team of researchers in the United States and Europe succeeded for the first time in using CRISPR to treat 10 people with sickle cell disease and transfusion-dependent beta thalassemia. As published in the New England Journal of Medicine, several months after this non-heritable treatment, all patients no longer needed frequent blood transfusions and are living pain free .
The researchers tested a one-time treatment in which they removed bone marrow from each patient, modified the blood-forming hematopoietic stem cells outside the body using CRISPR, and then reinfused them into the body. To prepare for receiving the corrected cells, patients were given toxic bone marrow ablation therapy, in order to make room for the corrected cells. The result: the modified stem cells were reprogrammed to switch back to making ample amounts of a healthy form of hemoglobin that their bodies produced in the womb. While the treatment is still risky, complex, and prohibitively expensive, this work is an impressive start for more breakthroughs to come using gene editing technologies. NIH, including its Somatic Cell Genome Editing program, continues to push the technology to accelerate progress and make gene editing cures for many disorders simpler and less toxic.
Scientists Speak Up for Diversity: The year 2020 will be remembered not only for COVID-19, but also for the very public and inescapable evidence of the persistence of racial discrimination in the United States. Triggered by the killing of George Floyd and other similar events, Americans were forced to come to grips with the fact that our society does not provide equal opportunity and justice for all. And that applies to the scientific community as well.
Science thrives in safe, diverse, and inclusive research environments. It suffers when racism and bigotry find a home to stifle diversity—and community for all—in the sciences. For the nation’s leading science institutions, there is a place and a calling to encourage diversity in the scientific workplace and provide the resources to let it flourish to everyone’s benefit.
For those of us at NIH, last year’s peaceful protests and hashtags were noticed and taken to heart. That’s one of the many reasons why we will continue to strengthen our commitment to building a culturally diverse, inclusive workplace. For example, we have established the NIH Equity Committee. It allows for the systematic tracking and evaluation of diversity and inclusion metrics for the intramural research program for each NIH institute and center. There is also the recently founded Distinguished Scholars Program, which aims to increase the diversity of tenure track investigators at NIH. Recently, NIH also announced that it will provide support to institutions to recruit diverse groups or “cohorts” of early-stage research faculty and prepare them to thrive as NIH-funded researchers.
AI Disentangles Protein Folding: Proteins, which are the workhorses of the cell, are made up of long, interconnected strings of amino acids that fold into a wide variety of 3D shapes. Understanding the precise shape of a protein facilitates efforts to figure out its function, its potential role in a disease, and even how to target it with therapies. To gain such understanding, researchers often try to predict a protein’s precise 3D chemical structure using basic principles of physics—including quantum mechanics. But while nature does this in real time zillions of times a day, computational approaches have not been able to do this—until now.
Of the roughly 170,000 proteins mapped so far, most have had their structures deciphered using powerful imaging techniques such as x-ray crystallography and cryo–electron microscopy (cryo-EM). But researchers estimate that there are at least 200 million proteins in nature, and, as amazing as these imaging techniques are, they are laborious, and it can take many months or years to solve 3D structure of a single protein. So, a breakthrough certainly was needed!
In 2020, researchers with the company Deep Mind, London, developed an artificial intelligence (AI) program that rapidly predicts most protein structures as accurately as x-ray crystallography and cryo-EM can map them . The AI program, called AlphaFold, predicts a protein’s structure by computationally modeling the amino acid interactions that govern its 3D shape.
Getting there wasn’t easy. While a complete de novo calculation of protein structure still seemed out of reach, investigators reasoned that they could kick start the modeling if known structures were provided as a training set to the AI program. Utilizing a computer network built around 128 machine learning processors, the AlphaFold system was created by first focusing on the 170,000 proteins with known structures in a reiterative process called deep learning. The process, which is inspired by the way neural networks in the human brain process information, enables computers to look for patterns in large collections of data. In this case, AlphaFold learned to predict the underlying physical structure of a protein within a matter of days. This breakthrough has the potential to accelerate the fields of structural biology and protein research, fueling progress throughout the sciences.
How Elite Controllers Keep HIV at Bay: The term “elite controller” might make some people think of video game whizzes. But here, it refers to the less than 1 percent of people living with human immunodeficiency virus (HIV) who’ve somehow stayed healthy for years without taking antiretroviral drugs. In 2020, a team of NIH-supported researchers figured out why this is so.
In a study of 64 elite controllers, published in the journal Nature, the team discovered a link between their good health and where the virus has inserted itself in their genomes . When a cell transcribes a gene where HIV has settled, this so-called “provirus,” can produce more virus to infect other cells. But if it settles in a part of a chromosome that rarely gets transcribed, sometimes called a gene desert, the provirus is stuck with no way to replicate. Although this discovery won’t cure HIV/AIDS, it points to a new direction for developing better treatment strategies.
In closing, 2020 presented more than its share of personal and social challenges. Among those challenges was a flood of misinformation about COVID-19 that confused and divided many communities and even families. That’s why the editors and writers at Science singled out “a second pandemic of misinformation” as its Breakdown of the Year. This divisiveness should concern all of us greatly, as COVID-19 cases continue to soar around the country and our healthcare gets stretched to the breaking point. I hope and pray that we will all find a way to come together, both in science and in society, as we move forward in 2021.
 CRISPR-Cas9 gene editing for sickle cell disease and β-thalassemia. Frangoul H et al. N Engl J Med. 2020 Dec 5.
 ‘The game has changed.’ AI triumphs at protein folding. Service RF. Science. 04 Dec 2020.
 Distinct viral reservoirs in individuals with spontaneous control of HIV-1. Jiang C et al. Nature. 2020 Sep;585(7824):261-267.
COVID-19 Research (NIH)
2020 Science Breakthrough of the Year (American Association for the Advancement of Science, Washington, D.C)
Posted on by Dr. Francis Collins
In 1953, Francis Crick famously told the surprised customers at the Eagle and Child pub in London that he and Jim Watson had discovered the secret of life. When NIH’s Marshall Nirenberg and his colleagues cracked the genetic code in 1961, it was called the solution to life’s greatest secret. Similarly, when the complete human genome sequence was revealed for the first time in 2003, commentators (including me) referred to this as the moment where the book of life for humans was revealed. But there are many more secrets of life that still need to be unlocked, including figuring out the biochemical rules of a protein shape-shifting phenomenon called allostery .
Among those taking on this ambitious challenge is a recipient of a 2018 NIH Director’s New Innovator Award, Srivatsan Raman of the University of Wisconsin-Madison. If successful, such efforts could revolutionize biology by helping us better understand how allosteric proteins reconfigure themselves in the right shapes at the right times to regulate cell signaling, metabolism, and many other important biological processes.
What exactly is an allosteric protein? Proteins have active, or orthosteric, sites that turn the proteins off or on when specific molecules bind to them. Some proteins also have less obvious regulatory, or allosteric, sites that indirectly affect the proteins’ activity when outside molecules bind to them. In many instances, allosteric binding triggers a change in the shape of the protein.
Allosteric proteins include oxygen-carrying hemoglobin and a variety of enzymes crucial to human health and development. In his work, Raman will start by studying a relatively simple bacterial protein, consisting of less than 200 amino acids, to understand the basics of how allostery works over time and space.
Raman, who is a synthetic biologist, got the idea for this project a few years ago while tinkering in the lab to modify an allosteric protein to bind new molecules. As part of the process, he and his team used a new technology called deep mutational scanning to study the functional consequences of removing individual amino acids from the protein .
The screen took them on a wild ride of unexpected functional changes, and a new research opportunity called out to him. He could combine this scanning technology with artificial intelligence and other cutting-edge imaging and computational tools to probe allosteric proteins more systematically in hopes of deciphering the basic molecular rules of allostery.
With the New Innovator Award, Raman’s group will first create a vast number of protein mutants to learn how best to determine the allosteric signaling pathway(s) within a protein. They want to dissect out the properties of each amino acid and determine which connect into a binding site and precisely how those linkages are formed. The researchers also want to know how the amino acids tend to configure into an inactive state and how that structure changes into an active state.
Based on these initial studies, the researchers will take the next step and use their dataset to predict where allosteric pathways are found in individual proteins. They will also try to figure out if allosteric signals are sent in one direction only or whether they can be bidirectional.
The experiments will be challenging, but Raman is confident that they will serve to build a more unified view of how allostery works. In fact, he hopes the data generated—and there will be a massive amount—will reveal novel sites to control or exploit allosteric signaling. Such information will not only expand fundamental biological understanding, but will accelerate efforts to discover new therapies for diseases, such as cancer, in which disruption of allosteric proteins plays a crucial role.
 Allostery: an illustrated definition for the ‘second secret of life.’ Fenton AW. Trends Biochem Sci. 2008 Sep;33(9):420-425.
 Engineering an allosteric transcription factor to respond to new ligands. Taylor ND, Garruss AS, Moretti R, Chan S, Arbing MA, Cascio D, Rogers JK, Isaacs FJ, Kosuri S, Baker D, Fields S, Church GM, Raman S. Nat Methods. 2016 Feb;13(2):177-183.
Drug hunters explore allostery’s advantages. Jarvis LM, Chemical & Engineering News. 2019 March 10
Allostery: An Overview of Its History, Concepts, Methods, and Applications. Liu J, Nussinov R. PLoS Comput Biol. 2016 Jun 2;12(6):e1004966.
Srivatsan Raman (University of Wisconsin-Madison)
Raman Project Information (NIH RePORTER)
NIH Director’s New Innovator Award (Common Fund/NIH)
NIH Support: National Institute of General Medical Sciences; Common Fund
Posted on by Dr. Francis Collins
Back in April 2003, when the international Human Genome Project successfully completed the first reference sequence of the human DNA blueprint, we were thrilled to have achieved that feat in just 13 years. Sure, the U.S. contribution to that first human reference sequence cost an estimated $400 million, but we knew (or at least we hoped) that the costs would come down quickly, and the speed would accelerate. How far we’ve come since then! A new study shows that whole genome sequencing—combined with artificial intelligence (AI)—can now be used to diagnose genetic diseases in seriously ill babies in less than 24 hours.
Take a moment to absorb this. I would submit that there is no other technology in the history of planet Earth that has experienced this degree of progress in speed and affordability. And, at the same time, DNA sequence technology has achieved spectacularly high levels of accuracy. The time-honored adage that you can only get two out of three for “faster, better, and cheaper” has been broken—all three have been dramatically enhanced by the advances of the last 16 years.
Rapid diagnosis is critical for infants born with mysterious conditions because it enables them to receive potentially life-saving interventions as soon as possible after birth. In a study in Science Translational Medicine, NIH-funded researchers describe development of a highly automated, genome-sequencing pipeline that’s capable of routinely delivering a diagnosis to anxious parents and health-care professionals dramatically earlier than typically has been possible .
While the cost of rapid DNA sequencing continues to fall, challenges remain in utilizing this valuable tool to make quick diagnostic decisions. In most clinical settings, the wait for whole-genome sequencing results still runs more than two weeks. Attempts to obtain faster results also have been labor intensive, requiring dedicated teams of experts to sift through the data, one sample at a time.
In the new study, a research team led by Stephen Kingsmore, Rady Children’s Institute for Genomic Medicine, San Diego, CA, describes a streamlined approach that accelerates every step in the process, making it possible to obtain whole-genome test results in a median time of about 20 hours and with much less manual labor. They propose that the system could deliver answers for 30 patients per week using a single genome sequencing instrument.
Here’s how it works: Instead of manually preparing blood samples, his team used special microbeads to isolate DNA much more rapidly with very little labor. The approach reduced the time for sample preparation from 10 hours to less than three. Then, using a state-of-the-art DNA sequencer, they sequence those samples to obtain good quality whole genome data in just 15.5 hours.
The next potentially time-consuming challenge is making sense of all that data. To speed up the analysis, Kingsmore’s team took advantage of a machine-learning system called MOON. The automated platform sifts through all the data using artificial intelligence to search for potentially disease-causing variants.
The researchers paired MOON with a clinical language processing system, which allowed them to extract relevant information from the child’s electronic health records within seconds. Teaming that patient-specific information with data on more than 13,000 known genetic diseases in the scientific literature, the machine-learning system could pick out a likely disease-causing mutation out of 4.5 million potential variants in an impressive 5 minutes or less!
To put the system to the test, the researchers first evaluated its ability to reach a correct diagnosis in a sample of 101 children with 105 previously diagnosed genetic diseases. In nearly every case, the automated diagnosis matched the opinions reached previously via the more lengthy and laborious manual interpretation of experts.
Next, the researchers tested the automated system in assisting diagnosis of seven seriously ill infants in the intensive care unit, and three previously diagnosed infants. They showed that their automated system could reach a diagnosis in less than 20 hours. That’s compared to the fastest manual approach, which typically took about 48 hours. The automated system also required about 90 percent less manpower.
The system nailed a rapid diagnosis for 3 of 7 infants without returning any false-positive results. Those diagnoses were made with an average time savings of more than 22 hours. In each case, the early diagnosis immediately influenced the treatment those children received. That’s key given that, for young children suffering from serious and unexplained symptoms such as seizures, metabolic abnormalities, or immunodeficiencies, time is of the essence.
Of course, artificial intelligence may never replace doctors and other healthcare providers. Kingsmore notes that 106 years after the invention of the autopilot, two pilots are still required to fly a commercial aircraft. Likewise, health care decisions based on genome interpretation also will continue to require the expertise of skilled physicians.
Still, such a rapid automated system will prove incredibly useful. For instance, this system can provide immediate provisional diagnosis, allowing the experts to focus their attention on more difficult unsolved cases or other needs. It may also prove useful in re-evaluating the evidence in the many cases in which manual interpretation by experts fails to provide an answer.
The automated system may also be useful for periodically reanalyzing data in the many cases that remain unsolved. Keeping up with such reanalysis is a particular challenge considering that researchers continue to discover hundreds of disease-associated genes and thousands of variants each and every year. The hope is that in the years ahead, the combination of whole genome sequencing, artificial intelligence, and expert care will make all the difference in the lives of many more seriously ill babies and their families.
 Diagnosis of genetic diseases in seriously ill children by rapid whole-genome sequencing and automated phenotyping and interpretation. Clark MM, Hildreth A, Batalov S, Ding Y, Chowdhury S, Watkins K, Ellsworth K, Camp B, Kint CI, Yacoubian C, Farnaes L, Bainbridge MN, Beebe C, Braun JJA, Bray M, Carroll J, Cakici JA, Caylor SA, Clarke C, Creed MP, Friedman J, Frith A, Gain R, Gaughran M, George S, Gilmer S, Gleeson J, Gore J, Grunenwald H, Hovey RL, Janes ML, Lin K, McDonagh PD, McBride K, Mulrooney P, Nahas S, Oh D, Oriol A, Puckett L, Rady Z, Reese MG, Ryu J, Salz L, Sanford E, Stewart L, Sweeney N, Tokita M, Van Der Kraan L, White S, Wigby K, Williams B, Wong T, Wright MS, Yamada C, Schols P, Reynders J, Hall K, Dimmock D, Veeraraghavan N, Defay T, Kingsmore SF. Sci Transl Med. 2019 Apr 24;11(489).
DNA Sequencing Fact Sheet (National Human Genome Research Institute/NIH)
Genomics and Medicine (NHGRI/NIH)
Genetic and Rare Disease Information Center (National Center for Advancing Translational Sciences/NIH)
Stephen Kingsmore (Rady Children’s Institute for Genomic Medicine, San Diego, CA)
NIH Support: National Institute of Child Health and Human Development; National Human Genome Research Institute; National Center for Advancing Translational Sciences
Posted on by Dr. Francis Collins
Credit: Adapted from Nima Mesgarani, Columbia University’s Zuckerman Institute, New York
Computers have learned to do some amazing things, from beating the world’s ranking chess masters to providing the equivalent of feeling in prosthetic limbs. Now, as heard in this brief audio clip counting from zero to nine, an NIH-supported team has combined innovative speech synthesis technology and artificial intelligence to teach a computer to read a person’s thoughts and translate them into intelligible speech.
Turning brain waves into speech isn’t just fascinating science. It might also prove life changing for people who have lost the ability to speak from conditions such as amyotrophic lateral sclerosis (ALS) or a debilitating stroke.
When people speak or even think about talking, their brains fire off distinctive, but previously poorly decoded, patterns of neural activity. Nima Mesgarani and his team at Columbia University’s Zuckerman Institute, New York, wanted to learn how to decode this neural activity.
Mesgarani and his team started out with a vocoder, a voice synthesizer that produces sounds based on an analysis of speech. It’s the very same technology used by Amazon’s Alexa, Apple’s Siri, or other similar devices to listen and respond appropriately to everyday commands.
As reported in Scientific Reports, the first task was to train a vocoder to produce synthesized sounds in response to brain waves instead of speech . To do it, Mesgarani teamed up with neurosurgeon Ashesh Mehta, Hofstra Northwell School of Medicine, Manhasset, NY, who frequently performs brain mapping in people with epilepsy to pinpoint the sources of seizures before performing surgery to remove them.
In five patients already undergoing brain mapping, the researchers monitored activity in the auditory cortex, where the brain processes sound. The patients listened to recordings of short stories read by four speakers. In the first test, eight different sentences were repeated multiple times. In the next test, participants heard four new speakers repeat numbers from zero to nine.
From these exercises, the researchers reconstructed the words that people heard from their brain activity alone. Then the researchers tried various methods to reproduce intelligible speech from the recorded brain activity. They found it worked best to combine the vocoder technology with a form of computer artificial intelligence known as deep learning.
Deep learning is inspired by how our own brain’s neural networks process information, learning to focus on some details but not others. In deep learning, computers look for patterns in data. As they begin to “see” complex relationships, some connections in the network are strengthened while others are weakened.
In this case, the researchers used the deep learning networks to interpret the sounds produced by the vocoder in response to the brain activity patterns. When the vocoder-produced sounds were processed and “cleaned up” by those neural networks, it made the reconstructed sounds easier for a listener to understand as recognizable words, though this first attempt still sounds pretty robotic.
The researchers will continue testing their system with more complicated words and sentences. They also want to run the same tests on brain activity, comparing what happens when a person speaks or just imagines speaking. They ultimately envision an implant, similar to those already worn by some patients with epilepsy, that will translate a person’s thoughts into spoken words. That might open up all sorts of awkward moments if some of those thoughts weren’t intended for transmission!
Along with recently highlighted new ways to catch irregular heartbeats and cervical cancers, it’s yet another remarkable example of the many ways in which computers and artificial intelligence promise to transform the future of medicine.
 Towards reconstructing intelligible speech from the human auditory cortex. Akbari H, Khalighinejad B, Herrero JL, Mehta AD, Mesgarani N. Sci Rep. 2019 Jan 29;9(1):874.
Advances in Neuroprosthetic Learning and Control. Carmena JM. PLoS Biol. 2013;11(5):e1001561.
Nima Mesgarani (Columbia University, New York)
NIH Support: National Institute on Deafness and Other Communication Disorders; National Institute of Mental Health