Posted on by Dr. Francis Collins
Researchers recently showed that a computer could “learn” from many examples of protein folding to predict the 3D structure of proteins with great speed and precision. Now a recent study in the journal Science shows that a computer also can predict the 3D shapes of RNA molecules . This includes the mRNA that codes for proteins and the non-coding RNA that performs a range of cellular functions.
This work marks an important basic science advance. RNA therapeutics—from COVID-19 vaccines to cancer drugs—have already benefited millions of people and will help many more in the future. Now, the ability to predict RNA shapes quickly and accurately on a computer will help to accelerate understanding these critical molecules and expand their healthcare uses.
Like proteins, the shapes of single-stranded RNA molecules are important for their ability to function properly inside cells. Yet far less is known about these RNA structures and the rules that determine their precise shapes. The RNA elements (bases) can form internal hydrogen-bonded pairs, but the number of possible combinations of pairings is almost astronomical for any RNA molecule with more than a few dozen bases.
In hopes of moving the field forward, a team led by Stephan Eismann and Raphael Townshend in the lab of Ron Dror, Stanford University, Palo Alto, CA, looked to a machine learning approach known as deep learning. It is inspired by how our own brain’s neural networks process information, learning to focus on some details but not others.
In deep learning, computers look for patterns in data. As they begin to “see” complex relationships, some connections in the network are strengthened while others are weakened.
One of the things that makes deep learning so powerful is it doesn’t rely on any preconceived notions. It also can pick up on important features and patterns that humans can’t possibly detect. But, as successful as this approach has been in solving many different kinds of problems, it has primarily been applied to areas of biology, such as protein folding, in which lots of data were available for researchers to train the computers.
That’s not the case with RNA molecules. To work around this problem, Dror’s team designed a neural network they call ARES. (No, it’s not the Greek god of war. It’s short for Atomic Rotationally Equivariant Scorer.)
To start, the researchers trained ARES on just 18 small RNA molecules for which structures had been experimentally determined. They gave ARES these structural models specified only by their atomic structure and chemical elements.
The next test was to see if ARES could determine from this small training set the best structural model for RNA sequences it had never seen before. The researchers put it to the test with RNA molecules whose structures had been determined more recently.
ARES, however, doesn’t come up with the structures itself. Instead, the researchers give ARES a sequence and at least 1,500 possible 3D structures it might take, all generated using another computer program. Based on patterns in the training set, ARES scores each of the possible structures to find the one it predicts is closest to the actual structure. Remarkably, it does this without being provided any prior information about features important for determining RNA shapes, such as nucleotides, steric constraints, and hydrogen bonds.
It turns out that ARES consistently outperforms humans and all other previous methods to produce the best results. In fact, it outperformed at least nine other methods to come out on top in a community-wide RNA-puzzles contest. It also can make predictions about RNA molecules that are significantly larger and more complex than those upon which it was trained.
The success of ARES and this deep learning approach will help to elucidate RNA molecules with potentially important implications for health and disease. It’s another compelling example of how deep learning promises to solve many other problems in structural biology, chemistry, and the material sciences when—at the outset—very little is known.
 Geometric deep learning of RNA structure. Townshend RJL, Eismann S, Watkins AM, Rangan R, Karelina M, Das R, Dror RO. Science. 2021 Aug 27;373(6558):1047-1051.
Structural Biology (National Institute of General Medical Sciences/NIH)
The Structures of Life (National Institute of General Medical Sciences/NIH)
RNA Biology (NIH)
Dror Lab (Stanford University, Palo Alto, CA)
NIH Support: National Cancer Institute; National Institute of General Medical Sciences
Posted on by Dr. Francis Collins
For many people who’ve had COVID-19, the infections were thankfully mild and relatively brief. But these individuals’ immune systems still hold onto enduring clues about how best to neutralize SARS-CoV-2, the coronavirus that causes COVID-19. Discovering these clues could point the way for researchers to design highly targeted treatments that could help to save the lives of folks with more severe infections.
An NIH-funded study, published recently in the journal Science, offers the most-detailed picture yet of the array of antibodies against SARS-CoV-2 found in people who’ve fully recovered from mild cases of COVID-19. This picture suggests that an effective neutralizing immune response targets a wider swath of the virus’ now-infamous spike protein than previously recognized.
To date, most studies of natural antibodies that block SARS-CoV-2 have zeroed in on those that target a specific portion of the spike protein known as the receptor-binding domain (RBD)—and with good reason. The RBD is the portion of the spike that attaches directly to human cells. As a result, antibodies specifically targeting the RBD were an excellent place to begin the search for antibodies capable of fending off SARS-CoV-2.
The new study, led by Gregory Ippolito and Jason Lavinder, The University of Texas at Austin, took a different approach. Rather than narrowing the search, Ippolito, Lavinder, and colleagues analyzed the complete repertoire of antibodies against the spike protein from four people soon after their recoveries from mild COVID-19.
What the researchers found was a bit of a surprise: the vast majority of antibodies—about 84 percent—targeted other portions of the spike protein than the RBD. This suggests a successful immune response doesn’t concentrate on the RBD. It involves production of antibodies capable of covering areas across the entire spike.
The researchers liken the spike protein to an umbrella, with the RBD at the tip of the “canopy.” While some antibodies do bind RBD at the tip, many others apparently target the protein’s canopy, known as the N-terminal domain (NTD).
Further study in cell culture showed that NTD-directed antibodies do indeed neutralize the virus. They also prevented a lethal mouse-adapted version of the coronavirus from infecting mice.
One reason these findings are particularly noteworthy is that the NTD is one part of the viral spike protein that has mutated frequently, especially in several emerging variants of concern, including the B.1.1.7 “U.K. variant” and the B.1.351 “South African variant.” It suggests that one reason these variants are so effective at evading our immune systems to cause breakthrough infections, or re-infections, is that they’ve mutated their way around some of the human antibodies that had been most successful in combating the original coronavirus variant.
Also noteworthy, about 40 percent of the circulating antibodies target yet another portion of the spike called the S2 subunit. This finding is especially encouraging because this portion of SARS-CoV-2 does not seem as mutable as the NTD segment, suggesting that S2-directed antibodies might offer a layer of protection against a wider array of variants. What’s more, the S2 subunit may make an ideal target for a possible pan-coronavirus vaccine since this portion of the spike is widely conserved in SARS-CoV-2 and related coronaviruses.
Taken together, these findings will prove useful for designing COVID-19 vaccine booster shots or future vaccines tailored to combat SARS-COV-2 variants of concern. The findings also drive home the conclusion that the more we learn about SARS-CoV-2 and the immune system’s response to neutralize it, the better position we all will be in to thwart this novel coronavirus and any others that might emerge in the future.
 Prevalent, protective, and convergent IgG recognition of SARS-CoV-2 non-RBD spike epitopes. Voss WN, Hou YJ, Johnson NV, Delidakis G, Kim JE, Javanmardi K, Horton AP, Bartzoka F, Paresi CJ, Tanno Y, Chou CW, Abbasi SA, Pickens W, George K, Boutz DR, Towers DM, McDaniel JR, Billick D, Goike J, Rowe L, Batra D, Pohl J, Lee J, Gangappa S, Sambhara S, Gadush M, Wang N, Person MD, Iverson BL, Gollihar JD, Dye J, Herbert A, Finkelstein IJ, Baric RS, McLellan JS, Georgiou G, Lavinder JJ, Ippolito GC. Science. 2021 May 4:eabg5268.
COVID-19 Research (NIH)
Gregory Ippolito (University of Texas at Austin)
NIH Support: National Institute of Allergy and Infectious Diseases; National Cancer Institute; National Institute of General Medical Sciences; National Center for Advancing Translational Sciences
Posted on by Dr. Francis Collins
This striking portrait features the spike protein that crowns SARS-CoV-2, the coronavirus that causes COVID-19. This highly flexible protein has settled here into one of its many possible conformations during the process of docking onto a human cell before infecting it.
This portrait, however, isn’t painted on canvas. It was created on a computer screen from sophisticated 3D simulations of the spike protein in action. The aim was to map its many shape-shifting maneuvers accurately at the atomic level in hopes of detecting exploitable structural vulnerabilities to thwart the virus.
For example, notice the many chain-like structures (green) that adorn the protein’s surface (white). They are sugar molecules called glycans that are thought to shield the spike protein by sweeping away antibodies. Also notice areas (purple) that the simulation identified as the most-attractive targets for antibodies, based on their apparent lack of protection by those glycans.
This work, published recently in the journal PLoS Computational Biology , was performed by a German research team that included Mateusz Sikora, Max Planck Institute of Biophysics, Frankfurt. The researchers used a computer application called molecular dynamics (MD) simulation to power up and model the conformational changes in the spike protein on a time scale of a few microseconds. (A microsecond is 0.000001 second.)
The new simulations suggest that glycans act as a dynamic shield on the spike protein. They liken them to windshield wipers on a car. Rather than being fixed in space, those glycans sweep back and forth to protect more of the protein surface than initially meets the eye.
But just as wipers miss spots on a windshield that lie beyond their tips, glycans also miss spots of the protein just beyond their reach. It’s those spots that the researchers suggest might be prime targets on the spike protein that are especially promising for the design of future vaccines and therapeutic antibodies.
This same approach can now be applied to identifying weak spots in the coronavirus’s armor. It also may help researchers understand more fully the implications of newly emerging SARS-CoV-2 variants. The hope is that by capturing this devastating virus and its most critical proteins in action, we can continue to develop and improve upon vaccines and therapeutics.
 Computational epitope map of SARS-CoV-2 spike protein. Sikora M, von Bülow S, Blanc FEC, Gecht M, Covino R, Hummer G. PLoS Comput Biol. 2021 Apr 1;17(4):e1008790.
COVID-19 Research (NIH)
Mateusz Sikora (Max Planck Institute of Biophysics, Frankfurt, Germany)
The surprising properties of the coronavirus envelope (Interview with Mateusz Sikora), Scilog, November 16, 2020.