Posted on by Dr. Francis Collins
Researchers recently showed that a computer could “learn” from many examples of protein folding to predict the 3D structure of proteins with great speed and precision. Now a recent study in the journal Science shows that a computer also can predict the 3D shapes of RNA molecules . This includes the mRNA that codes for proteins and the non-coding RNA that performs a range of cellular functions.
This work marks an important basic science advance. RNA therapeutics—from COVID-19 vaccines to cancer drugs—have already benefited millions of people and will help many more in the future. Now, the ability to predict RNA shapes quickly and accurately on a computer will help to accelerate understanding these critical molecules and expand their healthcare uses.
Like proteins, the shapes of single-stranded RNA molecules are important for their ability to function properly inside cells. Yet far less is known about these RNA structures and the rules that determine their precise shapes. The RNA elements (bases) can form internal hydrogen-bonded pairs, but the number of possible combinations of pairings is almost astronomical for any RNA molecule with more than a few dozen bases.
In hopes of moving the field forward, a team led by Stephan Eismann and Raphael Townshend in the lab of Ron Dror, Stanford University, Palo Alto, CA, looked to a machine learning approach known as deep learning. It is inspired by how our own brain’s neural networks process information, learning to focus on some details but not others.
In deep learning, computers look for patterns in data. As they begin to “see” complex relationships, some connections in the network are strengthened while others are weakened.
One of the things that makes deep learning so powerful is it doesn’t rely on any preconceived notions. It also can pick up on important features and patterns that humans can’t possibly detect. But, as successful as this approach has been in solving many different kinds of problems, it has primarily been applied to areas of biology, such as protein folding, in which lots of data were available for researchers to train the computers.
That’s not the case with RNA molecules. To work around this problem, Dror’s team designed a neural network they call ARES. (No, it’s not the Greek god of war. It’s short for Atomic Rotationally Equivariant Scorer.)
To start, the researchers trained ARES on just 18 small RNA molecules for which structures had been experimentally determined. They gave ARES these structural models specified only by their atomic structure and chemical elements.
The next test was to see if ARES could determine from this small training set the best structural model for RNA sequences it had never seen before. The researchers put it to the test with RNA molecules whose structures had been determined more recently.
ARES, however, doesn’t come up with the structures itself. Instead, the researchers give ARES a sequence and at least 1,500 possible 3D structures it might take, all generated using another computer program. Based on patterns in the training set, ARES scores each of the possible structures to find the one it predicts is closest to the actual structure. Remarkably, it does this without being provided any prior information about features important for determining RNA shapes, such as nucleotides, steric constraints, and hydrogen bonds.
It turns out that ARES consistently outperforms humans and all other previous methods to produce the best results. In fact, it outperformed at least nine other methods to come out on top in a community-wide RNA-puzzles contest. It also can make predictions about RNA molecules that are significantly larger and more complex than those upon which it was trained.
The success of ARES and this deep learning approach will help to elucidate RNA molecules with potentially important implications for health and disease. It’s another compelling example of how deep learning promises to solve many other problems in structural biology, chemistry, and the material sciences when—at the outset—very little is known.
 Geometric deep learning of RNA structure. Townshend RJL, Eismann S, Watkins AM, Rangan R, Karelina M, Das R, Dror RO. Science. 2021 Aug 27;373(6558):1047-1051.
Structural Biology (National Institute of General Medical Sciences/NIH)
The Structures of Life (National Institute of General Medical Sciences/NIH)
RNA Biology (NIH)
Dror Lab (Stanford University, Palo Alto, CA)
NIH Support: National Cancer Institute; National Institute of General Medical Sciences
Posted on by Dr. Francis Collins
Tropical medicine has its share of wily microbes. Among the most clever is the mosquito-borne protozoan Plasmodium falciparum, which is the cause of the most common—and most lethal—form of malaria. For decades, doctors have used antimalarial drugs against P. falciparum. But just when malaria appeared to be well on its way to eradication, this parasitic protozoan mutated in ways that has enabled it to resist frontline antimalarial drugs. This resistance is a major reason that malaria, one of the world’s oldest diseases, still claims the lives of about 400,000 people each year .
This is a situation with which I have personal experience. Thirty years ago before traveling to Nigeria, I followed directions and took chloroquine to prevent malaria. But the resistance to the drug was already widespread, and I came down with malaria anyway. Fortunately, the parasite that a mosquito delivered to me was sensitive to another drug called Fansidar, which acts through another mechanism. I was pretty sick for a few days, but recovered without lasting consequences.
While new drugs are being developed to thwart P. falciparum, some researchers are busy developing tools to predict what mutations are likely to occur next in the parasite’s genome. And that’s what is so exciting about the image above. It presents the unprecedented, 3D atomic-resolution structure of a protein made by P. falciparum that’s been a major source of its resistance: the chloroquine-resistance transporter protein, or PfCRT.
In this cropped density map, you see part of the protein’s biochemical structure. The colorized area displays the long, winding chain of amino acids within the protein as helices in shades of green, blue and gold. These helices enclose a central cavity essential for the function of the protein, whose electrostatic properties are shown here as negative (red), positive (blue), and neutral (white). All this structural information was captured using cryo-electron microscopy (cryo-EM). The technique involves flash-freezing molecules in liquid nitrogen and bombarding them with electrons to capture their images with a special camera.
This groundbreaking work, published recently in Nature, comes from an NIH-supported multidisciplinary research team, led by David Fidock, Matthias Quick, and Filippo Mancia, Columbia University Irving Medical Center, New York . It marks a major feat for structural biology, because PfCRT is on the small side for standard cryo-EM and, as Mancia discovered, the protein is almost featureless.
These two strikes made Mancia and colleagues wonder at first whether they would swing and miss at their attempt to image the protein. With the help of coauthor Anthony Kossiakoff, a researcher at the University of Chicago, the team complexed PfCRT to a bulkier antibody fragment. That doubled the size of their subject, and the fragment helped to draw out PfCRT’s hidden features. One year and a lot of hard work later, they got their homerun.
PfCRT is a transport protein embedded in the surface membrane of what passes for the gut of P. falciparum. Because the gene encoding it is highly mutable, the PfCRT protein modified its structure many years ago, enabling it to pump out and render ineffective several drugs in a major class of antimalarials called 4-aminoquinolines. That includes chloroquine.
Now, with the atomic structure in hand, researchers can map the locations of existing mutations and study how they work. This information will also allow them to model which regions of the protein to be on the lookout for the next adaptive mutations. The hope is this work will help to prolong the effectiveness of today’s antimalarial drugs.
For example, the drug piperaquine, a 4-aminoquinoline agent, is now used in combination with another antimalarial. The combination has proved quite effective. But recent reports show that P. falciparum has acquired resistance to piperaquine, driven by mutations in PfCRT that are spreading rapidly across Southeast Asia .
Interestingly, the researchers say they have already pinpointed single mutations that could confer piperaquine resistance to parasites from South America. They’ve also located where new mutations are likely to occur to compromise the drug’s action in Africa, where most malarial infections and deaths occur. So, this atomic structure is already being put to good use.
Researchers also hope that this model will allow drug designers to make structural adjustments to old, less effective malarial drugs and perhaps restore them to their former potency. Perhaps this could even be done by modifying chloroquine, introduced in the 1940s as the first effective antimalarial. It was used worldwide but was largely shelved a few decades later due to resistance—as I experienced three decades ago.
Malaria remains a constant health threat for millions of people living in subtropical areas of the world. Wouldn’t it be great to restore chloroquine to the status of a frontline antimalarial? The drug is inexpensive, taken orally, and safe. Through the power of science, its return is no longer out of the question.
 World malaria report 2019. World Health Organization, December 4, 2019
 Structure and drug resistance of the Plasmodium falciparum transporter PfCRT. Kim J, Tan YZ, Wicht KJ, Erramilli SK, Dhingra SK, Okombo J, Vendome J, Hagenah LM, Giacometti SI, Warren AL, Nosol K, Roepe PD, Potter CS, Carragher B, Kossiakoff AA, Quick M, Fidock DA, Mancia F. Nature. 2019 Dec;576(7786):315-320.
 Determinants of dihydroartemisinin-piperaquine treatment failure in Plasmodium falciparum malaria in Cambodia, Thailand, and Vietnam: a prospective clinical, pharmacological, and genetic study. van der Pluijm RW, Imwong M, Chau NH, Hoa NT, et. al. Lancet Infect Dis. 2019 Sep;19(9):952-961.
Malaria (National Institute of Allergy and Infectious Diseases/NIH)
Fidock Lab (Columbia University Irving Medical Center, New York)
Video: David Fidock on antimalarial drug resistance (BioMedCentral/YouTube)
Kossiakoff Lab (University of Chicago)
Mancia Lab (Columbia University Irving Medical Center)
Matthias Quick (Columbia University Irving Medical Center)
NIH Support: National Institute of Allergy and Infectious Diseases; National Institute of General Medical Sciences; National Heart, Lung, and Blood Institute
Posted on by Dr. Francis Collins
When Dmitry Lyumkis headed off to graduate school at The Scripps Research Institute, La Jolla, CA, he had thoughts of becoming a synthetic chemist. But he soon found his calling in a nearby lab that imaged proteins using a technique known as single-particle cryo-electron microscopy (EM). Lyumkis was amazed that the team could take a purified protein, flash-freeze it in liquid nitrogen, and then fire electrons at the protein, capturing the resulting image with a special camera. Also amazing was the sophisticated computer software that analyzed the raw 2D camera images, merging the data and reconstructing it into 3D representations of the protein.
The work was profoundly complex, but Lyumkis thrives on solving extremely difficult puzzles. He joined the Scripps lab to become a structural biologist and a few years later used single-particle cryo-EM to help determine the atomic structure of a key protein on the surface of the human immunodeficiency virus (HIV), the cause of AIDS. The protein had been considered one of the greatest challenges in structural biology and a critical target in developing an AIDS vaccine .
Now, Lyumkis has plans to take single-particle cryo-EM to a whole new level—literally. He wants to develop new methods that allow it to model the atomic structures of much smaller proteins. Right now, single-particle cryo-EM has worked with proteins as small as roughly 150 kilodaltons, a measure of a protein’s molecular weight (the approximate average mass of a protein is 53 kDa). Lyumkis plans to drop that number well below 100 kDa, noting that if his new methods work as he hopes, there should be very little, if any, lower size limit to get the technique to work. He envisions generating within a matter of days or weeks the precise structure of an average-sized protein involved in a disease, and then potentially handing it off as an atomic model for drug developers to target for more effective treatment.
Posted on by Dr. Francis Collins
The striking image you see above is an example of what can happen when scientists combine something old with something new. In this case, a researcher took the Rous sarcoma virus (RSV)—a virus that’s been studied for more than century because of its ability to cause cancer in chickens and the insights it provided on human oncogenes [1, 2]—and used modern computational tools to generate a model of its atomic structure.
Here you see an immature RSV particle that’s just budded from an infected chicken cell and entered the avian bloodstream. A lattice of proteins (red) held together by short peptides (green) cover the outer shell of the immature virus, shielding other proteins (blue) that make up an inner shell.