Skip to main content


A Global Look at Cancer Genomes

Posted on by Dr. Francis Collins

Cancer Genomics

Cancer is a disease of the genome. It can be driven by many different types of DNA misspellings and rearrangements, which can cause cells to grow uncontrollably. While the first oncogenes with the potential to cause cancer were discovered more than 35 years ago, it’s been a long slog to catalog the universe of these potential DNA contributors to malignancy, let alone explore how they might inform diagnosis and treatment. So, I’m thrilled that an international team has completed the most comprehensive study to date of the entire genomes—the complete sets of DNA—of 38 different types of cancer.

Among the team’s most important discoveries is that the vast majority of tumors—about 95 percent—contained at least one identifiable spelling change in their genomes that appeared to drive the cancer [1]. That’s significantly higher than the level of “driver mutations” found in past studies that analyzed only a tumor’s exome, the small fraction of the genome that codes for proteins. Because many cancer drugs are designed to target specific proteins affected by driver mutations, the new findings indicate it may be worthwhile, perhaps even life-saving in many cases, to sequence the entire tumor genomes of a great many more people with cancer.

The latest findings, detailed in an impressive collection of 23 papers published in Nature and its affiliated journals, come from the international Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. Also known as the Pan-Cancer Project for short, it builds on earlier efforts to characterize the genomes of many cancer types, including NIH’s The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC).

In these latest studies, a team including more than 1,300 researchers from around the world analyzed the complete genomes of more than 2,600 cancer samples. Those samples included tumors of the brain, skin, esophagus, liver, and more, along with matched healthy cells taken from the same individuals.

In each of the resulting new studies, teams of researchers dug deep into various aspects of the cancer DNA findings to make a series of important inferences and discoveries. Here are a few intriguing highlights:

• The average cancer genome was found to contain not just one driver mutation, but four or five.

• About 13 percent of those driver mutations were found in so-called non-coding DNA, portions of the genome that don’t code for proteins [2].

• The mutations arose within about 100 different molecular processes, as indicated by their unique patterns or “mutational signatures.” [3,4].

• Some of those signatures are associated with known cancer causes, including aberrant DNA repair and exposure to known carcinogens, such as tobacco smoke or UV light. Interestingly, many others are as-yet unexplained, suggesting there’s more to learn with potentially important implications for cancer prevention and drug development.

• A comprehensive analysis of 47 million genetic changes pieced together the chronology of cancer-causing mutations. This work revealed that many driver mutations occur years, if not decades, prior to a cancer’s diagnosis, a discovery with potentially important implications for early cancer detection [5].

The findings represent a big step toward cataloging all the major cancer-causing mutations with important implications for the future of precision cancer care. And yet, the fact that the drivers in 5 percent of cancers continue to remain mysterious (though they do have RNA abnormalities) comes as a reminder that there’s still a lot more work to do. The challenging next steps include connecting the cancer genome data to treatments and building meaningful predictors of patient outcomes.

To help in these endeavors, the Pan-Cancer Project has made all of its data and analytic tools available to the research community. As researchers at NIH and around the world continue to detail the diverse genetic drivers of cancer and the molecular processes that contribute to them, there is hope that these findings and others will ultimately vanquish, or at least rein in, this Emperor of All Maladies.


[1] Pan-Cancer analysis of whole genomes. ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Nature. 2020 Feb;578(7793):82-93.

[2] Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Rheinbay E et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):102-111.

[3] The repertoire of mutational signatures in human cancer. Alexandrov LB et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):94-101.

[4] Patterns of somatic structural variation in human cancer genomes. Li Y et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):112-121.

[5] The evolutionary history of 2,658 cancers. Gerstung M, Jolly C, Leshchiner I, Dentro SC et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):122-128.


The Genetics of Cancer (National Cancer Institute/NIH)

Precision Medicine in Cancer Treatment (NCI)

ICGC/TCGA Pan-Cancer Project

The Cancer Genome Atlas Program (NIH)

NCI and the Precision Medicine Initiative (NCI)

NIH Support: National Cancer Institute, National Human Genome Research Institute

Skin Health: New Insights from a Rare Disease

Posted on by Dr. Francis Collins

Forehead of study participant with rare form of ichthyosis

Courtesy of Keith Choate, Yale University School of Medicine, New Haven, CT

Skin is the largest organ in the human body, yet we often take for granted all of the wonderful things that it does to keep us healthy. That’s not the case for people who suffer from a group of rare, scale-forming skin disorders known as ichthyoses, which are named after “ichthys,” the Greek word for fish.

Each year, more than 16,000 babies around the world are born with ichthyoses [1], and researchers have identified so far more than 50 gene mutations responsible for various types and subtypes of the disease. Now, an NIH-funded research team has found yet another genetic cause—and this one has important implications for treatment. The new discovery implicates misspellings in a gene that codes for an enzyme playing a critical role in building ceramide—fatty molecules that help keep the skin moist. Without healthy ceramide, the skin develops dry, scale-like plaques that can leave people vulnerable to infections and other health problems.

Two patients with this newly characterized form of ichthyosis were treated with isotretinoin (Accutane), a common prescription acne medication, and found that their symptoms resolved almost entirely. Together, the findings suggest that isotretinoin works not only by encouraging the rapid turnover of skin cells but also by spurring patients’ skin to boost ceramide production, albeit through a different biological pathway.

International “Big Data” Study Offers Fresh Insights into T2D

Posted on by Dr. Francis Collins

World map
Caption: This international “Big Data” study involved hundreds of researchers in 22 countries (red).

It’s estimated that about 10 percent of the world’s population either has type 2 diabetes (T2D) or will develop the disease during their lives [1]. Type 2 diabetes (formerly called “adult-onset”) happens when the body doesn’t produce or use insulin properly, causing glucose levels to rise. While diet and exercise are critical contributory factors to this potentially devastating disease, genetic factors are also important. In fact, over the last decade alone, studies have turned up more than 80 genetic regions that contribute to T2D risk, with much more still to be discovered.

Now, a major international effort, which includes work from my own NIH intramural research laboratory, has published new data that accelerate understanding of how a person’s genetic background contributes to T2D risk. The new study, reported in Nature and unprecedented in its investigative scale and scope, pulled together the largest-ever inventory of DNA sequence changes involved in T2D, and compared their distribution in people from around the world [2]. This “Big Data” strategy has already yielded important new insights into the biology underlying the disease, some of which may yield novel approaches to diabetes treatment and prevention.

The study, led by Michael Boehnke at the University of Michigan, Ann Arbor, Mark McCarthy at the University of Oxford, England, and David Altshuler, until recently at the Broad Institute, Cambridge, MA, involved more than 300 scientists in 22 countries.

Molecular Answers Found for a Mysterious Rare Immune Disorder

Posted on by Dr. Francis Collins

Harry Hill and Patient Images

Caption: Helping to solve a medical mystery. Top left, University of Utah’s Harry Hill; Bottom, CVID patient Roma Jean Ockler; Right, Ockler showing the medication that helps to control her CVID.
Credit: Jeffrey Allred, Deseret News

When most of us come down with a bacterial infection, we generally bounce back with appropriate treatment in a matter of days. But that’s often not the case for people who suffer from common variable immunodeficiency (CVID), a group of rare disorders that increase the risk of life-threatening bacterial infections of the lungs, sinuses, and intestines. CVID symptoms typically arise in adulthood and often take many years to diagnose and treat, in part because its exact molecular causes are unknown in most individuals.

Now, by combining the latest in genomic technology with some good, old-fashioned medical detective work, NIH-funded researchers have pinpointed the genetic mutation responsible for an inherited subtype of CVID characterized by the loss of immune cells essential to the normal production of antibodies [1]. This discovery, reported recently in The New England Journal of Medicine, makes it possible at long last to provide a definitive diagnosis for people with this CVID subtype, paving the way for them to receive more precise medical treatment and care. More broadly, the new study demonstrates the power of precision medicine approaches to help the estimated 25 to 30 million Americans who live with rare diseases [2].

Study Shows DNA Sequencing Brings Greater Precision to Childhood Cancer

Posted on by Dr. Francis Collins

Dr. Plon with a patient and her family

Caption: Baylor’s Sharon Plon consults with a family at the Texas Children’s Cancer Center in Houston.
Credit: Paul V. Kuntz/Texas Children’s Hospital

An impressive number of fundamental advances in our understanding of cancer have occurred over the past several decades. One of the most profound is the realization that cancer is a disease of the genome, driven by a wide array of changes in DNA—some in the germline and affecting all cells of the body, but most occurring in individual cells during life (so-called “somatic mutations”). As the technology for sequencing cancer genomes has advanced, we are learning that virtually all cancers carry a unique set of mutations. Most are DNA copying errors of no significance (we call those “passengers”), but a few of them occur in genes that regulate cell growth and contribute causatively to the cancer (we call those “drivers”). We are now learning that it may be far more important for treating cancer to figure out what driver mutations are present in a patient’s tumor than to identify in which organ it arose. And, as a new study shows, this approach even appears to have potential to help cancer’s littlest victims.

Using genomic technology to analyze both tumor and blood samples from a large number of children who’d been newly diagnosed with cancer, an NIH-funded research team uncovered genetic clues with the potential to refine diagnosis, identify inherited cancer susceptibility, or guide treatment for nearly 40 percent of the children [1]. The potential driver mutations spanned a broad spectrum of genes previously implicated not only in pediatric cancers, but also in adult cancers. While much more work remains to determine how genomic analyses can be used to devise precise, new strategies for treating kids with cancer, the study provides an excellent example of the kind of research that NIH hopes to accelerate under the nation’s new cancer “moonshot,”  a research initiative recently announced by the President and being led by the Vice President.

Creative Minds: Interpreting Your Genome

Posted on by Dr. Francis Collins

Artist's rendering of a doctor with a patient and a strand of DNA

Credit: Jane Ades, National Human Genome Research Institute, NIH

Just this year, we’ve reached the point where we can sequence an entire human genome for less than $1,000. That’s great news—and rather astounding, since the first human genome sequence (finished in 2003) cost an estimated $400,000,000!  Does that mean we’ll be able to use each person’s unique genetic blueprint to guide his or her health care from cradle to grave?  Maybe eventually, but it’s not quite as simple as it sounds.

Before we can use your genome to develop more personalized strategies for detecting, treating, and preventing disease, we need to be able to interpret the many variations that make your genome distinct from everybody else’s. While most of these variations are neither bad nor good, some raise the risk of particular diseases, and others serve to lower the risk. How do we figure out which is which?

Jay Shendure, an associate professor at the University of Washington in Seattle, has an audacious plan to figure this out, which is why he is among the 2013 recipients of the NIH Director’s Pioneer Award.

Exploring the Complex Genetics of Schizophrenia

Posted on by Dr. Francis Collins

Illustration of a human head showing a brain and DNA

Credit: Jonathan Bailey, National Human Genome Research Institute, NIH

Schizophrenia is one of the most prevalent, tragic, and frustrating of all human illnesses, affecting about 1% of the human population, or 2.4 million Americans [1]. Decades of research have failed to provide a clear cause in most cases, but family clustering has suggested that inheritance must play some role. Over the last five years, multiple research projects known as genome-wide association studies (GWAS) have identified dozens of common variations in the human genome associated with increased risk of schizophrenia [2]. However, the individual effects of these variants are weak, and it’s often not been clear which genes were actually affected by the variations. Now, advances in DNA sequencing technology have made it possible to move beyond these association studies to study the actual DNA sequence of the protein-coding region of the entire genome for thousands of individuals with schizophrenia. Reports just published have revealed a complex constellation of rare mutations that point to specific genes—at least in certain cases.