Skip to main content

A Global Look at Cancer Genomes

Posted on by Dr. Francis Collins

Cancer Genomics

Cancer is a disease of the genome. It can be driven by many different types of DNA misspellings and rearrangements, which can cause cells to grow uncontrollably. While the first oncogenes with the potential to cause cancer were discovered more than 35 years ago, it’s been a long slog to catalog the universe of these potential DNA contributors to malignancy, let alone explore how they might inform diagnosis and treatment. So, I’m thrilled that an international team has completed the most comprehensive study to date of the entire genomes—the complete sets of DNA—of 38 different types of cancer.

Among the team’s most important discoveries is that the vast majority of tumors—about 95 percent—contained at least one identifiable spelling change in their genomes that appeared to drive the cancer [1]. That’s significantly higher than the level of “driver mutations” found in past studies that analyzed only a tumor’s exome, the small fraction of the genome that codes for proteins. Because many cancer drugs are designed to target specific proteins affected by driver mutations, the new findings indicate it may be worthwhile, perhaps even life-saving in many cases, to sequence the entire tumor genomes of a great many more people with cancer.

The latest findings, detailed in an impressive collection of 23 papers published in Nature and its affiliated journals, come from the international Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. Also known as the Pan-Cancer Project for short, it builds on earlier efforts to characterize the genomes of many cancer types, including NIH’s The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC).

In these latest studies, a team including more than 1,300 researchers from around the world analyzed the complete genomes of more than 2,600 cancer samples. Those samples included tumors of the brain, skin, esophagus, liver, and more, along with matched healthy cells taken from the same individuals.

In each of the resulting new studies, teams of researchers dug deep into various aspects of the cancer DNA findings to make a series of important inferences and discoveries. Here are a few intriguing highlights:

• The average cancer genome was found to contain not just one driver mutation, but four or five.

• About 13 percent of those driver mutations were found in so-called non-coding DNA, portions of the genome that don’t code for proteins [2].

• The mutations arose within about 100 different molecular processes, as indicated by their unique patterns or “mutational signatures.” [3,4].

• Some of those signatures are associated with known cancer causes, including aberrant DNA repair and exposure to known carcinogens, such as tobacco smoke or UV light. Interestingly, many others are as-yet unexplained, suggesting there’s more to learn with potentially important implications for cancer prevention and drug development.

• A comprehensive analysis of 47 million genetic changes pieced together the chronology of cancer-causing mutations. This work revealed that many driver mutations occur years, if not decades, prior to a cancer’s diagnosis, a discovery with potentially important implications for early cancer detection [5].

The findings represent a big step toward cataloging all the major cancer-causing mutations with important implications for the future of precision cancer care. And yet, the fact that the drivers in 5 percent of cancers continue to remain mysterious (though they do have RNA abnormalities) comes as a reminder that there’s still a lot more work to do. The challenging next steps include connecting the cancer genome data to treatments and building meaningful predictors of patient outcomes.

To help in these endeavors, the Pan-Cancer Project has made all of its data and analytic tools available to the research community. As researchers at NIH and around the world continue to detail the diverse genetic drivers of cancer and the molecular processes that contribute to them, there is hope that these findings and others will ultimately vanquish, or at least rein in, this Emperor of All Maladies.


[1] Pan-Cancer analysis of whole genomes. ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Nature. 2020 Feb;578(7793):82-93.

[2] Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Rheinbay E et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):102-111.

[3] The repertoire of mutational signatures in human cancer. Alexandrov LB et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):94-101.

[4] Patterns of somatic structural variation in human cancer genomes. Li Y et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):112-121.

[5] The evolutionary history of 2,658 cancers. Gerstung M, Jolly C, Leshchiner I, Dentro SC et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):122-128.


The Genetics of Cancer (National Cancer Institute/NIH)

Precision Medicine in Cancer Treatment (NCI)

ICGC/TCGA Pan-Cancer Project

The Cancer Genome Atlas Program (NIH)

NCI and the Precision Medicine Initiative (NCI)

NIH Support: National Cancer Institute, National Human Genome Research Institute



    A brilliant Pan-Cancer Project in the ever-expanding Cancer-Health Disparities-Global Public Health field!

    The expert highlights by the NIH USA Director Dr. Collins proprlled my inherent scientific curiosity for diligently deciphering the complex labyrith of interrelated aberrant-physiological inflammatory metabolic flux in diverse human carcinomas ranging from colorectal-hepatocellular-cervical-breast-prostate-lung-glioblastoma, etc. Pharmaogenetics and tailor-made personalized gene therapy for cancer-cell type-mutations/genetic variants in susceptible cohorts of varying genetic landscapes are urgently warranted for developing patient-friendly cost-effective predictive and prognostic biomarkers in cancer treatment.

    With my proven scientific research excellence as evident in 46 first authorships as of February 2020, I am overwhelmed to gather innovative public health oriented novel scientific knowledge in my broad areas of translational/public health research, including carcinogenesis-epidemiology.

    Thank ou for providing a stimulating clinical research crisp update with fascinating avenues for designing and implementing my future research grants/proposals in a competitive biomedical/lifesciences/public health research in 2020 and future years!

  • Kimberly CB says:

    Dr. Collins…you are a great writer! Thank you for keeping us updated on NIH research. We met way back in 2013…our son, Michael, was trying out several study trials for relapsed pre-B ALL. In fact, you gave Michael and me a ride to the Argyle Country Club for the charity tournament…Michael and a father/son group partnered up and they WON. Your beauty-full wife gave us a ride back to the Peds Clinical Center. I will never forget both of you and your kindness.

  • Marylou D. says:

    Thank you for keeping me informed with your important research. I had a daughter who died of breast cancer that metastasized to her brain. I have six granddaughters who need to be aware if the latest research.

  • Venil says:

    This is amazing! WIth so much more sequence from so many more cancers, do the ‘hallmarks of cancer’ need to be revised?

  • Martin Canizales says:

    Dr Collins, This has been a very very interesting read. Very accurate information and lots of data. Thank you for your effort providing this information. The data recently reported by MDACC with the clinical trial for CD19-NK cells is promising with 73% Response rate with leukemia and lymphoma for further more diverse kind of treatments but for solid tumors! Thank you Dr Collins! What about the Brain Initiative? Regards!

  • Dhurv Shah says:

    Amazingly written . . .

  • John says:

    Thanks for sharing this informative blog . . .

Leave a Comment