Skip to main content

Pan-Cancer Analysis of Whole Genomes Consortium

Study Reveals How Epstein-Barr Virus May Lead to Cancer

Posted on by Lawrence Tabak, D.D.S., Ph.D.

a blue protein, EBNA1, attaches to DNA, gray. In the distance the DNA is fragmented. Several small arrows point to Cancer.
Caption: Illustration shows in the foreground EBNA1 protein (blue) bound to a preferred stretch of DNA. In the background, larger amounts of the protein accumulate, breaking strands of DNA, and increasing a cell’s susceptibility to cancer. Credit: Donny Bliss, NIH

Chances are good that you’ve had an Epstein-Barr virus (EBV) infection, usually during childhood. More than 90 percent of us have, though we often don’t know it. That’s because most EBV infections are mild or produce no symptoms at all.

But in some people, EBV can lead to other health problems. The virus can cause infectious mononucleosis (“mono”), type 1 diabetes, and other ailments. It also can persist in our bodies for years and cause increased risk later in life for certain cancers, such as lymphoma, leukemia, and head and neck cancer. Now, an NIH-funded team has some of the best evidence yet to explain how this EBV that hangs around may lead to cancer [1].

The paper, published recently in the journal Nature, shows that a key viral protein readily binds to a particular spot on a particular human chromosome. Where the protein accumulates, the chromosome becomes more prone to breaking for reasons that aren’t yet fully known. What the study makes clearer is that the breakage produces latently infected cells that are more likely over time to become cancerous.

This discovery paves the way potentially for ways to screen for and identify those at particular risk for developing EBV-associated cancers. It may also fuel the development of promising new ways to prevent these cancers from arising in the first place.

The work comes from a team led by Don Cleveland and Julia Su Zhou Li, University of California San Diego’s Ludwig Cancer Research, La Jolla, CA. Over the years, it’s been established that EBV, a type of herpes virus, often is detected in certain cancers, particularly in people with a long-term latent infection. What interested the team is a viral protein, called EBNA1, which routinely turns up in those same EBV-related cancers.

The EBNA1 protein is especially interesting because it binds viral DNA in particular spots, which allows the virus to persist and make more copies of itself. This discovery raised the intriguing possibility that the protein may also bind similar sequences in human DNA. While it had been suggested previously that this interaction might play a role in EBV-associated cancers, the details had remained murky—until now.

In the new study, the researchers first made uninfected human cells produce the viral EBNA1 protein. They then peered inside them with a microscope to see where those proteins went. In both healthy and cancerous human cells, they watched as EBNA1 proteins built up at two distinct spots and confirmed that this accumulation was dependent on the protein’s ability to bind DNA.

Next, they mapped where exactly EBNA1 binds to human DNA. Interestingly, it was along a repetitive non-protein-coding stretch of DNA on human chromosome 11. This region includes more than 300 copies of an 18-letter sequence that looks quite similar to the EBNA1-binding sites in its own viral genome.

What’s more, the researchers noticed that the repetitive DNA there takes on a structure that’s known for being unstable. And these so-called fragile sites are inherently prone to breaking.

The team went on to uncover evidence that the buildup of EBNA1 at this already fragile site only makes matters worse. In EBV-infected cells, increasing the amount of EBNA1 protein led to more chromosome 11 breaks. Those breaks showed up within a single day in about 40 percent of cells.

For these cells, those breaks also may be a double whammy. That’s because the breaks are located next to neighboring genes with long recognized roles in regulating cell growth. When altered, these genes can contribute to turning a cell cancerous.

To further nail down the link to cancer, the researchers looked to whole-genome sequencing data for more than 2,400 cancers including 38 tumor types from the international Pan-Cancer Analysis of Whole Genomes consortium [2]. They found that tumors with detectable EBV also had an unusually high number of chromosome 11 abnormalities. In fact, that was true in every single case of head and neck cancer.

The findings suggest that people will vary in their susceptibility to EBNA1-induced DNA breaks along chromosome 11 based on the amount of EBNA1 protein in their latently infected cells. It also will depend on the number of EBV-like DNA repeats present in their DNA.

Given these new findings, it’s worth noting that the presence of EBV and the very same viral protein has been implicated also in the link between EBV and multiple sclerosis (MS) [3]. Together, these recent findings are a reminder of the value in pursuing an EBV vaccine that might thwart this infection and its associated conditions, including certain cancers and MS. And, we’re getting there. In fact, an early-stage clinical trial for an experimental EBV vaccine is now ongoing here at the NIH Clinical Center.


[1] Chromosomal fragile site breakage by EBV-encoded EBNA1 at clustered repeats. Li JSZ, Abbasi A, Kim DH, Lippman SM, Alexandrov LB, Cleveland DW. Nature. 2023 Apr 12.

[2] Pan-cancer analysis of whole genomes. ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Nature.2020 Feb;578(7793):82-93.

[3] Clonally expanded B cells in multiple sclerosis bind EBV EBNA1 and GlialCAM. Lanz TV, Brewer RC, Steinman L, Robinson WH, et al. Nature. 2022 Mar;603(7900):321-327.


About Epstein-Barr Virus (Centers for Disease Control and Prevention, Atlanta)

Head and Neck Cancer (National Cancer Institute,/NIH)

Multiple Sclerosis (National Institute of Neurological Disorders and Stroke/NIH)

Don W. Cleveland Lab (University of California San Diego, La Jolla, CA)

NIH Support: National Institute of General Medical Sciences; National Institute of Environmental Health Sciences; National Cancer Institute

A Global Look at Cancer Genomes

Posted on by Dr. Francis Collins

Cancer Genomics

Cancer is a disease of the genome. It can be driven by many different types of DNA misspellings and rearrangements, which can cause cells to grow uncontrollably. While the first oncogenes with the potential to cause cancer were discovered more than 35 years ago, it’s been a long slog to catalog the universe of these potential DNA contributors to malignancy, let alone explore how they might inform diagnosis and treatment. So, I’m thrilled that an international team has completed the most comprehensive study to date of the entire genomes—the complete sets of DNA—of 38 different types of cancer.

Among the team’s most important discoveries is that the vast majority of tumors—about 95 percent—contained at least one identifiable spelling change in their genomes that appeared to drive the cancer [1]. That’s significantly higher than the level of “driver mutations” found in past studies that analyzed only a tumor’s exome, the small fraction of the genome that codes for proteins. Because many cancer drugs are designed to target specific proteins affected by driver mutations, the new findings indicate it may be worthwhile, perhaps even life-saving in many cases, to sequence the entire tumor genomes of a great many more people with cancer.

The latest findings, detailed in an impressive collection of 23 papers published in Nature and its affiliated journals, come from the international Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. Also known as the Pan-Cancer Project for short, it builds on earlier efforts to characterize the genomes of many cancer types, including NIH’s The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC).

In these latest studies, a team including more than 1,300 researchers from around the world analyzed the complete genomes of more than 2,600 cancer samples. Those samples included tumors of the brain, skin, esophagus, liver, and more, along with matched healthy cells taken from the same individuals.

In each of the resulting new studies, teams of researchers dug deep into various aspects of the cancer DNA findings to make a series of important inferences and discoveries. Here are a few intriguing highlights:

• The average cancer genome was found to contain not just one driver mutation, but four or five.

• About 13 percent of those driver mutations were found in so-called non-coding DNA, portions of the genome that don’t code for proteins [2].

• The mutations arose within about 100 different molecular processes, as indicated by their unique patterns or “mutational signatures.” [3,4].

• Some of those signatures are associated with known cancer causes, including aberrant DNA repair and exposure to known carcinogens, such as tobacco smoke or UV light. Interestingly, many others are as-yet unexplained, suggesting there’s more to learn with potentially important implications for cancer prevention and drug development.

• A comprehensive analysis of 47 million genetic changes pieced together the chronology of cancer-causing mutations. This work revealed that many driver mutations occur years, if not decades, prior to a cancer’s diagnosis, a discovery with potentially important implications for early cancer detection [5].

The findings represent a big step toward cataloging all the major cancer-causing mutations with important implications for the future of precision cancer care. And yet, the fact that the drivers in 5 percent of cancers continue to remain mysterious (though they do have RNA abnormalities) comes as a reminder that there’s still a lot more work to do. The challenging next steps include connecting the cancer genome data to treatments and building meaningful predictors of patient outcomes.

To help in these endeavors, the Pan-Cancer Project has made all of its data and analytic tools available to the research community. As researchers at NIH and around the world continue to detail the diverse genetic drivers of cancer and the molecular processes that contribute to them, there is hope that these findings and others will ultimately vanquish, or at least rein in, this Emperor of All Maladies.


[1] Pan-Cancer analysis of whole genomes. ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Nature. 2020 Feb;578(7793):82-93.

[2] Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Rheinbay E et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):102-111.

[3] The repertoire of mutational signatures in human cancer. Alexandrov LB et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):94-101.

[4] Patterns of somatic structural variation in human cancer genomes. Li Y et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):112-121.

[5] The evolutionary history of 2,658 cancers. Gerstung M, Jolly C, Leshchiner I, Dentro SC et al; PCAWG Consortium. Nature. 2020 Feb;578(7793):122-128.


The Genetics of Cancer (National Cancer Institute/NIH)

Precision Medicine in Cancer Treatment (NCI)

ICGC/TCGA Pan-Cancer Project

The Cancer Genome Atlas Program (NIH)

NCI and the Precision Medicine Initiative (NCI)

NIH Support: National Cancer Institute, National Human Genome Research Institute