International “Big Data” Study Offers Fresh Insights into T2D

World map

Caption: This international “Big Data” study involved hundreds of researchers in 22 countries (red).

It’s estimated that about 10 percent of the world’s population either has type 2 diabetes (T2D) or will develop the disease during their lives [1]. Type 2 diabetes (formerly called “adult-onset”) happens when the body doesn’t produce or use insulin properly, causing glucose levels to rise. While diet and exercise are critical contributory factors to this potentially devastating disease, genetic factors are also important. In fact, over the last decade alone, studies have turned up more than 80 genetic regions that contribute to T2D risk, with much more still to be discovered.

Now, a major international effort, which includes work from my own NIH intramural research laboratory, has published new data that accelerate understanding of how a person’s genetic background contributes to T2D risk. The new study, reported in Nature and unprecedented in its investigative scale and scope, pulled together the largest-ever inventory of DNA sequence changes involved in T2D, and compared their distribution in people from around the world [2]. This “Big Data” strategy has already yielded important new insights into the biology underlying the disease, some of which may yield novel approaches to diabetes treatment and prevention.

The study, led by Michael Boehnke at the University of Michigan, Ann Arbor, Mark McCarthy at the University of Oxford, England, and David Altshuler, until recently at the Broad Institute, Cambridge, MA, involved more than 300 scientists in 22 countries.

The results from two related studies were combined to produce this work. One is called the Genetics of Type 2 Diabetes (GoT2D), which receives substantial NIH funding and included people from the United Kingdom, Sweden, Finland, and Germany. It has produced comprehensive whole genome sequencing (WGS) data representing about 2,650 people, roughly half with diabetes. The other consortium is the NIH-funded Type 2 Diabetes Genetic Exploration by Next-generation sequencing in multi-Ethnic Samples (T2D-GENES). It has sequenced the exomes—the 1.5 percent of the genome that codes for proteins—of nearly 13,000 individuals, including people with and without T2D who trace their ancestry to Europe, South and East Asia, Latin America, and Africa. But the work reported by this international collaboration went even further: based on sequence data from these two initial projects, a combination of experimental genotyping and mathematical imputation produced highly relevant information from another 111,548 individuals.

The complex genetics of T2D, once referred to by human genetics pioneer James V. Neel as “the geneticist’s nightmare,” has been gradually yielding to new genomic methods. Over the past 10 years, multiple research projects known as genome-wide association studies (GWAS) have identified scores of common variations in the human genome associated with increased risk of T2D. But GWAS is only able to detect risk variants that occur in at least a few percent of the individuals studied. Furthermore, most of the variants found by GWAS have been found to fall outside of the protein-coding part of the genome, making it hard to interpret their functional basis.

Could we have been missing the most significant, but rare, mutations in the coding region of genes? Those would be easier to understand functionally and would point more directly toward possible drug targets. By doing direct sequencing in GoT2D and T2D-GENES, it was possible to search directly for rare changes in the coding region of protein-coding genes that had been invisible before.

The sequencing data yielded up some important insights. More than a dozen genes were found to harbor diabetes-related variants that change the structure and function of proteins. But nearly all of these discoveries fell in the same region as a GWAS variant. That suggests the responsible genes can act in at least two ways to increase T2D risk. One way is the presence of common variants that either increase or decrease gene expression, and the other is the presence of rare variants in different individuals that directly alter the gene’s protein product.

Several of these new coding-region discoveries offer intriguing clues about the mechanisms underlying T2D. For example, the data uncovered a single coding variant in a gene called PAX4 that was powerfully associated with T2D, but only in people from East Asian countries, including Korea, China and Singapore. The study also implicated another gene, called TM6SF2, already known for its role in the development of fatty liver disease, a chronic liver condition that is very common in people with T2D and often makes their diabetes harder to control.

The study findings address a debate among scientists that goes back more than a century. Some had argued that common variants provide most of the heritable risk of T2D and other common diseases. Others had reasoned it was the rarer familial variations that are the real drivers. Clearly from this extensive analysis, the first option is right: it is common variants that provide the majority of the genetic risk for this disorder. Whether that will be the case for other common complex diseases like hypertension, asthma, or schizophrenia remains to be determined.

Taken together, the findings suggest that everyone carries an assortment of genetic variants related to T2D, including some that may offer protection and others that may place us at greater risk for disease. Most of these versions of the DNA code are widely shared within and between populations, but the precise combination of risk and protective variants carried by any given individual is likely to be unique. As scientists learn more, it may become possible to identify where people fall on that spectrum of risk and, should they develop T2D, learn what impact it may have on the best course of treatment or prognosis.

As the number of people with T2D continues to soar, these new findings provide much-needed stepping stones on the path to new treatment strategies for diabetes and its many health complications, which often affect the eyes, nerves, kidneys, and heart. For those diagnosed with T2D, currently available treatments can often keep blood glucose under control, but they rarely return a person’s metabolism completely back to normal. So new insights that could lead to more effective treatments are most welcome.

Another important point should be made about this research study—all of the data has been made immediately available through the Knowledge Portal of the Accelerating Medicines Partnership (AMP), a joint effort of NIH and several biopharmaceutical companies and non-profit organizations. The wide availability of this large data set should accelerate the search for the most promising biological targets for diagnostic and drug development.


[1] IDF Diabetes Atlas. International Diabetes Association.

[2] The genetic architecture of type 2 diabetes. Fuchsberger C et al. Nature. 2016 July 11.


Diabetes A-Z (National Institute of Diabetes and Digestive and Kidney Diseases/NIH)

Diabetes (National Human Genome Research Institute/NIH)

T2D-GENES Consortium (University of Michigan, Ann Arbor)

GoT2D Consortium

Type 2 Diabetes Knowledge Portal (Accelerating Medicines Partnership)

NIH Support: National Institute of Diabetes and Digestive and Kidney Diseases; National Human Genome Research Institute; National Institute on Aging; National Cancer Institute; National Institute of General Medical Sciences; National Heart, Lung and Blood Institute; National Institute of Mental Health

One thought on “International “Big Data” Study Offers Fresh Insights into T2D

Comments are closed.