Skip to main content

genome-wide analysis

Big Data Study Reveals Possible Subtypes of Type 2 Diabetes

Posted on by Dr. Francis Collins

Computational model

Caption: Computational model showing study participants with type 2 diabetes grouped into three subtypes, based on similarities in data contained in their electronic health records. Such information included age, gender (red/orange/yellow indicates females; blue/green, males), health history, and a range of routine laboratory and medical tests.
Credit: Dudley Lab, Icahn School of Medicine at Mount Sinai, New York

In recent years, there’s been a lot of talk about how “Big Data” stands to revolutionize biomedical research. Indeed, we’ve already gained many new insights into health and disease thanks to the power of new technologies to generate astonishing amounts of molecular data—DNA sequences, epigenetic marks, and metabolic signatures, to name a few. But what’s often overlooked is the value of combining all that with a more mundane type of Big Data: the vast trove of clinical information contained in electronic health records (EHRs).

In a recent study in Science Translational Medicine  [1], NIH-funded researchers demonstrated the tremendous potential of using EHRs, combined with genome-wide analysis, to learn more about a common, chronic disease—type 2 diabetes. Sifting through the EHR and genomic data of more than 11,000 volunteers, the researchers uncovered what appear to be three distinct subtypes of type 2 diabetes. Not only does this work have implications for efforts to reduce this leading cause of death and disability, it provides a sneak peek at the kind of discoveries that will be made possible by the new Precision Medicine Initiative’s national research cohort, which will enroll 1 million or more volunteers who agree to share their EHRs and genomic information.