Skip to main content

Cardiometabolic Disease: Big Data Tackles a Big Health Problem

Posted on by Dr. Francis Collins

Cardiometabolic risk loci

More and more studies are popping up that demonstrate the power of Big Data analyses to get at the underlying molecular pathology of some of our most common diseases. A great example, which may have flown a bit under the radar during the summer holidays, involves cardiometabolic disease. It’s an umbrella term for common vascular and metabolic conditions, including hypertension, impaired glucose and lipid metabolism, excess belly fat, and inflammation. All of these components of cardiometabolic disease can increase a person’s risk for a heart attack or stroke.

In the study, an international research team tapped into the power of genomic data to develop clearer pictures of the complex biocircuitry in seven types of vascular and metabolic tissue known to be affected by cardiometabolic disease: the liver, the heart’s aortic root, visceral abdominal fat, subcutaneous fat, internal mammary artery, skeletal muscle, and blood. The researchers found that while some circuits might regulate the level of gene expression in just one tissue, that’s often not the case. In fact, the researchers’ computational models show that such genetic circuitry can be organized into super networks that work together to influence how multiple tissues carry out fundamental life processes, such as metabolizing glucose or regulating lipid levels. When these networks are perturbed, perhaps by things like inherited variants that affect gene expression, or environmental influences such as a high-carb diet, sedentary lifestyle, the aging process, or infectious disease, the researchers’ modeling work suggests that multiple tissues can be affected, resulting in chronic, systemic disorders including cardiometabolic disease.

The work, published in the journal Science and partially supported by NIH, was initiated by Johan L.M. Björkegren, a scientist from the Karolinska Institute, Stockholm, who has teamed up with Big Data innovator Eric Schadt at the Icahn School of Medicine at Mount Sinai, New York. In 2007, Björkegren contacted heart surgeon Arno Ruusalepp at the University of Tartu in Estonia to launch a study called STARNET, in which they collected tissue samples from 600 human volunteers with cardiovascular disease undergoing coronary artery bypass surgery. The volunteers were white and mostly male, with diagnoses of diabetes (27 percent), hyperlipidemia (68 percent), and hypertension (77 percent). About 40 percent had survived a previous heart attack.

The researchers analyzed the RNA in each of the tissue samples, which enabled them to create a composite view of gene activity in each of the various tissue types. In addition, they analyzed the participants’ genomes, looking for DNA variations associated with risk of cardiometabolic disease. That combined search yielded 8 million regions of DNA, called expression quantitative trait loci (eQTLs), that uniquely affect gene activity and possibly influence progression of cardiometabolic disease. Next, the team integrated its analyses of these eQTLs with data found in NIH’s National Human Genome Research Institute (NHGRI) GWAS Catalog, which contains the full catalogue of more than 3,300 common human genetic variants associated with disease risk in various genome-wide association studies (GWAS).

The NHGRI GWAS Catalog includes more than 150 markers of risk for coronary artery disease [2]. However, because most of these markers lie in regions of the genome that don’t code for proteins, it has been difficult to determine the mechanism by which they contribute to the disease process. Armed with their new data and an array of computational tools, the group took on the challenge: assigning at least one gene implicated in cardiometabolic disease, along with its target tissue, to 61 percent (or 2,047) of all GWAS findings to date!

The researchers then went on to use sophisticated algorithms to infer gene interactions near and far, developing models that provide an unprecedented view of the complex molecular networks that may be in play in cardiometabolic disease across the seven tissues studied. Such modeling has uncovered many intriguing, and in some cases unexpected, leads for future study. For example, the analyses point to a super network of hundreds of genes that may interact across tissues to regulate the risk of coronary artery disease. The modeling also indicates that blood lipid levels share the most regulatory genes among cardiometabolic conditions, suggesting that lipids may be a central factor and perhaps a key one to focus on in efforts to find new ways to treat and prevent this chronic disease.

The researchers also showed how Big Data analyses could help to improve the development of new drugs. A prime example is the many lipid-lowering drugs now under development that target the PCSK9 protein. It had been thought that the PCSK9 gene, which makes a protein that helps to control plasma levels of low-density lipoprotein (LDL), or “bad cholesterol,” is expressed only in the liver. But the researchers discovered a different scenario. They found that when they looked at gene expression levels in their STARNET volunteers’ tissues, it was PCSK9 activity in visceral abdominal fat (the part of belly fat located within the peritoneum)—not the liver—that was associated with risk of early heart attacks. In a follow-up analysis, the researchers also confirmed that participants in the upper percentile of waist-to-hip ratio, a specific measure of belly fat, had higher circulating PCSK9 and LDL levels than those in the lower percentiles.

Björkegren and colleagues note their network models remain works in progress that will need further validation and refinement to bolster their reliability. But with more Big Data studies being published all of the time and the Precision Medicine Initiative Cohort Program nearing its launch, this view will only increase in resolution in the years ahead, helping to pave the way for a new generation of strategies for improving human health.


[1] Cardiometabolic risk loci share downstream cis- and trans-gene regulation across tissues and diseases. Franzén O, Ermel R, Cohain A, Akers NK, Di Narzo A, Talukdar HA, Foroughi-Asl H, Giambartolomei C, Fullard JF, Sukhavasi K, Köks S, Gan LM, Giannarelli C, Kovacic JC, Betsholtz C, Losic B, Michoel T, Hao K, Roussos P, Skogsberg J, Ruusalepp A, Schadt EE, Björkegren JL. Science. 2016 Aug 19;353(6301):827-830.

[2] The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L, Parkinson H. Nucleic Acids Res. 2014 Jan;42(Database issue):D1001-D1006.


What is Metabolic Syndrome? (National Heart, Lung, and Blood Institute/NIH)

Johan Björkegren (Icahn School of Medicine at Mount Sinai, New York)

Eric Schadt (Icahn School of Medicine at Mount Sinai)

Precision Medicine Initiative Cohort Program (NIH)

NIH Support: National Heart, Lung, and Blood Institute; National Institute on Aging


Leave a Comment