Skip to main content

Seeking Consensus on the Use of Population Descriptors in Genomics

Posted on by

Laptop with research article on Ethnicity and Race. Printer printing a page of cartoon faces
Credit: Ernesto del Aguila III, National Human Genome Research Institute, NIH

Cataloging and characterizing the thousands of genomic variants—differences in DNA sequences among individuals—across human populations is a foundational component of genomics. Scientists from various disciplinary fields compare the variation that occurs within and between the genomes of individuals and groups. Such efforts include attributing descriptors to population groups, which have historically included the use of social constructs such as race, ethnicity, ancestry, and political geographic location. Like any descriptors, these words do not fully account for the scope and diversity of the human species.

The use of race, ethnicity, and ancestry as descriptors of population groups in biomedical and genomics research has been a topic of consistent and rigorous debate within the scientific community. Human health, disease, and ancestry are all tied to how we define and explain human diversity. For centuries, scientists have incorrectly inferred that people of different races reflect discrete biological groups, which has led to deep-rooted health inequities and reinforced scientific racism.

In recent decades, genomics research has revealed the complexity of human genomic variation and the limitations of these socially derived population descriptors. The scientific community has long worked to move beyond the use of the social construct of race as a population descriptor and provide guidance about agreed-upon descriptors of human populations. Such a need has escalated with the growing numbers of large population-scale genomics studies being launched around the world, including in the United States.

To answer this call, NIH is sponsoring a National Academies of Sciences, Engineering, and Medicine (NASEM) study that aims to develop best practices in the use of race, ethnicity, and genetic ancestry in genomics research. The NASEM study is sponsored by 14 NIH institutes, centers, offices, and programs, and the resulting report will be released in February 2023.

Experts from various fields—including genomics, medicine, and social sciences—are conducting the study. Much of the effort will revolve around reviewing and assessing existing methodologies, benefits, and challenges in the use of race and ethnicity and other population descriptors in genomics research. The ad hoc committee will host three public meetings to obtain input. Look for more information regarding the committee’s next public session planned for April 2022 on the NASEM “Race, Ethnicity, and Ancestry as Population Descriptors in Genomics Research” website.

To further underscore the need for the NASEM study, an NIH study published in December 2021 revealed that the descriptors for human populations used in the genetics literature have evolved over the last 70 years [1]. For example, the use of the word “race” has substantially decreased, while the uses of “ancestry” and “ethnicity” have increased. The study provided additional evidence that population descriptors often reflect fluid, social constructs whose intention is to describe groups with common genetic ancestry. These findings reinforce the timeliness of the NASEM study, with the clear need for experts to provide guidance for establishing more stable and meaningful population descriptors for use in future genomics studies.

The full promise of genomics, including its application to medicine, depends on improving how we explain human genomic variation. The words that we use to describe participants in research studies and populations must be transparent, thoughtful, and consistent—in addition to avoiding the perpetuation of structural racism. The best and most fruitful genomics research demands a better approach.


[1] Evolving use of ancestry, ethnicity, and race in genetics research—A survey spanning seven decades. Byeon YJJ, Islamaj R, Yeganova L, Wilbur WJ, Lu Z, Brody LC, Bonham VL. Am J Hum Genet. 2021 Dec 2;108(12):2215-2223.

Use of Race, Ethnicity, and Ancestry as Population Descriptors in Genomics Research (National Academies of Sciences, Engineering, and Medicine)

Language used by researchers to describe human populations has evolved over the last 70 years.” (National Human Genome Research Institute/NIH)

Genomic Variation Program (NHGRI)

[Note: Acting NIH Director Lawrence Tabak has asked the heads of NIH’s institutes and centers to contribute occasional guest posts to the blog as a way to highlight some of the cool science that they support and conduct. This is the third in the series of NIH institute and center guest posts that will run until a new permanent NIH director is in place.]


  • Steve White says:

    I wish to be clear I am a layman, so perhaps there is more to this effort than is apparent to me, but I do not believe the author made any case for the assertions he is making here.
    As a layman, it certainly appears there is no need to try to find new words for concepts which, despite the author’s claims, are not “social constructs”. Well, not PURELY social constructs – but are widely assumed to have biological meaning. Of course, there is some imprecision to words in common usage – if I say “black people” – I mean people with some African ancestry – but how much? And from what parts of Africa? Do I include all recent African immigrants ? Do I include only descendants of slaves? How about someone who has an African father but was raised by white people? There could be disagreement, honest disagreement, over the term – but if I am talking about a finding in human biology or medicine – I probably mean people with mostly African ancestry, who I assume, on average, will have different medical problems, and different physiological responses and the listener almost always knows, from the context, when this is the case. The author asserts that the wrong words have done terrible things to people, historically and even now, but he does not cite one example of it. In fact, for an OpEd, or for that matter, nearly anything, written by a PhD and MD, this piece is extremely vague . . .
    I would also point out – we already HAVE “descriptors” for everything. Now, they could require use of multiple words but we all end up knowing what the authors meant. If anything, it’s activists adding new terms which never existed which causes confusion -“Structural racism” – for example -which the author uses but does not define. And which I for one sincerely do not know the meaning of.

  • Dana W. Aswad says:

    If the suggestions set forward in this article are put into place, it will set science back a few hundred years.

  • DR. SAUMYA PANDEY, PH.D. says:

    Dissecting the genetic basis of complex multifactorial human diseases spanning from cancers to infertility warrants precision-based sophisticated pharmacogenetics/genomics approach with prospective and/or retrospective population cohort-based clinical research studies for eventual design of predictive biomarkers and drug development with emphasis on Covid-19 critical care; in this context, my American scientific contemporaries at NIH USA provide a crisp critical snapshots of the population-based genomics study-approach for significant reduction of morbidity/mortality trends by thorough analysis of large sample data-sets of genetically diverse poolants on a global scale!

  • Victoria says:

    Genetic identifiers in disease is a rather slippery slope when medical privacy can not be maintained.This goes to things such as medical privacy among Royalty where hemophilia and porphyria are known entities. When that expands to conditions such as schizophrenia and other mental health issues there is very much a chance of abuse of information. It is not like precedence has not been set in the 20th century. Perhaps PhD programs across the globe in the field should start having more ethics classes as well as the one semester statistics that is required.
    Science progresses when the information can be of use and beneficial for humanity as a whole. When certain pockets of humanity chose to abuse science for other purposes, perhaps there is a need to take a pause and assess what is really being achieved? Future generations will probably thank those for the consideration.

  • Maya says:

    What happens when databases of direct to consumer services are sold for “data mining”? What protections do the “average” uninformed consumers have? Or better yet, their descendants? When databases of people with security clearances can get hacked, this is somewhat of concern. Cyber attacks are not just limited to the tech world especially when there are reports of unexplained “Havana Syndrome”.

Leave a Comment