Perpetuating Bias: The Dangers of Incomplete COVID-19 Data on Health Inequities - Annals of Internal Medicine: Fresh Look Blog


Wednesday, May 20, 2020

Perpetuating Bias: The Dangers of Incomplete COVID-19 Data on Health Inequities

Amid the coronavirus disease 2019 (COVID-19) pandemic, leaders and policymakers are looking for data to inform decision making. For instance, Governors Gavin Newson (D-CA), Kate Brown (D-OR), and Jay Inslee (D-WA) announced that the decisions to reopen their states’ economies will be guided by science and data. This declaration begs the question: What data are we collecting?

“Garbage in, garbage out,” the saying goes, and in the case of COVID-19, a lack of health equity in collection and application of data contributes to the garbage. In a recent Annals article, Rajkomar and colleagues highlight this point, describing how biases perpetuated in machine-learning models can exacerbate health care disparities (1). The sentiment aligns with the term weapons of math destruction coined by mathematician Cathy O’Neil to describe algorithms that can have potentially devastating effects on individuals as a result of being based on faulty data.

May is Asian American and Pacific Islander Heritage Month, providing an opportunity to use experience with research among Asian and Asian American populations to exemplify the importance of collecting data with health equity in mind and glean insights into the COVID-19 pandemic. Asians and Asian American Pacific Islanders—both very socially and culturally diverse groups—are often inadequately recruited and represented in research because of barriers, such as concerns regarding providing informed consent, language barriers, and lack of time or financial resources.

As a result of this systematic underrepresentation in the data, clinical models are less likely to be applicable to these populations. The potential effect is highlighted by studies that do involve broad, diverse recruitment. For example, an Annals study examining metabolic abnormalities in normal-weight individuals showed that Chinese Americans and South Asians had significantly higher rates of metabolic abnormalities than white individuals despite a normal body mass index (2). Studies that do not ensure representation of these racial/ethnic groups may miss such differences.

What can these dynamics teach us amid efforts to use data to inform COVID-19 decision making? Information from the Centers for Disease Control and Prevention suggest that COVID-19 is disproportionately affecting minorities in the United States, and in particular African Americans.

However, more than half of the patient reports did not have race and ethnicity data. In addition, reports do not capture the higher rates of underlying health conditions and poverty or the lower access to regular medical care facing African Americans. This information is critical in the development and use of prediction and treatment algorithms based on COVID-19 data.

In an Ideas and Opinions article in Annals, Bibbins-Domingo stated that “this time must be different because the increasing diversity of the U.S. population and our essential workers … means that focusing on minority communities is essential both to relieve suffering in these communities and to effectively manage this crisis” (3). Without a concerted effort to ensure adequate representation of minorities in data collection, we risk introducing and perpetuating bias. Using terminology from Rajkomar and colleagues, data may not contain enough examples from a group to appropriately tailor predictions for that subpopulation (minority bias), the data set from vulnerable populations may have more missing features because of fragmented care (missing data bias), or marginalized groups may not be described with sufficient granularity (cohort bias) (1).

We all need to practice responsible data science at every stage of the data pipeline to combat health disparities. Now more than ever, this effort will involve advocating for minority representation in the data to facilitate equitable decision support for COVID-19.

  1. Rajkomar A, Hardt M, Howell MD, et al. Ensuring fairness in machine learning to advance health equity. Ann Intern Med. 2018;169:866-872. [PMID: 30508424] doi:10.7326/M18-1990
  2. Gujral UP, Vittinghoff E, Mongraw-Chaffin M, et al. Cardiometabolic abnormalities among normal-weight persons from five racial/ethnic groups in the United States. A cross-sectional analysis of two cohort studies. Ann Intern Med. 2017;166:628-636. [PMID: 28384781] doi:10.7326/M16-1895
  3. Bibbins-Domingo K. This time must be different: disparities during the COVID-19 pandemic. Ann Intern Med. 2020. [PMID: 32343767] doi:10.7326/M20-2247

No comments:

Post a Comment

By commenting on this site, you agree to the Terms & Conditions of Use.