
There is a consensus in the literature that good quality data is a cornerstone for understanding and effectively addressing long-standing ethnic inequalities in health (Sheikh,et al., 2023), (Kapadia, et al., 2022). For example, disaggregated ethnicity data can provide evidence of populations at risk of poor health(care) outcomes. In turn, this evidence can be used to tailor care and inform the design of suitable and acceptable interventions and effective evidence-based policies for ethnic populations with the greatest health needs. Similarly, data on wider determinants of health (e.g. area deprivation, housing, education, employment, wealth) is crucial given that inequalities in these domains are the key drivers of health inequalities (World Health Organization, 2024). Underlying these inequities are powers of oppression such as racism, sexism and other forms of discrimination which are entrenched in law, policies and practices and are (re)produced within societal organisations and institutions (Salway, et al., 2020), (Williams and Rucker, 2000). In this piece, I reflect on the current sources of data used by researchers at the intersection of health and ethnicity and discuss some of the ways in which they navigate the data limitations to better understand ethnic health inequalities.
In the UK, healthcare records from hospitals, primary care, disease registers, and specialists clinics are an invaluable resource for researchers interested in examining ethnic inequalities in health as they offer direct measures of health. While ethnicity recording in routine electronic healthcare records has improved over time, data quality problems relating to inconsistency and completeness have been found and they disproportionately affect the records of minoritised ethnic people (Scobie, Spencer and Raleig, 2021). Coverage of ethnicity recording also varies from sector to sector with hospital records and primary care records having better coverage than social care records (Raleigh and Glodbatt, 2020). The lack of good quality ethnicity data in some sectors means that we lack an understanding of ethnic inequalities in some domains of healthcare (e.g. social care). Further, ethnic inequalities in health may be underestimated and populations at risk of poor health(care) outcomes remain invisible and may miss out on vital health improvement strategies.
To overcome the limitations of the current data sources, researchers often weigh the strengths and limitations of each data source for their analyses and then lean into their methodological imagination to address gaps. Strategies include using social surveys alongside health records, adopting innovative methodological strategies to address their research questions and/or following up their quantitative analyses with qualitative research, where possible, to better understand ethnic inequalities in health. For example, when examining ethnic inequalities in age-related patterns of multiple long-term conditions (MLTCS), my colleagues and I were cognisant of the bias that stems from clinician reported ethnicity in health records and that self-reported ethnicity using official classifications of ethnicity is the gold standard (Hayanga, et al., 2024). Thus, in this analysis, we used both patient records and large-scale social survey data to overcome these data quality issues. In doing so, we offset the limitations of each data source and optimised our understanding of ethnic variation in multiple long-term conditions (Hayanga, et al., 2024). A key finding of this analysis was that ethnic inequalities in the prevalence of MLTCs emerge from mid-life and by later life, older Pakistani, Indian, Black Caribbean and Other ethnic people have increased risk of MLTCs compared to white British people, even after adjusting for area-level deprivation (Hayanga, et al., 2024). There were similarities in trends across both datasets.
Stopforth and colleagues provide another example of methodological imagination using large-scale survey data. They set out to examine the prevalence and persistence of ethnic inequalities in health in later life, and assess the effects of socio-economic position and experienced racial discrimination in explaining health inequalities (Stopforth, et al., 2021). In the absence of longitudinal surveys with adequate numbers of older minoritised people, they harmonised six nationally representative social survey datasets that covered a 24-year period . Their analysis illuminates deeply alarming ethnic inequalities in limiting long-term illness and self-rated health over two decades (Stopforth, et al., 2021).
Evidently, existing data sources are imperfect, and their limitations impede our ability to comprehensively understand ethnic inequalities in health. However, the willingness to cultivate one’s methodological imagination means that researchers need not wait for the perfect dataset to explore ethnic inequalities in health. It is important to note that methodological imagination may not always be possible given the extra cognitive, logistical and even financial resources required to effectively execute novel, complex research methods. Further, the evidence may take longer to produce, thereby delaying the incorporation and implementation of such evidence into policy and practice to address urgent public health issues. As such, alongside methodological imagination is the need for sustained efforts to collect and monitor good quality disaggregated ethnicity data alongside, data on wider determinants of health (including racism and discrimination) which will allow researchers to produce evidence that can meaningfully inform evidence-based policies and interventions to address ethnic health inequalities.