Defining a High-Quality Myalgic Encephalomyelitis/Chronic Fatigue Syndrome cohort in UK Biobank.
Samms, Gemma L, Ponting, Chris P · NIHR open research · 2025 · DOI
Quick Summary
This study looked at how ME/CFS is identified in UK Biobank, a large database of health information from hundreds of thousands of people. Researchers found that about 1 in 100 UK Biobank participants had some evidence of ME/CFS, but only about one-third had strong, consistent evidence across multiple records. When they accounted for people too ill to participate in the study, they estimated that around 410,000 people in the UK have ME/CFS.
Why It Matters
ME/CFS research has been hindered by small studies with unreplicated findings. This study demonstrates how large population biobanks like UK Biobank can accelerate discovery by providing standardized data on thousands of ME/CFS cases and controls simultaneously, while also highlighting that accurate case identification requires multiple evidence sources—an important methodological insight for future biobank-based ME/CFS research.
Observed Findings
5,354 UK Biobank individuals had at least one indicator of ME/CFS diagnosis (1.1% of participants)
Only 36% (1,922 individuals) had 2 or more independent lines of evidence supporting ME/CFS status
ICD-10 code G93.3 (Post-viral fatigue syndrome) had the highest consistency with other data types (72% concordance)
Pain Questionnaire responses had the lowest consistency with other evidence types (43% concordance)
Data missingness significantly contributed to lack of diagnostic consistency across evidence types
Inferred Conclusions
ME/CFS diagnosis in biobank settings is most reliably established using multiple, corroborating data sources rather than single indicators
Adjusting for those unable to participate due to severe illness increases UK ME/CFS prevalence estimates substantially (to 410,000)
Standardization of ME/CFS case definitions across multiple data types is necessary for large-scale biobank analyses
Population-scale biobanks represent a feasible and cost-effective alternative to traditional cohort recruitment for ME/CFS research
Remaining Questions
What are the clinical and demographic characteristics of individuals with inconsistent evidence across data types, and do they represent true ME/CFS cases or misclassification?
What This Study Does Not Prove
This study does not identify causes or underlying mechanisms of ME/CFS, nor does it provide evidence for any specific diagnostic test or biomarker. It also does not compare treatment outcomes or track disease progression, and the prevalence estimate applies specifically to the UK context and may not generalize to other populations or healthcare systems.
Tags
Method Flag:PEM Not DefinedWeak Case DefinitionMixed Cohort
How can biobanks better capture data on the most severely affected patients (bed-/house-bound) who cannot participate in standard recruitment?
What is the optimal statistical weighting or prioritization of different evidence types (ICD-10 vs. questionnaire vs. hospital records) for ME/CFS case definition?
Do case definition strategies validated in UK Biobank generalize to ME/CFS cohorts defined using formal diagnostic criteria (e.g., ICC, CCC) or other populations?