Unveiling sub-populations in critical care settings: a real-world data approach in COVID-19.
Anderson W., Gould R., Patil N., Mohr N., Dodd K., Boyce D., Dasher P., Guerin PJ., Khan R., Cheruku S., Kumar VK., Mathé E., Mehta AK., Michelson AP., Williams A., Heavner SF., Podichetty JT.
BackgroundDisease presentation and progression can vary greatly in heterogeneous diseases, such as COVID-19, with variability in patient outcomes, even within the hospital setting. This variability underscores the need for tailored treatment approaches based on distinct clinical subgroups.ObjectivesThis study aimed to identify COVID-19 patient subgroups with unique clinical characteristics using real-world data (RWD) from electronic health records (EHRs) to inform individualized treatment plans.Materials and methodsA Factor Analysis of Mixed Data (FAMD)-based agglomerative hierarchical clustering approach was employed to analyze the real-world data, enabling the identification of distinct patient subgroups. Statistical tests evaluated cluster differences, and machine learning models classified the identified subgroups.ResultsThree clusters of COVID-19 in patients with unique clinical characteristics were identified. The analysis revealed significant differences in hospital stay durations and survival rates among the clusters, with more severe clinical features correlating with worse prognoses and machine learning classifiers achieving high accuracy in subgroup identification.ConclusionBy leveraging RWD and advanced clustering techniques, the study provides insights into the heterogeneity of COVID-19 presentations. The findings support the development of classification models that can inform more individualized and effective treatment plans, improving patient outcomes in the future.