Protocol for using treeLFA to infer multimorbidity patterns in the form of disease topics from diagnosis data in biobanks.
Zhang Y., Jiang X., McVean G., Lunter G.
Research on multimorbidity patterns promotes our understanding of the common pathological mechanisms that underlie co-occurring diseases. Here, we present a protocol to infer multimorbidity clusters in the form of disease topics from large-scale diagnosis data using treeLFA, a topic model based on the Bayesian binary non-negative matrix factorization. We describe steps for installing software, preparing input data, and training the model. We then detail post-processing procedures to obtain summarized results for downstream analyses. For complete details on the use and execution of this protocol, please refer to Zhang et al.1.