Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

The availability of large genome datasets has changed the microbiology research landscape. Analyzing such data requires computationally demanding analyses, and new approaches have come from different data analysis philosophies. Machine learning and statistical inference have overlapping knowledge discovery aims and approaches. However, machine learning focuses on optimizing prediction, whereas statistical inference focuses on understanding the processes relating variables. In this review, we outline the different aspirations, precepts, and resulting methodologies, with examples from microbial genomics. Emphasizing complementarity, we argue that the combination and synthesis of machine learning and statistics has potential for pathogen research in the big data era.

More information Original publication

DOI

10.1186/s13059-025-03775-4

Type

Journal article

Publication Date

2025-09-01T00:00:00+00:00

Volume

26

Addresses

I, n, e, o, s, , O, x, f, o, r, d, , I, n, s, t, i, t, u, t, e, , f, o, r, , A, n, t, i, m, i, c, r, o, b, i, a, l, , R, e, s, e, a, r, c, h, ,, , D, e, p, a, r, t, m, e, n, t, , o, f, , B, i, o, l, o, g, y, ,, , U, n, i, v, e, r, s, i, t, y, , o, f, , O, x, f, o, r, d, ,, , O, x, f, o, r, d, ,, , U, n, i, t, e, d, , K, i, n, g, d, o, m, .

Keywords

Sequence Analysis, DNA, Genomics, Drug Resistance, Microbial, Virulence, Genome-Wide Association Study, Microbiota, Genome, Microbial, Datasets as Topic, Machine Learning