Machine learning in microbiology

Machine learning (ML), a subset of artificial intelligence (AI), is used as a structured analytical tool to extract meaningful patterns from complex biological data, in close integration with experimental microbiology. ML strategies can advance understanding of host–pathogen infection models and antimicrobial resistance (AMR). They can also identify patterns within high-dimensional multiomics datasets and translate computational predictions into experimentally testable hypotheses.

Application areas

Multiomics Integration can be carried out from a wide range of experimental setups such as dual RNA-seq and single-cell RNA-seq, quantitative proteomics and data from phenotypic assays and validation experiments.

Host–Pathogen interaction analysis can be investigated by network-based modelling using data from dual RNA-seq, single-cell transcriptomics, and pathway-level data to evaluate infection heterogeneity and immune modulation.

Drug Response can be investigated using ML based on data from checkerboard assays, growth kinetics experiments, and transcriptional profiling under drug exposure. These approaches prioritize identification of drug combinations and dosing strategies.

Computational protein and phage analysis can be guided by AI-based protein modelling tools, including protein language models and structure predictions, supporting analysis of phage proteins, host-receptors, and antimicrobial targets. Computational predictions are coupled with laboratory validation such as binding assays and infection experiments.

Methods Commonly Applied

Depending on the biological question and dataset, implementation can be:

  • Supervised learning (e.g., Random Forest, XGBoost) for phenotype prediction such as antimicrobial resistance

  • Unsupervised learning (PCA, clustering, latent factor models such as MOFA+) for multiomics integration

  • Network-based modelling to investigate host–pathogen interactions

  • Deep learning frameworks (e.g., PyTorch-based models)

  • Single-cell analysis tools (Scanpy, Seurat, scVI)

All models must be evaluated using appropriate biological validation experiments.

Explore applications.