AI breakthrough shows potential to accelerate cancer drug discovery

A group of researchers led by Christ Church’s Dr Tapabrata Rohan Chakraborty has developed a new artificial intelligence system capable of generating molecular information from cellular imaging data – a breakthrough that could accelerate the discovery of new cancer treatments and improve understanding of disease.

Developed in collaboration with researchers from The Alan Turing Institute and The Institute of Cancer Research, London, the new framework – known as ‘PhenoSeq’ – generates transcriptomic profiles from cell images, enabling scientists to uncover molecular insights from existing imaging experiments without relying solely on costly sequencing technologies. The study has been accepted for presentation at the International Conference on Machine Learning (ICML), one of the world's leading conferences for machine learning research, and was supported by the Turing–Roche strategic partnership between Roche Pharmaceuticals and The Alan Turing Institute, the UK's national institute for AI.

Modern drug discovery increasingly relies on large-scale imaging experiments that capture detailed pictures of cells after they have been exposed to different treatments. These experiments can be carried out rapidly and at scale, generating vast quantities of biological data. However, obtaining deeper molecular information about how cells respond to drugs typically requires specialised sequencing technologies that are significantly more expensive and time-consuming.

PhenoSeq bridges this gap by using AI to predict gene-expression patterns from cellular images alone. By learning the relationship between a cell’s appearance and its underlying molecular activity, the system can generate biologically meaningful molecular representations without requiring additional sequencing experiments.

‘Cell morphology and gene expression are fundamentally different measurements of the same underlying biology,’ said Dr Chakraborty. ‘Our goal was to determine whether information contained in large-scale imaging experiments could be translated into a molecular representation that is normally only accessible through costly sequencing technologies.’

Diagram showing how PhenoSeq uses AI to translate cell images into detailed molecular information
PhenoSeq uses AI to translate cell images into detailed molecular information

To develop and evaluate the system, the research team used a newly released dataset containing matched cellular imaging and transcriptomic measurements across a range of chemical treatments. Their results showed that the AI-generated molecular profiles captured biologically meaningful information and improved the ability to distinguish between different treatments when compared with imaging data alone.

The study builds on recent advances in multimodal AI and follows Dr Chakraborty’s earlier work, PathGen, which demonstrated that molecular information could be generated from digital pathology images and was published in Nature Communications earlier this year. While previous research focused on predicting molecular features from tissue images, the new study is among the first to show that AI can generate transcriptomic representations from high-content cellular imaging in the context of phenotypic drug discovery.

The researchers believe the approach could help scientists extract additional biological insight from existing imaging datasets without the need for extensive molecular profiling. In the future, such methods could support more efficient drug-screening pipelines, improve understanding of how experimental treatments work, and accelerate the search for new therapies.

The study highlights the growing potential of generative AI to integrate different forms of biological data and uncover information that would otherwise remain hidden within routine laboratory experiments.

The paper, Cell Painting Generates Single-Cell Transcriptomics via Conditional Diffusion, is available to read online.