11.05.2026 Generative Artificial Intelligence Can Significantly Reduce the Number of Animal Experiments

Between 30 and 50 percent fewer mice needed in pharmacological research experiments

Digital data points instead of additional animal testing: AI can simulate real laboratory values.

Researchers at Goethe University Frankfurt and Marburg University, in collaboration with the Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, have developed a new form of artificial intelligence designed to reduce the need for animal testing. The AI, called genESOM, was trained to “learn” the structure of small datasets. It then uses this learned structure to generate new data points. These synthetic data points reproduce the properties of experimentally collected data so accurately that they appear as if they had been obtained in laboratory experiments. In the future, “genESOM” could reduce the number of animals required in drug testing by 30 to 50 percent.

In the early stages of drug development, new compounds are tested in animals alongside many other experimental methods. Researchers face a dilemma: on the one hand, ethical considerations require minimizing the number of animals used in experiments. On the other hand, animal studies must include enough subjects to produce reliable and representative results, for example to determine whether a drug candidate has a specific effect.

Network of thousands of artificial neurons

Prof. Jörn Lötsch, a data scientist and clinical pharmacologist at Goethe University Frankfurt, together with computer scientist Prof. Alfred Ultsch from Marburg University—both of whom do not conduct animal experiments themselves—developed a generative AI called genESOM. It is based on a network of thousands of artificial neurons that “learns” the internal structure of a dataset. This enables the system to expand experimentally obtained data and simulate a situation in which more animals were used than in reality.

Integrated error monitoring

To train the AI, the researchers used existing data from a published study conducted on mice at the Fraunhofer ITMP. Two key innovations were achieved:

First, the AI was trained to generate new data points based on the study data in a way that integrates seamlessly into the learned data structure, as if they had been collected in real experiments.

Second, an error-monitoring system was integrated directly into the data generation process. Generative AI methods risk amplifying not only relevant signals but also noise and random variation. This issue, known as error inflation, can lead to non-significant variables being incorrectly identified as treatment-relevant (so-called false positives).

By separating the learning phase from the synthesis phase, the researchers introduced a controlled artificial error signal into the process and precisely measured its propagation. This enabled a data-driven stopping criterion that halts data generation before scientific validity is compromised.

AI training with published study data

genESOM was tested using data from a preclinical multiple sclerosis model study. In the original experiment, 26 mice were divided into three treatment groups to evaluate the effects of an experimental drug. Lötsch and Ultsch reduced the dataset to 18 animals (six per group) to simulate a smaller study. When analyzing this reduced dataset, all previously observed treatment effects disappeared: statistical tests showed no significance, and machine learning methods could not distinguish between treatment groups.

After enriching the reduced dataset with genESOM-generated data points, all effects of the original experiment reappeared at the same level of statistical significance, without introducing relevant false positives. Alternative AI approaches, including complex deep learning neural networks tested by the researchers, failed in this task.

Lötsch explains: “We have now tested several datasets in a similar way and can say today: with genESOM, the number of animals used in exploratory research questions can be reduced by 30 to 50 percent while maintaining scientific validity.”

However, he emphasizes that genESOM can only learn from data obtained in real animal experiments. The number of animals cannot be reduced arbitrarily: “If too few animals are included in the experiment and the data are then simply supplemented by generative AI, the experiment could quickly become scientifically meaningless due to amplified random effects.”

Nevertheless, Lötsch is confident: “With genESOM, we can make an important contribution to reducing the number of animal experiments in many areas of preclinical research.”

The project was funded by the German Research Foundation (DFG) under the title: “Generative artificial intelligence-based algorithm to increase the predictivity of preclinical studies while keeping sample sizes small.”

Publications

Jörn Lötsch, Benjamin Mayer, Natasja de Bruin, Alfred Ultsch: Self-organizing neural network-based generative AI with embedded error inflation control enhances effective knowledge extraction from preclinical studies with reduced sample size. Pharmacological Research (2026) https://doi.org/10.1016/j.phrs.2026.108159

Jörn Lötsch, André Himmelspach, Dario Kringel: Dimensionality-modulated generative AI for safe biomedical dataset augmentation. iScience (2026) https://doi.org/10.1016/j.isci.2025.114321

Alfred Ultsch, Jörn Lötsch: Augmenting small biomedical datasets using generative AI methods based on self-organizing neural networks Open Access. Briefings in Bioinformatics (2024) https://doi.org/10.1093/bib/bbae640