From data deluge to diagnostic insight: RAMSES supercomputer powers next-generation AI pathology at Cologne

A new study highlights a pivotal shift in biomedical research: breakthroughs now depend as much on powerful computational tools as on laboratory instruments. Driving this transformation is the RAMSES supercomputer at the University of Cologne's IT Center in Germany, empowering researchers to process and analyze enormous digital pathology datasets at a scale previously unattainable.
 
Featured in a recent Nature Medicine publication, this work presents SPARK, an advanced AI-driven framework described as “agentic” for its ability to autonomously generate, test, and validate hypotheses in cancer pathology. While the conceptual innovation is noteworthy, it is RAMSES’s computational power that makes such a system practically feasible.

Scaling digital pathology beyond human limits

Digital pathology operates on whole-slide images (WSIs), each of which can reach gigapixel resolution. When multiplied across thousands of patient samples and multiple cancer types, the resulting data volume quickly becomes prohibitive for conventional computing.
 
To address this, researchers deployed a hybrid computational architecture in which high-throughput workloads were executed on RAMSES, a high-performance computing (HPC) system designed for large-scale modeling and simulation. The system integrates advanced GPU resources, including multiple NVIDIA H100 accelerators, optimized for parallel processing of AI and image analysis pipelines.
 
Within this environment, each pathology case required substantial dedicated resources, up to 120 GB of memory and 12 CPU cores per sample, highlighting the intensity of the computational workload.

The SPARK framework: AI at scale

The SPARK system represents a shift from static machine learning models to dynamic, reasoning-based AI workflows. Rather than being trained solely on labeled data, SPARK generates its own analytical “ideas,” translates them into executable code, and evaluates their predictive value across large datasets.
 
This process unfolds in several stages:
  • Idea generation using large language models (LLMs)
  • Automated code creation and validation
  • High-throughput parameter extraction from WSIs
  • Statistical modeling and prognostic evaluation
While early-stage development and prototyping could be conducted on smaller systems, the full-scale execution, particularly across cohorts exceeding 5,000 patients, required the parallel processing capabilities of RAMSES.

High-performance computing meets oncology

The integration of HPC into this workflow enabled several key advances:
1. Massive Parallel Image Analysis
RAMSES allowed simultaneous processing of thousands of WSIs, performing segmentation, cell classification, and spatial mapping across seven distinct cell types.
2. Large-Scale Parameter Exploration
The system generated and evaluated thousands of candidate biomarkers, over 2,400 validated parameters in some analyses, each representing a potential predictor of cancer progression.
3. Predictive Modeling at Population Scale
Using HPC resources, the team conducted multivariable statistical modeling across diverse cancer cohorts, identifying features with independent prognostic value beyond traditional clinical metrics.
4. Temporal Reconstruction of Tumor Evolution
By analyzing spatial patterns within tumors, the system inferred evolutionary sequences of disease progression, an inherently data-intensive task requiring both computational power and algorithmic sophistication.

RAMSES: Infrastructure as an enabler of discovery

The RAMSES system, formally known as the Research Accelerator for Modeling and Simulation with Enhanced Security, played a central role in enabling these analyses. Hosted at the University Hospital Cologne and supported by national and European funding initiatives, it provides a secure, scalable environment for data-intensive biomedical research.
 
Crucially, RAMSES is not merely a computing resource but an integrated platform supporting:
  • GPU-accelerated AI workloads
  • High-memory nodes for large dataset handling
  • Parallelized pipelines for image and statistical analysis
  • Secure processing of sensitive clinical data
Without such infrastructure, the SPARK framework would be constrained to small-scale experiments rather than clinically relevant population studies.

Toward autonomous scientific discovery

The implications of this work extend beyond pathology. By combining agent-based AI systems with supercomputing infrastructure, researchers are moving toward autonomous scientific discovery pipelines, systems that can generate hypotheses, test them, and refine their own analytical strategies.
 
In oncology, this approach could accelerate the identification of novel biomarkers, improve patient stratification, and ultimately inform more personalized treatment strategies. More broadly, it signals a shift in how science is conducted: from manually driven analysis to computational ecosystems capable of operating at scale.

The supercomputing imperative in modern medicine

The study reinforces a central theme in contemporary research: data alone is not enough. The ability to extract meaning from complex, high-dimensional datasets depends critically on access to advanced computational infrastructure.
 
In this case, the RAMSES supercomputer transformed a conceptual AI framework into a practical, high-impact tool, demonstrating that in the era of digital medicine, supercomputing is not an accessory but a necessity.
 
As biomedical datasets continue to expand in size and complexity, systems like RAMSES will increasingly define the boundary between theoretical possibility and real-world application.

Hidden order, revealed at scale: Supercomputing, electron ptychography uncover the inner workings of relaxor ferroelectrics

A recent study led by researchers at the Massachusetts Institute of Technology has shed new light on one of materials science’s most persistent puzzles: the elusive structural organization inside relaxor ferroelectrics. Although these materials are foundational to technologies such as precision actuators and advanced sensors, the atomic-level disorder inherent to relaxor ferroelectrics has, until now, masked the origins of their exceptional electromechanical behavior.
 
The breakthrough, highlighted in MIT News, goes beyond experimental advances; it is fundamentally computational. Central to this progress is the integration of high-resolution electron ptychography with large-scale simulation workflows powered by high-performance computing (HPC), bridging the gap between experiment and theory across various length scales.

A computational lens into atomic disorder

Relaxor ferroelectrics such as lead magnesium niobate–lead titanate (PMN-PT) exhibit what researchers describe as a “polar slush,” a complex, fluctuating arrangement of nanoscale polarization domains. Capturing this structure requires more than imaging; it demands reconstruction, simulation, and statistical interpretation of vast multidimensional datasets.
 
The MIT-led team employed multislice electron ptychography to generate 4D scanning transmission electron microscopy (4D-STEM) datasets. Each dataset consists of diffraction patterns collected across a real-space grid, yielding an immense volume of information that requires iterative reconstruction algorithms. These reconstructions rely on computational frameworks such as PtychoShelves and custom multislice solvers, tools that are computationally intensive and inherently suited to supercomputing environments.
 
Critically, the reconstruction process overcomes multiple scattering effects and retrieves depth-resolved structural information at near-atomic resolution. This allows researchers to visualize polarization variations through the thickness of the material, something unattainable with conventional microscopy techniques.

Supercomputing the physics of polarization

Beyond imaging, the study’s true computational depth emerges in its integration with molecular dynamics (MD) simulations. These simulations model supercells as large as 72 × 72 × 72 unit cells under varying strain conditions, tracking atomic displacements and polarization vectors over nanosecond timescales.
 
Such simulations are not trivial. They require:
  • Parallelized computation of interatomic forces using bond-valence models
  • Thermodynamic control via Nose–Hoover thermostats and Parrinello–Rahman barostats
  • Statistical averaging across billions of atomic interactions
The resulting datasets enable direct comparison with experimental reconstructions, effectively validating observed polar structures and revealing their dependence on strain and chemical ordering.
 
Moreover, multislice simulations of electron scattering, used to replicate experimental conditions, incorporate frozen phonon approximations with dozens of configurations to ensure convergence. 
 
These calculations, which simulate electron propagation through matter at atomic resolution, are computationally demanding and benefit significantly from HPC acceleration.

Data-driven discovery at the nanoscale

To interpret the immense data volumes, the researchers deployed advanced statistical and machine learning techniques. Principal component analysis (PCA) was applied to local polarization environments, reducing high-dimensional datasets into dominant “polar motifs” that describe recurring structural patterns.
 
Additionally, clustering algorithms were used to identify contiguous polarization domains, while pair-correlation functions quantified spatial relationships between dipoles. These analyses revealed that:
  • Polarization is strongly influenced by local chemical heterogeneity, particularly the distribution of Nb⁵⁺ and Mg²⁺ ions.
  • Short-range chemically ordered regions significantly enhance long-range polar correlations.
  • Strain drives a transition toward more ordered, ferroelectric-like behavior without eliminating intrinsic disorder.
Such findings would be inaccessible without the combination of high-resolution experimental input and large-scale computational analysis.

Resolving the limits of measurement

One of the study’s notable achievements is quantifying the resolution limits of ptychographic reconstruction. Through simulation, the team demonstrated that polar domains as small as ~1 nm can be resolved under optimal conditions, despite a depth resolution of ~3.2 nm due to inherent blurring effects.
 
This calibration, achieved through synthetic datasets and reconstruction pipelines, underscores the importance of computational modeling in interpreting experimental data. It also highlights a broader trend in materials science: measurement is no longer purely observational but deeply intertwined with simulation.

Toward predictive materials design

By bridging atomistic simulations with experimental imaging, the MIT team has effectively created a multiscale framework for understanding relaxor ferroelectrics. The implications extend beyond academic curiosity.
 
With HPC-enabled workflows, researchers can now:
  • Predict how nanoscale chemical ordering influences macroscopic properties.
  • Optimize strain conditions for enhanced electromechanical performance.
  • Design next-generation materials with tailored polarization behavior.
This convergence of supercomputing and microscopy signals a shift toward predictive materials engineering, where computation does not merely support experiments but guides them.

The supercomputing imperative

The study exemplifies how modern materials science is inseparable from high-performance computing. From reconstructing terabyte-scale microscopy datasets to simulating millions of atomic interactions, every stage of the workflow depends on computational power.
 
As datasets grow richer and models more sophisticated, the role of supercomputers will only expand, transforming hidden atomic disorder into actionable scientific insight.
 
In the case of relaxor ferroelectrics, what was once considered noise is now recognized as structure, and it is supercomputing that has made it visible.

Modeling life at the microscopic scale: A computational breakthrough in oxygen transport

Within the human body, the delivery of oxygen occurs at the microscale, where the disciplines of physics, chemistry, and biology intersect. Elucidating the mechanisms by which oxygen is transported through the bloodstream, diffuses out of erythrocytes, and is utilized by surrounding tissues has remained a formidable challenge, primarily due to the inherent complexity of these processes.
 
Recently, a study published in the International Journal of Heat and Mass Transfer introduced a significant advancement: a fully three-dimensional computational model that simultaneously simulates oxygen transport alongside the motion and deformation of individual red blood cells (RBCs). This development constitutes an important step toward addressing one of the most complex multiphysics challenges in biomedical science.

A problem too complex to see directly

At the scale of capillaries, oxygen transport is governed by a delicate interplay of mechanisms:
  • Fluid flow through narrow vessels
  • Diffusion across multiple regions (cells, plasma, tissue)
  • Chemical reactions involving hemoglobin
  • Continuous deformation and interaction of red blood cells
Traditional models simplified this system, often ignoring individual cells or treating vessels as static tubes. But such approximations fall short of capturing how oxygen is actually delivered in living tissue.
 
The new study breaks from that tradition by embracing the full complexity.

A unified multiphysics framework

The researchers developed a diffuse interface model that unifies multiple physical processes into a single computational framework. Instead of treating boundaries, like the surface of a red blood cell, as sharp discontinuities, the method smooths them into a continuous transition region. This allows the governing equations to be solved seamlessly across the entire domain.
 
At its core, the model simultaneously solves:
  • The incompressible Navier–Stokes equations for blood flow
  • Advection–diffusion–reaction equations for oxygen transport
  • Fluid–structure interaction governing deformable red blood cells
Red blood cells are modeled as elastic membranes interacting with fluid using an immersed boundary method, enabling them to move, deform, and respond dynamically to their environment.
 
The result is a fully coupled 3D simulation where flow, chemistry, and cellular mechanics evolve together.

Capturing the behavior of living blood

One of the most striking outcomes of the study is the ability to simulate how red blood cells actively regulate oxygen delivery.
 
Rather than acting as passive carriers, the simulations suggest that RBCs:
  • Adjust oxygen release based on local tissue demand.
  • Interact with one another in ways that influence flow distribution.
  • Contribute to maintaining relatively uniform oxygenation across tissue.
This emergent behavior, arising purely from physics and chemistry, offers new insight into how the body maintains balance at the microscale.

The computational challenge beneath the surface

While the study does not explicitly reference supercomputers or high-performance computing (HPC) systems, the scale and sophistication of the model place it firmly within the realm of HPC-class workloads.
 
The simulation involves:
  • Three-dimensional, time-dependent PDEs
  • Moving and deforming interfaces
  • High-order numerical schemes (including fifth-order advection methods)
  • Coupled nonlinear physics across multiple domains
These are precisely the kinds of problems that increasingly drive demand for advanced computing infrastructure.
 
Interestingly, rather than relying solely on brute-force computational power, the researchers focused on algorithmic efficiency:
  • A mixture formulation eliminates the need for complex interface reconstruction.
  • Fixed Cartesian grids simplify geometry handling.
  • Carefully chosen numerical schemes balance accuracy and cost.
This approach reflects a broader trend in computational science: pairing smarter algorithms with scalable hardware to tackle previously intractable problems.

A glimpse of scalable biomedical simulation

The implications extend beyond this specific study. By demonstrating a practical way to simulate oxygen transport with deformable cells in 3D, the work lays a foundation for:
  • Patient-specific microcirculation modeling
  • Disease studies involving impaired oxygen delivery
  • Integration with larger-scale physiological simulations
As these models grow in size and realism, they are likely to transition naturally onto parallel and high-performance computing platforms, where their full potential can be realized.

Looking Ahead

This research highlights a subtle but important shift. The frontier of biomedical modeling is no longer defined solely by biological insight, but increasingly by computational capability.
 
Even when supercomputers are not explicitly named, they linger in the background, implicit in the complexity of the equations, the dimensionality of the models, and the ambition of the questions being asked.
 
In that sense, this study is not just about oxygen transport. It is a preview of a future where understanding life at its smallest scales depends as much on computational innovation as it does on biology itself.