Marina Sirota
Marina Sirota

AI agents open new frontiers in predicting preterm birth

In a compelling example of artificial intelligence (AI) and high-performance computing (HPC) revolutionizing medical research, scientists at the University of California, San Francisco (UCSF) have created advanced AI tools capable of precisely analyzing vast healthcare datasets to predict preterm birth, a major contributor to infant mortality and long-term health issues globally. Their findings, recently published in Cell Reports Medicine, offer fresh optimism for early intervention and underscore the transformative potential of supercomputing-powered data science in addressing complex biological challenges.
 
Preterm birth, defined as delivery before 37 weeks of gestation, impacts about one in ten pregnancies worldwide and carries a heightened risk of complications, including respiratory distress, neurodevelopmental disorders, and chronic long-term illnesses. Despite years of research, accurately pinpointing which pregnancies are most at risk has proven difficult, primarily because of the complex mix of genetic, environmental, clinical, and lifestyle factors influencing gestational outcomes.
 
The UCSF team, led by Marina Sirota, PhD, professor of Pediatrics and interim director of the Bakar Computational Health Sciences Institute, approached the problem not by narrowing the dataset, but by embracing its scale.
 
The UCSF team addressed this complexity by harnessing machine learning algorithms trained on a vast multi-institutional dataset encompassing millions of electronic health records (EHRs), biomarker measurements, and demographic information. To manage and extract meaningful patterns from such a voluminous and heterogeneous dataset, the researchers relied on a supercomputing infrastructure that could efficiently process and analyze large-scale data in parallel, an essential capability when training and validating predictive AI models.
 
Their model integrates clinical features such as maternal age, blood test results, previous obstetric outcomes, and lifestyle information. Through iterative learning and exposure to diverse cases, AI developed the ability to distinguish subtle signals predictive of preterm birth, achieving significantly higher accuracy than traditional risk scoring systems. The findings reported in Cell Reports Medicine affirm that AI models trained on robust, high-dimensional data can discern patterns that may elude even experienced clinicians.
 
Crucially, the supercomputing element of this research was not merely about speed, but scale and integration. Handling millions of records, each with potentially hundreds of variables, demands computational resources capable of orchestrating complex matrix operations, optimization routines, and cross-validation loops that ensure model generalizability. Standard computing environments struggle with datasets of this magnitude, but HPC systems equipped with parallel processing and optimized data pipelines enabled researchers to train, test, and refine models within feasible time frames.
 
According to the study, this approach represents a paradigm shift in obstetric research. By applying AI to large-scale datasets, we can identify risk profiles long before symptoms manifest. This opens the door to earlier, more personalized interventions that could improve outcomes for mothers and infants alike.
 
The implications are profound. Early prediction of preterm birth could allow clinicians to tailor monitoring schedules, recommend targeted therapies, and provide proactive support to high-risk patients, ultimately reducing the incidence of complications and associated healthcare costs. In regions with limited access to specialized care, AI-driven models could empower frontline providers with actionable insights based on data patterns derived from large cohorts.
 
For the supercomputing community, the model illustrates the expanding role of HPC beyond traditional domains like physics, climate modeling, and astrophysics. In the era of digital medicine, vast datasets generated by electronic health records, genomic sequencing, and wearable sensors present both a challenge and an opportunity: how to turn data into life-saving knowledge. Supercomputers, with their ability to orchestrate trillions of calculations across distributed architectures, are becoming essential partners in this transformation.
 
Moreover, the success of the AI underscores the importance of ethical, transparent, and clinically grounded AI development. The UCSF researchers emphasize that predictive models must be rigorously validated across diverse populations to ensure fairness and avoid perpetuating healthcare disparities. Supercomputing resources make such comprehensive validation feasible, enabling researchers to test model performance across subgroups defined by race, socioeconomic status, and geographic region.
 
As AI continues to mature alongside advances in supercomputing, the pace of medical discovery is poised to accelerate. From predicting preterm birth to personalized cancer therapies and beyond, computational models trained on big data are charting new frontiers in health science, turning complexity into clarity and raw data into actionable insight. 
 
As Sirota and her colleagues demonstrate, when scientific AI meets scalable computing, the result is more than faster analysis. It is the possibility of foresight, the ability to identify risk before crisis emerges.

In maternal health, that foresight could mean healthier pregnancies, stronger newborns, and lives changed by the power of computation.

The low-surface-brightness galaxy CDG-2, within the dashed red circle at right, is dominated by dark matter and contains only a sparse scattering of stars. The full image from NASA’s Hubble Space Telescope is at left. NASA, ESA, Dayi Li (UToronto); Image Processing: Joseph DePasquale (STScI)

Peering into cosmic darkness: Supercomputers illuminate one of the faintest galaxies ever seen

Astronomers have made a discovery that redefines how we think of galaxies, which are often pictured as shining collections of stars. By leveraging cutting-edge space telescopes and state-of-the-art data analysis, scientists have pinpointed one of the dimmest galaxies ever observed, a nearly invisible structure where just a few scattered stars hint at a vast, hidden mass. This achievement, made possible by advanced computational methods, demonstrates how supercomputing and data science are pushing the boundaries of our ability to detect the universe’s faintest and most mysterious objects.
 
Named Candidate Dark Galaxy-2 (CDG-2), this galaxy lies about 300 million light-years from Earth in the Perseus cluster. Instead of shining with billions of stars like most galaxies, CDG-2 emits barely any light, with a visible brightness equivalent to just six million suns. Even more astonishing, over 99 percent of its mass seems to be made of dark matter, the enigmatic, unseen substance that dominates the universe’s mass but does not emit or absorb light.
 
What makes this discovery especially groundbreaking is how the galaxy was found. Rather than detecting the galaxy directly by its stars, researchers used globular clusters, densely packed, gravitationally bound spheres of old stars, as cosmic beacons. These compact clusters were identified as unusually tight groupings in survey data, hinting that they might be orbiting an unseen underlying galaxy. Follow-up imaging with NASA’s Hubble Space Telescope confirmed the clusters’ presence, while data from ESA’s Euclid mission and the Subaru Telescope in Hawaii revealed an ultra-faint diffuse glow around them, the first direct evidence of the dark galaxy itself. 

This detection would have been impossible without high-performance computing and sophisticated statistical models, which are capable of sifting through vast datasets and isolating subtle signals. Modern astrophysical research increasingly relies on supercomputer-assisted analyses to combine multi-telescope observations, model faint features buried in noise, and test competing interpretations of the observed data. In essence, HPC enables astronomers to digitally construct cosmic systems too faint or distant to examine through direct observation alone.
 
CDG-2 stands apart from most known systems not just for its dimness, but for what it may reveal about the role of dark matter in galaxy formation and evolution. The prevailing view in cosmology holds that dark matter provides the gravitational scaffolding around which normal matter, gas, and stars accumulate to form galaxies. Yet the extreme case of CDG-2 suggests scenarios in which star formation was suppressed or stripped away, leaving behind a halo rich in dark matter but poor in visible stars. Such galaxies are thought to be exceedingly rare, making this one a crucial testbed for theories of cosmic structure formation.
 
The supercomputing community should take particular pride in this discovery, as the identification and analysis of CDG-2 depended on algorithms and models developed to handle petabyte-scale datasets from ongoing and upcoming sky surveys. As observatories like the Vera C. Rubin Observatory and the Nancy Grace Roman Space Telescope begin mapping the sky with unprecedented depth and breadth, the role of HPC will only grow, not just in storing and processing data, but in helping astronomers ask new questions about the dark universe and find answers hidden within noise.
 
Moreover, the methods used to detect CDG-2, effectively letting computational exploration precede direct detection, open a new frontier in observational astronomy. In future surveys, machine learning and other supercomputer-powered techniques may routinely uncover objects too faint or too exotic to be seen by the naked eye of a telescope, blurring the line between observation and inference.
 
While CDG-2 may be one of the darkest galaxies yet discovered, its detection casts an inspiring light on the future of astrophysics. It reminds us that the universe still holds countless hidden wonders and that with the synergy of powerful telescopes and supercomputing, we are just beginning to uncover them.

Supercomputers tackle a stellar puzzle, but have we really solved it?

Astrophysicists have long puzzled over a key mystery in the life cycle of red giant stars, the swollen, aging stars that will eventually include our own Sun. For more than fifty years, scientists have documented changes in the surface chemistry of these stars as they evolve; however, the process responsible for these changes has remained unclear. Now, researchers at the University of Victoria’s Astronomy Research Centre report that advanced supercomputer simulations have finally cracked the case: stellar rotation intensifies internal mixing, carrying elements from deep inside red giants up to their surfaces.
 
These results are grounded in sophisticated three-dimensional hydrodynamical simulations. Such simulations are feasible only thanks to the immense computational power of modern high-performance computing facilities, including the Texas Advanced Computing Center and the Trillium supercomputer at SciNet in Canada.
 
According to lead researcher Simon Blouin, rotation dramatically increases the efficiency with which internal waves move material through the stable barrier layer between the core and the outer convection zone. In practical terms, this means that elements like carbon and nitrogen can be transported outward in ways that align with what telescopes have observed for decades, particularly changes in isotopic ratios like carbon-12 to carbon-13 that had until now lacked a convincing cause.
 
But before declaring this celestial riddle fully solved, especially for a scientifically literate audience like that of SC Online, it’s worth digging into what this “solution” really entails, and where skepticism might still be warranted.

Simulation Success, But What About Reality?

The crux of the new work lies in computational hydrodynamics: solving the fluid motion of stellar interiors under the influence of rotation, gravity, turbulence, and thermal gradients. These simulations are not simple; even with hundreds of processors working in parallel, individual runs can consume millions of CPU hours. Their scope and resolution reflect the kind of computational scale once reserved for meteorology and climate models, “big science” simulations where raw computational power often dictates what questions can be asked as much as what answers are found.
 
While the results reproduce observed surface anomalies under specific rotation regimes, there are critical caveats inherent to any model of such complexity:
  • Parameter Dependence: The simulations assume particular rotation rates and internal structural profiles. Whether those parameters accurately represent all red giants, especially those with different masses or histories, is not firmly established.
  • Resolution Limits: Even top-tier HPC clusters must balance between resolution and computational cost. Fine details of mixing processes can be sensitive to grid size and physics approximations, meaning that what appears as a “solution” at one scale might shift at higher fidelity.
  • Model Uncertainties: Stellar interiors host a vast array of poorly constrained physical processes,  from magnetism to subtle wave interactions, some of which may not be fully captured in current models.
In other words, while the simulations are impressive and represent a significant step forward in computational astrophysics, there remains ample room for cautious interpretation.

The Computational Science Perspective

For the supercomputing community, the UVic work is a testament to both the power and limitations of HPC. Without large-scale simulations, computations spread across hundreds or thousands of processors, exploring how rotating convection and internal gravity waves interact inside a star would remain purely theoretical. Supercomputers act here as numerical laboratories, where hypotheses about internal stellar dynamics can be tested in silico, complementing observations from telescopes with otherwise unreachable insights.
 
At the same time, this breakthrough highlights that solving a scientific problem rarely equates to closure. Computational results often raise as many questions as they answer: How universal is the rotational mixing mechanism across the diversity of red giant stars? Could different physical processes dominate in other evolutionary phases? And how might uncertainties in initial conditions or physics assumptions influence model outcomes?
 
These are issues that only further HPC-driven research, informed by both observation and theoretical refinement, can address. In that sense, the latest simulations are less a final answer and more a checkpoint in a long, iterative process of scientific inquiry.

A Future Written in Code

As supercomputing power continues to grow and astrophysical models become ever more detailed, simulations like these will increasingly serve as essential tools and indispensable partners in unraveling cosmic mysteries. Whether it’s mixing winds in red giants or simulating galaxy formation at cosmological scales, HPC remains at the frontier of our capacity to think computationally about the universe.
 
Still, caution is warranted. Matching known observations with a model marks significant progress, but it does not equate to a final answer. In astronomy and other computational sciences, findings are only as robust as the underlying assumptions, and verifying those assumptions across the universe’s full complexity is a task that extends far beyond any individual study.
 
At present, supercomputer-generated star models present a compelling narrative for how rotation affects red giant surfaces. Whether this narrative endures further examination, evolves with new data, or is ultimately rewritten remains to be seen.