Cratered clues: How supercomputers are reconstructing the violent history of asteroid Psyche

In the distant reaches of the asteroid belt between Mars and Jupiter, a metallic world named 16 Psyche preserves vital clues to planetary formation. Once thought to be the exposed core of an incomplete planet, Psyche is now at the center of groundbreaking research led by scientists from the University of Arizona. Using supercomputer simulations, they are re-examining the asteroid’s surface to unravel secrets about the early solar system.
 
Central to this research are the vast impact craters that pockmark Psyche’s exterior. These craters are not mere remnants of collisions; they hold essential information about the asteroid’s internal makeup, composition, and origins. Unlocking these secrets requires more than careful observation, it demands large-scale computational reconstruction.

From Telescope Data to Computational Models

Asteroid Psyche, roughly 220 kilometers in diameter, is one of the most massive metal-rich bodies in the asteroid belt.
 
Yet its composition remains debated. While once believed to be a solid iron-nickel core, more recent evidence suggests a mixed metal–silicate structure, complicating assumptions about its formation.
 
To resolve this uncertainty, researchers are turning to large-scale numerical impact simulations, using supercomputers to model how craters form under different material conditions. By comparing simulated crater morphologies with observational data, scientists can infer what lies beneath Psyche’s surface.
 
This approach effectively transforms crater analysis into an inverse problem, one where the observed geometry must be matched to a forward model of high-energy impacts governed by nonlinear physics.

HPC at the Core of Planetary Reconstruction

The study, published in Journal of Geophysical Research: Planets, leverages hydrocode simulations, a class of numerical methods used to model shock physics, material deformation, and high-velocity impacts. These simulations solve coupled partial differential equations describing:
  • Momentum conservation under extreme pressures
  • Energy transfer during hypervelocity collisions
  • Phase transitions in metal and silicate materials
  • Fragmentation and ejecta dynamics
Such models are computationally intensive. Each simulation must resolve fine spatial and temporal scales while exploring a large parameter space, including:
  • Impactor size and velocity
  • Target composition (metal-rich vs. mixed material)
  • Porosity and internal layering
  • Gravity regime of the asteroid
Running these scenarios across multiple configurations requires massively parallel HPC systems, often executing thousands of simulations to converge on statistically robust interpretations.

Craters as Probes of Internal Structure

One of the key insights from the study is that crater size alone is not sufficient to infer surface composition. Instead, the shape, depth, and ejecta distribution of craters vary significantly depending on whether the target material behaves like solid metal, fractured rock, or a porous composite.
 
Supercomputer simulations revealed that some of Psyche’s largest craters are more consistent with impacts into a lower-density or heterogeneous, rather than purely metallic, body. This finding aligns with recent observational and spectral data suggesting Psyche is not a simple exposed core, but a more complex, differentiated object.
 
In practical terms, this suggests the asteroid’s history likely includes a sequence of complex processes: partial differentiation followed by structural disruption, subsequent re-accumulation of mixed materials, and repeated high-energy impact events.
 
Each of these scenarios leaves distinct signatures in crater morphology, signatures that only become interpretable through computational modeling.

A Digital Twin Ahead of NASA’s Arrival

The timing of this work is particularly significant. NASA’s Psyche mission, launched in 2023, is expected to arrive at the asteroid in 2029.
 
By the time the spacecraft begins transmitting high-resolution imagery and gravity data, researchers aim to have a computational framework already in place, a kind of digital twin of Psyche that can rapidly assimilate new observations.
 
For HPC users, this represents a familiar paradigm:
  • Build large ensembles of forward simulations.
  • Precompute parameter sensitivities.
  • Utilize observational data to constrain model space in real-time.
In planetary science, this workflow is becoming increasingly central as datasets grow and missions demand faster scientific interpretation.
 
"Large impact basins or craters excavate deep into the asteroid, which gives clues about what its interior is made of," said Namya Baijal, a doctoral candidate at the LPL and first author of the paper. "By simulating the formation of one of its largest craters, we were able to make testable predictions for Psyche's overall composition when the spacecraft arrives."

Inspiration for the Supercomputing Community

For supercomputing engineers, Psyche offers a compelling example of how HPC extends beyond traditional domains into planetary-scale inference problems.
 
The work illustrates a broader shift: modern space science is no longer limited by data collection, but by our ability to simulate, compare, and interpret complex physical systems.
 
Craters, once viewed as static geological features, are now dynamic datasets, decoded through parallel computation and advanced modeling.
 
And in those impact scars, billions of years old, supercomputers are helping scientists read a story that was once thought unreachable: the formation of worlds, written in metal and stone, reconstructed in code.
Larissa Verona measures greenhouse gas emissions from the soil using the LI-COR instrument. Photo: Juliana Di Beo
Larissa Verona measures greenhouse gas emissions from the soil using the LI-COR instrument. Photo: Juliana Di Beo

Machine learning meets the Cerrado: Mapping the hidden carbon power of Brazil’s wetlands

The Brazilian Cerrado, often overshadowed by the Amazon rainforest, is emerging as a new frontier for computational climate science. According to researchers at the Cary Institute of Ecosystem Studies, wetlands scattered across this vast tropical savanna may act as unexpectedly powerful carbon reservoirs, yet quantifying their role in the global carbon cycle is proving to be a complex data problem increasingly addressed with machine learning and large-scale environmental modeling.
 
For machine learning professionals working with environmental data, the research highlights a fascinating challenge: detecting and modeling carbon storage in ecosystems that are spatially heterogeneous, seasonally dynamic, and poorly mapped.

The Cerrado’s Hidden Carbon System

The Cerrado biome covers roughly two million square kilometers across central Brazil and is widely recognized as one of the most biodiverse savanna ecosystems on Earth. But ecologically, its most important features may lie underground.
 
Researchers often describe the Cerrado as an “underground forest”, where plants store a significant portion of their biomass in deep root networks rather than aboveground trunks and canopies.
 
Seasonal wetlands within this landscape, such as veredas, peatlands, and marshy valley systems, play an outsized role in carbon storage. These ecosystems accumulate organic carbon in waterlogged soils where decomposition occurs slowly, allowing carbon to build up over centuries.
 
Some estimates suggest that Cerrado peatlands may hold around 13% of the region’s soil carbon while covering less than 1% of its surface area, illustrating the concentration of carbon within these specialized environments.
 
Yet despite their importance, the spatial distribution and total carbon stocks of these wetlands remain poorly constrained.

A Data Problem Well Suited to Machine Learning

This is where computational methods come in.
 
To understand how Cerrado wetlands influence regional and global carbon cycles, researchers must integrate several challenging datasets simultaneously:
  • Satellite imagery capturing seasonal hydrology and vegetation structure.
  • Soil carbon measurements from sparse field sampling campaigns
  • Topographic and hydrological models predicting water flow and wetland formation
  • Climate data describing temperature, rainfall, and evapotranspiration dynamics
Machine learning models, particularly ensemble regression and geospatial deep learning frameworks, are increasingly used to interpolate carbon density across unsampled regions and to identify wetland systems that conventional maps miss.
 
Such models often operate on multi-terabyte remote-sensing datasets, requiring HPC pipelines capable of processing satellite imagery, generating spatial features, and training predictive models across millions of grid cells.
 
For ML engineers, this workflow closely resembles large-scale geospatial modeling tasks seen in climate simulation or Earth-observation analytics.

Mato Grosso do Sul: A Case Study in Rapid Landscape Change

The state of Mato Grosso do Sul provides a particularly revealing example of the computational challenge.
 
Cerrado landscapes dominate much of the state, covering more than 60% of its territory, and include a mosaic of savannas, grasslands, forests, and wetland fields that feed major river basins connected to the Pantanal.
 
However, the region has undergone rapid land-use change in recent decades. Between 1985 and 2022, more than 4.6 million hectares of native vegetation were largely replaced by cattle pasture and soybean agriculture.
 
For environmental modelers, these changes introduce a moving target. Carbon storage potential must be estimated not just for intact ecosystems but also for landscapes undergoing continuous transformation.
 
Machine learning models, therefore, need to account for temporal dynamics, incorporating satellite time-series data and land-use classification models that track vegetation shifts over decades.

Building the Next Generation of Ecological Models

Researchers associated with the Cary Institute of Ecosystem Studies, including ecologist Amy Zanne, are exploring how plant traits, microbial processes, and wetland hydrology influence carbon storage and greenhouse gas fluxes across the Cerrado.
 
For the machine learning community, these questions translate into a broader computational challenge:
 
How can models capture interactions among vegetation traits, soil microbiology, hydrology, and climate across continental-scale landscapes?
 
Traditional ecological models struggle with the dimensionality of these systems. Data-driven approaches, combining remote sensing, statistical inference, and ML, offer a pathway toward scalable predictions.

Curiosity for the ML Community

From an algorithmic standpoint, the Cerrado wetlands project illustrates an emerging domain sometimes called computational ecosystem science.
 
It sits at the intersection of:
  • Geospatial machine learning
  • Earth-system modeling
  • Large-scale environmental data assimilation
For machine learning engineers, the appeal is clear. Few real-world datasets are as complex, or as consequential, as those describing Earth’s carbon cycle.
 
And in the Cerrado’s wetlands, the stakes may be surprisingly high. Beneath the grasses and shrubs of Brazil’s savanna lies a vast, partially hidden carbon reservoir whose behavior could influence climate models for decades to come.
 
Understanding it will require more than field biology alone.
 
It will require algorithms capable of learning from the landscape itself.

Palantir, NVIDIA propose a ‘sovereign AI operating system,’ a new blueprint for AI supercomputing infrastructure

With the rapid expansion of large-scale AI infrastructure, Palantir Technologies and NVIDIA have launched a joint initiative that is attracting significant interest from the high-performance computing sector. Their new Sovereign AI Operating System Reference Architecture is a comprehensive blueprint designed to help organizations create production-ready AI data centers that can operate advanced models while preserving stringent control over data and infrastructure.
 
Initially, this approach mirrors familiar high-performance computing (HPC) reference architectures, offering a validated stack that brings together compute, networking, storage, orchestration, and application frameworks. However, the system aims to go further by establishing what its developers call a true AI infrastructure operating system, one that unifies the stack from GPU hardware all the way to model deployment and enterprise workflows.
 
For supercomputing engineers accustomed to designing clusters for scientific simulation or AI training, the announcement raises a curious question: are we witnessing the emergence of an “AI operating system” layer for entire data centers?

A Turnkey AI Datacenter Stack

The new architecture, referred to as AIOS-RA, is designed as a turnkey platform that encompasses everything from hardware procurement to the development of production AI applications. It builds on NVIDIA’s enterprise reference architectures and has been validated to run Palantir’s full software ecosystem, including its data-integration and AI platforms.
 
Key components of the stack include:
  • GPU-accelerated compute nodes based on NVIDIA’s Blackwell-class systems
  • High-bandwidth networking, including Spectrum-X Ethernet fabrics
  • CUDA-X libraries and NVIDIA AI Enterprise software for optimized AI workloads
  • Palantir’s AIP, Foundry, Apollo, Rubix, and AIP Hub platforms for data integration, orchestration, and AI deployment.
At the software layer, the system runs on a Kubernetes-based orchestration substrate, coordinating distributed services and enabling AI models to interact directly with enterprise data sources.
 
From an HPC perspective, the architecture resembles a hybrid of traditional supercomputing clusters and modern cloud platforms, combining tightly coupled GPU resources with containerized service orchestration and model-driven applications.

Why “Sovereign” AI?

The most distinctive feature of the architecture is its emphasis on data sovereignty.
Organizations deploying large-scale AI increasingly face regulatory and security constraints that require data and models to remain within specific jurisdictions or controlled infrastructure. The proposed platform allows enterprises or governments to deploy AI systems on domestic or on-premises infrastructure while maintaining full control over data, models, and applications.
 
This requirement has become especially prominent in sectors such as defense, healthcare, and finance, where data residency and regulatory compliance often prohibit the use of global public-cloud AI services.
 
In this sense, the architecture reflects a broader industry shift: AI workloads are no longer just software pipelines; they are strategic infrastructure assets.

HPC Convergence With Enterprise AI

For HPC practitioners, the proposed architecture highlights a growing convergence between AI factories and traditional supercomputing systems.
 
Several design principles familiar to HPC engineers appear throughout the architecture:
  • GPU-dense compute nodes optimized for AI training and inference.
  • High-bandwidth networking fabrics designed to minimize latency across distributed workloads
  • Parallel data pipelines capable of feeding large models efficiently
  • Unified orchestration layers that coordinate heterogeneous workloads across clusters
However, unlike many scientific HPC environments, the stack is designed to support continuous operational AI workloads rather than batch simulation jobs.
 
In other words, the architecture treats the data center not as a machine that occasionally runs AI jobs, but as a persistent AI system operating at production scale.

Curiosity for the Supercomputing Community

The idea of an “AI operating system” for infrastructure invites both curiosity and debate among HPC engineers.
 
Traditional supercomputing environments already integrate complex software layers: schedulers, parallel file systems, MPI stacks, container runtimes, and resource managers. The new architecture attempts to unify many of these concepts within a platform designed specifically for AI-native workloads and enterprise data integration.
 
Whether this approach represents a genuine architectural shift or simply a rebranding of established HPC design patterns adapted for AI remains an open question.
 
What is clear, however, is that AI workloads are pushing infrastructure design toward tighter integration across hardware, orchestration, and application layers. As models grow larger and data pipelines more complex, the boundaries between cloud architecture, enterprise software, and supercomputing are rapidly dissolving.
 
For HPC practitioners observing the transformation of AI infrastructure, the partnership between Palantir and NVIDIA represents more than just a new product. It signals a larger shift, an exploration of how supercomputing architectures might become the standard foundation for production-scale AI systems.