Yuan Yao Assistant Professor of Industrial Ecology and Sustainable Systems
Yuan Yao Assistant Professor of Industrial Ecology and Sustainable Systems

Yale prof Yao investigates the use of ML for the sustainable development of biomass

Biomass is widely considered a renewable alternative to fossil fuels, and many experts say it can play a critical role in combating climate change. Biomass stores carbon and can be turned into bio-based products and energy that can be used to improve soil, treat wastewater, and produce renewable feedstock. 

Yet large-scale production of it has been limited due to economic constraints and challenges to optimizing and controlling biomass conversion. Yao Biomass 687504388 587d0

A new study led by Yale School of the Environment’s Yuan Yao, assistant professor of industrial ecology and sustainable systems, and doctoral student Hannah Szu-Han Wang, analyzed current machine learning applications for biomass and biomass-derived materials (BDM) to determine if machine learning is advancing the research and development of biomass products. The study authors found that machine learning has not been applied across the entire life cycle of BDM, limiting its ability for growth.

Yao’s research investigates how emerging technologies and industrial development will affect the environment with a focus on bio-economy and sustainable production. Wang worked in the production of biomaterials during her master’s research. The two researchers said they were interested in pursuing this study to find out if machine learning could help with best practices for creating BDM, a chief component of a bio-based economy, as well as predicting their performance as sustainable materials.

“There are so many combinations of biomass feedstock, conversion technologies, and BDM applications. If we want to try each combination using the traditional trial-and-error experimental approach, this will take a lot of time, labor, effort, and energy. We already generate a lot of data from these past experiments, so we are asking, can we apply machine learning to help us to figure out how we can better design BDM?" Yao explains.

For the study published in Resources, Conservation, and Recycling, Yao and Wang reviewed more than 50 papers published since 2008 to understand the capabilities, current limitations, and future potential of machine learning in supporting sustainable development and applications of BDM. What they found is that while a few studies applied machine learning to address data challenges for life cycle assessment, most studies only applied machine learning to predict and optimize the technical performance of biomass conversion and applications. None reviewed machine learning applications across the entire lifecycle, from biomass cultivation to BDM production and end-use applications.

“Most studies are applying machine learning to just a very small part of the entire lifecycle of BDM,” Yao says. “We argue that if you want to incorporate sustainability into the development of this material, we need to consider the entire lifecycle of the materials, from how they are generated to their potential environmental impact. We believe machine learning has the potential to support sustainability-informed design for biomass-derived materials.”

Wang said the study has led to further research on data gaps in machine learning on biomass-derived materials.

“We found a future direction that people have not yet explored regarding sustainability assessments for BDM. There needs to be a full pathway prediction to enhance our understanding of how various factors regarding BDM interact and contribute to sustainability,” she says.

Zooming Through a Simulated Universe

This video begins by showing the most distant galaxies in the simulated deep field image in red. As it zooms out, layers of nearer (yellow and white) galaxies are added to the frame. By studying different cosmic epochs, Roman will be able to trace the universe's expansion history, study how galaxies developed over time, and much more. Credit: Caltech-IPAC/R. Hurt and M. Troxel

Read more

This simulated Roman deep field image, containing hundreds of thousands of galaxies, represents just 1.3 percent of the synthetic survey, which is itself just one percent of Roman's planned survey. The full simulation is available here. The galaxies are color coded – redder ones are farther away and whiter ones are nearer. The simulation showcases Roman’s power to conduct large, deep surveys and study the universe statistically in ways that aren’t possible with current telescopes. Credits: M. Troxel and Caltech-IPAC/R. Hurt
This simulated Roman deep field image, containing hundreds of thousands of galaxies, represents just 1.3 percent of the synthetic survey, which is itself just one percent of Roman's planned survey. The full simulation is available here. The galaxies are color coded – redder ones are farther away and whiter ones are nearer. The simulation showcases Roman’s power to conduct large, deep surveys and study the universe statistically in ways that aren’t possible with current telescopes. Credits: M. Troxel and Caltech-IPAC/R. Hurt

NASA Goddard scientists comb through the new simulated Roman data to get a taste of the bonus science

Scientists have created a gargantuan synthetic survey that shows what we can expect from the Nancy Grace Roman Space Telescope’s future observations. Though it represents just a small chunk of the real future survey, this simulated version contains a staggering number of galaxies – 33 million of them, along with 200,000 foreground stars in our home galaxy. This animation shows the type of science that astronomers will be able to do with future Roman deep field observations. The gravity of intervening galaxy clusters and dark matter can lens the light from farther objects, warping their appearance as shown in the animation. By studying the distorted light, astronomers can study elusive dark matter, which can only be measured indirectly through its gravitational effects on visible matter. As a bonus, this lensing also makes it easier to see the most distant galaxies whose light they magnify.

The simulation will help scientists plan the best observing strategies, test different ways to mine the mission's vast quantities of data and explore what we can learn from tandem observations with other telescopes.

"The volume of data Roman will return is unprecedented for a space telescope,” said Michael Troxel, an assistant professor of physics at Duke University in Durham, North Carolina. “Our simulation is a testing ground we can use to make sure we will get the most out of the mission’s observations.”

The team drew data from a mock universe originally developed to support science planning with the Vera C. Rubin Observatory, which is located in Chile and set to begin full operations in 2024. Because the Roman and Rubin simulations use the same source, astronomers can compare them and see what they can expect to learn from pairing the telescopes’ observations once they’re both actively scanning the universe.

Cosmic Construction

Roman’s High Latitude Wide Area Survey will consist of both imaging – the focus of the new simulation – and spectroscopy across the same enormous swath of the universe. Spectroscopy involves measuring the intensity of light from cosmic objects at different wavelengths, while Roman’s imaging will reveal precise positions and shapes of hundreds of millions of faint galaxies that will be used to map dark matter. Although this mysterious substance is invisible, astronomers can infer its presence by observing its effects on regular matter. 

Anything with mass warps the fabric of space-time. The bigger the mass, the greater the warp. This creates an effect called gravitational lensing, which happens when light from a distant source becomes distorted as it travels past intervening objects. When those lensing objects are massive galaxies or galaxy clusters, background sources can be smeared or appear as multiple images.

Less massive objects can create more subtle effects called weak lensing. Roman will be sensitive enough to use weak lensing to see how clumps of dark matter warp the appearance of distant galaxies. By observing these lensing effects, scientists will be able to fill in more of the gaps in our understanding of dark matter.

“Theories of cosmic structure formation make predictions for how the seed fluctuations in the early universe grow into the distribution of matter that can be seen through gravitational lensing,” said Chris Hirata, a physics professor at Ohio State University in Columbus, and a co-author of the paper. “But the predictions are statistical in nature, so we test them by observing vast regions of the cosmos. Roman, with its wide field of view, will be optimized to efficiently survey the sky, complementing observatories such as the James Webb Space Telescope that are designed for deeper investigation of individual objects.”

Ground and Space This graphic compares the relative sizes of the synthetic image (inset, outlined in orange), the whole area astronomers simulated (the square in the upper-middle outlined in green), and the size of the complete future survey astronomers will conduct (the large square in the lower-left outlined in blue). The background, from the Digitized Sky Survey, illustrates how much sky area each region covers. The synthetic image covers about as much sky as a full moon, and the future Roman survey will cover much more area than the Big Dipper. While it would take the Hubble Space Telescope or James Webb Space Telescope around a thousand years to image an area as large as the future survey, Roman will do it in just over seven months. Credits: NASA’s Goddard Space Flight Center and M. Troxel

The synthetic Roman survey covers 20 square degrees of the sky, which is roughly equivalent to 95 full moons. The actual survey will be 100 times larger, unveiling more than a billion galaxies. Rubin will scan an even greater area – 18,000 square degrees, nearly half of the entire sky – but with lower resolution since it will have to peer through Earth’s turbulent atmosphere.

Pairing the Roman and Rubin simulations offers the first opportunity for scientists to try to detect the same objects in both sets of images. That’s important because ground-based observations aren’t always sharp enough to distinguish multiple, close sources as separate objects. Sometimes they blur together, which affects weak lensing measurements. Now, scientists can determine the difficulties and benefits of “deblending” such objects in Rubin's images by comparing them with Roman ones. 

{media id=301,layout=solo}

With Roman’s colossal cosmic view, astronomers will be able to accomplish far more than the survey's primary goals, which are to study the structure and evolution of the universe, map dark matter, and discern between the leading theories that attempt to explain why the expansion of the universe is speeding up. Scientists can comb through the new simulated Roman data to get a taste of the bonus science that will come from seeing so much of the universe in such exquisite detail.

“With Roman’s gigantic field of view, we anticipate many different scientific opportunities, but we will also have to learn to expect the unexpected,” said Julie McEnery, the senior project scientist for the Roman mission at NASA’s Goddard Space Flight Center in Greenbelt, Maryland. “The mission will help answer critical questions in cosmology while potentially revealing brand new mysteries for us to solve.”

The Nancy Grace Roman Space Telescope is managed at NASA’s Goddard Space Flight Center in Greenbelt, Maryland, with participation by NASA's Jet Propulsion Laboratory and Caltech/IPAC in Southern California, the Space Telescope Science Institute in Baltimore, and a science team comprising scientists from various research institutions. The primary industrial partners are Ball Aerospace and Technologies Corporation in Boulder, Colorado; L3Harris Technologies in Melbourne, Florida; and Teledyne Scientific & Imaging in Thousand Oaks, California.

Orbit-to-Ground study of biosignatures in the terrestrial Mars analog study site Salar de Pajonales, Chile. (b) drone view of the site with macroscale geologic features (domes, aeolian cover, ridge networks and patterned ground) in false color. (c) 3-D rendering of dome macrohabitats from drone imagery. (d) Orange and green bands of pigments of the photosynthetic microbial communities living in Ca-sulfate micro-habitats. These biosignatures are a feature of NASA’s Ladder of Life Detection and are detectable by eye and by instruments such as Raman (e) and Visible Short-Wave Infrared spectroscopy. Image credit: N. Cabrol, M. Phillips, K. Warren-Rhodes, J. Bishop and D. Wettergreen.
Orbit-to-Ground study of biosignatures in the terrestrial Mars analog study site Salar de Pajonales, Chile. (b) drone view of the site with macroscale geologic features (domes, aeolian cover, ridge networks and patterned ground) in false color. (c) 3-D rendering of dome macrohabitats from drone imagery. (d) Orange and green bands of pigments of the photosynthetic microbial communities living in Ca-sulfate micro-habitats. These biosignatures are a feature of NASA’s Ladder of Life Detection and are detectable by eye and by instruments such as Raman (e) and Visible Short-Wave Infrared spectroscopy. Image credit: N. Cabrol, M. Phillips, K. Warren-Rhodes, J. Bishop and D. Wettergreen.

SETI Institute’s NAI team paves the way for machine learning to assist scientists in the search for biosignatures in the Universe

Wouldn’t finding life on other worlds be easier if we knew exactly where to look? Researchers have limited opportunities to collect samples on Mars or elsewhere or access remote sensing instruments when hunting for life beyond Earth. An interdisciplinary study led by SETI Institute Senior Research Scientist Kim Warren-Rhodes mapped the sparse life hidden away in salt domes, rocks, and crystals at Salar de Pajonales at the boundary of the Chilean Atacama Desert and Altiplano. Warren-Rhodes then worked with co-investigators Michael Phillips (Johns Hopkins Applied Physics Lab) and Freddie Kalaitzis (University of Oxford) to train a machine learning model to recognize the patterns and rules associated with their distributions so it could learn to predict and find those same distributions in data on which it was not trained. In this case, by combining statistical ecology with AI/ML, the scientists could locate and detect biosignatures up to 87.5% of the time (versus ≤10% by random search) and decrease the area needed for search by up to 97%. Biosignature probability maps from CNN models and statistical ecology data. The colors in a) indicate the probability of biosignature detection. In b) a visible image of a gypsum dome geologic feature (left) with biosignature probability maps for various microhabitats (e.g., sand versus alabaster) within it. Figure credit: M. Phillips, F. Kalaitzis, K. Warren- Rhodes.

“Our framework allows us to combine the power of statistical ecology with machine learning to discover and predict the patterns and rules by which nature survives and distributes itself in the harshest landscapes on Earth.,” said Rhodes. “We hope other astrobiology teams adapt our approach to mapping other habitable environments and biosignatures. With these models, we can design tailor-made roadmaps and algorithms to guide rovers to places with the highest probability of harboring past or present life—no matter how hidden or rare.”

Ultimately, similar algorithms and machine learning models for many different types of habitable environments and biosignatures could be automated onboard planetary robots to efficiently guide mission planners to areas at any scale with the highest probability of containing life.

Rhodes and the SETI Institute NASA Astrobiology Institute (NAI) team used the Salar de Pajonales, as a Mars analog. Pajonales is a high altitude (3,541 m), high U/V, hyperarid, dry salt lakebed, considered inhospitable to many life forms but still habitable. 

{media id=300,layout=solo}

During the NAI project’s field campaigns, the team collected over 7,765 images and 1,154 samples and tested instruments to detect photosynthetic microbes living within the salt domes, rocks, and alabaster crystals. These microbes exude pigments that represent one possible biosignature on NASA’s Ladder of Life Detection.

At Pajonales, drone flight imagery connected simulated orbital (HiRISE) data to ground sampling and 3D topographical mapping to extract spatial patterns. The study’s findings confirm (statistically) that microbial life at the Pajonales terrestrial analog site is not distributed randomly but concentrated in patchy biological hotspots strongly linked to water availability at km to cm scales.

Next, the team trained convolutional neural networks (CNNs) to recognize and predict macro-scale geologic features at Pajonales—some of which, like the patterned ground or polygonal networks, are also found on Mars—and micro-scale substrates (or ‘micro-habitats’) most likely to contain biosignatures.

Like the Perseverance team on Mars, the researchers tested how to effectively integrate a UAV/drone with ground-based rovers, drills, and instruments (e.g., VISIR on ‘MastCam-Z’ and Raman on ‘SuperCam’ on the Mars 2020 Perseverance rover).

The team’s next research objective at Pajonales is to test the CNNs' ability to predict the location and distribution of ancient stromatolite fossils and halite microbiomes with the same machine learning programs to learn whether similar rules and models apply to other similar yet slightly different natural systems. From there, entirely new ecosystems, such as hot springs, permafrost soils, and rocks in the Dry Valleys, will be explored and mapped. As more evidence accrues, hypotheses about the convergence of life’s means of surviving in extreme environments will be iteratively tested, and biosignature probability blueprints for Earth’s key analog ecosystems and biomes will be inventoried.

“While the high rate of biosignature detection is a central result of this study, no less important is that it successfully integrated datasets at vastly different resolutions from orbit to the ground, and finally tied regional orbital data with microbial habitats,” said Nathalie A. Cabrol, the PI of the SETI Institute NAI team. “With it, our team demonstrated a pathway that enables the transition from the scales and resolutions required to characterize habitability to those that can help us find life. In that strategy, drones were essential, but so was the implementation of microbial ecology field investigations that require extended periods (up to weeks) of in situ (and in place) mapping in small areas, a strategy that was critical to characterize local environmental patterns favorable to life niches.”

This study led by the SETI Institute’s NAI team has paved the way for machine learning to assist scientists in the search for biosignatures in the universe. Their paper “Orbit-to-Ground Framework to Decode and Predict Biosignature Patterns in Terrestrial Analogues” is the culmination of five years of the NASA-funded NAI project, and a cooperative astrobiology research effort with over 50 team members from 17 institutions. In addition to Johns Hopkins Applied Physics Lab and the University of Oxford, the Universidad Católica del Norte, Antofagasta, Chile supported this research.

The SETI NAI team project entitled “Changing Planetary Environments and the Fingerprints of Life” was funded by the NASA Astrobiology Program (Mary Voytek, Director) under grant No. NNA15BB01A.

Can AI help find life on Mars or Icy Worlds?

Video showing the major concepts of integrating datasets from orbit to the ground. The first frames zoom in from a global view to an orbital image of Salar de Pajonales. The salar is then overlain with an interpretation of its compositional variability derived from ASTER multispectral data. The next sequence of frames transitions to drone-derived images of the field site within Salar de Pajonales....

Read more