Using machine learning, researchers at MIT and Dana-Farber Cancer Institute created a computational model that can analyze the sequence of about 400 genes and use that information to predict where a given tumor originated in the body. Credits :Image: iStock, MIT News
Using machine learning, researchers at MIT and Dana-Farber Cancer Institute created a computational model that can analyze the sequence of about 400 genes and use that information to predict where a given tumor originated in the body. Credits :Image: iStock, MIT News

MIT builds AI model that can help determine where a patient's cancer arose

Predictions from the OncoNPC model could enable doctors to choose targeted treatments for difficult-to-treat tumors.

For a small percentage of cancer patients, doctors are unable to determine where their cancer originated. This makes it much more difficult to choose a treatment for those patients, because many cancer drugs are typically developed for specific cancer types.

A new approach developed by researchers at MIT and Dana-Farber Cancer Institute may make it easier to identify the sites of origin for those enigmatic cancers. Using machine learning, the researchers created a computational model that can analyze the sequence of about 400 genes and use that information to predict where a given tumor originated in the body.

Using this model, the researchers showed that they could accurately classify at least 40 percent of tumors of unknown origin with high confidence, in a dataset of about 900 patients. This approach enabled a 2.2-fold increase in the number of patients who could have been eligible for a genomically guided, targeted treatment, based on where their cancer originated.

“That was the most important finding in our paper, that this model could be potentially used to aid treatment decisions, guiding doctors toward personalized treatments for patients with cancers of unknown primary origin,” says Intae Moon, an MIT graduate student in electrical engineering and computer science who is the lead author of the new study.

Alexander Gusev, an associate professor of medicine at Harvard Medical School and Dana-Farber Cancer Institute, is the senior author of the paper.

Mysterious origins

In 3 to 5 percent of cancer patients, particularly in cases where tumors have metastasized throughout the body, oncologists don’t have an easy way to determine where the cancer originated. These tumors are classified as cancers of unknown primary (CUP).

This lack of knowledge often prevents doctors from being able to give patients “precision” drugs, which are typically approved for specific cancer types where they are known to work. These targeted treatments tend to be more effective and have fewer side effects than treatments that are used for a broad spectrum of cancers, which are commonly prescribed to CUP patients.

“A sizeable number of individuals develop these cancers of unknown primary every year, and because most therapies are approved in a site-specific way, where you have to know the primary site to deploy them, they have very limited treatment options,” Gusev says.

Moon, an affiliate of the Computer Science and Artificial Intelligence Laboratory who is co-advised by Gusev, decided to analyze genetic data that is routinely collected at Dana-Farber to see if it could be used to predict cancer type. The data consist of genetic sequences for about 400 genes that are often mutated in cancer. The researchers trained a machine-learning model on data from nearly 30,000 patients who had been diagnosed with one of 22 known cancer types. That set of data included patients from Memorial Sloan Kettering Cancer Center and Vanderbilt-Ingram Cancer Center, as well as Dana-Farber.

The researchers then tested the resulting model on about 7,000 tumors that it hadn’t seen before, but whose site of origin was known. The model, which the researchers named OncoNPC, was able to predict their origins with about 80 percent accuracy. For tumors with high-confidence predictions, which constituted about 65 percent of the total, its accuracy rose to roughly 95 percent.

After those encouraging results, the researchers used the model to analyze a set of about 900 tumors from patients with CUP, which were all from Dana-Farber. They found that for 40 percent of these tumors, the model was able to make high-confidence predictions.

The researchers then compared the model’s predictions with an analysis of the germline, or inherited, mutations in a subset of tumors with available data, which can reveal whether the patients have a genetic predisposition to develop a particular type of cancer. The researchers found that the model’s predictions were much more likely to match the type of cancer most strongly predicted by the germline mutations than any other type of cancer.

Guiding drug decisions

To further validate the model’s predictions, the researchers compared data on the CUP patients’ survival time with the typical prognosis for the type of cancer that the model predicted. They found that CUP patients who were predicted to have cancer with a poor prognosis, such as pancreatic cancer, showed correspondingly shorter survival times. Meanwhile, CUP patients who were predicted to have cancers that typically have better prognoses, such as neuroendocrine tumors, had longer survival times.

Another indication that the model’s predictions could be useful came from looking at the types of treatments that CUP patients analyzed in the study had received. About 10 percent of these patients had received a targeted treatment, based on their oncologists’ best guess about where their cancer had originated. Among those patients, those who received a treatment consistent with the type of cancer that the model predicted for them fared better than patients who received a treatment typically given for a different type of cancer than what the model predicted for them.

Using this model, the researchers also identified an additional 15 percent of patients (2.2-fold increase) who could have received an existing targeted treatment, if their cancer type had been known. Instead, those patients ended up receiving more general chemotherapy drugs.

“That potentially makes these findings more clinically actionable because we’re not requiring a new drug to be approved. What we’re saying is that this population can now be eligible for precision treatments that already exist,” Gusev says.

The researchers now hope to expand their model to include other types of data, such as pathology images and radiology images, to provide a more comprehensive prediction using multiple data modalities. This would also provide the model with a comprehensive perspective of tumors, enabling it to predict not just the type of tumor and patient outcome, but potentially even the optimal treatment.

The research was funded by the National Institutes of Health, the Louis B. Mayer Foundation, the Doris Duke Charitable Foundation, the Phi Beta Psi Sorority, and the Emerson Collective.

Gas distribution around the trinary protostars IRAS 04239+2436, (left) ALMA observations of SO emissions, and (right) as reproduced by the numerical simulation on the supercomputer ATERUI. In the left panel, protostars A and B, shown in blue, indicate the radio waves from the dust around the protostars. Within protostar A, two unresolved protostars are thought to exist. In the right panel, the locations of the three protostars are shown by the blue crosses. (Credit: ALMA (ESO/NAOJ/NRAO), J.-E. Lee et al.)
Gas distribution around the trinary protostars IRAS 04239+2436, (left) ALMA observations of SO emissions, and (right) as reproduced by the numerical simulation on the supercomputer ATERUI. In the left panel, protostars A and B, shown in blue, indicate the radio waves from the dust around the protostars. Within protostar A, two unresolved protostars are thought to exist. In the right panel, the locations of the three protostars are shown by the blue crosses. (Credit: ALMA (ESO/NAOJ/NRAO), J.-E. Lee et al.)

Japanese-built supercomputers discover gas streamers feed triple-baby stars

Recent observations and supercomputer simulations of three gas spiral arms feeding three protostars in a trinary system have helped to clarify the formation of multi-star systems. 

Most stars with a mass similar to the Sun form in multi-star systems together with other stars. So an understanding of multi-star system formation is important to an overall theory of star formation. However, the complexity and lack of high-resolution, high-sensitivity data left astronomers uncertain about the formation scenario. In particular, recent observations of protostars often reported structures called "streamers" of gas flows toward the protostars, but it has been unclear how these streamers form. 

An international team led by Jeong-Eun Lee, a professor at Seoul National University, used the Atacama Large Millimeter/submillimeter Array (ALMA) to observe the trinary protostar system IRAS 04239+2436 located 460 light-years away in the constellation Taurus. The team found that emissions from sulfur monoxide (SO) molecules trace three spiral arms around the three protostars forming in the system. Video: {joomvideos id=327}

Comparison with simulations led by Tomoaki Matsumoto, a professor at Hosei University using the supercomputers “ATERUI” and “ATERUI II” in the Center for Computational Astrophysics at the National Astronomical Observatory of Japan (NAOJ) indicate that the three spiral arms are streamers feeding material to the three protostars. The combination of observations and simulations revealed, for the first time, how the streamers are created and contribute to the growth of the protostars at the center.

The discovery of three baby stars being fed by a single gas streamer is a remarkable feat of astronomy. It is a testament to the power of modern supercomputing technology and the dedication of the research team that such a complex and intricate system was able to be observed and documented. This discovery provides a unique insight into the formation of stars and the evolution of galaxies and could lead to further breakthroughs in our understanding of the universe. It is a reminder that the universe is full of wonders and that with enough dedication and hard work, we can uncover its secrets.

Circulation of the subpolar North Atlantic: The image shows a snapshot of the surface velocity in the high-resolution VIKING20X model, showing the meandering flow of the North Atlantic Current and the narrow boundary current that develops south of the Denmark Strait along the eastern continental shelf of Greenland. Shaded in grey is the area where convection exceeded 1800 m depth during the winters of 1990-1994.
Circulation of the subpolar North Atlantic: The image shows a snapshot of the surface velocity in the high-resolution VIKING20X model, showing the meandering flow of the North Atlantic Current and the narrow boundary current that develops south of the Denmark Strait along the eastern continental shelf of Greenland. Shaded in grey is the area where convection exceeded 1800 m depth during the winters of 1990-1994.

Germany's GEOMAR supercomputing shows winter storms over Labrador Sea influence Gulf Stream

The Gulf Stream system plays a vital role in the climate, and its decline over the past two decades has raised concerns and sparked debates. While the cause of this weakening is uncertain, some simulations suggest that human-induced climate change could be a significant factor in the future. However, a recent study conducted by the GEOMAR Helmholtz Centre for Ocean Research in Kiel, Germany suggests that the observed weakening may be due to natural fluctuations caused by extremely cold winters in the Labrador Sea during the 1990s. 

The new supercomputer simulations show that fluctuations in the Labrador Sea can have a significant influence on the strength of sinking processes east of Greenland. An important link is a little-noticed system of deep currents that ensures the rapid spread of Labrador Sea water into the deep-sea basin between Greenland and Iceland. Schematic of surface (red) and deep (blue) currents in the Atlantic Ocean. Circles indicate regions where currents are strongly influenced by oceanic eddies. The dashed area between Canada and Greenland outlines the area in the Labrador Sea where strong winter cooling causes vertical mixing of the water column. Graphic: Böning / Scheinert (GEOMAR)

"We oceanographers have long had our eyes on the Labrador Sea between Canada and Greenland," says Professor Dr Claus Böning, who led the study. "Winter storms with icy air cool the ocean temperatures to such an extent that the surface water becomes heavier than the water below. The result is deep winter mixing of the water column, whereby the volume and density of the resulting water mass can vary greatly from year to year."

In the model simulations of the past 60 years, the years 1990 to 1994 have stood out, when the Labrador Sea cooled particularly strongly. "The huge volume of very dense Labrador Sea wateThe Gulf Stream system plays a critical role in the climate, and concerns have been raised about its weakening over the past two decades. While it's unclear whether human-induced climate change has caused these changes, simulations indicate that it's highly probable to occur in the future. However, a recent study by GEOMAR Helmholtz Centre for Ocean Research in Kiel, Germany suggests that the weakening may be due to natural fluctuations caused by extremely cold winters in the Labrador Sea during the 1990s.r that formed following extremely harsh winters led to significantly increased sinking between Greenland and Iceland in the following years," explains Claus Böning. As a result, the model simulations calculated an increase in Atlantic overturning transport of more than 20%, peaking in the late 1990s. The measurements of the circulation in the North Atlantic, which have only been carried out continuously since 2004, would then fall precisely in the decay phase of the simulated transport maximum. 

Video: {joomvideos id=326}

"According to our model results, the observed weakening of the Atlantic circulation during this period can therefore be interpreted, at least in part, as an aftereffect of the extreme Labrador Sea winters of the 1990s", summarises Professor Dr. Arne Biastoch, head of the Ocean Dynamics Research Unit at GEOMAR and co-author of the study. However, he clarifies: "Although we cannot yet say whether a longer-term weakening of the overturning is already occurring, all climate models predict a weakening as a result of human-induced climate change as 'very likely' for the future.

Ongoing observing programs and further development of supercomputer simulations are crucial for a better understanding of the key climate-relevant processes. And, of course, for future projections of the Gulf Stream system under climate change.

The research conducted by GEOMAR has revealed a potential link between winter storms over the Labrador Sea and the Gulf Stream system. While the findings are certainly intriguing, further research is needed to confirm the validity of these results. Until then, it is important to remain skeptical.