Penn State develops deep learning model that helps doctors choose better lung cancer treatments

Doctors and healthcare workers may one day use a machine learning model, called deep learning, to guide their treatment decisions for lung cancer patients, according to a team of Penn State Great Valley researchers.

In a study, the researchers report that they developed a deep learning model that, in certain conditions, was more than 71 percent accurate in predicting survival expectancy of lung cancer patients, significantly better than traditional machine learning models that the team tested. The other machine learning models the team tested had about a 61 percent accuracy rate.

Information on a patient's survival expectancy could help guide doctors and caregivers in making better decisions on using medicines, allocating resources, and determining the intensity of care for patients, according to Youakim Badr, associate professor of data analytics.

"This is a high-performance system that is highly accurate and is aimed at helping doctors make these important decisions about providing care to their patients," said Badr. "Of course, this tool can't be used as a substitute for a doctor in making decisions on lung cancer treatments."

According to Robin G. Qiu, professor of information science and engineering and an affiliate of the Institute for Computational and Data Sciences, the model can analyze a large amount of data, typically called features in machine learning, that describe the patients and the disease to understand how a combination of factors affect lung cancer survival periods. Features can include information such as types of cancer, size of tumors, the speed of tumor growth, and demographic data.

Deep learning may be uniquely suited to tackle lung cancer prognosis because the model can provide the robust analysis necessary in cancer research, according to the researchers, who report their findings in the International Journal of Medical Informatics. Deep learning is a type of machine learning that is based on artificial neural networks, which are generally modeled on how the human brain's own neural network functions.

In deep learning, however, developers apply a sophisticated structure of multiple layers of these artificial neurons, which is why the model is referred to as "deep." The learning aspect of deep learning comes from how the system learns from connections between data and labels, said Badr.

"Deep learning is a machine-learning algorithm that makes associations between the data, itself, and the labels that we use to describe the data examples," said Badr. "By making these associations, it learns from the data."

Qiu added that deep learning's structure offers several advantages for many data science tasks, especially when confronted with data sets that have a large number of records -- in this case, patients -- as well as a large number of features.

"It improves performance tremendously," said Qiu. "In deep learning, we can go deeper, which is why they call it that. In traditional machine learning, you have a simple structure of layers of neural networks. In each layer, you have a group of cells. In deep learning, there are many layers of these cells that can be architected into a sophisticated structure to perform better feature transformation and extraction, which gives you the ability to further improve the accuracy of any model."

In the future, the researchers would like to improve the model and test its ability to analyze other types of cancers and medical conditions.

"The accuracy rate is good so far -- but it's not perfect, so part of our future work is to improve the model," said Qiu. Deep learning, a powerful machine learning model, could guide doctors and healthcare workers in weighing treatment and care options, according to a team of Great Valley researchers. IMAGE: WIKIMEDIA{module INSIDE STORY}

To further improve their deep learning model, the researchers would also need to connect with domain experts, who are people who have specific knowledge. In this case, the researchers would like to connect with experts on specific cancers and medical conditions.

"In a lot of cases, we might not know a lot of features that should go into the model," said Qiu. "But, by collaborating with domain experts, they could help us collect important features about patients that we might not be aware of and that would further improve the model."

The researchers analyzed data from the Surveillance, Epidemiology, and End Results (SEER) program. The SEER dataset is one of the biggest and most comprehensive databases on the early diagnosis information for cancer patients in the United States, according to Shreyesh Doppalapudi, a graduate student research assistant and first author of the paper. The program's cancer registries cover almost 35 percent of U.S. cancer patients.

"One of the really good things about this data is that it covers a large section of the population and it's really diverse," said Doppalapudi. "Another good thing is that it covers a lot of different features, which you can use for many different purposes. This becomes very valuable, especially when using machine learning approaches."

Doppalapudi added that the team compared several deep learning approaches, including artificial neural networks, convolutional neural networks, and recurrent neural networks, to traditional machine learning models. The deep learning approaches performed much better than the traditional machine learning methods, he said.

Deep learning architecture is better suited to processing such large, diverse datasets, such as the SEER program, according to Doppalapudi. Working on these types of datasets requires robust computational capacity. In this study, the researchers relied on ICDS's Roar supercomputer.

With about 800,000 to 900,000 entries in the SEER dataset, the researchers said that manually finding these associations in the data with an entire team of medical researchers would be extremely difficult without assistance from machine learning.

"If it were just three fields, I would say it would be impossible, but, we had about 150 fields," said Doppalapudi. "Understanding all of those different fields and then reading and learning from that information, would just be near impossible."

Toshiba launches 18TB HDDs for pennies per GB

3rd-generation 9-disk Helium-sealed design and innovations in energy-assisted recording help customers achieve new levels of storage density and power efficiency

Toshiba has launched the 18TB MG09 Series HDD, Toshiba’s first HDD models with energy-assisted magnetic recording. The MG09 Series features Toshiba’s third-generation, 9-disk Helium-sealed design and Toshiba’s innovative Flux Control – Microwave-Assisted Magnetic Recording (FC-MAMR) technology, to advance Conventional Magnetic Recording (CMR) density to 2TB per disk, achieving a total capacity of 18TB. Sample shipments of 18TB MG09 Series HDD to customers are expected to start sequentially at the end of March 2021.

With 12.5% more capacity than prior 16TB models, 18TB MG09 CMR drives are compatible with the widest range of applications and operating systems. The MG09 is adapted to mixed random and sequential read and write workloads in both cloud-scale and traditional data center use cases. The MG09 features 7,200rpm performance, a 550TB per year workload rating, and a choice of SATA and SAS interfaces—all in a power-efficient Helium-sealed industry-standard, 3.5-inch form factor.

The MG09 Series further illustrates Toshiba’s commitment to advancing HDD design to meet the evolving needs for storage devices in cloud-scale servers and Object and File storage infrastructure. With its improved power efficiency and 18TB capacity, the MG09 Series helps cloud-scale infrastructure advance storage density to reduce CAPEX and improve TCO (total cost of ownership). As data growth continues at an explosive pace, advanced 18TB MG09 with FC-MAMR technology will help cloud-scale service providers and storage solution designers achieve higher storage densities for cloud, hybrid-cloud, and on-premises rack-scale storage.  {module INSIDE STORY}

“Toshiba’s new 18TB MG09 Series delivers new levels of storage density and power efficiency to our cost-conscious cloud-scale and storage solutions customers. Our HDD technology can achieve our customers’ critical TCO objectives at a cost of pennies per GB,” said Shuji Takaoka, General Manager of the Storage Products Sales & Marketing Division at Toshiba Electronic Devices & Storage Corporation. Our 3rd generation 9-disk Helium-sealed design provides a field-tested foundation for achieving a massive 18TB capacity. The addition of Toshiba’s innovative FC-MAMR technology advances CMR capacity to 18TB, delivering compatibility with the widest range of applications and operating environments.”

For more information on the new products, please visit: https://toshiba.semicon-storage.com/ap-en/storage/product/data-center-enterprise/cloud-scale-capacity/articles/mg09-series.html

Research from CANELa models reaction to improve fuel, lubricant additive production

Polyisobutenyl succinic anhydrides (PIBSAs) are important for the auto industry because of their wide use in lubricant and fuel formulations. Their synthesis, however, requires high temperatures and, therefore, higher cost.

Adding a Lewis acid--a substance that can accept a pair of electrons--as a catalyst makes the PIBSA formation more efficient. But which Lewis acid? Despite the importance of PIBSAs in the industrial space, an easy way to screen these catalysts and predict their performance hasn't yet been developed.

New research led by the Computer-Aided Nano and Energy Lab (CANELa) at the University of Pittsburgh Swanson School of Engineering, in collaboration with the Lubrizol Corporation, addresses this problem by revealing the detailed mechanism of the Lewis acid-catalyzed reaction using computational modeling. The work, recently featured on the cover of the journal Industrial & Engineering Chemistry Research, builds a deeper understanding of the catalytic activity and creates a foundation for computationally screening catalysts in the future. Polyisobutenyl succinic anhydrides (PIBSAs) are important for the auto industry because of their wide use in lubricant and fuel formulations. New research led by the Computer-Aided Nano and Energy Lab (CANELa), in collaboration with the Lubrizol Corporation, builds a deeper understanding of the catalyst used to synthesize PIBSAs.

"PIBSAs are commonly synthesized through the reaction between maleic anhydride and polyisobutene. Adding Lewis acids makes the reaction faster and reduces the energy input required for PIBSA formation," explained Giannis Mpourmpakis, the Bicentennial Alumni Faculty Fellow and associate professor of chemical and petroleum engineering at Pitt. "But the reaction mechanism has not been well understood, and there are not many examples of this reaction in the literature. Our work helps to explain the way the reaction happens and identifies Lewis acids that will work best."

This new foundational information will aid in the discovery of Lewis acid catalysts for industrial chemical production at a faster rate and reduced cost.

"The alliance between the University of Pittsburgh and Lubrizol has been instrumental in demonstrating how Academia and the Chemical Process Industry can work together to produce commercially relevant results," said Glenn Cormack, Global Process Innovation Manager at The Lubrizol Corporation. "Combining the knowledge and expertise of the Swanson School of Engineering and The Lubrizol Corporation allows both parties access to some of the best available computational and experimental techniques when exploring new challenges."

The research is one of many collaborations between Pitt and the Lubrizol Corporation, an Ohio-based specialty chemical provider for transportation, industrial and consumer markets. The alliance with Lubrizol, now in its seventh year, provides students with hands-on opportunities to experience how the knowledge and skills they're developing are used in the chemical industry. At the same time, students gain world-ready knowledge how Pitt's research helps improve Lubrizol's processes and products.

"Over the last few years, our partnership with Lubrizol has led to new, innovative ways for Lubrizol to make products and rethink their manufacturing processes," said Steven Little, William Kepler Whiteford Endowed Professor and chair of the Department of Chemical and Petroleum Engineering. "We learn a tremendous amount from them as well, and all of these publications are evidence of an alliance that continues to grow."

The paper, "Computational Screening of Lewis Acid Catalysts for the Ene Reaction between Maleic Anhydride and Polyisobutylene,"  (DOI: 10.1021/acs.iecr.0c04860 ) was published in the ACS journal I&EC Research. It was authored by Cristian Morales-Rivera and Giannis Mpourmpakis at Pitt and Nico Proust and James Burrington at the Lubrizol Corporation.