Duke computer scientist wins $1 million artificial intelligence prize

Duke professor becomes the second recipient of AAAI Squirrel AI Award for pioneering socially responsible AI

Whether preventing explosions on electrical grids, spotting patterns among past crimes, or optimizing resources in the care of critically ill patients, Duke University computer scientist Cynthia Rudin wants artificial intelligence (AI) to show its work. Especially when it’s making decisions that deeply affect people’s lives. Cynthia Rudin, professor of electrical and computer engineering and computer science at Duke University  Credit Les Todd

While many scholars in the developing field of machine learning were focused on improving algorithms, Rudin instead wanted to use AI’s power to help society. She chose to pursue opportunities to apply machine learning techniques to important societal problems, and in the process, realized that AI’s potential is best unlocked when humans can peer inside and understand what it is doing. 

Now, after 15 years of advocating for and developing “interpretable” machine learning algorithms that allow humans to see inside AI, Rudin’s contributions to the field have earned her the $1 million Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity from the Association for the Advancement of Artificial Intelligence (AAAI). Founded in 1979, AAAI serves as the prominent international scientific society serving AI researchers, practitioners, and educators.

Rudin, a professor of computer science and engineering at Duke, is the second recipient of the new annual award, funded by the online education company Squirrel AI to recognize achievements in artificial intelligence in a manner comparable to top prizes in more traditional fields.

She is being cited for “pioneering scientific work in the area of interpretable and transparent AI systems in real-world deployments, the advocacy for these features in highly sensitive areas such as social justice and medical diagnosis, and serving as a role model for researchers and practitioners.”

“Only world-renowned recognition, such as the Nobel Prize and the A.M. Turing Award from the Association of Computing Machinery, carry monetary rewards at the million-dollar level,” said AAAI awards committee chair and past president Yolanda Gil. “Professor Rudin's work highlights the importance of transparency for AI systems in high-risk domains.  Her courage in tackling controversial issues calls out the importance of research to address critical challenges in the responsible and ethical use of AI."

Rudin’s first applied project was a collaboration with Con Edison, the energy company responsible for powering New York City. Her assignment was to use machine learning to predict which manholes were at risk of exploding due to degrading and overloaded electrical circuitry. But she soon discovered that no matter how many newly published academic bells and whistles she added to her code, it struggled to meaningfully improve performance when confronted by the challenges posed by working with handwritten notes from dispatchers and accounting records from the time of Thomas Edison.

“We were getting more accuracy from simple classical statistics techniques and a better understanding of the data as we continued to work with it,” Rudin said. “If we could understand what information the predictive models were using, we could ask the Con Edison engineers for useful feedback that improved our whole process. It was the interpretability in the process that helped improve accuracy in our predictions, not any bigger or fancier machine learning model. That’s what I decided to work on, and it is the foundation upon which my lab is built.”

Over the next decade, Rudin developed techniques for interpretable machine learning, which are predictive models that explain themselves in ways that humans can understand. While the code for designing these formulas is complex and sophisticated, the formulas might be small enough to be written in a few lines on an index card.

Rudin has applied her brand of interpretable machine learning to numerous impactful projects. With collaborators, Brandon Westover and Aaron Struck at Massachusetts General Hospital, and her former student Berk Ustun, she designed a simple point-based system that can predict which patients are most at risk of having destructive seizures after a stroke or other brain injury. And with her former MIT student Tong Wang and the Cambridge Police Department, she developed a model that helps discover commonalities between crimes to determine whether they might be part of a series committed by the same criminals. That open-source program eventually became the basis of the New York Police Department’s Patternizr algorithm, a powerful piece of code that determines whether a new crime committed in the city is related to past crimes.

“Cynthia’s commitment to solving important real-world problems, desire to work closely with domain experts, and ability to distill and explain complex models is unparalleled,” said Daniel Wagner, deputy superintendent of the Cambridge Police Department. “Her research resulted in significant contributions to the field of crime analysis and policing. More impressively, she is a strong critic of potentially unjust ‘black box’ models in criminal justice and other high-stakes fields, and an intense advocate for transparent interpretable models was accurate, just and bias-free results are essential.”

Black-box models are the opposite of Rudin’s transparent codes. The methods applied in these AI algorithms make it impossible for humans to understand what factors the models depend on, which data the models are focusing on and how they’re using it. While this may not be a problem for trivial tasks such as distinguishing a dog from a cat, it could be a huge problem for high-stakes decisions that change people’s lives.

“Cynthia is changing the landscape of how AI is used in societal applications by redirecting efforts away from black-box models and toward interpretable models by showing that the conventional wisdom—that black boxes are typically more accurate—is very often false,” said Jun Yang, chair of the computer science department at Duke. “This makes it harder to justify subjecting individuals (such as defendants) to black-box models in high-stakes situations. The interpretability of Cynthia's models has been crucial in getting them adopted in practice, since they enable human decision-makers, rather than replace them.”

One impactful example involves COMPAS—an AI algorithm used across multiple states to make bail parole decisions that were accused by a ProPublica investigation of partially using race as a factor in its calculations. The accusation is difficult to prove, however, as the details of the algorithm are proprietary information, and some important aspects of the analysis by ProPublica are questionable. Rudin's team has demonstrated that a simple interpretable model that reveals exactly which factors it’s taking into consideration is just as good at predicting whether or not a person will commit another crime. This begs the question, Rudin says, as to why black box models need to be used at all for these types of high-stakes decisions.

"We've been systematically showing that for high-stakes applications, there's no loss in accuracy to gain interpretability, as long as we optimize our models carefully,” Rudin said. “We've seen this for criminal justice decisions, numerous healthcare decisions including medical imaging, power grid maintenance decisions, financial loan decisions, and more. Knowing that this is possible changes the way we think about AI as incapable of explaining itself."

Throughout her career, Rudin has not only been creating these interpretable AI models but developing and publishing techniques to help others do the same. That hasn’t always been easy. When she first began publishing her work, the terms “data science” and "interpretable machine learning" did not exist, and there were no categories into which her research fit neatly, which means that editors and reviewers didn't know what to do with it. Cynthia found that if a paper wasn’t proving theorems and claiming its algorithms to be more accurate, it was—and often still is—more difficult to publish.

As Rudin continues to help people and publish her interpretable designs—and as more concerns continue to crop up with black box code—her influence is finally beginning to turn the ship. There are now entire categories in machine learning journals and conferences devoted to interpretable and applied work. Other colleagues in the field and their collaborators are vocalizing how important interpretability is for designing trustworthy AI systems.

“I have had enormous admiration for Cynthia from very early on, for her spirit of independence, her determination, and her relentless pursuit of true understanding of anything new she encountered in classes and papers,” said Ingrid Daubechies, the James B. Duke Distinguished Professor of Mathematics and Electrical and Computer Engineering, one of the world’s preeminent researchers in signal processing, and one of Rudin’s Ph.D. advisors at Princeton University. “Even as a graduate student, she was a community builder, standing up for others in her cohort. She got me into machine learning, as it was not an area in which I had any expertise at all before she gently but very persistently nudged me into it. I am so very glad for this wonderful and very-deserved recognition for her!”

“I could not be more thrilled to see Cynthia’s work honored in this way,” added Rudin’s second Ph.D. advisor, Microsoft Research partner Robert Schapire, whose work on “boosting” helped lay the foundations for modern machine learning. “For her inspiring and insightful research, her independent thinking that has led her in directions very different from the mainstream, and for her longstanding attention to issues and problems of practical, societal importance.”

Rudin earned undergraduate degrees in mathematical physics and music theory from the University at Buffalo before completing her Ph.D. in applied and computational mathematics at Princeton. She then worked as a National Science Foundation postdoctoral research fellow at New York University, and as an associate research scientist at Columbia University. She became an associate professor of statistics at the Massachusetts Institute of Technology before joining Duke’s faculty in 2017, where she holds appointments in computer science, electrical and computer engineering, biostatistics and bioinformatics, and statistical science.

She is a three-time recipient of the INFORMS Innovative Applications in Analytics Award, which recognizes creative and unique applications of analytical techniques, and is a Fellow of the American Statistical Association and the Institute of Mathematical Statistics.

“I want to thank AAAI and Squirrel AI for creating this award that I know will be a game-changer for the field,” Rudin said. “To have a ‘Nobel Prize’ for AI to help society makes it finally clear without a doubt that this topic, AI works for the benefit for society, is important.”

Explore the most detailed 3D map of the universe with virtual reality

You’re floating in space, just above the Earth. The International Space Station is an arm’s length away. You twist your head around only to see the moon, a tiny circle, far off in the distance. You can’t help but think that this is probably what an astronaut would see during a spacewalk.

This is the beginning of a journey into outer space, in a virtual environment developed by EPFL scientists.

Now, for the very first time, you can enter the most comprehensive virtual universe based on the latest astrophysical and cosmological data, thanks to powerful, open-source software developed at EPFL’s Laboratory of Astrophysics (LASTRO). The software is called VIRUP, for Virtual Reality Universe Project, and the first beta version is being released today.

“You can navigate through the most detailed map of the universe from the comfort of your own home,” explains Jean-Paul Kneib, director of LASTRO. “It’s the chance to travel through space, through time, and discover the universe.” From left to right: Florian Cabot, Sarah Kenderdine, Yves Revaz, Jean-Paul Kneib.  Credit Alain Herzog / EPFL

The VIRUP challenge: visualizing terabytes of data at once

Astronomers and astrophysicists are collecting data about billions of celestial objects in the night sky with the help of telescopes here on Earth and in space. There are already decades of observational data. Even greater amounts of data are expected soon.

To get visual representations of the vast amounts of data, like a movie, it’s standard practice to pre-render specific sequences. But what about creating a visual representation of the data in real-time, as if you were there, an observer at an arbitrary point in space and time? This is what astrophysicist Yves Revaz of LASTRO set out to do with the VIRUP, with the help of LASTRO software engineer Florian Cabot, and it meant rendering terabytes of data at 90 frames per second. The latter constraint is imposed by the virtual reality environment, for a fully immersive and smooth experience.

“Visualization of astrophysical data is much more accessible than showing graphs and figures, it helps to develop an intuition of complex phenomena,” explains Revaz. “VIRUP is precisely a way of making all of our astrophysical data accessible to everyone, and this will become even more important as we build bigger telescopes like the Square Kilometer Array that will generate tremendous amounts of data.”

Astrophysical, cosmological data and supercomputer simulations

For the moment, VIRUP can already visualize data from over 8 databases bundled together. The Sloan Digital Sky Survey consists of over 50 million galaxies and 300 million objects in general. The Gaia data of the Milky Way Galaxy consists of 1.5 billion light sources. The Planck mission involves a satellite that measures the universe’s first light after the Big Bang called the cosmic microwave background radiation. There’s also the Open Exoplanet Catalog which aggregates various sources of exoplanet data. Other databases include a repertoire of over 3000 satellites orbiting the Earth, as well as various skins and textures to render the objects.

VIRUP also renders data sets of contemporary and scientifically robust simulations based on research.  You can watch the Milky Way Galaxy and its future collision with the Andromeda galaxy, our galactic neighbor also known as M31. You can also see huge portions of the cosmic web – the filamentary large-scale structures that extend across the universe – coming into existence over billions of years, based on simulations from a data set called IllustrisTNG which consists of 30 billion simulated particles. A major challenge is ensuring a smooth transition from one database to the next.

“We considered using existing graphics engines for visualizing the data, but in the end, I developed one specifically for the project. It’s flexible, we can add more data as it becomes available, and it’s tailored to astronomy,” explains Cabot. “For this first release of VIRUP, I have focused on rendering static data, so interacting with the data is still a bit rough and the rendering of simulations can’t yet happen in real-time for example.”

Of course, it’s only possible to navigate through the data and simulations imported into VIRUP. You can visit the 4500 discovered exoplanets so far, for instance, but the way they look are artist impressions inferred from observation. You can also navigate through the 50 million galaxies measured so far by the Sloan Digital Sky Survey, but the actual data has limited resolution and this limits how much detail can be shown in its virtual representation. That being said, there is still a tremendous amount of data that can be explored with the help of VIRUP. Some of the next steps could be to include databases of objects in our solar system like all of the asteroids, or various other objects in the galaxy like nebulae and pulsars.

A flexible immersive virtual environment

For the fully immersive, 3D, 360 experience, you’d need a pair of VR glasses and a computer for running the VIRUP engine, plus storage space to store a selection of astrophysical and cosmological data.

VIRUP is also capable of building a virtual universe in other VR environments, like a dome which is especially useful for venues like planetariums, panoramas, caves, and half-caves. The open software’s transition from the rather personal and isolated experience of VR goggles to the collective, theatrical experience offered by domes and caves, became possible thanks to a collaboration between LASTRO scientists and researchers at EPFL’s Laboratory for Experimental Museology (eM+) and funded by EPFL seed funding for fostering interdisciplinary projects.

“It’s about data discovery. The immersive system means that you can embody the data which has a profound effect on how you actually perceive the data,” says artist Sarah Kenderdine who leads eM+.

A journey through the universe – a short movie

With the release of VIRUP comes a short movie entitled “Archaeology of Light”, one possible journey through the virtual universe made possible thanks to the open software.

The 20-minute movie starts from Earth, and charts out a voyage throughout the various scales of the universe, from our solar system to the Milky Way, all the way to the cosmic web and the relic light of the Big Bang.

If you’re impatient to see the movie, you can watch it in 2D, 360°, and in stereo 180°  on YouTube. If you have access to a VR environment, you can immerse yourself in “Archaeology of Light”.

For the dome experience, the movie will be showcased at EPFL’s next exhibit, Cosmos Archaeology: Explorations in Space and Time, which opens on 21 April 2022 at EPFL Pavilions. A  preliminary version of the movie was shown at the Synra Dome of the Science Museum of Tokyo in September, thanks to support from the Swiss Embassy in Tokyo.  VIRUP will be presented this month at an exhibit, in Dubai, as part of EPFL’s Virtual Space Tour.

Europe launches research project that aims to improve treatment for patients with cancer through artificial intelligence

In Arnhem, the Netherlands, the Innovative Medicines Initiative (IMI), a joint undertaking of the European Union and the European Federation of Pharmaceutical Industries and Associations (EFPIA), has announced the launch of OPTIMA (Optimal Treatment for Patients with Solid Tumours in Europe Through Artificial intelligence), a € 21.3 million public-private research program that will seek to use artificial intelligence (AI) to improve care for patients with prostate, breast and lung cancer. OPTIMA’s goal is to design, develop and deliver the first interoperable, GDPR-compliant real-world oncology data and evidence generation platform in Europe, to potentially advance treatment for patients with solid tumors in the three cancers. OPTIMA (Optimal Treatment for Patients with Solid Tumours in Europe Through Artificial intelligence)

To achieve this ambitious goal, OPTIMA has brought together 36 partners from across 13 countries to:

  • Establish a secure, large-scale evidence data platform for prostate, breast, and lung cancer that includes real-world data from more than 200 million people. With a focus on patient privacy, the platform will be GDPR-compliant. The interoperable platform will host datasets, data analysis tools, federated learning tools, AI algorithms, and electronic decision support tools.
     
  • Drive new knowledge generation by developing advanced analytics and AI models to identify, prioritize and fill the main knowledge gaps in prostate, breast, and lung cancer – and propose improved clinical guideline recommendations.
     
  • Develop AI-based decision support tools that can be employed in electronic health records (EHRs). These tools will help clinicians make care decisions based on the leading clinical practice guidelines for prostate, breast, and lung cancer.

These new tools and models could allow for the processing of high-dimensional data across sources and the use of deep learning to identify factors that enable individualized, real-time care decisions – ultimately informing personalized treatment for patients with solid tumors.

The OPTIMA consortium consists of 36 multidisciplinary private and public stakeholders in the clinical, academic, patient, regulatory, data sciences, legal and ethical, and pharmaceutical fields – and is being jointly led by Prof. Dr. James N’Dow from the European Association of Urology and Academic Urology Unit at the University of Aberdeen and Dr. Hagen Krüger, Medical Director Oncology, Pfizer Germany.

Prof. N’Dow, “OPTIMA’s main objective is to harness the potential of AI to enable healthcare professionals to provide the most optimal personalized care for each individual patient living with prostate, breast, and lung cancer and their families. This is an ambitious goal and one that the entire OPTIMA consortium is dedicated to delivering, building on the diverse knowledge base and expertise of our consortium members. By working together, we hope to deliver meaningful improvements in cancer care.”

Dr. Krüger, “While healthcare has begun to take advantage of AI to improve treatment for patients with cancer, there is still immense untapped potential to integrate these next-generation tools into care models and decision-making. We hope that OPTIMA will be a key driver in the development of personalized treatments that recognize each patient’s individual needs.”

Dr. Pierre Meulien, Executive Director IMI, “The OPTIMA project brings together experts from a wide range of disciplines and organizations. It is therefore well placed to potentially deliver results that could fast-track the use of artificial intelligence in the care of people with cancer.”

With its diverse multidisciplinary membership, the OPTIMA consortium is uniquely positioned to develop healthcare evidence-generation practices for the incorporation of real-world evidence into clinical practice guidelines (CPGs). If successful, OPTIMA may also help to establish best practice procedures for CPG development that incorporate analytics and evidence-informed by AI models.

OPTIMA is at the forefront of healthcare innovation in Europe, building on other IMI projects (such as EHDEN, PIONEER, and Harmony) that are supporting the European Health Data Space (EHDS) – a European Commission initiative to promote better exchange and access to different types of health data to support healthcare delivery and health research and policy. If it is successful, OPTIMA could not only contribute knowledge and data to the EHDS but may also inform European policy regarding the clinical deployment of AI algorithms in healthcare.