MIT neuroscientists build AI that sheds light on how the brain processes language

Neuroscientists find the internal workings of next-word prediction models resemble those of language-processing centers in the brain.

In the past few years, artificial intelligence models of language have become very good at certain tasks. Most notably, they excel at predicting the next word in a string of text; this technology helps search engines and texting apps predict the next word you are going to type.

The most recent generation of predictive language models also appears to learn something about the underlying meaning of language. These models can not only predict the word that comes next, but also perform tasks that seem to require some degree of genuine understanding, such as question answering, document summarization, and story completion. 

Such models were designed to optimize performance for the specific function of predicting text, without attempting to mimic anything about how the human brain performs this task or understands the language. But a new study from MIT neuroscientists suggests the underlying function of these models resembles the function of language-processing centers in the human brain.

Computer models that perform well on other types of language tasks do not show this similarity to the human brain, offering evidence that the human brain may use next-word prediction to drive language processing.

“The better the model is at predicting the next word, the more closely it fits the human brain,” says Nancy Kanwisher, the Walter A. Rosenblith Professor of Cognitive Neuroscience, a member of MIT’s McGovern Institute for Brain Research and Center for Brains, Minds, and Machines (CBMM), and an author of the new study. “It’s amazing that the models fit so well, and it very indirectly suggests that maybe what the human language system is doing is predicting what’s going to happen next.”

Joshua Tenenbaum, a professor of computational cognitive science at MIT and a member of CBMM and MIT’s Artificial Intelligence Laboratory (CSAIL); and Evelina Fedorenko, the Frederick A. and Carole J. Middleton Career Development Associate Professor of Neuroscience and a member of the McGovern Institute, are the senior authors of the study. Martin Schrimpf, an MIT graduate student who works in CBMM, is the first scholar of the academic research paper.

Making predictions

The new, high-performing next-word prediction models belong to a class of models called deep neural networks. These networks contain computational “nodes” that form connections of varying strength, and layers that pass information between each other in prescribed ways.

Over the past decade, scientists have used deep neural networks to create models of vision that can recognize objects, as well as the primate brain, does. Research at MIT has also shown that the underlying function of visual object recognition models matches the organization of the primate visual cortex, even though those computer models were not specifically designed to mimic the brain.

In the new study, the MIT team used a similar approach to compare language-processing centers in the human brain with language-processing models. The researchers analyzed 43 different language models, including several that are optimized for next-word prediction. These include a model called GPT-3 (Generative Pre-trained Transformer 3), which, given a prompt, can generate text similar to what a human would produce. Other models were designed to perform different language tasks, such as filling in a blank in a sentence.

As each model was presented with a string of words, the researchers measured the activity of the nodes that make up the network. They then compared these patterns to activity in the human brain, measured in subjects performing three language tasks: listening to stories, reading sentences one at a time, and reading sentences in which one word is revealed at a time. These human datasets included functional magnetic resonance (fMRI) data and intracranial electrocorticographic measurements taken in people undergoing brain surgery for epilepsy.

They found that the best-performing next-word prediction models had activity patterns that very closely resembled those seen in the human brain. Activity in those same models was also highly correlated with measures of human behavioral measures such as how fast people were able to read the text.

“We found that the models that predict the neural responses well also tend to best predict human behavior responses, in the form of reading times. And then both of these are explained by the model performance on next-word prediction. This triangle really connects everything together,” Schrimpf says.

Game changer

One of the key computational features of predictive models such as GPT-3 is an element known as a forward one-way predictive transformer. This kind of transformer can make predictions of what is going to come next, based on previous sequences. A significant feature of this transformer is that it can make predictions based on a very long prior context (hundreds of words), not just the last few words.

Scientists have not found any brain circuits or learning mechanisms that correspond to this type of processing, Tenenbaum says. However, the new findings are consistent with hypotheses that have been previously proposed that prediction is one of the key functions in language processing, he says.

“One of the challenges of language processing is the real-time aspect of it,” he says. “Language comes in, and you have to keep up with it and be able to make sense of it in real-time.”

The researchers now plan to build variants of these language processing models to see how small changes in their architecture affect their performance and their ability to fit human neural data.

“For me, this result has been a game-changer,” Fedorenko says. “It’s totally transforming my research program because I would not have predicted that in my lifetime we would get to these computationally explicit models that capture enough about the brain so that we can actually leverage them in understanding how the brain works.”

The researchers also plan to try to combine these high-performing language models with some computer models Tenenbaum’s lab has previously developed that can perform other kinds of tasks such as constructing perceptual representations of the physical world.

“If we’re able to understand what these language models do and how they can connect to models which do things that are more like perceiving and thinking, then that can give us more integrative models of how things work in the brain,” Tenenbaum says. “This could take us toward better artificial intelligence models, as well as giving us better models of how more of the brain works and how general intelligence emerges than we’ve had in the past.”

IRB Barcelona identifies the genes that cause resistance to treatment of the pathogenic fungus Candida

It is estimated that 80% of women will suffer from vaginal candidiasis at least once in their lives. In addition to superficial infections, which can be oral or vaginal and do not usually have a serious prognosis, fungi of the Candida genus can cause systemic diseases in immunocompromised individuals and these are fatal in 40% of cases. Drugs are available to treat these conditions, but doctors are increasingly encountering varieties of fungi that have developed resistance to treatments, thus making candida infection a serious global health problem. 

Scientists led by Dr. Toni Gabaldón, ICREA researcher and group leader at the Institute for Research in Biomedicine (IRB Barcelona) and the Barcelona Supercomputing Center (BSC), have studied the resistance mechanisms developed by the species Candida glabrata upon exposure to various drugs and have identified eight genes that, when mutated, are responsible for allowing the fungus to adapt and survive treatment. To date, only half of these genes were known as candidates to confer drug resistance. Mutations correlated with mechanisms of resistance to treatment (IRB Barcelona)

“The interesting thing about this work is that the identification of these eight genes allows us to use a genetic test to diagnose potential drug resistance present in the infection of a specific patient and, therefore, help choose the best treatment,” says Dr. Gabaldón, head of the Comparative Genomics lab at IRB Barcelona.

The evolutionary process underlying the incorporation of resistance mechanisms

To perform this study, the researchers cultured independent populations of the fungus Candida glabrata and administered a variety of drugs available on the market that have different mechanisms of action. They then analyzed the resistance developed and the genomes of the distinct populations to correlate the mechanisms with the genetic differences.

The strains that have been generated in this work, which combine resistance to several drugs, can serve as a study model in the search for new treatments.

Cross-resistance phenomena

In addition to resistance to the treatment administered, the researchers observed that exposure to one particular drug (fluconazol) also caused resistance to another type of drug (equinocandina) in 50% of the cases, although these populations had never been exposed to the second drug.

“This phenomenon is known as cross-resistance and, in this regard, our discoveries should lead to an adaptation of treatment guidelines to avoid favoring the appearance of multiresistant,” says Dr. Gabaldón.

The laboratory headed by Dr. Gabaldón has received support from the “la Caixa” Foundation to start a project related to these findings. In this regard, this endeavor seeks to improve the diagnosis of candidiasis and design new treatments by searching for patterns of infection and adaptation to drugs in the different species of candida.

The work is a collaboration with Dr. Christoph Schüller, from the Universidad BOKU in Vienna (Austria), and it has been funded by the Spanish Ministry of Science and Innovation and the “la Caixa” Foundation.

University of Surrey researcher discovers links to Bernard Williams' 40-year-old "slosh" hypothesis

Syringomyelia is a spinal cord disease characterized by fluid-filled cavities within the spinal cord tissue, which was first described over 400 years ago. However, the mechanism by which these cavities are formed is still not fully understood. In 1980, neurosurgeon Bernard Williams hypothesized that pressure changes due to coughing, sneezing, and straining, caused fluid in the cavity to “slosh” thus generating stress in the spinal cord tissue, and allowing the cavity to slowly expand over time.

Syringomyelia is common in brachycephalic (flat-faced) toy breed dogs. In humans, the disease can be painful and disabling, often seen alongside Chiari malformation, a condition where the lower part of the brain pushes down and extends into the spinal canal. In some cases, the malformation can be a direct result of serious spinal cord trauma.

Dr. Srdjan Cirovic, Lecturer in Biomedical Engineering, and Professor Clare Rusbridge, Professor in Veterinary Neurology have worked together to develop a supercomputer model based on the MRI from a Cavalier King Charles spaniel with syringomyelia, showing that Bernard Williams hypothesis from 1980 is likely correct.  

By using this model, the pair used various simulations to show that the fluid “slosh” caused a small cavity to gradually expand down the spinal cord. However, when the syrinx became large, there was less focal stress which may explain why syringomyelia can develop rapidly but then remain unchanged in shape over time.

Dr. Srdjan Cirovic and Professor Clare Rusbridge plan to further develop the model to improve understanding of why syringomyelia develops in both dogs and humans, and also use it as an opportunity to model potential surgeries to better establish means of reversing the syrinx filling in all species.

Clare Rusbridge, Professor in Veterinary Neurology at the University of Surrey said: “The results for the simulations of an expanding syrinx are broadly consistent with the homeostatic hypothesis, however, this study specifically addresses syringomyelia in dogs; more specifically in a Cavalier King Charles Spaniel. Since there are many similarities in syringomyelia in both humans and animals, it is likely the theory should hold for humans too. However, more analysis needs to be done to understand this in further detail.”

Dr. Srdjan Cirovic, Lecturer in Biomedical Engineering at the University of Surrey, said: “It has been both fascinating and challenging to work on the problem of syringomyelia over the last decade. With this breakthrough, we are one step closer to understanding this puzzling neurological condition. In the future, we are looking towards using these findings to inform the improved medical treatment of syringomyelia in humans as well as animals.”

Dr. Helen Williams, General Practitioner and daughter of Bernard Williams said: “This is a significant and important piece of work, and thanks to the hard work of two researchers, I am delighted to hear that my late father’s 40-year-old hypothesis is now much closer to being proven. This is key to further understanding more about this debilitating disease and how it can be treated.”