The 2023 Warren Alpert Foundation Prize honors computational biology pioneer David J. Lipman
The 2023 Warren Alpert Foundation Prize honors computational biology pioneer David J. Lipman

The 2023 Warren Alpert Foundation Prize celebrates a frontiersperson in computational biology

Scientist’s vision transformed the way researchers analyze data, access biomedical information

The 2023 Warren Alpert Foundation Prize has been awarded to scientist David J. Lipman for his visionary work in the conception, design, and implementation of computational tools, databases, and infrastructure that transformed the way biological information is analyzed and accessed freely and rapidly around the world. 

The Warren Alpert Foundation bestows the $500,000 award in recognition of work that has improved the understanding, prevention, treatment, or cure of human disease. The prize is administered by Harvard Medical School. 

Lipman will be honored at a scientific symposium on Oct. 11, 2023, hosted by HMS. For further information, visit The Warren Alpert Foundation Prize symposium websiteScienceSource SS2379212 2 0e351

Lipman, who is currently a senior science adviser for bioinformatics and genomics for the Food and Drug Administration, is receiving the award for work he did in the 1980s and 1990s before and after becoming the founding director of the National Center for Biotechnology Information (NCBI), a position he held until 2017.

Lipman led the development of a powerful computational program called BLAST for the analysis and comparison of newly identified DNA and protein sequences against all known DNA and protein sequences. The tool transformed researchers’ ability to access and interpret DNA, RNA, and protein sequence data and propelled the fields of computational biology and molecular biology. While at the NCBI, Lipman also conceptualized and then oversaw the design and implementation of PubMed, the web-based database for biomedical literature used daily by millions of scientists, physicians, students, teachers, and the public. Today, NCBI houses multiple biotechnology databases and resources that, over the years, have reshaped biology, medicine, and other fields of science. 

“At a time when computation was deemed an exotic pursuit by most biomedical researchers, David was prescient because he understood the potential of computation to propel biomedical science forward,” said George Q. Daley, dean of HMS and chair of the Warren Alpert Foundation Prize scientific advisory board. “His vision, creativity, and rigor have transformed how scientists analyze and share data and, indeed, how we do science.” 

Lipman’s pioneering achievements not only democratized access to scientific information but also helped catalyze critical discoveries by enabling vital exchanges and collaborations among scientists in multiple fields of biomedicine and beyond. 

“The foundational work of David Lipman in the field of computational biology and the tools that he envisioned and created have an impact that is nearly impossible to measure on the fields of biology, medicine, and beyond,” said David M. Hirsch, director, and chairman of the board of The Warren Alpert Foundation. “His contributions exemplify the mission and vision of the Warren Alpert Foundation.”

Significance of the work

Over the past 40 years, advances in DNA sequencing, computation, and the internet have transformed biomedical research, public health, and the practice of medicine. Lipman developed many of the most important computational tools and infrastructure for making discoveries with these technologies. 

In the 1980s, as the understanding of DNA and genes accelerated, elucidating the evolutionary relationships across genes and proteins within and between species became a major focus of Lipman’s scientific curiosity and research efforts. Such knowledge is critical in elucidating evolutionary relationships that provide essential clues about the function of genes and proteins.

Early on, Lipman realized that the rapid emergence of new genetic sequencing data would require powerful and efficient computer programs to compare one DNA or protein sequence against all known sequences.

In a series of papers published between 1983 and 1990, Lipman pioneered the design of multiple methods for comparative sequence analysis. This culminated in the development of an algorithm called BLAST, described in a now seminal 1990 paper. Today, BLAST and subsequent programs, gapped BLAST, and PSI-BLAST remain among the most widely used tools in biology and medicine and are deemed among the most significant achievements in the field of computational biology of the past 40 years. 

BLAST enabled understanding of the interplay between genes, biology, physiology, and the environment across organisms and has led to important discoveries in nearly all areas of biomedical research, from the molecular basis of cancer to identify the source of a foodborne outbreak.  

Furthermore, Lipman became one of the most ardent supporters of and key figures in the move toward open-access science. He was instrumental in the design of PubMed, the open-access scientific publication resource of the NCBI and the largest and most widely used resource for scientific research in the world. 

As director of NCBI, Lipman oversaw GenBank, the world’s largest DNA and protein sequence repository, an international collaboration among the United States, Japan, and Europe. Under his direction, NCBI brought GenBank into the era of genomics and the internet, vastly augmenting its capabilities.

Through the creation of computational tools and information systems, my goal and that of the wonderful collaborators I've had the honor to work with has been to enable biomedical researchers to make discoveries. The scientists involved in the nomination and selection process have a deep understanding of the field and have themselves made some of the most important biomedical discoveries. So, this honor holds a special significance to me,” commented David J. Lipman.

Korean prof Min predicts 10 year countdown to sea-ice-free Arctic

A research team led by Professor Seung-Ki Min at POSTECH predicts the Arctic will be without ice by the end of 2030s if the current increasing rate of greenhouse gas emission continues 영문 본문 2780e

If the world keeps increasing greenhouse gas emissions at its current speed, all sea ice in the Arctic will disappear in the 2030s, an event that could at best be postponed until the 2050s should emissions be somehow reduced. The prediction is a decade earlier than what the Intergovernmental Panel on Climate Change (IPCC) has projected: an ice-free Arctic by the 2040s. 민승기교수팀 뷰페이지en fecaa

A possible ice-free Arctic in the 2030-2050s was projected regardless of humanity’s efforts to reduce its greenhouse gas emissions by Professor Seung-Ki Min and Research Professor Yeon-Hee Kim from the Division of Environmental Science and Engineering at Pohang University of Science and Technology (POSTECH) and a joint team of researchers from the Environment Climate Change Canada and Universität Hamburg, Germany.

The term global warming has become a household name since it was first used by a climate scientist at NASA in 1988. The Earth has seen a rapid decline in the Arctic sea ice area as its temperature has increased over the past several decades. This reduction in Arctic sea ice has induced the acceleration of Arctic warming, which is suggested to contribute to the increased frequency of extreme weather events in mid-latitude regions.

To predict the timing of Arctic sea ice depletion, the research team analyzed 41 years of data from 1979 to 2019. By comparing the results of multiple model simulations with three satellite observational datasets, it was confirmed that the primary cause of the decline is attributed to ‘man-made greenhouse gas emissions’. Greenhouse gas emissions resulting from human fossil fuel combustion and deforestation have been the primary drivers of Arctic sea ice decline over the past 41 years, while the influence of aerosols, solar and volcanic activities is minimal. The monthly analysis found that increased greenhouse gas emissions were reducing Arctic sea ice all year round, regardless of season or timing, although September exhibited the smallest extent of sea ice reduction. 그림1 c825a

Furthermore, it was revealed that climate models used in previous IPCC predictions generally underestimated the declining trend of sea ice area, which was taken into account to adjust the simulation values for future predictions. The results showed accelerated decline rates across all scenarios, most importantly confirming that Arctic sea ice could completely disappear by the 2050s even with reductions in greenhouse gas emissions. This finding highlights for the first time that the extinction of Arctic sea ice is possible irrespective of achieving ‘carbon neutrality.’

The accelerated decline of Arctic sea ice, faster than previously anticipated, is expected to have significant impacts not only on the Arctic region but also on human societies and ecosystems worldwide. The reduction of sea ice can result in more frequent occurrences of extreme weather events such as severe cold waves, heat waves, and heavy rainfalls all across the globe, with the thawing of the Siberian permafrost in the Arctic region possibly intensifying global warming further. We may witness terrifying scenarios, which we have seen only in disaster movies, unfold right before our eyes.

Professor Seung-Ki Min, who led the study, explained, “We have confirmed an even faster timing of Arctic sea ice depletion than previous IPCC predictions after scaling model simulations based on observational data.” He added, “We need to be vigilant about the potential disappearance of Arctic sea ice, regardless of carbon neutrality policies.” He also expressed the importance of “evaluating the various climate change impacts resulting from the disappearance of Arctic sea ice and developing adaptation measures alongside carbon emission reduction policies.”

The study was funded by the National Research Foundation of Korea (Mid-Career Researcher program).

Response differences in brain circuits when broadcasting (top row) or receiving signals (bottom row). Left: healthy controls; Middle: unresponsive wakefulness; Right: minimally conscious
Response differences in brain circuits when broadcasting (top row) or receiving signals (bottom row). Left: healthy controls; Middle: unresponsive wakefulness; Right: minimally conscious

Spanish researchers use brain modeling to identify necessary circuits of consciousness

Currently, patients with DoC are classified into several categories (coma, unresponsive wakefulness syndrome, and minimally conscious state) which describe their overall consciousness and awareness. “The diagnosis is mainly response-based: the doctor sits down with the patient and assesses their response to stimuli” explains Jitka Annen from the University of Liège. “However, this may not correspond to their underlying brain activity - patients with high activity may still be unable to react. It’s a heterogeneous group. We wanted to go a step beyond assessment and classification based on neuroimaging and instead look at the flow of information in their brains, to find common patterns associated with consciousness.”

The researchers focused on two DoC groups - unresponsive wakefulness syndrome (previously known as the vegetative state) and the minimally conscious state. After collecting fMRI data from each patient during resting state (i.e., patients were awake but no particular task was provided), they looked at the spontaneous and model-based perturbation of brain activity captured by the blood flow, such as signals and peaks. “Based on spontaneous peaks of activity, we evaluated the personalized connectivity of each patient’s brain, which can tell us how likely a signal is to travel from one point to the other,” says Gorka Zamora-López from Pompeu Fabra. “After we constructed a patient-specific computational model of their propagation patterns, we then trigger a signal in the model and see how the brain reacts. In particular, we look for which areas are more likely to respond to a signal; which areas are more likely to propagate it. Basically, we look at whether an area acts as an influencer or influenced.”

A marked distinction arises between the unresponsive wakefulness syndrome group and the minimally conscious state group, with the former not displaying activity in identifiable circuits. “The key difference is that in patients with unresponsive wakefulness syndrome, no region of the brain seems embedded in a functional network, they all display equally low activation. Meanwhile, distinct regions and circuits pop out in the brain models of people in the minimally conscious state: the thalamic-frontotemporal region when broadcasting signals, and the posterior cortical region when receiving them,” adds Rajanikant Panda of the University of Liège. 

These findings shed new light on disorders of consciousness and could lead to a more defined understanding of the mechanisms based on brain activity rather than behavioral responses. “I believe these results can potentially guide the clinicians to better understand what is going wrong in the information exchange and thus look for ways to reactivate those circuits” concludes Zamora-López.

BIDMC researchers test AI-powered chatbot's medical diagnostic ability

Generative artificial intelligence could serve as a promising tool to assist medical professionals

In a recent experiment published in JAMA, physician-researchers at Beth Israel Deaconess Medical Center (BIDMC) tested one well-known publicly available chatbot’s ability to make accurate diagnoses in challenging medical cases. The team found that the generative AI, Chat-GPT 4, selected the correct diagnosis as its top diagnosis nearly 40 percent of the time and provided the correct diagnosis in its list of potential diagnoses in two-thirds of challenging cases.

Generative AI refers to a type of artificial intelligence that uses patterns and information it has been trained on to create new content, rather than simply processing and analyzing existing data. Some of the most well-known examples of generative AI are so-called chatbots, which use a branch of artificial intelligence called natural language processing (NLP) that allows computers to understand, interpret and generate human-like language. Generative AI chatbots are powerful tools poised to revolutionize creative industries, education, customer service, and more. However, little is known about their potential performance in the clinical setting, such as complex diagnostic reasoning.

“Recent advances in artificial intelligence have led to generative AI models that are capable of detailed text-based responses that score highly in standardized medical examinations,” said Adam Rodman, MD, MPH, co-director of the Innovations in Media and Education Delivery (iMED) Initiative at BIDMC and an instructor in medicine at Harvard Medical School. “We wanted to know if such a generative model could ‘think’ like a doctor, so we asked one to solve standardized complex diagnostic cases used for educational purposes. It did really, really well.”

To assess the chatbot’s diagnostic skills, Rodman and colleagues used clinicopathological case conferences (CPCs), a series of complex and challenging patient cases including relevant clinical and laboratory data, imaging studies, and histopathological findings published in the New England Journal of Medicine for educational purposes.

Evaluating 70 CPC cases, the artificial intelligence exactly matched the final CPC diagnosis in 27 (39 percent) of cases. In 64 percent of the cases, the final CPC diagnosis was included in the AI’s differential – a list of possible conditions that could account for a patient’s symptoms, medical history, clinical findings, and laboratory or imaging results.

“While Chatbots cannot replace the expertise and knowledge of a trained medical professional, generative AI is a promising potential adjunct to human cognition in diagnosis,” said first author Zahir Kanjee, MD, MPH, a hospitalist at BIDMC and assistant professor of medicine at Harvard Medical School. “It has the potential to help physicians make sense of complex medical data and broaden or refine our diagnostic thinking. We need more research on the optimal uses, benefits, and limits of this technology, and a lot of privacy issues need sorting out, but these are exciting findings for the future of diagnosis and patient care.”

“Our study adds to a growing body of literature demonstrating the promising capabilities of AI technology,” said co-author Byron Crowe, MD, an internal medicine physician at BIDMC and an instructor in medicine at Harvard Medical School. “Further investigation will help us better understand how these new AI models might transform health care delivery.”

This work did not receive separate funding or sponsorship. Kanjee reports royalties for books edited and membership of a paid advisory board for medical education products not related to artificial intelligence from Wolters Kluwer, as well as honoraria for CME delivered from Oakstone Publishing. Crowe reports employment by Solera Health outside the submitted work. Rodman reports no conflicts of interest.  

Assistant Professor of Biomolecular Engineering Ali Shariati. (photo by Carolyn Lagattuta)
Assistant Professor of Biomolecular Engineering Ali Shariati. (photo by Carolyn Lagattuta)

UCSC prof Shariati develops deep-learning software that detects, tracks individual cells with high performance

Cell growth and division are two of the most fundamental and essential features of life, and closely monitoring cell changes over time can give scientists key insights into the dynamics of these biological processes. Time-lapse microscopy allows scientists to detect and track cells but produces huge amounts of data that are nearly impossible to sort through manually.

Now, however, powerful data processing capabilities of modern deep learning models offer techniques to sort through so much imaging data. Assistant Professor of Biomolecular Engineering Ali Shariati and doctoral student Abolfazl Zarageri together with several student researchers in the Shariati lab have developed and released a new deep learning model called “DeepSea,” one of the only tools with the ability to segment cells, track them and detect their division to follow lineages of cells. DeepSea, which is detailed in a new paper in Cell Reports Methods, is one of the highest-accuracy tools of its kind. 

DeepSea’s model training dataset, user-friendly software, and open-source code are available for use on the DeepSea website, and Shariati and his team of researchers have already used it to make discoveries about stem cell growth and division. 

“The model is more efficient, has fewer parameters, and both segmentation and tracking are integrated into a user-friendly software,” Shariati said. “The software allows you to train the model for any cell type of interest, paving the way for future discoveries.”

Time-lapse microscopy, which captures a series of images from a microscope over time, allows researchers to monitor single cells throughout an experiment to track phenomena such as differentiation — when stem cells become a specific type of cell — or change in shape and size over time. This can allow scientists to make new biological discoveries by measuring the dynamics of cell biological phenomena at the single-cell level.

Once the scientists have gathered images, they need to carry out two main tasks: segmentation, or identifying the borders of individual cells from each other and the background; and tracking, or following a cell from one frame to the next. From that point, the researchers can further investigate characteristics such as size, shape, texture, how they move and change their shape, and more. 

Manually sorting through microscopy images is tedious, time-consuming, and ultimately a task better suited for supercomputing, which is where DeepSea comes in. This efficient deep-learning model can perform segmentation in less than a second, and track cells with 98 percent accuracy. 

Enabling the software to detect cell division was a particularly unique and challenging aspect of this project, as there are few if any other situations in which artificial intelligence and computer vision must track one object transforming into two. 

“This is a very unusual problem for object tracking,” Shariati said. “If you want to track a car or something, the car will be moving around and you can use machine learning and computer vision to follow them as they move. But for cells, all of a sudden one object becomes two, and that's a fundamentally new problem that we needed to solve, and we were able to do so.” 

DeepSea is a generalizable model, meaning it can be used to track a variety of cell types. It uses a modified version of a popular model, 2D-UNET, with significantly fewer parameters to achieve both fast speeds and high accuracy. 

“We compared our model with some of the best cell segmentation models, and ours is now showing the best results in terms of precision, and speed, especially for these cell types,” said Zarageri, an electrical and computer engineering Ph.D. student in Shariati’s lab who led the creation of the software.

The researchers trained DeepSea using a dataset of images of cells manually segmented from their backgrounds, a time-intensive process as the images are often low-contrast and the cell bodies hard to make out. To aid in this process, the team developed another software tool to help crop, label, and edit the microscopy images of cells, which is also available at DeepSeas.org.

The training dataset included images of lung, muscle, and stem cells, meaning DeepSea achieves high precision across different cell types. More cell types can be added to future versions of the model. 

The researchers used DeepSea to study the size regulation of embryonic stem cells, which are the foundation of multicellular life and can differentiate into every other cell type. They came away with the discovery that embryonic stem cells, which are known to divide unusually fast, regulate their size so that smaller cells spend a longer time growing before producing the next generation of cells. 

“We found that if an embryonic stem cell is born small, they kind of know that they are small, so they spend more time growing before they go on and divide again,” Shariati said. “We do not know why and how exactly this happens, but at least that phenomenon is there.” 

In the future, the researchers plan to apply their existing software to gather data to study spatial relationships between cells, and how the cellular features are organized in 3-D patterns to form structures. 

The researchers also aim to resolve bottlenecks they have noticed in using their deep learning models, such as the lack of labeled images of cells that are used to train the models. They plan to use a class of machine learning frameworks called Generative Adversarial Networks (GANs) to create new synthetic data and images of cells that are already annotated to cut down on the time it takes to create labels. The researchers would then have large libraries of datasets of any cell type of interest with minimal human involvement.