Harvard's BE-Hive predicts which base editor performs best to repair disease-causing mutations

Gene-editing technology is getting better and growing faster than ever before. New and improved base editors--an especially efficient and precise kind of genetic corrector--inch the tech closer to treating genetic diseases in humans. But, the base editor boom comes with a new challenge: Like a massive key ring with no guide, scientists can sink huge amounts of time into searching for the best tool to solve genetic malfunctions like those that cause sickle cell anemia or progeria (a rapid aging disease). For patients, time is too important to waste.

"New base editors come out seemingly every week," said David Liu, Thomas Dudley Cabot Professor of the Natural Sciences and a core institute member of the Broad Institute and the Howard Hughes Medical Institute (HHMI). "The progress is terrific, but it leaves researchers with a bewildering array of choices for what base editor to use." behive 3e573{module INSIDE STORY}

Liu invented base editors. Fittingly, he and his research team have now invented a way to identify which are most likely to achieve desired edits, as published yesterday in Cell. Using experimental data from editing more than 38,000 target sites in human and mouse cells with 11 of the most popular base editors (BEs), they created a machine learning model that accurately predicts base editing outcomes, Liu said. The library, called BE-Hive, is available for public use. But the effort produced more than a neat catalog of BEs; the machine learning model discovered new editor properties and capabilities that humans failed to notice.

"If you set out to use base editing to correct a single disease-causing mutation," said Mandana Arbab, a postdoctoral fellow in the Liu lab and co-first author on the study, "you're left with a mountain of possible ways to do it and it is difficult to know which ones are most likely to work."

Base editors may be more precise than other forms of gene editing, but they can still cause unwanted, often unpredictable, edits outside the intended genetic target. Each editor has its own eccentricities. Different types operate within smaller or larger editing "windows," stretches of DNA about two to five letters wide. Some editors might overshoot or undershoot their targets; others might change just one of two As in a given window.

"If the sequence within the window is GACA," Liu said, "and you're using an adenine base editor to change one of those As will one be preferentially edited over the other?"

The answer depends on the base editor, its paired guide RNA--the chaperone that ferries the editor to the appropriate DNA work site--and the surrounding DNA sequence. To corral all these complicating factors, the team first collected a massive amount of data. Over about a year, Arbab said, they equipped cells with over 38,000 DNA target sites and then treated them with the 11 most popular base editors, paired with guide RNAs. After the treatment, they sequenced the DNA of the cells to collect billions of data points on how each base editor impacted each cell.

To analyze this bounty, Max Shen, a Ph.D. student at the Massachusetts Institute of Technology's Computational and Systems Biology program, member of the Broad Institute, and co-first author designed and trained a machine learning model to predict each base editor's particular eccentricities. In a previous groundbreaking study, Shen and his lab mates trained a different machine learning model to analyze data from another common gene-editing tool, CRISPR, and dispelled a popular misconception that the tool yields unpredictable and generally useless insertions and deletions, Shen said. Instead, they showed that even if humans can't predict where those insertions and deletions occur, machine learning could.

Now, researchers can put a target DNA sequence into BE-Hive, Shen's beefed-up machine learning model, and see predicted outcomes of using each of the 11 base editors on that target. "BE-Hive predicts, down to the individual DNA sequence level, what will be the distribution of products that results from each of those base editors acting on that target site," said Liu.

Some of BE-Hive's predictions were surprising, even to the inventor of base editors. "Sometimes," Liu said, "for reasons that our primate brains aren't sufficiently sophisticated to predict, the model could accurately tell us that even though there are two Cs right in the editing window, this particular editor will only edit the second one, for example."

BE-Hive also learned when base editors can make so-called transversion edits: Instead of changing a C to a T, some base editors changed a C to a G or an A, rare and abnormal but potentially valuable quirks. The researchers then used BE-Hive to correct 174 disease-causing transversion mutations with minimal byproducts. And, they used BE-Hive to discover unknown base editor properties, which they used to design novel tools with new capabilities, adding a few more genetic keys to the ever-growing ring.

University of Maryland's simulations reveal interplay between scent marking, disease spread

Accounting for individual animal movement could boost understanding of emerging infectious diseases

In a new mathematical model that bridges animal movement and disease spread, territorial behaviors decreased the severity of potential disease outbreaks--but at the cost of increased disease persistence. Lauren White of the University of Maryland's National Socio-Environmental Synthesis Center, Annapolis, MD, and colleagues present these findings in PLOS Computational Biology.

Disease research often addresses direct social contact without considering individual animals' movement. Individual movement can be shaped by indirect social cues; for instance, a puma might mark its territory with a scent. While territorial behaviors could, in theory, inhibit diseases that require direct transmission, pathogens able to persist in the environment could still spread. CAPTION Scent marking plays a key role in social communication for many species. For example, cheetahs rely on scent mark signals for establishing territories and commonly scent mark elevated places on the landscape like termite mounds or trees.  CREDIT Martyn Smith, Flickr{module INSIDE STORY}

To better understand the interplay between indirect communication and disease spread, White and colleagues developed a mathematical model in which infected animals can indirectly infect others by leaving behind pathogens whenever they deposit scent marks. The researchers used the model to simulate the territorial movement of animals over a landscape, as well as the resulting disease spread.

In simulated outbreak-prone conditions with high animal density and slow disease-recovery rates, territorial movement decreased the number of animals infected, but at the cost of longer disease persistence within the population. These results suggest that indirect communication could play a more important role in disease transmission than previously thought.

"It was exciting to be able to incorporate a movement-ecology perspective into a disease-modeling framework," White says. "Our findings support the possibility that pathogens could evolve to co-opt indirect communication systems to overcome social barriers in territorial species."

This study demonstrates that accounting for movement behavior in disease models could improve understanding of how infectious diseases spread. Moving forward, the researchers hope to strengthen their models with additional dynamics, such as varying habitat quality and prey kill sites.

Pitt ECE professor wins $300K NSF Award to develop 2D synapse for deep neural networks

The world runs on data. Self-driving cars, security, healthcare, and automated manufacturing all are part of a "big data revolution," which has created a critical need for a way to more efficiently sift through vast datasets and extract valuable insights.

When it comes to the level of efficiency needed for these tasks, however, the human brain is unparalleled. Taking inspiration from the brain, Feng Xiong, assistant professor of electrical and computer engineering at the University of Pittsburgh's Swanson School of Engineering, is collaborating with Duke University's Yiran Chen to develop a two-dimensional synaptic array that will allow computers to do this work with less power and greater speed. Xiong has received a $300,000 award from the National Science Foundation for this project. 234366 web c8883{module INSIDE STORY}

"Deep neural networks (DNN) work by training a neural network with massive datasets for applications like pattern recognition, image reconstruction or video and language processing," said Xiong. "For example, if airport security wanted to create a program that could identify firearms, they would need to input thousands of pictures of different firearms in different situations to teach the program what it should look for. It's not unlike how we as humans learn to identify different objects."

To do this, supercomputing systems transfer data back and forth constantly from the computation and memory units, making DNNs computationally intensive and power-hungry. Their inefficiency makes it impractical for them to be scaled up to the level of the complexity needed for true artificial intelligence (AI). In contrast, computation and memory in the human brain uses a network of neurons and synapses that are closely and densely connected, resulting in the brain's extremely low power consumption, about 20W.

"The way our brains learn is gradual. For example, say you're learning what an apple is. Each time you see the apple, it might be in a different context: on the ground, on a table, in a hand. Your brain learns to recognize that it's still an apple," said Xiong. "Each time you see it, the neural connection changes a bit. In computing we want this high-precision synapse to mimic that so that over time, the connections strengthen. The finer the adjustments we can make, the more powerful the program can be, and the more memory it can have."

With existing consumer electronic devices, the kind of gradual, slight adjustment needed is difficult to attain because they rely on binary, meaning their states are essentially limited to on or off, yes or no. The artificial synapse will instead allow the precision of 1,000 states, with precision and control in navigating between each.

Additionally, smaller devices, like sensors and other embedded systems, need to communicate their data to a larger computer to process it. The proposed device's small size, flexibility, and low power usage could make it able to run those calculations in much smaller devices, allowing sensors to process information on-site.

"What we're proposing is that, theoretically, we could lower the energy needed to run these algorithms, hopefully by 1,000 times or more. This way, it can make power requirement more reasonable, so a flexible or wearable electronic device could run it with a very small power supply," said Xiong.

The project, titled "Collaborative Research: Two-dimensional Synaptic Array for Advanced Hardware Acceleration of Deep Neural Networks," is expected to last three years, beginning on Sept. 1, 2020.