Risks of disease from microbial pathogens in food can be predicted more quickly

The 'Food Safety Knowledge Markup Language (FSK-ML)' format allows us to uniformly document mathematical models and model-based simulation results, and make these available to other researchers for supercomputer-based forecasts or further optimization of models. With FSK-ML, even models that were developed in different ways programming languages can be exchanged in a harmonized format. For the first time, it is possible to integrate suitable models from other scientists into in-house calculations, simulations, and assessments at the push of a button. Also, simulation results are transparent to others, as the used software code and all model parameters are visible to everyone and thus, results can be recalculated.

The FSK-ML information exchange format, which was extended and tested by the BfR under the AGINFRA+ project (2017-2019), allows us o better and more quickly assess human health risks in the future. This means that previously developed predictive models can now quickly be calculated with different simulation scenarios and adapted to fit the issue at hand - whether it concerns the risk of salmonella in fresh eggs or possible transmission of Campylobacter germs from raw chicken breast fillet to green salad in the kitchen. {module INSIDE STORY}

The new FSK-ML data standard also makes it easier for researchers to make their results available in accordance with the FAIR data principles (findability, accessibility, interoperability, and reusability). In particular, the support of the FAIR data principles means that data and information can be found, accessed, and used by different software solutions in a long-term manner.

With the development of the FSK-ML information exchange format, the BfR provides the basis for the future initialization risk assessment. With FSK-ML, software developers in the food safety domain can now easily expand their current and future tools to include new functions for importing and exporting models. FSK-ML also represents the basis for the development of web-based model databases, where researchers from different disciplines can search for established models or even share their own models. One example of such a model database is the 'RAKIP_portal' (https://aginfra.d4science.org/web/rakip_portal/catalogue), developed in the AGINFRA+ project. Models, which can be made available and downloaded via this online platform, can then be used in different software tools on in-house computers or on other online platforms.

The use of FSK-ML models on one's own computer is for example possible by the open-source software named "FSK-Lab" (https://foodrisklabs.bfr.bund.de/fsk-lab/) that was also developed by the BfR. In-house and external models can be imported, exported, edited, joined, and even run with this intuitive software. In this way, each user can set up their own predictions or simulation calculations. There is also an extension named "FSK2R" for the open-source scripting language R, which was previously presented at an international conference (esa.ipb.pt/icpmf11/welcome) in 2019.

Moreover, there are already scientific journals, such as the Food Modelling Journal (FMJ) (https://fmj.pensoft.net/), which enable FSK-ML compliant models to be imported with all relevant metadata. For example, an 'executable model paper' can be automatically generated in the FMJ in this way. The presented model is not only downloaded but is also calculated online with user-defined input parameters. Such innovative digital solutions make a significant contribution to increasing the transparency and reproducibility of scientific work, as the results presented in the article, e.g. in the review process, can be tested effectively. Moreover, the models contain all relevant metadata, such as the range of applicability.

Cleveland Clinic researchers build model online to predict risk of COVID-19, disease outcomes

Cleveland Clinic researchers have developed the world's first risk prediction model for healthcare providers to forecast an individual patient's likelihood of testing positive for COVID-19 as well as their outcomes from the disease.

According to a new study published in CHEST, the risk prediction model (called a nomogram) shows the relevance of age, race, gender, socioeconomic status, vaccination history, and current medications in COVID-19 risk. The risk calculator is a new tool for healthcare providers to aid them in predicting patient risk and tailoring decision-making about care. It provides a more scientific approach to testing which is important for the healthcare community which has faced increased demand for testing and limited resources.

"The ability to accurately predict whether or not a patient is likely to test positive for COVID-19, as well as potential outcomes including disease severity and hospitalization, will be paramount in effectively managing our resources and triaging care," said Lara Jehi, M.D., Cleveland Clinic's Chief Research Information Officer and corresponding author on the study. "As we continue to battle this pandemic and prepare for a potential second wave, understanding a person's risk is the first step in potential care and treatment planning." {module INSIDE STORY}

The nomogram, which has been deployed as a freely available online risk calculator at https://riskcalc.org/COVID19/ , was developed using data from nearly 12,000 patients enrolled in Cleveland Clinic's COVID-19 Registry, which includes all individuals tested at Cleveland Clinic for the disease, not just those that test positive.

Data scientists, including co-author of the study Michael Kattan, Ph.D., Chair of Lerner Research Institute's Department of Quantitative Health Sciences, used statistical algorithms to transform data from registry patients' electronic medical records into the first-of-its-kind nomogram.

This study revealed several novel insights into disease risk, including:

  • Patients who have received the pneumococcal polysaccharide vaccine (PPSV23) and flu vaccine are less likely to test positive for COVID-19 than those who have not received the vaccinations.
  • Patients actively taking melatonin (over-the-counter sleep aid), carvedilol (high blood pressure and heart failure treatment) or paroxetine (anti-depressant) are less likely to test positive than patients not taking the drugs.
  • Patients of low socioeconomic status (as measured in this study by zip code) are more likely to test positive than patients of greater economic means.
  • Patients of Asian descent are less likely than Caucasian patients to test positive.

"Our findings corroborated several risk factors already reported in the existing literature - including that being male and of advancing age both increase the likelihood of testing positive for COVID-19 - but we also put forth some new associations," said Dr. Jehi. "Further validation and research are needed into these initial insights but these correlations are extremely intriguing."

In a previous network medicine study led by Lerner Research Institute scientists, 16 drugs (including melatonin, carvedilol, and paroxetine) and three-drug combinations were identified as candidates for repurposing as potential COVID-19 treatments. While these findings suggest an association between taking these medications and reduced risk of testing positive for COVID-19, additional studies are needed to assess how these drugs may affect disease progression.

"The data suggest some interesting correlations but do not confer cause and effect," said Kattan. "For example, our data do not prove that melatonin reduces your risk of testing positive for COVID-19. There may be something else about patients who take melatonin that is indeed responsible for their apparent reduced risk, and we don't know what that is. Consumers should not change anything about their behavior based on our findings."

The nomogram, developed using data from patients tested at Cleveland Clinic for COVID-19 before April 2, 2020, showed good performance and reliability when used in a different geographic region (Florida) and over time (patients tested after April 2, 2020). This suggests that the patterns and predictors identified in the model are consistent across regions and communities and can be potentially adapted for clinical practice in healthcare systems across the country.

"This nomogram will bring precision medicine to the COVID-19 pandemic, helping to enable researchers and physicians to predict an individual's risk of testing positive," said Kattan. "Additionally, while testing solutions continue to be needed, it is so important to make sure we are responsibly and optimally dispatching our resources ¬- including clinical personnel, personal protective equipment, and hospital beds. Our risk prediction model stands to greatly assist hospital systems in this planning."

Harvard's BE-Hive predicts which base editor performs best to repair disease-causing mutations

Gene-editing technology is getting better and growing faster than ever before. New and improved base editors--an especially efficient and precise kind of genetic corrector--inch the tech closer to treating genetic diseases in humans. But, the base editor boom comes with a new challenge: Like a massive key ring with no guide, scientists can sink huge amounts of time into searching for the best tool to solve genetic malfunctions like those that cause sickle cell anemia or progeria (a rapid aging disease). For patients, time is too important to waste.

"New base editors come out seemingly every week," said David Liu, Thomas Dudley Cabot Professor of the Natural Sciences and a core institute member of the Broad Institute and the Howard Hughes Medical Institute (HHMI). "The progress is terrific, but it leaves researchers with a bewildering array of choices for what base editor to use." behive 3e573{module INSIDE STORY}

Liu invented base editors. Fittingly, he and his research team have now invented a way to identify which are most likely to achieve desired edits, as published yesterday in Cell. Using experimental data from editing more than 38,000 target sites in human and mouse cells with 11 of the most popular base editors (BEs), they created a machine learning model that accurately predicts base editing outcomes, Liu said. The library, called BE-Hive, is available for public use. But the effort produced more than a neat catalog of BEs; the machine learning model discovered new editor properties and capabilities that humans failed to notice.

"If you set out to use base editing to correct a single disease-causing mutation," said Mandana Arbab, a postdoctoral fellow in the Liu lab and co-first author on the study, "you're left with a mountain of possible ways to do it and it is difficult to know which ones are most likely to work."

Base editors may be more precise than other forms of gene editing, but they can still cause unwanted, often unpredictable, edits outside the intended genetic target. Each editor has its own eccentricities. Different types operate within smaller or larger editing "windows," stretches of DNA about two to five letters wide. Some editors might overshoot or undershoot their targets; others might change just one of two As in a given window.

"If the sequence within the window is GACA," Liu said, "and you're using an adenine base editor to change one of those As will one be preferentially edited over the other?"

The answer depends on the base editor, its paired guide RNA--the chaperone that ferries the editor to the appropriate DNA work site--and the surrounding DNA sequence. To corral all these complicating factors, the team first collected a massive amount of data. Over about a year, Arbab said, they equipped cells with over 38,000 DNA target sites and then treated them with the 11 most popular base editors, paired with guide RNAs. After the treatment, they sequenced the DNA of the cells to collect billions of data points on how each base editor impacted each cell.

To analyze this bounty, Max Shen, a Ph.D. student at the Massachusetts Institute of Technology's Computational and Systems Biology program, member of the Broad Institute, and co-first author designed and trained a machine learning model to predict each base editor's particular eccentricities. In a previous groundbreaking study, Shen and his lab mates trained a different machine learning model to analyze data from another common gene-editing tool, CRISPR, and dispelled a popular misconception that the tool yields unpredictable and generally useless insertions and deletions, Shen said. Instead, they showed that even if humans can't predict where those insertions and deletions occur, machine learning could.

Now, researchers can put a target DNA sequence into BE-Hive, Shen's beefed-up machine learning model, and see predicted outcomes of using each of the 11 base editors on that target. "BE-Hive predicts, down to the individual DNA sequence level, what will be the distribution of products that results from each of those base editors acting on that target site," said Liu.

Some of BE-Hive's predictions were surprising, even to the inventor of base editors. "Sometimes," Liu said, "for reasons that our primate brains aren't sufficiently sophisticated to predict, the model could accurately tell us that even though there are two Cs right in the editing window, this particular editor will only edit the second one, for example."

BE-Hive also learned when base editors can make so-called transversion edits: Instead of changing a C to a T, some base editors changed a C to a G or an A, rare and abnormal but potentially valuable quirks. The researchers then used BE-Hive to correct 174 disease-causing transversion mutations with minimal byproducts. And, they used BE-Hive to discover unknown base editor properties, which they used to design novel tools with new capabilities, adding a few more genetic keys to the ever-growing ring.