Unprecedented data sharing driving new rare disease diagnoses in Europe

Results are just the 'tip of the iceberg', according to researchers

Rare disease experts detail the first results of an unprecedented collaboration to diagnose people living with unsolved cases of rare diseases across Europe. The findings are published today in a series of six papers in the European Journal of Human Genetics.

In the main publication, an international consortium, known as Solve-RD, explains how the periodic reanalysis of genomic and phenotypic information from people living with a rare disease can boost the chance of diagnosis when combined with data sharing across European borders on a massive scale. Using this new approach, a preliminary reanalysis of data from 8,393 individuals resulted in 255 new diagnoses, some with atypical manifestations of known diseases. Sergi Beltran and Leslie Matalonga pictured in front of a supercomputer and servers that hosts the RD-Connect GPAP platform. The platform is located at the CNAG-CRG facilities in the Parc Cientific de Barcelona.

A complementary study describes the method in more detail and four accompanying case studies showcase the advantages of the approach. In one case study, researchers used the method to identify a new genetic form of pontocerebellar hypoplasia type 1 (PCH1), a genetic disease that affects the development of the brain. PCH1 is normally linked to mutations in four known genes. The researchers used the method to identify a new variant in a fifth gene.

In another case study, researchers used the method on an individual with a complex neurodevelopmental disorder and found the disease was caused by a new genetic variant in mitochondrial DNA. This went previously undetected because the patient did not present typical symptoms of a mitochondrial disorder. The diagnosis will help tailor treatment for the individual and inform their family members on the possibility of passing it on to future generations.

Key to the reanalysis of unsolved cases is the RD-Connect Genome-Phenome Analysis Platform, which is developed, hosted, and coordinated by the Centro Nacional de Analisis Genomico (CNAG-CRG), part of the Centre for Genomic Regulation (CRG), based in Barcelona.

Recognized officially by the International Rare Diseases Research Consortium and funded by the EU, Spanish and Catalan governments, the RD-Connect GPAP provides authorized clinicians and researchers with secure and controlled access to pseudonymized genomic data and clinical information from patients with rare diseases. The platform enables the secure, fast, and cost-effective automated re-analysis of the thousands of undiagnosed patients and relatives entering the Solve-RD project.

According to Sergi Beltran, co-leader of Solve-RD data analysis and Head of the Bioinformatics Unit at CNAG-CRG, "Solve-RD has shown that it is possible to securely share large amounts of genomics data internationally for the benefit of the patients. The work we are publishing today is just the tip of the iceberg since many more patients are being diagnosed thanks to the innovative methods developed and applied within Solve-RD".

An estimated 30 million people in Europe are affected by a rare disease during their lifetime. More than 70% of rare diseases have a genetic cause. However, around 50% of patients with a rare disease remain undiagnosed even in advanced expert clinical settings that use techniques such as genome sequencing.

At the same time, scientists around the world are finding an average of 250 new gene-disease associations and 9,200 variant-disease associations per year. As scientific understanding expands, reanalyzing data periodically can help people receive a diagnosis.

The consortium, which consists of more than 300 researchers and clinicians in fifteen countries and collectively sees more than 270,000 rare disease patients each year, aims to eventually diagnose more than 19,000 unsolved cases of rare diseases with an unknown molecular cause. Their preliminary findings are an important first step for developing a European-wide system to facilitate the diagnosis of rare diseases, which can be a long and arduous process.

A fiery past sheds new light on the future of global climate change

Ice core samples reveal significant smoke aerosols in the pre-industrial Southern Hemisphere

Centuries-old smoke particles preserved in the ice reveal a fiery past in the Southern Hemisphere and shed new light on the future impacts of global climate change, according to new research published in Science Advances.

"Up till now, the magnitude of past fire activity, and thus the amount of smoke in the preindustrial atmosphere, has not been well characterized," said Pengfei Liu, a graduate student and postdoctoral fellow at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) and first author of the paper. "These results have importance for understanding the evolution of climate change from the 1750s until today, and for predicting future climate."

One of the biggest uncertainties when it comes to predicting the future impacts of climate change is how fast surface temperatures will rise in response to increases in greenhouse gases. Predicting these temperatures is complicated since it involves the calculation of competing warming and cooling effects in the atmosphere. Greenhouse gases trap heat and warm the planet's surface while aerosol particles in the atmosphere from volcanoes, fires, and other combustion cool the planet by blocking sunlight or seeding cloud cover. Understanding how sensitive surface temperature is to each of these effects and how they interact is critical to predicting the future impact of climate change.

Many of today's climate models rely on past levels of greenhouse gasses and aerosols to validate their predictions for the future. But there's a problem: While pre-industrial levels of greenhouse gasses are well documented, the amount of smoke aerosols in the preindustrial atmosphere is not.

To model smoke in the pre-industrial Southern Hemisphere, the research team looked to Antarctica, where the ice trapped smoke particles emitted from fires in Australia, Africa, and South America. Ice core scientists and co-authors of the study, Joseph McConnell and Nathan Chellman from the Desert Research Institute in Nevada measured soot, a key component of smoke, deposited in an array of 14 ice cores from across the continent, many provided by international collaborators.

"Soot deposited in glacier ice directly reflects past atmospheric concentrations so well-dated ice cores provide the most reliable long-term records," said McConnell.

What they found was unexpected.

"While most studies have assumed less fire took place in the preindustrial era, the ice cores suggested a much fierier past, at least in the Southern Hemisphere," said Loretta Mickley, Senior Research Fellow in Chemistry-Climate Interactions at SEAS and senior author of the paper.

To account for these levels of smoke, the researchers ran supercomputer simulations that account for both wildfires and the burning practices of indigenous people.

"The computer simulations of fire show that the Southern Hemisphere atmosphere could have been very smoky in the century before the Industrial Revolution. Soot concentrations in the atmosphere were up to four times greater than previous studies suggested. Most of this was caused by widespread and regular burning practiced by indigenous peoples in the pre-colonial period," said Jed Kaplan, Associate Professor at the University of Hong Kong and co-author of the study.

This result agrees with the ice core records that show that soot was abundant before the start of the industrial era and remained relatively constant through the 20th century. The modeling suggests that as land-use changes decreased fire activity, emissions from industry increased.

What does this finding mean for future surface temperatures?

By underestimating the cooling effect of smoke particles in the pre-industrial world, climate models might have overestimated the warming effect of carbon dioxide and other greenhouse gasses to account for the observed increases in surface temperatures.

"Climate scientists have known that the most recent generation of climate models have been over-estimating surface temperature sensitivity to greenhouse gasses, but we haven't known why or by how much," said Liu. "This research offers a possible explanation."

"Clearly the world is warming but the key question is how fast will it warm as greenhouse gas emissions continue to rise. This research allows us to refine our predictions moving forward," said Mickley.

Key early steps in gene expression captured in real time by CSU researchers

Capturing how RNA polymerase enzymes kick off transcription

On scales too small for our eyes to see, the business of life happens through the making of proteins, which impart to our cells both structure and function. Cellular proteins get their marching orders from genetic instructions encoded in DNA, whose sequences are first copied and made into RNA in a multi-step process called transcription.

Research collaboration at Colorado State University specializes in high-resolution fluorescence microscopy and computational modeling to visualize and describe such stuff-of-life processes in exquisite detail, in real-time, at the level of single genes. Now, scientists led by postdoctoral researcher Linda Forero-Quintero have, for the first time, observed early RNA transcription dynamics by recording where, when and how RNA polymerase enzymes kick off transcription by binding to a DNA sequence.

The breakthrough technology has countless possible outshoots; these include sharpening understanding of basic biological processes, to unveiling the genetic underpinnings of certain diseases.

"This is the first time someone has looked at RNA polymerase phosphorylation dynamics in a single-copy gene," said Forero, who is a postdoctoral researcher co-advised by Tim Stasevich, Monfort Professor and associate professor in biochemistry, and Brian Munsky, associate professor in chemical and biological engineering. In the past, such early transcription activity could only be visualized using gene arrays, which are artificial structures composed of hundreds of copies of a gene and not commonly found in the cell nucleus.

Stasevich and Munsky lead a collaboration funded by the W.M. Keck Foundation and the National Institute of General Medical Sciences (through two Maximizing Investigators' Research Awards) that's seeking to unveil and quantify real-time genetic expression in living, single cells. Forero, who works in both labs under the auspices of the collaboration, had previously studied proteins and transporters in cell membranes associated with neurological conditions.

Early transcription activity

Forero et al. designed a method using an established mammalian cell line, engineered fluorescent antibody fragments, and a custom super-resolution microscope to capture the process of early transcription in vivid colors: blue, green, and red. More specifically, they observed the start of the transcription cycle that happens when the RNA polymerase II (RNAP2) transcription enzyme becomes phosphorylated or decorated with phosphate groups, on its amino acid tail.

"The interdisciplinary science here is a fantastic blending of new experimental capabilities and a new approach for mechanistic computational modeling of single-cell dynamics, both of which are very novel in their respective fields," said Munsky, who supervises the computational aspects of the collaboration.

In the lab, the researchers loaded their antibody fragments into an established mammalian cell line containing a reporter gene that when transcribed, is lit up by a fluorescently tagged protein. The antibody fragments, which Stasevich helped develop several years ago, are tagged with fluorescent molecules that light up their specific targets in the RNAP2 tail. Using these tagging technologies together, the researchers could distinguish three distinct steps in the transcription cycle, marked by different colors. The images obtained with this system translate into fluorescent intensity fluctuation. The researchers then used those signals to interpret the spatiotemporal organization of RNAP2 phosphorylation throughout the transcription cycle at a single-copy gene.

New information via a computational model

Munsky's team led by graduate student William Raymond took Forero and Stasevich's microscopy data and translated it into a computational model based on stochastic differential equations. By fitting this statistical model to reproduce all the experimental results, the computational team then extended their analyses to glean new mechanistic and quantitative information about the different molecules and their states through the transcription process.

For example, they estimated how many individual RNA polymerase molecules collect to form transient clusters in the region of the DNA's promoter, how long these clusters persist, and how, when, and where the polymerases distribute themselves along with the DNA. They found, for example, that each burst of transcription activity produces a cluster of between five and 40 RNA polymerases to form around the promoter region of the gene, of which 46% eventually succeed to transcribe RNA. They also found that each RNA takes approximately five minutes to be fully transcribed and processed prior to release.

Forero says the technology has far-reaching potential, especially combined with newer technologies like CRISPR, in which specific genes can be singled out and manipulated. Choosing a certain gene of interest, say one implicated in a disease, and applying the CSU researchers' real-time readout of the transcription cycle, could then allow researchers to watch disease processes happening at the activity level of single genes.

"The ability to resolve the spatial and temporal dynamics of the transcription cycle, in one gene, is the most exciting aspect of this work," Forero said.