NOAA scientists unravel positional, structural errors in numerical weather forecast models

Due to the chaotic nature of the atmosphere, weather forecasts, even with ever-improving numerical weather prediction models, eventually lose all skill. Meteorologists have a strong desire to better understand this process as they try to trace forecast error back to observational gaps and to provide a means for improvement.

Root mean square error (rms, or its square, the variance distance) is often used to measure differences between simulated and observed fields. In this case, scientists measured the distance between a model forecast field within its grid and the verifying analysis field that represents all real-world observations. However, one must consider that atmospheric features, like fronts and pressure systems, are three-dimensional weather features in space that supercomputer models displace and also structurally distort as the numerical forecast moves away from initiation. Variance or rms error metrics do not quantify the displacement and distortion of weather systems.

In a recently published paper in Advances in Atmospheric Sciences, a team of scientists with the National Oceanic and Atmospheric Administration (NOAA), the Massachusetts Institute of Technology (MIT), and the University of Connecticut set out to find a general approach to assess the positional and structural components of the total difference between two fields. Essentially, meteorologists want to assess the accuracy of many different weather features within a model forecast compared to a verifying analysis based on real-world observations.

Sai Ravela from MIT, a co-author of this study, previously developed a Field Alignment method. In this case, his approach aligns the model forecast field with the observationally-based analysis in a smooth fashion so their difference is minimized (Step 1 in the schematic diagram, see also example map). Next, small-scale errors from uncertain origins are removed from all three fields (the original and aligned forecast as well as the verifying analysis, or proxy for observations) through a process called spatial filtering or smoothing (Step 2). The total variance distance, or difference, is then partitioned into three unique components (Step 3). Positional error, which is the variance distance between the smoothed original model forecast and smoothed aligned forecast fields, and structural error that is the variance distance between the smoothed aligned forecast and the smoothed verifying analysis fields, are two sides of the right-angle triangle in Fig. 1, and fine-scale noise, which are the uncertain small-scale errors removed from the original model forecast and verifying analysis, or observation fields (see smoothing arrows orthogonal to the triangle in Fig. 1). Fig. 1. Schematic for total forecast error reduction: (1) Spatially align a forecast with the verifying analysis field; (2) Smooth original and aligned forecast and analysis to remove unpredictable scales; (3) Decompose total error into orthogonal (right angle) components of (i) large scale positional error, (ii) large scale structural error, and (iii) small scale noise.  CREDIT Isidora Jankov

This method outputs the three orthogonal error components as scalar fields, as well as a vector field (Fig. 2) indicating the large-scale displacement of the forecast compared to the observational analysis field. Interestingly, throughout all regions and lead times that the team studied, more than half of the total error variance is associated with the misplacement of weather features. Therefore, displacement is more pronounced than distortion in forecast fields: only about 25% error variance is associated with structural inaccuracies of the partially predictable features, such as fronts and low-pressure systems. The rest of the error variance remains unexplained or unpredictable variability or noise. Fig. 2. 3.5-day forecast (black contour) and verifying analysis (shades of color) of mean sea level pressure for Hurricane Katia, valid at 12 UTC 6 September 2011. Moving the forecast along with the blue arrows aligns it with the observational analysis.  CREDIT Isidora Jankov

"How noise grows in error variance as a function of forecast lead time, and whether a positional-structural-noise decomposition of the spread among an ensemble of perturbed forecasts captures forecast error components is the subject of ongoing studies," said Dr. Jankov from NOAA, the lead author of the study.

Raytheon BBN Technologies harnesses quantum’s ‘noise problem’

Scientists at Raytheon BBN Technologies have developed a new way to detect a single photon, or particle of light, a development with big applications for sensors, communications, and exponentially more powerful quantum computer processors.

The team has published its work, which centers on the use of a component called a Josephson junction, in the academic journal Science. The discovery builds on the same team’s previous research into a microwave radiation detector 100,000 times more sensitive than existing systems.

“A Josephson junction in quantum computing is analogous to a transistor for modern electronics, so they are super important,” said Kin Chung Fong, a quantum information processing scientist at Raytheon BBN Technologies and a research associate at Harvard University. “Our new device enables this basic unit in quantum computing to communicate through as little as one photon. It will improve the speed in the communication and can make quantum networking and sensing possible.” This illustration depicts a newly developed component, known as a Josephson junction that can detect a single photon of light. The research, led by Raytheon Intelligence & Space, has potential applications for sensors, communications and quantum computers.

Researchers and labs around the world have started building larger quantum computers, seeking to unlock the promise of faster processing.

“In theory, quantum computers can take over where traditional computers would run out of processing power,” said Brad Tousley, president of Raytheon BBN Technologies. “Quantum computers are particularly good at solving critical optimization problems. One example would be for a computer-aided design of a large system like an aircraft. Quantum computing allows for a more finite analysis of something like a wing shape than ever before. Fundamental everyday processing optimization is the first problem we’d like to tackle with quantum computing.”

The technical limitation has been the background noise that causes qubits to lose memory, creating errors in the processing. While other researchers see the noise as a problem, Fong and his team see opportunity.

Their method works a little like a highway, where superconducting charges play the role of cars. In principle, they can move very fast without bumping into each other. Background noise is like a broken-down car in the center lane – it breaks the flow of traffic.

“The interruption could destroy the data in quantum computing applications,” Fong said. “However, we can utilize this same phenomenon to detect a single photon, allowing the traffic to continue to speed along.”

The discovery is part of a research effort at Raytheon BBN Technologies, a subsidiary of Raytheon Intelligence & Space. Raytheon BBN has been providing advanced technology research and development for more than 70 years, often serving as a crucial link between the military and researchers at universities. As an example, it was one of the first nodes in the ARPANET, the precursor of the internet funded by the Defense Advanced Research Projects Agency, or DARPA. Scientists at Raytheon BBN work in broad-reaching portfolios, while quantum engineering and supercomputing continue to show promise for next-generation capabilities.

“This discovery is going to open up quantum processors to be connected like never before,” Tousley said. “The next step is characterizing performance and scaling up to more than one device in parallel or linking multiple devices.”

The Raytheon BBN team believes they have the systems engineering expertise to take this basic research to more practical applications.

“We’ve filled a technological void with the first Josephson junction to detect a single photon,” said Fong. “It’s an enabling technology for networking, communication, and computation. We are really just scratching the surface.”

 

 

Cambridge researchers shine a light on how federated learning evolves towards being environmentally friendly

Training the artificial intelligence models that underpin web search engines, power smart assistants, and driverless cars consumes megawatts of energy and generate worrying carbon dioxide emissions. But new ways of training these models are proven to be greener.

Artificial intelligence models are used increasingly widely in today's world. Many carry out natural language processing tasks - such as language translation, predictive text, and email spam filters. They are also used to empower smart assistants such as Siri and Alexa to 'talk' to us, and to operate driverless cars.

But to function well these models have to be trained on large sets of data, a process that includes carrying out many mathematical operations for every piece of data they are fed. And the data sets they are being trained on are getting ever larger: one recent natural language processing model was trained on a data set of 40 billion words.

As a result, the energy consumed by the training process is soaring. Most AI models are trained on specialized hardware in large data centers. According to a recent paper in the journal Science, the total amount of energy consumed by data centers made up about 1% of global energy use over the past decade - equalling roughly 18 million US homes. And in 2019, a group of researchers at the University of Massachusetts estimated that training one large AI model used in natural language processing could generate around the same amount of CO2 emissions as five cars would generate over their total lifetime.

Concerned by this, researchers at the University of Cambridge set out to investigate more energy-efficient approaches to training AI models. Working with collaborators at the University of Oxford, University College London, and Avignon Université, they explored the environmental impact of a different form of training - called federated learning - and discovered that it had a significantly greener impact. Instead of training the models in data centers, federated learning involves training models across a large number of individual machines. The researchers found that this can lead to lower carbon emissions than traditional learning.

Senior Lecturer Dr. Nic Lane explains how it works when the training is performed not inside large data centers but over thousands of mobile devices - such as smartphones - where the data is usually collected by the phone users themselves.

"An example of an application currently using federated learning is the next-word prediction in mobile phones," he says. "Each smartphone trains a local model to predict which word the user will type next, based on their previous text messages. Once trained, these local models are then sent to a server. There, they are aggregated into a final model that will then be sent back to all users."

And this method has important privacy benefits as well as environmental benefits, points out Dr. Pedro Porto Buarque De Gusmao, a postdoctoral researcher working with Dr. Lane.

"Users might not want to share the content of their texts with a third party," he explains. "In federated learning, we can keep data local and use the collective power of millions of mobile devices together to train AI models without users' raw data ever leaving the phone."

"And besides these privacy-related gains," says Dr. Lane, "in our recent research, we have shown that federated learning can also have a positive impact in reducing carbon emissions.

"Although smartphones have much less processing power than the hardware accelerators used in data centers, they don't require as much cooling power as the accelerators do. That's the benefit of distributing the training of models across a wide pool of devices."

The researchers recently co-authored a paper on this called 'Can Federated Learning save the planet?' and will be discussing their findings at an international research conference, the Flower Summit 2021, on 11 May.

In their paper, they offer the first-ever systematic study of the carbon footprint of federated learning. They measured the carbon footprint of a federated learning setup by training two models -- one in image classification, the other in speech recognition - using a server and two chipsets popular in the simple devices targeted by federated methods. They recorded the energy consumption during training, and how it might vary depending on where in the world the chipsets and server were located.

They found that while there was a difference between CO2 emission factors among countries, federated learning under many common application settings was reliable 'cleaner' than centralized training.

Training a model to classify images in a large image dataset, they found any federated learning setup in France emitted less CO2 than any centralized setup in both China and the US. And in training the speech recognition model, federated learning was more efficient than centralized training in any country.

Such results are further supported by an expanded set of experiments in a follow-up study ('A first look into the carbon footprint of federated learning') by the same lab that explores an even wider variety of data sets and AI models. And this research also provides the beginnings of necessary formalism and algorithmic foundation of even lower carbon emissions for federated learning in the future.

Based on their research, the researchers have made available a first-of-its-kind 'Federated Learning Carbon Calculator' so that the public and other researchers can estimate how much CO2 is produced by any given pool of devices. It allows users to detail the number and type of devices they are using, which country they are in, which datasets and upload/download speeds they are using and the number of times each device will train on its own data before sending its model for aggregation.

They also offer a similar calculator for estimating the carbon emissions of centralized machine learning.

"The development and usage of AI are playing an increasing role in the tragedy that is climate change," says Dr. Lane, "and this problem will only worsen as this technology continues to proliferate through society. We urgently need to address this which is why we are keen to share our findings showing that federated learning methods can produce less CO2 than data centers under important application scenarios.

"But even more importantly, our research also shines a light as to how federated learning should evolve towards being even more broadly environmentally friendly. Decentralized methods like this will be key in the invention of future sustainable forms of AI in the years ahead."