Viewed from space, the sun shines on the earth globe, with continents represented by data symbols 0 and 1.

Data assimilation methods

Introduction

Data assimilation exploits our knowledge of forecast model and observation uncertainties. We seek an adjusted forecast that gives the best fit to observations spanning the past six hours for the global forecast and the past three hours for the UK forecast while also respecting the laws of physics.

An iterative process is used to keep adjusting the forecast so that the fit continues to improve until a convergence criterion has been met. Operational forecast models use about a billion variables but typically only assimilate about ten million observations.  Four-dimensional variational data assimilation (4DVar), so-called because it involves three spatial dimensions and time, uses covariance matrices to weight the previous forecast, which contains information about past observations, and recent observations.

Covariance matrices describe how the uncertainties in different quantities are correlated, allowing us to give greater weight to more-accurate data and also to spread observational information between different atmospheric variables. For example, a temperature observation can also be used to adjust our estimate of the wind. Operational forecast models require too many pieces of information  to explicitly account for all the inter-relationships. Instead, 4DVar models the correlations using physics principles in the form of an unchanging covariance matrix.

Hybrid 4DVar

A drawback of 4DVar is that the forecast covariance matrix does not take account of the day-to-day weather characteristics. For example, we would expect correlations to be stretched along a weather front. One way to include this flow-dependence is to use ensemble forecasting to represent the forecast uncertainty. We randomly perturb the previous forecast several times to produce a set of possible initial atmospheric states, each of which is evolved in time using the forecast model. The ensemble is then used to construct a flow-dependent covariance matrix. In practice, the ensemble size is limited by our supercomputing capacity so a hybrid approach is used that blends the unchanging covariance matrix of traditional 4DVar with the ensemble covariance matrix. Hybrid 4DVar is used operationally for global forecasting at the Met Office.

Ensemble 4DVar

Another feature of 4DVar is its use of a linear approximation to the nonlinear forecast model to evolve the initial forecast uncertainty over the time window. You can think of the nonlinear model as being like the curvature of the Earth. On a calm day a region of the ocean appears flat to someone on a boat. Another boat sailing towards the horizon will nevertheless eventually disappear due to the Earth's curvature, showing the limitations of the flat Earth approximation. Likewise, the linear approximation reasonably describes the difference between the adjusted and original forecasts for a given period of time but eventually it becomes less accurate.

Four-dimensional ensemble variation data assimilation (4DEnVar) computes the evolving uncertainty directly from the ensemble, replacing the linear approximation used in 4DVar. This has the benefit of reducing the maintenance costs of the data assimilation software while also being more efficient to run on supercomputers with increasing numbers of processors.

Each ensemble member can itself be updated by assimilating observations, which have also been randomly perturbed. This system is referred to as En-4DEnVar.

Current Projects

  • Developing En-4DEnVar as a candidate operational ensemble initialisation system for the Met Office Global and Regional Prediction System (MOGREPS).
  • Exploring the benefits of more-frequent cycling of global data assimilation.
  • Improving our stationary covariance modelling system.
  • Improving hybrid 4DVar.