Topic 5: Machine learning in model parametrizations and transfer functions


Machine learning (ML) workflows and development of AI methods carry great potential to optimize parametrization, post-processing, analysis and validation of earth system models. Topic 5 specifically explores the potential of AI approaches to enable or to improve the model representation of computationally demanding model subsystems. This includes: (1) multiscale model parametrisation by seamlessly linking physiographical input data with model parameters in parameter transfer models (Samaniego et al, 2010, 2018); (2) the replacement of numerically expensive stratospheric chemistry ODE solvers with ML techniques, thus expanding the existing ML prototype SWIFT (Kreyling et al., 2017). We will also develop the framework for the longer term perspective of using the ML approach for a number of model sub-systems, including atmospheric chemistry, clouds and convection, land surface processes and other complex model sub-systems for which working but slow process models exist.
Topic 5 will exploit synergies with the Helmholtz Analytics Framework (HAF) project and in particular investigate, how to apply massively parallel numerical methods from the HAF ́s HeAT framework (Helmholtz Analytics Toolkit). DLR, FZJ and KIT are involved in the HAF developments. Other synergies are foreseen with doctoral projects from the HDS-LEE graduate school (Helmholtz School for Data Science in Life, Earth and Energy). This will enable the exploitation of optimized tensor kernel libraries as basic building blocks for AI methods in advanced analysis and validation concepts, such as multi-dimensional/multi-variate correlation analysis, PCAs, EOFs, vine copulas, Granger causalities, etc.

Multiscale model parametrisation

In this work package, we investigate a variety of Machine Learning (ML) Techniques to alter pedo-transfer functions (PTFs), in order to have scale independent and flux matching simulations of the land surface models. These functions map the soil texture properties to the model parameters of the underlying land surface model (LSM). We employ Multiscale Parameter Regionalization (MPR) [Samaniego et al] technique, which uses the soil texture information as input in order to calculate the model parameters of the land surface model using PTFs. Our aim is to alter these PTFs through a variety of ML-based techniques such as Function Space Optimization (FSO) [Feigl et al], Neural Networks and Random Forrest. We use Helmholtz Analytics Toolkit (HeAT) [Goetz et al] in order implement the aforementioned ML-techniques. To demonstrate the viability of these techniques, the setup (combination of ML techniques and MPR) is tested using the land surface model HTESSEL, which is used by ECMWF in operational mode.


With SWIFT-AI, we explore ML methods and apply them to stratospheric ozone chemistry. Our model is intended to replace the detailed chemistry schemes of chemistry and transport models (CTMs), or to be coupled to general circulation models (GCMs). Modelling of the Earth system is a complex task and models usually contain a large number of submodules and parameterizations. This applies for example to the atmosphere, hydrosphere, solid earth and the ice sheets. Atmospheric chemistry is complex and usually involves dozens of species and hundreds of reactions with a wide range of concentrations and lifetimes. SWIFT-AI is a data-driven machine learning approach to predict the stratospheric ozone tendencies. The training data is taken from stratospheric chemistry simulations with the Lagrangian CTM ATLAS.
The ozone tendencies and 55 parameters are stored at each model point and time step and can be used to apply supervised learning methods to learn the highly nonlinear context. A previous version of the SWIFT model employed a polynomial approximation approach. In SWIFT-AI we exploit the capabilities of neural networks to improve the approximation and develop an uncertainty estimation algorithm to increase the robustness of the model.

The figure depicts the monthly mean ozone tendency in altitude-latitude cross-section and compares, first, the CTM ATLAS ground truth data used to train the surrogate models, second, the previous polynomial approach of SWIFT, and third, the new attempt of the neural network approach called SWIFT-AI. By comparing the results with ATLAS (right column) the errors can be observed. SWIFT-AI lowers the deviations in the boundary regions significantly and improves the overall RMSE by approximately one order of magnitude.

The overall advantage of our surrogate models is a much lower computation time (minutes instead of days) compared to the full chemistry model ATLAS, while still achieving a high level of accuracy.