Pilot Lab Exascale Earth System Modelling

During the years 2019-2022, the Pilot Lab Exascale Earth System Modelling (PL-ExaESM) was a preparatory project fot the Joint Lab, exploring specific concepts to enable exascale readiness of Earth System models and associated work flows in Earth System science.

Earth system models simulate the Earth’s climate including physical and biogeochemical processes and they are important tools for assessing the magnitude and impacts of future climate change. Such models have to run in ever finer resolutions in order to capture extreme events, which, in turn, are often responsible for the majority of weather-related and other environmental impacts. We foresee that next-generation Earth system models will run at resolutions of 1-3 km, which means that around 200 numerical equations must be solved in each of about 100 billion model grid boxes at each model time step (i.e. every minute). With such simulations local scale phenomena can be accurately reproduced and extreme events will be simulated much more accurately than at present. Therefore, these models require ever increasing computer power. The next generation of models is expected to only run on supercomputers with at least 1018 floating point operations per second, i.e. exascale systems.

Supercomputer technology is undergoing rapid and fundamental changes: since a few years, processor development has reached physical size limits, and therefore new paradigms for computing processors have to be found. Typically, next-generation processors combine many thousand cores in one processing unit. To use such devices efficiently, new programming concepts must be developed and implemented into Earth system models. Furthermore, these models generate huge amounts of data, and storage technology is also evolving. This implies that new ways for handling Earth system model output and new modelling workflows have to be developed. To enable such gigantic simulations on such gigantic machines necessitates a close collaboration between Earth system modellers and computer scientists. The PilotLab ExaESM offers a platform where these two communities can meet and interact to build the next-generation Earth system models.

In the PL-ExaESM, scientists from nine Helmholtz institutions worked together to address five specific problems of exascale Earth system modelling: (i) scalability: models are being ported to next-generation GPU processor technology and the codes are modularized so that computer scientists can better help to optimize the models on new hardware (Topic 1), (ii) load balancing: asynchronous workflows are being developed to allow for more efficient orchestration of the increasing model output while preserving the necessary flexibility to control the simulation output according to the scientific needs (Topic 2), (iii) data staging: new emerging dense memory technologies allow new ways of optimizing I/O operations of data-intensive applications running on HPC clusters and future Exascale systems (Topic 3), (iv) system design: the results of dedicated performance tests of Earth system models and Earth system data workflows are analysed in light of potential improvements of the future exascale supercomputer system design (Topic 4), and (v) machine learning: modern machine learning approaches are tested for their suitability to replace computationally expensive model calculations and speed up the model simulations or make better use of available observation data (Topic 5).