Machine learning (ML) is a rapidly growing field that is starting + to touch all aspects of our lives, and science is not immune to this. + In fact, recent work in the field of scientific ML, i.e. combining ML + and with conventional scientific problems, is leading to new + breakthroughs in notoriously hard problems, which might have seemed + too distant till a few years ago. One such age-old problem is that of + turbulence closures in fluid flows. This closure or parameterization + problem is particularly relevant for environmental fluids, which span + a large range of scales from the size of the planet down to + millimeters, and remains a big challenge in the way of improving + forecasts of weather and projections of climate.
+The climate system is composed of many interacting components + (e.g., ocean, atmosphere, ice) and is described by complex nonlinear + equations. To simulate, understand, and predict climate, these + equations are solved numerically under a number of simplifications, + therefore leading to errors. The errors result from numerics used to + solve the equations and the lack of appropriate representations of + processes occurring below the resolution of the climate model grid + (i.e., sub-grid processes).
+This book aims to conceptualize the problems associated with
+ climate models within a simple and computationally accessible
+ framework, and show how some basic ML methods can be used to approach
+ these problems. We will introduce the readers to climate modeling
+ using a simple tool, the Lorenz
+ (
The material in this Jupyter book is presented over five sections. + The first section, Lorenz 96 and General Circulations Models, + describes the Lorenz-96 model and how it can work as a simple analog + to much more complex general circulation models used for simulating + ocean and atmosphere dynamics. This section also introduces the + essence of the parameterization or closure problem. In the second + section, Neural Networks with Lorenz-96, we introduce the basics of + ML, how fully connected neural networks can be used to approach the + parameterization task, and how these neural networks can be optimized + and interpreted. No model, even the well parameterized ones, is + perfect, and the way we keep computer models close to reality is by + guiding them with the help of observational data. This task is + referred to as data assimilation, and is introduced in the third + section, Data Assimilation with Lorenz-96. Here, we use the L96 model + to quickly introduce the concepts from data assimilation, and show how + ML can be used to learn data assimilation increments to help reduce + model biases. While neural networks can be great functional + approximators, they are usually quite opaque, and it is hard to figure + out exactly what they have learnt. Equation discovery is a class of ML + techniques that tries to estimate the function in terms of an equation + rather than as a set of weights for a neural network. This approach + produces a result that is far more interpretable, and can potentially + even help discover novel physics. These techniques are presented in + the fourth section, Equation Discovery with Lorenz-96. Finally, we + describe a few more ML in section five, Other ML approaches for + Lorenz-96, with the acknowledgment that there are many more techniques + in the fast-growing ML and scientific ML literature and we have no + intention of providing a comprehensive summary of the field.
+The book was created by and as part of M2LInES, an international + collaboration supported by Schmidt Futures, to improve climate models + with scientific ML. The original goal for these notebooks in this + Jupyter book was for our team to work together and learn from each + other; in particular, to get up to speed on the key scientific aspects + of our collaboration (parameterizations, ML, data assimilation, + uncertainty quantification) and to develop new ideas. This was done as + a series of tutorials, each of which was led by a few team members and + occurred with a frequency of roughly once every 2 weeks for about 6-7 + months. This Jupyter book is a collection of the notebooks used during + these tutorials, which have only slightly been edited for continuity + and clarity. Ultimately, we are happy to share these resources with + the scientific community to introduce our research ideas and foster + the use of ML techniques for tackling climate science problems.
+Parameterization of sub-grid processes is a major challenge in
+ climate modeling. The details of this problem may often be very
+ context dependent (Christensen & Zanna
+ (
As described above, these notebooks were originally created to + introduce non-domain experts to ideas from the parameterization + aspects of climate modeling and how ML could be used to potentially + address these. Now they have been adapted to act as a pedagogical tool + for self-learning, be used as a reference manual, or for teaching some + modules in an introductory class on ML. The book is organized in + sections that are relatively independent; with the exception that the + first section provides a general overview to the parameterization + problem in climate models. Each notebook covers material that can be + discussed in roughly an hour-long lecture, and sections can be mixed + and matched or ordered as needed depending on the overall learning + objectives.
+This work is supported by the generosity of Eric and Wendy Schmidt + by recommendation of Schmidt Futures, as part of its Virtual Earth + System Research Institute (VESRI). MAB acknowledges support from + National Science Foundation’s AGS-PRF Fellowship Award + (AGS2218197).
+