Machine Learning Climate Simulation Dataset Paper Wins Award at Prominent AI Conference
An interdisciplinary group of researchers from multiple institutions – including climate and data scientists in UC Irvine’s School of Physical Sciences and Donald Bren School of Information and Computer Sciences — won a best paper award at the 37th Conference on Neural Information Processing Systems, held recently in New Orleans. The publication, titled “ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation,” took the top prize in the dataset category. ClimSim, the climate modeling tool introduced in the paper, is the largest-ever data compendium designed for machine learning-enhanced physics research. “Hybrid methods that combine physics with machine learning have introduced a new generation of high-fidelity climate simulators to the research community,” said co-author Michael Pritchard, UCI professor of Earth system science. “However, this hybrid machine-learning approach requires domain-specific treatment that has been inaccessible to users because of a lack of training data and relevant, easy-to-use workflows. ClimSim is one step toward alleviating that problem.”
The dataset consists of 5.7 billion pairs of input and output vectors that isolate the influence of local physical characteristics on a host climate simulator’s large-scale physical state. The ClimSim dataset spans the globe and multiple years at a high sampling frequency. Lead author Sungduk Yu, UCI assistant project scientist in the Department of Earth System Science, presented the team’s project in front of an audience at the conference. “ClimSim is the result of a collaboration between 56 scientists from 20 different institutions. It’s a great example of the progress that can be made when the climate science and AI/machine learning communities come together,” he said. Co-author Stephan Mandt, UCI associate professor of computer science, said, “This prestigious award, issued among thousands of papers, shows that the machine learning community increasingly cares about applications in climate science. I congratulate Sungduk and the team on this important effort of making their data accessible to many thousands of AI researchers worldwide.” The ClimSim dataset has been released on an open-source basis for the benefit of science and society, said Pritchard.