Talk by Hugo Kyo Lee, Jet Propulsion Laboratory, CalTech: Application of topological data analysis to multi-resolution matching and anomaly detection

Topic: Data Science Application Climate Science

Kyo Lee Speaker


noon to 1 p.m., Oct. 25, 2023


Seminar Format

Available remotely via Zoom: Contact the department to subscribe to the email list (zoom link provided in announcement).


Topology is the study of shapes. Topological data analysis (TDA) is an emerging machinery at the interface of algebraic topology, machine learning (ML), and statistics. TDA has shown a high utility in a diverse range of applications, from social studies to digital health care to power systems. While geometrical methods, such as TDA, continue to gain popularity in statistical sciences and ML, from causal inference to deep learning on manifolds, the utility of geometric methods for assessing the spatial characteristics of Earth science datasets is yet untapped. Topological information on the inherent data shape can provide invaluable insights into the latent data structure and organization and can serve a leading role in understanding spatiotemporal dynamic patterns of observations and climate models. 

Here, I studied latent shape in temperature maps over the contiguous United States in February, June, and July 2021. The cold wave in February 2021 was an extreme weather event that brought record-breaking temperatures to North America and caused multiple days of massive blackouts in Texas. From late June through mid-July, an extreme heat wave associated with a strong ridge occurred over Western North America. The main objective is to build a robust and reliable methodology that compares spatial patterns from different sources and detects anomalous spatial patterns during extreme temperature events. Specifically, I assessed two temperature datasets, the Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2) reanalysis, and the Atmospheric Infrared Sounder (AIRS). By applying cubical complexes, I summarized the shape of these two temperature datasets into persistence diagrams (PDs) and calculated the Wasserstein distance between the two PDs. My previous work (Orofi-Boateng et al., 2021) shows that Wasserstein distance represents differences in spatial patterns and can replace conventional metrics, such as a bias and root-mean-square-deviation (RMSD).
To the best of my knowledge, there is no quantitative metric to measure the difference in spatial patterns between Earth science datasets at different spatial resolutions. In my work, PDs summarized temperature maps during extreme cold and heat waves. Applying TDA to observational and model datasets has enormous potential because we can also analyze key spatial structures in the three-dimensional data from sounders and compare them with climate models.

Dr. Hugo Kyo Lee from NASA's Jet Propulsion Laboratory (JPL) is an atmospheric scientist with a background in research and data analysis, specializing in climate science and remote sensing. Serving as the Principal Investigator of NASA's Advanced Information Systems and Technology (AIST) project and JPL's Regional Climate Model Evaluation System (RCMES), Dr. Lee has played a role in advancing our understanding of climate systems and their impact on the Earth's climate system at JPL for the last 11 years. Dr. Lee received a Ph.D. in Atmospheric Sciences from the University of Illinois at Urbana-Champaign in 2012 and a B.S. in Atmospheric Sciences from Seoul National University, Korea, in 2002. 

Hugo Kyo Lee: [Email: | Jet Propulsion Laboratory | Google Scholar]