Long-term projections of soil moisture using deep learning and SMAP data with aleatoric and epistemic uncertainty estimates

2019 Spring Cyberseminar Series

Presenter(s):
Chaopeng Shen / Pennsylvania State University

Talk Description

Recently, recurrent deep networks have shown promise to harness newly available satellite-sensed data for long-term soil moisture projections. However, to be useful in forecasting, deep networks must also provide uncertainty estimates, which were not available before for time series DL. Here we adapt Monte Carlo dropout with an aleatoric term (MCD+A), an efficient uncertainty estimation framework developed in computer vision, for hydrologic time series predictions. MCD+A simultaneously estimates a heteroscedastic aleatoric uncertainty (attributable to observational noise and predictable using inputs) and an epistemic uncertainty (attributable to insufficiently constrained model parameters). Although MCD+A has appealing features, many heuristic approximations were employed during its derivation, and there lacked rigorous quality evaluation and evidence of its asserted capability to detect dissimilarity. We show that for reproducing soil moisture dynamics recorded by the Soil Moisture Active Passive (SMAP) mission, MCD+A indeed gave a good estimate of our predictive error, provided we tune a hyperparameter and use a representative training dataset. The aleatoric term responded strongly to observational noise and the epistemic term clearly acted as a detector for physiographic dissimilarity from the training data. They behaved as intended, but are also correlated. However, when the training and test data are characteristically different, the aleatoric term could be misled, undermining its reliability. Nevertheless, the uncertainty quality varied with the epistemic:aleatoric uncertainty ratio, and this trend could potentially be exploited to anticipate the reliability of the aleatoric term. Finally, a more informative prior for the aleatoric term improves uncertainty quality, suggesting expert knowledge should be incorporated.

2019 Spring Cyberseminar Series: Recent advances in big data machine learning in Hydrology

Hosted by Chaopeng Shen, Pennsylvania State University

Recently big data machine learning has led to substantial changes across many areas of study. In Hydrology, the introduction of big data and machine learning methods have substantially improved our ability to address existing challenges and encouraged novel perspectives and new applications. These advances present new opportunities methods that aid scientific discovery, data discovery, and predictive modeling. This series cover new techniques and findings that have emerged in Hydrology during the previous year, with a focus on catchment and land surface hydrology.

Consider attending the 2019 CUAHSI Hydroinformatics Conference on Hydroinformatics for scientific knowledge, informed policy, and effective response!

July 29 - 31, 2019 at Brigham Young University in Provo, UT

The CUAHSI Conference on Hydroinformatics is uniquely focused on data science and technology for water resources and hydrology. This conference will include keynote speakers and oral, poster, and hands-on sessions. Start planning now to be a part of this important meeting.

We are pleased to announce the following Keynote Speakers:

  • Ni-Bin Chang, University of Central Florida
  • Tyler Erickson, Google Earth Engine and Google Earth Outreach
  • Sara Larsen, Western States Water Council Water Data Exchange
  • Manish Parashar, National Science Foundation
  • Gene Shawcroft, Central Utah Water Conservancy District
  • Chaopeng Shen, Pennsylvania State University

Register by June 15 (Early Bird) | July 15 (Regular).

A limited number of $750 travel grants are available to students, post-docs, and early career faculty affiliated with U.S. universities.

For more information, including how to register, click here.