Long-term projections of soil moisture using deep learning and SMAP data with aleatoric and epistemic uncertainty estimates

2019 Spring Cyberseminar Series

Chaopeng Shen / Pennsylvania State University

Talk Description

Recently, recurrent deep networks have shown promise to harness newly available satellite-sensed data for long-term soil moisture projections. However, to be useful in forecasting, deep networks must also provide uncertainty estimates, which were not available before for time series DL. Here we adapt Monte Carlo dropout with an aleatoric term (MCD+A), an efficient uncertainty estimation framework developed in computer vision, for hydrologic time series predictions. MCD+A simultaneously estimates a heteroscedastic aleatoric uncertainty (attributable to observational noise and predictable using inputs) and an epistemic uncertainty (attributable to insufficiently constrained model parameters). Although MCD+A has appealing features, many heuristic approximations were employed during its derivation, and there lacked rigorous quality evaluation and evidence of its asserted capability to detect dissimilarity. We show that for reproducing soil moisture dynamics recorded by the Soil Moisture Active Passive (SMAP) mission, MCD+A indeed gave a good estimate of our predictive error, provided we tune a hyperparameter and use a representative training dataset. The aleatoric term responded strongly to observational noise and the epistemic term clearly acted as a detector for physiographic dissimilarity from the training data. They behaved as intended, but are also correlated. However, when the training and test data are characteristically different, the aleatoric term could be misled, undermining its reliability. Nevertheless, the uncertainty quality varied with the epistemic:aleatoric uncertainty ratio, and this trend could potentially be exploited to anticipate the reliability of the aleatoric term. Finally, a more informative prior for the aleatoric term improves uncertainty quality, suggesting expert knowledge should be incorporated.

2019 Spring Cyberseminar Series: Recent advances in big data machine learning in Hydrology

Hosted by Chaopeng Shen, Pennsylvania State University

Recently big data machine learning has led to substantial changes across many areas of study. In Hydrology, the introduction of big data and machine learning methods have substantially improved our ability to address existing challenges and encouraged novel perspectives and new applications. These advances present new opportunities methods that aid scientific discovery, data discovery, and predictive modeling. This series cover new techniques and findings that have emerged in Hydrology during the previous year, with a focus on catchment and land surface hydrology.