This series has passed.

CUAHSI's 2019 Spring Cyberseminar Series:

Recent advances in big data machine learning in Hydrology

Host: Chaopeng Shen, Pennsylvania State University

Recently big data machine learning has led to substantial changes across many areas of study. In Hydrology, the introduction of big data and machine learning methods have substantially improved our ability to address existing challenges and encouraged novel perspectives and new applications. These advances present new opportunities methods that aid scientific discovery, data discovery, and predictive modeling. This series cover new techniques and findings that have emerged in Hydrology during the previous year, with a focus on catchment and land surface hydrology.

All talks take place on Fridays at 1:00 p.m. ET.

Dates, Speakers, and Topics:

  • March 29, 2019: Machine Learning & Information Theory for Land Model Benchmarking & Process Diagnostics | Grey Nearing, University of Alabama
  • April 5, 2019: Long Short-Term Memory (LSTM) networks for rainfall-runoff modeling | Frederik Kratzert, Johannes Kepler University
  • April 12, 2019: Use deep convolutional neural nets to learn patterns of mismatch between a land surface model and GRACE satellite | Alex Sun, University of Texas at Austin
  • April 19, 2019: Long-term projections of soil moisture using deep learning and SMAP data with aleatoric and epistemic uncertainty estimates | Chaopeng Shen, Pennsylvania State University
  • April 26, 2019: Exploring deep neural networks to retrieve rain and snow in high latitudes using multi-sensor and reanalysis data | Guoqiang Tang, Tsinghua University
  • May 3, 2019: Process-guided deep learning: Improving water resource predictions with advanced hybrid models | Jordan S. Read, USGS and Vipin Kumar, University of Minnesota
  • May 10, 2019: Remote sensing precipitation using artificial neural networks and machine learning methods | Kuolin Hsu, University of California, Irvine

Presentation Abstracts and Recordings

  • March 29, 2019: Grey Nearing, University of Alabama

    Machine Learning & Information Theory for Land Model Benchmarking & Process Diagnostics

    I would like to propose that there is significant room in the Hydrological Sciences for developing better methods for integrating machine learning and physical modeling.

    This presentation will start by reviewing some recent results that compare machine learning and process-based Hydrology and Hydrometeorology models through benchmarking and process diagnostics. We will use information theory and dynamic process networks to look at the internal structure and functioning of complex systems models, and try to understand causes of missing information in process-based models.

    The talk will conclude by outlining one particular strategy for combining machine learning with process modeling that involves adding a machine learning kernel to the numerical integration of a dynamical systems model. I’ll present results from applying this method to both rainfall-runoff modeling and soil moisture modeling.

    View recording.

  • April 5, 2019: Frederik Kratzert, Johannes Kepler University, Linz, Austria

    Long Short-Term Memory (LSTM) networks for rainfall-runoff modeling

    Over the last couple of years, deep learning methods have revolutionized many fields, such as computer vision and natural language processing. Recently, several studies have reviewed potential applications for such deep learning methods in the field of hydrology, or earth sciences in general. However, one critic often raised, albeit their non-questionable predictive power, is the black-box like nature of these models.

    In this session we concentrate on a special (recurrent) neural network architecture that played an important role on the recent rise of deep learning methods, the Long Short-Term Memory network (LSTM). We look at both the predictive power, as well as ways to interpret the network internals, in the domain of rainfall-runoff modeling. For example we will see that a LSTM that was trained to predict sololy the discharge from time series of meteorological variables, learns by itself to model snow in its internal memory.

    View recording.

  • April 12, 2019: Alex Sun, University of Texas at Austin

    Use deep convolutional neural nets to learn patterns of mismatch between a land surface model and GRACE satellite

    Global hydrological models are increasingly being used to simulate spatial and temporal patterns of total water storage. Missing physical processes and/or uncertain parameterization in these models may introduce significant uncertainty in model predictions. GRACE satellite senses total water storage at the regional and continental scales. In this study, we applied deep convolutional neural nets to learn the spatial and temporal patterns of mismatch between model simulations and GRACE observations. This physically based learning approach leverages strengths of data science and hypothesis-driven process-level models. We show, through three different types of convolution neural network-based deep learning models, that deep learning is a viable approach for improving model-GRACE match. After the deep learning model is trained, GRACE data is not required. As a result, the method can also be used to fill in data gaps between GRACE missions or even before the GRACE mission.

    View recording.

  • April 19, 2019: Chaopeng Shen, Pennsylvania State University

    Long-term projections of soil moisture using deep learning and SMAP data with aleatoric and epistemic uncertainty estimates

    Recently, recurrent deep networks have shown promise to harness newly available satellite-sensed data for long-term soil moisture projections. However, to be useful in forecasting, deep networks must also provide uncertainty estimates, which were not available before for time series DL. Here we adapt Monte Carlo dropout with an aleatoric term (MCD+A), an efficient uncertainty estimation framework developed in computer vision, for hydrologic time series predictions. MCD+A simultaneously estimates a heteroscedastic aleatoric uncertainty (attributable to observational noise and predictable using inputs) and an epistemic uncertainty (attributable to insufficiently constrained model parameters). Although MCD+A has appealing features, many heuristic approximations were employed during its derivation, and there lacked rigorous quality evaluation and evidence of its asserted capability to detect dissimilarity. We show that for reproducing soil moisture dynamics recorded by the Soil Moisture Active Passive (SMAP) mission, MCD+A indeed gave a good estimate of our predictive error, provided we tune a hyperparameter and use a representative training dataset. The aleatoric term responded strongly to observational noise and the epistemic term clearly acted as a detector for physiographic dissimilarity from the training data. They behaved as intended, but are also correlated. However, when the training and test data are characteristically different, the aleatoric term could be misled, undermining its reliability. Nevertheless, the uncertainty quality varied with the epistemic:aleatoric uncertainty ratio, and this trend could potentially be exploited to anticipate the reliability of the aleatoric term. Finally, a more informative prior for the aleatoric term improves uncertainty quality, suggesting expert knowledge should be incorporated.

    View recording.

  • April 26, 2019: Guoqiang Tang, Tsinghua University

    Exploring deep neural networks to retrieve rain and snow in high latitudes using multi-sensor and reanalysis data

    Satellite remote sensing is able to provide information on global rain and snow, but challenges remain in accurate estimation of precipitation rates, particularly in snow retrieval. In this work, the deep neural network (DNN) is applied to estimate rain and snow rates in high latitudes. The reference data for DNN training are provided by two spaceborne radars onboard the GPM Core Observatory and CloudSat. Passive microwave data from the GPM Microwave Imager (GMI), infrared (IR) data from MODIS and environmental data from ECMWF are trained to the reference precipitation. The DNN estimates are compared to data from the Goddard Profiling Algorithm (GPROF) which is used to retrieve passive microwave precipitation for the Global Precipitation Measurement (GPM) mission. First, the DNN-based retrieval method performs well in both training and testing periods. Second, the DNN can reveal the advantages and disadvantages of different channels of GMI and MODIS. Additionally, IR and environmental data can improve precipitation estimation of the DNN, particularly for snowfall. Finally, based on the optimized DNN, precipitation is estimated in 2017 from orbital GMI brightness temperatures and compared to ERA-Interim and MERRA2 reanalysis data. Evaluation results show that: (1) the DNN can largely mitigate the underestimation of precipitation rates in high latitudes by GPROF; (2) the DNN-based snowfall estimates largely outperform those of GPROF; and (3) the spatial distributions of DNN-based precipitation are closer to reanalysis data. The method and assessment presented in this study could potentially contribute to the substantial improvement of satellite precipitation products in high latitudes.

    View recording.

  • May 3, 2019: Jordan S. Read, USGS and Vipin Kumar, University of Minnesota

    Process-guided deep learning: Improving water resource predictions with advanced hybrid models

    Data growth and computational advances have created new opportunities to improve water resources modeling. Deep learning tools deliver improved prediction accuracy by resolving complex relationships in large quantities of data. Additionally, process knowledge has continued to grow, yielding finer resolution models that capture more complex interactions and can be applied at broader scales. Both modeling approaches have drawbacks that can impede scientific discovery, including data needs for DL models and the often rigid structures of process-based models.

    We will discuss hybrid modeling approaches, called "process-guided deep learning", which have the potential to offset these limitations by integrating process understanding into advanced machine learning modeling techniques. We show how these hybrid modeling frameworks can better leverage the strengths of both model types. Examples of PGDL predictions (including water temperature and surface water extent dynamics) will be presented. 

    View recording.

  • May 10, 2019: Kuolin Hsu, University of California, Irvine

    Remote sensing precipitation using artificial neural networks and machine learning methods  

    Satellite remote sensing techniques provide a unique way to monitor precipitation at a global scale, especially for regions where ground measurements are limited. Recent development in computational Intelligence has shown excellent progress in learning from a large amount of in situ and remote sensing data to improve the quality of precipitation measurement. In this presentation, integrate multi-satellite sensors for precipitation estimation using deep neural networks (DNNs) and recent machine learning methods developed at the Center for Hydrometeorology (CHRS) will be presented. Case studies will be demonstrated for monitoring of precipitation from extreme storm events.

    View recording.