This series has passed.

CUAHSI's 2019 Cyberseminar Series:

Waterhackweek

Host: CUAHSI and University of Washington

Studies of water and environmental systems are becoming increasingly complex and require the integration of knowledge across multiple domains. At the same time, technological advances have enabled the collection of massive quantities of data for studying earth system changes. Fully leveraging these datasets and software tools requires fundamentally new approaches in the way researchers store, access and process data. Waterhackweek, supported by the National Science Foundation Cybertraining program, serves the national interest by motivating a culture shift within the hydrologic and more broadly earth science communities toward open and reproducible software practices that will enhance interdisciplinary collaboration and increase capacity for addressing complex science challenges around the availability, risks and use of water. This cyberseminar series consists of presentations from the Cybertraining investigators and research software developers, each focusing on a specific water-related use cases, tool, or library. Topics will consist of both introductory and advanced concepts that are relevant to a wide range of water and informatics use-cases, e.g. publishing large datasets, running numerical models, organizing collaborative research projects, and meeting journal requirements by following open data standards. The goal of the 2019 series is to prepare the incoming Waterhackweek (March 25-29, 2019) participants for the in-person capstone event in which their skills and creativity will be used to address natural hazards, however, these topics and technologies are also relevant to the broader water science community. We welcome all undergraduate, graduate, and early career scientists to join us in this public cyberseminar series.

All talks take place on Thursdays at 1:00 p.m. ET.

Dates, Speakers, and Topics:

  • January 17, 2019: Hydroshare and community data sharing tools | Daniella Tijerina, CUAHSI
  • January 24, 2019: Jupyter notebooks and workflows on Hydroshare | Tony Castronova, CUAHSI
  • January 31, 2019: Visualization of water datasets | Anthony Cannistra, University of Washington
  • February 7, 2019: Data access and time-series statistics | Emilio Mayorga and Yifan Cheng, University of Washington
  • February 14, 2019: Workflows for gridded climate datasets | Bart Nijssen and Diana Gergel, University of Washington
  • February 21, 2019: Introduction to Version Control with Git & Github | Valentina Staneva, eScience Institute and University of Washington
  • February 28, 2019: Landlab modeling framework and use cases | Sai S. Nudurupati, Amanda Manaster, Christina Bandaragoda, and Erkan Istanbulluoglu, University of Washington
  • March 7, 2019: Tools for building Apps: Tethys | Rohit Khattar, Brigham Young University and Nathan Swain, Aquaveo

Presentation Abstracts and Recordings

  • January 17, 2019: Daniella Tijerina, CUAHSI

    Hydroshare and community data sharing tools

    Data management, sharing, and publication are integral parts of a robust data management plan, a core requirement of all NSF funded research grants and many other funding agencies. This seminar will discuss some common challenges and present solutions for managing and sharing data using CUAHSI tools, specifically utilizing HydroShare. HydroShare is an online repository system for water data and models that aims to advance hydrologic science through enabling users to manage, share, and publish products resulting from their research and data collection. We will introduce attendees to approaches for managing current and archived data, collaboration within a research group, documentation of metadata, and publication. The webinar will center around tools and techniques within HydroShare to facilitate these activities, employing both discussions and demos.

    View recording.

  • January 24, 2019: Tony Castronova, CUAHSI

    Jupyter notebooks and workflows on Hydroshare

    The water science community continually develops and adopts technologies to improve our ability to openly collaborate and share workflows. Ultimately, this will have a transformative impact on how we address the challenges associated with collaborative and reproducible scientific research. One solution to these problems is to utilizing Jupyter notebooks, an open-source platform for creating metadata-rich toolchains for modeling and data analysis applications. Combining this technology with publicly available datasets from agencies such as USGS, NASA, and EPA enables researchers to easily prototype and execute data-intensive toolchains. CUAHSI has invested in this technology by establishing a free and open source web platform for scientists to (1) conduct data intensive and computationally intensive collaborative research, (2) utilize high performance libraries, models, and routines within a pre-configured cloud environment, and (3) enable dissemination of research products. This seminar will discuss CUAHSI’s investment in JupyterHub for supporting water science research, training, and education. Participants can expect a primer on JupyterHub and the cyberinfrastructure that has been designed to support these workflows, as well as detailed demonstrations of common educational and research use cases. A basic understanding of HydroShare.org and the Python programming language are helpful, but not required for participation in the live demonstrations.

    View recording.

  • January 31, 2019: Anthony Cannistra, University of Washington

    Visualization of water datasets

    Geospatial data, especially those in hydrology, are uniquely suited to compelling and practical visualization. Maps, in particular, are not only tools for developing an initial understanding of a new set of data but are also used widely to disseminate analytical results in a native manner. This seminar will develop both a high-level understanding of the practice of visualizing geospatial data and practical skills in Python for easily creating geospatial visualizations. In particular, we will discuss the importance of (and historical precedent for) creating a visual narrative for the dissemination of information, concerns regarding cartographic projections, a brief overview of common geospatial data types, and provide live demonstrations of common open-source geospatial data visualization packages in Python using hydrologic datasets.

    View recording.

  • February 7, 2019: Emilio Mayorga and Yifan Cheng, University of Washington

    Data access and time-series statistics

    Data about water are found in many types of formats distributed by many different sources and depicting different spatial representations such as points, polygons and grids. How do we find and explore the data we need for our specific research or application? This seminar will present common challenges and strategies for finding and accessing relevant datasets, focusing on time series data from sites commonly represented as fixed geographical points. This type of data may come from automated monitoring stations such as river gauges and weather stations, from repeated in-person field observations and samples, or from model output and processed data products. We will present and explore useful data catalogs, including the CUAHSI HIS catalog accessible via HydroClient, CUAHSI HydroShare, the EarthCube Data Discovery Studio, Google Dataset search, and agency-specific catalogs. We will also discuss programmatic data access approaches and tools in Python, particularly the ulmo data access package, touching on the role of community standards for data formats and data access protocols. Once we have accessed datasets we are interested in, the next steps are typically exploratory, focusing on visualization and statistical summaries. This seminar will illustrate useful approaches and Python libraries used for processing and exploring time series data, with an emphasis on the distinctive needs posed by temporal data. Core Python packages used include Pandas, GeoPandas, Matplotlib and the geospatial visualization tools introduced at the last seminar. Approaches presented can be applied to other data types that can be summarized as single time series, such as averages over a watershed or data extracts from a single cell in a gridded dataset – the topic for the next seminar.

    View recording.

  • February 14, 2019: Bart Nijssen and Diana Gergel, University of Washington

    Workflows for gridded climate datasets

    Climate change, forecasting, satellite datasets, large model ensembles ... Large gridded datasets are everywhere in hydrology and earth science. While accessing and analyzing these datasets required some serious programming skills not so long ago, a number of toolkits are now available that let you easily access, ingest, analyze and display gridded climate datasets. In this webinar we’ll discuss one of the most common file formats used in our field for large data sets, the Network Common Data Format (NetCDF), and step through a Jupyter notebook to showcase python packages, such as xarray and cartopy, that can be used to examine them. No prior experience required, although we will build on some of the skills you have acquired in earlier webinars in the series.

    View recording.

  • February 21, 2019: Valentina Staneva, eScience Institute and University of Washington

    Introduction to Version Control with Git & Github

    Managing and sharing scientific code is an invaluable skill researchers today should possess. However, existing version control systems which facilitate are sometimes hard to master. In this training, you will learn how to use Git & Github to:

    • to put your code under version control and publish it online
    • track changes in your code, and retrieve old versions
    • collaborate with others on the same project

    To setup git on your computer follow these instructions.

    View recording.

  • February 28, 2019: Erkan Istanbulluoglu, Christina Bandaragoda, Amanda Manaster, and Sai Siddhartha, University of Washington

    Landlab modeling framework and use cases

    Landlab is an open-source Python toolkit that provides a modeling environment for building 2D numerical Earth surface models. This toolkit is designed to accelerate the development of new process-based models by providing: tools to create a grid; data structures for storing and managing data on the grid; and easy-to-use plotting and visualization functions. In this cyberseminar, we will walk through the basics of Landlab using HydroShare, including a simple example of an Earth surface model. In addition, we will provide two more water-specific examples—an Overland Flow example and an Ecohydrology example—for further independent exploration. We will discuss the two additional examples in detail during WaterHackWeek. No prior experience is required, but a basic understanding of HydroShare and Python is helpful. 

    View recording.

  • March 7, 2019: Rohit Khattar, Brigham Young University and Nathan Swain, Aquaveo

    Tools for building Apps: Tethys

    Tethys is a python-based web platform that can be used to develop and host environmental applications. Its free and open source and includes commonly used web tools including PostGIS, GeoServer, mapping gizmos with open layers and google maps and HTCondor for distributed computing. Tethys Platform is powered by the Django Python web framework giving it a solid web foundation with excellent security and performance. Tethys aims at lowering the technical barrier required for scientists and engineers to develop environmental web applications and share their work with the world. In this cyberseminar, we will go through a brief introduction to Tethys and walk through the Beginner Concepts section of the Tethys Tutorial. We will discuss Tethys in detail during the WaterHackWeek when we will employ some of the more advanced concepts to develop our projects. Prior experience in Python and Django will be of help.

    Docs: http://docs.tethysplatform.org/en/stable/tutorials/getting_started.html

    View recording.