Sharing Data Helps Puerto Ricans Rebound After Hurricane Maria

By Julia Hart, Christina Bandaragoda, and Graciela Ramirez-Toro

[Original article from EOS Project Update.]

On 20 September 2017, Hurricane Maria made landfall in Puerto Rico as a category 4 hurricane. At the time, Maria was the fifth largest storm to hit the United States and the largest to hit Puerto Rico in over 80 years [Cortés, 2018]. Bisecting the island with sustained winds of 155 miles per hour (250 kilometers per hour), Maria left a trail of devastation in its path [Cortés, 2018] and would go on to claim nearly 3,000 lives [Santos-Burgoa et al., 2018]. Heavy winds and flash flooding razed homes, businesses, and power lines, plunging Puerto Rico’s nearly 3.4 million people into darkness and underscoring concerns of how we address vulnerability and adaptation planning and highlighting opportunities for transformative change [Eakin et al., 2018].

In the weeks that followed Maria, a water crisis ensued. Without electricity, water could not be treated or distributed to people’s homes; residents had no drinking water or water with which to bathe or flush a toilet. As a result, residents turned to potentially contaminated streams, rivers, and creeks, risking exposure to disease-causing bacteria like Leptospira. A month following the storm, several confirmed cases of leptospirosis, which may be fatal, were reported to the Centers for Disease Control and Prevention [Rodríguez-Díaz, 2017].

After Maria, widespread disruption of drinking water treatment and distribution systems, as well as a lack of information regarding water quality, posed a significant health risk in Puerto Rico. Thus, the hurricane demonstrated a need to strategically archive and disseminate data relevant to water quality and public health to both researchers and community members.

An interdisciplinary team of researchers sought to fill this need, with support from the National Science Foundation. The team included researchers at the University of Washington, Virginia Tech, University of Pennsylvania, Utah State University, Interamerican University of Puerto Rico, and the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI). These researchers developed an open-source research software infrastructure to support scientific investigation and data-driven decision-making following natural disasters, with a pilot project focused on drinking water and Hurricane Maria data. The team maintains that the scientific community can do more to reduce the cost and human impact of destructive hurricanes.

Getting Started

The first objective of this project was to collect water quality data from across Puerto Rico. The team collected water samples from drinking water, surface water, and wastewater systems in collaboration with public water supply utilities in early 2018. They analyzed samples for microbial, chemical, and biological water quality parameters, including a comprehensive panel of opportunistic, waterborne pathogens like Leptospira. Researchers also collated publicly available, spatially explicit data about property and infrastructure damage, landslide disturbance, power outages, and the availability of medical aid, food, and water from other databases.

The team’s second objective was to build a cyberinfrastructure capable of integrating and disseminating these diverse data sets. Cyberinfrastructure simultaneously provides an accessible online platform, connects stakeholder communities, and houses the software tools necessary to advance computing and data analysis research needs. In other words, cyberinfrastructure provides the bridges, roads, and highways for the storage, analysis, and sharing of water data and all forms of digital information. Users (e.g., researchers, community stakeholders, and public utility managers) may navigate cyber highways to discover information about water quality, major public health concerns, or hydrologic modeling tools.

The bulk of field data collection and cyberinfrastructure development for this project is now complete. But the researchers will continue to add new data sources and time series data as they become available. Continuous cyberinfrastructure improvements (e.g., regular software maintenance) also ensure continued refinement of this data platform.

Managing the Data

The researchers used HydroShare, an online, collaborative platform, as a centralized cyberinfrastructure for all data relevant to Hurricane Maria water quality and recovery efforts. First launched in 2015, HydroShare is an online data repository operated by CUAHSI that already boasts more than 3,000 users. It currently enables water researchers from around the world to upload and manage a wide variety of hydrologic data types, models, and code and to make this information available in a citable, shareable, and discoverable manner [Horsburgh et al., 2016; Yi et al., 2018].

The main advantages of this platform are sharing controls and accessibility. Anyone can become a HydroShare member and gain access to dozens of unique public data sets related to Hurricane Maria by joining the Puerto Rico Water Studies group. The team ensures high-data quality by using data model templates for quality assurance prior to data publication. Data incorporated from other sources or databases (collected independently of this project) rely on the quality assurance protocols of those sources to ensure high quality. Regardless of their source, the data are reported in a consistent, well-documented format, according to findable, accessible, interoperable, and reusable (FAIR) data principles [Wilkinson et al., 2016]. These principles are designed to address some of the biggest challenges facing data-intensive science, including transparency, reproducibility, and reusability.

Although data transparency was a desired outcome of this project, this level of transparency could not violate the privacy of data contributors, many of whom were directly affected by Hurricane Maria. As such, water quality scientists and public health researchers alike contributed and published deidentified data at a spatial scale that protects individuals’ privacy. For example, the team used publication steps where users privately uploaded individual household or treatment plant records, but they published only county- or municipality-level spatial resolutions for planning, disaster response, and population health research purposes.

Potential Benefits

With so many diverse data sets available in one place, the HydroShare cyberinfrastructure facilitates an opportunity for unprecedented interdisciplinary research and new applications of data. In the future, population health researchers may use published, geospatially anonymous clinical records in conjunction with environmental data (e.g., surface, ground, and drinking water quality data) to identify populations that have experienced health-related impacts of Hurricane Maria. Meteorologists may compare the physical characteristics of the hurricane with pictures of the destruction using an interactive, digital story map. Hydrologists may explore how the island has responded to the hurricane ecologically (e.g., windblown deforestation) while simultaneously identifying areas newly susceptible to landslides. In this way, a centralized cyberinfrastructure provides a large, temporally and spatially explicit, and accessible platform for organizing and disseminating a “hurricane of data.”

Cyberinfrastructure in Action

Scientists and other professionals aren’t the only people to contribute to and benefit from this new resource. For example, Porfirio Fraticelli is a volunteer water system operator who has assumed the responsibility of providing safe drinking water to his small, rural community. The research team is currently training Porfirio and other community members in rural southeastern Puerto Rico to upload data from his potable water system directly to HydroShare. Communication lines and Internet access are still limited following Hurricane Maria, so Porfirio will travel to the nearest town to contribute and manage his own private data in HydroShare. He will be able to choose when to make his data publicly available using a digital object identifier (DOI). Until that time, only members of his private HydroShare group, made up of community stakeholders, may view the data.

Publishing his data on HydroShare benefits Porfirio’s community in two ways. First, it communicates the status of his community’s water quality—water temperature, pH, turbidity, alkalinity, and bacterial concentrations—to the government, researchers, and aid organizations. These variables help to answer the critical question, Does this community have safe drinking water? Second, uploading his data to HydroShare connects Porfirio’s small, rural community to a network of communities and municipalities across Puerto Rico who are also contributing their own data. The resultant network of spatially explicit, high-quality data allows interdisciplinary researchers to investigate aquatic ecosystem, physical infrastructure, and public health recovery at broad scales and to deliver research products that may increase community resilience to future hurricanes.

An Unprecedented Opportunity

The data transparency, accessibility, and reproducibility provided by the cyber roads and highways of HydroShare facilitate unprecedented opportunity for interdisciplinary, collaborative research on the “new normal” of intensified hurricane seasons. For example, HydroShare also houses digital information about Hurricanes Harvey and Irma, which illustrates its usability for myriad post–extreme event data storage, analysis, and sharing. Centralized cyberinfrastructure introduces a new system of data-driven decision-making in the weeks and months following natural disasters. This system is capable of tackling uncertainty and increasing community resilience to future extreme events.

Addressing real-time communications limitations or water quality concerns during a hurricane is beyond the scope of this project. However, future research, informed by cyberinfrastructure resources, will explore the development of low-cost wireless technology or mesh networks for real-time communication about drinking water and other resources.

To access data collected in this project, create a free HydroShare account online and search for the “Puerto Rico Water Studies” group under the “Collaborate” header. Public web pages linking to data resources are also available at the CUAHSI projects website (see Building Infrastructure to Prevent Disasters and the Hurricanes 2017 Data Archive).


We thank the National Science Foundation (NSF) for financial support. We are grateful to Amanda Manaster and David Tarboton for providing feedback on early versions of this article. We are also very grateful to our NSF Rapid Response Research (RAPID) project team: Melitza Crespo Medina, Miguel Leon, Jimmy Phuong, Kelsey Pieper, William Rhoads, Marc Edwards, Amy Pruden, Virginia Riquelme, Ishi Keenum, Ben Davis, Matthew Blair, Greg House, Sean Mooney, Kari Stephens, Erkan Istanbulluoglu, Jerad Bales, Emily Clark, Liza Brazil, William McDowell, Jeffery Horsburgh, David Tarboton, Amber Spackman Jones, Eric Hutton, Gregory Tucker, Lynn McCready, Scott Dale Peckham, W. Christopher Lenhardt, Ray Idaszak, and Tim Ferguson-Sauder.