Automating water savings measurements customized to the unique conditions at your local water utility

In a previous post we laid out how CaDC analytics implements a vision for water efficiency as an integral part of water resource management. In this post we will zero in on the question at the core of demand management: how much water does a given water efficiency intervention save? In particular, this post will provide a history of CaDC research answering that question and explain the value of new, automated and regularly updated water savings estimates that are customized to the unique conditions of a local utility.

A History of CaDC Water Savings Estimation

The California Data Collaborative grew out of computational social science research on the water savings of turf rebate programs conducted at the Center for Urban Science and Progress at New York University. In the first phase of the California Data Collaborative, the focus was on the rapidly expanding turf removal rebates because of their relevance during the height of the drought. Building on what we learned during various attempts to quantify the water savings of turf removal, we have since expanded our work to estimate the water savings of rebate programs more broadly. Our methods have been successively refined. Those methods have now reached the point where they are automated and built into our Strategic California UrBan water Analytics (“SCUBA”) data infrastructure.

Our first work on this topic began before the CaDC had officially been established, when our project manager Patrick Atwater and consulting statistician Eric Schmitt published a conference paper at the 2015 Bloomberg Data for Good Exchange analyzing the turf rebate program within Moulton Niguel Water District. That approach used quantile regression to estimate the treatment effect of receiving a turf rebate on water use. This is an econometric approach that explicitly controls for confounding factors like household size, turf area, weather, and seasonality while estimating the effect that each square foot of turf removal has on different quantiles (e.g. median, 25th or 75th percentile, etc.) of monthly household water use.

The next study we performed looked at a larger data set of three Southern California water agencies and used a very different method to estimate water savings. This improved on the previous approach by directly addressing the core question of any water savings analysis: what would the water usage have been absent the intervention? The core of the approach was done using a software package called Causal Impact that was developed inside of Google to estimate the causal effect of marketing initiatives at the regional scale. We repurposed the approach so that instead of comparing the impact of advertising on sales across regions, we instead compared water use in a household that participated in a turf rebate against water use in control households that did not participate. In order to determine which households to use as controls, we performed a matching procedure to select the top 6 households with water use patterns that were most similar to the participant in the time period before the rebate.

 Conceptual illustration of the “difference-in-differences” method that uses change in use relative to a control to estimate effect size.

Conceptual illustration of the “difference-in-differences” method that uses change in use relative to a control to estimate effect size.

After matching, the participating and non-participating households’ water use is fed into a Bayesian Structural Time Series model. This model takes the water use of the control households as input and makes a prediction for what the participant’s water use would have been if they had not taken a rebate (a counterfactual). The difference between this counterfactual prediction and the water that was actually used is then attributed as the effect of the rebate. In this way we were able to calculate an estimate of savings for each individual household that were then averaged together using a meta-regression model. This allowed us to obtain both an overall estimate of average savings per square foot (24.6 gallons per square foot per year) as well as to gain insight into which characteristics of a household might lead to higher savings.

These results were initially published at the 2016 Knowledge Discovery and Data Mining (KDD) conference at a special workshop on Data Science in Food, Energy and Water. The results were later refined and generalized as a method to apply to conservation programs broadly and accepted for publishing in the Annals of Applied Statistics. This latest round showed some really exciting results and hinted at what is possible when analyzing how water savings vary among a population, rather than assuming an average number. For example, the paper showed increased water savings in inefficient and low income households. A turf removal rebate in a neighborhood with $30k median household income saves one more gallon per square foot per month on average than a rebate in a neighborhood with $80k median household income, a nearly 50% increase on what a typical customer saves. Similarly, a rebate in a household that is usually at 150% of their budget before the rebate will save 1.5 gallons per square foot per month more than in a household that is at 66% of their budget on average before the rebate.

 Example of difference-in-differences approach in practice from our 2016 KDD paper. The vertical dotted line indicates the date of the post-installation inspection. Notice how observed use flattens out relative to what would be expected if the turf removal had not taken place.

Example of difference-in-differences approach in practice from our 2016 KDD paper. The vertical dotted line indicates the date of the post-installation inspection. Notice how observed use flattens out relative to what would be expected if the turf removal had not taken place.

The latest stage in this evolution has been to build a simplified version of this difference-in-differences approach into our data pipelines. In this way each time we receive new data from a partner utility we can rapidly estimate the water savings at a monthly level for each household. This applies not only to turf removal but also smart timers, toilets, and even site inspections. This automation work is still in early stages but over the next year we plan to build these water savings estimates into the fabric of each tool that we deploy for water managers to help inform decisions around cost effectiveness, outreach targeting, and even program design.

Localized Savings Measurements for Improved Program Management

California has a diverse array of communities with different socioeconomic, climate, and other local conditions that affect how customers respond to water efficiency programs. Estimates of water savings that are derived from an area that is closer in space, climate, and demographics are more likely to reflect the unique conditions of a water utility’s service area and are therefore more relevant to that water utility’s water management needs. What better way to get estimates like this than to estimate the savings in a water utility’s own service area?

Turf removal again provides an illustrative example of this effect. In the last several years there has been a surge of interest in quantifying the effects of these programs, but prior to this surge there were very few resources to turn to. One of these was an assessment done by the Southern Nevada Water Authority that found the removal of turf and replacement with a zeric landscape could save 55 gallons per square foot per year. In contrast, our findings (and those of others in California) tend to yield results closer to 25-30 gallons per square foot. This difference may seem obvious in retrospect due to differences in climates and landscapes, but it highlights the dangers of relying on estimates that aren’t tailored to local context.

Another important factor is the frequency of measurements. The value of automated water savings statistics is that those measurements can be reviewed every couple of months rather than waiting years for a research study to be completed. This frequency means that these empirical water savings can be used as part of annual budget setting processes and the statistics can be used to adaptively manage water efficiency programs.

Customer-level granularity in water savings can open up entirely new possibilities for adaptive management of water efficiency investments compared to broad averages. For example, with savings estimates for each customer it is possible to monitor the actual savings obtained from a landscape transformation against what might be expected. If water use were to increase rather than decrease, and not return to normal even after an establishment period has passed, it might indicate a need for outreach to that customer to provide education on irrigation best practices. That could entail a custom mailer with tips on drought tolerant landscaping or text messages reminding a customer to adjust their smart controller or phone call from a customer service representative asking about their water usage patterns.

Lastly, unanticipated elements of human behavior may skew observed savings compared to what might be expected based on an engineering model. An excellent example of this was provided by one water manager who stopped funding for a program that provided free high-efficiency sprinkler heads. An inspection of the empirical water savings showed essentially no results, prompting the manager to dig deeper and discover that many recipients did not even install the sprinkler heads they received, while others installed them but then increased their irrigation times to compensate for lower flows.

Transparency and Open Source at the Center

The California Data Collaborative utilizes open source analytics wherever possible. The CaDC data team documents key methodology decisions to make those transparent to water managers using CaDC analytics. The methodology for these automated water savings statistics can be found here. This methodological transparency also means that water managers participating in the CaDC benefit from the best and brightest minds across the globe. In addition, this means that the statistical methods utilized to measure water savings can be subject to the highest standards of rigor. For example, the aforementioned statistical work benefited from collaboration with Eric Schmitt, who works as the Head of Research and Development at Protix Biosystems and served as Advising Statistician for the CaDC. Reflecting the growing importance of these measurements, the CaDC will be seeking a new advising statistician in the coming months and developing an advisory board of academic and water industry researchers to guide the development of these automated statistics.


Alliance for Water Efficiency. AWE Tracking Tool Version 3.0, User Inputs.

Alliance for Water Efficiency. 2018 Market Analysis and Recommendations.

Alliance for Water Efficiency. Landscape Transformation Study: 2018 Analytics Report.

Aspen Institute (2017). INTERNET OF WATER: Sharing and Integrating Water Data for Sustainability. Accessed at

California Data Collaborative (2017). Statewide Efficiency Explorer Methodology, Version 1.1. Accessed at

California Department of Water Resources (2015). 2015 UWMP Guidebook for Urban Water Suppliers.

Case Studies of Market Transformation as a Means for Delivering Regional Conservation Results. T. Chestnutt, M. Erbeznik, D. Pekelney. Prepared for the Metropolitan Water District of Southern California.

Mayer, P. W., DeOreo, W. B., Opitz, E. M., Kiefer, J. C., Davis, W. Y., Dziegielewski, B., & Nelson, J. O. (1999). Residential end uses of water.

Mini, C., Hogue, T. S., & Pincetl, S. (2014). Estimation of residential outdoor water use in Los Angeles, California. Landscape and Urban Planning, 127, 124-135.

Pittenger, D, Hodel, D. UC Riverside. The California Drought and Landscape Water Use.

Seapy, B. California Urban Water Conservation Council (March 2015). Turf Removal & Replacement: Lessons Learned.

Schmitt, E., Tull, C., & Atwater, P. (2018). Extending Bayesian structural time-series estimates of causal impact to many-household conservation initiatives. Annals of Applied Statistics (pending publication).

Sovocool, Kent A, Southern Nevada Water Authority, and Mitchell Morgan. (2005). “Xeriscape Conversion Study.” Final Report. Accessed at on 29 August, 2018.

Tull, C., Schmitt, E., & Atwater, P. (2016). How Much Water Does Turf Removal Save? Applying Bayesian Structural Time-Series to California Residential Water Demand. California Data Collaborative.