CaDC Statewide Efficiency Explorer Methodology v 1.1

CaDC staff is proud to announce the completion of its first rapid assessment of the prospective residential component of  statewide efficiency goals described in the implementation framework for Governor Brown’s Executive Order B-37-16.  This integrates publicly available evapotranspiration, land use, service area boundary, aerial imagery, population and water production data to estimate residential water efficiency goals for 404 out of CA's 409 major urban water retailers reporting in the latest supplier report.  The assessment offers water suppliers a first look at water use compared to a residential efficiency goals and illustrates the need for enhanced data sources and additional information.

Supporting the CA water community in planning for the future

This assessment provides a marked improvement over the previous CaDC parcel based methodology, which was the previously best available statewide approximation publicly available online.  Those calculations are shown via an interactive tool whereby water uses can input and analyze various policy scenarios.  This tool was developed for planning and education purposes as a public service to support the water community in navigating the rapidly evolving statewide policy discussions.

As described in the original grant agreement with the Water Foundation, "This interactive planning tool empowers the California water community to analyze the impact of those prospective efficiency standards under user selected scenarios with varying indoor or outdoor efficiency standards."  The CaDC partnership does not take water policy positions as described in the CaDC in depth principles here.  The tool also illustrates the requirement for additional accuracy in landscape, population, land use, and weather data as part of an integrated approach for improving these estimated goals in version 2.0.

The open source CaDC efficiency explorer tool is described in greater detail in the statewide efficiency section here.  The underlying open source code is available here and that interactive tool leverages ARGO nonprofit public data infrastructure to provide the ability to iteratively improve this initial rapid assessment.

CaDC staff would like to thank the Water Foundation for generously funding this work, Claremont Graduate University for developing landscape area data, the CaDC local utility technical working group for their invaluable insight and CaDC academic partners for their review.  The water policy-neutral methodology developed in collaboration with those partners and utilized to estimate residential efficiency goals is available below. (UPDATED see here to download a PDF of the version 1.1 methodology that is also shown below. Also, please see here for a one page statement summarizing the uses of the tool.)  

This methodology documentation made great effort to highlight future opportunities for improvement as this endeavor is a rapid first assessment, not a final definitive result.   CaDC staff has quantified the expected utility level error statistically and included error bars on the residential goals shown.  

That statistical error calculation is available in depth here.  Furthermore, CaDC staff is qualitatively analyzing the unique local circumstances that can lead to data quality challenges for all 404 agencies.  Those factors include the CIMIS station proximity, administrative area boundary issues, rural residential parcels, and prevalence of local factors making remote sensing difficult.

In addition, CaDC staff is collaborating with the state, the CaDC coalition of local water utilities and the CaDC network of academic, technology and nonprofit partners on improving this underlying land use, service area boundary, landscape area and evapotranspiration data utilized for this initial assessment.  The CaDC welcomes the participation of other water suppliers in the coalition to aid in improving data accuracy and improving this tool and other CaDC analytics.  Get in touch here to join and stay tuned for future updates!  

In the interim, note feedback and suggestions for improvements in future iterations are appreciated!  Please leave your questions and ideas in the comments section below.

UPDATE 6-5-17: Based on CaDC technical working group feedback, the following section has been added to the efficiency explorer tool to provide important data quality considerations. There are two distinct senses in which efficiency goals calculations can deviate from ground truth: precision and accuracy.  

Parameter data used to calculate goals can be imprecise. Imprecision reflects deviations around a true value. The Efficiency Explorer's graphs include gray confidence bands around each agency's calculated goal to indicate the imprecision resulting from the compounded statistical error for all parameter data sources. Analogous to the relationship between the darts and the bullseye in figure (a) above, one should expect the ground truth efficiency goal values to lie somewhere within the confidence bands (for agencies not flagged as showing evidence of systematic bias away from accuracy). Imprecise goal calculations are good initial estimates of ground truth, though ones that can be further refined.

As alluded to above, in certain situations parameter data used to calculate goals can be not only imprecise, but also inaccurate. Inaccuracy reflects a more systematic bias away from ground truth. Figure (b) above graphically illustrates this type of error. Non-random inaccuracies can arise from situations such as the prevalence of large rural residential parcels in certain districts, which would result in systematic overestimation of goal calculations in those districts. The prevalence of brown lawns in other districts would result in systematic underestimation of goal calculations in those districts. These types of data quality uncertainties will be elaborated in an upcoming CaDC blog post. Goals flagged as systemically biased away from ground truth have been grayed out on the map and should be interpreted as being potentially inaccurate.

The efficiency explorer methodology linked above has been updated (now version 1.1) to reflect this nuance.

UPDATE 6-23-17: The CaDC efficiency explorer tool has added the following qualifying statement in bold at the landing page for the tool, in addition to a splash screen with additional data quality context prior to utilizing the tool:

"The Efficiency Explorer Tool was developed with publicly available data to offer water managers a first glance at water use compared to potential water efficiency goals. It is for educational and illustrative purposes only. The Efficiency Explorer Tool was not intended and is not able to calculate water agency budgets at a level of accuracy appropriate for establishing policy. Several areas for improvement were identified as this tool was developed and the CaDC is dedicated to working with members and stakeholders to improve the accuracy and precision of this tool."

Please see here for a one page statement summarizing the uses of the tool. The CaDC has also detailed additional data quality considerations with statewide efficiency goal setting.

UPDATE 8-10-17: We have identified a methodological improvement to account for edge cases in our assignment of residential parcels to utilities.

Since our process for determining residential landscaped area involves joining separate parcel datasets—one with landscaped area measurements and one with a land use classification—we include a step to filter out duplicate records in both of these datasets before joining to avoid any possible administrative data quality issues. To achieve this, one can filter on distinct combinations of APN and county, or distinct combinations of APN and a unique supplier identifier; and in most cases the results will be identical.

However, there do exist cases where parcels are associated with two suppliers due to boundary overlaps. We have addressed this administrative data issue by recognizing that in most cases this is a result of wholesaler boundaries subsuming strictly retailer boundaries, and in turn assigning conflicting parcels to the supplier with the smaller overall area. While this handles most cases, there still exists the possibility that one can filter out parcels associated with the smaller supplier prior to the join if one filters on APN and county, rather than APN and a unique supplier identifier. Avoiding the possibility of unwanted pre-join filtering by changing to this latter filter approach is therefore a methodological improvement.

Most importantly: we include this update only for scientific transparency. These edge cases were already included in our +/- 40 percent error estimates and data quality consideration flags. The aggregated landscaped area measurement of only one supplier not already flagged with data quality considerations has changed outside of original error bounds (“Shasta Lake City of”)."