A methodology to assess vulnerability in small communities drinking water systems

22, 2019 ABSTRACT In many countries, small community systems have an important role as suppliers of drinking water for large part of the population. These systems must be evaluated with respect to their capacity to produce and deliver safe drinking water. As there are thousands of small systems to be evaluated in any given region, it is necessary to develop a procedure for selecting a representative sample, as well as the use of indicators that can provide information about the state of the systems. The objective of this study was to propose and apply a methodology to evaluate the vulnerability of small communities drinking water systems. The methodology includes the application of a statistical method to select a representative sample of systems in a region. It also proposes vulnerability indicators, ratings and an index. As a case study, the methodology was applied to determine the vulnerability of small community drinking water systems in the state of Rio Grande do Sul, Brazil. Data collected with application of the proposed methodology indicated that 67% of the surveyed systems were classified as having intermediate and high levels of vulnerabilities, hence at risk of distributing water that is not safe


INTRODUCTION
In year 2015, it was estimated that 2.5 billion people lacked access to safely managed drinking water services, defined as those that are located on home grounds, available when needed and free from fecal and chemical contamination (WHO, 2017). Improvements in drinking water and sanitation yield health and economic benefits. Contaminated water is related to the occurrence of diarrhea, cholera, schistosomiasis, and many other diseases (GERBA;NICHOLS, 2015). It is estimated that contaminated drinking water results in more than half a million deaths each year. Associated with inadequate sanitation, contaminated water causes about 1.5 percent of all deaths (PRÜSS-USTÜN et al., 2014).
A large percentage of the population relies on small drinking water systems (SDWS) for their basic needs. Drinking water systems in small communities have been the subject of many publications in the past (WAGNER; LANOIX, 1959;SAUNDERS;WARFORD, 1976;HOFKES, 1983), and continue to this date (CAMERON et al., 2011;OXENFORD;BARRETT, 2016;IWA, 2017). Small systems differ from those of larger cities with respect to management, infrastructure, technology and operation. They frequently lack trained personnel and have limited financial resources to maintain the systems' sustainability (WHO, 2012b). The topic is not limited to the developing world. In Canada, in year 2015, about 2,000 small communities were under boiling water advisories because of failures in drinking water systems (LUI, 2015). In the European Union, one every ten people get their drinking water from small systems (HULSMANN, 2005). In the United States, 97% of the systems provide drinking water to 10,000 or fewer people (USEPA, 2015). In the Nordic countries of Europe, microbial non-compliance in small supply systems is eleven times higher than in large supplies (GUNNARSDOTTIR et al., 2017). Recognizing the importance of the topic, the World Health Organization published a manual for development and implementation of water safety plans for small communities drinking water systems (WHO, 2012a).
This article describes a methodology that can be used to assess the vulnerability of small community drinking water systems. The method uses statistical techniques to choose a representative sample in a region. Ten indicators, grouped into four dimensions, were selected to characterize the systems. A vulnerability index (VI) was proposed using the values given for each indicator. Statistical analysis of the VI provides useful information on the conditions of the drinking water systems in the chosen region.
The methodology was applied in the state of Rio Grande do Sul, Brazil, uncovering the main causes of vulnerability in the water supply systems of the selected region. Results from the assessment can be used by public agencies to guide priorities and investments.

Vulnerability dimensions and indicators
Ten vulnerability indicators were proposed in four different dimensions, focusing on the assessment of small community drinking water systems. The indicators were chosen after a review of other studies on the subject (ALESSA et al., 2008;SULLIVAN, 2011;HURLEY;SADIQ;MAZUMDER, 2012;PLUMMER;LOË;ARMITAGE, 2012;WWAP, 2012;WHO, 2012a). Table 1 presents the proposed dimensions and their indicators.
Each indicator was evaluated in a scale ranging from zero to one. "Zero" indicates a system extremely vulnerable in that characteristic, and "one" not vulnerable in that aspect. Therefore, each indicator has different levels of vulnerability. Table 2 shows, for each indicator, the ratings values associated with vulnerability description and their reasonings.

Vulnerability index
The proposed vulnerability index (VI) is the weighted average of the indicator's ratings by dimension (Equation 1).
where VI is the vulnerability index; I k is the sum of ratings associated to the vulnerability indicators in the dimension k; W k is the weight assigned to the dimension k ( k k W ∑ =1)); and n k is the number of indicators analyzed in the dimension k.

Water distribution network
Occurrence of problems in the water distribution network.

Economic
Economic sustainability Economic sustainability for the system O&M and investments.

Human Resources
Technical capacity Technical capacity of the system personnel.

Institutional
External support Support from health agencies, local government, and/or non-governmental organizations. Surveillance and control: water quality monitoring Surveillance and control monitoring of the drinking water quality in compliance with health regulations. The weights assigned for the dimensions water resources, economic, human resources and institutional were, respectively, 0.6, 0.1, 0.1 and 0.2 ( · . . . . k k W 0 6 0 1 0 1 0 2 1 = + + + = ∑ ), according to the number of indicators in each dimension (6, 1, 1 and 2). Therefore, each indicator had the same weight. The assignment of weights did not have the participation of stakeholders because of the timeframe of this study, but it can be considered in future applications of the methodology. Three vulnerability levels (low, intermediate and high) were considered. The ranges of values assigned for each level were: low (0.65 < VI ≤ 1.00), intermediate (0.40 ≤ VI ≤ 0.65) and high (0.00 ≤ VI < 0.40).
As an example, if the values of indicators water resources availability, water source, level of water treatment, drinking water quality, operation and maintenance, and water distribution network were, respectively, 0.

Sampling
A probabilistic sampling design was used to determine the sample size that would be representative of the whole set of drinking water systems in the region. Four sampling techniques were considered, as explained below.
1) Simple random sample (SRS): it is the simplest sampling method. It denotes an equal probability of selecting any unit from the sampled population. Sampling size using SRS is calculated by Equation 3.
where n is the sample size; N is the population size (in this case the total number of systems in a given region); p is the proportion of the population with desired characteristic (i.e. the proportion of systems in high vulnerability levels); Z α/2 is the significance level; and E is the absolute error or precision.
The sample size calculated by Equation 3 is applicable to estimate the proportion of the population with the desired characteristic (p). However, the true proportion required by Equation 3 is unknown prior to the study. In this case, Lemeshow et al. (1990) suggest conducting a preliminary survey or the use of p = 0.5, which provides a sample size with enough observations for most cases. For large studies, SRS of the population is usually not practical and economically feasible (COCHRAN, 1977); 2) Probability proportional to size: it is a sampling technique that applies weights to the population units (in this study, a unit is a drinking water system). The probability of inclusion of a unit in the population is a function of its weight, the sample size and the total number of systems in the universe; 3) Cluster sampling: it is usually applied when the units in a population may be divided into heterogeneous groups (clusters), and a sample of the clusters is selected. In a survey, clusters could be municipalities, groups of municipalities, neighborhoods, among others; 4) Stratified sample: it is the assembling of the units in homogenous subgroups. The subgroups, or stratum, should be mutually exclusive, collective exhaustive (all water supply systems in the given region must be in one of the subgroups) and have at least one system sampled in each stratum. Examples of stratification in a study would be the division of the sample universe into subgroups of similar economic base, age, etc.
The efficiency of a sample using these techniques in comparison to one using SRS can be estimated by the design effect. Kish (1965) described the design effect as the variance of parameter estimates in the survey and the variance assuming a simple random sample (Equation 4). A design effect below 1.0 denotes gains in sample efficiency, as the variance estimates are lower than those in a SRS sample of same size, while values above 1.0 indicate a loss of precision for the same significance level. is the variance assuming SRS. For example, cluster-based studies usually result in a design effect greater than 1.0 because the results within clusters tend to be more alike. This is indicated by the presence of a positive intracluster correlation coefficient (ICC). The ICC ranges from -1 to 1, with negative or close to 0 values suggesting that the results within a cluster are not very similar. On the other hand, positive values of ICC indicate that results within a cluster are likely to be more similar than results in a different cluster. The ANOVA estimator of ICC is defined by Equation 5.
where ρ is the intracluster correlation coefficient (ICC); MSB is the mean square between clusters; MSW is the mean square within clusters; and m is the cluster size. Even if the technique used for the sampling design results in loss of precision in comparison to SRS (deff > 1.0), it has 5/10 several advantages: it does not require a comprehensive list of the population units to be sampled; it is more practical and economical, as it provides similar results with lower costs; and it ensures a representative sample of the population and subpopulation groups of interest (COCHRAN, 1977). Because of the expected loss of precision, the sample size calculated from Equation 3 should be multiplied by the estimated design effect (Equation 6). Suggested values vary, for instance the 2013 Brazilian National Health Survey estimated deff values between 1.4 to 10.4 to calculate the sample size (SOUZA JÚNIOR et al., 2015).
where n est is the estimated sample size; deff est is the estimated design effect; and n is the sample size calculated by Equation 2.
The sampling design has the following procedure: 1º) Estimate a representative sample size for the study using Equation 3; 2º) Multiply the sample size by the estimated design effect (Equation 6); 3º) Arrange the study region in stratums (i.e. geographic regions) and clusters (i.e. groups of municipalities, neighborhoods); 4º) Select clusters in each stratum with probability proportional to size (first stage of the sample); 5º) Sample a number of systems from each selected cluster with SRS (second stage of the sample).
The number of selected clusters is chosen by the analyst and it is a function of the estimated sample size. For example, in an estimated sample size of 100 systems, the number of selected clusters could be 25 (divided proportionally among the stratums) with four systems selected in each cluster. One application of this sampling procedure in a region is presented in the case study.

Statistical analysis
The statistical analysis of the survey provided information that could be extended to the whole region. The analysis considered the sampling design characteristics (probability proportional to size, clustering, stratification and SRS) to produce statistically significant results.
The statistical mean vulnerability of the drinking water systems surveyed in each region was the weighted average of the vulnerability indexes (Equation 7). The same procedure was used to calculate the results in terms of proportion, for example, the proportion of surveyed systems in the assigned vulnerability levels (low, intermediate and high), and to different subpopulations within the sample (i.e. stratums).
where VI is the mean vulnerability index of the surveyed systems; w j is the sample weight of unit j; and VI j is the vulnerability index value of the sampled unit j. Besides sample means and proportions, the statistical analysis also provided the estimation of confidence intervals for the survey results in each region. It was calculated using student's t-distribution with the study's defined significance level. Equation 8 shows the vulnerability index mean confidence interval calculation.
For proportions, the result should be converted to a proportional format (0 to 1). The variance of the sample must also consider the sample design techniques used. The variance estimation method used was linearization by Taylor series.
where VI is the vulnerability index mean; / , is the significance level in the t-student distribution with d degrees of freedom; and ( ) V VI is the variance of the vulnerability index mean in the sample.

Case study
The proposed sampling methodology and vulnerability indicators were applied, as a case study, in the state of Rio Grande do Sul, Brazil. The existing small drinking water systems and water quality surveillance information were collected from the Brazilian Drinking Water Quality System database considering the period from January to December 2014 (BRASIL, 2015). In addition, a field survey on the selected drinking water systems was conducted to collect complementary on-site data for the indicators shown in Table 1.
In Brazil, the national drinking water regulation classifies systems into Water Supply System (WSS), Alternative Collective Solution (ACS), and Alternative Individual Solution (AIS) (BRASIL, 2017). WSS is usually a municipal or state-owned company that provides drinking water through complete infrastructure, while ACS is generally a community-based supplier in a smaller scale compared to WSS. AIS is individual or household drinking water solutions.
In this case study, two criteria were applied to select the sample universe: 1) Water supplies registered as ACS; 2) Water supplies registered as WSS serving up to 2,000 people. The target population that met these criteria was 588,513 inhabitants from ACS, and 64,384 from WSS. The total number of systems was 6,276, from which 6,095 were ACS and 181 were WSS, in 373 municipalities.
The case study aimed to evaluate the application of the sampling methodology and vulnerability indicators to a real case situation. Data was statistically analyzed to allow inferences on the conditions of the small drinking water supplies in the state of Rio Grande do Sul. Statistical analysis was performed using Stata (StataCorp, 2013).

RESULTS
The drinking water systems sample size was calculated with the proportion of 50% (p=0.5), following the suggestion of Lemeshow et al. (1990) when the proportion is unknown prior to the study, absolute error of ± 10% (E=0.1), population size of 6,276 (N=6,276) and 80% significance level. The result indicated a sample size (n) of 41 water supplies using Equation 3, or 61 (n est ) using Equation 6 with an estimated design effect of 1.5 (deff est ).
The State of Rio Grande do Sul was arranged in clusters and stratums. The clusters were municipalities or groups of municipalities while stratums were the geographical regions of the State. The 373 municipalities of the sample universe were distributed in 343 clusters and allocated in their corresponding stratum. The number of selected clusters in each stratum was based on the population served by small drinking water supplies in each geographical region. At the first stage, clusters were selected with probability proportional to size (number of systems in a particular cluster). At the second stage, the sample was selected, among the small water supply systems inside the cluster, using SRS. Three systems were selected in each of the twenty clusters sampled at the first stage, resulting in a total of sixty systems to be studied on-site. Among the sixty systems, three were classified as WSS and 57 as ACS. Table 3 summarizes the sampling results, while Figure 1 shows the approximate location of the twenty-one municipalities surveyed.
The sixty selected water supply systems were visited on-site, and each indicator evaluated following the ratings of vulnerability shown in Table 2. This data is presented in Figure 2 with the absolute number of observations for each vulnerability value assigned (0 to 1).
The vulnerability index (VI) was calculated according to Equation 1 and statistically analyzed with Equations 7 and 8. Figure 3 presents the indexes results for each surveyed system per cluster. Figure 4 shows the statistical mean and confidence intervals considering the Geographical regions Central West, Southeast and Southwest were merged in one stratum to produce a sample of at least two clusters.  7/10 state of Rio Grande do Sul and three subpopulation groups (region, population, and the drinking water level of treatment). The mean vulnerability index for the small drinking water systems in the state of Rio Grande do Sul was 0.52 with a confidence interval of 0.47 and 0.58 for 80% significance. It is represented by the number zero in the abscissa in Figure 4. It indicates an overall vulnerability at intermediate level.

DISCUSSION
The sampling technique used in the case study resulted in the selection of 60 small drinking water systems from a total universe of 6,276 systems. Twenty clusters were selected, with three water systems per cluster. The clusters were not uniformly spaced within the area of the State, as some regions had more small systems and population (Table 3). Each of the sixty systems was visited on-site and had quantitative values for the 10 indicators shown in Table 1.
A large variety of vulnerability conditions was observed among the small drinking water systems surveyed, as shown in the vulnerability indicators results data (Figure 2). Some indicators had a predominance of high (water distribution network), medium (operations & maintenance and economic sustainability) and low (water resources availability) vulnerability levels, while others had a combination of conditions (water source, level of treatment, drinking water quality, technical capacity, external support, surveillance and control). For instance, the main cause of vulnerability in the indicator "drinking water quality" came from systems that failed to maintain residual chlorine in compliance with regulation. Besides, the indicator "surveillance and control" showed that several systems didn't comply with health regulations related to residual chlorine monitoring.
Among the 60 surveyed systems, the sum of observations in medium to high vulnerabilities was significant in most indicators. For example, in the indicator "water distribution network", 52 out of 60 systems had ratings 0.0 to 0.5 (Figure 2). For the indicator "drinking water quality", 47 from 60 systems had ratings up to 0.5. Only the indicator "water resources availability" had a predominance of low vulnerability condition. Water quantity was not a major problem for most systems surveyed, except during extended droughts events.
Since most surveyed systems inside the clusters belonged to the same municipality, it was observed that the vulnerability index values were more similar within cluster (mostly comprised of one municipality) than among different clusters ( Figure 3). The estimated intracluster correlation coefficient for this case (Equation 5) was 0.42, which is a consequence of a lower within cluster component of variance (MSW) with respect to the between component (MSB). The sampling design resulted in deff (Equation 4) values ranging from 2.0 to 2.5 in all analyses, higher than the estimated 1.5. The main cause for this loss of precision in comparison to SRS was due to clustering. In order to increase the sample efficiency, an alternative design would consider the division of systems in clusters that include more than one municipality, or micro regions. The sampling technique accomplished the case study objectives with confidence intervals close to those estimated in the sampling design (absolute error of ± 12% for 80% significance as opposed to the estimated ± 10%).
In terms of the vulnerability index statistical analysis, according to Figure 4, the VI was not significantly different among regions and population. However, there was a significant difference for systems with a proper drinking water treatment and those without (numbers 9 and 10 in Figure 4). A system that has an appropriate treatment technology, trained personnel and monitoring usually provides safe drinking water. On the contrary, systems without proper treatment have recurring water quality non-conformities. Figure 5 indicates there is a substantial number of small water supply systems at risk in the state of Rio Grande do Sul. In the universe of 6,276 systems, it is estimated that between 3,428 and 4,880 are classified in the intermediate and high vulnerability levels for 80% significance (VI ≤ 0.65). The results indicate that these systems are predominantly those without an appropriate level of treatment. Among the 23 systems with proper treatment technology (case 10 from Figure 4), only five were not assigned a low vulnerability level (VI > 0.65).
Although not statistically verified, it was noted that efficient management, qualified personnel, maintenance of the physical structure, proper treatment technology, water quality monitoring and charging a proper fee for operation and maintenance of the system are variables that foster low vulnerability. From the case study, it was also observed that presence of external support promotes the quality of the small drinking water systems. In only 20 of the surveyed locations there was some support from municipal or state institutions to qualify the systems.
During on-site visits and consulting with sanitary regulation agencies and drinking water companies, it was frequently reported a lack of interest of companies in providing drinking water to small systems. The distance from the urban center and economic feasibility were some of the main obstacles mentioned. In addition, the community would have to pay the higher rates usually charged by companies. Therefore, sparse population and perceived high fees are some of the main obstacles to promote investments in these locations.

CONCLUSIONS
The suggested vulnerability index and sampling technique used in the small community drinking water system case study can be extended to other regions. The method provides guidance on how to evaluate the vulnerability indicators and to calculate the corresponding index. The sampling design accomplished the proposed case study objectives, however with a loss of precision in comparison to SRS. On the other hand, the use of SRS would require higher costs for data collection.
The results of the case study indicated a wide range of situations within the analyzed indicators of vulnerability. The mean vulnerability index of small community water supply systems in the state of Rio Grande do Sul was 0.52, a vulnerability at intermediate level. In terms of proportion, 36% of the surveyed systems were classified as highly vulnerable, while 31% were in the intermediate level. Together they comprised almost seventy percent of the surveyed systems that are at risk of providing drinking water that is not safe for consumption. This could be valuable information to plan policies and actions to decrease the vulnerability of the systems and to reduce the risk of acquiring diseases associated with contaminated water. It would also be valuable to set policy priorities, such as the development and implementation of water safety plans. For instance, in the case study region, the adequacy of the treatment technology to the water source and the compliance to the water quality monitoring regulations could be the priorities for public investments. In addition,