Environmental Sanitation Conditions and Health Impact: a Case-control Study

This epidemiological investigation examines the impact of several environmental sanitation conditions and hygiene practices on diarrhea occurrence among children under five years of age living in an urban area. The case-control design was employed; 997 cases and 999 controls were included in the investigation. Cases were defined as children with diarrhea and controls were randomly selected among children under five years of age. After logistic regression adjustment, the following variables were found to be significantly associated with diarrhea: washing and purifying fruit and vegetables; presence of wastewater in the street; refuse storage, collection and disposal; domestic water reservoir conditions; feces disposal from swaddles; presence of vectors in the house and flooding in the lot. The estimates of the relative risks reached values up to 2.87. The present study revealed the feasibility of developing and implementing an adequate model to establish intervention priorities in the field of environmental sanitation. Although the World Bank 46 had discouraged the development of investigations involving environmental sanitation conditions and health impact from the mid 1970's, studies concerning these associations have been receiving increased attention since the beginning of the 80's. In 1983, Blum & Feachem 6 stated that most of the studies published until then had methodological limitations. These constraints were identified as from one to more than eight methodological flaws and, in several of the 44 studies reviewed, the results obtained could not be considered unbiased. In the same year, a workshop on this subject was held in Bangladesh 8 and, as conclusion of the discussions, the implementation of epidemiological studies on water supply and sanitation exposure was again recommended, provided that some important methodological care was observed. In order to increase the applicability of those studies, the workshop suggested the case-control design as the most adequate epidemiological method and child diarrhea morbidity as the health variable to be measured. More than 250 studies have been carried out to investigate the probable association of environmental

Although the World Bank 46 had discouraged the development of investigations involving environmental sanitation conditions and health impact from the mid 1970's, studies concerning these associations have been receiving increased attention since the beginning of the 80's.
In 1983, Blum & Feachem 6 stated that most of the studies published until then had methodological limitations.These constraints were identified as from one to more than eight methodological flaws and, in several of the 44 studies reviewed, the results obtained could not be considered unbiased.
In the same year, a workshop on this subject was held in Bangladesh 8 and, as conclusion of the discussions, the implementation of epidemiological studies on water supply and sanitation exposure was again recommended, provided that some important methodological care was observed.In order to increase the applicability of those studies, the workshop suggested the case-control design as the most adequate epidemiological method and child diarrhea morbidity as the health variable to be measured.
More than 250 studies have been carried out to investigate the probable association of environmental sanitation and health conditions.The following main features were observed from the analysis of 256 epidemiological studies published in the literature 23 : a) Fifty-seven (146) percent of the studies were developed on Asian 31 or African 32 scenarios and this tendency has not changed over the last decades; b) Seventy-seven (198) percent of the studies investigated exposure related to water supply 34 and fortytwo (107) percent, aspects related to domestic wastewater disposal 22 .Few investigations analyzed other environmental sanitation conditions, such as refuse disposal 18 (2%, 4), hygiene habits 5 (17%, 44) or drainage, vector presence 42 and other forms of exposure (5%, 12).In the majority of cases, only rural areas were investigated 34 ; c) Forty-one (105) percent of the studies adopted diarrhea morbidity 9 as the health variable; (d) Case-control designs began to be employed in the last decade.In the universe of studies analyzed, prospective 2 (25%, 64) and cross-sectional 30 (21%, 53) designs predominated.
The present paper describes an epidemiological case-control study, which attempts to explore some aspects of the epidemiological methodology insufficiently investigated, which could be applied to environmental sanitation exposure, such as: (a) the applicability of the case-cohort or inclusive design 38 , in which controls are chosen, as a random sample, among the population from where the corresponding cases are identified; (b) an investigation conducted in an urban area outside African or Asian environment; (c) the inclusion of a large set of environmental conditions, as multicategorical variables and (d) a test of sample sizes adequacy, suitable for generalizing to other similar sanitary and environmental situations.

Studied area.
The study was conducted in the urban area of Betim, a city with about 160,000 inhabitants.Betim is an industrial city, located in the Metropolitan Region of Belo Horizonte, the capital of Minas Gerais State, southeast Brazil, with a population of nearly 3.5 million inhabitants.A public concessionaire is responsible for the water supply and sanitation services.Other environmental sanitation services, for instance refuse collection and disposal, urban drainage and vector control are directly provided by the municipality.
Sample size.Considering the methods of sample size determination for independent case-control designs 40 and for multicategorical exposures 7 , a sample size of about 1,000 cases and equal numbers of controls was considered as adequate, assuming: 1) probability of type I error (alpha value) = 0.05; 2) power of the test = 0.90 (probability of type II error (beta value) = 0.10); 3) the prevalence of the exposure factors among the controls (p 0 ) to be equal 0.30, the lowest among the various factors being analyzed and 4) the minimum significant difference (!) between exposure factors prevalence among cases (p 1 ) and controls (p 0 ) equal to 0.10.
Questionnaires.A standardized protocol was developed, with the technical assistance of Universidade Federal de Minas Gerais faculty members, specialists in sanitary engineering, epidemiology, biostatistics and sociology.The questionnaire was based on information derived from other published investigations.In addition, a large spectrum of variables related to environmental sanitation was included.After a pilot test, the final version of the protocol was defined, including 80 closed questions, organized through the following sections: 1) informed consent; 2) identification of residents in the selected houses; 3) participant identification; 4) socioeconomic status of selected families; 5) household characteristics; 6) water supply and individual hygiene habits; 7) wastewater disposal and existence of nearby streams; 8) domestic refuse storage and disposal; 9) rainwater flooding and pounding; 10) vector presence; and 11) validation of collected information (in loco observation).
Case selection.A case was defined as a child under five years of age, resident in Betim urban area, attended at a local health institution, whether public or private, with a report of diarrhea.The attendant physician diagnosis of diarrhea was assumed as the case definition.All local health institutions, including 15 public and 14 private health centers, were investigated.All cases diagnosed between December 20, 1993 and April  4, 1994 were included in the study, comprising a final sample of 997 cases.
Control selection.In coherence with the casecohort or inclusive design definition, control was selected as a child under five years of age, randomly chosen from the resident population universe of the urban area of Betim.The control selection was based on a random allocation of houses, taken from a register used by the municipality with the purpose of housing taxes.For the allocation, algorithms for the register pages and lines were employed, using random numbers generated by a TurboBasic compiler.
While conducting the study, when the assigned house did not have a resident child under five years of age, displacement to the house on the left was adopted.For other situations, when the selection was not possible, other standardized criteria were established.For instance, when 1) the assigned house was selected twice from the allocation lists; 2) the assigned house was located outside of the studied area, or 3) no houses in the selected city block had a child under five years even after adoption of the left displacement criterion, another address was randomly chosen from the original register list.
The 999 selected controls were interviewed from November 23, 1993 to April 22, 1994; the majority of the data was collected before December 18, 1993.
Interviews.The interviews were carried out by a team of ten trained interviewers, recruited among local residents familiar with this kind of activity.The conduction of Interviews in the same region in which the interviewer resides was avoided.Double-masked interviews were planned, but in some situations the participant status was obvious for the respondent.Information from the questionnaires was coded and introduced in a database, developed with the aid of the software MS-Access for Windows 26 .All data was double-entered.
Reliability test.In a sub-sample (10% from the original sample), reliability tests were performed, through re-interviews.Four groups were defined: a) Group 1, cases/same interviewer; b) Group 2, cases/other interviewer; c) Group 3, controls/same interviewer and (d) Group 4, controls/other interviewer.The statistical analysis considered the values of the kappa statistics 20 .
Data analysis.The data analysis was developed through a sequence of steps, when several associations and confounding factors had progressively been identified.The data set was organized using the software MS-Excel 4 for Windows 41 and statistically analyzed through the software SYSTAT 21 , EPIINFO 22 and MULTLR 23 .The sequence of the statistical analysis followed these steps: 1) frequency distribution; 2) univariate analysis, including: a) point estimate and confidence interval for the relative risk (Cornfield method 40 ), b) trend analysis (Mantel method 40 ) and c) point estimate and confidence interval for the attributable risk 40 ; 3) bivariate analysis, with inspection of potential confounders and effect modifiers (Mantel-Haenszel method 33 ); 4) multivariate analysis, using the logistic regression model 25 following the sequence: a) preliminary selection of variables, from the univariate analysis (p<0.25) 36 ; b) intermediate logistic models construction, using 8 different homogeneous subgroups (familiar structure, socioeconomic variables, hygiene practices, water supply, sanitation, urban refuse disposal, drainage, vectors presence).Variables attaining a significance level of p<0.15 were kept in these models; c) final model construction, maintaining only those variables reaching significance of p<0.05 and d) effect modification analysis, under the multiplicative model.Variables known to be associated with diarrhea were kept in the model throughout the analysis, even when they did not reach the established significance levels.

RESULTS
Approximately 29% of identified cases were lost for interviewing; the main reason was the impossibility of localizing the address given at the participating clinics.Regarding the temporal pattern of cases distribution, it was found not to be associated with any meteorological event, such as air temperature or daily precipitation.Controls were found to be uniformly spread throughout the sixty-six Betim metropolitan regions; the proportion of controls per occupied house was also found to be evenly distributed in Betim.
Table 1 shows the frequency distribution, as well as the results of the univariate analysis for the socioeconomic and familial structure qualitative variables, with the respective relative risk (RR) and its 95% confidence interval.Trend analysis results for the polychotomous variables are also presented.Except for gender and for person who takes care of the child, all other variables analyzed, reflecting a lower socioeconomic family status or a disrupted familiar structure, were statistically associated with diarrhea.All polychotomous variables showed a strong linear increase in the risk of disease with increasing levels of exposure.
Table 2 shows the comparison of quantitative variables between cases and controls.Student's t-test demonstrated differences between these groups, except for the variable duration of breast feeding.Younger mothers and children, and variables reflecting lower socioeconomic status were found to be associated with diarrhea occurrence.
In Table 3, crude RRs for the main exposures (converted to dichotomous variables), as determined in the univariate analysis, are presented, together with the respective 95% confidence interval.
After the multivariate adjustment, several of the variables significantly associated with diarrhea, based on the crude RR, lost their significant effect.The remaining variables, in general, showed a smaller point estimate of that risk as can be seen on Table 4.
Twenty-eight controls were later selected as cases.This identification allowed a verification of the differences between the estimates of the relative odds (RO), obtained by simulation of a traditional case-control study and the correspondent RR.The simulation was done through the exclusion of these 28 cases from the control group.The results showed a rigorous similarity between both risk measures.
Finally, the results of the reliability test indicated that 46% of the questions presented an almost perfect or substantial concordance; the remaining questions had regular, poor, or no concordance at all, according to Landis and Koch criteria for interpreting the kappa index 29 .In general, questions related to personal habits or daily observations had the worst index of reliability than those regarding house and family descriptions.

DISCUSSION
Although the case-control design was used, the risk measure used throughout the study was the RR instead of the RO.This is conceptually supported by the sampling scheme employed for control identification, characterizing the case-cohort or the inclusive variant of the case-control method 38 27 .
Cases and controls were identified from the same population.Residence in Betim urban area was an  inclusion criteria for cases, in order to allow a house visit.Cases identified in participating clinics whose domicile was not located by the interviewers, were excluded from the investigation.In consequence, it is unlikely that a squatter or a child living in a unregistered house would be included in the studied sample.Selection criteria for controls required permanent address in Betim.Among the several exposures and confounding factors studied, after the multiple adjustment by logistic regression, only 16 dichotomous comparisons showed significant values for the relative risk, reaching up to 2.87 of magnitude (point estimate), as present in Table 4.This fact suggests a strong co-linearity between the environmental sanitation and the hygienic variables and the presence of several confounding factors.
It should be noted that the effect modification term is included only in a model that has both of the corresponding main effects.This is because these terms can be interpreted as effect modifiers only when the corresponding main effect terms are contained in the model.This is the general rule in model building: higherorder terms are included in a model only when the corresponding lower-order terms are present 14 .
Some of the results are in accordance with the literature.Superficial presence of wastewater in street as risk for diarrhea can be seen as an analogous result to studies concerning lack of latrines 2 32 22 16 35 .An inadequate management of domestic refuse showed an odds ratio of 2.48 for infantile diarrhea in Nigeria 18 and a similar result was also observed in Brazil 21 .Vector presence, mainly flies, was associated to diarrhea in studies carried out in Thailand 43 and in Myanmar 42 .Relationship between hygiene practices and infantile health was identified in several investigations, like those developed in Bangladesh 46 13 24 , USA 28 , Brazil 5 and Philippines 4 .Moreover, inadequate feces disposal from swaddle was found to be significantly associated with infantile diarrhea in studies developed in the Philippines 3 and Bangladesh 43 .There are several descriptions in the literature of water supply association with health 11 12 15 44 , while other studies do not show any association, for example an investigation in Panama 39 .The importance of quantity of water consumption on health conditions has also been demonstrated 44 19 .In this study, the lack of association between several aspects of water supply and health can be explained by: 1) the very low population exposure to the absence of public water supply (1.6%), due to the high population coverage and 2) to the practice, among Betim inhabitants, of clandestine connection to the public network, observed in the study.This situation reveals an effective nonexistence of exposure.
Albeit references of the health importance of domestic water storage and recommendation for its improvement can be found in the literature 37 , studies that quantify these effects were not identified.Similarly, previous references regarding a health effect of a nearby stream and flooding of rainwater in the lot were also not identified.
The possible limitations of this study findings include: 1) the fact that 29% of the total cases identified were excluded from interview; however, there is no evidence of any relationship between these exclusions and the exposures studied.A chi-square test, comparing the proportion of cases at participating Health Institutions, showed that the proportion of exclusions was statistically different only for two of them.Both were small Institutions, responsible for a low proportion of cases (1.8 and 4.1% ).Indeed, this limitation would result in an underestimation of the established risks; 2) the source of controls, represented by the register used by the municipality for housing taxes, could exclude the informal city that is supposedly more exposed to the lack of environmental sanitation measures.This effect was minimized by the updated municipal file used in this study and the strategy of displacement to the house at the left when the assigned house did not have a child under five years old.This fact was very frequent and allowed the inclusion of the unregistered houses, since slums are very integrated to the urban design of the formal city, in Betim.This possible limitation implies in an overestimation of the risks; and 3) the lag of about three months between cases and controls interview.As control selection did not presume disease definition and the environmental and behavioral exposures studied have a long duration pattern, this time difference probably did not imply in bias in the disease or exposure information.Besides, cases and control interview were conducted in the rainy season.
According to the results of the reliability test, variables related to public environmental sanitation conditions and house characterization -such as reservoir existence and conditions -are more reliable, since direct observation for validation of the answers was carried out.As a consequence, information related to personal and domestic habits were less reliable.
Generalization of the study results seems to be possible for similar urban areas, analogous in size, socioeconomic conditions and public services.It is also possible to visualize that a priority setting for intervention, based on the adopted design, can be a feasible approach.From this point of view, generalization of the present method, adjusted to a specific situation, reveals an important issue: the epidemiological design usedinclusive case-control or case-cohort -proves to be valid, since some potential bias on the control group selection, frequent in traditional case-control studies, can be avoided.However, some simplifications, like a smaller sample size, the investigation of a smaller number of confounding variables and the dichotomization of variables in the analysis phase, can be utilized.
The main conclusion of this investigation suggests that an important impact on health status of Betim's children can be achieved by implementation of environmental sanitation measures and hygiene education programs.
Finally, this study also enables the conclusion that infantile diarrhea has multiple and complex determinants.Environmental factors, associated to the lack of appropriate public urban services, poor hygiene practices and social determinants play an important role in transmission of this disease.

Table 1 -
Frequency distribution, crude RR and O 2 test of socioeconomic and familial structure (qualitative variables)

Table 2 -
Frequency distribution and mean differences (Student t-test) of socioeconomic and familial structure (quantitative variables).
*did not know the answer and refusals, when comprising less than 10% of answers, were excluded from Tablea(1) category with proportion of cases significantly higher.(2)category with proportion of cases significantly lower.b[ ] attributed score, trend analysis.c (3) p value, trend analysis.d one unity added to score, when family owned another house.Revista da Sociedade Brasileira de Medicina Tropical 36:41-50, jan-fev, 2003

Table 3 -
Crude RRs for main exposures, converted to dichotomous variables.

Table 4 -
Variables remaining in the logistic model: RR and respective confidence interval, without and with effect modification.