A predictive multimetric index based on macroinvetebrates for Atlantic Forest wadeable streams assessment

Multimetric Indices (MMIs) have been widely applied for ecological assessment in freshwater ecosystems. Most MMIs face difficulties when scaling up from small spatial scales because larger scales usually encompass great environmental variability. Covariance of anthropogenic pressures with natural environmental gradients can be a confounding factor in assessing biologic responses to anthropogenic pressures. This study presents the development and validation of a predictive multimetric index to assess the ecological condition of Atlantic Forest wadeable streams using benthic macroinvertebrates. To do so, we sampled 158 sites for the index development. We adjusted each biological metric to natural variation through multiple regression analyses (stepwise-forward) and considered that the residual distribution describes the metric variation in the absence of natural environmental influence. For metric selection we considered normal distribution, variation explained by the models, redundancy between metrics and sensitivity to differentiate reference from impaired sites. We selected five metrics to the final index: total richness, %MOLD, %Coleoptera, EPT richness and Chironomidae abundance. The residuals were transformed into probabilities and the final index was obtained through the mean of these probabilities. This index performed well in discriminating the impairment gradient and it showed a high correlation (r = 0.85, p <0.001) with a specific index developed for a particular basin indicating a similar sensitivity. This index can be used to assess wadeable streams ecological condition in Atlantic Forest biome, so we believe that this type of approach represents an important step towards the application of biomonitoring tools in Brazil.


Introduction
The conservation and management of aquatic ecosystems is a great challenge worldwide as these ecosystems are highly threatened by several human activities. The knowledge and the accurate measurement of these impacts is a fundamental part for the decision process in conservation programs. Besides being based in ecological concepts, the protocols and tools for carrying out biological monitoring must be efficient, fast and consistently applied in different regions (Pont et al 2006). Also, the use of aquatic communities as biological indicators requires a detailed knowledge of the composition patterns and distribution of communities in water bodies in natural conditions and of the natural gradients that explain these patterns (Oberdorff et al. 2002). This is essential to evaluate biologic responses to anthropogenic pressures without the influence of natural environmental gradients.
Some studies in Brazil have shown the existence of a relationship between macroinvertebrate communities and natural gradients such as altitude (Henriques-Oliveira & Nessimian 2010) and river order (Baptista et al. 2001, Melo 2009). For the development of predictive indices, it is not enough to know the influence of environmental variables in the community as a whole, but it is important that direct relationships between these variables and metrics are studied.
At the moment the most used tool for assessment of biological conditions in aquatic ecosystems are the multimetric indices (Buss et al. 2015) and predictive models (Feio et al. 2007, Feio & Poquet 2011. A multimetric index (MMI) considers the effects of multiple impacts and aggregates individual biological, ecological and functional measures in a unique value that can be used to measure the general condition of the ecosystem (Karr 1981, Karr et al. 1986, Hering et al. 2006, Jun et al. 2012. The metrics based on multiple biological measures (diversity, composition, tolerance and trophic) are a characteristic or a measurable process of a biological system that alters in value along a gradient of human influence (Karr & Chu 2000). The strength of the MMI relies on its ability to integrate information of several aspects of a community, in order to provide a general classification of the ecosystem, without losing information from the individual metrics (Karr 1981, Karr et al. 1986, Hering et al. 2006). However, a great difficulty in the identification of the metrics sensitivity to anthropogenic impact is that they also can be affected by natural gradients such as altitude, precipitation and temperature, among others. Therefore, MMIs are generally developed for ecoregions or river typologies with the same natural conditions (Kosnicki et al. 2016, Munne & Prat 2009). Factors such as slope, altitude and type of substrate, for example, which may vary internally in previously defined ecoregions, may also influence the macroinvertebrate community and may even be more critical than the larger scale factors (Hawkins & Vinson 2000, Munn et al. 2009).
Covariance of anthropogenic pressures with natural environmental gradients can be a confounding factor in the evaluation of biologic responses to anthropogenic pressures (Stoddard et al. 2008, Hawkins et al. 2010, Moya et al. 2011. Aiming to overcome this methodological bottleneck, predictive multimetric indices are being developed in a way the metrics are adjusted to the variables that describe the natural gradients (Oberdorff et al. 2001, 2002, Pont et al. 2006, Tejerina-Garro et al. 2006, Moya et al. 2007, Chen et al. 2014, Macedo et al. 2016, Pereira et al. 2016, Chen et al. 2017. These indices take into account the influence of different natural gradients and allow that the metric value, in the absence of human interference, be predicted by the environmental characteristics of a specific locality. Besides the environmental variables, spatial variables should be included to avoid autocorrelation, which represents a problem for statistical inference and ecological patterns (Suriano et al. 2011).
Developing biomonitoring tools for large spatial scales is far more challenging than defining local tools for small regions, but this approach has been developed and studied in the USA , Klemm et al. 2003, Blocksom & Johnson 2009, Flotemersch et al. 2014, Kosnicki et al. 2016) and in the EU (Böhmer et al. 2004, Gabriels et al. 2010, Mondy et al. 2012. The development of a national biomonitoring program as systematic management is under study in Brazil (Buss et al. 2015). The aim is to assess what would be the most appropriate tool to be deployed at the national level and applied to all natural biomes of Brazil. These studies have focused on developing mainly multimetric indices using macroinvertebrates. We highlight representative works for some Brazilian natural biomes: Amazon (Couceiro et al. 2012, Chen et al. 2017, Savanna (Ferreira et al. 2011, Macedo et al. 2016, Silva et al. 2017, Atlantic Forest (Baptista et al. 2007, Oliveira et al. 2011a, Jiménez-Valencia et al. 2014, Pereira et al. 2016.
In this study, we developed and tested a MMI based in the macroinvertebrate community that can discriminate natural from anthropogenic variability in wadeable streams in the Atlantic Forest in southeastern Brazil. This was the first predictive multimetric index developed in Brazil, and it is an important step towards the implementation of biomonitoring programs in the country.

Study area
We sampled 158 sites in streams from 1st to 4th order (according to Strahler classification using 1:50,000 scale maps) that were divided into three categories: 64 minimally impaired areas, considered as reference sites, 50 severally altered by human activities sites; and 44 sites moderately affected by human influence in the Rio de Janeiro State. The sampling campaigns were carried out between 2006 and 2011 (during the dry season) using the same protocol and under the supervision of the same team. All these sites were located in watersheds between 20 m and 1900 m above sea level (Figure 1).
The area corresponding to all Rio de Janeiro state is 43,696 km². According to Alvares et al. (2013), 44% of Rio de Janeiro state's mid-lower portions are classified as tropical with a summer rainy season, with the most mountainous regions and plateaus classified as humid subtropical with hot summer, without dry season or with a dry winter. Temperatures oscillate between 15ºC and 28ºC, and annual rainfall is around 1000 -1500 mm. The Atlantic Forest biome, which originally covered virtually the entire region, now represents less than 12% of its original extent and is mostly spread in the higher parts of the mountains and remnants interspersed with agriculture and pasture (Ribeiro et al. 2011).

Sampled sites
We classified the impairment classes of sampling sites using physicochemical and environmental parameters. We classified sampling sites in the field using the visual-based habitat assessment protocol (HAP; Barbour et al. 1999). The HAP evaluates ten environmental parameters (e.g., sediment deposition, margin stability, and riparian vegetation) which the score ranges from 0 to 20. We used de mean score to obtain the site classification, as follows: 0-5 "Poor"; 5.1-9.9 "Marginal", 10-14.9 "Suboptimal" and 15-20 an "Optimal" environmental condition (Barbour et al., 1999). We classified as references sites if dissolved oxygen> 6.0 mg/L, an "Optimal" or "Suboptimal" environmental condition according to the HAP, no sign of channelization locally or upstream and if <25% area upstream land-use were urban (based on recent satellite images). We classified as impaired sites if they had "Poor" condition according to HAP and if recent satellite images showed >40% of the upstream area was affected by urban areas and agriculture. Intermediate sites had characteristics between these two classes.
The sampling sites were divided into groups for the development and test of the index according to Table 1. The sites of the group REF-CAL (reference calibration sites) were chosen in order to encompass all the watersheds sampled and represent the different natural gradients presented in the study area.
All these datasets were used in different steps of the process for developing the index as explained further in Figure 2.

Sampling, screening, and identification of organisms
For sampling macroinvertebrates we followed multi-habitat sampling procedure according to the availability of substrate in the reach using a Kick net with mesh size of 500 µm, using the protocol of the RBP III (Barbour et al. 1999). Twenty samples (around 20 m 2 ) were taken proportional to the substrates availability in each site. The percentage of available habitats was estimated by visual inspection. Substrates with less than 5% of the site area were not sampled. Samples were composited and conserved in the field in Figure 1. Map of the study area, showing the sampling sites in the Rio de Janeiro State. Squares indicate reference sites, and triangles and circles indicate moderately and severely impaired sites, respectively. Numbers represent river groups: 1-Ilha Grande Bay; 2-Central part Atlantic; 3-Central part Continental; 4-Northern region; 5-Northern Fluminense; 6-Coastal lowland; 7-Itatiaia. The equipment was tested for its efficiency, and the number of subsamples was defined for the metrics stability. After homogenization, six quadrats were randomly selected for sub-sampling. More two subsamples were taken in case of the minimum number of 200 specimens were not reached (Oliveira et al. 2011b). The organisms were identified to the genus level, except Diptera, Hemiptera and Lepidoptera to the family level and Annelida to class level using the available taxonomic keys (Froehlich 1984, De Marmels 1990, Dominguez et al. 1992, Angrisano 1995, Merrit & Cummins 1996, Wiggins 1996, Nieser & de Melo 1997, Dorville & Froehlich 1999). For standardization, the level of taxonomic identification is treated here as operational taxonomic units (OTU).

Environmental variables
In each sampling site were taken measurements of environmental variables such as altitude (m), water temperature (ºC), precipitation (mm/year) and physical and chemical variables such as dissolved oxygen (mg/l), pH, conductivity (µS/cm), total dissolved solids (TDS). Dissolved oxygen (DO) was determined by using a YSI 550A analyzer, pH with a MPA 210p (LabConte) and conductivity and TDS by using MCA 150p (LabConte). Water samples were taken in sterile plastic bags (whirl-pak) according to APHA (2000). However, as recommended by Pont et al. (2009), only the variables that are not easily influenced by the anthropogenic impact such as channel slope and mean annual air temperature were used in the models, as we cannot guarantee that the reference sites do not suffer any kind of impact from human activities. Moreover, two variables were also measured: average width of the river (trough the measuring of 6 transversal sections in the reach sampled, considering only the wet width) and average depth.
The altitude was obtained using a digital model of ground elevation based in SRTM, using the software ESRI ArcGIS 9.0®. The area of the watershed was measured through the delimitation of polygons generated by this digital model. The average precipitation in each site was obtained from the map of total annual isoiets (period of 1968-1995) developed by CPRM (Centro de Pesquisa dos Recursos Minerais -Brasil). The temperature was obtained from the study of Araújo et al. (2010) of monthly spatial distribution of the temperature in the State of Rio de Janeiro, based in linear models and SRTM.
A Shapiro-Wilk test was performed to test the normality of the data. The variables that had non-normal distributions were transformed to reach a normal distribution. For verification of redundancy between the environmental variables, a Pearson correlation test was performed. Variables with a correlation coefficient higher than 0.8 were considered redundant, and the most difficult to measure was eliminated.

Identification of river groups determined by the fauna
Some predictive multimetric indices use a variable that represents the river groups determined by their own fauna Pont et al. (2006). Analyses used in the initial stages of the predictive model development are performed to identify biologically relevant units. Other studies have grouped rivers as ecoregions or according to river typologies, using geographic, geologic and climatic characteristics (Moya et al. 2011, Oberdorff et al. 2001). In the present study, we used a methodology similar to Pont et al. (2006) for defining a variable called "river group". A Grouping Analysis (GA) was performed with macroinvertebrates abundances (transformed in logx+1) for 55 reference sites using the similarity of Bray-Curtis. After that, a Similarity Analyses (ANOSIM) was performed to verify the significance of the groups found in the grouping analyses. The ANOSIM calculates R-value that reflects the similarity between groups according to some factor, in this case, the fauna. An R-value close to zero indicates low difference and close to one a highly different fauna composition between river groups using the Bray-Curtis index.

Metrics calculation, selection, and modeling
We show a flow diagram with the general overview of this process and the datasets used in each step ( Figure 2). Details regarding the description of analysis used are given further in the text.
Initially, 35 metrics chosen were calculated following previous studies of multimetric indices development , Baptista et al. 2007, Oliveira et al. 2011b). All metrics were tested to verify the normality in the group REF-CAL. The metrics that even after transformations did not present normal distribution were eliminated. The metrics that had a small range were also eliminated (Klemm et al. 2003). Using the REF-CAL data, we generated models for each one of the metrics (dependent variables), using the environmental variables that represent the local factors (e.g., width and depth) and also the variable "river group" representing the spatial variation as predictive factors through multiple regression analyses (forward stepwise). The squares of the environmental variables were also considered for the identification of possible non-linear relationships. Each one of the river groups identified was used as variables and codified between zero and one. Only the river groups with determinant coefficients equal or higher than 0.3 were considered as adequate models. Regression analyses were performed in the software Statistica 7.0.
We also performed a regression analysis using the Akaike Information Criteria (AIC) for the selection of the best model. This analysis was carried out for confirming the selection of the model using the minimum square method, verifying if the selected model presented the lower values of AIC.
The residuals of each one of the metrics represent the metrics variation in the absence of natural gradients interference. The residuals were standardized subtracting the average of the residuals from the REF-CAL group and dividing by the standard deviation of that same data set, even when computed for the other groups of data (SI-MET, REF-VAL; SI-VAL; MI-VAL). The residual expected value in a reference site is equal to zero. For the metrics that decrease with the increase of anthropogenic impact (especially in severely impaired sites) negative residuals are expected and for metrics that increase with impact positive residuals are expected.
Assuming that residuals in the reference sites present normal distribution, with zero mean and standard deviation equal to one, it is possible to compute the probability of finding a lower residual value than the observed (for decreasing metrics) and higher than observed (for increasing metrics). The probability represents the chance that a specific site belongs to the distribution of the reference sites, that is, the lower the value the higher is the impairment in that site. The transformation of the residuals in probabilities makes the metrics comparable. All probabilities will vary between 0 and 1 and decrease with the impact. The expected probability distributions for a group of reference sites is a uniform distribution with average 0.5.
Residuals normality and variance homogeneity were tested in the reference sites. Metrics with non-normal residuals were eliminated. The models obtained were applied to the SI-MET data set and the residuals standardized. The residuals of the groups REF-CAL and SI-MET were compared by a paired t-test for verifying if the metrics are capable of differentiating minimally impaired from severely impaired sites.
A Pearson Correlation Analysis was performed to verify the redundancy of the metrics. If two metrics were highly correlated (r >0.8 or <-0.8), the metrics with the lower p in the sensitivity test or the metrics which facilitates the application of the index (i.e., metrics which require identification only at the level of family or order) was retained.

Index development and test
The final index was obtained by the average of probabilities found for the selected metrics. Thus the index varied between 0 and 1 and was divided into five classes of anthropogenic impact. The index was denominated Predictive Multimetric Index for Rio de Janeiro State (PREMIER).
The index was validated in 3 independent data sets: REF-VAL, SI-VAL, and MI-VAL. If the index is working properly two conditions must be satisfied: (a) the average of the probabilities obtained for the group REF-VAL cannot be different from 0.5, and (b) the mean value of the group REF-VAL is higher than the mean value of the group MI-VAL which is, in turn, higher than the mean value of the group SI-VAL. Those conditions were tested by a unilateral t-test.
The MMIs developed for a specific drainage basin are very sensitive to small disturbances in rivers of the same basin. However, they may not work well when applied in other basins with different characteristics. A predictive index based on a model developed for areas with distinct characteristics could present an inferior performance compared to the ones developed for specific areas when applied for the drainage basin for which it was developed.
For verifying if the PREMIER presented a lower performance than GMMI, both indices were tested in 10 sites in the Guapiaçu-Macacu drainage basin with different levels of anthropogenic impact, which were not used for the development of either index. The results were compared by a correlation test between the final values of the two indices.

Results
In this study, more than 110,000 benthic macroinvertebrate specimens were identified. In the reference sites that were used for the definition of the "river groups" by the fauna, more than 40,000 specimens were identified. The grouping analysis showed that the fauna is very similar in the following groups: Guapiaçu-Macacu, Itatiaia, Paquequer e Bocaina. Some sites located in lower areas (as São João river drainage basin) also demonstrated higher similarity among them. As the streams located in the northern Rio de Janeiro State are poorly studied and have different climatic conditions from other places it was decided to keep this group of rivers as a separate group. Criteria adopted here were both based on the analysis performed and in the knowledge of several prior works done in the region.
The results of ANOSIM considering the "river groups" show that, in most cases, the rivers did not differ significantly. The region that covers the area of Itatiaia National Park, which has specific features of a high degree of conservation of the aquatic ecosystems, was the most differentiated from other rivers groups. According to the ANOSIM, the following river groups were considered: coastal lowland (Maricá and São João rivers), northern Fluminense region (Macaé river), northern region (Desengano State Park), central part of Serra do Mar mountain region -continental side (drainage basin of the rivers Paquequer, Preto and Piabanha), central part of Serra do Mar mountain region -Atlantic side (drainage basin of the rivers Guapimirim, Guapiaçu and Macacu); "Ilha Grande" bay region (Bocaina National Park and Mambucaba river), Itatiaia (Itatiaia National Park, Serra da Mantiqueira). The R value obtained in the analysis was 0.48 (p<0.001), and all river groups differed significantly among them, except for the northern region of the State about the lowland coastal region ( Table 2).
Characterizations of the sampling sites in the different determined river groups are presented. The higher sites are located on the continental side of the Serra do Mar, mainly in the Paquequer river basin. In this same region, the lower average temperature was found. The largest drainage basins are in the Macaé river basin, as well as the larger average width. The highest average depth was found in the rivers of the Itatiaia region. The region with the highest average precipitation was the Atlantic side of the central part of Serra do Mar (Table 3).

Metric selection and index development
From the 35 metrics calculated, nine were eliminated for not reaching normal distribution after transformations procedures. From the remaining 26 metrics, 16 presented more than 30% of its variation explained by the model. All 16 metrics with significant models presented residuals with normal distribution after the Shapiro-Wilk test (p>0.05). It was also performed a graphical analysis of residuals for verification of the homogeneity of variances. The results indicate that all metrics presented residuals randomly distributed around the mean. The sensitivity test, which compared the values of residuals of the 16 metrics in the group of reference sites with the ones of the severely impaired sites, showed that all were capable of significantly differentiate the groups (t-test; p<0.01).
The metric %Trichoptera had a slight increase in intermediate disturbance conditions and after that a decrease in severely impaired sites. To avoid assessment errors in the final index the metric %Trichoptera was eliminated (Table 4).
Several metrics presented a high correlation between each other. Therefore, in the face of a redundant metric, the one with the best result in the sensitivity test was chosen (t-test). In this step, five metrics were selected: richness (Family), %MOLD, %Coleoptera, EPT richness (OTU), Chironomidae abundance.
Using the dataset MI-VAL, a graphical evaluation was performed for verifying the relation of the metrics with the increase of anthropogenic impact, identifying if they were positively or negatively related with the disturbance gradient and also how it behaves in intermediate disturbance scenarios. All metrics presented a linear response to impairment with %MOLD and Chironomidae abundance presenting a positive relationship, and richness (Family), EPT richness (OTU) and %Coleoptera showed a negative relationship with the impact gradient. Finally, five metrics were kept for the index development (Figure 3). The models of the five selected metrics were confirmed by regression using the Akaike Information Criteria (AIC). The generated models presented the lower values of AIC. Table 5 presents the models of the five metrics retained to the final index.
The final value of the index was obtained by the average of the probability values of the five metrics. As expected, the average values obtained for the metrics of the group REF-CAL were not significantly different from 0.5 (Figure 4; Figure 5) and were not also different from the average obtained for the group REF-VAL (p=0.33). That means that the index is adequately calibrated with reference areas values mostly scoring over 0.5 and impaired sites scoring lower values (Figure 4). So, it can adequately asses the level of impairment of wadeable streams in Atlantic Forest.

Index test
The index was able to differentiate the classes of an impact since the values in the REF-VAL group were significantly higher than the MI-VAL group (P = 0.02) and the values MI-VAL were significantly higher than in SI-VAL (p = 0.03) ( Figure 5).

Comparison of the effectiveness of the predictive index with a specific index
The GMMI and PREMIER indices were applied to 10 sampling sites with distinct natural characteristics and degrees of anthropogenic impact. It is noteworthy that the GMMI ranges from 0 to 100, divided into five classes: Very good (80-100), Good (60-80), Regular (40-60), Impaired (20-40) and Severely Impaired (0-20). The PREMIER ranges from 0 to 1, divided into five classes: Very Good (≥0.5); Good (0.4-0.5); Regular (0.2-0.4); Impaired (0.1-0.2); Severely impaired (≤ 0.1). The correlation between the results of two indices was high (Pearson r = 0.85, p <0.001), indicating that the PREMIER showed a similar sensitivity to the index developed specifically for the basin. Regarding the differences in ratings, it is important to note that when the classification was different, the PREMIER provided an inferior class than the GMMI. However, there has been no reverse case (Table 6).

Discussion
The natural condition of a river is a combined product of factors operating on different scales. On large scales, geological and climatic factors are the major determinants of the characteristics of streams, and the biota often responds to these factors along spatially distinct regions (Mykrä et al. 2004). In this work, environmental variables that represent local factors were used (width and depth, for example) and also the variable "river group" representing the spatial variation in wider geographical range. As in the study of Pont et al. (2006) the variable "river group" additively participates in the model, that is, they are only regional adjustments not changing the relationship between the metrics with the other variables in the model.
A metric positively related to altitude in a region will be positively related to altitude in the other region as well, requiring only the adjustment coefficient generated by "river group" variable. This variation among different river groups (the six regions defined in this case) may be related to biogeographical and historical factors and the natural gradients not measured by the variables used in the models (Pont et al. 2006). Thus, it will take more studies to identify the factors that are determining these regional differences and, preferably, turn them into quantitative variables that can be included in the models.
The PREMIER was able to assess most of the anthropogenic impacts on the macroinvertebrate community, regardless of various natural gradients in streams of biome of Atlantic Forest in southeastern Brazil. The metrics used in the index are known to be able to distinguish between different impairment levels and have been used in other indices (Baptista et al. 2007, Moya et al. 2007, Oliveira et al. 2011a, Lakew & Moog 2015, Melo et al. 2015. The richness metrics (richness and EPT richness) are considered useful due to their response to structural changes impaired gradients in macroinvertebrate assemblages (Suriano et al., 2013). The metrics that increase with impact (%MOLD, Chironomidae abundance) are associated organic pollution caused by untreated wastewater effluents in urban areas and to the increase of suspended organic particles in rural areas (Baptista et al. 2011).
The %Coleoptera metric is often associated with the increase in primary production due to its function food web (Baptista et al. 2007). Metrics based on trophic functional groups were not included in the index. The metrics %predators and %collectors were not included because it was not possible to normalize the data. The metric %shredders did not have the R-squared of its model larger than 0.3. The metric %scrapers had a high correlation with the metric %Coleoptera. Therefore it is indirectly represented in the index. The metric %Coleoptera was

% Predators
preferred because other studies found that functional metrics do not always respond to the impact in tropical environments (Moya et al. 2011, Melo et al. 2015, Pereira et al. 2016. Moreover, there is much debate about the feeding habits of some macroinvertebrate groups in the tropics and classifications have been discussed (Tomanova et al. 2006, Miserendino et al. 2007, Moya et al. 2007, Chará-Serna et al. 2012.
The metrics selected to compose the index require only that the groups Ephemeroptera, Plecoptera and Trichoptera are identified to the lowest taxonomic level. In the Diptera group, only Chironomidae should be identified to the family level. The other groups may be identified only to the order level or, in the case of Mollusca, to the phylum level. This is an important aspect for facilitating the use of the index for actual monitoring programs since it reduces the level of expertise and the time needed for applying it.
The results found in this study are very similar to other predictive multimetric indices already developed (Pont et al. 2006, 2009, Moya et al. 2011. It is noteworthy that, even using different groups of bioindicators, the final results of the indices are very similar. In the REF-CAL group the index ranged between 0.2 and 0.8, with an average of 0.502 (Figure 4). In the study of Moya et al. (2011) the index, also    based on macroinvertebrates, varied in the calibration reference group between 0.3 and 0.8, with mean 0.5. In the study of Pont et al. (2006) the index based on fish community ranged from 0.1 to 0.8, averaging 0.5 in the reference sites. It is worth mentioning that in this study the index was developed to the whole Europe and the calibration group had 1000 reference sites distributed in several different countries. In the study of Pont et al. (2009) the index for aquatic vertebrates varied between 0.1 and 0.8 in the reference sites, having an average of 0.5. It is noticeable that the results are quite similar, especially that of Moya et al. (2011) that also developed using benthic macroinvertebrates.
For the development of PREMIER only variables that are not affected by disturbances were used. The altitude was kept in almost all models. The altitude gradient in small streams of mountainous regions tend to reflect other gradients such as water temperature (Pearson's correlation tested for the reference sites of 0.95) and slope. Both the temperature and slope gradients are known factors that affect the benthic community (Vinson & Hawkins 1998, Mykrä et al. 2008. Rivers groups were also retained in most of the models since the fauna actually appear to be different at these locations. Moya et al. (2011) also found that regions are important to determine the fauna. The test showed that the PREMIER is able to differentiate between different levels of impact and is sensitive to different types of disturbances. Ideally, it would be important to have a group of independent reference sites in the group used to calibrate the model, which presented different environmental characteristics, to verify if the index remains unchanged along the changes of natural gradients. However, priority was given to assemble a group of robust calibration and to ensure the spatial representation of the index. The test shows that even if there are still unmeasured natural gradients affecting the macroinvertebrate community, the index can assess the impact.
The PREMIER had its performance compared to a specific index developed for a particular basin (GMMI). The two indices agreed in most of the ratings and showed a high correlation, showing that the PREMIER has good sensitivity to different sources of disturbance. This shows that the PREMIER is a robust index and had the same sensitivity to disturbance as an index developed specially for the region.
One issue to be considered in the predictive multimetric approach is that the variance of the metrics not explained by the model must be the result of other unmeasured variables and also from human degradation gradients (Pont et al. 2006). That is, there must be natural gradients that covary with gradients of impact, or there are impact gradients, even in reference sites, influencing the variation of metrics. This brings us to another crucial aspect in the development of multimetric indices which is the definition of the reference condition. Even though the criteria are strict and equal for all sites, the ecological integrity of reference sites in certain regions may be higher than others. In the case of the State of Rio de Janeiro, it is much more difficult to find reference sites in the lower regions that meet the minimum established criteria. It is possible that in higher areas, due to the greater difficulty of occupation, the ecological integrity of reference sites is higher than the lower sites. A similar conclusion was also pointed out by Pinto et al. (2009).
It is noteworthy that in this study the temporal variation of the benthic community was not analyzed. As in Pont et al. (2006), samples collected at different times were used to increase the spatial representativeness of the index. However, Baptista et al. (2007) tested the stability of the selected metrics in Atlantic Forest streams to form a Multimetric index in three seasons and found that all had relative temporal stability.
Protecting the biotic integrity of our ecosystems depends on our ability to identify, measure and predict the effects that human activities have on them. This depends primarily on our ability to distinguish between natural variations and variations induced by anthropogenic disturbances (Oberdorff et al. 2002). The indices adjusted for natural variation has been shown to be more responsive and sensitive than standard indices (Chen et al. 2014, Macedo et al. 2016, Pereira et al. 2016). Using the methodology presented here is possible to assess the ecological status of Atlantic Forest wadeable streams. This allows that the monitoring can be performed on a large scale generating comparable data to facilitate management.
Brazil´s National Water Agency launched in 2013 the National Water Quality Monitoring Network which will be operated in a decentralized manner with partnerships established with states agencies (ANA 2014). At the moment, there is no goal within this network to include biological indicators. However, it may represent an important forum for discussion on the progress necessary for monitoring in the country, considering that some states already have biomonitoring in their programs. Furthermore, the use of cost-effective analysis (sampling, sorting, and identification) and the selection of a cost-benefit indicator group would be a great step forward for the development of practical biomonitoring programs. In this study, the protocols combined simple and inexpensive field equipment with optimized techniques for processing samples in the laboratory. The cost regarding time to process each sample generally is high, but the subsampling represented a saving of money, human resources and time (approximately 11 h in processing each sample) (see Oliveira et al. 2011b).
In Brazil, different multimetric indices were developed for the study area (Baptista et al. 2007, Oliveira et al. 2011a). However, these indices were developed specifically for each basin, considering specific characteristics. The PREMIER is an improvement over multimetric indices by considering the natural variability (e.g., topographic and climatic) and its application on wider geographical scale. The PREMIER index represents a robust methodology that can be easily reproduced in different areas and can easily be tested and applied for wadeable streams in Atlantic Forest. It can be an important tool for conservation and managing restoration areas in this important and severely damaged biome. Also, adequate water quality assessment is key to resolve, prevent and anticipate water conflicts in Brazil and can also be an important guide to financial investment in sanitation and conservation policies. We hope that the initiative of this study to produce a standardized protocol and an index that can be used on a broader scale contribute to improve ecological assessment in our country.

Supplementary material
The following online material is available for this article: Appendix 1