A predictive index based on environmental filters for the bioassessment of river basins without reference areas in Atlantic Forest biome, Brazil

Biological assessments that use the reference condition approach are based on the concept of comparing a site’s observed biology to sites where disturbance is minimal or absent. However, in many regions of the world, such areas are scarce or nonexistent. In this study, an alternative approach proposed by Chessman and Royal for bioassessment without reference areas based on environmental filters was tested in Brazil. This approach assumes that key environmental features act in the selection of potential colonists, from a regional pool of taxa, based on the ecological traits (tolerances) possessed by each taxon. We developed the approach by: 1) determining the regional pool, based on a large Atlantic Forest biome database; 2) selecting environmental filters (elevation, original vegetation and soil type); and 3) including information on the tolerance and preferences of aquatic insects to these filters. With this information we were able to determine the expected taxon under natural conditions and compare with observed taxon, developing a predictive index (Observed/Expected). Although the model was intended to predict the fauna in regions without reference sites, we included reference areas to test the model responsiveness, precision and sensitivity. Our results indicated that the index was able to discriminate impairment classes (F=56.9; p<0,001), it has high precision due to low standard deviation across reference sites values (SD=0.098) and high sensitivity due the correlation with environmental variables that are sensitive to human alteration (r=0.74, p<0.01). Also, it was strongly correlated with multimetric indices developed for multiple watersheds in the state, showing agreement between the methods in relation to ecological quality classification. Even though the predictive index had performed well in our study, we make some considerations that may help to improve its sensitivity of similar methods that are being tested using the environmental filters approach.


Introduction
Biological assessments are generally based on measurements of attributes of biological assemblages, which are often characterized and expressed as indices.Different types of indices have been developed (Hawkins et al. 2010, Feio & Poquet 2011, Herman & Nejadhashemi 2015), but most of these methods share the need to use a benchmark (i.e. a reference condition), from which the measured biological condition will be compared.The impairment of a site is defined as how much their biological attributes differ from those found at the benchmark's condition.
Ideally, reference benchmark conditions represent historical, pristine conditions (Hawkins et al. 2010).However, in many regions of the world due to a long history of anthropogenic alteration of the aquatic ecosystems, those conditions are scarce or absent.In such cases, where "minimally disturbed" or "historical condition" sites (sensu Stoddard et al. 2006) are missing, some protocols relax the reference criteria and use "best attainable" or "least disturbed" sites as reference (Stoddard et al. 2006).The misuse of this practice can generate a series of problems (Chessman 2006, Labay et al. 2015, Elias et al. 2015).First, these definitions are often arbitrary and inconsistent, thus impractical to apply them in broad spatial scales (Cao & Hawkins 2011).Second, these sites have human-altered hydrological, physical and chemical conditions, that also affects their corresponding biological attributes.Thus, an index based on those sites may differ from one developed using natural benchmarks.This practice should be applied with great prudence, otherwise can influence indices by incorporating increasing degradation into the modelling.
The need to implement bioassessment programs in regions where reference sites are absent or scarce have fostered the development and testing of alternative methods (Chessman & Royal 2004, Carter & Fend 2005, Stranko et al. 2005, Chessman 2006, Blocksom & Johnson 2009, Hawkins et al. 2010, Birk et al. 2012, Schoolmaster et al. 2013, Labay et al. 2015, Milošević et al. 2016, Elias et al. 2016).Some researchers point that an approach that does not require the use of reference sites should be explored (Olden et al. 2006, Feio et al. 2009, Feio & Poquet 2011, Elias et al. 2015).Recent literature reviews on bioassessment (Dolédec & Statzner 2010, Hawkins et al. 2010) classified Chessman and Royal's (2004) approach, the Observed Proportion of Potential -OPP, as promising, but lacking sufficient validation.This approach is based on the environmental filters concept (Poff 1997), which assumes that key environmental features act in the selection of potential colonists, from a regional pool of taxa, based on the ecological traits (tolerances) possessed by each taxon.The premise of the approach is ecologically intuitive: a taxon from the regional pool (i.e.known to possibly occur in a given area) that possess environmental tolerances that fit the 'natural' environmental conditions would occur in a site if those 'natural' conditions are found.If the environmental conditions are changed in a way they extrapolate a taxon's tolerance, they act as filters excluding the taxon from that site.The use of multiple key environmental filters would allow predicting the composition of the community potentially occurring in a site (Poff 1997, Chessman & Royal 2004, Stranko et al. 2005, Chessman 2006).Anthropogenic impacts can be viewed as either modifying the natural filters (allowing more, fewer, or different taxa to pass) or creating additional filters (Chessman & Royal 2004).Thus, the comparison of the observed taxa with the expected taxa provides a measure of the impairment level of a site.It results in an Observed/ Expected index (O/E index), similar to the type used in a RIVPACS and AUSRIVAS (Feio & Poquet 2011), but Chessman and Royal's approach identify taxa expected to occur in each typology based on life history information, rather than through reference-based predictive modeling.This can be considered typological approach (sensu Hawkins et al. 2010), because the expected taxa are identical to all streams within each type listed.
In this study, we test the potential for environmental filters approach to provide a basis for bioassessment of regions without reference sites.The environmental filters predicted the natural potential distribution of aquatic insect families, which was used as benchmarks for comparisons with observed assemblages.The predictive index was tested for its ability to determine the impairment condition of streams sampled in southeast Brazil.Specifically, to assess index performance we tested for its (1) responsiveness, in order to verify the ability to discriminate impairment classes; (2) precision, to know the precision of filters approach worked; (3) sensitivity to stressor gradient through evaluation of the relationship between the OPP and environmental variables that are highly sensitive to human alteration.Then, we compared OPP with existing multimetric indices in order to verify the agreement between methods in relation to ecological quality classification.

Study Area
This study was carried out with data collected from seven of the nine main river basins of Rio de Janeiro state, southeast Brazil (Figure 1).The geomorphology of the state is composed of coastal plains separated by hills and two mountain chains that run parallel to the ocean (Serra do Mar, ranging from elevations 0-2,000 m.a.s.l and Serra da Mantiqueira, ranging from 800-2,500 m.a.s.l).The state's main river, rio Paraiba do Sul, runs in the valley formed between the two mountain chains at an elevation of about 800 m.a.s.l.According to a recent review of Köppen's climate classification for Brazil most of Rio de Janeiro state´s mid-to-lowland portions (44%) is classified as tropical with a summer rainy season (Aw type), and the mountainous regions and plateaus classified as humid subtropical zones with hot summer, without dry seasons (Cfa type) or with a dry winter (Cwa type) (Alvares et al. 2013).The temperature oscillates between 15oC and 28oC and the mean annual rainfall is around 1000-1500mm.The Atlantic Forest, which originally covered virtually the entire region, now represents less than 12% of its original extent, and is mostly spread in the higher parts of the mountains and in remnants interspersed with agriculture and pasture (Ribeiro et al. 2011).

Development of the predictive model
A model to predict the natural distribution of macroinvertebrate based on environmental filters was built using three main types of information: 1) the regional pool of macroinvertebrate taxa; 2) the environmental filters; 3) the preference and tolerance of each taxon to the three filters.The 'regional pool' of aquatic insects was based on a database of both published and unpublished information.This database consisted of over 400,000 individuals representing 100 macroinvertebrates families from 370 stream sites distributed in five neighboring states (Rio Grande do Sul, Paraná, Mato Grosso do Sul, São Paulo and Rio de Janeiro) in south and southeast Atlantic Forest region.To define the regional pool, we identified the set of families that occurs in a region using the database.All families within the database were included as part of the regional pool and the list of families were confirmed with local experts.In next step, we excluded the rare families, which occurred in less than 10% of total sites in the database.These families that occurs with low frequencies could reduce model performance, because we couldn't define their preferences and environmental tolerances accurately or their absence in the following analyses are probably due to rarity rather anthropogenic degradation.
The environmental filters were used to generate a potential natural macroinvertebrate assemblage for each site by selecting from the regional pool the taxon which had the possibility (based on its environmental tolerance) that allowed it to occupy each test site.To obtain the local pool, we used a series of binary filters.Thus, a taxon was considered as a potential colonizer at a given site if it was listed in the regional pool and the site's characteristics were compatible with the environmental range described for the taxon.If a given taxon were filtered (excluded) by a single environmental attribute it was not considered to be potentially present in that location ("one-out-all-out" principle).We obtained a list of families potentially present in each combination of environmental filters (Appendix 1).
Ideally, the environmental filters should be relevant to the biological assemblages, must have enough available information regarding their biological tolerance ranges, and must not be strongly affected by stressors (Chessman & Royal 2004, Chessman 2006, Walsh 2006).In our study, the environmental variables that held these characteristics were elevation, original vegetation and soil type.These three filters are not directly influenced by human activities, they have been reported to influence aquatic insects assemblage distributions (Walsh 2006, Dudgeon 2012, Olson & Hawkins 2012), and there was enough information to map them and to calculate the tolerance range values for aquatic insects.The elevation range was divided into low (0-200m.a.s.l.), medium (>200-800m.a.s.l.) and high (>800m.a.s.l.), because these categories were reported as holding different aquatic insects assemblages in our studied region (Baptista et al. 2001, Henriques-Oliveira & Nessimian 2010).Original vegetation (Atlantic ombrophilous dense and semideciduous forest) and soil type (Cambisols, Ferralsols and Podzols) followed IBGE (http://mapas.ibge.gov.br).
The preferences and environmental tolerances of each taxon regarding the elevation range, vegetation and soil type were based on information obtained in the database and numerous publications.We used the database to define occurrence frequency of taxa in each category within the environmental filters.We used the database to define occurrence frequency of taxa in each category within the environmental filters.Families which the occurrence was equal or more than 10% of the streams in that category was considered potentially present in the category of the environmental filter (Appendix 1).So, the probabilities were transformed to binary presence (1) and absent (0).We used the literature to avoid errors and the ones containing more information were All taxa were identified to the lowest taxonomic level possible (mostly to genus level), but data were aggregated to family level for the modelling.The family level was chosen due to the lack of information about the preferences and environmental tolerances (ecological traits) on genus or lower levels.Also, this taxonomic level was shown to present similar responses to impairment as lower levels in Brazil (Buss & Vitorino 2010) and, due to practical reasons, family-level is recommended as a starting point for bioassessments in regions with taxonomic and resources constraints, such as Brazil (Buss et al. 2015).
Other studies developing predictive modelling based on environmental filters (Chessman & Royal 2004, Stranko et al. 2005, Chessman 2006, Walsh et al. 2010, Davies et al. 2012) have found that those approaches may be applied successfully using family level.

Database used to apply and test the model
The dataset used to apply and test the model consisted of 106,088 individuals representing 67 families and 10 orders of insects, from 146 sites sampled in Rio de Janeiro state.The records from these 146 sites were excluded from model development so that they could be independently used to assess the model performance.Aquatic insects were sampled in streams of 1 st to 4 th orders (of which less than 15% were of 5 nd to 6 th orders), representing elevations from sea level to 1,700 m.a.s.l. and under different land uses (reference, pasture, agriculture or urban).The great majority of the samples were taken avoiding the wet season (thus, sampled from April to October) between the years of 2005 and 2010.All samples were taken and processed by the same research team.In the field, twenty samples (around 20m 2 ) were collected proportionally to the microhabitats (substrates) available in each stream reach using a Kick sampler (30x30cm; 500 µm mesh size), following the multi-habitat method (Barbour et al. 1999).The percentage of available habitats was previously estimated by visual inspection and substrates with less than 5% of the site area were not sampled.Samples were obtained from a reach length of approximately 20 times the channel width.Samples were conserved in the field in 80% ethanol and taken to the laboratory for further inspection.In the laboratory, samples were washed to remove coarse organic matter, such as leaves and twigs and the remaining material was placed in a sub-sampler measuring 64 x 36 cm, divided into 24 quadrats, each measuring 10.5 x 8.5 cm, with area of 89.25 cm 2 (European patent number 2572576).Eight quadrats were chosen at random and processed entirely, following the procedures described in Oliveira et al. (2011a).
In the field, water was analyzed for pH (MPA 210p LabConte) and dissolved oxygen (mg/L; YSI 550A).Cooled samples were taken to the lab for further analysis.The parameters ammonia (mg/L NH 3 ), nitrate (mg/L NO 3 ), and total phosphorus (mg/L P-total) were analyzed using a spectrophotometer (HACH DR2500), following Standard Methods protocols (APHA 2000).Sampling sites were also classified in the field using the visual-based habitat assessment protocol (HAP; Barbour et al. 1999).The HAP analyzes ten environmental parameters, such as substrate availability for colonization by benthic fauna, water velocity and embeddedness, channel condition, sediment deposition, margin stability and riparian vegetation extent and condition.For each parameter a score between 0 and 20 is assigned.Sites are classified according to the mean score obtained in the HAP, as follows: 0-5 "Poor", 5.1-9.9"Marginal", 10-14.9"Suboptimal" and 15-20 indicating an "Optimal" environmental condition (Barbour et al. 1999).Landscape variables (elevation, original vegetation and soil types) were obtained for each site using the ArcGIS 10.3 software and their corresponding digitalized 1:500,000 maps (IBGE http://mapas.ibge.gov.br).
Although our aim in this study was to develop a model to predict the fauna in regions lacking proper reference sites, we included reference areas to test the model.We hypothesize that if the model is robust enough it should yield higher scores to reference sites in comparison to other impairment conditions.Impairment classes were assigned based on physical, chemical and environmental parameters, the latter following Barbour et al. (1999).Sites were classified as reference if water had dissolved oxygen >6.0 mg/L, an "Optimal" or "Good" environmental condition according to the HAP, no sign of channelization locally or upstream and if <25% area upstream land-use were urban (based on recent satellite images).Most reference sites were within or in buffer zones of protected areas and were thus classified as "minimally disturbed" (sensu Stoddard et al. 2006).Impaired sites were classified if they had "Poor" condition according to HAP and if recent satellite images showed >40% of upstream area was affected by urban areas and/ or agriculture.Intermediate sites had characteristics between these two classes.When in the field we noticed some intermediate sites had been reforested or were in process of recovery.Those sites were classified as "best attainable conditions" (sensu Stoddard et al. 2006), and we used this subset of intermediate sites to refine the model testing.

Comparison of observed and expected taxa
The suite of families observed at each site was compared with the suite that was attributed as potential colonizers of that site.This ratio (number of observed within the expected/total expected) is termed as "observed proportion of potential" (OPP; Chessman & Royal 2004).The OPP scores range from 0 to 1, where scores close to zero indicate few expected families were observed and scores close to one indicate the opposite.The lower the OPP score the higher the impairment level of a site.

Testing the model
The testing of the OPP was done threefold, aiming to verify: (1) responsiveness, if the OPP scores statistically discriminate sites of the four impairment classes (minimally disturbed reference, best attainable conditions, intermediate and impaired) using an ANOVA followed by a Tukey post-hoc; (2) precision, using standard deviation of scores across reference sites; (3) sensitivity, if OPP scores correlated with environmental variables that are highly sensitive to human alteration, which allow to define a stressor gradient; (4) agreement between the methods in relation to ecological quality classification, if OPP scores correlated with multimetric indices previously developed for river basins where those sites were located.To verify if the OPP scores were correlated to environmental variables, first a PCA were calculated

Results
Based on the criteria to determine impairment conditions, the 146 sampled sites were classified as follows: 35 reference sites, 20 with best attainable conditions, 55 intermediate and 36 sites classified as impaired.Physical and chemical parameters suggest a gradient of impairment related to organic origin (as the values of ammonia and P-total indicate) and to non-point source pollution and/or related to environmental degradation (based on the HAP index; Table 1).The HAP index classified reference sites as "optimal" or "good", while impaired sites were classified as "regular" or "poor" environmental condition.

Environmental filters
From the 146 stream sites analyzed in this study, cambisols had the higher number of sites (104) among soil types, followed by podzols and ferralsols (26 and 16 sites, respectively).The great majority of sites (136) belonged to ombrophylous dense forest dominion, with the ten remaining belonging to the Atlantic semi-deciduous forest dominion.The elevation range separated sites in more even numbers: 51 sites from 0-200m.a.sl., 60 sites in the 200-800m.a.s.l range and 35 sites >800m.a.s.l.It is important to notice that the distribution of streams do not follow the percentiles of each environmental condition in the state, being an artefact of the database.The three environmental filters were combined to determine the potential of occurrence of aquatic insects.
Sixty-seven families were identified as having the natural potential to occur at one or more sites.The combination with the higher number of families with potential to occur was cambisols covered by ombrophilous dense forest and in the 200-800m.a.s.l.elevational range (53 families; Table 2).The combinations with the lower numbers of families with potential to occur were those at low elevation (0-200m.a.s.l) covered by Atlantic semi-deciduous forest, both in ferralsols and podzols soil types (35 families; Table 2).Sampled sites represented most possible filters combinations.All combinations had at least one site sampled, with the exception of sites >800m.a.s.l. in Atlantic semi-deciduous forest on podzols and ferralsols, and in Atlantic ombrophilous dense forest on ferralsols (Table 2).

Testing the model
The OPP scores for the 146 sites ranged from 0.02 to 0.75.Reference sites had a high number of occasions where the expected families were also observed (E+O+, Table 3).This contributed for the highest OPP scores obtained by sites of this class (percentiles 25%-75% = 0.46 and 0.60, respectively; Figure 2).BAC sites scores were statistically similar to those of reference sites (ANOVA Tukey post-hoc test p=0.11),although 75% of BAC scores were lower than the median score of reference sites (Figure 2).On the other hand, in impaired sites, a high number of expected families were not observed (E+O-, Table 3).The lower number of E+O+ occasions resulted in lower OPP scores for impaired sites (percentiles 25%-75% = 0.12 and 0.32, respectively).Intermediate sites scored between the two extremes (ANOVA Tukey post-hoc test p< 0.01 for all those pairs of data; Figure 2).The scores of reference sites had low standard deviation (0.54, SD= 0.098) which indicates high precision of filters approach.In general, sensitive taxa (Corydalidae, Grypopterygidae, Perlidae, Psephenidae and Pyralidae) and the shredders -group particularly vulnerable to riparian deforestation (e.g., Calamoceratidae, Leptoceridae and Tipulidae)were observed with higher frequency in reference sites (Table 3).Those taxa, among others, had distinct presence patterns between reference and impaired sites.For example, Grypopterygidae was observed in 34 of the 35 reference sites it had a natural potential of occurring, but it was observed in only two of the 36 impaired sites it was expected.Some very abundant tolerant or moderately tolerant taxa were observed in most sites they were expected to occur (e.g., Elmidae, Chironomidae, Simuliidae, Baetidae, Leptohyphidae e Hydropsychidae), regardless the impairment condition (reference, BAC, intermediate or impaired; Table 3).In very few occasions, some families unassigned by the filters for occurring at a site were observed (E-O+, Table 3).Table 3. Number of occasions of aquatic insect families with/without a natural potential of occurrence in a site based on environmental filters (E+, E-), and observed/ not-observed (O+, O-), at reference (bold letters) and impaired sites (italic letters)."E+O+" (number of occasions the family with potential was collected), "E+O-" (number of occasions the family with potential was not collected); "E-O+" (number of occasions the family with no potential was collected); "E-O-" (number of occasions the family with no potential was not collected).The axis 1 of the environmental PCA represented 54.04% of the variance and it was the only significant one, according to the broken-stick model.PCA axis 1 discriminated Reference and BAC sites from intermediate and impaired conditions (Mann-Whitney test using PCA axis 1 scores, p<0.01 for all pairs except Ref x BAC, p = 0.56).The parameters ammonia and P-total had the highest eigenvalues (>0.50) for this axis.The OPP scores responded to this gradient of impairment conditions and were significantly correlated with the environmental PCA axis 1 values (r=0.74,p<0.01).The OPP scores were also highly correlated to all subsets of biological data, expressed by the multimetric indices calculated for each basin (Table 4).

Discussion
The OPP index developed in this study was sensitive to discriminate a gradient of impairment conditions (reference, best attainable, intermediate and impaired sites; Figure 2).The OPP scores were also correlated with a set of environmental variables, and it was strongly correlated with all multimetric indices developed for multiple watersheds in the state (Table 4).This latter result was unexpected, since those multimetric indices were developed separately for each watershed, incorporating their particular environmental and biological conditions.Chessman and Royal (2004) found that OPP was significantly correlated with different disturbance measures and with another biological index (SIGNAL2; Stream Invertebrate Grade Number Average Level).Chessman et al. (2006) reported the OPP had higher sensitivity to distinguish the impairment gradient than the AUSRIVAS O/E index (Australian River Assessment System Observed over Expected).Even though the OPP had performed very well in our study, we believe it could be improved if some of the following aspects is implemented: using a taxonomic level lower than family-level; using more refined environmental filters; and by incorporating more information about the auto-ecology of aquatic insects taxa.

Considerations on taxonomic level
The taxonomic level to be used for bioassessment and monitoring purposes is a topic of ongoing debate (e.g., Buss & Vitorino 2010, Mueller et al. 2013).Some researchers argue that the identification of aquatic insects to species-level have a higher sensitivity to detect small differences among sites (Heino 2014), but some others argue that in biomonitoring programs datasets are summarized in indices, which do not necessarily require species data and are often robust to taxonomic aggregation (Buss & Vitorino 2010, Whittier & Van Sickle 2010, Mueller et al. 2013).The original OPP index in Australia was also developed in family-level (Chessman & Royal 2004).Those authors argued that this level was chosen because of the lack of taxonomic keys, due to practical reasons (lower cost and skill requirements), and because most bioassessments in the region has operated at family level.In Brazil and other parts of Latin America the situation is the same.The identification to the species-level is not always possible due to the limited taxonomic knowledge for many insect groups, and although taxonomic keys are being developed and becoming available (e.g., Domínguez & Fernandez 2009, Mugnai et al. 2010, Hamada et al. 2014) there is still a lack of taxonomic keys for most regions and an insufficient and decreasing number of experts in taxonomy.Still, if these limitations may be overcome, the use of lower taxonomic levels (e.g., genus) could hypothetically increase the precision of the model.Some neotropical families are composed by genera that differ considerably regarding their tolerance to stressors and ecological preferences (e.g., Chironomidae - Roque et al. 2010;Baetidae -Buss & Salles 2007).In such cases, a family may have been assigned as having the "potential of occurrence" based on the distribution and preferences of a given genus 'A', while the collected genus in one location was the 'B', incorrectly increasing the OPP score for that site.Higher taxonomic levels (e.g., family) could be used more consistently for families composed by few genera and/ or where the genera share similar ecological traits.Since the tolerance and biological traits of many species or genera are still missing (see discussion below), an interesting starting point could be using mixed taxonomic levels -using the criteria above for the decision -instead of losing information by using data aggregated to family level.

Considerations on environmental filters
In our study, OPP scores for some reference areas were relatively low (OPP = 0.34 the lowest score for this condition), juxtaposing with some impaired condition scores (OPP = 0.46, the highest score for this condition).For the OPP, we used environmental filters at a "landscape-" or "regional-scale" (elevation, original vegetation and soil types).In part, we chose these filters based on Omernik's (1987) ecoregional approach, which have already been shown to have a good response by the macroinvertebrate (Verdonschot & Nijboer 2004) and fish fauna (Pinto et al. 2009).Also, these features are not subject to human disturbance, which is an ideal condition for the use of the filters approach (Poff 1997, Walsh 2006).On the other hand, some researchers argue that macroinvertebrate faunal predictions based on filters should be based on multiple spatial scales, mixing both "regional" and more "local" features (Poff 1997, Olden et al. 2006).This is supported by research based on diversity partitioning analyses, which indicate that both scales help to explain macroinvertebrate assemblage distribution in Brazil (Ligeiro et al. 2010, Macedo et al. 2014).However, the use of local environmental filters has some limitations.There is an increasing difficulty for extracting patterns from large databases since local information are not widely available and may not be extracted by GIS procedures -rather, they must be gathered on site -and most importantly, that local features are strongly affected by anthropogenic impacts, even relatively mild ones.The problem of using predictor variables that are affected by human disturbance is that the modelling will incorporate part of the disturbance, and it may misclassify sites with some degradation as being undegraded (Walsh 2006).To correct this, simulation models of human influences may enable to predict natural features and they could be used instead of the observed features (Chessman et al. 2006).The use of GIS-based information coupled with Self-organizing maps (SOM) techniques may aid further the prediction of local-level features based on larger-scale data (e.g., Davies et al. 2000, Snelder et al. 2011).

Considerations on taxa preferences and tolerances
The virtual lack of information on biological traits hinders the application of environmental filters approach in many regions (Helm et al. 2015).In recent years, there was a greater focus on this subject worldwide (Van den Brink et al. 2011, Culp et al. 2011, Mueller et al. 2013).Still, more data on the autecological characteristics, ecological preferences and biological traits are necessary, especially for neotropic benthic fauna.The way forward includes the compilation of bio/ecological traits in large databases, such as those developed for Europe (Bonada &Dolédec 2011, Schmidt-Kloiber andHering 2015), North America (Vieira et al. 2006), South America (Tomanova & Usseglio-Polatera 2007), and New Zealand (Dolédec et al. 2006).Several researchers highlight the potential of using biological traits as metrics for bioassessments, since they can potentially reveal additional information concerning ecosystem properties beyond taxonomic composition (Poff et al. 2006, Dolédec & Statzner 2008, Culp et al. 2011).
In our study, we found few occasions where an unexpected family by the environmental filters analysis was, in fact, observed (E-O+, Table 3).Chessman & Royal (2004) argue that such cases may occur because of the model was based on inappropriate filters or incorrect information on biological traits, or according to Helm et al. (2015) they can indicate invasion of non-native species from different geographic regions or opportunistic species that historically do not occupy this particular habitat.Very little information is available on the historical distributions of the aquatic insect assemblages included in this study.However, this should be a concern when using this approach for the whole macroinvertebrate assemblages, especially given the many cases of mollusk and mosquito exotic species invasions in South America (e.g.Thomaz et al. 2014).

Considerations on reference sites
Based on our judgment, the "best attainable areas" (BAC) in this study would be classified as intermediate condition, or reference "on the fringe".The OPP approach was sensitive to detect this subtle change: although BAC sites had OPP scores statistically similar to reference, their scores were lower than most reference sites' scores.Also, both conditions were statistically different than intermediate and impaired conditions (Figure 2).These results indicate that best management practices (BMP) in these areas, such as maintaining or recovering the riparian zone and the natural stream habitat features, were efficient to partially sustain the aquatic insect fauna.We hope this can stimulate and guide managers toward BMPs, even though the recovery could be slow (Meals et al. 2010), and limited (Harding et al. 1998).

Further developments on environmental filters approach
Some alternative approaches were developed for the assessment of streams without reference areas.In general, those methods select from the population of sites the "least disturbed" ones -using physical, chemical and/or biological data -and use them as benchmarks.This can be achieved by using a pre-defined percentile of sites (Blocksom & Johnson 2009) or by multivariate analyses with abiotic data to

Figure 1 .
Figure 1.Locations of the macroinvertebrate sampling sites.
Biota Neotrop., 19(1): e20180601, 2019 http://dx.doi.org/10.1590/1676-0611-BN-2018-0601http://www.scielo.br/bnusing environmental variables (ammonia, nitrate, P-total, and the HAP index) for 110 sites where all those informations were available.Prior to analysis, data were standardized by subtracting each value from its mean and dividing it by its standard deviation to reduce the effects of different scales used in the variables.Second, Pearson correlations were calculated using the OPP scores for each site and the obtained PCA first-axis values (environmental variables considered should be positively correlated in the PCA first-axis).To verify if the OPP scores were correlated with multimetric indices, Pearson correlations were calculated between the OPP score for each site and the indices: GMMI (Guapiaçu-Macacu Multimetric Index; Oliveira et al. 2011b), for 38 sites sampled in this basin; PPPMI (Paquequer-Piabanha-Preto Multimetric Index; Baptista et al. 2011), for 22 sites; MISB (Serra da Bocaina Mutimetric Index; Baptista et al. 2013) for 16 sites; IMMM (Macaé Multimetric Index; unpublished data) for 29 sites; and ECMI (East Coast Multimetric Index; Pereira et al. 2016), for 20 sites.Twentyone sites were sampled in basins without prior developed indices and were not included in this analysis.
Bold letters: Reference sites; Italic letters: Impaired sites.

Table 1 .
Mean values (and standard deviation) of the physical, chemical and environmental parameters measured for stream sites classified as reference, best attainable condition (BAC), intermediate and impaired.

Table 2 .
Number of sampled sites and number of families with potential of occurring, according to environmental filters (soil types, original vegetation cover and elevation ranges).

Table 4 .
Pearson correlations (R and p-level)between the OPP scores and the multimetric indices for each stream basin.