Introduction
Successful fisheries studies require an appropriate sampling design that produces unbiased data, because biased data do not allow valid conclusions to be made about the statistical population (^{Zar, 2010}). Given that it is rarely possible to obtain data from all the sample units of a statistical population (^{Pagano, Gauvreau, 2000}; ^{Hansen et al., 2007}), a sampling design must be probabilistic so that each sample unit has the same chance of being sampled (^{Pagano, Gauvreau, 2000}).
A sampling design stipulates how samples are organized in space and/or time in an observational study (^{Gotelli, Elisson, 2004}), also called a mensurative experiment (^{Hurlbert, 1984}), which is when the researcher generally does not control any of the variables being quantified. The general sampling designs most frequently used are simple random, systematic, stratified and clustered, which are all probabilistic (^{Wilde, Fisher, 1996}; ^{Pagano, Gauvreau, 2000}; ^{Hansen et al., 2007}; ^{Noble et al., 2007}). However, among these general sampling designs it is still possible to establish more specific sampling designs since there is more than one option for determining the location of fish sampling sites. Specific sampling designs may be fixed - i.e., the same sampling sites are sampled successively in time, or variable - i.e., new sampling sites are defined for each sampling trip (^{Bonar et al., 2009}). The general sampling design used for fisheries studies is already well established (e.g., ^{Wilde, Fisher, 1996}; ^{Hansen et al., 2007}; ^{Noble et al., 2007}), but not the specific sampling design.
Although probabilistic sampling is an old statistical recommendation, non-probabilistic sampling is dominant among fisheries studies in Brazil. For example, non-probabilistic sampling was used in 97% of the 35 scientific studies published from 1993 to 2014 (personal data) in reservoirs in the upper Paraná River basin. In North America, non-probabilistic sampling is still in use (e.g., ^{McClelland, Sass, 2012}; ^{Patterson, 2014}), but probabilistic sampling, which had been used for decades (^{King et al., 1981}), seems to be predominant.
Reasons for using non-probabilistic sampling include ease of access to sampling sites, time and resource availability, and environmental constraints versus capture technique efficiency, among others (^{Wilde, Fisher, 1996}; ^{Hansen et al., 2007}; ^{Noble et al., 2007}; ^{Bonar et al., 2009}). Probabilistic sampling is generally more expensive (^{Osburn, 1988}), and randomization may select sampling sites that are difficult to access, making sampling more time-consuming and costly. However, probabilistic sampling improves the ability of managers to interpret spatial patterns of populations and to predict their responses to habitat management and manipulations (^{Wilde, Fisher, 1996}). Non-probabilistic sampling may be less expensive, but it limits statistical inferences and restricts the use of data (^{Wilde, Fisher, 1996}; ^{Noble et al., 2007}).
Studies in the fisheries literature have compared probabilistic versus non-probabilistic sampling designs (usually called fixed sampling design). However, studies comparing two or more general probabilistic sampling designs are rare, and even rarer are those that compare specific sampling designs. For most rivers, assessing differences between probabilistic sampling designs is hampered by flow, which limits the number of sampling sites where fishing gear, such as gillnets, can be used. Reservoirs, in turn, make it possible to compare data from different probabilistic sampling designs, both general and specific, since they impose few restrictions on the choice of sampling sites. Our objective was to determine if two specific sampling designs (fixed vs. variable sampling sites) produce differences in eight population metrics of curimba, Prochilodus lineatus (Valenciennes, 1837), in two reservoirs in the upper Paraná River basin, Southeast Brazil.
Material and Methods
Study area. The reservoirs of Volta Grande (VGR) and Jaguara (JR) are in the middle Grande River (Fig. 1). This river originates in the Mantiqueira Mountain Range, and travels for 1,300 km until it joins the Paranaíba River, thus forming the Paraná River (^{Paiva, 1982}). For its first 615 km, the Grande River delimits the border between the states of Minas Gerais and São Paulo.
The VGR is the fifth reservoir, downstream to upstream, of 13 reservoirs on the Grande River. It is a run-of-the-river reservoir, with an area of 205.0 km^{2} (^{CEMIG, 2006}). It is 80 km long, of which approximately 60 km are lentic and 20 km lotic. Sugar cane cultivation is the predominant land use surrounding the reservoir. The VGR is mostly oligomesotrophic (^{Lopes, 2013}) with high water transparency. The approximately 90-km long Carmo River is its main tributary and possesses spawning and nursery grounds for curimba (^{Ribeiro, 2013}).
The JR is the seventh reservoir of the Grande River. It is slightly more than one-sixth the size of VGR with an area of 34.6 km^{2} (^{CEMIG, 2006}). The extension of JR is 25 km, about 17 of which is lentic and 8 lotic. The JR is also a run-of-the-river type of reservoir. Pastures, forests and houses predominate its surrounding landscape. It is an ultra-oligotrophic reservoir (^{Brandt Meio Ambiente, 2013}), with highly transparent water, and with only small tributaries, the largest of which is about 28 km long.
Study fish. Curimba is an iliophagous South American characiform of the family Prochilodontidae whose distribution is limited to the watersheds of La Plata and Paraíba do Sul rivers (^{Agostinho et al., 2003}; ^{Eschmeyer et al., 2016}). It is widely distributed in these basins and contributes significantly to fishing (^{Sverlij et al., 1993}). The maximum reported size is 78 cm, and it reaches first maturation between 24 and 28 cm (^{Vazzoler, Menezes 1992}; ^{Agostinho et al., 2003}). Curimba performs spawning migrations, and spawning is associated with floods. It does not have parental care, with larvae and juveniles developing in floodplain lakes (^{Agostinho et al., 2003}). Curimba is commonly used in stocking programs in Brazil (^{Agostinho et al., 2007}). The species represented 1.0% of the catch in VGR fish samplings and ranked 15^{th} in abundance, while in JR it represented 1.3% and ranked 11^{th} (personal data).
Fish sampling. We conducted 25 fish sampling trips at VGR and 22 at JR. At VGR, the sampling trips were monthly from July 2012 to June 2013 and bimonthly from August 2013 to October 2015. At JR, the sampling trips were bimonthly from July 2011 to October 2014.
During each sampling trip, we sampled fish with gillnets at nine fixed sampling sites and at nine variable sampling sites, all in the lentic habitat of the reservoirs. The locations of the fixed sampling sites were the same for all sampling trips, while the locations of the variable sampling sites changed for each sampling trip. We used systematic sampling to determine the fixed sampling sites at VGR and stratified sampling for those at JR. To determine the variable sampling sites of the two reservoirs we used stratified sampling. The methodology for choosing both fixed and variable samplings sites is described below.
Establishment of sampling sites for Volta Grande Reservoir. We delineated the perimeter of the lentic habitat using the line option of the “rule tool” of Google Earth (GE), excluding 9 km upstream of the lentic habitat, due to an excess of macrophytes that prevented the use of gillnets. The line generated with GE was 299.5 km long and had 3,631 waypoints. The first sampling site was randomly drawn from all waypoints. We then defined the second sampling site as the closest waypoint within 500 m of the first sampling site in a counterclockwise direction. We used this same criterion to define the third sampling site in relation to the second, and so on for the entire perimeter of the reservoir. We excluded sampling sites located in front of countryside houses, in the urban area and in the security area of the dam. After exclusions, 465 sampling sites remained, representing about 78% of the initial number.
We then divided VGR into three zones of equal margin lengths: A (downstream), B (intermediate) and C (upstream). For each zone, we drew three fixed and three variable sampling sites from among the 465 sampling sites. For each sampling trip, we used the same fixed sampling sites, but performed a new drawing, without replacing previously sampled sampling sites, for the selection of variable sampling sites. We drew the first fixed sampling site from the set of 465 sampling sites. The second fixed sampling site was the sampling site located 29.9 km away from the first fixed sampling site, and the third fixed sampling site was 29.5 km from the second, and so on. Since the fixed sampling sites were equidistantly spaced, and the VGR zones had the same margin lengths, each zone had three fixed sampling sites. For the variable sampling sites, we used stratified sampling by randomly drawing three variable sampling sites per reservoir zone per sampling trip.
Establishment of sampling sites for JR. We used the same methodology used for VGR to establish fixed and variable sampling sites for JR, with the differences described below. The GE ruler tool line generated 18,888 waypoints and a perimeter of 101.2 km. After the exclusion of sampling sites in front of countryside houses, in the urban area and in security area of the dam, 128 sampling sites remained, representing 63% of the initial number.
We chose to use stratified sampling to establish fixed sampling sites in JR, in contrast with the method adopted in VGR, because the exclusion of many sampling sites prevented the use of systematic sampling. For variable sampling sites, we carried out a new drawing for each sampling trip with replacement of sampling sites that were already sampled because of their limited number.
Capturing fish. We used 180 gillnets per sampling trip; 90 at the fixed sampling sites and 90 at the variable sampling sites. At each sampling site, we installed a set of 10 gillnets (each 20-m long and approximately 1.7-m high) with stretched mesh from 3 to 16 cm in the following order: 3, 8, 4, 10, 5, 12, 6, 14, 7 and 16. We installed the set of gillnets in the littoral zone of the reservoirs, parallel to the shore, late in the afternoon and removed it the next morning.
At VGR, we used a fishing effort of 76,775.4 m^{2} at the fixed sampling sites and 77,470.2 m^{2} at the variable sampling sites. At JR, the fishing effort was 55,427.3 m^{2} at the fixed sampling sites and 57,430.7 m^{2} at the variable sampling sites. The differences in fishing effort between fixed and variable points were caused by the loss of nets.
For each captured curimba, we recorded the sampling site and mesh size of the gillnet. We fixed the specimens in 10% formaldehyde and, in the laboratory, determined standard length (SL, cm), body weight (BW, g) and sex macroscopically after dissection. We deposited the voucher specimens in the fish collection of the Universidade Federal de Minas Gerais under the numbers ICT-UFMG 2897 (VGR) and ICT-UFMG 2896 (JR).
Metrics and analyses. For each reservoir, we analyzed the influence of the sampling design on eight metrics: catch per unit effort (CPUE), catch constancy (C), SL, SL-capture distance relationship, Fulton condition factor (K), weight-length relationship, sex ratio and proportion of individuals of each sex. We performed 70 analyses of these metrics (35 per reservoir), comparing the results obtained by the two sampling designs.
We calculated catch per unit effort (CPUE) according to ^{Gulland (1969}) as modified by ^{Alves et al. (1998}), dividing the number of captured curimba by the fishing effort (in 1,000 m^{2}) for each mesh size and summed the quotient of all meshes. We calculated the CPUE for sampling design, as well as per reservoir zone, sex and sampling trip separately for each sampling design. We determined the influence of sampling design, reservoir zone and sex on CPUE using the effect size for χ^{ 2 } according to ^{Cohen (1988}). Thus, we used g effect size to determine the influence of sampling design alone and w effect size to determine the influence of sampling design associated with reservoir zone and sex.
We determined C per sampling design according to ^{Dajoz (1978}), and classified curimba as: (I) constant, when present in more than 50% of the sampling trips; (II) accessory, when present in between 25 and 50% of the sampling trips; and (iii) rare, when in less than 25% of the sampling trips.
We established the frequency distribution of curimba by SL class, with the number of classes being determined according to ^{Sturges (1926}). We tested for differences in the frequency distribution of curimba by SL class between sampling designs with the Kolmogorov-Smirnov test. We determined the influence of sampling design, sex and their interaction on SL using the effect size with a two-way ANOVA (f), also according to ^{Cohen (1988}).
Using the GE ruler tool, we measured the capture distance (i.e., the distance from the dam to the sampling site where the fish was captured) for each curimba by tracing a line from the dam to the sampling site, passing through the center of the reservoir. We determined the regression between SL and capture distance for each sampling design separately and tested for differences between regressions using ANCOVA.
We calculated K for each curimba according to ^{Fulton (1904}) and multiplied it by 100. We evaluated the influence of sampling design, sex and their interaction on K with effect size f.
We used ANCOVA to test for differences in the weight-length relationship between sampling designs. We analyzed the influence of sampling design on sex ratio with w effect size and on proportion of individuals per sex with g effect size.
We used the GPower 3.1, SAS University Edition and equations in ^{Cohen (1988}) to calculate effect size. We classified effect size into four classes (null, small, medium and large) following ^{Cohen (1988}). We performed ANCOVA using Past 1.28 and the Kolmogorov-Smirnov test using SAS University Edition, both at the significance level of 0.05. For ANCOVA, we verified normality using the Shapiro-Wilk test and, when necessary, homogeneity of variances with Levene’s test.
Results
We captured 84 curimba in VGR and 115 in JR. The sampling design had an effect size near null on CPUE in both reservoirs (VGR: g = 0.04 and JR: g < 0.01; Tab. 1). The effect size of sampling design on CPUE of the three reservoir zones was small in VGR (w = 0.15) and medium in JR (w = 0.26; Fig. 2). Temporal variation in CPUE, both intra- and inter-annual, were similar between the sampling designs in the two reservoirs, except for a few months (Fig. 3). The effect size of sampling design was small on CPUE by sex in VGR (w = 0.15) and JR (w = 0.08).
Reservoir | Sampling design | Curimba | |
---|---|---|---|
N | CPUE | ||
Volta Grande | Fixed | 38 | 4.7 |
Variable | 46 | 5.6 | |
Jaguara | Fixed | 55 | 10.3 |
Variable | 60 | 10.6 |
In VGR, the curimba were accessory at the fixed sampling sites (C = 40%) and constant at the variable sampling sites (C = 72%), but constant at the fixed (C = 73%) and variable (C = 64%) sampling sites of JR. On average, we captured 3.4 curimba per sampling trip in VGR and 5.2 curimba per sampling trip in JR. We caught no curimba during five sampling trips in VGR, while curimba were present in all the sampling trips in JR.
The frequency distribution per SL class of curimba did not differ between fixed and variable sampling sites in the two reservoirs (Kolmogorov-Smirnov test; VGR: P = 0.49 and JR: P = 0.15; Fig. 4). Sampling design, sex and their interaction had a small to null effect size on SL in VGR (f < 0.03) and JR (f < 0.04).
The regression coefficient (b) of the SL-capture distance relationship was significantly different from zero at the fixed sampling sites of both reservoirs (Tab. 2 and Fig. 5). At the variable sampling sites, however, b did not differ, or only marginally so, from zero. Moreover, the r ^{ 2 } was higher at fixed sampling sites than at the variable sampling sites. The influence of the interaction between sampling design and capture distance on SL was significant (ANCOVA; VGR: P = 0.03 and JR: P = 0.02). The slope was greater at the fixed sampling sites in both reservoirs (Fig. 5).
Reservoir | Sampling design | b | P* | r ^{ 2 } |
---|---|---|---|---|
Volta Grande | Fixed | 0.39 | <0.01 | 0.54 |
Variable | 0.16 | 0.04 | 0.09 | |
Jaguara | Fixed | 0.92 | <0.01 | 0.14 |
Variable | -0.19 | 0.61 | 0.00 |
*P-value for the null hypothesis of b = 0.
Sampling design, sex and their interaction had near null effect sizes on K in both reservoirs (VGR: f < 0.03 and JR: f < 0.02). There were also no significant differences in the weight-length relationship due to sampling design in both reservoirs (ANCOVA; VGR: P = 0.49 and JR: P = 0.60).
The effect size of sampling design on sex ratio was small in both reservoirs (VGR: w = 0.11 and JR: w = 0.08). Sampling design had a small to null effect size on the proportion of individuals both for males and females in the two reservoirs (Tab. 3).
Discussion
We performed 70 analyses (35 per reservoir) of 8 population metrics of curimba between two probabilistic sampling designs. Sampling design influenced (i.e., had an effect size greater than small or showed a statistically significant difference) in only five of the analyses of three metrics: capture constancy (VGR only), CPUE (JR only) and SL-capture distance relationship (both reservoirs, three analyses). We did not find an explanation for the medium effect size of sampling design on the CPUE of JR. Explanations for the differences in the other metrics are given below.
Differences between sampling designs. Differences in C between sampling designs in VGR may have been caused by random zeros (i.e., no capture of curimba due to sampling variability), not by the type of sampling. Random zeros also may explain differences in C between the fixed systematic sampling design used in VGR and the fixed stratified sampling design applied in JR. The mean number of curimba captured per sampling trip was lower in VGR than in JR. Moreover, curimba were absent in five sampling trips in VGR, while they were present in all the sampling trips in JR. It seems that the smaller the number of curimba captured per sampling trip, the greater the chances of not catching any in one of them (greater chance of the occurrence of a random zero). Thus, the C of the curimba may have been influenced more by the quantity captured than by the sampling design. We obtained more evidence supporting this hypothesis by analyzing the C of the other species captured in our samplings (31 in VGR and 26 in JR). All of the 11 most abundant species of each reservoir were constant in both types of sampling designs. Differences in C between sampling designs occurred only for the less abundant species, whose average number of individuals per sampling site was less than 5.2 fish, as occurred with curimba in VGR.
The most notable differences between sampling designs occurred in the metrics associated with the SL-capture distance relationship, apparently due to the bias of the fixed sampling sites. The SL of curimba captured at these sites increased (or increased more sharply) with increasing capture distance compared to those captured at the variable sampling sites. Some evidence suggests that the type of sampling design, and not biology, was responsible for such a difference. In sampling designs with continuous predictor variables (as was capture distance in our study), it is necessary that samples are sufficiently distributed along the amplitude of the variable (^{Gotelli, Elisson, 2004}), otherwise sample units (sampling sites, in the present case) may not be sampled, even if the population of the response variable (curimba) makes use of these units (^{Hansen et al., 2007}). Samples of the fixed sampling sites were, by definition, always made at the same capture distance, and thus this sampling design did not sample units (sampling sites) available at other distances. These sample units were only sampled by variable sampling sites. Thus, the metric SL-capture distance obtained at fixed sampling sites may reflect the type of sampling, not the population (^{Hansen et al., 2007}), which could bias the results. If we sampled the reservoirs with only fixed sampling sites, we would conclude that there is a trend for larger sized curimba further from the dam. This tendency was not observed with the variable sampling sites. That is, curimba of different sizes are distributed among all sample units of the study area, with no relation to capture distance. ^{Magnusson et al. (2015}) gave hypothetical examples of how sampling design can influence results and, consequently, decision making. Inadequate or insufficient sampling can generate biased results that lead to an erroneous statistical hypothesis (^{Gotelli, Elisson, 2004}; ^{Hansen et al., 2007}). In our study, for example, sampling curimba in JR with only fixed sampling sites would generate a type I error (i.e., rejection of a true null hypothesis).
Spatial heterogeneity and differences between sampling designs. We suspect that the lower spatial heterogeneity of the studied reservoirs may be the reason that sampling design influenced only a few metrics in a low percentage of the analyses. ^{Smith et al. (2016}) compared fish community metrics of fixed and variable sampling sites, both probabilistic, in lakes in the state of South Dakota, USA. They also found no consistent differences between the sampling designs, and attributed the similarity between them to the low spatial heterogeneity of the lakes. Reservoirs have less spatial heterogeneity (^{Wills et al., 2004}; ^{Santos et al., 2008}) and most fish occupy the littoral zone (^{Agostinho et al., 2007}). Our sampling sites, both fixed and variable, were located in the littoral zone with lower spatial heterogeneity. At VGR, for example, most of the sampling sites (71%) were occupied by Egeria, a rooted submerged macrophyte (personal data), while at JR, macrophytes were not abundant at any sampling site.
Conversely, then, the chances of more metrics being influenced by sampling design would be greater where there is greater spatial heterogeneity. The spatial structuring of the size of curimba in rivers with floodplains, an environment with greater spatial heterogeneity, supports this hypothesis. In this environment, curimba exhibit spatial differences in size since juveniles prefer floodplains while adults prefer flowing water (^{Sverlij et al., 1993}; ^{Agostinho, Zalewski, 1995}; ^{Gomes, Agostinho, 1997}). When metrics are compared between distinct habitats, differences may appear even in environments with lower spatial heterogeneity, such as reservoirs (^{Bodine et al., 2011}).
It appears, then, that the level of spatial heterogeneity may determine differences in fish metrics between sampling designs. However, it also seems that there have been no studies evaluating this hypothesis for fish or any other organism.
Choice of sampling design. Choosing between fixed and variable sampling sites for probabilistic sampling seems to depend not only on the spatial heterogeneity of the study area, but also on the type of variation (temporal or spatial) that one wishes to detect. In environments with lower spatial heterogeneity, such as reservoirs and lakes, the recommendation is to use fixed sampling sites (^{Smith et al., 2016}), particularly if they are more economical or viable than variable sampling sites (^{King et al., 1981}). In environments with greater spatial heterogeneity, such as headwater streams, fixed sampling sites are better at detecting temporal variation while variable sampling sites are better for detecting spatial variation (^{Quist et al., 2006}). Moreover, in environments with greater spatial heterogeneity, stratified sampling is potentially more appropriate than simple random sampling when variation within strata is less than variation among strata (^{Hansen et al., 2007}).
Metrics of fish at variable sampling sites are less subject to the confounding effects of localized environmental degradation than are the metrics of fish at fixed sampling sites. ^{McClelland, Sass (2012}) suggested that lower fish abundance at one of their (non-probabilistic) fixed sampling sites may have been caused by some degree of habitat change. If habitat change occurs during a study, the result may be increased variation in fixed sampling site metrics (probabilistic or not), and lower power of the statistical tests. By sampling different sampling sites during each sampling trip, variable sampling sites do not suffer from this issue.
Our study generated results both in agreement and in disagreement with the observation of ^{Quist et al. (2006}) that variable sampling sites may be better for detecting spatial variation while fixed sampling sites may be better for detecting temporal variation. In agreement were the differences in the SL-capture distance relationship of curimba between variable and fixed sampling sites. Variable sampling sites seem to have been the most appropriate sampling design for evaluating this spatial metric. In disagreement was the lack of temporal differences in CPUE between variable and fixed sampling sites. The absences of these differences may have been caused, perhaps, by the low spatial heterogeneity of the sampled reservoirs.
The present study and the literature suggest that the choice of the most appropriate sampling design will generally depend on the availability of time, financial resources and spatial heterogeneity. Fixed sampling sites are operationally simpler, may have lower execution cost depending on the ease of access to the sampling sites and, perhaps, be the most appropriate design for temporal analysis in environments with greater spatial heterogeneity. However, fixed sampling sites may generate biased metrics and can be more susceptible to localized environmental degradation. Variable sampling sites, on the other hand, are operationally more complex and can be more expensive. Nevertheless, they are likely more appropriate for spatial analysis in environments with less spatial heterogeneity, and the metrics are less subjected to localized environmental degradation. These recommendations are based on a very limited number of studies comparing probabilistic sampling designs and some suggestions from studies with non-probabilistic designs. Further studies with probabilistic sampling designs are therefore required to determine whether our recommendations are consistent.