Spatial dependence and experimental precision in snap bean ( Phaseolus vulgaris L . ) trials related to the number of plants and harvests

The productive variability in horticultural crops affects the planning and quality of the experiments, leading to wrong conclusions. The objectives of this study were to verify the spatial dependence of the fresh biomass of snap beans and to dimension the number of plants and harvests that are necessary to improve experimental accuracy in trials. The data of the fresh biomass of snap beans from uniformity trials carried out in a greenhouse and in the field with semivariograms were created with data transformed into indicators. Thus, they were combined on scenarios of plot size and harvest grouping, and they were adjusted to the spherical, exponential and Gaussian models. A response surface was also applied, with the variation coefficient as a dependent variable and the numbers of plants per plot and harvests as independent variables. The estimates of the semivariogram models parameters indicated a weak spatial dependence. The average of the fresh biomass of snap beans is distributed randomly in the trials, and it is not influenced by the number of plants per plot or by the number of grouped harvests. The best combinations between the number of plants per plot and harvest, for the smaller variation coefficients, are plots of 24 plants for plastic greenhouse and field, and 28 plants for plastic tunnel, in the autumn-winter, combined with the grouping of all harvests. In the spring-summer the number of plants per plot was 30 for plastic tunnel and field, also combined with the grouping of all harvests.


INTRODUCTION
Snap beans (Phaseolus vulgaris L.) are the most important Fabacea in the horticultural group.They are different from the common beans because they are harvested still in their immature stage, and they are used in human feeding in an industrialized manner and "in natura" (Filgueira, 2008).Their growth is a good alternative for the off season period of other horticultural crops to diversify the production, both in protected environments and in the field.This happens due to the use of staking structures and residual fertilization, serving also to break the cycle of some diseases.
In 2011 the production of vegetable crops in Brazil was 19.4 million ton and acreage of 537,215 ha with 17 principal vegetables (Associação Brasileira do Comércio de Sementes e Mudas-ABCSEM, 2011).The Southeast and South regions held three quarters of the production volume, while the Northeast and Midwest regions produced 25% of the total (Melo;Vilela, 2007).So, the increase of the production and productivity of horticultural crops, there was a generation of new technology, and that was also possible via application of experiments with consistent techniques.In those cases, the residual variability control generates improvements in accuracy, in the experimental quality, and in the reliability of inferences.
In horticultural in greenhouse crops, factors such as the position of the crop row in relation to the lateral doors, the presence or absence of fruits able to be harvested are variability causes that must be controlled during the execution of experiments.According to Lúcio et al. (2006), the injuries to which the plants are subjected during crop treatments and fruit harvesting, as well as the environmental variations, alter the plants individual production throughout the crops and, therefore, they become sources of variability as well.
Plots with no production are frequent, due to the absence of fruits that can be harvested or the fruits do not present appropriate characteristics for the harvest or commercialization.The occurrence of many plots with null values generates overdispersion in data.To reduce this overdispersion, Couto et al. (2009) suggest the use of plot size with more than one plant, combined with a grouping of harvests.In this sense, the application of geostatistic techniques to describe the spatial dependence on the experimental environments is useful to obtain a higher precision and reliability of the results.
Generally, according to Yamamoto and Landim (2013), through the geostatistic methodology it is possible to extract from an apparent random data the probabilistic structural characteristics of the regionalized phenomenon, that is, a correlation between the values located in a certain neighborhood and direction of sampling space.Fagioli, Zimback and Landim (2012) reported that the geostatistic assumes that the data are spatially related, and the closest points are more similar than the distant ones.Because of that, we need to know the location in space of the variable being studied in order to verify the existence and the spatial dependence level in a certain situation.
Many are the factors responsible for the occurrence of spatial dependence and it is not always possible to extrapolate the results obtained in an experimental environment to others, since each one has specific characteristics (Yamamoto;Landim, 2013).However, respecting some limitations, the information can be useful, enabling the enhancement of the experimental techniques used.
The definitions of plot size and shape, number of repetitions, sample size and experimental design are influenced by the variability that exist in the experiment (Steel;Torrie;Dickey, 1997).This variability also interferes in the statistical analysis, inflating the experimental error, leading the researcher to interpretations and conclusions which have low experimental accuracy and reliability in the results.
Authors such as Lopes et al. (1998), Lorentz et al. (2005), Lúcio et al. (2008), Carpes et al. (2008) and Couto et al. (2009) pointed out that there is a variability among the crop rows and the multiple harvests.They also affirmed that this variability alters the sampling intensity estimates, the size and form of the plot, the experimental design and the number of enough harvests to better discriminate between the treatments studied.
One of the alternatives to evaluate the variability in the experimental area is the use of uniformity tests, where the area is cultivated with and identical cultural practices, without applying treatments.After that, the area is divided in basic units, in which the variable observed in each basic units (BU) is measured separately, in a way that the values observed in the BU's may be summed up to simulate plots of different sizes and forms (Storck et al., 2011).From the results generated in these trials, there is an investigation on the variability behavior among the plots and the harvests.
There are several papers defining the number of plants per plot in experiments with horticulture of multiple harvests (Mello et al., 2004;Lorentz et al., 2005;Carpes et al., 2010;Lúcio et al., 2010;Santos et al., 2012).However, there is a lack of papers that associate this size to the number of harvests that should be done so that there is a smaller variability in the data and greater experimental accuracy in the conclusions.If the lower variability is associated with a lower number of harvests, it is possible to reduce the time necessary to evaluate the treatments.This way, it will not be necessary to wait until the end of the crop cycle, saving time, resources, and avoiding greater variation in the data observed.
Therefore, the objectives of this study were to verify the spatial dependence of the fresh biomass of beans and to dimension the number of plants and harvests that are necessary to improve experimental accuracy in trials with snap beans (Phaseolus vulgaris L.).

MATERIAL AND METHODS
The data used was the fresh biomass of beans from snap beans (Phaseolus vulgaris L.) from the "macarrão" cultivar were used, obtained in uniformity trials carried out in the experimental area of the Federal University of Santa Maria, with coordinates 29º 43' 23'' S and 53º 43' 15'' W and altitude of 95 m.The climate of the region is classified as Cfa humid subtropical, without dry season and with hot summers, according to the KÖPPEN classification (Moreno, 1961) and the soil classified as Paleudalf soil (Empresa Brasileira de Pesquisa Agropecuária-Embrapa, 1999).
The trials were carried out in autumn-winter in three environments (plastic greenhouse, plastic tunnel and field crops) and in spring-summer in two environments (plastic tunnel and field crop).The trial in the plastic greenhouse was composed of six crop rows of 72 plants, while in the plastic tunnels and in the crop fields there were six rows of 84 plants, with spacing among the plants of 0.2 m and among the rows of 1m.The basic units (BU) were composed of two plants, totaling 36 BU's in the plastic greenhouse and 42 BU's in the plastic tunnel and in the crop field.It was performes four harvests for each environment in the trials during autumn-winter season and three crops during spring-summer season.
In all trials, the BUs were identified by the number of crop row and were numbered according to their position inside the row.Several plot sizes with the data of the fresh biomass of the beans were elaborated summing up the adjacent BUs in the crop rows (1, 2, 3, 4, 6, 9, 12 and 18 BUs in the plastic greenhouse environment and 1, 2, 3, 6, 7, 14 and 21 BUs in plastic tunnel and crop field) and two forms of harvest groupings.The first form of grouping was with the sum of consecutive harvests, as follows: 1st, 1st + 2nd, 1st + 2nd + 3rd, and 1st+2nd+3rd+4th.The second form of grouping was with individual harvests, grouped 1st+2nd, grouped 3rd+4th and 1st+2nd+3rd+4th, in the autumn-winter season and individual harvests, grouped 1st+2nd, grouped 2nd+3rd and 1st+2nd+3rd in the spring-summer season.For each plot and harvest carried out, a variation coefficient (%) was estimated at the crop row.
For the geostatistic analysis, the data from the fresh biomass of beans of the plot sizes and number of harvests were georeferenced in UTM coordinates in function of the distances (in meters), generating a point grid inside each crop row.The greatest number of plot for each trial was the one in which there were at least 30 points for the analysis, as Landim (2006) recommends.The original data of the fresh biomass of beans were transformed into indicators using the general average as a cutoff level, according to the criterion proposed by Yamamoto and Landim (2013), where: vt = 1 if vj ≤ vc and vt = 0 if vj > vc, in which vt = transformed value, vc = cutoff level (average); vj = variable observed value.
A semivariogram was elaborated according to the description by Vieira et al. (1983) by the equation: N(h) is the number of pairs of values Z(xi) and Z(xi+h) separated by the hr distance.The chart of versus the values corresponding to h is the semivariogram where the spherical, exponential and Gaussian theoretical models were adjusted.In the adjustment of the theoretical models to the experimental semivariograms, the nugget effect (Co), the contribution (C1), the sill (Co + C1) and the range (R) were calculated.
For the analysis of the spatial dependence index (SDI), the ratio SDI= C1/(C1+Co)*100 was used, as well as the intervals descript in Souza et al. (2008) who considers: weak (SDI < 25%), moderate (25% ≤ SDI < 75%) and strong (SDI ≥ 75%) spatial dependence.In case the SDI ≥ 25%, the elaboration of the maps with classes of probability of the plots to produce above the average were carried out through the indicative kriging and the studies of the variability behavior that was carried out within each one of the probability classes in the uniformity trial.
In cases where SDI < 25%, the variability behavior study was carried out in the whole trial.
In order to scale the number of plants and harvests necessary to reduce the variability of the experiments, a second-order polynomial regression was used, described by Neter and Wasserman (1996) as: , in which Y= is the variation coefficient between the plots within the crop row, X1= plot size and X2= number of harvests.The model was rewritten in matrix notation 0 ˆŶ X 'a X 'AX contains the values of the pair whose answer's estimate is desired; formed by the linear coefficients of the equation; and, is composed by the quadratic coefficients and linear interaction of the model.For the estimated response The regression equations were estimated with the help of the Genes application (Cruz, 2013) and the geostatistic procedures were carried out with the computing program ArcGis 10.1 (Enviromental Systems Research Institute-Esri, 2012).

RESULTS AND DISCUSSION
The spatial dependence indexes (SDI) obtained in the scenarios of plot size and harvest group were generally low (Tables 1, 2 and 3).That way, the variations in fresh mass of the fruits do not have any structure depending on the distance between the sampling points.In this condition, it is not recommended the use of indicative kriging for the definition of probability classes and the area of the trials environment were studied integrally.
Almost all the adjusted semivariogram models presented weak or moderate spatial dependence (SDI< 75%).These results indicate that the average of fresh biomass of beans is distributed randomly within the experimental area grown with snap beans, for any plot size and form of grouping.These results agree with those obtained by Benz, Lúcio and Lopes (2015), who found the random distribution of the zucchini production under different plot sizes and with the grouping of the three first harvests.However, in a greenhouse crop of tomatoes and in crops of melon in two commercial production areas with different soils, hybrid and cultural treatment (Miranda et al., 2005), the variability and spatial dependence seen was from moderate to strong in all the production components and crop systems.
In most adjusted models, the value of the variogram function at the origin, called the nugget effect, was distant from zero, and several models presented pure nugget effect (Tables 1, 2 and 3).What is more evident differences were spatial dependence values in experiments carried out in the field and in protected environments.During the autumn-winter season, the trial in plastic greenhouse presented 33.3% of models with spatial dependence classified as moderate or strong, while the trials in plastic tunnel and in the field were 25%, 94.44%, respectively.In the spring-summer season, the trial in plastic tunnel presented 25.93% of the models with moderate or strong SDI while in the field it was 85.88%.The SDI values were higher in the field, mostly, which also did not present any combination with pure nugget effect in both season periods.In the trial at a plastic greenhouse, the lowest SDI or the pure nugget effect were observed with the use of four classes of grouped harvests, regardless of the plot size and the semivariogram model (Table 1).
In the theoretical models of semivariogram, we saw that, generally, the higher the nugget effect, the smaller the contribution (difference between the baseline and the nugget effect), the higher the quadratic average error and the smaller the spatial dependence found (Tables 1, 2 and 3).The similar RMSE values was also seen between the semivariogram models used (Tables 1, 2 and 3).This situation shows little difference among the models, being RMSE predominantly a bit smaller in the first harvest in any plot size.In the situations where there was a strong spatial dependence with the adjusted semivariogram theoretical model, this did not represent variability (Tables 1, 2 and 3).In these cases, the lack of structure was visible in the semivariogram, because the appropriate would be an increase in the semivariance with the distance until reaching the range, and what happened was the random fluctuation of the semivariance.According to Landim (2006), in a regionalized variable, the value of each point is related, somehow, to values obtained from points located at a certain distance, being reasonable to infer that the influence is higher when the distance between the points is smaller.
The fresh biomass of snap beans presents variability influenced mainly by the weather conditions, by the type of handling used and harvest time.According to Andriolo (2002), the field crops are subject to environmental variations because of the smaller control of the temperature, humidity and wind inflating thus production variability.These situations were duly controlled in the study, and that may have contributed for a low spatial dependence and, otherwise, the non-representation of the variability by the adjusted semivariograms models.According to Andriotti (2002), the nugget effect represents the sample error or the natural variability of the phenomenon studied.When the semivariogram presents itself with a pure nugget effect, it means that the variable structuring, if it exists, cannot be seen in the scale used, and it does not have any advantage so that the geostatistic method can be adopted for its study.Another factor that may have contributed for the absence of dependent structuring of the distance between the measurement points in the average variations of fresh biomass of snap beans may have originated in the structuring of the uniformity tests.For the scenarios of plot sizes, neighboring plants were grouped inside the crop row, which generated greater uniformity in the production values by the combination of plants in a similar region.This methodology has been adopted by several authors (Lúcio et al., 2008;Carpes et al., 2010;Santos et al., 2014;Benz et al., 2015) due to the variability among the crop rows.
The polynomial equations adjusted were very similar to the two forms of harvest groupings.Among the three environments used, it was seen that the trial carried out in the field resulted in higher VC values, followed by those in greenhouse and in plastic tunnel (Figures 1, 2, 3, 4 and 5).It was possible to estimate the critical points of the minimum variation coefficient (VC) in all cases studied because the estimates of the eigenvalues of matrix were positive.In the autumn-winter season, the plot size with the minimum VC was 24 plants (X1 =12 UB) for the trials carried out in a plastic greenhouse and in the field, and 28 plants (X1 = 14 UB) for those carried out in plastic tunnel.For the spring-summer season, the plot size with minimum VC was 30 plants (X1 = 15 UB) for the trials in plastic tunnel and in the field.In all uniformity trials, the best harvest grouping was the use of the total produced, X2 near the maximum ( Figures 1, 2, 3, 4 and 5).
In experiments with snap beans, Santos et al. ( 2012) verified that the estimates of the variation coefficients for the different plot sizes were bigger when the harvests were assessed individually when compared to the estimates X 1 X 2 R² VC% = 64.638-5.001 X 1 + 0.151 2 1 X -14.106X 2 + 1.448 2 2 X + 0.164 X 1 X 2 14.34 4.05 0.92 obtained with the harvest groupings.The same authors concluded that the production analysis of beans per harvest, instead of total production, reduces the accuracy of the experiments with snap beans.Also, Haesbaert et al. (2011) observed that the production grouping of all harvests enables the use of smaller sample sizes than in the individual harvests or the ones grouped two by two, because it enables a reduction of variation coefficient values in most of the crop rows.
An experimental area with random variability (no spatial dependence) can be used for conduct experiments that used statistical analysis models, which assume the random distribution of errors.However, when variability is spatially correlated, the geostatistic techniques are useful for reducing the standard error of the means of treatments, to improve the discrimination in the treatments and to increase statistic test power (Duarte;Vencovsky, 2005).The smallest variation coefficients are observed in plots of 24 plants for the trials carried out in plastic greenhouse and in the field, and 28 plants for the trials in plastic tunnel, in the autumn-winter season, combined with the grouping of all harvests.In the spring-summer season, the plot size is 30 plants for trials in plastic tunnel and in the field, also combined with the grouping of all harvests.

.
Ciência e Agrotecnologia 40(2):184-197, Mar/Apr.2016    surface a response function of the critical point was estimated by The maximum or minimum nature of the critical point was identified by the signal of the eigenvalues associated to the matrix Â, that is, you find the values I= identity matrix).

Figure 1 :
Figure 1:Response surface of the variation coefficient (%) for fresh biomass of snap bean in function of the plot sizes (X 1 ) and harvest groups (X 2 ), determination coefficient (R 2 ) and critical point, in trial with snap beans in plastic greenhouse during autumn-winter season for the first (a) and second (b) forms of harvest grouping.

Figure 2 :
Figure 2:Response surface of the variation coefficient (%) for fresh biomass of snap bean in function of the plot sizes (X 1 ) and harvest groups (X 2 ), determination coefficient (R 2 ) and critical point, in trial with snap beans in plastic tunnel during autumn-winter season for the first (a) and second (b) forms of harvest grouping.

Figure 4 :
Figure 4:Response surface of the variation coefficient (%) for fresh biomass of snap beans in function of the plot sizes (X 1 ) and harvest groups (X 2 ), determination coefficient (R 2 ) and critical point, in trial with snap beans in plastic tunnel during spring-summer season for the first (a) and second (b) forms of harvest grouping.

Table 1 :
Geostatistic analysis for the snap bean trial in plastic greenhouse in the autumn-winter season.

Table 2 :
Geostatistic analysis for snap bean trial in plastic tunnel and in crop field, in the autumn-winter season.

Table 3 :
Geostatistic analysis for snap bean trial in plastic tunnel and in crop field, in the spring-summer season.