INTRODUCTION

In every experiment, one of the main objectives is to reduce the error. In general, the experimental unit should be chosen to minimize the experimental error, which is the measure of variation that exists among observations of experimental units equally treated throughout the experiment (^{Steel et al., 1997}; ^{Storck et al., 2011}).

Although one considers that the larger the plot size, the lower the experimental error and, consequently, the greater the accuracy of the experiment, this relationship is not linear (^{Smith, 1938}; ^{Paranaíba et al., 2009}; ^{Barbin, 2013}). The increase in the size of the plot initially leads to a decrease of the experimental error up to some extent, from which the precision gain is very small (^{Paranaíba et al., 2009}; ^{Storck et al., 2011}).

An adequate experimental design involves the determination of the plot size and it will also depend on the crop, number of treatments, and environmental conditions of each experiment (^{Federer, 1977}; ^{Storck et al., 2011}).

Several methods have been reported in the literature for estimation of the size of the plot. The most commonly used method is the modified maximum curvature, according to ^{Meier & Lessman (1971}). Another method widely used in the last years is the method of maximum curvature of the coefficient of variation (^{Paranaíba et al., 2009}) which reduces the calculations to determine the optimum plot size as the great advantage over the previous methods. Even so, the blank trial is still needed, and in this experiment, the plants need to be set in rows, and evaluated in the exact sequence in which they are found, in order to estimate the coefficient of spatial autocorrelation of the first order.

Recently, ^{Santos et al. (2012}) and ^{Storck et al. (2014}) incorporated the bootstrap simulation to the method of maximum curvature of coefficient of variation proposed by ^{Paranaíba et al. (2009}) and, ^{Brito et al. (2014}) to the linear response plateau method. However, the incorporation of simulation to the ^{Meier & Lessman (1971}) method was not found in the literature.

To determine the optimum plot size in a simpler manner is expected since the formation of clusters would be made from the simulations. The issue on flaws in the final stand (^{Brum et al. 2016}), which is common in experiments involving seedlings, could also be contoured from the use of bootstrap simulation.

In field experiments involving papaya crops, several useful plot sizes are found, arbitrarily set since there are no studies reporting which plot size should be used. There are reports of the use of only one plant per plot (^{Pratissoli et al., 2007}; ^{Melo et al, 2009}) to 20 plants per plot (^{Vivas et al., 2011}). In trials with production of papaya seedlings in nursery, useful sizes of plots follow the same arbitrariness, also due to the lack of studies on plot designs. Evaluation of experiments with plots sizes of four (^{Melo et al., 2007}), six (^{Sá et al, 2013}), 10 (^{Paixão et al., 2012}; ^{Mengarda et al., 2014}) and 12 (^{Serrano et al., 2010}) seedlings per plot are reported.

The objective of this study was to comparatively determine the optimal size of plots for evaluation of papaya seedlings by the method of maximum curvature of ^{Meier & Lessman (1971}), by the method of maximum curvature of the coefficient of variation according to ^{Paranaíba et al. (2009}) and by a new method that incorporates the bootstrap simulation to the method of ^{Meier & Lessman (1971}).

MATERIAL AND METHODS

The data used in this study were obtained from a greenhouse at the Experimental Farm of CEUNES/UFES in São Mateus, state of Espírito Santo, between parallels 18°40'19.6" South latitude and 39°51'23.7" West longitude. The climate according to Köppen classification is Aw (tropical humid), with rains in summer and dry winter.

The optimal size of plots was determined using papaya seedlings (*Carica papaya* L.) cv. Golden Pecíolo Curto, whose seeds were obtained from the Caliman Agrícola S. A. company. The blank test was carried out using three black polyethylene trays containing 10x14 tubes of 50 cm^{3}. The trays were allocated together to provide 14 rows of 30 tubes, totaling 420 tubes. All 420 tubes were sown in summer with a single seed, being utilized for evaluating only the seedlings in the eight central rows, corresponding to 240 seedlings. The tubes were filled with Bioplant^{(r)} substrate, adding the slow release fertilizer Basacot mini 3M^{(r)} at a dose of 10 g dm^{-}³ substrate (^{Paixão et al., 2012}).

The characters evaluated 30 days after sowing were as follow: SH: seedling height - determined with the aid of a centimeter graduated ruler, by measuring the base of the stem to the apex of the last leaf; SD: stem diameter - obtained with a digital caliper (mm) measured in the middle region of the stem; NL: number of leaves - counting of full grown leaves; PL: petiole length - obtained by measuring with centimeter graduated ruler from the connection point in the plant to the insertion point on the leaf; and LLR: length of the longest root - determined by measuring from the base of the seedling to its end, with a centimeter graduated ruler.

By using the five characters, it was determined the optimum size of the plot using the following methods: maximum curvature method, according to ^{Meier & Lessman (1971}); method of maximum curvature of the coefficient of variation according to ^{Paranaíba et al. (2009}); maximum curvature method according to ^{Meier & Lessman (1971}) using bootstrap simulation, which is a proposal made in this work.

To determine the optimum plot size by the method of maximum curvature of ^{Meier & Lessman (1971}), 240 seedlings from the blank trial were structured in basic experimental units (BEU), where each BEU was composed of a seedling. The BEU were grouped using the exact dividing seedling number of the total number of seedlings from the blank trial, ranging from 1 BEU to 60 BEU, providing 12 clusters. For each specific cluster, all the possibilities of clustering composition were evaluated, characterizing different compositions (Table 1).

For each X_{i} BEU, it was calculated: , mean of the plots with Xi UEB in size; , variance among plots with Xi UEB in size; , coefficient of variation among plots with Xi UEB in size; and , variance per BEU among plots of Xi BEU in size. From the cluster of 12 data of Xi and the constants and the regression coefficient were estimated by log transformation of the function weighing it by degrees of freedom associated to the number of applicable plots with X_{i} UEB in size for each size of the designed plot in the uniformity test (^{Steel et al., 1997}). Similarly, ^{Smith's (1938}) heterogeneity index (b) was estimated from the relationship between and Xi. By using the values of and the optimal size of the plot was calculated given by .

For calculations of the optimal size of the plot by the method of maximum curvature of the coefficient of variation according to ^{Paranaíba et al. (2008}), the 240 seedlings of the blank trial received sequential numeration from 1 to 240, in which the five characteristics were measured in these appropriately identified seedlings. From the values of those characteristics, it was determined the sample mean (m), the sample variation (s2) and the estimate of the coefficient of the spatial autocorrelation of the first order (), in which , and where xi is the value observed in the plant i. It was also determined the coefficient of variation, which is given by where Xi indicates the number of BUE. Finally, it was determined the plot optimal size, given by

The proposed method is based on the maximum curvature method, according to ^{Meier & Lessman (1971}) with the proposed modification of clustering the Xi BEU (Table 1) by bootstrap simulation with replacement (^{Efron, 1979}; ^{Martinez & Louzada Neto, 2001}).

For the simulations, 12 sample sizes were designed (1, 2, 3, 4, 5, 6, 10, 12, 15, 30 and 60 UEB) for each character. Then, for each designed sample size of each characteristic, 2,000 simulations were performed by resampling with replacement. For each simulated sample, the mean was estimated. Thus, for each sample size of each characteristic, 2,000 mean estimates were obtained (^{Ferreira, 2009}) and from these, a coefficient of variation was obtained for each designed sample size, which we named

From the set of 12 data of Xi and , the constant and the regression coefficient were estimated by log transformation of the function By using the values and , the optimum plot size was calculated given by . The heterogeneity index (b) was calculated by log transformation of the function according to ^{Smith (1938}).

Performance of each of the three methods was demonstrated graphically by the relationship between the coefficients of variation and the number of BEU and the presentation of the optimum plot size. Data were analyzed using the computational resources of the R software (^{R Development Core Team, 2014}). Because it is a discrete random variable, the optimum plot size was presented by a full number, rounding up to the higher entire number.

Procedure for determining and from bootstrap simulation in software R is described in scrip below for the designed plots with one and two seedlings from the "golden pecíolo curto" file for the characteristic seedling height (SH). For other designed plot sizes (3; 4; 5; 6; 10; 12; 15; 20; 30; 60), it proceeds in a similar manner.

X<-read.table("e:\\data\\golden peciolo curto.txt",header=T) # data import

R = 2000 # number of resamplings

boot.means = numeric(R)

for (i in 1:R) { boot.sample = sample(X$AP, 1, replace=T)

boot.means[i] = mean(boot.sample) }

m1<-mean(boot.means) #mean

d1<-sd(boot.means) # standard deviation

cv1 =(d1*100)/m1 # coefficient of variation

v1<-d1^2 # variance

vu1<-v1/1^2 # variance per beu

R = 2000 # number of resamplings

boot.means = numeric(R)

for (i in 1:R) { boot.sample = sample(X$AP, 2, replace=T)

boot.means[i] = mean(boot.sample) }

m2<-mean(boot.means) #mean

d2<-sd(boot.means) # standard deviation

cv2 =(d2*100)/m2 # coeficient of variation

v2<-(2*d2)^2 # variance

vu2<-v2/2^2 # variance per beu

RESULTS AND DISCUSSION

The results of the analysis of the coefficients of variation in function of the different plot sizes, measured by the number of basic experimental units (BEU) from 240 seedlings for height, stem diameter, number of leaves, petiole length and length of the longest root in papaya seedlings (*Carica papaya* L.) cv. Golden Pecíolo Curto are shown in Figures 1 to 5, respectively. For all 15 adjusted curves (three methods x five characters), a decrease in the coefficient of variation decreased as the size of the plot increased, a result expected from the statistical point of view (^{Barbin, 2013}). It is noteworthy that, although the method of maximum curvature of the coefficient of variation according to ^{Paranaíba et al. (2009}) does not require BEU clustering to estimate the optimum plot size, the method allows the determination of the coefficient of variation according to the different sizes of the plots, and therefore it also provides a graphic representation and adjustment of a power model nonlinear regression.

It is observed that the optimum plot size estimated was four seedlings per plot when using seedling height (Figure 1) and stem diameter (Figure 2) and five seedlings per plot when using number of leaves (Figure 3), petiole length (Figure 4), and length of the longest root (Figure 5). Thus, it is pointed the number of five seedlings per plot as optimum size for 'Golden Pecíolo Curto'. Different estimates of sample sizes for different characters of the same plants were also detected in the production of coffee seedlings "Catuaí Amarelo" (^{Firmino et al., 2012}) and coffee "Rubi" (^{Cipriano et al., 2012}). In papaya, with the lack of scientific results on optimum plot size in the production of seedlings, we verified great variation since experiments in which plots were used with four seedlings (^{Melo et al., 2007}) up experiments whose parcels contained 12 seedlings (^{Serrano et al., 2010}). The use of appropriate plot size in the experiments is crucial for reducing experimental error and a consequent increase in experimental precision (^{Catapatti et al., 2008}). Therefore, the researcher who is using more plants in the plot than the recommended may be spending more than necessary for his or her experimentation with technical, physical or financial resources.

When the three methods for determining the sample size are compared, it can be seen that the coefficient values of and and the optimum plot size are similar among the methods of maximum curvature of coefficient of variation and the maximum curvature method with bootstrap simulation to SH ( figures 1B, 1C), SD (figures 2B, 2C), NL (figures 3B, 3C), PL (figures 4B, 4C) and LLR (figures 5B, 5C), and the values of the coefficient get closer to 0.5. Thus, it is clear that the bootstrap simulation with replacement leads to similar results to the method of maximum curvature of coefficient of variation presented by ^{Paranaíba et al. (2009}), and with the advantage of not needing to identify the sequence of plots in the uniformity test since the bootstrap simulation the drawing is at random.

The means of the characters are presented in Table 2, where it is observed that the mean of the 2,000 estimates by bootstrap present values very close to the real values presented in the methods of ^{Meier & Lessman (1971}) and ^{Paranaíba et al. (2009}). This is because the resampling is done thousands of times and the bootstrap technique with replacement allows the same probability of drawing to all values of the sample (^{Ferreira, 2009}).

^{*}^{1)} Statistics obtained from a total of 240 seedlings individually assessed.

^{*}^{2)} b was obtained by the method of ^{Smith (1938}).

^{*}^{3)} Method: ML - maximum curvature according to ^{Meier and Lessman (1971}); P - maximum curvature of coefficient of variation according to ^{Paranaíba et al. (2009}); MLboot - maximum curvature methods, according to Meier and Lessman (1971) by using bootstrap simulation.

The values of the coefficient of variation in the evaluated characters also present similarity among each other in the comparison of three methodologies. It is noteworthy that, for the calculation of this statistic by the method proposed by ^{Paranaíba et al. (2009}), the numerator of the equation contains the spatial autocorrelation coefficient of the first order (), which ranges from -1 to +1. Algebraically, when the CV method proposed by ^{Paranaíba et al. (2009)} will have a value close to the CV of the method by ^{Meier & Lessman (1971}), as it can be observed for the five characters (Table 2). The autocorrelation close to zero indicates random distribution between the seedlings, which is what happened to the five evaluated characters, which can be explained by the fact that each seedling is contained in a different tube.

It can be seen in Table 2 that the heterogeneity index of ^{Smith (1938}), (b) is as twice as the value of the coefficient (Figures 1A, 2A, 3A, 4A, 5A) estimated in the equation that determines the optimum plot size by using the modified maximum curvature method (^{Meier & Lessmam, 1971}). This relationship, also was reported by ^{Lorentz et al. (2012}), and it can also be verified in the proposed method; however, the values are close to 0.500 and the b values are close to 1.000. Considering that Smith's b values (1938) range from zero to one and that values closer to one indicate heterogeneity in crop environment, it is clear that the method proposed by bootstrap simulation is valuing the maximum of heterogeneity. ^{Santos et al. (2012}) report that when heterogeneity is large, the plots are less related to each other and in this case, they should be larger to obtain the same degree of experimental precision. Thus, it is expected that the proposed method will present optimal plot size equal to or greater than the method for maximum curvature modified by ^{Meier & Lessman (1971}) and this is interesting from the practical point of view since that method sometimes determines optimal size of plots smaller than a plant (^{Leite et al., 2006}), which is a criticism of the method.

Taking as an example the tray model used in this experiment (10x14 = 140 tubes) for further studies, a tray would be enough to allocate 28 plots of 5 seedlings.

CONCLUSIONS

The optimum number of plants per plot for evaluation of 'Golden Pecíolo Curto' papaya seedlings is five.

The method proposed by bootstrap simulation with replacement provides optimum sizes of plots equal to or higher than the method of maximum curvature. It also provides the same size of plot than the method of maximum curvature of the coefficient of variation.