Sampling sufficiency for estimating the mean of wheat traits

Abstract The objective of this work was to determine the sample size necessary for estimating the means of wheat (Triticum aestivum) traits, obtained through measurement, counting, and weighing. Seventeen uniformity trials were performed with 1,790 plants harvested randomly, whose following traits were evaluated: lengths of the main stem and main stem ear (measurement); number of leaves, stems, and ears (counting); and mass of fresh and dry matter of leaves, stems, and ears (weighing). The Bartlett and Kolmogorov-Smirnov tests and Welch’s analysis of variance were performed. Skewness, central tendency, and variability were determined, and sample size was calculated to estimate the means of the 13 evaluated traits, considering estimation errors (semi-amplitudes of the 95% confidence interval) equal to 5, 10, 15, and 20% of the mean. There is a decrease in the sample size to estimate the means of wheat traits obtained through weighing, counting, and measuring, in this order. In an experiment to estimate the mean of wheat traits obtained by weighing, counting, and measuring with a maximum error of 10% of the mean at a 95% confidence interval, 117, 76, and 9 plants per treatment are needed, respectively.


Introduction
Wheat (Triticum aestivum L.) belongs to the Poaceae family and is one of the main food crops cultivated in various environments and geographical regions, with relevance in diet, due to the quantity and quality of its protein and diversity of derived products (Borém & Scheeren, 2015).Because of wheat importance, the experiments with this crop should be planned properly, prioritizing the obtaining of high experimental precision (low coefficient of variation) and, consequently, reliability in the inferences on the evaluated treatments.In experiments conducted in the field, there are several traits that can be evaluated by measuring, counting, and weighing.When evaluating a trait, it is common to observe variation between plants, even between those subjected to the same treatment.Therefore, it is essential to size the number of plants under evaluation, to obtain reliable information on the trait to be estimated (Storck et al., 2016).
Sample sizing can be performed from data obtained in uniformity trials (experiments without treatment).It is important to perform uniformity trials in various scenarios, such as the combination of agricultural years, sowing dates, and cultivars.These scenarios enable plants to develop under different environmental conditions, expanding their variability.With these databases, it is possible to size a representative sample size of scenarios with wide variability.
For wheat crop, studies on sampling sufficiency have been conducted which demonstrated variation in sample size between regions of adaptation and Brazilian states as to estimate the means of gluten strength (Castro et al., 2016), between characteristics of severity of yellow spot and the area under the disease progress curve (Sari et al., 2020) and between instruments used to determine the hectoliter mass (Martin et al., 2022).
The inclusion of sample sizing studies is assumed to aggregate important information to support the planning of experiments with better precision and, consequently, with greater reliability in the results.
The objective of this work was to determine the sample size necessary for estimating the means of wheat traits, obtained through measurement, counting, and weighing.

Materials and Methods
Seventeen uniformity trials (experiments without treatments) with wheat were conducted in an experimental area (29º42'S, 53º49'W, at 95 m altitude).In this site, the climate is Cfa -humid subtropical, according to the Köppen-Geiger's classification; and the soil is Argissolo Vermelho distrófico arênico (Ultisol) (Santos et al., 2018).
The trials were formed by the combination of agricultural years, sowing dates, and cultivars (Table 1).In all uniformity trials, which measured 20×8 m (160 m 2 ), the mechanized sowing of wheat was carried out in rows spaced at 0.20 m apart, using 420 seed m -2 .Basal fertilization consisted of 9 kg ha -1 N, 36 kg ha -1 P 2 O 5 , and 36 kg ha -1 K 2 O; subsequently, two top-dressing fertilizations with 41 kg ha -1 N were carried out in the development stages V 3 (three expanded leaves) and V 6 (six expanded leaves).Other cultural management practices were performed evenly throughout the experimental area.
The evaluations were performed when the crop was at the dough grain development stage (reproductive stage).For that, samples of 100 and 110 plants were established in each trial conducted in 2018 and 2019 crop season, respectively.These 1,790 plants were randomly harvested and separated into three parts (leaf, stem, and ear).The plants were cut near the soil surface and, immediately after cutting, fresh matter was determined with a digital scale to obtain the mass value (grams per plant).
In each plant, traits were evaluated as follows: measuring (cm) of the main stem length (SL, obtained by the distance between the base of the plant and the flag leaf insertion node) and of the main stem ear length (EL); counting of the number of leaves (NL), number of stems (NS), and number of ears (NE); and weighing (g per plant) of the fresh matter of leaves (FML), fresh matter of stems (FMS), fresh matter of ears (FME), fresh matter of shoots (FMSH = FML+FMS+FME), dry matter of leaves (DML), dry matter of stems (DMS), dry matter of ears (DME) and dry matter of shoots (DMSH = DML+DMS+DME).
The following statistics were calculated: p-value by the Kolmogorov-Smirnov's normality test, skewness, mean, median, minimum and maximum values, variance, and coefficient of variation.The Bartlett's test was performed to check the homogeneity of variances between the uniformity trials, and the Welch's analysis of variance was employed to check whether the means of traits differed between the uniformity trials.In case of occurences of heterogeneity of variances, the Welch's analysis of variance is an appropriate procedure to check whether the means of traits differ between uniformity trials.
For each trial and trait, based on the pilot sample, the sample size (n) was determined for the estimation of errors (e) fixed at 5, 10, 15, and 20% of the mean (m), that is, 0.05×m (higher precision), 0.10×m, 0.15×m, and 0.20×m (lower precision), with a confidence level (1-α) of 95%.The estimation error corresponds to the semi-amplitude of the confidence interval.The 'n' was determined by the expression n = [(t α/2 s)/e] 2 (Bussab & Morettin, 2017), in which t α/2 is the critical value of the Student's t-distribution, whose area on the right is equal to α/2, that is, the value of t, such that P(t>t α/2 ) = α/2, with α=5% probability of error and n-1 degrees of freedom (n = 100, and 110 plants in the 2018 and 2019 trials, respectively, in the present study), and 's' is the standard deviation estimate.
Statistical analyses were performed by the applications of Microsoft Office Excel, Genes (Cruz, 2016), and R software (R Core Team, 2022).
The Kolmogorov-Smirnov's test for the 221 cases (17 trials × 13 traits per trial) shows p-values between <0.001 and 0.996, with 0.313 average (Table 2).The higher is the p-value, the greater is the adherence of the data to the normal distribution curve.Thus, assuming 5% significance level, the normality was met in 71% of the cases.A lower adherence to the normal distribution was observed for the traits obtained by counting (NL, NS, and NE).Skewness coefficients close to zero (-0.56 ≤ skewness ≤ 1.75) and the proximity of the mean to the median (Table 3) indicate that the data showed good fit or slight distances from the normal distribution curve.Therefore, this data set is suitable for the study of sample sizing based on the Student's t-distribution.
A wide variation was found between plants within the uniformity trials and between uniformity trials for all traits, on the basis of the minimum and maximum values (Table 4) and of the variance and the coefficient of variation (Table 5).Such wide variation -promoted by 17 trials involving six sowing dates, five cultivars, and evaluations performed between 77 and 119 days after sowing (Table 1) -is important for the studies on sample sizing, as it contemplates plants of different sizes (small, medium, and large), which are common in field experiments.
The coefficient of variation (CV) of the 13 traits ranged between 7.85%, for EL in trial 7, and 54.37% for DME in trial 1, with 31.08% average (Table 5).In the average of the 17 uniformity trials, the coefficients of variation of traits FML, FMS, FME, FMSH, DML, Table 2. P-value of the Kolmogorov-Smirnov's normality test and skewness coefficient of the traits evaluated in wheat plants (Triticum aestivum) cultivated in 17 uniformity trials (1) .The sample sizes for estimating the mean, with estimation error (semi-amplitude of the 95% confidence interval) equals to 5% of the estimate of the mean (m), that is, 0.05×m (higher precision, in the present study), ranged between 10 plants for EL in trial 7 and 466 Table 3. Mean and median of the traits evaluated in wheat plants (Triticum aestivum) cultivated in 17 uniformity trials (1) .Therefore, in relation to ear length, it can be inferred that, with 95% confidence, the confidence interval of the mean (m) obtained with 10 plants is m±0.05m, that is, m±0.44 cm because the mean height of the 100 plants sampled was 8.71 cm (Table 3).At another extreme, the precision of m±0.05 m is obtained with 466 plants for the mass of dry matter of ears and, in this situation, the value would be m±0.04g because the mean dry matter mass of ears of the 100 plants sampled was 0.77 g.Table 4. Minimum and maximum value of traits evaluated in wheat plants (Triticum aestivum) cultivated in 17 uniformity trials (1) .  (1Uniformity trials defined in Table 1.SL, main stem length (cm); EL, main stem ear length (cm); NL, number of leaves; NS, number of stems; NE, number of ears; FML, fresh matter of leaves (g per plant); FMS, fresh matter of stems (g per plant); FME, fresh matter of ears (g per plant); FMSH, fresh matter of shoots (FML+FMS+FME) (g per plant); DML, dry matter of leaves (g per plant); DMS, dry matter of stems (g per plant); DME, dry matter of ears (g per plant); and DMSH, dry matter of shoots (DML+DMS+DME) (g per plant).
Table 5. Variance and coefficient of variation of the traits evaluated in wheat plants (Triticum aestivum) cultivated in 17 uniformity trials (1) .  (1Uniformity trials defined in Table 1.SL, main stem length (cm); EL, main stem ear length (cm); NL, number of leaves; NS, number of stems; NE, number of ears; FML, fresh matter of leaves (g per plant); FMS, fresh matter of stems (g per plant); FME, fresh matter of ears (g per plant); FMSH, fresh matter of shoots (FML+FMS+FME) (g per plant); DML, dry matter of leaves (g per plant); DMS, dry matter of stems (g per plant); DME, dry matter of ears (g per plant); and DMSH, dry matter of shoots (DML+DMS+DME) (g per plant).
For the estimation error of 10%, a larger sample size was observed for the traits FML, FMS, FME, FMSH, DML, DMS, DME, and DMSH (16 ≤ n ≤ 117), with the average of 55 plants, than the sample size required for NL, NS, and NE (16 ≤ n ≤ 76), with an average of 40 plants, and sample size required for SL and EL (3 ≤ n ≤ 9), with an average of 5 plants (Table 6).As expected, these results are due to the higher coefficient of variation of the traits obtained by weighing (FML, FMS, FME, FMSH, DML, DMS, DME, and DMSH) than to those obtained by count (NL, NS, and NE) and by measurement (SL and EL), in this order.A similar order was observed in black oat, for which a larger sample size was necessary to estimate fresh matter Table 6.Sample size (number of plants) for estimating the means of traits evaluated in wheat plants (Triticum aestivum) cultivated in 17 uniformity trials (1) , for estimation errors (semi-amplitudes of the confidence interval) equal to 5% and 10% of the mean (m), that is, 0.05×m (higher precision) and 0.10×m (lower precision), with confidence level (1-α) of 95%. and dry matter obtained by weighing, than sample sizes required to estimate the number of leaves and tillers obtained by count, and plant height obtained by measurement (Cargnelutti Filho et al., 2015).In maize, Toebe et al. ( 2014) also determined a larger sample size for traits obtained by weighing than those obtained by count and measurement, in this order.
If the researcher allows of 20% estimation error, that is, 0.20×m (lower precision, in the present study) and a 95% confidence level, the number of plants to be sampled is 1 to 30 plants (Table 7).It is evident that this low number of plants (≤ 30) would be easily evaluated in an experiment.However, it would lead to low precision in estimating the means of the traits; Table 7. Sample size (number of plants) for estimating the means of traits evaluated in wheat plants (Triticum aestivum) cultivated in 17 uniformity trials (1) , for estimation errors (semi-amplitudes of the confidence interval) equal to 15% and 20% of the mean (m), that is, 0.15×m (higher precision) and 0.20×m (lower precision), with confidence level (1-α) of 95%. ( 1 Uniformity trials defined in Table 1.SL, main stem length (cm); EL, main stem ear length (cm); NL, number of leaves; NS, number of stems; NE, number of ears; FML, fresh matter of leaves (g per plant); FMS, fresh matter of stems (g per plant); FME, fresh matter of ears (g per plant); FMSH, fresh matter of shoots (FML+FMS+FME) (g per plant); DML, dry matter of leaves (g per plant); DMS, dry matter of stems (g per plant); DME, dry matter of ears (g per plant); and DMSH, dry matter of shoots (DML+DMS+DME) (g per plant).
Pesq In practice, by the results of the present study, the researcher can choose the sample size to estimate the means of these traits with the desired precision.For instance, if the option is to allow of the maximum estimation error of 10%, that is, 0.10×m, 117 plants would be sufficient to estimate the means of these 13 traits, under the conditions of the 17 uniformity trials (Table 6).Thus, when planning an experiment to be conducted in the field, in a completely randomized experimental design to estimate the mean of each treatment with 10% precision, 117 plants per treatment should be evaluated.If the experiment is planned with four replicates per treatment, 30 plants per replicate (117/4 = 29.25)should be sampled, that is, 30 plants per plot.Furthermore, if five treatments were evaluated in the experiment, 600 plants (120 per treatment) should be sampled.
It is worth pointing out that, for the traits obtained by count (NL, NS, and NE) and by measurement (SL and EL), individual evaluations of the plants are needed, which requires more labor and time than for FML, FMS, FME, FMSH, DML, DMS, DME, and DMSH, since, for these traits, plants can be weighed together.Optionally, the researcher can guarantee the maximum estimation error of 10%, by sampling 117, 76, and 9 plants, for the traits obtained respectively by weighing, counting, and measuring.

Conclusions
1.The sample size for estimating the means of wheat (Triticum aestivum) traits, obtained by weighing, counting and measuring, decreases in this attainment order.
2. To estimate the means of wheat traits with 10% maximum error of the mean and a confidence level of 95%, the sufficient sample size is 117 plants -for fresh and dry matter of leaves, stems, ears, and shoots -, 76 plants for the number of leaves, stems, and ears; and nine plants for the length of main stem and length of main stem of ear.

Table 1 .
Composition of uniformity trials with wheat (Triticum aestivum) cultivars in the 2018 and 2019 crop season.
(1)DAS, number of days after sowing on the evaluation date.
DMS, DME, and DMSH (34.01 ≤ CV ≤ 40.76%, with 36.30% average) were higher than those obtained for the traits NL, NS and NE (29.25 ≤ CV ≤ 32.78%, with 30.78% average) and the traits SL and EL (10.39 ≤ CV ≤ 10.90%, with 10.64% average).Thus, for the same precision, a larger sample size is expected to estimate the means of the traits obtained by weighing (FML, FMS, FME, FMSH, DML, DMS, DME and DMSH) than those obtained by count (NL, NS and NE) and by measurement (SL and EL), in this order.However, means estimated from a single sample size would show precision increase for the traits obtained by weighing, counting, and measuring, in that order.