Optimizing Bartlett test: a grain yield analysis in soybean

ABSTRACT: This study analyzed the response of the Bartlett test as a function of sample size and to define the optimal sample size for the test with soybean grain yield data. Six experiments were conducted in a randomized block design with 20 or 30 cultivars and three repetitions. Grain yield was determined per plant, totaling 9,000 sampled plants. Next, sample scenarios of 1, 2, ..., 100 plants were simulated and the optimal sample size was defined via maximum curvature points. The increase in sampled plants per experimental unit favors Bartlett test’s precision. Also, the sampling of 17 to 20 plants per experimental unit is enough to maintain the accuracy of the test.

Gaussian inferences are subject to mathematical assumptions that, if violated, may reduce the reliability of results (WELHAM et al., 2015;BUTLER, 2021).The analysis of variance, in particular, which is used for summarizing scientific data, is subject to four assumptions, such as the additivity of the model, error independence, error normality, and homogeneity of variances (BUTLER, 2021).The two latter are normally the hardest ones to meet and; although BLANCA et al. (2017) pointed out that the analysis of variance is robust to normality deviations, such robustness does not include cases with heterogeneous variances (WELHAM et al., 2015).This is because FISHER (1925), when developing such analysis, considered the variances of each treatment to be similar or at least close.If the variation surrounding the mean of each treatment is similar, a grouped error can be calculated (BUTLER, 2021); otherwise, this inference loses reliability.
Many statistical tests can be used in order to evaluate the presence of variance homoscedasticity, being Bartlett test one of the most common (BARTLETT, 1937).However, in cases where variance homoscedasticity is violated, the accuracy of the test used to assess the homogeneity of variances is an important factor to verify.Bartlett test itself is susceptible to normality deviations (BARTLETT, 1937;WELHAM et al., 2015); however, this may not be the only factor that interferes with its estimates.Little is known about the quantitative response of this test as a function of sample size, being samplings often empirically performed for soybean yield traits, as in SoUZA et al. (2021) andSoDRÉ FILHo et al. (2022), who evaluated 20 and 5 plants per experimental unit, respectively.Therefore, in order to optimize the accuracy of the test and identify how sample size interferes with Bartlett's estimates, this study analyzed the response of the Bartlett test as a function of sample size and defined the optimal sample size for soybean grain yield data.
For the data analysis, specific routines constructed in R software were used (R DEVELoPMENT CoRE TEAM, 2022).Initially, the database was subdivided per experimental unit for all experiments (E1, E2, E3, E4, E5, and E6).Next, 31 sampling scenarios of n = 1, 2, …, 20, 25, …, 50, 60, ..., 100) plants per experimental unit were simulated with reposition and 10,000 resamplings (EFRoN, 1979) for each experiment, using sample() function.once the values of each experimental unit in the re-samplings per sampling scenario were obtained, the analysis of variance was performed with aov() function, according to the following mathematical model: Y ir = m + G i + β r + ε ir , where Y ir is the value observed in the response variable in plot ir, m is the overall mean, G i is the fixed effect of level i of the genotype factor, being i = 1, 2, ..., 30 for E1, E2 and E3 and i = 1, 2, ..., 20 for E4, E5 and E6, β r is the random effect of level r (r = 1, 2, 3) of the block and ɛ ir is the effect of the experimental error.The estimates of the error ( ) obtained by were extracted and the Bartlett test was applied at 5% error probability using bartlett.test()function.Bartlett's statistic (K 2 ) was obtained 1,860,000 times (31 sample sizes per experimental unit × 10,000 re-samplings × 6 reference experiments).
Finally, each planned scenario was subject to a descriptive analysis calculating minimum, 2.5 percentiles, mean, 97.5 percentiles, and maximum values.The ninety five percent confidence interval width (CI 95% ) was obtained as the difference between the 97.5 and 2.5 percentiles.Then, CI 95% estimates were fitted through nls() function with the following power model: CI 95% = α × n β + ε, where α is the coefficient of interception, n is the sample size, β is the exponential rate of decay, and ɛ is the error of random effect.Subsequently, four maximum curvature point methods were used (general, perpendicular distances, linear plateau response, and spline) as described by SILVA & LIMA (2017), using the maxcurv() function from the soilphysics package, considering the point reached as a sample size that is representative enough.
As expected, sample size directly interferes with Bartlett test's estimates (Figure 1) when analyzing soybean grain yield per plant.By observing the mean properties of the six trials, an exponential decreasing response is identified, which is also true for the CI 95% .This type of response has already been described for other statistics when analyzing CI 95% (ToEBE et al., 2018;PIÑERA-CHAVEZ et al., 2020).Such indicators showed that increasing sample size guarantees a higher precision to the test's estimates (TOEBE et al., 2018).Bartlett test's sensitivity to sample size is identified in small sampling scenarios, as in a number of ≤ 5 plants per experimental unit.In those cases, there is a higher tendency to overestimate the values of the test.However, as observed in figures 1a, 1c, 1e, 1g, 1i, and 1k, an underestimation bias is also possible.
Moreover, four methods to estimate sample size were applied, and compared by the previous fitting of Ciência Rural, v.53, n.6, 2023.power models (Table 1 and figure 1).The power models showed a satisfactory performance in the six trials, when analyzed using fitting indicators as the coefficient of determination (R 2 ), root mean square error (RMSE), and Willmott's agreement index (d).This allows to make inferences a posteriori, such as the use of maximum curvature points, to be efficiently made (SILVA & LIMA, 2017).Nevertheless, contrasting sample size values were identified, ranging from ≥ 4 to ≤ 41 plants per experimental unit.Perceptibly, such a large variation occurs due to the implemented method since only slight differences can be seen when comparing sample sizes obtained through the same method between trials.An example of this is, when comparing the optimal sample size for the Bartlett test between trials, obtained using the general method, the number of plants only fluctuates from ≥ 4 to ≤ 9 plants per experimental unit.
Equally, with the linear plateau response method, variation is little, ranging from ≥ 28 to ≤ 41 plants per experimental unit.The same is observed for the perpendicular distance and spline methods.
Based on the CI 95% , small sample sizes, as the ones obtained through the general method (≤ 9 plants) may lead to biased estimates; and although the slightly greater sizes suggested by the spline method (≤ 15 plants) might reduce the bias of the test, such values are still far from optimizing it, that is, CI 95% is still decreasing, meaning the curve has not stabilized yet at

Table 1 -
Coefficient of determination (R 2 ), root mean square error (RMSE), and d index of the power models, and maximum curvature points and sample sizes for Bartlett's test.points.only up from the sample numbers obtained through the perpendicular distance and linear plateau response methods, is CI 95% curve beginning to stabilize, which suggested that the values reached with those methods are representative enough sample sizes.Interestingly; although the perpendicular distance method recommended, at maximum, the sampling of 20 plants per experimental unit, and the linear plateau response reached a maximum of 41 plants.When analyzing CI 95% , the precision gain obtained with the linear plateau response method is too little compared with the perpendicular distances', not being enough to justify the choice of the first over the latter.That way; although both methods are capable of obtaining sufficiently reliable sample size estimates to optimize the Bartlett test, we encourage the sampling of ≥ 17 to ≤ 20 plants per experimental unit, so that the test's estimates generate accurate results, enabling the verification of the meeting or violation of the homogeneity of variances assumption in an analysis of variance performed for soybean crop.