Analysis of variance in augmented block design and Scott-Knott’s test in hybrid corn selection studies

Abstract The objective of this work was to present a methodological alternative for studies of the characterization and selection of corn (Zea mays) genotypes, through the joint analysis of variance of an augmented block design, using Scott-Knott’s test, and to present the hybrids selected from the adopted strategy, to show its efficiency. For the application of the methodology, a case study was used: the selection of superior corn hybrids for the Brazilian Cerrado. In four locations, 70 experimental hybrids were evaluated in an augmented block design without replicates, with three controls replicated once in each block. The analysis of experimental groups applied to the augmented block design, followed by genotype classifications by Scott-Knott’s multiple comparison test, is a viable alternative for studies with a low number of replicates and a large number of genotypes. Based on the tested methodology, the following experimental hybrids are selected for grain yield: HT007, HT008, HT018, HT004, HT024, HT005, and HT071.


Introduction
The augmented block design proposed by Walter T. Federer, in 1955, sought to supply the needs of breeding programs in the preliminary experimentation phases, whose basic purpose is the screening of promising treatments for more accurate future tests (Duarte, 2000).A strong justification for this methodology indication is the little availability of propagation material for each new genotype to be evaluated, which forces the breeders not only to reduce the number of replicates, but also to adopt plots with only one or two rows of plants, without borders.
The analysis of variance method has its proposition attributed to Fischer (1919).Its classical version, historically the most widely applied, is employed for balanced data sets, in fixed or completely random models.This method produces estimators with desirable statistical properties, which are unbiased and with minimum variance (Barbin, 2019).The use of the augmented block design allows of two types of analysis, the intrablock analysis (fixed effects for blocks and treatments), and the analysis with recovery of interblock information (one with fixed effect and another with random effect) (Cochran & Cox, 1992).
The result analysis of an experimental group (called conjoint analysis) allows of more reliable conclusions and knowledge of the recommendation comprehensiveness (Pimentel Gomes & Guimarães, 1958).
To perform the pooled analysis of variance for the Federer's augmented block design, it is necessary that the experiments can be grouped without difficulty, for which it is necessary that the residual mean squares, obtained in the individual analysis of experiments, are not very different from each other, that is, that they have homogeneous variances for experimental errors (Pimentel-Gomes, 2000).
Scott-Knott's test is an alternative to a post-hoc analysis of variance, but has a different concept from the traditional multiple comparison tests such as those of Tukey, Duncan, SNK, and Dunett.Scott-Knott's method separates the means of the treatments into homogeneous groups, thus minimizing the sum of squares within groups and maximizing it between them, without overlapping these groups.The objective of this method is to observe how important is the division of j treatments into homogeneous groups, based on the likelihood ratio, thus implying the maximization of the sum of squares between the generated groups (Ramalho et al., 2000).
Regarding the generalized use of Scott-Knott's test, it should be understood that it is a method that was only proposed as an alternative to avoid ambiguity, in order to facilitate the interpretation of research results, especially in studies with a large number of treatments (Ferreira et al., 1999).However, it is important to clarify that ambiguity should not be understood as a problem in comparison tests because it is part of one of the possible outcomes predicted in such tests to detect differences.
Corn is of great importance among the cultivated grains, and Brazil is one of the its main producers in the world (Klein & Luna, 2022).The Midwest region of the country shows the highest rates of planted area and yield.In the 2020/2021 harvest, 21,584,400 ha were planted, which represents the second largest grain-planted area, and 114,691,300 Mg corn were harvested in Brazil (Conab, 2022).
The objective of this work was to present a methodological alternative for studies of the characterization and selection of corn genotypes, through the joint analysis of variance of an augmented block design, using Scott-Knott's test, and to present the hybrids selected from the adopted strategy, to show its efficiency.

Materials and Methods
The use of the suggested methodology can be exemplified with data from an experiment installed in five sites of great representativeness, for the production of second-crop (safrinha) corn in Brazil used as a case study.Out of these five sites, two are located in a lowaltitude region (<500 m), and three, in a high-altitude region (>500 m), and they are described as follows.
Site 1: located in the Fazenda Santo Antônio (12º52'S, 55º49'W, at 398 m altitude), in the municipality of Sorriso, in the state of Mato Grosso (MT), Brazil.In all locations, the experiments were installed in a rainfed (without irrigation), third-party area, and they were carried out together with the commercial corn area of the farm, after soybean was harvested.The experiment at site 5 was discarded because of phytotoxicity problems in one of the pesticide applications; and the other four sites remained for the evaluation of the experimental hybrids.
An augmented block delineation was carried out for the evaluation of 73 corn genotypes (triple hybrids), with 70 treatments without replicates (experimental hybrids), with 3 controls (20A55, MG699, and 2B633) replicated in each of the blocks, orthogonally arranged, so that each control was present in all rows and columns of the experiment (Figure 1).There were a total of 10 with 10 plots per block, totaling 100 plots.
The triple hybrids used as control are commercial hybrids, already known and of proven performance, which were acquired with own resources.The 70 new genotypes under test came from the breeding program of a company located in the municipality of Cristalina, GO, and they were obtained from a partial diallel, in which inbred lines were crossed with experimental simple hybrids, both from the germplasm bank of the company.
Seed of all treatments were treated with two insecticides, to ensure the initial pest protection for good stand formation of the plots.Two different classes of insecticides were used for seed treatment, one from the neonicotinoid class, and the other from the diamide class, at the doses recommended by the manufacturer, for the control of initial pests such as Hemiptera (bugs) and Lepdoptera (caterpillars), respectively.
The plot size was 4 rows at 0.5 m spacing by 5 m length (10 m²); the useful area was formed of the two central rows by 5 m length (5 m²).For the evaluation of corn for grain yield, Sturion et al. (1994) determined that plots of 3.64 m 2 , with approximately 20 plants, were sufficient to obtain a good precision in the evaluations.
The planting of the experiments was done manually with the aid of a planter, after the area had been mechanically furrowed by the planter.Approximately 15 days after plant emergence, thinning was performed to standardize the number of plants per plot.After thinning, the density was equivalent to 60,000 plants per hectare.
The fertilization and the cultural treatments followed the technical recommendations indicated for corn cultivation (Fancelli, 2015).Fungicides were not applied to any of the locations due to the low incidence of foliar diseases.
At harvesting, the grain weight per plot and the moisture content of the grain mass were attained.The data of weight per plot were converted into kilograms per hectare (yield) and corrected to 13% water content.
The statistical analysis of variance was used to evaluate the genotypes in the augmented block design, in which: the letter B refers to blocks; T refers to the control treatments (controls) and L refers to treatments (hybrids) (Table 1).
For this model, the total sum of squares (SS total ) and block (SS blocks , ignoring treatments) are calculated in the usual way, as follows: Y ij are called the observations made in each plot; Y j are the block totals, where j represents the number of blocks; i is the total number of treatments; and CF is the corrector factor.Thus, we have: To calculate the sums of squares of adjusted treatments and their partitions, the model adjusted the values obtained for the hybrids under test.The resulting value from the difference between means of the controls of the block (MCj), where the treatment is located, and the general mean of the controls (MC) in the whole experiment was added to the observed values of each hybrid.Thus, the sum of squares of the treatments (adjusted) is given by: Thus, an individual statistical analysis was performed for each of the sites, in order to determine the mean square error (MSE), and to know if it would be possible to perform the joint analysis of the experiments.With the results of the local MSEs, the homogeneity among the different locations was tested from the maximum F-test of Hartley, to proceed with the joint analysis (Hartley, 1950).To perform the joint analysis of variance for the Federer's augmented block design, these residual mean squares obtained in the individual analysis of experiments should not very different from each other, that is, that they should have homogeneous variances of experimental errors.
In the present study the intrablock analysis model was adopted.This model has the following assumptions: fixed effects for blocks and treatments, corresponding to the within-block analysis.The effects µ (population average), β j (index block j), τ i (index treatment i), and ε ij (random error) were admitted as independent of each other, and the only variance component is associated with the experimental error, in accordance with the assumption ε ij ~ N(0,σ_e 2 ).
The statistical analyses were performed using the SAS Proc GLM software (SAS Institute Inc., 1997), where each of the four locations was analyzed individually.With the results of the MSE of each location, the joint analysis was determined.The output values were subjected to Scott-Knott's test built from a macro adapted for the SAS software, for each of the locations individually, and for the joint analysis.

Results and Discussion
The results of the analysis of variance for each of the locations showed the following parameters: degrees of freedom (DF), mean squares (MS), and significance levels (F) data calculated for the sources of variation of the experiments (Table 2).
For the blocks source of variation (ignoring treatments), a significant difference was observed at 5% probability in Canarana (MT) and Sorriso (GO).In Primavera do Leste (MT) and Rio Verde (GO), the effect of blocks was not significant.
For the adjusted treatments, there was a significant difference at 5 and 1% probability in Primavera do Leste (MT) and Sorriso (GO), respectively.In Canarana (MT) and Rio Verde (GO), however, no significant difference was found for the treatments.
In all locations, there was a significant difference for the controls x hybrids interaction at 1% probability, except in Primavera do Leste (MT), where the interaction was not significant.
The MSEs values of the experiments were close, allowing of the joint analysis of the experiments.The obtained MSE values were: 700,980.74 in Canarana (MT), 851,792.87 in Primavera do Leste (MT), 1,412,143.67 in Rio Verde (GO), and 260,992.39 in Sorriso (GO).The maximum F was 5.25, calculated by dividing the highest MSE value by the lowest of the individual analyses.Since this value was below 7, it was possible to perform a joint analysis of the experiments (Hartley, 1950).
In the joint analysis of variance (Table 3), there was 1% probability for location, adjusted treatments, and for the controls x hybrids interaction.The significance level for location shows that there is a great difference from one location or region of evaluation to another.This confirms the good representativeness of the chosen locations, for the main producing regions of second-crop corn in the Brazilian Cerrado, which shows climatic and geographic differences and can be more or less affected by the rainfall regime in one or more locations, in one or more years, without having a tendency to do so.As to the treatments, significant differences between them were expected, which allows of the selection of the best genotypes in more locations and with more replicates for the following evaluation stages of the genetic improvement program.
There was a significant difference, at 5% probability, for blocks (ignoring treatments) and hybrids.For the controls, there was no significant difference, which was expected since the performance of this source of variation had already been observed in other genotypes-selection trials.This nonsignificance helps to increase the confidence in the conduction of the experiments and in the quality of the collected data.It also allows of the visualization of a good response of the statistical method used in this work, for the evaluation and selection of hybrids in the joint analysis for the augmented block design.
Another important observation for the results of the individual analysis of variances of the four trials are the obtained coefficients of variation (CVs).
According to Pimentel Gomes & Guimarães (1958), the CV gives an idea of the precision of the experiment.The highest and lowest CVs were 22.58% and 12.22% in Primavera do Leste (MT) and Sorriso (GO), respectively (Table 4).
When studying the CVs of several agricultural trials, Pimentel Gomes & Guimarães (1958) proposed the following CV classification: low, <10%; medium, between 10 and 20%; high, between 20 and 30%; and very high, > 30%.This classification is inversely proportional to that of the precision of the experiment, i.e., the higher the CV the lower the experimental precision.However, since CVs can be misinterpreted (Doring & Reckling, 2018), their ranges of variation can vary and should be specific for each analyzed variable and for each studied crop (Fritsche-Neto et al., 2012;Couto et al., 2013).
Analyzing the CVs of 66 studies on corn, Scapim (1995) found values between 2.24 and 38.14% for the grain weight characteristic.The same author classified the CVs as: low, up to 10%; medium, from 10 to 22%; high, 22 to 28%; and, very high, above 28%.Therefore, there is a slight divergence between the CV ranking values proposed by Pimentel-Gomes and Scapim (Fritsche-Neto et al., 2012;Lopes et al., 2021).
In the joint analysis, the CV was 18.78%, which is considered the average for data of grain weight, proving the good conduction of the experiments in the field and, consequently, the reliability of the data obtained to proceed with the selection of hybrids.
The grain yield of the genotypes for each location and in the joint analysis are presented (Table 4).There are also the results of Scott-Knott's mean grouping test for all locations and the joint analysis.In the joint analysis, which considers all locations, seven experimental hybrids performed equal to the three controls, they are: HT007, HT008, HT018, HT004, HT024, HT005, HT071.
The performance of these hybrids in each of the four locations remained very stable, showing that they are performing equal or better than one or more controls, depending on the location evaluated.This way, it is possible to have a great confidence in the grouping results of the joint analysis.
Another fact conveying a high degree of reliability to the method of conjoint analysis is that the controls are in the group of higher grain yield.This fact was already foreseen, and it is frequent in preliminary trials because the controls are genotypes that have already had their performance validated in several research trials, before entering the agricultural seed market.Regarding the genotypes evaluated in these preliminary trials, it is worth noting that they are not completely homogeneous because the processes that involve the purity of the lineages are not complete, or in advanced stages of homozygosity.Generally, in parallel with the preliminary trials, the same lineages that are parents of these genotypes are undergoing processes of characterization and purification seeking greater genotypic and phenotypic homogeneity, to obtain better homogeneity in hybrids that will be tested in network trials in subsequent years, if selected for intermediate and/or VCU trials.This homogeneity in the advanced hybrids brings an increment of grain yield because it improves the pattern and uniformity of plants that compose the evaluated plot, reducing the possible presence of atypical and/or dominated plants, which in practice reduces the grain yield that appear in the preliminary phases.
The experimental hybrids HT007, HT008, HT018, HT004, HT024, HT005, and HT071 are the most promising ones to present the best grain yields, showing equal performance in comparison to the controls, by Scott-Knott's test at 5% probability, through the joint analysis in augmented block design.

Conclusions
1.The experimental corn (Zea mays) hybrids HT007, HT008, HT018, HT004, HT024, HT005, and HT071 show the same performance for yield by Scott-Knott's test, at 5% probability, and by the joint analysis of four locations in augmented block design.
2. The analysis of experimental groups applied with augmented block design, followed by the classification of the genotypes performed by Scott-Knott's multiple comparison test, is a viable alternative to supply the following needs: a low number of replicates and a large number of genotypes; Scott-Knott's test provides a more objective way for differentiating the best treatments in the experiments, thus, among the

Table 1 .
Statistical analysis of variance.

Table 2 .
Analysis of F variance for the experimental sites in the municipalities located .

Table 3 .
Analysis of variance for the joint analysis of the four municipalities.