Yield related key traits in the selection of super sweetcorn hybrids

Super sweet corn breeding must develop hybrids that fully meet the expectations of the market. In this sense, the determination of keytraits for the selection of sweetcorn genotypes is a fundamental condition for breeding success. The objective of this work was to identify key traits for the selection of promising and contrasting genotypes of super sweet corn. The experiments were carried out in Guarapuava (PR), in two sowing dates. Seventeen traits related to the yield and quality of the ear were evaluated. To perform the multivariate analysis, data were subjected to diagnosis of multicolinearity, analysis of canonical variables, genetic divergence, hierarchical clusters, factor analysis, and canonical correspondence analysis. Grain yield, yield of dehusked ears, and number of commercial ears were considered related key traits in the identification of promising super sweet corn genotypes. Hybrid D3-20 × D5-41 presented higher averages than the others, considering the yield related keytraits.


INTRODUCTION
Super sweetcorn (Zea mays var. saccharata) is exclusively used for human consumption, in natura or processed, due to its high concentration of sugars in the endosperm. The presence of one or more genes differentiates sweetcorn from common corn, conferring changes in quality (flavor, aroma, texture, and chemical composition), in the aspect of the plant and ear, and in the seed viability, such as germination and vigor. Sweetcorn grains are used in the immature stage of development because of the great content of sugars in the endosperm (Alburquerque et al. 2008). The higher sugars content in the grains is due to the lower activity or absence of activity of the enzymes that turn the synthetized polysaccharides into starch and, as a consequence, the percentage of starch and the content of dry matter decrease (Tracy 2001). Among the several known mutant recessive genes that, separately or jointly, affect endosperm sugar content, eight have been employed in the obtainment of cultivars: Amilose Extender (ae), Brittle (bt), Britlle2 (bt2), Dull (du), Shrunken2 (sh2), Sugary (su), Sugary Enhancer (se), and Waxy (wx) (Hallauer et al. 1988, Hossain et al. 2013. Even though there are several maize breeding programs in Brazil, which are responsible for the production of new hybrids that increase productivity and profitability in this sector, options of super sweetcorn hybrids are still scarce. In sweet corn breeding programs, a large number of traits are evaluated; however, in many cases, they are misinterpreted due to the uncertainty of the traits considered key in the selection of promising genotypes (Ayodeji et al. 2019;Taylor and Whelan 2010). https://doi.org/10.1590/1678-4499.20200484

PLANT BREEDING Article
Because of the great demand and wide distribution of super sweetcorn cultivation, associated with the low dissemination of technical knowledge and availability of a few hybrids specifically for this purpose, the development of new studies, both for identification of key traits and for selection of promising and contrasting genotypes, is very important for the advancement of super sweetcorn genetic breeding (Hossain et al. 2013). Solomon et al. (2011) performed similar studies involving agronomic and quality traits of sweet corn and reaffirm the need for further studies involving traits directly related to grain yield of this type of corn in order to provide a possible faster and more efficient selection based on pre-determined key traits. In accordance with the aforementioned, the present work aims to fill the literature gap and reaffirm, in scientifical terms, the possible selection based on key traits through differentiated and appropriate analyzes, aiming at allowing a more assertive selection in sweet corn breeding programs.
The objective of this work was to identify key traits for the selection of promising and contrasting genotypes of super sweet corn.
The experiments were sown in two sowing dates (October 15 th and December 17 th 2017), in a randomized complete block design, with three replications. Each plot was made up of two rows with 5 m length, 0.8 m apart. The experiments were carried out in Guarapuava (PR, Brazil) (1100 m altitude, 25°21'S latitude, and 51°30'W longitude). The soil is classified as typical Dystrophic Bruno Latosol, very clay texture (Nitsche et al. 2019).
The evaluated traits were:1) days for male flowering (MF -days), number of days accounted between sowing date and pollen release in 50% of the plants of the plot; 2) plant height (PH -cm), obtained from samples of six plants per plot, measured from the soil to the flag leaf with the help of a tape measure; 3) ear height (EH -cm) obtained from samples of six plants per plot, measured from the soil to the ear; 4) yield of ears with husk in which all ears of the plot were weighed with husk and later extrapolated to kg.ha -1 (YWH -kg.ha -1 ); 5) total number of ears (TNE); 6) number of commercial ears (NCE), obtained by ear count longer than 15 cm and above 3 cm in diameter; 7) number of damaged ears (NDE), by insect or pathogens; 8) yield of dehusked ear which all ears of the plot were weighed without husk and later extrapolated to kg•ha -1 (YDE -kg.ha -1 ); 9) yield of commercial ears (YCE -kg.ha -1 ); 10) grain yield of commercial ears in which only ears classified as commercial were considered (GY -kg.ha -1 ); 11) cob diameter, data obtained from six representative samples with the help of a digital pachymeter of the MA871 model (CD); 12) ear length (EL-cm); 13) ear diameter (ED -mm); 14) grain length (GL -mm), measured with a digital pachymeter of the MA871 model, of a grain sample in the middle third of six ears collected at random; 15) grain width (GW -mm); and 16) grain dry matter content (DM -%), obtained from two samples of 20.0 g, which were weighed on a precision scale and taken to the forced air circulation oven at 55 ºC for 72 h. At the end, the samples were weighed. With the initial and final weights, the DM of the grains was obtained in the two samples. Finally, it was measured 17) the content of soluble solids of the grains (SS -o Brix), obtained with a digital refractometer, in samples of grain mass carried out at the time of the harvest.
The premises of normality and homogeneity of the variances were evaluated by the Shapiro-Wilk and Levene tests, respectively. The data were subjected to the multivariate analyses of variance (MANOVA), using the Genes statistical software (Cruz 2016) integrated with the R software (https://www.r-project.org).
The diagnosis of multicollinearity was made between the predictive variables by using the Variance Inflation Factor (VIF) and the Index of Condition (IC), aiming at eliminating their inter-relations and redundancy (Montgomery and Peck 1981;Prunier et al. 2015).
The genetic divergence among the genotypes was evaluated based on the Unweighted Pair Group Method (UPGMA), obtained from the generalized distance of Mahalanobis (Solomon et al. 2011). The groups were established in dendrogram according to Mojema (1977). The analyses of canonical variables with clusters were carried out by the Tocher method. The factor analysis was carried out in accordance with the following model: X j = I j1 F 1 + I j2 F 2 + …. + I jm F m + ε j , where X j is the estimated variable in each parcel with j = 1, 2… v, I jk is the factorial loading for the j-th variable associated with the k-th factor, in which k = 1, 2, … m; F k is the k-th common factor; and ε j is the specific factor associated with the j-th variable. The canonical correspondence analysis was carried out according to Ter Braak (1986). The means were subjected to the Hotelling T2 test at 5% probability, in order to associate the results obtained with the other analyses (Friendly and Sigal 2017). The analyses were carried out using the statistical software Genes (Cruz 2016) and SAS (SAS Institute 2011).

RESULTS AND DISCUSSION
The normality of error sand homogeneity of the variances was observed for all the evaluated traits. The MANOVA showed significant differences among the vectors of the genotypes means for all the evaluated traits (p < 0.01), indicating the existence of genetic variability. There was no significant effect for the genotypes × crop seasons interaction.
The multicollinearity among the traits was evaluated by Variance Inflation Factors (VIF) and Condition Index (CI) criteria. Traits that presented VIF and CI values above 10 and 100, respectively, were considered evidence of substantial multicollinearity among variables, making it necessary to remove such predictors (Montgomery and Peck 1981;O'Brien 2007;Prunier et al. 2015). When working with a large number of variables, it is necessary to verify the possible existence of correlation between them, since, in case there is a correlation, these variables must be discarded in analysis of identification of key traits, as they can compromise the final result (Montgomery and Peck 1981). Five traits that presented high VIF and CI values were eliminated: cob diameter (CD); yield of ears with husk (YWH); total number of ears (TNE); yield of commercial ears (YCE); and number of damaged ears (NDE). On the other hand, the traits days to male flowing (MF); plant height (PH); ear height (EH); grain length (GL); grain width (GW); ear length (EL); ear diameter (ED); yield of dehusked ears (YDE); number of commercial ears (NCE); grain yield (GY); content of soluble solids (SS); and grain dry matter (DM) presented VIF and CI values below 10 and 100, respectively, and were kept for subsequent analyses (Table 1). Taylor and Whelan (2010) and Gonçalves et al. (2018) also used multivariate techniques for traits related to the selection of sweet corn genotypes. The multivariate analysis allowed reducing the number of traits, as in the present study, increasing the chances of a more effective selection, since the use of fewer traits reduces the effect of their inter-relations, avoiding redundancy and misconception in the selection process of promising genotypes (Aaliya et al. 2016).
In order to identify key traits, it is necessary to have variability among genotypes in relation to the previous traits selected by the multicollinearity analysis. One of the main ways to identify variability is the elaboration of dendrograms, which allows the visualization of possible different groups (Friendly andSigal 2017, Morales et al. 2011). It is recommended that at least three groups be formed so that one can proceed with analysis of key traits without bias in the results.
The relationships among the 64 super sweetcorn genotypes can be observed in the dendrogram based on the Mahalanob is generalized distance, clustered by the UPGMA method (Fig. 1). The high value of the cophenetic correlation coefficient (CCC = 0.75) indicates the good adjustment between the original data and the matrix of dissimilarity, providing good accuracy in the observed results. The dendrogram shows four main groups, which reinforces the presence of variability among the evaluated traits, since the dendrogram is based on phenotypic traits (Friendly and Sigal 2017;Morales et al. 2011).
The result presented by the dendrogram was confirmed and supported by the analysis of canonical variables clustered by the Tocher method (Fig. 2), which shows different groups. The Tocher optimization method is based on the formation of groups whose distances within the groups are shorter than the distances among groups (Cruz et al. 2014), being efficient for the determination of different groups based on different traits, when associated with the techniques of dissimilarity and canonical variables analyses (Hossain et al. 2013;Taylor Whelan 2010).
The study and identification of different groups help both in determining future crosses and in verifying existing genetic variability, a fundamental criterion for the determination of key characters and, consequently, for the selection of promising genotypes (Gonçalves et al. 2018;Hallauer et al. 2010;Solomon et al. 2011).
The two first variables of the canonical variables analyses explained 83.92% of the total variation among the 64 genotypes, considering the twelve selected traits. When analyzing the graphic dispersion of the scores of the two first canonical variables, the match with the previous clusters is observed, as well as the expression of the variability between genotypes, in order to consolidate the results observed and the possible use of the traits chosen for the selection of promising genotypes of super sweet corn. The first canonical variable (CV1) explained 57.65% of the total variation, and the second (CV2) was responsible for 26.27% (Fig. 2). The Tocher clustering method, based on the analysis of canonical variables, once more pointed out hybrid D3-20 × D5-41. A second group was formed by hybrids substituir por D3-30x D5-43 and D3-20x D5-43. However, hybrids D3-33 × D5-53 and D3-20 × D5-55, which previously were in the same group as the hybrids mentioned before, became a separate group (Fig. 2).
The factor analysis is a multivariate statistical method which has been recently applied in agronomic studies (Keith and Reynolds 2018). This analysis aims at explaining the relationships observed among traits by removing possible redundancies or the duplication of a set of correlated phenotypical data (Cattell 1965). This method allows selecting important traits to explore their relationships and variations, in addition to generating important information between factors and genotypes (Vile et al. 2012).
In the factor analysis, the traits are replaced by a smaller number of latent characters, which are clustered so that there is little variance within the groups, as well as maximum variation among the groups (Cruz et al. 2014). This analysis, associated with the canonical variables and the canonical correspondence analyses, makes it possible to efficiently select the traits that best discriminate genotypes for a certain purpose. In this work, these analyses can help in the selection of key traits and of superior super sweet corn hybrids.
The traits MF, PH, EH, GL, GW, EL, ED, YDE, NCE, GY, SS, and MS were used to carry out the factor analysis, which expressed the proportion of the existing variation in the dependent traits. Most of these traits were also studied by other authors, being considered as determinants for the super sweet corn breeding (Ayodeji et al. 2019;Solomon et al. 2011). The variation can be explained by two predictors (Factors 1 and 2) (Fig. 3), making the factor analysis adequate to select the key traits related to the selection of promising super sweet corn hybrids (O'Brien 2007).
The communality values indicate the relationship between the evaluated traits and the objective of the study, since it is directly related to the contribution of the characteristic in the variation, being decisive in the selection of key traits (Cattell 1965). Communality values above 0.80 have been accepted as optimal, since they are equivalent to a correlation above 0.90 between the standardized variable (X j ) and the common part that explains this variable (Z j ), to present satisfactory final results (Keith and Reynolds 2018).
The communality values ranged from 0.1828 (GW) to 0.8601 (GY). High values of communality are related to traits with a great contribution to the variability and to the determination of superior hybrids based on these traits, reaffirming that the selected traits can be described as key traits (Table 2).   Solomon et al. (2011) used the same criteria in the identification of key traits for the selection of superior genotypes and observed that the methodology was efficient in the objective of the work.
According to the factor analysis, the first factor was decisive for the definition of key traits YDE, NCE, and GY (Table 2), considering estimates above 0.80 of factor 1 and low estimates of factor 2 (Fig. 3), which highlights more representativeness of factor1 for the respective traits. Consequently, these traits contributed the most to the expression of genetic variance and the formation of different groups, in addition to the possibility of identifying superior genotypes (Keith and Reynolds 2018). The other traits presented intermediate estimates between both factors, representing a certain level of influence on the selection of the hybrids, but they were not as significant as YDE, NCE, and GY.
Research related to key traits has also been carried out by other authors, and its results indicate that the selection of promising genotypes based on these traits can be considered efficient, in addition to reducing costs and labor. However, most studies of this kind have been conducted, so far, on common corn with different purposes (Badu-Apraku et al. 2020;Harrison et al. 2014). There are still few works that present possible key traits related to yield in super sweet corn (Solomon et al. 2011).
According to the factor analysis, the selected traits can be decisive in the selection of promising hybrids. Therefore, the canonical correspondence analysis was carried out, which is an exploratory technique for simplifying the structure of multivariate data variability, where the variables are organized in contingency tables, taking into account correspondence measures between the lines and the columns of the data matrix. According to Ter Braak (1987), the canonical correspondence analysis is a method for determining a system of association among the elements of two or more sets, trying to explain the association structure of the factors in question. Thus, graphs are elaborated with the main components of the lines and columns, allowing the visualization of the relationship between the sets, where the proximity of the points referring to the line and column indicates an association, whereas the distancing indicates repulsion.
The canonical correspondence analysis reveals the relationships that would not have been noticed being the analysis done in the pairs of variables, and, apart from that, it is highly flexible in the treatment of the data, because it is not necessary to adopt a theoretical model of probability distribution, being enough a rectangular matrix containing non-negative data, which makes it possible to masterfully relate the influence of different traits on specific genotypes (Nyfjäll 2002).
The canonical correspondence analysis explained 96.87% of the total variation between the genotypes and the respective evaluated traits. The first canonical correspondence axis (CCA1) explained 93.70% of the total variation, and the second (CCA2) explained 3.17% (Fig. 3). It is possible to verify that the greatest total variation was already explained in the first CCA, which is desirable, since it increases the accuracy between the cluster and the estimated scores (Nyfjäll 2002;Ter Braak 1987).
According to the biplot of the canonical correspondence analysis, traits YDE, GY, and NCE were decisive in the selection of superior super sweet corn in the set of the evaluated genotypes (Fig. 4), because they are close to the intersection of the axes, making it possible to state that these are yield-related main traits, considering the greater influence of these traits on all genotypes.  The traits DM, SS, MF, GW, EL, GL, ED, PH, and EH presented little influence on the discrimination of the hybrids, which indicates that it is possible to reduce the number of traits to be evaluated in future studies (Nyfjäll 2002;Ter Braak 1987), although this does not mean that traits related to ear aspects and eating quality should be disregarded in the selection.
It can be stated that key traits YDE, GY, and NCE were decisive in discriminating promising hybrids, based on the adopted analyses, and may allow a more assertive selection of super sweet corn genotypes.
After determining the key traits, a means comparison test was carried out in order to select the most promising genotypes. According to Fig. 5, it can be seen that hybrid D3-20 × D5-41 was superior to the others, even surpassing the performance of the commercial hybrids used as checks.