Development and selection of super-sweet corn genotypes ( sh2 ) through multivariate approaches

: The aim of the present study was to investigate relations among ten traits in super-sweet corn genotypes assessed by means of simple correlation, path and canonical variable analyses, as well as to investigate the relative importance of such traits to the super-sweet corn breeding program developed at Darcy Ribeiro Northern Fluminense State University in order to develop strategies able to improve the efficiency in the selection of superior genotypes. Thus, trials comprising 3 × 6 partial diallel of super-sweet ( sh2 ) corn were carried out, according to a randomized block design (RBD) with four repetitions, in two different environments located in Northern Rio de Janeiro State, Brazil (Itaocara and Campos dos Goytacazes counties). The correlation study showed that traits such as ear diameter and useful ear length contributed the most to increase ear yield (without husk); the variable ear diameter stood out for having stronger direct effect on ear yield, as well as for presenting high heritability (0.95). The trait number of grains per ear row contributed the most to the variation between hybrids, whereas the trait useful ear length contributed the least. The canonical variables showed that the genetic backgrounds of sh2 -gene donor populations had effect on recurrent populations, even after five backcrossing cycles, thus resulting in the formation of two divergent groups.


INTRODUCTION
Corn (Zea mays L.) is one of the most widely cultivated species in the world. It accounts for 38.8% of the total cereal, legume and oilseed production in Brazil and its annual grain production reaches 82.7 million tons (IBGE 2016). In addition, it has enormous potential for sweet corn production; however, such potential is yet to be explored. It can be used as canned or frozen cobs or grains, as well as dehydrated or as baby corn, depending on the harvest point (Tracy 2001;Melo et al. 2014).
Sweet corn is among the most profitable vegetables (ABCSEM 2014); however, nowadays, it has limited productivity due to low supply of quality cultivars to several regions in the country. The main cultivar traded in Brazil is Tropical Plus (sh2), which was registered in 2005 by the multinational company Syngenta and is recommended for cultivation from Northern to Southern Brazil (MAPA 2016).
The super-sweet group shows total sugar content higher than that of the sweet group, and up to 15 times higher than field corn grains (Tosello 1987;Azanza et al. 1994;Tracy 2001). In addition, according to Goldman and Tracy (1994), the protein content in sh2-gene carrier corn species is up to 30% higher than that found su1-carriers, which have the same genetic background.
The sweet corn breeding may be summarized in two action forms, namely: by subjecting a sweet germplasm to a routine breeding program (Parentoni et al. 1990) or by introducing the sweet (monogenic recessive) trait of any genetic source in a normal-endosperm germplasm developed through classical breeding methods (Santos et al. 2014). However, the second option (introducing the gene through backcrossing) would require new selection stages because, unlike field corn whose main product lies on grains, the commercial product in sweet corn is the whole cob (Tracy 2001). Therefore, it requires special attention during breeding programs.
It is necessary conducting studies about direct and indirect relations between yield and several plant and fruit traits (Entringer et al. 2014;Cabral et al. 2016), as well as about the relative importance of each trait to genotype variability, in order to help increasing the efficiency of breeding programs. For instance, the trait ear yield without husk (EY) has high direct effect on fresh grain yield in sweet corn crops (Ilker 2011). Thus, EY is a more practical measure that may reflect the interest of the industry. Accordingly, the aim of the present study was to investigate the relationship between traits, as estimated through simple correlation and path analyses, as well as to investigate the relative importance of such traits to the super-sweet corn breeding program developed at Darcy Ribeiro Northern Fluminense State University.

MATERIALS AND METHODS
Based on 10 × 10 partial diallel trials conducted in two different environments, nine inbred lines were selected, according to their general and specific combining ability, to be used in a more advanced stage of the Super-Sweet Corn Breeding Program developed at Darcy Ribeiro Northern Fluminense State University. These nine inbred lines were used to perform a new 3 × 6 partial diallel trial, which resulted in 18 single-cross hybrids, in total. The herein used lines derived from base-genotype populations such as CIMMYT8 and Piranão8 during the 8 th recurrent reciprocal selection cycle. Both populations were backcrossed five times with two shrunken-2 (sh2) mutant gene populations (SH2 and SH28HS), and it resulted in the following super-sweet populations: SH2-CIMMYT8 (CSH), SH2-8HS-CIMMYT (C8HS), SH2-Piranão (PSH) and SH2-8HS-Piranão (P8HS) (Fig. 1).  Three out of the nine inbred lines derived from the CSH population, three came from the PSH population and three from the P8HS one. Nine single-cross hybrids were formed through the inbred line cross CSH × PSH (SC1, SC2, SC3, SC7, SC8, SC9, SC13, SC14, and SC15) and nine single-cross hybrids derived from the inbred line cross CSH × P8HS (SC4, SC5, SC6, SC10, SC11, SC12, SC16, SC17, and SC18). A super-sweet corn (sh2) control from Syngenta (Tropical Plus) was used as reference.
The traits assessed in the experiments were: ear yield (without husk) (EY); grain filling, using scores from 1 to 10, whereas 1 is for cobs with few grains and 10 for cobs filled with grains (GF); ear diameter (ED); number of rows per ear (RE); number of grains per ear row (GR); grain length × thickness product (LT); useful ear length (EL); ear insertion height (EH); plant height (PH); and silking days (SD).
The following mathematical models were used to calculate: (Eq. 1) phenotypic correlation, (Eq. 2) genotypic correlation, and (Eq. 3) environmental correlation estimates based on variance and joint covariance analyses: where: QMe corresponds to the mean square of the error; r is the number of repetitions; l is the number of locations; and q is the tabulated value of the total studentized range (least significant difference at p < 0.05); σ²f = QMc/r.l (phenotypic variance) (6) фg = QMc -QMe/r.l (genotypic quadratic component) (7) where: QMc corresponds to the mean square of crosses; VCe (%) = 100 × √QMe/μ (experimental variation coefficient) (8) VCg (%) = 100 × √фg/μ (genotypic variation coefficient) (9) VI = VCg/VCe (variation index) (10) H² = фg/σ²f (genotypic determination coefficient) The path analysis was conducted by solving the equation X'Xβ = X'Y, wherein X'X is the non-singular correlation matrix between the basic variable and β, which is the column vector of the path analysis coefficient; whereas X'Y is the column vector of the correlation between the explanatory variables and the dependent one. The correlation matrix was based on genotypic correlations; it considered the trait EY as basic variable, and the other variables (GF, ED, RE, GR, LT, EL, EH, PH, and SD) as explanatory.
Multivariate analyses were conducted by using canonical variables to assess genetic divergence between hybrids, whereas Mahalanobis-D 2 was used as dissimilarity measure to assess the relative importance of the traits. Statistical analyses were performed in the GENES software (Cruz 2013).

RESULTS
The analysis of variance (Table 1) showed significant crossbreeding differences (p < 0.01) in all the studied traits and it indicated variability in the investigated genotypes. All the variables in the Piranão group showed significant general combining ability (GCA), except for ear yield (EY), whereas useful ear length (EL) was the only variable that did not show significant GCA in the CIMMYT group. All the per ear row (GR); grain length × thickness product (LT); useful ear length (EL); ear insertion height (EH); plant height (PH); and silking days (SD).
The following mathematical models were used to calculate: (Eq. 1) phenotypic correlation, (Eq. 2) genotypic correlation, and (Eq. 3) environmental correlation estimates based on variance and joint covariance analyses: The genetic statistical parameters were set through the estimators (Eqs. 4 to 11): The genetic statistical parameters were set through the estimators (Eqs. 4 to 11): (1) where: QMe corresponds to the mean square of the error; r the number of locations; and q is the tabulated value of t significant difference at p < 0.05); variables showed significant specific combining ability (SCA), except for grain filling (GF) and grain length × thickness product (LT). EL and plant height (PH) were the only variables among the ten herein assessed ones that did not show interaction with the environment in any of the effects. The mean EY of the hybrids exceeded that of the control in absolute values, although without significance. However, when 18 genotypes, on average, show such a trend, it puts in evidence the genetic potential of the crossbreeding and allows identifying superior genotypes. The mean values of traits such as number of grains per ear row (GR), LT and EL in the hybrids were not significantly different from the mean values found in the control. As for the other traits, the control showed better GF, higher ear diameter (ED) and number of rows per ear (RE), and lower ear insertion height (EH), PH and silking days (SD) than the mean of the hybrids.
VCe values ranged from 1.92% (SD) to 15.04% (EY), and it evidenced the reliability of the herein presented results. As for the other traits, VCe estimates were below 7.40%; ED (2.64), RE (3.66) and SD (1.92) showed the lowest values.
The genetic variation coefficient (VCg) ranged from 2.62 (EL) to 13.22 (EY). The highest estimates were found in EY, which was followed by RE and LT; as for the other traits -EL (2.62), PH (2.95) and SD (2.93) -the likelihood of having progress through selection was reduced due to low VCg estimates.
RE, ED and SD presented the highest variation index (VI) values (2.61, 1.56 and 1.53, respectively). Only RE showed high VCg values among the aforementioned traits, whereas ED and SD presented low VCe, fact that led them to show the highest variation indices. EL and PH presented the lowest VIs due to the low VCg values they presented.  Overall, the herein assessed traits showed high H 2 estimates; it is worth highlighting that RE, ED and SD showed values above 0.95; whereas ED, GF, GR, LT, and EH showed estimates above 0.86. Table 2 shows the phenotypic, genetic and environmental correlation estimates of the traits investigated in the current study. Trait pairs EY and ED (0.60), ED and GR (0.60), and LT and SD (0.78) presented the highest phenotypic correlation values. On the other hand, trait pairs RE and SD (-0.85), RE and LT (-0.67), and ED and SD (-0.52) showed the highest negative phenotypic correlations. Trait pairs EY and EL (0.74), ED and GR (0.62), and ED and EH (0.65) showed the highest positive genetic correlations. The genotypic correlations were predominantly higher than the phenotypic ones showing the same signal, fact that indicated less environmental influence on trait expression. EY showed positive and significant genetic correlation with three other explanatory variables, namely: RE, ED, and EL (0.38, 0.65, and 0.74, respectively); on the other hand, it was negatively correlated with three other traits and showed magnitudes -0.19 (SD and EY) and -0.25 (ED and EY). GR, LT and EH presented non-significant genotypic, phenotypic and environmental correlations with the basic variable EY.
Most trait pairs showed low environmental correlation estimates, since the highest values were 0.66 (EH and PH) and 0.56 (GR and EL). It is worth emphasizing that low-magnitude environmental correlations were found when the correlations between EY and the other traits  were taken into consideration; environmental correlations values ranged from 0.01 to 0.23 and most of them were not significant. The multicollinearity diagnosis applied to the genetic correlation matrix showed weak collinearity (21.47); therefore, it was not necessary transforming data to improve relations between variables or excluding variables to conduct the path analysis. The variables used in the analysis were able to explain 87.46% (R 2 ) of the variation in EY.
According to the path analysis (Table 3), ED and EL had high direct effects on the basic variable EY, and showed moderate and significant total correlation. Table 3. Decomposition of effects of genotypic correlation coefficient in path analysis of nine explanatory traits of the basic trait ear yield (without husk), assessed in 18 single-cross hybrids diallelics of super-sweet corn.  RE, PH and SD presented high direct negative effects on the basic variable EY. However, they showed weak total correlation with the basic variable, as well as EH, which had high positive direct effect on EY.

GF ED RE
The relative importance of the ten herein investigated variables to the genetic divergence of hybrids was measured, based on the Mahalanobis-D 2 estimate, according to the method by Singh (1981). RE results contributed to most of the variations (36.9%), whereas EL and PH presented the lowest contributions (1.8% and 2.7%, respectively), thus indicating the possibility of discarding these variables in divergence studies comprising these genotypes (Fig. 2).
The canonical variable dispersion chart explained 73.23% of the variation in the sum of canonical variables (CV) 1 and 2. As the most explanatory dispersion lies on CV1, it is possible seeing the grouping of hybrids derived from lines belonging to the PHS population (SC1, SC2, SC3, SC7, SC8, SC9, SC13, SC14 and SC15); these hybrids are separated from the those derived from lines belonging to the P8HS population (SC4, SC5, SC6, SC10, SC11, SC12, SC16, SC17 and SC18); the CSH lines are able to discriminate the two groups. The Tropical Plus control presented great similarity with hybrids derived from P8HS × CSH lines (Fig. 3).   Genetic improvement of super-sweet corn

DISCUSSION
The mean EY estimate was higher than that found by Souza et al. (2013), Oliboni et al. (2013), and Oliveira Jr. et al. (2006). This information is very important because it allows highlighting the potential of the tested genotypes and makes it possible using them in the final tests of the breeding program in question.
There was good control of the conditions applied to the present study due to the low experimental variation coefficients herein observed. EY recorded the highest VCe, fact that was already expected due to the polygenic nature of such trait; this result was corroborated by other studies (Oliveira Jr. et al. 2006;Souza et al. 2013;Oliboni et al. 2013).
Correlations may have genetic or environmental causes; genetic and phenotypic correlations are the most important ones in plant breeding (Hallauer et al. 2010). Thus, three aspects must be considered at the time to interpret correlations, namely: magnitude, direction, and significance. Positive correlation coefficient estimates indicate that a given variable tends to increase when another one increases; negative correlations indicate that a given variable tends to increase when the other one decreases (Nogueira et al. 2012). According to the results in the current study, the high complexity between production components contribute to EY and make the selection of super-sweet corn genotypes complex. Thus, it becomes evident the need to unfold correlations in direct and indirect effects in order to assess the degree of importance of each explanatory variable in comparison to the main or basic variable (Santos et al. 2014).
Results showed that variables such as ED and EL significantly contributed to EY increase (Table 3). In addition, ED stood out for showing high heritability (0.95). Therefore, it is possible inferring that ED can be used in indirect selection processes aimed at increasing EY. According to Vencovsky and Barriga (1992), it should be taken into consideration that when the correlation coefficient and the direct effect show similar magnitude and sign, this direct correlation explains the true association between variables. It shows that, apparently, this variable is more independent than the others.
EH and LT presented high direct positive and negative effects, respectively; however, the correlation was not significant due to indirect effects. Accordingly, Vencovsky and Barriga (1992) suggested using restricted selection to eliminate undesirable indirect effects and to allow the advantage that the existing direct effect is able to provide.
RE presented positive and significant correlation coefficient in the path analysis and such coefficient was higher than that of the residual variable; however, the direct effect of RE was highly negative. According to Lorentz et al. (2006), when the linear correlation coefficient is positive but the direct effect is negative or negligible, the correlation is caused by the indirect effects.
GF, PH and SD presented negative direct effects that, although of small magnitude, contributed to EY reduction. Thus, it is possible inferring that plants showing smaller height and higher flowering precocity contribute to EY increase. However, due to the low magnitude of the total correlation, it is recommended restricting the use of these variables in the selection index.
Although the results of the correlation and path analyses indicated that ED and EL were the main variables influencing EY, the results of the path analysis indicated which trait presented the highest direct and indirect effects, fact that assures greater security at the time to choose the traits that should be improved in order to reach the final goal (Montardo et al. 2003), in the present case, a higher EY.
Since EL presented low relative importance, low direct effect on the base variable, as well as lower VI, H 2 and VCg, one could suggest ignoring it in the analysis. On the other hand, since EL is important to the ear quality assessment and since it is not replaceable, it must not be ignored. Similarly, since PH is considered of significant importance in corn breeding -whether it is the sweet, field or popcorn type (Kleinpaul et al. 2014;Cabral et al. 2016) -it should not be ignored in the assessments.
The canonical variables presented groupings that reflected the genetic background of sh2-gene donor populations, fact that directly influenced the performance of the herein used inbred lines. In addition, it was possible identifying a subgrouping trend between hybrids composing half-sib families within the same genetic background. EY was the only trait used in the analysis that did not present significant GCA in Piranão group, which suggests that the divergence between PSH and P8HS population inbred lines is not due to yield. The lack of EY significance in Piranão group may occur due to the fact that the lines used in the final assessment stage were selected in a truncated way, based on the general and specific combining abilities of the EY, which reduced its variability.

CONCLUSION
Ear diameter was the variable that showed the greatest explanatory power for ear yield, as well as sufficient variability for the selection process.
Plants showing smaller height and higher flowering precocity contributed to ear yield increase due to the use of selection indices.
The genetic backgrounds of the SDSH and SDS8HS donor populations strongly influenced the recurrent population, even after five backcrossing cycles.