Heritability and combined parental information to define the number of crosses in circulant diallels

In many agricultural crops, the number of parents used in cross breeding make it difficult to obtain all possible hybrids in diallels. Therefore, using maize as a model plant and based on traits with high and low heritabilities, the aim of the present study was to quantify the correlation estimates of the general combining ability (GCA) and specific combining ability (SCA) between a complete diallel and a circulant diallel, with and without the inclusion of parents. For the high heritability trait, the GCA estimates can be obtained with low s values, whereas for the SCA estimates, s values close to half the number of parents should be used. For the low heritability trait, information from parents must be used to obtain the SCA estimates. For the GCA estimates, considering the stabilization of r above 0.70 for s=7, s values greater than half the number of parents must be used in the circulant models.


INTRODUCTION
From the 1942 paper by Sprague and Tatum (1942), it is understood that general combining ability (GCA) is the result of the average performance of a parent i when crossed with a set of other parents and is associated with additive gene effects. On the other hand, specific combining ability (SCA) refers to a specific combination between two parents, expressed by their allelic complementarity, and is associated with non-additive effects (dominance variance and the three types of epistatic interaction components if epistasis was present). They include additive × dominance and dominance × dominance interactions (Fasahat et al. 2016). The concepts of GCA and SCA are extensively used in the breeding of temperate and tropical crops, and according to Hallauer et al. (2010), are particularly significant for the diallel mating design.
In this context, diallel mating designs are widely used for the evaluation of genotypes in many agricultural crops (Hill et al. 2001, Souza et al. 2012, Mendes et al. 2015, Bolson et al. 2016, since they provide a better genetic understanding between crosses and prediction of the best crosses between parents (Laude and Carena 2014). Such analyses have been developed for parents that range from inbred lines to varieties with a broad genetic base (Hallauer et al. 2010).

G Inocente et al.
However, when using complete diallel models, it must be considered that the number of crosses increases directly with an increase in the number of parents, regardless of the species and the objective of the breeding program. This limits the use of the method to the operational capacity in the field to obtain the hybrids and to evaluate subsequent experiments. Thus, using maize as a model for allogamous plants, several derivations were created to test the heterotic effects of two distinct groups of inbred lines, such as partial diallel crosses or even circulant chains, in which only one sample of the set of all possible crosses between available parents (s) decreased the number of crosses compared to the complete diallel.
Since the first plant diallel study of Jinks and Hayman (1953) was published 67 years ago, few papers can be found in the literature evaluating the efficiency between complete or partial diallels in relation to the circulant model (involving only a sample of the number of crosses). Furthermore, the conclusions of these studies are divergent, especially with respect to the minimum number of crosses of parents in the circulant models that maximizes their efficiency in relation to the complete models. It is also noteworthy that even though heritability is considered in the discussions of these studies, the presence or absence of parents is a factor considered not to interfere in the correlation estimates. Dhillon and Singh (1978) evaluated a circulant partial diallel for s = 3, 5, 7,11,and 15 compared to a complete diallel (s = 19) in four environments and eight traits. They found that the s = 5 circulant diallel was as good as the complete s = 19 diallel in detecting differences among the effects of general combining ability (GCA) but using s = 3 gave adequate information in the case of traits with high heritability. However, the authors indicated that circulant analysis was less efficient in detecting differences due to the effects of specific combining ability (SCA). The results varied according to the environments, and traits with low heritability were more prone to misinterpretation. The GCA effects showed fluctuations in the circulant analysis that were more pronounced in s = 5 and s = 3, particularly for traits with low heritability. The authors concluded that smaller circulant diallels gave erratic heritability estimates, particularly for traits with low heritability. Veiga et al. (2000) simulated data to evaluate the efficiency of a circulant diallel in relation to complete diallels based on GCA and SCA. For both parent classification and the magnitude of the combining abilities, the circulant diallels provided reliable estimates in relation to the complete diallels. The authors reported that though they recommended evaluation of the value of s for low heritability traits, they found that it is not necessary to exceed 50% of the number of parents for s values. These data contradict the conclusions of Murty et al. (1967), who found that circulant diallels may be adequate for screening parents for their GCA effect with s = n/2 and the average standard error for the estimates, increase with the decrease in s for the six traits evaluated.
Considering the divergent results in the literature, we used maize as model plant with the aim of quantifying the correlation estimates of the general combining ability (GCA) and specific combining ability (SCA) between a complete diallel model and a circulant one based on traits with intrinsic high and low heritabilities, with and without the inclusion of parents in the analysis. These hybrids were evaluated together with the parents during the 2015/2016 crop year to obtain GCA and SCA estimates through the adaptation of Morais et al. (1991), based on the Gardner and Eberhart (1966) complete diallel model for the populations in Hardy-Weinberg equilibrium. Circulant diallels were then simulated with the same data set, as described by Ferreira et al. (2004), considering different numbers of crosses per parent (s), according to Kempthorne and Curnow (1961). To minimize the environmental noise on genetic parameters, resamplings of the different interpopulational hybrids were used through development of a parallel algorithm in R language (R Development Core Team 2019) for sampling and sorting of new codification of parents. That way, it was possible to generate two thousand different circulant diallels for each s. For the correct definition of crosses in which each parent will participate, an odd number of s must be considered when one has an even number of parents, and an even number of s when one has an odd number of parents. As the number of parents in the present study was even, it was possible to simulate circulant diallel values of s = 3, 5, 7, and 9.

Experimental evaluation
Experiments were carried out in three environments in southern Brazil. The main characteristics of each environment are described in Table 1. The experimental design was randomized blocks with three replications per site, and the experimental plot consisted of a 5-m length row at a spacing of 0.80 m between rows and 5 plants per linear meter after thinning.
Two traits were evaluated: female flowering (FF), initially considered a high heritability trait, measured in days from emergence until the style-stigma appeared in 50% of the experimental plot; and grain yield (GY), considered a low heritability trait, adjusted to kg ha -1 and corrected for moisture to a standard 14.5% by weighing the grain from all ears harvested in the plot (4 m 2 ).

Effects of genetic model and statistical model
Individual and combined analyses of variance of the three sites were performed using R software (R Development Core Team 2019). The complete diallel model was analyzed using GENES software (Cruz 2013) through the model proposed by Gardner and Eberhart (1966) for parents and F 1 , adapted by Morais et al. (1991), in different environments, aiming to evaluate the potential of varieties per se and in hybrid combinations through estimates of varietal effects and heterosis manifested in the hybrid.
The model proposed by Kempthorne and Curnow (1961), evaluated in different environments, was used for the circulant model. In crossbreeding, each of the 12 populations (i) was involved in s crosses, where s < p -1 and s ≥ 2. Thus, the values of s are 3, 5, 7, and 9, since p(the number of parents) is even. The following crosses k + i,k + i + 1,...,k + i-1 + s were performed, where the value of k is an integer obtained by (p + 1 -s)/2; and in the genetically balanced diallel analysis, each parent participates the same number of times in the crosses. Thus, ps/2 is the total number of crosses.
The Kempthorne and Curnow (1961) model presumes the extraction of a sample of lines from the same population of origin, which is a complication in modern breeding programs that seek to identify parents from different origins for the extraction of lines for the purpose of maximizing the heterotic response. This problem of the original model motivated development of the model by Miranda Filho and Vencovsky (1999) for circulant diallels at the interpopulational level. In this case, the model considers not one, but two base populations or heterotic patterns of origin. However, in the present case, the aim was to represent different populations belonging to a common genetic basis, since the Gardner and Eberhart (1966) model considers that the allele frequencies from each population obey the Hardy-Weinberg equilibrium, whereas the Kempthorne and Curnow (1961) model does not restrict the use of populations but considers a common origin as the basis.
The estimates of the effects of the model and procedures for calculating the sums of squares are based on matrix operations, whose sequence is given by Kempthorne and Curnow (1961). The GCA estimates are given by the formula Y ij = m + g i + g j , where Y ij is the mean for an unvalued predicted hybrid, m is the overall mean of the diallel, g i is the G Inocente et al.
general combining ability effect of the parent i, and g j is the GCA effect of parent j. For the hybrids evaluated for each value of s, the values of specific combining ability (SCA) are estimated by the equation ŝ ij = Y ij -μ̂ -ĝ i -ĝ j , where Y ij is the average of the evaluated hybrids, μ̂ is the overall average of the evaluated hybrids, ĝ i is the effect of the GCA of parent i, and ĝ j is the effect of the GCA of parent j.
The efficiencies in the two diallel models for different numbers of crosses per parent were evaluated by the Spearman rank-order correlation (5% probability) between the combining abilities (GCA and SCA) obtained in the complete diallel (through approximation of the v i and h i effects) and the simulated circulant diallel.

RESULTS AND DISCUSSION
Through Hartley's F maximum test, the data shows homogeneity of variance considering residual squares, and no exclusion of any environment from the analysis was required. Combined diallel analysis (Table 2) shows low CV% for the FF and GY variables, indicating high experimental accuracy in the evaluation of these traits. Significant differences were also observed by the F test (P < 0.1) for the base set of interpopulation hybrids evaluated by means of the heterosis source of variation (h ij ), as well as among the populations used as parents to obtain crosses by the source varieties (v i ). Thus, the results indicate that there is variability in the mean values for the two variables among populations, among hybrids, and when the overall set is evaluated by the source of variation (hybrids + populations).
The significance of the environmental source of variation (E) indicates that the overall means obtained were statistically different, from which we can infer that the set of selected sites allowed differential expression of the set of evaluated genotypes. However, the interaction effects were only significant for the GY variable, which was expected, since the estimates of heritability or the component of genotypic determination (H 2 ) are commonly reported in the literature as medium to low; that is, there is strong influence in the expression of this trait, since it is linked to many genes directly or indirectly involved in its expression. The same did not occur for the FF variable, since the high H 2 (0.95) indicates a lower response to environmental variation.
Based on the responses of the interactions for the different sources of variation with the source E and their decompositions, the analyses for the FF trait were based on the overall mean, while for the GY traits, responses were obtained within each environment evaluated.
Estimates of correlations were evaluated when parents were inserted in the circulant model, and when the parents were not included, only the hybrids defined by the model and the number of crosses per parent were evaluated. Gardner and Eberhart (1966) considered the parental means to obtain estimates of the effects of varieties (v i ), but the circulant model may or may not consider parental inclusion in the analysis. Figures 1a and 1b show that for the FF trait, the correlations between the GCA estimates obtained in the two diallel models, even with low circulant s, had values higher than 0.80 when parents were included in the analyses (Figure 1a). In contrast, in the model that did not consider parents in the analyses (Figure 1b), the values were higher than 0.67, indicating adequate concordance in the ranking of the estimates (and parental populations) even when only 18 hybrids per parent (s = 3) were obtained in the circulant model for this trait with high heritability. Failure to use the parents caused a reduction in the estimation of correlations. However, under breeding criteria, this should be an occasion where the weight of information for decision making should be used to justify whether or not to include the parents in the evaluation. In this case, if the reduction in the correlation estimate is compensated by the fact that a smaller number of crosses is evaluated without significant loss of information based on level of heritability, the breeder may opt for smaller values of s. Similar results were reported by Veiga et al. (2000), with a correlation higher than 0.9 in s = 3 in the variables that exhibited heritability higher than 50%.
For the correlations above 0.80 in SCA estimates (Figure 1c and 1d), low values of s can be used both in the absence and in the presence of the parents. However, with the use of the parents (Figure 1c), the values of r start at 0.87 for the lowest value of s, stabilizing at 0.90 for crossing larger numbers of parents.
Without the use of parents (Figure 1d), the r values started at 0.58, but for s = 5 and above, there was a substantial increase where the correlations were higher than 0.85. For values of s > 5, the values of correlation estimates started to show stability.
This stabilization is in agreement with that observed by Veiga et al. (2000) for variables with values of h 2 above 50% in models that consider or do not consider dominance effects in the expression of the trait. The heritability (0.95), which tends to facilitate obtaining hybrids with values higher than desired, is due to the low predominance of non-additive effects associated with the trait. Gordon (1980) found that when a trait has a large additive genetic component in its variance, the GCA estimate may be used for predicting the performance of crosses. Lima et al. (2008), studying genetic control of the thermal requirements for the beginning of flowering in maize, found that by the components of variance, a model involving the additive variance, dominance variance, and environmental variance was sufficient to explain genetic control of flowering in maize, showing that the thermal sum as a measure of earliness in maize is oligogenic, with the presence of dominance. However, the authors also found that the possibility of success in mass selection for the flowering trait could be inferred from a high estimate of the heritability coefficient. The present study indicates that obtaining earlier interpopulational hybrids may be facilitated and that selection within populations may be advantageous, when the objective is reciprocal recurrent selection.
For the GY variable in relation to the GCA estimates obtained in the two models, correlations between 0.40 and 0.66 were associated with lower values of s (Figure 2a and 2b) in the presence (Figure 2a) or absence (Figure 2b) of the parents in the Londrina (LD) environment. For the other environments, the r values were close to or higher than 0.70 for s = 5 and above under the two conditions evaluated, with and without parents in the analyses. Murty et al. (1967) and Dhillon and Singh (1978) observed high variations in traits with low trait heritability when using small values of s. Veiga et al. (2000) simulated different scenarios with different values of heritabilities relative to GCA and found mean correlation values higher than 0.5 with low h 2 (0.10) for s = 3 in models with and without dominance. Ferreira et al. (2004) worked with ear yield, a low heritability trait, in different diallel models, with parents included in the analyses. They concluded that the parents involved in a diallel may require different s values for each parent to obtain acceptable GCA estimates, and that using a minimum s value for a simulated circulant diallel may result in biased GCA estimates compared to the estimates obtained in the complete diallel.
In our study, a noteworthy point is that the correlations tended to stabilize above 0.70 for s = 7 and greater, that is, more than half the number of crosses per parent in the complete model (s = 11). This confirms the conclusions of Murty and Anand (1966) and Murty et al. (1967) and is contrary to studies by Veiga et al. (2000), emphasizing that it is not necessary to exceed 50% of the number of parents, even for low heritability traits. It should be emphasized that the heritability obtained in the present study was 54% for GY, which is 5.4 times higher than the value obtained by Veiga et al. (2000); and even with this higher heritability value, a larger number of crosses per parent were necessary for the correlations to stabilize. As Bray (1971) concluded, the use of a partial circulant diallel of any size entails a considerable risk in that only part of the potentially available data is being sampled, and the extent and likelihood of any such error depend upon both the number of crosses sampled and the nature of the trait under study.
However, comparison of the efficiencies of the two circulant models (with and without parental presence) in relation to the complete model showed lower correlation values in the circulant model with parents (Figure 2a). In this case, for the identification of testers with high GCA, the breeder may choose not to include parents in the analysis, without significant loss of information that would compromise definition of promising populations, especially for low values of s (s < 5).
To improve the identification of promising populations through hybrid combinations and to obtain GCA estimates for each population, it is necessary to know the degree of complementarity of each interpopulation hybrid based on specific heterosis, and the estimate is similar to SCA. Rinaldi et al. (2007) highlighted that the greater the divergence of the parents is, the better the hybrid combinations. In this case, the number of heterozygous loci in a given hybrid combination may correspond to higher SCA effects or even greater effects of mean heterosis (relative to the mean of the parent involved in the cross) or heterobeltiosis (relative to the mean of the best parent).
The Spearman r estimates of SCA calculated for the complete and circulant models (Figure 2c and 2d) were obtained in a manner similar to the GCA estimates, but only the information of the hybrids that were actually evaluated in the complete diallel model was used to follow the simulations of the different circulant models tested.
For the GY variable, the correlation estimates in the circulant diallel model with s = 3 and parents evaluated were higher in all environments compared to the non-parental results. However, for s = 5 and above, the values of the correlations obtained in the Guarapuava (GUA) and Ponta Grossa (PG) environments for non-parental simulations were higher than the estimates in the parental scenarios. Only the LD values were close in both scenarios.
The G × E interaction, as expected, contributed to some divergences in the r responses, especially in the LD environment when we simulated a small number of crosses per parent (s = 3 and s = 5). For the GUA and PG environments, for s = 5 and above, the correlation values were close to or greater than 0.8, both in the scenarios that involved the evaluation of parents and in those without parents. In this case, it can be inferred that, although the interaction affects the estimations of the correlations, especially for low s (s = 3), their magnitudes may be such that the breeder judges their use viable in the case of scarce resources in order to evaluate a larger number of hybrids.
However, it should be emphasized that an evaluation involving parents and conducted in more than one environment can contribute to minimization of selection errors, as well as recognition of a possible adaptability pattern of the hybrids and/or parental populations evaluated. In cases where s is close to half the number of parents involved in the analysis and the inheritability of the trait is greater than 50%, the use of parents can even be disregarded, as long as the evaluation is maintained in different environments. Thus, the prediction of hybrids by the circulant model was considered efficient in detailing the response patterns of interpopulational hybrids, depending on whether or not parents were used. Prediction of the minimum number of parents for traits with different levels of heritability was also considered efficient, without the need to perform a complete diallel study.
In the case of the breeder focusing on development or improvement of populations, the approach should be to seek close adaptability patterns (or a low G × E interaction), so that the subsequent selection process progresses in the direction of maintaining this pattern and transmits regions of the genome directly or indirectly associated with this pattern to progenies or hybrids. Gage et al. (2017) evaluated the effect of artificial selection on phenotype plasticity in temperate maize and detected selected genomic regions that explain the low variability in G × E for grain yield compared to non-selected regions in modern breeding programs. According to the authors, this indicates that the improvement may have contributed to reduction in the G × E interaction in modern temperate maize cultivars.
Another factor that may contribute to the optimization of hybrid selection in diallel crosses may be related to the use or non-use of reciprocals in evaluations. Fan et al. (2014) analyzed a dataset on reciprocal and non-reciprocal diallels G Inocente et al.
in Southwest China and found that the use of reciprocals can affect grain yield estimates and estimates of SCA and may even reflect the classification of heterotic groups in maize. However, in the present study, the correlation estimates for SCA remained high in circulant models without the presence of parents in most of the environments tested and with numbers of crosses of less than half the number of parents. These results are in agreement with Zare et al. (2011) and Barata et al. (2019), who found it is possible to disregard reciprocal crosses, since their effects are not significant for GY.
In this case, it is understood that the use of reciprocals would contribute little to an improvement in selection efficiency, in addition to the fact that this would double the number of hybrids, and the risk of not obtaining all hybrids and reciprocal combinations would be necessary. Fan et al. (2014) pointed out that one of the difficulties in evaluating reciprocal diallel interactions is that reciprocal crosses sometimes fail to produce sufficient seeds or that the decision to include reciprocal or non-reciprocal crosses may depend on limitation of space and other resources.

CONCLUSIONS
For a high heritability trait (female flowering), GCA estimates can be obtained with low values of s, while for SCA estimates, values of s close to half the number of parents should be used.
For a low heritability trait (grain yield), information from parents must be used to obtain the SCA estimates. For GCA estimates, considering the stabilization of r above 0.70 for s = 7, values of s above half the number of parents must be used in the circulant models.