Uni and multivariate approaches for diallel analysis in early generation trials for soybean tolerance to rust

: Due to the high impact of Asian soybean rust (SBR) in Brazilian croplands, several studies have been conducted in order to maintain or increase the grain yield gain over years in the presence of the pathogen. The aim of this study was to define a breeding strategy applying uni and multivariate approaches for diallel analyzes in early generation trials and contrasting disease conditions. Thus, assessing genetic parameters to identify traits related to greater tolerance to soybean rust. Deploying a North Carolina design II scheme (4 elite commercial cultivars × 10 rust-tolerant experimental lines), we obtained 40 F 2 crosses that were evaluated in a randomized complete block design with 4 replicates. The crosses were conducted in two environments, contrasting only for the fungicide management (with and without rust control), enabling the estimation of the rust impact on 11 traits, divided into two groups: adaptive and reproductive. A multi-environment diallel model was applied assuming all effects as fixed in order to estimate the general (GCA) and the specific (SCA) combining abilities. The multivariate diallel analysis was performed in each rust management using MANOVA to test the genetic effects. The rust conditions were the most important source of variation for almost all evaluated traits. Using the contrasting disease conditions, we were able to identify crosses with desirable SCA and parents with GCA, which implies that tolerance can be used as a strategy to deal with SBR. The implementation and the use of multivariate diallel analysis aiming to increase the efficiency of genotype selection for rust tolerance is discussed.


INTRODUCTION
Soybean (Glycine max) is the major oil crop in the world and the fourth crop in terms of volume of production (USDA 2017). Due to its importance, several companies and research institutions have been conducting breeding programs and releasing new cultivars in all producing countries. In Brazil, the second largest soybean producer in the world, yields across the country has experienced negative impacts due to Asian soybean rust (SBR), the currently most important disease. SBR is caused by the biotrophic fungus Phakopsora pachyrhizi. As a consequence, the use of fungicides at multiple times per grown cycle has led to expenses as high as US$ 2 billion per year (Godoy et al. 2016).
Several strategies can be applied to reduce the pathogen population and restrict the damage for soybean production. Some strategies include the use of fungicides (Godoy et al. 2016), resistant cultivars carrying one or multiples Rpp (resistance to P. pachyrhizi) genes (Yamanaka et al. 2015), tolerance (Jarvie and Shanahan 2009;Oloka et al. 2009;Maphosa 2014), soybean-free period (Godoy et al. 2016), and early sowing dates. Others noncommercial strategies include transgenic events (Kawashima et al. 2016), RNA interference (Koch and Kogel 2014;Langenbach et al. 2016) and nonhost resistance (Langenbach et al. 2013). When the strategy covers genetic aspects, the use of classical breeding is essential to develop a high-performance cultivar. In this sense, the breeder must plan the crosses followed by the segregating populations assessments. A great part of the success in obtaining elite cultivars depends on how well the information available in the breeding program will be collected and used to assist in the decision-making process.
One of the most important steps in a plant breeding program is the parental selection. Regarding this, breeders have to consider several traits, since grain yield is a result of the association of multiple traits, e.g. seed weight, number of seeds per plant, and harvest index. Several research studies have reported the relevance of parental selection on the breeding success (Carpentieri-Pípolo et al. 2000;Bertan et al. 2007;Casassola et al. 2013), and many tools are available to help the selection, e.g. divergence analyses based on phenotypic, genetic, and pedigree information.
After parental selection, the crosses are made, and these can be schematized in complete diallel or North Carolina design II. Frequently, segregating populations are analyzed following a diallel design using univariate approaches (Ribeiro et al. 2007;Gavioli et al. 2008;Oliveira et al. 2014), which implies failure in considering the correlations between desirable traits. To overcome this restriction, a multivariate diallel approach can be performed, enabling the study of two or more traits simultaneously (Ledo et al. 2003). Research studies applying multivariate diallel analysis are available for several important crops, including wheat (Benin et al. 2009), sweet passionfruit (Jung et al. 2007), popcorn (Gonçalves et al. 2014), Jatropha (Teodoro et al. 2017) sweet pepper (Nascimento et al. 2010) andsoybean (Amaral et al. 2004).
Tolerance can be defined as the capacity of a cultivar resulting in less yield or quality loss relative to the same disease severity when compared to other cultivars or crops (Schafer 1971). According to some researchers, tolerance against plant diseases may have several benefits and can yield valuable results (Ney et al. 2013;Tukamuhabwa and Maphosa 2014). Besides grain yield, the use of tolerance also impacts the plant × pathogen interaction complex (Kover and Schaal 2002). Genes conferring complete resistance (R-genes) tend to be surmounted by the pathogen. On the other hand, tolerance implies the absence of selection pressure over the pathogen, which contrasts with resistance and fungicide usage (Roy et al. 2000). The use of tolerance to the development process of improved plant materials is reported in the literature in various crops (Jarvie and Shanahan 2009;Roux et al. 2010;Pierre et al. 2015).
The reduction of grain yield by SBR occurs mainly through leaf loss, reduction in leaf area, decrease in dry matter accumulation and reduction in harvest index (Kumudini et al. 2010). In tolerant genotypes, the ability to maintain productivity may occur at three levels, organ, plant and crop. Some traits are influenced by source-sink relationships, an increase of assimilation in healthy tissues and by crop architecture (Ney et al. 2013). Once the losses are a result of several traits that can be evaluated in a plant-based phenotyping, the use of multivariate diallel and multivariate approaches may be an alternative as a breeding strategy to obtain tolerant genotypes to SBR. The identification of promising crosses and parents with high general combining ability and traits related with tolerance to SBR are important for plant breeding, since in this way the breeder can define the best breeding strategy.
Besides the importance of this disease, multivariate approach for soybean tolerance to rust still have not been applied. In this context, the aim of this study was to define a R. M. Yassue et al. breeding strategy applying uni and multivariate approaches for diallel analyzes in early generation trials and contrasting disease conditions. Thus, assessing important genetic parameters for plant breeding to identify traits related to greater tolerance to soybean rust.

Study location and genetic material
The experiments were carried out in Piracicaba-SP (22º 41' S; 47º 39' W; 546 m alt). The meteorological data are described in Supplementary File 1. The evaluations of rust symptoms were performed in all plots during the reproductive period of the crop, according to the canopy severity ratings, ranging from 1 to 5, 1 indicating a dark green canopy and 5 indicating noticeable yellowing throughout the canopy (Monteros et al. 2007;Harris et al. 2015).
The crosses evaluated in the F 2 generation were originated from a North Carolina design II scheme. The first group of parents consisted of 4 elite commercial cultivars and the second was 10 experimental lines selected for rust tolerance, composing the second cycle of a recurrent selection program established in 2007 (see Supplementary File 2).

Rust tolerance assessment
The experimental design was a randomized complete block with 4 replications. Each block was filled with the 40 single crosses. The sowing density was 40 seeds per single row plot (2 m × 0.5 m). The date of sowing and fungicide applications are available on the Supplementary File 1.
In order to evaluate the tolerance to SBR, the experiments were replicated in two fungicide management. In the first (RC) we controlled the rust and the late season leaf diseases, with three preventive applications of fluxapiroxade and pyraclostrobin at a dose of 58.45 + 116.55 g a.i..ha -1 , respectively. For the second management, we controlled for late season leaf diseases, except Asian rust (NRC), with three preventive applications of carbendazim at a dose of 300 g a.i..ha -1 , respectively. According to previous results, this last fungicide has no apparent effect on SBR at field conditions (Jarvie and Shanahan 2009;Araujo and Vello 2010).

Phenotyping of morphoagronomic traits
Using the scale proposed by Fehr and Caviness (1977), we estimated, in days, the period of emergence to beginning of seed development (R5), period of grain filling (beginning of seed development to beginning of maturity, GF), and the cycle (sowing to full maturity, R8). During the reproductive period, at 108 days after sowing, we evaluated the following traits by means of scores: plant architecture (1-less favorable to 5-ideal plant architecture, the ideotype considered was to be the one with the highest number of branches and with lower angulation, PA) and leaflet area estimation (1 -small to 5 -large leaflet area, LA).
At the physiological maturity stage, the following traits were measured: plant height (cm, PH); lodging (1 -all plants upright to 5 -all plants prostrate, Lo), and agronomic value (1 -less adequate to 5 -suitable plants, AV). This index is a comparative scale that gathers a series of subjective traits, such as plant architecture, plant height, vigor and number of pods. The biomass of each plot was collected, packed in bags and stored in the laboratory with air circulation until drying. Later, the material was weighed and threshed. Seed yield (kg.ha -1 , SY) was estimated from the harvested material and the apparent harvest index (HI) was calculated as the ratio of seed yield over total biomass. Subsequently, 100 seeds weight (g, HSW) was determined. To attend the assumption analysis of variance, the Lo, PA and LA data were transformed to (x) 0.5 .

Univariate diallel analysis
Genetic analyses were performed to estimate general combining ability (GCA), specific combining ability (SCA) and their interaction effect with fungicide management. The methodology was proposed by Griffing (1956), Model IV, adapted for partial diallel (Dhillon 1978;Vencovsky and Barriga 1992), according to the following model: 8

Univariate diallel analysis
Genetic analyses were performed to estimate general combining ability (GCA), specific combining ability (SCA) and their interaction effect with fungicide management.
The methodology was proposed by Griffing (1956), Model IV, adapted for partial diallel (Dhillon 1978;Vencovsky and Barriga 1992), according to the following model: Y ijkl = μ+ b ij +g I k + g II l + S kl + f j + gI k f j +gII l f j +S kl F j +e ijkl In this model, Yijkl is the observed value relative of the klth hybrid at the ith block of the jth experiment; μ is the constant; bij is the effect of the block (i = 4); gIk is the GCA effect for the parents of group I (k = 4), e gIIl is the GCA for the parents of group II (l = 10), Skl is the SCA effect for F2 hybrid, fi is the effect of fungicide management (j = 2) and eijkl is the experimental error, with mean 0, normally distributed and variance σ². All the effects were considered fixed, except for the error.
The quadratic components for GCA and SCA were calculated using the methods In this model, Y ijkl is the observed value relative of the kl th hybrid at the i th block of the j th experiment; μ is the constant; b ij is the effect of the block (i = 4); gI k is the GCA effect for the parents of group I (k = 4), e gII l is the GCA for the parents of group II (l = 10), S kl is the SCA effect for F 2 hybrid, f i is the effect of fungicide management (j = 2) and e ijkl is the experimental error, with mean 0, normally distributed and variance σ². All the effects were considered fixed, except for the error.
The quadratic components for GCA and SCA were calculated using the methods of moments, according to Torres and Geraldi (2007), and are shown in the follow equations: values of each genotype in each fungicide management, we performed a Principal Component Analysis (PCA), determining a minimum 70% of representation. Pearson's correlation coefficient was estimated between each principal component and the traits. Finally, the effects of GCA I, GCA II and SCA for the scores from each principal component and genotypic were estimated. The statistical analyses were performed using the software Genes (Cruz 2013) and R language (R Core Team 2016).

Univariate diallel analysis
No significant effects of fungicide management were observed on R5, PA, Lo, and AV, meaning that they were not influenced by the presence of the disease (Fig. 1). On the other hand, LA, PH, R8, GF, HI, HSW and SY showed significant effects for fungicide management. The highest reductions (losses) caused by rust occurred for SY (37.26%), HSW (30.85%), GF (14.80%) and HI (8.78%). The effect of GCA I and II were significant for almost all evaluated traits and significant SCA was only observed for R8, GF, HI, and HSW. The interactions between fungicide management and GCA I and II were significant for R8 and HSW, and the interaction of SCA with management was significant only for R8 (Fig. 1).
Analyses of the quadratic components for GCA and SCA revealed Baker Ratio higher than 0.75, showing a predominance of additive effects for SY, PH, PA, HSW, and R5 (Table 1). Non-additive effects were important mainly for R8, GF, and HI, with Baker Ratio of 0.495, 0.540, and 0.606, respectively.

Correlations between traits
Correlations between the 11 traits in the management of each fungicide are shown in Fig. 2. There was a differentiated response in each fungicide management. For RC, there were no significant correlations between HSW and the other traits. However, in NRC management, HSW presented high negative correlations with R5 and R8 (-0.45 and -0.42, respectively) and positive correlations with AV (0.45). For SY in RC management, there was a significant correlation only with AV (0.50). However, when in contact with rust, SY presented significant correlations with HI, R8 and AV In this model, Yijkl is the observed value relative of the klth hybrid at the ith block of the jth experiment; μ is the constant; bij is the effect of the block (i = 4); gIk is the GCA effect for the parents of group I (k = 4), e gIIl is the GCA for the parents of group II (l = 10), Skl is the SCA effect for F2 hybrid, fi is the effect of fungicide management (j = 2) and eijkl is the experimental error, with mean 0, normally distributed and variance σ². All the effects were considered fixed, except for the error.
The quadratic components for GCA and SCA were calculated using the methods of moments, according to Torres and Geraldi (2007), and are shown in the follow equations: Baker Ratio (Baker 1978) that reveals an estimative of the additive effects regarding the genetic effects: Baker Ratio= ΦGCA gI +ΦGCA gII ΦGCA gI +ΦGCA gII + ΦSCA Skl

Correlation between traits
Traits were combined in pairs for the estimation of the Pearson's correlation coefficient in each of the fungicides management. With the correlation matrix, we calculated the variance inflation factors (VIF) and traits with VIF higher than 10 were excluded for the path analysis. Seed Yield (SY) was used as dependent variable and R8, R5, Lo, PA, LA, AV, HSW, HI and GF as explanatory.

Multivariate diallel analysis
For the multivariate diallel analysis (Ledo et al. 2003

Correlation between traits
Traits were combined in pairs for the estimation of the Pearson's correlation coefficient in each of the fungicides management. With the correlation matrix, we calculated the variance inflation factors (VIF) and traits with VIF higher than 10 were excluded for the path analysis. Seed Yield (SY) was used as dependent variable and R8, R5, Lo, PA, LA, AV, HSW, HI and GF as explanatory.

Multivariate diallel analysis
For the multivariate diallel analysis (Ledo et al. 2003), traits were divided into two groups, adaptive and reproductive. The adaptive group consisted of R8, R5, Lo, PA, LA, AV and PH and the reproductive group by SY, HSW, HI, and GF. To simplify, we considered separate managements. The multivariate analysis of variance (MANOVA) test was utilized to verify the significance of the genetic effects, using Wilks test and the F approximation (Harris 2001  The total observed phenotypic variation is decomposed into the following elements: GCA I: general combining ability (genotypes from group one); GCA II: general combining ability (genotypes from group two); F: fungicide management, i.e. with and without rust presence; B:F: blocks within fungicide; SCA: specific combining ability; the interactions GCA I*F, GCA II*F, SCA*F; and the error term. C.V.: coefficient of variation. **, *: Significant at 1% and 5% of error probability by F-test. The evaluated traits were: number of days from VE to R8 (R8); days to R5 (R5); period of grain filling (GF), plant architecture (PA), leaflet area estimation (LA), plant height (PH); lodging (Lo), agronomic value (AV), seed yield (SY), 100 seeds weight (HSW) and harvest index (HI). Correlation values above 0.30 and below -0.30 are significant by the t-test. Table 1. Quadratic components estimates for general (ϕGCAgI and ϕGCAgII) and specific (ϕSCA) combining ability for number of days from VE to R8 (R8); days to R5 (R5); period of grain filling (GF), plant architecture (PA), leaflet area estimation (LA), plant height (PH); lodging (Lo), agronomic value (AV), seed yield (SY), 100 seeds weight (HSW) and harvest index (HI), measured in F 2 crosses from a North Carolina design II. The direct effects of each trait on seed yield can be observed through path analysis (Fig. 3). The coefficient of determination of each path analysis was 0.414 and 0.536 for management of the RC and NRC, respectively. Differential response between managements was observed. For RC management, the greatest direct effects occurred for AV and Lo (0.606 and 0.332, respectively). However, for NRC management, effects occur for AV, Lo, PH, R8 and PA (0.697,0.584,respectively)

Multivariate diallel analysis
Multivariate diallel variance analyses are presented in Table 2 for management RC and NRC, individually. For the management RC, additive (GCA I and GCA II) and non-additive (SCA) eff ects were signifi cant for adaptive and reproductive traits. For NRC, the additive effects (GCA I and GCA II) for the adaptive and only GCA II for the reproductive traits were signifi cant.
For management RC, the first principal component (PCA1a) was associated with height and lodging values, while PCA2a was associated with negative values of R5 and PCA 3a with low levels of agronomic value and plant architecture (Table 3). Considering this, the desirable ideotype would be genotypes with negative values of PCA1a and PCA3a and positive values of PCA2a. Th e best-performing parents in Table 3. Pearson correlations between traits and principal components (PCA) for management with control for rust and late season leaf diseases (RC) and management for late season leaf diseases, except Asian rust (NRC). The adaptive traits were: number of days from VE to R8 (R8), days to R5 (R5), plant architecture (PA), leafl et area estimation (LA), plant height (PH), lodged (Lo) and agronomic value (AV). The reproductive traits were: period of grain fi lling (GF), seed yield (SY), 100 seeds weight (HSW) and harvest index (HI).  the first group were L6, L8, L10 and C01 and the crosses C01 × L01e C01 × L03 (Fig. 4).
In the management NRC and the adaptive traits, PCA1a' was associated with late cycle and low agronomic value, PCA2a' with short plants and low leaf area score and PCA3a' with upright plants and with favorable architecture. Based on this, the desirable ideotype would correspond to genotypes with low PCA1a' and high PCA2a' and PCA3a' , which correspond to the parents L10, L07 and C04 with the highest contributions to GCA.
For the reproductive traits in management RC, PCA1r was associated with genotypes with a short grain filling period, PCA2r with low productivity and higher PCS and PCA3r with low SY and shorter period grain filling. Consequently, the desirable ideotype would be plants with high PCA1r and low values of PCA2r and PCA3r. The best-performing GCA parents were C03, L08 and L10 and, for SCA, the crossings L03 × C01, L04 × C02, and L08 × C03.
For the reproductive traits, PCA1r' was negatively associated with all traits, PCA2r' with low grain yield and a long period of grain filling and PCA3r' with a heavy weight of one hundred seeds. Based on this, the desirable ideotype would be genotypes with low PCA1r and PCA2r and high PCA3r values. Considering only the GCA II, the C03 was the parent that contributed the most with favorable traits.
In RC, considering short plants and favorable architecture, with early cycle and high agronomic value, the best parents were USP  and M-SOY 7908 RR and crosses M-SOY 7908 RR × USP 14-22.001 and M-SOY 7908 RR × USP 14-22,003. However, for the selection of plants with high harvest index, productive, with heavier hundred seeds weight (HSW) and longer period grain filling, the best performing parents were V Max RR, USP 14-22.008 and USP 14-22.010 and the crosses USP 14-22.003 × M-SOY 7908 RR, USP 14-22.004 × AS 7307 RR and USP 14-22.008 × V Max RR. For NRC, only the effects of GCA were significant. Therefore, the parents USP 14-22.010, USP 14-22.007, and BMX Potência RR provided earlier, shorter and upright plants with favorable architecture. However, the parent V MAX RR provided additive effects for high yield genotypes, with a longer period of grain filling and heavier seeds.

DISCUSSION
The conditions of temperature and precipitation were satisfactory for the crop development (Supplementary File 1). It was possible to observe the progress of the disease as the relative humidity of the air increased, revealing that the environmental conditions were ideal for the pathogen (Godoy et al. 2016).
The mechanisms involved in the reduction of seed yield was also similar to that found in the literature, in consequence of reductions in size and seed mass and harvest index (Kumudini et al. 2008). The nonsignificant effect of fungicide management for R5 is explicated by the fact that the disease just progresses after this period. For PA and Lo, these traits are also defined before the infection period occurs. The use of VA was able to identify differences between treatments within each fungicide management, however, because it was a comparative scale, this was not significant for fungicide management.
Few traits had a significant interaction effect between GCA and SCA and fungicide management. This may be explained due to the origin of the parents from group I. This group originates from the recurrent selection program for rust tolerance, where genotypes are selected for high stability when in contact with the disease. These results were similar to those obtained by Jarvie and Shanahan (2009), who observed low interaction between rust tolerant genotypes and fungicide management.
In self-pollinated crops such as soybeans, it is expected to find a predominance of additive effects for most traits in early generations (Spehar 1999;Mebrahtu and Devine 2008). Tolerance to rust is expected to have a polygenic inheritance and a predominance of additive effects (Ribeiro et al. 2008), which is similar to those found in the present study. Dominant effects were important mainly for cycle-related traits, since few genes with major effect govern these traits, generally with dominant alleles for precocity (Suh et al. 2000;Watanabe et al. 2012).
For both correlation and path analysis, a differentiated response was observed in each management. This indicates that, for the evaluated genotypes, they behaved differently in the presence or absence of the disease. Considering NRC management, negative correlations were observed between Harvest Index and plant height (-0.56) and cycle-related traits (-0.61), implying that high and late plants contribute negatively to the biomass conversion in productivity, suggesting an unbalance in the source and drain relation. Besides that, plants with short period of grain filling tendes to be lower (0.42) and convert more dry matter into grains (-0.36).
Low coefficients of determination in path analysis may be associated with other traits not evaluated in this research, which could have contributed significantly to SY (Carvalho et al. 2002). These traits could be number of nodes on the main stem, pods per unit area and seeds per pod (Pandey and Torrie 1973;Bizeti et al. 2004). In this way, the results should be analyzed cautiously and future studies should be performed to confirm the results.
Our results indicated that agronomic value and seed yield had direct effects and correlation in the same direction and magnitude, implying that this correlation explains the true association between traits (Vencovsky and Barriga 1992). These results indicate that, even if agronomic value is a subjective visual scale, which considers the researcher's experience, indirect selection for rust tolerance was possible. It is important to emphasize that AV did not present significant differences for fungicide management in the analysis of variance but showed significant effects for genotypes (Fig. 1). In this way, AV was efficient in identifying the most productive crosses within each fungicide management, but not between experiments.
The identification of a tolerant genotype is difficult, since its phenotype is very similar to a sensitive one (Newton 2016). However, our results suggest that it is possible to identify tolerant crosses through the use of agronomic value scale. Nogueira et al. (2015) reported that the use of the number of pods is highly associated with grain yield when in contact with rust. Despite its accuracy, the assessment of number of pods is timeconsuming and not practical in a breeding pipeline.
Low correlation between seed yield and cycle were observed. Similar results to those obtained by Melo et al. (2015), that reported the existence of significant levels of tolerance in late-maturity genotypes under long exposure to the pathogen were found. However, Kawuki et al. (2004), associated early maturing line to tolerant genotypes, since late maturing lines had greater disease severity.
The use of the multivariate diallel was efficient for a more detailed study of the GCA and SCA combinations, together with the possibility of parental selection with the intention of improving certain traits simultaneously. Similar results were obtained in the literature by several authors, proving the efficiency of this type of analysis (Jung et al. 2007;Benin et al. 2009;Teodoro et al. 2017).

CONCLUSIONS
Our research was the first insights in the study in early generation to obtain tolerant genotypes to Asian rust using several approaches. The univariate analysis was able to identify the contributions of the additive and non-additives from the several traits. The use of multivariate analysis allows a better understanding of grain yield components and other traits associated with rust tolerance. The use of visual selection via agronomic value is a useful practice to identify tolerant genotypes. The multivariate diallel analysis allows the identification of the best parents and crosses with ideal GCA and SCA, respectively, considering the ideotype.