Improvement of vegetable soybean: genetic diversity and correlations of traits between immature and mature plants

Human consumption of vegetable soybeans is increasing, and consequently, breeding programs need to be encouraged and optimized. Therefore, there is a need to understand the relationship between the traits evaluated in the R6 (full seed) and R8 (full maturity) stages and estimate genetic divergence. At two sites and two crop seasons, we evaluated 42 progenies (F6 generation) from crosses between vegetable soybean genotypes and three commercial cultivars. In general, we found significant coincident correlations between environments. Positive and significant correlations between pod width (PWR6) and hundredseed weight (HSW) and between PWR6 and hundred-pod weight (HPW) suggest the possibility of replacing the labor-intensive process of obtaining HSW and HPW by estimating PWR6, which is faster and easier to apply. The unweighted pair-group method arithmetic average analysis allocated soybean genotypes into six major clusters. Furthermore, information from the present work may guide practical actions in breeding programs.


INTRODUCTION
Soybean is one of the most important crops and has enormous potential for improving the dietary quality of people worldwide. Soybeans can be consumed as a vegetable crop or processed into various products (Hartman et al. 2011). Vegetable soybean, also known as "edamame" in Japan, "maodou" in China, and "poot kong" in Korea (Kumar et al. 2011), belongs to the same species as soybean-grain or "commodity" soybean [Glycine max (L.) Merrill]. However, beans are consumed when they are immature or unripe (Konovsky et al. 2020); that is, the pods are harvested at the R 6 stage (Fehr and Caviness 1977), when the seeds are already fully developed inside the pods. The short growth duration allows edamame to fit into narrow windows during crop rotation.
Although beans are often available in pods, only the beans are edible. Vegetable soybeans have gained special attention from breeders, non-breeder researchers, growers, and consumers, and new edamame varieties are currently being developed . Soybean is a popular crop in Asian countries, but the consumption of edamame is increasing in the United States (Carneiro et al. 2020), Australia (Figueira et al. 2019), Europe (Hong and Gruda 2020), and sub-Saharan Africa (Djanta et al. 2020). In comparison to grain soybean, the seeds of vegetable soybean cultivars are larger (>30 g per 100 dry seeds) NE Casas-Leal et al. and occupy 80% to 90% of the pods, in addition to being better in flavor and texture. The physicochemical properties of edamame vary during different growth stages. According to Yu et al. (2022), pod/bean weight and pod thickness peaked at the R 6 stage. Moreover, sugar, starch, alanine, and glycine levels also peaked at R 6 and then declined. The seeds had a lower percentage of starch, which caused flatulence. However, as in grain soybeans, vegetable soybeans also have anti-nutritional factors such as tannins, protease inhibitors, and phytic acid (Gondim-Tomaz et al. 2022).
Soybeans are one of the main crops in Brazil and are of social and economic importance (Sentelhas et al. 2015). Vegetable soybean can be an important source of high-quality, low-cost protein and other nutrients for Brazilians (Keatinge et al. 2011, Rizzo andBaroni 2018). In Brazil, the consumption of soybeans in the human diet remains limited (Juhász et al. 2017) but is exhibiting an increasing trend as a result of the dissemination of the benefits of soybeans for human health and the increasing offer in the market of better-quality soybean-based products (Carrão-Panizzi et al. 2009).
Vegetable soybeans possess antioxidant properties and exert inhibitory effects on inflammatory mediators, suggesting their potential use as dietary supplements (Lin and Wu 2021). The market demand for edamame has begun to grow and expand dramatically in recent decades owing to increased knowledge of its nutritional properties compared with that of mature soybeans (Islam et al. 2019) and changes in lifestyle with diets shifting toward healthier food (Zhang et al. 2017). Thus, the objectives of the present study are as follows: i) to estimate the correlation between traits in the R 6 and R 8 stages of vegetable soybean progenies in advanced stages of endogamy; and ii) to study the genetic diversity between progenies for use in a breeding program for vegetable soybeans.

Plant material
The genotypes used in this study were 42 soybean lines (F 6 ) from crosses between inbred lines from the soybean breeding program of the Department of Genetics of the University of São Paulo (ESALQ-USP). The genealogy of the crosses included vegetable soybean genotypes: HAKUCHO, IAC PL-1, MAJÓS, P.I. 80.441, P.I. 165.672, P.I. 229.320, P.I. 230.970 F7-4, STWART, TAMBA, TMV, and TN#4. Moreover, three checks were used in all experiments: two soybean cultivars, especially those developed for human consumption (BRS 257 and BRS 267), and a cultivar (IAC 100) with insect resistance.

Experimental conditions
The experiments were conducted during two crop seasons at two locations in the city of Piracicaba (lat 22° 42′ 30″ S, long 47° 39′ 00″ W, alt 540 m asl), São Paulo: Esalq (Site 1) and Areão (Site 2). The Esalq site is the headquarters of the soybean breeding program of the Luiz de Queiroz School of Agriculture, University of São Paulo, and has the ferric red nitossol soil (Santos et al. 2018) with a clayey texture and undulating relief. The Areão site has dystrophic red-yellow argisol soil with a medium-clay texture and undulating relief. The experiments were carried out in a randomized complete block design with three replicates. The experimental units were single-row plots (2 m x 0.5 m). Planting was performed in a commercial production setting, with a planting density of 40 seeds per plot. Fertilization, supplemental irrigation, and pesticide application followed the technical recommendations for soybeans in this region.

Phenotyping
When the plants reached the R 6 stage (full seed), two plants per plot were harvested, and the traits evaluated were pod yield (PY, g plant -1 ), hundred pod weight (HPW, g), number of days from planting to the R 6 stage (NDR 6 ), pod width (PWR 6 , mm), and number of pods per plant (NP). At the R 8 stage (full maturity), the plots were evaluated for grain yield (GY, g plot -1 ), hundred seed weight (HSW, g), number of days from planting to maturity (NDM), agronomic value (AV), and plant height at maturity (PHM, cm).

Data analysis
All analyses were performed using the Genes software in integration with R software (Cruz 2016). Thus, the integration of the correlation network used the "Qgraph" package (Epskamp et al. 2012). According to Singh (1981), the relative importance of the characteristics in relation to genetic diversity among the genotypes was studied. The generalized quadratic distance of Mahalanobis was adopted as a dissimilarity measure between the soybean genotypes to perform clustering using the unweighted pair group method (UPGMA). In path analysis, the coefficients of phenotypic correlation between PY and the other traits from R 6 (HPW, NDR 6 , PWR 6 , and NP) were partitioned into direct and indirect effects. The same was considered for GY and other traits from R 8 .

RESULTS AND DISCUSSION
In the present work, we phenotyped soybean lines at the R 6 and R 8 stages and found great variation between environments and genotypes. This provides important and necessary information for breeding programs. Analysis of variance revealed significant differences among the genotypes for all traits, except for PY and AV. The interaction between genotypes and environments was not significant for PHM. Moreover, the effects of the environment had the highest contributions to variation in traits in relation to the other main effects. Du et al. (2020) observed that drought stress significantly affected seed weight in the soybean R 6 stage. Between genotypes, the average seed weight in the four environments ranged from 111.9 to 198.4 g plant -1 (PY), 80.9 to 137.2 g (HPW), 9.4 to 12 mm (PWR 6 ), 52.7 to 106.5 (NP), 492.8 to 861.8 g plot -1 (GY), 13.4 to 23.7 g (HSW), 128.4 to 143.9 (NDM), 2.5 to 3.3 (AV), and 72.9 to 133.3 cm (PHM) ( Table 1). In the present study, NDR 6 ranged from 109.8 to 117.3 d; these values were higher than those reported by Kumar et al. (2006) at 60 to 93 d. Interestingly, these authors evaluated mostly the exotic genotypes not adapted to the local environment, which caused a reduction in the cycle duration due to differences in photoperiod sensitivity. By contrast, Rao et al. (2002) found values of NDR 6 ranging from 99 to 134 d. Assessment of NDR 6 is relevant for vegetable soybean genotypes because it allows the identification of early and late genotypes that can be planted at different dates while maintaining high quality (Silva e Souza et al. 2020).
For all traits, we found superior means in relation to commercial cultivars. In a multi-year study of edamame conducted by Jiang et al. (2018), it was reported that there are significant trait variations between years, including changes in PY and PHM, suggesting that environmental variation is an important factor in the development of edamame lines. Yu et al. (2021) concluded that for breeding better edamame genotypes, both genotype and planting location should be considered.
Establishing the relationship between characteristics is important for reducing the number of evaluated traits and optimizing the selection steps in a breeding program. In our study, we evaluated the correlation between traits and genetic diversity, which is useful for improving vegetable soybeans. Significant positive correlations coinciding in all environments and the respective mean estimates of the Pearson's correlation among the characteristics of R 6 were PY x HPW (0.363), PY x NP (0.692), and HPW x PWR 6 (0.649). Negative coincident correlations occurred only for HPW x NDR 6 (-0.541). Between traits from R 6 and R 8 , positive and significant correlations were found for HPW x HSW (0.696), NDR 6 x NDM (0.497), and PWR 6 x HSW (0.657). Among the traits from the R 8 stage, only the correlation between AV x PHM was significant in all environments but was negative in first year and positive in second year. The correlation network ( Figure 1) helped visualize the association between groups (R 6 and R 8 ) and graphically demonstrated the importance of all traits. In all environments, we observed a strong correlation between R 6 traits and strong relation of HSW to them. The width of the line was proportional to the intensity of the correlation. These analyses help to detect complex statistical patterns that are difficult to extract using other approaches (Silva et al. 2016).
Significant correlations, independent of the environment, suggest a strong genetic association between traits. The positive and significant phenotypic correlations between PWR 6 and HSW and between PWR 6 and HPW suggest the possibility of an alternative method for obtaining HSW and HPW. Similar results have been reported by Yokomizo et al. (2000) and Silva e Souza et al. (2020) for PWR 6 and HSW, with high positive and significant phenotypic correlations of 0.892 (P < 0.01) and 0.830 (P < 0.05), respectively. HSW is an important trait, especially in soybean food products such as edamame (Kaler and Purcell 2021). HSW and HPW quantification is quite laborious and can be replaced with PWR 6 estimation, which is faster and easier to apply. Moreover, by selecting plants with higher PWR 6 , the HSW and, consequently, the GY can be increased by approximately 66%. These results are consistent with the findings of Li et al. (2019). High moisture content of approximately 57-79% is present in fresh seeds of vegetable soybeans, and it has a significant positive correlation with yield. Thus, moisture content is an important yield-related trait in vegetable soybeans ).

NE Casas-Leal et al.
The correlation between PWR 6 and NP was moderately negative (-0.529, P < 0.01). Although the significance was observed in only three locations, this indicates that the selection of plants with larger pods may cause a reduction in NP. The occurrence of compensation in some traits related to productivity is well recognized in this species, which may explain these results (Vaz Bisneta et al. 2015). However, the appearance of pods and large size are important criteria for the selection of quality vegetable soybeans (Manninen et al. 2015).
The positive and significant estimate between HPW and HSW is highlighted, indicating that these characteristics are highly correlated, allowing an increase in HSW of approximately 69.6% when selecting plants with higher HPW. We observed that the estimate of the correlation coefficient between the characteristics NDM and NDR 6 was moderate and positive; that is, earlier plants in the R 6 stage reached maturity faster. Indirect selection through mature soybeans may benefit the edamame allowing for improvement in most amino acids (Jiang et al. 2018). The strong to moderate significant correlations between R 6 and R 8 physiological stages without changes in the genotype rankings for yield, texture, protein, sucrose, calcium, and iron content suggest that breeders could evaluate vegetable soybean lines at the R 8 stage and indirectly select for other traits at the R 6 stage (Mozzoni and Chen 2019). Moreover, Jiang and Katuuramu (2021) found significant correlations between fatty amino acids of edamame and mature soybeans. Taware et al. (1997) equally demonstrated nonsignificant correlation values between GY and HSW and reported that soybean often promotes compensation between GY and HSW, increasing or decreasing the size of seeds as a function of the number of pods and seeds in development. When limiting environmental factors causes intense competition between plants, intense competition also occurs between different parts of the plant for nutrients and metabolism. This competition is particularly pronounced during the formation of reproductive structures, which results in compensatory variation between the primary components of production (Santos et al. 2013). According to path analysis results the trait that contributed the most to PY was the number of pods, which showed the highest direct effect on PY. In terms of GY, PHM had the greatest direct effect. The associations found in the path analysis were similar to those reported in other studies (Gomes and Almeida Lopes 2005, Machado et al. 2017, Ferrari et al. 2018. To obtain hybrids, it is essential to identify contrasting genotypes with high degrees of genetic divergence. Genetic variability is the raw material for breeders, providing genetic material to generate significant genetic gains in the characteristics of interest (Govindaraj et al. 2015). To leverage genetic diversity in edamame breeding, the major challenge lies in phenotyping (Dhakal et al. 2021). The cophenetic correlation coefficient was 0.6 with a significant t-test at 1% probability, which showed an adequate relationship between the distance matrix and the generated dendrogram.
We observed some clusters composed of only one or two genotypes (Figure 3), and cultivar IAC100 proved to be the most divergent among all evaluated genotypes. Other studies have found genetic divergence between soybean genotypes. Shilpashree et al. (2021) identified eight clusters based on morphological and quality traits. When the intention is to perform crossings resulting in superior progenies in relation to the traits of interest, selection of divergent parents should be based on the magnitude of the genetic divergence among genotypes (Cantelli et al. 2016).
In the present study, generalized Mahalanobis distance allowed us to quantify the relative importance of traits for genetic diversity by evaluating the contribution of the characteristics to the values of D². In general, the traits that contributed the most to genetic divergence were HSW, NDM, PWR 6 , and NDR 6 ( Figure 2). Removal of the least important variable from each group of traits evaluated in R 6 (PY) and R 8 (AV) caused no change in the grouping of genotypes. This indicates that there is little genetic diversity in these traits. However, producing large numbers of fresh pods is a major goal of edamame breeding (Dhakal et al. 2021); thus, we could identify nine genotypes with means numerically superior to the two vegetable soybean checks (43 and 44), of which six belonged to cluster 3, the cluster with the highest number of genotypes (n = 17).
In conclusion, the path analysis determined that NP has the greatest favorable effect on PY, and PHM on GY. Thus, it is useful for indirect selection toward vegetable soybean genotypes. The present work suggests optimization of the phenotyping process in the two stages of soybean development. The positive and significant estimate between HPW and HSW suggests that one of the variables can be disregarded. According to the path analysis, breeders should maintain HPW as it has a direct effect on PY. In contrast, HSW stands out as the most important trait for genetic divergence. Therefore, phenotyping should consider the objectives and stages of the breeding program. According to the present work, we conclude that there is sufficient genetic variability in vegetable soybean lines, with the potential to be used for hybridization in a breeding program for vegetable soybeans.