Phenotypic characteristics and genetic diversity of new broccoli (Brassica oleracea var. italica Plenck) varieties in China

ABSTRACT Due to its high nutritional value, broccoli (Brassica oleracea var. italica Plenck) is one of the most popular vegetables worldwide. This study assessed 36 phenotypic characteristics of 111 broccoli varieties to understand the phenotypic diversity of new broccoli varieties and improve their breeding speed with advantages and characteristics in China, including 108 new varieties and three varieties of common knowledge. The genetic diversity, the principal component, and the cluster of phenotypic characteristics of broccoli varieties were further investigated. The results showed that the coefficients of variation of 36 characteristics ranged between 11.18 % and 94.99 %, with their diversity index between 0.26 and 1.82. The 111 broccoli varieties were further classified into eight groups, primarily attributed to the differences in phenotypic characteristics, including curd weight, main stem thickness, plant development degree, plant height, and anthocyanin coloration. The cumulative contribution rate of the first five principal components reached 81.186 %, corresponding to 12 representative phenotypic traits. The analysis indicated that the phenotypic characteristics of broccoli were rich in diversity, especially for several characteristics appreciated by the market, such as weight, curd firmness, and anthocyanin coloration. This study revealed the basic information on the genetic diversity of new broccoli varieties in China from 2017 to 2019 and provided potential breeding strategies for broccoli to meet diverse market demands.


Introduction
Broccoli (Brassica oleracea var.italica Plenck) has become an essential vegetable in China due to its edibility in autumn and winter.Broccoli is rich in vitamins, minerals, and other bioactive compounds, such as glucoraphanin and dietary antioxidants (Eberhardt et al., 2005;Velasco et al., 2007;Soares et al., 2017).Vegetable quality is a comprehensive trait mainly affected by phenotype and multiple internal nutritional elements.As broccoli's curd and tender stems are the main edible part, several phenotypic characteristics are essential factors affecting the broccoli quality, such as the color, weight, shape, curd size, and stem thickness.
A rich genetic diversity of broccoli germplasm resources is the foundation of germplasm innovation and genetic improvement and provides an excellent gene reserve for cultivating new varieties.The analysis of genetic diversity and the evaluation of broccoli germplasm resources are mainly based on morphology, chemical compounds, or molecular markers (Hu and Quiros, 1991;Louarn et al., 2007;Li et al., 2019;Liu et al., 2022b).Conventional phenotypic diversity is the basis and essential content to reveal biodiversity.Phenotypic characteristics can reflect the plant level of genetic variation and diversity in the short term and provide reliable phenotypic data for related studies.Therefore, the description and the evaluation of germplasm resources still rely mainly on phenotypic characteristics.Previous studies found that broccoli cultivars, which have been widespread in China since 1980, had close relationships and similar genetic diversity (Li et al., 2019).The 165 cauliflower (Brassica oleracea var.botrytis) inbred lines (primarily from southeast China) presented relatively narrow genetic diversity assessed by the 43 SSR markers (Zhu et al., 2018).However, the genetic diversity level of the new broccoli variety in recent years has yet to be well-documented.
In this study, we collected 108 new varieties of broccoli from 2017 to 2019, representing the breeding advances of broccoli varieties in China.Among them, 63 new broccoli varieties were obtained from the national consortium for seed research, which aims to improve the breeding speed of new varieties with advantages and characteristics.According to the new varieties test guideline, we selected 36 phenotypic characteristics of broccoli and analyzed the phenotypic genetic diversity and the principal components.The genetic characteristics of these new broccoli varieties were obtained from the analysis results, which could provide helpful information for further research to improve new broccoli varieties.

Plant materials and field experiments
Seeds of 111 broccoli varieties were collected from 2017 to 2019, composed of 108 new varieties from seven provinces of the Chinese mainland and three varieties of common knowledge from Japan and Chinese Taiwan, including hybrid, the three-sex sterile line, and male-sterile hybrid.Field experiments were conducted in Shanghai, China (30°98' N, 121°36' E, altitude 2.8 m) in 2019.The experiment was conducted in a randomized complete block design replicated three times.Each variety was sown with at least 120 plants in August, where the experiment unit comprised 18 m by 2.31 m with three rows.The row spacing was 55 cm, while the plant spacing was 50 cm.The soils in the experiment site were paddy soils.All trials were managed following standard agricultural practices.The bottom fertilizer was 52500 kg ha -1 for organic fertilizer and 375 kg ha -1 for compound fertilizer.The treatments were irrigated 2-3 times after field planting.After 15 days of planting, 225 kg ha -1 of urea was applied and 450 kg ha -1 of compound fertilizer was applied at about 45 days after planting.

Measurement of phenotypic traits and data collection
Thirty-six characteristics were selected from "Guidelines for the conduct of tests for distinctness, uniformity and stability-Broccoli (Brassica oleracea var.italica Plenck)" (Jian et al., 2015;UPOV, 2016;UPOV, 2018).We assigned criteria for 26 qualitative and pseudoqualitative characteristics (Table 1).Besides, ten quantitative characteristics, including the degree of plant development, plant height, leaf number, leaf length, leaf width, petiole length, main stem thickness, flower stem length, curd weight, and curd thickness were measured from 20 spikes chosen randomly at each plot.The numerical data for the ten quantitative characteristics were transferred to ordinal scales of 1-9.The specific measurements are as follows: degree of plant development (cm) was the widest plant part; plant height (cm) was measured from ground level to the leaf tip; leaf number was the number of fully-developed leaves in each plant; leaf length (cm) and width (cm) was the longest and widest part of a leaf; petiole length (cm) was measured from one petiole end to the other; main stem thickness (mm) was the longest thickness of main stem by vernier caliper; flower stem length (cm) was measured as the longest branch of curd by vernier caliper; curd weight (g) was measured by electronic balance; curd thickness (cm) was measured as the distance of the widest part of curd.

Statistical analysis
The mean, standard deviation, coefficient of variation, minimum (min), maximum (max), and range for the 36 phenotypic characteristics were calculated.The distribution frequency of 36 characteristics on assignment was analyzed.The numerical values of ten quantitative characteristics were divided into nine intervals, and the distribution frequency of every quantitative characteristic was performed.The Pearson's correlation analysis (Fisher, 1924) was conducted to calculate the relationship among ten quantitative characteristics using SPSS Statistics 23.0 software.The Shannon-Weaver's diversity index (H') was used to evaluate genetic diversity and the calculation formula is H'﹦-∑P i ×lnP i (i﹦1, 2, 3, ..., n), where P i represents the frequency of a specific characteristic in the i-th level (Shannon and Weaver, 1949).The Euclidean distance was used as the distance between germplasms and the UPGMA (Unweighted Pair Group Method with Arithmetic mean) method was used for the cluster analysis (Nei and Li, 1979).The principal component analysis (PCA) (Sneath and Sokal, 1973) was performed on 36 phenotypic characteristics using SPSS Statistics 23.0 software.

Frequency distributions of phenotypic characteristics
The frequency distributions of 36 phenotypic characteristics from 111 broccoli varieties are shown in Table 2.It was found that different phenotypic characteristics had various frequency distributions.Anthocyanin coloration of the hypocotyl (ACH), plant collateral curd (PCC), leaf anthocyanin coloration (LAC), and male sterility (MS) had only two types of expression, including states absent or present.The absent and present state ratios for ACH, PCC, and LAC were approximately 1:1, while MS was about 1:3.Leaf shape was mainly narrow oval and medium oval, accounting for 54.1 % and 45.0 %, respectively.Over half of the broccoli varieties (62.2 %) had no curd anthocyanins.Curd bud uniformity of 74.8 % of broccoli varieties was uniform.Only curd bulb color was light green or medium green, which was approximately 1:1.The maturity was mainly distributed in 115-145 days and 160-190 days, accounting for 53.1 % and 43.2 %, respectively.The degree of plant development, leaf length, leaf width, and curd thickness were distributed within a narrower range than the other quantitively characteristics.The curd thickness of 89.2 % of varieties was distributed in 60.0-79.9cm and the degree of plant development of 85.6 % was distributed in 12.0-15.9cm.The leaf length of 82.9 % of varieties was distributed in 30.0-39.9 cm and the leaf width of 90.1 % was distributed in 15.0-19.9cm.

Diversity of phenotypic characteristics
The mean, standard deviation, coefficient of variation, and diversity index (H') of 36 phenotypic characteristics are shown in Table 3.The coefficient of variation reflects the data discrete and the data variation.A larger coefficient of variation indicates a greater degree of variation.Curd anthocyanin coloration, leaf anthocyanin coloration, and anthocyanin coloration of hypocotyl showed significant variation with coefficients of 94.99 %, 90.50 %, and 89.28 %, respectively.The leaf 0-29.9 30.0-39.9 40.0-49.950.0-59.960.0-69.970.0-79.980.0-89.990.0-99.9100.0-Plant height PH (cm) 0-24.9 25.0-29.930.0-34.9 35.0-39.940.0-44.945.0-49.950.0-54.955.0-59.960.0-  Curd diameter CD (cm) 0.9 -7.2 51.4 37.8 0.9 0.9 -0.9 characteristics generally had a small degree of variation, including leaf foliar wax powder, leaf apex shape, and degree of plant development, with coefficients of 11.18 %, 14.08 %, and 13.85 %, respectively.The yield-related characteristics showed a wide range of variability.For instance, curd weight and diameter ranged from 73.72 g to 524.57g and 7.63 cm to 25.83 cm, respectively.Significant variability was also observed for flower stem length, ranging from 1.69 cm to 6.62 cm.The Shannon-Weaver diversity index reflects the degree of variety and the non-uniformity of individual distribution among interspecific crossings.A higher diversity index value corresponds to a more uniform distribution of individual characteristics from varieties.The diversity indices of the degree of occurrence of collateral curd, curd bud size, curd firmness, leaf number, main stem thickness, and curd weight were greater than 1.5, which were 1.82, 1.70, 1.75, 1.67, 1.75, and 1.66, respectively.Curd covered by inner leaves and curd dome shape showed characteristics with a low diversity index at 0.26 and 0.46, respectively.The diversity index of 16 characteristics related to curd varied from 0.26 to 1.75, with an average value of 1.11.The diversity index of 13 characteristics related to leaf varied from 0.46 to 1.67, with an average value of 0.93.The genetic diversity of the curd parts was greater than that of the leaf parts.

Correlation of phenotypic characteristics
Correlation coefficients among the 36 phenotypic characteristics were analyzed (data not shown).Five characteristics were significantly correlated with more than 20 characteristics, including curd firmness, Phenotypic diversity of broccoli Sci.Agric.v.80, e20220223, 2023 petiole length, main stem thickness, flower stem length, and curd weight.However, the undulation of leaf margin, leaf apex shape, number of leaf lobes, and curd covered by inner leaves only had significant correlations with less than five characteristics.Furthermore, the most significant positive correlations were found for anthocyanin coloration of hypocotyl, leaf, and curd.Flower stem length was significantly negatively correlated with 17 characteristics in this study.As the flower stem length increases, the shape of the curd tends to become a transverse narrow elliptic, significantly affecting yield.Maturity and male sterility were all significantly positively correlated with the degree of plant development, main stem thickness, and curd weight.Curd weight had significant correlation coefficients with most quantitative characteristics except leaf number.The results suggested that plant growth affects yield-related characteristics.

Cluster analysis
Based on the data of 36 phenotypic characteristics, 111 varieties of broccoli were classified into eight groups (Figure 1).Pictures of typical broccoli varieties in eight groups are also shown in Figure 2. Group I included three varieties (BO24, BO79, and BO83), whose degree of plant development, leaf length, main stem thickness, flower stem length, and the weight and diameter of curd were significantly smaller than that in the other varieties.For example, curd weight of the three varieties was 200-250 g.Group II contained 19 varieties with no collateral curd, anthocyanin coloration of hypocotyl, leaf, and curd, and male sterility.Group III contained seven varieties, such as BO49, BO104, BO23, and BO54, etc., whose quantitative characteristics were generally less than average.Group IV, composed of 16 varieties (BO12, BO68, BO90, BO110, etc.), had collateral curd and anthocyanin coloration  had one variety (BO57), which was more significant than the overall average on quantitative characteristics.Group VII included 34 varieties with an intermediate level of quantitative characteristics (BO1, BO34, BO51, BO89, etc.).Group VIII had no anthocyanin coloration of hypocotyl, leaf, curd, and no collateral curd covered by inner leaves.

Principal component analysis
The principal component analysis (PCA) can obtain the most critical information because it is one of the methods to analyze multi-factor complex problems using the dimension reduction idea effectively.Each group selected the two most significant values of the absolute value among the eigenvector to reflect the contribution rate of its corresponding numerical trait.The results showed that the characteristic values of the five principal components were higher than 1.At the same time, the cumulative contribution rate reached 81.186 %, which could reflect the basic information of 36 phenotypic traits of broccoli (Table 4).The first principal component (PC1) accounted for 34.362 % of the total variance, primarily attributed to the anthocyanin coloration of hypocotyl, leaf, and curd.The second principal component (PC2) accounted for 22.873 % and was attributed primarily to collateral curd and its degree of occurrence.The third principal component (PC3) was represented by male sterility, responsible for an additional 13.672 % of the total variance.The fourth principal component (PC4) accounted for 5.875 % of the total variance and was attributed primarily to maturity, main stem thickness, curd firmness, and curd weight.The fifth principal component (PC5) mainly reflected curd bud size, leaf number, and anthocyanin coloration of hypocotyl and leaf, accounting for 4.404 %.
Projecting the accessions onto the first two PCs showed apparent separations corresponding well to the eight clustered groups (Figure 3).The varieties of Group VII were mainly distributed on the upper left of the origin, corresponding to hypocotyl, leaf, and curd have no anthocyanin coloration.Group V and Group VI were distributed above the origin, while the main stem thickness of their corresponding varieties was higher than the other five groups.Group III and Group IV were mainly distributed on the upper right of the origin, reflecting the characteristics of the lightest curd weight.Group VIII was mainly distributed on the lower left of the origin, reflecting heavy curd weight characteristics.Group I was mainly distributed below the origin with lower plant height.The germplasm resources of Group II were mainly distributed on the lower right of the origin, which had anthocyanin coloration on hypocotyl, leaf, and curd.

Discussion
Information on many new varieties can reflect the situation of variety breeding in a region.Therefore, it Sci.Agric.v.80, e20220223, 2023 is necessary to assess the new indigenous varieties.However, relevant reports have yet to be on the phenotypic diversity analysis of new broccoli varieties based on large samples in China.The 111 experimental materials used in this study contained much information regarding the germplasm resources.Abundant germplasm resources can broaden the choice of the public and enrich the market.The commercial value of the broccoli depends mainly on the quality of its curd, including the curd weight (CW), curd firmness (CF), curd covered by inner leaves (CCIL), curd anthocyanin coloration (CAC), and maturity (M).In this study, the diversity index of CCIL only was 0.26.Besides, only 7.2 % of broccoli varieties had curd covered by inner leaves, indicating that most breeders used mature breeding technology to avoid breeding varieties with CCIL, as this characteristic could significantly affect the quality of the curd.In contrast, the Shannon-Weaver diversity indices of CW, CF, CAC, and M were 1.66, 1.75, 1.27, and 1.49, respectively.The results indicated that these quality-related characteristics had rich variations.There is still room for optimization of these four characteristics in the subsequent breeding.
The CW is directly related to the broccoli yield.In this study, the CW positively correlated with the degree of plant development, plant height, leaf length, and leaf width.Thus, these characteristics, indicative of high-quality varieties, could be the critical target for future breeding.
The purple color of broccoli is due to the accumulation of anthocyanins (Liu et al., 2020).Anthocyanin pigments are synthesized in different parts of broccoli plants, such as hypocotyl, leaf, and curd.It is a valuable trait to characterize broccoli genetic resource, since it positively correlates with tolerance to abiotic stresses (Chalker-Scott, 1999).Some broccoli varieties with green curd produce anthocyanins in flower buds   under low winter temperatures (Rahim et al., 2019).
In our study, among 36 characteristics of 111 broccoli varieties, the coefficient of variation of anthocyanin coloration was the highest, indicating a large significant variation with anthocyanin coloration in the materials tested.Richness in anthocyanins has attracted more and more attention as an essential trait in purple broccoli breeding (Liu et al., 2022a).Anthocyanin-rich varieties of broccoli could be used as the primary material for anthocyanin research in the future.Moreover, the quality and nutrition of broccoli had a close genetic relationship with maturity and the same cultivar maturity was usually different in various regions (Coupe et al., 2003;Dallaire et al., 2006;Tian et al., 2018;Lu et al., 2009).In this study, the diversity at maturity was significant among the broccoli varieties bred in China in recent years.In previous studies, maturity was generally considered the most reliable indicator for categorizing broccoli varieties (Quiros et al., 2011).Our study showed that maturity was significantly correlated with ten characteristics of leaf and curd, meaning that the growth of a large volume of vegetative organs took time to accumulate, resulting in later maturity.In addition, by comparing the information on each variety maturity and geographical area, we found that most varieties bred in North China (e.g., Beijing and Tianjin) have an early maturing period.In contrast, the varieties bred in South China (e.g., Fujian) were mainly late-maturing.The varieties bred in East and Central China (e.g., Shanghai and Hubei) were Phenotypic diversity of broccoli Sci.Agric.v.80, e20220223, 2023 either early-maturing or late-maturing.However, Zhu et al. (2018) found that most broccoli inbred lines were not tightly clustered according to maturity.The results above suggested that a certain introgression occurs in the gene pool of different mature broccoli varieties.Li et al. (2019) pointed out that 95 broccoli cultivars have been separated into five primary subgroups since 1980 in China, including primary cultivars and innovative resources.Zhu et al. (2018) found that the genetic diversity of the inbred lines was relatively low, which derived from the primary areas for cauliflower breeding in China (Fujian, Zhejiang, and Taiwan).In the current study, 108 new broccoli varieties were mainly bred from the Northeast, East, and Southeast China.The cluster analysis showed that the varieties tested were not clustered strictly according to the region but were dispersed in each group.It was also found that the two broccoli resources from Japan, BO17 and BO18, were clustered into Group VIII and Group II, respectively.This result may be related to the cross-utilization of germplasm resources and the germplasm derived from the same regions with different genetic backgrounds.An unobvious correlation between materials and regions indicated that the joint regulation of the internal genetic and external environment regulated the plant phenotypic traits.
Several studies used cluster and principal component analysis methods to comprehend broccoli genetic diversity and relationships.Desirable hybridization combinations of plant materials were obtained through the analysis method above (Sun et al., 2001;Adeyemo et al., 2011;Zhao et al., 2014).In our study, some phenotypic characteristics affected the taxonomy of broccoli germplasm from the PCA, mainly including collateral curd, male sterility, maturity, curd weight, curd firmness, anthocyanin coloration of hypocotyl, leaf, and curd.Curd firmness is an essential characteristic that breeders usually consider in broccoli breeding.We also found that most broccoli varieties were not tightly clustered according to the curd firmness or curd weight.A similar result was also found in cauliflower germplasms (Zhu et al., 2018).
Phenotypic studies are an essential foundation of biological research.Without the support of phenotypic data, it is impossible to rely solely on bioinformatics data, such as those from the genome and transcriptome, to understand the breeding characteristics and mechanisms (Zhang et al., 2022).Therefore, our study provided phenotypic information on new broccoli varieties, which could help screen genotypes in broccoli breeding and advance breeding development.

Conclusion
This study analyzed the genetic diversity in 108 new and three varieties of common knowledge of broccoli on 36 phenotypic characteristics.Results show that the variation coefficient of the 36 characteristics ranged from 11.18 % to 94.99 %, indicating the apparent differences among phenotypic characteristics of broccoli varieties.A higher diversity index of broccoli characteristics was present mainly in the curd characteristics, suggesting that the curd-related characteristics were the main factors in the phenotypic differences of broccoli.Moreover, the broccoli varieties were further clustered into eight groups, primarily attributed to the phenotypic difference in anthocyanin coloration, degree of plant development, plant collateral curd, main stem thickness, flower stem length, and curd weight of broccoli.Considering the commercial value of the broccoli, several curd-related traits which primarily affected the quality of broccoli, such as the curd weight, firmness, anthocyanin coloration, and maturity, were rich in diversity.Curd weight was significantly positively correlated with main stem thickness, plant degree of development, and leaf length.Thus, these characteristics could guide target traits for breeding high-yield broccoli varieties in the future.

Figure 1 -
Figure 1 -Cluster dendrogram based on 36 phenotypic characteristics from 111 broccoli varieties.The different colors of the cluster dendrogram represent different groups, and the color dots on the left indicate the source of the corresponding resource.

Figure 3 -
Figure 3 -Principal coordinate two-dimensional plot of 111 broccoli varieties.Different colors correspond to the different groups in Figure 1.PC1 = The first principal component; PC2 = The second principal component.

Figure 2 -
Figure 2 -Typical broccoli varieties of eight cluster groups.

Table 2 -
The distribution frequency of the different expression states of the 36 phenotypic characteristics assessed in 111 broccoli varieties.

Table 3 -
Statistical analysis of the 36 phenotypic characteristics diversity of 111 broccoli varieties.
of hypocotyl, leaf, and curd.Group V contained four varieties (BO2, BO4, BO40, BO84) with collateral curd and anthocyanin coloration of the leaf.Group VI only

Table 4 -
Principal component analysis of phenotypic characteristics of 111 broccoli varieties.