Population structure and phylogenetic relationships in Brassica rapa L. subspecies by using isozyme markers

figure) Abstract The present study aimed to assess population structure and phylogenetic relationships of nine subspecies of Brassica rapa L. represented with thirty-five accessions cover a wide range of species distribution area using isozyme analysis in order to select more diverse accessions as supplementary resources that can be utilized for improvement of B. napus . Enzyme analysis resulted in detecting 14 putative polymorphic loci with 27 alleles. Mean allele frequency 0.04 (rare alleles) was observed in Cat4A and Cat4B in sub species Oleifera accession CR 2204/79 and in subspecies trilocularis accessions CR 2215/88 and CR 2244/88. The highest genetic diversity measures were observed in subspecies dichotoma, accession CR 1585/96 (the highest average of observed ( H 0 ) and expected heterozygosity ( He ), and number of alleles per locus ( Ae )). These observations make this accession valuable genetic resource to be included in breeding programs for the improvement of oilseed B. napus . The average fixation index (F) is significantly higher than zero for the analysis accessions indicating a significant deficiency of heteozygosity. The divergence among subspecies indicated very great genetic differentiation ( F ST = 0.8972) which means that about 90% of genetic diversity is distributed among subspecies, while 10% of the diversity is distributed within subspecies. This coincides with low value of gene flow (Nm = 0.0287). B. rapa ssp. oleifera (turnip rape) and B. rapa ssp. trilocularis (sarson) were grouped under one cluster which coincides with the morphological significativa de heterozigose. A divergência entre as subespécies indicou uma grande diferenciação genética (FST = 0,8972), o que significa que cerca de 90% da diversidade genética é distribuída entre as subespécies, enquanto 10% da diversidade é distribuída nas subespécies. Isso coincide com o baixo valor do fluxo gênico (Nm = 0,0287). B. rapa ssp. oleifera (nabo) e B. rapa ssp. trilocularis (sarson) foram agrupados conforme a classificação

It is thought that B. rapa L. is originated in the mountainous areas near the Mediterranean Sea (Tsunoda, 1980). However, the time of domestication is unknown. There are two main centres of origin for B. rapa L.: one is the Mediterranean and the other is the Afghanistan-Pakistan region (Sinskaia, 1928).
B. rapa, with the exception of the Indian yellow sarson form (subspecies trilocularis (Roxb.) Hanelt), is an obligate outcrosser due to the presence of self-incompatible genes (Kimber and McGregor, 1995).
Subspecies of B. rapa L. species are widely cultivated as leaf and root vegetables, fodder and oilseed crops. In addition, they can be a weed of cultivated land and disturbed sites (Shahzadi et al., 2015). Some of the plants of B. rapa L. subspecies includes both annual and biennials forms, e.g. subsp. Chinensis, subsp. oleifera and B. rapa subsp. rapa L.; others are biennials e.g. subsp. pekinensis.
Genetic diversity is essential for long-term survival in time and in place as it supplies the species by the plasticity to cope with the new conditions brought about by the environment. It has a leading role in competition, symbiosis, parasitism, and the impact of climate and the effect of deficiencies (Tanhuanpää et al., 2016). The knowledge of the genetic diversity of plant genetic resources is essential for saving the present genetic diversity and for subsequent utilization in crop improvement. In Brassica genetic diversity study is major requirement for success in plant breeding and crop improvement (Shahzadi et al., 2015). Knowledge of genetic diversity in Brassica gene pool may be a valuable genetic source to overcome obstacles in genetic improvement (Bird et al., 2017). Brassica rapa L is one of the valuable crop species in Brassicaceae family. It is known by its common names field mustard, turnip, and/or Chinese cabbage. It is characterized by its broader global distribution than most of other Brassica species (OECD, 2016). The assessment of genetic distances within B. rapa L. may help to broaden the genetic diversity in B. napus since it suffered a series of bottlenecks during its development and has reduced genetic diversity (Becker et al., 1995;Cowling, 2007). Moreover, landraces of B. rapa L. are adapted to a broad range of environments, e.g. cold or high temperatures, across a very wide geographic area (Dixon, 2007) which means that they contain valuable genetic traits which can be used for B. napus improvement (Annisa et al., 2011).
The aim of the present study was to assess of genetic diversity and population structure of Brassica rapa L. subspecies in order to select more diverse accessions as Supplementary Material resources that can be utilized for improvement of B. napus.

Biological Material
Thirty-five accessions of B. rapa L. were obtained as donation from IPK gene bank Gatersleben Germany. The accessions belong to nine subspecies of B. rapa L. The represented subspecies are: chinensis, pekinensis, rapa, oleifera f. annua (annual turnip rape, summer turnip rape), Brassica rapa subsp. campestris (L.), oleifera f. biennis ((biennial turnip rape, winter turnip rape), dichotoma, oleifera, trilocularis. The origin and the accession number of these accessions are recorded as shown in Table 1.

Isozymes extraction, electrophoresis and staining
The seeds of the studied accessions were surface sterilized in 70% v/v ethanol for 1 min before germination at 25ºC in sterilized Petri dishes with three moist filter papers. Three-day-old seedlings of each accession were macerated in 5 ml saline solution containing 0.8% NaCl and 0.2% NaNO 3 , then centrifuged at 12000 rpm for 15 minutes. Supernatants were collected in pre-chilled tubes and stored at -20ºC until use for electrophoretic separation of isozymes.
15 µL clear supernatants were mixed with equal volumes of loading buffer (50% glycerol containing 1% bromophenol blue), then applied directly on 7% vertical PAGE at 4°C in a Mini Protean III unit (BioRad, USA) according to the method of Manchenko (1994). Electrophoresis was carried out at 15 mA/gel for 60 min.
The electrophoresed gels were stained for acid phosphatase, catalase, α-esterase and peroxidase (Pasteur et al., 1988). The gels of acid phosphatase were incubated in 100 ml solution of 0.1 M sodium phosphate buffer, pH 5.1, at 37 0 C for 3 to 5 h, then stained in solution formed of 10 mM I 2 mixed with 14 mM KI developing white bands on a dark blue background. The chromatic or light brown bands appeared at the bottom of the gels were amylase bands. The gels of catalase were stained by immersing in 1:1 mixture of solutions 2% potassium ferricyanide and 2% ferric chloride after incubation in a solution of 3% H 2 O 2 for about 15 min. The gels were then washed and gently agitated for a few minutes in water. Yellow bands of catalase activity appeared on a blue-green background. The gels of α-esterases were incubated at 37°C for 15 min in 100 ml staining solution consisted of 0.05 M phosphate buffer, pH 7.2, containing 1% α-naphthyl acetate and 50 mg Fast Blue RR until brown colored bands appeared. The stained gels were photographed as quickly as possible and stored in 3% acetic acid. The gels of peroxidase were incubated in 100 ml 0.05 M acetate buffer (pH 5.0 containing 65 mg benzidine dissolved in 1ml of ethanol. 2 ml of 0.1 M CaCl 2 were added as co-enzyme. Finally, 2 ml of H 2 O 2 were added as a substrate and incubated in refrigerator until dark brown bands appeared. Stained gels were washed with distilled water and fix in 50% glycerol (Soltis et al., 1983). At least 5 and generally 10 plants per accession were examined for isozyme patterns.

Data analysis
The data of isozymes were analyzed using POPGENE version 1.31 Microsoft Window-based Freeware for Population Genetic Analysis. The construction of a dendrogram was made based on Nei's genetic distance using UPGMA (Nei, 1978). The banding patterns were first encoded using Microsoft Excel for easier editing before being transformed into a POPGENE data file.

Loci and alleles scored
Enzyme electrophoresis resulted in detecting 14 putative polymorphic loci with 27 alleles. The mean allele frequency ranged 0.04 in loci Cat4A and Cat4B to 0.62 in Per1B.
The alleles having frequency 0.04 were termed rare alleles (as shown in Supplementary Table 1). These alleles were observed for Cat4A and Cat4B in sub species Oleifera accession CR 2204/79 and in subspecies trilocularis accessions CR 2215/88 and CR 2244/88. It also showed that the total number of alleles in each accession for the 14 loci ranged from 4 in subspecies chinensis accession RA 1637/94"K89 and subspecies rapa accessions BRA 337/79, BRA 1116/99 and BRA 1224/87 to 17 in subspecies chinensis accession BRA 116/ 80 with a mean of 6.8 (total = 27). The percent of polymorphic loci varied from 0.29 in accession RA 1637/94"K89 of subspecies chinensis and accessions BRA 337/79, BRA 1116/99 and BRA 1224/87 of subspecies rapa L. to 1.21 in accession BRA 116/ 80 of subspecies chinensis (as shown in Supplementary Table 1).

Genetic diversity and accession-level homozygosity
The mean number of alleles per locus (A) and the effective number of alleles per locus (Ae) varied respectively from 1.13 in subspecies chinensis, accession BRA 1014/85 to 1.63 in subspecies rapa, accession BRA 1321/91 with a mean of 1.35 and from 1.13 in subspecies chinensis, accession BRA 1014/85 to 1.75 in subspecies dichotoma, accession CR 1585/96 with a mean of 1.281 (as shown in Table 2).
The average fixation index (F) is significantly higher than zero for the analysis accessions (as shown in Table 2) indicating a significant deficiency of heteozygosity. In general, it was found that the number of loci with significant deficiency of heterozygosity for all accessions was extremely higher than the number of loci with excessive heterozygosity. The non significant inbreeding coefficients (NS) were ranged between 1.00 in subspecies dichotoma accession CR 1585/96 to 7.00 in sub species oleifera biennis accessions CR 289/84 & CR 1586/87and subspecies Oleifera accession CR 2203/79 (as shown in Table 2).

Genetic structure and gene flow
The mean breeding index was significantly higher than zero (F IT = 0.8228), which means that there is excess of average heterozygotes in the studied accessions. A high and significant value was obtained for F ST (The coefficient of genetic differentiation) with mean = (0.8972) suggesting the occurrence of random mating system for the studied accessions (as shown in Table 3). The mean value of F IS all over the analyzed loci was -0.7235 confirming the self-incompatibility of B. rapa. The estimate of gene flow based on Wright (1951)

Genetic distance and dendrogram
Dendrogram constructed based on isozyme showed four main clusters at Nei's genetic distance 0.60 (see Figure 1). Accessions of subspecies oleifera and trilocularis were collected with each other in cluster (C2). The accessions of subspecies oleifera biennis and oleifera annua were separated in clusters (C1) and (C2) respectively. The accessions of other subspecies (chinensis, pekinensis, rapa, dichotomy, campestris L.) were clustered in two clusters each.
The matrix of eigenvectors and values of the principle components (PCs) resulting from the interaction of the isozyme data influenced 54.93% of the variability accumulated up to the first four components of the PCA (as shown Table 4). The accessions of subspecies rapa, dichotoma, oleifera and trilocularis were separated on PC1 and the accessions of oleifera biennis on PC2. The other accessions were separated on more than one access.

Discussion
The surprising phenotypic diversity of B.rapa has made it a precious multi-use crop for food, fodder, and oil. Although the complex domestication history and intense selective breeding of B. rapa had an effective role in generating and shaping this diversity, they highly complicated the elucidation of population structure and evolutionary relationships in the species. With using isozyme data, we provide a more detailed analysis of population structure and the relationships of B. rapa subspecies.

Loci and alleles scored
The alleles with low mean allele frequency (0.04) or what were known as rare alleles were observed in Cat4A and Cat4B in sub species oleifera accession CR 2204/79 and in subspecies trilocularis accessions CR 2215/88 and CR 2244/88. The presence of this allele could be due to deleterious mutations or may be due to evolutionary relics (Sammour et al., 2019). The detection of rare allele in combination with low allelic frequency of other loci leads to the conclusion that the studied accessions had narrow genetic differentiation.
Although the studied accessions except the accessions of subspecies trilocularis are an obligate outcrosser due to the presence of self-incompatible genes, the mean allele frequency in general was low. This was attributed to breakdown and mismatch of the differently adapted gene complexes or what is known outbreeding depression. The great variation in allele frequency in studied subspecies could be attributed to these subspecies include many varieties or forms or cultivars. The selected forms of some subspecies e.g. chinensis exhibited significantly different morphological traits which made early botanists classified them as separate species (OECD, 2016).

Genetic diversity and accession-level homozygosity
There was wide variation in mean number of allele per locus (A) and effective number of allele per locus (Ae). Such wide variation was reported by Karam et al. (2014). However, the subspecies which showed low and high number of allele per locus (A) and effective number of allele per locus in Karam et al. (2014) were different from that observed in the present study. This might be attributed to that Karam et al. (2014) carried out their study at population level and we carried out study at accessions level.
Low average of observed heterozygosity (Ho) and expected heterozygosity (He) which observed in sub species chinensis (accession K 9420/95), subspecies rapa (accessions BRA 1116/99) and subspecies campestris (L.) (accession CR 1578/93) might be attributed to these accessions at the edge of the subspecies distribution centre; the position at which population sizes gradually decrease as does genetic diversity or the origin of these accession could be associated with some sort of bottleneck event followed by the absence of inter-population gene flow (Meeus et al., 2012;Surina et al., 2014;Radosavljevic et al., 2015).
The highest average of observed and expected heterozygosity, and the highest effective number of alleles per locus (Ae) were observed in subspecies dichotoma, accession CR 1585/96 from Canada. The variation in genetic structure between subspecies dichotoma, accession CR 1585/96 from Canada and other accessions of subspecies dichotoma from Asia could be due to Asian accessions were subjected to long term selection which narrow their genetic variation or/and the variation of geographical position of the accessions within subspecies distribution range or severe effects of small population sizes (Radosavljevic et al., 2015). The highest genetic diversity measures were observed in subspecies dichotoma, accession CR 1585/96 (the highest average of observed and expected heterozygosity, and number of alleles per locus (Ae)), These observations make subspecies dichotoma, accession CR 1585/96 valuable genetic resources to be included in breeding programs for the improvement of oilseed B. napus.

The population structure and gene flow
The population structure and gene flow were analyzed in the term of F statistic. Genetic divergence was quantified by computing F statistic as an indicator for genetic diversity and gene flow among subspecies. The inbreeding coefficient of the individuals within each subspecies was relatively low (F IS = -0.7235) which agreed with the self-incompatibility of B. rapa.
The coefficient of genetic differentiation (F ST ) was 0.8972 which was considered very great based on the guidelines suggested by Wright (1978). This guideline considered F ST ranges 0.0 to 0.05, 0.05 to 0.15, 0.15 to 0.25 and above 0.25 as indicator for little, moderate, great and very great genetic differentiation respectively. The divergence among subspecies indicated very great genetic differentiation (F ST = 0.8972) which means that about 90% of genetic diversity is distributed among subspecies, while 10% of the diversity is distributed within subspecies. This was coincided with low value of gene flow (Nm = 0.0287).
Although Brassica rapa was classified as obligate self-incompatible (Koch and Al-Shehbaz, 2009) the inbreeding coefficient of the individuals in the entire studied populations (within all subspecies) was relatively high (F IT = 0.8228). This can be attributed to the geographic isolation of the individuals of the studied subspecies. As a result of ongoing breeding depending on local preferences in different parts of the world, B. rapa has been undergone selection that increased genetic variation within the species (Gomez Campo, 1999;Koch and Al-Shehbaz, 2009). Variation in genetic structure was observed among accessions of the same subspecies and was attributed to severe selection, domestication and low gene flow (Snowdon et al., 2007).

The relationships among B. rapa subspecies
The subspecies were differentiated at Nei s genetic distance 0.60 into four main clusters, each cluster contained more than a subspecies confirming that Brassica rapa had a polyphyletic origin (Tanhuanpää et al., 2016). The accessions of each of oleifera, oleifera annua, oleifera biennis, trilocularis were collected with each other in a specific cluster. This could be due to the accessions of each of these subspecies collected from same sub-geographic region, which did not have a high fluctuation in environmental and ecological factors: the factors that may cause change/modifications in isozymes markers (Horáček et al., 2009). Conversely, the accessions of other subspecies were distributed between two clusters exhibiting a wide variation in genetic structure, e.g. subspecies chinensis, pekinensis, rapa, dichotomy, campestris L. The variation in the genetic structure of these subspecies based on isozyme analysis coincides the distinct variation of the morphotypes of these subspecies (OECD, 2016).
It was reported B. rapa L. classified into two main groups based on morphological and restriction fragment length polymorphism (RFLP) markers: one group consists of ssp. rapa and ssp. oleifera in Europe and another is the group of leafy vegetables, such as ssp. pekinensis, ssp. chinensis and ssp. nipposinica in East Asia (Takuno et al., 2007). However, our results based on isozyme analysis did not support this classification. The discrepancy between our results and Takuno et al. (2007) conclusion was attributed to Takuno et al. (2007) did not include ssp. rapa and ssp. oleifera from East Asia in their studies (Tsunoda, 1980). B. rapa ssp. oleifera (turnip rape) and B. rapa ssp. trilocularis (sarson) were grouped in one cluster. Similar relationship was observed by Song et al. (1991) based on RFLP and Karam et al. (2014) based on isozyme. The grouping of these two subspecies coincides with the morphological classification suggested by Inaba and Nishio (2002) that sarson had been derived from turnip rape and was selected and developed in India.
The separation of dichotoma and trilocularis with each other on PC1 confirmed the suggestion of Bird et al. (2017) for collecting of these two subspecies in a separate subpopulation by using a high-throughput GBS method that leverages next-generation sequencing and multiplexing of RRLs. However, the suggestion of Bird et al. (2017) to assign each of subspecies chinensis and pekinensis to a subpopulation, since these two species had many morphotypes, great variation in genetic structure based on isozyme (Karam et al., 2014) and RFLP analysis (Takuno et al., 2007).
These two subspecies and other subspecies of B. rapa need more study on a big collection of accessions covering their distribution area and using more than one molecular markers covering majority of their genome to assess genetic diversity and relationships within and among them.

Conclusion
In conclusion, enzyme electrophoresis resulted in detecting 14 putative polymorphic loci with 27 alleles. The rare alleles, alleles with mean frequency 0.04 were observed in Cat4A and Cat4B in subspecies Oleifera accession CR 2204/79 and in subspecies trilocularis accessions CR 2215/88 and CR 2244/88. The average fixation index (F) is significantly higher than zero for the analysis accessions indicating a significant deficiency of heteozygosity. The highest genetic diversity measures were observed in subspecies dichotoma, accession CR 1585/96 (the highest average of observed (H 0 ) and expected heterozygosity (He), and number of alleles per locus (Ae)) which made this accessions valuable genetic resources to be included in breeding programs for the improvement of oilseed B. napus. Low heterozygosity in some accessions might be due to these accessions at the edge of the subspecies distribution centre; the position at which population sizes gradually decrease as does genetic diversity or origin of these as these accession could be associated with some sort of bottleneck event followed by the absence of inter-population gene flow. B. rapa ssp. oleifera (turnip rape) and B. rapa ssp. trilocularis (sarson) were grouped in one cluster which coincides with their morphological classification. These two subspecies and other subspecies of B. rapa need more study on a big collection of accessions covering their distribution area and using more than one molecular markers covering majority of their genome to assess genetic diversity and relationships within and among them.