Intra-genotypic variability in elite parent lines of papaya

Abstract This study aimed to characterize papaya lines via microsatellite markers, and select genotypes based on the fixation index in order to promote the genetic purification of important commercial hybrids parent lines. Overall, 400 genotypes from three parental lines (JS-12, SS-72/12, and Sekati) were genotyped. Expected (HE), observed (HO) heterozygosity, and fixation index (F), were estimated. Genetic distances were estimated using an unweighted index, which was graphically presented via cluster analysis using the UPGMA and PCoA. Intra-genotypic variability was detected in both JS-12 and Sekati lines, while an absence of it was observed in SS-72/12. Such variability may positively contribute to the fitting of ‘UENF/Caliman 01’ and ‘UC-10’ hybrids into the traits of commercial interest how size and weight fruit. Regarding the fixation index, 293 genotypes showed maximum values (F=1) facilitating the genotypes selection process. Concerning population analysis, a close proximity between heterotic group ‘Formosa’ lines was observed, while a greater distance among ‘Solo’ group ones, and this enables systematic exploitation of such material. The fixation index maximum enabled the 80 genotypes selection thereby contributing to the parents genetic purification, since, the selected genotypes will be used in future hybridization steps to generate hybrids fitted into the traits of commercial interest.


INTRODUCTION
Papaya (Carica papaya L.) is a tropical fruit tree that presents economic importance to national and international agribusiness.In this vein, crop breeding programs have intensified the efforts to develop hybrid cultivars that meet market demands (Pereira et al. 2019a) by exploiting the available genetic variability of elite lines.This breeding strategy is only viable due to the absence of inbreeding depression (Manshardt & Drew 1998), which enables the implementation of hybridization programs focusing on heterosis and hybrid vigor.
For papaya hybrids generation, the use of pure lines is recommended to avoid segregation in F 1 , a phenomenon that may lead to undesirable agronomic variability in the hybrids (Allard 1971).The genetic instability in hybrids can be overwhelmed via genetic lines purification which seeks to minimize unwanted effects on the hybrid, such as crop heterogeneity.Thus, in such cases, the line characterization at molecular levels and selection are fundamental.
Papaya lines belonging from the Active Germplasm Bank UENF/Caliman are well characterized by both agronomic (Marin et al. 2006a, b, Pereira et al. 2019b) and molecular descriptors (Pinto et al. 2013, Vivas et al. 2018, Pirovani et al. 2021).The use of these lines in crossbreeding programs developed 21 hybrid papaya cultivars (Pereira et al. 2019a).However, from the observation of agronomic variation related to the size and weight of fruits in two hybrids that have a high potential for the crop development in the country 'UENF-Caliman 01' and 'UC-10', there was a need to recharacterize the JS-12, Sekati and SS-72/12 parental lines these hybrids.Since this variation may indicate the existence of intra-genotypic variability in these lines.
Microsatellite markers (Simple sequence repeat) are powerful tools in molecular characterization due to their multiallelic and especially codominance nature (Turchetto-Zolet et al. 2017), since they enable the genotypes selection based on the genetic variability level and genotypes allele fixation index.Currently, there are many primers of microsatellite designed for C. papaya species (Santos et al. 2003, Eustice et al. 2008) that have been used in the development of binding maps and identification of QTLs associated with morphoagronomic traits (Chen et al. 2007, Blas et al. 2012, Nantawan et al. 2019).
In this context, this study aimed to characterize papaya lines via microsatellite molecular markers, aiming at selecting genotypes within JS-12, SS-72/12, and Sekati lines belonging at germplasm bank UENF/Caliman, by considering intra-genotypic variability and fixation index, thereby contributing to their genetic purification as well as to the adjustment of agronomic trait size and weight fruit, reducing the size of the 'UC-10' and increase the 'UENF-Caliman 01' fruit according to market demand, as well as increasing uniformity during the production process.

Plant material
Samples containing young leaves were collected from 400 hermaphrodite genotypes of C. papaya belonging to three different lines (SS-72/12) Belonging to the 'Solo' group, and (Sekati and JS-12) "Formosa' group located at germplasm bank UENF/Caliman, Universidade Estadual do Norte Fluminense.The genotypes were selected by a mass selection based on fruit size and fruit yield.Thus, 100 genotypes each from the SS-72/12 and Sekati lines were sampled, and 200 genotypes from the JS-12.Concerning JS-12 line, two classes were considered due to the occurrence of variability in fruit weight trait, in which 100 genotypes containing large fruits were classified as JS-12 L which average weight 1.350 kg; while JS-12 S the 100 genotypes displaying small fruits which average weight 950 g.

Isolation of genomic DNA
DNA extraction from young leaves was carried out following the CTAB method (Doyle & Doyle 1990).Then, the samples were subjected to quantification on 1% metaphor agarose gel and diluted to a 5 ng/μL working concentration, using the High DNA Mass Ladder marker (Invitrogen, USA).The gel was stained in a GelRedTM/blue juice solution (1:1) and the images captured by the MiniBis Pro photo documentation system (Bio-Imaging Systems).

Molecular analysis via PCR
The 56 pairs of microsatellite primers used in this study were designed and described by Eustice et al. (2008) and were selected by bearing in mind the genomic location, seeking to select primers on all chromosomes, thus widely covering the species genome (Table I).For this, it was used the available information presented in the genetic map developed by Chen et al. (2007).
Table I.List of 56 microsatellite markers selected for analysis of papaya lines.Linkage group (LG), primer sequence and annealing temperature (Ta °C).

Locus
LG Forward primer Reverse primer T (°C) PCR amplification reactions were performed with a final volume of 13 µL, containing 10 ng of DNA, 1X Tris-Base, 0.2 mM dNTPs, 1.9 mM MgCl 2 , 0.19 μM of each primer, and 0.75 U of Taq DNA polymerase.Amplifications were accomplished in an Eppendorf® Gradient Thermocycler, set up with the following steps: initial denaturation at 94°C for 5 min, followed by 35 cycles of 1 min at 94°C for complete denaturation; primer annealing for 1 min; and an extension during 3 min at 72°C, followed by a final extension at 72°C for 7 min.
The amplified products were stained with a solution GelRed + Blue juice (1:1) and separated by electrophoresis in a 4% metaphor agarose gel in TAE 1X running buffer, at 80V and 0.20A, constantly.Subsequently, the agarose gels were visualized under ultraviolet light and the images were captured by the MiniBis Pro photo documentation system.

Statistical analysis
The outputs from the amplification of 56 loci were converted into a numerical matrix, as described by Ramos et al. (2014).Based on this matrix, the genetic distance was estimated using the unweighted index with Genes version 2018.23 (Cruz 2013).
The genetic dissimilarity matrix was exported to the Mega X program (Kumar et al. 2018), in which analyses of clusters were carried out using the hierarchical UPGMA method (Unweighted Pair Group Method Arithmetic Mean), and the graphical dispersion of genetic distances was performed based on the principal coordinates analysis -PCoA via GenAlex 6.3 software (Peakall & Smouse 2006, 2012).
The numerical matrix was submitted to the Power marker 3.5 software (Liu & Muse 2005), in which the following parameters of diversity were estimated: observed number of alleles per locus (Na), expected (H E ) and observed heterozygosity

Molecular analysis of the intra-genotypic variability
The three studied lines presented lower averages of H O, indicates that the alleles are almost entirely fixed for the analyzed loci ( A suitable result can be observed in the allele fixation, in which 73.25% (293) of the genotypes presented the maximum value (F=1), specifically (65) JS-12 L, (69) JS-12 S, (59) Sekati, and (100) SS-72/12 genotypes.The JS-12 L and JS-12 S lines presented values that ranged from 0.724 to 1.00, and from 0.853 to 1.00 for the Sekati line (Table II).This result is of utmost importance for the study since the main aims is selecting fully fixed genotypes.
Based on the cluster analysis for each line, it is observed the presence of intragenotypic variability revealed by the occurrence of groups and subgroups by using the cut-off demonstrated by Mojena (1977) for lines JS-12 L, JS-12 S, and Sekati.This detected genetic variability, although unwanted, was already expected and it is likely related to the observed morpho-agronomic variability in these lines.Due to the lack of genetic variability in the SS-72/12 line, the dendrogram was not generated.Three groups were formed in both JS-12 L and JS-12 S lines (Figure 1 and 2), in counterpart to four groups in Sekati (Figure 3).This outcome corroborates with the results already demonstrated by the genetic variability analysis (Table II).This detected genetic variation in both lines may be associated with variability in the 'UENF/Caliman 01' and 'UC-10' hybrids.This raises thus the necessity of a genotype selection to obtain genetically uniform hybrids.
The results were also expressed in a twodimensional plane via principal coordinate analysis.In JS-12 L line, such coordinate explained 50.83% (Figure 4a); while 47.64% (Figure 4b) and 67.1% (Figure 4c) for JS-12 S and Sekati, respectively.In the three PCoA graphs, a large distribution of genotypes was demonstrated, covering four graph quadrants.The Sekati line displayed genotypes with the highest dispersion, indicating that there is higher genetic variability among them.

Molecular analysis for the population
A total of 56 polymorphic alleles were obtained by analyzing 26 polymorphic SSR loci, totalling 2.12 alleles per locus on average.Only three loci showed values of observed heterozygosity (H O ) higher than zero, namely P6K128CC (0.123), ctg-365A5 (0.65), and P3K7344CC (0.104), with an average of 0.011, which is lower the expected heterozygosity (H E ) 0.423, on average.The PIC value provides an estimate of the discriminatory power of the loci, by considering not only the number of alleles by locus but also their respective frequencies within studied population (Botstein et al. 1980).In this study, the PIC values ranged from 0.5 (P6K900CC) to 0.093 (P3K7344CC), with an average of 0.333, thus being classified as little, moderately, and highly informative (Table III).
Based on the cluster analysis coupled with the UPGMA hierarchical method (Figure 5a), it was observed the formation of three groups by the cut-off point established by Mojena (1977).There is a close proximity between the lines of the heterotic group 'Formosa', namely JS-12 S and JS-12 L (group I) and Sekati (group II), and a wide genetic distance with the SS-72 /12 (group III), which belongs to the 'Solo' group.It is also worth noting that molecular analysis was not able to separate the genotypes of the JS-12 line into the initially determined classes (large fruit and small fruit).
Genotypes were also analyzed for graphical dispersion based on Principal Coordinate Analysis (PCoA) (Figure 5b).The two coordinates together explained 97.72% of the total variability, in which about 82.72% was explained by coordinate 1, and 15% by coordinate 2. The analysis showed a clear distinction between the lineages as seen in the dendrogram.Quadrants I, II, and IV gathered only genotypes from the JS-12 L, JS-12 S and Sekati lines, while quadrant III clustered the SS-72/12 line in a single point on the graph, thereby demonstrating the absence of genetic variability in this material.cultivated by both national and international farmers, and 'UC-10', which holds great potential for commercial purposes.However, there is still a demand for adjustments in the weight and fruit size traits to increase the productive potential of such hybrids, as well as make it easy the packing and transport steps.
The existence of intra-genotypic genetic variability in both JS-12 and Sekati lines probably occurred during the generation advance via selfpollination, in which different alleles from the same locus were fixed into these lines.It was also observed that some genotypes had heterozygous loci, and this can be associated with a higher cross-fertilization rate that probably occurs in Formosa group genotypes.This variability can be controlled and avoided by adopting more accurate practices to generation advance via self-fertilization, germplasm rejuvenation, and seed production, which may be carried out with adequate flower bud protection.
As presented by the outcomes from cluster analysis, the genetic variability was also observed in JS-12 L, JS-12 S and Sekati lines, both in dendrograms and two-dimensional graphic analysis, thereby corroborating with the current results.Regarding population clusters, it was observed that the groups were formed based on heterotic groups.Such organization enables systematic exploitation of the lines, making them promising in crop breeding programs, thereby corroborating with results demonstrated elsewhere (Pirovani et al. 2021, Vivas et al. 2018).

DISCUSSION
The molecular analysis detected a higher intragenotypic genetic variability in Sekati, which was then followed by JS-12 L and JS-12 S lines, both belonging to the 'Formosa' heterotic group.This result may be associated with the reproductive behavior of this group which presents a higher cross-fertilization rate as compared to the heterotic group 'Solo', namely SS-72/12 line.This line also reproduces by self-fertilization but displays a lower cross-fertilization rate (Damasceno Junior et al. 2009), and presents observed heterozygosity values equal to zero.
JS-12, Sekati and SS-72/12 genotypes are materials that have already been explored in hybridization assays -both for their morphoagronomic traits and combinatorial capacity (Marin et al. 2006a, b, Pereira et al. 2019b).Based on these previous studies, it has been developed two hybrids with great potential to be exploited in the Brazil, such as the 'UENF/Caliman 01', already hybridization step to adjust the 'UENF/Caliman 01' hybrid; while a top-cross to adjust the 'UC-10' hybrid -due to the absence of genetic variability in the SS-72/12 parent.Once identified the best combination of parents for the commercial traits of interest size and weight fruit, they will be selected and self-fertilized to develop the next generations of these parents.
It is expected that with the selected genotypes and the continuity of the work, we will be able to achieve purified lines, contributing to the maintenance of the commercial standard of the developed hybrids.This process aims to obtain the 'UC10' hybrid with a maximum of 2.0 kg fruits and the hybrid 'Calimosa' with fruits from 1.2 to 1.4 kg and more yielding.

Figure 1 .
Figure 1.Dendrogram generated by the UPGMA hierarchical method based on analysis of 100 genotypes from JS-12 L line (cophenetic correlation coefficient = 0.77).

Figure 3 .
Figure 3. Dendrogram generated by the UPGMA hierarchical method based on analysis of 100 genotypes of the Sekati lineage (cophenetic correlation coefficient = 0.86).

Figure 4 .
Figure 4. Principal Coordinates analysis based on the genetic distance revealed by the Unweighted Index (Cruz 2013) applied to the lines a) JS-12 L; b) JS-12 S and c) Sekati.

Figure 2 .
Figure 2. Dendrogram generated by the UPGMA hierarchical method based on analysis of 100 genotypes from the JS-12 S line (cophenetic correlation coefficient = 0.84).

Figure 5 .
Figure 5. Clusters based on the analysis of 400 papaya genotypes generated by the genetic distance obtained via Unweighted Index.a) UPGMA (Cophenetic correlation coefficient = 0.92).b) Principal Coordinates Analysis (Cruz 2013).

Table I .
Continuation.
Table II).It is worth highlighting the SS-72/12 line, which presented H O equal to zero, and seems to hold fully fixed alleles, thus being considered the best line with regards to allelic fixation and suitable as parents to generate hybrid seeds in further crop breeding programs.The other lines -JS-12 and Sekati -showed H O values ranging from zero to 0.077.

Table II .
Means of genetic variability parameters obtained from each papaya lines.
Na: observed number of alleles, H E : expected heterozygosity, H O : observed heterozygosity and F: fixation index.

Table III .
Descriptive analysis of the genetic variability from 26 loci analyzed in the 400 genotypes of C. papaya.
GL: linkage group, N: sampled individuals, Na: observed number of alleles, H E : expected heterozygosity, H O : observed heterozygosity and PIC: polymorphic index content.