Genetic structuring of segregating populations of Psidium spp resistant to the southern root-knot nematode by Bayesian approach as basis for the guava breeding program

There are no guava cultivars resistant to the Meloidogyne enterolobii; for this reason, genetic breeding has been performed by introgressing genes into the current cultivars through interspecific hybridization. We used 33 microsatellite markers for the genetic-molecular characterization of segregating populations of Psidium resistant to M. enterolobii, aiming at selection within and between populations for generation advancement in the guava breeding program. The average number of alleles per locus ranged from 1.60 to 2.09. Populations 1 (P. guineense × P. cattleyanum) and 5 (P. guajava × P. cattleyanum) obtained the greatest genetic diversity, which can be confirmed by the higher observed-heterozygosity values (0.422 and 0.312, respectively). Bayesian analysis showed that the populations were subdivided into three groups, agreeing with the number of groups observed by Nei’s genetic distance. The population obtained from the P. guineense × P. cattleyanum cross differed from the others with a clear structuring, whereas the P. guajava × P. cattleyanum and P. cattleyanum × P. guineense populations were the most similar between each other. The SSR markers were efficient in discriminating the populations, and individual 80 may be employed in future crosses with guava, allowing generation advancement in the guava breeding program aimed at resistance to M. enterolobii.


INTRODUCTION
The genus Psidium, of the family Myrtacea, comprises 183 species among which the guava (P. guajava L.) and the araçá fruits (P. cattleyanum and P. guineense) stand out, originating from tropical regions of America (Flora do Brasil 2020).
India, Pakistan, Brazil, Colombia, and Mexico are the biggest guava producers in the world. This crop is amidst the 19 most produced fruits in Brazil (Pommer & Murakami 2014). The annual production of red guava in Brazil is approximately enterolobii, responsible for the appearance of galls in large quantities and yellowing of the leaf edges, ultimately leading to total defoliation of the shoots . A matter of an even greater concern is the synergistic association between M. enterolobii and the fungus Fusarium solani, since root rots occur only when two pathogens are involved in the process. This disease in association is known as the 'guava decline', which eradicates numerous orchards of this fruit . This result in huge economic impacts for producers, once the losses are estimated to reach over $ 70 million .
Practices for the control of this nematode have limited or zero effectiveness. There are no reports of genetic resistance to M. enterolobii identified in P. guajava  and no guava cultivar resistant to the nematode has been registered thus far. Therefore, studies conducted by many researchers identified the araçás of the species P. cattleyanum (Miranda et al. 2011, Biazatti et al. 2016 and P. guineense (Costa et al. 2012) as sources of resistance to this pathogen.
However, the araçás have limited or full incompatibility when used as rootstock for guava , Robaina et al. 2012. For this reason, hybrids between guava and araçá have been generated as an alternative to prevent losses and the devastation of guava orchards across Brazil. Interspecific Psidium hybrids have already been obtained and assessed for resistance to the nematode (Gomes et al. 2017, Costa et al. 2016. Gomes et al. (2017) performed crosses between P. cattleyanum (resistant araçá) and P. guajava (susceptible guava) and selected 30 hybrids including immune and resistant types that will be backcrossed with guava to retrieve the desired agronomic traits and for the later release of a new cultivar. Gomes et al. (2017) and Costa et al. (2016) obtained diverging results for inheritance of resistance to M. enterolobii. Costa et al. (2016) evaluated 22 hybrids from the P. guajava × P. guineense cross and found that 226 plants were immune (Reproduction Factor [RF] = 0) and 16 were susceptible (RF between 0.003 and 0.322), suggesting simple inheritance with full dominance. Gomes et al. (2017), by contrast, evaluated 367 hybrids obtained from the cross between P. guajava and P. cattleyanum and identified 18 immune, 46 resistant, and 303 susceptible genotypes, discarding the hypothesis of monogenic inheritance. However, it should be stressed that the resistant araçá species used in both studies are different, which may lead to conflicting results. In order for monogenic inheritance to occur, as suggested by Costa et al. (2017), the species P. guineense must be homozygous for the alleles that provide resistance to M. enterolobii, which is not common in allogamous species like the araçás.
Results obtained by Gomes et al. (2017) led to the generation of many progenies through crosses between P. guineense × P. cattleyanum, P. guajava × P. cattleyanum, and P. cattleyanum × P. guineense, which were evaluated for both resistance to M. enterolobii and genetic diversity through morphological traits (Almeida 2017). However, the phenotypic expression is influenced by several external factors such as environmental conditions, plant age, among others. Studies based on molecular methods, in turn, have a lower environmental effect and have been increasingly used in breeding programs, since, in addition to providing direct information on the genome of each individual, they improve the genotype selection efficiency and guide breeders in the choice of crosses.
To save time and accelerate the response of the programs, breeders make use of auxiliary tools such as molecular markers. Among the many classes of molecular markets available today, microsatellites stand out for their informative power and wide spread across the genome, allowing for good sampling in genetic studies (Oliveira et al. 2010). Other features of these markers are codominant inheritance, multi-allelic nature, and high reproducibility. Microsatellites are a class of more-promising markers for broad use in breeding programs. Of their several applications, noteworthy cases are studies of genetic diversity (Tuler et al. 2015 Campomanesia, Myrciaria, and Syzygium (Nogueira et al. 2015). However, there are no reports involving the use of SSR markers in interspecific Psidium hybrids resistant to M. enterolobii to support breeding programs.
The combination of breeding methods, statistical methodologies, and molecular technologies brings new prospects for genetic knowledge and for the acceleration of breeding programs. In this regard, the objectives of the present study were: i) to undertake genetic characterization, estimating genotypic indices for the quantification and structuring of the genetic variability of the segregating Psidium populations studied; and ii) to select genotypes genetically closer to P. guajava for generation advancement in the guava breeding program aiming at resistance to M. enterolobii.

Genetic material
Ninety-four (94) individuals originating from five segregating populations of Psidium resistant to the nematode M. enterolobii and their respective parents were evaluated ( Table  I). The populations were obtained from an interspecific hybridization of P. cattleyanum (resistant genotype), P. guajava, and P. guineense (susceptible genotypes) and evaluated for resistance to M. enterolobii (Gomes et al. 2017).

Genomic DNA extraction and quantification
Total genomic DNA was extracted from young leaves collected individually from each hybrid plant and parents, using the CTAB method with modifications (Doyle & Doyle 1990).
Next, the DNA was quantified by analysis on 1% agarose gel with TAE 1X buffer (Tris, sodium acetate, EDTA, pH 8.0), using the 100-pb lambda (λ) marker (100 ng/μL -1 ) (Invitrogen, USA) by comparing the bands. For this procedure, samples were stained using a mixture of Gel Red™ and Blue Juice (1:1), and the image was captured by the Mini Bis Pro (Bio-Imaging Systems) imaging system. Subsequently, DNA samples were diluted to the working concentration of 10 ng/μL -1 .

Primer screening
The DNA of the parental genotypes was initially used for a screening of 100 microsatellite primers developed for P. guajava (Risterucci et al. 2005, GuavaMap 2008 aiming to identify SSR loci capable of differentiating the parents. After tracking, a set of 33 polymorphic primers was selected for the amplification reactions (Table II).
The PCR products were diluted at a ratio of 4 μl of sample to 20 μL of buffer E from the DNF 900 kit and dis and subjected to a capillary electrophoresis system (Fragment Analyzer -AATI) in which amplified fragments of 35 to 50 pb are separated with a resolution of approximately 2bp. Each run lasted 2h20min at a voltage of 8 kw.

Estimation of genetic diversity of SSR markers
The data obtained from the amplification of the SSR primers were converted to numeric code for each allele per locus. This numerical matrix was developed by assigning values from 1 to the maximum number of alleles in the locus, as described next: a locus presenting three alleles was represented by 11, 22, and 33 for the homozygous forms A1A1, A2A2, and A3A3; and 12, 13, and 23 for the heterozygous forms A1A2, A1A3, and A2A3. Based on this numerical matrix, we calculated the genetic distance between the studied genotypes using the GENES software  2009). The optimal number of markers was estimated with the GENES software (Cruz 2016). From the molecular variables, we estimated the genetic diversity between the Psidium spp populations with the Genalex 6.3 software (Peakall & Smouse 2009). The following parameters were estimated: number of alleles per locus (NA), genetic variance (Vg), genotypic coefficient of determination (H 2 ), polymorphic information content (PIC), probability of identity (PI), observed heterozygosity (Ho), expected heterozygosity (He), and fixation index (f).

Analysis of the structure and clustering of Psidium spp. populations
Based on the information obtained from the polymorphic primers, a molecular analysis of variance (ANOVA) was performed to evaluate the differences between and within the populations by Wright's F statistics, where the genetic structure of a population is characterized by obtaining three distinct parameters: FIS (inbreeding coefficient among individuals in the subpopulations), FST (degree of genetic differentiation among the subpopulations), and FIT (degree of genetic differentiation among the total populations), using the Genalex 6.3 software (Peakall & Smouse 2009).
Analyses of population structure were performed with the Bayesian method using the STRUCTURE software, version 2.3.4 (Pritchard et al. 2000). Considering that the present study was conducted using populations (comprising plants from controlled crosses), we used the no admixture model and correlated allele frequencies in each population. The burn-in period and replication numbers were set to 10,000 and 50,000 respectively, for each run. The number of groups (K) was systematically varied from 1 to 10, and 20 simulations were performed to estimate each K. We used the ΔK ad hoc method described by Evanno et al. (2005) and implemented in the online tool Structure Harvester Earl & Vonholdt (2012) to estimate the most likely K in each group of Psidium. The membership probabilities threshold of 0.60 was used as the maximum membership probability among subgroups. Based on the posterior probability of membership (q) of a given accession belonging to a given group compared to the total number of groups (K), we classified individuals with q > 0.60 as a member of a given cluster, whereas for clusters with membership (q) values ≤0.60, the accession was classified as admixed (Cerqueira-Silva et al. 2014).

Genetic-molecular characterization of Psidium segregating populations by SSR markers
A total of 100 microsatellite primers were used to distinguish the segregating populations of Psidium spp resistant to M. enterolobii at molecular level. Of these, 70 produced amplified DNA fragments, 33 of which were polymorphic and thus used in this study (Table II). These SSR loci generated 89 alleles; the number of alleles per locus ranged from two to four, averaging 2.70 alleles per locus, considering the entire dataset (Table III).
For most of the analyzed loci, Ho was higher than He and ranged from 0.000 to 0.871, averaging 0.288. Expected heterozygosity, in turn, ranged from 0.033 to 0.508, averaging 0.216. As for the fixation index, the variation between the loci was -0.082 to 1. According to the F statistics, perhaps the most disseminated statistics in population genetics, FIS values ranged from -0.049 to 1, while FIT values went from -0.100 to 1 and FST from 0.003 to 0.932 (Table III).
The parameters Na, Ho, He, f, and PIC were also estimated at population level (Table IV). Na ranged from 1.60 to 2.09, with a total average of 1.86 alleles between populations. Expected for heterozygosity (Table IV). The fixation index, which estimates the inbreeding coefficient of the populations, had its lowest estimated value in population 1 (-0.29), which was expected, since this population exhibited the highest values for PIC and He. The highest f value, in turn, was observed in population 2 (P. guajava × P. cattleyanum) (0.27) (Table IV).
Of the 33 loci analyzed, just 15 would be sufficient to estimate the genetic diversity of the populations (Figure 1). The ranking of PI (probability of identity) values obtained allowed us to highlight the most informative loci; i.e. those that most contributed to the differentiation of the populations ( Table V). Most of these loci are associated with high Vg and H 2 values, which ranged from 0.1432 to 156.70265 and from 24.19284 to 99.80875, respectively (Table V).
The individual analysis clearly showed the formation of three large groups (Figure 2). The hybrids originating from the P. guineense × P. cattleyanum cross and their parents formed a group highlighted in blue that differed from the others. Hybrids whose female parent was P. guajava, in turn, were genetically closer to each other and remained in the same group highlighted in red. The hybrids resulting from the P. cattleyanum × P. guineense cross, which have one parent in common, displayed greater similarity with each other and also formed a single group highlighted in green ( Figure 2). However, greater emphasis should be placed on the red group, which comprises the parents and hybrids of the P. guajava × P. cattleyanum cross, in which the genotypes genetically closer to P. guajava can be used as parents in future crosses with guava ( Figure 2).
Genetic structure and diversity among populations of Psidium spp.
The Bayesian approach indicated that the Psidium populations were clustered into three genetic groups. According to the criterion of Evanno et al. (2005), the optimal ΔK was obtained when K=3, suggesting that maximum structuring was observed when the sample was divided into three groups (Figure 3). With a high probability of adhesion, the evaluated populations were grouped as follows: group 1 consisted of the genotypes originating from the P. guineense × P. cattleyanum cross, which corresponds to population 1; group 2 was formed by the hybrids from the P. cattleyanum × P. guineense cross, representing populations 3 and 4; and group 3 allocated the individuals resulting from the P. guajava × P. cattleyanum cross, corresponding to populations 2 and 5 ( Figure 3). The Bayesian clustering method agrees with the number of groups observed in the dendrogram based on Nei's distance ( Figure 3) and by the principal coordinate analysis (PCA) ( Figure  4). The first two coordinates explained 99.98% of the total variation (PC1: 91.25%; PC2: 8.72%), and the largest genetic distance was observed between populations 1 and 2 ( Figure 4). In addition, AMOVA indicated that approximately 80% of the variance originated from the differentiation between these populations and only 20% of the total variation was observed within the populations (Table VI).    (2), Guava 13 4 (5) and P. guajava (13.4II) × P. cattleyanum (P33) and P. guajava (13.4II) × P. cattleyanum (P51) hybrids; and green group: genitors Ara P11, Ara CV8, Ara CV1, Ara CV11 (3), CV11 (4), Ara P33 and Ara P53 and P. cattleyanum (CV8) × P. guineense (CV11) and P. cattleyanum (CV1) × P. guineense (CV11) hybrids. Cophenetic correlation = 0.96. The loci that most contributed to the differentiation of populations are in bold.

DISCUSSION
Genetic diversity is essential to sustain the productivity of a crop, since it provides 'new genes' for yield, adaptation, and disease resistance. The disease known as 'guava decline' has caused a strong economic impact for the guava crop, with losses estimated at over 90 million dollars . Therefore, identifying resistant genotypes is a measure of paramount importance to increase yields. Interspecific crosses involving guava and araçá have been made as an alternative to attain success, since no variability exists for resistance to the nematode in the commercial species P. guajava (Gomes et al. 2017, Costa et al. 2017). However, the genetic breeding of perennial species is a long process, and thus markers have been used as an alternative to reduce the time and resources employed throughout the program until the release of a cultivar.

Genetic-molecular characterization of Psidium segregating populations by SSR markers
The microsatellites developed for P. guajava were efficiently transferred to the guava and araçá segregant populations. This can be corroborated by the work of Tuler et al. (2015), who investigated the transferability of microsatellite markers developed for P. guajava into another 12 Psidium species including P. cattleyanum and P. guineense and detected a high percentage of transferability, which evidences the narrow phylogenetic relationship between these species, considering that transferability occurs through the preservation of microsatellites and anchoring regions. The maximum number of alleles per locus observed in Psidium populations was four. This result would be expected if the parents involved in the crosses were diploids and heterozygotes for different alleles. However, the araças are known polyploids, which would allow finding more than 4 alleles per locus, such result was not observed in the present study. Costa & Santos (2013) found seven to 22 alleles per locus in the 13 SSR analyzed for 61 accessions of genotyped Psidium. Aranguren et al. (2010) evaluated 31 accessions of native Venezuelan guavas and detected a total of 111 alleles and three to 11 alleles per locus in 16 SSR analyzed. Tuler et al. (2015) found one to six alleles per locus in 31 SSR markers used for the identification of 13 indigenous Psidium species of the Atlantic Forest. The highest number of alleles found in these studies is due to the genotypes studied by the authors belonging to germplasm banks of guava and araçá that have greater genetic variability.
Many descriptive measures have been employed in the search for the quantification of the genetic diversity between populations, making it possible to infer about the structure of the population in addition to the informative and discriminatory ability of the many classes of molecular markers in the process of genotypic identification and diversity analysis (Tuler et al. 2015, Pena et al. 2016). To this end, most research studies have adopted diversity parameters such as observed and expected heterozygosity and PIC.
In this study, the observed (Ho) and expected (He) heterozygosity values varied largely across the loci, indicating that the microsatellites were efficient in detecting variability between populations. Mean values for Ho (0.288) and He (0.216) were close, suggesting that the studied populations have medium genetic diversity inasmuch as the observed frequency of heterozygotes (Ho) is near the expected level. Although the mean values for Ho and He obtained in the current study were not the expected for allogamous populations, especially those obtained from interspecific crosses, these findings are similar to that found by Tuler et al. (2015), who observed an average Ho of 0.15 using 23 SSR primers in 13 Psidium species. By contrast, P.G.O. Pessanha et al. (unpublished data) evaluated intraspecific hybrids of P. guajava using 10 SSR primers and obtained higher mean He (0.46) and Ho (0.58) values than those obtained here. The reduced estimates for heterozygosity found in this study may be a result of the presence of null alleles, a problem inherent to codominant markers (Ramos et al. 2014). This problem can be overcome by excluding these markers, which would not compromise the analysis, considering that of the 33 loci used in this study, 15 would be sufficient to estimate the genetic diversity of the populations. Most markers with a high level of information for the differentiation of the populations showed the highest genetic-variance values and are associated with those which had the highest H 2 values, which corresponds to broad-sense heritability.
Seventeen of the 33 loci showed negative fixation coefficients (f). This indicates that the alleles for these loci are not being fixed by either inbreeding or any other factor that might lead the population to a distancing from the Hardy-Weinberg Equilibrium, as expected for allogamous populations under random crosses. Negative estimates for the inbreeding coefficient are common when observed heterozygosity values are greater than the expected heterozygosity, as observed in this study, suggesting excessive loci in heterozygosis (Cruz et al. 2011). However, in four loci, the estimated Ho was 0 and f was consequently 1, indicating that, for these loci, there is a greater deficit of heterozygotes than expected under an equilibrium condition, similarly to the reports of Tuler et al. (2015) for different Psidium species.
The analysis of diversity by Wright's F statistics is useful to determine how much each locus influences the diversity of studied populations (Cruz et al. 2011). As stated by Hernan (2009), FST values can range from 0 (no differentiation) to 1 (complete differentiation) between a specific group and the subgroups derived from it. Therefore, the high level of differentiation between the populations was confirmed by the high mean values of FST (0.55), contrasting with the low Fit values (-0.08), which indicate a greater occurrence of heterozygotes than homozygotes in the studied populations. These findings are expected, considering the intense gene flow taking place in crosspollination species like guava and araçá. Similar results were found by Rajesh et al. (2008), who observed an elevated level of genetic differentiation (FST = 0.42) in coconut landraces using 14 SSR markers. At population level, Ho and He were very close, and the fixation coefficients were low and mostly negative. According to Loiola et al. (2016), close or non-significant Ho and He values suggest a predominance of panmixia; i.e. the occurrence of random crosses in the populations. The populations would thus be under Hardy-Weinberg equilibrium. This cannot considered in the current study, though, since the evaluated populations were obtained from directed crosses between unrelated heterozygous parents. Populations 1 and 5 obtained the greatest genetic diversity, which can be corroborated by the higher PIC and He values. This result suggests the presence of more heterozygotes in these populations than expected under Hardy-Weinberg equilibrium (HWE). By contrast, populations 3 and 4 had the lowest PIC and He values, indicating that the observed level of polymorphism was low, generating variability within them.
Though formed exclusively based on the molecular profiles, the three groups showed some peculiar traits in relation to resistance to M. enterolobii. Considering all evaluated populations, the P. cattleyanum (CV1) × P. guineense (CV11) cross showed the lowest values for the reproduction factor, making it the population with the highest level of resistance and P. guajava (13.4 II) × P. cattleyanum (P51) the cross with the highest values (Gomes et al. 2017). However, with respect to the guava × araçá crosses, population P. guajava (13.4 II) × P. cattleyanum (P33) presented the lowest amplitude for reproduction factor (0 to 3.60) (Gomes et al. 2017). In this way, genotypes like 24, 29, and 33, with a reproduction factor of 0, can be selected to be used as parents in future crosses with guava.
Genetic Structure and diversity among populations of Psidium spp.
The Bayesian approach based on the criterion of Evanno et al. (2005) as well as the approach based on Nei's distance suggested the formation of three well-structured groups. Populations 3 and 4, which shared the same female parent P. cattleyanum, formed a single group and so did populations 2 and 5, which shared the same female parent P. guajava These last ones were the most distant from populations 1, 3, and 4, which originate from araçá × araçá crosses. A population derived from an interspecific cross is normally expected to have greater polymorphism; however, if the parents share most of the analyzed anatomical regions, this does not occur. The loci analyzed in this study may be in well-preserved regions of the parents' genome, with a low mutation rate (Tuler et al. 2015, Paiva et al. 2014, which may also explain the low number of polymorphic markers and the fact that populations 2 and 5 (P. guajava × P. cattleyanum) and 3 and 4 (P. cattleyanum × P. guineense) were grouped with each other. This approach has been used in fruit trees to analyze the genetic structuring of 26 varieties of Vitis (Santana et al. 2012); for the geneticmolecular characterization of papaya genotypes obtained from three backcross generations and to check the genetic relationships between coconut accessions collected in Brazil and in different geographic regions of the world (Loiola et al. 2016). This is the first study under the Bayesian approach characterizing Psidium populations.
The SSR markers were efficient in discriminating the populations, which may help in new stages of the guava breeding program aimed at resistance to M. enterolobii. Hybrid 33, obtained from the P. guajava × P. cattleyanum, can be used as a parent in future crosses with the guava for the generation of the first backcross (RC 1 ). Because this genotype is genetically closer to P. guajava, it has a larger number of alleles shared with this parent, which will certainly promote a faster recovery of the guava genome and the release of a cultivar resistant to the nematode.