Crop Breeding and Applied Biotechnology - 20(1): e256520112, 2020 Guava breeding via full-sib family selection: conducting selection cycle and divergence between parents and families

: The major hindrance to guava fruit (Psidium guajava L.) growing is the low availability of cultivars for use by Brazilian producers, who usually rely on few options of genotypes to implement the crop. In the present study, 11 full-sib families were evaluated in a randomized-block experimental design with three replicates, in order to more efficiently select superior genotypes. Genetic parameters were estimated and the best genotypes were selected based on genetic value by applying the REML/BLUP statistical procedure. Additionally, genetic divergence was estimated based on the mean Euclidian distance between the individuals selected via BLUP. Based on genetic divergence, the best genotypes were selected for use as parents in new crosses aiming at continuity of the guava breeding program. The mean values of the traits of selected individuals surpassed the mean values of their parents, confirming that the strategy of obtaining full-sib families is effective in generating considerable gains.


INTRODUCTION
Guava (Psidium guajava L.) is enjoyed in Brazil and in many other countries across the globe. The fruit has a unique flavor and gives rise to a number of by-products, such as jams, compotes, juices, syrups, sweets, cereal bars, ice creams, etc. Brazilian guava production is approximately 578,5 t from a planted area of 21.5 thousand hectares, with a mean yield of 26.91 t ha -1 , per year. This represents an estimated return of BRL 794.916 million (IBGE 2018). Nevertheless, the greatest obstacle faced by Brazilian producers is the low number of cultivars available and adapted to the producing regions. Thus, current challenges for breeders are obtaining, developing, and commercially releasing new cultivars with physical and chemical traits of fruit that are more attractive to the consumer.
In this scenario, guava breeding focusing on the selection of elite genotypes via full-sib families has proven to be an efficient selection strategy, especially when low-heritability traits are involved. In this case, selection consists of ranking the individuals with high genotypic values within the full-sib families by using mixed models (Quintal et al. 2017). That way, selection of families based on quantitative production traits, such as fruit weight (FW), placenta weight (P L W), pulp yield (PY), and pulp weight (PW), will allow identification of promising progeny and/or families with better chances of higher yield.

CM Bezerra et al.
Intraspecific crosses made to obtain full-sib families generate segregating populations, which have high genetic variability. This variability, in turn, is useful in breeding programs, since it increases the selection power in those generations. However, successful genotype selection not only depends on genetic variability, but also on the accuracy of the analytical methods employed. This is especially true in unbalanced experiments, a common situation in experimentation with fruit species, in which analysis of variance leads to imprecise estimates of variance components. Given this situation, it is essential to use methods that precisely estimate the variance components and allow for prediction of the individual genetic values of candidates for selection (Santos et al. 2015).
To overcome those limitations, mixed-model methodologies can be used as an optimal procedure for selection, allowing for a more accurate selection process. In this approach, variance components are estimated by the Restricted Maximum Likelihood (REML) method and genetic values are predicted by the Best Linear Unbiased Prediction (BLUP) method (Resende 2016). The REML/BLUP methodology allows estimation of variance components even in situations of unbalanced experiments. This possibility becomes extremely important in the context of guava breeding programs, considering the relevance of the selection unit as the individual rather than the selection unit as the mean of groups of individuals, a fact which requires the prediction of breeding values (additive and non-additive) for selection purposes. Furthermore, the survival rate of the plants in the experiments decreases over time. These facts, along with the overlap of generations, tend to generate unbalanced data for use in estimation of genetic parameters and in the prediction of individual genetic values (Viana and Resende 2014).
The present study aim to estimate the genetic parameters in a guava full-sib progeny test, using the REML/BLUP methodology to select the best individuals within the families, to carry out a comparative analysis among the selected individuals and their parents, and to verify genetic divergence based on all the morpho-agronomic traits in order to predict possible future crosses.

History of the genetic material
The 11 full-sib families evaluated in this experiment originated from populations developed by Pessanha et al. (2011) based on information on existing genetic diversity. The guava breeding program at the Universidade Estadual do Norte Fluminense through selection of full-sib families was started from collections of 20 accessions made by Pessanha et al. (2011) in the cities of São João da Barra and São Francisco do Itabapoana in the state of Rio de Janeiro, as well as selection by RAPD markers of the 7 most genetically contrasting genotypes. This selection was followed by crosses between these individuals to obtain 17 segregating full-sib families.
The parents used in the crosses exhibited a considerable degree of genetic divergence, since the plants were selected in orchards formed from seedlings. The seeds obtained from those crosses generated a segregating population of wide genetic variability that was subsequently evaluated and selected via REML/BLUP by Quintal et al. (2017). The highest yielding progeny were crossed again to give rise to the 11 full-sib families that were evaluated in the present experiment.

Implementation of the experiment
The experiment was conducted in the experimental area of the Universidade Estadual do Norte Fluminense, in Campos dos Goytacazes, in the north of the state of Rio de Janeiro, Brazil, at the Antônio Sarlo Agricultural School (lat 21º 08' 02'' S, long 41º 40' 47'' W, alt 14 m asl). The climate of the region is a sub-humid and dry tropical type, with mean annual temperature from 22 to 25 ºC and mean annual rainfall from 1200 to 1300 mm.
After the seedlings were obtained, the experiment was established as an unbalanced randomized-block design with three replicates. Each replicate consisted of eight plants from each family. Plant spacing was 1.20 m in the row and row spacing was 3.50 m. Fertilizer was applied on different occasions based on the results of soil analysis made after pruning and before harvest. Fertilization consisted of three doses of 123 g urea, 315 g single super phosphate, and 87 g potassium chloride, which were applied per plant every 30 days.
The full-sib family seedlings were planted in September 2014 and evaluated from May to July 2018 in the experimental area located approximately 150 meters from where the parents were planted and under the same conditions of drip irrigation, use of pesticides, production pruning, and fertilization provided to the families.

Traits evaluated
Ten fruit samples were taken per individual harvested. The fruit was harvested at maturation, i.e., when the peel was yellowish-green (color angle between 112 and 108 °h). The following traits were evaluated: fruit weight (FW) -calculated as the mean of the total weight of the fruit from each genotype, which was obtained using a semi-analytical scale, and expressed in g; fruit length (FL) -measured in the longitudinal region of the fruit with a digital caliper and expressed in mm; transverse fruit diameter (FD) -determined in the equatorial region of the fruit with a digital caliper and expressed in mm; length/diameter ratio (L/D) -ratio between the fruit length and diameter; placenta weight (P L W) -region where the seeds are concentrated, which was measured on a semi-analytical scale; pulp yield (PY) -calculated by dividing fruit weight by pulp weight; endocarp thickness (ET) -the region where the seeds are concentrated, which was measured in the placenta region with a digital caliper and expressed in mm; mesocarp thickness (MT) -region from the fruit peel to the beginning of the pulp, which was measured with a digital caliper and expressed in mm; pulp weight (PW) -the mesocarp region of the fruit, which was determined by subtracting the total weight and the placenta weight; total soluble solids content (TSS) -determined using an Atago no.1 digital refractometer, with results expressed in ºBrix; and pH -measured in aqueous solution containing 5 g of pulp for 50 mL of water, using a W3B pH meter.

Mixed models for evaluation and selection of plants and estimate of genetic parameters
According to the model described in Viana and Resende (2014), deviance analysis was performed to test the effects of the model. Deviance analysis was made as follows: D = -2ln(L) where ln (L) is the maximum point of the logarithmic function of restricted maximum likelihood (REML); y is the vector of the variable analyzed; m is the vector of the effect of observations, assumed as fixed; X is the incidence matrix of fixed effects; and V is the variance-covariance matrix of y.
The statistical LRT (likelihood ratio test) was applied to test the significance of the effects, as shown: LTR: |-2ln(Lwe) + -2ln(Lfm), where Lwe is the maximum point of the maximum likelihood function for the reduced model (without the effects), and Lfm is the maximum point of the maximum likelihood function for the full model.
The variables were analyzed using Selegen-REML/BLUP software (Resende 2016), in which the variance components were obtained by the REML method, and the individual genotypic values were obtained by the BLUP approach. The matrix representation of the statistical model is y = Xr + Zg + Wp + e, where y is the data vector; r is the vector of replicate effects (fixed), added to the overall mean; g is the vector of individual genotypic effects (random); p is the vector of plot effects (random); and e is the vector of errors or residuals (random). Uppercase letters represent the incidence matrices for the effects mentioned. Employing the REML/BLUP procedure, all the individuals were ranked according to the genotypic values found for each trait. Based on these values, the 30 best individuals for each trait were selected.

Multivariate analysis
After the information on agronomic traits was obtained, for multivariate analysis, the data from the 30 selected individuals were used, in addition to the parental data. Data from all traits evaluated were considered to obtain the genetic distance matrix, construction of which was based on Euclidean distance. From the distance matrix, the dendrogram was designed by the UPGMA method (Unweighted Pair-Group Method with Arithmetic Mean). All analyses were performed using R software <http://www.r-project.org>.

Genetic evaluation of the families
The results indicate the existence of significant differences based on the chi-squared test at the 5% probability level for FW, FL, FD, L/D, P L W, ET, MT, PW, and pH (Table 1). This suggests the existence of genetic variability among the genotypes evaluated, allowing successful selection of superior genotypes based on those traits. For placenta weight and CM Bezerra et al. endocarp thickness, significant differences for the plot effect was found, indicating that there is significant variability among plots for those traits. This result is due to the fact that these traits are highly affected by the environment, that is, a small environmental variation affects their behavior within the blocks.
The highest proportions of genotypic variance compared to phenotypic variance were estimated for the traits FW, FL, FD, P L W, ET, MT, and PW, which were 1517. 16, 37.67, 17.67, 51.84, 7.16, 1.04 and 1034.92, respectively (Table 2). This shows the existence of high genetic variance that can be exploited via selection using those traits. When positive and different from zero, genetic variance estimates (σ 2 g ) indicate variability among the progeny due to genotype and, consequently, the possibility of selection of superior individuals for each trait evaluated. The lowest genotypic variance values were found for the traits L/D, PY, TSS and pH, whose values ranged from 0.0016 (pH) to 0.64 (PY). In this case, these low values are due to the low genetic variability of these traits. Thus, it can be affirmed that the genetic gains obtained from selection in relation to those traits are also low, since most of the variance found for those traits is environmental, resulting in low heritability values ( Table 2). The highest estimated values for phenotypic variance were found for FW,  ), accuracy of family selection (Acprog), and overall mean of the traits obtained by the REML procedure for 11 traits evaluated in progenies of 11 full-sib families  (Table 2). Lower phenotypic variation values found for the L/D ratio (0.0087) and pH (0.08) may indicate the environmental effect on them is low and depends, in this case, on the heritability and selection accuracy estimated for them.
When aiming at selection of individuals, it is very important that phenotypic variation be composed mostly by variations derived from the genotype of the selection candidates, thus contributing to higher heritability estimates for the trait. Regarding those estimates, individual narrow-sense heritability ranged from 0.861 to 0.003 (Table 2). The fact that the highest estimates of individual heritability in this study were obtained for the traits FW, FD, FL, PW, and P L W (0.861, 0.85, 0.84, 0.82, and 0.72, respectively) indicates that individual selection based on those traits may be successful.
Estimated heritability values on a family mean basis (h 2 mp ) were higher than the narrow-sense individual heritability  (Table 2). Moreover, pH and MT showed some of the highest heritability estimates on a family mean basis (0.90 and 0.91, respectively), despite their medium magnitude individual heritability values (0.53 and 0.57). High estimates of individual heritability and heritability on a family mean basis indicate the applicability of both individual selection and selection between families on those traits. Thus, for traits whose individual heritabilities are of medium to high magnitude, individual selection would be the most appropriate, as it would bear excellent genetic control, ensuring significant genetic gains and excellent selection possibilities at the individual and progeny level. Thus, individual selection is more efficient in predicting the individual genetic value for traits of medium to high heritability (Cunha et al. 2004). However, because of the low relative individual heritability values for L/D, PY, and TSS, only selection among families is recommended. Thus, the estimated heritability values provide information concerning the correlation between the phenotypic and the genotypic values, with the differences measured between individuals translating the true genetic differences, thereby ensuring the success of the selection strategy adopted (Leite et al. 2016). This shows the importance of performing genotype selection based on the predicted genotypic value rather than the simple measured phenotype. Paiva et al. (2016) worked with the guava population from which we selected the best genotypes that were used to obtain the full sib families of the present study. They aimed to examine and compare different selection criteria in the guava crop; they estimated genetic parameters and compared genetic gains predicted by different selection indices between and within full-sib families. The heritability coefficients for traits such as MT, PW, and L/D ratio were also higher when they used heritability on a progeny mean basis compared with individual heritability; however, the individual heritability values of those traits remained below the values estimated in the current study and ranged from 0.073 for MT to 0.2635 for L/D. So, despite the selection made in the base population, genetic variability was not lost, since the heritability values were high in the present experiment. This confirms that genetic improvement via full sib families is advantageous and brings significant gains for traits of agronomic interest.
The lower estimated values for individual heritability and heritability on a family mean basis for the TSS and PY traits are related to the low genetic variance associated with their phenotypic variance relatively high. Those traits exhibited low values mainly for individual heritability. In general, the use of selection procedures based on mixed models is justified even with low-heritability traits, because the favorable genetic gains were predicted and the genotypes showed potential for selection.
In this experiment, selection accuracy values ranged from very high to low magnitude (0.97 to 0.19), according to Resende and Duarte (2007) (Table 2). The highest estimates were obtained for FD, PW, FL, and FW (0.97 for all), whereas lower estimates were found for TSS and PY (0.19 and 0.78, respectively). The selection accuracy estimated for these traits implies that the predicted genetic values were highly precise and highly correlated with the true genetic values of the selection candidates. High accuracy values were also obtained by Quintal et al. (2017), who evaluated 17 segregating full-sib families of guava and obtained accuracy values for FW (0.82), PY (0.84), PW (0.75), TSS (0.86), and vitamin C (0.89). For their part, Santos et al. (2015) state that variables with high accuracy values also indicate good genetic control in expression of a trait. Therefore, accuracy values greater than 0.70 are sufficiently high to make precise inferences regarding the genetic value of the progeny, whereas low accuracy values are usually related to low heritability estimates of a trait (Santin et al 2019). However, it is noteworthy that, in the present experiment, only TSS showed an accuracy value lower than 0.70. Thus, Resende and Duarte (2007) reported that this reliability factor is a function of the genotypic determination coefficient associated with the trait in question, which corresponds to the heritability CM Bezerra et al.

coefficient in an intrapopulation selection process.
For selection purposes, the 30 best individuals were selected for each trait analyzed separately. The genetic gains were predicted and the new means were estimated in relation to the overall mean of the traits (Table 3). The genetic gains estimated via BLUP are equivalent to the mean of genetic values predicted for the selected genotypes, and the new mean refers to the overall mean of the trait added to the gain. As a result, a higher population mean is obtained for said trait (Santos et al. 2015). In a comparison between the mean of the 30 selected individuals and the overall mean of the population, estimates of best average genetic gain were obtained for PW (40.65%) and FW (39.90%). Other traits that showed mean predictions with positive gains were MT (15.39%), FL (17.21%), and FD (13.14%). The other traits evaluated in this experiment showed lower variations for predicted genetic gain, which ranged from 5.69% for L/D to -0.31% for P L W.
The aim of the present study was to encounter genotypes that are promising for fruit quality traits by selection within families, prioritizing the gains in traits that achieved the highest genetic variance and heritability values and, consequently, optimum accuracy for precise selection of individuals and generation of higher gains, as occurred for FW and PW. The selection of individuals with a higher FW is advantageous and may allow for gains in production, since FW is directly and positively correlated with fruit size, and, consequently, with production (Fachi et al. 2018). Table 3. New predicted means (NM) and genetic gain estimated via REML/BLUP from 168 individuals from 11 full-sib families for 11 morpho-agronomic traits Ord.

Fruit weight (g)
Fruit length (mm) Genotype 165 obtained the best means for FW (371.47 g) and PW (302.89 g) simultaneously, representing additions of 131.85 g and 110.46 g (55.02% and 57.40% increases, respectively), to the overall means of those traits (Table 3). The FW and PW traits are the two most important traits for guava breeding, because they are the traits that most positively affect the yield of each genotype. Genotype 165 also showed excellent gains for other traits: FL (18.98%), FD (16.04%), PY (1.16%), and MT (16.58%). A noteworthy factor in the selection of those traits (FW and PW) was that, of the 30 individuals selected for PW, all are present in the group of individuals selected for FW, although the order was not the same. This ensures that simultaneous gains are obtained for both traits in this selection.
Considering TSS, the mean ºBrix found in this study was higher than the maximum value of 9.50 ºBrix reported by Quintal et al. (2017). In the present experiment, TSS did not show genetic variance (0.1), remaining at 15.9 for the 30 genotypes selected. It should be emphasized that although selection is performed based on the FW and PW traits, the TSS and pH traits are still of paramount importance for fruit quality. Among the 30 genotypes selected based on FW and PW, the variation in ºBrix was 18.10 to 11.64, which is important for fruit quality and commercial acceptance, since the guava industry requires ºBrix of pulp of around 13. For the 30 genotypes selected, the pH values obtained in this experiment are within the range mentioned by Yusof (1990) for several guava varieties, a range that is ideal for fresh consumption. If pH values are higher than 3.5, it is necessary to add edible organic acids during fruit processing to confer higher quality to the industrialized end product.

Divergence between parents and families
The means of the genotypes evaluated in the present experiment were compared with those of their parents, which were evaluated over eight crop seasons in another experiment. The means per se were used for that evaluation (Figure 1). Three genotypes among the 30 were selected from Family 1. These three had mean FW higher than the mean of their parents, but they had lower performance than the best parent, whose mean FW was 341.51 g (Figure 1).
Genotype 76 was selected from Family 2, with a mean FW of 370.11 g, which was much higher than the means of both parent P1 and P2. This was the only individual selected from this family. In Family 3, P2 showed a high mean FW (341.51 g), surpassing the selected individuals. This family contributed five individuals, three of which were superior to P1. Among these three, genotype 83 stood out, with a mean value very close to that of P2 (341.40 g). Family 5, for its part, participated with seven individuals among the 30 genotypes selected, whose means were higher than those of the two parents. In this family, genotype 147 stood out, averaging 401.84 g. Only one individual was selected from Family 8: genotype 45, which showed a higher mean than either of the parents. In Family 9, only two genotypes (117 and 165) were selected, both showing a higher mean than the two parents. Genotype 165 stood out with the highest FW (429.57 g) and PW means in this study. Family 10 stood out for contributing the most individuals among the 30 best. Nine genotypes were selected from this family, all of which were superior to P2 and three of which (60, 122, and 124) were superior to both parents. In Family 11, two individuals were selected, 69 and 127, averaging 281.30 g and 346.04 g, respectively. CM Bezerra et al.
In conclusion, 16 individuals (53%) had a mean FW higher than the FW of their two parents. This result confirms that the selection strategy based on full-sib families obtained from crossing individuals selected in the previous generations is efficient in producing gains from selection in the guava breeding program. Families 4, 6, and 7 did not have any genotype selected for FW. Thus, they will be discarded in the next generation of recombination for formation of new full-sib families. In regard to genetic divergence for mean FW, five distinct groups were formed, considering the 30 genotypes selected and including the parents as shown in Figure 2. Group I contained only genotype 165, which was the most genetically divergent from the others, based on the 11 traits evaluated. It also had the best performance for important traits such as FW and PW. In Groups II and III, only parents of full-sib families were allocated: P11 and P5 belonging to group II; and P2, P8, P10, P13, P3, P4, and P6 belonging to group III. The divergence shown in this analysis between full-sib families generated and evaluated in this experiment and the previous parental generation indicates possible superiority of the families for the traits analyzed, since the genetic distance between them was of great magnitude and the mean of the families was higher. Group IV was the largest group, with 20 genotypes. Only two of the genotypes were parents (P7 and P12); the other 18 were progenies (15, 57, 61, 5, 6, 18, 29, 4, 56, 69, 121, 98, 17, 120, 28, 94, 45, and 96). These genotypes had the lowest means among the genotypes selected for the traits FW, FL, FD, and PW, traits which are most correlated with yield. Group V, for its part, contained only one of the parents (P9) and progenies 147, 117, 60, 76, 19, 59, 83, 124, 122, 127, and 148. In this group, the individuals had high means of FW, FL, FD, and PW. Therefore, we suggest hybridizing genetically distant individuals, because generates segregating populations with greater genetic variability, and that increases the possibility of obtaining superior genotypes.

CONCLUSIONS
1) The REML/BLUP statistical procedure was effective in selection of superior genotypes and prediction of genetic parameters in the population under study; 2) The mean of most of the individuals selected exceeded the mean of their parents, confirming that the strategy of obtaining full-sib families was effective in generating gains in the guava breeding process; 3) Families 1, 2, 3, 5, 8, 9, 10, and 11 were considered superior because they include many genotypes selected for FW; thus, they will be represented in possible new crosses for the formation of new full-sib families; and 4) The most divergent guava genotypes, based on the UPGMA method, should be recommended for future crosses in order to obtain segregating populations and continue the guava breeding program with the aim of obtaining superior genotypes..