Selection of Jatropha genotypes for bioenergy purpose: an approach with multitrait, multiharvest and effective population size

: Jatropha curcas L. is a perennial plant with great potential for biodiesel production. Thus, the aim of this study was to select J. curcas genotypes for bioenergy purpose considering multitrait, multiharvest and effective population size. To this end, a data set with 70 J. curcas families, obtained through controlled crosses, evaluated in a randomized block design, with six replications, three plants per plot, spaced at 4 × 2 m was used. The following traits were evaluated: plant height in the 2014 and 2015 crop seasons; canopy projection in the row (2015 crop season); canopy projection between rows (2015 crop season); and grain yield (2015, 2016, and 2017 harvests). The mixed model methodology was used to estimate the variance components and to predict the genetic values. Superior genotypes were selected through Multitrait index based on factor analysis and genotype‐ideotype distance (FAI-BLUP index), considering the effective population size. According to this index, 98% of all additive genetic variability were summarized in three factors. By adopting the selection intensity that allows maintaining the effective population size equal to 30 (i.e., selection of 214 individuals), the predicted gains were 15.56, 21.75, 4.16, and 5.37% for the traits grain yield in the 2015 and 2016 harvests, canopy projection in the row, and canopy projection between rows in the 2015 crop season, respectively. Thus, the results suggest that the FAI-BLUP index can be successfully used in J. curcas breeding.


INTRODUCTION
Jatropha curcas L. (Euphorbiaceae) is a perennial plant native of Mexico and Central America that can be widely found in Latin America, Africa, India, and Southeast Asia (Pandey et al. 2012). This species can be used in soil recovery (Reubens et al. 2011), medicine (Pereira et al. 2018) and animal feeding (Makkar 2016) due to the identification of nontoxic varieties. Jatropha curcas also stands out for its great potential in biofuels production (Kumar et al. 2016;Laviola et al. 2017) due to its high yield and oil quality (Laviola et al. 2014). However, some technical-scientific improvements are required, especially regarding its genetics, for the use of this crop as renewable energy raw material.
Genetic variability in the population is fundamental for the selection of genotypes with high productive potential. Also, aggregating a set of traits of interest in the same genotype is a challenge to be overcome. Besides the high yield,

Experimental data
The experiment was carried out at the Embrapa Cerrados experimental area, in Planaltina, DF (lat. 15°35'30'' S, long. 47°42'30'' W, at 1007 m asl). The climate of the region is classified as Aw type (tropical with dry winter and rainy summer), according to the Köppen classification, with an average annual temperature of 22 °C, relative humidity of 73%, and average rainfall of 1100 mm. The soil is predominantly classified as red latosol, with high clay content.
Controlled crosses between 42 J. curcas genotypes that formed the Embrapa Agroenergia's Active Germplasm Bank in a unbalanced diallel (3 × 3) were performed, only 70 segregating families were obtained (Table 1), since some crosses did not result in viable seeds, which were evaluated in a randomized block design, with six replications, three plants per plot, spaced at 4 × 2 m, totalizing 1206 individuals. The following traits were evaluated: plant height in the 2014 and 2015 crop seasons; canopy projection in the row (2015 crop season); canopy projection between rows (2015 crop season); and grain yield (2015, 2016, and 2017 harvests).

Statistic model
The REML/BLUP procedure was used to estimate the variance components and to predict the genetic values, according to Patterson and Thompson (1971) and Henderson (1975). The statistical model associated with the evaluation of individuals of full-sib families is given by Eq. 1: where y is the phenotypic data vector; r is the vector of the replication effects (assumed as fixed) added to the overall mean; a is the vector of the individual additive genetic effects (assumed as random); p is the vector of the plot effects (assumed as random); f is the vector of dominance effects of full-sib families (assumed as random); and e is the vector of errors (random). X, Z, W, and T represent the incidence matrices for r, a, p and f, respectively. The significance of the random effects of the statistical model was evaluated by the likelihood ratio test (LRT) (Wilks 1938), using the χ 2 statistic, with one degree of freedom and 0.05 significance level. The narrow-sense heritability (h 2 a ) and the broad-sense heritability were obtained, respectively, by Eq. 2 and 3 (Resende 2016): where σ 2 a is the estimate of the additive genetic variance, σ 2 2 Phen is the estimate of the phenotypic variance and σ 2 d is the estimate of the dominance genetic variance.
The solutions for the genetic values of the nonparent individuals were based on Eq. 4 (Resende et al. 2014): where â ijk is the estimate of the additive genetic value of individual i of family j in block k; μ is the overall mean of individuals; â p and â m are the predicted additive genetic values for parents p and m, respectively; σ 2 2 e is the estimate of residual variance; y ijk is the phenotypic observation of individual i of family j in block k; b k is the estimate of the effect of block k; f j is the predicted dominance genetic value for family j; and p ik is the predicted plot effect (individual i in block k). For individuals without phenotypic observation (not available), the additive genetic value of the individual was given by the mean additive effects of the parents, added to the overall mean of individuals.

Effective population size
The effective population size (N e ) is the ideal size of a population that could generate the same amount of inbreeding or variance in allele frequencies as that of the study population (Kimura and Crow 1963). The N e estimator is given by Eq. 5 (Resende 2015): where N f is the number of families selected; k f is the mean number of individuals selected per family; and σ 2 f is the variance of the number of individuals selected per family (Resende 2015).

Effective family size
The estimator of the effective number of families selected (N ef ) is given by Eq. 6 (Resende 2015): where k fj is the number of individuals selected from family j (Robertson 1961).

Genetic diversity of selected families
The genetic diversity between the families selected was calculated based on Eq. 7 (Wei and Lindgren 1996): where N f0 is the original number of families.

Genetic selection based on multitrait and multiharvest
Superior genotypes were selected through multitrait index based on factor analysis and genotype-ideotype distance (FAI-BLUP index) (Rocha et al. 2018) which is based on the theoretical foundations of the structural equation models. This index associates exploratory (based on the data) and confirmatory (based on the ideotype) factor analysis techniques. It also considers the correlation structures obtained from the data and directs the selection, seeking for genotypes similar to those idealized by the breeders.
The ideotype used in the FAI-BLUP index for plant height in the 2014 and 2015 crop seasons was defined as the lowest value among the predicted genetic values for these crop seasons. Conversely, for canopy projection in the row and canopy projection between rows, evaluated in the 2015 crop seasons, and grain yield, assessed in the 2015, 2016, and 2017 harvests, the highest value among the predicted genetic values for the respective traits and years was considered. Only the traits with significant additive genetic effect (GCA) were considered for the index composition. The selection intensity was determined based on the effective population size (N ep ) suitable for the formation of the breeding population (N ep = 30), according to Resende (2015). The variance components, genetic and nongenetic parameters and genetic values were obtained using the Selegen REML/BLUP software (Resende 2016). The genetic selection was made using the FAI-BLUP index routine, available by Rocha et al. (2018), using the R software (R Development Core Team 2019).

Additive and dominance genetic effects
The significance of the random effects of the statistical model was evaluated using the likelihood ratio test (LRT), based on Resende (2016). This test indicated a nonsignificant effect of the general combining ability (GCA) for traits HGT14, HGT15, and GY17 (Table 2). A nonsignificant effect was observed for the specific combining ability (SCA) for the traits HGT15 and CPBR15 and plot effect for the trait GY17. For the other effects, the likelihood ratio test revealed a significant effect (p < 0.05) ( Table 1). Since no additive genetic variance was observed for traits HGT14, HGT15, and GY17, they were not used in the FAI-BLUP index composition.
Additive genetic variance estimates (σ 2 2 a ) for traits HGT14, HGT15, and GY17 do not statistically differ from zero (Table 1). Consequently, the narrow-sense heritability values (h 2 a ) do not differ from zero. However, the estimates for the other traits are higher than zero, indicating the existence of genetic variability, which allows the selection of superior genotypes. The estimate of dominance genetic variance (σ 2 2 d ) for trait HGT15 was the only one that did not differ from zero (Fig. 1).   Narrow-sense (h 2 a ) and broad-sense (h 2 g ) heritabilities estimates increased for GY from the 2015 harvest to the 2016 harvest (132.21 and 14.85%, respectively). The h 2 a estimate of CPBR15 was higher than that of CPIR15. Conversely, the opposite occurred for the h 2 g estimates. Finally, the overall mean for GY increased over the harvests.

Effective population and family sizes and genetic diversity of selected families
The 214 individuals selected belong to 28 out of the 70 families evaluated (Table S1). The families selected belong to a gene pool that includes 22 out of 33 parents evaluated. Thus, the effective number of families selected (N ef ) was 16.38. Regarding the genetic diversity of these families, the estimated coefficient of genetic diversity (D) was 0.23.

Individual genetic selection based on multitrait and multiharvest information (FAI-BLUP index)
The FAI-BLUP index was used to select superior genotypes for bioenergy purpose, considering multitrait and multiharvest information, simultaneously. The 1206 individuals (green and black dashes) ranked are shown in Fig. 2. Figure 2 shows that the 214 best genotypes (green dashes) were closer to the desirable ideotype. According to the FAI-BLUP index, 98% of all additive genetic variability were summarized in three factors. Factor 1 clustered the traits canopy projection in the row and canopy projection between rows in the 2015 crop season, explaining 54% of the variability. Factor 2 clustered the trait grain yield in the 2015 harvest, explaining 26% of the total variability. In its turn, factor 3 clustered the trait grain yield in the 2016 harvest, explaining 18% of the total variability. These results indicate a high correlation between traits of the first factor and the presence of genotypes × harvests interaction for grain yield.
By adopting the selection intensity that allows maintaining the N ep = 30 (i.e., selection of 214 individuals), the predicted gains were 15.56, 21.75, 4.16, and 5.37% for the traits grain yield in the 2015 and 2016 harvests, canopy projection in the row, and canopy projection between rows in the 2015 crop season, respectively.

DISCUSSION
The GCA is fundamental to identifying and selecting superior genotypes and quantifying the breeding potential of the studied population. Consequently, it allows the direct selection, which aims at increasing the frequency of favorable alleles (Resende 2015). The GCA effect was prevalent to the SCA effect only for traits GY16 and CPBR15 (Table 1). Results indicate that these traits can be improved by applying simple selection techniques, and the strategy to be used is intrapopulation recurrent selection, highlighting the additive genetic effects.
Conversely, dominance effects (SCA) were prevalent in the genetic control of HGT14, GY15, GY17, and CPIR15, suggesting differences in the genetic compositions of the parents (Table 2). Biabani et al. (2012 ) observed higher SCA for plant height when compared with GCA in diallel crosses with J. curcas. Results suggest that the methods must prioritize the capitalization of heterosis (genetic dominance and divergence) to improve the population considering these traits, and the breeding strategy should be recurrent interpopulation selection to improve the best hybrids.
Narrow-sense individual heritability (h 2 a ) quantifies the relative contribution of additive genes in the expression of the trait (Laviola et al. 2012). All estimates obtained can be considered low, according to the classification proposed by Resende (2002). Similar magnitudes values were reported by Laviola et al. (2010) for grain yield trait measured at 12 months in 110 Jatropha accessions. Spinelli et al. (2015) also found lower h 2 a estimates for this trait in 16 Jatropha half-sib families in the second, third, and fourth years after planting. In general, the increase in the h 2 a and h 2 g estimates of the 2015 harvest for the 2016 harvest indicates that the GY16 is the most suitable trait for selecting the most productive Jatropha genotypes.
The FAI-BLUP index was established based on the traits that showed GCA values statistically higher than zero, i.e., there were genetic variability. The first two factors explained 80% of the total variability among individuals (Fig. 2). The clustering of CPBR15 and CPIR15 traits in factor 1 indicates a high correlation between them (Fig. 2), as already reported by Laviola et al. (2010;. The fact that the traits GY15 and GY16 are allocated in different factors demonstrates the genotypes × harvests interaction for grain yield, i.e., the genotypes showed different yields in these years. These results are similar to those obtained by Laviola et al. (2013), Teodoro et al. (2016), and Alves et al. (2018), who verified low repeatability among the performance of Jatropha genotypes for grain yield in the early years.
The clustering of GY15 in factor 2 (26% of the total variability) can be explained by the fact that, in the early years, some Jatropha genotypes either produce very few or no grains at all, generating a great phenotypic variability for this trait. However, the phenotypic variability of this trait was predominantly composed of plot variance (Fig. 1). In the 2016 harvest, most of the genotypes produced grains, reducing plot variance and increasing genetic variance (Table 1).
The selection index used in this study allowed positive and high gains with selection for the traits GY15, GY16, CPBR15, and CPIR15. According to Rocha et al. (2018), the FAI-BLUP index has several advantages over the classic index proposed by Smith (1936) and Hazel (1943), since the former is based on the theory of the structural equation model. Moreover, the FAI-BLUP index does not assign weights and is free from multicollinearity. The estimates of selection gains obtained in this study are aligned with those of other works Spinelli et al. 2015) that used indices to select J. curcas genotypes to produce bioenergy.
It should be noted that the FAI-BLUP index was successfully used by  to assist the selection of sorghum hybrids that simultaneously meet favorable traits for the production of second-generation bioethanol. Oliveira et al. (2019) selected five sorghum hybrids with higher potential for energy cogeneration using the FAI-BLUP index. Woyann et al. (2019) used the FAI-BLUP index in recombinant soybean inbreed lines to identify those that best related to the ideotype for the biodiesel production.
It should be highlighted that the gains predicted with the selection in this study were obtained with restriction in the effective population size (N ep = 30) and, for that, 214 individuals were selected. An adequate effective population size is of paramount importance for the success in new cycles of recombination and selection in any breeding program because the effective population size must be maintained at a reliable level in each generation to avoid losing favorable alleles (Resende 2002).

CONCLUSION
This study applied the mixed model methodology and the FAI-BLUP index for genetic selection, with restriction in the effective population size, of J. curcas genotypes. The results suggest that this procedure can be successfully used in J. curcas breeding.