INTRODUCTION:
Rice (Oryza sativa) is one of the most important crops in the world. Increased rice production has played key roles in food security, especially in developing countries in Asia and Africa (^{CHEN, 2017}). Currently, the production of this crop is approximately 12,327.8 thousand tons (^{CONAB, 2018}). Despite supplying world’s current population, it is estimated that by 2050, rice production in the world must increase from 60 to 110% to meet population demand (^{GODFRAY et al., 2010}; ^{TILMAN et al., 2011} ^{RAY et al., 2013}). Thus, there is a need for development of new lines considering improvements in yield over existing varieties. According to ^{SPINDEL et al. (2015}), because the process is extremely timeconsuming, using conventional breeding and selection methods, it takes ten years on average for elite varieties to be developed and identified.
The continued evolution of sequencing and genotyping technologies has led to a breakthrough in molecular genetics. Such advances have promoted the direct use of the information from the DNA in the identification of genetically superior individuals, thereby shortening the selection cycle to the benefit of plant breeding programs. For this purpose, ^{MEUWISSEN et al. (2001}) devised a Genome Wide Selection (GWS), which consists of the analysis of a large number of single nucleotide polymorphisms (SNPs) distributed in the genome, capturing the genes affecting the quantitative trait of interest. According to ^{MEUWISSEN et al. (2001)}, it can be assumed that some of these molecular markers are in linkage disequilibrium (LD) with quantitative trait loci (QTL), allowing its direct use in the prediction of genomic breeding values (GEBVs) of the individuals subject to selection.
Several methods, such as Bayesian methods, Bayesian Least Absolute Shrinkage and Selection Operator (BLASSO), BayesCpi, and the mixedmodel method, Genomic Best Linear Unbiased Predictor (GBLUP), have been extensively applied to GWS and are recommended for genomic prediction (^{DE LOS CAMPOS et al., 2012}, ^{AZEVEDO et al., 2015}). However, new methodologies have been proposed, such as the method called Deltap (^{RESENDE, 2015}; ^{LIMA et al., 2019}), which does not demand an iterative computational method and consequently, does not require evaluation regarding the convergence of results. Such methodology divides the estimation population into two subpopulations, one associated with higher phenotypic values and the other associated with lower phenotypic values. Effects of the markers were estimated nonparametrically using the difference between the allelic frequencies and the genetic gain associated with these two subpopulations. With the goal of combining good properties of different methodologies, ^{LIMA et al. (2019)} proposed the use of a genomic index, called the Deltap/GBLUP index, which combines estimated genomic values obtained by Deltap and GBLUP. The Deltap/GBLUP index was more accurate than GBLUP in genomic prediction. However, the genomic index can combine predictions from several statistical methodologies, as indicated by the literature, such as BLASSO and BayesCpi. Use of other methodologies to compose the index can be interesting because it allows the use of specific properties of each method of genomic selection in terms of architecture of the evaluated traits. In addition, the Bayesian approach has been used successfully in other areas (^{MACEDO et al., 2014}, ^{GARNERO et al., 2014}).
Consequently, the goals of the present study were to evaluate the Deltap/BLASSO and Deltap/BayesCpi genomic indexes and compare them to the Deltap/GBLUP index in terms of prediction efficiency of additive genomic values of the individuals in reference to photosynthetic yield traits, grain quality, yield, and blast resistance in rice.
MATERIALS AND METHODS:
Description of the database
Database used in this study was composed of nine traits referring to 352 rice accessions (Oryza sativa), which were genotyped for 44,100 SNPs markers. The dataset is publicly available, part of two projects, the OryzaSNP Project and the OMAP Project (^{AMMIRAJU et al., 2006}), and available at https://ricediversity.org/data/.
Plantations were supervised throughout the access phase, from May to October of 2006 and 2007. A complete block design with two replications was used, in which the planting lines had a length equal to 5 m. Plants were spaced 25 cm apart and there was 0.50 m between rows. Further details can be reported in ^{ZHAO et al. (2011}). Quality control procedures were made considering a call rate of 70% and minor allele frequency (MAF) less than 1%. After the quality control of the genomic database, the total was 36,901 SNPs markers.
The nine traits analyzed were considered to have relevant effects on the improvement of rice. They were flag leaf length (FLL), flag leaf width (FLW), amylose content (AC), protein content (PC), panicles number per plant (PNPP), seed length (SL), seed width (SW), primary panicle branch number (PPBN), and blast resistance (BR). The first two traits (FLL and FLW) are associated with the photosynthetic yield of the plant and traits AC and PC are associated with grain quality. Traits PNPP, SL, SW, and PPBN are associated with the production of the plant, whereas BR is associated with the main rice disease.
Deltap method
The Deltap method proposed by ^{RESENDE (2015}) and ^{LIMA et al. (2019}) is based on the concept of changes in allele frequency because of selection and the genetic gain theory (the contrast between the averages of two subpopulations). The method consists of the following steps:
i) The training population is divided into two subpopulations (subpopulation one and subpopulation two), according to the phenotype of the character corrected for systematic effects;
ii) Calculation of difference between allelic frequencies of the subpopulations,
iii) Calculation of the average difference between the allelic frequencies of the subpopulations,
iv) Calculation of the average allelic substitution effect
v) Calculation of the allelic substitution effect of the ith marker,
vi) Calculation of the additive genomic value of the jth individual (j = 1, 2,..., N with N being the total number of individuals in the validation population),
GBLUP method
Genomic Best Linear Unbiased Predictor (GBLUP) method is based on the following linear mixed model:
Equations of mixed models for prediction via the GBLUP method are equivalent to:
with the components of variance,
where p _{ i } and q _{ i } are the allelic frequencies of the ith marker, W is the incidence matrix for the markers in the training population.
BLASSO Method
The Bayesian version of the LASSO regression (BLASSO) for genomic selection was proposed by ^{DE LOS CAMPOS et al. (2009}). BLASSO (Bayesian Least Absolute shrinkage and Selection Operator) includes a common variance term for the genetic and residual effects of markers. Basic linear model for predicting the effects of markers is presented below:
where y is the vector of phenotypes of training population, µ is the general mean, 1 is the vector with the same dimension of y whose elements are equal to 1, α is the vector of allelic substitution effects of the markers with incidence matrix W, and is the residual vector. The a priori distributions of the parameters in terms of an increased hierarchical model are presented below:
in which MNV represents the multivariate normal distribution,
The additive genetic variance of each marker is given by
BayesCpi method
The Bayes Cpi method was proposed by ^{HABIER et al. (2011}) to allow election of variables and Bayesian learning with data. The a priori distributions assumed for the parameters in the model (1) considering this method are:
in which the indicator variable I_{
a
} = (I_{
a1
} ...I_{
an
} ) follows a binomial distribution with probabilityp. Thus, the probability of mixing p will be assigned an a priori distribution beta. The additive genetic variance is given by
In this study, for Bayesian methods, 300,000 iterations were used for the MCMC algorithms, of which 20,000 were discarded (burnin) to guarantee the heating of the chain and there was a selection of one in 10 iterations (thin). The convergence analysis was performed using the criterion proposed by ^{GEWEKE (1992}).
Genomic index
The genomic index is defined as (^{RESENDE, 2015}; ^{LIMA et al., 2019}):
in which it combines the genomic values predicted through GBLUP, BLASSO, or BayesCpi (â _{1}) and through the Deltap (â _{ 2 } ) method, weighted by coefficients b _{1} and b _{2}, respectively. The weights, b _{1} and b _{2}, are given respectively by:
in which
Crossvalidation and comparison between methods
The validation procedure chosen was the kfold process with k = 4. Thus, the phenotypic dataset composed of 352 individuals was divided into 4 groups with 88 individuals each. Thus, for each replicate of the analysis, three groups were considered as training populations and used to obtain effects of SNP markers. The other group was considered a validation population and was used to predict the additive genomic values through estimations of effects of markers obtained in the estimation population. Later, the calculation of the efficiency measures was possible, as described below. The process was repeated such that at each step, one of the four groups constituted the validation population. After the end of the validation process, the arithmetic averages and standard deviations of the efficiency measures were used, such that it was possible to report the general results. The efficiency measures used are described below: (i) the molecular heritability was given by
Computational resource
All the computational routines of the proposed methods were implemented in R software (^{R Development Core Team, 2018}). For the GBLUP, the sommer package and the mmer function were used; for the BLASSO method and BayesCpi, the BGLR package and the BGLR function were used. Algorithms used for the development of the Deltap and Deltap/GBLUP index methods were implemented by ^{LIMA et al. (2019}) and are available at https://licaeufv.wordpress.com/pesquisasresearch/.
RESULTS AND DISCUSSION:
Average results and the respective estimated standard deviations relative to molecular heritability, predictive ability, and regression coefficient between genomic value and phenotypic value associated with the Deltap and GBLUP methods, as well as predictive ability and regression coefficient between the estimated genomic value through the Deltap/GBLUP index and phenotypic value are shown in table 1.
Method  Trait 





Deltap  FLL  0.34±0.08  0.21±0.07  0.39±0.18  
FLW  0.25±0.01  0.55±0.08  1.14±0.25  
PNPP  0.13±0.02  0.69±0.07  1.93±0.31  
PPBN  0.20±0.03  0.36±0.06  0.79±0.14  
SL  0.43±0.02  0.41±0.18  0.67±0.32  
SW  0.16±0.01  0.50±0.06  1.26±0.26  
AC  0.10±0.02  0.61±0.04  1.98±0.16  
PC  0.13±0.02  0.31±0.06  0.86±0.15  
BR  0.25±0.04  0.42±0.19  0.84±0.37  
GBLUP  FLL  0.22±0.28  0.47±0.03  1.00±0.14  0.63±0.14  0.28±0.10 
FLW  0.50±0.05  0.77±0.04  1.09±0.14  0.80±0.04  0.85±0.14  
PNPP  0.66±0.03  0.83±0.03  1.02±0.09  0.83±0.03  0.92±0.10  
PPBN  0.36±0.08  0.63±0.05  1.06±0.17  0.70±0.08  0.60±0.08  
SL  0.56±0.08  0.75±0.09  1.01±0.10  0.77±0.09  0.76±0.19  
SW  0.65±0.04  0.84±0.03  1.04±0.08  0.84±0.03  0.93±0.09  
AC  0.59±0.03  0.80±0.09  1.04±0.09  0.80±0.09  0.93±0.13  
PC  0.25±0.05  0.46±0.05  0.92±0.08  0.59±0.06  0.46±0.06  
BR  0.44±0.08  0.67±0.08  1.02±0.11  0.72±0.06  0.69±0.05 
Flag leaf length (FLL); Flag leaf width (FLW); Amylose content (AC); Panicles number per plant (PNPP); Primary panicle branch number (PPBN); Seed length (SL); Seed width (SW); Protein content (PC); Blast resistance (BR).
The average results and respective standard deviations relative to molecular heritability, predictive ability, and regression coefficient between genomic value and phenotypic value associated with Bayesian methods (BLASSO and BayesCpi), as well as predictive ability and regression coefficient between the estimated genomic value through the index (Deltap/BLASSO index and Deltap/BayesCpi index) and phenotypic value are shown in table 2.
Method  Trait 





BLASSO  FLL  0.48±0.15  0.48±0.09  1.11±0.29  0.53±0.09  0.57±0.19 
FLW  0.70±0.10  0.76±0.03  1.11±0.09  0.78±0.03  0.91±0.09  
PNPP  0.77±0.02  0.81±0.04  1.01±0.11  0.82±0.04  0.91±0.11  
PPBN  0.49±0.03  0.63±0.02  1.11±0.09  0.65±0.02  0.80±0.07  
SL  0.83±0.18  0.74±0.08  1.07±0.17  0.75±0.08  0.82±0.23  
SW  0.84±0.03  0.83±0.04  1.08±0.12  0.83±0.03  0.97±0.15  
AC  0.81±0.02  0.81±0.01  1.12±0.06  0.82±0.01  1.00±0.04  
PC  0.41±0.01  0.48±0.06  1.00±0.04  0.52±0.03  0.70±0.05  
BR  0.65±0.05  0.68±0.07  1.06±0.12  0.71±0.09  0.77±0.14  
BayesCpi  FLL  0.47±0.17  0.50±0.09  1.17±0.32  0.54±0.09  0.60±0.20 
FLW  0.71±0.09  0.76±0.03  1.10±0.09  0.78±0.03  0.90±0.08  
PNPP  0.78±0.02  0.81±0.04  1.01±0.11  0.82±0.04  0.90±0.11  
PPBN  0.51±0.01  0.64±0.02  1.11±0.10  0.66±0.02  0.79±0.07  
SL  0.83±0.18  0.74±0.08  1.07±0.16  0.76±0.08  0.82±0.23  
SW  0.84±0.03  0.83±0.04  1.08±0.12  0.83±0.04  0.97±0.16  
AC  0.82±0.01  0.82±0.02  1.14±0.06  0.83±0.02  1.02±0.05  
PC  0.42±0.03  0.48±0.06  1.00±0.05  0.52±0.04  0.69±0.06  
BR  0.62±0.04  0.69±0.08  1.09±0.13  0.71±0.09  0.80±0.17 
Flag leaf length (FLL); Flag leaf width (FLW); Amylose content (AC); Panicles number per plant (PNPP); Primary panicle branch number (PPBN); Seed length (SL); Seed width (SW); Protein content (PC); Blast resistance (BR).
Predictive ability
Results showed that the Deltap/GBLUP, Deltap/BayesCpi, and Deltap/BLASSO indices for all traits presented higher predictive abilities than the GBLUP, BayesCpi, and BLASSO methods, respectively. This can be easily seen when evaluating the relationship between the predictive abilities of the methods, which showed that the Deltap/GBLUP index, Deltap/BLASSO index, and Deltap/BayesCpi index were, on average, 9.7%, 3.6% and 3.3%, respectively, more efficient in the genomic prediction than the traditionally applied methods, GBLUP, BLASSO and BayesCpi. It is important to point out that one substantial advantage is that these percentage points in predictive ability have no additional computational cost. Moreover, according to ^{RESENDE et al. (2015}), gains of 5% in predictive ability and accuracy are already significant in plant breeding, often equivalent to the gain that is obtained in a complete cycle of improvement genetics. Thus, under genomic selection performed in a short time, these gains are cumulative and grow rapidly. Therefore, it has been shown that the indices caused an improvement in the prediction of the GEBVs because they provided superior predictive abilities over other methods.
For all traits, the Deltap method presented lower predictive values compared to that of the GBLUP, BLASSO and BayesCpi methods because of the different genetic information used in each of the methods. According to ^{LIMA et al. (2019}), the Deltap method uses only linkage unbalance information, whereas the other methods, such as GBLUP and Bayesians methods, also used the relationship information between individuals. In addition, ^{AZEVEDO et al. (2016}) reported that when genomic prediction considered only linkage imbalance, the predictive ability reported should be less than or equal to that derived from the genomic prediction that also considers the relationship between individuals, which corroborates the results reported in our study. However, it was perceived that the index together with the GBLUP is able to capture more genetic information that benefits genomic prediction.
In addition, the GBLUP, BLASSO, and BayesCpi methods presented similar predictive abilities, being in agreement with the results reported in the literature (^{AZEVEDO et al., 2015}, ^{GIANOLA, 2013}, ^{DE LOS CAMPOS et al., 2012}) that point out the similarity of several methods in terms of predictive ability regarding the prediction of genomic values. ^{GUO et al. (2014}), using GBLUP, also reported similar values for the predictive ability for the same traits analyzed.
Regression Coefficient
Interest in GWS is that the regression coefficient between the phenotype and the estimated genomic value is close to one, indicating that these values are nonbiased. For regression coefficients below one (1), it is understood that the genomic values are overestimated and for coefficients above one (1), genomic values are underestimated. Thus, according to the results, genomic values estimated by the three indices considered were overestimated, except for the trait of amylose content that obtained a regression coefficient equal to one in the Deltap/BLASSO index method and underestimated the values in the Deltap/BayesCpi method. In addition, GBLUP obtained regression coefficients closer to one than did the Deltap and Deltap/GBLUP index methods. Additionally, the BayesCpi method exhibited values closer to one than did the Deltap/BayesCpi index. Lower values of regression coefficients reported for these indices may have occurred because of the Deltap method because as previously reported, this method generates regression coefficients more than one. In turn, for the traits flag leaf width, seed width, and amylose content, it was observed that Deltap/BLASSO index method obtained regression coefficient values closer to one in relation to the BLASSO method.
Heritabilities
When analyzing heritability, it was observed that the BLASSO and BayesCpi methods presented similar heritability estimates, which corroborates the results obtained by ^{AZEVEDO et al. (2015}) who also verified similarities between Bayesian methods to estimate genomic heritability. Heritability estimated by GBLUP were similar to those reported by ^{GUO et al. (2014}) considering the same dataset. It was verified that the methods Deltap and GBLUP, resulted in smaller values for heritability in relation to the Bayesian methods. According to ^{XING & ZHANG (2010}) and ^{VALLURU et al. (2014}), quantitative traits, as were the traits used in this study, are generally known because they have low heritability and are difficult to investigate.
Reports of estimates of heritability obtained through pedigree for some of the traits, such as, panicles number per plant, flag leaf width, flag leaf length, and seed length, are reported in the literature (^{XU et al.,2018}; ^{SUMANTH et al., 2017}; ^{AKINWALE et al., 2011}; ^{SEYOUM et al., 2012}; ^{SINGH et al., 2011}; ^{OLADOSU et al., 2014}). However, according to ^{DE LOS CAMPOS & SORENSEN (2013}) and ^{DE LOS CAMPOS et al. (2015)}, these heritability values are always superior to genomic or molecular heritability. This superiority is caused by molecular heritability being a fraction of the heritability obtained via the pedigree that is captured by the markers.
CONCLUSION:
In general, the Deltap/GBLUP index has more predictive ability for genomic values than traditional methods (GBLUP, BLASSO, and BayesCpi) and Bayesian indexes, besides being easy to implement and requiring cost computation. Conversely, the genomic indexes presented greater bias in the predictions of the individual genomic values. Results indicated a greater potentiality of use of rank indexes for the selection of genetically superior individuals and not the exact inference about how much they will produce when commercially planted.