Effective population size and genetic gain expected in a population of Coffea canephora

This work aimed to study the effective population size and genetic gain in a population of robusta coffee (Coffea canephora Pierre) and verify the possibility of using recurrent selection. The experiment comprised 25 treatments, consisting of 21 C. canephora progenies and four C. arabica (cultivars) grown in Brazil. The experimental design was a 5x5 quadruple balanced lattice, with 24 replications, with one plant per plot. Six harvests were performed in each plant. Statistical analysis was carried out using the mixed model methodology. The analysis showed high additive genetic variability, and the magnitude of the additive components prevailed over that of the dominance components. These facts revealed the plant population liability to undergo recurrent selection, whose expected genetic gains were high. Results suggest that the effective population size and inbreeding degree throughout recurrent selection cycles be monitored. During selective cycles, cloning with weak selection is required due to few progenies.


INTRODUCTION
Coffee's economic and social importance is indisputable.Its cultivation has generated near 80 billion dollars per year, second only to petroleum, and involves 125 million people worldwide in its production chain.Coffee is a universally popular drink -about 600 billion cups are consumed every year, making it the second most consumed beverage worldwide, second to water (Bliska et al. 2007, ICO 2017).
The crop has been cultivated in almost 80 countries in Africa, Asia, and Latin America, and is divided into two commercial species, Coffea arabica Linnaeu (Arabica coffee) and Coffea canephora Pierre ex A. Froehner (Robusta coffee).The production growth and trading of Robusta coffee are evident.Robusta has been used as feedstock for instant and ground coffee over the years.In the 1960s, the crop represented approximately 18% of the worldwide market; it currently represents 40%.The world's largest Robusta coffee growers are Vietnam, Brazil, Indonesia, Ivory Coast, and India.
C. canephora is an allogamous diploid species (2n = 2x = 22 chromosomes) with gametophytic self-incompatibility, which makes Robusta cultivars have at least two different and compliant genotypes.Therefore, C. canephora is JC Mistro et al. divided into two group, the Guinean and Congolese, from Western and Central Africa, respectively.The latter is the most grown worldwide.Cultivars may be available for farmers both via sexual or asexual reproduction, and the vegetative propagation is the most recommended due to the plant's phenotypical uniformity in the field, which facilitates handling and harvesting.In Brazil, C. canephora is produced in a mixed plantation of clones in alternated rows (Ferrão et al. 2007).
C. canephora clonal selection is the major practice in Brazil, promoting high genetic gains, uniformity, and higher frequency of favorable alleles.This process, however, reduces the genetic variability.Therefore, the recurrent selection strategy is recommended to soften this reverse effect since it recovers the genetic variability via successive selection and recombination cycles, enhancing favorable allele frequency of the trait to be improved.This way, new superior genotypes may be obtained again, continuously, supplying the system and promoting program continuance in the long term (Hallauer and Miranda Filho 1981, Gallais 1989, Bernardo 2010).
The selection intensity to be adopted by breeders can be used as a strategy to improve recurrent selection effectiveness.This intensity will, a priori, depend on the program needs, i.e., short-and long-term results.Nevertheless, high or low selection intensity affects the program.High selection intensity will reduce population size due to genetic drift.Conversely, low selection intensity will increase the population size, which may result in slight gains during first cycles; however, gains will increase over time (Vencovsky et al. 2012).
In the case of perennial plants, in which generation overlapping occurs, the plants that generated the progenies selected can be used for recombination.In Robusta coffee (self-incompatible), another advantage is the no need for emasculation, which facilitates the operational work in a recurrent selection scheme.Several recurrent selection studies have been performed with plant species.In the last few years, some studies have worked with C. canephora.The few existing and old results about this species are promising.For instance, a recurrent selection performed with a coffee population in Ivory Coast obtained genetic gains of up to 60% for grain yield (Leroy et al. 1994).
Statistical analyses are different for perennial and annual species.In the latter, data for the same individual is recorded for several years (Cilas et al. 2011, Liu et al. 2012), causing measurement correlations and dependent non-homogeneous variances (Collins 2006).Nonetheless, these issues may generate erroneous information on the efficiency of superior plant selection (Piepho et al. 2004, Pinto et al. 2013).Apiolaza et al. (2000) and Mariguele et al. (2011) have addressed these topics.The random part's modeling with a variance-covariance matrix may be important for the applicability of mean inferences (Littell et al. 2006).Plant death results in unbalanced data and the presence of fixed and random effects within the same statistical model is another particularity of perennial plant studies (Resende 2007).
The use of mixed model methodology, has been the most proper way to analyze this sort of data since it circumvents these problems (Resende 2002).This method uses the REML/BLUP procedure at the individual level [REML -restricted maximum likelihood (Patterson and Thompson 1971) and BLUP -best linear unbiased prediction (Henderson 1975)], in which REML estimates variance components and BLUP predicts genetic values (Resende 2002).This study aimed to estimate the expected genetic gain for production traits and the effective population size in a recurrent selection program of Robusta coffee (Coffea canephora Pierre).

MATERIAL AND METHODS
A phenotypical selection was carried among individuals of a C. canephora (Robusta coffee) open-pollinated population in the Centro Agronómico Tropical de Investigación y Enseñanza -CATIE (Center for Tropical Agriculture Research and Education), in Turrialba, Costa Rica.From this population, 21 plants were selected, 19 belonging to Congolese Group and two belonging to the Guinean group, from which seeds were harvested for seedling production (Table 1).
Progenies were planted in 1974, in an experiment in Polo Regional do Nordeste Paulista (Northeastern São Paulo State Research Center), which belongs to APTA (São Paulo State Agribusiness Technology Agency, associated with the State Agriculture Department), in the municipality of Mococa, São Paulo, Brazil.The data analyzed refer to the period from 1986 to 1992, after pruning of plants.
The experiment consisted of a 5x5 quadruple balanced lattice design, with six replications, containing 21 C. canephora (seeds) progenies and four C. arabica cultivars (autogamous species, tetraploid, and self-compatible), with one plant per plot.The spacing consisted of 4 m between rows and 3 m between plants within each row.Six harvests were performed in each plant.Grain production (composed of a mixture of green, ripe, and dry fruits) was weighed and recorded in kg of coffee per plant.
Analyses were performed by the mixed model methodology (REML/BLUP procedure), using the Selegen-REML/BLUP software (Resende 2016), wherein variance components are estimated by the restricted maximum likelihood (REML), and the additive genetic value is predicted by the best linear unbiased prediction (BLUP), by a genetic relationship matrix.This method was adopted because the used models have both fixed and random effects.In addition, the same individuals (plants) were evaluated in several years (time repetition), and results revealed a correlation between these measures.The mixed model methodology enables efficient estimates and prediction under these circumstances, which would have been impossible to obtain by the least square method (Resende 2002, Bueno Filho and Gilmour 2003, Piepho et al. 2003, Gilmour et al. 2004, Mariguele et al. 2011).Data analyses were performed for the C. canephora progenies.
The following statistical model was adopted: y is the phenotypic data vector, with dimension n x 1 (where n is the number of observations); m is the vector of the measurement-replication combinations effects plus the overall mean (fixed effect); a = is the vector of individual additive genetic effects, assumed as random {where a ~ N(0,G) [being G = Aσ 2 a (in which G is the genetic variance and covariance matrix among individuals; A is the genetic relationship matrix; and σ 2 a is the additive genetic variance among individuals)]}; p is the vector of permanent effects, which accounts for correlated repeated measures over time, assumed as random; b is the vector of block effects, assumed as random; i is the vector of genotype and harvest interaction effects (random effect); and e is the vector of errors (random effect).X, Z, W, S, and T are the incidences matrices for m, a, p, b, and i, respectively.
, in which: , and ; where: σ 2 e is the residual variance, σ 2 a is the additive genetic variance, σ 2 p is the individual permanent variance, σ 2 b is the block variance within replication, σ 2 i is the variance of genotype x harvest interaction.
The Arabica coffee cultivars were not considered in the statistical analyses.The effective population size was estimated according to Vencovsky et al. (2012).JC Mistro et al.

RESULTS AND DISCUSSION
The likelihood ratio test (LRT) showed highly significant effects for progenies (p ≤ 0.01) via the additive genetic variance (V a ) values.These results indicate genetic variability in the target population, which leads to the possibility of improving this material and consequently developing promising cultivars.Moreover, it allows using the recurrent selection in these plants, and it will only be successful if significant additive effects occur on the genetic material.Initially, the genetic variability of this population was expected to have been compromised due to the small number of progenies, which was not verified in this study.
Probably, some factors can make C. canephora an advantageous species for breeding programs, including its wide distribution within its center of origin, which promotes diversity to respond to different environmental conditions, allogamy, and genetic self-incompatibility.These factors generate highly heterozygous individuals with high potential for diversity.Some authors have reported the wide and advantageous genetic variability of Robusta (Conagin and Mendes 1961, Charrier and Berthaud 1988, Ferrão et al. 2008).The literature also shows plant breeding projects in which base populations were composed of a small number of progenies without compromising their genetic variability.This finding is also confirmed by Gouvea et al. (2013) and Oliveira et al. (2014), who studied initial populations consisting of 22 and 30 progenies of rubber tree, respectively, and in studies by Fogaça et al. (2012), in which the base population consisted of 24 progenies of daylily (Hemerocallis x hybrids Hort.).It should be remembered that one of the limiting factors of allocating a large number of progenies in experiments of perennial crops is the maintenance cost of this population during the 365 days of the year.This is because these crops occupy larger areas than those used for annual crops, such as corn (Table 2).
Resende ( 2007) discussed some determination coefficients, which assisted in the result interpretation and discussion of perennial species experiments.Among them, the determination coefficients of block effects (c 2 blocks ) and permanent effects of the individual within the progeny (c 2 perm ) stood out.The first coefficient points out the environmental heterogeneity among blocks.According to Resende (2002), the ideal situation for perennial plant breeding is when block effects are not significant and the c 2 blocks is lower than 10%.In the present experiment, the significance of the block effect and the lower values of c 2 blocks (between 2 and 5%) indicated experimental design effectiveness.The c 2 perm provides permanent environmental variations from one year to another or environmental correlations for the observations within the plots over time.The analysis of deviances confirmed the significant variance of permanent environment effects throughout the six harvests, with a c 2 perm close to 0.30.This result shows that measurements of a single plant had a phenotypic correlation over time.Thus, data analyses must be performed by mixed models, since they model this source of variation (Littell et al. 2006).The genotypes x harvests interaction was significant (c 2 ge = 12%); however, the genetic correlation throughout harvests was high (approximately 0.75).This result reveals a good genetic consistency of the trait production over time.Some studies using C. canephora presented coefficients of experimental variation (CV e ) higher than 50% for production.For instance, Leroy et al. (1997) recorded CV e ranging from 30 to 105%, and Cilas et al. (2011) observed CV e varying from 32.02 to 218.40%.The estimated CV e for the present experiment was of 67.56%.However, this value will not hinder the selection process since the genetic variation among progenies was significant at 1% probability.In breeding programs of perennial species, both seed and vegetative propagation forms are viable.In the specific case of C. canephora, these two possibilities should be conducted concurrently, when possible.The selection and recombination of superior plants in successive selection cycles, aiming to increase the favorable alleles, should incorporate new genotypes to the system that constantly extracts promising individuals for the development of clonal cultivars.The adoption of a recurrent selection scheme maintains the genetic variability at a suitable level, improve the population mean, and provide a sustainable program in the medium and long terms.
Initially, the additive components (a) were higher than the corresponding dominance components, indicating that the investigated population has the potential for a recurrent selection program (Table 3).
The plant classification, which was performed by the â value, revealed that progenies 1,2,3,5,7,8,9,11,13,14,16, and 21 had at least one individual up to the 25 th place.Moreover, progenie14 contributed the most (five individuals), which were within the first five positions.The mean phenotypical value (f ̂) observed between these 25 plants was 20.42 kg of harvest production, and its corresponding mean component a was 8.62 kg.Higher values of f ̂ do not always mean better classifications, as noted in the f and a values of plants 129, 83, 72, and 128, among others.The overall mean of the dominance components (d ̅ ) was 1.71, i.e., lower than the corresponding mean of the additive components (a̅ ), which was 3.18.This result favors recurrent selection, as already mentioned, but it does not exclude the possibility of selecting superior heterozygous plants for cloning.
selection cycle.For both sexes, this selection is expected to promote a genetic variation of 116.59% if only 3% of the population is selected.In this case, the improvement expected (X ̅ m ) for one selection cycle would be 17.89 kg, which is twice as much as that of the initial population.Leroy et al. (1997) obtained similar results for grain yield and observed a genetic gain of 65% when applying strong selection intensity (5%) to one sex only.
The expected gains originated only from the female parent are halved when compared with gains for both sexes.An individual selection with high intensities is a risky strategy in a breeding program; however, rapid progress within a short period can be achieved since the reduction of the effective population size can lead to inbreeding and genetic vigor loss (Vencovsky and Barriga 1992).These 15 selected individuals are related to effective population size at the individual level.The (N e ) value was 10.23, which is equivalent to non-related individuals.The effective number (10) is smaller than the physical (15) number since several of these individuals belong to the same progeny.For instance, progeny 14 contributed with five plants; progeny 3 contributed with three plants; and the other progenies contributed with one.Considering the top 15 progenies, the effective family size (N ef ) also differs from the physical numbers 5.49 and 8.00, for the same reason mentioned above.If the "top 14" progenies were selected, the N ef value would be higher than when selecting the "top 15" progenies because progeny 14 would be excluded, (which already appeared in previous positions).
If the breeder wants to use higher N e to decrease inbreeding risks, the best 50 plants may be chosen (10% intensity), resulting in N e = 27.16 and F = 1.84%.Nevertheless, the improved mean would be reduced to 15.45 kg, which is also higher than the original mean (8.26 kg).Moreover, a higher N e would still generate an expected gain of 87.05%.When using 150 plants, selection gain would drop to 58.47%, with a mean of 13.09 kg and F coefficient close to 1%.Maintaining high effective size values in coffee selection programs is very difficult due to the larger experimental areas required.Also, the breeder should consider the choice of the recombination strategy.This decision can maximize the expected genetic gain of populations under recurrent selection.As previously stated, one of the alternatives is to prune non-selected plants, leaving only the selected ones.Therefore, the genetic gain will reach the values shown in table 2. Otherwise, this gain might be reduced by half since selection would be performed only with the female parent.The advantage of the latter procedure would be the time saving since the recombination would already have occurred in the following year.Conversely, when only field-selected plants are used, recombination occurs at two or three years after plant pruning.
For asexual selection, N e and F values are known to be less significant than they are for sexual selection.In a recurrent selection program, in which new populations are formed after selection cycles, N e and F directly affect the magnitude of the additive variance, remaining in subsequent generations.In recurrent selection, both the genetic gain and effective population size must be observed.A very intense selection, i.e., high genetic gains within a low number of individuals for the next generation, may impair further gains.Thus, a weaker selection with low initial gains is recommended.However, the newly improved population will be composed of a larger number of individuals.Therefore, less inbreeding and a higher remaining additive variance occur.This approach ensures the indirect supply and sustainability of longterm breeding programs with clonal cultivars.In this study, the selection of at least 50 individuals would be advisable to create an effective population size and an inbreeding coefficient close to 25 and 2%, respectively.If the program enables a higher number of individuals, then program sustainability will be reached.
Despite the relatively low progeny number, high additive genetic variance and additive component magnitude were detected, which are essential conditions for a successful recurrent selection.The effective population size and inbreeding degree over cycles must be monitored, mainly because of the small size of the initial population.Selecting more individuals with slight genetic gains over time will promote more sustainability in Coffea canephora breeding programs.

Table 1 .
Origin of the progenies of Coffea canephora selected in Turrialba, Costa Rica, evaluated in the experiment installed in Mococa, SP Treatments from 22 to 25 refer to the four cultivars of Arabica coffee (T 22 = Catuaí IAC 144, T 23 = Catuaí IAC 44, T 24 = Mundo Novo IAC 388, and T 25 = Mundo Novo IAC 374).

Table 2 .
Analyses of deviances (ANADEV), likelihood ratio test (LRT) results, variance components and determination coefficients of the joint analysis of the harvests of C. canephora progenies 1 + : deviance of the adjusted model without corresponding effects. 2 LRT: Likelihood Ratio Test with 1 degree of freedom distribution for X 2 ; * and ** are X 2 significance at 5% and 1% respectively. 3V a : additive genetic variance; V blocks : environmental variance between blocks; V gm : variance of genotype x measurement interaction; V perm : variance of permanent effects; V e : residual variance. 4c 2 blocks : determination coefficient of blocks effects; c 2 gm : determination coefficient of genotypes x measurement; c 2 perm : determination coefficient of permanent effects; c 2 res : determination coefficient of residual effects.

Table 3 .
Plant classification according to additive genetic component estimates (a) within the selected plants of the C. canephora progeny experiment.: improved mean expected for the selected plants; Gs: genetic gain expected for the selection; N e : population effective size at individual level; N ef : effective family size at selected progeny level; F: inbreeding coefficient of the selected plants.
f: phenotypic value; a: additive genetic value; u + a: additive genetic values plus overall mean; X̅ m