Caffeine inheritance in interspecific hybrids of Coffea arabica x Coffea canephora ( Gentianales , Rubiaceae )

Caffeine inheritance was investigated in F2 and BC1F1 generations between Coffea arabica var. Bourbon Vermelho (BV) and Coffea canephora var. Robusta 4x (R4x). The caffeine content of seeds and leaves was determined during 2004 and 2005. Microsatellite loci-markers were used to deduce the meiotic pattern of chromosome pairing of tetraploid interspecific hybrids. Genetic analysis indicated that caffeine content in seeds was quantitatively inherited and controlled by genes with additive effects. The estimates of broad-sense heritability of caffeine content in seeds were high for both generations. In coffee leaves, the caffeine content (BSH) from the same populations showed transgressive segregants with enhanced levels and high BSH. Segregation of loci-markers in BC1F1 populations showed that the ratios of the gametes genotype did not differ significantly from those expected assuming random associations and tetrasomic inheritance. The results confirm the existence of distinct mechanisms controlling the caffeine content in seeds and leaves, the gene exchange between the C. arabica BV and C. canephora R4x genomes and favorable conditions for improving caffeine content in this coffee population.


Introduction
The stimulant effect of coffee that makes it one of the most popular beverage in the world is due to the presence of caffeine (1,3,7-trimethylxanthine), an alkaloid synthesized from purines and found in the seeds.However, in spite of its importance, the literature on caffeine inheritance in coffee is scarce and inconclusive.Some genetic studies with coffee cultivars have revealed that the caffeine content of the seed is genotypically defined in a quantitative and polygenic manner and is only slightly influenced by exogenous factors (Charrier and Berthaud, 1975;Ravohitrarivo, 1985;Le Pierres, 1988).The mean caffeine content in Coffea hybrids is close to the arithmetical mean of the parents.Moreover, work with intra and inter-specific hybrids has shown that variability in the progeny depends on the degree of parental heterozygosity, which is more evident in out-breeding species.For example, Charrier and Berthaud (1975), reported that in a collection of cultivated Coffea trees the caffeine level was higher in Coffea canephora, an allogamous species, than in the autogamous species Coffea arabica.On average, the dry weight caffeine content of seeds from C. canephora varieties is 2.5%, more than twice as high as the 1.2% found in C. arabica varieties in.Therefore, breeding programs with C. canephora include intraand inter-specific hybridization and selection of suitable genotypes with the aim of lowering its caffeine content (Leroy et al., 1993).
Transfer of desirable genes from C. canephora (2n = 2x = 22, allogamous) to C. arabica (2n = 4x = 44, autogamous) varieties through wide crosses is one of the breeding strategies used for coffee improvement.However, the ability to transfer useful traits from a related species to another by conventional methods depends on the crossability between the two species.Although C. arabica has a diploid-like meiotic behavior (Krug and Mendes, 1940;Lashermes et al., 2000), tetraploid interspecific hybrids resulting from the hybridization between C. arabica and an auto-tetraploid C. canephora obtained following colchicine treatment have been particularly favorable to intergenomic recombination and gene introgressions.Segregation analyses of restriction fragment length polymorphism (RFLP) loci-markers have indicated tetrasomic inheritance resulting from the pairing of homologous chromosomes in meiosis (Lashermes et al., 2000).In two BC 1 F 1 populations, (C.arabica x C. canephora 4x) x C. arabica, segregations and co-segregation of RFLP and microsatellite loci-markers conformed to the expected ratio assuming random chromosome segregation and the absence of selection (Herrera et al., 2002).
Seeds of these hybrids have a considerable range of caffeine content (Mazzafera et al., 1992;Mazzafera and Carvalho, 1992;Bertrand et al., 2003), suggesting that this trait is under polygenic control.Montagnon et al. (1998) used intraspecific hybrids of C. canephora to evaluate genetic parameters related to several biochemical compounds, and found that while additive genetic effects were preponderant for most of the characters narrow-sense heritability was high (0.80) for caffeine content.The lowcaffeine level (0.6-0.7%) found in the seeds of C. arabica var.Laurina, a spontaneous mutation of C. arabica var.Bourbon, has been suggested to be a qualitative trait and controlled by a recessive pair of lr lr alleles, with a strong pleiotropic effect on morphological characteristics (Carvalho et al., 1965).Barre et al. (1998) having investigated caffeine inheritance in the first and second generations of an interspecific cross between Coffea liberica var.Dewevrei and the caffeine-free wild species Coffea pseudozanguebariae and proposed that the absence of caffeine was controlled by cc alleles.Interestingly, the caffeine content of coffee seeds and leaves seems to be independently controlled (Mazzafera and Magalhães, 1991), with, in general, the caffeine content of leaves being lower than that of seeds even in coffee species with duplicated chromosome numbers (Silvarolla et al., 1999).
In this paper we describe the genetic variability and the inheritance pattern of the caffeine content of the seeds and leaves of hybrids between C. arabica and C. canephora 4x.Our results are also discussed in relation to the possibility of gene exchange between the homologous genomes in interspecific hybrids.

Plant material
The F 1 tetraploid hybrid between Coffea arabica L. var.Bourbon Vermelho (BV, parent 1) and Coffea canephora var.Robusta 4x (R4x, parent 2), an artificial tetraploid obtained by Mendes (1947), has, since 1996, been advanced to F 2 by selfing three F 1 clones from the same plant, with some F 1 flowers having been backcrossed to the BV parent to also develop the backcross ((BV x R4x) x BV, generation BC 1 F 1 ).
In the investigation described in this paper we used leaves and seeds collected from F 2 and BC 1 F 1 plants in 2004 and 2005.All the segregating populations were grown in a field trial at a site near the municipality of Mococa (latitude 21°28' S, longitude 47°01' W and altitude 665 m) in the Brazilian state of São Paulo State and received treatment with inorganic fertilizer, and weed and pest control and all other treatments recommended for growing coffee under Brazilian conditions (Thomaziello et al., 1996).

Sample preparation and caffeine extraction
During 2004 and 2005 the total number of plants analyzed for caffeine content was 150 F 2 plus 88 BC 1 F 1 with regard to the caffeine content of their leaves and 71 F 2 plus 24 BC 1 F 1 in respect to the caffeine content of their seeds.In January and July of each year we collected leaves and fruits, respectively, of third and fourth leaf pairs from different sides of the tree canopy and red cherries (mature fruits) from F 2 and BC 1 F 1 plants.The fruits were manually processed to remove the seeds, which were dried at 70 °C for two weeks and then finely ground with a blade grinder.Caffeine was extracted by adding 5 mL of 80% methanol to 100 mg samples of the ground material contained in testtubes, the mixture being maintained for 60 min in a water bath a at 70 °C with occasional agitation.The same procedure was adopted for the leaves, except that a pestle and mortar was used to grind the leaves.The extracts were centrifuged and the supernatant was analyzed by reversedphase high performance liquid chromatography (Shimadzu HPLC system), using a C18 column (Supelco, 5 Pm, 4 mm x 250 mm) with 0.5% (v/v) acetic acid in 50% (v/v) aqueous methanol as solvent, at a flow rate of 0.8 mL min -1 .Caffeine was detected with a diode array detector operating at 280 nm and quantification as a percentage of sample dry mass was carried out by comparing the sample data with known amounts of pure caffeine (Sigma, St Louis, USA).Every sample was analyzed twice and when the values differed by more than 5% a third analysis was made, hence each data point represents the mean of at least two determinations.

Microsatellite marker assay
In January 2005 we collected approximately 10 g of young leaves from 73 BC 1 F 1 plants and isolated total genomic DNA from freeze-dried leaves as described by Ky et al. (2000).Eight microsatellites (32-2CTG, C2-2CTC, E8-3-CTG, M20, M27, M32, EST-SSR1, EST-SSR2) that have previously shown clear polymorphisms between BV and R4x in a preliminary screening (data not shown) were selected for use in this study.Six of these microsatellites, or simple sequence repeats (SSR), were identified in DNA clones derived from genomic libraries and obtained from Combes et al. (2000) and Rovelli et al. (2000).Two expressed sequence tags (EST) were developed from cluster consensus sequences derived from the Coffee Genome Project (Vieira et al., 2006) using the TROLL software (Castelo et al., 2002).Every forward primer was labeled with either blue (FAM) or green (JOE) fluorescent tags (Invitrogen, São Paulo, Brazil).The EST-SSR sequences used were EST-SSR 1 (forward 5'GAATACATCACTCCAGA GACG3', reverse 5'CCTTAGCCAACTCCTGAAC 3') and EST-SSR 2 (forward 5'CATAGCAACTTCAAACA CGC 3', reverse 5'TCGACTATGAGAAGCTGAAGG3').Polymerase chain reactions (PCR) were performed in a 15 μL final reaction volume containing 60 ng DNA, 0.2 μM of each forward and reverse primer, 200 μM of each dNTP, 2.0 mM MgCl 2 , 10 mM Tris-HCl, 50 mM KCl, and 1.5 unit Taq DNA polymerase (Invitrogen, Brazil).Reactions were amplified on a PTC-200 thermocycler (MJ Research, USA) as follows: 95 °C for 5 min, followed by 30 cycles consisting of heating at 95 °C for 1 min, annealing for 1 min at a temperature specific for each primer (55 °C to 60 °C) and extension at 72 °C for 1 min, with a final elongation step at 72 °C for 10 min.Amplification products were separated on 5% (w/v) polyacrylamide gel (PAGE) in an ABI 377 DNA automatic sequencer (Applied Biosystems, USA).The data were collected automatically based on the differential fluorescence of the products and analyzed with the GeneScan/Genotyper programs (Applied Biosystems, USA).The amplification products representing the microsatellite loci were identified and interpreted as specific markers of either C. arabica or C. canephora by comparison with the amplification products produced by the parental accessions (Table 1).

Statistical analysis
To analyze the effect of years, the frequency and normal distributions of the caffeine content in seeds and leaves within each segregating population were compared using the Kolmogorov-Smirnov two-sample test (Snedecor and Cochran, 1989).Data for each trait were analyzed using the pooled means of both years (2004 and 2005).The caffeine content of seeds and leaves plotted as frequency histograms with a class interval (i) according to the formula i = A/k, where A is the amplitude of the variation between the maximum and minimum value observed in the dataset and k is the square root of the number of observations per dataset.The Shapiro-Wilk test was used to test for normality in the distributions.
The mean differences between generations were compared using the F-test with unequal variance as implemented in the PROC GLM routine of the SAS 8.0 program (SAS Institute, USA).Comparisons between means were carried out using the Tukey test at p = 0.05.
Broad sense heritability (BSH) of each F 2 population was estimated as BSH F2 = (V F2 -V E )/V F2 and for each 500 Inheritance of caffeine in coffee BC 1 F 1 population as BSH BC1F1 = (V BC1F1 -V E )/ V BC1F1 , where V F2 , V BC1F1 and V E were the F 2 , BC 1 F 1 and environmental variances respectively (Falconer 1960).The environmental variance (V E was estimated as V E = (V P1 + V F1 )/2, where V P1 and V F1 are the variances of the parent and F 1 hybrid respectively (Falconer, 1960).The BSH error was estimated according to Vello and Vencovsky (1974).
Statistical analysis of the microsatellite data in the BC 1 F 1 progeny compared observed genotypic frequencies to the expected frequencies predicted for tetrasomic inheritance models.The chi-squared test (χ 2 ) was used to check the goodness-of-fit between the observed and expected frequencies.

Results
In both years the Kolmogorov-Smirnov two-sample test detected no heterogeneity in the frequency distributions of caffeine content for each segregating population (Figure 1).However there was a greater tendency for the leaf caffeine content means to be homogenous than there was for the seed caffeine content means, probably due to the leaves being sampled from more plants (150 F 2 and 88 BC 1 F 1 plants for each year) as compared to seeds (71 F 2 and 24 BC 1 F 1 plants for each year).The absence of environmental effects allowed the 2004 and 2005 results to be pooled.
The distribution of the caffeine means for 2004 and 2005 are shown in Figure 2. Caffeine content in seeds of all segregating populations was distributed in a continuous manner and could not be classified into discrete classes.Segregation followed a normal distribution in the F 2 (Shapiro-Wilk test, p = 0.3318) but not in the BC 1 F 1 (Shapiro-Wilk test, p = 0.0395).No transgressive segregation for low or high caffeine content was observed in any of the F 2 and BC 1 F 1 populations (Figure 2A-B).The data for caffeine content in leaves for the F 2 and BC 1 F 1 populations were also distributed continuously and could not be classified into discrete high or low caffeine content classes.However, none of them followed a normal distribution (p = 0.0021 to F 2 and p = 0.0325 to BC 1 F 1 ) and trans-Priolli et al.
501 gressive segregants with enhanced levels of caffeine content were observed (Figure 2C-D).
Significant differences occurred between the generation means for the caffeine content of seeds but not for leaves, the mean seed caffeine content being high in the R4x parent, low in the BV parent and intermediate in the F 2 and BC 1 F 1 generations (Table 2).However, while the two parents were significantly different (p = 0.05) from each other and the F 2 and BC 1 F 1 generations in regard to mean seed caffeine content the F 2 and BC 1 F 1 generations were not significantly different from each other (Table 2).For leaf caffeine content there were no statistically significant differences between any of the generations (Table 2).
The F 2 broad-sense heritability (BSH) was 0.91 for seeds and 0.83 for leaves, while the BC 1 F 1 BSH was lower at 0.81 for seeds and 0.79 for leaves.The error associated with the BSH estimate was low, 0.12 for seeds and 0.11 for leaves.
The diagram of allele segregations in the BV x R4x hybrid was inferred from the analysis of the backcross progeny for eight microsatellite-loci (Table 3).For all loci except the M27 locus we identified two types of BC 1 F 1 genotypes, one with PAGE gel segregation patterns similar to the three bands seen in the F 1 hybrid and another similar to the two bands occurring in the BV parent (see Table 1).The observed frequencies of each genotype were compared with the expected frequencies (Table 3).The genotype ratio tested was 1/4 of the BV profile and 3/4 of other profiles (others banding patterns in the tetraploid BC 1 F 1 progeny), assuming random chromosome segregation, absence of a preferential chromosome pairing and tetrasomic inheritance.Of the eight loci tested, only three (C2-2CTC, M32 and M27) exhibited a significant deviation (p > 0.05) between observed and expected values.

502
Inheritance of caffeine in coffee

Discussion
Both major and minor genes with additive gene effects appear to be involved in the variation of caffeine content in the seeds and leaves of the segregating populations studied, as shown by the data presented in Figure 2. The inability to classify caffeine content into discrete classes of high and low levels in the segregating populations studied is an indication that caffeine content in seeds and leaves is a quantitative trait.
The parent plants used by us to obtain the F 2 and BC 1 populations had contrasting seed caffeine contents (BV = 1.07%,R4x = 3.5%), which generated enough phenotypic variation to study the inheritance of caffeine in coffee.The seeds of the BV plants have been reported to contain 1.15% caffeine (Mazzafera et al., 1992), similar to the value determined by us, but the caffeine content of the tetraploid R4x has not been previously reported and has been shown by us to be higher (3.5%) than in diploid plants of C. canephora.In fact, it is known that there is considerable variation in the caffeine content of C. canephora seeds, ranging from the 0.81% to 3.26% reported by Mazzafera et al. (1997) to the 1.16% to 3.27% reported in the earlier study by Charrier and Berthaud (1975).In seeds, intraspecific studies with C. arabica (Charrier and Berthaud 1975), C. canephora (Ravohitrarivo, 1985, Le Pierres, 1988) and interspecific hybrids from several Coffea species (Mazzafera et al., 1992) have also presented a large range of distribution in the descendants, supporting the quantitative feature of this inheritance.Barre et al., (1998) studied caffeine inheritance in interspecific hybrids from a species with (Coffea liberica) and without (C.pseudozanguebariae) caffeine in the seeds and also came to the conclusion that the caffeine content appeared to be un-der polygenic control with a strong genetic effect.Caffeine content has often been described as an additive trait in intraand inter-specific hybrids involving coffee species with caffeine, irrespective of the ploidy level (Capot, 1972, Charrier and Berthaud, 1975, Le Pierres, 1988).
The inheritance of caffeine in leaves was different to that in seeds and showed intermediate values as compared to the parents, although there were no statistically significant differences between the different generations (Table 2).However, the highest caffeine content in leaves occurred in the BV parent and not in the R4x parent, as was the case for seed caffeine content.Previous studies have shown similar values for caffeine in the leaves C. arabica used as P 1 (i.e., 1%: Mazzafera and Magalhães, 1991) and C. canephora R4x used as P 2 (i.e., 0.21%: Silvarolla et al., 1999).In our study, the occurrence of plants transgressive for caffeine in leaves indicate that the parents probably have major and minor genes for caffeine production.Moreover, the observed segregation seems to be a consequence of allogamy (heterozygosity) of the R4x parental line and suggests that this parent contributed with a greater number of alleles to the low caffeine content in leaves than did the other parental line.
The high BSH seen in our study indicates that caffeine levels are mainly controlled by genetic characters and that progress in these populations by selection should be feasible, even considering that BSH is an estimate that takes into account additive, dominant and epistatic effects in the model.Le Pierres (1988) reported high BSH (0.76) and narrow-sense heritability (NSH = 0.33) in C. canephora varieties, while Montagnon et al. (1998) found higher NSH heritability (0.80) for caffeine content in a factorial crossing scheme of C. canephora and attributed this value to the analytical method used and the genetic origin of the parents.
Although interspecific crosses have inherent limitations such as hybrid instability, infertility, non-Medelian segregation and low levels of intergenomic crossing-over (Stebbins, 1958), C. arabica x C. canephora R4x hybrids have been reasonably fertile (Berthaud, 1978;Owuor and Van der Vossen, 1981).In our study we were able to determine the caffeine in the seeds of 71 out of 150 F 2 plants and 24 out of 88 BC 1 F 1 plants, indicating a certain degree of fertility.Additionally, our segregation analyses of eight microsatellite-loci markers showed genetic recombination, since five of these showed no differences between the observed and the expected frequencies.A similar conclusion was reported for C. arabica C. canephora R4x hybrids by Lashermes et al. (2000) using RFLP loci and Herrera et al. (2002) using RFLP associated microsatellite-loci.
Our results indicate that breeding for low, or high, caffeine content in seeds could be very efficient and might be accomplished starting from the F 2 or BC 1 F 1 generations.We also found that the genetic control of the caffeine content of leaves is different to that for seeds and that the C.

Figure 1 -
Figure 1 -Distribution of mean caffeine content (% dry mass) in F 2 and backcross (BC 1 F 1 ) generations of Coffea arabica var.Bourbon Vermelho x Coffea canephora var.Robusta 4x seeds (Figure 1A) and leaves (Figure 1B) collected in 2004 and 2005.All the p-values from the Kolmogorov-Smirnov two-sample test were non-significant.

Table 1 -
Specific microsatellite marker bands detected in Coffea arabica var.Bourbon Vermelho (BV) and Coffea canephora var.Robusta 4x (R4x) and the tetraploid BC 1 F 1 progeny resulting from the backcross between the BV parent and the BV x R4x hybrid.

Table 2 -
Comparison of the caffeine content in the seeds and leaves of the parent plants Coffea arabica var.Bourbon Vermelho, (BV, parent P 1 ) and Coffea canephora var.Robusta 4x (R4x, P 2 ), the BV x R4x F 1 and F 2 hybrid progeny and the BC 1 F 1 backcross between the BV parent and the BV x R4x hybrid.The caffeine means for the different generations were compared using the F-test p = 0.05.
N: Number of plants analyzed.M:Mean caffeine content (%). 1 Means followed by different letters are different by the Tukey Test at p = 0.05.*Significant at p = 0.05, compared with F-test at p = 0.05.ns Not significant compared with F-test at p = 0.05.