Genetic and Environmental Influence on Essential Oil Composition of Eugenia dysenterica

A composição dos óleos essenciais de Eugenia dysenterica de populações silvestres de Senador Canedo (SC) e Campo Alegre de Goiás (CA) e de plantas cultivadas, crescidas adjacentemente a partir de sementes dos dois sítios de amostragem, indicou a presença de dois grupos de óleos relacionados à origem das amostras. O grupo I incluiu amostras de SC, seja da população cultivada (subgrupo IA), com percentagens elevadas de a-pineno (5,9-13%), b-pineno (6,6-14%) e (Z)-b-ocimeno (0-13%), seja da silvestre (subgrupo IB), com percentagens elevadas de g-cadineno (21-34%), limoneno (1,3-28%) e óxido de cariofileno (1,5-14%). O grupo II incluiu amostras cultivadas e silvestres de CA, com b-cariofileno (15-44%), d-cadineno (6,4-21%) e a-copaeno (4,4-14%) como majoritários. A correlação canônica revelou que limoneno, g-cadineno, óxido de cariofileno, Zn, Cu, Fe, Mn, temperatura e precipitação média mensal correlacionaram-se às amostras silvestres de SC, enquanto (Z)-b-ocimeno, a-copaeno, b-cariofileno, a-humuleno, d-cadineno e P correlacionaram-se às amostras silvestres de CA e a todas as amostras cultivadas, independentemente da origem da semente. As variações nos óleos parecem ser geneticamente determinadas, em adição a uma influência ambiental sobre as amostras de SC.


Introduction
Eugenia dysenterica DC. is a shrubby tree with edible cherry-like fruits and it is popularly known in Brazil as 'cagaiteira'.It is well-known in Brazilian Cerrado medicine and its leaves are part of preparations used for medical diarrhoeic care and dysentery. 1Antimicrobial activities have been reported concerning Eugenia genus' essential oils and expressed juice, including dermatophytes, 2 antibacterial and systemic fungi such as Paracoccidioides brasiliensis, 3,4 Cryptococcus neoformans var.neoformans, and C. neoformans var.gattii isolated from HIV-infected individuals with paracoccidioidomycosis or cryptococcal meningitis. 5ts fruits are appreciated for their taste and they are consumed in natura or processed to produce jams and ice creams.Moreover, they are harvested by extractive and predatory methods. 6Studies based on genetic diversity have shown a complex pattern of genetic variation in the geographic space of E. dysenterica wild populations, 7,8 which may be useful for conservation programs or to establish sampling strategies.
Previous investigations regarding E. dysenterica essential oils have mainly revealed sesquiterpenes in the leaf of this species. 5,9Seasonal influence on oil chemovariations has been described in cultivated individuals originated from seeds obtained from two different sites. 9In addition, the dynamics of terpene variations during fruit ripening has shown that monoterpenes concentration was high up to the semi-ripe stage and afterwards decreased.On the other hand, sesquiterpenes were intensively synthesized later on in the ripening process, whereas ester occurrence was negligible. 10Despite the great potential and the growing regional market for E. dysenterica fruits, leaf and fruit essential oils are unknown to cosmetic industries in Brazil.Moreover, genetic and environmental influence on the chemical variability of different wild populations and their cultivated samples has not yet been obtained.
As part of our ongoing work on the characterization of essential oils of medicinal aromatic plants growing wild in central Brazilian cerrado, 11 we now report on the results obtained for the essential oil variability of E. dysenterica, which was collected from two geographically separated wild populations and from adjacently-grown cultivated populations originated from seeds obtained from two natural sites.For this purpose, leaf essential oils were analyzed by GC-MS.
To study chemical variability, chemical constituents were submitted to principal component, cluster, and canonical discriminant analyses.Our aim was to detect the samples' distribution pattern and to identify which constituents may distinguish between these groups of individuals.In addition, environmental factors affecting essential oil variability were studied via canonical correlation analysis between oil constituent data set and edapho-climatic data matrix, with 19 variables for each sampling site.

Results and Discussion
According to Barazani et al., 12 chemotypic differentiation may not be concluded from data based solely on wild populations or cultivated plants.Therefore, chemotypic characterization could be established when representatives of two wild, adjacently-grown populations exhibited the same chemical differences as seen in nature.In the present work, E. dysenterica oils were obtained from two wild populations geographically separated by the Corumbá River basin, which forms two distinct sampling sites in the cities of Senador Canedo (SC) and Campo Alegre de Goiás (CA).Cultivated plants were made up of 12 year-old individuals originating by seed propagation from each indigenous population in a single experimental field, located 30 and 200 km from natural SC and CA populations, respectively (see map of sampling sites in the Supplementary Information, SI, file).
Results obtained from PCA and nearest neighbour complete linkage cluster analysis using Ward's technique (31 samples × 14 variables = 434 data; see Table 1) revealed high chemical variability within E. dysenterica essential oils (see SI file).The first PC accounts for ca.38% of total variance and distinguishes well above the 99% confidence level rich sesquiterpenes of CA samples from rich monoterpenes of SC samples, regardless of population (wild and cultivated).Moreover, the second PC (16% of total variance) separates wild samples from cultivated samples of SC origin (see PC scatterplot in the SI file).
Therefore, two main types of essential oils were found according to sampling origin: cluster I included SC wild and cultivated samples originating from SC seeds and cluster II included all CA wild and cultivated samples Vol.21 Arcsine and e rank-transformed in two-way ANOVA analysis (see experimental section).Percentages followed by the same capital letter in the columns and by the same small letter in the rows did not share significant differences at 5% probability by Tukey's test.A dendrogram showing similarities between samples in terms of Euclidean distances -originated from the cluster analysis via PC scores and percentages of oil constituents in clustered samples -may be seen in the supplementary information file.
The canonical discriminant analysis (CDA) confirmed this clustering as a priori groupings and an axial system produced by this analysis distinguished well above the 99% confidence level the different types of oils based on the contents of (Z)-b-ocimene (5), g-cadinene (24), and d-cadinene (26) as predictor variables (Figure 1).The first discriminant function (F1) accounts for 95.4% of total variability and separates wild SC samples from CA samples regardless of the populations (F-test value = 28.8;degrees of freedom, DF = 6 and 52) due to the high negative and positive scores of 24 (subcluster IB) and 26 (cluster II), respectively.On the other hand, the second discriminant function (F2) distinguishes cultivated samples originating from SC seeds (F = 7.3; DF = 2 and 27), as a result of high scores of (Z)-b-ocimene (5) (subcluster IA).It was possible to predict accurately 98% of total well-classification in the original clusters by means of a cross-validation approach. 13he only misclassification referred to an SC cultivated sample in subcluster IA which had been originally classified as a CA sample.Such a misclassification may have been caused by a lower level of d-cadinene (16) in the sample, which is a feature of cultivated plants from SC seeds.
All these findings may be correlated with factors other than genetic determination (cluster I from cluster II), as biotic pressures which could modulate the volatiles of SC wild and cultivated samples originating from SC seeds (subclusters IA/IB), such as the influence of pollinators, pathogens, and herbivores or differences in environmental conditions. 14,15Several studies have reported on the effects of nutrients on essential oil biosynthesis, which include the influence of fertilizer applications on the variations of different oil constituents. 16Micronutrient fertilizers (Cu, Zn, Mn, and Fe) have also shown significant effects on oil yields and contents of marjoram, mint, geranium, rosemary, and cumin. 15,17,18herefore, oil constituent data (set 1) and edaphoclimatic factor data (set 2) were jointly analysed via canonical correlation analysis (CCA), a multivariate treatment that describes correlations between two data sets (Table 2). 19The method makes it possible to assess new variables called canonical variates (CVs) so that they exhibit the highest correlations that may be found between the two data groups.Similarly to PCA, CVs bear no correlation with each other, whereas eigenvalues are approximately equal to the squares of canonical correlations and reflect the variance proportion explained by each canonical correlation relating two variable sets.The correlations of the variables with the CVs -known as canonical loadings or structure correlation coefficients -have been used to explain with which original variables a canonical correlation is mainly associated.
Canonical correlation analysis results (Table 2) showed that the first axis of oil constituent data (set 1) was highly correlated with the first axis of edapho-climatic factors (set 2).In fact the first pair of canonical variates (V1 and W1) was correlated -their canonical correlation coefficient measured 0.9433 -and the variance amount was accounted at 89%.Since the p-value of the first pair of CVs was lower than 0.05, the data sets were statistically correlated at the 95% confidence level by the multivariate Wilks' lambda test, and may aid in interpreting the relationship between variables.
In Table 2, which shows the signs and magnitude of structure correlation coefficients, an increase in the value of the first CV is linked with an increase in d-cadinene (26), a-copaene (9), b-caryophyllene (10), and (Z)-b-ocimene (5) from the first set and P from the second set.On the other hand, the increase in the first CV is also highly associated with a reduction of g-cadinene (24), limonene (4), and caryophyllene oxide (27) from the first set, and Zn, Cu, Fe, Mn, and climatic factors -precipitation and temperaturefrom the second set.Thus, the first CV shows sesquiterpene variation in leaves in response to environmental pressure.
The correlation analysis regarding populations and soils revealed that g-cadinene (24), limonene (4), and  27) have a strong relationship with micronutrient balance in soils (Zn, Cu, Fe, Mn) and with the hottest and most humid habitats, as well as with SC wild samples (subcluster IB).In addition, d-cadinene (26), a-copaene (9), b-caryophyllene (10), (Z)-b-ocimene (5), and P are related to the cultivated samples from SC seeds (subcluster IA) and to CA samples regardless of population (cluster II).The canonical correlation plot shows sample scores for each of the two CVs of the first canonical correlation (Figure 2).When the canonical correlation is high, the points form two clusters at different points on the regression line.In Figure 2, the hottest and most humid site of the SC wild population (subcluster IB) was located to the left of the regression line (negative CV values), whereas cultivated samples from SC seeds (subcluster IA) and CA samples regardless of population (cluster II) were located to the right of positive CV values.
The positive and negative correlation between caryophyllene oxide (27) and b-caryophyllene (10)  respectively and metal ions are in agreement with the effects of foliar application of micronutrient fertilizers containing Zn and Mn on cumin oils. 18These micronutrient effects should be associated with a strict requirement for sesquiterpene synthases for a divalent metal ion as cofactor, which have also influenced the number of byproducts obtained from these reactions. 20The formation of g-humulene is promoted by Mn 2+ ions whereas the amounts of all other by-products are reduced.In peppermint, the only by-product (d-cadinene) produced by (E)-b-farnesene synthase in the presence of Mg 2+ was entirely absent in the presence of Mn 2+ ions. 21Similar negative effects of Mn 2+ on d-cadinene (26) are in agreement with the negative correlation shown in Table 2.
As regards the relationship between P and oil constituents, it has been reported that reduced P availability causes an increased production of different in vitro secondary metabolites under greenhouse conditions. 22In contrast, terpenoid accumulation was related with high P soil content or when culture media were supplemented with increased P concentration. 23The observed correlation should be related at least partially with the collection of wild samples, which occurred in August at the end of the dry season.During this time the peak of leafing activities, senescence, and emission of new leaves occur, 24 thus requiring large amounts of carbon and macronutrients, particularly N and P for proteins and RNA, markedly increased in young leaves with a high capacity for biosynthesized essential oils.Leaf volatiles may provide a constitutive defense -by deterring potential herbivoresor an induced response to herbivore damage by attracting predators or parasites. 25ased on currently available data, the chemical variability of oil composition from SC and CA wild samples may be explained as a result of localized inbreeding effects associated with a low migration gene rate within the populations.The Corumbá River basin separates the wild CA population from the SC site (cluster I from cluster II) through a depression formed by the river and its tributaries.This spatial barrier could contribute at least partially to ecological isolation -a pre-requisite for speciation and chemovariation between the two sampling sites.Thus, the observed chemical polymorphism should be genetically determined rather than environmentally controlled, a fact that has been observed in several plant species. 26The existence of chemotypic differentiation between the two populations could be confirmed by the fact that cultivated plants grown adjacently in the same environment exhibited the typical composition of their wild populations. 12urthermore, the influence of edapho-climatic factors on SC samples -not on CA samples -is strong enough to induce the high chemical variability recorded in the leaf oil of SC wild and cultivated samples originating from SC seeds.It might be speculated that the chemical phenotypic plasticity of SC samples (subclusters IA/IB) could be the result of various evolutive pressures acting as a selection force for a specialized phenotype that is better adapted to local environments (ecotypes).
The population structure based on oil variability is in accordance with the results of genetic structure in E. dysenterica populations using morphological and isoenzymatic traits, 7,27 as well as SSR and RAPD markers. 8lthough most of the genetic variance was found within natural populations, there was a highly significant quantity among populations, thus indicating a gene flow restriction between them.The high correlation coefficient between genetic and geographic distance matrices suggested a spatial pattern of genetic variability among the populations, with decreased gene flow as distances increased. 8On the other hand, the regions' edaphic features exerted a strong influence on the populational phenotypic differentiation as morphological and demographic sample characters. 27hus, variation patterns in essential oils may reflect the existence of a genetic nature in oil composition (SC and CA chemotypes) or stress that chemical variations may be caused by selective pressures in different ecological and geographical environments (SC ecotypes) of E. dysenterica.

Conclusions
Essential oil variability of E. dysenterica determined by GC-MS and by multivariate statistical analysis of wild and adjacently-grown cultivated populations originated from seeds of two sampling sites revealed high polymorphism, which could be influenced by genetic and edapho-climatic factors.

Experimental
Plant material E. dysenterica leaves were collected in their natural habitat in August 2006, in the cities of Campo Alegre de Goiás (CA: 17° 36´ 13´´ S, 47° 43´ 13´´ W, 831 m) and Senador Canedo (SC: 16° 37´ 7´´ S, 49° 4´ 26´´ W, 904 m), Goiás State, Brazil; they were identified by a single author (R. R. N.).With regard to cultivated samples, leaves were collected in July 2006 from 12 years-old individuals originated from seed propagations of the same wild plants.The cultivated individuals were adjacently grown in the form of a randomized block with three replications in a single experimental field (16° 35´ 39´´ S, 49° 17´ 23´´ W, 716 m) belonging to the School of Agronomy and Food Engineering of Universidade Federal de Goiás, Goiânia, Goiás State, Brazil.The cultivated habitat was located 30 and 200 km from SC and CA natural populations, respectively.Voucher specimens are deposited at the herbarium of Universidade Federal de Goiás (UFG40611 and UFG40612).
To assess oil chemical composition, leaf samples were collected from 11 different trees of the wild populations (6 trees collected at the CA site and 5 at the SC site) and from 20 different trees of the cultivated populations (11 trees originating from CA seeds and 9 from SC seeds), all of which were dried for 7 days at 30 °C until constant weight.After being powdered, the dried phytomass (50 g) was submitted to hydrodistillation (3 h) by means of a modified Clevenger-type apparatus.At the end of each distillation the oils were collected, dried with anhydrous Na 2 SO 4 , transferred to glass flasks, and kept at a temperature of -18 °C.Oil yields (%) were based on the dried weight Vol.21, No. 8, 2010   of plant samples.All experiments were conducted in duplicates and the results are shown as mean values.

Soil analyses
Soil samples were collected at 0-20 and 20-40 cm depths in each locality.They were subsequently air-dried, thoroughly mixed, and sieved (2 mm).The portion finer than 2 mm was kept for physical and chemical analysis. 28he pH was determined in a 1:1 soil-water volume ratio.Ca, Mg, and Al were extracted with 1 mol L -1 KCl, whereas P, K, Zn, Cu, Fe, Mn, and Mo were extracted with Mehlich's solution.Organic matter, cation exchange capacity (CEC), potential acidity (H+Al), base saturation, Al saturation, and soil texture were determined by the usual methods. 28ean monthly temperature and precipitation values were obtained from climatological stations at UFG (cultivated samples) and Instituto Nacional de Meteorologia -INMET (wild samples).Environmental factor data from these climatological records and the average of the soil analyses of both depths were ordered in an edaphoclimatic matrix with 19 variables for each sampling site.The canonical correlation procedure was applied to both data sets concerning essential oil constituents and edaphoclimatic features (discriminant edapho-climatic variables in clustered samples are shown in the supplementary information file).
In geographical terms, the cultivated field has a soil loam texture whereas natural habitat mainly reveals a sandy loam texture; both are characterized by acidic and nutritionally impoverished soils and by scleromorphic vegetation.Mean annual rainfall, temperature, and relative humidity values are similar.

Chemical analyses
Oil sample analyses were performed on a GC-MS Shimadzu QP5050A instrument under the following conditions: a CBP-5 (Shimadzu) fused silica capillary column (30 m × 0.25 mm i.d., 0.25 mm film thickness) connected to a quadrupole detector operating in EI mode at 70 eV with a scan mass range of 40-400 m/z at a sampling rate of 1.0 scan s -1 ; carrier gas: He (1 mL min -1 ); injector and interface temperatures of 220 °C and 240 °C, respectively, with a split ratio of 1:20.The injection volume was 0.4 µL (ca.20% in hexane) and the oven temperature was raised from 60 to 246 °C with an increase of 3 °C min -1 , then 10 °C min -1 to 270 °C, holding the final temperature for 5 min.Individual components were identified by comparing their linear retention indices (RI), 29 by co-injection with a C 8 -C 32 n-alkanes series, 30 mass spectra with those of the literature, 29 and a computerized MS-database using NIST libraries. 29

Chemical variability
Univariate average multiple comparisons of oil constituent data were established by two-way ANOVA (wild/cultivated populations and SC/CA sites as factors) using SAS GLM analyses (Statistical Analysis System, SAS Institute Inc., Cary, NC, 1996).All data were checked for homoscedasticity with the use of Hartley's test.This test revealed significant deviation from the basic assumption for oil constituents 9, 10, 24, 27-29, monoterpene hydrocarbons, and 3, 4, 12, 26, which were arcsine and rank-transformed, respectively(Table 1).A post-hoc Tukey test was performed whenever a difference was established.P-values below 0.05 were regarded as significant.
In multivariate analyses, each datum was standardized according to z ij = (x ij -average j )/(standard deviation) j .Principal component analysis (PCA) was applied to explore the interrelationships between populations and their chemical constituents, via système portable d'analyse des données numériques-SPAD, version 5.5, Centre International de Statistique et d'Informatique Appliquées, France (2001).Cluster analysis was also applied to investigate possible natural groupings among samples characterized by the set of oil constituents.Nearest neighbour complete linkage technique by Benzécri algorithm was used as a similarity index and hierarchical clustering was performed according to Ward's variance minimizing method. 31As for variable selection, the threshold of residual eigenvalues (≤ 0.70) in the original data matrix (31 samples × 29 variables) was used to establish the maximum number of variables which could be removed (19 variables). 32The 15 effectively eliminated variables expressed the highest loadings in the lowest residual eigenvalues and also contributed ≤ 2% to chemical profiles.
Canonical discriminant analysis via SAS CANDISC and SAS DISCRIM procedures was used to differentiate populations and clusters on the basis of oil composition.The predictive ability of canonical discriminant functions was evaluated by leave-one-out cross-validation approach as implemented in SAS.
Oil variability and edapho-climatic factor relationships were obtained by Canonical Correlation analysis via the SAS CANCORR procedure.The magnitude of structure correlation coefficients (canonical loadings) was used to explain canonical variates.The predictive ability was evaluated by canonical redundancy analysis with standardized variance coefficients.

Figure 1 .
Figure 1.Scatterplot of canonical discriminant functions of E. dysenterica wild samples (circle symbols) and adjacently-cultivated individuals (square symbols) from seeds originated from Senador Canedo (SC; unshaded symbols) and Campo Alegre de Goiás (CA; shaded symbols) to which subclusters IA/IB and cluster II it belongs.a Axes refer to scores from the samples.b Axes refer to loadings from predictor oil variables represented as long arrows from the origin.Short arrows show a misclassified individual detected by CDA.Crosses represent cluster centroids and values between parentheses refer to the explained variance on each discriminant axis.

Table 2 .
Canonical correlation structure (loadings) of oil constituents and edapho-climatic factors with their canonical variates