Peach palm core collection in Brazilian Amazonia

The Peach palm Active Germplasm Bank has abundant genetic diversity in its holdings. Because it is a live collection, maintenance, characterization and evaluation are expensive, restricting its use. One way to promote more efficient use is to create a Core Collection, a set of accessions with at least 70% of the genetic diversity of the full collection with minimal repetition. The available geographic, molecular marker (RAPD) and morphometric information was systematized and the populations were stratified into wild and domesticated. The Core Collection consists of 10% of the entire collection: 31 accessions of landraces, 5 accessions of non-designated populations and 4 accessions of wild populations. The Core has 92% of the molecular polymorphism and 95% of the heterozygosity of the full collection, with minimal divergence between them by molecular variance. The Core is already receiving priority maintenance, which will contribute to long term conservation.


INTRODUCTION
The peach palm (Bactris gasipaes Kunth) is a food crop (fruit and heart-of-palm) domesticated by Native Americans.Over millennia, these people selected types that most interested them, resulting in a complex of landraces with distinct morphological, chemical and production characteristics (Clement 1995, Mora-Urpí et al. 1997).Different landraces have different demands in the modern market.For example, the Pampa Hermosa landrace has strong demand for heart-of-palm; the Pará landrace has strong demand for cooked fruit in Manaus and Belém, Brazil (Clement 2008, Graefe et al. 2013).A recent review organized B. gasipaes into two varieties: var.gasipaes, which includes all domesticated populations with large fruits; var.chichagui, which includes all wild populations with small fruits (Henderson 2000).The domesticated populations are further classified into landraces: "microcarpa" (Pará and Juruá landraces) have small fruit (means < 20 g), "mesocarpa" (Pampa Hermosa, Pastaza, Cauca, Utilis) have fruits of intermediate size (means between 20 and 70 g), and "macrocarpa" (Putumayo and Vaupés) have large fruits (means > 70 g) (Mora-Urpí and Clement 1988, Mora Urpí et al. 1997).Within var.chichagui, Henderson (2000) suggested the existence of three wild types (Figure 1), two with very small fruit (means 1 g) and the third with fruit weighing 2-15 g.The National Research Institute for Amazonia (INPA) created and maintains Brazil's largest ex situ collection of B. gasipaes, the Peach palm Active Germplasm Bank (BAG), which contained samples of all the landraces mentioned above and two of the wild types.This BAG contributed to the expansion of the cultivated peach palm agribusiness, but was less successful with expansion of demand for fruit (Clement et al. 2004).Studies in the BAG have contributed to understanding the domestication of peach palm and the distribution of its landraces (Clement et al. 2012).
Molecular techniques have been used to investigate the validity of landraces and populations of peach palm and their genetic relationships.Sousa et al. (2001) used RAPD andClement et al. (2002) used AFLP markers, and concluded that the Solimões landrace belongs to the Putumayo landrace, while the Pará landrace is distinct.Rodrigues et al. (2004) used RAPD markers to validate seven landraces, and concluded that there is only one landrace in Central America (the Guatuso and Tuira landraces are part of the Utilis landrace).Importantly, Rodrigues et al. (2004) also showed that there is a significant negative relationship be-tween gene flow and geographic distance among landraces or isolation by distance.Silva et al. (2003) used the same RAPD markers on a different set of accessions and confirmed the validations of Rodrigues et al. (2004).Cristo-Araújo et al. ( 2010) also used the same RAPD markers on another set, and found results similar to those of Silva et al. (2003) and Rodrigues et al. (2004).Santos et al. (2011) assessed the genetic variability of hybrid populations with the same markers and concluded that they are not different from the landraces surrounding them.This analytical process took a decade because the peach palm is not an economically important crop and therefore is not a national priority (Clement et al. 2004), which also has implications for the conservation of its genetic resources.
Maintenance costs of living collections, such as the Peach palm Active Germplasm Bank (BAG), are high and the means to maintain, characterize and evaluate are increasingly scarce (Clement et al. 2004).One way to increase the effectiveness of the available human and financial resources to ensure continuity and expand the usefulness of the BAG is to create a core collection within it, which will receive priority in future investments.A core collection consists of a limited set of accessions chosen to represent at least 70% of the genetic variation of the entire collection with minimal redundancy (Brown 1989, Brown and Spillane 1999, Odong et al. 2012).Escribano et al. (2008) e Odong et al. (2012) review criteria for selecting and evaluating the accessions included in the core and the core's relationship with the total collection; some of these evaluation methods are used here.Several computer programs now exist to aid core collection design (see Escribano et al. 2008, Odong et al. 2012), but none handle disjointed data sets with dominant markers, like that of the Peach palm BAG.The aim of this study was to design and evaluate a core collection within the Peach palm BAG to support the better management of these genetic resources.

Material
The Peach palm BAG is located at km 38 of the BR-174 highway, Manaus, Brazil, and currently has 371 accessions maintained in the field, with representatives of eight landraces, four hybrid populations (Belém, Iquitos, Manaus, Yurimaguas), some populations that have not been designated to landraces (middle Ucayali and upper Madeira Rivers), and some populations of var.chichagui (types 1 and 3).Accessions are composed of nine (or fewer) plants derived from a single open-pollinated bunch obtained from a palm in the property of a traditional farmer; all Latin American peach palm germplasm banks established in the 1970s and 1980s use the same definition of accession (Mora Urpí et al. 1997).All accessions have reasonably complete passports and nearly all have at least one plant with RAPD molecular M Cristo-Araújo et al.

Design of the Core Collection
All information available from the BAG was systematized for the formation of the Core Collection.Morphometric information should be the first criterion for the creation of a core collection (van Hintum 1999), because it is the most useful information for the breeder, followed by genetic information and geographical information.However, this sequence of information availability is the opposite of that available in the BAG.In fact, most germplasm collections have reasonably good passport information, so it is common practice to use this in core collections (Johnson and Hodgkin 1999).The fact that there is a significant negative correlation (Mantel's r = -0.83;Rodrigues et al. 2004) between gene flow and geographic distance increases the significance of geographical information as a criterion, as this confirms significant genetic structure across geographic space.Therefore, we used two criteria for the formation of the Peach palm core: geographical distribution and molecular markers.The general procedure for the selection of a core collection has four steps (van Hintum 1999): 1. Determine the size of the core collection.This is a management decision and in this case is 40 accessions (about 10% of the BAG).

Divide the material into distinct groups.
A stratified sampling strategy in two levels was used in this case.At the first level, the samples were classified based on the degree of domestication: wild and domesticated.At the second level, the domesticated populations were subdivided into landraces, non-designated populations and hybrid populations, although after the study of Santos et al. (2011), which showed that hybrid populations are not different from their surrounding landraces, these samples were incorporated into their landraces.Because there is a significant relationship between gene flow and geographic distance (Rodrigues et al. 2004), this stratum represents the eco-geographical distribution of peach palm also.
3. Decide the number of accessions per group.Landraces and populations with little representation in BAG were sampled in proportion to their number, while well represented landraces and populations contributed logarithmically (Log n ) (van Hintum et al. 2000).Minor changes in these strategies were justified for economic reasons, i.e., market demand, or to include material potentially important for understanding the origin and racial hierarchy of peach palm.Odong et al. (2012) point out that these decisions should result in better uniformity of representation of the categories and that this better uniformity can be tested by comparing the Shannon diversity index of the full collection with that of the core collection.
4. Final selection using information generated by RAPD molecular markers.Comparisons were made of dendrograms of published papers to identify accessions with greater genetic divergence, as this divergence is probably due to a greater number of rare alleles (Marita et al. 2000).

Genetic analyses and evaluation of the core collection
We were unable to combine the four binary matrices (Silva et al. 2003, Rodrigues et al. 2004, Cristo-Araújo et al. 2010, Santos et al. 2011) due to many differences in the generation of the RAPDs, including the people involved, reagents, thermal cyclers and other equipment throughout the decade.This is a recognized limitation of RAPD technology (Ferreira and Grattapaglia 1998), but RAPD continues to be used because most of the fragments are identical at the intra-specific level, and the estimates of genetic diversity within and between populations are very similar when compared to other dominant markers (AFLP and ISSR) (Nybom 2004).Therefore, the matrices were analyzed separately to estimate the percentage of polymorphism (99%) and expected heterozygosity using Genalex v.6.4 (Peakall and Smouse 2006), and genetic diversity (H T , H S ) and divergence (θ-I and G ST-B ) using Hickory v.1.1,which uses a Bayesian approach (Holsinger and Lewis 2003).The comparison of the genetic variability between the BAG and Core Collection was done by analysis of molecular variance (AMOVA) and visualized with Principal Coordinate Analysis (PCoA), using R, v.3.0 (R. Core Team 2013).

The Core Collection
The Core Collection consists of 40 accessions (Table 1), which represent slightly more than 10% of the BAG that currently has 371 accessions.Of these 40 accessions, 36 represent domesticated populations (var.gasipaes) and four represent wild populations (var.chichagui).The BAG has a Shannon diversity index of 0.67 due to the unevenness of landrace representation, while the Core has an index of 0.99 due to its greater evenness (Figure 2).Greater evenness is one indication that a Core represents its full collection (Odong et al. 2012).
The microcarpa landraces are sources of oilier fruits and were over-represented in the Core to meet market demands in Brazilian Amazonia (Clement and Santos 2002).
Among the landraces described to date, Pará is the only landrace of the eastern dispersal from the southwestern center of domestication (Rodrigues et al. 2004) and is the most widely distributed of all landraces (Figure 1).Due to its wide geographic distribution, an accession from each of six populations was chosen to represent the 60 accessions in the BAG, and an accession of the putative hybrid population in Manaus was included in this group.Within these populations, accessions were chosen for their genetic divergence, with accession 50-P the most divergent (Cristo-Araújo et al. 2010).
The Juruá landrace is part of the western dispersal (Rodrigues et al. 2004), and is the most primitive landrace of Western Amazonia.Although it has a very small geographical distribution (collected only near Cruzeiro do Sul, Acre), its importance as a source of oily fruit explains its a Hybrid populations -Allocated to their respective landraces based on Santos et al. (2011); b Guatuso e Tuira -Genetically parts of the Utilis landrace (Rodrigues et al. 2004); c Solimões -Genetically part of the Putumayo landrace (Rodrigues et al. 2004).
over-representation in the Core.The two accessions chosen were the most divergent in Silva et al.'s (2003) analysis.
The mesocarpa landraces make up the largest and most variable group, with two over-represented landraces and the others under-represented in the BAG (Figure 2).As the only landrace of the Pacific-side of northwestern South America, the Cauca landrace was over-represented in the Core by the two existing accessions.The Pastaza landrace was represented by the single accession in the BAG.Pampa Hermosa was represented logarithmically because its fruits are not preferred by consumers of Manaus and Belém, but also because there are many working collections that conserve its genetic resources for use in heart-of-palm breeding (Clement et al. 2004, Clement et al. 2012).Among the 92 accessions, four were selected from the Pampa Hermosa, Santa Maria, Rio Paranapura and Lorenza populations by divergence in Silva et al.'s (2003) analysis.Rodrigues et al. (2004) showed that the Guatuso and Tuira landraces are parts of the Utilis landrace, even though Guatuso presents a high proportion of spineless plants (Clement and Manshardt 2000).The number of accessions chosen is proportional to their representation in the BAG, although with different amounts for each population.The Guatuso population is important as a source of alleles for spinelessness and therefore was slightly over-represented by the two most divergent accessions in Rodrigues et al.'s (2004) analysis.The Tuira population was represented by the most divergent accession in Rodrigues et al.'s (2004) analysis.The Utilis landrace was under-represented, including the two most divergent accessions in Rodrigues et al.'s (2004) and Silva et al.'s (2003) analyses.
The non-designated upper Madeira River population was over-represented, because the fruits are oilier and likely to meet consumer preferences in Manaus and Belém, and also because it is important to understand the origin and domestication of peach palm, mainly in Bolivia.Three accessions were selected, Plácido de Castro and Puerto Maldonado ( 2 The two macrocarpa landraces contain large amounts of starch and are potentially important for the preparation of flour and for fermentation (Clement 2008), but do not get much attention from entrepreneurs.In the Core, the Putumayo landrace was under-represented and Vaupés was over-represented.The Putumayo landrace accessions were selected for their high divergence in Silva et al.'s (2003) and Rodrigues et al.'s (2004) analyses, and two accessions of the reputed hybrid population of Iquitos were included.The Solimões landrace, although originally belonging to the mesocarpa class, was grouped here because this landrace is genetically part of the Putumayo landrace (Sousa et al.Table 2. Genetic parameters of the Peach palm Active Germplasm Bank (BAG) and Core Collection (CC) estimated from four data matrices [Rodrigues et al. (2004), Silva et al. (2003), Cristo-Araújo et al. (2010), Santos et al. (2011)] that could not be combined for a single analysis, with summary of the AMOVA.P = polymorphism, He = expected heterozygosity (assuming absence as recessive), Hs = average panmitic heterozygosity, Ht = panmitic heterozygosity based on allele frequency, θ-I (Theta-I) = divergence (similar to Fst); Gst-B = Bayesian version of Nei's Gst.

Parameters
Rodrigues et al.  2001, Clement et al. 2002, Rodrigues et al. 2004).Three accessions were chosen for their divergence in Rodrigues et al.'s (2004) and Silva et al.'s (2003) analyses.The Vaupés landrace was over-represented in the Core with the only two accessions of this landrace, which has fruits that are flattened (much wider than tall) and very large (mean of 138 g).
The two types of var.chichagui (1 and 3) in the BAG were represented in proportion to their numbers.Type 1 is only represented by a single population near Rio Branco, Acre, and the two most divergent accessions in Rodrigues et al.'s (2004) analysis were chosen.Type 3 is represented by one accession from Pucallpa and one from Contamana that were analyzed by Santos et al. (2011).

Evaluation of the Core Collection
A Core Collection should represent at least 70% of the diversity of the collection, but in practice varies between 70 and 90% (Brown and Spillane 1999).As it was not viable to perform a joint analysis using RAPD markers, the genetic variability of the samples represented in the Peach palm Core Collection was evaluated on the basis of the four individual data matrizes analyzed separately.The Principal Coordinate Analysis (PCoA) of each matrix clearly demonstrates that the Core is well distributed within the BAG (Figura 3).
AMOVAs were performed to compare the Core with the BAG for the plants in each data matrix, as a way of quantifying the variance captured by the Core, and found small fractions of the variance between the BAG and the Core (Table 2).The slightly higher percentage of variance between BAG and Core found in the Cristo-Araújo et al. ( 2010) matrix is visible in the Principal Coordinate Analysis (Figure 3C).The BAG had a higher percentage of polymorphism and heterozygosity than the Core (Table 2), as expected by different numbers of accessions and plants.On average, the Core had 92% of polymorphism present in the BAG, and 95% of heterozygosity.The Hs estimated with Bayesian methods differ much less between BAG and Core, and often the Core had the same heterozygosity as the BAG, because these are simulations of heterozygosity rather than direct measurements (Holsinger and Lewis 2002).The estimates of divergence obtained with Bayesian methods show that the Core is almost the same as the BAG, similar to the AMOVA results.
Based on these graphical and numerical comparisons, the Core Collection represents very well the molecular genetic variability maintained in Peach palm Active Germplasm Bank, as do most Core Collections created to date (Odong et al. 2012), including those created for other fruit species (Escribano et al. 2008).Also, the Core Collection can be considered a suitable sample of the BAG to try to identify the origin and dispersal of peach palm in Amazonia and in the Neotropics in general.
), as the most divergent accessions in Cristo-Araújo et al.'s (2010) analysis.The non-designated population of the middle Ucayali River was over-represented with two accessions, because it is important to understand the origin and domestication of peach palm, mainly in Peru.Accessions of Pucallpa and Contamana were the most divergent in Cristo-Araújo et al.'s (2010) analysis.

Figure 2 .
Figure 2. Representativeness of the Core Collection within the Peach palm Active Germplasm Bank in terms of number of accessions per landrace and population.Populations are arranged from east (Pará) to west (Utilis) (see Figure 1).

Figure 3 .
Figure 3. Evaluation of the distribution of the Core Collection in the multivariate space of the Peach palm BAG by Principal Coordinate Analysis of the four matrices generated with RAPD markers that could not be combined for a single analysis: (A) Rodrigues et al. (2004), ( B) Silva et al. (2003), ( C) Cristo-Araújo et al. (2010), and (D) Santos et al. (2011).CC = black points; BAG = gray points.

Table 1 .
Accessions of domesticated and wild populations included in the Core Collection within the Peach palm Active Germplasm Bank.Classdomesticated vs wild; Race/pop -designation of landrace or other taxonomic unit; BAG -number of accessions currently in the BAG; CC -number of accessions designated to the Core Collection; geographical location of the accessions in Brazil (BR), Colombia (CO), Costa Rica (CR), Ecuador (EC), Panama (PA), Peru (PE); No. id.-INPA identification codes