Structure of the phenotypic variability of fruit and seeds of Dipteryx alata vogel (Fabaceae)

Abstract Dipteryx alata Vogel (“baru tree”) is a fruit species native to the Brazilian Cerrado and presents a multiplicity of uses, constituting a priority species for domestication and cultivation studies. The objective of the current study was to characterize fruit and seeds of D. alata from several regions of the Brazilian Cerrado biome to support strategies for ex situ conservation and utilization of the genetic variability of the species. Fruits were collected from 25 subpopulations, sampling six mother plants per subpopulation, with collection of at least 25 fruits per plant. The physical trait data of the fruits and seeds were submitted to descriptive analysis, analysis of variance and correlation among traits. There was significant variation for all traits among subpopulations and among individuals within subpopulations. The highest proportion of variability was observed among individuals within subpopulations. The phenotypic differentiation among subpopulations was higher for fruit traits compared to seed traits. The correlation analysis showed the existence of significant correlations for most of the traits pairs in the different hierarchical levels. It was concluded that D. alata presents phenotypic variability to support germplasm collection programsand formation of base populations for breeding programs, recommending the sampling in several locations to ensure an adequate representativeness of the phenotypic variability.


Introduction
Brazilian Cerrado is the second largest biome in South America, occupying 23% of the Brazilian territory, surpassed in area only by the Amazon biome. It presents a vegetation with several phytophysiognomies, and has the richest flora among the savannas of the world (MYERS et al., 2000;KLINK and MACHADO, 2005;MENDONÇA et al., 2008). It is considered one of the 35 hotspots in the world for conservation purposes, due to its biological diversity, endemism and threat from human occupation (MYERS et al., 2000;MITTERMEIER et al., 2011). Studies indicate that about 61% of this biome is still covered by native vegetation in a relatively intact state, however, on a highly asymmetric basis. The northern part has 90% of its natural physiognomies and the southern part only 15% (SANO et al., 2010a).
The native fruit trees of Cerrado, from different genera and families, constitute an important source of food for animals and produce fruits of interest for both fresh food and industrialization. There is a potential and growing market for native fruit, but little explored by farmers (MING et al., 2000). In contrast, Brazilian socio-biodiversity product chains have been presented as solutions to threats to Brazilian biomes (DINIZ and CERDAN, 2017). Native fruits have been used in school feeding programs, food acquisition programs and in a policy of guaranteeing minimum prices for products, in recognition of their socio-environmental and nutritional importance (MESSIAS and CAMARGO, 2016;BRASIL, 2018). In this perspective, interest in these fruits has reached several sectors of society, among which stand out farmers, housewives, industries, traders, research and technical assistance institutions, cooperatives, university centers, health, education and health agencies, among others (PEREIRA and PASQUALETO, 2011). The concentration of efforts on the genetic conservation of fruit species must be established based on the genetic variation among and within populations, seeking to collect and preserve the maximum genetic variability of the species (PAIVA et al., 2003).
Dipteryx alata Vogel -Fabaceae (baru tree) is a tree legume, which occurs in several Cerrado phytophysiognomies (LORENZI, 1998). Its wide geographical distribution is indicative of the possibility of the species presenting high levels of genetic diversity, thus conferring the capacity to occupy different habitats (KAGEYAMA et al., 2003). This species has a multiplicity of uses, constituting a key species for studies of domestication and cultivation. It is used as food by local human population and some groups of animals, as bats, monkeys, rodents, birds, ants and domesticated cattle. The pulp (mesocarp) and seed are edible and have a pleasant taste. The toasted seed (baru nut) has high nutritional value, being the main product and achieving high values in the specialized market. The oil extracted from the seed is used in food and pharmaceutical industry. In pastures, it is beneficial due to its use as a shelter for cattle, the energy and nutritional value of the fruits and the maintenance of the forage quality. The tree presents high density, compact wood, with high durability, high resistance to rot and is indicated for piles, civil construction, landscaping and recovery of degraded areas (FERREIRA, 1980;ALMEIDA, 1998;SANO et al., 2004).
It is important to sample fruits and seeds from different geographical locations to characterize a fruit species morphologically, which allows verifying the phenotypic differences determined by genetic and environmental variations. Therefore, even in the case of only one species, in each location the seeds are subject to environmental variations that highlight certain aspects of their genetic composition, that is, by studying different origins, it can capture various expressions of the genotype, allowed by the appropriate environmental conditions (BOTEZELLI et al., 2000).
Despite still limited, studies have been carried out to characterize the variability of quantitative traits in D. alata, such as that of Melhem (1974), who worked with the characterization and physiology of baru tree; Sano et al. (1999), who sampled fruits of 57 matrices from the states of Goiás and Minas Gerais; Corrêa et al. (2000), who conducted a study sampling fruits of 150 plants from three regions of the state of Goiás and Corrêa et al. (2008) who carried out a study of fruit morphological traits from 36 mother plants from different municipalities in the state of Goiás, with the evaluation of 20 fruits per plant. In general, these studies were performed based on samples from a little numbers of populations and in restricted geographical areas. Some studies with molecular markers show a pattern of genetic variability with low polymorphism compared to other species from the Cerrado biome, moderate to high diversity among subpopulations, a decreasing pattern of intrapopulational variability from the southwest to the rest of the biome, possible effect of landscape changes on the structure of genetic variability and a certain tolerance to future climate changes (SOARES et al., 2008a;SOARES et al., 2008b;MELO et al., 2011;DINIZ-FILHO et al., 2012a;DINIZ-FILHO et al., 2012b;COLLEVATTI et al., 2013;TELLES et al., 2014;SOARES et al., 2015;GUIMARÃES et al., 2019a;GUIMARÃES et al., 2019b).
The current study aimed to access information on the patterns of phenotypic variability for D. alata fruit and seed traits from wide sampling in the Cerrado biome, in order to support conservation strategies and use of the species genetic variability.

Material and methods
The collection of material for this study was carried out in 25 subpopulations, in the following Brazilian States: Goiás (GO), Tocantins (TO), Mato Grosso (MT), Mato Grosso do Sul (MS), Minas Gerais (MG) and São Paulo (SP) (Figure 1). Six mother plants were randomly sampled from each subpopulation from those with sufficient production, with collection of at least 25 undamaged fruits per plant. A random sample of five fruits per plant was used for physical characterization. After obtaining the data of the fruits, they were opened for the evaluation of the seeds characters, maintaining the individual seed identification.
Fruit collection was carried out respecting their point of physiological maturation, which corresponds to the point at where they easily detach themselves from branches or those that are already in the soil under the canopy of plants. Once harvested, the fruits were packaged in plastic mesh bags, labeled and identified with the subpopulation and plant numbers, and then transported to the laboratory at the Federal University of Goiás, in Goiânia. Each mother plant was geo-referenced using a GPS receiver and each subpopulation was plotted in the map using its central geographic coordinates (Figure 1). The evaluated traits were: fruit mass (FM), fruit length (FL), fruit width (FW), fruit thickness (FT), lengthto-width ratio of the fruit (FL/FW), length-to-thickness ratio of the fruit (FL/FT), seed mass (SM), seed length (SL), seed width (SW), seed thickness (ST), length-towidth seed ratio (SL/SW), length-to-thickness seed ratio (SL/ST) and seed yield (SY). The mass characters were obtained using a semi-analytical digital scale and the results expressed in grams (g). Dimension measurements were obtained with a digital pachymeter and expressed in millimeters (mm). Seed yield was obtained by the relation between seed mass and fruit mass (SY = SM/FM).
The data were submitted to descriptive statistics and, subsequently, to the analysis of variance based on a hierarchical model with the effect of subpopulations, plants within subpopulations and fruits within plants. The model used was  .., m i , [ E(p j(i) In this case, the number of mother plants per subpopulation (m i = 6) and fruits per mother plant (f j = 5) were constant, therefore the data set was balanced. The scheme for the analysis of variance and the expectations of the mean squares (Table 1) were elaborated according to the adopted statistical model.
The phenotypic correlation coefficients between traits were also estimated, considering the different hierarchical levels. The analyses were performed based on the genetic-statistical procedure of the Genes software (CRUZ, 2013).  The components of variance associated with the effects of the model and its proportions in relation to the total phenotypic variation were estimated according to the expectations of the ANOVA mean squares as follow: difference among subpopulations (P S ), difference among plants within subpopulations (P P/S ) and difference among fruits or seeds within plants (P F/P ). Using the components of variance, the parameter P ST was also estimated, which measures the quantitative phenotypic divergence among subpopulations. The estimators for the parameters are:

Results and discussion
The descriptive analysis of the 13 evaluated traits, based on the values of the coefficient of phenotypic variation (CV), demonstrated a greater variability for the characters fruit mass (FM), seed mass (SM) and seed yield (SY) and lower for seed width (SW), fruit width (FW) and length-to-width fruit ratio (FL/FW) ( Table 2).
The mass values of whole fruit ranged from 9.75 g to 72.97 g (Table 2), with an average of 28.13 g. The results found for fruit mass by Sano et al. (1999), Corrêa et al. (2000), Corrêa et al. (2008) are within the range found in the current study. One different case is the lower values found by Melhem (1974), with a range from 10 g to 28 g, and an average of 18 g. FM presented the highest relative phenotypic variation among the evaluated fruit traits, with a phenotypic coefficient of variation equal to 33.09% (Table 2). Corrêa et al. (2000) also observed a greater variation for this variable. Subpopulations 11 (Estrela do Norte, GO) and 25 (Cáceres, MT) had the highest means for FM (Table 2). The greater amplitudes of variation found here were expected due to the large geographical area sampled, covering greater environmental variation and, probably, greater genetic variation.
The fruit length (FL) varied from 36.20 mm to 77.42 mm with a mean of 52.27 mm (Table 2). This wide amplitude of variation is according to the study of Corrêa et al. (2000). On the other hand, Silva et al. (1994), Ferreira et al. (1998), andCorrêa et al. (2008) found a smaller amplitude of variation, but within the range of this study. The subpopulations with the highest means for this trait were 11 (Estrela do Norte, GO) and 3 (Pirenópolis, GO) ( Table 2). The values of fruit width and fruit thickness (Table 2) are also in accordance with data from the literature (SILVA et al., 1994;FERREIRA et al., 1998;CORRÊA et al., 2000;CORRÊA et al., 2008), with a trend for larger variation intervals, as expected due to the larger sample size.    The length-to-width fruit ratio (FL/FW) and the length-to-thickness fruit ratio (FL/FT) presented means of 1.31 and 1.80, respectively, characterizing ovate and flat fruits. The variable FL/FW presented the lowest coefficient of variation among the evaluated traits (7.96%), much smaller than the highest value, which was 33.09% for FM. The variation interval for this trait was relatively small, which demonstrates a similarity in the shape of the fruit among subpopulations.
The physical characterization of the seeds presented the following averages (Table 2) Sano et al. (1999;2010b) and with a similar mean to that found by Corrêa et al. (2008). Here the mean of SM was slightly lower than that found by Silva et al. (1994), which was 1.50 g. These differences regarding to other studies can be explained by the differences in sampling procedures and extension. The amplitudes of variation observed for the fruit and seed traits are, generally, higher than the values of the referred studies, probably due to the greater representativeness of the sample of this study. The mean values of the traits associated with the seed dimensions (SL, SW and ST) and their relation characterize long seeds with oval cross-section. Among the sampled subpopulations, 11 (Estrela do Norte, GO) presented the highest mean values for the fruit traits, with the highest averages for fruit mass (FM) and fruit length (FL). Also noteworthy are the subpopulations 25 (Cáceres, MT), 18 (Indiara, GO) and 3 (Pirenópolis, GO), which had a good average performance. The fruits of these subpopulations are heavier and larger, making them more attractive to the fresh market.
The subpopulations highlighted in their average performance for seed variables were 25 (Cáceres, MT) and 11 (Estrela do Norte, GO), followed by subpopulations 19 (Barra do Garças, MT), 21 (Jandaia, GO) and 4 (Sonora, MS). Notoriously, the two subpopulations with the highest average value for fruit characters were the same for seed characters, showing a positive association between these variables. These subpopulations presented seeds well developed morphologically and with a higher weight. It should be noted that the greatest economic appeal of D. alata is in the nutritional value of its seed, which has high levels of proteins and lipids, high concentration of unsaturated fatty acids and some minerals (VERA et al., 2009). Its use in food has been carried out in different ways, including sweets, jellies, liqueurs, ice cream, pies, chocolates, cereal bars, flour, oil, among others. However, the product with the greatest economic appeal is roasted nuts, which reach high values in the specialized market. Current demand is usually supplied by community associations and family farmers (CANDIL et al., 2007), with a still incipient production chain. However, some large-scale planting initiatives already exist, requiring the development of cultivation techniques and selected populations. As an example, there are plantings on the farms such as, Grupo Tropical, in São Luiz do Norte, GO; Santa Julieta Bio, in Santa Cruz da Conceição, SP and Agropecuária Kehler, in Brejinho de Nazaré, TO. There are also several farmers and extractivists associations in different regions of the Cerrado biome that explore baru tree products in addition to other native fruit species.
The seed yield (SY) measured by the SM/FM ratio showed a mean value of 4.50%, therefore 95.50% of the fruit mass, on average, corresponds to the endocarp and pulp mass. This low yield, combined with the difficulty in extracting the seed from the fruit, constitute a major challenge for the use of baru nut. Subpopulations 1 (Cocalinho, MT), 4 (Sonora, MS), 15 (Paraíso, MS), 22 (Natividade, TO) and 23 (Arraias, TO) had the highest values for seed yield, despite the low mean for fruit and seed mass. An exception is subpopulation 4 (Sonora, MS), which presented adequate values for seed traits combined with a high value of SY, even with low values for FM.
These results demonstrate that there is a wide phenotypic variation among the localities for some of the traits under study, mainly for the mass variables, which is relevant for the purposes of conservation and genetic improvement. The great variation in the results, when compared to other authors studies, highlights the statement by Sano and Simon (2008), that the production of baru fruits is, in general, extremely variable among the years. These authors also mention that baru fruits have larger dimensions in the years of smaller production and vice versa. Thus, the dimensions found in the subpopulations of this study, as they were sampled in one year of greater production, may increase in years of smaller production.
The hierarchical analysis of variance showed the existence of significant variation between subpopulations for all characters with the exception of seed width (SW), seed thickness (ST) and the ratio of seed length (SL) with these two variables (Table 3), which shows some uniformity among subpopulations for the shape of the seed. Assuming that part of the significant variation is determined by genetic causes, these results reinforce the possibility of selecting provenances with desirable characteristics, including seed mass and length. The variation among plants within subpopulations was highly significant for all characters. Considering only the mass variables, the partition of the source of variation among plants by each subpopulation (data not shown) showed significant variation for fruit mass (FM) within all subpopulations and for seed mass (SM) in 23 out of 25 subpopulations.. This shows the feasibility of selecting superior mother plants from different provenances.
Of all the variation observed, most are due to the phenotypic variation among plants within subpopulations (P P/S ), as can be seen in the estimated values of the variation proportions for each trait (Table 3), ranging from 50.47 % for the ratio between length and width of the fruit (FL/FW) to 69.58% for the ratio between length and thickness of the fruit (FL/FT). This greater proportion of variation among plants within subpopulations in relation to the variation among subpopulations is expected for allogamous species and agree with results already reported for D. alata (SANO et al., 1996D. alata (SANO et al., , 1999CORRÊA et al., 2000) and other fruit species from the Cerrado biome (SILVA et al., 2001;VERA et al., 2005;TRINDADE and CHAVES, 2005;GANGA et al., 2009;MOURA et al., 2013;NOVAES et al., 2018). The phenotypic variation among subpopulations and among matrices within subpopulations is influenced by uncontrolled environmental factors, such as soil, climate, anthropic interferences, plant age, competition, and also by the genetic difference among individuals.
For the variation among fruits within plants, there is a difference in proportions of fruit and seed traits (Table 3). For all fruit traits, the proportion of variation within plants (P F/P ) was below 20% and was lower than the proportion among subpopulations, showing a certain uniformity of fruits per plant. For seed traits, this proportion varied from 24.25% to 37.90%, intermediate values between the proportions among subpopulations and among plants within subpopulations. Assuming that the heritability of the traits is similar among subpopulations and among plants, the data of components proportions of variance allow to predict greater efficiency in the selection among progenies in comparison to the selection among provenances.
The P ST values for the fruit traits ranged from 0.099 to 0.234. With the exception of the FL/FT ratio, the values were greater than 0.150. The seed characters ranged from 0.034 to 0.137. Thus, the estimated P ST values for the fruit traits were higher than the P ST estimates for the seed traits. The P ST parameter is a measure of the quantitative phenotypic differentiation among the subpopulations and is a surrogate to the Q ST parameter that measures the quantitative genetic differentiation among the subpopulations (BROMMER, 2011). These parameters can be used as an indicator of the selection forces that shaped the current structure of population variability. To infer on evolutionary processes, the Q ST parameter must be compared with the non-adaptive variation that can be assessed by neutral molecular markers (WHITLOCK and GUILLAUME, 2009;NOVAES et al., 2018). The difference in P ST values among fruit and seed traits allow the hypothesis of different evolutionary forces acting on these traits, which must be evaluated in a specific study.
The significant variability among the subpopulations allied to the great amplitude of variation found for most of the evaluated traits of the baru tree emphasizes that they behave differently among the collection localities. In this way, subpopulations, or collection sites with better performance, for traits of interest in the market can be recommended. Based on the means of the variables for fruit and seed characters, subpopulations 25 (Cáceres, MT) and 11 (Estrela do Norte, GO) stand out, although with seed yield below average. The population 4 (Sonora, MS) is noteworthy for being the second with the highest seed mass (SM) and the third in seed yield, due to having a below average fruit mass. In addition to the average performance, the indication of the best localities for collection also depends on the internal variability of each subpopulation. Based on the seed mass (SM) variable, which has the highest economic value, the subpopulation 4 (Sonora, MS) mentioned above stands out for internal variability. Subpopulation 11 is among the five with the greatest variability. Subpopulation 25 (Cáceres) has intermediate variability. The variability of each subpopulation in this study was estimated based on a sample of only six plants, so it must be taken with caution.
The coefficients of phenotypic correlations among the variables at different hierarchical levels, among subpopulations, among mother plants within subpopulations and among fruits within plants, showed a positive and significant correlation between most pairs Rev. Bras. Frutic., Jaboticabal, 2020, v. 42, n. 5: (e-003) of the evaluated traits (Table 4). The correlation in the level of fruits within plants is only due to environmental effects (non-genetic), in the other levels the genetic and environmental effects are confounded. In these cases, when the phenotypic correlation coefficient is of high magnitude in absolute value, it means that both the genetic  Uberlândia, v.24, n.4, p.42-47, 2008.
Based on the total phenotypic correlation and among means of subpopulations, the total fruit mass (FM) trait was highly correlated with the other dimensional characters of fruits and seeds, showing a negative correlation with SY, which was expected (Table 4). Corrêa et al. (2000) observed the same trend of correlation between the fruit mass and the other characters, as in this study. The positive and high correlations observed between the variables such as, fruit length (FL) with seed length (SL) and SL with seed mass (SM) should be highlighted, as these variables are of interest in the market. In the analysis of correlation among plants within subpopulations, as well as fruits within matrices, the trait FM was also highly correlated with the other characters of fruits and the same negative correlation with the variable SY. The high and positive correlation estimates are confirmed between the characters of FL and SL, SM and SL, as well as observed in the correlation among subpopulations.
The negative correlation among all the variables of fruits with seed yield (SY), although expected by the mathematical relation of SY with fruit mass, represents a difficulty for selecting plants with larger seeds and with high yield. On the other hand, the close relation of these fruit traits with the mass and dimensions of the seed, allows the field selection of plants with larger fruits resulting in seeds with superior characteristics, since it would be impractical to evaluate the seeds traits in the field. The best alternative for using baru genetic resources could then be to select plants with larger fruits and seeds, even with less yield, and direct efforts to use the pulp and even the mesocarp as by-products.

Conclusions
The results of this study allow us to affirm that there is a high phenotypic variability for most of the physical traits of D. alata fruits and seeds, in all tested hierarchical levels, except among subpopulations for the traits such as width and thickness of the seed, length-to-width and length-to-thickness seed ratio. The largest proportion of phenotypic variability is among mother plants within subpopulations. The phenotypic divergence among subpopulations measured by the P ST parameter is greater for fruit traits than for seed traits. In order to support a conservation and breeding program for the species, it is recommended to sample a large number of subpopulations to ensure an adequate representativeness of the observed variability. The size of the fruit (mass and dimensions) is highly correlated with the size of the seed and can be used for mass selection of mother plants in the field in a collection program for purposes of genetic improvement. Based on the general performance of the subpopulations, the priority collection localities that can be recommended for a breeding program are subpopulations 4 (Sonora, MS), 11 (Estrela do Norte, GO) and 25 (Cáceres, MT). It would be advisable to expand the collection areas in these regions, focusing on traits of interest to the market.