UNIVARIATE AND MULTIVARIATE APPROACHES IN THE CHARACTERIZATION OF LIMA BEAN GENOTYPES

To estimate the genetic diversity among the genotypes of a germplasm collection, a combined analysis of qualitative and quantitative variables was performed. The objective was to promote the morphoagronomic characterization and estimate the genetic diversity of lima bean genotypes belonging to the Phaseolus Germplasm Bank of UFPI through univariate and multivariate approaches. The experiment was conducted on a screen house between January and September 2016, using a completely randomized design with four replications, with a plot consisting of a pot with a plant. An analysis of variance of the quantitative characteristics was performed and used to estimate the comparison of means. The combined analysis of the quantitative and qualitative variables was performed based on the Gower distance. Subsequently, the genotypes were grouped by the UPGMA method, from which five groups were formed. The lima bean genotypes showed wide genetic variability in relation to morpho-agronomic characteristics.


INTRODUCTION
Lima bean (Phaseolus lunatus L.) is also known as broad bean. It has high genetic diversity and production potential. In addition, this legume is highly adapted to the edaphoclimatic conditions of the semiarid region, with considerable social and economic importance (BARBOSA; ARRIEL, 2018;NOBRE et al., 2012;OLIVEIRA et al., 2011).
Due to the low use of technology by family farmers, there are low levels of productivity rates and large fluctuations in the production of lima beans (OLIVEIRA et al., 2014). Studies on the cultivation of lima beans, mainly genetic and breeding studies, are still incipient in Brazil. This low level of research has led to in limited knowledge of the agronomic characteristics and potential of lima beans. Thus, its cultivation is limited.
The maintenance and conservation of lima bean varieties in germplasm banks are very useful for genetic breeding, since it allows the identification of possible duplicates and provides parameters for selecting and obtaining superior genotypes in segregating generations (SUDRÉ et al., 2005). Thus, the Phaseolus Germplasm Bank at Universidade Federal do Piauí (BGP-UFPI) was created to preserve the genetic diversity of lima beans. The germplasm bank has more than 1050 accessions of lima beans from the Northeast, Southeast, Midwest regions, Germplasm Resources Information Network (GRIN), and International Center for Tropical Agriculture (CIAT).
A morpho-agronomic characterization study was carried out with genotypes available in the germplasm bank. The study aimed to support the development of genetic breeding programs and contribute to solving cultivation challenges (FERRAZ et al., 2016). It was expected to provide phenotypic information, which is essential for determining the genetic divergence between the studied genotypes.
Multivariate analyses have been used to estimate genetic diversity among various crops, such as soy (SANTOS et al., 2013;SANTOS et al., 2011) and beans (SCHMIT et al., 2016). This information is particularly useful in the selection of genotypes, since it is based on numerous characteristics for the evaluation of germplasm (SUDRÉ et al., 2005;DALLASTRA et al., 2014). Thus, this type of analysis has driven studies of genetic divergence between germplasm bank genotypes (ROCHA et al., 2010).
A large volume of data (qualitative and quantitative) can be one of the factors that hinder the analysis and interpretation of the results of characterization and evaluation of the germplasm, resulting in an incomplete distinction between the genotypes (ROCHA et al., 2010). Thus, multivariate analysis can provide a better indication of genetic variability in germplasm banks. A technique that allows the simultaneous analysis of quantitative and qualitative data was proposed by Gower (1971). The tool is an algorithm that estimates the similarity between two individuals by using continuous and discrete distribution data.
Thus, this study aimed to perform morphoagronomic characterization and estimate the genetic diversity of lima bean genotypes from the Phaseolus Germplasm Bank at UFPI using univariate and multivariate approaches.

MATERIAL AND METHODS
The experiment was conducted from January to September 2016 in a covered area (50% shading rate) located in the Department of Plant Science, at the Agricultural Sciences Center of the Universidade Federal do Piauí, Teresina -PI (72.7 m altitude, 05 ° 05 ''05 '' ″ S and 42º05 'longitude W).
The 22 genotypes of lima beans used belonged to the Phaseolus Germplasm Bank at the Universidade Federal do Piauí (BGP-UFPI), located in the Laboratory of Genetic Resources and Plant Breeding (Table 1). The experiments were carried out using a completely randomized design, with four replications. A pot (25 L) with fertilized vegetable soil in a 3:1 ratio was used as the experimental plot. In each pot, one plant was placed. Phytosanitary treatments for pest control and irrigation were carried out throughout the crop cycle, using a microsprinkler. The plants were tutored with lath stakes. The accession used in the experiment showed indeterminate growth.
Quantitative data were grouped using the test proposed by Scott and Knott at a 5% probability (SCOTT; KNOTT, 1974). The genetic distance matrix was estimated according to the quantitative and qualitative variables obtained based on the Gower algorithm (1971), which is expressed as follows: where K represents the number of variables (k = 1, 2, ..., p); i and j = two samples that represent accession; W ijk = weight attributed to the ijk comparison, which was attributed a value of 1 for valid comparisons and a value of 0 for invalid comparisons (when the value of the variable is absent in one or both individuals); S ijk = contribution of the k variable on the similarity between individuals i and j, with values between 0 and 1. Regarding the qualitative (nominal) variable, if the value of variable k is the same for bothindividuals, I and j, then S ijk = 1; otherwise, it is equal to 0. Regarding the quantitative (continuous) variable , in which x ik and x jk are the values of variable k for individuals i and j, respectively, and R k is the range (maximum value minus minimum value) of variable k in the sample. The division by R k eliminates the differences between scales of the variables, allowing obtaining a value within the range [0, 1] and equal weights. The genotype groupings were obtained using the unweighted pair-group method using an arithmetic average (UPGMA), Ward's, and Group nearest neighbor (GNN) methods. The validation of the clusters was determined by the Cofenetic correlation coefficient (CCC) (SOKAL; ROHLF, 1962). Data were analyzed using R ® (R DEVELOPMENT CORE TEAM, 2018) and GENES ® (CRUZ, 2013) programs.

RESULTS AND DISCUSSION
The analysis of the averages for quantitative traits demonstrated variability between the descriptors. Thus, it was stated that there is sufficient genetic variability among the genotypes, which can be used for the selection of the most promising. However, the descriptor "number of seeds per pod" would not be used in this process, as all genotypes showed two seeds per pod (Tables 2 and 3).
The descriptor "pod maturation period" (PMP) showed greater variability among the genotypes, in which fourteen distinct groups were organized. In addition, eleven groups were organized using the data obtained from the analysis of the descriptors: "start flowering period" (SFP), "flowering duration" (FD), and "weight of 100 seeds" (100SW). The descriptors "Bunch length" (BL) and "seed length" (SL) allowed the formation of nine distinct groups. According to Mello (2011), in terms of economic importance, the descriptors that stand out are the following: the number of flowering days; for pods (length, width, and the number of seeds) and seeds (thickness, length, width, and weight). These descriptors contribute to the production and increase the commercial value of lima beans. In addition, the "start flowering period" is an important descriptor because it involves precocity.
Regarding the descriptor BL, a variability ranging from 4.96 cm (UFPI 862) to 24.38 cm (UFPI 220) was observed. For the descriptor SFP, a variation ranging from 45 days (UFPI 866) to 154 days (UFPI 856) was observed. Comparable results were reported by Oliveira et al. (2011), who observed a variation of 55 days for the beginning of flowering in the earliest genotypes, and 107 days for the latest genotype, Santos et al. (2002) studied eight varieties, and they found that the earliest variety started flowering at 49 days, and the latest at 71 days. *GEN = genotypes; NLP = number of locules per pod; NSP = number of seeds per pod; SL = seed length; SW = seed width; ST = seed thickness; LWR = length/width ratio; 100SW = weight of one hundred seeds. *The means identified by the same letter do not differ by the Scott-Knott test (P> 0.05).   Genotypes with earlier flowering are more convenient, considering the Northeast region climate, where temperature increases progressively over the months. Thus, genotypes that take a longer period to bloom are more likely to suffer damage, such as abortion. Precocity is an important plant characteristic, as it represents the possibility of up to three crops per year, including rainfed and irrigated crops. This characteristic allows increasing and/or stabilizing production in regions with long periods of drought (FREIRE FILHO et al., 2005). The UFPI 866 genotype was the one that showed the best results regarding the descriptor "start flowering period". For the descriptor "flowering duration," the UFPI 220 genotype was superior to the other genotypes, considering that it showed a flowering period of 113 days.
The descriptors regarding the pods, in general, showed results that were used to differentiate the genotypes. For the descriptor "maturation period," the UFPI 220 genotype showed the earliest maturation period, with 101 days. Regarding the "average length of the pod", it was observed a variation from 4.37 cm (UFPI 836) to 10.07 cm (UFPI 852), which was in accordance with the results obtained by Santos et al. (2002), who found a variation from 6.2 to 8.9 cm. The descriptor "pod width" varied from 1.18 cm (UFPI 220) to 2.13 cm (UFPI 842). This result varied more than those analyzed by Oliveira et al. (2011), who observed a variation from 1.43 cm to 1.77 between the genotypes studied. For "pod weight," the UFPI 852 genotype showed the highest average, with 3.17 g. This characteristic is important, considering that it is linked to productivity. Regarding the number of locules per pod, the UFPI 861 genotype showed the highest average, with 3.6 locules per pod. The number of seeds per pod did not allow for differentiation of the genotypes, since all genotypes showed statistically similar results, with two seeds per pod (Table 3). Comparable results were observed by Oliveira et al. (2011), who also verified a number of seeds per constant pod between the genotypes, with two seeds.
Regarding the seed descriptors, it was observed that the UFPI 852 genotype showed seeds with greater length and width (22.32 mm and 13.42 mm, respectively). The weight of 100 seeds ranged from 34.56 g (UFPI 220) to 154.04 g (UFPI 858). These results partially corroborate with those found by Santos et al. (2002), who obtained a variation from 32.6 to 79.5 g in the weight of 100 seeds of lima beans. In contrast. Guimarães et al. (2007) found a greater variation, ranging from 15.0 to 88.9 g. According to Silva and Freitas (1996), the weight of 100 seeds is one of the most important descriptors for grain yield in lima beans.
Genotype 852 showed the highest average pod length (10.04 cm), pod weight (3.17 g), seed length (22.32 mm), and seed width (13.42 mm). Guimarães et al. (2007) characterized 14 genotypes of lima beans and obtained the following measurements: average length (16.90 mm), width (11.70 mm), and thickness (6.10 mm) of the seeds. These results were lower than those reported in this study.
Considering the qualitative and quantitative descriptors of the 22 genotypes from the BGP-UFPI, it was verified from the data obtained the formation of groups using the UPGMA hierarchical method based on the Gower distance (1971). The obtained dendrogram is shown in (Figure 1). The cophenetic value between the genetic distance matrix and the cluster matrix was high (r = 0.85), which indicated a good agreement with the values of genetic similarity.
From the UPGMA grouping, with a cutoff point corresponding to an abrupt change in the graph, five groups were formed. The standard descriptors of growth and color of the pods did not contribute to the formation of these groups, as all studied genotypes had an indeterminate growth pattern and brown pods. Group I was composed of the following two genotypes: UFPI 220 and UFPI 866. The main characteristics of the plants from this group included oval leaflet shape, leaf hairiness slightly pubescent, with branches orientated with a main stem with short lateral branches, colorless stem, white flower' wings, short shape pods' apex, random pods distributed in bunches, and non-dehiscent pods and glabrous. Regarding the quantitative traits, these genotypes showed the lowest values for SFP, 59 days (UFPI 220) and 45 days (UFPI 866); higher FD, 113 days (UFPI 220) and 108 days (UFPI 866). These values were statistically equal for the descriptors PL, PWI, PWE, NLP, and 100SW. It was observed that this group showed the genotypes with the lowest values regarding the quantitative descriptors to seed and pod, except for seed thickness.
Group II was composed of three genotypes (UFPI 836, UFPI 865, and UFPI 872). The plants from this group showed oval and slightly pubescent leaflet shapes, branches oriented with a main stem with short side branches, colorless stem, white flowers' wings, long shape pods' apex (UFPI 836), medium (UFPI 865), and thick (UFPI 872), which are dehiscent, pubescent, and are distributed in the central part of the bunches. Regarding quantitative characteristics, variability was revealed. The genotypes UFPI 865 and UFPI 872 were statistically similar for the descriptors SFP, FD, PL and SW. An LWR descriptor similarity was observed between the accessions UFPI 836 and UFPI 865. Finally, UFPI 836 and UFPI 872 were statistically similar for the descriptors NLP and ST.
Group III was composed of the largest number of genotypes. The genotype UFPI 879 showed an oval-lanceolated leaflet shape (UFPI 879), while the others showed an oval leaflet shape. Regarding the hairiness of the leaflet, one genotype showed a leaflet with moderate hairiness (UFPI 863), while the others were slightly pubescent. Regarding the orientation of the branches, three genotypes (UFPI 864, UFPI 857, and UFPI 879) showed a rare main stem and lateral branches starting from the first nodes, and the other genotypes showed short lateral branches. Two genotypes (UFPI 860 and UFPI 868) showed stems with generalized pigmentation, and the other genotypes showed stems without pigmentation. Most of the genotypes in this group showed white flowers, but the genotypes UFPI 860 and UFPI 879 showed flowers with light pink flower wings.
The morphology of the pods from group III also varied; the genotypes UFPI 842, UFPI 868, UFPI 869, and UFPI 864 showed pods with short apexes, a pod with medium apexes was observed for the genotype (UFPI 859), and a pod with long apexes was observed for the genotype (UFPI 879), while the other genotypes showed pods with thick apexes. Regarding the arrangement of pods in the cluster, UFPI 860 was a genotype with pods randomly positioned in the cluster. Five genotypes showed pods disposed at the base of the cluster (UFPI 842,UFPI 854,UFPI 856,UFPI 857,UFPI 863), and the others showed pods disposed at the top of the cluster. It was verified only one dehiscent genotype (UFPI 857), while the others were observed without dehiscence. In addition. a genotype without hairiness was observed in the pod (UFPI 842), while the others were pubescent. Regarding the quantitative characteristics, this group showed variability in all descriptors, mainly those related to bunch length (values ranging from 7.0 to 21.63 cm), days to flowering start (93 -154 days), and flowering duration (35 -85 days).
Group IV was composed of only two genotypes (UFPI 852 and UFPI 867). This group has plants with an oval leaflet shape, moderately pubescent, branches arranged with a main stem with short side branches, colorless stem, white flowers' wings, pods with short apex, and pods positioned in the base of the bunches. A genotype showed glabrous pods (UFPI 852), and another genotype showed pubescent pods (UFPI 867). The two genotypes did not show dehiscence in the pods. For the quantitative traits, the genotypes showed statistically similar values for the FD, SW, and LWR traits.
Group V was composed of two genotypes (UFPI 858 and UFPI 862). This group has plants with oval leaflet shape, a moderately pubescent genotype (UFPI 862), and a slightly pubescent genotype (UFPI 858), both with a colorless stem. Regarding the branch orientation, the genotypes showed a main stem with short, rare, or nonexistent side branches. The flowers of both genotypes had white wings. The pods were disposed at the base of the bunches. The pods were dehiscent and had a thick apex. For quantitative traits, the genotypes showed statistically similar values for PL, PWE, SW, LWR, and 100SW traits.
The UPGMA results verified the significant contribution of the quantitative and qualitative variables to the explanation of the groups formed based on the morpho-agronomic descriptors. Thus, from the analysis of the qualitative and quantitative data, it was possible to form groups with the genotypes in a single dendrogram. According to Rocha et al. (2010), a dendrogram allows better analysis and use of qualitative data that. in general, are analyzed only by descriptive statistics. This statistical method does not allow drawing conclusions regarding the genetic divergence between genotypes, which often hinders the subsequent use of these genotypes, for example, in breeding programs.
The distribution of the BGP-UFPI genotypes in the dendrogram did not facilitate the formation of groups away from the collection area, since many genotypes from different regions were grouped together. Regarding the genotypes originating from geographically remote regions, one explanation for their allocation into the same group may be related to the genetic variability that exists between the genotypes. Considering the genotypes originating from nearby regions, another factor can be attributed to possible genetic variability, that is, the probable exchange of seeds between producers. In general, this evaluative study can be reinforced using molecular markers, Guimarães et al. (2007) state that the association of quantitative and qualitative studies with genetic markers may allow an analysis of the genetic distance between genotypes. Thus, the results from these studies may be used for breeding programs, such as for the choice of contrasting parental lines.

CONCLUSION
Regarding the quantitative characteristics, the descriptor "pod maturation period" was more efficient in discriminating the evaluated genotypes, while the "seed width" descriptor showed less contribution. These results indicated that there was sufficient genetic variability between the genotypes, which allows the selection of the most promising genotypes.
Gower's genetic distance was efficient in discriminating groups and allowed the simultaneous analysis of qualitative and quantitative data between genotypes of lima beans from germplasm banks. This method helped in characterizing the genotypes.
Five groups were formed using the UPGMA grouping method. These groups indicated the genetic divergence between the lima bean genotypes, which may be used in breeding programs in the future.