MULTIVARIATE ANALYSIS OF PEANUT MECHANIZED HARVESTING

The peanuts harvesting mechanization is affected by the soil physical characteristics and it may increase the losses due to the production of pods in subsurface. The objective of the experiment was to identify the clusters through multivariate exploratory approaches from similarity in six soil textures (very clayey, clayey, silty clayey loam, clayey loam, sandy loam and sandy) in the state of São Paulo, Brazil, determining the main agronomic variables that most influenced the clustering division to assist the decision-making process in peanuts mechanized harvesting. The data were analyzed by the multivariate exploratory that is performed to simplify the description of a set of interrelated variables, using: yield, maturity, soil and pod moisture content, windrow width and height, visible and invisible digging losses, and gathering losses, as agronomic indicators of quality. The low and high clay content were grouped into clusters I and III, respectively, according to the agronomic traits of the peanut crop. The principal components analysis (PC) allowed a single distribution of accesses since only two eigenvalues were higher than “one”: the highest eigenvalues of 4.51 and 1.79, resulted in a Biplot that explained 70% of the original variability, 50.11% and 19.89% of which in the PC1 and PC2, respectively. The multivariate analysis indicated that high peanut yields in soils with low clay are correlated with the losses during the peanut mechanized harvesting operation.


INTRODUCTION
Peanuts (Arachis hypogaea L.) are an important oilseed in the Brazilian market, but even more in the state of São Paulo.The state produced 98% of the 418,300 tons of peanuts produced in the 2016/2017 harvest in Brazil, and peanuts are currently used as a rotational crop in sugarcane area and even pastures (CONAB, 2018).
The peanut production process is a critical time which it seeks mechanized operational excellence via new technologies to overcome the challenges and seize the opportunities of foreign markets in a sustainable way by increasing production volume, productivity and reducing production costs (Grotta et al., 2008) because in this operation losses are inevitable (Barbosa et al., 2014).
Most of the losses in the peanut harvest occurs in the digging operation and can reach high levels when the operation is not carefully managed; the values range fro m 3.1 to 47.1% of pod losses in relation to yield (Santos et al., 2013;Zerbato et al., 2014).
Monitoring the losses allows detecting and correcting the errors that might occur during the process so they can be min imized thus avoiding yield dropping (Bertonha et al., 2014;Câmara et al., 2007).
Multivariate analysis can be defined as exploratory statistical method that analyzes simu ltaneously multiple measurements in the experimental unit.The random variables must be interrelated so their effects cannot be mean ingfully interpreted separately.The Clustering Analysis (CA) uses the Ward method (hierarchical) for extract statistical properties of a dataset, clustering similar vectors into classes (Hair, 2005).
The Principal Co mponent Analysis (PCA) is performed to simplify the description of a set of interrelated variables.The created orthogonal axes are linear co mbinations of the original variab les, starting with the eigenvalues of the covariance and/or correlation matrix of the studied variables in which the two largest eigenvalues generate the first two principal components, explain more variability than any other components (Hair et al., 2005).
The use of Multivariate exploratory techniques to control mechanized farming operations is a reality, as an innovative tool in the estimat ion of operating costs of Engenharia Agrícola, Jaboticabal, v.38, n.2, p.244-250, mar./apr. 2018 agricultural and forestry machineries (Guerrieri et al., 2017).The studies about the impact of mechanization through soil tillage on the behavior of weeds were efficient based on the similarity (CA) of the weeds (Boscardin et al., 2016;Nagahama et al., 2014).Azevedo et al. (2015) and Silva & Lima (2012) investigated, respectively, the selection of lettuce cultivars and the nutritional status and productivity of coffee plants, and showed the efficiency of mult ivariate analysis, because the effects of multip le variables (random and interrelated) could not be interpreted separately.
Multivariate explo ratory analysis is used in sectors of the agricultural production due to the high complexity of the produced informat ion.Lamp kowski & Biaggioni (2013) and Paredes Junior et al. (2015) reported that it is used to better interpret, understand, manage and assist the decision-making process in the sugar-energy sector.
It is assumed that the peanut mechanized harvesting needs tools to assist in the effective control of loss variability and to understand the behavior of the agronomic t raits in relation to soil textural classes for mu ltip le traits.
This study aimed at identifying the clusters through mu ltivariate exploratory approaches from similar soil textures and determining the variab les that most influenced the clustering division to assist the decision-making process in mechanized agricultural operations.

MATERIAL AND MET HODS
The experiment was conducted in six farms in Ribeirão Preto region, SP (Table 1).The 120-ha assessed area was planted with peanut of the Runner IAC 886 variety, sown in October 2015 with 0.90-m spaced rows.The crop was harvested in all evaluated areas in February 2016, 130 days after sowing.The farms were located between 20º58' and 21º10'S and 47º51' and 48º13'W, at 593 m average alt itude.The soil in the areas presented between 7.0 and 66.8% clay (Tab le 1).------------g/kg ------------ The digging was performed by a 680 HD Massey Ferguson tractor with maximu m engine power of 127 kW (173 hp) at 2000 rp m coupled with an EIA-2 Santal digger-shaker-inverter, 2 x 1 (t wo harvested/digging rows, formed a windrow).However, the tractor worked at 1,500 rpm to meet the digger-shaker-inverter manufacturer's recommendation of 340 rp m in the PTO.Although the conditions were unsuitable fro m the mechanical viewpoint, this rotation was used because they represented the real conditions of the equipment in the field, since it has no reduction mechanisms able to provide the indicated rotation.
The mechanized harvesting of peanuts was evaluated for six textural soil classes as follows very clayey (VCL), clayey (CLA ), silty clayey loam (SCL), clayey loam (CLL), sandy loam (SA L), and sandy (SAN) with 10 repetit ions per soil textural classes , totaling 60 plots in a randomized design, each plot formed a regular grid of 25 x 50 m.
The evaluated variables were yield, maturity, soil and pods moisture content, height and width of the windrow after digging, digg ing crop losses (visible and invisible losses), and gathering operation.The windrow formed after the passage of the digger was carefully removed to determine the visible and invisible digging losses.For this purpose, a metal frame o f appro ximately 2 m 2 (1.11 x 1.80m) was placed across the windrow and the material was manually collected up to 0.25 m deep.
The frame width corresponded to the working width of the digger-shaker-inverter.After collecting, the pods were placed in paper bags, tagged, and sent to the laboratory where the samples were washed to remove the dirt fro m the exocarp.After that, the pods were weighed on a digital scale with 0.01 g precision and oven dried at 105 ± 3°C fo r 24 hours.After drying, they were weighed again to determine the losses, which were extrapolated to kg ha -1 with fu rther adjustment to 8% moisture.The losses were calculated in kg ha -1 and expressed as percentage.
The previously described frame of appro ximately 2 m 2 was placed on the windrows at all sampling points to determine yield.All pods within the frame area were collected, and yield was calculated based on 8% moisture, the standard value for peanut storage in the processing companies.Subsequently, gross crop yield was calculated by adding total digging losses (sum of invisible and visible digging losses).
Pod mo isture content (calculated on wet basis) was obtained following the oven method for samples collected after the digger-shaker-inverter passage, with subsequent correction for 8% water content value used for peanut storage in hulling (Mart ins & Lago, 2008).Soil moisture was determined is samples collected by a Dutch auger in the 0.0 to 0.2 m layer.The soil samples were placed in alu minu m containers, sent to the laboratory, and oven dried at 105°C for 24 hours.Soil mo isture was obtained on a dry basis, according to the methodology recommended by EMBRAPA (2006).The 0.0 to 0.2 m layer was defined as the soil-sampling layer to determine the moisture content because this region concentrates most of the peanut pods.One soil sample was collected per plot, totaling 60 samp les.
Maturity (Hull scrape method) was determined by scraping the exocarp of 100 rando m pods for each sampling point, exposing the color of the mesocarp.The pods were classified by color, according to the Peanut Maturity Table, developed by the University of Georgia in the United States (Williams & Drexler, 1981).
The windro w width and height were measured using a graded ruler to indicate the quality of windrow reversal since they can affect the drying and gathering mechanized process.
The data were standardized (null mean and unit variance) prio r to the conduction of the multivariate analysis and the variables did not present collinearity .Exp loratory statistics of the data was performed using the Statistica software to analyze the hierarchical clusters, calculating the Euclidean distance between accesses by the Ward algorithm to obtain similar accesses, which was then graphically represented by a dendrogram (clustering the Engenharia Agrícola, Jaboticabal, v.38, n.2, p.244-250, mar./apr. 2018 accesses) and the k-means (minimizing access variance within each cluster).
The Discriminant Analysis (DA) is the oldest of the three classificat ion methods.It was originally developed for mult ivariate normal distributed data.The data as a whole should not be normally distributed but within each class the data should be normally distributed.This means that if you could plot the data, each class would form an ellipsoid, but the means would differ.The Mahalanobis distance between x and the center ci of class i is the Sweighted distance where S is the estimated variancecovariance mat rix of the class.
After forming the clusters using the clustering method (Ward), the dendrogram branches were coded for technical application of principal components (Hair et al., 2005), using the same traits.The objective was to visualize the soil texture class in the two-dimensional plane formed by the principal components and interpret the discriminatory power of the variables in each major component, as: The eigenvectors (PC1, PC2, ...., PC h ) were determined from the eigenvalues of the covariance and/or correlation matrix of branch traits in descending order.Thus, PC1 is the component that exp lains mo re variability in the original dataset, while the last component explains less.
The variance in each principal co mponent can be calculated as follo ws: (2) where, : Principal co mponent h; : Eigenvalue h; C : covariance and/or correlat ion matrix; : λ 1 +λ 2 + ... +λ h ; The principal co mponent analysis was performed based on the diagonalization of a symmetric correlation matrix after analy zing the population variance to identify new numerical variables that explained most of the variability (Hair, 2005) by the Kaiser method with eigenvalues higher than "1".

RES ULTS AND DISCUSS ION
The dendrogram that resulted from the clustering analysis is presented in Figure 1.The results corroborated with Lacerda et al. (2016) which in the evaluation on the discrimination of soil texture, the authors considered the similar soil textural groups to differentiat ion of the soil managements and production potential.The processes were divided into 3 groups: I, II, and III and the clustering analysis by the discriminants analysis method (Figure 2) and the non-significant variables were excluded (IDL and VDL).The Classification and Discriminant Function Analysis Summary are presented for seven variables for three groups (Table 2 and 3).The Classification Matrix of groups (G1: 33.8%, G2: 16.7% and G3:50.0%) are represented for Observed classifications versus Predicted classifications (Table 4).Both characterized by the I and III opposing soil classes with lo w and high clay content, respectively, where higher yield was achieved but more advanced maturity stage caused majo r losses in peanut mechanized digging operation due to increased peduncle fragility.In addition, the mechanized gathering (GA L) became more d ifficu lt due to lower windrow height, and lower soil and pod mo isture content, corroborating Santos et al. (2013) and Zerbato et al. (2014) that reported pod moisture contents of 35 to 45% at d igging time.The GA L is a variable that most influenced the discriminant division to assist the decision-making process in mechanized agricultural operations.Tables 4 shows the results for the three groups used in the model with agronomic variab les to observed and predicted classification.The consideration concerns the percentage of well classified with the DA.In fact, all the soils of the estimation groups are well-classified (100%).Thus, there is not a situation of over-fitt ing: the analysis works well for the base model (Table 2 and 3), considering it is appropriate for predict ions (Lucadamo & Leone, 2015) which using all the explicative and response variables, the DA works perfectly for the basic soil textural classes as the percentage of well classified, denoting better classification rates for the new model.The principal component analysis allowed a single distribution of accesses (PC1 and PC2), since only two eigenvalues were greater than "one", 4.51 and 1.79, respectively.The two largest principal co mponents together enabled a two-dimensional ordering of accesses and variables, producing a Biplot graph (Figure 3).The distribution of soil textures and agronomic traits for the peanut crop showed that these components explained 70% of the variability, 50.11% and 19.89% of wh ich in PC1 and PC2, respectively.

SAN SAN SAN SAN SAN SAN SAN SAN SAN SAN SAL SAL SAL SAL SAL SAL SAL SAL SAL SAL SCL SCL SCL SCL SCL SCL SCL SCL
Engenharia Agrícola, Jaboticabal, v.38, n.2, p.244-250, mar./apr. 2018  The biplot graph (Figure 3) shows the distribution of soil classes in the peanut areas on the plan formed by the first two principal co mponents (PC1 and PC2) and coded according to the clusters determined in the dendrogram.The x-axis (PC1) shows the contrasts of the six soil classes, three textural classes (VCL, CLA, and CLL) to the right and two (SAN and SAL) to the left, and the centralized intermed iate behavior (SCL).The y-axis (PC2) shows a high direct correlation between the visible losses and windrow height, explained by the adjustment of align ment rolls and inverters of branches together with the plant mass being processed directly interfering with the windrow dimensions in the digging process (Zerbato et al., 2017).
Table 5 shows the variables with higher discriminatory power in the first principal co mponent that had direct correlations between YLD, MAT, VDL, IDL, and GAL, enabling an efficient multivariate approach to mechanized harvesting of peanuts.Likewise, Silva et al. (2010) and Santos et al. (2016) successfully used the agronomic traits of great economic importance in the agricultural production system.Principal co mponent 1 (PC1); principal co mponent 2 (PC2); Yield (YLD); maturity (MAT); windrow width (WIW) and height (HEW); soil (SM C) and pod (PMC) mo isture content; visible (VDL) and invisible (IDL) digging losses; and gathering losses (GAL).
The values of the correlations (Table 5) between the peanut harvest variables and the first two principal components according to soil textual class, PC1 has a high discriminatory power for the following peanut harvest variables: YLD (-0.88),MAT (-0.64),VDL (-0.69),IDL (-0.86), and GA L (-0.85).The negative associations of the variables presented their respective projections in the agglomerat ion directions of SAN and SA L soil textural classes (Figure 3).Then, the positive associations of SMC (0.66) and WPC (0.86) correlated with VCL, CLA and CLL soil textural classes.Thus, the peanuts digging operation could represent the PC1.It is noteworthy that the high impact of losses on the gathering of pods directly affected the final yield of the peanut crop (Ferezin et al., 2015).
Yield is the variable correlated to maturity that can directly influence peanut mechanized harvesting, in addition to the moisture content of the pods.The dry peduncles, with lower mo isture content, breaking easily, cause higher losses in the gathering operation (Zerbato et al., 2014).
The PC2 showed a high discriminatory power for the HEW and VDL variab les (Figure 3) with respective negative associations of -0.79 and -0.61 (Tab le 5).The windrow height influenced the gathering process since losses were higher for lower windrow heights because the gathering operation (PC2 representation) becomes more difficult closer to the ground.
The three mu ltivariate statistical statistics showed, together, an efficient method of discriminating soil textural classes.
of variable j; : Coefficient of variable j in the h-th main principal co mponent; : Eigenvalue h; : Correlation of the variab le x j to the hth principal co mponent.

FIGURE 2 .
FIGURE 2. Standardized coefficients for agronomic traits of peanut crop and soil texture for each group, according to the discriminants analysis.Yield (YLD), maturity (MAT), windro w width (WIW) and height (HEW), soil (SM C) and pod (PM C) mo isture content, visible and (VDL) and invisible (IDL) d igging losses, and gathering losses (GA L).

TABLE 1 .
Part icle size analysis and soil textural classes.

TABLE 3 .
Square Mahalanobis distances (upper half table) between the centroids of the group distributions (G1, G2 and G3) o f the soil textural classes and respective Fvalues (lower half table).