MULTIVARIATE ANALYSIS APPLIED TO THE STUDY OF THE RELATIONSHIP BETWEEN SOIL AND PLANT PROPERTIES IN A PEACH ORCHARD

Santos Silva Terra, Viviane; Valgas, Ricardo Alexandre; Reisser Júnior, Carlos; Timm, Luis Carlos; Martins Pereira, José Francisco; Carpena Carvalho, Flávio Luiz; Oldoni, Henrique MULTIVARIATE ANALYSIS APPLIED TO THE STUDY OF THE RELATIONSHIP BETWEEN SOIL AND PLANT PROPERTIES IN A PEACH ORCHARD Revista Brasileira de Ciência do Solo, vol. 38, núm. 3, 2014, pp. 755-764 Sociedade Brasileira de Ciência do Solo Viçosa, Brasil


SUMMARY
In the State of Rio Grande do Sul, the municipality of Pelotas is responsible for 90 % of peach production due to its suitable climate and soil conditions.However, there is the need for new studies that aim at improved fruit quality and increased yield.The aim of this study was to evaluate the relationship that exists between soil physical properties and properties in the peach plant in the years 2010 and 2011 by the technique of multivariate canonical correlation.The experiment was conducted in a peach orchard located in the municipality of Morro Redondo, RS, Brazil, where an experimental grid of 101 plants was established.In a trench dug beside each one of the 101 plants, soil samples were collected to determine silt, clay, and sand contents, soil density, total porosity, macroporosity, microporosity, and volumetric water content in the 0.00-0.10 and 0.10-0.20 m layers, as well as the depth of the A horizon.In each plant and in each year, the following properties were assessed: trunk diameter, fruit size and number of fruits per plant, average weight of the fruit per plant, fruit pulp firmness, Brix content, and yield from the orchard.Exploratory analysis of the data was undertaken by descriptive statistics, and the relationships between the physical properties of the soil and of the plant were assessed by canonical correlation analysis.The results showed that the clay and microporosity variables were those that exhibited the highest coefficients of canonical cross-loading with the plant properties in the soil layers assessed, and

INTRODUCTION
Fruit growing in a temperate climate in the southern half of the State of Rio Grande do Sul is one of the agricultural activities that has gained prominence in recent years due to its high profitability in small areas, and it is present in many of the family farm properties of this region.Among the fruit-bearing crops in a temperate climate, the peach has been an option for crop diversification within these rural properties, as well as an option for generation and maintenance of jobs in the rural area.According to Herter et al. (2003), this is due to the fact that the region has favorable climate and soil conditions for peach production.Timm et al. (2007) mention that there is a lack of studies that seek to evaluate the soil-plant interactions of the peach tree since peach fruit quality is the result of the interaction of various factors, including topography of the site, the soil, water, climate, and the management practices adopted in production (Herter et al., 2003).Thus, one of the main factors that should be taken into consideration in planning for establishment of an orchard is the initial condition of the soil, which may be checked through analysis of its physical and chemical properties.From the physical point of view, the structural quality of soils has been associated with conditions favorable to growth of the root system, aeration, water infiltration, and movement in the soil profile, conditions which do not limit root penetration and water and nutrient uptake and, consequently, do not restrict crop development (Coelho Filho et al., 2001;Li et al., 2002).
Canonical correlation analysis is a multivariate statistical technique that allows examination of the interrelation that exists between two groups of variables (X,Y), i.e., it verifies the existence and the intensity of association between groups through linear combinations between the variables that make up the groups (Amarante et al., 2006).In this respect, the technique may be appropriate to assess the relations of primary and secondary production traits and, or, physiological and agronomic traits of a given crop (Santos et al., 1994;Coimbra et al., 2004;Silva et al., 2007).Tavares et al. (1999) used canonical correlation to study the relationships between the main production factors in green pepper (weight and number of fruits) and the traits of the fruit.Coimbra et al. (2000) studied the relationships between the primary and secondary components of grain yield in common bean.In papaya, R. Bras.Ci.Solo, 38:755-764, 2014 Schmildt et al. (2011) used canonical correlation analysis to study the relationship between plant characteristics and the capacity for formation of sprouts after pruning.
Seeking to test the hypothesis that soil structural conditions affect the growth and yield properties of the peach orchard, as well as fruit quality, the aim of the present study was to evaluate the relationships that exist between soil physical properties and the peach tree plant in the years 2010 and 2011 in an orchard located in the municipality of Morro Redondo, RS, Brazil, by the technique of multivariate canonical correlation.

MATERIAL AND METHODS
The present study was carried out in the years 2010 and 2011 in a peach orchard located in the municipality of Morro Redondo, RS.The geographical coordinates of the experimental area are 31 o 31' 55.30" latitude South and 52 o 35' 37.87" longitude West.Climate in the region, according to the Köppen climate classification, is the "Cfa" type, i.e., humid subtropical with hot summers.The region has an annual mean temperature of 18 °C and annual mean rainfall of 1,509.2mm, and relative humidity of 78.8 %.The soil was classified according to Santos et al. (2006) as Argissolo Bruno-Acinzentado and as Aquertic Hapludalf based on the US Soil Taxonomy system (NRCS, 2009).
The peach orchard evaluated was composed of the Esmeralda cultivar with trees of three years of age at the beginning of the study, and an experimental grid was established of 101 plants out of a total of 1450.The crop practices (soil fertilization, weed management, plant health management, pruning, and thinning) were performed according to the recommendations of Medeiros & Raseira (1998).In July 2010, disturbed and undisturbed soil samples were collected at the 0.00-0.10 and 0.10-0.20 m layers in a trench opened beside each one of the 101 plants.The samples were identified, packaged, and taken to the Soil Physics Laboratory of Embrapa Clima Temperado (Pelotas, RS) to determine the following soil physical properties in each layer: sand, silt, and clay percentages by the Pipette Method (Gee & Or, 2002); soil bulk density (BD) by the (volumetric) soil sample ring method (Grossman & Reinsch, 2002); and total porosity (TP), macroporosity (MA), and microporosity (MI) (Embrapa, 1997).At each experimental point and in each layer, moisture content was determined based on volume at the time of collection.The depth of the "A" horizon was demarcated at each point with the aid of a soil auger.
The R statistical software (R Core Team, 2012) was used for data analysis.The coefficient of variation (CV) was classified according to Wilding & Drees (1983) as CV 15 %, 15 % < CV 35 %, and CV > 35 %; representing low, moderate, and high, respectively.To test the normality hypothesis of data distribution, the test of Shapiro & Wilk (1965) was performed at the 5 and 1 % significance levels.
b' = [b 1 ...b q ] are the vectors of the weights of the characteristics corresponding to groups 1 and 2, respectively.The first canonical correlation corresponds to the following equation: which maximizes the relationship between the functions X 1 and Y 1 that represent the first canonical pair, with: cov(X 1 ,Y 1 ) = a'S 12 b; var(X 1 ) = a'S 11 a; var(Y 1 ) = b'S 22 b; S 11 = pxp matrix of covariances of the traits of group 1; S 22 = qxq matrix of covariance of the traits of group 2; and S 12 = pxq matrix of covariances between the traits of groups 1 and 2.
With R 11 , R 22 , and R 12 being the sample correlation matrices, the first canonical correlation (r 1 ), corresponding to the first canonical pair, is the square root of the first eigenvalue ( ) , the solution of the equation The weighted coefficients of the canonical pairs are known as eigenvectors and are associated with the respective eigenvalues.The significance test applied for each canonical correlation was the chi-square (χ²): where n = number of observations; p = number of variables of the X group (independent); q = number of variables of the Y group (dependent), 2 i r = square of the canonical correlation of the equation to be tested.Canonical loading and canonical cross-loading were also evaluated, the latter measure being more used for analysis of the functions and, moreover, the measure adopted by the main statistical packages.These loadings include the correlation between each variable (dependent or independent) and the index of their respective group.

RESULTS AND DISCUSSION
Table 2 exhibits the descriptive statistics for the data in reference to the plant properties in the years 2010 and 2011, showing that, for both years, the mean values of the variables for mean fruit weight per plant (MFWP), trunk diameter (TD), fruit size per plant (FSP), firmness (F), and Brix (B) are similar.Nevertheless, this behavior cannot be seen for the number of fruits per plant (NFP) and yield (Y) variables.
The data spread around the mean, expressed by the coefficient of variation (CV), was low (CV 15%) for the variables for MFWP, TD, FSP, F, and B for the years 2010 and 2011 (Table 2), according to the classification proposed by Wilding & Drees (1983).The number of fruits per plant (NFP) and yield (Y) variables exhibited a spread classified as moderate (15 % < CV 35 %) in the year 2011.The CV values calculated were high (CV > 35 %) for the number of fruits per plant (NFP) (CV = 54.6 %) and yield (Y) (CV = 55.7 %) variables in the year 2010.The variability of the data of NFP and Y may be attributed to the occurrence of high wind (72.4 km h -1 ) in the experimental area on October 31, 2010, which caused a great deal of windfall and thus heterogeneity in the distribution of the number of fruits per plant throughout the orchard and, consequently, a reduction in peach yield.It may also be seen in table 2 that the residual distributions of the NFP and Y did not follow the tendency toward normality in the year 2010 by the Shapiro & Wilk test (p 0.05); nevertheless, for the year 2011, they showed a tendency toward normality.For the other variables, there was no change in the tendency of distribution, showing the same tendency in the two years.
In table 3, the values of the descriptive statistics for the soil physical properties determined in the 0.00-0.10 and 0.10-0.20 m layers are shown.It may be observed that, in both layers, the mean values of the BD and TP variables are similar.The data spread around the mean value was low (CV 15 %) for the variables for BD, MI, TP, and volumetric moisture (VM) in the two layers, and for the sand (SAN) variable in the layer from 0.10-0.20 m.Nevertheless, the distributions of the variables for clay (CLA), silt (SIL), MA (in both layers), SAN (0.00-0.10 m layer), and depth of the "A" horizon (DAH) were classified as moderate (Table 3) (Wilding & Drees, 1983).The distributions of the variables for bulk density (BD), macroporosity (MA) and total porosity (TP) in both layers, and volumetric moisture (VM) (0.00-0.10 m) followed the trend toward normality by the Shapiro & Wilk test (p 0.01) (Table 3) as well as the distributions of the variables CLA, SIL, and MI in the 0.10-0.20 m layer and DAH.In contrast, the distributions of clay, silt, and microporosity in the layer of 0.00-0.10m and VM (0.10-0.20 m) did not show a tendency toward normality of the data by the Shapiro & Wilk test (p 0.01) (Table 3).The same results are found for the distributions of sand in both layers.
In assessment of the canonical correlations, some analyses are not recommended if there is the presence of multicollinearity between the variables because the results obtained are considered to be not very reliable and lead to mistaken conclusions (Cruz et al., 2003;Rigão et al., 2009).Such a situation was observed among the variables for MA, MI, and TP.To correctly assess this case, two methods of analysis were considered: the first taking the coefficients of the canonical pairs into account (Tables 4 and 5) and the second assessing the canonical loadings and cross loadings (Table 6).In table 4 are the results in regard to the canonical correlation coefficients between the group of soil physical variables (group 1) and the group of plant variables (group 2) in the year 2010.Group 1   has 17 variables and group 2 is composed of seven variables, thus allowing the formation of seven canonical pairs in all.It may be observed in table 4 that the first canonical pair showed a correlation coefficient r equal to 0.613, and was significant by the chi-square test (p-value = 0.000014).The second canonical pair also showed significant correlation (r = 0.585), with a p-value of 0.025.It may further be seen in table 4 that, in both pairs, the variables for MA, MI, and TP, are correlated in a direct manner among themselves, for MI was obtained by the difference between MA and TP in both layers.In the face of this situation, the interpretation and analysis of the coefficients of the first and second canonical pair may lead to a mistaken understanding because the variables that are really important might not be easily identified.Thus, it is necessary to apply analysis of the canonical loadings which allow one to clearly and objectively identify the contribution of each variable, both for its group and for the other.The canonical correlation coefficients between the group of soil variables and the group of plant variables in the year 2011 are shown in table 5.It may be observed that only the first canonical pair was significant at 5 % by the chi square test (p-value = 0.0078), showing a correlation (r = 0.615) slightly greater than the year 2010 (r = 0.613).The  SAN: sand (g kg -1 ); SIL: silt (g kg -1 ); CLA: clay (g kg -1 ); BD: bulk density (kg dm -3 ); MA: macroporosity (%); MI: microporosity (%); TP: total porosity (%); VM: volumetric moisture (%); 1: depth of 0.00-0.10m; 2: depth of 0.10-0.20 m; DAH: depth of A horizon (cm); MFWP: mean fruit weight per plant (kg); TD: trunk diameter (cm); NFP: number of fruits per plant; FSP: fruit size per plant (cm); F: pulp firmness (Lb); B: Brix content (%); Y: yield (kg ha -1 ); -: do not have values; 10: year of 2010; χ²: chi square calculated; *: significant at 5 % by the chi square test.
problem of multicollinearity in the year 2010 among the MA, MI, and TP variables was also observed in analysis of the 2011 data, seen by the high values of the coefficients of the first canonical pair in relation to the others.Thus, we chose to adopt the same criterion in analysis of canonical loadings for the year 2011.
In canonical correlation analysis, we sought to verify the relationships that exist between the group of soil variables, in each one of the layers, and the group of plant variables for the years 2010 and 2011.Due to the occurrence of high wind in the year 2010 in the experimental area, as cited above, we believe that the results of the correlations between the variables of the two groups do not represent that which was expected for the year.Thus, we chose to analyze and discuss the results in reference to the correlations between the soil physical properties, in each one of the layers, and the plant properties only for the year 2011.The values of the coefficients of canonical crossloading between the group of soil physical variables, in both layers, and the group of plant variables for the year 2011 are shown in table 6.
variable to its own group, it may be seen that in the soil group, the CLA variable, related to soil texture, showed the highest values of the correlation coefficient (-0.627 in the 0.00-0.10m layer and -0.651 in the 0.10-0.20 m layer) with the other variables belonging to this group.It may also be seen that this variable, in both layers, has the opposite sign of the coefficient of canonical loading in relation to SAN, BD, and MA, which was expected.In contrast, the signs of the coefficients of canonical loadings of VM, TP, and MI are equal to that of the CLA variable, corroborating the expected result.Among the variables linked to soil structure (BD, MA, MI, and TP), the MI variable was that which exhibited the greatest values of the correlation coefficient (-0.713 and -0.565 in the 0.00-0.10 and 0.10-0.20 m layers, respectively) with the other variables belonging to the soil group (Table 6).In the plant group, the mean fruit weight per plant (MFWP) variable was that which showed the greatest coefficient of canonical loading (= -0.909) for the year 2011.It may also be observed that the MFWP variable has correlation in the same direction as the TD, NFP, FSP, and P variables, indicating that the MFWP has a strong correlation with the other variables, i.e., in a general way, vigorous plants exhibit a larger TD and tend to produce large fruit (FSP); however, for the NFP variable, this same relationship is not in evidence.From canonical analysis, it may also be observed that the B and F variables are related in the direction opposite to the other plant variables since, normally, larger fruits (MFWP and FSP) have greater cell size and, consequently, less concentration of solutes and less firmness, especially if this effect of cell growth is a consequence of better moisture conditions.The greater value of the coefficient of canonical loading of the Y variable in the year 2011 (-0.382) reflects, within the group of plant variables, the greater values of the other coefficients of the variables linked to peach production (MFWP, TD, and FSP).
Among the variables related to soil texture, the CLA variable was that which exhibited the greatest coefficients of canonical cross-loading (-0.386 and -0.401 in the 0.00-0.10 and 0.10-0.20 m layers, respectively) with the plant group variables.As mentioned above, this variable also stood out in the correlation with its own group.Among the variables linked to soil structure, MI exhibited the greatest values of the coefficient of canonical crossloading (-0.439 in the 0.00-0.10m layer and -0.347 in the 0.10-0.20 m layer) with those belonging to the plant group, suggesting that soil MI, which is directly related to water storage in the soil (Reichardt & Timm, 2012), affected the behavior of the plant variables in this study.The values of the coefficients of canonical cross-loading between the VM variable (-0.454 in the 0.00-0.10m layer and -0.369 in the 0.10-0.20 m layer) and the group of plant variables provided evidence of the fact that VM of the soil is a variable that integrates factors related to its texture (CLA) and structure (MI).This result is corroborated when correlation in the direction of the plant variables with the soil group variables (Table 6) is analyzed because the signs of the coefficients of canonical crossloadings are the same (negative), indicating that the variables that represent FSP and TD are directly related to soil water availability (Simões, 2007).

CONCLUSIONS
1.The clay and microporosity variables are those that show the greatest coefficient of canonical crossloading with the plant properties in the soil layers evaluated.
2. The mean fruit weight per plant and trunk diameter variables are those that show the greatest correlation values within the plant group for the two years evaluated.
3. The mean fruit weight per plant and trunk diameter variables are directly related to the volumetric moisture of the soil, which is a variable that integrates factors related to texture and soil structure.

R
. Bras.Ci.Solo, 38:755-764, 2014 MULTIVARIATE ANALYSIS APPLIED TO THE STUDY OF THE RELATIONSHIP BETWEEN SOIL AND PLANT PROPERTIES IN A PEACH ORCHARD

Table 2 .
Parameters of the descriptive statistics for plant properties in the years 2010 and 2011 MFWP: mean fruit weight per plant; TD: trunk diameter; NFP: number of fruits per plant; FSP: fruit size per plant; F: pulp firmness; B: Brix content; P: yield; 10: year of 2010; 11: year of 2011; (-): does not have a unit of measure; SD: standard deviation; CV: coefficient of variation; C s : coefficient of asymmetry; C k : coefficient of kurtosis; SW: Shapiro and Wilk Test; *: property does not follow normal distribution at 5 %; nd: property follows normal distribution at 5 %.

Table 4 .
Coefficients of canonical correlations (r) and canonical pairs between the group of soil physical variables (group 1) and the group of plant variables (group 2) in the year 2010

Table 5 .
Coefficients of canonical correlations (r) and canonical pairs between the group of soil physical variables (group 1) and the group of plant variables (group 2) in the year 2011 *: significant at 5 % by the chi square test; n.s: not significant at 5 % by the chi square test.

Table 6 .
Analysis of the group indexes through the canonical loadings and canonical cross-loadings for the years 2010 and 2011