USING NUMERICAL CLASSIFICATION OF PROFILES BASED ON VIS-NIR SPECTRA TO DISTINGUISH SOILS FROM THE PIRACICABA REGION , BRAZIL ( 1 )

Considering that information from soil reflectance spectra is underutilized in soil classification, this paper aimed to evaluate the relationship of soil physical, chemical properties and their spectra, to identify spectral patterns for soil classes, evaluate the use of numerical classification of profiles combined with spectral data for soil classification. We studied 20 soil profiles from the municipality of Piracicaba, State of São Paulo, Brazil, which were morphologically described and classified up to the 3rd category level of the Brazilian Soil Classification System (SiBCS). Subsequently, soil samples were collected from pedogenetic horizons and subjected to soil particle size and chemical analyses. Their Vis-NIR spectra were measured, followed by principal component analysis. Pearson’s linear correlation coefficients were determined among the four principal components and the following soil properties: pH, organic matter, P, K, Ca, Mg, Al, CEC, base saturation, and Al saturation. We also carried out interpretation of the first three principal components and their relationships with soil classes defined by SiBCS. In addition, numerical classification of the profiles based on the OSACA algorithm was performed using spectral data as a basis. We determined the Normalized Mutual Information (NMI) and Uncertainty Coefficient (U). These coefficients represent the similarity between the numerical classification and the soil classes from SiBCS. Pearson’s correlation coefficients were significant for the principal components when compared to sand, clay, Al content and soil color. Visual analysis of the principal component scores showed differences in the spectral behavior of the soil classes, mainly among Argissolos and the others soils. The NMI and U similarity coefficients showed values of 0.74 and 0.64, respectively, suggesting good similarity between the numerical and SiBCS classes. For example, numerical


INTRODUCTION
Agriculture is considered one of the foundations of the Brazilian economy, providing employment and increasing foreign exchange reserves.However, in recent decades, authorities and even increasing community awareness have required production increases based on sustainable practices.Therefore, implementation of proper management procedures and prior knowledge of the agricultural environment become essential.
Soil is one of the most important constituents of the environment.It is not only a support for plants but also supplies water and nutrients.Thus, knowing the soil properties and their spatial variability is essential for implementation of any management technique (Bhatti et al., 1991).Soil maps are one of the most used sources of information for evaluating soil spatial variability and, when available at an appropriate scale, they allow the user to identify physical, chemical and morphological properties.Therefore, these maps aid in the decision-making process for agricultural planning, e.g., indicating sites with periodic flooding or variations in soil depth.
According to Mendonça- Santos & Santos (2006), approximately 35 % of Brazilian territory, which represents 17 of 26 States, has soil maps in intermediate scales (1:100,000 -1:600,000).However, soil maps covering the whole of national territory are found only in the exploratory and schematic scales.
These are called organizational maps and were mainly developed by the Brazilian Agricultural Research Organization (Embrapa) and the Agronomic Institute of Campinas (IAC).They are usually offered free of charge and made available in digital databases.Currently, to respond to the demand for more detailed maps, several private agricultural enterprises have hired professionals to develop these maps.However, this information is not available in the public domain.In addition, there is a lack of interest from governmental institutions in producing detailed maps, due to the complexity, cost, and.especially, time requirements (Ben-Dor et al., 2008).
According to Brown et al. (2006), the cost for conventional soil characterization, calculated by the U.S. National Soil Survey Center, is about US$2,500 per pedon and requires 6-12 months to be accomplished.Giasson et al. (2006) estimated a cost of approximately US$ 2.21 ha -1 (scale of 1:50,000) and US$ 0.817 ha -1 (scale of 1:100,000) for mapping the soil in two municipalities in the state of Rio Grande do Sul, Brazil.To improve the mapping process, researchers have developed equipment to provide soil information in real time at lower costs (Viscarra Rossel et al., 2009).Since the 1980s, numerous initiatives have evaluated the potential of spectra in obtaining quantitative soil data and, in many cases, the results were promising (Rivero et al., 2007;Ben-Dor et al., 2008;Viscarra Rossel et al., 2009).Several researchers (Moran et al., 1997;Ben-Dor et al., 1999) have recommended the development of methods using optical sensors, which may lend support to both laboratory processes and routine fieldwork.Some studies attempted to use the spectrum through visual interpretation to describe the soil class (Ben-Dor et al., 2008;Bellinaso et al., 2010).However, this process requires knowledge of spectral patterns for each soil class, as well as good references for comparisons (complete spectral libraries), a condition that is not always possible.In this regard, numerical classification of profiles could be useful as a technique to support soil classification based on spectral data.According to Campbell et al. (1970), numerical classification of profiles allows application of a numerical procedure to determinate similarity between soil profiles and subsequently identify the most representative groups for the dataset.Various methods are described in the literature, suggesting the relevance of this technique and its applicability (Rayner, 1966;King & Girard, 1988;Carré & McBratney, 2005).More recently, Carré & Jacobson (2009) proposed a method of numerical classification of soil profiles called the Outil Statistique d'Aide à la Cartogénèse Automatique (OSACA).To date, all studies carried out have considered soil quantitative information obtained from laboratory analyses or qualitative properties obtained from field observations.However, no study has been carried out taking into account soil spectral behaviors as input data for numerical classification.
Thus, this study tested numerical classification using only Vis-NIR spectra as input data to distinguish soil profiles, assessing the potential and limitations of this information to classify soils.In addition, we aimed to assess the correlations among principal component scores of soil spectra and soil properties, and conduct visual evaluation of soil profile spectral patterns and their relationship to soil classes in the Brazilian Soil Classification System (SiBCS).

Study area and data sampling
The area of study is located between latitudes 22 o 42' 49" and 23 o 0' 15" S, and longitudes 47º 57' 3" and 47º 30' 15" W, corresponding to the municipality of Piracicaba, São Paulo, Brazil.Soil samples from 20 profiles, previously collected and studied by Bellinaso (2009), were used in the analyses.The soil profiles were described and sampled according to Santos et al. (2005).The samples were analyzed for soil particle size using a densimeter with hexametaphosphate as a dispersing agent (Camargo et al., 1986).The resin method was used to determine pH (in H 2 O and KCl), organic matter content (OM), and P exchangeable, K, Ca, Mg, and Al content.Cation exchange capacity (CEC), bases and Al saturation were calculated (Raij et al., 2001).The soil profiles were classified up to the 3 rd category level according to Embrapa (2013) based on the chemical and soil particle size data, as well as soil morphological description and field observations.The profiles were also classified according to World Reference Base for soil Resources (WRB) (IUSS Working Group WRB, 2007) and are presented along with the Brazilian classification.In this work, the profiles classified as Latossolos (LV and LVA) (SiBCS) corresponded to Ferrasols (WRB), the Argissolos (PA, PVA, PV) are related to the Acrisols, Nitossolos (NV) with the Nitisols and Cambissolos (CX) are related to Cambisols and Alisols.
Soil color was determined using a Minolta CR-300 colorimeter adjusted for the Munsell color system (Campos et al., 2003).The values were then converted to the RGB color system by the COLOSOL software to use the color as a numerical variable in statistical analysis (Viscarra Rossel et al., 2006a).
After that, the spectra of soil samples were obtained using the FieldSpec Pro spectrometer (Analytical Spectral Devices, Boulder, Colo.), which has spectral resolution of 1 nm and performs readings in wavelengths in the spectral range from 350 to 2500 nm.Geometric acquisition of the spectral data was based on perpendicular positioning of the sensor to the sample separated by 27 cm.The lighting was positioned at 61 cm from the sample, forming a zenith angle of 20°.The absolute reference standard was a white spectralon plate.To cover the soil surface analyzed, the samples were measured in triplicate, and then an average of these three readings was obtained.

Principal Component Analysis of spectra
Due to multicollinearity in high spectral resolution data (Chang et al., 2001) and the long time requirement for computational processing, transformation of spectra has been necessary.We used Principal Component Analysis (PCA) (Wold, 1982) carried out with the Parles 3.1 software (Viscarra-Rossel, 2008) to summarize the soil spectral data.This software determines the principal components (PCs), which are given by the linear combination of the variables X 1 , X 2 , ..., X j (in this case, the spectral wavelengths): in which a ij corresponds to the load of the variable j in the calculation of the principal component i.
The Principal Component (PC) is calculated to represent the greatest possible variance in the dataset respecting the restriction applied to the load calculations: The process is repeated until the total of the principal components calculated is equal to the original number of variables or reaches a pre-established number of components.At the end of the analysis, the PCA provides the eigenvectors, eigenvalues and the principal component scores.The elements of an eigenvector are the loads a ij described above, also known as loadings (Viscarra-Rossel, 2008).In this case, a set of loadings is associated with each principal component and each loading is associated with an original variable.
The loading value indicates the contribution of each variable, in this case, the reflectance of each spectral wavelength for each principal component.This information is interpreted by observing the proximity of the loading values in relation to value 0, i.e., the closer the loading value is to 0, the lower the contribution.Similarly, the more positive or more negative the loading value is, the greater the contribution of the variable to the principal component.
The eigenvalues represent the variance explained by each principal component, which decrease from the first to the last principal component, generally expressed in percentage.Finally, the scores are one of the most important pieces of data, allowing the construction of ordination diagrams of samples (scatter plots).Visual analysis of these scatter plots allows the similarity between samples to be identified or even outliers to be located.

Numerical classification of profiles
To perform numerical classification of the soil profiles, the first four PCs were considered as variables, explaining more than 98 % of the data variability.
The OSACA algorithm (Carré & Jacobson, 2009) was used for numerical classification.This algorithm aims to classify the soil profiles using k-means clustering (Diday, 1971) by comparing soil characteristics, taking the position of their respective horizons into account.First, the comparison of profiles is performed two by two, i.e., it compares profile "A" and "B", then "A" and "C", and so forth.In these comparisons, the 1 st horizon in profile "A" is compared to the 1 st horizon in profile "B", the 2 nd horizon in "A" with the 2 nd horizon in "B", and so forth (Figure 1).The values of these comparisons are given by distance metrics, in this case, the Euclidean distance (Gauch, 1982).The average Euclidean distance between each two profiles, called pedological distance (Dped), is added to the kmeans cluster algorithm to classify the profiles.The pedological distance is described as follows: in which S a and S b are profiles A and B; h i,j is horizon j of profile i; D h (h a,j , h b,j ) is the Euclidian distance between the j horizons of profiles A and B; M a and M b is the number of horizons of profile A and B, considering that profile B has a higher number of horizons.

Comparison between numerical and SiBCS classifications
The Brazilian Soil Classification System is used to cluster soil profiles with similar properties and provides important information for soil management and conservation procedures.Therefore, we checked for agreement between the OSACA and SiBCS classifications.Considering that SiBCS takes the diagnostic horizons into account, while the numerical method uses the pedogenetic horizons, we do not aim to validate the numerical method, but rather observe, from an established system, whether the groups generated by OSACA are consistent.In this case, soil classes up to the 3 rd categorical level, as well as information on soil texture and chemistry, were used as a basis for comparison.Although the SiBCS classification does not take directly this information into account, it is important in soil management and conservation procedures.In addition, it is generally indicated next to the soil classes in maps.
The agreement between the SiBCS and the numerical classification was determined by the Normalized Mutual Information (NMI) and the Uncertainty Coefficient (U) indices.The NMI is used to describe the degree of dependence between original classes and numerical groups (Santos, 2009).In other words, this index is a measure of the disagreement between the groups obtained and the classes indicated (Liu & Navath, 2006).Therefore, random groups would have mutual information worth zero, while high values of this measurement would indicate high similarity.One of the possible formulations for NMI is defined as (Shi & Ghosh, 2003):

IMN
in which P i represents one of the g groups; L j is one of the l classes; T ij is the element of the contingence matrix that represents the number of profiles attributed to the P i group, which are also members of L j class; and n is the total number of profiles.In this case, the NMI value ranges from 0 to 1, and the higher this value, the greater the relationship between the original classes and the groups.
The U indicates if the variation of reference classification can be explained by the groups generated.Among the positive aspects of this coefficient, it does not penalize the algorithm if it generates groups that subdivide a class.Thus, if profiles belonging to a specific SiBCS class are subdivided into two groups, the U considers the division correct, without penalizing U.The index is defined as (Shannon & Weaver, 1963): in which H(I) corresponds to the entropy of distributions and reference classification; and H(I|J) refers to the value of the conditional entropy.By applying the index, we obtain a value between 0 and 1, where higher values suggest more associated classes and groups.

Soil profiles
The profiles were placed in different landscape positions, with elevation ranging from 438 to 781 m (Figure 2).Those located on the east side of the study area in the highest places are derived from basaltic rocks.These soils showed high clay and iron oxide contents and, in some cases, high base saturation.An example was the profile P52 (Table 1) which corresponded to a Latossolo Vermelho (LV) férrico with a heavy clay texture.There were also soil profiles classified as Nitossolo Vermelho (NV), with clay or heavy clay texture and, in some cases, high iron oxide content (18 % < ferric < 36 %).The profiles located in the hillside areas with intermediate elevation (Figure 2) correspond to Argissolo Vermelho and Argissolo Vermelho-Amarelo, with texture ranging from clay to loam.These soils were derived from claystones and siltstones of the Corumbataí and Iratí formations (Mezzalira, 1966) and are represented by profiles P26 and P54 (Table 1).
The profiles in the central and western areas corresponded to the loam textured soils, with color ranging from Yellow-Red to Yellow.Profiles placed in flat areas were related mostly to Latossolos, while those allocated to gentle sloping areas corresponded to Argissolos.Soils in steeper slopes at lower altitudes (Figure 2) were classified as Cambissolo Háplico.The parental materials of these soils are claystones and sandstones from the same geological formations described above (Mezzalira, 1966).The representative profiles of these soils were identified as P32, P37, P40 and P49, which are shown in table 1.

Relationship between soil spectral data and physico-chemical properties
The relationship between soil properties and spectral data were evaluated based on Pearson correlation analysis (Table 2).In this analysis, soil properties determined in the laboratory were compared to the PCA scores of the spectral data.In addition, the PCA loadings were plotted and observed to indicate which spectral range affected each PC score (Figure 3).By evaluating the PC1 loadings (Figure 3a), we observed that the scores were affected by the entire spectral reflectance intensity (albedo) along the wavelengths after 600 nm.Thus, variability in the PC1 was related to the spectral albedo, corroborating results found by Galvão et al. (2001) and Bellinaso et al. (2010).The correlation between clay content and PC1 scores showed a value of -0.74 (Table 2).Considering that PC1 was related to spectral intensity, this result is in accordance with those obtained by Bowers & Hanks (1965), Stoner (1979) and Demattê et al. (2004), which indicated that variations in soil particle sizes caused changes in the reflectance of the soil spectral behavior.Sand contents are also related to PC1, showing a correlation of 0.78 (Table 2).In this case, reduction in the clay content of the soil samples leads to a proportional increase in the percentage of other fractions, e.g., sand.
The loadings showed that PC2 scores were primarily related to 350-600 nm (Figure 3b).
Reflectance variations in the visible region (400-700 nm) results from the presence of Fe oxides (Demattê & Garcia, 1999) which are directly related to variations in soil color (Fernandes et al., 2004;Chicatti, 2011).This is in agreement with the results observed in table 2, where the color represented by the values of R, G and B showed significant correlations with PC2 of 0.78 (G) and 0.71 (B).In addition to soil color, we also observed a significant correlation between exchangeable Al and PC2 (r value of 0.73).According to Stenberg et al. (2010), cations in the soil do not have a direct relationship to the spectrum in the visible and near infrared range.According to them, there is no specific spectral feature in the Vis-NIR for elements retained in CEC.Moreover, the authors attribute the good results in some studies (Chang et al., 2001;Groenigen et al., 2003;Pereira et al., 2004;Nanni & Demattê, 2006) to the existence of local covariation between spectrally active soil properties and the cations evaluated.In our study, soil color is considered a spectrally active property, which showed significant correlation between the Al 3+ and R, G and B values of 0.74, 0.82 and 0.80, respectively (Table 2).Therefore, there was an indirect relationship between oxides and Al contents.
The most important wavelengths related to the PC3 (Figure 3c) were around 1,400; 1,900 and 2,200 nm.The features in 1,400 and 1,900 nm are related to molecular vibration of the hydroxyl group (OH) in hygroscopic water and 2:1 clay mineral structures (Lindberg & Snyder, 1972).In contrast, variations in 2200 nm are attributed to kaolinite (Goetz et al., 2009).Other ranges that had less influence on PC3 scores were from 350 to 600 nm and 850 to 1,100 nm, both related to the presence of hematite and goethite in soil samples (Leone & Sommer, 2000).The significant correlation between PC3 and R and G values, 0.79 and 0.77, respectively (Table 2), supported the statement that variations in PC3 were related to Fe oxides and hydroxides and, consequently, soil color.According to Viscarra Rossel et al. (2006b), this spectral range is also affected by OM, although in our study there was no significant correlation between the scores and that property.The lack of correlation between OM contents and reflectance data (Table 2) is in disagreement with several studies.Viscarra Rossel et al. ( 2010) evaluated the correlation between spectral PCs and OM in the municipality of Rafard, São Paulo, Brazil, and found significant values for PC3.Karmanov (1968) suggested that the interactions between OM and soil spectral properties are due to the accumulation of humic substances, which produced reduction of soil spectral reflectance.
The PC4 was affected mainly by the 550-700 nm and 850-1,400 nm bands (Figure 3d), which were related to the presence of Fe oxides and hydroxides in the soil.Given that PC4 showed significant correlation to sand (r = 0.66) and clay (r = -0.59)contents (Table 2), the iron oxides in the clay fraction were probably responsible for the correlation between this property and PC4.

Spectral analysis of soil classes based on PCA
The determination of dissimilarity of soil classes, based on spectral reflectance, should be performed considering the qualitative analysis of both spectra from the surface horizon (HA) and the diagnostic subsurface horizons (HB).Despite good results (Demattê et al., 2004;Ben-Dor et al., 2008), this method requires knowledge of soil spectra and their characteristic properties for each soil class, which has not been established yet.Therefore, we evaluated the differences in spectral behavior of soil classes based on the scatter plots of the PCA scores (Figure 4), which was more easily interpreted.In this case, we considered the surface and subsurface diagnostic horizons.
Knowing the relationship between soil properties and the reflectance spectrum, it is possible to draw inferences about clay content, color and mineralogy from visual interpretation analysis.In addition, assessment of soil profile spectra and surface and subsurface horizons, allows identification of the degree of weathering and environmental conditions to which soils have been subjected (Demattê et al., 2004).
In this context, the scatter plots of the scores from Argissolo profiles (Figure 4 a,b,c) showed distinct   values for HA in relation to HB.By analyzing these graphs, we observed that the values are greater in samples from horizon A for PC1 and PC2.This difference was more pronounced in Figure 4c, where the plot represented the values of PC2 and PC3.In this case, samples from HA of Argissolos were at the top left of the figure, while the samples from HB were found at the bottom right (Figure 4c).The PCA loadings (Figure 3) showed that the second component was related to the spectral range from 350 to 600 nm and, therefore, Fe oxides (Demattê & Garcia, 1999).Thus, the sample distribution on that scatter plot indicated variation in Fe oxide contents between the subsurface horizons and their respective surface horizons.This is in agreement with Lespch et al. (1977), who evaluated soils in a toposequence in the western plateau of São Paulo, Brazil, and found that Argissolos had a significant variation of Fe 2 O 3 at depth.According to the authors, this characteristic is related to iron accumulation in the subsurface horizons caused by clay degradation in the HA of the profiles.The PC3 scores (Figure 4c) were higher in samples from HB than HA, which were influenced by the features in 1,400, 1,900 and 2200 nm, as evidenced in the loadings, indicating the presence of 1:1 and 2:1 clay minerals (Lindberg & Snyder, 1972), and therefore variation in the clay content at increasing depth in the profile.
The three profiles of Cambissolos showed spectra with distinct patterns with high values for PC1, PC2 and PC3 (Figure 4d,e,f).For that reason, these profiles had a distinct position in the PC graph when compared to the soils of other classes (Figure 4).These soils presented a lower degree of weathering and tended to have higher contents of 1:1 and 2:1 clay minerals, which were identified in the spectra and produced higher PC3 values.The loam texture soils with predominantly yellowish color (Table 1) generated greater values of PC1 and PC2.This was confirmed by the significant correlation between clay content and PC1, as well as between soil color and PC2 (Table 2).Moreover, by Figure 4e and 4f, we observed that surface and subsurface horizons may be distinguished from the more intense degree of weathering in the soil surface layer, which led to variations in soil properties, such as clay type and content, as well as the presence of Fe oxides (Lepsch et al., 1977).In spite of a visible spectral distinction between HA and HB, it was not as pronounced as in Argissolos, thus excluding the possibility of a mistake between these soil classes when assessed only for their spectral reflectance.
In regard to the Latossolos and Nitossolos, the PC scores of both soil classes showed similar spectral patterns, in which both PC1 and PC3 showed medium values, while the PC2 values were generally lower (Figure 4g,h,i,j,k,l).In addition, the samples from HA and HB did not show significant differences in regard to the scores.That similarity indicated homogeneity in the characteristics of samples at different depths, which is characteristic for these two soil classes.Through the scatter plots, we observed close positioning of these soil samples at the left (Figure 4 g, j) or closer to the origin of the graph (Figure 4h,i,k,l).Thus, these characteristics did not allow these soil classes to be distinguished from one another.However, that aspect could be differentiated from other soils evaluated in this study.Bellinaso et al. (2010) evaluated the spectra of 233 soil profiles from southeastern and Midwestern Brazil and found that Latossolos and Nitossolos had a similar spectral behavior.The author also indicated that the differentiation between these soils should be carried out through the presence of gibbsite in Latossolos, which provides a typical and more pronounced spectral feature in 2,265 nm.The evidence of this mineral is due to a more advanced weathering process in LVs, providing a higher content of Fe and Al oxides and a greater loss of SiO 2 (Boul et al., 1997).Thus, we were able to confirm that the incapability of distinguish ing LVs and NVs was, in part, related to the fact that the PCA was not able to represent spectral variations at 2265 nm as suggested by the loadings (Figure 3).In this case, it is necessary to observe the spectral behavior of these soil classes to distinguish them.However, even in studies with more detailed information, there is still great confusion between LVs and NVs (Cezar et al., 2013).Based on numerical classification, seven groups were identified within the dataset evaluated (Table 3).The method is an unsupervised classification, i.e., no previous rules or conditions were established for the OSACA algorithm.The clustering of soil profiles was conducted based only on spectral behavior, taking into account their horizons.This method was used to investigate if the spectra are able to distinguish different soil clusters by themselves.In general, the comparison between the soil groups obtained from the numerical method and the soil classes from SiBCS presented similarities (Table 4).It proved that spectra were relevant to soil analyses and classification.
The SiBCS classification at the first categorical level generated four soil classes.Considering the second categorical level, this value was increased to seven classes.Adding information regarding texture to the suborder level, 13 different soil classes were obtained (Table 3).By evaluating the NMI index, we observed that the relationship between the cluster generated by OSACA and the classes of the Brazilian System of Soil Classification corresponded to 0.42 at the 1 st level (Table 3).When these clusters were compared to the 2 nd level + texture, this value increased to 0.74.The NMI index increased to 0.78 when the comparison was performed with the classification at the 2 nd level plus texture and chemical properties (Table 3).The U index presented a greater value when considering the 2 nd level + texture, with a value of 0.64 (Table 3).By soil classification at the 2 nd level + texture + chemistry, the U index showed a slight reduction, with a value of 0.62 (Table 3).If we consider both indices in our analyses (NMI and U), we found better correlations between conventional classification and numerical clustering at the 2 nd level + texture (Table 3).
By evaluating the differentiation carried out by numerical classification, we observed a trend toward grouping soil profiles with similar clay content (Figure 5a) and color (figure 5b).In general, soils with higher clay content and red hue were allocated to group 1, while sandy soils with yellow hue were classified in group 7 (Figure 5).Groups 2, 3, 4 and 6, however, ranged from loam textured to clay soils, with hues from 5YR to 7.5YR (Figure 5b).In contrast, group 5 consisted of one profile with 10YR, which was distinct from the others (Figure 5b).Considering that the parameters used by OSACA were the PC scores, and PC1 is correlated with clay and the second and third components with soil color, we observed that this technique was able to cluster soil profiles with similarities in these properties.
The contingency table (Table 4) indicates the comparison between the classes obtained in the SiBCS and the OSACA groups.Overall, we observed that group 1 corresponded to Latossolos Vermelhos and Nitossolos Vermelhos.Therefore, the numerical technique was not able to distinguish soils belonging to these classes.In part, this was due to the fact that distinction between these two soil classes by spectral reflectance is carried out mainly by the absorption feature in 2265 nm.This feature is related to gibbsite, which is more pronounced in LVs than NVs (Bellinaso et al., 2010).In addition, given that these soils have high clay and iron oxide contents (Table 1) and, in some cases, the presence of magnetite, the spectral aspects able to distinguish these two classes might have been masked by these properties (Formaggio et al., 1996).Groups 2, 3 and 5 consisted of Argissolo profiles (Table 4), in which these soil profiles were divided into several groups, probably related to differences in clay contents and color, as well as the distinct variation patterns of these properties in depth.This may be exemplified by the comparison between groups 3 and 5 (Table 4).In this case, group 3 had one soil profile with loam texture, Yellow-Red Munsell color, classified as Argissolo; and soil profiles with clay texture, Yellow-Red color, classified as Argissolo.In contrast, group 5 consisted of one profile with clay texture, Yellow color, also classified as Argissolo (Table 4).
Group 6 consisted of profiles classified as Latossolo Vermelho Amarelo and Argissolo Vermelho Amarelo, both loam-textured (Table 4).These profiles were clustered in the same group because both were highly weathered, with a predominance of 1:1 clay minerals and Fe oxides.In this case, the main factor for distinguishing both classes is the variation in intensity of reflectance between surface and subsurface layers due to the texture gradient in Argissolos, as well as the feature in 2265 nm, due to the higher concentrations of gibbsite in Latossolos Categorical level of classification SiBCS (1)  No  (1) According to Embrapa ( 2013); (2) Normalized Mutual Information; (3) Uncertainty coefficients.(Bellinaso et al., 2010).However, the texture gradient was probably not pronounced enough to be detected by the spectra.Finally, group 7 was associated with soil profiles identified as Cambissolo Háplico with loam texture (Table 4).When compared to the highly weathered soils, such as Argissolos, Latossolos and Nitossolos, great differences between them were observed (Bellinaso et al., 2010).That was reinforced in this study, where group 7 was exclusively associated with Cambissolo profiles (Table 4).

CONCLUSIONS
1.The soil spectra do not have features directly related to soil chemical properties, consequently no significant correlations are expected between those two sets of information.
2. Descriptive analysis of soil spectral data, based on the diagnostic surface and subsurface horizons, allows different soil classes to be distinguished.It is observed that variations in PC scores between the surface and subsurface horizon of Argissolos enable a satisfactory level of distinction of this soil as compared to other classes.In this case, graphic representation of PC2 vs PC3 allows better discrimination of Argissolos.   3. Latossolos and Nitossolos have similar spectra and had PCs with great similarity in the surface and subsurface horizons.Consequently, distinction between those classes requires further information, such as morphological features.
4. Spectral reflectance data associated with the OSACA algorithm is an efficient methodology for soil classification.The results show a consistent clustering of soil profiles, as well as a good relationship to the SiBCS classes.
5. The similarity between numerical classification and the Brazilian system were higher when considering the 2 nd categorical level + texture.When considering the 2 nd categorical level + texture + chemical properties, the similarity decreases.This is indicative of spectral limitations in the classification process.Therefore, in situations where some chemical properties are required to distinguish soil classes correctly, soil spectra might not be the best option.

Figure 2 .
Figure 2. Profile location and landscape elevation in the survey area.

Figure 4 .
Figure 4. Graphs representing the principal components 1, 2 and 3 of the spectra from the profiles of Argissolos, Cambissolos, Latossolos and Nitossolos in the surface ( ) and subsurface ( ) layers.

Figure 5 .
Figure 5. Average clay content and standard deviation for each OSACA group (a); and frequency of soil profile hues in the different OSACA groups (b).

Table 1 .
Chemical and particle size analysis of soil profile samples

Table 3 .
Normalized Mutual Information and Uncertainty coefficient for the Brazilian Soil Classification System (SiBCS)

Table 4 .
Contingency table of SiBCS classes by OSACA groups