Is It Possible to Classify Topsoil Texture Using a Sensor Located 800 km Away from the Surface ?

It is often difficult for pedologists to “see” topsoils indicating differences in properties such as soil particle size. Satellite images are important for obtaining quick information for large areas. However, mapping extensive areas of bare soil using a single image is difficult since most areas are usually covered by vegetation. Thus, the aim of this study was to develop a strategy to determine bare soil areas by fusing multi-temporal satellite images and classifying them according to soil textures. Three different areas located in two states in Brazil, with a total of 65,000 ha, were evaluated. Landsat images of a specific dry month (September) over five consecutive years were collected, processed, and subjected to atmospheric correction (values in surface reflectance). Non-vegetated areas were discriminated from vegetated ones using the Linear Spectral Mixture Model (LSMM) and Normalized Difference Vegetation Index (NDVI). Thus, we were able to fuse images with only bare soil. Field samples were taken from bare soil pixel areas. Pixels of soils with different textures (soil texture classifications) were used for supervised classification in which all areas with exposed soil were classified. Single images reached an average of 36 % bare soil, where the mapper could only “see” these points. After using the proposed methodology, we reached a maximum of 85 % in bare areas; therefore, a pedologist would have proper conditions for generating a continuous map of spatial variations in soil properties. In addition, we mapped soil textural classes with accuracy up to 86.7 % for clayey soils. Overall accuracy was 63.8 %. The method was tested in an unknown area to validate the accuracy of our classification method. Our strategy allowed us to discriminate and categorize different soil textures in the field with 90 % accuracy using images. This method can assist several professionals in soil science, from pedologists to mappers of soil properties, in soil management activities.


INTRODUCTION
Orbital remote sensing in a country of continental dimensions, such as Brazil, is an indispensable tool for understanding and monitoring natural resources (Lima et al., 2001).Soils play a very important role in plant development and global food production, but soils have typically been used without proper knowledge, characterization, and studies.Improper land use results in soil degradation, low crop yield, and high costs from unsustainable production.
One of the most important soil properties is texture, due to its relationship with other properties, such as structure, porosity, permeability, fertility, chemistry, and moisture content (Brady and Weil, 2007).Soil texture is obtained from soil particle size distribution, which is mainly analyzed and mapped using traditional approaches in which pedologists make boreholes in the field or collect samples from soil profiles and analyze their findings in a laboratory.This procedure is costly and time-consuming.However, areas with high intensity agriculture need this information.Thus, there is the need for a more effective method of mapping soil texture.
Remote sensing is an important tool for soil surveying.In particular, many studies have found that texture can be quantified by spectral reflectance under laboratory conditions (Nanni and Demattê, 2006).Nanni et al. (2012) have also observed the importance of orbital images for clay estimates.These studies have shown that it is possible to determine soil particle distribution (clay, silt, and sand percentages) by spectral data.Identification of bare soils by satellite imaging is not new (e.g.Demattê et al., 2009;Ghaemi et al., 2013;and Masoud, 2014), but continuous information on bare soils in an area is still an important difficulty for soil scientists, especially pedologists.
Spectral data from orbital levels usually show vegetated areas where it is not possible to detect soil information.Therefore, how can a pedologist map an area if only certain spots of bare soil are shown in a single image?This problem has been observed in field studies; if we had one image with all the information from bare soil, the survey would certainly be easier.Methodologies that are well-performed have been restricted to spots of bare soil.Brazil and African countries, in tropical regions, have extensive areas of agriculture, and information on soil texture from orbital data can assist soil surveying and mapping.
In this context, the aim of the present study was to develop a strategy to identify continuous areas of bare soil by fusing multi-temporal orbital images of the same location and by mapping the topsoils according to soil textural classification.The hypothesis of our research is that changes in soil texture affect spectral data, which can be detected and mapped by satellite images.In addition, the strategy we propose is based on field observation of conventional agricultural areas with bare soils in different periods of the year, where the fusion of images from different years can provide a complete "picture" of bare soil.

Description of study sites
Three sites were chosen: two in the state of São Paulo and one in the state of Goiás (Figure 1).The study sites covered a total of 30,000 ha in São Paulo (15,000 ha each) and 35,000 ha in Goiás.All sites are at altitudes ranging from 450 to 900 m; the climate is temperate with dry winters (Aw -Köppen classification system), with average annual rainfall ranging from 1,000 to 1,800 mm and average temperature of 20 °C.The lithology is mainly represented by the Serra Geral, Botucatu, and Pirambóia (São Bento Group) Formations, and Serra de Santana covers (Taubaté Group).The rocks from the Serra Geral Formation are volcanic of basaltic origin; the rocks from the Botucatu Formation are eolic sandstones; and the rocks from the Pirambóia Formation are composed of sandstones from river deposits and floodplains (Bistrichi et al., 1981).The soils at the sites are mainly classified as Arenosols and Ferralsols (WRB, 2014) and Neossolos Quartzarênicos and Latossolos, respectively (Santos et al., 2013).

Database
This study was divided into three steps: (1) to determine a methodology for measuring exposed continuous soil areas with no vegetation in Area 1 from the state of São Paulo; (2) to perform a theoretical validation in Area 2 from the state of São Paulo; and (3) to perform a practical validation in an unknown area from a different region (Area 3 in the state of Goiás) to check the potential of the method.First, we needed to understand the textural condition of the target areas at the terrestrial level before analyzing it at the satellite level.For that purpose, topsoil samples were obtained along toposequences (transect method).We collected 300 soil samples at a depth of 0.00-0.20 m from 20 toposequences (Figure 2) and their georeferenced images.After that, we analyzed soil particle distribution to determine the contents of coarse and fine sands, silt, and clay in the soil samples (Camargo et al., 1987).Clay contents were grouped into five textural classes, according to Santos et al. (2013): sandy (<150 g kg -1 of clay), sandy loamy (150 to 250 g kg -1 of clay), clayey loamy (250 to 350 g kg -1 of clay), clayey (350 to 600 g kg -1 of clay), and heavy clayey (>600 g kg -1 of clay).Soils in these areas were mostly classified as Arenosols and Ferralsols (WRB, 2014).Then, the 300 sampling points were positioned in pixels corresponding to bare soils.

Satellite data acquisition and processing -fusion preparation
Five TM/Landsat-5 images were used in order to obtain consecutive years in the same season.September was chosen since the soil is usually dry in this month.In large agricultural areas cultivated with sugarcane, traditional tillage resulting in bare soils was practiced in different areas from one year to the next.This is a five-year cycle, which means that an area tilled in a given year will only be tilled again five years later.Within a five-year period, all parts of a given area should have bare soil.Thus, our strategy was to collect images from different years to detect spots of bare soils and construct a map like a "puzzle", where each piece is related to bare soil from a specific year.
The images were georeferenced using ground control points obtained from GPS, and the nearest neighbor was used as an interpolation method (RSI, 2006) to maintain the pixel   (Vermote et al., 1997).The 6S program allows geometric configuration of specific satellites, such as Landsat 5, to be selected.
For each satellite image, the Linear Spectral Mixture Model (LSMM) was used to discriminate vegetated areas from images of bare soil (Demattê et al., 2009), reducing pixel mixture and quantifying proportions of pure elements that constitute the pixel mixture (Shimabukuro and Smith, 1991).After using the LSMM, the following image-processing procedures were performed to indicate the true nature of the pixel: A) determination of NDVI images (Equation 1), B) evaluation of the pixel position at the soil line, C) display of the false-color band combination of bands 5 (1.55 to 1.75 µm), 4 (0.76 to 0.90 µm), and 3 (0.63 to 0.69 µm) as red, green, and blue, respectively, D) display of the true-color band combination of bands 3, 2 (0.520 to 0.600 µm), and 1 (0.450 to 520 µm) as red, green, and blue, respectively, and E) determination of the pixel spectral manifestation (spectral shape), as described by Demattê et al. (2009).Thus, only when all these procedures simultaneously indicated a pixel as bare soil was the pixel used for subsequent analysis.Pixels were rejected if at least one of these procedures, such as NDVI, indicated any possibility of contribution from vegetation in their spectral reflectance (Demattê et al., 2009).Only pixels that met all the requirements were used to establish the library for a certain soil class.With the information on bare soil positions for each image, a mask was made excluding all vegetated areas using the following equation: where IV is the spectral pixel response in the near infrared (TM 4) and VIS is the spectral response in the visible pixel (TM 3).The images corresponding to different seasons were individually classified using the Gaussian Maximum Likelihood algorithm (supervised classification), where bands 1, 2, 3, 4, 5, and 7 were used in the classification.The vector file containing the 300 sampling points with information related to soil textural classes was overlaid with the surface reflectance images.The reflectance of pixels at each sampling point was obtained, and five regions of interest (ROIs) were designated, corresponding to each soil texture class.All pixels that were not identified by supervised classification as belonging to any specific textural class were defined as "NoData" and were reclassified to zero value.The images displayed only pixels related to classes of interest.After that, a fusion of all five classified images with their respective portions of bare soils was obtained by overlapping them.We suggested the name of fused image (FI) for this final product.
A mosaic with five supervised classification images (SCI) was designed to show the largest possible area covered by the supervised classification in a single image.Thus, a mosaic of exposed soils was generated and soil texture was classified.
In the validation stage (Step 3), data from Area 1 were tested in Area 2. Area 2 was subdivided into smaller continuous areas (Figure 2).Both SCI and the fused image were cut off based on these sub-areas and were subjected to the "Zonal Geometry" routine of the ArcGis 9.2 program to calculate the percentage of bare soil obtained by supervised classification.We positioned 204 other sampling points over this unknown area.These soil samples were collected, analyzed (as described earlier), and classified according to textural classes.We considered these data points as "reference data" and compared them with pixels obtained from the fused image.From this comparison, a contingency table (error or confusion matrix) was obtained.In addition, we performed correlation between the test results and determined the percentages of correct answers for each class, as well as overall accuracy and the Kappa index (Cohen, 1960) (Equation 2).
Eq. 2 Eq. 4 where K is the Kappa coefficient, n is the number of columns and rows of the matrix confusion, m ij is the element (i,j) of the matrix confusion, and N is the total number of observations.Finally, an in situ field validation was performed.For this validation, we analyzed a third area (Area 3) located in the state of Goiás, quite distant from the other sites (Figure 1).This validation was performed as follows: a) the previously described fusion method was used to obtain a single image of bare soil distribution; b) 200 pixels inside this area (35,000 ha) were chosen to obtain different spectra; c) we went to the field and identified soil texture using field methodology (by feel) of each topsoil at the spots (pixels); and d) correlations between spectra from fused images and topsoil texture were performed.

Fused image of bare soils
The sand class showed the largest number of observations ( 83) and the highest coefficient of variation (Table 1).The contents of organic matter (OM) analyzed in the sites had low variation, from 0.5 to 0.7 %.Chemical analysis also indicated very low fertility and cation exchange capacity (CEC) (Table 1).However, particle size distribution was the soil property that most influenced the spectra.Considering all textures within an area of 15,000 ha, we found high and similar numbers of samples.The only exception was for heavy clayey.In fact, heavy clayey is a very specific class and does not occur in this site.
The SCI obtained from Landsat images and classified according to the four soil texture classes covered only areas of bare soil, which was evident when we simultaneously observed satellite images, NDVI images, and the SCI (Figure 3).The satellite image was visualized with the following composition: R (band 3), G (band 2), and B (band 1), whereas the NDVI image represented exposed soil by a dark color and vegetation by a light color.The fact that SCI involved only areas of exposed soil is very important because this exposure is necessary to perform analysis of soil texture by satellite imaging. (1)n: number of samples; (2) SE: standard error; (3) CV: coefficient of variation.When we analyzed images for each year, areas that were classified as N.Class (Not classified) were larger than the useful area to which we applied the supervised classification (Table 2, Figure 4).N.Class areas were not classified (in the respective season) because they had vegetation cover.We determined that, on average, 63 % of the total area of each sub-area showed N.Class values compared with only 37 % of the total area with bare soils.When we used a single image, we had an average of 36.2 % considered as bare soils.On the other hand, using the fused image, we obtained an average of 75.2 %, a minimum of 59 %, and a maximum of 85 % considered as bare soil.The study sites were cultivated; therefore, they have some type of plant cover through most of the year, hindering analyses by satellite imaging.Such information highlights the problem of working with images from only one season.For example, in the case of an image dated from Aug. 17, 2003, sub-area 7 showed a percentage of useful area of 72 %, while sub-area 8 showed a percentage of only 1 % (Table 2).Thus, it is impossible to use images of Area 3 for soils using this date (Figure 5).On the other hand, the importance of evaluating images in different years can be observed in sub-area 8. From 2002 to 2007 (with exception of 2003), the bare soil rate was 1, 56, 42, 63, and 3 %, respectively.In practice, a pedologist could obtain a maximum area with 63 % bare soil analyzing only one image.When we used the proposed method, we obtained 75 % bare soil (Table 2, Figure 5). (1)Non-classified area not to be considered as exposed soil; (2) Area classified as exposed soil.
Rev Bras Cienc Solo 2016;40:e0150335 values were higher than those found by Demattê et al. (2005) for a similar area.Some confusion was observed, such as 33.3 % of the clayey class was mixed with the clayey loamy class.The sandy loamy class with 56.6 % agreement had 28.3 % misclassified as the clayey loamy class.The confusion in supervised classification was mostly related to the correct texture class.
Topsoils of texture were discriminated from soils of the sandy class in 100 % of the classification, that is, none of the clayey soils were classified as sandy texture.Okin et al. (2001) had considerable success in discriminating clayey from sandy soils, with approximately 90 % accuracy.However, the authors conducted the experiment with the high spectral resolution Airborne Visible and Infrared Imaging Spectrometer (AVIRIS) using the MESMA method.This sensor has 224 spectral bands and is considerably more powerful than the Landsat, which explains their better results.Coleman et al. (1993) started with 0.4 R² for clay using Landsat.Nanni and Demattê (2006) reached 67 % accuracy for the clayey class when evaluated an area with full exposure of non-vegetated areas (bare soils).Demattê et al. (2009) reached 90 % accuracy evaluating 224 pixels in a 300,000 ha area, but in this case, it was all done in a single image.The fact is that the present methodology reached a classification accuracy ranging from 60 to 86 % for some soil texture classes using multi-spectral Landsat images (only six bands) and fused images from several seasons.Moreover, the methodology used in this study generated continuous areas with bare soil that can provide more sustainable information to pedologists.Still, we have to keep in mind that the soil sample was collected in a single spot that corresponds to a 0.20 m 2 (the borehole) and the pixel of the image covers 30 m 2 .Moreover, the sensor is located at a distance of 800 km from the target.
The overall accuracy (63.8 %) (Table 4) was lower than the minimum acceptable value (85 %) according to Guptill and Morrison (1995).Considering the classes individually, we observed that clayey texture was the only class whose value was higher than the minimum acceptable value, and the Kappa index obtained (0.52) was considered appropriate by the Landis and Koch (1977) classification, suggesting that the method can be used to indicate soil texture on the surface.

In situ evaluation
In areas located in the state of Goiás, we created a fused image of exposed soils, as mentioned in the methodology.We used a 543 RGB color composition based on the image and an unsupervised classification.We then went to the field using GPS, and with the image in a Palmtop, we went to spots already expecting to see the soil texture.This in situ approach was illustrated in figure 6.For each validation spot, we went to the field and determined its texture (at 200 points).We determined that the different spectral information detected in the image enabled differentiation of soil texture.The soils of these areas are mostly Arenosols and Ferralsols, which have no texture variation with depth.Thus, determining surface texture represents most of the soil.Since these are very large areas, determining soil texture as "continuous" would be highly cost-effective.The OM content did not influence the results, because it was similar in all areas, ranging from 0.3 to 0.5 %.In many situations, we observed light colors in the 543 RGB image and expected to "see" sandy soils.In fact, upon reaching the spot, we made the borehole, and a sandy texture was detected.We observed that 90 % of the soil samples in situ were in the same texture classification as indicated by the image.The image also indicated spatial variation (Figure 6).This information can assist farmers in making a texture map to assist agriculture practices, such as the use of herbicides.
Soil texture is one of the most important features in production of many crops as it is related to water retention, soil aggregation, and soil exchange capacity.Our method shows that it is possible to have excellent, detailed information on soil texture in these areas.
Rev Bras Cienc Solo 2016;40:e0150335 The use of hyperspectral images is clearly the best choice because of the great number of bands.However, these data are still under development, and they are difficult to use and somewhat costly.AVIRIS, with its 224 bands, is a private sensor and is not available to users.Hyperion has low quality (high noise information) and does not have high temporal information.Thus, we need an easy and inexpensive method to assist users.In this context, Landsat has high temporal information and free access, achieving fairly good results, as presented in this study.

CONCLUSIONS
Using the fused image (FI) methodology of multi-temporal images of the same region, the bare soil area was increased from 36 to 75 %, representing a gain of more than 100 %.There is wide variation in the amount of bare soil in satellite images at the same site in different seasons, ranging from 1 to 65 %.This range indicates the importance of evaluating images in different years (time series).In our methodology, the use of images from different periods enabled us to map more than 75 % of the total area studied, an increase of up to 60 % when compared with the use of a single image.
its original value.After that, atmospheric correction was performed by the 6S program (Second Simulation of the Satellite Signal in the Solar Spectrum) to convert digital numbers into surface reflectance values

Figure 2 .
Figure 2. Distribution scheme of sampling points in Area 1 (calibration phase) as well as in Area 2 (validation phase), and determination of eight sub-areas for the validation phase.

Figure 6 .
Figure 6.Illustration of field work: (a) Landsat image indicating areas with bare soils, sandy and clayey; (b) soil variability and sampling in the field; (c) Soil variability in an area with bare soil; (d) Soil variability in an area with vegetation cover; and (e) Landsat image indicating variations in soil texture.

Table 2 .
Quantitative statistics of exposed soil areas in each study sub-area and in different years