THE USE OF ARTIFICIAL INTELLIGENCE FOR ESTIMATING SOIL RESISTANCE TO PENETRATION

The aim of this study was to present and to evaluate methodologies for the estimation of soil resistance to penetration (RP) using artificial intelligence prediction techniques. In order to do so, a data base with values of physical-water characteristics of the soils available in the literature was used, and the performances of Artificial Neural Networks (ANN) and Support Vector Machines (SVM) were evaluated. The models generated from the ANNs were implemented through the multilayer perceptron with backpropagation algorithm of Matlab software, varying the number of neurons in the input and intermediate layers. For the procedure from SVM, the RapidMiner software was used, varying input variables, the kernel function and the coefficients of these functions. The efficiency of the techniques was analyzed by the ratio 1:1, and later, compared to the Busscher non-linear model (Busscher, 1990). The results showed that the artificial intelligence models (ANN and SVM) are efficient and have predictive capacity superior to the Busscher model, under data conditions of soils with textural classes and different, and similar managements, although with higher performance index values for conditions of soils of the same textural class exposed to the same management.


INTRODUCTION
The sustainable use of natural resources, especially soil and water, has become a topic of increasing relevance and the focus of numerous researches.Among these researches (Tavares Filho et al., 2012, Cortez et al., 2017, Ribon & Tavares Filho, 2008) a number of studies have been carried out on the soil compression process, mentioning the use of models or functions that identify the physical quality of it (Fernandes et al., 2016).The understanding of this process is essential to estimate the changes that can occur in the soil structure when subjected to a certain external pressure.
The soil compaction promotes an increase in soil density and its mechanical resistance, as well as reducing the total porosity by altering the pore distribution by size (Ajayi et al., 2009;2010;Chioderoli et al. 2012).The compacted soil also promotes increased load carrying capacity (Dias Junior et al., 2007, Deperon Júnior et al., 2016), limiting nutrient absorption, water infiltration and redistribution, gas exchange, seedling emergence and the development of the root system, which results in decreased production (Arvidsson, 2001;Dauda & Samari, 2002;Gubiani et al., 2013;Toigo et al., 2015).These conditions promote the increase of erosion as well as the power of the equipment used in soil preparation (Canillas & Salokhe, 2002).Martins (2012) states that, due to the technological development of mechanized harvest and its potential to promote soil compaction, researchers have used various physical and mechanical properties to quantify the effect of compaction on soil structure, and resistance to penetration is the most frequently used (Cortez et al., 2017;Dias Junior et al., 2008;Molina Junior et al., 2013).
In this context, the knowledge of resistance to penetration can be used to develop management strategies that minimize the risks of reduced productivity and additional soil compaction due to the impacts caused by operations, mainly motor mechanized.
Even with the existence of several methodologies to estimate soil resistance to penetration, including empirical, analytical and numerical methods, they have as disadvantages the non-consideration of some important parameters, such as mineralogy, thus limiting to soils with similar behaviors, or a high computational cost of the algorithms analysis.However, studies carried out by Soares et al. (2014), estimating water retention curves in agricultural soils, showed that the results obtained by pedotransfer functions generated by artificial neural networks are often better than those obtained by traditional methods.
The Artificial neural networks (ANN), which encompass several techniques that can emulate human behavior, stand out in the field of Artificial Intelligence (AI).In addition to the ANN, the Support Vector Machines are a technique based on Statistical Learning Theory (Takahashi, 2012), which has received great attention in recent years.Studies related to soil load bearing capacity, such as Martins & Miranda (2012), showed that the support vector machines led to a better estimation of the safety factor and a better evaluation of the stability of the slopes in earth dams.
Based on the above, there is the need to research the field of Artificial Intelligence in soil physics that deals with the load bearing capacity, aiming to evaluate the estimation of resistance to penetration obtained by artificial neural networks and support vector machines.

MATERIAL AND METHODS
For the development and training of the models (ANN and SVM), a database was elaborated based on the physical-hydrographic characteristics of the soils obtained through research and investigation of the national and international literature of soil load bearing capacity.
The following were selected as input variables for the formulation of the models: clay content (Cla), sand (san) and silt (Sil), soil density (Sd), particle density (Pd) and soil volumetric moisture (θ).The models were composed of six different arrangements, the first composed of all the input variables and the following were established by removing some of these (Table 1).TABLE 1. Independent input variables in the computational models of resistance to penetration (RP) estimation.
The models generated from the ANN were implemented through the Multi-Layer-Perceptron with backpropagation algorithm and Levenberg-Marquardt optimization, using the Neural Network Toolbox of Matlab ® 2008b software.The architecture of the neural network used was composed of an input layer, a hidden layer and an output layer.For the hidden layer, variations in the number of neurons were tested, using 10, 20, 30, 40 and 50 neurons, according to Braga (2014) in order to verify which topology generated the best results.
Knowing that at the beginning of the training, the free parameters are generated at random, and according to Soares et al. (2014) these initial values may influence the final training result, each combination of the variables in the ANN was trained 3, 4, 5, 6, 7, 8, 9, 10, 13, 15, 17, 19, 25 and 30 times.After the training, the best result was the one with the best coefficient of determination (R²) between the data obtained in the literature for validation and the data estimated by the artificial neural network simulation.
The SVM training was carried out in the RapidMiner5 software, selecting the epsilon-SVR option for the SVM type, in order to vary the entries (Table 1), the type of kernel function (Table 2) and γ and C parameters.Each input model was trained with kernel functions of the types: linear, polynomial (2 nd and 3 rd degree), Radial Basis Function and sigmoidal.For each of the kernel functions, the C parameter was also tested with values of 0, 50, 100 and 150.The γ parameter was also tested for each of the variations of C and the kernel function, except for the linear kernel function that does not present the γ in its formulation.Based on the estimated penetration resistance values, linear regression equations were adjusted to be analyzed graphically by the 1: 1 ratio.The dependent variables were the respective RP values obtained from the database and the independent variables were the values estimated by the methods under study (ANN and SVM).We also evaluated the estimated values of RP by the nonlinear model of Busscher (1990), since this is very cited for such estimation (Blainski et al., 2008;Ribon & Tavares Filho, 2008;Gubiani, 2012;Suzuki et al., 2008, Roboredo et al., 2010), comparing them to the study methods, for input data related to soils of several textural classes, sandy loam soil, and very clayey soil.
To evaluate the results obtained by each method, the performance index (id) was used, which was determined by the product of the correlation coefficient (r) and the concordance index (c), according to the methodology described by Braga et al. (2014), being the performance classes: great (id above 0.85), very good (id from 0.76 to 0.85), good (id from 0.66 to 0.75), regular (id from 0.61 to 0.65), weak (id from 0.51 to 0.60), very weak (id from 0.41 to 0.50) and poor (id of less than 0.41).

RESULTS AND DISCUSSION
Figures 1a, 1b and 1c show the graphical relations between the values of soil resistance to penetration (RP) obtained in the literature and the values estimated by artificial neural networks, support vector machines and by the Busscher model, respectively, for classes textures.
In the ANN1 model, the estimation of soil resistance to penetration (RP) obtained the best result with R² of 0.7262 (Figure 1a) when using the topology constituted by 50 neurons in the intermediate layer and 5 repetitions in the network training, while in the SVM1, the topology constituted by the kernel function RBF and the C parameters equal to 100 and γ equal to 10 was the one that best fitted the data with R² of 0.7520 (Figure 1b).Both artificial intelligence methodologies presented good results, since, according to Andrade et al. (2013), the coefficient of determination of 0.62 obtained when estimating RP was considered satisfactory.Differently from what happened with the ANN and SVM models, the Busscher model did not obtain a good result, since it presented coefficient of determination of 0.0221 (Figure 1c).In the ANN2, the topology that presented the highest predictive capacity was composed of 50 neurons in the hidden layer and 10 trainings, presenting a R² of 0.6517 (Figure 2a) lower than the previously obtained (ANN1), which leads to deduce that the variable (Pd) exerts a strong influence on the estimation of RP through artificial neural networks.Stefanoski et al. (2013) point out that the values of particle density (Pd) are strongly influenced by the organic matter content, which in turn contributes effectively to the estimation of soil resistance to penetration (Ribon & Tavares Filho, 2008).
The behavior of the support vector machines was not similar to that of the neural networks, since even reducing the number of input variables, the SVM2 model presented R² of 0.7634 (Figure 2b) higher than that obtained with SVM1, presenting this result with the same topology.The ANN3 and SVM3 models are differentiated from the previous ones by the Silt + Clay variable, which is the sum of the two variables presented separately in the ANN2 and SVM2 models, since it is easier to obtain.The ANN3 model presented a coefficient of determination (R²) equal to 0.6107 (Figure 3a), lower than that obtained in ANN2, although SVM3 obtained R² of 0.7136 (Figure 3b), lower than SVM2, but higher than ANN2, even with simpler input variables.The SVM3 model obtained these results using the C parameters equal to 100 and γ equal to 15 for the training process.
Estimated RP by ANN (MPa) The ANN4 and SVM4 models are composed by the variables soil density (Sd) and volumetric moisture (θ), these being the same ones used in the non-linear model developed by Busscher (Busscher, 1990).The adjustment of the RP values obtained in the literature, when compared with those estimated in both ANN4 (Figure 4a) and SVM4 (Figure 4b), presented a great dispersion, since the coefficients of determination (R²) were 0.1502 and 0.1432, respectively.Although the R² values obtained by artificial intelligence models (ANN4 and SVM4) were low, they were much higher than 0.0136, obtained by the Busscher Table 3 shows that when using all the data, in order not to discriminate soil types or management type, the performance of the artificial neural networks for the estimation of soil resistance to the penetration was better in ANN1, being classified as "Very good" as well as the support vector machines.However, the performance index (id) of ANN1 was slightly lower (0.7779) than that of SVM1, which obtained an "id" equal to 0.8021.These models were much superior to the Busscher model, classified as "poor" and with a performance index of 0.0389.TABLE 3. Values of the correlation coefficient (r), coefficient of determination (R²), agreement index (c), performance index (id) and the qualitative performance class of the different soil resistance to penetration (RP) estimation models.
equations are required based on different granulometry and density conditions.
Table 4 shows the performance of the models with an input architecture similar to the previous model (soil density and volumetric moisture), with the difference that the input data were soils with the same textural class, but from different locations (sandy loam from different localities).In this condition, although all the models were classified as "poor", the inferiority of the predictive capacity of the Busscher model is evident, they presented "id" equal to 0.04.The ANN was the method of estimation that obtained the best results (id = 0.39) for soils of the same textural class without taking into account the locality of the sample.
The methodologies were also tested for very clayey (VC) soil data of the same property and exposed to the same management.Table 5 shows that in singular conditions all methods presented performance indexes classified as "good", highlighting, as well as in previous architectures, the models generated by artificial intelligence ANN and SVM with performance indexes of 0.71 and 0.69, respectively.However, there was no discrepancy between those methods and the Busscher method that obtained "id" equal to 0.68.TABLE 5. Values of the correlation coefficient (r), coefficient of determination (R²), agreement index (c), performance index (id) and the qualitative performance class of the different models of soil resistance to penetration (RP) estimation for very clayey soil (VC).

CONCLUSIONS
The models based on artificial intelligence (ANN and SVM) presented a performance index superior to the non-linear model of Busscher in all the studied scenarios, being able to be used for the estimation of resistance of the soil to the penetration.
The artificial neural networks (ANN) presented superior predictive capacity than the other methods to estimate soil resistance to penetration for soil data of the same textural class.
The support vector machines (SVM) presented higher predictive capacity than the other models for soil resistance to penetration for soil data with different textural classes and management.

FIGURE 1 .
FIGURE 1. Graphical representation of the RP values observed in the literature in relation to those estimated by ANN1 (a), SVM1 (b) and Busscher (c), for soils of diverse textural classes.

FIGURE 2 .
FIGURE 2. Graphical representation of the RP values observed in the literature in relation to those estimated by ANN2 (a), SVM2 (b) and Busscher (c), for soils of diverse textural classes.

FIGURE 3 .
FIGURE 3. Graphical representation of the RP values observed in the literature in relation to those estimated by ANN3 (a), SVM3 (b) and Busscher (c), for soils of diverse textural classes.

FIGURE 4 .
FIGURE 4. Graphical representation of the RP values observed in the literature in relation to those estimated by ANN4 (a), SVM4 (b) and Busscher (c), for soils of diverse textural classes.

TABLE 2 .
Kernel functions used for different SVM architectures.
Polynomial RBF Sigmoidal For each ANN and SVM architecture, the data were randomly divided, by software, in 75% for training and 25% for validation, following the indication of Nagaoka et al. (2005).

TABLE 4 .
Values of the correlation coefficient (r), coefficient of determination (R²), agreement index (c), performance index (id) and the qualitative performance class of the different models of soil resistance to penetration (RP) estimation for sandy loam soil (SL).