IN BATHYMETRIC SURFACES : IDW OR KRIGING ?

The representation of the submerged relief is very importance in diverse areas of knowledge such as Projects to build or reassess port dimensions, installation of moles, ducts, marinas, bridges, tunnels, mineral prospecting, waterways, dredging, silting control of river and lakes, and others. The depths of the aquatic bodies, indispensable for the representation of those, are obtained through the bathymetric surveys. However, the result of a bathymetric sampling is a grid of points that, for itself, it is not capable of generating directly the Digital Model of Depth (DMD), being necessary the use of interpolators. Currently, there are more than 40 available scientific methods of interpolation, each one with its particularities and characteristics. This study has the objective to analise, comparing, the efficiency of Universal Kriging (UK) and of the Inverse Distance Weighted (IDW) in the computational representation of bathymetric surfaces, varying in a decreasing way the quantity of sample points. Through the results, we can be stated the superiority of the interpolator Universal Kriging in efficiency in creating DMD with basis in the bathymetric surveys data.


1.Introduction
Brazil has an extensive coast and the largest hydrographic net of the world, with rivers that stand out in depth, width and extension.This affirmation alone would justify any study related to the submerged floor.
Since the early 19 th century, navigators have tried to better understand the seafloor.At present, this study is necessary in portuary works, both in the construction and dredging of new ports; leasing of gas pipelines and transoceanic telephone cables; exploration of oil and other mineral resources; environmental preservation; research activities; follow-up of erosion or silting-up processes, and, especially, in navigation (Iho 2005;Sánchez 2010).
For analysis, preparation and introduction procedures of these studies, the use of Digital Elevation Models (DEMs) is essential.These models consist of a computational mathematical representation of the distribution of a space phenomenon occurring within a region of the land surface (Felgueiras 1998).
DEMs allow from a simple three-dimensional visualization of the floor to more complex analyses, like volume calculations and generation of slope maps (Felgueiras 1998).
The depths of water bodies, essential in the construction of DEMs of seafloor, are obtained through bathymetric surveys.In spite of the growing technological evolution, single-beam bathymetric survey is still the most used technique in the whole world (Iho 2005).This technique is carried out on board of vessels using single-beam echosounders for depth measurements at high sampling rate and GPS (Global Positioning System) receptors for differential planimetric positioning (Ferreira 2015).The outcome is a mesh of three-dimensional points that, by itself, is not able to directly generate the imaged floor surface.To build a DEM that represents such morphology, it is necessary to employ interpolation techniques to estimate the depth value of non-sampled places (Camargo 1998;Silveira 2014).
Interpolators are mathematical functions used in the construction of continuous surfaces from a set of collected points.They are used for densification of a sample that does not cover the whole interest area.Interpolation techniques are based, more frequently, on the basic geography principle that near objects tend to be more correlated than distant ones (Ferreira 2015).
Many are the interpolation methods found in the literature, each one with their peculiarities and characteristics.They are basically divided into deterministic and probabilistic interpolators (Santos 2010).Both methods are based on the similarity of near points to create a spatially continuous surface.Deterministic models make estimations from mathematical functions.Probabilistic In bathymetric... Bull.Geod.Sci, Articles section, Curitiba, v. 23, n°3, p.493 -508, Jul -Sept, 2017.models, besides mathematical functions, apply statistical methods, allowing besides creating spatially continuous surfaces, estimating the uncertainty of predictions (Ferreira 2015).
Among the available interpolators, the Inverse Distance Weighted (IDW) deterministic interpolator and the kriging probabilistic interpolator stand out (Carvalho e Assad 2005;Silva et al. 2008).
According Azpurua and Dos Ramos (2010), Meng et al. (2013) and Merwade et al. (2006) showed that Inverse Distance Weighting (IDW) to produce better results than geostatistical methods.Conversely, Bello-Pineda and Hernández-Stefanoni (2007) showed that the kriging method was better than IDW for mapping the bathymetry of the Yucatan submerged platform (Curtarelli et al. 2015).
In inverse distance weighted, maximum and minimum interpolated values are within the range of the sampling points (Ferreira 2015).This method determines values for not sampled points using a weighted linear combination of a set of sampled points.The weight is a function of the inverse of the distance raised to any mathematical exponent (Landim 2000;Watson 1985).This method is easy to implement, with few decisions to take regarding the model parameters.However, this interpolator is sensitive to clusters and to the presence of outliers (Ferreira 2015).
Kriging is an interpolator that can be exact or smoothed depending on the model, associated to prediction error analysis.To apply kriging assumptions must be followed (Ferreira 2015).
The main difference between kriging and other interpolation methods is in the way that weighting is attributed to the different samples.In kriging, weights are determined from a space analysis based on semivariogram.In addition, kriging normally provides unbiased and minimal variance of estimates.A great advantage of kriging over inverse distance weighted and other deterministic methods is easy production of prediction maps, like prediction errors and probabilities, in other words, kriging supplies the precision associated to each estimate (Vieira 2000).A disadvantage is the necessity of a series of decision making on data modulation, tendencies, adjustment of semivariograms and choice of neighborhood size.Thus, prior to interpolation, kriging requires a detailed geostatistical analysis of the studied phenomenon.
In spite of the vast use of these interpolators, many divergences exist on their choice and use.Studies by Kravchenco and Bullock (2003), demonstrated that kriging performs a more precise description of the spatial structure of the studied phenomenon.However, the inverse distance weighted interpolator is simpler to apply and demands less time.
Better results for kriging, when compared with the inverse distance weighted method were also noted by Tabios and Salas (1985), Laslett et al. (1987) and Warrick et al. (1988).In contrast, Kanegae Júnior et al. (2006), Wollenhaupt et al. (1994) and Gotway and Hartford (1996) demonstrated that inverse distance weighted is more efficient than kriging.Silva et al. (2008) and Souza et al. (2010) did not find great differences while comparing these methods.
Such divergences can be directly associated with the amount of sampling points.In agreement with Burrough apud Camargo (1998) when data are abundant, most interpolation methods produce basically identical results.Conversely, when data is scattered as in topobathymetric surveys, deterministic methods have limitations in the representation of spatial variability.
Therefore, the aim of the present study was to compare the efficiency of kriging and inverse distance weighted in the computational representation of bathymetric surfaces, decreasingly varying the amount of sampling points.
The purpose of this article is to make a comparison on the efficiency of Universal Kriging (KU) and the Weighted Inverse of Distance (IPD) in the computational representation of bathymetric surfaces.

Inverse Distance Weighted (IDW)
The inverse distance weighted determines the values for not sampled places using a weighted linear combination of a set of sampled points.The weight is a function of inverse distance raised to any mathematical exponent (Landim 2000;Watson 1985).As a result, as distance increases the weights decrease; the decrease gets more intense, with higher exponents.The exponent value can be chosen by minimizing root mean square deviation (RMSD), obtained from cross validation (Ferreira 2015).Inverse distance weighted is calculated by the following Equation 1 described by Ferreira (2015): where: Ẑ is the estimated value for place l0; n is the number of measured values, () i Zl , involved in the prediction; (  ) = 1    is the weight attributed to observation i (inverse distance function); and Pot is the mathematical exponent.Souza (2003) affirmed that the algorithm of the inverse distance weighted is what better represents the floor surface for generation of the digital elevation model (DEM), as it has the characteristic of softening the study surface.
Another important characteristic of this method is that it allows the handling of dimension parameters of the search neighborhood, the number of neighbors to be processed in the calculation and the exponent to be employed in the distance weighted.
According to Landim (2000), with this method, the results are variable, from highly biased to in favor of points nearest to results where weight is practically the same for all near points.According to this same author, the exponent has the following effects on the estimated results: low exponents (0-2) emphasize local anomalies; whereas high exponents (3-5) soften local anomalies.Higher or equal exponents to 10 result in even estimates.

Kriging
Geostatistics is based on the theory of regionalized variables.Such theory assumes that the studied phenomenon is stationary (Vieira 2000;Santos 2010).Geostatistical inference is based on the assumption of three hypotheses of stationarity: first and second order stationarity and semivariogram.First-order stationarity, according to Babak and Deutsch (2009) is that in the mean is constant in every area studied.According to Banerjee et al. (2015) second order stationarity is a less restrictive condition and exists if the mean and variance of the stochastic process are independent of location and covariance exists and is dependent only on distance h.The intrinsic hypothesis is the most used because it is less restrictive (Chilès & Delfiner 2012, Siqueira et al. 2010;Lark 2012), this means that it only requires the existence and stationarity of the semivariogram without any restriction regarding the existence of variance Finite (Vieira 2000).For geostatistical studies, second order stationarity is required (Guimarães 2004).
Intrinsic hypothesis assumes that Z (l) exists and does not depend on location l, and that for every Δd, the difference variance [Z (l + Δd) -Z (l)] exists and does not depend on location l, where Z (l) corresponds to an occurrence of the studied phenomenon at point l and Δd is the distance between the successive occurrences (Guimarães 2004;Santos 2010).
The semivariogram is the most used tool in Geostatistics because it requires that only the intrinsic hypothesis is satisfied (Guimarães 2004).The experimental semivariogram was obtained from the calculation of semivariances ˆ() Δd  according to Equation 2: where N(Δd) is the number of pairs of Z(li) and Z(li+ Δd) values, separated by a distance Δd.It is expected that differences {[Z(li) -Z(li + Δd)]} decrease as Δd decreases, in other words, it is expected that nearest spatial observations have a more similar behavior between each other than more distant ones.Thus, it is expected that ˆ() Δd  increases with distance (Camargo 1998).
As it can be Analyzed in equation ( 2), in the construction of the semivariogram, all possible pairs of data are examined.If distance Δd between two points is null, the semivariance will also be.
When distance Δd is small, the points to be compared are very similar and, very correlated, soon the semivariance value is reduced (Ferreira 2015).The semivariogram graphic representation is shown according Figure 1, where the following parameters are identified: Range: distance within which samples present themselves spatially correlated; Sill: semivariogram value corresponding to its reach.From this point, it is considered that there is no more space dependence between samples; and, Nugget: ideally, γ(0) = 0.However, for most of the studied phenomena there is a discontinuity of the semivariogram for smaller distances than the least distance between samples, then, γ(0) ≠ 0. In agreement with Camargo (1998) a part of this discontinuity can be attributed to measurement errors, but it is impossible to quantify whether the largest contribution comes from measurement errors or from small-scale variation unnoticed by the sampling.Source: Adapted from Silveira (2014).When the semivariogram presents identical behavior for all directions it is isotropic; otherwise, it is anisotropic.If anisotropy is detected, it must be corrected through linear transformations; since it prevents the existence of stationarity, condition necessary for accuracy in analysis and estimates for the area in study (Vieira 2000;Santos 2011).
After obtaining the experimental semivariogram, it is possible to adjust it through theoretical models (Santos 2011).It is important that the adjusted model represents the trend of ˆ() Δd  in relation to Δd.Thus, estimates obtained from kriging will be more accurate and, consequently more reliable (Camargo 1998).The adjustment of the theoretical semivariogram is a very important phase, and must not be carried out automatically, since all the necessary parameters to apply kriging depend on the adjusted semivariogram model.
In the literature it is possible to find several isotropic models; these contemplate semivariograms with and without sill.Among the models without sill, the exponent model is quoted; while, among those with sill (the most common), the exponential, spherical and Gaussian models can be mentioned (Vieira 2000;Santos 2011).
Models with sill are known in Geostatistics as transitive models.Some of these models reach the sill asymptotically.For such models, the reach is randomly defined as the corresponding distance at 95% of the sill.Models without sill, as the name suggests, do not reach the sill; in other words, they keep on increasing as the distance increases.These models are used to represent phenomena that have infinite scattering capacity (Camargo 1998;Vieira 2000).
The main difference between kriging and other interpolation methods is in the way that weights (pi) are attributed to different samples.In kriging, weights are determined from a spatial analysis based on the experimental semivariogram.In addition, kriging supplies, on average, unbiased estimates and minimum variance.Another interesting characteristic of kriging is that it allows the calculation of the estimate variance; in other words, kriging supplies accuracy associated to each prediction (Camargo 1998;Vieira 2000).
According to Santos (2011) if trend is detected in the data, it is necessary to use universal kriging.In this method, trend removal is done by an adjustment of low degree polynomials.Then, the remaining analytical procedure becomes an analysis of residues.Universal kriging was proposed by Journel and Matheron to resolve a problem presented by the French National Institute of Geographic (IGN), related to the mapping of an underwater surface of evident inclination (Landim et al. 2002).

2.Materials and methods
The data used in the present study were collected in December of 2010 in a bathymetric survey of one of the main dammings of the Sao Bartolomeu stream located at the Federal University of Viçosa (UFV) in Viçosa city (MG, Brazil).The studied area has approximately 8800 m², 150 m length and 66 m width.
At collection, a single-beam echobathymeter and a couple of RTK (Real Team Kinematic) GPS receptors were used.After collection, data were processed in the Hypack 2010 software producing a file with 1414 points containing planimetric coordinates and respective depths according Figure 2.After statistical analysis of data, as recommended by Ferreira (2010), it was concluded that the survey accuracy in question is in agreement with the quality standards stipulated by DHN (Office of Hydrography and Navigation) and with the IHO (International Hydrographic Organization).In order to reach the goals, the original file, containing 1414 points, here denominated GRID1, was randomly divided into two other files, GRID2 (Figure 2, center), containing 706 points, and GRID3 (Figure 2, right), containing 359 points.Kriging and Inverse distance weighted were applied to GRIDS 1, 2 and 3, aiming to compare the efficiency of both interpolators in presence of many and few sampling points.
Firstly, supervised Kriging was applied, thus, depth data were submitted to an exploratory analysis.Basically, this type of analysis is based on construction and graphic interpretation, calculations and statistical interpretation.Such analysis is a very important procedure, as it allows detecting the existence of outliers and/or trends that may affect interpolation (Guimarães, 2004;Vilela, 2004).
In this study, exploratory analysis consisted in obtaining trend graphs, mean, variance, standard deviation, variation coefficient (CV), maximum value, minimum value, asymmetry, kurtosis estimation and outliers detection tests.
Subsequently, geostatistical analysis was carried out to verify the existence and, in this case, to quantify the degree of spatial dependence of the attribute in study, from the adjustment of the theoretical models to experimental semivariograms.
Semivariograms were also built for directions: N-S (0°), E-W (90°), NE-SW (45°) and SE-NW (135º).After estimation of ˆ() Δd  , the obtained spatial structures were analyzed; theoretical semivariogram models which better conformed to the experimental semivariograms were built from these structures and from knowledge of the phenomenon in study.
When presence of spatial dependence was noted between data, inferences were carried out for kriging for not sampled places from the measured points, according to equation 3 (Camargo 1998;Vieira 2000;Ferreira 2015).
where: Ẑ is the depth value estimated for location l0; n is the number of measured values, () i Zl , involved in the estimate; i p are weights associated to the measured values.For interpolation using inverse distance weighted, exponents were tested adopting the one which presented better results.A minimum of 10 and maximum of 15 sampled nearest neighboring points were used.For kriging, only points within the range of reach of the spatial dependence obtained by each attribute were considered.
In this study, evaluation of the performance of inverse distance weighting and kriging interpolators was carried out by cross validation, considering the estimates of the root-mean-square deviation (RMSD), mean error (ME), coefficient of determination ( 2R ) and simple linear regression parameters between observed and predicted values, angular (a) and linear coefficient (b).
According to Santos (2011), RMSD reduces if the model adopted for the theoretical semivariogram is well chosen.In this case RMSD tends to be the same as the square root of the kriging variance.Likewise, a mean discrepancy close to zero is expected, indicating accuracy in the estimation. 2 R will be best when it is the same as the unit, the same occurs for the angular coefficient (a).However, the linear coefficient (b) will be best when null.In agreement with Morillo Barragán et al. (2002) the RMSD between predicted and observed depths, in addition to the ME are given, respectively, by the following Equations 4 and 5. where: and OBS i Z respectively correspond to predicted and observed depths; n corresponds to the number of observed values and predicted correspondences.

Results and discussion
Results of the exploratory data analysis can be verified according Table 1.It can be noticed that data present a mean variability, considering variance and sampling standard deviation values.Such variability is confirmed by the variation coefficient measurement, based on limits proposed by Warrick and Nielsen (1980), who consider: low (CV ≤ 12%); average (12% < CV < 60%) and high variability (CV ≥ 60%).It is observed, by the variation coefficient, standard deviation and variance, a higher sampling variation in GRID3, comparatively to GRIDS 1 and 2, affecting prediction.
Moreover, it is emphasized that right asymmetry presented by mean estimates, median and by the coefficient of asymmetry, highlight the form of damming of the Sao Bartolomeu Stream.The data showed the presence of some values distant from mean is noticed.They may be possible outliers; however, zero value depths correspond to the margin of a damming, not being, thus, atypical values.
Based on the exploratory analysis, the trend graph was built, using Arcgis 10 software according Figure 3.In all GRIDS, the presence of second order trend is noticed in depth data, seen in parabolas exposed in vertical plans, which is in agreement with authenticity, as it is a reservoir with intense inclination.
When trend presence is noticed proper geostatistical interpolation is applied.In view of this data characteristic, universal kriging (UK) is chosen.According to Santos (2011), UK applies an adjustment of low-degree polynomials for trend removal, allowing working with residues.
Aiming at verifying the existence of anisotropy, semivariograms were calculated for directions: N-S (0°), E-W (90°), SW-NE (45°) and NW-SE (135º), according Figure 4.It is worth mentioning that as universal kriging was chosen, the built semivariograms here correspond to residual semivariograms.By analyzing the semivariograms of GRID1 shown in Figure 4, it can be noticed that the depth variable presents practically identical spatial dependence standards up to range, that is, it presents the same spatial variability in all directions.Thus, it is concluded that the phenomenon is isotropic.Hence, a single semivariogram representing all directions can be used, entitled omnidirectional semivariogram.
Therefore, semivariogram adjustment was carried out using Arcgis 10 software.An omnidirectional semivariogram was obtained that represents the trend of ˆ() Δd  in relation to Δd.The same analysis was carried out for GRIDS 2 and 3, where anisotropy presence was not detected.
The chosen theoretical models for each GRID are summarized in Table 2.
The theoretical model which better adjusted to an experimental GRID1semivariogram was the stable.In this model, it is necessary to define a parameter, which varies from 0 to 2, where the null value makes the stable model identical to the exponential model.If the parameter is defined as 2, the model becomes Gaussian.The parameter value of the stable model defined in this study was 1.432227.The omnidirectional experimental semivariogram and the adjusted model can be seen according Figure 5 for three grids.After obtaining the semivariogram, universal kriging interpolation can be applied.Prior to interpolation, cross validation is carried out, allowing evaluating the performance of interpolators; however the outcomes will be presented throughout the text.
As previously mentioned, universal kriging was used to estimate points in not sampled locations.
For interpolation using inverse distance weighted, the number of neighbors to be used in the interpolation was set firstly.A minimum of 10 and maximum of 15 sampled nearest points was adopted.Subsequently, a study was carried out to define the value of the exponent used as weight.This value was chosen by analyzing several factors, such as the area characteristics and the RMSD value obtained in the cross validation, as suggested by Ferreira (2015).Exponents were tested varying from 1 to 5. Results are shown according Figure 6.By analyzing Figure 6, the choice of higher exponents for both GRIDS becomes obvious; however it is necessary to be careful with such choice.As reported by Landim (2000), the exponent choice has the following effects on estimated results: low exponents point out local anomalies, whereas high exponents soften local anomalies.In other words, the exponent controls the importance of points around the estimated value, that is, higher exponents result in fewer distant point influences.
It was noticed that lower exponents, besides pointing out local anomalies, provide a smoother surface, this fact is explained by higher weight given to most distant points.The highest exponent, value 5, despite providing a lower RMSD value, around 0.290, for GRID1, 0.428 for GRID2 and 0.535 for the GRID 3, provides a more detailed surface; in other words, less soft.Such fact is due to higher emphasis given to the nearest points.
In view of that, exponent value 2 was chosen, mainly due to the studied area characteristics along with the vast use of this exponent in the literature (Morillo Barragán et al., 2002;Silva et al., 2008;Souza et al., 2010).It is worth pointing out that when exponent value 2 is chosen, inverse distance weighted is then called inverse squared-distance weighted (ISDW).
In this study the evaluation of the performance of ISDW and UK interpolators was carried out by cross validation.Results are shown according Table 3.When analyzing Table 3, GRID by GRID, it can be noticed, through all adopted decision parameters, that UK favored higher accuracy, in both GRIDS, when compared to ISDW, fact justified by RMSD and ME values.In addition, according to Vieira (2000), simple linear regression between observed and predicted values must present 2 R quite near the unit, as well as the regression coefficient "a" and intercept "b" quite near zero.For UK all three parameters were better estimated than in ISDW, which is an important result for the aim of the present work.
Another important result is that UK carried out for GRID3 (fewer sampling points) compared with ISDW for GRID1 (higher number of sampling points) showed higher accuracy, fact justified by RMSD and ME values, showing that Kriging, in computational modeling of bathymetric surfaces, is more accurate than ISDW even in unfavorable situations.
In practical terms, one of the reasons for building digital elevation models of water bodies is to subsequently calculate the volumes.Thus, volume calculation of the surveyed reservoir was carried out aiming at verifying the occurrence of significant differences.Results are shown according Table 4.  Difference of 2,371 m 3 is noticed for GRID1, the difference is even higher for GRID2, around 3,000 m 3 , yet for GRID3 the difference was approximately 2,800 m 3 .If we consider UK for GRID1 as the most accurate interpolation, fact justified in Table 3, while applying UK in GRID2 and GRID3 the following discrepancies are found in the volume calculation, respectively: 163 m 3 (0.5%) and 800 m 3 (2.6%).Whereas while applying ISDW in GRID2 and GRID3, still considering UK is for GRID1 as the most accurate interpolation, the following discrepancies are found in the volume calculation: respectively, 2,885 m 3 (9.5%)and 3,615 m 3 (11.9%).This is another fact that transmits the control of Geostatistics towards the deterministic method studied here.
Since GRID1 presents more accuracy for both interpolators, the DEMs (Digital Elevation Model) produced from GRID1 for interpolation carried out by UK (a) and ISDW (b) are shown according Figure 7. without a bias and with minimum variance.It is possible to notice that this interpolator models both regional trends and local anomalies, and is not sensitive to irregularly sampled or grouped data.A disadvantage of this interpolator is the mathematical complexity of its algorithm and necessary time for modeling the semivariogram.
The DEM interpolated by ISDW for both GRIDS presented higher variability when compared with UK.It could also be noticed that ISDW suffers great influence from anomalous local values, besides being sensitive to grouped data, and in their presence, estimates biased values.

Conclusions
The construction of bathymetric surfaces is an important component in several studies.
This study allowed verifying that the kriging interpolator presented better results in comparison to the inverse distance interpolator for this set of data; in scattered and abundant sample GRIDS.It was also verified, as it is standard in Surveying Engineering, that the volume calculation of the damming in study was more accurate when UK was applied in a scattered sample GRID, comparatively to the ISDW applied in abundant data.Another reason for the use of kriging is the possibility of generating DEM of uncertainties of the interpolation.
In view of the present results, the application of Geostatistics is recommended in the modeling of bathymetric surfaces, with either scattered or abundant data.
Considering kriging as superior in the construction of bathymetric surfaces, further studies are recommended to define the best sample GRID, in terms of cost and benefit using UK.

Figure 6 :
Figure 6: Graphic representation of Exponent x RMSD analysis.

Figure 7 :
Figure 7: Bathymetric depth DEM based on universal kriging using GRID1 (left map) and Depth bathymetric DEM based on ISDW using GRID1 (right map).

Table 1 :
Estimates of descriptive statistics on Depth of São Bartolomeu stream

Table 2 :
Estimates of the variogram analysis

Table 3 :
Presentation of main cross validation measures

Table 4 :
Calculated volumes for each interpolator and sampling grid