1. Introduction

The necessary investment to start a mine is in the order of tens to hundreds of millions of dollars. For the investment to be profitable, the potential product in the subsurface must be present in adequate quantity and quality to justify a decision to invest.

All technological and financial decisions are built on the knowledge about the mineral deposit. Thus, the estimation of grade and location of material in the ground (*in situ* resources) must be known with an acceptable degree of confidence. A small difference between planned (estimated) and realized production have a large impact on mine profitability (^{Sinclair and Blackwell, 2002}).

The resource evaluation of a mining deposit is composed of two steps (^{Chilès et al., 2004}): Delimitation of the boundaries of the units corresponding to the various geological formations and estimation or simulation of grades within each unit.

Therefore, prior to every geostatistical estimation or simulation, there is a need for delimiting the geologic units, implying a stationarity decision for geological and statistical homogeneity within chosen domains (^{McLennan, 2007}). Generating accurate boundary models is clearly necessary, as the quality of the models will influence various downstream mine practices, including estimations/simulations and mining planning (^{Cowan et al., 2003}). These boundaries, separating different stationary domains, are traditionally modeled explicitly by manual digitizing from sample data.

First, two-dimensional poly-lines are manually drawn on cross sections honoring the sample data, and then linked by tie lines, these tied poly-lines are triangulated to create a wire-frame that represents a geologic unit, this process has some disadvantages (^{Cowan et al., 2003}): its time consuming and requires an experienced geomodeler to construct complex geometries; the model produced is unique to each individual geomodeler, and cannot be replicated, making auditing by external personnel a hard task; it is inflexible, since modifying the model as new data becomes available is time consuming and laborious.

For most mines, only a single working model is maintained because of the time constraint. Rarely is there an opportunity to model alternative interpretations and compare resource estimations based on the alternative models; thus, the assessment of mining risks is avoided, even though it is inherent in geological modeling (^{Cowan et al., 2003}).

Mining software packages have provided computational tools to display drillhole data and to speed up the manual digitization of cross sections. Despite these advances, explicit modeling still suffers from the disadvantages presented. Recent novel techniques referred to as implicit modeling provide algorithms that reduce the level of subjectivity by replacing the digitization process with some form of automatic procedure (^{Silva, 2015}).

The aim of this article is to investigate the applicability of the signed distance function methodology (^{Deutsch and Wilde, 2013}; ^{Silva, 2015}) as a substitute or ancillary method for traditional geologic modeling. The algorithm was applied in a real gold deposit, results discussed and the implicit compared to an explicit model created by a geomodeler, in order to check if interpreted geological structures were reproduced by the algorithm.

2. Geologic modeling

There are two groups of geologic modeling techniques available:

Deterministic techniques, which comprise explicit methods where the boundary that separates the geological domains is defined explicitly by the geomodeler, in vertical and horizontal sections, from the drillhole data and implicit methods which replace the manual digitization of the explicit methods by an automatic or semi-automatic form of domain delimitation. Implicit methods are straightforward and computationally fast. Some well-established implicit methodologies are reviewed in this paper: The

*Leapfrog*^{©}methodology (^{Cowan et al., 2003}), potential field (^{Chilès et al., 2004}) and signed distance function technique (^{Deutsch and Wilde, 2013};^{Silva, 2015}). For the latter, a special plug-in was developed to run at*SGeMS*. A real case study is presented herein to illustrate the methodology.Stochastic techniques require more computational effort, and are not so straightforward. The established algorithms are: sequential indicator simulation (

^{Journel, 1989b}), truncated gaussian simulation (^{Journel and Isaaks, 1984}), plurigaussian simulation (^{Galli et al., 1994}) and multi-point based methods (^{Guardiano and Srivastava, 1993}).

3. Implicit methods

3.1 *Leapfrog*^{©} methodology

The subject was introduced in the geology field by ^{Cowan et al., (2003)}, based on the work of ^{Savchenko et al., (1995)} for modeling objects interpolating volume functions. The definition of a volume function is attached to the notion of distance to an interface where the interface is defined as the surface separating two distinct domains. Distance is measured to the nearest interface, and it can be negative or positive depending on whether the location is inside or outside of the domain. The bounding interface of interest is the surface corresponding to a particular iso-value of the volume function, usually the iso-surface zero (^{McLennan, 2007}). The volume function must be interpolated in order to define the boundary interface, by implementing a fast scattered interpolator method know as radial basis function (RBF) (^{Hardy, 1990}). The interpolator is represented as the linear combination of basic functions similar to dual kriging (^{Journel, 1989a}).

*Leapfrog*^{©} provides one of the first implicit boundary modeling implementations within a commercial software package. There are five major steps in the methodology (^{McLennan, 2007}): (1) data validation and composting; (2) interpolation and meshing; (3) incorporating geological morphology; (4) interpolating the geological morphology; (5) morphologically constrained interpolation.

RBF functions do not derive the covariance functions from the data; they correspond to a simple isotropic linear covariance function. Therefore, there is no possibility of incorporating anisotropy into the boundaries through RBF interpolation. Instead it is injected manually in the form of deterministic morphological constraints (^{McLennan and Deutsch, 2006}). Besides this, dealing with multiple domains is not straightforward (^{McLennan, 2007}).

3.2 Potential field

Presented by ^{Chiles et al., (2004)}, this is an implicit 3D scalar field from which a geological interface is extracted as a particular iso-surface. There are five major steps to the methodology (^{McLennan and Deutsch, 2006}): (1) collect surface intersection and structural orientation data; (2) determine the form of the locally varying drift; (3) infer the potential field covariance function; (4) interpolate the potential field with the universal cokriging approach; (5) visualize the uncertainty in the boundary surface placement.

A key feature of the potential field method is the use of universal cokriging to optimally account for both intersection and structural dip data. The procedure is not simple, the covariance of the potential field is particularly difficult to infer, since there are no hard potential field data available (^{McLennan and Deutsch, 2006}).

3.3 Distance function modeling

Signed distance function methodology for multiple categories (^{Silva, 2015}) is a deterministic implicit modeling technique where the implicit function is constructed by interpolating a distance measure based on conditioning data. This method is based on the volume function technique, however, some interesting modifications were made (^{Silva, 2015}).

For each sample, an anisotropic (or isotropic) distance between itself and the nearest sample belonging to an opposite domain is computed and assigned. Negative values represent the distance to the boundary for samples inside the domain. The interface that separates the regions in space is determined by the sign of the estimated signed distance values. The algorithm works as follows.

The set of data z(u_{α}),α = 1,...,n is coded in indicators to specify the samples that belong inside or outside of the domains (Equation 1).

A simple two-dimensional example (Figure 1) proposed by ^{Silva (2015)} is presented to illustrate the process.

Signed distance values are then calculated for each sample using Equation 2. If the sample is inside the domain, the distance is negative; if the sample is outside the domain, the distance is positive. The euclidean norm is used to measure the distance.

The location u_{β} corresponds to the closest sample of the opposite domain to u_{α}.

The distances can also be anisotropic, in this case, the set of data coordinates *x, y* and *z* must be rotated and dilated (or contracted), using the Equation 3 transformation, to *x'', y'', z'',* and then, the euclidean distances are calculated normally. Anisotropic distances should be used when it is known that the geologic body extends more in a specific direction than in the others, as in the case of a tabular body.

Where α, β and ϕ are a|zimuth, dip and rake respectively, and a_{max}, a_{min} and a_{vert} are the anisotropy ranges.

The signed distance implicit function is then interpolated to all locations of interest (Equation 4). Ordinary kriging will be used due to its ability to account for directions of continuity and the spatial configuration of data. Others techniques, such as inverse distance can also be used.

^{Deutsch and Wilde, (2013)} recommend the use of global kriging (^{Neufeld and Wilde, 2005}), a smooth interpolator, to avoid artifacts in the implicit model. However the use of global kriging is restricted to the number of samples due to computational efficiency. Instead of global kriging, ordinary kriging can be used considering a large neighborhood and a large amount of data. Kriging allows us to control the run-time by changing the number of samples retained in the estimates, while methods based on RBF always use all samples.

The signed distances have a non-stationary behavior, namely, the variograms do not have a sill. Also, the linearity of the distance makes the origin behave close to a quadratic form. Thus, the Gaussian model is a well-suited structure for modeling this type of variogram. Known geological trends can be incorporated into the model by the variogram, whilst RBF based methods are not able to incorporate anisotropy. An increment in the nugget effect disconnects the domains and variations in the range influence the shape and spatial extension of the geologic domains.

At the end, the domains are classified at the unsampled location *u* in function of the sign of the signed distance estimates (Equation 5). If the estimate is negative, the location is classified as inside the boundary; otherwise, the location is considered as outside. Models can be created in any resolution at the cost of computer resources and time, the resolution is controlled by modifying the dimensions of the grid blocks.

3.3.1 Distance function modeling for multiple geologic domains

The described signed distance implicit geologic modeling methodology may only be applied for binary cases. In that which follows, the algorithm developed by ^{Silva (2015)} to handle multiple domains will be presented.

Suppose there exists *K* multiple domains in the deposit. For all sample locations z(u_{α}),α = 1,...,n, an indicator vector of *K* elements is coded according to Equation 6:

Thus the *k* element of the vector is one while the remainder *K*-1 elements are set to code 0.

In the same manner as in Equation 6, the signed distance value to the closest opposite domain is computed individually for each k element of the vector (Equation 7). The signs of the distances remain equal as in the binary case.

Interpolation is performed individually for each k. Ordinary kriging is applied multiple times (Equation 8).

Then the final rock type model is determined by the following equation.

The estimated distance provides a measure of proximity to the closest opposite domain. In this sense, the minimum estimated signed distance value may be seen as the most probable domain to be found at an unsampled location (^{Silva, 2015}).

3.3.2 Measure of uncertainty (softmax transformation)

The proposed algorithm does not characterize uncertainty. Therefore, a heuristic measure of uncertainty was proposed by ^{Silva (2015)}, this measure of uncertainty is not based on multiple realizations drawn from a random function as in stochastic methods; Instead, it is based on a post-processing transformation method, a widely used approach for multiple class classification (^{McCullagh and Nelder, 1989}).

The central idea is to transform the estimated signed distance into values that can be interpreted as posterior probabilities (Equation 10). The transformed values lie between 0 and 1 and sum to 1 for all elements k .

*P(i(u))=k* represents the probability at location *u* to belong to category *k*, d*_{k} (*u*) is the estimated distance for category *k* and ω is a parameter that controls the uncertainty bandwidth for *K* categories, a large parameter leads to greater uncertainties.

To illustrate how the transformation works Figure 2 shows, on the left, estimated distances for five different rock types on a particular block and on the right, the probabilities of this block to belong to each of the five categories obtained from the distances by the softmax transformation.

The shorter the estimated distance for a given category *K* at an unknown location *u*, the greater is the probability of that location to belong to category *k*.

3.3.3 Reproducing target proportions (servo system)

To avoid bias introduced by preferential sampling in geologic models, ^{Silva (2015)} proposed the use of some sample declustering techniques and the application of a servo system (^{Strebelle, 2002}), to generate geologic models that match the representative proportions on the declustered data.

The algorithm is based on the probabilities obtained after the distance transformation (Equation 10). The core of the algorithm (Equation 11) consists of updating the probabilities of the domains based on the difference between the target and current marginal distributions. The nodes visited must follow a random path to avoid the introduction of artifacts.

*P(i(u)=k)*^{update} represents the updated probability at location *u* to belong to category *k*, *P(i(u)=k)* is the probability obtained by the softmax transformation, *µ* is the servo system correction parameter defined as *µ*=λ/(1-λ) , *p(k)* and *p ^{c}_{k} (u)* are respectively, the target proportion for category

*k*and the marginal proportion for the category

*k*calculated from the blocks already visited. There is more of a correction when λ is closer to one, namely, the proportions gets closer to the target.

The blocks need to be reclassified, now, based on the updated probabilities. The classifier becomes Equation 12.

Sometimes, when using a high µ factor, the servo system generates structures that do not make physical sense to reach the target proportions. Hence the maps must be corrected by a moving window system, similar to resources classification methodology (^{Deutsch et al., 1998}).

The determination of uncertainty and use of the servo system are optional. After all of the post processing and corrections, the algorithm assigns conditioning data to the corresponding grid node.

4. Case study

This case study was conducted on a major gold mine operation. The dataset presents 9140 data representing 5 different lithologies. The area of the deposit is approximately 10km^{2} with 1300m dip.

First, signed distances were calculated for all 5 lithologies, in practice users must only input the categorical dataset. Then the distances calculated were variographed. It may seem a laborious process, but the extremely continuous behavior of the distances makes the variograms easy to model, and are similar to each other. Furthermore, in cases where we do not know any geologic anisotropic pattern a *priori*, the same variogram model can be used for all categories. Figure 3 shows the experimental variograms and their respective fitted models for all five calculated signed distances. The distances of the experimental variograms never show the nugget effect but it can be arbitrarily added by the user to control the connection between the lithologies. In this case a 10% nugget effect was added on all variogram models. All models have the same range (1500m) and the same proportions between structures contribution (10% for nugget effect and 90% for a Gaussian structure). It can be seen in Figure 3 that the non-stationarity behavior of the distances makes the modelled range somehow arbitrary.

As the absolute value of each variogram structure contribution does not affect kriging results, only the proportion between the contributions, the same variogram model (Equation 13) could be assumed for all five signed distances.

Now, distances calculated for each category must be interpolated to each grid node, and the category responsible for the most negative estimated distance retained and assigned. Users must input kriging parameters and variogram models for each lithology. Resolution is the same as the explicit model provided. Grid properties are in Table 2 and kriging parameters are in Table 1; the same kriging parameters must be used to interpolate all categories. A maximum of 40 samples per estimation makes the run-time a few minutes, the output model differs a little compared to models created using 100 or 200 samples.

A categorical scatterplot, pairing each block category defined by the geomodeler, on the X axis, and by the algorithm, on the Y axis, was constructed (Figure 4) showing a linear correlation coefficient 0.93, namely, 95% of the estimated blocks are consistent in both models; these are the points that fall in the 45º line.

A high linear correlation coefficient is not indicative of a good model. More simplistic methods such as nearest neighbor also produce highly correlated scattterplots similar to Figure 4; however, produce unrealistic models. In Figure 5, both of the model's sections were compared in order to check if interpreted geological structures were reproduced by the algorithm. Slices are vertical along X and Y directions, the block model has 70 sections along the X direction and 60 sections in the Y direction. Four sections along each axis were chosen.

Some complex structures interpreted by geomodeler were reproduced satisfactorily by the algorithm as can be seen highlighted by green lines (Figure 5). Conversely, the model mathematically generated misses or does not accurately reproduce some other structures, as seen highlighted by red lines (Figure 5), usually, because there were not enough samples in that region, in those cases the geomodeler counted with his/her experience and/or additional information for creating the explicit model. Discordant blocks are shown in red and are located at lithological transitions. Scattered blocks belonging to lithology 5, displayed at the upper part of implicit model, are the blocks assigned conditioning to data locations, where probably the geomodeler arbitrarily increased its influence in the explicit model. This feature is not expected to be reproduced by the algorithm, as there are few samples from lithology 5 which are surrounded by many lithology 3 and 4 samples.

A trained professional will never be replaced by an algorithm and implicitly created models will rarely be final models, as they must be refined manually. The higher the number of samples, the more the implicit model approximates the explicit model, and in this sense, there is no need for much interference. In the case where samples are scarce, the implicit model serves as a proto model, saving time and effort in the early stages of modeling.

Uncertainty was calculated by softmax transformation, as the probability of each block belonging to each category in each Figure 5 section. The plug-in default value for the factor that regulates the interrelation among the probabilities is ω=175. Lower w values create more conservative models.

The Servo system and the magnitude of *µ* factor are dependent on how much you trust your declustered data proportions. In this case study, applying the servo system makes the model more discordant with the model provided for comparison (not necessarily wrong). Border effects often occur when there is extrapolation in estimates, extending structures exaggeratedly beyond the limits of the samples. The servo system can also be used to control the extent of this kind of structure.

The implicit geologic modeling algorithm formulation, description, and implementation should be relatively straightforward. Too complex algorithms will be difficult to implement, explain, and justify in practical settings. ^{McLennan (2007)} proposed six different criteria to evaluate implicit modeling algorithms: (1) *Simplicity*: The signed distance algorithm is extremely simple, where the only tough step is variogram modeling that can be simplified using the same isotropic variogram for all categories. *Speed*: It is quite fast. For the case study example where for a dataset with 9140 samples, the algorithm took a few minutes to run. (3) *Subjectivity*: Using the same parameters, the model is perfectly replicated. (4) *Flexibility*: Incorporating incremental geological data is easy and fast. (5) *Uncertainty:* The transformation of distances in probabilities creates a heuristic measure of uncertainty. (6) *Realistic:* The algorithm generates geologically plausible boundary models.

5. Conclusions

As could be observed in the case study, the algorithm does not replace a trained professional, but its simplicity and speed justify its usage, especially in the early stages of modeling.

As disadvantages of the method, the non-stationarity of the distances can be mentioned, making the variogram modeling arbitrary and questionable. Moreover the method is based on kriging, so only linear relationships between domains are modeled, unless a large amount of data is available.

For practical purposes, the heuristic measure of uncertainty is satisfactory, but it is not an assessment of uncertainty that result from multiple realizations of a stochastic model. Knowing this, future work may be focusrd on combining some kind of boundary simulation with the straightforwardness of the deterministic method.