Sensitivity analysis of ordinary kriging to sampling and positional errors and applications in quality control Mining Mineração

Data quality control programs used in the mineral industry normally define tolerance limits based on values considered as good practice or those that have previously been applied to similar deposits, although the precision and accuracy of estimates depend on a combination of geological characteristics, estimation parameters, sample spacing and data quality. This study investigates how the sample quality limits affect the estimates results. The proposed methodology is based on a series of metrics used to compare the impact on the estimates using a synthetic database with an increasing amount of error added to the original sample grades or positions, emulating different levels of precision. The proposed approach results lead to tolerance limits for the grades similar to those recommended in literature. The influence of the positional uncertainty on model estimates is at a minimum, because of the accuracy of current surveying methods that have a deviation in the order of millimeters, so its impact can be considered negligible.


Introduction
Since the geometry and geological properties of a mineral deposit are only known exactly after its complete extraction and processing, it is necessary to use models and estimates throughout the lifetime of a project for proper planning.The quality of available data strongly affects these estimates, and has consequently led the mining industry to adopt controls and procedures to measure and ensure data quality.These sampling and analytical controls typically establish error tolerances based on the intervals suggested by literature and good practices, i.e.Abzalov (2008), which do not have a mathematical relationship with the precision and accuracy required for the grade model, mine planning and scheduling.In this context, this paper proposes a methodology that uses a sensitivity analysis to measure how the analytical and/or location errors affect the estimates.
Despite the diversity of estimation techniques, none is able to completely correct the impact of a database containing large analytical errors, data obtained using an improper sampling protocol or with inaccurate spatial positions.Various systematic controls are employed to determine data accuracy and precision to ensure the quality of this information.At the same time, established quality assurance/quality control (QA/QC) programs ensure representative sampling and preparation.
The tolerance limits are typically based on average values suggested for different types of deposits (Abzalov, 2008).The main problem of employing values used in other mines or recommended by literature is that the estimated model will not necessarily have the same accuracy achieved by the mine from which the values were obtained, because the sensitivity of a model regarding the data quality is a complex relationship between data accuracy, estimation parameters and sampling spacing.
Based on the need to measure the real impact of information on the estimation accuracy, this paper proposes a methodology to measure the impact of analytical and location errors on grade models, enabling a better definition of tolerance limits and assuring that the defined values achieve the planned accuracy for block estimates.

Materials and methods
Based on the original data, perturbed databases were generated by adding errors to the initial sample grades and locations using Monte-Carlo simulation.Ten uncertainty scenarios were created, with the relative error randomly drawn from a normal distribution with increasing standard deviations (Miguel, 2015).
The sensitivity analyses were performed using the same blocks, methods and estimation parameters, changing only the databases.Therefore, a series of estimates was obtained and their relationship to the initial values for each level of perturbation was added to the information.
The different uncertainty scenarios and their relationship to the error-free estimates were measured by the average deviation and the coefficient from a linear regression using matching pairs of block values.The first block value are estimated using the error-free dataset and the second using an error-added dataset.
The results were evaluated as a function of the uncertainty added to the data.The proportion of blocks incorrectly classified as ore or waste is also a measurement of the impact on the estimates and enables an assessment of the financial impact caused by data error by measuring the ability of the model to classify correctly each block.Two types of misclassification exist: • Dilution, in which a block of poor economic value (at Walker Lake, V grade lower than 450 ppm) is classified as an ore, reducing the average grade of the mined material; • Loss, where an economically viable block of ore (at Walker Lake, V grade greater than 450 ppm) is classified as waste, reducing the total ore tonnage of the deposit.
The proposed method was applied to an analysis of the ordinary kriging (Matheron, 1963) sensitivity to sampled grade and locational uncertainty.The Proposed approach can be replicated to any geostatistical method.Thus, ordinary kriging was selected because it is probably the best known and most commonly used geostatiscal method in mining industry.
The proposed method was applied to an ordinary kriging sensitivity analysis with quality limit definitions in the Walker Lake dataset (Isaaks And Srivastava, 1989), a public database composed of 78,000 two-dimensional data points derived from the topography of the Walker Lake area in the state of Nevada, USA.

Initial database and its estimation
The study used a re-scaled version of Walker Lake with V grade (ppm) at 195 locations in a pseudo-regular grid of 20 x 20 m in an area of 78,000 m² (280 x 300 m).
The variogram showed anisotropy with longer and shorter continuity in the N157.5° and N67.5º directions, respectively.The modeled variogram parameters were: (1) A f ter mo del i ng t he spat ia l structure, the V grades were esti-mated by ordinary kriging in blocks of 10 x 10 m, with a search ellipse using a minimum of 3 and a maximum of 12 samples.

Generating perturbed scenarios
Based on the grades and positions of the 195 samples, perturbed scenarios were obtained by adding errors to initial values.Deviations were randomly drawn using Monte-Carlo simulation (Lehmer, 1951) from a Gaussian distribution with a zero mean and relative standard deviations of 2% to 40% (measured as the deviation between the perturbed and the initial sample values).The zero mean ensures the non-bias condition with symmetric scattering around the original mean.
For each level of uncertainty, 25 datasets were generated and used individually to estimate the block model (Table 2), ensuring sufficient values for the sensitivity curves.Table 1 shows the statistics averaged to each uncertainty scenario datasets.For positional sensitivity analysis, initial position was dislocated by adding Gaussian noise with a zero mean and standard deviation from 0.04 m to 10 m, randomly drawn independently for the X and Y coordinates, generating displace-ments ranging from 0.02 to 12.9 m from the initial position.
The locational uncertainty generated average deviations of 0.05, 0.20, 0.84, 3.34 and 12.9 m from the original positions.The variations among realizations for the same standard deviation caused little impact on the location sensitivity curve, which eliminated the need for multiple realizations.Figure 2 shows the original locations as circles and the perturbed positions for scenarios SDL-2.56 and SDL-10.24 as triangles.

Checking uncertainty influence on estimates
The sensitivity analysis for analytical and locational errors, quantified for each scenario dataset, were compared to estimated blocks using the error-free dataset, which is accepted as "true".Since kriging and variogram parameters have a significant effect on estimates, values defined by the error-free database were applied to all perturbed scenarios The impact caused by uncertainty on grades (Table 1) and on locational data (Table 2) were measured by the proportion of blocks that were incorrectly classified as ore or waste using a cut-off value for V of 450 ppm.The impacts were also measured by deviation from the original mean, the correlation coefficient and the mean absolute deviation for pairs of blocks containing the individual perturbed sample estimated values and original values.

Modeling of the sensitivity of estimates to uncertainty
The kriging sensitivity, measured to ten grade and five locational levels of added uncertainty, were interpolated by linear functions (Figure 3) that relate dataset errors to their impact on relative block-to-block errors and on block misclassification.
An absolute increase of 5% in the standard deviation of the uncer-tainty related to the sample grade increased the average deviation between all estimated blocks and their reference values by 6.2% (Figure 3a-I,  y=1.2405*5%), while the number of blocks misclassified as ore or waste was increased by 1.12% (Figure 3b-I, y = 0.2247*5%).
For posit iona l u ncer t a i nt y, each meter on location mean error increased the average deviation between the block grade and the reference value by 3.16% (Figure 3a-II, y = 0.2247*1 m).For the loss and dilution rate, each meter of positional uncertainty in the database increased the misclassification rate by 0.76% (Figure 3b-II, y = 0.0076*1 m).

Definition of quality limits
The sensitivity relationship between sampling and estimate uncertainty can take into account different criteria for defining sampling errors limits, i.e. minimum accuracy required for a grade model, mine planning and scheduling.For the sake of brevity, the study assumed free access to all mining blocks (each block can be mined as ore or waste independently) and the Walker Lake block model being used as a bench to be mined during a month.
As reference, good practices in QA/QC suggests replicate sample errors between 5-10%, being acceptable a maximum of 20% (Abzalov, 2008;2011a).For resource classification, values accepted by the mining industry as good practice state that measured resource should have an accuracy between 8% (Abzalov, 2011b) and 15% (Parker, 2014), over a monthly or quarterly production; both references using the usual 90% confidence interval.
First, we propose to use the sensitive equation to define quality limits that ensure the Walker Lake resource is classified as measured.Considering a maximum acceptable sample uncertainty impact on estimates of 10%, the equation a-I (Figure 3) defines that 90% of the samples error has to be below 8.1% (x = 10%/1.2405).
Quality limits may also be defined to ensure an impact on misclassification below a chosen threshold.Assuming an arbitrary maximum misclassification of 3%, equation b-I (figure 3) defines a sampling error of 7.5% (x = 0.2247/3%).For the same misclassification caused by locational errors, equation b-II (Figure 3) defines a maximum average deviation of 0.25 m (x=0.0076/0.03).

Conclusion
As discussed, in the mining industry there is a gap between data quality programs and the commonly used geostatistical method, where sampled grades, as well as their location, are assumed as error-free in the geostatistical modelling.This paper proposes an approach measuring the link between sampling quality control and estimate values through sensitivity analysis.
The general procedure is applicable to any deposit and geostatistical method.For the Walker Lake case, the blocks were estimated by ordinary kriging.The maximum error limit values defined by the proposed approach, and its expected impact on resource confidence, were very close to well-established references, being the sensitivity equations results consistent to sampling error and good resource classification practices.Moreover, the approach applied to data position uncertainty concluded that when the accuracy of modern topographic surveys is taken into account, kriging sensitivity due to location error could be assumed negligible.
In real cases, whose initial sample grades and position have a measureassociated error, the regression line can be extrapolated to uncertainty values below measured error to the origin, assumed as error-free.Such methodology, applied to locational data uncertainty, allows the assessment of the locational impact on estimated value cases of poorly located data.
A financial appraisal on the benefit related to the method proposed can be carried out following the same approach, using real sampling protocol costs and measuring their impact on profits due to metal loss or dilution caused by misclas-sified blocks.Thus, the approach can be used to maximize profit, considering the money spent in sampling quality control and its impact on mining profitability.This type of optimization cannot be reached using benchmark values established by the industry for maximum acceptable error values.
In future work, stochastic simulations will be investigated in order to incorporate other sources of uncertainty associated with the estimation process.The simulations also allow to define where additional samples should be located, and how it should be prepared and analyzed, taking into account the methods available for each position, their costs, expected impact on mine profitability and if the additional information costs in this position exceed the financial benefits.

Figure 1
Figure 1 Kriged V values using an error-free database for V grades.a) Estimated blocks and b) histogram and statistics for the estimates.

Figure 2
Figure 2 Location map of the original (circles) and perturbed positions (triangles) of sampled coordinates of SDL-2.56 and SDL-10.24m.

Figure 3
Figure 3 Grade and positional sensitivity curves relating the standard deviation added to sample error population (X) and its measured impact on relative block-to-block errors (a) and by misclassifying ore as waste (b).

Table 1
Descriptive statistics of the original and perturbed database statistics grouped by level of uncertainty 1 .
1Columns: Mean, median, 1st and 3rd quartiles refer to scenario mean values; correlation coefficient compares the initial and scenario values sample-to-sample.

Table 2
2 Column: Mean deviation measures how many meters the coordinates are from the actual sampled position; the mean refers to scenario blocks values; the correlation coefficient, mean deviation and loss+dilution compare the initial and scenario values block-to-block.