DESIGN OF GEODETIC NETWORKS BASED ON OUTLIER IDENTIFICATION CRITERIA: AN EXAMPLE APPLIED TO THE LEVELING NETWORK

: We present a numerical simulation method for designing geodetic networks. The quality criterion considered is based on the power of the test of data snooping testing procedure. This criterion expresses the probability of the data snooping to identify correctly an outlier. In general, the power of the test is defined theoretically. However, with the advent of the fast computers and large data storage systems, it can be estimated using numerical simulation. Here, the number of experiments in which the data snooping procedure identifies the outlier correctly is counted using Monte Carlos simulations. If the network configuration does not meet the reliability criterion at some part, then it can be improved by adding required observation to the surveying plan. The method does not use real observations. Thus, it depends on the geometrical configuration of the network; the uncertainty of the observations; and the size of outlier. The proposed method is demonstrated by practical application of one simulated leveling network. Results showed the needs of five additional observations between adjacent stations. The addition of these new observations improved the internal reliability of approximately 18%. Therefore, the final designed network must be able to identify and resist against the undetectable outliers – according to the probability levels.


Introduction
Geodetic networks are essential for the most geodetic engineering projects, such as monitoring the position and deformation of man-made structures (bridges, dams, power plants, tunnels, ports, etc.), to monitor the crustal deformation of the Earth, to implement an urban and rural cadastre, to establish and maintain a geospatial reference frame.Before the installation of geodetic marks and gathering of survey data, geodetic networks should be designed according to the pre-established quality criteria, such as precision, reliability and costs.The precision is related to the covariance matrix of estimated parameters (i.e.point's coordinates or displacement of points).The ability of the measurements scheme to identify outliers in the observations as well as their effects on estimated parameters are associated with the measures of reliability.Finally, the cost is related to the effort required to implement the design and related expenses (i.e. it expressed in terms of the observation schedule).Therefore, the design problem can be formulated as: "finding the optimum solution that increases the precision and reliability of the network and, at the same time, it has the lowest possible cost".In other words, a network should be designed in such a way that: Φp, Φr and Φc are weight coefficients for precision, reliability and cost, respectively.Additional criteria such as sensitivity to detect displacements or deformation parameters of certain magnitude can also be considered (see e.g.Vaníček et al., 1990;Kuang, 1991;Even-Tzur, 2002;Erdogan and Hekimoglu, 2014).The design problem have been widely developed and investigated since the pioneering work of Helmert (1868).Thenceforth, a series of papers have been published on the development of the new algorithms by simulated examples and real applications (see e.g.Baarda, 1973;Grafarend, 1974;Baarda, 1977;Koch, 1982;Knight et al., 2010;Hekimoglu et al., 2011;Hekimoglu and Erdogan, 2012;Klein et al., 2012;Klein, 2014).
Although the design of geodetic networks is a widely investigated topic, there are open issues requiring further research.
However, most of these studies are based on theoretical criteria, such as criteria matrices (Baarda, 1973), error ellipses or ellipsoids (Vaníček and Krakiwsky, 1986) and redundancy numbers (Amiri-Simkooei, 2001a, 2001b).Unlike these, in this paper, we purpose an alternative method based on Monte Carlo simulation technique (MCS) to design a geodetic network in the sense of high reliability.In the context of reliability, we highlight that the vast majority of the methods consider the redundancy numbers of the observations as reliability criterion (e.g.: Baarda, 1968;Amiri-Simkooei et al. 2012;Yetkin, 2013).Here, on the other hand, the reliability criterion is based on the power of the test of iterative data snooping testing procedure.
We also highlight that the proposed method discards the use of the observation vector of Gauss-Markov model.In fact, the only needs are the geometrical network configuration (given by Jacobian matrix); the uncertainty of the observations (given by nominal standard deviation of the equipment); and the magnitude intervals of the outliers.The random errors are generated artificially from the normal statistical distribution, while the magnitude of outliers is selected using standard uniform distribution.
The paper is organized as follows: -Section 2: contains the preliminary concepts.
-Section 3: contains the background regarding the theory within our scope.
-Section 4: the method proposed here for designing geodetic networks is presented.
-Section 5: a numerical example of the proposed method is given for a leveling network.

Preliminary Concepts
Here, the term outlier is defined according to Lehmann (2013): "an outlier is an observation that is so probably caused by a gross error that it is better not used or not used it is".One of the wellknown methods for outlier identification in geodetic data analysis is Baarda's testing procedure.This method was proposed by Baarda (1968) and consists of three steps (see e.g.Teunissen 2006): detection (also known as overall model test), identification (also known as data snooping) and adaption (a corrective action, such as elimination of identified observation as an outlier).
At each iteration, only a single observation can be identified in the data snooping procedure.
Once an identified observation is excluded, the least squares estimation (LSE) adjustment is restarted without the rejected observation and again the identification step (data snooping) is applied.Of course, if redundancy permits, this procedure is repeated until none identification.This procedure is called "iterative data snooping" (e.g.Teunissen, 2006).In this paper we are exclusively concerned with iterative data snooping procedure (DS).
Since DS is based on a statistical hypothesis testing with an alternative hypothesis for each observation, it may lead to a false decision as follows: • Type I error or false alert (probability level α) -Probability of identifying an outlier when there is none; • Type II error or missed detection (probability level β) -Probability of non-identifying an outlier when there is at least one; and • Type III error or wrong exclusion (probability level κ) -Probability of misidentification a nonoutlying observation as an outlier, instead of the outlying one.This type of error decision was introduced by Hawkins (1980) and Förstner (1983).
The rate of type I decision error in a binary hypothesis test (i.e., with a single alternative hypothesis) can be selected by the user.The rate of type II decision error cannot.Lehmann and Voß-Böhme (2017) also point out that a test statistic with a low rate of type II is said to be powerful in the binary hypothesis case, when only a single alternative hypothesis is considered.However, in case of multiple alternative hypotheses (i.e., DS case), without considering the Type III error, there is a high risk of over-estimating the successful identification probability (see e.g.Yang et al. 2013).On the other hand, the confidence level is the probability that a non-outlying observation is correctly ignored; the power of the test is the probability that an outlier is correctly identified.Therefore, the confidence level and the power of the test are the probabilities of the test result leading to correct decisions, as opposed to the occurrence of type I, II and III errors (see, for example, Förstner, 1983;Teunissen, 2006;Klein et al., 2015b).In addition to these probabilities, the DS can identify more observations than real number of outliers (Rofatto et al., 2017).
Many of the relevant probabilities in this contribution are multivariate integrals over complex regions (Teunissen, 2017).They therefore need to be computed by means of numerical simulation such as MCS.The basic idea is to approximate probability distributions by frequency distributions of computer random experiments performed using pseudo random numbers.Therefore, as pointed out Lehmann (2012), MCS methods are used whenever the functional relationships are analytically not tractable, as is the case for DS.This simulation technique based on the pioneering idea of Hekimoglu and Koch (1999) has been recently applied in geodesy.For example, Ryan and Lachapelle (2001) used simulations to obtain the minimal detectable bias polygon for the case of two outliers; Lehmann and Scheffler (2011) used MCS method to solve the problem how to determine the optimal levels of Type I error probabilities for global and local tests in DS; Lehmann (2012) used MCS method to improve the critical values of the test statistics; Niemeier and Tengen (2017) extended the classical concept of geodetic network adjustment by combining the uncertainty modeling and MCS.
3. Data Snooping Procedure: a particular case of statistical testing procedures for identification of outliers based on maximum likelihood ratio The mathematical model generally adopted in geodetic data analysis is the linear(ised) Gauss-Markov model, given by (Koch, 1999): where e is the n x 1 random error vector, A is the n x u design (or Jacobian) matrix, x is the u x 1 unknown parameters vector and y is the n x 1 observations vector.
The most employed solution for a redundant system of equations ( nu  and full column rank) is the weighted least squares estimator (WLSE) for the vector of unknowns ( x ): in which W is the n x n weight matrix of the observations, taken as W = σ0 2 Σy, where σ0 2 is the variance factor and Σy is the covariance matrix of the observations; if Σy is diagonal, one speaks of weighted LSE (WLSE); if it is full, generalized LSE (GLSE).Teunissen (2003) demonstrates the geometric interpretation of the LSE.More details about LSE estimation can be seen in Ghilani (2010).
If there are only random errors in the observations, the LSE is the best linear unbiased estimator (BLUE) for the unknown parameters; if the observational errors follow the multivariate normal distribution with mean μ = 0 and covariance matrix Σy, the LSE coincides with the maximum likelihood estimator (Teunissen, 2003).However, the LSE is no longer optimal in the presence of systematic and/or outliers in the observations.In other words, despite optimal properties for LSE, they lack robustness or insensitivity to outliers in observations (Huber, 1964;Rousseeuw and Leroy, 1987;Lehmann, 2013).Therefore, statistical testing procedures for detection and identification of outliers have been developed.
Quality control to identify outliers in geodetic measurements has been widely investigated since the pioneering work of Baarda (1968).In the sense of LSE, statistical testing procedures for detection and identification of outliers are based on maximum likelihood ratio.Consider a null hypothesis H0for the parameters of the population probability distribution of an observation vector y.Consider further an alternative hypothesis HA for these parameters, constructed in a way that H0 is a subset of HA.Thus, in the general case, the maximum likelihood ratio "λ(y)" between H0and HA is given by (Larson, 1974): where max p(y|H0) is the maximum of the probability density function (pdf) of y under H0 and max p(y|HA) is the maximum of the pdf of y under HA.As the null hypothesis is defined so that its sample space is contained in the sample space of the alternative hypothesis, the ratio in Equation4 lies in the interval of 0 ≤ λ(y) ≤ 1 (Teunissen, 2006).
The test criterion for the maximum likelihood ratio is given by (Larson, 1974): where c > 0 is the critical value for the test according to the significance level α stipulated (for more details, see Larson, 1974;Teunissen, 2006).
Assuming normally distributed observation errors, a general case of hypothesis testing in linear models is formulated as Teunissen(2006): where H0 is the null hypothesis (namely, absence of outliers in the observations) and HA is an alternative hypothesis (presence of "q" outlying observations in at certain known locations).The quantity Cy defines the non-random error model (in this context called outlier model) with dimension n x q and  is the corresponding vector of q outliers.The dimension of  should be comprised between1 q n u   − .For example, if n=5 and q=2, then a possible outlier model is For more details about error models, see e.g.Lehmann and Lösler (2016).The performance of the statistic test Tq for multiple outliers can be found, for example, in Klein et al. (2015a) and Klein et al. (2017).
DS procedure is a particular case of maximum likelihood ratio test when only one outlier (q = 1) is present in the data set at a time (see e.g.Baarda, 1968;Pope, 1976;Berber and Hekimoglu, 2003;Lehmann, 2012).Therefore, the outlier model for q=1 is a n x 1 unit vector with 1 in its i th entry and zeros in the remaining (e.g. ), and  is a scalar value with the outlier at i th observation being tested.If we consider one outlying observation in at certain known locations (q = 1), then the maximum likelihood ratio test for DS (Tq= 1) is given by (Teunissen, 2006): Under H0, observation errors are zero-mean (multivariate) normally distributed.The null hypothesis is rejected if the following test statistic (Tq = 1) of the i th observation being tested exceeds a given critical value Κα.Κα is the critical value for the test according to the significance level α.Under the null hypothesis, the test statistic Tq follows the central chi-squared distribution with one degree of freedom; under the alternative hypothesis, the test statistic Tq follows the non-central chi-squared distribution with one degree of freedom and non-centrality parameter δ.The non-centrality parameter δ expresses the separation between the null and alternative hypotheses, that is, (see Figure 1): In the Figure 1, the 0 is the expectation of the H0 hypothesis and the  is the expectation of the HA hypothesis.It should be noted that the test statistic Tq in equation ( 8) is a particular case of generalized test statistic, when q=1. Important to mention also that the critical value follows from a chi-squared distribution with one degree freedom at a significance level of α in a onetailed test.Baarda (1968) and Teunissen (2006) demonstrate that if q = 1, then the test statistics (equation 7) can also be formulated based on a standard normal distribution in a two-tailed test (so-called w-test).Both the chi-squared and normal distribution tests are equivalent.Usually in geodesy, the value of α is set between 0.1% and 1% (Kavouras, 1982;Aydin and Demirel, 2004;Lehmann, 2013).Furthermore, DS contains multiple alternative hypotheses, as each observation is individually tested.Therefore, the only observation considered contaminated by outlier is the one whose test statistic satisfies the inequalities Tq=1>Κα.After having identified the observation most suspected of being an outlier (at given α), it is excluded usually from the model, and the WLSE and DS are applied iteratively until there are no further outliers identified in the observations (Berber and Hekimoglu, 2003).
However, three types of incorrect decisions may occur into DS and its occurrence rates are associated with probability levels: the significance level α is the probability of a non-outlying observation be misidentified as an outlier (type I error or false positive); β is the probability that an outlying observation not be identified as outlier (type II error or false negative); finally, a nonoutlying observation is misidentified as an outlier, instead of the outlying one (type III error given by κ).On other hand, the confidence level (CL) is the probability that a non-outlying observation is correctly ignored, therefore, CL = 1 -α; the power of the test (γ) is the probability that an outlier is correctly identified as outlier, i.e. γ = 1 -(β + κ).Therefore, the CL and the γ are the probabilities of the test result leading to correct decisions, as opposed to the occurrence of type I, II and III errors (see, for example, Förstner, 1983;Teunissen, 2006;Klein et al., 2015b).Here, we denoted α0, β0 and κ0 as pre-established probability levels.For example, the Figure 1 shows these relationships in the DS procedure for CL=0.999,γ0 = 0.8, so α0 = 0.001 and β0 = 0.20, leading to a pre-set non-centrality parameter δ0 = 17.075 and a pre-set critical value of Κα = 10.83.Baarda (1968) provides the monograms for those interested in obtaining δ0 values as a function of α0 and γ0 (for a given degree of freedom).Alternatively, Aydin and Demirel (2004) presented a procedure to obtain the same through approximations of the non-central chisquared distribution.The necessity of obtaining the non-centrality parameter is widespread in Geodesy (Baarda, 1968;Kavouras, 1982;Teunissen, 2006;Knight et al., 2010).

Method Proposal to Design Geodetic Networks
The method proposed here focuses on designing of the geodetic networks in terms of high reliability.Under the present proposal, the quality criterion to be considered during the design stage is based on the number of outliers correctly identified in the DS.In other words, the key to our method is to estimate the power of the test on DS using MCS.The method does not depend on the real measurements values but only on the model design, i.e. the network geometry and covariance matrix.Therefore, the computations can be performed before the observations are carried out as follows: 1. Defining the magnitude intervals of outliers based on standard deviation of observation (e.g.|3σ to 9σ|, being σ the standard deviation of observation), the significance level α0 and the desired power of the test γ0.Often, a value deviating from the sample mean by more than three times (3σ) the sample standard deviation σ is assumed to be an outlier (e.g.: Hekimoglu and Koch, 2000).Here, we do not discuss how to choose the outlier magnitude, because it depends on the purpose of the geodetic project.The α0 is associated to type I error.Typically a value of the level α0 = 0.001 is adopted (see, e.g.Baarda, 1968).However, if this level is chosen too low then we get too large critical values and many outliers remain unidentified.If on the contrary this level is chosen too high then we get too small critical values and it is likely that good observations are eliminated.Lehmann (2012) shows how to strike a balance between these impairments of parameter estimation by using MCS methods.The γ0 represents the efficiency level of the DS to identify correctly an outlying observation.The γ0 is equivalent to the success rate considering the probability of type II and type III errors for all network observations (Hekimoglu and Koch, 2000;Yang et al., 2013;Klein et al., 2015b).
2. Defining a priori geodetic network configuration (preliminary design matrix) as well as the observations uncertainties (preliminary of covariance matrix of the observations, including random components and the uncertainty associated with the systematic effects).The latter follows from the instrument precision, measurement techniques and field condition (Grafarend and Sanso, 1985).It is important to mention that the design matrix defined initially must have a minimum configuration to avoid rank deficiency as well as being able to detect at least one outlier as mentioned by Xu (2005) that 'in order to identify outliers, one also has to further assume that for each model parameter, there must, at least, exist two good data that contain the information on such a parameter'.For example, consider the one unknown height into a leveling network (one-dimensional -1D).Two observations would lead to different solutions and allow the detection of an inconsistency between them.Three observations would lead to different solutions and the identification of one outlying observation, and so on.Thus, in a general case, the value for 'q' equal to the minimal number of redundant observations across each and every point, minus one, For more details on the choosing the number of outliers to be considered, see Klein et al. (2017).
3. Generating synthetically a sequence of "m" random errors vector eK, k = 1,...,m of a desired statistical distribution.The "m" is known as the number of Monte Carlo experiments.Usually, one assumes that the random errors of the good measurements are normally distributed with expectation zero.Thus, we generate the random errors using multivariate normal distribution, since the assumed stochastic model for random errors is based on pre-defined matrix covariance of the observations, i.e. e ~N(0,σ0 2 Σy).We apply the well-known Box-Muller method (Box and Muller, 1958) for generation of normal pseudo random numbers (Practically we use MATLAB's function mvnrnd here).On the other hand, an outlier (q=1) is selected based on pre-established magnitude intervals of the outliers (see item 1) for each m Monte Carlo experiments.Here, we use the standard uniform distribution to select the outlier magnitude (Practically we use MATLAB's function unifrnd here).The uniform distribution is a rectangular distribution with constant probability and implies the fact that each range of values that has the same length on the distributions support has equal probability of occurrence (see e.g.Lehmann and Scheffler, 2011).For example, for 10,000 Monte Carlo experiments, if the user choices a magnitude interval of the outliers of |3σ to 9σ|, the probability of a 3σ error occurring is virtually the same as -3σ, and so on.Random and outliers are assumed to be independent (by definition) and both are combined to the total error as follow (see e.g.Kavouras, 1982): , 0 where  is the n x 1 total error vector, e is n x 1 random errors and cy is outlier model for q=1(see expression 6), and  is a scalar value with the outlier at i th observation being tested.We assume that e+cy  > +3σ and e+cy  < -3σ.Before computing statistical test Tq=1 it is necessary to relate the random error vector e and total error vector , since this statistical test depends on the estimated random error vector 0 ê .In the sense of LSE, this relationship is given by (see Kavouras, 1982):

) TT R I A A WA A W
in which R is the n x n redundancy matrix and I is the n x n identity matrix.
In the equation 11 the reader should be note that the multiplication of the redundancy matrix R and the total error ε provides the estimated random error vector 0 ê .Now, the 0 ê is not only composed by random errors, but also it has one of its elements contaminated by an outlier.Now it becomes possible to compute the test statistic Tq=1 considering the relation given by equation10.
4. Quantifying the success rate (fraction of outliers correctly identified, i.e. the power of the test computed by MCS, see e.g.Hekimoglu and Koch, 1999;2000) and the misidentification rates for one outlier after having ran m Monte Carlo experiments for each observation.The misidentifications are divided in two types of classes are counted in the simulations: number of experiments where the procedure yielded none observation identification (type II error -β); number of experiments in which the procedure identified a single observation but wrong localization (type III error, denoted here as κ).
5. Checking if the lowest power of the test based on MCS (γM) is greater than or equal to the desired power of the test (γ0) for each observation.For example, in the case of γ0 = 0.80 and one outlier, the outlier must be identified in at least 80% of cases.Otherwise, the network must be improved until the power of the test based on MCS is greater than or equal to the pre-stipulated power of test γ0.Here, a new observation is added in the one which presents the lowest power of the test based on MCS.The design stage is finished when γM is greater than or equal to γ0.The adding of new observations in the one less resistant to outlier in the network has been used by some authors.For example, Yetkin and Beber (2012) added the GPS (Global Positioning System) baselines to meet the robustness criterion in the design of GPS networks.The proposed method is summarized as a flowchart in Figure 2.

Numerical Examples
In order to demonstrate the design method in practice, in this section we apply it to a simulated closed leveling network.The goal is to illustrate the design method; further considerations about levelling networks are outside the scope of this study.
We consider a closed levelling network, with one control station (benchmark), and 4 points with unknown heights (A, B, C and D), totaling four minimally constrained points as shown in Figure 3  For each unknown points, there are four height difference measurements.Thus, there are n = 10 observations, u = 4 unknowns, and n -u= 6 degrees of freedom in this simulation.The design matrix (A) has dimension 10 x 4 and the covariance matrix of observations has dimension 1 0 x 1 0. Each station is involved in four height differences, so there are three redundant observations for the determination of each unknown.
Here, we define a threshold power of the test of DS of γ0 = 0.80.The significance level is taken as α0 = 0.001 (the typically adopted value, see, e.g.Baarda, 1968), so the critical value of the χ² distribution for the statistic Tq=1 is 0 K  = 10.83.Here, 15,000 Monte Carlo experiments were defined for each observation.In each experiment are performed considering positive and negative outliers between 3σ and 9σ magnitude intervals (σ being the respective standard deviation of the observation).Table 1 shows the power of the test based on MCS for each observation without considering the addition of new observations:  1 shows that the observation ∆h1 has the lowest power of the test (66.9%)for the simulated leveling network.In other words, among the 15,000 experiments performed using MCS for this observation, 10035 outliers were correctly identified.Regarding the three classes of misidentification rates, 4485 observations were not identified (29.9%); 405 outlying observations is misidentified (2.7%); and there were 75 cases of over-identification (0.5%). Figure 4 shows the power of the DS for the priori geodetic network configuration.It can be noted that the lowest power of test based on MCS (γM (∆h1) = 66.9) does not achieve the reliability criterion.Therefore, the leveling network must be improved.It is important to mention that the power of the test defined here as quality criteria was of 0.80.The user can choose another level of this probability.However, if the user increases the level of the power of the test, in general, it also increases the cost, since more observations must be needed or other equipment of better precision must be acquired to meet the quality criteria.Furthermore, the magnitudes of the outlier and the significance level have a direct relationship.
The smaller the size of the outlier to be identified, the higher the level of significance should be.However, the user must be careful not to set a significance level and a magnitude of the outlier (first step of the method) that makes the design geodetic network unfeasible from the point of view of cost.Thus, we suggest to the user to analyze the best scenario with different levels of significance (e.g.Rofatto et al., 2017).The analysis of the significance level is outside the scope of this paper and it will be addressed in a future research.

CONCLUSION
The conclusions are highlighted in the following: • The proposed method to design a geodetic network is based on numerical simulation.The method discards the use of the observation vector of Gauss-Markov model.In fact, the only needs are the geometrical network configuration (given by Jacobian matrix); the uncertainty of the observations (given by nominal standard deviation of the equipment); and the magnitude intervals of the outliers.The proposed method seeks to maximize the internal reliability by adding new observations in the one which presents the lowest power of test based on MCS.This process is repeated until a sufficiently robust network is obtained.Therefore, it can be applied for any kind of geodetic network.• Here, the proposed method was applied to a closed levelling network.It was defined a threshold power of the test of DS of 0.80 (80%) and the significance level was taken as 0.001 (0.1%).15,000 Monte Carlo experiments were defined for each observation.In each experiment are performed considering positive and negative outliers between 3σ and 9σ magnitude intervals.Results showed the needs of five additional observations between adjacent stations.The addition of these new observations improved the internal reliability of approximately 18%, i.e. it increased the probability of the ability of the measurements scheme to identify an outlier in the set of the observations.
outlier in each observation y1 and y3).
estimated random error vector and a posteriori covariance matrix of the estimated random error computed by LSE into H0, respectively.

Figure 1 :
Figure 1: Probability levels related to testing hypotheses of DS.

Figure 2 :
Figure 2: Flowchart of the proposed method.
. The benchmark is fixed, and the distances of the adjacent and non-adjacent stations are approximately 240 m and 400 m, respectively.The equipment used is a spirit level with nominal standard deviation for a single staff reading of ±0.02 mm/m.Lines of sight distances are kept at 40 m.Thus, each total height difference Δhi between adjacent or non-adjacent stations is made of, respectively, three or five partial height differences (p).Each partial height difference, in turn, involves one instrument setup and two sightings: forward and back.The standard deviation for each Δhi equals to σi = ±√2p × 40 × 0.02 mm/m, i.e. σi = ±√2p × 0.8 mm, where p is 3 or 5.The readings are assumed uncorrelated and σ0 2 = 1.

Figure 4 :
Figure 4: Power of the DS for the initial geodetic network configuration.

Table 1 :
Power of the iterative DS and misidentification rates (%) for each observation in a priori network configuration Observation (∆hi) Power of iterative DS (%) Type II Error Type III Error Over-identification