USING A LEAST SQUARES SUPPORT VECTOR MACHINE TO ESTIMATE A LOCAL GEOMETRIC GEOID MODEL Usando o método máquina de vetor de suporte por mín imos quadrados para estimar um modelo geoidal local SZU-PYNG KAO

In this study, test-region global positioning syste m (GPS) control points exhibiting known first-order orthometric heights were employed to obtain the points of plane coordinates and ellipsoidal heights by using the re al-time GPS kinematic measurement method. Plane-fitting, second-order cur ve-surface fitting, backpropagation (BP) neural networks, and least-squares support vector machine (LSSVM) calculation methods were employed. The study i ncludes a discussion on data integrity and localization, changing reference-poin t quantities and distributions to obtain an optimal solution. Furthermore, the LS-SVM was combined with local geoidal-undulation models that were established by researching and analyzing3 kernel functions. The results indicated that the ov erall precision of the local geometric geoidal-undulation values calculated usin g the radial basis function (RBF) and third-order polynomial kernel function wa s optimal and the root mean square error (RMSE) was approximately ± 1.5 cm. The se findings demonstrated that the LS-SVM provides a rapid and practical method fo r determining orthometric heights and should serve as a valuable academic ref erence regarding local geoid models. Using a least squeres support vector machine to est imate a... Bol. Ciênc. Geod., sec. Artigos, Curitiba, v. 20, n o 2, p.427-443, abr-jun, 2014. 4 2 8


INTRODUCTION
The gravimetric method is the technique most commonly used to precisely determine the local geoid.Recently, a gravimetric geoid model covering Taiwan was generated using gravity survey data, which are relatively difficult and timeconsuming to measure and often yield results that do not fit well with the local terrain.Gravimetric geoid results fit large rather than small areas.Because of the lack of gravity data in mountainous regions of Taiwan, this study was conducted to generate a regional geoid model that yields adequate accuracy regarding GPS-based leveling but does not require using a high-resolution gravimetric geoid model to evaluate a small area.Local geoid determination is possible by using various geometric methods, such as the LS-LSM method previously proposed by the authors.After calculating the local geometric geoid heights at the points of interest, combining the GPS-derived ellipsoid heights and an accurate model provides a novel alternative method for determining orthometric height.Hence, a stand-alone geometrically derived geoid model must be constructed to transform the ellipsoid height h from the GPS to the orthometric height H in research or real-time situations.
If engineers can leverage GPS survey techniques to establish a precise local geometric-geoidal model and a system for converting ellipsoidal and orthometric heights, the traditional leveling survey model can be modified, improving measurement operations.The current geoid establishment methods comprise the astrogeodetic and astrogravimetric leveling, gravitational field model, and mathematical function-fitting methods (ABDALLA et al.,2011;AKCIN and CELIK, 2013;FEATHERSTONE et al.,2004;KAO and SHEN, 2011;Kao et al., 2010;KAO, 2006;KAO andBETHEL, 1992(a) 1992(b); LIN,2007;SHEN, 2011;TRANE Setal., 2007;USTUN and DEMIREL, 2006;WANG, 2005;YOU, 2006;YANG, 1999).If local geoid changes are smoothed within a certain range, then mathematical functions can be used to reflect the spatial distribution conditions.In the mathematical function-fitting method, GPS technology is employed to measure sufficient control points ina survey area and the ellipsoidal heights are derived based on local ellipsoids.The orthometric height is determined using direct leveling surveys.If the vertical deflection influence is disregarded, local geometric-geoidal undulations can be derived by subtracting the orthometric height from the ellipsoidal height of a location, enabling a partial simulation of the geometric geoidal undulations between the local geoid and ellipsoid by using a mathematical function-fitting method.In this study, the GPS control point of a test region involving known first-order orthometric heights and GPS real-time kinematic (GPS RTK) measurement method were used to calculate the plane coordinates (x, y) and ellipsoidal height (h) of the survey point.Various fitting calculation methods and fitting points were used to determine the optimal resolutions of each fitting method.Integrality and localization were discussed, and an accurate local geometric-geoid model was used to achieve local geometric-geoid calculations.The study was executed in the Taichung Metropolitan area of Taiwan.
The ellipsoid heights of the checks points were derived from the RTK GPS survey, and the orthometric heights of these check points were observed using the first-order geodetic leveling survey, which exhibited a forward and backward section misclosure of ±3.0√K mm, where K is the distance of a leveling line in kilometers.The root mean square errors (RMSEs) listed in Table 2 reflect random and systematic errors and data inconsistencies.Thus, the determined surface models absorb the systematic errors and spatial distortions hidden in the known orthometric and ellipsoid heights of the control points, primarily based on the ellipsoid heights determined using the RTK GPS survey.The accuracy of the orthometric height determined using the GPS RTK survey depends on the accuracy of the ellipsoid and geometric geoid heights.Increasing the observation lengths of GPS RTK field surveys can enhance the quality of the orthometric heights determined using GPS leveling.
The primary goal of this study was to combine the rapid data-accessibility feature of the GPS-RTK method with direct leveling survey results and the least squares support vector machine (LS-SVM), constructing a local geoid model that uses 3kernel functions to enhance precision and practical value.This would reduce the operating time required to conduct spiritleveling surveys and improve the efficiency of practical engineering applications.

THE LEAST SQUARES SUPPORT VECTOR MACHINE
Numerous researchers (ABDALLA et al., 2011, KAO et al, 2011, LIN, 2007, You, 2006 ) have investigated combined methods for improving local geoids, using GPS and geodetic leveling data.These scholars have presented multiple useful tools and interpolation methods.In this paper, the LS-SVM is applied to compute local geometric geoids.Vapnik (1995aVapnik ( ,1995b) ) proposed the support vector machine (SVM) machine learning method based on the principle of structural risk minimization (SRM).According to the statistical learning theory (VAPNIK, 1995a(VAPNIK, , 1995b)), if data are subject to a certain (fixed but unknown) distribution, the machine should follow the SRM principle to minimize the deviation between the actual and desired outputs.This differs from the empirical risk minimization principle.In brief, the machine must minimize the upper bound of the error probability; thus, the SVM is the realization of this theory.The SVM is simpler compared with traditional artificial neural networks (ANN; ACKIN and CELIK,2013;LIN, 2007); however, the generalization abilities of numerous approaches employing the statistical decision method exhibit difficulties deriving the desired results when using limited samples.Thus, Vapnik(1995aVapnik( ,1995b) ) proposed the Vapnik-Chervnenkis dimension (VC dimension) as follows: a function separates N samples into n 2 forms and the VC dimension can divide the largest number of samples (N).A large VC dimension indicates poor function generalization ability, and a small VC dimension indicates strong function generalization ability (VAPNIK, 1995a(VAPNIK, , 1995b)).The SVM constraint is an inequality that involves an insensitive loss function; thus, SVM calculation is complex.Therefore, Suykens and Vandewalle (1999) used the leastsquares quadratic loss function to replace the insensitive loss function of the SVM, changing he constraints into equations to construct the LS-SVM.This method involves linear and nonlinear problems as follows.

Linear Problems
The objective function is determined as follows (SUYKENS and VANDEWALLE, 1999;SHEN, 2011) γ : the rule parameter used to adjust the deviation between the classification boundary maximization and error minimization.Lagrange multipliers are used to change function J into a quadratic equation: (2) To determine the optimal solution for function J , the partial differentials for parameters ) , , , ( i i e b w α are calculated as shown in the following matrix: The following is derived after w and i e are eliminated: , and I is the unit matrix.

Variables α , b
are used to solve the equation, attaining the following linear LS-SVM regression:

Nonlinear Problems
Regarding nonlinear data, the data are converted into a feature space or a higher dimensional space by using a mapping function φ , and converted into a linear problem to determine the solution (SUYKENS and VANDEWALLE, 1999;SHEN, 2011).
Suppose the training sample is 1 2 ( , ) ; by using the mapping function φ , the sample is converted into the following: According to the Mercer condition, the training sample is expanded into the following: ( , ) ( ) ( ) where Kis the kernel function.
After being replaced with a kernel function, the nonlinear support vector regression function is as follows: Currently, the most commonly used kernel functions are the linear kernel function ( , ) ( ) ( ) , where d is the power of entry; and radial basis function , where σ is the kernel function bandwidth.

Parameter Selection
Researchers must determine model parameters during model training; however, no standard approach exists for selecting parameters.In addition, because the polynomial kernel function has a strong predictive power, a low order indicates a correspondingly strong predictive ability.Experiments have shown that 1to7orders is a suitable range, and the polynomial recognition rates for the third or fourth orders are superior.If the results are similar, the lower order is typically used.When the order is excessively large, the kernel matrix element value either approaches infinity or is infinitesimal (ZHOU, 2009;FU, 2010).The RBF kernel function employed in this study involved combining grid search and cross-validation (HSU et al., 2003) to form an optimal parameter selection method; this function is commonly applied and simple to implement.The grid search method yields an optimal parameter set, and cross-validation prevents the model from over fitting.
Figure 1 shows a flow chart presenting the primary aspects and functionality of the LS-VCM used in this study.
Figure 1 -A flow chart presenting the primary aspects and functionality of the LS-VCM used in this study.

Research Data and Collection
Figure 2 shows the study area and all known control points and orthometric heights (H) selected in this study.The triangle points represent the control points where spirit leveling and GPS RTK data were captured.This GPS-RTK was first employed to measure the 78 plane coordinates (x, y) and ellipsoidal heights (h) that met the precision requirements.The GPS-RTK observation data were received once per second.The instrument began to receive data when the initial plane precision was less than±1.5 cm and the height precision was less than±2.0cm.The known control-point orthometric height precision yielded first-order Class-2 leveling survey results.The geometric geoid heights of these 78 control stations (Fig. 2) vary from West 19.2 m to East 20.7 m.Figures 2 and 3 show contour and 3D maps of the constructed geometric undulations of the78 control stations in the study area.

Research Methods
The orthometric height (H) obtained from the spirit leveling information was subtracted from the ellipsoidal height (h) measured using the GPSRTK.The results were used to derive various local geometric geoid undulations (N) to serve as the known values.In this study, Matlab was used asthe LS-SVM calculation platform.The plane coordinates of the points (N, E) were entered in Matlab to calculate the geometric geoid undulation of the fitting region.The results were compared with the known geoid undulation to derive residuals△N.The precision of the local geoid values calculated using various methods were compared with that of the fitting methods used in previous studies (WANG, 2005;CHUNG, 2008;KAO and SHENG, 2011)that used the same experimental condition limits.The various LS-SVM model kernel functions were used to establish the optimal fitting points of the study area.The determined geometric geoid was verified by evaluating the orthometric heights at selected benchmarks.The primary focus of this study was increasing the accuracy of the local geometric geoid, ensuring a simple and rapid transformation of the ellipsoid height into the orthometric height in the test area.Interpolation is a crucial step when determining a local geometric geoid.Substantial computation time may be required if numerous terms of a function are used and a large data set is required to form the model.Surveyors should use the minimum occupied number of GPS and leveling data points to conserve working hours, and the difficult terrain for spirit leveling measurement (e.g., mountainous areas) and ensure the level of accuracy required during practical engineering surveys.

Research Procedures
The steps of this study are summarized as follows: (1) Confirm input the plane coordinates of the points (N, E) and output geoid undulation (N).
(2) Select the model parameters and use the polynomial kernel function, using as the third-order polynomial, and using the RBF kernel function grid to select optimal parameters

STUDY RESULTS AND ANALYSIS
First, the obtained prediction outcomes were compared with those obtained of previous studies that have used the quadratic surface fitting, back-propagation (BP) neural network (WANG, 2005), and multi-surface function (CHUNG, 2008) methods in the same test area.This comparison was conducted to explore the applicability of the LS-SVM.The same fitting points were tested to compare the accuracy of derived geometric geoid with the results of Wang (2005) and Chung (2008).Therefore 35 of the 78 GPS stations used by WANG (2005) were selected as checkpoints and the remaining 43 stations were used to construct the geometric geoid.The differences between the predicted and known values were used to measure the accuracy of the local geometric geoid.According to the results o Wang (2005), the RMSEs could attain ± 2.00 cm and ± 1.89 cm by adopting curve surface fitting and BP ANNs, respectively, to build the geometric geoid model.According to CHUNG (2008), the RMSEs could attain± 2.62 cm, ± 3.52 cm, and ± 2.62 cm by adopting the hyperbolic curve, 3power of distance, and squared distance methods, respectively.Its precision was sufficient for use in engineering surveys.Therefore in this study, the RMSE was calculated for the same test region and fitting point conditions, using checkpoints that were not employed in model training.The results were used to conduct a comparative analysis.The 43 fitting points and 35 checkpoints used in the BP neural network study of WANG( 2005) and the 40 fitting points and 38 check points used in the multi-surface function methods study of (CHUNG,2008) were evaluated.The study area comprised 78 points (fitting points and checkpoints) and Tables 1 and 2 display the relevant results.Because the real geoid surface was modeled as smooth in the test area, any interpolation methods produce similar results (Tables 1 and 2).Thus, the various methods used to derive local geoid model are all suitable and similar.In this study, a method was presented using LS-SVM to approximate the regional geoid surface.Although the fitting performance levels of the various models yield small RMSE differences of a few millimeters, the proposed LS-SVM method yielded the smallest RMSE, indicating that it can refine the approximate geoid surface in the test area.The test results indicated that the accuracy of the estimated geometric undulation interpolation that used the LS-SVM was in the order of 1.5 cm.Based on these results, the estimated undulation accuracy attained using the LS-SVM was superior to that attained using the curve fitting and BPANN methods (LIN, 2007) by an order of 2-4 cm in the same test area.
The results shown in Tables 1 and 2 indicate that although the linear kernelfunction calculation results of the geometric geoid undulations derived using the LS-SVM were less precise than were those calculated using the second-order curve surface, BP neural network, and multi-surface function methods, the precision derived using the RBF and polynomial kernel function results was superior to that derived using other methods.Therefore, the LS-SVM was used to calculate local geometric geoid undulations.

Optimization Tests
The 2-point increment method was adopted within the study area to calculate the RMSE values of the 3kernel functions at various points.Although the points increased, the selected points were distributed throughout the study area as uniformly as possible and the minimum RMSE was used as the basis to determine the optimal number of fitting points.Table 3 shows the analysis results.In this section, the optimal local geometric geoid model is determined for the study area.

±6.01
The results in Table 3 indicate that when the fitting point was at 30, the △N RMSE values derived using the RBF and polynomial methods were minimal and complied with the sample size, which was greater than 30.Errors should be assumed to be normally distributed; therefore, 30 points were used to perform model training and the remaining48 points were used as checkpoints.Figure 4 shows the distributed points.
The graphed results presented in Figures 5 and 6 indicate that the large residual values of the linear kernel function primarily occurred at the outer corners of the test region; the points at the eastern and western portions of the test region and parts of the southern region present fitted values that are smaller compared with the known values.This phenomenon might be caused by underestimated fitting results because certain areas lack surrounding fitting points.The points in the central and northern regions demonstrate higher fit values compared with the known values, indicating over prediction in these regions.The residuals of the RBF and polynomial kernel function showed no significant trend changes.The results shown in Table 4 indicate that the local geometric geoid undulation fitted using the RBF and polynomial kernel functions yielded an overall precision of approximately ±1.47cm compared with that of the known geometric geoid undulation.The results obtained using the linear kernel function exceeded the limitations of indirect elevation observations and inverse elevation calculations, which should be less than 5 cm.The results presented in Table 5 indicate that compared with the local geometric geoid undulation model of the test area constructed using the secondorder curve surface, the BP neural network (WANG, 2005), and multi-surface function methods (CHUNG, 2008), the proposed LS-SVM method requires only 30 fitted points.This is 10 or more points fewer compared with the number required for using the second-order curve surface, BP neural network, or multi-surface function methods; however, the proposed method yields a comparable level of precision.

5.CONCLUSION
In this study, the LS-SVM was applied to construct a local geoid undulation model.Experimental testing indicated that the proposed method effectively predicted geometric geoid undulation and met precision requirements.Various kernel functions can be used to construct distinct LS-SVM models.The results indicated that using RBF and polynomial kernel functions to evaluate the study area produced a superior fitting precision of approximately ± 1.5 cm.The fitting of the linear kernel function was less desirable compared with these methods.
No optimal solution exists for selecting LS-SVM parameters.The current experiment demonstrated that the grid search method can feasibly be used to select specific and complete data search parameters.In addition to model parameter selection, the fitting results of the proposed LS-SVM included the density and distribution of the known points.In this study, after changing the number of points and point distributions, the 3kernel functions yielded differing fitting results, reducing the number of fitting points by 10 or more.The checkpoint errors of the RBF and polynomial kernel functions were controlled within ± 5 cm, meeting the elevation precision requirements of engineering surveys.This level of precision cannot be achieved using the linear kernel function.
i e : the error variable used to measure error classification.

Figure 2 -
Figure 2 -The 78 control stations distribution and original geometric undulations contour map of the study area.

Figure 3 -
Figure 3 -3D Contour map of the original geometric undulations of 78 control points of the study area (unit: m, undulation isexaggerated in up direction).
Perform model initialization and use the fit points (training samples) combined with linear, polynomial, and RBF kernel functions to establish the LS-SVM model.(4) Input the check points (test samples) into the trained LS-SVM model to yield the local geometric geoid undulation forecasts.

Figure 4 -
Figure 4 -Study area fitting-point and checkpoint distribution chart.

Figure 5 -
Figure 5 -The residual chart of the three kernel functions for the region and the known geometric geoid undulation.

Figure 6 -
Figure 6 -Differences of local geoid undulation at 48 global positioning satelliteleveling control points for three kernel function of LS-SVM.

Table 1 -
Statistics results for35 checkpoint-result geoid undulation differences predicted by three methods for study area.

Table 2 -
Statistics results for 38 checkpoint-result geoid undulation differences predicted by three methods for study area.

Table 3 -
The N ∆ RMSE results (cm) fitted using various numbers of LS-SVM fitting points.

Table 4 -
The △N statistic results for the 48-point checkpoint fitting of the study area.

Table 5 -
Precision comparison of the local geometric geoid undulation for the test area derived using different fitting methods.