STOCHASTIC MODEL OF THE BRAZILIAN GPS NETWORK COORDINATES TIME SERIES

It is well known that daily estimates of GPS coordinates are highly temporally correlated and that the knowledge and understanding of this correlation allows to establish more realistic uncertainties of the parameters estimated from the data. Despite this, there are currently no studies related to the analysis and calculation of the noise sources in geodetic time series in Brazil. In this context, this paper focuses on the investigation of the stochastic properties of a total of 486 coordinates time series from 159 GPS stations belonging to the Brazilian Network for Continuous Monitoring of GNSS (RBMC) using the maximum likelihood estimation approach. To reliably describe the GPS time series, we evaluate 4 possible stochastic models as models of each time series: 3 models with integer spectral indices (white noise, flicker plus white noise and randomwalk plus white noise model) and 1 with fractional spectral index (fractional power-law plus white noise). By comparing the calculated noise content values for each model, it is possible to demonstrate a stepwise increase of the noise content, being the combination of a fractional power-law process and white noise process, the model with smaller values and the combination of random walk process with white noise process, the model with greater values. The analysis of the spatial distribution of the noise values of the processes allow demonstrate that the GPS sites with the highest accumulated noise values, coincide with sites located in coastal zones and river basins and that their stochastic properties can be aliased by the occurrence of different physical signals typical of this type of zones, as the case of the hydrological loading effect.


Introduction
It is well known that the noise in continuous GPS observations, as with many geophysical phenomena, can be described as a power-law process (Mandelbrot and Van Ness, 1968;Agnew, 1992).That is, a one-dimensional stochastic process whose time-domain behaviour is such that its power spectrum has the form where 0 P and 0 f are normalizing constants, f is spatial or temporal frequency and k is the spectral index.In most cases, the spectral index k lies within the range 31 k −   and it can be subdivided into "fractional Brownian motions" with 31 k −   − and "fractional Gaussian noise" with 11 k −   .Special cases of these processes also include the classical Brownian motion (called "random-walk noise") with 2 k =− , the "flicker noise" process with 1 k =− and the uncorrelated "white noise" process with The importance of knowing and understanding the noise content of GPS data is because it allows establishing more realistic uncertainties of the parameters estimated from them.Johnson & Agnew (1995), for instance, demonstrated that neglecting the effect of long-range time-Bulletin of Geodetic Sciences, 24(4): 545-563, Oct-Dec, 2018 dependent correlations noise, makes the uncertainty in the estimated velocities much too small, and if the correlated and independent noise sources have a similar magnitude, the expected improvement in uncertainty from having more measurements is minimum.Zhang et al. (1997), in turn, showed that if the noise of data is purely white, it generates an underestimation of site rate uncertainties 3 or 5 times greater than when using a white plus flicker noise process, a more suitable stochastic model.Likewise, Langbein and Johnson (1997), found that the correlated nature of random-walk noise can cause time-dependent fluctuations in the time series which could be misinterpreted as tectonic signals if these were not previously eliminated.Finally, Williams (2003), proved that the uncertainties of sites rates could be estimated on the basis of the error model ( ) xt t  assumed for the data.This can be expressed as a linear combination of independent unit-variance random variables ( ) and temporally correlated random variables where a and k b are the variance of white noise and coloured noise (power-law processes other than classical white noise) of spectral index k respectively.
Currently, there are many different methods for assessing the noise content and their elements in time series (Johnson and Agnew, 1995;Langbein and Johnson, 1997;Wdowinski et al., 1997;Zhang et al., 1997;Mao et al., 1999), however, the most robust is the Maximum Likelihood Estimator (MLE) (Bos et al., 2008(Bos et al., , 2013;;Williams, 2008).It estimates the noise components and the other parameters of the stochastic model, finding the set of values of the model that maximize their likelihood function (Langbein and Johnson, 1997).In general, the MLE can be found the set of values as an explicit function of the observed data, or through global numerical optimization (Bos et al., 2008(Bos et al., , 2013)).Based on the aforementioned, and because currently there are no studies related to the analysis and calculation of the noise sources in GPS time series in Brazil, this study seeks to establish the basis for the description of GPS time series uncertainties at the national level.In this regard, this paper focuses on the analysis of the stochastic properties of the time series of weekly position estimates for 159 sites of the Brazilian Network for Continuous Monitoring of GPS using the modified MLE algorithm established by Bos et al., (2013).Finally, we examined 4 different possible stochastic models.They are a classic white noise (WN), a flicker plus white noise (FL+WN), a random-walk plus white noise RW+WN) and a fractional power-law noise model.

Stochastic model estimation 2.1 Maximum Likelihood Estimator
According to Langbein & Johnson (1997), estimating the noise components and the parameters from the linear function using the maximum likelihood estimator (MLE), requires maximizing the probability function by adjusting the data covariance.Thus, given a vector of observations x , the likelihood function for a covariance matrix C (which represents the assumed noise in the data) is where N is the number of data (epochs), det is the determinant of a matrix and v are the postfit residuals to the linear function using weighted least squares with the same covariance matrix C .For greater numerical stability and since the maximum is unaffected by monotonic transformation, it is often convenient to work with the logarithm form of (3).In this case, the loglikelihood function is given by According to Williams (2008), sometimes it is also necessary to estimate other parameters of the stochastic model than the noise values (common examples of these are the spectral index k of power-law noise models and the cross-over parameter  in first-order Gauss-Markov noise) (Langbein, 2004).This can be done if we include the parameters as extra dimensions in a general uphill simplex together with the set of noise values, however, this process is not recommended because at most iterations the uphill simplex may choose a new value for one of these parameters and therefore create a new unit covariance matrix.The alternative to this is to split the maximization process into two parts.Initially, we can estimate the noise values for fixed noise model parameters (inner maximization) and after that, we can use an outer maximization to estimate the other parameters.The result of this process is an adequate treatment of the routines that deal with covariance matrices (creation, inversion and determinant estimation) and probably to a more stable process.In this sense, the problem of calculating the noise values is reduced to the correct deduction of the covariance matrix.

Covariance matrix of stochastic models
The data covariance matrix can usually be represented through the combination of different stochastic models.That is where m represents a total number of stochastic models, PL C are the unit covariance matrices for temporally correlated noise (power law noise with spectral index k ) and 2 i  the variance of Geodetic Sciences, 24(4): 545-563, Oct-Dec, 2018 the stochastic models.For instance, if we take into consideration the 3 main processes cited in section 1 (random-walk noise, flicker noise and classical white noise), the matrix C takes the form

Bulletin of
where I is the NN  identity matrix (unit covariance matrix for white noise -no crosscorrelation) and RW C and FL C are the unit covariance matrices for random-walk and flicker noise.It is important to note that if the covariance matrix consists of only one noise source, then it takes the form and therefore, the log-likelihood function ( 4) can be re-written as In this case, the residuals v and the estimated parameters can be considered invariant to a scale change in the covariance matrix C (least squares adjustment) (Williams, 2008) and therefore, it is possible to differentiate (7) respect to  and obtain its true value explicitly by Meanwhile, if the covariance matrix C depends on more than one noise source, it is necessary to transform the noise variances to a new common that allows an only one-dimensional numerical maximization (Bos et al., 2008;Williams, 2008).For instance, if we have two noise variances 1  and 2  it is possible to transform these to two alternative variables namely an angle  and a common variance  (driving noise) such that  Geodetic Sciences, 24(4): 545-563, Oct-Dec, 2018 ( ) where C − are the unit covariance matrices for temporally correlated noises with variances 1  and 2  respectively.A generalization of this model was obtained by Bos et al. (2008), In this case, all angles  vary only between 0 and 1.The result of this new model is a Toeplitz covariance matrix C , where only the first column i  determines all its properties (descending diagonal from left to right is constant).In this regard, for instance, the first column i  of the covariance matrix C for any power-law noise model is (Bos et al., 2008): where d is 12 − time the spectral index k .Another important and classical noise model used in the analysis of geodetic times series and which can be determined from (12), is the Generalized Gauss Markov noise.For this, Langbein (2004) modified the classical first-order Gauss Markov model and include the spectral index d in the equation to create a power-law noise with a slope of 2d in the power density spectrum, which flattens to white noise at the very low and very high frequencies (Bos et al., 2008).The new equation for the autocovariance vector A more robust description of the covariance matrices for each stochastic model using in geodetic studies can be found in Williams (2003), Bos et al. (2008), Williams (2008) and Langbein & Johnson (1997).For the present study, we used the research of Williams (2003) as the standard research to follow.

Bayesian Information Criterion
Once the parameter values of the stochastic model have been obtained, it is important to compare the Maximum Likelihood values considering the complexities of the different models (number of estimated parameters both in the linear and the stochastic model) and determine which model is the best adapted to the GPS data.For this, one of the most used tests is the Bayesian Information Criteria (BIC), which is defined as ( ) where MLE is a maximum likelihood estimation value, p is the number of estimated parameters, and N is the number of data.The best model will be the one with the lowest BIC value.
Other criteria used to evaluate the quality of the chosen noise models are the Akaike Information Criteria (AIC), the Deviance Information Criterion (DIC) and the Focused Information Criterion (FIC).A detailed analysis of the selection criteria can be found in Kadane and Lazar (2004).

Data and Methods
In this study, we analyze the time series of weekly position estimates for 159 sites of the Brazilian Network for Continuous Monitoring of GNSS, RBMC, obtained in the last 10 years (2007 -2017).The lengths of the time series range between 1 and 10 years (To see the individual time span of the time series for all GPS station used in the experiments see http://www.sirgas.org/es/sirgas-con-network/stations/station-list/#).The station coordinates were calculated by the data processing centres of the Geocentric Reference System for the Americas (SIRGAS) and available at the ftp://ftp.sirgas.org/pub/gps/SIRGAS.The details concerning processing of GNSS observations can be found in Sánchez et al., (2015) and the RBMC official site https://ww2.ibge.gov.br/home/geociencias/geodesia/rbmc/analise.shtm.

Maximum Likelihood Estimation
Once the residuals have been obtained, we apply the MLE equations to these (section 2.1) to evaluate their stochastic properties.In this study, we evaluate 4 possible stochastic models.They are 3 classical models with integer spectral indices (WN -White noise; FL + WN -flicker plus white noise and RW + WN -random-walk plus white noise model) and 1 model with fractional spectral index (PF + WN -Fractional power-law plus white noise).The respective unit covariance C for each of temporally correlated noises (WN, FL, RW noises) was obtained using (13) and included in the estimation as fixed parameters (inner estimation) together with other parameters of interest such as variance and phase of seasonal signals and offsets.Maximum likelihood estimates of the values of noise sources for all 159 RBMC time series are presented in Appendix 5.1.

Results and Discussion
Figure 1 and Table 1 provide the distribution of the optimal noise models (based on BIC criteria) for the RBMC data.The results show that roughly 70% to 80% of the best noise models are a combination of flicker noise plus white noise.The combination of random-walk plus white noise characterizes between 5% and 16% of the time series and the remaining 5% of the time series are split between the classical white noise model and a fractional power-law noise process.Figure 1: Distribution of noise models for the RBMC data An important aspect to consider is that most of the existing studies only use a power-law noise process to calculate the stochastic model of the time series (Bos et al., 2013;Williams, 2015;Klos et al., 2016;Klos, Bogusz and Moreaux, 2017).In this regard, if we apply a fractional power-law model to all RBMC time series, the spectral index k of the sites lies between -1 or 1 (fractional Gaussian noise) and, in most cases, this tends to be -1 (Figure 2).Initially, this behaviour agrees with the results presented in Table 2 (Flicker nose with 1 k =− ), however, the fractional powerlaw model is not able to detect the presence of random-walk noise in the time series and therefore, it has a limitation in the modelling compared to the approach used in this study.
Bulletin of Geodetic Sciences, 24(4): 545-563, Oct-Dec, 2018 (The symbol "-" indicates that the estimation made did not obtain a noise value for the specified model).It is important to note, the fact that the mean variance of the white noise in this model is almost zero and therefore, it does not represent even one percent of the total accumulated noise value.The next models are the classical white noise model with accumulated noise variance in order of 0,7 mm for horizontal components and 2 mm for vertical component, the Flicker noise plus white noise, with accumulated noise variances of 3 mm for horizontal components and 10 mm for the vertical component and the Random-walk plus white noise with accumulated noise variances in order of 6 mm for horizontal components and 15 mm for vertical component.An interesting detail to emphasize is the growing tendency of the accumulated noise variances between combined models (white noise plus power-law noise) (i.e. the noise grows by a factor of 3).For instance, the variance of PF + WN model is 3 times lower than the variance of FW + WN model, and this is 3 times lower than the variance of RW + WN.The same applies to the components of the sites.In this case, the horizontal components are also less noisy than the vertical component by a factor of 3, which coincides with results obtained in other studies (Zhang et al., 1997;Mao et al., 1999;Langbein, 2004)  On the other hand, it is important to note the high magnitudes of the noise of the RW + WN model when compared to the others.In this regard, Table 3 shows the 10 sites with the highest accumulated noise values.The clear majority of them are just an RW + WN process.According to Langbein and Johnson, (1997), this marked variability of the RW+WN noise model, is often due to the incidence of two principal processes: the precision of the instrument and the motion of the geodetic monument.To analyze this precept, Figure 3 illustrates the location of all GPS sites together with the spatial distribution of the vertical accumulated noise.As can be seen from the Figure 3, it is evident that most GPS sites, whose stochastic model is an RW + WN process, coincide with sites located in coastal zones and river basins.Initially, it may imply that the behaviour of the data of these stations is aliased by the prevalence of different physical signals typical of this type of zones, such as the hydrological, non-tidal oceanic loading or atmospheric pressure loading.A particular example of a GPS site whose motion is aliased by external physical effects is the NAUS station, whose vertical displacements can vary around 7 cm (variance of seasonal signal) (Figure 4).
It is interesting to see that in the present research, the Manaus station is precisely the station with the highest level of noise.It suggests that the claim that the high variances of RW + WN processes are produced by the motion of the site is correct, however, there is not enough evidence to ensure this.Finally, a detailed description of the variances of the noises for each station is presented in appendix 5.1.

Conclusion
We have analyzed the stochastic properties of GPS position time series of the 159 sites of the Brazilian Network for Continuous Monitoring of GNSS, RBMC, obtained in the last 10 years (2007 -2017) and determine that more than 90% of these stations have as stochastic model a linear combination of white noise process and either a flicker noise or random-walk noise process.
Comparison of the variances of models allows demonstrating a stepwise increase of the noise variances, being the combination of a fractional power-law process and white noise process, the model with smaller variances and the combination of random walk process with white noise process, the model with greater variances.The analysis of the spatial distribution of the noise variances of the processes allow demonstrate that the GPS sites with the highest accumulated noise variances, coincide with sites located in coastal zones and river basins and that their stochastic properties can be aliased by the occurrence of different physical signals typical of this type of zones, though we cannot prove that this is the case.
Bulletin of Geodetic Sciences, 24(4): 545-563, Oct-Dec, 2018 Appendix and attachments 1. Noise variances of selected stochastic models this case, we can compute a unit covariance matrix C by Bulletin of

Figure 2 :
Figure 2: Spatial distribution of noise values of GPS sites

Figure 3 :
Figure 3: Spatial distribution of noise values of GPS sites.Up component

Table 1 :
Distribution of noise models for the RBMC data

Table 2 :
Noise values and selected stochastic models of GPS sites * WN: White noise; FL: Flicker noise; RW: Random-walk; PF: Power-Law (fractional) noise; PL: Power-Law noise

Table 3 :
Sites with the highest accumulated noise values * WN: White noise; FL: Flicker noise; RW: Random-walk; PL: Power-Law noise

Table A :
Noise variances and selected stochastic models of GPS sites

Table A :
Noise variances and selected stochastic models of GPS sites

Table A :
Noise variances and selected stochastic models of GPS sites