POINT TO POINT: AN ALTERNATIVE METHOD FOR EXTRACTING “HOMOLOGOUS POINTS” IN BATHYMETRIC DATA COLLECTED WITH A MULTIBEAM SYSTEM Point to Point: Um Método Alternativo Para Extrair “Pontos Homólogos” Em Dados Batimétricos Coletados Com Um Sistema Multibeam

Due to waterway transport efficiency, mainly for commercial trades, the use of sea/river routes has grown recently. So, the importance of producing high quality nautical charts stands out. A nautical chart is the hydrographic survey final product and its accuracy depends on data quality collected, primarily of the vertical quality (depth). In this sense, despite the theoretical and practical difficulty in obtaining homologous points in hydrographic surveys, even when performing check lines, bathymetric data must always be delivered with a statistically proven confidence level. Thus, this study has two main objectives: i) to propose a method, called Point to Point (P2P), for obtaining “homologous points” for hydrographic surveys carried out with multibeam systems, without resorting to mathematical and/or statistical interpolations, called Point to Point (P2P); ii) to quantify the magnitude of the difference between the statistical evaluation using check lines (CL) and by overlapping successive sounding lines (SL), applying the P2P method. The results showed that P2P is easy to application, provide low computation effort, is robust and consistent. Besides that, was possible to applied successive regular lines to get a validation of the hydrographic survey.


Introduction
The improvement in water transportation (waterway and shipping) efficiency directly contributes to the increase in the sea and river routes use (IHO, 2005). Locations such as the Northern region of Brazil are directly dependent on maritime transport, whether for the transportation of people or the transit of goods, which generates growth in the use of waterways (VEN, 2019). Due to the hydrographic network and the Brazilian coast, authors such as Olivia (2008) and Pompermayer et al. (2014) concluded that Brazil has characteristics very conducive to navigation, as it has a coastal area of about 9,000 square kilometers, a jurisdictional area larger than 4.5 million km² and has more than 40,000 kilometers of waterways in its interior.
Maritime transport is one of the most important modes for both industry and logistics in Brazil, so much so that 75% of Brazilian international trade is transported by this means. However, even though it is the most economical and cleanest means, this modal still presents little development in Brazil, except in certain regions of the Brazilian territory that, because they do not have highways and railways in sufficient quantity and quality, depend directly on shipping (OLIVA, 2008;POMPERMAYER et al., 2014).
Thus, due to the navigation value for various sectors, mainly commerce, the importance of producing quality nautical charts stands out. The nautical chart is the final product of a hydrographic survey, therefore, its accuracy is dependent on the quality of the data acquired, primarily on the quality of the vertical component. It is emphasized that, although the main interest of these surveys is navigation, several other purposes are served by the data collected, such as calculations of sediment volumes, dredging with the removal of deposited material, installation of pipelines, port projects, among others (IHO, 2005;DHN, 2017).
The main nautical chart objective is to represent the underwater bottom, exposing the physical characteristics of the bottom, such as geological formations, and providing information necessary for navigation. Nautical charts are also used to inform about features that offer dangers to navigation (sand banks, submerged stones, overturned hulls, etc.), anchorage areas; conspicuous altitudes and points; coast line; Islands; tidal elements; navigation aids (lighthouse, beacons, buoys, landmarks, alignment lights, radio beacons); in addition to other indications necessary for safe navigation (MIGUENS, 1996).
Currently, hydrographic research aimed at cartographic production and updating is focused on the use of multibeam echo sounders (beamformers) to obtain depths. Compared to single beam echo sounders, these systems have better resolution, greater accuracy, higher precision and acquire a greater amount of data, which allows significant improvements in the object detection process and in the submerged relief definition (CRUZ et al., 2014;MALEIKA, 2015). The main problem is the statistical evaluation of the vertical component quality of these data (FERREIRA et al., 2019c), that is this work's focus.
Traditionally, the assessment of depths collected by single beam echosounders consists on crossing the check lines (CL) with the sounding lines (SL), in a perpendicular way, followed by statistical analysis of the discrepancies resulting from the depths confrontation of the "homologous points". In this work, "homologous points" were consists of identical or corresponding spatial points, usually obtained at different times. That is, they consist of points that have, or should have, the same three-dimensional coordinates. On the other hand, due to the large amount of data generated during a multibeam survey, theoretical and practical adaptations must be addressed to obtain the discrepancies samples.
One of the most common ways to generate discrepancies samples is the creation of depths digital models of the tracks surveyed through sounding lines and check lines, followed by pixel by pixel comparison between these models, aiming to obtain the discrepancies (SUSAN & WELLS, 2000;SOUZA & KRUEGER, 2009;EEG, 2010). However, the uncertainties arising from mathematical and/or statistical interpolations can affect the correct judgment of the hydrographic survey vertical component quality (SOUZA & KRUEGER, 2009). Souza and Krueger (2009), present a study in which the estimation of sample uncertainty is executed by comparing bathymetric models. Discrepancies, in the context of this work, consist of the difference between the altimetric coordinates of the "homologous points".
In this work, a multibeam bathymetry system capable of creating an expectation of vertical uncertainty (theoretical model) of ± 0.24 meters was used. The smallest interval of the hydrographic survey sample vertical uncertainty, at the 95% confidence level, varied between ± 0.305 meters, suggesting the existence of unquantified components of uncertainty in the evaluation process, what can be related to the interpolation process. Another, more common, way is to carry out a comparison between the depths arising from the check lines with the bathymetric surface generated from the sounding lines. In this case, the uncertainties arising from mathematical and/or statistical modeling are alleviated, given that the interpolation process is performed only on regular lines (HYPACK, 2020;QPS, 2020, TELEDYNE CARIS, 2020. Based on NORMAM-25 (DHN, 2017), hydrographic surveys performed with a multibeam system and used for the construction or updating of nautical charts and publications, must adopt, as spacing between the sounding lines, half of the swath length, generating 200% sea bottom coverage. This linear overlap value is recommended for the correct processing of the collected data, as well as the visualization of random effects that negatively interfere with the quality of a hydrographic survey. The correct overlap also allows to make judgments about anomalous depths (spikes) (FERREIRA et al., 2019a). It is also possible to use this information to assess the bathymetric survey quality, either by verifying the vertical and horizontal fit of the profiles generated from the successive swaths, or by generating discrepancy samples.
According to Ferreira et al. (2019a), the verification of the vertical and horizontal fit is proven to be effective in assessing the survey quality, being applied in the quality control of the data collected by swaths systems (multibeam echo sounders, bathymetric LiDAR, etc.). However, using these data to estimate sample uncertainties through sample discrepancies is not so common. In fact, there is no information in the existing literature that proves the performance of this technique for statistical evaluation of the collected bathymetric data, considering the complexity of its practical application.
Thus, this study has two main objectives. The first one is to propose a method, called Point to Point (P2P), for obtaining "homologous points" for hydrographic surveys carried out using multibeam sounding systems, without the need to resort to mathematical and/or statistical interpolations. Although it is unlikely that the same submerged feature will be resampled, this method allows checking and extracting homologous survey points with a multibeam system, with the aim of eliminating the need of check lines. The second one is to quantify the magnitude of the difference between statistical evaluation through check lines and the estimation of sample uncertainty by overlapping successive swaths, applying the P2P method.

Dataset
The database used in the present study consists of a hydrographic survey executed on May 18, 2017 in the Porto region of Rio de Janeiro, between Ilha das Enxadas and Ilha das Cobras.  The collected data were obtained with the aid of a multibeam system (beamforming) model Sonic 2022 from R2 Sonic, integrated with an inertial system, model I2NS (Integrated Inertial Navigation System) produced by Applanix. It should be noted that the planning, execution, and processing of data followed the standard recommended by S-44 (IHO, 2008) and NORMAM-25 (DHN, 2017) aiming at classification in Special order and Category A, respectively.
In this research, three adjacent sounding lines and four check lines were selected. The lines from the swaths were pre-processed using the Hysweep software (HYPACK, 2020). Figure 2 shows the steps present in software processing. During the development process of this tool, it was assumed that the vessel had a stable behavior, since was used processed data and already compensated for the attitude and heave effect, in addition to a vessel subjected to a careful DIMCON process and properly calibrated system (patch test). Furthermore, we can emphasize that, theoretically, these possible problems would not affect the obtaining of the discrepancies by the algorithm. These, in turn, would be shown as outliers in a statistical analysis.
According to NORMAM-25, hydrographic survey classified in Category A must comply with uncertainties requirements laid down by S-44 (IHO, 2008). In this sense, for each survey order, the S-44 stipulates maximum values allowed for the Total Vertical Uncertainty (TVU). Table 1 shows the maximum allowed values for TVU at 95% confidence level for each order. The coefficients a and b presented above are used in Equation 1 to estimate, at the 95% confidence level, the maximum vertical uncertainty. Where P is the depth obtained for each point.
Based on Equation 1, each depth will have a vertical uncertainty estimate. Therefore, the sample vertical uncertainty interval with 95% of confidence level, obtained from the use of "redundant information" (check lines), must be equal to or less than the estimated tolerances.
The surveys performed with a multibeam system present a large amount of data, which requires an approach to obtaining discrepancy samples that is different from the approach employed with data from single beam survey. In this context, the P2P method, proposed in this work, offers advantages over the other existing methods to obtain these discrepancy samples. Among the P2P method advantages, it is worth noting that there is no need for mathematical and/or geostatistical interpolations to obtain data, which consequently reduces the uncertainties in the analysis. Another benefit lies in the possibility of identifying and using homologous points.
In surveys of different bathymetric lines from the same location, the probability of resampling and identifying the same submerged feature is low due to the equipment configurations, collection and processing methods employed. Using only overlapping scanning methods, the spacing between beams and the interval between successive transmissions of the sound beam make the possibility of the occurrence of minimal redundancy yet (CALDER & MAYER, 2003; HUGHES CLARKE, 2014).

Methodology
The P2P method, developed and implemented in the R software (R CORE TEAM, 2020), presents itself as a robust, innovative and easy to apply algorithm whose methodological steps are illustrated in Figure 3. To start the method application, the data collected from the bathymetric survey are subjected to preprocessing based on the SODA (Spatial Outliers Detection Algorithm) algorithm, in which the identification and filtering (or cleaning) of the depth data is performed (FERREIRA et al. 2019a;FERREIRA et al. 2019b). (2018), is a methodology that provide aids to detect spikes in bathymetric point cloud, specifically multbeam echosounder data, interferometric sonars and airborne laser bathymetric systems. Even though the focus of SODA is on bathymetric data, one can easily modify it to identify outliers in any set of georeferenced data (FERREIRA, 2018). The theoretical development of SODA is based on classical and geostatistical statistics theorems, which makes it supposedly robust and efficient. Figure 4 shows the diagram describing the steps of SODA logic process. With the subsamples in place, SODA applies the Method δ for outlier detection. The Method was inspired, in part, by the technique proposed by Lu et al. (2010), which consists of applying spikes detection thresholds based on the global and local sample variance, where the local variance refers to the subsample variance. In this method, if a global variation is greater than the local variation, the cutoff value is set to be equal to the global variation, otherwise, the threshold is defined as 0,5 • (δ2Global + δ2Local), where δ2 is the variance. And so, any observation that has a higher residue, in absolute value, than the cut-off value, is considered a spike (FERREIRA, 2018). The next step consists of data entry in the software, where the sounding lines and check lines ranges are represented in the point structure containing the coordinates and the respective depth. It should be noted that the algorithm is capable of importing text data and converting it into the native format of the implemented code. For this, the user must insert the cartographic projection system used. The inserted files, regardless of their format, must present positional coordinates (local, projected, or geodetic), as well as reduced depth data (corrected for tidal effects). It is noteworthy that in this step, a priori, only two files are read at a time, the sounding line file and the check line file, where the regular sounding ranges are provided in the same data set and the check lines are analyzed separately. In cases where the user uses two sounding lines, one of them must be adopted as verification.

SODA, developed by Ferreira
After reading the data from the sounding lines, the algorithm identifies the intersection area between them with or without the inclusion of a Buffer. The intersection area is obtained by the overlap between the sounding line or check line in a transverse position (Figure 5a), or between the longitudinal position of successive swaths lines (Figure 5b).
The Buffer was defined as a limit extrapolation of the intersection area between the swaths, thus allowing an increase in the search area for points located at the extreme, which can contribute to the identification of the nearest neighbor before the pre-defined limit distance (Figure 6b). The default Buffer value is zero, but it can be defined according to user's criteria.
After identifying the intersection area and extracting the respective points present in swaths, the algorithm uses a limit distance to search for the nearest neighbor. The Limit Distance consists of applying a search radius to each point located on the check line (Figure 6a), so that a probable homologous point on the sounding line must be within this limit distance. If zero value is assigned to the limit distance, the points will be considered homologous when they have planimetric coordinates with differences less than 1x10 -6 meters.  In the P2P method, the limit distance can be defined based on four criteria: a) User-defined value: Based on the knowledge about the database density, the user defines the limit distance; b) Value equal to the square root of the diagonal of the intersection between swaths: the algorithm identifies the area of intersection between the two swaths, and uses the value of the diagonal of this area as the limit distance; c) Value equivalent to 25% of the shortest distances: for each point on the sounding line, the closest point on the check line with its respective distance is identified. From the data set of these obtained distances, the value equivalent to the first quartile of this set is defined as the limit distance; d) Based on Footprint: The algorithm calculates the footprint size based on the average depth, beamwidth, and beam angle in both directions, Across Track and Along Track, and adopts the smallest footprint value as the limit distance. The average depth is determined based on the points present within the intersection area between sounding lines. The information on the beam angle (β) must be selected at the user's discretion, and it is recommended to use the maximum firing angle adopted in the survey. The beamwidth (θ) is provided by the echo sounder catalog. Figure 7 illustrates the calculations based on the footprint. (2000).

Figure 7: Footprint Along and Across track illustration adapted from Galway
After identifying the homologous points between the check lines and sounding lines, the P2P method calculates the discrepancies of the points considered homologous according to Equation 2.

dpi = Zi − Zi
(2) Where Zi CL and Zi SL represent the depths collected in the check and regular swaths, respectively.
With the generation of the discrepancy files, it is possible to carry out the necessary analysis on the vertical quality of the hydrographic survey. The discrepancies obtained are analyzed statistically with the purpose of estimating the 95% confidence interval of the vertical sample uncertainty of the bathymetric survey. In this step, the MAIB (Methodology for the Evaluation of the Uncertainty of Bathymetric data), developed by Ferreira (2018) was used. In all phases that demanded an investigation by outliers, SODA was applied (FERREIRA et al. 2019a;FERREIRA et al. 2019b).
It is noteworthy that in the data input phase, instead of providing only two files at a time, the sounding line file and the check line file, the user can create a file for batch entry, which allows, through the iteration with script, to process all survey lines with a single data input, also varying the processing parameters. At the end, three products are generated: the file of discrepancies in text and shapefile format, a report containing all the parameters adopted in the processing, as well as all statistical analysis about the sample uncertainty.

Results and Discussion
At first, the survey lines were subjected to pre-processing, as explained in Figure 2. In the filtering phase, SODA (or AEDO -Algoritmo Espacial de Detecção de Outliers, in Portuguese) algorithm was used. Subsequently, regular surveying swaths were combined into a single point file (XYZ), simply called RSL. This file was used in later analysis. The search radius was chosen as 3 times the minimum distance value. The following table summarizes the general information.
More information on the SODA algorithm and one of the methods proposed in it -can be found in Ferreira et al. (2019a) and Ferreira et al. (2019b). Where C indicates the type of the submerged morphology. In this case, C = 3, indicates that the study area is flat and that it has low variability.
To analyze the performance of the proposed methodology, a combination of sounding line and check lines obtained throughout the study area was applied.
Due to prior knowledge of the database, Buffer and Limit Distance parameters were configured, respectively, with unitary and null values. For each of the nine combinations analyzed, the discrepancy files were obtained.
Although the same database as Ferreira (2018) was used, some metrics and improvements were increased in the script in order to improve performance, with an 80% reduction in processing time. Then, the discrepancy files were submitted to the SODA algorithm. For this, the methodology was modified to investigate the discrepancy variable, instead of the reduced depth. In the presence of spatial autocorrelation, geostatistical analysis were carried out in order to generate Standardized Residues (SRs). In these cases, the search for anomalous discrepancies was effectuated on the SRs, as provided for in the SODA algorithm. In all cases, the search radius was adopted as 3 times the minimum distance value. The results of this stage are summarized in Table 3. In general terms, the discrepancy files showed a low occurrence of outliers. The dp7 sample had the highest proportion of outliers, about 32,400 anomalous discrepancies, according to the SODA criteria. After removing the non-representative discrepancies in the area, the statistical examination of the hydrographic survey vertical quality was continued. Table 4 summarizes the statistical analysis results of the vertical sample quality. In this step, it was decided to use the MAIB methodology, as discussed previously. Table 4 shows a summary of the traditional analysis based on the tolerances defined by S-44 for the survey in question. Also shown are the sample vertical uncertainty values computed based on the estimators RMSE (Root Mean Square Error) and Φ Robust (Robust Uncertainty), it consists of a point estimator for the vertical uncertainty of hydrographic surveys proposed and validated in Ferreira et al (2019c). All statistics were applied to data with outliers, as is commonly practiced. Clearly evaluating the samples dp1 to dp4 results in Table 4, the survey would be classified in the Special Order, since all the discrepancies generated are smaller than the stipulated tolerance. On average, an RMSE of 0.043 meters and a Φ Robust of 0.038 meters were obtained. Samples dp5 and dp6 were established for the evaluation and, later, validation of the sample uncertainty estimates through the overlapping of sounding lines.
In short, it can be said that the results achieved are significant and quite promising. The dp1, dp2, dp3, dp4 and dp6 samples presented, on average, 100% of the discrepancies below the tolerance stipulated in S-44, while the dp5, dp7, dp8 and dp9 samples reached 99.96%. By examining the samples in isolation, the same conclusions about its classification are obtained. In terms of point estimation of sample uncertainty, the RMSE and Φ Robust , produced by samples dp5 and dp6, were equal to 0.045 meters, for both, and 0.036 and 0.049 meters, respectively. It is noteworthy that the sample dp6, in particular, had the estimate of Φ Robust greater than the value of RMSE. This fact, although uncommon, is possible due, mainly, to the data set frequency distribution nature.
In the case of samples dp7 to dp9, mathematically and statistically equal results were obtained, leading to the conclusion that for the analyzed database, the definition of the limit distance does not influence the final result. However, distance does have influence on files such as sample size and the processing time, when the null value for the limit distance is adopted. Finally, it is observed that the magnitude of the difference in the estimated uncertainties is quite subtle, which shows the quality of both the data collected and the P2P method. Although the point estimates of the sample uncertainty are within the 95% tolerance range for the special order, it is not possible to guarantee the estimate reliability, since the confidence intervals were not constructed.
A more careful analysis can be performed using the MAIB methodology, described in detail in Ferreira (2018). Table 5 summarizes the results of applying the MAIB. As can be seen when analyzing Tables 4 and 5, the Φ Robust estimator showed approximate punctual results when using the database with outliers and the database treated by SODA. Only the dp2 sample showed a difference of 0.001 meters, a non-significant value given the survey characteristics. On the other hand, through the use of the RMSE estimator, it was possible to observe greater differences, especially in the cases in which the Central Limit Theorem (CLT) was used, as suggested by Santos (2015) and Ferreira (2018). Differences in mean, minimum and maximum were obtained, respectively, around 0.003, -0.007 and 0.012 meters.
The samples dp1, dp2, dp7, dp8 and dp9 reflected RMSE values (Table 5), mathematically, higher than those computed by the traditional method (Table 4), even after the elimination of outliers. This is due to the essence of the applied methodology, that is, the CLT. From the application of this Theorem, it is noticed that the point estimate of the sample uncertainty is always, even if in a mild way, overestimated. In contrast, the confidence interval estimates are quite consistent and reliable. However, given the analytical and computational complexity of applying the CLT, especially for large datasets, combined with the Φ Robust values obtained (Table 4 and Table 5), it is recommended in these cases to adopt the robust approach.
As for the dp3 and dp4 samples, the results obtained by the proposed method were equivalent to those computed by the traditional method, with the exception of confidence intervals, which are extremely important in a statistical context, but generated only in the proposed methodology. It is emphasized that the fact that the RMSE values presented in Table 4 and Table 5, some are equal and other approximate, may indicate failures in the outlier detection step. However, analyzing the values of Φ Robusta , it is clear that such similarities come from the high quality of the data combined with the robustness of the P2P method. As for the classification before the norms, the hydrographic survey, analyzed based on the proposed methodologies and developed in this work, fits into the Special Order/category A, the same result being achieved through traditional analysis.
Finally, with regard to the use of successive sounding lines to estimate the vertical uncertainty, satisfactory results were obtained and statistically identical to those obtained by the check lines, leading to the conclusion that the use of such an alternative is viable, both economically, and in terms of defining a vertical uncertainty. Similar to what occurred in the traditional analysis, samples dp5 to dp7 show that the choice of the limit distance to the database is indifferent.

Conclusion
This work aimed to propose a new method for extracting homologous points from hydrographic surveys carried out using swath sounding systems, called "Point to Point" -P2P, without the need to resort to mathematical and/or statistical interpolations. It was found, from the results obtained, a high level of accuracy and consistency from the application of this method. Another advantage observed is the simplicity application and the fact that the P2P method provides low computational effort, which makes it an excellent option for generating sample discrepancies.
Another important point reached is related to the validation of the hydrographic survey vertical quality analysis, applying discrepancies arising from the overlapping of successive regular lines. Thus, it was possible to conclude that the check lines production is, in general, dispensable. It should be noted that this finding occurred both in the traditional analysis and in the examination performed using the uncertainty interval assessment methodology applied in this study. Therefore, the results generated here allowed us to conclude that this is a robust method and, therefore, is recommended in the analysis of the vertical content quality of hydrographic surveys, either using check lines or successive regular lines.
As these are unprecedented proposals, improvements are necessary. Thus, for the development of future works, it is suggested that improvements be made to the algorithms developed in order to further reduce the machine processing time. Further studies on the use of the overlays of successive regular lines to assess the vertical quality of the survey are also recommended.