COMPARISON BETWEEN THE DOUBLE BUFFER METHOD AND THE EQUIVALENT RECTANGLE METHOD FOR THE QUANTIFICATION OF DISCREPANCIES BETWEEN LINEAR FEATURES

Currently, in Brazil, for the assessment of the Positional Accuracy of non-point features (lines and polygons), there is no standard norm of execution. This work aims to compare the results of two methodologies that allow determining the average value of the discrepancies between linear features. The first, Equivalent Rectangle Method, aims to determine the discrepancy by considering an equivalent rectangle for the polygon obtained from the two homologous lines. The second, Double Buffer Method applies a buffer on both lines and obtains the average discrepancy value based on the relation of the areas of the generated polygons. These methods were compared in two steps. Initially, an experiment was performed with features of known measurements, where the displacement of the homologous lines was controlled in azimuth and distance. In this step, it was verified that the shape of the feature and the direction of the displacement interfere in the results of both methods when compared to the traditional procedure of measurement of discrepancies by homologous points. In the second stage, we evaluated the vector data of the OpenStreetMap (class of roads), with reference to a more accurate vector dataset produced for the Mapping of the State of Bahia. As a result, for the 1:25,000, 1:50,000, 1:100,000 and 1:250,000 scales, it was obtained, respectively, the PEC-PCD for the Equivalent Rectangle Method "C", "B", "A" and "A" and the PEC-PCD for the Double Buffer Method "R", "C", "B" and "A", where "R" means that it has not achieved the minimum PEC-PCD classification.


Introduction
In the last years, the cartography has received the voluntary and non-commercial contribution of several institutions for global mapping, a phenomenon known as Volunteered Geographic Information (VGI) (Goodchild, 2007).In this context, it stands out the OpenStreetMap (OSM) (Sehra et al., 2014).Concomitantly, new discussions about the quality of these data have increased, considering the following quality elements: completeness, logical consistency, positional accuracy, temporal accuracy and thematic accuracy (Ming et al., 2013).
For paper topographic charts (known as analogical products) produced officially in Brazil for systematic mapping, the positional accuracy follows the Cartographic Accuracy Standard (Padrão de Exatidão Catográfica -PEC) which parameter values were established in Decree No. 89.817 of June 20, 1984(Brazil, 1984), transcribed through the Table 1.With the recent evolution of digital cartography and the geotechnology's resources, new products require other quality parameters, including a refinement of the positional accuracy assessment (Ariza-López et. al, 2007) and computational programs to perform it (Nero et. al, 2017).
The Brazilian Geographic Service Bureau (Diretoria de Serviço Geográfico -DSG, 2016) established the quality assessment parameters described in the Technical Specifications for Geospatial Set Products (ET-PCDG) and Geospatial Data Quality Control (ET-CQDG).

Positional Accuracy of Linear Features
Even in the most recent standards, such as ET-CQDG, the quality assessment procedures regarding positional accuracy leave some uncertainties regarding the performance of the evaluation for line and polygon feature types (Santos et al., 2015).The adopted procedure remains the determination of quantitative percentages in relation to the maximum permissible error as well as the comparison of the mean square error with the standard error between the observations related to a point feature obtained in the field and its cartographic representation (DSG, 2016a).
Bulletin of Geodetic Sciences, 24(3): 300-317, Jul-Sept, 2018 According to Ferreira and Cintra (1999), the choice of the points to be evaluated is a subjective task that, when conducted by different professionals, can produce disparate results, distorting the real results.One of the reasons is the difficulty in determining many homologous points (Figure 1) and even if points are taken at regular intervals, the results may be different depending on the number of intervals and the initial marking point, illustrated by means of Figure 2. The calculation of discrepancies through distance measurements between homologous points is the traditional method and is still widely used.Reference is made, for example, to Helbich et al. ( 2012), which use the junctions (crossings between highways of a German city) to make the statistical evaluation through a two-dimensional regression model, besides studying the pattern of spatial association through the distribution function G (Getis and Ord, 1992).
Similarly, Brovelli et al. (2016) evaluated the positional accuracy of OSM buildings in the city of Milan (Italy) through a semi-automatic method of detecting homologous points in relation to a set of reference data.
To automate the evaluation and standardization, other methods of positional accuracy have been developed.Santos et al. (2015) presented the following methods for evaluating planimetric positional accuracy: Epsilon Band (Area Method), Single Buffer, Double Buffer, Hausdorff Distance and Vertex Influence.Among the studied methods, it was concluded that Double Buffer would be the most recommended because this method makes it possible to verify outliers and trends in the data, besides of reaching results closer to the traditional method, both in the classification of the positional accuracy of the features and in the descriptive statistics of errors.
In Brazil, Fonseca Neto et al. (2017) used the Double Buffer Method to determine the planimetric positional accuracy of linear features in the evaluation of the quality of orthoimages generated from a sensor embedded in a drone platform.Santos et al. (2016) used the same method to evaluate the planimetric positional accuracy of digital surface models (DSM) from linear features extracted from the models (ridges and drainage lines).
Cruz and Santos (2016) also applied the Double Buffer Method to evaluate the OSM road system of the central Brazilian region of Viçosa-MG, obtaining for that situation PEC-PCD class C for the 1: 10,000 scale.
The measurement of the mean discrepancy of each linear feature in relation to its homologous reference feature (which can be measured on field or obtained by more accurate dataset) allows the determination of the root mean square error of the sample and the percentage of the absolute value of discrepancies lower than a specific tolerance and, thus, the PEC-PCD classification.
In this context, this work compares the results of the mean discrepancies and the PEC-PCD classification obtained by the Equivalent Rectangle Method (ERM) with the results calculated using the Double Buffer Method (DBM).The research was performed in two stages: 1) Experimentation with features of distinct figures with controlled azimuth and distance for the displacements between pairs of homologous lines.
2) A case study with the roads from the OSM's vector data in relation to the equivalent from the mapping of the Brazilian State of Bahia's CDGV.

Equivalent Rectangle Method (ERM)
This method comes as an alternative to the traditional method of measuring discrepancies from homologous points.It proposes to perform this task by measuring the area between two representations of the same feature, replacing the resulting geometry with an equivalent rectangle with the same area and the same perimeter (Ferreira da Silva and Cintra, 1999).
Figure 3 shows the representations A and B for a Feature F and its respective equivalent rectangle of x1 and x2 sides, whose area (S) and perimeter (2p) are obtained by Equations 1 and 2, respectively.
Figure 3: Equivalent Rectangle.Source: Ferreira da Silva and Cintra (1999) Geodetic Sciences, 24(3): 300-317, Jul-Sept, 2018   𝑆 From (1) and ( 2) is obtained the quadratic equation (Equation 3): The discrepancy value x1 (smaller side of the rectangle) can be calculated from Equation 3, through the Bhaskara's Formula: According to Ferreira and Cintra (1999), due to the fact that the rectangle is adopted as a geometric model, lines with many salient angles tend to generate underestimated values of x1, that is, the difference for the real value is proportional to the quantity and size of salient angles in the polygon, which usually does not occur in practice.
Some precautions should be taken to apply the method: ▪ If there is a cross between the feature's representations, these lines must be broken at the points of intersection and the original polygon must be fractioned into smaller polygons; ▪ Both representations must be in the same Coordinate Reference System (CRS).
The main advantages of this method are the great speed in obtaining the results, the reduction of bias and the uniformity of the evaluation.

Double Buffer Method (DBM)
The DBM has been suggested by Santos et al. (2016), Fonseca Neto et al. (2017) and Cruz and Santos (2016) for the evaluation of the positional accuracy of linear features due to the best results presented in studies developed by Santos et al. (2015).
This method was originally proposed by Tveite and Langaas (1999) under the name Buffer Overlay Statistics.It is based on the application of an uncertainty value in the two lines: Reference Line (RL) and Test Line (TL), that is, a buffer is created for each line.Then, the difference of the polygon generated by the RL in relation to the polygon generated by the TL is determined.The area of this difference allows the calculation of the mean discrepancy between the features, which is necessary for the evaluation of the positional accuracy.
The value of the mean discrepancy md is obtained by Equation 5, where x is the buffer distance, ΣAdiff is the sum of the areas of the difference between the polygons, i.e., the areas of the RL buffer that does not intersect the TL buffer and ABTL is the area of the TL's buffer. (1) (2) Geodetic Sciences, 24(3): 300-317, Jul-Sept, 2018   = ..

∑ 𝐴 𝑑𝑖𝑓𝑓 𝐴𝐵 𝑇𝐿
(5) Figure 4 illustrates a practical example of the Method for an x buffer distance, highlighting the hatched area as the difference of the RL buffer over the TL buffer.
The value x of the buffer used by Santos et al. (2015) corresponds to the EM of the PEC-PCD according to the scale and class to be evaluated.

Controlled Experiment
The objective of this procedure is to investigate the results of the ERM and DBM applied to features with different shapes and different types of displacements, given a certain distance, for homologous pairs of lines.
In this experiment, three features of different configurations were built using the QGIS Advanced Digitizing Tools and the CADDigitize plugin.These features and the respective measurements are represented in Figure 5, where the first is a straight line, the second a 90° inflection line in the middle point and the latter a semi-circumference.
New identical lines were also created with position changed considering a certain azimuth.In all situations the translation distance of 20 meters was considered, that is, in the case of the traditional method, the discrepancy between all its homologous points is exactly 20 meters corresponding to the distance of the translation.In order to obtain different configurations of the pairs of homologous lines and eliminating some symmetrical relative positions, the azimuths shown in Figure 6 were considered, where the test lines are dashed in red and the reference lines are in black color.For each of the nine situations presented, the ERM and DBM were applied.For the ERM, when the lines intersect themselves, those lines were segmented at the points of intersection, creating two polygons, and then were calculated the discrepancy based upon the sums of the areas and semi-perimeters of those polygons.

Case Study
The area of study corresponds to Map Index (MI) 1721-4-NO with nomenclature SC-23-Z-B-V-4-NO referring to the Brazilian Systematic Mapping's frame for the 1:25,000 scale.
The evaluated data consider the linear features of the Roads class of the OSM (Ramm, 2017), corresponding to all types of highways, paths and accesses.This data can be obtained free of charge through the Geofabrik server (2016).In this work, the highways of the OSM were denominated Test Lines (TL).
Bulletin of Geodetic Sciences, 24(3): 300-317, Jul-Sept, 2018 The data adopted as reference correspond to the classes Trecho Rodoviário and Arruamento, belonging to the Transportation Systems category of ET-EDGV 2. 1.3 (DSG, 2010).These data are available in the Geographic Database of the Brazilian Army (Banco de Dados Geográfico do Exército -BDGEx) through the following website: http://www.geoportal.eb.mil.br/mediador/.In this work, the combination of these two classes was performed in a single class called Reference Lines (RL).
The reference features were obtained by heads-up vectorization on orthoimage whose planimetric positional accuracy was evaluated for the Bahia Mapping Project (Penha et al., 2012).The acquisition scale was 1: 2,000 on orthoimage with a spatial resolution of 60 cm.
The work area, as well the TL and RL are shown in Figure 7.The mapped region covers the city of Xique-xique in the state of Bahia.Before applying the QGIS tools for the ERM and DBM, it is necessary to perform the geoprocessing that are described in the workflow (Figure 8).
The objective of obtaining the LT and LR with the same Coordinate Reference System (CRS) SIRGAS 2000/UTM 23S is to have all the measurements in metric units, making possible the classification of the PEC-PCD.The tools that "merge lines with equal direction" and "cut lines in intersections" allow automatic identification of a greater number of homologous features.Besides, the ERM requires intersecting lines be segmented at the intersections.Figure 9a shows an example of the "merge lines with equal direction" tool application, and Figure 9b illustrates the result of the "cut lines in intersections" tool.After pre-processing, the data is ready to be evaluated.The evaluation was performed using the following QGIS tools: "MBD -Método do Buffer Duplo" and "MRE -Método dos Retângulos Equivalentes".
The identification of homologous features is performed automatically following the criteria: "Two lines are homologous when one is completely inside the relationship buffer of the other and vice versa".If there are more than one line within the relationship buffer of the other, then the nearest line is considered.
It is worth noting that this artifice was implemented with the purpose of automatically identifying homologous lines, not being suggested by any of the studied methods.However, it systematically enables the evaluation of several homologous pairs in a set of lines.
In order to classify the PEC-PCD of linear features, it was considered the value of the discrepancies weighted by the respective lengths through an analogy in what ET-CQDG recommends for pairs of homologous points.For that, the two criteria were followed: 1 st criterion: A sample whose sum of the features' lengths with discrepancies lower than the EM shall fit into a cartographic class (PEC-PCD) when that sum is greater than 90% of the total length.Table 4 presents an example of evaluation applying this criterion.In this case, considering the scale 1:25,000 and class B, the EM is 12.5 meters (Table 2).The sum of lengths where the mean discrepancy is less than the EM corresponds to 80.0 meters, or 80% of the total length.Therefore, this sample is not classified in class B for the 1:25,000 scale.
The length value may refer to the length of the RL or TL, as well as the sum of both.In this work, the length of the RL was used.
2 nd Criterion: The value of the Weighted Root Mean Square Error (RMSEW) is compared to the EP of the PEC-PCD table.If the value is lower, then the sample belongs to the evaluated class.If it is larger, it walks on the PEC-PCD table until it finds a case where RMSEW is lower or, if it is not found, the sample is considered "not compliant" for any of the classes, called as "R" class.
The RMSEW is calculated by Equation 6, where dm and l correspond, respectively, to the value of the mean discrepancy and length for each feature of the sample.6) 6. Results and Discussion

Controlled Experiment's Results
Initially, it was found that the discrepancy values applying the DBM may have different results considering the same pair of homologous features.This is proven in the simplest case, where we have two parallel lines of 500 meters of length separated at 20 meters with 90 ° azimuth (Figure 6), several values of discrepancies were obtained, presented in Table 5.
Similarly to Santos et al. (2015), the EM was used as buffer distance.Table 5 reveals that the discrepancy is influenced by the buffer distance which is different for each scale and class of the PEC-PCD (Tables 2 and 3).For the larger scales a greater variation of the calculated discrepancy was presented, on the other hand, for the small scales, there was a smaller variation.In the latter case, the average obtained was 32.15 meters.
The main reason for the large variation in the calculated discrepancies is the lack of intersection between the polygons when a small buffer value was applied (in the cases where EM is less than 10 meters).As there is no intersection, the discrepancy of the DBM becomes directly proportional to the value x of the buffer (Equation 5).
Thus, the DBM makes sense in cases where there is an intersection between the polygons generated from the lines' buffer.In the situation of two parallel lines separated at 20 meters, this only happens for small scales, except for the case of PEC-PCD "A" and 1: 25,000 scale, where the EM is 7.0 meters (Table 2).
Therefore, it can be observed in Table 5 that, for small scales, the DBM overestimates the value of the discrepancies, considering that a displacement of 20 meters was forced and, using this method, reached the value of 32.15 meters for one of the situations and values greater than 20 in most cases (Table 5).
Applying the ERM for the two parallel lines, the exact value of 20 meters of discrepancy was obtained, which is expected value for the situation.In this method, the calculated value is independent of the scale, that is, the discrepancy will be the same for all situations.
The shape of the linear features and the displacement azimuth between the homologous pairs present different results for the different situations presented in Figure 6 for both the ERM and the DBM.Table 6 presents the results of the discrepancies calculated for the nine situations presented in Figure 6.In Table 6, the discrepancy value of the DBM corresponds to the mean of the calculated values for small scales, so the discrepancies for large scales were not considered.
Considering the shape of the features, Feature 1 (straight segment) was the one that had greater variation of the values of discrepancy in both methods.On the other hand, Feature 3 (halfcircumference) obtained a smaller variation of the discrepancies.
It is evident that in both methods, in situations where a part of the homologous pairs is coincident (Feature 1 and Feature 2 with 0 ° azimuth), the values of the discrepancies were smaller.In the case where the discrepancy was zero, it means that the lines were parallel and, then, there is no area between them.
However, for the 500 meters of the features, only 40 meters was not coincident.In this situation, to assert that the discrepancy is 20 meters (calculated by the traditional method) can bring a distorted idea from the point of view that the features coincided in most of its extensions.Nevertheless, to consider that the discrepancy is null, as is the case of the ERM, is unfear because it disregards that there is a discrepancy.
Discrepancies by DBM generally overestimated the 20 meters discrepancy imposed on feature points.On the other hand, in none of the cases did the ERM extrapolate the amount imposed.

Case Study's Results
Within the study area, initially, the amount of RL from BDGEx was 1552 features.The number of TL related to OSM was 285.After the pre-processing described in Figure 8, the amounts were changed to 3568 and 3219 for the BDGEx and OSM layers, respectively.
The criterion for identifying pairs of homologous lines was the same for both methods, applying a relationship buffer of 30 meters.With this parameter, the number of pairs of homologous lines automatically identified was 2122.
Figure 10 shows pairs of homologous lines (the result of the DBM tool) and Figure 11 allows visualization of the polygons generated by these lines (the result of the ERM tool).
Both QGIS tools, developed by the authors, generate a shapefile type layer.The ERM tool creates a layer of polygons and its attributes include the average discrepancy for each feature.Similarly, the DBM tool creates a multiline layer for pairs of homologous lines.In this case, their attributes include the discrepancy for each scale situation and category of PEC-PCD, because the buffer value is different for each of these occasions.
Due to the DBM presents different values of discrepancies, the average of the discrepancies calculated for small scales was considered for each homologous pair, allowing a comparison with the ERM.
Bulletin of Geodetic Sciences, 24(3): 300-317, Jul-Sept, 2018  Table 7 presents the mean, weighted mean, RMSE, weighted RMSE, minimum and maximum of the discrepancies obtained in each method for the total number of homologous lines.
In both methods, the weighted calculus by the feature-length was higher than those calculated without weighting, being this difference more evident for the ERM.
Bulletin of Geodetic Sciences, 24(3): 300-317, Jul-Sept, 2018 In this study case, the pairs of features were obtained through the application of a 30-meter relationship buffer.That is, for any of the pairs, the maximum discrepancy would be 30 meters, which was not verified in DBM, which had discrepancies up to 69.7 meters (Table 7).Table 8 presents the results of the PEC-PCD for OSM roads data, qualifying the results of the mean discrepancies of homologous lines for each method.
This assessment was made based on the two weighted calculation criteria for comparison with EM and EP values of Table 2.

Conclusion
The studied methods allowed the determination of the mean discrepancy between pairs of homologous lines.The results of these methods were analyzed through controlled experimentation with distinct features varying the azimuth at a fixed distance.A case study with the OSM roads data was also carried out in relation to those available in the BDGEx.
In the DBM, for the same pair of homologous lines, different values of discrepancy were obtained (Table 5).This variation happened due to the adoption of the EM values as buffer distance which is different for each scale and PEC-PCD class.In the case of the ERM, there is only one discrepancy value.
Three cases were presented to prove that the feature configuration and the displacement azimuth interfere in the determination of the discrepancy in both methods.In cases where the lines are parallel with coincident parts, the ERM underestimates the value of the discrepancy by assigning the value 0 (zero).
Both the controlled experiment (for small scales) and the case study showed that the DBM overestimates the value of the discrepancy between linear features (Tables 6 and 7).
By visual inspection, it was verified that the ERM discrepancies reached results very close to the values calculated with the distance measuring tool of the QGIS.On the other hand, DBM discrepancies have almost double the result obtained by the other method in most pairs of homologous lines.This is proven through the spreading diagram (Figure 12).
Fonseca Neto et al. ( 2017), Cruz and Santos (2016), Santos et al. (2016) and Santos et al. (2015) arbitrated for the x value of the buffer the EM that in their works they denominate simply of PEC.
It is considered that the use of buffer distance requires detailed investigation.A fixed value in millimeters (mm) should be arbitrated for the buffer instead of EM.The fixed value would be transformed into meters according to the scale for calculating the discrepancy.
However, it is clear the DBM is more rigorous than the ERM in the assessment of Positional Accuracy.Any product approved in DBM will inevitably be approved in the ERM and, most likely, in the Traditional Method by homologous points.
The Traditional Method differs of both studied methods due to the advantage of detecting systematic errors, nevertheless the ERM and DBM detect automatically and efficiently the precision for linear features.
In terms of innovation, it is worth noting the use of the "adapted" criteria for the evaluation of the PEC-PCD for linear features that consider the weighting based on the length of the features.That is, larger features have average discrepancy values with greater weight than smaller features.Although this innovation is not standardized, the proposal is interesting to be studied and applied.
In the classification criteria of the PEC-PCD for linear features, the non-use of weighting by featurelength may fake the results, indiscriminately assigning the same weight to lines with different lengths.
As the calculated discrepancy value in both methods is a mean estimate, then a probability distribution based on the linear characteristic "length" as weight can be considered to evaluate the sample (Montgomery and Runger, 2002).
From the point of view of the results of the evaluation of the quality of the highways of the OSM, the results were worse than those found by Cruz and Santos (2016), being a possible reason, the region is a rural locality, pointed out by Sehra et al. (2014) andHelbich et al. (2012) as an area of low contribution and data validation.
The variety of open data resources unquestionably collaborate for economy in mapping works.However, it is worth emphasizing the necessary cares in their use, regarding the quality of the products that will be developed, in a way that meets its purpose, reflected mainly in the scale to which the product will be applied.

Figure 5 :
Figure 5: Shape and size of features for the controlled experiment.

Figure 6 :
Figure 6: Homologous pairs used in the controlled experiment.

Figure 9
Figure 9: a) merge lines with equal direction; b) cut lines in intersections.

Figure 12
Figure12corresponds to the scatter diagram, where each point represents the discrepancies obtained by method for each pair of homologous features.The Pearson correlation coefficient calculated was 0.432, which highlights the low correlation between the results.Although, by inspecting the graph, a tendency of greater correlation of the results is observed as the values of discrepancy by ERM are higher.

Figure 12 :
Figure 12: Scatter diagram between the values obtained for each method.

Table 4 :
Example of PEC-PCD evaluation considering the EM.

Table 5 :
Different values (in meters) of discrepancies for the same situation applying the DBM.

Table 6 :
Discrepancies in meters for the 3 features of the controlled experiment.

Table 7 :
Descriptive results of discrepancies calculated for the two methods.

Table 8 :
Result of PEC-PCD for each method.