Automatic Building Extraction Using Lidar and Aerial Photographs

This paper presents an automatic building extraction approach using LiDAR data and aerial photographs from a multi-sensor system positioned at the same platform. The automatic building extraction approach consists of segmentation, analysis and classification steps based on object-based image analysis. The chessboard, contrast split and multi-resolution segmentation methods were used in the segmentation step. The determined object primitives in segmentation, such as scale parameter, shape, completeness, brightness, and statistical parameters, were used to determine threshold values for classification in the analysis step. The rule-based classification was carried out with defined decision rules based on determined object primitives and fuzzy rules. In this study, hierarchical classification was preferred. First, the vegetation and ground classes were generated; the building class was then extracted. The NDVI, slope and Hough images were generated and used to avoid confusing the building class with other classes. The intensity images generated from the LiDAR data and morphological operations were utilized to improve the accuracy of the building class. The proposed approach achieved an overall accuracy of approximately 93% for the target class in a suburban neighborhood, which was the study area. Moreover, completeness (96.73%) and correctness (95.02%) analyses were performed by comparing the automatically extracted buildings and reference data. estudo, preferiu-se a classificação hierárquica. Primeiramente, foram geradas as classes de vegetação e solo e então foi extraída a classe de edifícações. O NDVI, declividade, e as imagens Hough foram gerados e usados para evitar confundir a classe edificações com outras classes. As imagens de intensidade geradas a partir dos dados LiDAR e operações morfológicas foram utilizados para melhorar a precisão da classe de edifícações. A abordagem proposta alcançou uma exatidão de aproximadamente 93% para a classe alvo em um bairro suburbano, que era a área de estudo. Além disso, as análises de integralidade (96,73%) e correção (95,02%) foram realizadas através da comparação dos edifícios automaticamente extraídos e dados de referência. 1. INTRODUCTION Healthy and sustainable urban development is an important factor in human life. Therefore, current spatial data are needed to improve urban management and quality of life. The acquisition of information regarding man-made objects in a fast and accurate manner plays an important role in making critical decisions for city planning and urban development. The automatic extraction of buildings is useful for many applications, such as project planning in various infrastructures, analyzing population mobility and tracking and preventing illegal housing in urban areas. Particularly in cities …


INTRODUCTION
Healthy and sustainable urban development is an important factor in human life.Therefore, current spatial data are needed to improve urban management and quality of life.The acquisition of information regarding man-made objects in a fast and accurate manner plays an important role in making critical decisions for city planning and urban development.The automatic extraction of buildings is useful for many applications, such as project planning in various infrastructures, analyzing population mobility and tracking and preventing illegal housing in urban areas.Particularly in cities located in seismic belts, the regular tracking of buildings is critical for emergency disaster planning after earthquakes and for guiding rescue operations.
The extraction of an object (e.g., buildings, roads, and vegetation) has become an important topic of photogrammetry and computer vision research.The goal of object extraction is to meaningfully organize, group, and properly represent the points, edges, and area of objects (VOSSELMAN et al., 2004).Traditional approaches to automatic object extraction include pixel-based image processing and classification methods.The main problem encountered in this classical approach is confusion of the target object with other classes.In addition, the visual quality, shadow and contrast features of aerial photographs directly affect the quality of object extraction.In the pixel-based classification method, which is the traditional approach, only the spectral value of the pixel is used.However, the success of this approach is limited to objects that have similar spectral information (GAO, 2003).This situation has a negative effect on the accuracy of the classification process.The extraction of objects such as buildings, roads and vegetation is a highly complex classification problem.In the object-based classification method, other parameters are used in addition to spectral information, such as textures, shapes and neighboring relationships regarding the object.
The most common problem encountered in building extraction in pixel-based and object-based classification is confusion of the building class with other classes.Different approaches and methods have been proposed to solve this problem.The Hough transform (HOUGH, 1962;TARSHA-KURDI et.al., 2007), slope analysis (ZEVENBERGEN and THORNE, 1987), and Normalized Difference Vegetation Index (NDVI) (ROTTENSTEINER et al. 2007;DEMIR et al., 2009;AWRANGJEB et al., 2010) are examples of such methods.In addition, different classification methods, such as ISO data classification (RICHARDS, 1999;HAALA and BRENNER, 1999), have been used to improve classification results for building extraction.A short summary of the recent state of the art in building extraction methods can be found in Yong and Huayi (2008), Matikainen (2009), Kabolizade et al. (2010), Blaschke (2010) and Pakzad et al. (2011).
In recent years, studies have been performed using automatic object extraction with multi-band images (HAALA and BRENNER 1999), point clouds obtained using the Light Detection and Ranging (LiDAR) sensing system (MASS andVOSSELMAN 1999, SITHOLE, 2005), and a combination of these data (ROTTENSTEINER et al., 2007;ELBERINK, 2010;ELBERINK and VOSSELMAN, 2011).Intensity images from the LiDAR system are used as additional data to improve classification accuracy.The pixel-and object-based image classification techniques for LiDAR intensity data have been compared and tested by El-Ashmawy et al. (2011).A similar comparison, but for data from airborne laser scanning for building extraction, can be found in Rutzinger et al. (2009).Mao et al. (2009) have even used aerial images to reduce the difficulty of identifying building outlines.Rottensteiner et al. (2005b) developed methods to extract buildings from aerial imagery and laser range data based on the Dempster-Shafer theory.Beger et al. (2011) used the object-oriented image analysis method with a fusion of high-resolution aerial imagery and LiDAR data for automated railroad center line reconstruction.Awrangjeb et al. (2010) reported completeness results for object-based (97%) and pixel-based (78%) methods using LiDAR data and multispectral imagery.Vosselman (2000), Zhang et al. (2003), Sithole (2005), Zeng (2008), and Sampath and Shan (2010) applied different slope-based filters to separate non-ground objects using LiDAR point clouds.Rottensteiner et al. (2005a), Rottensteiner and Clode (2009) and Khoshelhama et al. (2011) used a Normalized Digital Surface Model (nDSM) with subtraction of the Digital Terrain Model (DTM) from the Digital Surface Model (DSM) to detect non-ground objects.
The data fusion method has certain advantages over the single method, but it also poses problems when the data are obtained at different times and resolutions.To resolve these problems, LiDAR point cloud and intensity data can be used along with aerial photographs obtained from LiDAR, GPS/IMU and a digital camera on the same platform for automatic building extraction.
Building extraction is not only a research topic but also a requirement in urban management, control and decision-making processes, which require accurate spatial and spectral data on buildings.Automatic building extraction methods should be fast, accurate, and easy to implement in a large study area for applications in urban areas.To solve the above-mentioned problems in automatic building extraction methods, we aimed to create new rule sets with our proposed approach for contributing to complex building extraction problems.In this research, an efficient workflow is proposed for automatic building extraction with LiDAR data and aerial images based on object-based image analysis with a multi-sensor system.Rule sets were developed for each target vegetation ground class and building using the proposed approach, and automatic building extraction was performed.
The organization of this paper is as follows.In section 2, the technical approach is explained in detail, including the workflow, segmentation and classification steps.In section 3, the properties of the study area, the data set obtained by the multi-sensor system and the results of the experiment are given.Finally, concluding remarks are given in Section 4.

METHODOLOGY
The proposed technique for automatic building extraction includes segmentation, analysis and classification steps using the data set from a multi-sensor system.To solve the misclassification problem, the object-oriented image analysis method was utilized for the extraction of building as well as vegetation and ground classes.The proposed method was performed using defined rules that were organized to improve the building class.

Overview
Figure 1 shows a diagram of the proposed building extraction strategy.The input information consists of a DSM, an intensity image and a color infrared orthoimage of the study area generated with the data set obtained from LiDAR, GPS/IMU and a digital camera on the same platform.The NDVI, Slope and Hough images were used in the segmentation and classification steps.Detailed information about the data is given in Section 3. The primary objective of this research is to extract the building class automatically, but the extraction of vegetation and the ground class has major importance because it affects the accuracy of the extracted building class.
The object-oriented image analysis method used in this research has two major steps: segmentation and classification.The segmentation procedure starts with a one-pixel object and merges similar neighboring objects together.Then, the image is separated into its homogenous object clusters according to certain features, including scale parameter, shape, completeness, brightness, contrast difference, and statistical parameter values (BENZ et al., 2004;NAVULUR, 2007); the segments in different levels were performed using these features (DASH et al., 2004).In this research, the chessboard, contrast split and multi-resolution segmentation methods were utilized in the segmentation steps of the automatic extraction of building, vegetation and ground classes with the proposed approach.
The chessboard segmentation algorithm splits the image into square image objects, and each object is cut along these gridlines for more detailed analyses.The size of the square grid in a pixel defines the object size.The contrast split segmentation splits the images into dark and bright regions based on a threshold that maximizes the contrast between the resulting bright objects and dark objects.The optimal threshold is evaluated separately by an algorithm for each image object in the image object domain.The contrast split segmentation algorithm first executes chessboard segmentation and then performs the split for each square if the pixel level is selected in the image object domain (TRIMBLE DEFINIENS, 2010).The multi-resolution segmentation algorithm locally minimizes the average heterogeneity of image objects.The algorithm consecutively merges pixels or existing image objects and can be defined as a bottom-up segmentation algorithm based on a pairwise region merging technique.This segmentation method is an optimization procedure that minimizes the average heterogeneity and maximizes the respective homogeneity of the segments.The segmentation procedure begins with single-image objects of one pixel and repeatedly merges them as long as an upper threshold of homogeneity is not exceeded locally.This homogeneity criterion is defined as a combination of both spectral and shape homogeneity, which are influenced by the scale parameter.Higher scale parameter values result in larger image objects and smaller values in smaller image objects.Image layers can be weighted to determine their importance or suitability for the segmentation result.The compactness criterion is used to optimize the compactness of image objects (TRIMBLE DEFINIENS, 2010;BEGER et al., 2011).The different image layers, such as the Hough and intensity layers, can be included with multi-resolution segmentation, which we adopted in our approach to improve the segmentation quality and more accurately represent the newly created image object.Detailed information about the used parameter for different segmentation methods is given in the following section for vegetation, ground classes and buildings using the proposed approach.
The determined object primitives (such as spectral characteristics, scale parameter, shape, completeness, brightness, contrast difference and statistical parameters) in the segmentation were constantly altered in the analysis and classification steps until they became the target object class.In the analysis stage, the object primitives were used to distinguish objects into different types by classification.The segmentation, analysis and classification steps of the proposed object-oriented image analysis method included subsequent steps allowing the refinement or improvement of the segmentation locally for a specific class, such as building, ground or vegetation.Thus, the entire proposed building extraction strategy, as shown in Figure 1, alternates iteratively between local segmentation modifications on the one hand and local object analysis and classification on the other hand The rule-based classification was carried out because it offers the possibility to automate the entire classification process with decision rules based on determined object primitives combined with fuzzy logic operators at different levels of analysis.The fuzzy rules (membership functions) were defined to include information about the overall reliability, stability and class combination of all of the potential classes.In fuzzy classification, a complete fuzzy system was defined, including the fuzzification of object primitives that will form fuzzy sets, fuzzy logic combinations of these fuzzy sets for defining a class, and defuzzification of the fuzzy classification results to obtain the common crisp classification for thematic classification (BENZ et al., 2004).The entire object-oriented image analysis for automatic building extraction was performed in Definiens eCognition Developer 8.64 with defined rule sets.

Generation of classes
The hierarchical classification method was developed for the proposed automatic building extraction.Instead of focusing on building extraction at an early stage of the classification steps, a classification of the data in the following vegetation, ground and building classes was first performed.Building regions were then derived from the classification results.Figure 2 shows the applied hierarchical classification scheme for automatic building extraction.The Normalized Difference Vegetation Index (NDVI) was employed to classify vegetation and is computed as follows: NDVI = (NIR -Red)/(NIR + Red) (1) Before classification, the contrast split segmentation was performed with determined NDVI threshold values after histogram analyses of the NDVI image for vegetation extraction.The classification was performed with defined fuzzy rules, and the results were improved with morphological operations such as opening and closing to represent the vegetation class accurately.
The ground and non-ground classes were differentiated using the DSM of the study area, which contained the height information of the buildings and other objects elevated from the bare ground.The generated slope image using the DSM (ZEVENBERGEN and THORNE 1987;ROTTENSTEINER et al., 2005a;BEGER et.al., 2011) was used to identify these objects.Object contours having the same slope value in the slope image were used in contrast to split segmentation, and the non-ground class was obtained with fuzzy classification using the determined threshold values.Because the first and last returns of the LiDAR pulse were unavailable, quantile statistical analyses of the DSM were performed.Then, the ground class was generated using the threshold value from quantile statistical analysis.
The proposed technique for building class extraction has five major steps.First, an nDSM was created to exclude the influence of topography using the difference between the DSM of the non-ground class and the average point heights in the ground class.Before nDSM generation, ground and vegetation masks were generated using classification results and DSM.The ground mask and vegetation mask were extracted from the DSM, and then the DSM of the non-ground class was generated.Second, objects in the nDSM with heights above 1 m were classified as initial building 1.Third, the multi-resolution segmentation was performed using the initial building 1 class, the orthoimage and the generated Hough image (HOUGH, 1962).The parameters of scale, shape, compactness, smoothness and color were taken into consideration in multi-resolution segmentation.The initial building 2 and non-building class were generated as a result of the classification with a determined threshold value for the building and non-building classes.Fourth, the contrast split segmentation was utilized using the intensity image generated from the LiDAR data, and the initial building 3 class was generated as a result of the classification with the determined building intensity threshold.Finally, the building class was acquired after morphological operations (opening and closing), and the non-building class was renamed as the other class.

Accuracy assessment
A statistical analysis of the results and a comparison of the results with the reference data set were used to evaluate the performance of the proposed automatic building extraction approach.The Error Matrix method is a preferred method for statistical analyses because it provides not only the overall accuracy but also the opportunity to evaluate the producer accuracy, user accuracy and Kappa analysis results (CONGALTON and GREEN, 2009).The second approach used in the accuracy assessment was completeness and correctness analyses.The main approach in this method is as follows (ROTTENSTEINER et al., 2007): Completeness = (TP)⁄(TP+FN) (2) Correctness = (TP)⁄(TP+FP) (3) In this formula, TP represents true positives, and FP and FN are the false positives and false negatives, respectively.In this method, TP is the number of true positive entities classified as buildings in both data sets (a result of the automatic classification and reference data set), FN is the number of false negative entities identified as buildings in the reference data that were not classified in the automatic classification, and FP is the number of false positive entities that were classified as buildings in the automatic classification but were not classified as buildings in the reference data (RUTZINGER et al., 2009).

1. Study area and data
The study area, used for testing the proposed approach for automatic building extraction, is a suburban neighborhood located in the northwest of the city of San Bernardino, California, United States of America.The data set from a project named "B4", conducted through the cooperation of Ohio State University and the U.S. Geological Survey, was used (CSANYI and TOTH 2006;TOTH et al., 2007).The data set was obtained simultaneously with the multi-sensor system, which included LiDAR, GPS/IMU and a digital camera (color infrared) positioned at the same platform.Detailed information about the multi-sensor system and collected data are given in Table 1.The study area was chosen because the data set was collected simultaneously with the multi-sensor system onboard the same airplane and contained different types of land cover within a small area, including buildings, roads, vegetation, shadows, trees, grass and one of America's largest amphitheaters (the Hyundai Pavilion in Glen Helen), with a capacity of more than 65,000 people.The relatively small size of the study area allowed reference digitizing for an accuracy assessment of the extracted objects.
A gridded DSM with a 0.2-m-resolution intensity image with a ground sampling distance (GSD) of 0.2 m and an orthoimage with a GSD of 0.2 m was produced using the data set from the multi-sensor system (Figure 3).The NDVI image was generated using the NDVI method described in Section 2.1.A slope image with slope analyses of the DSM and Hough image with the Hough transformation of the orthoimage was generated.

2. Results
The first step of the proposed automatic building extraction approach is segmentation, which can be seen in Figure 1.Because we proposed the hierarchical classification method in the classification step, different segmentation methods were utilized for each target class.The contrast split segmentation was performed using the NDVI image to generate the vegetation class as a first step of the hierarchical classification scheme for automatic building extraction (Figure 2).Contrast split segmentation was applied to generate the ground class using the slope image generated with the DSM of the study area (ZEVENBERGEN and THORNE, 1987).The multi-resolution segmentation was carried out for the automatic extraction of buildings using the orthoimage, the Hough image and the other classes generated from the previous step of the hierarchical classification method.Figure 4 shows the generated NDVI, slope and Hough images of the study area that were used in the segmentation steps.The contrast split segmentation was repeated again with the intensity image (Figure 3b) to improve the building class.The segmentation step is the first and most important step of the proposed automatic building extraction based on the object-oriented image analysis technique.The determined object primitives in segmentation were used to distinguish objects into different types by classification.Therefore, analyses of the parameter used in segmentation are of vital importance to the success of the proposed building extraction strategy.The scale parameter used in segmentation defines the maximum segment or object size.To determine the most appropriate scale parameter for multiresolution segmentation, the parameters for shape and compactness were both set to 0.5, and then segmentation was performed for scale parameters 25, 50 and 75, respectively.Figure 5 shows the results of the multi-resolution segmentation using the orthoimage of the study area with scale factors of 25, 50 and 75.As expected, the scale parameter of 75 produced the maximum segment size and fewer objects in multi-resolution segmentation.Therefore, we preferred the scale parameter of 25 for multi-resolution segmentation to include the maximum number of objects in classification steps for the rest of the segmentation process.A similar analysis was performed for shape and compactness.As a result of these analyses, the scale, shape and compactness parameters were set to 25, 0.4 and 0.6, respectively, for multi-resolution segmentation.Similar analyses were conducted for the other segmentation methods to create the most appropriate segments before classification.
The analysis steps were continued before classification to determine the threshold value for classification in our proposed approach.The classification was performed with the fuzzy rules (membership functions) for different classes.As mentioned in the previous section, the proposed approach in this study is based on hierarchical classification (Figure 2).First, the vegetation and ground classes were generated, and then the building class was obtained.The determined parameters for the segmentation and threshold values for classification were refined iteratively until the target class was obtained correctly.
A histogram analysis of the NDVI image (Figure 4a) was utilized to determine a threshold value for the vegetation and non-vegetation classes.In the classification steps, the thresholds were defined as NDVI values less than or equal to 0.55 and greater than or equal to 0.35.The vegetation class was generated using the fuzzy rules with defined threshold values and morphological operations.The result of the contrast split segmentation on the NDVI image and fuzzy classification for the generation of the vegetation class is given in Figure 6.Contrast split segmentation was utilized using the slope image to identify elevated objects from bare ground (Figure 7a).The thresholds were defined as a slope value less than or equal to 200 and greater than or equal to 120.The ground and non-ground classes were extracted using fuzzy classification with the threshold value from the slope value (slope value less than or equal to 90) and the quantile statistical analysis (height in the DSM less than the quantile threshold).Building class generation was performed at the end of the five major steps, as mentioned in previous section.To avoid confusing the building class with the other classes, three initial building classes were generated.The initial building class 1 was created with a defined threshold value greater than 1 m with fuzzy classification using the generated nDSM (Figure 7b).Then, multi-resolution segmentation was performed using orthoimage and the Hough image.Different weights were applied in multi-resolution segmentation.To improve building edge detection, the orthoimage was weighted as 0.3, but the Hough image was weighted as 1 (Figure 7c).The initial building class 2 was created using the result of the multi-resolution segmentation and building class 1 (Figure 7d).To eliminate non-building objects classified as buildings, contrast split segmentation was utilized using the intensity image.Then, initial building class 3 was generated as a result of the classification with the determined building intensity threshold value.The thresholds for building intensity were defined as an intensity value less than or equal to 127 and greater than or equal to 44.Man-made objects in the study area that were not buildings (e.g., removable containers and security shelters in small lots) were eliminated with defined rules such as the area threshold value and were classified as 'other'.The final building class was obtained after improvements with morphological operations such as opening and closing.Figure 8 shows the extracted vegetation, ground, building and other classes performed by defined rules with the proposed approach.

Performance evaluation
The performance evaluation of the proposed building extraction strategy based on object-oriented image analysis techniques was carried out using two different methods.
The "Error Matrix based on a TTA (training or test area) Mask" method was used first.In this method, the overall accuracy, producer accuracy, user accuracy and Kappa analysis results were computed by comparing the reference segments from each class and the results of the automatic extracted class with the proposed approach.The obtained performance evaluation results with the Error Matrix based on a TTA Mask are given in Table 2.As seen, an overall accuracy of 93% and a Kappa value of 88% were obtained as a result of the performance evaluation.
The second method used in the performance evaluation was the completeness and correctness analyses of the automatic extracted buildings.The reference data set for this method was generated by digitizing the target buildings over the orthoimage.The achieved results were 95.02% completeness and 96.73% correctness.The obtained accuracy level as a result of the two different performance evaluations confirms the success of the proposed automatic building extraction technique with LiDAR and aerial photographs.

CONCLUSIONS
This paper presented an efficient workflow for automatic building extraction using LiDAR data and aerial photographs.The proposed method in this study, based on object-based image analysis, includes segmentation, analysis and classification steps using the data set from a multi-sensor system positioned at the same platform.The chessboard, contrast split and multi-resolution segmentation methods were utilized in segmentation steps of the automatic extraction of building, vegetation and ground classes with the proposed approach.Rule-based classification was carried out because it offers the ability to automate entire classification processes with defined decision rules based on determined object primitives and fuzzy rules (membership functions).To avoid confusing the building class with other classes, the NDVI, slope and Hough images were generated using the data set from a multisensor system and were used for segmentation analysis and classification steps in the proposed method.The intensity image generated from the LiDAR data and morphological operations (opening and closing) was used to improve the accuracy of the building class.
Two different methods were utilized for performance evaluation of the proposed automatic building approach.The overall accuracy was 93%, and the Kappa value was 88% based on the Error Matrix, which was based on a TTA Mask method for the extracted vegetation, ground, other and building classes.The proposed automatic building extraction method achieved a detection rate of 96.73% for completeness and 95.02% for correctness.The obtained accuracy level as a result of the two different performance evaluations confirms the success of our approach for automatic building extraction.Although the rule set was developed over the relatively small size of a suburban neighborhood as a study area, modifications can easily be made for dense urban areas.The proposed automatic building extraction method can be applied for urban development, tracking and prevention of illegal housing, emergency disaster planning after earthquakes and the guidance of rescue operations.ZHANG K.; CHEN S.; WHITMAN D. A progressive morphological filter for removing nonground measurements from airborne LIDAR data.IEEE Transaction on Geoscience and Remote Sensing, 41, (4), 872-882, 2003.(Recebido em fevereiro de 2013.Aceito em março de 2013).

Figure 1 -
Figure 1 -Flow diagram of the proposed building extraction strategy.

Figure 2 -
Figure 2 -Hierarchical classification scheme for automatic building extraction.

Figure 3 -
Figure 3-The produced DSM (a), intensity image (b) and color infrared orthoimage (c) of the study area.

Figure 4 -
Figure 4 -The NDVI image (a), slope image (b) and Hough image (c) of the study area.

Figure 6 -
Figure 6 -The contrast split segmentation results using the NDVI image (a) and the vegetation class produced by the defined fuzzy rules (b).

Figure 7 -Figure 8 -
Figure 7 -The results of the contrast split segmentation using slope images (a), initial building class 1 (b), multi-resolution segmentation using an orthoimage and a Hough image (c) and initial building class 2 (d)

Table 1 -
Multi-sensor system and data specifications.

Table 2 -
Accuracy assessment of the classification results using LiDAR data and aerial photographs.Error Matrix based on the TTA (Training or Test Areas) Mask