Artigo SEMANTIC DESCRIPTION FOR THE TAXONOMY OF THE GEOSPATIAL SERVICES Descrição semântica da taxonomia dos serviços geoespaciais

With the advances in the World Wide Web and Geographic Information System, geospatial services have progressively developed to provide geospatial data and processing functions online. In order to efficiently discover and manage the large amount of geospatial services, these services are registered with semantic descriptions and categorized into classes according to certain taxonomies. Most taxonomies for geospatial services are only provided in the human readable format. The lack of semantic description for taxonomies limits the semantic-based discovery of geospatial services. The objectives of this paper are proposing an approach to semantically describe the taxonomy of geospatial services and using the semantic descriptions for taxonomy to improve the discovery of geospatial services. A semantic description framework is introduced for geospatial service taxonomy to describe not only the hierarchical structure of classes but also the definitions for all classes. The semantic description of taxonomy base on this framework is further used to simplify the semantic description and registration of geospatial services and enhance the semantic-based service matching method.


Introduction
Geospatial services are modularized programs developed to facilitate the discovery, access, and composition of geospatial information and processing functions over the Internet (Di et al. 2005, De Oliveira, De Oliveira, andDavis 2010).With the advances in the World Wide Web and Geographic Information System (GIS), there are a large amount of online geospatial services that provide various geospatial data and functions.Categorizing these services into classes according to taxonomy is an effective way for service management and discovery.Descriptions of geospatial services and taxonomy are the basis to make these services manageable and findable.
The traditional descriptions of geospatial services are human-oriented.It's difficult for computers to read.Semantic web technology provides a promising way to solve this problem.Semantic markup languages are used in some researches to describe geospatial services and improve the efficiency and accuracy of automatic service discovering (Di et al. 2007, Di et al. 2006, Klien, Lutz, and Kuhn 2006, Lutz 2007, Smits and Friis-Christensen 2007).Semantic descriptions of services include many aspects, such as function, input/output, and quality.During service discovering, the correlation between service descriptions and application requirements is checked according certain rules, and the services that fulfill the application requirements are returned as results.This process is called service matching.Due to the great quantity of services, matching services one by one is a time-consuming process for computer programs.Providing complete and correct semantic description for services is a challenging task in practice, especially for geospatial services that has various attributes and constrains.
In order to improve the efficiency of service matching, classification-based service matching method is proposed and discussed in some researches (Yue et al. 2007, Bai, Di, and Wei 2009, Luo, Wang, and Chen 2010, Zeng et al. 2013).In classification-based matching services are categorized into different classes and these classes are used to find services.Only those services with the specified class will be matched; others will be excluded.Classification-based matching is usually used as the first step of matching process.Further matching steps, such as matching by inputs and outputs (I/O), could be used to refine the result.The overall efficiency of the service matching process is greatly improved when a large number of service instances are precluded in classification-based matching (Wang, Li, and Luo 2012).
The information of service classes and the relations among them are important for classificationbased service matching.Ontologies are used to describe the sub-class relationships in service taxonomies (Di et al. 2005, Bai, Di, andWei 2009).In addition to a class hierarchy, most taxonomies provide literal definitions for all classes.These definitions also provide important semantic information of service classes.So far, the semantic description of geospatial service taxonomies has been limited to the sub-class relationship, and no advanced research has been done to address how to semantically describe the common attributes of each class in a taxonomy.
The incomplete descriptions of service taxonomy limit the application of classification-based matching (Zhang et al. 2010).
The objectives of this paper are proposing an approach to semantically describe the taxonomy of geospatial services and using the semantic descriptions for taxonomy to improve the discovery of geospatial services.A semantic description framework is introduced for geospatial service taxonomy to describe not only the hierarchical structure of classes but also the definitions for all classes.The semantic description of taxonomy base on this framework is further used to simplify the semantic description and registration of geospatial service instances and enhance the efficiency of service matching.
Due to the large number of concepts involved in the following discussion, this paper begins with a background section describing the semantic description framework for web services, the geospatial service taxonomy and basic concepts involved in service matching.In Section 3 the approach to semantically describe the taxonomy of geospatial service is described in detail.In Section 4, experiments for service description and matching are conducted to illustrate the how proposed approach works.In Section 5, the effectiveness and efficiency of the proposed approach are discussed.In Section 6, the research outcomes are summarized and an overview of open issues and interesting topics for further research are presented.

Semantic description for web services
Semantic Web is an extension of the existing World Wide Web.It provides meaning to data and services, enabling them to be understood and used properly (Berners-Lee, Hendler, and Lassila 2001).Ontologies are the structural frameworks for organizing information in Semantic Web.Ontology includes a set of concepts and the relationships between them (Gruber 1993).The Web Ontology Language (OWL) is a semantic markup language for publishing and sharing ontologies on the web, which is recommended by the World Wide Web Consortium (W3C).OWL provides a framework to describe classes of concepts involved in Web applications and the relationships between these concepts (McGuinness and van Harmelen 2004).In order to describe semantics in web service domain, an ontology called OWL-S (OWL for Service) is proposed (Martin et al. 2004).OWL-S supplies web service providers with a core set of concepts for describing the properties and capabilities of their Web services in unambiguous, computer-interpretable form (Martin et al. 2007).
OWL-S defines a general class "Service", which serves as an organizational point to describe a service.The "Service" class contains three elements: "presents", "describedBy", and "supports" that are implemented by three classes of descriptions: "ServiceProfile", "ServiceModel", and "ServiceGrounding" respectively (Martin et al. 2004).Each of the three classes is a part of the aggregated class "Service".The "ServiceProfile" describes what the service does, including the function of the service, the application scope of the service, the rank of service quality, and the requirements to use the service.The "ServiceModel" describes how to use the service, including what input the service is required and what output or change the service will produce.The .Bol. Ciênc.Geod., sec.Artigos, Curitiba, v. 21, n o 3, p.515 -530, jul-set, 2015 "ServiceGrounding" describes how an computer program to invoke the service, including a communication protocol to access the service, message formats to make the request, and the means for data exchanging (Martin et al. 2004).
As defined in OWL-S, a service profile is used to characterize a service for the purposes of advertising, discovery, and selection.Service developers publish the service profiles in service registries.Service consumers check the service profile to find out whether the service can fulfill their requirement.But checking all service profiles in the registry manually is not an effective way.In order to effectively find a service from the registry, OWL-S suggested constructing a hierarchy of subclasses of the "ServiceProfile" class according to the application domain.Additional properties can be designed in the constructed service profile classes, and these properties can be inherited by their subclasses.The construction of subclass hierarchy actually provides a formal approach to define a service taxonomy.

Geospatial Services Taxonomy
A taxonomy is usually provided as a tree structure of classes along with descriptions for each class.The tree structure of a taxonomy indicates the relationships between classes.In a taxonomy, a super-class has one or more sub-classes, while a sub-class has only one super-class.The description of a class mainly concerns the function and input/output interface.The description of a sub-class is more specific than its super-class.Services providing similar functions are categorized into the same class.Services providing distinct functions are categorized into different classes.Conversely, from the class of a service, a common understanding of what this service can do and how it works can be easily achieved.
A taxonomy for geospatial services is provided by International Organization for Standardization (ISO) in ISO19119 standard Geographic Information-Services (ISO 2005).This taxonomy categorizes geospatial services into a three-level hierarchy with six main classes at the top level.Geographic processing services, one of the six classes, are further subdivided into four subclasses.If a provided service is named after any of these ISO service classes, the provided service must provide the functionality defined in ISO19119.For example, if a service is named coordinate transformation service should perform coordinate transformations as defined in the ISO standard.ISO19119 lists some non-exhaustive services in each class and sub-class, along with the name and description of the classes.
The taxonomy for geospatial services can effectively help identify candidates for geospatial services.All these definitions and descriptions for service taxonomy, such as those in ISO19119, are provided in human-readable document, which are not ready for computers to use for service discovery (Bai, Di, and Wei 2009).Some attempts have been made in some research works to describe the sub-class relationships between different type of geospatial services (Bai, Di, and Wei 2009, Di et al. 2005, Yue et al. 2007).The definitions of classes, such as the common attributes and restrictions of service in this class, have not yet been semantically described.In other word, the semantic descriptions for taxonomy are incomplete in these researches.

Service Matching
Service matching is a process of identifying the correspondence between service consumer's requirements and the descriptions of service instances, according to various similarity measurements (Talantikite, Aissani, and Boudjlida 2009).In the service oriented architecture, service matching plays an important role to fulfill the demands of service requesters with correct services published by service providers.An efficient and effective matching method will boost the application of geospatial services.The efficiency of service matching methods is usually measured by the time required for processing a service request, and the effectiveness is mainly evaluated by the recall rate and the precision of the matching result (Lutz 2007).Recall rate measures how many relevant services are found in the result compare to the number of actually relevant services that are presented for matching.The precision measures how many services in the matching result are actually relevant.
In the field of geospatial services, several matching frameworks have been proposed to improve the efficiency and effectiveness of service matching.The inputs and outputs of geospatial services are used for service matching in some early researches (Paolucci et al. 2002, Klien, Lutz, andKuhn 2006).The inputs and outputs of all candidate services are exhaustively compared with the inputs and outputs of service demands to find matched ones, Though this method has a high precision, it is always time-consuming given the fact that there are always a large number of geospatial services available for matching and the vary data types of inputs and outputs (Luo, Wang, and Chen 2010).In some other researches, classification of geospatial services is suggested as an additional criterion for service matching to shorten the matching time (Bai, Di, and Wei 2009, Luo, Wang, and Chen 2010, Wang, Li, and Luo 2012).In these researches, the service class specified in the service request and the classes of service instances are compared to find the matched ones.The classification-based matching is usually applied at the beginning of service matching to quickly exclude large amounts of unrelated services, and other matching methods, such as I/O-based matching, can be applied in the following steps to further refine the result (Luo, Wang, and Chen 2010).

Approach
In this paper, an approach to semantically describe the taxonomy of geospatial services is proposed.A semantic description framework for the taxonomy of geospatial services is designed in this approach.In this framework, service classes in the geospatial service taxonomy is organized into a profile-based hierarchy, definitions of all service classes are described as ontology classes.Based on the sematic description of taxonomy, methods to improve the semantic registration and the matching of geospatial services are provided.Figure 1 illustrates the overall view of this approach.The details of this approach are described in the following subsections.

The hierarchical representation of service classes
In the proposed semantic description framework for the taxonomy of geospatial services, profilebased hierarchies are constructed to describe the categorization of services.As briefly introduced in the background section, an OWL class named "ServiceProfile" is provided as the top class for profiles of all web services in the OWL-S ontology.In the framework for taxonomy of geospatial services, sub-classes of the "ServiceProfile" class is defined to represent geospatial service classes.For example, the top class of geospatial service taxonomy is represented as an OWL class "GeographicService", which is inherited from the class "ServiceProfile".
The sub-class relationship in taxonomy is represented by the property "subClassOf" in OWL.For a specified class in the geospatial taxonomy, a corresponding profile class is created and connected to its super class with the "subClassOf" property.All service profile classes are organized into a hierarchy through the "subClassOf" property.Figure 2 illustrates the OWL code segment that describes the hierarchy of geospatial service classes.For example, in the last three lines of the code segment, the class "OverlayAnalysisService" is defined as a subclass of the class "Spatial Proximity Analysis Service".

The semantic definition of service classes
In the proposed semantic description framework for geospatial service taxonomy, each service class is defined as a service profile class in ontology.The definition for a service class specifies the super class of this class, the property description schema and the property restrictions.The super class indicates the position of the service class in the profile-based hierarchy, which was introduced in Section 3.1.The property description schema defines what properties should be described for the instances of the class.The property restrictions define the rules for values and cardinality that all instances of this class must comply with.
The property description schema is provided in the form of OWL properties.The definition of OWL property has two parts, domain and range.A domain of a property limits the individuals to which the property can be applied, and the range of a property limits the individuals that the property may have as its value (McGuinness and van Harmelen 2004).When a property for a specified service class is defined, the domain is set to the profile class of that service class, and the range is set to the data type or class of the property value.The definitions of OWL properties for one profile class are inheritable for all its sub-classes.In OWL-S, some basic properties are defined for class "ServiceProfile", such as "hasInput", and "hasOutput".For geospatial services, more properties are required in the description of profile classes.For different kinds of geospatial services, different properties are required.For example, in order to describe the data format supported by geospatial services, an OWL property "supportFormat" is defined with the domain setting to the class "GeographicService" and range setting to the class "DataFormat", as shown in Figure 3.The property restriction is provided in a special form that defines an anonymous class including all individuals that satisfy the restriction.Two kinds of property restrictions are defined in OWL: cardinality constraints and value constraints.A cardinality constraint constrains the number of values a property can take.For example, service class "VectorOverlay" has a cardinality constraint on property "suportOverlayOperation", which defines that a service in this class must support at least one kind of overlay operation.The corresponding code segment is shown in Figure 4.A value constraint constrains the range of the property.For example, the service class Intersect constrains the value of property suportOverlayOperation to intersect, which means the service in this class must support overlay intersect operations..The corresponding code segment is shown in Figure 5.

The class-specified description for service instances
In the proposed approach, semantic descriptions for service classes in geospatial service taxonomy are used to form a class-specified schema for the description of service instances.In OWL-S, services are described as OWL individuals of class "Service".The instance of service is connected to an instance of profile by the property "presents".Based on the definition of service profiles for different classes in the taxonomy, instances of service profile are created and Bol.Ciênc.Geod., sec.Artigos, Curitiba, v. 21, n o 3, p.515 -530, jul-set, 2015.
connected to the instances of services.For example, to describe a service providing vector overlay function, a new OWL individual of class "VectorOverlay" is created and associated properties are described, as shown in Figure 6.The definitions of profile classes simplify and normalize the description of service instances.According to the profiles class, properties that need to describe for a service instance are clearly presented.Some properties may already have a value according to the defined constraints.So, only the properties without values are focused and filled.The validation of values for properties can be checked by the property description schema and constraints defined in the profile classes.Thus, mistakes in the description of service instances are preventable.

The improved matching method for geospatial services
Based on the semantic description of service taxonomy, the classification-based matching method is improved.In the improved method, profile-based hierarchy and additional constrains on properties are used along with service classes to find a proper service, as shown in Figure 7.The profile-based hierarchy is used to determine whether a service is an instance of specified class.The profile definition for the specified class indicates what properties are included in the description of the service instances.Constraints on these properties are then used as filters for matching service instances.This method has two steps: matching by class and matching by property description.In the first step, profile instances that are members of the service profile class defined for this class are discovered.For example, if the class "Buffer Service" is provided, all instances of profile class "Buffer Service Profile" are found during the first step.In the second step, service profile instances are checked to determine whether their properties satisfy the property constrains set by service requester.For example, if the property "inputGeometryType" for "Buffer Service Profile" is constrained to "Polygon", only the profile instance with a property "inputGeometryType" set to "Polygon" are kept in matching results.

Experiment
A prototype system was designed and implemented to illustrate the significance of the proposed method.The prototype system includes a service registry and a service matcher.The service registry enables semantic register and retrieval for geospatial services.The service matcher is a component to process service request.When a service request is submitted to the registry, service matcher will find the appropriate services using the matching method proposed in this paper.A series of experiments are provided in the following three sub-sections, including the description and registration of geospatial service taxonomy, the registration of geospatial services based on service class, and the semantic matching of geospatial services based on service class.

Description and registration of geospatial service taxonomies
In this experiment, the ISO 19119 taxonomy is described using the method proposed in Section 3. Ontology editor Protégé (open-source software) is used to create the semantic description of service taxonomy.The semantic description is then submitted to the registry.The taxonomy is displayed as a tree view in the interface of service registry.Figure 8 shows the interface for registering geospatial service taxonomy.

The semantic registration of geospatial services
The service registry provides a function to register geospatial services by class.In this prototype system developed for this research, geospatial service taxonomy is displayed on the geospatial service registration page.As shown in Figure 9, if a service provider registers a geospatial service, users should first choose a classification for the service from the taxonomy displayed on the left, and a form is displayed to instruct users to complete the description of service profile.A class specified profile defined in the semantic description of taxonomy is used to generate this form.In this form, only the properties associated with that class are listed.Some properties with predefined value are auto filled.Users can refine these values as they wish.All values that users input will be checked according to the property restrictions.With this form, the description of a service profile is simplified.When a user finishes filling in the blanks and submits the form, a service profile description is complete.Additional descriptions, such as service process and service grounding, are required to finish the semantic registration of a geospatial service.

The semantic matching of geospatial services
Semantic matching of geospatial services is performed in this experiment based on service classifications.The matching process begins after a service request is submitted to the service registry.The service request consists of a service class and other descriptions, such as inputs, outputs, preconditions and effects.The service matching process in the registry consists of two steps.The first step is searching for the services that belong to the target class from the perspective of semantics.The second step is to compute the semantic similarity between the inputs, outputs, preconditions and effects (IOPE) of a requested service and the IOPE of the services returned by the first step, and to choose those services with the semantic similarity above a predetermined threshold as the matching result for the requested service.
A service request with class "VectorOverlay" is used as a case study.The required service should provide functions for overlaying two vector layers.As described previously, the service request will be sent to the registry, and the registry will perform a two-step service matching.The result of the first step is shown in Figure 10.The numbers in the brackets stand for the numbers of services that were returned.The result contains all services that belong to the class "VectorOverlay", its super-classes and sub-classes.The returned services will participate in the computation of semantic similarity on the second step.A set of experiment is conducted to compare the improved classification-based matching to other service matching methods.There are 100 service instances that are registered in the test service registry.All these services have complete semantic descriptions.Three service matching methods are tested in this experiment.The first one is direct matching by semantic similarity.The second one is classification-based two-step matching, which use the name of class in the first step.The last one is two-step matching with the improved classification-based matching in the first step.Ten different service requests are processed by all the three methods.The average matching time, recall rate and precision of each method are recorded in Table 1.The precision of the matching results for the three methods are all on a high level.The two-step matching method consumes less time, but results lower recall rate compared to the direct semantic similarity matching method.The use of semantic description in classification-based matching greatly improves the recall rate, though a slightly more time is consumed.

Discussion
The discovery of geospatial services requires formalized descriptions for all services and efficient matching methods.Although several geospatial service taxonomies and classificationbased matching methods have been proposed, their applications in service discovery are limited.The proposed approach for the semantic description of geospatial service taxonomy enhances the discovery of service by simplifying the way to describe service instances and improving the efficiency and effectiveness of matching method.
The semantic description and registration of geospatial service instances are simplified with the semantic description of service taxonomy.Semantic description of geospatial service instance is usually complex due to the large quantity of class specified properties.Without a clear description schema for each class, service providers cannot completely and correctly provide semantic description of service instances.Because the complexity and inefficiency, only few users would provide semantic descriptions for geospatial service instances.The proposed approach improves this situation.The complete description of geospatial service taxonomy includes profile classes for each service class, in which all aspects that need to be described for a service in a specified class are predefined.Service providers are only required to fill in proper values for these properties.All values can be validated by rules defined for each service class.The complete and correct descriptions for service instances make them much easier to be discovered.
The matching time is greatly shortened by the application of two-step matching method.The complete and correct descriptions for service instances the taxonomy make it possible to use service class as a criterion in service matching.The two-step service matching method is more .Bol. Ciênc.Geod., sec.Artigos, Curitiba, v. 21, n o 3, p.515 -530, jul-set, 2015 efficient than the traditional semantic-based service matching method.Using the traditional method, the registry will perform semantic similarity computation between the IOPEs of a requested service and that of all the candidate services in the registry, and choose those services with semantic similarity higher than a predetermined threshold as matching results.If there are a large number of candidate services in the registry, the computation will be time consuming.With the semantic description of geospatial service classes and their hierarchical structure, it is possible to divide the process for service matching into two steps.A large number of services that belong to those unrelated class will be filtered out at the first step.So, the total service matching computation will be reduced and the service matching process will be more efficient.
The recall rate of the matching result is improved by using semantic description of taxonomy with the two-step matching method.The hierarchical structure of classes and the definitions of classes provide more explicit criteria for matching.More qualified services are found using the improved method.For example, if the class specified in the service request is "Intersect", it means that a service to get the intersection of two vector layer is needed.Using the class namebased method, only services with the class "Intersect" will be returned.In the improved method, some other services with class "Vector Overlay" that satisfy the user's request are also be delivered, such as those services that support not only intersect operation but also several other overlay operations for not only vector data but also image data.With the semantic description of taxonomies, the services has required properties but belonging to those super classes and sub classes of the target class are also found to satisfy the user's requirement.

Conclusions
Making the process of semantically describing geospatial service easier and making the service matching more efficient and effective are pressing demands in the application of geospatial services.In this paper, an approach is proposed to semantically describe the taxonomy of geospatial services.Using this approach, the hierarchical structure and the definitions of classes in the service taxonomy are formally and explicitly described in a computer readable format.The semantic description of taxonomy is used to improve the discovery of geospatial services.The semantic description and registration of geospatial service instances are simplified with the description schema derived from the class definitions in the taxonomy.The classes of service instances and the hierarchical structure of classes in the semantic description of taxonomy ensure the application of the classification-based matching method, which greatly reduces the matching time.The hierarchical structure of classes and the definitions of classes provide more explicit criteria for matching, which enhance the effectiveness of the matching method.
Several issues require further attention to make the proposed method more comprehensive.Additional taxonomies for geospatial services should be considered.The domain knowledge related to geospatial service classification should be considered to link to the semantic description schema to make it more friendly for service consumers.

Figure 1 :
Figure 1: The approach for semantic description of geospatial services.

Figure 2 :
Figure 2: An example of hierarchical description in service taxonomy.

Figure 3 :
Figure 3: An example of property definition.

Figure 4 :
Figure 4: An example of cardinality constraint on property.

Figure 5 :
Figure 5: An example of value constraint on property.

Figure 6 :
Figure 6: An example of profile description for service instance.

Figure 7 :
Figure 7: The process of the improved method for classification-based matching.

Figure 8 :
Figure 8: The interface of registering geospatial service taxonomy.

Figure 9 :
Figure 9: The registration of geospatial services based on service class.

Figure 10 :
Figure 10: The result page of the semantic retrieval of geospatial services.

Table 1 :
Comparison of the improved method to other matching methods