Testbed-12 Semantic Enablement Engineering Report

Publication Date: 2017-05-12

Approval Date: 2016-12-07

Posted Date: 2016-11-03

Reference number of this document: OGC 16-046r1

Reference URL for this document: http://www.opengis.net/doc/PER/t12-A006

Category: Public Engineering Report

Editor: Martin Klopfer

Title: Testbed-12 Semantic Enablement Engineering Report

OGC Engineering Report

COPYRIGHT

WARNING

This document is an OGC Public Engineering Report created as a deliverable of an initiative from the OGC Innovation Program (formerly OGC Interoperability Program). It is not an OGC standard and not an official position of the OGC membership.It is distributed for review and comment. It is subject to change without notice and may not be referred to as an OGC Standard. Further, any OGC Engineering Report should not be referenced as required or mandatory technology in procurements. However, the discussions in this document could very well lead to the definition of an OGC Standard.

LICENSE AGREEMENT

Permission is hereby granted by the Open Geospatial Consortium, ("Licensor"), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.

If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.

THIS LICENSE IS A COPYRIGHT LICENSE ONLY, AND DOES NOT CONVEY ANY RIGHTS UNDER ANY PATENTS THAT MAY BE IN FORCE ANYWHERE IN THE WORLD. THE INTELLECTUAL PROPERTY IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE DO NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE INTELLECTUAL PROPERTY WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION OF THE INTELLECTUAL PROPERTY WILL BE UNINTERRUPTED OR ERROR FREE. ANY USE OF THE INTELLECTUAL PROPERTY SHALL BE MADE ENTIRELY AT THE USER’S OWN RISK. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR ANY CONTRIBUTOR OF INTELLECTUAL PROPERTY RIGHTS TO THE INTELLECTUAL PROPERTY BE LIABLE FOR ANY CLAIM, OR ANY DIRECT, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM ANY ALLEGED INFRINGEMENT OR ANY LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR UNDER ANY OTHER LEGAL THEORY, ARISING OUT OF OR IN CONNECTION WITH THE IMPLEMENTATION, USE, COMMERCIALIZATION OR PERFORMANCE OF THIS INTELLECTUAL PROPERTY.

This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.

Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications.

This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.

None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.

Table of Contents

1. Introduction
2. References
3. Terms and definitions
4. Overview
5. OGC Semantic Enablement
- 5.1. Geospatial Semantics
- 5.2. Semantic Web Technologies
6. Previous Work
7. Semantics and Links in REST APIs and the General Feature Model
- 7.1. Links Between Resources
- 7.2. REST API Description Languages
8. Semantic Enablement in Testbed-12
9. Future Work
Appendix A: Revision History
Appendix B: Bibliography

Abstract

The requirement for capabilities supporting semantic understanding and reasoning in geospatial intelligence (GEOINT) is an all-encompassing paradigm shift from the past. Standards play a critical role in ensuring this is accomplished in a consistent and repeatable manner. Semantic standards and services supporting semantic capabilities are at a relatively early stage of development. Interoperability between semantic standards for encoding relationships and Web based services for discovery, access, retrieval and visualization of those relationships requires more testing and evaluation. This engineering report (ER) highlights the key findings and discussions from Testbed-12 that enable semantic interoperability, including semantic mediation, schema registries, and SPARQL endpoints. It references key findings from the Semantic Portrayal ER and helps to understand the current OGC discussion on semantics in general.

Business Value

With the opening of previously closed environments, where locally defined semantics using arbitrary approaches have been sufficient, and the increasing ad-hoc re-use of externally provided data and processing services, standardized provision of semantics has become an essential element of interoperability. Only standardized approaches enable efficient automated integration of geospatial information and ensure correct application of externally provided processes, functions, or operations. Standardized explicit semantics enable provenance of data and can improve visualizations and querying of geospatial data. Each system that uses or produces data or offers services across limited local contexts gains in business value if the meaning of data and services can be uniquely obtained in a cost-efficient automated fashion.

What does this ER mean for the Working Group and OGC in general

This ER summarizes the work performed in Testbed-12 and provides an outlook on possible future activities. It serves as a starting point for the OGC community in general and the Geosemantics Domain Working Group (DWG) in particular to understand some of the latest discussions on semantics in geospatial contexts. It provides a number of references to more detailed material to facilitate more in-depth research and analysis.

How does this ER relate to the work of the Working Group

This ER does not reflect on the work of the Geosemantics DWG. It concentrates on summarizing these activities performed in Testbed-12 that are of relevance to the working group.

Keywords

ogcdocs, testbed-12, semantics, RDF, OWL, SHACL

Proposed OGC Working Group for Review and Approval

Geosemantics DWG

1. Introduction

1.1. Scope

This engineering report (ER) summarizes the work performed in Testbed-12 on modeling and serialization of geospatial semantics in the context of heterogenous distributed geospatial information processing systems. It serves as a starting point for the OGC community in general and the Geosemantics Domain Working Group (DWG) in particular to understand some of the latest discussions on semantics in geospatial contexts. It provides a number of references to more detailed material to facilitate more in-depth research and analysis. It provides an outlook on possible future activities.

1.2. Document contributor contact points

All questions regarding this document should be directed to the editor or the contributors:

Table 1. Contacts
Name	Organization
Martin Klopfer	Frisia IT

1.3. Future Work

No future work is planned to this document, but a number of work items and recommendations have been identified that should be addressed in future OGC interoperability program initiative (see Future Work).

1.4. Foreword

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.

Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.

2. References

The following documents are referenced in this document. For dated references, subsequent amendments to, or revisions of, any of these publications do not apply. For undated references, the latest edition of the normative document referred to applies.

OGC 16-059, Testbed-12 Semantic Portrayal, Registry, Mediation Services ER
OGC 16-062, Testbed-12 Catalogue, SPARQL ER
OGC 16-020, Testbed-12 UGAS ShapeChange ER
OGC 16-051, Testbed-12 Javascript, JSON, JSON-LD ER
OGC 16-039, Testbed-12 Aviation Semantics ER

3. Terms and definitions

For the purposes of this report, the definitions specified in Clause 4 of the OWS Common Implementation Standard [OGC 06-121r9] shall apply. In addition, the following terms and definitions apply.

3.1. Semantics

The meaning of expressions

3.2. Syntax

Way of expressing the meaning

3.3. Abbreviated terms

API Application Program Interface
LDP Linked Data Platform
LNP Natural Language processing
LOV Linked Open Vocabularies
OWL Web Ontology Language
OWL-S OWL-S: Semantic Markup for Web Services
POD Project Open Data
RDF Resources Description Format
RDF-QB RDF Data Cube
RIF Rule Interchange Format
SHACL Shapes Constraint Language
SKOS Simple Knowledge Organization System
SPARQL SPARQL Protocol and RDF Query Language
SVG Scalable Vector Graphics
SWRL Semantic Web Rule Language
VoID Vocabulary of Interlinked Datasets

4. Overview

This ER serves as an entry point to Testbed-12 semantics. It integrates discussions from all threads that addressed work items being directly relevant to semantics. These threads include Aviation and Linked Data and Advanced Semantics for Data Discovery and Dynamic Integration.

The main document starts with a short overview of semantics for geospatial systems, followed by an overview of previous work from Testbed-10 and Testbed-11. It discusses the current situation in OGC and outlines areas that would benefit most from Semantic Enablement. Chapter Previous Work summarizes semantic enablement work performed in Testbed-12 and helps understanding the activities in the broader OGC semantic enablement context. In detail, it describes the main outcomes from discussions on semantics as part of semantic portrayal, catalogs, UML to RDF mapping, JSON-LD, and the aviation thread. Chapter Future Work provides an outlook on future activities that require more research to further improve OGC semantic enablement.

The following figures provides an overview of the various work items addressed herein.

Figure 1. Overview of Semantic Work Items

Within the LDS thread, the REST discussion addressed association types; the GFM (General Feature Model) discussion addressed associations as first class objects that can have properties and associations to other objects; the Catalog sub-thread experimented with enhanced models to support semantic search on traditional catalog services; semantic mappings have been explored as part of Portrayal and schema registry experiments. The overall goal was to enhance the OGC architecture towards higher levels of semantic interoperability without revolutionary changes that would void all existing technologies, products, and operational systems. This report will highlight where the journey may lead to where applicable.

5. OGC Semantic Enablement

The goal of this engineering report is to understand how OGC technology can adopt technologies and best practices from the Semantic Web in order to improve user’s experience when working with spatiotemporal data. The term Semantic Web is used here rather broadly and includes aspects that others may categorize as Linked Data.

5.1. Geospatial Semantics

Semantic issues in spatial data sharing and service interoperability have been recognized in the literature for a long time. Bishr summarized interoperability issues in 1998 under the terms semantic heterogeneity, schematic heterogeneity, and syntactic heterogeneity. Though the latter two have been addressed pretty successfully with GML and OGC Web service interface standards, semantic heterogeneity still causes several problems. These include

discovery of data sets and services based on keywords,
rigid metadata structures,
missing semantics on technical terms,
missing matching capabilities for equivalent or related terms or symbols

The Spatial Data on the Web Working Group is investigating some of these aspects, so it is certainly worth to consult their website for additional information.

5.2. Semantic Web Technologies

As mentioned before, OGC Testbed-10 Cross Community Interoperability (CCI) Ontology Engineering Report, and OGC Testbed-11 Implementing Linked Data and Semantically Enabling OGC Services Engineering Report provide already a good overview of existing Semantic Web technologies. For that reason, we will use a slightly different approach here and forgo having a general detailed introduction to Semantic Web concepts. Instead, we will reflect on the latest OGC Testbed discussions relevant to semantics, put these in context and reference or explain underlying concepts as necessary.

As discussed above, the overall goal is to move on from syntactic and schematic interoperability to semantic interoperability for geospatial data sharing over the Web; and to search and access geospatial Web services by content rather than just by keywords in metadata. This should be achieved using a number of technologies, such as ontology, semantic descriptions of geographic information using ontology, the ontology-based catalogue service, and Web service composition. Testbed-12 addressed semantics with focus on the aspects illustrated in figure below:

Figure 2. Overview of Semantic Work Items with Testbed-12 work (green) and future strategy (orange)

6. Previous Work

Previous testbeds have addressed a number of semantics aspects already. Major contributions have been made in particular in Testbed-10 and Testbed-11.

During the Testbed-10 effort, Image Matters identified, designed and formalized a set of modular geospatial core and cross-domain ontologies in OWL (mereo-topology, spatial relations, locations, features, temporal ontologies, geometries, CRS, events, measures, etc.). These “ontology components” provide a core ontological foundation for geospatial information that is universally applicable to any domain. These core ontologies leverage existing standard abstract models (ISO 19xxx), but are modularized and adapted to better leverage the expressiveness of OWL and favor reusability. The resulting Geospatial Ontology can be used as a starting point for building the OGC ontological foundation for all common geospatial information that could be used across various domains (E&DM, Law Enforcement and Public Safety, Gazetteer, Hydrology, Aviation, etc). Unfortunately, this work has not attracted the necessary attention and needs to be addressed in future testbed activities.

OGC document 15-054 that results from Testbed-11 provides a detailed overview of Semantic Web and Linked Data. Interested readers find in 15-054 discussions of the following topics:

Semantic Web

Linked Data

Vocabularies

Linked Data Platform API

SPARQL and GeoSPARQL

VoID

DCAT

SHACL

SKOS

RDF Data Cube

PROV

Ontologies

Despite the research efforts in Testbeds 10 and 11, there is more work ahead in order to move from the traditionally hierarchical OGC data models towards graph based models and to fully implement the power and flexibility of these graph based models in OGC service and resource contexts.

7. Semantics and Links in REST APIs and the General Feature Model

One of the most active threads in Testbed-12 was the development of the REST user guide and REST Architecture Engineering Report. Though the discussion circled around the syntactical interoperability level at the beginning, it addressed more and more semantic aspects as it matured. Here, we will discuss the general issue when it comes to semantics and REST APIs. A number of aspects discussed in the following paragraphs are not restricted to REST APIs, but apply similarly to W*S implementations. Nevertheless, we use the REST API discussion here to introduce the general situation around hypermedia formats, REST API description languages, the goal of declarative programming, and the crux with custom media types.

7.1. Links Between Resources

To introduce the topic, let’s quickly recap the principles of declarative programming. Declarative programming allows separating logic from control flows, thus allows programmers to describe the intended behavior and let the software do the magic of rendering it correctly. The best example is a static Web page written in HTML. The browser will use the declarative HTML language and translates it into commands the browser software understands in order to render the page correctly. If the HTML code contains some links to images or other resources, the simple page converts into a hypermedia application, consisting of a set of representations conforming to standard media types that are interconnected via hyperlinks. The user can navigate from one page to the other and the various pages turn into finite state machines; each page representing an allowed state and links specifies allowed transitions. If the content of the page is changed, the browser can still render the new webpage without any changes to the browser code. Translated to geospatial data, it means that data content can be changed without the need to change the client rendering/processing the data; clients become generic and support all data that is compliant to a specific set of rules and constraints.

Back to the REST APIs, the goal is to allow exploring geospatial data the same way as exploring the Web. A client requests a resource from a server and starts following links to other resources. It is the underlying HTTP infrastructure that makes this work, which consists of three key aspects [3]:

resolving of URIs to access other resources
resources that know how to process requests via a uniform interface
clients that know how to render representations that conform to standard Internet Media Types

The challenge now is to move from manual link-following (as a human) to automated link-following (as a machine); with potential intermediate levels in between. From a syntactic interoperability perspective, it is necessary to be able to read the represented linked resource. From a semantic perspective, it is necessary to understand its content and relationship to the resource it was linked from. The first aspect can be handled by standardized media types. The latter requires two things: first custom-made media types, and annotated hyperlinks that define the type of relationship between resources. The following figure illustrates this aspect. Two resources, a mall and a river, are associated in the form that the mall is located north of the river. Syntactic interoperability is ensured by standardized link encodings and Internet media types. To ensure semantic interoperability, is it further necessary to have custom made association types that explain isNorthOf sufficiently (to a machine). The content itself is either a standard media type, i.e. a image/png of the river, or a custom made media type, e.g. application/geojson or application/gml+xml; again sufficiently semantically annotated.

Figure 3. Associations between resources

Let’s start with the hypermedia links. Hypermedia formats are particularly important when APIs need to support associations between resources. The ISO 19109 General Feature Model (for a detailed discussion of the GFM, see Testbed-12 engineering report OGC 16-047) defines the metaclass GF_AssociationType to define associations between the principal elements of the model, the GF_FeatureTypes. Given that GF_AssociationType is a subclass of GF_FeatureType, associations can be modeled as first class objects, as they directly inherit from GF_FeatureType. Alternatively, they can be implemented as simple UML associations between two classes that represent different feature types. Ideally, this flexible association mechanism is fully supported by hypermedia formats, which means that associations can be either typed or implemented as objects with properties.

Reflecting the current discussion on REST in OGC, hypermedia formats are particular important for API design, though hypermedia plays an important role in other contexts as well. According to Zazueta [2] and adopted herein, an API supports hypermedia when it does the following:

Each resource points to related resources and available actions using the method native to its protocol (i.e. using URIs over HTTP).
Each resource clearly defines its media type so a client knows what it’s parsing (i.e. by providing a MIME type in the “Content-Type” response header) and responds appropriately when requested to return a specific media type (i.e. when a client lists it specifically in the “Accepts” request header).
Each resource points to a machine-readable description of how to parse its media type. (i.e. through one of the many emerging API descriptor languages, such as I/O Docs, Swagger (now Open API Initiative (OAI)), RAML and Blueprint).

What does it mean in the context of OGC? In any OGC service response, independently of being a traditional W*S or a REST(ful) implementation, hypermedia formats play the crucial role of encoding links to other resources (resources being data or other services). Following these links confronts the user with the problem of handling different Internet media types, as links may point to other services, video streams, audio files, or XML or JSON encoded feature collections. In any case, the client application needs to be prepared to handle these types. In the case of exploring the Web, browsers commonly have in-built support for a variety of Internet media types, such as text (e.g. text/plain), image files (e.g. image/png), audio and video. In case an Internet media type is not supported, there are often plug-ins available and browsers know how to discover and install appropriate plugins. Though a few more clicks for the user, it’s still a smooth user experience. In addition, browsers can adapt to new types by either adding internal type handling or by loading another plug-in. The same smooth user experience needs to be generated for spatiotemporal data exploration and usage. Links from one resource to any other resource need to carry sufficient information to allow understanding the Internet media type of the linked resource and the link association type. Otherwise, a link "liegt südöstlich von" or "ist Mittelpunkt von" stops further state transitions if the link does not make sense to the user. Thus, it is absolutely essential to provide an ontology with typical link types. This ontology needs to be made permanently available online under the supervision of the OGC Naming Authority and should include a wide set of geo-spatially relevant link types. The ontology needs to be aligned with existing link type registries, such as the IANA link relations or Dublin Core (see future work section also).

Exact semantics of link types are just a first step towards semantic interoperability. Link types allow understanding aspects such as a resource isVisualizedIn a map, or isPartOfCollection, but full semantic interoperability is only achieved if the linked resource is sufficiently annotated. Providing the Internet media type of the resource is a first important step that allows processing linked resources, but Internet media types such as text/xml or application/json only describe the serialization. It remains at the syntactical level, i.e. the client application can read the resource, but does not understand it. As an example, making sense of a coverage requires understanding the internal structure of the coverage serialization in order to e.g. display the coverage on a map, which is all on the syntactical level. Once it comes to making sense of the displayed coverage, understanding the value type "Mittlere Jahreswindgeschwindigkeit" is essential.

In summary, a link to a spatiotemporal resource needs to carry additional information in order to allow full understanding of the linked resource(s). The ultimate goal is the development of a semantically sound declarative language for geospatial data. This language would allow clients to render all geospatial data in a way perfectly aligned with user’s expectations without the need to change anything programatically. To achieve this goal, a number of ontologies need to be made available. In this context, the OGC should concentrate on the semantics of geospatial link types.

7.2. REST API Description Languages

With The Open API Initiative (OAI), Hypertext Application Language (HAL), JSON-LD, Hypermedia-Driven Web APIs (Hydra), and Siren, there exist a number of hypermedia formats that allow linking resources. Future IP initiatives should investigate these formats and develop recommendations on how to integrate which type into the OGC meta architecture.

8. Semantic Enablement in Testbed-12

The following chapter summarizes the activities performed in Testbed-12 that are relevant to OGC Semantic Enablement and provides guidance for future activities required to explore the applicability of Testbed-12 results to real world situations. Semantic activities have been subject of two threads: Linked Data and Advanced Semantics for Data Discovery and Dynamic Integration (LDS) and Aviation (AVI). Whereas the first thread addressed semantics in a domain-neutral manner, the latter focused on specific requirements and aspects applicable to the aviation community.

As with all experiments featuring semantic mediation, future testbeds need to focus more on actual implementations of the developed ontologies, service interfaces, and data exchange models and formats. The current work provides a great start, but only future experiments will allow proper evaluation of the results produced in Testbed-12. It is recommended to develop an entire semantic exploration thread with a number of actual components implementing the ontologies developed herein, providing mappings between ontologies, and supporting the developed APIs. These future activities should not develop yet another stack of technologies, but implement was has been reported in Testbed-12.

In addition, what is currently missing is a complete guidance on "geospatial - the semantic way". To realize automatic search and discovery of geospatial feature data at the semantic level, various challenges have to be matched depending on the maturity of the system or data at hand. If new data is provided, the challenge will be how to match geospatial features to a predefined geospatial ontology. If the data is already available and organized according to a domain ontology, the challenge is how to map between different feature types from various ontologies that are not perfect matches. Though subject of substantial research, automated schema mapping between feature types is still a huge challenge and currently not ripe for operational systems. The approaches taken in Testbed-12 feature manual mappings to mitigate the problem. Nevertheless, in order to better understand where OGC currently stands in terms of operationalization of its semantic research work, we recommend extensive tests with real world data/scenarios.

The following paragraphs use material from other Testbed-12 Engineering Reports. As re-used text has often be modified, quotations are not highlighted. This summary is based on the following underlying reports:

OGC 16-020 Testbed-12 ShapeChange Engineering Report
OGC 16-039 Testbed-12 Aviation Semantics Engineering Report
OGC 16-051 Testbed-12 Javascript-JSON-JSON-LD ER
OGC 16-059 Testbed-12 Semantic Portrayal, Registry and Mediation Engineering Report

8.1. Semantic Models and Integration of Semantic Services

Testbed-12 work outside of the aviation thread focused on the integration of Semantic Web technologies into Spatial Data Infrastructure data models and services with the goal to improve OGC data discovery, exchange, representation and visualization closely. The following diagram illustrates these concepts.

Figure 4. Overview of semantic services

8.1.1. Semantic Registry Information Model and Semantic Registry Service

The optimal provisioning of metadata is still subject to debate. A number of standards are available to express metadata, all of them focusing on particular requirements, serialization models, or encodings. The virtual goal is the development of a metadata structure that allows the discovery of all other objects, services, and metadata that are associated to any single object. This approach requires a graph structure with associations between individual objects that are themselves first class citizens, i.e. objects with properties and potentially further links to other objects (see previous chapter on links between resources in addition to the Testbed-12 report on the General Feature Model).

Traditionally, metadata standards in the Geospatial domain have focused on either data or service descriptions, but rarely addressed links between objects or associations to particular object portrayal or other processing services. To overcome these shortcomings, Testbed-12 analyzed a number of metadata standards including W3C standard DCAT, DCAT-AP, GeoDCAT-AP, ADMS, Project Open Data 1.1, Dublin Core, ISO 19115, and ISO 19119 to identify the common and relevant metadata information needed for search and discovery. Testbed-12 also identified gaps in existing standards to provide a complete dataset description, including service, portrayal information, schema and schema mapping. These efforts resulted in the development of the Semantic Registry Information Model (SRIM), and the Semantic Registry Service. SRIM is defined as superset of DCAT and its existing application profiles (DCAT-AP, GeoDCAT-AP, ADMS) and introduces a superclass of dcat:Dataset called srim:Item. The reason not to use DCAT exclusively is the strict focus of DCAT on data; excluding other elements such as e.g. services or map layers. SRIM enables the integration of different metadata providers (CSW, CKAN, POD, WMS, WCS) by providing a common core vocabulary to describe resources (data, services, vocabularies, maps, layers, schemas, etc.) and to accommodate specificities of each source by leveraging the built-in extensibility mechanism in OWL. SRIM has been defined in UML and serialized in RDF and JSON. The JSON serialization has been closely aligned with the Project Open Data metadata schema 1.1 standard. However, to accommodate some of the requirements needed by the Semantic Registry Service it has been extended and partly modified.

The Semantic Registry Service implements the Semantic Registry Information Model and allows acting as a proxy between a client and any number of catalog services. Passing on clients' search requests to OGC CSW instances requires the dynamic query-rewriting to bridge between the various service metadata encodings models and formats and the Semantic Registry Service JSON encoding supported by the service’s REST API. This aspect needs to be implemented and further explored in future testbeds. The Testbed-12 implementation focused on a harvesting approach, where metadata elements from CSW services have been harvested and locally converted in order to fit the Semantic Registry Information Model.

8.1.2. Semantic Portrayal

The initial implementation of the Semantic Portrayal Service during the OGC Testbed-11 focused on defining styles, portrayal rules, point-based symbols and graphics to enable Web Processing Services to produce SLD documents. Testbed-12 now broadened the focus to include in particular symbology styles for lines, areas, and texts based on existing standards such as Symbol Encoding and Styled Layer Descriptor, SVG, ISO 19117, and KML. All portrayal information is captured in a set of microtheories, including a style ontology, a symbol ontology, a graphic ontology, and a portrayal catalog ontology. All semantic portrayal information (i.e. styles, rulesets, symbol-sets, and symbols) has been made available through a hypermedia-driven REST-based API.

8.1.3. Semantic Mediation

The semantic mediation work in Testbed-12 was closely related to the semantic portrayal work described above and built on the achievements from Testbed-11. OGC 15-058, Symbology Mediation Engineering Report from Testbed-11 describes the basic principles of semantic mediation using the example of two portrayal ontologies that need to be aligned in an ad hoc manner. Testbed-12 now focused on the usage of a schema registry to store information about schemas and schema mappings to support ad hoc transformations between source and a target schemas, see chapter semantic registry. Schema mappings can be considered a simple form of semantic mediation, but go without explicit formalization of the underlying semantic knowledge required to map from one schema to another. For that reason, the idea was to design Semantic Mediation Service REST API and integrate it with the Semantic Registry and CSW ebRIM profile for Schema Registry.

8.1.4. Semantic Catalogs

The OGC approach to support data and service discovery is the Catalog Service for the Web (CSW). The current versions of the CSW are supporting syntactical and schematic interoperability. CSW support searches by temporal and spatial dimensions, keywords, and well defined terms organized in the service taxonomy (e.g. search for a specific service type); an approach that is insufficient for automatic service discovery based on data contents. Service brokers that mediate between a client and catalogs or other services can only partly mitigate that problem, despite being loaded with additional knowledge (which is a very complex and cumbersome task).

In particular the keyword-based search approach, which uses a lexical comparison between search and target terms, often leads to poor discovery results because the keywords in the query may be semantically similar but syntactically different, or syntactically similar but semantically different from the terms in a Web service description. Thus, traditional keyword-based search approaches are inherently restricted by the ambiguities of natural languages.

8.2. UML to OWL/RDF Mapping

Testbed-12 experimented with deriving an ontology representation of an application schema (using RDF(S)/SKOS/OWL) - to support Semantic Web / Linked Data implementations. Application schemas are a key enabler of interoperable information exchange. They define the structure and semantics of geographic features for a specific domain, community, or application. Numerous application schemas exist, for example in the defense and intelligence as well as aviation domains.

Traditionally, XML Schemas have been derived from application schemas, based upon the encoding rules of the Geography Markup Language (GML). These schemas are used for exchanging XML encoded geographic information in an interoperable way. OGC 16-020 defines rules for converting an application schema into an OWL ontology. The design is based upon the conversion rules defined by ISO 19150-2. A number of configuration options as well as additional conversion rules provide a higher level of control and flexibility when deriving an ontology compared to the conversion rules defined by ISO 19150-2. Converting an application schema into an ontology results in a key component that can be used by web applications. The ontology defines the concepts for encoding geographic information in machine-processable representation languages (RDF/OWL/SKOS). RDF data published on the web supports linking between different datasets. The ontology makes conceptual knowledge available for automated reasoning over RDF data. Combined, this can unlock new information.

8.3. JSON-LD

The experiments in Testbed-12 on JSON-LD focused on the UML to JSON-LD mapping without using XML as an intermediate step. The work primarily addressed technical details of the conversion process and the usage of JSON schema for further validation processes. From a semantic perspective, these research activities play an important roles once it comes to integration efforts between various models and approaches. In this context, future work need to address real world situations where data serialized in JSON-LD, RDF and other formats need to be integrated, mapped, or aligned.

8.4. Aviation Semantic

Aviation semantics explores the usage of FAA Web Service Description Ontological Model (WSDOM) to improve service discovery within Spatial Data Infrastructures. The results of this work are documented in OGC 16-039r1, Testbed-12 Aviation Semantics Engineering Report. It starts giving a short introduction into the concept of semantic service description and discovery using the WSDOM [6] ontology while considering OWS' characteristics and specific needs by aviation traffic management. An overall goal is to integrate OGC technologies with aviation semantic web technologies, in particular those related to service and data discovery. The Testbed-12 focus was on semantic aviation service description for OWS compatible aviation services shall be interoperable with service description approaches based on OGC getCapabilities-request responses. At the same time, the power and expressiveness of query languages such as SPARQL and GeoSPARQL should be leveraged.

.image source OGC 16-018

WSDOM is a formal ontological model based on OWL-S and FAA Service Taxonomies that is being developed by FAA. It can be used to describe and discover service metadata instances using semantic technology. While it provides good basis for extensions, the WSDOM ontology needs additional metadata to fully classify services, describe their characteristics and express their geospatial properties. It further doesn’t include rules and axioms for reasoning nor supports mapping between WSDOM and OWS GetCapabilities documents. The current WSDOM specification also lacks the description of use case scenarios, service discovery query examples and a referent demonstration/test environment.

The usefulness of ontology-based service metadata descriptions for geospatial service discovery tasks was explained and several extensions for WSDOM ontology proposed. The extensions enhance WSDOM ontology for service classification taxonomy but also utilize the results of the OGC Geosemantics DWG in the area of geospatial semantics, proposing GeoSPARQL as the language of choice for service discovery. Much attention has also been given to the interoperability between the WSDOM ontological service representations and the equivalent OGC OWS compatible metadata descriptions.

8.4.1. OGC Web Services and OWL-S

One of the main shortcomings in automated service discovery and invocation of OGC compliant Web services is the lack of shared semantics about any specific service. A WMS provides maps with any number of layers, but each layer is described very briefly (if at all) using free text. If a user cannot interpret the layer named jõed or the short description suurimaid jõgesid Eestis, it is not possible to derive any further knowledge from that service. Whereas maps consist of a number of layers and might still serve some meaningful purpose even though semantically weak, other services such as WPS that interface literally any type of geospatial processing are not usable without clear semantics.

Semantic annotation was subject of research roughly 10 years ago, with Roman et al (2007), Tanasescu et al (2007) or Lutz (2007) suggesting ontologies to describe OGC Web services. All these approaches forwent using information provided directly by the Capabilities-document. Others such as Maué (2008) have addressed the aspect of missing semantics by adding links to the capabilities document. These links annotated data objects with semantics from a domain ontology and functional aspects with semantics from Web service process ontologies. In contrast to these approaches, Stock et al - and reused in the Aviation Semantics work - suggest to store and augment the capabilities content in a Web service ontology to assist discovery, execution, and orchestration of these services.

The W3C specification OWL-S is used to describe Web services with strong semantics that allow discovery and reasoning. OWL-S specifies ontologies to generically describe any Web service, but requires extensions to cover domain-specific aspects, e.g. the geospatial characteristics of OGC Web service interfaces. This is complementary to the existing OGC approach using Capabilities-documents to describe technical aspects and limited semantics of Web services without much room for semantic reasoning.

The Aviation Semantics Engineering report takes on previous work done by Stock et al and suggests the representation of OGC-compliant geospatial Web services using ontologies. The general idea is to populate the ontology with data from the Capabilities-document provided by OGC Web services.

It has to be noted that the previous research referenced before describes OWL-S as a cumbersome way to describe Web services with significant repetition. The OWL-S model with its three main parts: the service profile, the process model and the grounding has been proven onerous in practice, which is particularly true for the input and output parameter descriptions. The suggested solution limits their full description to the process model with references in other places. This far more practical solution is technically not correct, as it violates the OWL-S formal model. Further issues are based on the nature of OWL and its limited means to express cardinality without massive proliferation of properties, which is particularly relevant to non-atomic parameters, e.g. Query. All these experiences lead to a mixed picture of the suggested approach. It seems that it might make more sense to restrict ontology-based concepts to the discovery phase and use alternative methods to describe parameter details.

8.4.2. Extension of Existing Ontologies

The aviation thread made good experiences extending an existing ontology with GeoSPARQL classes and properties to allow for reasoning on geospatial characteristics. The goal was to evaluate if OGC specific elements such as GetCapabilities elements can be added to existing ontologies. The thread successfully developed a method for the semantic representation of OGC-compliant geospatial web services using an equivalent OWL ontology as a potential extension to the WSDOM ontology. WSDOM is a formal Web Service description ontological model that is being developed by FAA for use in building applications that process and exchange web service information. WSDOM was designed based on the OWL-S service ontology and the service taxonomies used by FAA. The WSDOM 1.1 release consists of six OWL files and three RDF files. For further details see OGC 16-039.

8.4.3. OWL Extension to CSW

Catalogue services support the ability to publish and search collections of descriptive information (metadata) for data, services, and related information objects. Metadata in catalogues represent resource characteristics that can be queried and presented for evaluation and further processing by both humans and software. Catalogue services are required to support the discovery and binding to registered information resources within an information community. However, the current OGC catalogue service specification only supports taxonomies, which means it is not suited for semantic based service discovery.

In order to understand the needs for service discovery in the domain of aviation, the aviation thread developed a number of use cases that include aviation services to be described and discovered using semantic technologies. The use cases are related to common capabilities provided by traditional aeronautical information systems. The major difference is in a new assumption regarding the System Wide Information Management (SWIM) - inside SWIM, numerous aviation services might be provided by various authorized service providers. Such diversity requires the services to be properly described in order to be successfully discovered and consumed. With highly flexible SWIM anticipated, the aeronautical data retrieval becomes more complex. Many aviation information services will be available in SWIM and the complete knowledge about characteristics and capabilities they provide (operational, data types, spatial coverage), as well as the implementation details, will not always be explicitly available. In other words, the availability of service metadata consolidated in a centralized SWIM catalogue/registry for the purpose of service discovery is an important precondition for successful service consumption. The aviation thread settled for use case development only. Future activities need to address this topic in more detail.

9. Future Work

As already discussed in section Semantic Enablement in Testbed-12, there is some uncertainty to which extent the developed models and approaches can be applied to real world scenarios. For that reason, future testbeds need to focus more on actual implementations of the developed ontologies, service interfaces, and data exchange models and formats. The current work provides a great start, but only future experiments will allow proper evaluation of the results produced in Testbed-12. It is recommended to develop an entire semantic exploration thread with a number of actual components implementing the ontologies developed herein, providing mappings between ontologies, and supporting the developed APIs. These future activities should not develop yet another stack of technologies, but implement was has been reported in Testbed-12.

In addition, what is currently missing is a complete guidance on "geospatial - the semantic way". To realize automatic search and discovery of geospatial feature data at the semantic level, various challenges have to be matched depending on the maturity of the system or data at hand. If new data will be provided, the challenge will be how to match geospatial features to a predefined geospatial ontology. If the data is already available and organized according to a domain ontology, the challenge is how to map between different feature types from various ontologies that are not perfect matches. Though the subject of substantial research, automated schema mapping between feature types is still a huge challenge and is currently not ripe for operational systems. The approaches taken in Testbed-12 feature manual mappings to mitigate the problem. Nevertheless, in order to better understand where OGC currently stands in terms of operationalization of its semantic research work, we recommend extensive tests with real-world data/scenarios.

In addition to this heuristic approach, a number of detailed aspects should be further researched in future testbeds (ideally in parallel to the heuristic approach described above). These aspects are described in the following paragraphs. It is emphasized that the following elements are aggregated from a number of Testbed-12 reports. Nevertheless, the ERs identified in the previous section need to be consulted for additional details and extended scope.

9.1. Semantic Concepts Applied to Real World Situations

9.1.1. Experiment With and Build Real World Ontologies

With the deployment of the Semantic Web and Linked Open Data we have not only multiplied the data sources but also the machine-processable controlled vocabularies that structure and constrain the interpretations of these data. These controlled vocabularies can be ontologies (RDF Schema, OWL), codelists, taxonomies, thesauri (SKOS) sometimes augmented with additional rules (SPARQL/SPIN rules, SWRL, RIF), and constraints (SHACL). Controlled vocabularies provide a better way to organize knowledge for subsequent retrieval.

400 .image source semanco-tools.eu

Vocabulary directories now exist (e.g. Linked Open Vocabularies, LOV), but there is an ever-increasing demand for environments that simplify searching, editing and collaborative contributions to the vocabularies by non-experts of the Semantic Web. This creates a tension between state-of-the-art, very rich formalisms and methods for modeling vocabularies and a need to democratize and decentralize participation in the life cycle and usage of controlled vocabularies.

Vocabularies are most likely to be adopted and shared if they are made available easily. Nevertheless, despite successes in the use of SKOS for encoding vocabularies, current standards provide only low-level interfaces to vocabulary data. For example, many vocabularies are published as an RDF document for download. However, if the vocabulary is large, then the download will be commensurately large; if the user only wants to retrieve a single vocabulary term or select a few terms, this option requires processing on the client side. Alternatively, access to vocabularies is often provided at a SPARQL endpoint. SPARQL is the generic RDF query language. While it is powerful, it is also considered a low-level language similar to the relational database query language SQL that is normally only used by database administrators.

Some SKOS vocabularies are published via other HTTP interfaces. However, each implementation uses different protocols and supports a varied set of features (e.g. content-negotiation provided by the GEMET REST interface and NERC Data Grid’s Vocabulary Server SOAP interface). In some cases, one or both of human-readable formats and machine-readable formats are not available. Thus, discovery and access across vocabulary endpoints becomes challenging and ad-hoc.

There is a clear opportunity here to design an API to match the SKOS and OWL vocabularies, taking advantage of the fact that most modern vocabulary content is structured using SKOS and OWL classes and predicates. This API can then be used as the basis for various higher-level vocabulary applications (NLP applications, Concept Recommender, Semantic Enricher, etc.) that can be used to enrich for example ISO 19115 metadata and other OGC services using controlled vocabularies in their metadata.

As stated above, experiments with real world models and ontologies need to be expanded in future testbeds. Tests run as part of the ShapeChange UML-RDF/OWL/SKOS work produced very promising results using the NATO Geospatial Real World Object Index.

Further on, current experiments with tools such as ShapeChange have shown a number of issues concerning UML to ontology mapping that need further research. This includes

a required specification in OWL of properties re-used within an application schema, and
the creation of multiple ontologies that support different levels of complexity as an potential handle for full and reduced-complexity ontologies.

9.1.2. Multilingual Support

200 .image source: eazysafe.com

Multilingual environments often struggle with common semantics due to different usage of terms in the various languages. Though technologies such as RDF support multilingual properties already, the feature is currently not used very often. In particular in infrastructures such as the Arctic Spatial Data Infrastructure, where many data sets are available on the national scale at the eight adjacent nations, represent ideal test cases to further experiment with multilingualism and semantic mapping.

9.1.3. OWL and OCL Constraints

Constraints-languages such as OCL allow constraining the mapping from one model representation/serialization to another. During the mapping process from UML to OWL, OCL could for example be used to constrain OWL constructs. These technologies should be added to UML converter tools such as ShapeChange to improve interoperability and reasoning capabilities. This applies to constrained property types, values, geometries, or times.

9.1.4. Data Models in OWL and RDF, JSON-LD, or other formats

Experiments have been started with RDF and JSON-LD, though no full comparison of both approaches has been performed yet. JSON-LD is a lightweight syntax to serialize Linked Data in JSON. As the name suggests, it supports linking in and between datasets, much like Xlinks in GML data.

A key aspect that JSON-LD adds to JSON is semantic tagging of data elements. Like in XML, each element can be assigned to a namespace. This allows clients to identify the exact meaning of an element, especially if there are multiple elements that happen to have the same name. JSON-LD context documents are used to provide the necessary information. A context document typically references terms from one or more vocabularies or ontologies. These terms then provide the semantics of elements in the actual JSON data. In this context, a number of questions arise that need additional research and experimentation, such as:

How does such an approach compare with semantics defined by OWL ontologies and RDF data serializations?
How can JSON-LD be verified to test compliancy to a standardized model upfront? What roles play Frames and JSON-Schema?
Does it make sense to bring JSON-LD into a specific layout, e.g. RDF, and then use SHCAL to validate the data?

9.2. Experiments With New Technologies

9.2.1. Hypermedia Formats

With Hypertext Application Language (HAL), JSON-LD, Hypermedia-Driven Web APIs (Hydra), or Siren: a hypermedia specification for representing entities, there exist a number of hypermedia formats that allow linking resources. Future IP initiatives should investigate these formats and develop recommendations on how to integrate which type into the OGC meta architecture.

9.2.2. Hypermedia Link Relations

Precise semantics of association types between resources are essential for any set up of resource servers that use links between resources. Testbed-12 partly addressed this topic as part of the REST User Guide and REST Architecture Engineering Report. To establish any semantically sound data serving infrastructure, a reliably available and clearly defined set of link types that is aligned with other initiatives such as IANA link relations is absolute essential. The OGC, under the supervision of the OGC Naming Authority, should establish a registry of typical link types for geospatial applications.

9.2.3. RDF-QB, VoID, Linked Data, Shared Vocabularies, and OGC Web Service Interfaces

Lots of base work has been done by the joint OGC-W3C Spatial Data on the Web Working Group. Future testbeds should identify a number of key elements and progress the work of the working group with implementations and further research. The important elements include RDF Data Cube to define thematic, spatial and temporal dimensions, the the Vocabulary of Interlinked Datasets, VoID as an RDF based schema to describe linked datasets, and their integration with shared vocabularies and the various OGC Web service interfaces as discussed above.

9.2.4. Validation of RDF Data

Ontologies describe semantics. Reasoning on RDF data can be performed based upon the information found in ontologies. This can lead to new information and knowledge. Pure validation of RDF data is another use case. Validation can verify that a given dataset is compliant to a specification. This is of interest whenever a data publisher and consumer have agreed to exchange information compliant to a certain specification. This is even more important when there are many publishers and consumers (think of information exchange on and between the communal, regional, national, and international level).

SHACL, the Shape Constraint Language, defines the shape of the graph and allows integrity checks of graph-based structures. SHACL has been identified in Testbed 12 as a key candidate to support the missing validity and integrity checks for OWL/RDF based serializations of information models.

9.3. Modifications to Existing Standards or Integration of Other Standards

9.3.1. ISO 19115 to SRIM best practices

ISO 19115 uses a hierarchical data model that is not perfectly suitable for graph-based technologies. Future activities should investigate how ISO 19115 could be better aligned with the current best practices for Linked Data publication in general and the SRIM in particular.

9.3.2. Semantic Registry and Dynamic Query Rewriting

The Testbed-12 Semantic Registry Service implements the Semantic Registry Information Model and allows acting as a proxy between a client and any number of catalog services. Passing on clients' search requests to OGC CSW instances requires the dynamic query-rewriting to bridge between the various service metadata encodings models and formats and the Semantic Registry Service JSON encoding supported by the service’s REST API. This aspect aspect needs to be implemented and further explored in future testbeds. The Testbed-12 implementation focused on a harvesting approach, where metadata elements from CSW services have been harvested and locally converted in order to fit the Semantic Registry Information Model. In future, support for arbitrary OGC services (providing access to e.g. feature collections, maps, map layers, coverages, and other objects) should be added and tested. This requires extensions to the Semantic Registry Information Model, see below.

9.3.3. SRIM Layer and Map Profile

Investigate a profile for Layer and Map that extends the RegisteredItem and relates to Datasets, Services and Portrayal Information developed for the Semantic Registry and Semantic Portrayal Service.

9.3.4. PubSub and federation of Registry

For this testbed, the Semantic Registry harvested information from a federation of CSW services as the focus was to exercise the Semantic Registry Information Model (SRIM) and the REST API. For the next testbed, efficiency could be improved by investigating the publish/subscribe protocol and versioning management of the register items in the Semantic Registry. This approach is complementary to the dynamic query rewriting discussed above. Both approaches have their advantages and disadvantages that need to be explored in further detail.

Appendix A: Revision History

Table 2. Revision History
Date	Release	Editor	Primary clauses modified	Descriptions
June 30, 2016	0.1	M. Klopfer	all	draft engineering report
September 30, 2016	0.2	M. Klopfer	all	draft engineering report capturing key results
October 18, 2016	0.3	M. Klopfer	all	all sections revised
November 16, 2016	1.0	M. Klopfer	all	feedback from sponsor incorporated

Appendix B: Bibliography

[1] Bishr, Y. (1998) Overcoming the semantic and other barriers to GIS interoperability. International Journal of Geographical Information Science, 12/4:299–314

[2] Zazueta, Rob (2014) Stop Talking About Hypermedia and REST - Start Building Adaptable APIs. Blog entry at http://www.mashery.com/blog/stop-talking-about-hypermedia-and-rest-start-building-adaptable-apis

[3] Bloomberg, Jason (2014) Automating HATEOAS Declaratively: REST’s Missing Link. Website: http://www.devx.com/blog/agile/automating-hateoas-declaratively-rests-missing-link.html

[4] Simonis, Ingo & S. Fellah (2014) OGC Testbed 10 Cross Community Interoperability (CCI) Ontology Engineering Report, OGC 14-049, https://portal.opengeospatial.org/files/?artifact_id=58974&version=2

[5] Fellah, Stephane (2015) Testbed-11 Implementing Linked Data and Semantically Enabling OGC Services Engineering Report, OGC 15-054, https://portal.opengeospatial.org/files/?artifact_id=64405&version=2

[6] Stock, Kristin, Anne Robertson, and Mark Small. 2011. “Representing OGC Geospatial Web Services in OWL-S Web Service Ontologies.” http://www.nottingham.ac.uk/~lgzwww/contacts/staffPages/kristinstock/documents/RepresentingOGCGeospatialWebServicesinOWLv0.2_000.pdf

[7] Roman D and Klien E. (2007). SWING – A Semantic Framework for Geospatial Services. In The Geospatial Web: How Geobrowsers, Social Software and the Web 2.0 are Shaping the Network Society. Arno Scharl, Klaus Tochtermann (eds.). London, Springer

[8] Tanasescu V, Gugliotta A, Domingue J, Gutiérrez Villarías L, Davies R, Rowlatt M, Richardson M and Stincic S. (2007). Geospatial Data Integration with Semantic Web Services: the eMerges Approach. In The Geospatial Web: How Geobrowsers, Social Software and the Web 2.0 are Shaping the Network Society. Arno Scharl, Klaus Tochtermann (eds.). London, Springer

[9] Lutz, M. (2007). Ontology-based descriptions for semantic discovery and composition of geoprocessing services. Geoinformatica 11