Publication Date: 2019-02-11
Approval Date: 2018-12-13
Posted Date: 2018-11-21
Reference number of this document: OGC 18-097
Reference URL for this document: http://www.opengis.net/doc/PER/elfie-er
Category: Public Engineering Report
Editor: David Blodgett, Byron Cochrane, Rob Atkinson, Sylvain Grellet, Abdelfettah Feliachi, Alistair Ritchie
Title: OGC Environmental Linked Features Interoperability Experiment Engineering Report
Copyright © 2019 Open Geospatial Consortium. To obtain additional rights of use, visit http://www.opengeospatial.org/
This document is not an OGC Standard. This document is an OGC Public Engineering Report created as a deliverable in an OGC Interoperability Initiative and is not an official position of the OGC membership. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an OGC Standard. Further, any OGC Engineering Report should not be referenced as required or mandatory technology in procurements. However, the discussions in this document could very well lead to the definition of an OGC Standard.
Permission is hereby granted by the Open Geospatial Consortium, ("Licensor"), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.
If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.
THIS LICENSE IS A COPYRIGHT LICENSE ONLY, AND DOES NOT CONVEY ANY RIGHTS UNDER ANY PATENTS THAT MAY BE IN FORCE ANYWHERE IN THE WORLD. THE INTELLECTUAL PROPERTY IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE DO NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE INTELLECTUAL PROPERTY WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION OF THE INTELLECTUAL PROPERTY WILL BE UNINTERRUPTED OR ERROR FREE. ANY USE OF THE INTELLECTUAL PROPERTY SHALL BE MADE ENTIRELY AT THE USER’S OWN RISK. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR ANY CONTRIBUTOR OF INTELLECTUAL PROPERTY RIGHTS TO THE INTELLECTUAL PROPERTY BE LIABLE FOR ANY CLAIM, OR ANY DIRECT, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM ANY ALLEGED INFRINGEMENT OR ANY LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR UNDER ANY OTHER LEGAL THEORY, ARISING OUT OF OR IN CONNECTION WITH THE IMPLEMENTATION, USE, COMMERCIALIZATION OR PERFORMANCE OF THIS INTELLECTUAL PROPERTY.
This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.
Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications.
This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.
None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.
- 1. Summary
- 2. References
- 3. Terms and definitions
- 4. Abbreviated terms
- 5. Overview
- 6. Objectives
- 7. Use Case Summary
- 8. Applicable Best Practices and Standards
- 9. Relations and JSON-LD Contexts
- 10. Summary of Experiment and Analysis of Outcomes
- 11. Issues and Recommendations for Future Work
- Appendix A: Strategies for Publishing Domain Ontologies as Linked Data from OGC domain standards
- A.1. Introduction
- A.2. Problem statement
- A.3. Background
- A.4. Proposed strategy
- A.5. Issues identified
- A.5.1. Naming Policy
- A.5.2. Weaknesses or issues with ISO 19150-2 rules
- A.5.3. Property names and definitions
- A.5.4. Alignment documents (UML → OWL)
- A.5.5. Meta-model issues (expressivity mismatches between OWL and UML)
- A.5.6. Bugs and limitations in software (or things too hard to configure)
- A.5.7. Annotation practices
- A.5.8. Proposed behavior when external classes is specified as properties values
- A.5.9. Standing issues
- A.6. Support material
- Appendix B: Default Response for Multiple Representions and Relations
- Appendix C: Use Cases Considered in Design Process
- Appendix D: Revision History
- Appendix E: Bibliography
Services and linked data for distributed spatio-temporal information on the internet hold the promise to increase interoperability and expressivity while decreasing duplication and data maintenance overhead. In order to achieve this potential, web service APIs and payloads need to work seamlessly with linked open data ontologies and hypermedia. Current OGC services, while flexible and capable, do not directly allow exposure of features in a REST-ful way or provide traversable hypermedia describing available methods, data, or interfaces to related (linked) content. Linked open data has the potential to turn the normal OGC service pattern (GetCapabilities request followed by introspection to understand contents) inside out by encoding associations between features as linked data predicates but specific implementation conventions and best practices are yet to be established.
Two core functional goals motivated the engineering described in this report: 1) The need to encode relationships between and among environmental features, and 2) the need to link observational data to the features they describe. Given that relationships in the absence of context are of little value, a third goal was the need for an ultimately general preview of a feature to provide just enough context in a ubiquitous form. These three functional goals coupled with a non-functional goal of using highly-adoptable technologies provided more than enough problem space for the interoperability experiment to take on.
Prior to initiating ELFIE, significant work had been completed on data models for numerous domains and types of data including hydrology, hydrogeology, soils, observations, and timeseries but almost never used together. The OGC services baseline was well implemented across the community but practices for use of OGC services, primarily those based on the Web Feature Service (WFS) standard, were extremely varied and fully interoperable. ELFIE sought to provide implementation conventions as a target for 1) future data model encodings to integrate data across domains, 2) design and use of web services in a linked data context, and 3) design of future services and software to fill an important gap in open distributed environmental data systems.
At the conclusion of the project, technical best practices for encoding links between and among features and observations are in sight. A general approach to using the power of OGC web service APIs to expose features in the context of rich domain-feature-model-based linked data while following W3C best practices has been described. The solution requires additional work to establish and publish ontologies and points to a number of opportunities to evolve data and service standards to meet the needs of environmental features and observations. This work has begun or will be proposed under the OGC Naming Authority, relevant OGC and W3C Standards and Domain Working Groups, and a follow-on Second ELFIE (SELFIE) focused on URI dereferencing and default content is being planned.
The ELFIE provides recommendations and conventions for work in working groups across OGC and W3C. The WFS Standards Working Group (SWG)'s work on WFS 3.0 appears to be well-suited to ELFIE goals, but consideration for HTTP URIs to identify and query for features should be clearly supported by the future incarnation of WFS. Various OGC Working Groups need to establish ontologies and work through the OGC Naming Authority to publish their feature types and associations for broad use in linked data. Similarly, all data providers should pursue enterprise or community capabilities to mint and publish URIs and available representations of features and observations they own. Without this critical foundation, which can and should be recognized as a rudimentary starting point, linked data applications cannot move forward. Finally, the scope of ELFIE was purposefully limited to encoding of links in linked data documents. A future IE should look at an expanded scope to include URL dereferencing and default response content for URIs that identify real world features and content that links to various information resources that represent or are linked to the real-world non-information resource.
All questions regarding this document should be directed to the editor or the contributors:
U.S. Geological Survey
Land Information New Zealand
Open Geospatial Consortium
Bureau de Recherches Géologiques et Minières
Bureau de Recherches Géologiques et Minières
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.
Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.
The following normative documents are referenced in this document.
For the purposes of this report, the definitions specified in Clause 4 of the OWS Common Implementation Standard OGC 06-121r9 shall apply. In addition, the following terms and definitions apply.
Facet or attribute of an object referenced by a name [ISO 19143:2010, definition 4.21]
Act of measuring or otherwise determining the value of a property [ISO19156:2011, definition 4.11]
specific (simplified and useful) subset of a rich (complete but complicated) linked data graph
A URI for a non-information resource, which is a web resource that cannot be transmitted electronically. A URI is defined as a non-information URI when a HTTP Get Request does not return a 2xx “Success” response when dereferenced. No data are returned because the URI represents something other than a web information resource. i. e. a person or object.
A physiographic unit defined by a common hydrologic outlet and potentially defined input(s). May have several representations per OGC 14-111r6 HY_Features. Note that “watershed” translations may be problematic (e.g. translating the term into German leads to two valid but different terms, i.e. “Wasserscheide”, which is the actual drainage divide, and “Einzugsgebiet”, which is the area. The definition here encompases both.).
API - Application Programming Interface
ELFIE - Environmental Linked Features Interoperability Experiment
GeoSciML - Geoscience Markup Language
GeoSPARQL - Geospatial SPARQL Protocol and RDF Query Language (recursive acronym)
GWML2 - WaterML2 Part 4: Groundwater Markup Language
HTTP - Hypertext Transfer Protocol
HY_Features - WaterML2 Part 3: Surface Water Hydrologic Features
IE - Interoperability Experiment
O&M - Observations and Measurements
OWL - Web Ontology Language
OWS - OGC Web Services Common
RDF - Resource Description Framework
REST - Representational State Transfer
SHACL - Shapes Constraint Language
SOS - Sensor Observation Service
SOSA - Sensor, Observation, Sample, and Actuator
URI - Uniform Resource Identifier
URL - Uniform Resource Locator
WaterML2 - Suite of Data Standards for the Hydrology Domain
WFS - Web Feature Service
UML - Unified Modeling Language
W3C - World Wide Web Consortium
WKT - Well Known Text
XML - eXtensible Markup Language
The Environmental Linked Feature Interoperability Experiment (ELFIE) has explored existing OGC and W3C standards with the goal of establishing a best practice for exposing cross-domain links between environmental domain and sampling features. The IE focused on two high-level functional goals:
The need to encode topological and domain specific relationships between cross-domain features
The need to link available observations data to sampled domain features.
Current OGC services, while flexible and capable, do not directly allow exposure of features in a REST-ful way or provide traversable hypermedia describing available methods, data, or interfaces to related (linked) content. The encodings demonstrated in the ELFIE turn the normal OGC service pattern (GetCapabilities request followed by introspection to understand contents) inside out by encoding associations between features as linked data predicates. This approach uses Resource Description Framework (RDF) triples (subject-predicate-object) to describe resource-relation-resource associations where the resources are unambiguous feature identifiers, but may resolve (dereference) to OGC web service requests as illustrated in the W3C/OGC Spatial Data On the Web Best Practices (Best Practice 1 ). This leverages the power of services such as WFS to provide feature instances, but expresses the structure of complex features or cross-service associated features explicitly.
The ELFIE has chosen to use JSON-LD as its encoding of RDF data. This was done in alignment and collaboration with the concurrent OGC work on JSON Best Practices. The ubiquity of JSON in Web development also contributed to the choice so as to ease implementing the RDF model by this community. The IE has specified a number of JSON-LD contexts  that describe useful RDF views. JSON-LD documents that conform to these contexts can be delivered as the payload returned from a REST API endpoint. In this case, they are self-describing through inclusion of the context document reference. They may also be negotiated, for instance using the view pattern described by the Linked Data API. (The DXWG is currently developing a formal standard for “content negotiation by profile” which confirms the relevance of this pattern.). Using this approach, the recommendation takes advantage of the power and flexibility of Linked Data architecture while providing just enough guidance to foster interoperability within a simplified implementation regime based on JSON. By describing a partially resolved useful view of a linked data graph, it also allows resources that are important but not yet available using Semantic Web formalisms to be incorporated into a linked data system. This mechanism should provide a gateway to inclusion in and understanding the value of the Linked Open Data Cloud.
The scope of this work could have easily gotten out of control, since the Linked Data and Semantic Web itself has never fully come to grips with the issue of pragmatic limits to graph shapes, non-unique naming and content-negotiation, nor has the Web Services world addressed linking objects and granularity of responses in a distributed environment. To focus the work, the IE attempted to avoid issues not directly related to RDF encodings as would be passed from server to client via http to support the two high level abstract use cases above.
The ELFIE tested the assumption that broadly adopted web technologies for linked data could be applied in the environmental domain to satisfy the needs of a number of compelling use cases. The Objectives section describes engineering design goals and details the assumptions and scope that were used to constrain the magnitude of the IE. Much of the IE involved design discussions amongst participants where candidate use cases were considered and potential technologies and implementations were explored. The Use Case Summary section describes these use cases and the select set that were more thoroughly tested to evolve the technologies that were agreed upon during design discussions.
The ELFIE technology selection criteria were that they be adoptable, common, and suitable to the needs of the ELFIE use cases. In some cases, new and thus lightly tested domain feature models were applied with the chosen, well established, technologies. The Applicable Best Practices and Standards section is a brief discussion of the technologies and standards tested in the ELFIE. The technology chosen to provide a precise but not overly prescriptive description of the RDF views is the JSON-LD context. Details of the contexts defined in the ELFIE are discussed in the Relations and JSON-LD Contexts section.
The outcomes of the ELFIE provide guidance for creating (graph) views of environmental features and observations. A simple view intended to be a preview, suitable for basic indexing is specified. Views intended to provide topological as well as observational and domain feature model relationships were also explored. The Preview view was largely successful, but there was no one satisfactory solution for expression of a preview geometry because GeoJSON is incompatible with JSON-LD, WKT is thought to be difficult to work with in typical web environments, and the schema.org geometries were found to have limited utility. The experimentation with domain feature models led to the conclusion that the underlying linked data technologies are mature enough, but progress is still needed on domain feature ontologies and encoding geometry. These issues are discussed in more detail in the Summary of Experiment and Analysis of Outcomes and Issues and Recommendations for Future Work sections.
The ELFIE explored a number of interesting but tangential topics. Annex A: Strategies for Publishing Domain Ontologies as Linked Data from OGC domain standards is a summary of one important topic discussed; how to generate a feature ontology from a UML model. Annex B: Default Response for Multiple Representions and Relations is a brief discussion of issues related to what the default response should be when resolving a URI for a non-information resource that may have multiple representations.
The ELFIE objectives were to create and demonstrate a reusable framework that met the following functional and operational goals. The IE participants took a positive view (assumption) — that the below goals are possible and sought to test this assumption through design and implementation of compelling test cases.
Functional goals of the ELFIE focused on providing
the ability to describe and use links between features that adhere to domain-specific feature models and
the ability to link the above features to observations and measurements data from samples collected on these features.
Within this, the ELFIE aimed to leverage existing standards and best practices while maintaining a high degree of simplicity. This led to IE participants to seek operational solutions that:
are adoptable by developers and users with a wide array of technical skill levels,
are ubiquitous and easily adoptable in any programming language,
and adhere, to the extent possible, to standard taxonomies and ontologies.
It is possible and desirable to implement a common approach to encoding links between environmental domain features which allow cross-domain and cross-system sharing and interoperability of such linked information
Existing and pending OGC standards for the encoding of environmental observation data in an integrated dataset of features can be linked according to RESTful and Linked Data principles
The ability to encode documents containing links between monitoring sites and environmental domain features in a common way will enable automation and lines of inquiry that are not possible without manual intervention today.
A reusable approach to encode documents that use these information models in cross-disciplinary applications is doable even while currently not apparent given existing encodings, implementation standards, and best practices,
To focus this IE, only environmental domain models concerning landscape interactions with the hydrologic cycle were considered. This included surface water, groundwater, well/borehole and soil moisture. (WaterML2, GroundwaterML 2, GeosciML 4 and SoilML)
Focus was on:
linked data documents that contain collections of linked features and related observations and
feature or observation representation documents that contain links to related features or observations.
The IE constrained itself to using existing ontologies and vocabularies wherever possible.
Problems regarding network architecture for resolving links and systems design and governance for applications that store and retrieve links or concept relationships were deemed out of scope.
Issues related to default behavior when dereferencing an identifier, while of great importance to the ELFIE goals, was also declared out of scope of this IE.
Discovery of available views and negotiation of choice between client and server were deemed out of scope, although the ability to define a view is seen as a key stepping stone to such options in the future.
To test the assumptions described above, a number of example test cases (described below) were used to guide the design and testing of potential technical solutions. The use cases were hypothetical, clearly of value to the domain, and technically difficult using existing technologies and practices. Discussion of use cases was predicated on the assumption that these would be implemented as test cases under the IE. While time and resources did not allow all these use cases to be implemented for the ELFIE, the use cases provided invaluable context and example systems to guide discussion and vetting of design alternatives.
The specifics of technologies and practices the IE tested are described below. With a clear direction, participants implemented test data and code to exercise and demonstrate the IE outcomes. Application of the use cases considered in the design phase led to some refinement to the design, but the design selected largely satisfied the goals of the IE with some clear next steps and further work needed to formalize data models, encodings, and best practices. The experiment succeeded in identifying shortcomings of some existing practices and gaps in the standards base line that need to be filled. Putting further work and needed new work aside, the IE failed to find significant or fundamental problems with the best practices and standards tested.
The design phase of the ELFIE focused on identification of a number of use cases with extensive need for linked environmental data. These use cases focused on known difficulties of the systems the ELFIE participants maintain or depend on as users. Many of these use cases were used for the discovery and vetting of ideas only. Others were further developed with implementation of example encoded linked data and client code to produce desired results of the use case.
Test cases that were implemented as demonstrations of the design outcomes are described below in full detail to include notional diagrams of how datasets involved link to each other, a thorough description of the use case, and an example application that speaks to and/or satisfies the needs of the use case.
The full suite of use cases considered during the design phase of the project are described in the Use Cases Considered in Design Process appendix. Each use case takes the form of a simple description of who or what system has an interest in what related datasets. Specific datasets are listed to clarify the scope and provide a starting point for future technical design work.
Water budget summary: This use case provides a person interested in a basic summary of the water budget for a given watershed information about a collection of watersheds and their water budget data. It links together various hydrographic representations of each watershed as well as observational water budget data and related web resources.
As shown in the Figure 1, the watershed feature (black) is linked to various representations of it or information characterizing it (white). This follows the HY_Features HY_Catchment concept (Watershed in figure 1) as an unrealized feature that is related to various realization features, there is no canonical representation of the watershed itself.
Additional details about the water budget summary use case implementation is available on the demonstration ELFIE web page.
Flood risks and impacts: This use case provides a decision maker needing to respond to flooded transportation infrastructure the information they need to understand the impacted assets and flooded roads for a forecast flood. Under this use case, a flood forecasting system would be able to discover vulnerable infrastructure and assets published by local jurisdictions as linked data and publish flood forecasts that include potentially impacted features in forecast information products.
Additional details about the flood risk use case implementation is available on the demonstration ELFIE web page.
Ground water level monitoring: This use case is meant to demonstrate how from a given well URI any user (domain expert, machine) can then traverse to the monitoring strategy deployed (piezometer information etc.) and then access ground water level time series and/or information about the monitored aquifer.
Additional details about the ground water level monitoring use case implementation is available on the demonstration ELFIE web page.
Surface-ground water networks interaction: This use case is meant to demonstrate how, from a given Piezometer URI, any user (domain expert, machine) can traverse to the ground water monitoring strategy (see Ground water level monitoring Use Case) but also to the associated surface water monitoring one. Provided each surface/groundwater feature is properly linked together (River network, Aquifer system), it is then feasible to discover information about the full, comprehensive water system. This use case can be seen as a flagship demonstration of the usefulness of linked data in the environmental/cross-domain context.
Additional details about the surface-ground water networks interaction use case implementation is available on the demonstration ELFIE web page.
Watershed data index: This use case is meant to demonstrate the use of HY_Features to link a catchment to the data representing it as well as the monitoring network associated with it. It serves as a general demonstration that could be used for a wide array of linked watershed information use cases.
Additional details about the watershed data index use case implementation is available on the demonstration ELFIE web page.
This use case is introduced in more detail than those above and its technical details are presented below. Technical details of other use cases can be found at the ELFIE demonstration web page.
The watershed data index use case is focused on a single HY_Catchment feature with an identifier of "070900020601" from the U.S. watershed boundary dataset. Given that HY_Catchment is an unrealized feature type, the document describing "070900020601" links to realizations of "070900020601". Three catchment realizations are included:
A more complete implementation could include multiple versions of any of these feature types as well as additional realization types such as a network of channels, or a network of sub-catchments. The boundary polygon and hydrographic network are geospatial features that only link back to the catchment they realize. The hydrometric network is a more complex feature that aggregates a set of network stations, each of type HY_HydrometricFeature.