Publication Date: 2017-05-12

Approval Date: 2017-02-20

Posted Date: 2016-11-09

Reference number of this document: OGC 16-021r1

Reference URL for this document: http://www.opengis.net/doc/PER/t12-U001

Category: Public Engineering Report

Editor: Benjamin Pross

Title: Testbed-12 Low Bandwidth & Generalization Engineering Report


Testbed-12 Low Bandwidth & Generalization Engineering Report (OGC 16-021r1)

COPYRIGHT

Copyright © 2017 Open Geospatial Consortium. To obtain additional rights of use, visit http://www.opengeospatial.org/

WARNING

This document is an OGC Public Engineering Report created as a deliverable of an initiative from the OGC Innovation Program (formerly OGC Interoperability Program). It is not an OGC standard and not an official position of the OGC membership.It is distributed for review and comment. It is subject to change without notice and may not be referred to as an OGC Standard. Further, any OGC Engineering Report should not be referenced as required or mandatory technology in procurements. However, the discussions in this document could very well lead to the definition of an OGC Standard.

LICENSE AGREEMENT

Permission is hereby granted by the Open Geospatial Consortium, ("Licensor"), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.

If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.

THIS LICENSE IS A COPYRIGHT LICENSE ONLY, AND DOES NOT CONVEY ANY RIGHTS UNDER ANY PATENTS THAT MAY BE IN FORCE ANYWHERE IN THE WORLD. THE INTELLECTUAL PROPERTY IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE DO NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE INTELLECTUAL PROPERTY WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION OF THE INTELLECTUAL PROPERTY WILL BE UNINTERRUPTED OR ERROR FREE. ANY USE OF THE INTELLECTUAL PROPERTY SHALL BE MADE ENTIRELY AT THE USER’S OWN RISK. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR ANY CONTRIBUTOR OF INTELLECTUAL PROPERTY RIGHTS TO THE INTELLECTUAL PROPERTY BE LIABLE FOR ANY CLAIM, OR ANY DIRECT, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM ANY ALLEGED INFRINGEMENT OR ANY LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR UNDER ANY OTHER LEGAL THEORY, ARISING OUT OF OR IN CONNECTION WITH THE IMPLEMENTATION, USE, COMMERCIALIZATION OR PERFORMANCE OF THIS INTELLECTUAL PROPERTY.

This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.

Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications.

This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.

None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.

Abstract

For delivering of data that is offered by OGC services over (very) low bandwidth, two options may be considered: On the one hand, the geospatial features remain the same, but compression techniques are used to reduce the size of the data that needs to be transferred. On the other hand, generalization techniques may be used by reducing the details of geometries and/or attributes in order to reduce the amount of data. The aim of this ER is to summarize the results of implementing sample services using compression techniques for DGIWG WFS (U002) and providing generalization processes using WPS (U003). The ER compares the results of the different approaches and infers recommendations and best practices for supporting data delivery of standard data and complex 3D data from OGC services over low and very low bandwidth.

Business Value

The evaluated compression techniques for WFS could lead to further documents like a profile or a best practice document. The WPS Generalization Implementation will be a use case for web based processing. Furthermore, the findings summarized in this ER could lead to a WPS 2.0 profile for generalization.

Technology Value

The Generalization Profile that is described in this ER serves as proof of concept for the WPS 2.0 profiling approach. With the generalization processes a use case for web-based processing is given.

Keywords

ogcdocs, testbed-12, low bandwidth, generalization, wfs, wps

Proposed OGC Working Group for Review and Approval

WPS 2.0 SWG, WFS/FES SWG

1. Introduction

1.1. Scope

This ER provides an analysis of the prototype implementations, approaches, test architectures and performance aspects of geospatial data generalization and compression techniques explored in OGC Testbed 12 and findings. OGC Testbed 12 investigated extending WFS with Efficient XML Interchange (EXI) output formats as a method of providing for WFS data delivery in low bandwidth environments. Also it was investigated, how generalization methods can be encapsulated in WPS processes and how WPS profiles for generalization can be specified.

1.2. Document contributor contact points

All questions regarding this document should be directed to the editor or the contributors:

Table 1. Contacts
Name Organization

Benjamin Pross

52°North GmbH

Jeff Harrison

The Carbon Project

1.3. Future Work

See section 10.

1.4. Foreword

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.

Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.

2. References

The following documents are referenced in this document. For dated references, subsequent amendments to, or revisions of, any of these publications do not apply. For undated references, the latest edition of the normative document referred to applies.

  • OGC 06-121r9, OGC® Web Services Common Standard

  • OGC 09-025r2 OGC® Web Feature Service 2.0 Interface Standard

  • OGC 14-065 OGC® WPS 2.0 Interface Standard

3. Terms and definitions

For the purposes of this report, the definitions specified in Clause 4 of the OWS Common Implementation Standard [OGC 06-121r9] shall apply. In addition, the following terms and definitions apply.

3.1. Compression

Process of reducing the size of data.

3.2. Generalization

Process of reducing the detail in data.

4. Conventions

4.1. Abbreviated terms

  • API - Application Program Interface

  • AOI - Area-of-Interest

  • DGIWG - Defense Geospatial Information Working Group

  • EXI - Efficient XML Interchange

  • FTP - File Transfer Protocol

  • GML - Geography Markup Language

  • JSON - JavaScript Object Notation

  • OGC - Open Geospatial Consortium

  • W3C - World Wide Web Consortium

  • WFS - Web Feature Service

  • WPS - Web Processing Service

  • XML - Extensible Markup Language

5. Overview

This ER provides an analysis of the prototype implementations, approaches, test architectures and performance aspects of geospatial data generalization and compression techniques explored in OGC Testbed 12 and findings. OGC Testbed 12 investigated extending WFS with Efficient XML Interchange (EXI) output formats as a method of providing for WFS data delivery in low bandwidth environments. Also it was investigated, how generalization methods can be encapsulated in WPS processes and how WPS profiles for generalization can be specified.

First, some background information will be provided, followed by a description of the Low Bandwidth WFS and the Generalization WPS. In the final section, recommendations for future work are given.

6. Background

In this chapter, the DGIWG profile as well as compression and generalization techniques are introduced.

6.1. DGIWG - Web Feature Service 2.0 Profile

The DGIWG - Web Feature Service 2.0 Profile document provides recommended implementation profiles for the ISO 19142:2010 Web Feature Service / Open Geospatial Consortium Web Feature Service Interface Standard (WFS) 2.0 – With Corrigendum. The WFS standard provides an interface allowing requests for geospatial features across the web using platform-independent mechanisms and is independent of the underlying data store. One can think of geospatial features as the "source code" behind a map. Whereas the OGC Web Map Service (WMS) interface or online mapping portals return only an image, which end-users cannot edit or spatially analyze, the WFS provides XML-based Geography Markup Language (GML) as the default payload-encoding for transporting geospatial features. In other words, rather than sharing geographic information at the file level using File Transfer Protocol (FTP), for example, the WFS offers direct fine-grained access to geographic information at the feature and feature property level.

The WFS standard specifies discovery operations, query operations, locking operations, transaction operations and operations to manage stored parameterized query expressions. The WFS interface permits users to access and manipulate geospatial feature information from distributed network sources. Technical specifications sometimes have optional features, such that two conforming implementations may not inter-operate completely due to choosing different sets of optional features to support. Even when no formal optional features exist within a standard, there is still a risk that vendors will not implement functionality that is most important to the military community. Also, some standards contain vague or ambiguous wording thus the development and use of profiles can enforce one possible interpretation.

To limit the number of interpretations by implementers and improve interoperability it is possible to define profiles. In standardization, a profile consists of an agreed-upon subset and specific interpretation of a specification. The intention of the DGIWG WFS 2.0 profiles is to minimize such interoperability issues with a specific view to a military context and to mandate a minimum set of service requirements necessary to ensure usability in an operational coalition environment. These profiles are designed to both increase interoperability between WFS servers and to improve the ease of implementation of the WFS standard.

A survey of DGIWG nations was conducted to determine implementation requirements for WFS. These profiles are in response to those survey results. Nations were asked to identify specific requirements for the type of WFS required (Simple, Basic, Transactional, Transactional with Locking, Manage Stored Queries). Based on the results of this survey the profiles define requirements for both a Basic WFS and a Transactional with Locking WFS. The survey also asked respondents to identify requirements for query filters, bindings, bandwidth constraints, output formats and quality of service.

One of the areas for profile assessment was using WFS in low bandwidth environments. This is because results from previous OGC activities and operational deployments indicate that transferring large volumes of geodata from a WFS over a network with poor or very low bandwidth can take a significant amount of time, and network capacity.

To help meet this challenge OGC Testbed 12 developed prototype implementations and conducted Technology Integration Experiments to assess optimizing data transfer under bandwidth-constraint conditions. This document discusses geospatial data size reduction and compression techniques relevant to JSON over GML, zipped XML, EXI etc. Development and testing in Testbed 12 focused on enhancing WFS for EXI compression, with a focus on capabilities that may be considered for possible DGIWG WFS profiling.

6.2. Compression Techniques and Software

The W3C Recommendation Efficient XML Interchange (EXI) Format 1.0 (Second Edition) is a representation for the Extensible Markup Language (XML) Information Set. EXI is intended to optimize performance and the utilization of computational resources. From a practical viewpoint, EXI is designed to reduce the size of XML data exchanged between computers.

EXI uses a grammar-driven approach designed to achieve 'efficient encodings representations'. Consequently, EXI processors are described by the W3C as ‘relatively simple’ and ‘can be implemented on devices with limited capacity.’ An EXI processor is used by application programs to encode their structured data into EXI streams and/or to decode EXI streams to make the data accessible.

EXI is schema ‘informed’, meaning that it can use available schema information to improve compactness and performance. However, the W3C indicates that EXI does not depend on accurate, complete or current schemas to work – a statement which must be considered carefully when using EXI for geographic feature data.

6.2.1. W3C EXI Documents

Despite W3C statements that EXI processors are ‘relatively simple’, Efficient XML Interchange (EXI) is a very complex topic. The reader is encouraged to review the following W3C documents for a complete background -

6.3. Schema-informed vs Schema-less Compression

Review of the references above indicates the W3C describes EXI as not 'dependent on schemas'. However, prior investigations have assessed that EXI may compress XML more efficiently if schemas exist describing the format of the expected XML. As background, it is important to understand there are two main ways in which EXI encodes XML documents -

  • Schema-less - In the schema-less mode, EXI encodes an XML document whether or not a schema is available to the encoder.

  • Schema-informed - In the schema-informed mode, EXI encoding can utilize available schema information to improve compactness and performance, but does not depend on accurate, complete or current schemas to work.

EXI uses a set of built-in grammars to encode XML documents and XML fragments when no schema information is available.

The two modes, schema-less and schema-informed, are important for GML and WFS compression because prior reports noted that coordinates in GML 2 may be defined in a schema as string ('text'). Since schema optimization cannot improve on 'text' this may result in poor compression when there are many coordinates in GML data described by a GML 2 schema. However, the report noted that coordinates in GML 3 may be defined in a schema as float. Schema optimization can improve on float, with the results being good compression with lots of coordinates. The key point being that compression performance may be dependent on the design of the schema as well as the XML data itself.

6.3.1. EXI Streams

EXI represents the contents of an XML document as an EXI stream. An EXI stream consists of an EXI header followed by an EXI body.

The EXI header conveys format version information and may also include the set of options that were used during encoding. If these options are omitted, it is assumed that the decoder has access to them out of band.

The EXI body comprises an event sequence describing the document (or document fragment) that is encoded.

6.3.2. EXI Option Values

In addition to the different compression performance that may be obtained with or without schemas, different types option values may be used while encoding XML documents in an EXI stream. Option values are part of the EXI header and provide a way to specify the options used to encode the body of an EXI stream. There are many option values outlined in the W3C EXI specification and the reader is again encouraged to review them as needed.

The most significant option values for Testbed 12 WFS Compression testing are presented and defined in the following table.

Table 2. EXI Options Values
Option Value Description

COMPRESSION

This compression option is used to increase compactness using additional computational resources (via DEFLATE algorithm).

BIT_PACKED

If the alignment option value is bit packed, that indicates that event codes and associated content are packed in bits without any padding in-between.

BYTE_PACKED

The alignment option value byte-alignment indicates that the event codes and associated content are aligned on byte boundaries. While byte-alignment generally results in EXI streams of larger sizes compared with their bit-packed equivalents, byte-alignment may provide a help in some use cases that involve frequent copying of large arrays of scalar data directly out of the stream. It can also make it possible to work with data in-place and can make it easier to debug encoded data by allowing items on aligned boundaries to be easily located in the stream.

PRE_COMPRESSION

This alignment option value indicates that all steps involved in compression are to be done with the exception of the final step of applying the DEFLATE algorithm.

6.3.3. Compression Software for WFS

In OGC Testbed 12 EXI participants extended WFS with software capable of producing an output format in EXI. The software tested in WFS implementations included the packages listed in the following table.

Table 3. EXI Compression Software
EXI Software Description

OSS

OSS is an implementation of the EXI specification, available both for Java and .Net platforms. http://www.oss.com/xml/products/exi-c/exi-c.html

Nagasena

Nagasena is an implementation of the EXI specification, available both for Java and .Net platforms. http://openexi.sourceforge.net/

EXIficient

EXIficient is a set of implementations of the EXI format specification available for Java, Javascript, C/C++. http://exificient.github.io/

6.4. Generalization Techniques

Generalization in GIS is used to reduce the detail in data. With this reduction in detail, also the amount of data can be reduced, thus improving the usage in low bandwidth environments. [1] lists the following twelve categories of operators for cartographic generalization:

generalization operators
Figure 1. Twelve categories of generalization operators. (source [1])

Examples for the twelve operators are shown in the following image:

generalization examples
Figure 2. Sample spatial and attribute transformations of cartographic generalization. (source [1])

There are three types of geometry for that generalization techniques will be investigated in this ER: point, line and polygon. Several techniques can be applied to each of the geometry types. We chose to investigate the following operations further:

  • Aggregation/Merging: A digitized representation of a map feature should be accurate in its representation of the feature (shape, location, and character), yet also efficient in terms of retaining the least number of data points necessary to represent the character. A profligate density of coordinates captured in the digitization stage should be reduced by selecting a subset of the original coordinate pairs, while retaining those points considered to be most representative of the line (Jenks, 1981). Glitches should also be removed. Simplification operators will select the characteristic, or shape-describing, points to retain, or will reject the redundant point considered to be unnecessary to display the line’s character. Simplification operators produce a reduction in the number of derived data points which are unchanged in their x,y coordinate positions. Some practical considerations of simplification includes reduced plotting time, increased line crispness due to higher plotting speeds, reduced storage, less problems in attaining plotter resolution due to scale change, and quicker vector to raster conversion (see [2]).

  • Simplification: There are many instances when the number or density of like point features within a region prohibits each from being portrayed and symbolized individually within the graphic. This notwithstanding, from the perspective of the map’s purpose, the importance of those features requires that they still be portrayed. To accomplish that goal, the point features must be aggregated into a higher order class feature areas and symbolized as such. For example, if the intervening spaces between houses are smaller than the physical extent of the buildings themselves, the buildings can be aggregated and re-symbolized as built-up areas (see [3]).

  • Refinement/Elimination: In many cases, where like features are either too numerous or too small to show to scale, no attempt should be made to show all the features. Instead, a selective number and pattern of the symbols are depicted. Generally, this is accomplished by leaving out the smallest features, or those which add little to the general impression of the distribution. Though the overall initial features are thinned out, the general pattern of the features is maintained with those features that are chosen by showing them in their correct locations. Excellent examples of this can be found in [4]. This refinement process retains the general characteristics of the features at a greatly reduced complexity.

The implementation of the three operations is described in section 8.

7. Low Bandwidth WFS

This section describes a WFS fitted for low bandwidth environments.

7.1. WFS Compression Experiments

In OGC Testbed 12 EXI participants investigated compression techniques for geospatial data sets delivered by WFS Servers and Clients by augmenting WFS with software capable of producing an output format in EXI (described above).

The testing architecture for WFS Compression in OGC Testbed 12 was configured using a combination of the following components -

  • EXI Pre-Processors and Processors - Software program modules used by application programs to encode their structured data into EXI streams and/or to decode EXI streams to make the structured data accessible.

  • Compression WFS - WFS augmented with EXI Pre-Processors and Processors and loaded with test data. Provides the ability to request test data as GML, GeoJSON, GZIP and EXI (among other output formats).

  • Compression WFS Clients - Application clients with the ability to request EXI encoded data from a Compression WFS, with a performance recording module to gather metrics on time taken to perform the encoding and, most importantly, size of the resulting EXI stream. Includes the ability to decode EXI streams from Compression WFS.

These components were configured for testing Compression WFS and EXI as described in the following sequence diagram –

TCPCompressionTestingArchitecture1
Figure 3. Sequence diagram for testing Compression with WFS and EXI

7.1.1. Test Suite 1

For compression testing implemented Compression Tests WFS Servers, Compression Clients, EXI Pre-Processors, EXI Processors in the following architecture -

Picture4
Figure 4. Compression Tests Architecture and Sequence Diagram

The Compression WFS in Test Suite 1 was based on WFS, extended with OSS for .NET on the server and client, and EXIeffienct using Java and Javascript wrappers on the server side as needed. Compression software was employed for encoding plain-text based GML and GeoJSON content into binary EXI files on the server side. The data preparation process is: features -→ XML-based data stream (GML) -→ Encoding into .exi file -→ Transfer to client application for decoding and rendering.

Additional types of compressing methods were implemented on the server including GZIP and LZMA(.7z) for comparison.

For compression testing processors implemented both Schema-less and schema-informed modes.

Feature data over San Francisco representing points (schools_public_pt.shp), lines (stclines_streets.shp) and polygons (schools_public.shp) formed the test baseline. Other data sets were assessed as well.

Performance recording modules were implemented for comparison of EXI performance. Three formats are supported, including GeoJSON -LZMA, GML3.1.1-EXI and GML3.1.1-LZMA. Browser side data decompressing functions were implemented with the format of GeoJSON -ZIP, GeoJSON -LZMA, GML3.1.1-ZIP and GML3.1.1-LZMA realized.

Participants attempted to integrate Nagasena into Compression WFS but did not continue the TIE due to performance issues.

Using the performance recording module information about different compression methods and datasets were developed. Initial test results for Compression WFS on Test Suite 1 are presented in the table below.

ASUfinalresults1
Figure 5. ASU Final Results

7.2. Low Bandwidth WFS Test Scenario

Prototype implementations, various approaches, test architectures and performance aspects of geospatial data compression techniques explored in OGC Testbed 12 were assessed in a simulated disaster response scenario. This scenario, and relevant aspects of WFS in low bandwidth environments are described in the following graphics.