Publication Date: 2019-03-07

Approval Date: 2019-02-28

Submission Date: 2019-02-06

Reference number of this document: OGC 18-025

Reference URL for this document: http://www.opengis.net/doc/PER/t14-D028

Category: Public Engineering Report

Editor: Jérôme Jacovella-St-Louis

Title: OGC Testbed-14: CityGML and AR Engineering Report


OGC Engineering Report

COPYRIGHT

Copyright (c) 2019 Open Geospatial Consortium. To obtain additional rights of use, visit http://www.opengeospatial.org/

WARNING

This document is not an OGC Standard. This document is an OGC Public Engineering Report created as a deliverable in an OGC Interoperability Initiative and is not an official position of the OGC membership. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an OGC Standard. Further, any OGC Engineering Report should not be referenced as required or mandatory technology in procurements. However, the discussions in this document could very well lead to the definition of an OGC Standard.

LICENSE AGREEMENT

Permission is hereby granted by the Open Geospatial Consortium, ("Licensor"), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.

If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.

THIS LICENSE IS A COPYRIGHT LICENSE ONLY, AND DOES NOT CONVEY ANY RIGHTS UNDER ANY PATENTS THAT MAY BE IN FORCE ANYWHERE IN THE WORLD. THE INTELLECTUAL PROPERTY IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE DO NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE INTELLECTUAL PROPERTY WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION OF THE INTELLECTUAL PROPERTY WILL BE UNINTERRUPTED OR ERROR FREE. ANY USE OF THE INTELLECTUAL PROPERTY SHALL BE MADE ENTIRELY AT THE USER’S OWN RISK. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR ANY CONTRIBUTOR OF INTELLECTUAL PROPERTY RIGHTS TO THE INTELLECTUAL PROPERTY BE LIABLE FOR ANY CLAIM, OR ANY DIRECT, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM ANY ALLEGED INFRINGEMENT OR ANY LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR UNDER ANY OTHER LEGAL THEORY, ARISING OUT OF OR IN CONNECTION WITH THE IMPLEMENTATION, USE, COMMERCIALIZATION OR PERFORMANCE OF THIS INTELLECTUAL PROPERTY.

This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.

Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications.

This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.

None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.

Table of Contents

1. Summary

This OGC Testbed-14 Engineering Report (ER) describes the results of the Augmented Reality (AR) work performed in the Testbed-14 CityGML and Augmented Reality work package which was part of the Next Generation Services thread.

By integrating information available from urban models within a view of the real world through a mobile device, this testbed activity explored the possibilities offered by AR in a geospatial context. The ER additionally discusses the approach used to bring in these urban models from various data sources. The experiments also covered to some extent Virtual Reality (VR) where any city can be explored freely from a computer display or potentially within a VR headset.

A continuation of these experiments would have looked at a combination of Augmented and Virtual Reality (Mixed Reality). The portrayal of AR and three-dimensional (3D) content through extending a common conceptual model to style classic geospatial features (as explored in the Testbed-14 Portrayal work) is also touched upon. The efficient transmission of 3D content is also a subject of this document through the use of a simple 3D transmission format developed during the initiative.

This ER provides many insights that showcase what is now made possible by the combination of AR, VR and integrated urban models.

The testbed work shines light on the benefits of applying a common portrayal approach to AR, bridging the gap between AR applications and traditional Geographic Information Systems and services.

The ER introduces a new, simple approach and conceptual model for transmitting 3D geospatial content which could be the basis to define simple profiles for the I3S and 3D Tiles community standards. It could also inform enhancements to the 3D Portrayal Service (3DPS) and/or next generation services (e.g., WFS 3.0) for delivering 3D contents in a format agnostic manner.

Finally, the ER covers methods to bring in different types of geospatial content from various sources for integration into AR applications.

During Testbed-14, the participants demonstrated AR experiences with geospatial datasets providing integrated views of urban spaces. Two clients and two services were shown to be interoperable, streaming AR content through a simple 3D transmission format, leveraging either GeoJSON or GNOSIS Map Tiles, as well as E3D 3D model specifications.

The feasibility of extending a classic portrayal conceptual model for AR was also shown. In order to serve them to the clients in the supported transmission formats, geospatial data sets of various types and in various formats were successfully imported for consumption by the services.

1.1. Requirements & Research Motivation

diagram
Figure 1. Diagram of CityGML and AR initiative deliverables

1.1.1. Requirements

As stated in the Call for Participation, the requirements for the Testbed-14 CityGML and Augmented Reality work package were:

  • To evaluate the possible methods for integrating AR content into CityGML content. The effort shall include an analysis of whether Augmented Reality content should be integrated into the CityGML data or whether the data content (features) should be linked for visualization purposes.

  • To demonstrate client-side support for visualization of streamed CityGML with AR content. Bidders shall include suggestions for suitable AR data in their proposals and shall investigate the role and applicability of OGC Augmented Reality Markup Language 2.0 (ARML 2.0), the Community Standard 3D Tiles, and OGC Community Standard I3S.

The goal was to provide highly detailed visualization of fully integrated urban spaces, as modeled in CityGML. This would notably be very valuable for urban warfighters or first responders, in the planning and execution of modern missions.

This integrated view brings together:

  • Geography

  • Topography

  • Infrastructure & utilities

  • Indoor spaces

  • Detailed building models

The stakeholders clarified that the AR content will be a presentation of these elements, either as 3D geometry, classic 2D geospatial features, or annotations sourced from the associated attribute information.

Since the primary objective of this initiative was to highlight the possibilities of AR, the requirement to stream contents specifically as 3D Tiles and/or I3S was relaxed to become optional. This allowed the participants to more quickly focus efforts on the AR aspects.

Furthermore, because abundant amounts of relevant CityGML data were not readily available, and because the AR clients did not deal directly with CityGML-encoded content, the stakeholders decided that the services could additionally import other source data types to be served in the transmission format directly through the best suited mechanism.

1.1.2. Objectives

Two primary objectives corresponding to different scenarios were identified: the first focused on AR whereas the second is better described as VR. The experiments done with the clients and services for these different scenarios are covered in different sections of this report. Additional objectives pertained to the particularities of portraying geospatial features to augment reality, and to efficient transmission of content.

The Augmented Reality Scenario

In the AR scenario, a real view of the surroundings from a mobile device camera is augmented with geospatial data overlays. The real world is the foundation of the user experience, and the integration of camera and sensors (gyroscope, magnetoscope, accelerometer) synchronize the 'augmented' elements with the real world. The 3D geometry data, such as originating from a CityGML package, makes up the AR content. This scenario is tied to device location, and combined with the difficulty of finding relevant data for an area where the participants were located, this posed logistical challenges for the initiative.

In this scenario, a sub-set of the urban model or extra 3D elements were overlaid on top; some potentially with translucency so as not to completely obscure the camera view of the real world.

The stakeholders decided that the focus would be on an outdoor scenario for a number of reasons:

  • Outside scenarios benefit from easier geolocalization and synching with the real world on a mobile device from GPS sensors and the cellular signal.

  • Outside scenarios are likely to involve traveling a large distance and would benefit greatly from streaming 3D contents as tiles.

  • Emergency response has been cited as a potential application of such an outside scenario.

An indoor scenario, however, would have presented a different set of challenges:

  • It would have to rely on recognition of the environment (e.g. pattern recognition) from the camera due to the smaller scale of features and limited GPS/cellular connectivity. The challenge is an order of magnitude more difficult, which was not well suited for the scope of the initiative.

  • CityGML is often used to represent large integrated environment, and would not have been the best source of data for indoors information. However, some elements such as ADEs providing BIM might still be relevant.

  • InDoorGML is another standard which may provide more relevant data for such a scenario.

The Virtual Reality Scenario

Virtual Reality is the second scenario, where an integrated urban view can be freely explored virtually. In this scenario, most of the 3D geometry makes up the base scene (i.e. the 'virtual' reality). The 3D models cover an entire view, with no need for integration of the real-world camera view. This scenario is not dependent on the physical location of the user (e.g., data from Taiwan could be visualized in Washington, DC).

In this scenario, some of the 3D elements could have different styles applied to highlight these features, and the visibility of the different elements could be toggled. Given more time, this could have been demonstrated through a VR headset such as the Oculus Rift.

Mixed Reality

Another interesting experiment would be to combine AR and VR (Mixed Reality), where a VR headset is used together with stereoscopic cameras such as the Ovrvision Pro or ZED Mini by Stereolabs.

A new generation of AR glasses with stereoscopic vision and binocular AR capabilities, such as the ODG-R9 (although the availability of these AR glasses is now doubtful given recent development [1]) or the MagicLeap also offer a glimpse of new Mixed Reality possibilities in the near future. The mixed reality devices market however is still very volatile, as illustrated by the often overwhelming challenges faced by many companies in the space despite large amounts of funding, as well as the limited availability of stereoscopic AR glasses, particularly outside of the US market.

The OGC is currently executing the Mixed Reality to the Edge Concept Development Study. Findings and recommendations from this study could feed directly into a mixed reality scenario.

Portrayal of AR content

All of these scenarios were to feature annotations in the forms of labels, markers and other visual elements (e.g. simple 2D and/or 3D primitives) tied to certain geographic locations / features. The annotations source was from the non-geometry attributes of the source data (e.g. CityGML), and transmitted to the client through a mechanism defined in the transmission format.

The stakeholders had to decide the best way to describe which additional elements to display (e.g., extra 3D geometry such as utility ADE vs. textual annotations), when to display them, where to overlay them, and how to style them.

How to 'link' this AR content to the base visual elements (either the camera feed, or the rendering of a virtual 3D view of the integrated urban model represented by the CityGML) also had to be determined.

Efficient transmission of AR content

An efficient transmission of AR content between clients and services had to be implemented. This is covered in detail in Chapter 8.

1.2. Prior-After Comparison

1.2.1. A review of ARML capabilities for 3D visualization and AR

The OGC® Augmented Reality Markup Language 2.0 (ARML 2.0) standard allows users to describe visual virtual objects in an augmented reality (AR) scene with their appearances and anchors (a broader concept of a location) related to the real world as depicted in visual camera input. ARML 2.0 also defined ECMAScript bindings to dynamically modify the AR scene based on user behavior and input. Part of the motivation for ARML 2.0 was to enable several AR Browser vendors, Layar, Metaio, and Wikitude to create a common format to enable better interoperability among their browsers at a time when the mainstream mobile browser market was still immature. ARML 2.0 also used the Web IDL Specification.

There are significant differences in the approach to describe augmented reality contents used in this initiative and that of ARML 2.0, both in assumed use case scenarios and in technical details. For example, the latter contemplated browser-based clients using XML-based COLLADA files for 3D data exchange. By contrast, the approach used in this testbed initiative was intended for native apps running on low-capacity clients operating in Denied, Degraded, Intermittent, or Limited Bandwidth (DDIL) environments as might be found in urban warfighting or emergency-response.

Another major difference is that while ARML combined both geospatial data and styling rules in a single payload, the approach used in this initiative opted for the preferred approach of keeping data and portrayal options separate. By using a styling conceptual model also applicable to typical Geographic Information System (GIS) data sources, this approach helped bridge the gap between Augmented Reality and GIS. The ability to readily integrate the vast amount of readily available geospatial data in AR applications can help minimize application development costs while also maximizing investments in geospatial data collection. Further, much of the data is already encoded or transferred according to OGC standards.

Conversely, the concepts of portraying and labeling features in a 3D view are not specific to Augmented Reality. They also apply to a regular 3D GIS application where views are purely virtual. For this reason, it makes more sense that such capabilities (which were the ones this initiative was mostly concerned with in its usage scenarios) be integrated in a way which applies to both classic and 3D GIS applications, than with a language targeting Augmented Reality specifically.

To compare the capabilities offered by ARML with the work done in this testbed, the latter focused strictly on Geospatial Augmented Reality. Computer Vision-based augmented reality was not looked into, being better suited for a smaller scale or indoor scenarios. For Geospatial AR, the core concepts in ARML include Feature, Anchor and Visual Assets.

The Features from ARML, defined as the real world objects themselves, are closely related to the geospatial data being defined in this tested. At various stages this data was defined in CDB, Shapefiles, CityGML, OpenStreetMap data, GNOSIS Map Tiles, GeoJSON and E3D models formats. As entire cities were covered, the geospatial data were very large. As such, a text-based format (e.g. XML) is not well suited to represent these data sets in an AR environment. Furthermore, it was desirable to only have a small subset of a much larger dataset relevant to a given view. For this reason, partial requests and tiles were used.

ARML also defines the concept of an Anchor, which in the geospatial AR can simply be a single geospatial point. For the testbed, the portrayal language could be thought of as what is anchoring visual assets to specific point or feature sets. But the actual geospatial coordinates would be in the data itself. In the case of lines or areas, labeling rules can control exactly where visual assets should appear.

Then ARML has the concept of a Virtual Asset, either two-dimensional (2D) or 3D, which is to be drawn into the scene thus "augmenting" the view. In the approach used for this testbed, this is also the case and these virtual assets are defined or referenced using the portrayal language. This approach allows referencing attributes associated with the geospatial data to display textual information, or to control various visual properties such as colors. It could also reference symbology, for example in the form of Portable Network Graphics (PNG) or Scalable Vector Graphics (SVG), or even 3D models to be displayed at the anchored location.

Most of the efforts of the initiative however focused on superimposing virtual buildings 3D geometry at their proper location in the view. In this case, the 3D building’s geometry could be thought of as being a Feature, Anchor and Virtual Asset all at once.

1.2.2. ARML capabilities not yet covered to be considered for future work

In spite of the differences between ARML 2.0 and the approach to describe Augmented Reality contents in this initiative, ARML can still be a source of inspiration for important capabilities to consider re-integrating in styling extensions specific to augmented reality. A most obvious example of such capabilities is the concept of a computer vision-based anchor (Trackable). This could be considered for future work, tied more with computer vision and potentially better suited for indoor localization challenges.

1.3. Intended audience

This ER provides findings and recommendations which may impact a number of OGC Working Groups that deal with standards for working with 3D geospatial data.

1.4. Recommendations for Future Work

  • Conducting more Augmented and Mixed Reality initiatives would be highly beneficial. The work done during this testbed barely scratched the surface of many different topics.

    • A new initiative simply continuing the work done in relation to the scenario selected for this Testbed would be very valuable. Such an initiative could:

      • Perform more field tests;

      • Ensure better registration between virtual objects and the real world;

      • Aim to achieve better client performance;

      • Support additional transfer formats and mechanisms (e.g. glTF, 3D Tiles, I3S);

      • Test the efficiency of tiled content delivery while moving with the AR device over large distances;

      • Develop better batching of 3D data;

      • Perform experiments with additional geospatial data sets;

      • Integrate more annotations and interaction capabilities into applications;

      • Experiment with more powerful devices such as those supporting Android’s ARCore and 6 Degrees of Freedom sensors or Apple’s ARKit;

      • Investigate the use of hardware enabling Mixed Reality experiments.

    • Ideally, requirements for any new AR/MR Innovation Program initiative should focus on clear functional objectives of limited scope. The current initiative requirements specified many 3D data standards that should be used for the experiments, but did not present a detailed picture of what needed to be achieved. The initiative ended up dealing simultaneously with multiple 3D data format conversions, transmitting 3D data between services and clients, and rendering 3D objects and annotations as AR content.

    • When describing AR applications and related initiatives, a distinction should be made between different types of AR content such as text and symbol annotations, overlaid 3D features (e.g. to compensate for poor visibility), anchored 3D models, etc.

    • In general, a requirement for using a specific data standard (e.g. CityGML) should ensure an appropriate data set for such standard is readily available before the initiative begins. This would avoid diverting efforts to generate an intermediate dataset for the sole purpose of satisfying a potentially non-essential requirement, when a more direct path may be possible.

    • For AR, data sets should be available covering the location where potential participants are located to properly test AR functionality.

    • One new initiative should look into the specifics of AR based on Computer Vision, such as ways to anchor visual assets based on pattern recognition as was the case in ARML, or recognizing geospatial features for increasing location accuracy to better support in-door scenarios.

    • A new initiative could leverage a highly detailed dataset of a small area, possibly from IndoorGML, point clouds, or other source.

    • Mixed Reality initiatives should consider the use of specialized hardware such VR headsets equipped with stereoscopic cameras and/or new generation AR glasses such as the MagicLeap (or the ODG-R9 if it ever becomes available).

  • The WFS extensions for querying 3D models and textures, as well as the concept of harmonizing services (so that these can be retrieved in addition to vector data, imagery, coverages from a single service and end-point) should be considered as part of future initiatives and the ongoing development of the next iteration of OGC services standards. This approach enabled the efficient serving of full CDB datasets in this initiative. An overview of client/server data exchange is presented in Chapter 8, and a full description of the services in Chapter 9.

  • The E3D format (described in Appendix A), with its ability to represent 3D models in a very compact manner and requiring very little processing before uploading to a GPU for hardware accelerated rendering should be considered for inclusion as part of future Innovation Program initiatives.

  • Contributing support for loading and writing the E3D model format to the Open Asset Importer Library would be beneficial, as it would result in that format being a lot more interoperable due to the widespread use of that library in modeling and 3D visualization tools.

  • GNOSIS Map Tiles (described in Appendix B), with their potential for describing compact tiled vector data, imagery, coverages and now also referenced and embedded 3D models as well as point clouds should be considered for inclusion as part of future Innovation Program initiatives as well as a potential OGC standard.

  • AR applications leveraging geospatial data should adopt a classic GIS styling conceptual model which can scale to 3D views (such as described in Appendix C), with extensions specific to AR. This approach allows linking source attributes with the presentation of different types of AR content (e.g. annotations, styled 3D geometry). The conceptual model used in this initiative leverages work done in the Testbed-14 Portrayal Task.

1.5. Document contributor contact points

All questions regarding this document should be directed to the editor or the contributors:

Contacts

Name Organization

Jérôme Jacovella-St-Louis (editor)

Ecere Corporation

振宇 Hao, Chen-Yu (How)

GIS Research Center, Feng Chia University

柏淳 Wen, Bo-Chun (Cherry)

GIS Research Center, Feng Chia University

Marcus Alzona

Keys

Carl Reed

Carl Reed & Associates

Scott Serich

Open Geospatial Consortium

1.6. Foreword

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.

Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.

2. References

3. Terms and definitions

For the purposes of this report, the definitions specified in Clause 4 of the OWS Common Implementation Standard OGC 06-121r9 shall apply. In addition, the following terms and definitions apply.

Augmented Reality (AR)

The integration of visual objects (e.g. labels, virtual 3D objects) with a viewer’s actual surroundings.

Virtual Reality (VR)

Virtual environment experienced by a user in a system (e.g. head mounted display) integrating a stereoscopic display and sensors allowing to re-orient the view with the head’s rotation (3DoF), and optionally position as well (6DoF).

Mixed Reality (MR)

The combination of Augmented and Virtual Reality in a HMD or smart glasses combining a stereoscopic display, motion sensors and (ideally stereoscopic) cameras.

3.1. Abbreviated terms

3DPS

3D Portrayal Service

ADE

Application Domain Extension

API

Application Programming Interface

AR

Augmented Reality

ARML

Augmented Reality Markup Language

ASSIMP

Open Asset Import Library

CDB

OGC CDB (a datastore standard for 3D environments, formerly Common Database standard)

CMSS

(GNOSIS) Cartographic Map Style Sheets

COLLADA

Collaborative Design Activity (3D interchange data format)

CPU

Central Processing Unit

CV

Computer Vision

DAE

Digital Asset Exchange (COLLADA)

DoF

Degrees of Freedom

DWG

Domain Working Group

E3D

Ecere 3D Model Format

ER

Engineering Report

ETC2

Ericsson Texture Compression, version 2

FoV

Field of View

glTF

GL Transmission Format

GIS

Geographic Information System

GIS-FCU

GIS Research Center, Feng Chia University

GML

Geography Markup Language

GMT

GNOSIS Map Tiles

GPS

Global Positioning System

GPU

Graphical Processing Unit

HMD

Head mounted display

HTTP

Hypertext Transfer Protocol

IBO

Index Buffer Object

ID

Identifier

IFC

Industry Foundation Classes

MR

Mixed Reality

NDK

Native Development Kit (Android)

OGC

Open Geospatial Consortium

OSM

OpenStreetMap

PBR

Physically based rendering

SDK

Software Development Kit

SE

Symbology Encoding

SLD

Styled Layer Descriptor

SWG

Standard Working Group

UMS

Unified Map Service

VBO

Vertex Buffer Object

VR

Virtual Reality

WFS

Web Feature Service

WMTS

Web Map Tile Service

XML

eXtensible Markup Language

4. Overview

Chapter 5 explores the possibilities of Augmented Reality in a geospatial context, as they apply to the various features of interest for the experiments. It delves into the portrayal aspect of geospatial data to augment reality, and the similarities and particularities of styling geospatial features for AR, for 3D views and in general, proposing a common conceptual model. It describes how the presentations can link annotations to attributes of the source data using classic GIS approaches. It also skims over the challenges of in-door Augmented Reality.

Chapter 6 provides a description of the various sensors found in mobile devices making Augmented Reality possible, and explains how they can be used to situate the position and orientation of the viewer in the real world.

Chapter 7 describes the two mobile Augmented Reality clients that were developed in this testbed. The first is an Android client, built by Ecere, using GNOSIS and OpenGL and accessing the Android sensors and Camera 2 API directly. The second was built by GIS-FCU, using ARKit, and runs on iOS.

Chapter 8 details the challenges of transmitting large geospatial contents, including 3D contents, efficiently from extended Web Feature Services. An overview of the extended WFS requests used for exchanging data is presented. The rationale behind opting for the E3D model format to deliver 3D meshes to the AR clients in this initiative is explained. The topic of texture compression is briefly covered. Because how 3D data is organized and transmitted by the server to the client, 3D performance considerations are also discussed. Tiling and caching strategies as well as support for clients working completely offline are explored. An attempt is made to describe a potential relationship between the work of this initiative and the 3D Portrayal Service.

Chapter 9 describes the two geospatial services that were built to support these Augmented Reality experiments, by Ecere and GIS-FCU.

Chapter 10 describes in details the datasets used for these experiments, their formats, as well as how they were processed and consumed by the services before being served to the mobile AR clients.

Appendix A details the specifications of the E3D model format used by the experiments.

Appendix B describes the latest specifications of the GNOSIS Map Tiles with support for referencing or embedding 3D models.

Appendix C presents an overview of a conceptual model and GNOSIS Cascading Map Style Sheets language for styling geospatial features, including describing AR annotations.

5. Augmented Reality

In order to provide a useful Augmented Reality experience, choosing the right type of data to present, as well as how to present that content for the best AR experience is crucial. This initiative focused on presenting geospatial data, and factors such as the limited availability of data and specific pre-established requirements for the initiative (but no pre-identified data sets) posed some challenges in identifying relevant data to present. Many aspects of AR work best with highly detailed data, which is typically more difficult to obtain.

5.1. Augmenting reality with geospatial data

As the use of CityGML was an important objective of this initiative, attempts were made to identify useable data sets, as well as CityGML Application Domain Extensions (ADEs) which may provide information more particularly useful for Augmented Reality experiments. Existing CityGML ADEs identified included:

Also relevant are the open source BIM collective [4], Indoor Spatial Data Model Using CityGML ADE [5], application of large 3D CityGML models on smartphones [6] and Augmented Reality Games with CityGML [7].

However, as it was already difficult to find a basic usable CityGML data set in an area of interest, let alone one defining any of these ADEs, no such data sets were identified.

Being able to see "hidden" features is a very useful aspect of AR, displaying underground utilities such as pipelines, is a good use case scenario. A data set of a pipeline in Kaohsiung, Taiwan available in the COLLADA format was used for some experiments in the Testbed. The location of the content however, proved problematic as no testbed participant was normally located in Kaohsiung.

The Testbed sponsors originally specified requirements for supporting 3D Tiles and/or I3S. They also hoped for a continuity of the work performed in Testbed-13 on 3D client performance. As a result, the focus shifted on the presentation of 3D buildings in an AR application. One problem with the presentation of 3D buildings in AR is that the actual buildings are normally directly visible, without the need for augmentations. As such, a 3D model would simply obstruct a real, better view of the actual building. However, having the 3D model could potentially be useful in an emergency response scenario where thick fog or smoke could severely restrict visibility. Alternatively, presenting the buildings with a reduced opacity setting may be useful to verify the accuracy of a data set with the real world, and/or for mapping one’s surroundings. The vast majority of the experiments performed in this initiative dealt with this 3D buildings scenario.

Unfortunately, the data sets used had little attribute information available. However, the approach implemented for transforming and transmitting buildings data kept into consideration the capability for preserving attributes when better attributed data is available.

In order to experiment with the capacity to display actual annotations apart from 3D models, separate and combined experiments were done using OpenStreetMap data for presenting street maps, including buildings and streets names.

Although desktop experiments were done with terrain elevation and aerial imagery, the very limited nature of mobile hardware, as well as the desire to not hide the majority of the camera feed made it difficult to find a proper use case for these data in a mobile Augmented Reality scenario. In one case however, information from the elevation model data was incorporated within the buildings position information.

See Chapter 10 for a detailed description of the data sets used for all experiments, as well as how they were processed and integrated for use by the services and Augmented Reality clients.

In this initiative, generic geospatial datasets, with no special consideration for Augmented Reality, were used so that AR applications developers could readily make use of any geospatial datasets such as those defined or transmitted using OGC standards. Similarly, current GIS applications developers can leverage their existing tools, data and expertise to start deploying Augmented Reality solutions.

5.2. Similarities and particularities of styling for 3D views

Many of the same concepts for styling geospatial features apply whether dealing with a cartographic or 3D view (whether doing Augmented Reality or not). As an example, any flat feature can be simply draped on the ground, and the same stroke or fill symbolizers can be applied to the feature(s). Labels and markers (annotations) typically still face the user as billboards, even if their 3D position serves as a basis for projecting to a screen position.

Some additional capabilities are possible however, such as defining styles specific to solid shapes / volumes, or using 3D models as markers. There is also the concept of depth of objects, which is sometimes useful to have objects sorted back to front with actual depth values.

With this in mind, rather than defining something entirely distinct, it makes sense to extend the classic GIS styling model with 3D capabilities.

5.3. A common flexible geographic features styling conceptual model

A great advantage of unifying styling representations is to be able to easily integrate any styled GIS content within an AR application.

This is done through the same mechanism as specifying cartographic styling for vector features in a typical cartographic 2D view, with a symbolizer concept associated with a feature described in a generic manner which accommodates 3D views just as well as 2D views.

As part of the ongoing next iteration of defining a standard conceptual model for styling capabilities, these possibilities were also considered within the Testbed-14 portrayal thread activities.

This conceptual model for styling features, as well as an encoding for it, is detailed in Appendix C.

5.4. Linking to attributes from source data in annotations

By configuring how the Augmented Reality content is displayed on either the camera feed, or the rendering of a virtual 3D view of the integrated urban model represented by the CityGML, entirely on the client, a great level of flexibility is gained. This linking can be established by specifying styling rules, and referencing available attributes associated with the geospatial data. In the approach used in this Testbed, the description of annotations elements was not hardcoded in either the source CityGML or the transmission format.

5.5. Extended capabilities applying specifically to AR

Capabilities very specific to Augmented Reality could be extensions to a conceptual model for styling geospatial data. This would provide a mechanism by which to re-integrate ARML capabilities within this conceptual model.

5.6. Indoor localization challenges: pattern matching, computer vision

At a very close range, the GPS precision is not enough to accurately register the camera with the surroundings. Indoor, the GPS signal is blocked, and devices cannot communicate with satellites. Virtual sensors such as the Android fusion sensor providing 6 degrees of freedom integrating the gyroscope, accelerometer, magnetometer in addition to input from the camera may solve some of these challenges. However, potentially recognizing features from a pre-existing data set and/or leveraging more advanced computer vision techniques on the camera feed could help solve some of these challenges.

Access to detailed in-door data sets is key for looking at these challenges in the context of Augmented Reality. Perhaps insights from the Indoor mapping and Navigation pilot would prove useful. Additionally, hardware with stereoscopic cameras might offer better ways to more accurately register the view. It is also possible to apply pattern recognition, and place markers in the real world at specific locations where augmentations should appear.

Finally, augmentation of reality at a smaller scale is generally more interesting and useful, making it possible to display more relevant content which is part of the user’s immediate surroundings. The heavy reliance on the complex topic of computer vision however implies more involved processes, and for this reason this was reserved for a future initiative.

6. Situating reality in a geospatial context

Augmented Reality applications leverage many embedded mobile device capabilities to merge the user’s real surroundings together with virtual elements (augmentations). These include the camera, the Global Positioning System (GPS) chip, as well as sensors such as the gyroscope, accelerometer, magnetometer and even the barometer (as an altimeter).

In the context of a geospatial Augmented Reality application, the user’s location is also related to an absolute global frame of reference, typically through the location information provided by GPS coordinates.

This chapter provides an overview of these capabilities and how they each play their role in making AR possible on a mobile device. This chapter is written from the perspective of the Ecere Android client making direct use of the camera and sensors, but the same fundamental concepts are utilized by toolkits such as ARKit on iOS.

6.1. The camera

The (rear-facing) camera of the mobile device provides a live feed of the user’s surroundings. It supplies an application with the backdrop in which augmentations can be integrated so that the user feels as though these are part of their actual surroundings. In computer vision-based Augmented Reality, the camera feed is not merely a backdrop but is analyzed in real time to identify anchoring patterns or to precisely locate features. Through the use of virtual fusion sensors such as Android’s six degrees of freedom sensor (TYPE_POSE_6DOF), the camera can also play an integral part in maintaining accurate information about the position and orientation of the device.

The Ecere client directly accessed the Android camera using the Android Camera 2 API. The Ecere client is built using the native eC programming language rather than Android’s Java development language. Further the NDK API for the Camera 2 was only added in Android API Level 24 (corresponding to Android 7.0 — Nougat). As there was a desire to support devices with an earlier version of Android, the Java Native Interface (JNI) had to be used. This decision required writing extra code to make use of that Camera API, thus complicating the task of implementing camera support. By using the Camera 2 API through JNI, it was possible to support Android devices starting from API Level 21 (corresponding to Android 5.0 — Lollipop). This code will likely be published under an open-source license as part of the Ecere SDK, and may be re-usable in other projects (including in projects written in other native languages such as C or C++, with minor modifications).

That camera code uses a capture session set up with repeating requests to continuously capture images. Using a lower resolution than the camera is capable of capturing can provide smoother performance. The performance of the current setup is still being investigated as it is believed that performance can be further improved, potentially by taking a more direct path between the captures and displaying the image on the device.

6.2. The GPS (and other means of determining locations)

Obtaining the GPS coordinates is the most accurate means for retrieving the device’s location. However, using the on-board GPS chip only works outdoors, and requires significant power. The network location mechanism can use the cell tower and Wi-Fi triangulation, but these computed locations are fairly inaccurate.[1]

Ideally for a geospatial AR application, the GPS unit should be used to provide coordinates. This could prove a real challenge for indoor Augmented Reality applications (possibly the location could first be obtained as close as possible from outside the building). Using the GPS provides the absolute frame of reference for the device’s position in relation to the Earth.

Because the Android NDK does not provide an API for obtaining the GPS coordinates and location updates, the JNI also had to be used to set up location updates and access the GPS chip in the Ecere Android client, through the LocationManager class.

In addition to latitude and longitude, the on-board GPS chip can provide altitude information. The altitude returned by Location.getAltitude() will be the altitude above the WGS84 reference ellipsoid, not the mean sea level. The latter can be obtained by taking into account the difference between the EGM96 geoid and the WGS84 reference ellipsoid, which varies between a difference of -100 and 80 meters).

This altitude information from the GPS has not yet been used in the Ecere client.

6.3. Motion sensors

A guide to the Android motion sensors can be found here.

6.3.1. The magnetometer

A magnetometer senses the Earth’s magnetic field to establish the direction of magnetic north, much like a classic compass. The magnetometer is the sensor most likely to suffer from distortion errors, due to a failure to initially be calibrated properly, or to be de-calibrated by the presence of magnetic materials nearby whose magnetic field could be much more significant than the Earth’s. It is possible to re-calibrate the magnetometer by waving the device following a figure-8 pattern over one’s head.

Remember that even when properly calibrated, the magnetometer indicates the magnetic North, and not the Geographic north. The local angular difference between the two, the magnetic declination, varies from one place to another on the globe, as well as over time as the magnetic pole moves (due to magnetic changes in the Earth’s core). This may have partially contributed to the difficulty of registering augmentations with the real world during the Testbed experimentation. For example, the magnetic declination is only approximately 4 degrees in Taiwan, but around 13.5 degrees in Eastern North America. Natural Resources Canada provides a convenient online calculator for the magnetic declination based on a given date, latitude and longitude.

The magnetometer, together with the gyroscope and accelerometer, allows establishing the 3D orientation of the device.

6.3.2. The gyroscope

The gyroscope sensor reports angular velocity. Together with the magnetometer and the accelerometer sensing the Earth’s gravity, which is used to establish the ground plane, an absolute 3D orientation can be calculated.

6.3.3. The accelerometer

An accelerometer senses linear acceleration as the device is moved. The sensor perceives the acceleration due to gravity at the Earth’s surface (g, roughly 9.8 m/s^2), which must be accounted for, but which also provides the horizontal frame of reference from which the conceptual electronic compass will interpret the magnetometer input.

6.3.4. The barometer (as an altimeter)

Mobile devices featuring a barometer that measure atmospheric pressure can use this information to establish a much more accurate altitude (at least in relative terms), based on a reference sea level pressure. The challenge is to obtain an accurate reference sea level pressure in order to get an accurate absolute altitude.

6.3.5. 3D orientation

By fusing the information measured by the magnetometer, gyroscope, and accelerometer, an absolute 3D orientation of the device can be determined.

Because the 6 degrees of freedom pose sensor was not available on Android devices used for the experiments, the TYPE_ROTATION_VECTOR was used. This Android sensor type fuses information from the magnetometer, gyroscope and accelerometer to provide a rotation vector in the form of a quaternion.[2] The 4th component (w) of the quaternion is optionally provided on some devices (always provided from SDK Level 18 onwards), but it can otherwise be implied from the other 3 components as the quaternion is normalized. During the experiments, it was found that special care has to be given to remap the coordinate system based on the default rotation of a device, to which the sensors information is relative. Most mobile devices have a portrait default rotation, but some tablets (such as the Nexus 10) default to a landscape orientation.

In the Ecere client, further transformation had to be done to convert this orientation to the GNOSIS coordinate system, requiring a 90 degrees pitch rotation, then a negation of the roll component of the orientation. Finally, the magnetic declination has to be taken into account for the yaw component.

6.3.6. 3D position

By performing a first integration of linear acceleration, velocity can be determined, and by a second integration, a position can be determined. Such a position would not be very useful in a geographic context, as it would only be relative to the user’s original position. However, by relating this position at the precise time where a location update from the GPS was received, the relative movement provides the necessary intermediate position updates until the next location update is received. More involved calculations can smooth these calculations over time based on past discrepancies and previsions. A virtual fusion sensor could also leverage computer vision using a device’s camera, such as an Android 6 degrees of freedom pose fusion sensor may do on recent devices [8].

On Android devices, it would seem that this type of comprehensive fused position and orientation information is only readily available through this particular sensor. As no device supporting it was available for the experiments, and as performing these calculations manually is quite involved, there was no time during the initiative to implement this intermediate position aspect for the Ecere client (positions currently only get updated when location updates are received).

6.4. Size, weight and power usage considerations

The use of location updates, sensors and the camera capabilities is particularly costly in terms of power. Turning off such capabilities when not required is essential to minimize this impact. This is in addition to potentially heavy use of the Graphical Processing Unit (GPU) for rendering a large number of items and/or 3D geometry. This makes Augmented Reality applications rather power hungry, as they can quickly drain a device’s battery. Correspondingly, it also produces a lot of heat, and the size of a device can be related to both its ability to dissipate this heat and the capacity of its battery. As purely anecdotal information, the Testbed-14 participants saw two laptops and one tablet perish during this initiative!

These are things to consider for field operations where devices would be required to be used for long periods of time without the ability to charge the devices, especially if operations increasingly rely on such Augmented Reality applications. It was beyond the scope of this testbed initiative to perform an analysis of factors such as power usage, heat and battery duration of devices. The small team was kept busy for the entire duration of the testbed by the already significant workload of the primary objectives. These objectives included processing and serving relevant data sets, achieving interoperability between the different services and clients, as well as displaying augmentation properly registering with the real world. This could be investigated in future AR initiatives. However, this would require strictly determining many variables and evaluation criteria so as to produce useful results.

7. Augmented Reality Clients

Two mobile clients were built by Ecere and GIS-FCU during this Testbed-14 Augmented Reality initiative.

See also Chapter 6 for more details about how the sensors were used to situate the view in the clients, and Chapter 8 for information about how the 3D contents was streamed from services.

7.1. GNOSIS / Android mobile client by Ecere

Ecere built a client targeting Android mobile devices, leveraging their cross-platform Ecere and GNOSIS SDKs and making use of OpenGL ES, the Android sensors, and the Android Camera 2 API directly. The client would automatically connect to both services and request 3D contents data in a tiled manner based on the current view position. It provided buttons to jump to fixed locations for the datasets (Washington, D.C. and New York City), as well as to the last known GPS position, or to constantly reposition the camera based on GPS location updates.

The client currently requires an Android 5.0+ device, as well as support for OpenGL ES 3.0. It will be possible to lower this requirement to OpenGL ES 2.0 after some fallbacks are properly implemented in the Ecere graphics engine. Application packages (APKs) were built for ARM CPUs, both 32-bit (armeabi v7a) and 64-bit (arm64-v8a), and tested on multiple devices.

There are four buttons in the application:

  • Follow: Toggling (it is off by default) automatically repositions the camera (latitude, longitude) based on GPS updates. The altitude will also currently be maintained at 2 meters above sea level.

  • Here: Clicking this button repositions the camera at the last received GPS position

  • NYC: Clicking this button moves the camera right above the Statue of Liberty

  • DC: Clicking this button moves the camera inside the Capitol.

If Follow mode is not toggled on, the user can:

  • Pinch the screen with two fingers to zoom in & out (changing altitude and position as well if not looking straight down).

  • Slide with one finger to move the camera around

  • Double-click to smoothly transition the camera closer where the clicks were performed

Whether Follow is on or not, the user can simply rotate the device around freely to update the camera view orientation (yaw/pitch/roll).

These experiments focused on rendering 3D buildings, as streamed by the services.

nyc office1
Figure 2. View of Statue of Liberty, with Manhattan in the background, retrieved as E3D from Ecere service, over camera (faking location)
nyc office4
Figure 3. View of New York City theater, retrieved as E3D from Ecere service, over camera (faking location)