I. Abstract
This OGC Testbed 17 Engineering Report (ER) documents the work completed in the “Attracting Developers: Lowering the entry hurdle for OGC Web API experiments” task.
OGC Web API Standards are being developed to make it easy to provide geospatial data over the web. These standards provide a certain level of formality to ensure high levels of interoperability. They rigorously define requirements and rules to reduce room for error during interpretation. This rigor sometimes makes the standard documents difficult to read and hence implement. Rather than direct examination of a standard, the majority of developers often prefer to start with implementation guidelines, sample code, and best practice documentation and then refer to the standards document for guidance and clarity.
The Testbed-17 (TB-17) API task served as a foundation for further development and exploration and delivers knowledge necessary for agile development, deployment, and executing OGC Standards-based applications following a “How-To” philosophy with hands-on experiments, examples, and instructions.
II. Executive Summary
The Testbed-17 API work began with design and conduction of an online Clause 4 and then focused on the results derived from the answers given by the participants: The developers were mostly from the OGC community. The outcome of this survey is that a typical profile of a geospatial developer is generally comparable with the results of similar surveys, such as StackOverflow Developer Survey 2021, which have been done having considered much larger groups of developers with broader interests. The conclusion is that a typical developer from the OGC/geospatial community is an experienced person focused on stable, well-established technologies. Therefore, this ER focuses on newer technologies and approaches to fill the gap between “stable maturity” and “contemporary/emerging trends”.
The value of this ER to interoperability is the exploration of different options for OGC APIs in the context of contemporary development and deployment practices. For interoperability, cloud native services are often based on open source and standards based technology. This helps reduce vendor lock-in and results in increased portability. The ER establishes the repository of How To-examples (GitHub) enabling faster and more efficient geospatial data provision and consumption based on OGC APIs and modern/emerging technology stacks. The API experiments further advance location-based technologies by investigating the compatibility to OGC API Standards with, and the applicability to, technologies such as source code generation and cloud to OGC API Standards. The final result is more efficient development processes, which make building new tools cheaper, more efficient, and less stressful, as well as a broader developer audience attracted to OGC APIs.
The requirements addressed by the work documented in this ER were:
-
Provide easy to understand documentation as a starting point for Web developers to explore OGC APIs standards.
-
Establish and demonstrate a culture of continuous learning and experimentation in the OGC APIs context utilizing DevOps practices.
-
Provide instructions/guidelines to get started with OGC APIs within cloud environments.
-
Expose data sets from cloud storages for consumption using selected technology stacks and compatible OGC APIs.
-
Explore the compatibility of client/server code generation approach in conjunction with OGC APIs.
An overview of recommendations on how to further proceed with the achievements documented in this ER is given in sections Clause 9 and Clause 10.
III. Keywords
The following are keywords to be used by search engines and document catalogues.
ogcdoc, OGC document, API
IV. Preface
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.
Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.
V. Security considerations
No security considerations have been made for this document.
VI. Submitting Organizations
The following organizations submitted this Document to the Open Geospatial Consortium (OGC):
- m-click.aero GmbH
VII. Submitters
All questions regarding this document should be directed to the editor or the contributors:
| Name | Organization | Role |
|---|---|---|
| Aleksandar Balaban | m-click.aero GmbH | Editor |
| Arne Vogt | 52°North GmbH | Contributor |
| Ignacio Correas | Skymantics Europe SL | Contributor |
| Sam Lavender | Pixalytics Ltd | Contributor |
| Karri Ojala | Solenix GmbH | Contributor |
| Alexander Lais | Solenix GmbH | Contributor |
OGC Testbed-17: Attracting Developers: Lowering the entry barrier for implementing OGC Web APIs
1. Scope
The scope of this OGC Testbed 17 ER covers the following topics:
-
Clause 3 introduces the problem that appears when starting with OGC APIs development. The overview describes the current situation and discusses the requirements set by the testbed Sponsors.
-
Clause 4 presents the results of developer survey.
-
Clause 5 illustrates the scenarios and experiments of this task.
-
Clause 6 describes the task architecture and the services/components.
-
Clause 8 explains the concept of code generation for OGC API.
-
Clause 9 summarizes the results of this API experiments.
-
Clause 10 provides conclusions and gives recommendations for future Work.
-
Clause 11 covers the Technology Integration Experiments between the DCS Server and DCS Client.
-
Annex A lists the questions used in the developer survey.
2. Terms, definitions and abbreviated terms
This document uses the terms defined in OGC Policy Directive 49, which is based on the ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards. In particular, the word “shall” (not “must”) is the verb form used to indicate a requirement to be strictly followed to conform to this document and OGC documents do not use the equivalent phrases in the ISO/IEC Directives, Part 2.
This document also uses terms defined in the OGC Standard for Modular specifications (OGC 08-131r3), also known as the ‘ModSpec’. The definitions of terms such as standard, specification, requirement, and conformance test are provided in the ModSpec.
For the purposes of this document, the following additional terms and definitions apply.
2.1. Terms and definitions
2.1.1. Application Programming Interface
An Application Programming Interface (API) is a standard set of documented and supported functions and procedures that expose the capabilities or data of an operating system, application or service to other applications (adapted from ISO/IEC TR 13066-2:2016).
2.1.2. CI/CD
Continuous Integration, Continuous Delivery is a method to frequently deliver apps to customers by introducing automation into the stages of app development. (source: RedHat)
2.1.3. DevOps
a set of practices that combines software development (Dev) and IT operations (Ops).
2.1.4. Cloud
a network of computing resources, such as storage, servers, applications and services, that can be used on-demand anywhere and anytime via the Internet NIST.
2.1.5. Cloud-native
The concept of building and running applications to take advantage of the distributed computing offered by the cloud delivery model. Cloud native apps are designed and built to exploit the scale, elasticity, resiliency, and flexibility the cloud provides, Oracle.
2.1.6. Code Generation
Source code generation is the process of creating programming code from a model such as one of OGC APIs data schemas.
2.1.7. Containerization
the packaging of software code with just the operating system (OS) libraries and dependencies required to run the code to create a single lightweight executable—called a container—that runs consistently on any infrastructure, IBM.
2.1.8. CRUD
In computer programming, create, read (aka retrieve), update, and delete are the four basic functions of persistent storage.
2.1.9. OGC APIs
Family of OGC standards developed to make it easy for anyone to provide geospatial data to the web.
2.1.10. Technology Stack
A technology stack or tech stack refers to a set of technologies, software, and tools that are used in the development of applications.
2.2. Abbreviated terms
API
-
Application Programming Interface
CI/CD
-
Continuous Integration / Continuous Delivery
CNCF
-
Cloud Native Computing Foundation
COTS
-
Commercial Off The Shelf
DevOps
-
Software development (Dev) and IT operations (Ops)
EDR
-
Environmental Data Retrieval
IDL
-
Interface Definition Language
JSON
-
JavaScript Object Notation
OGC
-
Open Geospatial Consortium
OWS
-
OGC Web Services
STAC
-
Spatio temporal Asset Catalog
3. Introduction
This OGC Testbed 17 ER documents the experiments performed with OGC API Standards for lowering the entry barrier for developers implementing OGC API Standards. The ER describes all developed services and components, the results of Technology Integration Experiments (TIE), provides an executive summary, and describes recommended future work.
4. Developer Survey
A pivotal activity within the TB-17 API experiments task was to design and conduct an online survey. The target survey universe was developers from the geospatial community. The survey objective was to collect information about their profile and behaviors, as well as their needs and their satisfaction with current and emerging OGC API Standards. The survey received responses from 125 participants. Answers from the survey helped the testbed participants understand the current situation, design the API experiments, recommend improvements for OGC APIs documentation, and provide useful tutorials with code examples for the community.
The answers were evaluated graphically and are available in Annex A. The following sections provide a detailed summary of survey’s findings:
4.1. Who are our developers?
More than 80% of the respondents reside in Europe and the United States of America; and many of the respondents were experienced developers.
Figure 1
-
Almost 75% of developers are older than 35!
-
More than 70% of the respondents have Masters degrees or/and doctorates.
-
Almost 80% are full-time employed, 20% work for public agencies, 30% for small businesses.
-
53% of the respondents currently reside in Europe and almost 29% in North America
-
60% of them stated that they spend 21 hours or more per week on tasks related to geospatial technologies (Q19).
-
71% of the respondents have been coding for more than 8 years.
4.2. Technology Stack
As expected, Python and JavaScript are the most popular programming languages.
Figure 2
-
Regarding programming languages preferences (Q11), developers are most familiar with Python (about 80%) followed by JavaScript (58%). Many of them are also fluent in more traditional C/C++ programming languages (about 38%). Java is still well presented in their technology stacks.
-
Developers are very familiar with the mainstream open-source geospatial projects. The most popular is the QGIS framework (for 90.40% of survey participants) followed by PostGIS 82.40%, GeoServer 65.60%, and MapServer 48.80%. This also reflects the fact that features and maps are the most significant and prominent geospatial data formats.
Figure 3
4.3. Learning Habits
-
The preferable way to learn a technology seems to be to “follow a documented how-to/tutorial” (30%) and “to read a tutorial” (20%).
-
When learning about a new topic from the internet, “full written descriptions” are the most preferable sources.
-
Portal Stack Overflow (Q29) seems to be the most important tool for getting support from the community (80%).
4.4. Familiarity with OGC (API) Standards
-
About 20% of the respondents have contributed to the development of OGC API Standards on GitHub. Still, these standards are not yet widely used in the community because, although 75% of respondents have either used an OGC Standard or read one of the documentations (Q24), a high percentage (30%) answered that they don’t have any experience involving OGC API Standards (Q25).
Figure 4
-
Features, Maps and Tiles are the most important geospatial standards and therefore the corresponding OGC APIs are of equal importance (Q27).
-
Some of the respondents believe OGC API Standards are not easy to learn. On the question of whether OGC API Standards are easy to learn (Q30) almost 30% answered that “Mastering the first API is hard, but then it becomes easier to learn” while 40% thinks that “Some are very easy and others are extremely difficult to learn”.
-
About 57% believes that there is a lack of best-practice guides that help differentiate the dos and don’ts”, as well as 38% that technical documentation is hard to read or understand (Q32)!
4.5. Summary
The survey results suggest that a typical developer from the OGC/geospatial domain is an experienced, well-educated individual focused on stable, well-established technologies. Traditional and new technologies are equally used. For example they follow the recent trends regarding scripting programming languages as they are predominant today.
Considering the results, the API experiments focused on developing examples and how-to documentation as well as newer technologies and approaches to fill the gap between “stable maturity” and “contemporary/emerging trends”.
Annex A lists the questions used to conduct developers survey.
5. API Experiments Scenarios Overview
The following figure illustrates the work items and deliverables of this task.
Figure 5 — Deliverables and Packages
5.1. Participants
To test all components and guidance material, the work in this task was shared by participants acting as:
-
Client developers
-
Server developers
-
Data providers
Client developers: Organization O1 implemented and delivered a client-side software library in Python for both OGC API Features and OGC API Environmental Data Retrieval (EDR). Organization O2 delivered a similar library in JavaScript and TypeScript. Both organizations demonstrated their library in a client application.
Server developers: Organization O3 implemented and delivered a server-side software library in Python for both OGC API — Features and OGC API — EDR. The library supported access and exploration of both OGC API — Feature and OGC API — EDR resources and supported data hosted on cloud object storage, cloud databases, and OGC Web Feature Service (WFS) instances (with constraint functionality). O3 provided a deployable instance together with deployment and execution documentation that allows third parties to test the deliverable(s) as a microservice in different cloud environments. Organization O4 delivered a similar library in JavaScript (alternatively TypeScript).
Data providers: Organization O5 deployed a microservice based on deliverables provided by O3 and O4 in at least two different cloud environments (to the extent possible, as some data backends are cloud specific, others work across cloud environments or are fully cloud independent). In addition, O5 provided data in a database, cloud object storage, and as a Web Feature Service instance to test data access (reuse of existing WFS and/or Web Coverage Service (WCS) instances, even outside of the target cloud environment, is admitted). All of the data sources were available to all organizations in this task and are freely accessible (i.e. object storage and WFS/WCS endpoints are fully accessible, together with cloud native databases as much as possible). Organization O6 performed similarly to O5.
Participants:
-
Data providers:
-
Skymantics: D167 Data Backend and Deployment 1
-
Pixalytics: D168 Data Backend and Deployment 2
-
-
Services (server developers):
-
52N: D165 API Experiments Server (Python)
-
GMU: D166 API Experiments Server (JavaScript)
-
-
Client developers:
-
Skymantics: D175 API Experiments Client (Python)
-
Solenix: D176 API Experiments Client (TypeScript)
-
5.2. API experiment scenarios
The figure below illustrates a generic Testbed-17 OGC APIs experiment component diagram with interactions. A client with a dedicated OGC API software library (1) interacts with a data service that implements one of the OGC APIs (2) at the front-end and includes a cloud native data storage software library (3) at the backend that interacts with cloud native storage, such as cloud databases, cloud object storage, or existing OGC Web Service instances such as WFS.
Figure 6 — Generic API experiments scenario
6. Architecture and Components
This chapter describes the experiment architecture, focusing on addressing the concepts of cloud-native architecture in conjunction with OGC APIs. Component diagrams are primarily used to describe the structure, while process, deployment and other diagrams were used on demand. Another chapter, on TIE experiments, lists all interactions among components which were used to validate the approach.
6.1. Key Features of a Cloud-Native Application Architecture
“Cloud native” is a design paradigm governed by the Cloud Native Computing Foundation (CNCF). Cloud native applications can be defined as applications running in a containerized environment, capable of moving in and out of the cloud, and that scale horizontally based on varying workloads. They are deployed as microservices, abstracted to underlying infrastructure, and delivered through a DevOps pipeline. Often, cloud-native applications are orchestrated by an orchestration infrastructure. Following are the characteristics as defined by CNCF:
Containerization
-
Cloud native applications are infrastructure agnostic and use containers. Containers provide the application with a lightweight runtime, libraries, and dependencies that allow the application to run as a stand-alone environment able to move in and out of the cloud—independent of the nuances of a certain cloud provider’s virtual servers or computer instances. With this, containers are able to increase mobility between different environments.
API Driven
-
The term means to ensure compatibility across platforms and simplify complexity by using APIs. The most common API pattern used in the industry is RESTful API. RESTful APIs are a component of loosely coupled architecture. REST systems interact through standard operations on resources, and do not rely on the implementation of interfaces.
Microservices
-
The term “microservice” is a design paradigm used in cloud-native applications to break down large components into multiple stand-alone deployable parts. The component is divided and deployed into smaller functional units. This way, maintenance in a module does not affect the other functionalities, since they can independently work on their own.
Horizontal Scaling
-
A key property of cloud-native applications is the ability to horizontally scale in and out based on the size of their payload.
Service Mesh
-
A service mesh is a dedicated infrastructure layer for handling service-to-service communication. The service mesh is responsible for the reliable delivery of requests through the potentially complex network topology of services. In practice, the service mesh is implemented as an array of lightweight network proxies that are deployed alongside application code, without the application needing to be aware.
Delivery Pipeline
-
The state-of-the art software delivery pipeline is supported by the concepts of Continuous Integration/Continuous Delivery (CI/CD). CI/CD has the following stages/components:
-
Source code is stored in a Repository
-
Each time new code is pushed to a Repository, CI is automatically triggered to create a build (image).
-
The images are then stored in a Registry or Storage Bucket.
-
Through CD, images are pulled from the registry for deployment.
-
CD then pushes the image to the target environment (e.g., the Kubernetes cluster).
-
6.2. Component Description Template
This section provides a template for formal description of components involved in the API experiments. The template is structured based on the following characteristics:
-
Functional Description
-
Implemented, supported or required OGC APIs
-
Used data sources
-
Technology Stack
-
Components, libraries, configurations
-
Development paradigm/process/approach:
-
Model Drives Software Development (MDSD), Domain Specific Language (DSL), Code/API generation?
-
Microservices? Service mesh?
-
Cloud native design/implementation?
-
-
CI/CD:
-
Pipeline
-
Containerization
-
Container orchestration
-
Cloud deployment
-
7. Component Design and Implementation Details
7.1. D165 API Experiments Server (Python)
The API Experiments Server implementation comprised separate implementations of OGC API — Features (Features) and OGC API — Environmental Data Retrieval (EDR). Both server implementations were developed in the Python programming language and around the Flask web framework. Flask provides the core capabilities to implement web interfaces. The basic implementation of both OGC API Standards is supported by the use of code generators. The model classes and controller classes (server stubs) are automatically generated from the respective OpenAPI specifications of Features and EDR by using the OpenAPI Generator project. OpenAPI Generator offers a specific code generator that produces basic Flask-based executable server applications.
The source code, corresponding documentation and deployment instructions are available on Github.
7.1.1. OGC API — Features Server
The architecture of the Features implementation is structured into modules. The modular approach enables different types of data backends to be easily integrated in a similar way. Within this task, connectivity’s for OGC Web Feature Service (WFS) and Elasticsearch backends were implemented.
7.1.1.1. Core Architecture
-
Controller: Controller classes accept API requests, choose the correct backend to handle a request and return the result to the client. The controller classes (empty stubs) are generated by the OpenAPI code generator along with the model classes.
-
Request transformer: A request transformer is specific to the backend type. Request transformers convert a request to the OGC API — Features into a corresponding request to the specific data backend. For example a request to the /items endpoint is converted into a WFS GetFeature request.
-
Query transformer: A query transformer is a backend specific implementation that translates query parameters into the domain specific language of the backend. For example a query transformer translates the datetime filter parameter of the /items endpoint into a WFS filter expression.
-
Format transformer: A format transformer is optionally interposed if a data backend has no support for the GeoJSON format.
-
Backend: A backend stores the configuration for an instance of a backend type (e.g URL of a WFS instance) and determines which type of request transformer has to be used for that specific backend type.
Figure 7 — schematic representation of the architecture of the OGC API — Features implementation with two data backends
7.1.1.2. Data Backends
7.1.1.2.1. OGC Web Feature Service Backend
The WFS backend can be used to create an OGC API — Features wrapper for an arbitrary WFS (2.x) instance. A request to the /items endpoint is converted to a GetFeature WFS request. The metadata for the collection endpoints of Features is automatically obtained and parsed from the WFS capabilities document.
7.1.1.2.2. Elasticsearch Backend
The Elasticsearch backend implementation is mainly meant to be used with the OGC API — Records GeoJSON features that were generated and indexed in the D168 Data Backend and Deployment task. With the Elasticsearch backend the Features implementation can be used as an OGC API — Records implementation at the same time since the API endpoints are identical. The Elasticsearch backend is able to connect to Amazon Web Service (AWS) Elasticsearch Service/OpenSearch Service instances. Therefore the Elasticsearch Python package version 7.13.4 must be used because newer versions of the package do not connect to the AWS flavor of Elasticsearch.
In general the Elasticsearch backend is able to serve generic GeoJSON features indexed in an Elasticsearch index. An Elasticsearch index corresponds to a collection. The assumption is made that the GeoJSON features match the general structure of the OGC API — Records definitions in order to implement the filter capabilities (e.g. datetime filter) of OGC API — Features.
7.1.2. OGI API — EDR (Environmental Data Retrieval) Server
In the same way as the Features implementation, the EDR implementation has a modular structure in order to enable seamless integration of different data backend types. Within this task the capability to perform data queries on NetCDF files was implemented. The EDR server application supports the position, radius and area data query endpoints of the EDR Standard. Though the OGC API — Features and — EDR implementations are separate applications the overall architecture is roughly the same.
When using the code generator there were originally many errors in the generated code for the EDR OpenAPI specification. The main reasons for the errors are the frequent use of the anyOf and oneOf keywords in the OpenAPI specification and the fact that the EDR specification is more comprehensive and complex compared to the Features specification. In order to mitigate these issues a custom, lean and simplified version of the OpenAPI EDR Standard was created. This custom version supports only two-dimensional, temporal data and only serves netCDF data as response. All data query endpoints that were not planned to be implemented — that means all except position, radius and area — were removed from the implementation to keep it as lean as possible. This resulted in fewer errors in the generated code which could be fixed manually afterwards.
7.1.2.1. Core Architecture
-
Controller: Controller classes accept API requests, choose the correct backend to handle a request and return the result to the client. The controller classes (empty stubs) are generated by the OpenAPI code generator along with the model classes.
-
Request transformer: A request transformer is specific to the backend type. A Request transformer converts a request to the OGC API — EDR interface into a corresponding request to the specific data backend. A request transformer delegates the execution of data query requests to a data query transformer.
-
Data query transformer: A query transformer is a backend specific implementation that executes data query requests for data served by a data backend.
-
Format transformer: A format transformer is optionally interposed if a data backend has no support for the NetCDF format. Since the OGC API — EDR implementation only serves netCDF files and has only a NetCDF backend there are no format transformers used.
-
Backend: A backend stores the configuration for an instance of a backend type (e.g location of a NetCDF file) and determines which type of request transformer has to be used for the specific backend type.
Figure 8 — schematic representation of the architecture of the OGC API — EDR implementation with a single data backend
7.1.2.2. Data Backends
7.1.2.2.1. netCDF backend
The NetCDF backend parses NetCDF files. Each file corresponds to an instance of a collection. The backend is adapted to the use of NetCDF land cover classification data provided by the D168 Data Backend and Deployment task. The general assumptions about the internal structure of these files is that each file contains a single data array, which has the dimensions x, y and time.
To implement data queries the NetCDF backend implementation mainly uses the Python packages Rasterio, xarray and Shapely. In addition, a package called rioxarray is used. Rasterio is used to clip raster data to a specific geometry and xarray is used to implement filter capabilities like the optional datetime filter. The internal data structure of xarray resembles the structure of a NetCDF file. rioxarray combines the functionalities of Rasterio and xarray which supports convenient use without transformation between (internal) data formats. Shapely is used to parse WKT geometries that are passed to the data query endpoints as query parameters.
7.1.3. Overall Design Decisions
7.1.3.1. Separating Implementations of OGC API — Features and OGC API — EDR
The implementation of OGC API — Features and OGC API — EDR was split into two separate services that do not share any common code base. Integrating Features and EDR into a single API instance was considered but rejected.
Another approach that was also rejected was the creation of a Python package that contained the (common) data model classes and then imported by both applications. The decision for separate applications was based on the difficulties that arose with code generation and that both OGC API Standards contain a few deviations or different concepts despite many similarities in detail.
Regarding code generation, the results for OGC API — Features and especially OGC API — EDR already showed that code generation becomes more error prone with increasing complexity of the underlying OpenAPI specification. The generated code for OGC API — EDR contained more errors compared to the code for the (simpler) Features specification. This led to the necessity to simplify the OpenAPI definition document provided with the OGC API — EDR. An OpenAPI specification that integrates Features and OGC API — EDR would have been even more complex and therefore the generated code would have been flawed and hard to rectify.
From the conceptual point of view, there are some concepts that have similar names but the meaning is slightly different. For example both APIs specify the /collections/{collectionID} endpoint to retrieve metadata for collection of data. In Features a collection is always a set of features while in OGC API — EDR Standard a collection can also be a coverage. In addition, in OGC API — EDR a collection can have multiple instances such as distinguishing multiple measurement series. These conceptual differences lead to slight differences in the data model. In OGC API — EDR a collection has additional attributes compared to the OGC API — Features Standard. As described, in OGC API — EDR a collection has multiple instances (a list of collections) and furthermore has supplemental attributes to describe measurement parameters.
A common data model for both implementations would also have had to meet the requirements of the OGC API — EDR Standard although some attributes are not required in the OGC API — Features Standard. This means that the same class of data model is used differently in different contexts. This can be seen as an anti-pattern. The OGC API Standards are not yet fully aligned with OGC API — Common, partly because OGC API – Common was still a candidate standard at the time of this testbed. As such, the data models have a few (minor) deviations. Since the first version of the OGC API — EDR Standard was only recently published and adjustments to the specification and data model are to be expected; this could also make a common code base hard to maintain.
7.1.3.2. Development and Deployment
Development, testing and deployment of both server applications was performed using a container-based platform for building applications called Docker. Docker simplifies service dependency management as the developer/user does not need to install dependencies on a local system (except the Docker engine itself). However, deploying and running the applications without Docker is still possible if all dependencies are installed manually. The decision to primarily use Docker stems from the need to have other software libraries installed in addition to simple Python packages. In particular the EDR implementation (respectively the used Python packages like Rasterio) require an installation of GDAL. While the Python package can easily be installed in a virtual environment with simple tools like Venv, the installation of GDAL needs more advanced tools like Conda. The assumption is that Docker is more known and better understood by most developers and users. In addition, with Docker the applications are completely separated from the environment and operation system.
In order to support development and production-ready deployment, the code repository contains Docker files and Docker Compose definitions with development and production profiles. The development profile uses the built-in Flask development server (which is not meant to be used in a production environment) and installs additional debugging dependencies. This allows developers to attach a remote debugger to the running Docker container and debug the application in the development environment of their choice (e.g. Visual Studio Code). The production profile does not contain any debugging dependencies and uses nginx and uWSGI instead of the development server and in order to serve requests fast and securely.
7.2. D166 API Experiments Server (JavaScript)
This section describes the server component as developed, documented and delivered by task D166 (JavaScript) and communicates all experiences and lessons learned back to the server provider to help optimizing documentation and scripts.
7.2.1. Development Environment
This implementation uses JavaScript to develop servers of OGC API-Features and API-Environmental Data Retrieval (EDR). The base system is as follows.
-
Debian-based Linux systems (Debian 10 or higher or Ubuntu 18.04 or higher)
-
NodeJS >= 10.6
-
NPM >= 6.10.0
-
GDAL 2 or higher
-
PostgreSQL 12+/Proj 6+/PostGIS 3+(optional)
7.2.2. Architecture
Figure 9 — Overall architecture
7.2.3. Deployment and Test Services
The demonstration service was deployed at https://aws4ogc17.webmapengine.com/ . This deployment is based on an Amazon Machine Image (AMI).
7.2.3.1. Deployment Through Source Code
This section contains information on how to deploy the JavaScript server implementation through cloning the source code with configuration.
7.2.3.1.1. Requirements of Virtual Machine
The following are tested and recommended system configurations:
-
Ubuntu 18.04 or higher or Debian 10 (Buster) or higher
-
NodeJS (>= 10.6) and NPM (>= 6.10.0)
-
git
-
GDAL 2 or higher
-
PM2 Process Management
-
HTTP server (e.g. Apache or nginx) for proxy
7.2.3.1.2. Cloning of Source
Use git to clone the source into a local work directory.
git clone https://github.com/opengeospatial/T17-API-D166.git
7.2.3.1.3. OGC API — Features
Change working directory to “T17-API-D166/features”. The following command will pre-install all required libraries.
npm install
Start the Service
Run the following command to start the server as a service.
sudo pm2 start index.js
Run the following command to stop the server.
sudo pm2 stop index.js
Configurations
Service Endpoints
Edit config.js to configure the running port and the external service endpoint. The external service endpoint can be different from the local server endpoint with port if the proxy is set up to redirect the service. If both services (API-Features and EDR) run on the same virtual machine, different ports should be configured.
Configuring Back-end PostGIS Database
Edit the file ‘data/collections.json’ to add a new back-end PostGIS database. The new element should be added under node ‘resources’. The following is one example -
"tl_2020_us_county": {
"type": "collection",
"title": "Tiger Line US Counties",
"description": "Counties, US Census, TIGER/Line",
"keywords": [
"counties",
"US",
"TIGER/Line"
],
"links": [
{
"type": "text/html",
"rel": "canonical",
"title": "information",
"href": "https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html",
"hreflang": "en-US"
}
],
"extents": {
"spatial": {
"bbox": [
-179.231086,
-14.601813,
179.859681,
71.439786
],
"crs": "http://www.opengis.net/def/crs/OGC/1.3/CRS84"
}
},
"providers": [
{
"type": "feature",
"name": "PostGIS",
"data": "postgresql://tluser:tl2020@localhost:5432/tl",
"id_field": "ogc_fid",
"title_field": "name",
"table": "tl_2020_us_county",
"geometry":{
"geom_field":"wkb_geometry",
"geom_format":"ewkb"
}
}
]
},
Configuring Back-end WFS Service Endpoints
Edit the file ‘data/collections.json’ to add new back-end WFS Service Endpoints. The new element should be added under node ‘resources’. The following is one example to add one single feature collection -
"DC_Building_Footprints": {
"type": "collection",
"title": "DC Building Footprints",
"description": "DC building footprints.",
"keywords": [
"DC",
"US",
"building",
"footprint"
],
"links": [
{
"type": "text/html",
"rel": "canonical",
"title": "information",
"href": "https://cubewerx.pvretano.com/cubewerx/cubeserv/default/ogcapi/usbuildingfootprints/collections/DC_Building_Footprints",
"hreflang": "en-US"
}
],
"extents": {
"spatial": {
"bbox": [
-77.115085,
38.810444,
-76.909707,
38.99561
],
"crs": "http://www.opengis.net/def/crs/OGC/1.3/CRS84"
}
},
"providers": [
{
"type": "feature",
"name": "WFS202",
"id_field": "gml_id",
"typename": "DC_Building_Footprints",
"data": "https://www.pvretano.com/cubewerx/cubeserv?datastore=usbuildingfootprints&"
}
]
},
All feature collections may be added as the back-end service. The following resource configuration shows an example using WFS capabilities.
"daraa": {
"type": "collections",
"title": "CubeSERV WFS - Daraa",
"description": "CubeSERV WFS - Daraa service, support version 2.0.2.",
"keywords": [
"WFS",
"feature",
"capabilities"
],
"links": [
{
"type": "text/html",
"rel": "canonical",
"title": "information",
"href": "https://www.pvretano.com/cubewerx/cubeserv?DATASTORE=daraa&SERVICE=WFS",
"hreflang": "en-US"
}
],
"providers": [
{
"type": "collection",
"name": "WFS202Capabilities",
"data": "https://test.cubewerx.com/cubewerx/cubeserv/demo?DATASTORE=Daraa&SERVICE=WFS&REQUEST=GetCapabilities&AcceptVersions=2.0.2&AcceptFormats=text/xml",
"removeprefix":"cw"
}
]
},
7.2.3.1.4. OGC API — Environmental Data Retrieval
Change working directory to “T17-API-D166/edr”. The following command will pre-install all required libraries.
npm install
Start the Service
Run the following command to start the server as a service.
sudo pm2 start index.js
Run the following command to stop the server.
sudo pm2 stop index.js
Configurations
Service Endpoints
Edit config.js to configure the running port and the external service endpoint. The external service endpoint can be different from the local server endpoint with port if the proxy is set up to redirect the service.
Configuring Back-end PostGIS Database
Edit the file ‘data/collections.json’ to add new back-end PostGIS database. The new element should be added under node ‘resources’. The following is one example -
"noaa_global_hourly_surface": {
"type": "collection",
"title": "The Integrated Surface Dataset (global, hourly)",
"description": "The Integrated Surface Dataset (ISD) is composed of worldwide surface weather observations from over 35,000 stations, though the best spatial coverage is evident in North America, Europe, Australia, and parts of Asia. Parameters included are: air quality, atmospheric pressure, atmospheric temperature/dew point, atmospheric winds, clouds, precipitation, ocean waves, tides and more. ISD refers to the data contained within the digital database as well as the format in which the hourly, synoptic (3-hourly), and daily weather observations are stored. The format conforms to Federal Information Processing Standards (FIPS). ISD provides hourly data that can be used in a wide range of climatological applications. For some stations, data may go as far back as 1901, though most data show a substantial increase in volume in the 1940s and again in the early 1970s. Currently, there are over 14,000 'active' stations updated daily in the database. For user convenience, a subset of just the hourly data is available to users for download. It is referred to as Integrated Surface Global Hourly data, see associated download links for access to this subset.",
"keywords": [
"Integrated Surface Dataset",
"Global",
"NOAA"
],
"links": [
{
"type": "text/html",
"rel": "canonical",
"title": "information",
"href": "https://www.ncdc.noaa.gov/isd",
"hreflang": "en-US"
}
],
"extents": {
"spatial": {
"bbox": [
-180.00,
-90.00,
180.00,
90.00
],
"crs": "http://www.opengis.net/def/crs/OGC/1.3/CRS84"
},
"temporal": {
"interval": [{
"begin":"1972-01-01T00:00:00Z",
"end":"1972-12-31T23:59:59Z"
}],
"trs": "TIMECRS[\"DateTime\",TDATUM[\"Gregorian Calendar\"],CS[TemporalDateTime,1],AXIS[\"Time (T)\",future]"
}
},
"data_queries" : {
"position": {
"link": {
"href": "/collections/noaa_global_hourly_surface/position?coords={coords}",
"hreflang": "en",
"rel": "data",
"templated": true,
"variables": {
"title": "Position query",
"description": "Position query",
"query_type": "position",
"coords" :{
"description": "Well Known Text POINT value i.e. POINT(-120, 55)"
},
"output_formats": [
"CoverageJSON",
"GeoJSON",
"IWXXM"
],
"default_output_format": "IWXXM",
"crs_details": [
{
"crs": "CRS84",
"wkt": "GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.01745329251994328,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4326\"]]"
}
]
}
},
"linkrel":"relative"
},
"radius": {
"link": {
"href": "/collections/noaa_global_hourly_surface/radius?coords={coords}",
"hreflang": "en",
"rel": "data",
"templated": true,
"variables": {
"title": "Radius query",
"description": "Radius query",
"query_type": "radius",
"coords" :{
"description": "Well Known Text POINT value i.e. POINT(-120, 55)"
},
"output_formats": [
"CoverageJSON",
"GeoJSON",
"IWXXM"
],
"default_output_format": "GeoJSON",
"within_units": [
"km",
"miles"
],
"crs_details": [
{
"crs": "CRS84",
"wkt": "GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.01745329251994328,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4326\"]]"
}
]
}
},
"linkrel":"relative"
},
"area": {
"link": {
"href": "http://www.example.org/edr/collections/hrly_obs/area?coords={coords}",
"hreflang": "en",
"rel": "data",
"templated": true,
"variables": {
"title": "Area query",
"description": "Area query",
"query_type": "area",
"coords" :{
"description": "Well Known Text POLYGON value i.e. POLYGON((-79 40,-79 38,-75 38,-75 41,-79 40))"
},
"output_formats": [
"CoverageJSON",
"GeoJSON",
"BUFR",
"IWXXM"
],
"default_output_format": "CoverageJSON",
"crs_details": [
{
"crs": "CRS84",
"wkt": "GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.01745329251994328,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4326\"]]"
}
]
}
}
},
"locations": {
"link": {
"href": "/collections/noaa_global_hourly_surface/locations",
"hreflang": "en",
"rel": "data",
"templated": false,
"variables": {
"title": "Location query",
"description": "Location query",
"query_type": "locations",
"output_formats": [
"application%2Fgeo%2Bjson",
"text%2Fhtml"
],
"default_output_format": "application%2Fgeo%2Bjson",
"crs_details": [
{
"crs": "CRS84",
"wkt": "GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.01745329251994328,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4326\"]]"
}
]
}
},
"linkrel":"relative"
}
},
"crs": [
"CRS84"
],
"output_formats": [
"application/geo+json",
"text/html"
],
"parameter_names": {
},
"providers": [
{
"type": "feature",
"name": "PostGIS",
"data": "postgresql://noaauser:noaa2020@localhost:5432/noaa",
"id_field": "ogc_id",
"locid_field": "station",
"title_field": "station",
"table": "noaa_global_surface",
"geometry":{
"geom_field":"the_geom",
"geom_format":"ewkb"
},
"time_field":{
"time_format":"datetime",
"datetime":"date"
}
}
]
},
Configuring Back-end WFS Service Endpoints
Service instances can be added as back-end. This is especially useful for those with “instances” path.
7.2.4. Running Through Docker
Docker images have to be created for a quick-start of the servers.
7.2.4.1. Running OGC API — Features Through Docker
The following command runs the service directly using Docker:
docker run -p 8080:8080 -d eugenegmu/ogc-api-features-javascript
To test the server, you may browse to http://localhost:8080 to test the results. This image does not have the PostGIS set up locally. The first three collections do not work properly. Configuration needs to be done if a proper PostGIS database is set up with populated data. If a port is redirected to a different port other than 8080, the configuration needs to be updated. To get into the image, the following command may be used.
docker exec -it <container id or name> /bin/bash
The container ID or name can be found by running “docker container ls”.
7.2.4.2. Running OGC API — EDR Through Docker
The following command runs the service directly using Docker:
docker run -p 8080:8080 -d eugenegmu/ogc-api-edr-javascript
To test the server, browse to http://localhost:8080/edr/ to test the results. This image does not have the PostGIS set up locally. Again, configuration needs to be done with a proper PostGIS database set up with populated data. If a port is redirected to a different port other than 8080, the configuration needs to be updated. To get into the image, the following command may be used.
docker exec -it <container id or name> /bin/bash
The container ID or name can be found by running “docker container ls”.
7.2.5. Test Services
The demonstration deployments can be found at Demonstration Services for OGC Testbed 17.
7.2.6. OGC API — Features Test Service
The test service is deployed at API-Features.
Landing Page
The implementation of the OGC API — Features Standard supports two media content types by default: application/json and text/html. Encoding negotiation is supported. For browsers, “text/html” would be the matching format. The following figure shows one example landing page.
Figure 10 — Landing page of the demonstration OGC API — Features server
Testing API using SwaggerUI: The SwaggerUI is enabled for users to test each API path. This is the response of an API document in text/html.
Features in a collection: The response for collection data supports encodings in both application/geo+json and text/html. The text/html shows a linked map and table for browsing items by pages. The following shows one example of a collection in the browser.
Figure 11 — Collection item page of the demonstration OGC API — Features server
7.2.7. Test Service for OGC API — EDR
The test service is deployed at API-EDR.
Landing Page
The implementation of the API — EDR supports two formats by default: application/json and text/html. Encoding negotiation is supported. For browsers, “text/html” would be the matching format. The following figure shows one example landing page.
Figure 12 — Landing page of the demonstration OGC API — EDR server
Testing API using SwaggerUI: The SwaggerUI is enabled for users to test each API path. This is the response of an API document in text/html.
Data query in a collection: Several data queries are supported for each collection in the API-EDR service. The following figure shows one example collection which supports queries of position, radius, area, and location.
Figure 13 — The Collections page of the demonstration OGC API — EDR server
The response for collection data supports encodings in both application/geo+json and text/html. The text/html shows a linked map and table for browsing features by pages. The following shows one example of a collection in the browser. The request is https://aws4ogc17.webmapengine.com/edr/collections/noaa_global_hourly_surface/items?f=text%2Fhtml&datetime=1972-07-25T00:00:00.000Z/1972-07-25T23:59:59.000Z
Figure 14 — Collection item page of the demonstration OGC API — EDR server
7.2.8. Extensions to the Library
The implementation of the library for both API-Features and API-Environmental Data Retrieval supports extension through the following mechanisms: One is the extension to support different back-end data sources and another is the extension to support different response formats.
7.2.8.1. Add a new backend data source
The Model extension manages the backend data models. The Data provider extension needs to support query interfaces as defined. An example provider can be seen in PostGIS.js.
7.2.8.2. Support new format
The Writer extension manages the supported formats. Each writer needs to support both collection items and individual features. TextHtml.js is an example writer.
7.2.9. Lessons learned
The following cover discussions about the implementation of API -Features and API — EDR services using JavaScript.
-
Usability of OpenAPI-Generator: The stubs generated by the OpenAPI-Generator provide a solid base for implementing the service. Minor modifications are necessary to work with specific data types.
-
Limitations of supporting libraries in pure JavaScript: There are fewer libraries available in pure JavaScript. Many require a system call to libraries from modules or programs implemented in other languages, such as C/C++ and Python. This increases the complexity in deployment and running environment.
-
Working with different backend data sources: Different backend data have different levels of support on realizing the interface functions. File-based dataset may require specific programs to support the spatiotemporal queries. Databases with spatial modules have full spatiotemporal query support with proper indexing. Legacy services, such as old versions of WFS, have various capabilities on supporting queries depending on their implementations.
-
Extensibility of API libraries: Response format supports media type negotiation. Backend services are pluggables with different modules. Extensions are supported in two aspects — backend data source provider and output data.
-
Performances: Certain profiled query-functions of API-EDR may not be easy to achieve ideal response time when a large dataset is in process. In the AWS cloud, one approach may be to set up a high-performance data service to serve data. For example, AWS RDS database service can be used to serve data. The PostGIS module can be configured to connect with AWS RDS PostgreSQL/PostGIS. The capability of handling big data can be achieved through a similar cloud-based database as the service.
7.3. D167 Data Backend and Deployment
The D167 Data Backend and Deployment consists of three different components:
-
A tiles server generator, generating the tiles from OpenStreetMap data and publishing them through an Apache Web Server
-
An OpenStreetMap dataset generator, storing the data in PostGIS datasets (in AWS, CloudFerro or CloudSigma) or as cloud objects in AWS S3 buckets
-
A system to publish array-oriented datasets in a tiled manner, through a STAC catalog and storing the resulting NetCDF files in AWS S3 buckets
Figure 15 — Component diagram
The focus for all the components was in automation and ease of execution, following a DevOps approach and lowering the learning curve for new developers to deploy advanced environments. For example, the script for the tiles server generator covers the following actions:
-
Installs/compiles all the software dependencies.
-
Installs a PostgreSQL/PostGIS database and sets it up.
-
Downloads and configures a stylesheet.
-
Downloads OSM data for the specific country and inserts it into the database.
-
Downloads additional shapefiles and fonts.
-
Sets up Apache HTTP Server with rendered.
-
Launches Apache Server.
The process for deploying a tiles server using a Docker container was also tested and documented. In addition, a simple HTML/JavaScript client to test the tiles server was developed and added to the repository.
The idea of tiling NetCDF files was to experiment with solutions for data providers to offer array oriented datasets to service providers that can be consumed on demand or to select the data of interest based on the bounding boxes extension. As the concept of tiling is well understood in the geospatial world, instead of publishing a large NetCDF covering the whole world, the content can be tiled by levels, chopping the content into smaller chunks that are more easily transferable through the network. These smaller NetCDF files can be published through a STAC catalog that documents the extent of each file, allowing service providers to automate the search for the data of interest.
Figure 16 — NetCDF tiling concept
The source code and detailed documentation on how to deploy these components are available at https://github.com/opengeospatial/T17-API-D167. The datasets are deployed in the following endpoints:
Database Access
-
AWS:
-
IP: 18.189.112.137
-
Port: 5432
-
User: apipostgres
-
Pwd: @P$_2021!
-
Database: dbapi4
-
Tables: geojson_places, geojson_waterways, geojson_polygons
-
-
CloudSigma:
-
IP: 162.213.36.126
-
Port: 5432
-
User: dbasigma
-
Pwd: Db@_S$gma2021!
-
Database: gis
-
Tables: geojson_places, geojson_waterways, geojson_polygons
-
-
CloudFerro:
-
IP: 185.178.85.176
-
Port: 5432
-
User: apipostgres
-
Pwd: Db@_F$rro2021!
-
Database: dbferro01
-
Tables: geojson_places, geojson_waterways, geojson_polygons
-
Example: psql -W -U apipostgres -p 5432 -h 185.178.85.176 dbferro01
-
AWS S3 buckets
-
Polygons (2883 elements):
-
Points (1776 elements):
Tiles server
-
Area: Northeast Spain
7.3.1. Challenges and lessons learned
The scripting and automation approach used in completing this deliverable was to deploy datasets and services in the cloud. This approach provides numerous benefits such as:
-
The deployed datasets and their environments are similar, easing the maintenance effort.
-
Human errors are minimized.
-
The time required for deployment is greatly reduced.
-
Deep technical knowledge is not needed, lowering the learning curve and allowing for new developers to quickly become productive.
However, developing these scripts required considerably more effort than a single deployment and they also require more technical knowledge. Scripts need to be structured and be reusable, as well as handle errors or generic configurations.
Besides, configuring cloud services and storage needs to be carefully set up and tested, because an error or untidiness in a deployment might result in security risks or unnecessary high costs. These risks are multiplied when developing a script that will be reused, especially if users are different and have a lower technical knowledge than the original developer.
The databases were deployed in three different clouds: AWS, CloudSigma and CloudFerro. There were not substantial differences among them from the functional point of view. AWS seemed more usable but eventually all three offered a similar set of features.
The extraction and handling of OpenStreetMap data was more troublesome than initially expected, with complex SQL queries required to extract specific datasets, including coordinate reference system transformation. These queries are not an important challenge for a seasoned geospatial developer, but they might be a major blocker for newcomers.
The NetCDF tiling concept resulted in a relatively easy-to-understand concept and its implementation was simple enough. However, testing with 0 tiling levels, it required 10 times more storage space as the same data needed to be packed for all the levels, and each file had considerable overhead. Moreover, the structure of the STAC catalog became more complex. Instead of the original NetCDF file being an item in a STAC catalog, the tiling structure required a sub-catalog containing a collection for each tiling level which then would publish each tiled file as an item.
During the TIEs, the server API implementations were deployed in AWS following the instructions documented in the repositories. The technical level required for these deployments was adequate for an experienced IT professional, but it was too high for newcomers. As part of future work, documentation in the repositories should be improved to ease access for less experienced professionals.
The experience with Docker when deploying D165 and D166 server APIs was positive: The process was simple and easily encapsulated. However, in real-life environments the risks of over-architecting should be considered among different alternatives before making a decision on how to deploy a server API.
7.4. D168 Data Backend and Deployment
The aim of D168 was to be the provider of data and deploy a server component as developed, documented and delivered: Communicating all experience and lessons learned back to the server provider to help with optimizing documentation and scripts.
The testing involved storing the data in specific cloud environments, primarily AWS with some parallel testing of CreoDIAS, with a specific focus on cloud object storage. The API Experiments Server tested was D165.
7.4.1. Development & Experiments
7.4.1.1. Experiment 1: Data catalogs stored within AWS S3 bucket & OpenSearch
Pixalytics has an Earth Observation (EO) dataset that is currently stored in a directory/file structure on an AWS Expandable File Storage (EFS).
Copernicus Sentinel-1 and Sentinel-2 data were used to generate a land cover classification at 10-meter resolution to investigate sand dams as part of a project that focuses on sustainable sand extraction in Kenya. In addition, the datasets include other products derived from Sentinel-2 (vegetation related) and products derived from other satellite missions at coarser spatial resolutions.The dataset can be viewed online at https://voila.notebook.eo4sas.com/ as CaseStudy1 (one of three Jupyter Notebooks running as interactive webpages).
Within the framework of this activity, the dataset was transitioned to an accessible S3 bucket with the catalogue defined through the two options of STAC (https://stacspec.org/) and OGC API — Records. The image files were originally in GeoTIFF format but transitioned to Cloud Optimized GeoTIFF (COG) and NetCDF for the experiments.
This experiment involved five steps, with the code stored as shown in Figure 1:
-
Convert GeoTIFF files to COG and NetCDF format using convert_gtiff.py in utils folder
-
Create OGC Record or STACcatalogs using create_catalog.py in build_catalog folder of D168
-
Manual push of catalogs to AWS S3 buckets
-
Push catalogs to Elasticsearch using upload_esearch.py in deploy_catalog folder of D168
-
Deployment of D165 server providing API access to the catalogs with the specification in JSON files
Figure 17 — Representation of the D168 GitHub repository with the code folders (light blue) and code components (dark blue)
7.4.1.1.1. Catalogs within AWS S3 bucket
The catalogs were created using Python code shared via the T17-API-D168 GitHub repository, under the build_catalog subfolder. The code uses yaml files (test-configuration.yaml and eo4sas-record.yml) to store configuration information, and rely on existing open-source code for STAC (pystac) and OGC API — Records (modified version of pygeometa to create the main catalog file) catalog creation. The GitHub ReadMe describes how to install the used anaconda environment and run the catalog creatin code.
Once created, the static catalogs are manually uploaded to the specifically setup S3 bucket that has public read access.
Examples of what is currently available via the S3 bucket include:
-
STAC catalog v0-8 created using pystac for GeoTiFFs:
-
STAC catalog v0-8 created using pystac for NetCDFs:
-
OGC API — Records catalog v0-8 created using pygeometa for GeoTiFFs:
-
OGC API — Records catalog v0-8 created using pygeometa for NetCDFs:
7.4.1.1.2. Catalogs deployed via Amazon OpenSearch (previously called Elasticsearch) Service with D165 Server
To support querying the catalogs, they were also deployed to Elasticsearch using Python code shared via the T17-API-D168 GitHub repository, under the deploy_catalog subfolder. The code uses a YAML file (es_upload_conf.yaml) for configuration of the deployment and then upload to Elasticsearch.
At the start of the Testbed 17 activity, OpenSearch was called Elasticsearch but since then the two have diverged. To maintain compatibility with what was developed the AWS deployment to Elasticsearch 7.10 was used. Also, version 7.13.4 of the Python Elasticsearch library needs to be used as more recent versions produce critical error messages.
-
Elasticsearch endpoint on AWS [needs an IAM user account with authentication for access]: https://search-ogc-t17-d168-yhvlgzft2zhuvdssiaejkyq5lq.eu-west-2.es.amazonaws.com
-
STAC Catalog indices (for version 0-8): STAC-index (GeoTIFFs) STAC-index-nc (NetCDFs)
-
OGC Records indices (for version 0-8): records-index (GeoTIFFs) records-index-nc (NetCDFs)
-
7.4.1.1.3. Deployment using D165 server
The deployment as an API using the D165 server is available at:
-
OGC API — Features server with three catalogs (CubeWerx’s alongside Elasticsearch versions of Records and STAC GeoTiFF catalogs): http://ogcapi.pixalytics.com:8080/
-
OGI API — EDR implementation with a single multi time-step NetCDF: http://ogcapiedr.pixalytics.com:8080/
7.4.1.2. Experiment 2: Catalogs created and deployed using the CloudFerro CreoDIAS platform
As a second test, the D168 code was also run on the CreoDIAS platform. The data was still stored in the AWS S3 bucket, but pulled to CreoDIAS and the catalogs generated and deployed using the D165 server. These deployments have not been kept active so cannot be accessed.
7.4.2. Lessons learned
A summary of lessons learned is:
-
Existing GitHub repositories for STAC (pystac) and OGC API Records (pygeometa) were helpful to both understand the catalog structure and implement the Python code to generate the catalogs.
-
To create the Records catalog, API weekly discussions alongside the documentation in ogcapi-records repository were very helpful.
-
For AWS, a combination of the online documentation and virtual training/events was used to get the necessary knowledge to setup the required IAM permissions and Amazon OpenSearch configuration.
-
For CreoDIAS, an introductory webinar was attended and then the Virtual Machine setup following the provided guidance and what was already understood from using AWS.
-
At the start of the activity Pixalytics had not deployed catalogs. From what was learnt during the API Experiments, the participants feel confident about future deployment of catalogs using the approaches tested.
7.5. D175 API Experiments Client — Python
The D175 API Experiments Client — Python implements three different OGC APIs, each available in its own GitHub repository:
-
OGC API — Features, available at https://github.com/opengeospatial/T17-API-D175-Features
-
OGC API — EDR, available at https://github.com/opengeospatial/T17-API-D175-EDR
-
OGC API — Routes, available at https://github.com/opengeospatial/T17-API-D175-Routes
All of the implemented APIs share a common structure and base code, as well as an automated and encapsulated deployment process that pulls the code from the respective repository. These implementations were conceived as data-centric clients, focusing the efforts in making them easy and quick to deploy and providing immediate access to the data.
Experiments were carried out in the auto-generation of client libraries for OGC API — Features and — EDR, based on the definition of a specific server API. The software used to auto generate the client libraries was OpenAPI Generator, in particular the python client (experimental). One of these experiments, auto generated against CubeWerx’s Daraa API, has been registered in its own repository and is available for deployment and tests at https://github.com/opengeospatial/T17-API-D175-Features_autogenerated.
Each client implementation consists of two different components:
-
A backend, built in Python 3 on top of Flask, a popular micro web framework. The backend implements the OGC API client calls to interact with the API server and fetch the data.
-
A frontend, built in HTML/JavaScript, receiving the interactions from the user through a web browser and translating those to the client backend, as well as visually displaying the data fetched from the server.
Figure 18 — Client architecture
This two-layered architecture enables separating the browser-client interface from the client-server interface, providing a greater flexibility in the design of the user interface. Clients can be deployed locally, but also remotely, at a different location than the API server or the user’s browser.
Clients use two common design patterns with this framework: The application dispatching pattern and the application as a package. This allows for a clean directory structure and readable code, making the different clients easy to maintain and deploy. Each client is launched in its own Python virtual environment, installing its own version of the required libraries without messing with the existing installation in the operative system.
Once the client is launched, it automatically reaches the server landing page and fetches all the important information, which is offered to the user via buttons, texts and drop-down menus. The goal is for the user to explore the API capabilities in a graphical and convenient way.
The User Interface for all clients have similar designs, with a main map to display results, a list of buttons to easily explore the API and a series of buttons and fields to make the most common queries. This user interface is stored in HTML, CSS and JavaScript files with a very simplistic design, but there is no limit as to the level of sophistication they can attain.
7.5.1. Challenges and lessons learned
7.5.1.1. Client auto-generation
Although in theory code auto-generation seems like a very interesting idea, the reality is that it is an error-prone process that requires a well-defined API and might not be worth the effort. This is particularly true considering that implementing a REST API client is not particularly difficult. The main challenges encountered in the client auto-generation can be summarized in the following points:
-
The OpenAPI Generator is not really stable. The latest versions are too buggy and can crash in many ways. For our case, OpenAPI Generator v4.0.0 was considered stable enough and could be useful but it was yet not perfect. There were still some minor bugs, which could be mitigated by editing the API definition.
-
Server API definition must comply with OpenAPI v3. The generator is particularly picky with numeric data types.
-
OpenAPI Generator has extensive parameterization and it requires a lot of documentation reading before being able to use it.
-
Generated libraries can be used only with the target API.
-
Changes in the target API can easily break the compatibility with auto generated clients.
-
Reality check: OpenAPI Generator is a good idea in theory but in reality, there are plenty of APIs with incomplete definitions, typos in OpenAPI docs, inconsistent definitions or published data are inconsistent with OpenAPI definitions.
On the other hand, client auto-generation did provide important benefits:
-
It is an API-oriented approach, the focus shifts away from code.
-
It moves towards standardization of the API structure and methods across programming languages.
-
It is easy to expand to other OGC APIs.
-
It is easy to maintain/update existing APIs.
-
Documentation and unit tests can be auto generated along with the source code.
7.5.1.2. Data-centric clients
A second important set of lessons learned came from the experiments with data-centric clients. Such clients are more focused on quick access to the data than on the feature completion.
The main challenges in the development of these clients were to set up an installation in a properly isolated environment, as well as the implementation of the invisible automation that allows the immediate visualization of data. The use of standard OGC API building blocks is instrumental in solving the latter.
However, there was an aspect that showed some limitations in the current definition of OGC API standards. The approach was for the client to automatically connect to a server API, explore it and populate the user interface with all the options and capabilities offered through the API. Having a standard and easy way to request from the server API landing page which methods and encodings are accepted before making the first request would be very useful. This could be done using the HTTP method OPTIONS and could facilitate the automation of API exploration.
When carrying out the integration experiments, some approaches to implementing the server APIs added extra complexity in the client side. For example, although type is not a property strictly required for link objects, it saves considerable implementation time when developing a client that automates the API exploration. A significant amount of effort had to be spent in adding the code necessary to handle link objects that did not include a type property that could lead to requests for non-GeoJSON formats. An interesting improvement would be to require the property type in all link objects, or at least to recommend its use in the standard documentation.
Alternatively, the client could have added the format in the request with the parameter f=json to ensure the format of the response. However, there is no specification for the value of this parameter in the standard and, in fact, other participants used different values in their server implementations, such as f=application/json and f=application/geo+json. The proper way would be to access the API definition, if it exists, and find the possible values for parameter f, but the automation for this is cumbersome.
Overall, data-centric clients have proven an interesting approach, providing the following benefits:
-
All deployed environments are similar, even for different types of APIs and clients, which facilitates maintenance.
-
Human error factors are minimized during the environment configuration and deployment.
-
The time required to complete a deployment is minimal, which makes it useful for test environments or pilots.
-
Users do not need advanced technical knowledge to set up the environment or deploy a client, lowering the entry barriers, and facilitating the entry-level for new developers to OGC APIs.
-
Clients are light-weight and performant and allow for an automated exploration of APIs, which can become a handy tool for experienced OGC server developers and deployers to test their deployments or visually explore the data they publish.
7.6. D176 API Experiments Client — TypeScript
The TypeScript API Experiment client consists of three main components:
-
The API clients that are generated based on the corresponding OpenAPI specifications. (https://github.com/opengeospatial/T17-API-D176)
-
A connector for Web WorldWind that ties in the API responses into visual layers on WorldWind. (https://github.com/karriojala/WebWorldWind-OGCTB17)
-
A demonstrator that shows the API client’s capabilities in a real-world use case integrating with WorldWind, which bridges to plain JavaScript usage. (https://github.com/opengeospatial/T17-API-D176-dev)
The demonstrator is available to try here: https://ogctb17-apis-breithorn.solenix.ch/#/earth
Figure 19 — Demonstrator application using NASA WebWorldWind
The most interesting aspects about this experiment are:
-
Execution in a Browser
Browsers exhibit particular challenges for the execution and communication with remote services as they are executed in the restricted JavaScript sandbox in a browser. Considerations such as HTTP vs. HTTPS access, Cross-Origin Resource Sharing (CORS) and limitations in processing of data are to be considered. -
Exploration of the API Client in a real world scenario, using Web WorldWind.
-
TypeScript with strong typing that significantly improves the available tooling for developers.
TypeScript is suitable for NodeJS environments and browsers, which spans a wide range of use cases.
The demonstration and discussed use cases focus on front-end development and visualizations in browsers, and less on complex processing or workflows of the results. Consequently, the developed use cases demonstrate the integration and common data request patterns for such web-based applications.
In addition to the OGC APIs Client source code, the documentation accompanying the source code provides examples for common use cases and explains the utility of specific end points and collections.
An important aspect of getting started with the OGC APIs is to get the initial bearings and identify where to find specific data, to learn the names and concepts used, and to learn how to explore available data sets. The documentation touches on these points as well, guiding a new user in getting started with the OGC APIs, data providers and data sets alike.
7.6.1. API Clients and Code Generation
This component focused on exploration and experimentation with OGC API — Features and OGC API — EDR.
Following the findings for generating code based on OpenAPI specifications for OGC APIs and considering the goal of creating a generic client that follows the standards as closely as possible, having an independent client API for each of the supported OGC API Standards was the final decision.
The back-end used was the typescript-angular generator, which a strongly typed client library that uses Angular’s HttpClient and various reactive patterns in the backend. This backend was chosen to demonstrate the generic use of code generators and as excellent fit for the angular based demonstrator.
The Web WorldWind API is plain JavaScript with basic (require.js based) dependency management and no awareness of TypeScript or Angular. The extension to WorldWind was thus made in a way that allows consuming resources retrieved via the APIs, independent of the specific client. Consequently, a developer could also generate a pure JavaScript or non-Angular TypeScript client that would be able to provide those same responses, which could be rendered appropriately by WorldWind.
Finally, the OpenAPI specification provided by the Daraa-data server that is hosted by CubeWerx was used to exercise a compound API client, which can be found as api-daraa in the API clients repository.
7.6.2. OGC Code Sprint and Discrete Global Grid Systems
During October 2021 OGC API Code Sprint the client was extended with a demonstration for retrieving and visualizing DGGS (Discrete Global Grid Systems) zones. For this purpose, auto-generated TypeScript client was used in the same way as with OGC API — EDR and OGC API — Features.