Published

OGC Engineering Report

OGC Testbed 20 GeoDataCube (GDC) API Profile Report
Jonas Eberle Editor
OGC Engineering Report

Published

Document number:24-035
Document type:OGC Engineering Report
Document subtype:
Document stage:Published
Document language:English

License Agreement

Use of this document is subject to the license agreement at https://www.ogc.org/license



I.  Keywords

The following are keywords to be used by search engines and document catalogues.

ogc, testbed, geodatacube, api, web service, OpenEO, Processes, Coverage, earth observation, weather, profile

II.  Contributors

All questions regarding this document should be directed to the editor or the contributors:

Table — Table Contributors

NameOrganizationRole
Jonas EberleGerman Aerospace CenterEditor
Matthias MohrMatthias Mohr — SoftwareentwicklungContributor
Jérôme Jacovella-St-LouisEcere CorporationContributor
Patrick DionEcere CorporationContributor
Michele ClausEurac ResearchContributor
Francis Charette MigneaultCRIMContributor

III.  Overview

The geospatial community has invested significant resources in defining and developing Geospatial Data Cubes (GDCs), demonstrating a commitment to creating an infrastructure that enables the organized storage and use of multidimensional geospatial data. While progress has been made, the state of the art falls short of fully interoperable GDCs capable of meeting specific organizational requirements. Establishing reference implementations to ensure GDCs are both interoperable and exploitable—particularly in Earth Observation (EO) contexts—remains a priority.

Based on the collective efforts of the OGC Testbed-19 GDC Activity and the GeoDataCube Standards Working Group (SWG), the definition of the draft OGC API — GeoDataCube Standard strives to unify the disparate threads of geospatial data cube technology. By integrating elements from the openEO API, OGC API — Processes, and the SpatioTemporal Asset Catalog (STAC), the GDC draft API presents a holistic approach to accessing, managing, and processing Earth Observation data.

A GeoDataCube API combines data access and analysis functionality. A GDC API implementation would support access to datasets in the form of data cubes with the basic capabilities of filtering, subsetting, and aggregating. Further advanced processing is enabled through the integration of existing processing API endpoints (e.g., OGC API – Processes Standard and the openEO API Specification).

This Testbed 20 Report focuses on profiling existing capabilities defined by other OGC-approved and draft standards, such as the OGC API — Processes Standard and OGC API — Coverages Standard, as well as community-driven standards such as the openEO API.

NOTE:  Hereafter in this document, a profile of a GeoDataCube API integrating support for one or more approved or candidate OGC Standards or Community Standards will be referred to as a GDC API.

IV.  Future Outlook

Vector Data Cubes and OGC API EDR Conformance While the current focus is on geospatial raster data as part of GDC, the integration of geospatial vector data into GDC remains an open area of investigation. Additionally, aligning a GDC API Standard with the OGC API — Environmental Data Retrieval (EDR) will require deeper exploration to ensure seamless compatibility.

Expanding STAC API Utilization Further evaluation of the SpatioTemporal Asset Catalog (STAC) API as a /collections endpoint is necessary. This will enable the handling of non-coverage data, such as geospatial data cubes containing vector data, broadening the applicability of the API for diverse geospatial datasets.

Standardized Processing and Workflow Execution The establishment of well-known processes to support structured processing languages is crucial for interoperability. Identifying and standardizing the representation of GDCs as process inputs will facilitate more efficient and consistent data processing across different platforms.

Harmonization of Workflow Execution A standardized API endpoint for submitting and executing workflows (e.g., OpenEO process graphs, Common Workflow Language [CWL]) is needed to unify execution across various workflow languages. Ongoing discussions in Testbed-20 lay the groundwork for achieving a harmonized approach to workflow orchestration.

By addressing these areas, GDC API profiles—such as those implemented in Testbed-20—can evolve into more robust and flexible solutions, supporting a wider range of geospatial data types and processing capabilities. Future efforts will focus on refining these aspects to enhance usability, scalability, and interoperability within the geospatial community.

V.  Value Proposition

Unified Data Access and Transformation

A GDC API is designed to enable seamless integration of data access and analysis by allowing users to interact with data in the form of data cubes, which can be filtered, subsetted, and reshaped to fit specific analytical needs. This combination optimizes workflows by eliminating redundant steps and ensuring efficient data handling.

The following key features show the value of a GDC API:

  1. Chaining Data Access and Reshaping:

    • Without a GDC API:

      • Collections retrieved from an implementation of the OGC API — Coverage Standard must be processed independently.

      • A separate download step is required between reshaping and analysis, increasing workflow complexity.

    • With a GDC API:

      • Collections from an implementation of the OGC API — Coverage Standard can be reshaped and analyzed in a single uninterrupted workflow, demonstrating significant efficiency improvements.

  2. Accessing Data from Multiple Infrastructures:

    • Utilize a common API definition to access data sets hosted on different infrastructures/platforms, enabling integration across diverse sources.

    • Example workflows:

      • Combine data from Copernicus DataSpace Ecosystem and Climate Data Store in a single pipeline (though the OGC API — Coverage Standard and STAC collections are not yet interoperable).

      • Define workflows using collections from different backends that support the implementation of a GDC API, demonstrating full compatibility and streamlined data processing.

Enabling FAIR Data Principles for Derived Data Sets

Data products generated through a GDC API enabled workflows are not only reusable but also adhere to FAIR principles (Findable, Accessible, Interoperable, Reusable):

  • Chained Workflows:

    • Data sets produced from chained processes (e.g., reshaping, subsetting, filtering) can be accessed as new collections through an implementation instance of a GDC API.

    • These derived collections can be further integrated into new workflows, enabling continuous data transformation and analysis.

By leveraging a GDC API, users can achieve an efficient, traceable, and FAIR-compliant approach to data access, transformation, and analysis, fostering reproducibility and scalability across various domains and infrastructures.

1.  Introduction

Datacubes are multi-dimensional arrays that include additional information about their dimensionality. They offer a clean and organized interface for spatiotemporal data, as well as for the operations that may be performed on them. While arrays are similar to raster data, datacubes can also contain vector data.

In Testbed 20, the focus was on datacubes containing raster data. GeoDataCubes (GDCs) are a specific type of datacube that includes one or more spatial dimensions, such as x and y. Unlike other OGC standards like the OGC API – Processes, working with datacubes involves the description, discovery, access, and processing of multi-dimensional data.

In Testbed 19, the emphasis was on comparing existing standards for the discovery (e.g., OGC API — Records and STAC) and processing (e.g., openEO and OGC API — Processes) of GDCs. In contrast, Testbed 20 aimed to define the functionality provided by a GDC and explore how existing standards can meet those requirements.

This work included the following standardization tasks:

  • Define functionality of a GeoDataCube;

  • Discuss applicability of existing standards to provide these functions;

  • Define profiles of a GDC API to provide a different set of functionality being provided via a GDC API endpoint.

  • Discuss harmonization of different processing standards with the aim of a standardized API endpoint to execute both a process and an openEO process graph

Although the definition of GeoDataCubes is an important step, it is not the goal of Testbed 20 to provide consensus agreement of each specific aspect and component of the term GeoDataCube. As there can be many variants of a GeoDataCube, such agreement was out of scope.

1.1.  Aims

The integration of GeoDataCubes into geospatial workflows is essential for advancing data-driven solutions across various domains. The Testbed 20 GDC Task focused on establishing a standardized approach to the use of GeoDataCubes by:

  • Define key terminology and concepts relevant for the execution of Testbed 20.

  • Create interoperability between openEO, CWL, and OGC API — Processes through a language-independent framework for GeoDataCube utilization.

  • Provide GDC API endpoints that offer data and functionalities aligned with specific use cases.

  • Conduct Technology Integration Experiments (TIE) to assess and validate the interoperability of GDC API endpoints.

These efforts will foster seamless data access, processing, and integration across heterogeneous platforms, enhancing the scalability and efficiency of geospatial workflows.

1.2.  Objectives

The primary objective of the GDC task was to develop a language-independent extension for the GeoDataCube (GDC) draft API, as defined in Testbed 19. This API combines components from approved and candidate OGC standards, enabling the creation of workflows that remain consistent across different workflow description languages. The task involved participants from various platforms integrating this extension using technologies such as openEO, CWL, and OGC API-Processes. The extension was designed to promote interoperability among different implementations and streamline the workflow creation process, drawing on existing frameworks established in both approved and candidate OGC standards, as well as community standards for GeoDataCubes.

2.  Topics

2.1.  Definition of GeoDataCube

GeoDataCubes (GDC) are a special case of datacubes in that they have one or multiple spatial dimensions, e.g. x and y. GeoDataCubes for raster data often consist of the dimensions x, y, time and bands. Sometimes they also have multiple temporal dimensions. GeodataCubes for vector data often consist of geometries, time, and a variable (e.g., temperature). Generally, datacubes can consist of any combination of dimensions — the dimensions are unrestricted. The spatial dimension of GeoDataCubes may get removed during processing.

The following additional information is usually available for datacubes:

  • the dimensions (see below)

  • a grid cell type / sampling method (area or point)

  • a unit for the values

This additional information could be provided upfront via metadata.

2.1.1.  Dimensions

A dimension refers to a certain axis of a datacube. This includes all variables (e.g. bands), which are represented as dimensions. An example raster datacube could have the spatial dimensions x and y, and the temporal dimension t. Furthermore, it could have a bands dimension, extending into the realm of what kind of information is contained in the cube.

The following properties are usually available for dimensions:

  • A name.

  • A type. Potential types include spatial (raster or vector data), temporal, and other variables such as bands.

  • Labels, usually exposed through textual or numerical representations, in the metadata as nominal values and/or extents.

  • A reference system and/or unit of measure for the labels. A unit may implicitly be defined through the reference system.

  • A resolution / step size.

  • Other information specific to the dimension type such as the geometry types for a dimension containing geometries.

Specific implementations of datacubes may prescribe details such as sorting orders or representations of labels. For example, some implementations may always sort temporal labels in their inherent order and encode them in an ISO8601 compliant way.

Datacubes contain scalar values (e.g. strings, numbers or Boolean values), with all other associated attributes stored in dimensions (e.g. coordinates or timestamps). Attributes such as the Coordinate Reference System (CRS) or the sensor can also be turned into dimensions. Be advised that in such a case, the uniqueness of pixel coordinates may be affected. When usually, (x, y) refers to a unique location, that changes to (x, y, CRS) when (x, y) values are reused in other coordinate reference systems (e.g. two neighboring UTM zones).

2.1.2.  Common operations

Below are some operations that are commonly applied to datacubes:

  • subset — Restrict the extent of dimensions, such as remove all temporal information not in the year 2021

  • apply (map) — Compute values from operations on single values, e.g. multiply all values by 10

  • reduce — Reduce a dimension by computing a single value for all values along a dimension, such as compute the maximum value along the temporal dimension

  • resample — The layout of a certain dimension is changed into another layout, most likely also changing the resolution of that dimension, such as downscaling from daily to monthly values

Every operation that returns a subset of the datacube or the complete datacube is datacube access.

Every operation that is computing new values is datacube processing.

2.2.  Capabilities

The main objective of implementing APIs for GeoDataCubes is to make the handling of multi-dimensional data easier. This includes the following capabilities:

  • Description of data cubes (e.g., available dimensions).

  • Discovery of data within a data cube (e.g., filter by specific parameters).

  • Accessing raw data contained in a data cube (e.g., static or dynamic provision of data, server-side subsetting).

  • Processing of data cubes (e.g., changing values of data within a data cube).

  • Visualization of data cubes.

For all those capabilities there are existing standards and specifications, which might be reused by an implementation of a GDC API.

2.3.  Applicable OGC standards

There are multiple approved or draft OGC Standards, including Community Standards, that define such standardized interfaces.

Each standard or specification covers different aspects of GeoDataCube capabilities as described above:

  • Description and discovery of GeoDataCubes (collections / coverages).

  • Discovery of components of a single GeoDataCube.

  • Facilitating access to data with server-side computation (custom client requests, where the server merges different pieces and return data at requested resolution).

  • Performing server-side computations (beyond accessing part / resampling of the data).

  • Encoding of data.

There is some overlap in terms of addressing these capabilities between multiple standards.

Data description and discovery

Implementation of the following API standards provide data description and data discovery capabilities:

  • OGC API — Common — Part 2: Geospatial Data (Collections and Uniform Multidimensional Collections)

  • OGC API — Records

  • SpatioTemporal Asset Catalog (STAC)

  • OGC API — Coverages — Part 3

Note: OGC API — Coverages and OGC API — EDR also provides this functionality by depending on OGC API — Common — Part 2.

Implementations of OGC API — Common — Part 2, the STAC API and OGC API — Records “Searchable Catalog Deployment” all use the /collections endpoint to provide and describe a list of available collections.

Implementations of OGC API — Coverages — Part 3: Scenes describe individual scenes within a single collection at /collections/{collectionId}/scenes.

Data access (partial pieces based on server-side computations)

Implementations of the following API standards provide server-side data access capabilities:

  • OGC API — Coverages (e.g., /collections/{collectionId}/coverage)

  • OGC API — EDR (e.g., /collections/{collectionId}/position)

Implementations of both standards provide endpoints that support subsetting query parameters and merge data as needed when the backing data store consists of separate components. While the OGC API — Coverages — Part 1 standard defines a requirement class for requesting data at a specific resolution (resampling), the OGC API — EDR — Part 1 standard does not offer this functionality.

Although data access can also be implemented as processes within the openEO or OGC API — Processes definitions, these implementations do not provide a dedicated endpoint for data access. Additionally, data access can be achieved by referencing static files over HTTPS (e.g., an HTTPS URL pointing to a file hosted in a cloud-optimized data format).

Data processing (server-side computations)

Implementations of the following API standards provide server-side data processing capabilities:

  • OGC API — Processes — Part 1: Core (+ upcoming parts) (e.g., /processes/{processId})

  • OGC Web Coverage Processing Service (WCPS) (e.g., request=ProcessCoverages)

  • openEO (e.g., /result or /jobs)

  • OGC API — Coverages — Part 2: Filtering, Deriving and Aggregating fields (e.g., /collections/{collectionId}/coverage with query parameters specifying OGC Common Query Language (CQL2) expressions

Implementations of these standards provide endpoints to start a previously defined process. With WCPS and openEO, the processing can be defined by the user as part of the request. With OGC API — Processes, only pre-defined processes (e.g., calculation of a specific algorithm) can be executed by users with their indivdidual input parameters.

Data visualization

Implementations of the following API standards provide data visualization capabilities:

  • OGC API — Maps

  • OGC API — Tiles (using map tiles)

Data visualization was not part of the Testbed 20 activity, thus no discussion took place. However, some example requests and responses can be seen in the implementation annexes for server-side visualization of the Vegetation Health Index (VHI) workflow.

Data formats

The following formats were used in this initiative.

  • OGC GeoTIFF (including Cloud Optimized GeoTIFF)

  • OGC netCDF

  • Zarr Storage Specification (Community Standard)

The analysis of data formats was not a focus of the Testbed 20 activity, but some basic capabilities and limitations of these encodings are described below to paint a more complete picture of the GDC ecosystem.

An advantage of Cloud Optimized GeoTIFF is support for overviews and tiles, allowing to perform HTTP range requests to retrieve only portions of interest. A disadvantage of GeoTIFF is that as of version 1.0, it is primarily limited to two-dimensional data, and does not define metadata for describing fields (bands). While Zarr has chunks providing functionality similar to tiles, extensions for overviews were not yet finalized at the time of writing this report (e.g., https://github.com/zarr-developers/zarr-specs/issues/125).

2.3.1.  Standards defining capabilities which can be integrated within a GDC API

As demonstrated in the previous section, there is some overlap and interoperability among the various standards when it comes to addressing these capabilities. The table below aims to illustrate and clarify this overlap by highlighting the specific capabilities defined within each standard, as well as how these standards can be integrated within a GDC API implementation.

  • A cell value with bold text means that this capability is a key feature of this standard defined within the standard itself (see note below),

  • A value with N/A means that this capability is not provided by this particular standard, but may be achieved in combination with other Standards as indicated

IMPORTANT

This table is not intended as a comparison of the value of different standards, but as a quick overview of the GDC standards landscape, allowing to easily look up which capabilities are provided by which standards, as well as how all of these standards can be integrated together within a GDC API implementation. The fact that a particular standard itself specifies a particular capability (indicated in bold) is not to be interpreted as an advantage of that particular standard compared to leveraging that capability from another standard.

Table 1 — Comparison of existing standards for GDC APIs

StandardGeoDataCube Discovery & DescriptionData Pieces DiscoverySingle request to retrieve data for specific area and resolutionServer-side computationsCombinations
OGC WCS with scaling and range subsetting extensionsYes (discovery limited to listing all coverages)NoYesOGC WCPS
OGC API — Common — Part 2: Geospatial Data,
Part 3?: Schemas, Part 4?: Discovery in numerous collections
YesN/AN/AN/ABasis for most OGC APIs, Records — Local (Collection) Resource Catalog
OGC API — RecordsYes (description not datacube-specific)YesN/AN/ALinks to files and services
STACDatacube extensionYesN/AN/ALinks to files and services
OGC API — CoveragesCommonPart 3: ScenesYesPart 2 w/ CQL2 1, ProcessesCQL2, Processes, STAC items (for scenes)
OGC API — EDRCommonN/ASubsetting (no resizing in Part 1)N/AProcesses
OGC API — TilesCommon2DTMSN/ACQL2 extension, w/ ProcessesProcesses, Coverages, CQL2
OGC API — DGGSCommonDGGRSresampling with zone-depth for single DGGRS zoneZone Queries, CQL2 extension, w/ ProcessesCQL2, Processes
OGC API — Processes — Part 1: Core, Part 2: Deploy, Replace, Undeploy, Part 3: WorkflowsN/AN/Ausing custom processesYes 2CQL2, CWL, WCPS, STAC, Coverages, EDR, Tiles, DGGS, Maps
openEOSTACSTACusing pre-defined processesYesSTAC, Tiles, Maps, Processes, CWL, Coverages
1 CQL2 defines arithmetic and relational operators/functions, while OGC API — Coverages — Part 2 will define additional functions for aggregation.
2 There is a need for defining well-known processes (similar to openEO processes), and declaring that a specific OGC API — Processes process implements a particular well-known process. An OGC Naming Authority register could be used for this purpose, with a URI property of the process description which could point to a well-known process. See also related discussion for well-known (CQL2) functions.

2.4.  Discussion on profiles of a GDC API

Based on the outcomes of the Testbed 19 GeoDataCube task, profiles for a GDC API needed to be discussed and defined in Testbed 20. In this context, a profile is a defined set of minimal capabilities that a system must implement to conform to specific functionality. A profile defines a structured way to integrate OGC (and other) standards without redefining data retrieval methods.

The following profiles were discussed for the main GDC capabilities described in the previous section:

  • Core

  • Partial Access

  • Resample Access

  • Data Processing with sub-profiles conforming to different workflow languages (e.g., openEO API, OGC API — Processes)

 

Figure 1

2.4.1.  GDC Core (Full Data Access)

Defines how GeoDataCubes are described through metadata and how they can be downloaded “as-is” (e.g. a netCDF formatted package downloaded from object storage).

The initial discussion by the Testbed GDC task participants was to follow OGC API — Coverages endpoints:

  • GET /

  • GET /conformance

  • GET /collections

  • GET /collections/{collectionId}

  • GET /collections/{collectionId}/schema

  • GET /collections/{collectionId}/coverage

However, the GDC API implementations in Testbed 20 differ between the provision of STAC API for the /collections endpoint as opposed to OGC API — Coverages.

Note that the endpoints/paths in a static context may have any path in the URL, but the responses must follow the same schema (comparable with STAC and STAC API). This allows the geospatial data to be hosted on any cloud storage infrastructure without the need for an API (comparable to static STAC catalogs).

So a GDC instance is described through the collection metadata. The data contained in a GDC can be found through the link relation type http://www.opengis.net/def/rel/ogc/1.0/coverage (comparable with STAC assets).

An API implementation of the OGC API — Coverages Standard is always compliant with the requirements for this GDC Core profile.

2.4.2.  GDC Partial Access

  • Requires: GDC Core

This profile is a full implementation of the OGC API — Coverages — Part 1 Standard. Additional parts may be implemented (i.e. the landing page). A partial access capability also requires:

  • Domain subsetting: This is filtering for a given time/area of interest, or additional dimensions not considered part of the “range”. In other words the output of the coverage function for a given direct position in its domain — subset=, datetime=, bbox=.

  • Field Selection: This is filtering on values returned for a given position, such as requesting specific band(s) or climate variable(s) — properties=.

2.4.3.  GDC Resampled Access

  • Requires: GDC Partial Access

This profile adds scaling capabilities to the GDC (Down/Up sampling — scale-axes=, scale-factor=, scale-size=, width=, height=).

2.4.4.  Authentication

  • Requires: GDC Core

The authentication profile will be adopted as defined in openEO, so implementations should at least implement one of the following endpoints and send tokens afterwards to the actual endpoints.

  • GET /credentials/oidc

  • GET /credentials/basic

2.4.5.  GDC Data Processing

  • Requires: GDC Core

The initial Testbed 20 discussion was to submit a workflow definition to one endpoint for advanced data processing, such as POST /tasks. However, the implementations conducted by the Testbed 20 participants differ related to the processing standards used.

An implementation of this profile may process data in different modes: Synchronously, asynchronously, or on-demand (i.e. create a new GDC API deployment or virtual collection). Which mode to choose is specified through a mode query parameter, which can be either sync, async, collection, or api.

A workflow definition could be anything, for example:

The body of the request is the workflow document.

2.4.5.1.  GDC Data Processing via openEO

An implementation of the openEO API conforming to API profile L1 (minimal) and Processes profiles L1 (minimal). The implementation of an openEO API can be integrated within a GDC API implementation or be separate (see note A). Additional extensions may be implemented.

A link with relation type openeo points to the openEO instance (path of the well-known document without /.well-known/openeo, e.g. https://openeo.cloud or https://earthengine.openeo.org).

See https://openeo.org, https://api.openeo.org and https://processes.openeo.org for details

2.4.5.2.  GDC Data Processing via OGC API — Processes

This profile is an implementation of the OGC API — Processes — Part 1 Standard. An implementation of the OGC API — Processes Standard can be integrated within a GDC API implementation or be separate (see note A). Additional parts may be implemented (i.e. the landing page).

A link with relation type service points to an OGC API — Processes instance.

See https://docs.ogc.org/is/18-062r2/18-062r2.html for details.


Note A: At the time of the Testbed 20 initiative, there were still some potential conflicts combining OGC API — Processes — Part 1 and openEO under the same API tree within a single GDC API deployment. One opinion was that if implementors want to offer both openEO and OGC API — Processes, one of the APIs needs to be implemented separately.

The potential conflicts identified relate to the different responses to /processes, as well as the response for retrieving the list of running jobs at /jobs.

The /processes conflict could be addressed by more specific JSON content profile negotiation being considered for OGC API — Processes version 2.0 (issue https://github.com/opengeospatial/ogcapi-processes/issues/481). The /jobs conflict is mitigated by each job in OGC API — Processes (but not openEO) including a type property identifying a particular type of asynchronous jobs for different APIs (such as OGC API — Coverages, OGC API — Processes or openEO), as well as the fact that clients could be restricted to only see the job they created themselves. Another participant’s opinion is that it is desirable for interoperability and it should be possible going forward to develop implementations able to deploy both capabilities at the same end-point, making such end-points compatible with both types of clients.

2.4.5.3.  GDC Data Processing via OGC API — Coverages — Part 2

An implementation of the proposed OGC API — Coverages — Part 2: Filtering, deriving and aggregating fields extends the OGC API — Coverages — Part 1: Core capabilities with server-side processing using expressions defined in the OGC Common Query Language (CQL2).

2.5.  Use Case: Vegetation Health Index

For the Testbed 20 GDC implementation and testing work. the Vegetation Health Index (VHI) was chosen as a use case. This datacube use case combines vegetation and temperature data.

The Vegetation Health Index that is implemented in the Alpine Drought Observatory (ADO) is used for detecting vegetation stress conditions, which arise when there is limited availability of soil moisture to plants. The VHI allows to identify drought impacts on vegetation which correspond to a combination of thermal stress, which is detected as an increase in Land Surface Temperature (LST), and a decrease in vegetation greenness, which is identified by lower-than-average values of the Normalized Difference Vegetation Index (NDVI). In the ADO project, VHI is computed on an 8-day basis and considers the reference period 2000–2020 for calculating extreme values of NDVI and LST.

Source: https://raw.githubusercontent.com/Eurac-Research/ado-data/main/factsheets/VHI_4.pdf

2.5.1.  Algorithm

The Vegetation Health Index (VHI) is a composite index used to detect vegetation stress, particularly agricultural drought. The VHI combines two sub-indices:

Vegetation Condition Index (VCI)

V C I = 100 N D V I N D V I m i n N D V I m a x N D V I m i n   (1)

where:

  • NDVI = Normalized Difference Vegetation Index (smoothed over a certain period)

  • NDVI_min and NDVI_max = Minimum and maximum NDVI values over the reference period.

2. Thermal Condition Index (TCI)

T C I = 100 L S T m a x L S T L S T m a x L S T m i n   (2)

where:

  • LST = Land Surface Temperature

  • LST_min and LST_max = Minimum and maximum LST values over the reference period.

VHI Formula

V H I = α V C I + ( 1 α ) T C I   (3)

where α is the weight assigned to VCI, typically set to 0.5 for equal contribution from the TCI.

2.5.2.  Interpretation of VHI Values

Table 2

VHI [%]Drought Intensity
0–25Extreme drought
25–35Severe drought
35–42Mild drought
>42No drought

2.5.3.  Data

When using Sentinel-2 for NDVI in the Vegetation Health Index (VHI), the data comes from high-resolution multispectral imagery, offering detailed vegetation greenness indicators. Sentinel-2 provides NDVI at a 10–20 m spatial resolution, ideal for monitoring agricultural and forested areas.

For Land Surface Temperature (LST), datasets like ECMWF ERA5-Land or CMIP6 provide modeled temperature estimates. ERA5-Land offers hourly global data at approximately 9 km resolution, while CMIP6 provides climate projections with coarser spatial resolution, useful for long-term trend analysis and drought assessment. These sources complement VHI by offering thermal stress data.

2.5.4.  Area and time of interest

Geospatial extent

The specified geospatial extent defines a rectangular bounding box covering the majority of Slovenia:

  • Southwest Corner:

    • Latitude: 45.4236367

    • Longitude: 13.3652612

  • Northeast Corner:

    • Latitude: 46.8639623

    • Longitude: 16.5153015

This region encompasses Slovenia’s diverse landscapes, from the Julian Alps in the northwest to the Pannonian Plain in the east, including key urban areas like Ljubljana, Maribor, and Koper.

Time extent

The defined time period used spans September 1, 2018, to September 1, 2021, covering three full years. This time period allows for temporal analysis of geospatial phenomena, such as:

  • Seasonal patterns in vegetation and land surface temperatures.

  • Climate variability and trends, particularly related to drought or extreme weather.

  • Impact assessment of significant environmental events during this timeframe.

This setup is ideal for studying environmental dynamics and changes within Slovenia using geospatial data from sources like Sentinel-2 for vegetation and ECMWF ERA5 for climate data.

2.6.  Overview of implementations

The GDC API Profiles were implemented using existing standards:

Table 3 — Overview of GDC API Profile implementations

Profile/StandardCRIMEcereEllipsis DriveEuracMMS
Core
OGC API — Common — Part 2*-YesYes-partially
OGC API — Coverages — Part 1*-YesYesYesYes
STAC APIYes-YesYesYes
Partial Access
Coverages — Part 1*
Subsetting, Field Selection
-YesNoYespartially
Resampled Access
OGC API — Coverages — Part 1*
Scaling
-YesYes--
Processing
OpenEO API---YesYes
OGC API — Coverages — Part 2*-Yes---
OGC API — Processes — Part 1YesYesYes--
OGC API — Processes — Part 2*Yes-?--
OGC API — Processes — Part 3*YesYes---

*Draft standards

The Vegetation Health use case described above has been described with the following profiles and standards in this testbed:

2.6.1.  Processing: OGC API — Coverages — Part 2 (using CQL2) implemented by Ecere

At the time of the Testbed 20 initiative, Ecere’s implementation of the proposed Part 2: Filtering, Deriving and Aggregating fields profile supported requests such as the following enabling computing OGC Common Query Language (CQL2) expressions referencing coverage fields.

 



https://maps.gnosis.earth/ogcapi/collections/T20-VHI:MonthlyRef_2017_2020:NDVI_ref_and_monthly_2020/coverage?
   subset=Lat(45.4236367:46.8639623),Lon(13.3652612:16.5153015),time("2020-07")&
   width=1024&
   f=image/tiff;application=geotiff&
   properties=100 * (NDVI_monthly - NDVI_min) / (NDVI_max - NDVI_min)

Listing 1 — HTTPS GET request to /coverage with embedded processing computing Vegation Condition Index (VCI) as implemented by Ecere at the time of the initiative

Suggested changes to the proposed Part 2 using an alias query parameter (not yet implemented) would support referencing newly derived fields and leave the properties query parameter strictly for field selection. Using this approach, the above request would become the following.

 



/collections/T20-VHI:MonthlyRef_2017_2020:NDVI_ref_and_monthly_2020/coverage?
   subset=Lat(45.4236367:46.8639623),Lon(13.3652612:16.5153015),time("2020-07")&
   width=1024&
   f=image/tiff;application=geotiff&
   alias[VCI]=100 * (NDVI_monthly - NDVI_min) / (NDVI_max - NDVI_min)&
   properties=VCI

Listing 2 — Request to /coverage with embedded processing computing Vegation Condition Index (VCI) with suggested changes

With a proposed joinCollections parameter, cross-collection queries such as the full VHI workflow could be implemented as follows.

 



/collections/sentinel2-l2a/coverage?
   subset=Lat(45.4236367:46.8639623),Lon(13.3652612:16.5153015),time=2020-07&
   joinCollections=https://planetarycomputer.microsoft.com/api/stac/v1/collections/nasa-nex-gddp-cmip6&
   filter=SCL = 4 and cmip6:model = 'GFDL-ESM4' and cmip6:scenario = 'ssp585'&
   alias[NDVI_monthly]=AggregateMulti((B08 - B04)/(B08 + B04), Max, ('time'), ('P1M'))&
   alias[NDVI_max]=AggregateMulti(NDVI_monthly, Max, ('time'), ('P1M'), ('R4/2017-01-01/P1Y'), ('2020-01-01'))&
   alias[NDVI_min]=AggregateMulti(NDVI_monthly, Min, ('time'), ('P1M'), ('R4/2017-01-01/P1Y'), ('2020-01-01'))&
   alias[VCI]=100 * (NDVI_monthly - NDVI_min) / (NDVI_max - NDVI_min)&
   alias[LST_monthly]=AggregateMulti(tas, Avg, ('time'), ('P1M'))&
   alias[LST_max]=AggregateMulti(LST_monthly, Max, ('time'), ('P1M'), ('R4/2017-01-01/P1Y'), ('2020-01-01'))&
   alias[LST_min]=AggregateMulti(LST_monthly, Min, ('time'), ('P1M'), ('R4/2017-01-01/P1Y'), ('2020-01-01'))&
   alias[TCI]=100 * (LST_max - LST_monthly) / (LST_max - LST_min)&
   alias[VHI]=0.5 * VCI + 0.5 * TCI&
   properties=VHI

Listing 3 — Request to /coverage with embedded processing computing Vegation Health Index (VHI) with cross-collection queries

2.6.2.  Processing: OpenEO API implemented by Eurac and Matthias Mohr (openEO for Google Earth Engine)

Processing on a data cube can be performed using the OpenEO API. An OpenEO process graph (see below) can be created with client libraries and applications, such as the OpenEO Python Client or the OpenEO WebEditor.

 

spatial_extent = {
          "west": 13.3652612,
          "east": 16.5153015,
          "south": 45.4236367,
          "north": 46.8639623
        }

temporal_extent = ["2020-06-01","2020-07-01"]
bands = ["red","nir","scl"]
s2_cube = conn.load_stac(url="https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a",
                              spatial_extent=spatial_extent,
                              temporal_extent=temporal_extent,
                              bands=bands).resample_spatial(resolution=0.25, projection="EPSG:4326")
s2_cube.metadata = s2_cube.metadata.add_dimension("time", label=None, type="temporal")
s2_cube.metadata = s2_cube.metadata.add_dimension("latitude", label=None, type="spatial")
s2_cube.metadata = s2_cube.metadata.add_dimension("longitude", label=None, type="spatial")
s2_cube = s2_cube.rename_dimension(source="latitude",target="lat").rename_dimension(source="longitude",target="lon")
scl = s2_cube.band("scl")
vegetation_mask = (scl == 4)
s2_cube_masked = s2_cube.filter_bands(["red","nir"]).mask(vegetation_mask)

B04 = s2_cube_masked.band("red")
B08 = s2_cube_masked.band("nir")
ndvi = (B08 - B04) / (B08 + B04)

ndvi_max = ndvi.reduce_dimension(dimension="time",reducer="max").add_dimension(name="bands",type="bands",label="NDVI_MAX")
ndvi_min = ndvi.reduce_dimension(dimension="time",reducer="min").add_dimension(name="bands",type="bands",label="NDVI_MIN")
ndvi_min_max = ndvi_max.merge_cubes(ndvi_min)

aggregated_ndvi = ndvi.aggregate_temporal_period("month",reducer="max")
ndvi_min = ndvi_min_max.band("NDVI_MIN")
ndvi_max = ndvi_min_max.band("NDVI_MAX")

diff = ndvi_max - ndvi_min

VCI = aggregated_ndvi.merge_cubes(ndvi_min,overlap_resolver="subtract").merge_cubes(diff,overlap_resolver="divide").add_dimension(name="bands",type="bands",label="value")

Listing 4 — Building an OpenEO process graph with Python

 

{
 
"process_graph": {
   
"loadcollection1": {
     
"process_id": "load_collection",
     
"arguments": {
       
"bands": [
         
"red",
         
"nir",
         
"scl"
       
],
       
"id": "SENTINEL2_L2A",
       
"spatial_extent": {
         
"west": 13.3652612,
         
"east": 16.5153015,
         
"south": 45.4236367,
         
"north": 46.8639623
       
},
       
"temporal_extent": [
         
"2020-01-01",
         
"2020-12-31"
       
]
     
}
   
},
   
"resamplespatial1": {
     
"process_id": "resample_spatial",
     
"arguments": {
       
"align": "upper-left",
       
"data": {
         
"from_node": "loadcollection1"
       
},
       
"method": "near",
       
"projection": "EPSG:4326",
       
"resolution": 0.125
     
}
   
},
   
"filterbands1": {
     
"process_id": "filter_bands",
     
"arguments": {
       
"bands": [
         
"red",
         
"nir"
       
],
       
"data": {
         
"from_node": "resamplespatial1"
       
}
     
}
   
},
   
"reducedimension1": {
     
"process_id": "reduce_dimension",
     
"arguments": {
       
"data": {
         
"from_node": "resamplespatial1"
       
},
       
"dimension": "bands",
       
"reducer": {
         
"process_graph": {
           
"arrayelement1": {
             
"process_id": "array_element",
             
"arguments": {
               
"data": {
                 
"from_parameter": "data"
               
},
               
"index": 2
             
}
           
},
           
"eq1": {
             
"process_id": "eq",
             
"arguments": {
               
"x": {
                 
"from_node": "arrayelement1"
               
},
               
"y": 4
             
},
             
"result": true
           
}
         
}
       
}
     
}
   
},
   
"mask1": {
     
"process_id": "mask",
     
"arguments": {
       
"data": {
         
"from_node": "filterbands1"
       
},
       
"mask": {
         
"from_node": "reducedimension1"
       
}
     
}
   
},
   
"reducedimension2": {
     
"process_id": "reduce_dimension",
     
"arguments": {
       
"data": {
         
"from_node": "mask1"
       
},
       
"dimension": "bands",
       
"reducer": {
         
"process_graph": {
           
"arrayelement2": {
             
"process_id": "array_element",
             
"arguments": {
               
"data": {
                 
"from_parameter": "data"
               
},
               
"index": 1
             
}
           
},
           
"arrayelement3": {
             
"process_id": "array_element",
             
"arguments": {
               
"data": {
                 
"from_parameter": "data"
               
},
               
"index": 0
             
}
           
},
           
"subtract1": {
             
"process_id": "subtract",
             
"arguments": {
               
"x": {
                 
"from_node": "arrayelement2"
               
},
               
"y": {
                 
"from_node": "arrayelement3"
               
}
             
}
           
},
           
"add1": {
             
"process_id": "add",
             
"arguments": {
               
"x": {
                 
"from_node": "arrayelement2"
               
},
               
"y": {
                 
"from_node": "arrayelement3"
               
}
             
}
           
},
           
"divide1": {
             
"process_id": "divide",
             
"arguments": {
               
"x": {
                 
"from_node": "subtract1"
               
},
               
"y": {
                 
"from_node": "add1"
               
}
             
},
             
"result": true
           
}
         
}
       
}
     
}
   
},
   
"aggregatetemporalperiod1": {
     
"process_id": "aggregate_temporal_period",
     
"arguments": {
       
"data": {
         
"from_node": "reducedimension2"
       
},
       
"period": "month",
       
"reducer": {
         
"process_graph": {
           
"median1": {
             
"process_id": "median",
             
"arguments": {
               
"data": {
                 
"from_parameter": "data"
               
}
             
},
             
"result": true
           
}
         
}
       
}
     
}
   
},
   
"reducedimension3": {
     
"process_id": "reduce_dimension",
     
"arguments": {
       
"data": {
         
"from_node": "aggregatetemporalperiod1"
       
},
       
"dimension": "time",
       
"reducer": {
         
"process_graph": {
           
"max1": {
             
"process_id": "max",
             
"arguments": {
               
"data": {
                 
"from_parameter": "data"
               
}
             
},
             
"result": true
           
}
         
}
       
}
     
}
   
},
   
"adddimension1": {
     
"process_id": "add_dimension",
     
"arguments": {
       
"data": {
         
"from_node": "reducedimension3"
       
},
       
"label": "NDVI_MAX",
       
"name": "bands",
       
"type": "bands"
     
}
   
},
   
"reducedimension4": {
     
"process_id": "reduce_dimension",
     
"arguments": {
       
"data": {
         
"from_node": "aggregatetemporalperiod1"
       
},
       
"dimension": "time",
       
"reducer": {
         
"process_graph": {
           
"min1": {
             
"process_id": "min",
             
"arguments": {
               
"data": {
                 
"from_parameter": "data"
               
}
             
},
             
"result": true
           
}
         
}
       
}
     
}
   
},
   
"adddimension2": {
     
"process_id": "add_dimension",
     
"arguments": {
       
"data": {
         
"from_node": "reducedimension4"
       
},
       
"label": "NDVI_MIN",
       
"name": "bands",
       
"type": "bands"
     
}
   
},
   
"mergecubes1": {
     
"process_id": "merge_cubes",
     
"arguments": {
       
"cube1": {
         
"from_node": "adddimension1"
       
},
       
"cube2": {
         
"from_node": "adddimension2"
       
}
     
}
   
},
   
"renamedimension1": {
     
"process_id": "rename_dimension",
     
"arguments": {
       
"data": {
         
"from_node": "mergecubes1"
       
},
       
"source": "latitude",
       
"target": "y"
     
}
   
},
   
"renamedimension2": {
     
"process_id": "rename_dimension",
     
"arguments": {
       
"data": {
         
"from_node": "renamedimension1"
       
},
       
"source": "longitude",
       
"target": "x"
     
}
   
},
   
"saveresult1": {
     
"process_id": "save_result",
     
"arguments": {
       
"data": {
         
"from_node": "renamedimension2"
       
},
       
"format": "GTiff",
       
"options": {}
     
},
     
"result": true
   
}
 
}
}

Listing 5 — OpenEO process graph decoded in JSON

2.6.3.  Processing: Draft OGC API — Processes — Part 1: Core and Part 2: Deploy, Replace, Undeploy implemented by CRIM

Processes can be deployed to provide functions necessary for data cube operations for data discovery, filtering, math operations, and so forth. Processes can merge multiple operations together, which can be deployed using EO Application Packages. This approach can be combined with OGC API — Processes — Part 3: Workflows & Chaining.

The following example is an execution body for the ndvi-batch process previously registered as CWL.

 

{
 
"process": "https://hirondelle.crim.ca/weaver/processes/ndvi-batch",
 
"inputs": {
   
"stac_items": {
     
"collection": "https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a",
     
"datetime": "2020-01-01T00:00:00Z/2021-01-01T00:00:00Z",
     
"bbox": [13.3652612, 45.4236367, 16.5153015, 46.8639623],
     
"format": "stac-items",
     
"type": "application/geo+json"
   
}
 
}
}

Listing 6 — Chaining of Processes for NDVI batch from STAC Collection

2.6.4.  Processing: Draft OGC API — Processes — Part 3: Workflows implemented by Ecere and CRIM

As implemented by Ecere during Testbed 20, the following JSON workflow definition uses “Nested Processes”, and ”Input Field Modifiers”. Based on the JSON workflow definition, the final step of the VHI can be performed using OGC Common Query Language (CQL2). Within this expression input collections (”Collection Input”) are referenced. Finally this expression can be posted to the /processes/{processId}/execution?response=collection endpoint to create a new virtual collection (”Collection Output”). Afterwards, OGC API — Coverages, OGC API — Tiles or OGC API — DGGS requests can be used to retrieve data from this resulting virtual collection, triggering on-demand processing for a specific area, time and resolution of interest. Server-side visualization can similarly be performed using implementations of the OGC API — Maps or OGC API — Tiles (for map tiles) Standards.

 

{
  
"process": "https://maps.gnosis.earth/ogcapi/processes/PassThrough",
  
"inputs": {
     
"data": [
        
{
           
"process": "https://maps.gnosis.earth/ogcapi/processes/PassThrough",
           
"inputs": {
              
"data": [
                 
{ "collection": "https://maps.gnosis.earth/ogcapi/collections/T20-VHI:MonthlyRef_2017_2020:VCI_2020" },
                 
{ "collection": "https://maps.gnosis.earth/ogcapi/collections/T20-VHI:MonthlyRef_2017_2020:TCI_2020" }
              
]
           
},
           
"properties": { "VHI": "0.8 * VCI + 0.2 * TCI" }
        
}
     
]
  
}
}

Listing 7 — Example workflow definition from OGC API - Processes - Part 3: Workflows as implemented by Ecere during the initiative

The execution endpoint for virtual “Collection Output” was proposed to be changed to /collections instead. However, this change was not yet implemented by the end of the initiative. Also not yet working by the end of the initiative, a complete VHI workflow directly referencing the source STAC metadata and making use of an AggregateMulti() function, an “Input Field Modifiers” filter as well as “Output Field Modifiers” could be as follows.

 



{
  
"process": "https://maps.gnosis.earth/ogcapi/processes/PassThrough",
  
"inputs": {
     
"data": [
        
{
           
"collection": "https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a",
           
"filter": "SCL = 4",
           
"properties": {
              
"NDVI_monthly": "AggregateMulti((B08 - B04)/(B08 + B04), Max, ('time'), ('P1M'))",
              
"NDVI_max": "AggregateMulti(NDVI_monthly, Max, ('time'), ('P1M'), ('R4/2017-01-01/P1Y'), ('2020-01-01'))",
              
"NDVI_min": "AggregateMulti(NDVI_monthly, Min, ('time'), ('P1M'), ('R4/2017-01-01/P1Y'), ('2020-01-01'))",
              
"VCI": "100 * (NDVI_monthly - NDVI_min) / (NDVI_max - NDVI_min)"
           
}
        
},
        
{
           
"collection": "https://planetarycomputer.microsoft.com/api/stac/v1/collections/nasa-nex-gddp-cmip6",
           
"filter": "cmip6:model = 'GFDL-ESM4' and cmip6:scenario = 'ssp585'",
           
"properties": {
              
"LST_monthly": "AggregateMulti(tas, Avg, ('time'), ('P1M'))" },
              
"LST_max": "AggregateMulti(LST_monthly, Max, ('time'), ('P1M'), ('R4/2017-01-01/P1Y'), ('2020-01-01'))",
              
"LST_min": "AggregateMulti(LST_monthly, Min, ('time'), ('P1M'), ('R4/2017-01-01/P1Y'), ('2020-01-01'))",
              
"TCI": "100 * (LST_max - LST_monthly) / (LST_max - LST_min)"
           
}
        
}
     
]
  
},
  
"properties": { "VHI": "0.5 * VCI + 0.5 * TCI" }
}

Listing 8 — Workflow for generating a Vegetation Health Index (VHI) using the PassThrough process, CQL2 and AggregateMulti() function, defined as a single execution request and directly referencing the source STAC metadata

CRIM implemented support for the “Nested Processes”, “Collection Input” and “Remote Collections” requirement classes of Processes — Part 3: Workflows.

3.  Conclusion & Outlook

Testbed 20 work demonstrated that chained processing can be used with requirements classes from various standards, such as OGC API — Coverages (derived fields), OGC API — Processes (workflows & chaining), and OpenEO. However, some of the standards are still in draft stages. Especially the additional parts such as OGC API – Processes: Part 3 Workflows & Chaining and OGC API – Coverages Part 2 Derived Fields do not have multiple implementations or sufficient experience/feedback from users. As shown in the implementation overview, this leads to three different ways users can define processing steps and execute those steps. Also, from a data provider’s point of view, there are three different standards they can choose from to make their data available with this kind of Geo Data Cube API. Therefore, Testbed 20 concluded with differing implementations that users could not easily interact with in an interoperable way.

However, Testbed 20 work did demonstrate which standards exist to conduct processing on geospatial (raster) data cubes and led to stronger collaboration between the developers of those standards. Although GDC API profiles were defined, how to proceed and whether an API definition is the best way forward is still being discussed.

There are some topics for further alignment between the above-mentioned standards that have started and can be further discussed after Testbed 20 completion:

  • Common process names between OpenEO, CQL2, and OGC API Processes. OpenEO already has a pre-defined list of process names, which can be further adapted in other standards (e.g., in CQL2).

  • Common API endpoint to post workflows (e.g., /tasks or /jobs).

4.  Security, Privacy and Ethical Considerations

During the course of this project, a thorough review was conducted to identify any potential security, privacy, and ethical concerns. After careful evaluation, it was determined that none of these considerations were relevant to the scope and nature of this project. Therefore, no specific measures or actions were required in these areas.


Bibliography

[1]  Pedro Gonçalves, editor (2021) ‘OGC Best Practice for Earth Observation Application Package’, http://www.opengis.net/doc/BP/eoap/1.0.

[2]  Benjamin Pross, Panagiotis A. Vretanos, editors (2021). ‘OGC 18-062r2: OGC API — Processes — Part 1: Core’, http://www.opengis.net/doc/IS/ogcapi-processes-1/1.0.

[3]  Matthias Schramm, Edzer Pebesma, Milutin Milenković, Luca Foresta, Jeroen Dries, Alexander Jacob, Wolfgang Wagner, Matthias Mohr, Markus Neteler, Miha Kadunc, and et al. 2021. “The openEO API–Harmonising the Use of Earth Observation Cloud Services Using Virtual Data Cube Functionalities” Remote Sensing 13, no. 6: 1125. https://doi.org/10.3390/rs13061125

[4]  Eurac Research (2023). Vegetation Health Index Factsheet of the Interreg Alpine Space Alpine Drought Observatory. Retrieved from https://raw.githubusercontent.com/Eurac-Research/ado-data/main/factsheets/VHI_4.pdf.

[5]  Alexander Jacob, editor (2024). ‘OGC 23-047: OGC Testbed-19 GeoDataCubes Engineering Report’. http://www.opengis.net/doc/PER/T19-D011.

[6]  Gérald Fenoy, editors (2025). ‘DRAFT OGC API — Processes — Part 4: Job Management’, https://docs.ogc.org/DRAFTS/24-051.html.

[7]  Charles Heazel, Jérôme Jacovella-St-Louis, editors (2025). ‘DRAFT: OGC API — Coverages — Part 1: Core’, https://docs.ogc.org/DRAFTS/19-087.html.

[8]  Jérôme Jacovella-St-Louis, Panagiotis A. Vretanos, editors (2025). ‘DRAFT OGC API — Processes — Part 3: Workflows and Chaining’, https://docs.ogc.org/DRAFTS/21-009.html.


Annex A
(informative)
Abbreviations/Acronyms

API

Application Programming Interface

CQL

Common Query Language

CRS

Coordinate Reference System

CWL

Common Workflow Language

ECMWF

European Centre for Medium-Range Weather Forecasts

EDR

OGC Environmental Data Retrieval

ERA5

ECMWF Reanalysis v5

GDC

Geo Data Cube

LST

Land Surface Temperature

NDVI

Normalized Difference Vegetation Index

STAC

Spatio Temporal Asset Catalog

VHI

Vegetation Health Index


Annex B
(informative)
Eurac Research GDC API

B.1.  GDC API Components

The Eurac Research GDC API implementation was further developed starting with the work done during the 2023 OGC Testbed 19. The Java openEO API implementation project called openeo-spring-driver https://github.com/Open-EO/openeo-spring-driver was the starting point This is the user-facing component handling the API calls and redirecting them to the other backend services. The public openEO API endpoint address is https://dev.openeo.eurac.edu/.

The second essential architecture component is the openeo-odc-driver https://github.com/Open-EO/openeo_odc_driver . This software is a Python processing engine that converts the openEO process graphs to executable Python code, computes the result, and returns the results back to the openeo-spring-driver. This component is based on two main open-source projects:

  1. openeo-processes-dask (https://github.com/Open-EO/openeo-processes-dask ): A Python implementation of most openEO processes written using the Xarray and Dask libraries.

  2. openeo-pg-parser-networkx (https://github.com/Open-EO/openeo-pg-parser-networkx/ ): A library to parse the JSON process graphs to executable functions based on openeo-processes-dask.

The third essential component of the GDC API implementation developed by Eurac Research is the STAC API. The deployment is based on the stac-fastapi project (https://github.com/stac-utils/stac-fastapi ) with a PostgreSQL back-end (https://github.com/stac-utils/stac-fastapi-pgstac ). This component contains a STAC Catalog, which indexes all the openEO batch job results as STAC Collections. The public STAC API stac openeo is available at https://stac.openeo.eurac.edu/api/v1/pgstac/ .

B.2.  OGC API – Coverages

A basic implementation of a /coverage endpoint is available in the GDC API instance of Eurac Research, which internally translates the request to an openEO process graph. The endpoint is supported only by the specific collections created for Testbed 20 (SENTINEL2_L2A, ERA5_FORECAST, ERA5_REANALYSIS) and requires a basic authentication token to be passed in the headers.

The following request parameters are available at the endpoint:

  • bbox=west,south,east,north for spatial filters

  • datetime=from[,to] parameter for time trimming (or slicing in case a single timestamp is provided); input timestamp shall be in the format YYYY-HH-MMThh:mm:ssTZ (e.g., 2017-10-31T10:00:00Z); ’..’ wildcards are supported for open-ended intervals

  • properties=p1[,p2]* for bands sub-selection, whose names are available in the cube:dimensions/bands section of in the STAC collection document

  • f=mime for specifying the coverage output format (eg. application/x-netcdf, image/tiff)

IMPORTANT

The following exceptions apply to the current implementation.

  • The subset (and subset-crs) parameter is not accepted, hence subsetting needs to be requested through the bbox and datetime parameters.

  • bbox-crs is not accepted, and lat/lon WGS84 decimal-degrees coordinates are assumed in the bbox parameter.

  • scale is not accepted, only the original resolution of the underlying datacube can be used.

Further Notes:

  • At least a bbox or a datetime filter are required to inhibit download of huge amounts of data (when requested in GeoTiff or NetCDF formats);

  • Authentication tokens are required in the HTTP request for retrieve a coverage;

B.2.1.  Data cubes (collections) of interest

Three data cubes were made available via the GDC API deployment, covering the needs of the sample workflow (VHI) and Provenance Demonstration.

Unfortunately, the ERA5 data offered by Microsoft has not been recently updated and the latest samples are for late 2020. Recently, an official ERA5 STAC Collection, made available within the Destination Earth project was released (https://earthdatahub.destine.eu/api/stac/v1/collections/era5). The Testbed participants did not have time to replace the Microsoft dataset with this data source, but would definitely be the new reference ERA5 collection going forward.

B.2.2.  Example use of implementation

A sample GET /coverage request to the Eurac Research server instance is the following:

https://dev.openeo.eurac.edu/collections/SENTINEL2_L2A/coverage?properties=blue,green,red&datetime=2023-01-08T00:00:00Z/2023-01-08T23:59:00Z&bbox=11.38,46.45,11.40,46.47&f=geotiff

The request specifies an area of interest above the city of Bolzano, where Eurac Research is located. The request selects only the blue, green, and red bands using the properties filtering and stores the result as a GeoTIFF.

B.2.3.  Workflow Demonstration of Vegetation Health Index (VHI) Use Case

The workflow common to all the GDC API implementations (early 2025) involves using an index that combines satellite optical data and atmospheric data. The Copernicus Sentinel-2 mission provides the red (B04) and near-infrared (B08) spectral bands necessary for computing the Normalized Difference Vegetation Index (NDVI) values. To calculate the Vegetation Health Index (VHI), determining the Vegetation Condition Index (VCI) is required. The minimum and maximum NDVI values over an extended period must be computed, which will then be used to normalize the current NDVI value being analyzed.

VCI workflow using the openEO API of Eurac Research

 

import openeo

conn = openeo.connect('link:++https://dev.openeo.eurac.edu').authenticate_basic(username="",password=""++[])
spatial_extent = {
          "west": 13.3652612,
          "east": 16.5153015,
          "south": 45.4236367,
          "north": 46.8639623
        }

temporal_extent = ["2020-01-01","2020-12-31"]
bands = ["red","nir","scl"]
s2_cube = conn.load_collection("SENTINEL2_L2A",
                              spatial_extent=spatial_extent,
                              temporal_extent=temporal_extent,
                              bands=bands).resample_spatial(resolution=0.125, projection="EPSG:4326")

## the next metadata steps are necessary due to missing auto extraction of metadata when using load_stac
s2_cube.metadata = s2_cube.metadata.rename_dimension("y","latitude")
s2_cube.metadata = s2_cube.metadata.rename_dimension("x","longitude")
scl = s2_cube.band("scl")
vegetation_mask = (scl == 4)
s2_cube_masked = s2_cube.filter_bands(["red","nir"]).mask(vegetation_mask)

B04 = s2_cube_masked.band("red")
B08 = s2_cube_masked.band("nir")
ndvi = (B08 - B04) / (B08 + B04)

ndvi_monthly_median = ndvi.aggregate_temporal_period("month",reducer="median")
ndvi_max = ndvi_monthly_median.max_time().add_dimension(name="bands",type="bands",label="NDVI_MAX")
ndvi_min = ndvi_monthly_median.min_time().add_dimension(name="bands",type="bands",label="NDVI_MIN")
ndvi_min_max = ndvi_max.merge_cubes(ndvi_min)
ndvi_min_max = ndvi_min_max.rename_dimension(source="latitude",target="y").rename_dimension(source="longitude",target="x")

temporal_extent = ["2020-07-01","2020-08-01"]

s2_cube = conn.load_stac(url="link:++https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a++[]",
                              spatial_extent=spatial_extent,
                              temporal_extent=temporal_extent,
                              bands=bands).resample_spatial(resolution=0.125, projection="EPSG:4326")

s2_cube.metadata = s2_cube.metadata.add_dimension("time", label=None, type="temporal")
scl = s2_cube.band("scl")
vegetation_mask = (scl == 4)
s2_cube_masked = s2_cube.filter_bands(["red","nir"]).mask(vegetation_mask)

B04 = s2_cube_masked.band("red")
B08 = s2_cube_masked.band("nir")
ndvi = (B08 - B04) / (B08 + B04)
ndvi_monthly_median = ndvi.aggregate_temporal_period("month",reducer="median")

ndvi_min = ndvi_min_max.band("NDVI_MIN")
ndvi_max = ndvi_min_max.band("NDVI_MAX")

diff = ndvi_max - ndvi_min

VCI = ndvi_monthly_median.merge_cubes(ndvi_min,overlap_resolver="subtract").merge_cubes(diff,overlap_resolver="divide")
VCI = VCI * 100

VCI = VCI.add_dimension(type="bands",name="bands",label="VCI")

Listing B.1

The next step is to calculate the second component of the VHI. This step takes into account the atmospheric conditions of the area being studied, specifically the Thermal Condition Index (TCI). The temperature data for this calculation is sourced from ERA5 Reanalysis.

TCI workflow using the openEO API of Eurac Research

 

spatial_extent = {
          "west": 13.3652612,
          "east": 16.5153015,
          "south": 45.4236367,
          "north": 46.8639623
        }

temporal_extent = ["2020-01-01","2020-12-31"]
bands = ["air_temperature_at_2_metres"]
era5 = conn.load_collection("ERA5_REANALYSIS",
                              spatial_extent=spatial_extent,
                              temporal_extent=temporal_extent,
                              bands=bands)

t_max = era5.max_time().rename_labels(dimension="bands",target=["T_MAX"])
t_min = era5.min_time().rename_labels(dimension="bands",target=["T_MIN"])
t_min_max = t_max.merge_cubes(t_min)

temporal_extent = ["2020-07-01","2020-08-01"]

era5 = conn.load_collection("ERA5_REANALYSIS",
                              spatial_extent=spatial_extent,
                              temporal_extent=temporal_extent,
                              bands=bands).drop_dimension("bands").aggregate_temporal_period("month",reducer="mean")

t_min = t_min_max.band("T_MIN")
t_max = t_min_max.band("T_MAX")

TCI = (((era5 * -1) + t_max) / (t_max - t_min)) * 100

Listing B.2

Finally, the two indexes can be combined into the VHI. The VCI and TCI data cubes have different resolutions and therefore projection, resolution and pixel centers need to be aligned, which is possible using the resample_cubes_spatial openEO process.

VHI workflow using the openEO API of Eurac Research

 

VCI_aligned = VCI.resample_cube_spatial(target=TCI,method="average")

alpha = 0.5

VHI = alpha * VCI_aligned + (1 - alpha) * TCI
VHI = VHI.add_dimension(type="bands",name="bands",label="VHI")

Listing B.3

The complete example is available as an interactive Python notebook (.ipynb) on GitHub at https://github.com/clausmichele/OGC-T20-D144/blob/main/VHI_openeo_eurac_research.ipynb.


Annex C
(informative)
Ellipsis GDC API

This annex was an optional deliverable for the participants. For more details, please contact Ellipsis Drive directly.


Annex D
(informative)
Ecere GDC API — GNOSIS Map Server

For the OGC Testbed 20 D140 GeoDataCube API Profile deliverable, Ecere provided an implementation of various OGC API standards as part of the Ecere GNOSIS Map Server. During Testbed 20, several enhancements were made to the GNOSIS Map Server and the underlying GNOSIS Software Development Kit (SDK). These enhancements supported temporal aggregations, as well as calculations and integration across disparate collections of raster data. Support for the OGC Common Query Language (CQL2) was also completed, with conformance to all requirements classes except for the array functions.

This included:

  • Completing the support for the Well-Known Text (WKT) encoding for geometry.

  • Completing the development of an in-house spatial query engine built around the Dimensionally Extended-9 Intersection Model (DE-9IM) to support spatial queries.

  • Support for temporal queries and support for CQL2-JSON.

  • Support for accessing data and results of processing through the candidate OGC API — Discrete Global Grid Systems (DGGS) Standard was also improved during the Testbed. This include improved support for the ISEA3H Discrete Global Grid Reference System, such as correctly converting between geodetic and authalic latitudes, and an implementation of the DGGS-optimized DGGS-JSON format.

The DGGS-JSON format relies on a canonical deterministic ordering of sub-zones (zones of a finer refinement level at least partially contained within a parent zone of a coarser refinement level) defined by the DGGRS.

D.1.  Workflow Demonstration of Vegetation Health Index (VHI) Use Case

Ecere provided an implementation of the Vegetation Health Index (VHI) workflow combining a Vegetation Condition Index (VCI) computed from satellite imagery with Temperature Condition Index (TCI) from near-surface air temperature. The workflow is defined using OGC API — Processes execution requests extended with capabilities defined in the candidate OGC API Processes — Part 3: Workflows Standard. This included collection inputs, as well as input field modifiers specifying OGC Common Query Language (CQL2) expressions deriving new values from the fields of the input(s). These execution requests can be submitted as a payload of POST requests to create virtual collections. For the purpose of the VHI workflow demonstration, the submission of execution requests for creating virtual collections was still at /processes/{processId}/execution?response=collection as defined by OGC API — Processes — Part 1: Core and the latest draft of Part 3: Workflows “Collection Output”. However, during the initiative it was discussed that virtual collections might best be set up by posting workflow definitions to /collections instead, since the new virtual collections are created as new resources at /collections/{collectionId}. This would also allow for instantiating virtual collections using workflow definition languages other than OGC API — Processes execution requests.

Once virtual collections are set up, clients can request data from them using implementations of OGC API Standards that support data access mechanisms such as OGC API — Coverages, OGC API — Discrete Global Grid Systems (DGGS) and OGC API — Tiles (tiled coverage data output). As requests are made for a given area, time and resolution of interest, the necessary computations are performed on-the-fly to fulfill the request. In addition to the final workflow, the intermediate steps in the workflow are also provided as virtual collections. Whereas reference monthly minimum and maximum based on years 2017-2020 were used, a separate workflow was set up using a yearly maximum for year 2020 instead for the purpose of comparing with workflows implemented this way by other Testbed participants. The workflow was tested and pre-processed for the entirety of the Slovenia area of interest for the year 2020, at the native 10 meters resolution of the source sentinel-2 data.

D.1.1.  First input: sentinel-2 Level-2A satellite imagery

The sentinel-2 Level 2-A collection, used in previous OGC Testbeds and Pilots, was used again in this initiative as the first input for computing the vegetation aspects of the VHI index. The data is sourced from the Cloud Optimized GeoTIFF (COG) hosted on Amazon Web Services (AWS) and managed by Element 84. The data cube is built around a local relational database of scene metadata initialized from around twenty million STAC items downloaded using the AWS S3 tool. The sentinel-2 catalog has about doubled in size to almost 40 million granules since then, which have not been fully indexed on the GNOSIS demonstration server. Initial attempts to use the STAC API deployment (https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a) in 2022 during the previous Climate Resilience Pilot initiative ran into issues relating to paging items which are described further in a dedicated section below. The numbers of STAC items matched (8952) from the latest STAC API endpoint for the Slovenia area of interest and 2017-2020:

 

https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items?
   bbox=13.3652612,45.4236367,16.5153015,46.8639623&
   datetime=2017-01-01T00:00:00Z/2020-12-31T23:59:59Z

Listing

differs significantly from the number of scenes for this same area and time of interest returned by the local STAC item database query (5174). This may be explained by additional scenes having been added for this time interval since items were fetched in 2022 for past initiatives. This may result in significant discrepancy when comparing the output of Ecere’s implementation of the VHI workflow with implementations from other participants who are using the STAC API deployment directly to identify granules of interest.

Data is requested from the COGs as needed using HTTP range requests for the overviews and tiles corresponding to the area, time, resolution and bands (fields) of interest to satisfy requests to the datacube.

This collection is available at https://maps.gnosis.earth/ogcapi/collections/sentinel2-l2a using all of the implementations of OGC API Standards access mechanisms supported by the GNOSIS Map Server, including OGC API — Coverages, Tiles (Coverage Tiles, supporting several registered 2D Tile Matrix Sets) and OGC API — DGGS (supporting three DGGRSs: GNOSIS Global Grid, ISEA3H and ISEA9R). In addition, data can be visualized directly using implementations of the OGC API — Maps or OGC API — Tiles (map tiles) Standards, or the collections can be referenced as input to processing workflows using the candidate OGC API — Processes — Part 3: Workflows Standard.