Published

OGC Engineering Report

Earth Observation Cloud Platform Concept Development Study Report
Johannes Echterhoff Editor Julia Wagemann Editor Josh Lieberman Editor
OGC Engineering Report

Published

Document number:21-023
Document type:OGC Engineering Report
Document subtype:
Document stage:Published
Document language:English

License Agreement

Permission is hereby granted by the Open Geospatial Consortium, (“Licensor”), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.

If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.

THIS LICENSE IS A COPYRIGHT LICENSE ONLY, AND DOES NOT CONVEY ANY RIGHTS UNDER ANY PATENTS THAT MAY BE IN FORCE ANYWHERE IN THE WORLD. THE INTELLECTUAL PROPERTY IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE DO NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE INTELLECTUAL PROPERTY WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION OF THE INTELLECTUAL PROPERTY WILL BE UNINTERRUPTED OR ERROR FREE. ANY USE OF THE INTELLECTUAL PROPERTY SHALL BE MADE ENTIRELY AT THE USER’S OWN RISK. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR ANY CONTRIBUTOR OF INTELLECTUAL PROPERTY RIGHTS TO THE INTELLECTUAL PROPERTY BE LIABLE FOR ANY CLAIM, OR ANY DIRECT, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM ANY ALLEGED INFRINGEMENT OR ANY LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR UNDER ANY OTHER LEGAL THEORY, ARISING OUT OF OR IN CONNECTION WITH THE IMPLEMENTATION, USE, COMMERCIALIZATION OR PERFORMANCE OF THIS INTELLECTUAL PROPERTY.

This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.

Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications. This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.

None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.



I.  Abstract

The Earth Observation Cloud Platform Concept Development Study (CDS) evaluates the readiness of satellite data providers and cloud service providers, as well as the maturity of their current systems, with regard to real-world deployment of the new “Applications-to-the-Data” paradigm, using cloud environments for EO data storage, processing, and retrieval.

II.  Executive Summary

Earth Observation (EO) data include data from satellites, but also in-situ as well as model-based data. The volume of Earth Observation (EO) has drastically increased in recent years. An increasing number of satellites and improved capabilities (better spatial and temporal resolution) of new imaging sensors and weather and climate models have led to an exponential growth in the daily EO data stream. EO data providers as well as EO data users are faced with challenges to manage, process and handle the continued increase in EO data. The growing data volume, combined with the EO variety, and the velocity in which the data is being made available, call for fundamental changes in the traditional modality of how EO data is disseminated and how users process and analyse the data. The volume of EO data can no longer be downloaded and processed on local machines. New EO data analysis workflows focus on ‘bringing applications to the data’, where EO data is accessed and processed in a cloud-based environment. Cloud-based services provide a highly scalable and flexible computing environment, where storage and computing resources can be acquired as needed, and where overall IT costs for working with EO data can significantly be reduced.

Cloud-based systems to store, process, analyze, and make EO data accessible are a paradigm change and disrupt the traditional EO data dissemination and analysis workflow.

The Earth Observation Cloud Platform Concept Development Study (CDS) evaluates the readiness of satellite data providers and cloud service providers, as well as the maturity of their current systems, with regard to real-world deployment of the new “Applications-to-the-Data” paradigm, using cloud environments for EO data storage, processing, and retrieval. The study highlights the results of three activities: (i) a dedicated “EO Technologies Show and Tell” workshop in December 2020, (ii) online meetings with a number of stakeholders in January and February 2021, and (iii) a literature review.

The study report documents the evolution of EO system architectures and covers common aspects of these architectures and platforms — data coverage and transmission, storage, discovery, access, as well as security — and the current status of systems from satellite data as well as cloud service providers. A number of topics were identified that satellite data and cloud service providers intend to address in the near future, such as improving interoperability, continuing the migration into the cloud, and expanding the range of available toolsets for analyzing EO data in the cloud. A number of challenges and recommendations have been identified, as well as lessons learned. They include, but are not limited to: the need for more interoperability in cloud-based applications, the lack of policies for data sharing in case of disasters, and the need for training and capacity building to develop the skills necessary for working with EO data in a cloud-based environment.

The study reveals that satellite data providers are moving towards cloud computing, and implementing the applications-to-the-data paradigm. Right now, the major focus for many providers is to make EO data accessible in the cloud. Others already process, analyze and disseminate EO data in the cloud.

Processing in the cloud removes the need to download large volumes of EO data leading to a decrease in total time it takes to analyze the data. Furthermore, the cloud provides the computing capabilities needed for processing such huge EO datasets, which local computing environments rarely support. The costs of using the cloud can be hard to specify up-front. Gradually migrating a system into the cloud and getting hands-on experience through well-defined projects can help build up the necessary experience, though.

Cloud-based EO systems are a way forward to better, manage, provision and process vast amounts of EO data. However, some problems that are typical for any IT system, such as technical and information interoperability, require additional work. The study shows that bringing applications to the data in the cloud is going to be the future in order to be able to process large amounts of EO data.

Cloud-based EO system architectures will enable monitoring Planet Earth and its ecosystems, and to study the impact of climate initiatives and programs. Simulations and forecasts on a global scale will be made possible, but also timely provisioning of EO data in case of disasters, such as forest fires and flooding. Cloud-based EO systems will play a vital role in supporting the U.S. engagement to address climate change (as outlined in the Executive Order on tackling the climate crisis, e.g. Sec. 211 d) and Sec. 222 b) (ii)) and the goals of the European strategy for data, including the Destination Earth initiative.

III.  Keywords

The following are keywords to be used by search engines and document catalogues.

ogcdoc, OGC document, EO Cloud Platform, CDS, ER


IV.  Preface

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.

Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.

V.  Security considerations

No security considerations have been made for this document.

VI.  Submitting Organizations

The following organizations submitted this Document to the Open Geospatial Consortium (OGC):

VII.  Submitters

All questions regarding this document should be directed to the editor or the contributors:

Name Organization Role
Johannes Echterhoff Interactive Instruments Editor/Contributor
Julia Wagemann Consultant Editor/Contributor
Josh Lieberman OGC Editor

VIII.  Acknowledgements

We would like to thank the following companies and organizations who provided inputs for this study.

Company / Organization Contact

Amazon Web Services (AWS)

Mark Korver

European Space Agency (ESA)

Albrecht Schmidt, Cristiano Lopes

European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT)

Joana Miguens, Peter Albert

CERN

Joao Fernandes

Fisheries and Oceans Canada (DFO)

Tobias Spears

Maxar Technologies

Kumar Navulur

Mercator Ocean International

Alain Arnaud

National Aeronautics and Space Administration (NASA)

Chris Lynnes

Natural Resources Canada (NRCan)

Brian Low, Ryan Ahola, Will Mackkinnon

Planet Labs

Chris Holmes, Quinn Scripter

Earth Observation Cloud Platform Concept Development Study Report

1.  Normative references

The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

ISO: ISO 19115-1:2014, Geographic information — Metadata — Part 1: Fundamentals. International Organization for Standardization, Geneva (2014). https://www.iso.org/standard/53798.html

ISO: ISO 19115-2:2019, Geographic information — Metadata — Part 2: Extensions for acquisition and processing. International Organization for Standardization, Geneva (2019). https://www.iso.org/standard/67039.html

ISO: ISO 19157:2013, Geographic information  — Data quality. International Organization for Standardization, Geneva (2013). https://www.iso.org/standard/32575.html

ISO: ISO 19165-1:2018, Geographic information — Preservation of digital data and metadata — Part 1: Fundamentals. International Organization for Standardization, Geneva (2018). https://www.iso.org/standard/67325.html

ISO: ISO 19165-2:2020, Geographic information — Preservation of digital data and metadata — Part 2: Content specifications for Earth observation data and derived digital products. International Organization for Standardization, Geneva (2020). https://www.iso.org/standard/73810.html

ISO/IEC: ISO/IEC 19941:2017, Information technology — Cloud computing — Interoperability and portability. International Organization for Standardization and International Electrotechnical Commission, Geneva (2017). https://www.iso.org/standard/66639.html

Graham Vowles: OGC 06-004r4, Topic 18 — Geospatial Digital Rights Management Reference Model (GeoDRM RM). Open Geospatial Consortium (2007). https://portal.ogc.org/files/?artifact_id=17802

2.  Terms, definitions and abbreviated terms

This document uses the terms defined in OGC Policy Directive 49, which is based on the ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards. In particular, the word “shall” (not “must”) is the verb form used to indicate a requirement to be strictly followed to conform to this document and OGC documents do not use the equivalent phrases in the ISO/IEC Directives, Part 2.

This document also uses terms defined in the OGC Standard for Modular specifications (OGC 08-131r3), also known as the ‘ModSpec’. The definitions of terms such as standard, specification, requirement, and conformance test are provided in the ModSpec.

For the purposes of this document, the following additional terms and definitions apply.

2.1.  Terms and definitions

2.1.1. Cloud Computing

Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model is composed of five essential characteristics (on-demand self-service, broad network access, resource pooling, rapid elasticity, measured service), three service models (Software as a Service, Platform as a Service, Infrastructure as a Service), and four deployment models (private, community, public, and hybrid cloud). Source: The NIST Definition of Cloud Computing.

2.1.2. Earth Observation

Data and information collected about our planet, whether atmospheric, oceanic or terrestrial. This includes space-based or remotely-sensed data, as well as ground-based or in situ data, and computer model based data such as weather forecasts. Based on https://earthobservations.org/geo_wwd.php (accessed on 2021-01-14), extended to include model-based data.

2.2.  Abbreviated terms

ADES

Application Development and Execution Service

AIS

Automatic Identification System

API

Application Programming Interface

ARD

Analysis Ready Data

AWS

Amazon Web Service

CDS

Concept Development Study

CEOS

Committee on Earth Observation Satellites

CFRA

Cloud Federation Reference Architecture

CMR

Common Metadata Repository

COG

Cloud-optimized GeoTIFF

CRR

Cross Region Replication

CWL

Common Workflow Language

DaaS

Data-as-a-Service

DCS

Data Centric Security

DIAS

Data and Information Access Services

DRM

Digital Rights Management

DWG

Domain Working Group

EC

European Commission

ECMWF

European Centre for Medium-Range Weather Forecasts

EMS

Execution Management Service

EO

Earth Observation

EODC

Earth Observation Data Centre

EODMS

EO Data Management System

EOSC

European Open Science Cloud

EOSDIS

Earth Observing System Data and Information System

ESA

European Space Agency

EU

European Union

EUMETSAT

European Organisation for the Exploitation of Meteorological Satellites

EWC

European Weather Cloud

GCMD

Global Change Master Directory

HDA

Harmonized Data Access

HPC

High Performance Computing

IaaS

Infrastructure-as-a-Service

IDN

International Data Network

JP2

JPEG 2000

MAAP

Multi-Mission Algorithm and Analysis Platform

NASA

National Aeronautics and Space Administration

NISAR

NASA-Indian Space Research Organisation Synthetic Aperture Radar

NIST

National Institute of Standards and Technology

NITF

National Imagery Transmission Format

ODIS

Ocean Data and Information Section

PaaS

Platform-as-a-Service

RCM

RADARSAT Constellation Mission

SaaS

Software-as-a-Service

SNAP

Sentinel Application Platform

STAC

Spatio Temporal Asset Catalog

STEP

Science Toolbox Exploitation Platform

SWOT

Surface Water Ocean Topography

TB

Terabyte

TEP

Thematic Exploitation Platform

UMM

Unified Metadata Model

USGS

United States Geological Survey

WPS

Web Processing Service

3.  Overview

The report is structured as follows:

4.  Introduction

Satellites collect vast amounts of Earth Observation (EO) data every day. Increasing numbers of satellites and improved capabilities of new imaging sensors have led to an exponential growth in the daily EO data stream. In 2019, for example, over 18TB of EO data from Copernicus, the European Union’s EO program, have been published on a daily basis under a full, free and open data license (European Space Agency 2020). Taking into account that EO data includes not only space-based data, but also remotely sensed, in-situ, as well as computer model based data, EO data providers as well as EO data users are faced with a challenge to manage, process and make use of this data flood. The growing data volume, combined with the EO variety, and the velocity in which the data is being made available, call for fundamental changes in the traditional modality of how EO data is disseminated and how users process and analyze the data. EO data volumes can no longer be downloaded and processed on local machines. New EO data analysis workflows focus on ‘bringing applications to the data’, where EO data is accessible and processable in a cloud environment.

Cloud-based services as a means to store, process, analyze, and make EO data accessible are a paradigm change and disrupt the traditional EO data dissemination and analysis workflow.

Cloud services vary widely in their capabilities, protocols, business models, and legal policies. There are services offered as Infrastructure (IaaS)- and/or Platform-as-a-Service (PaaS) from commercial cloud vendors, such as Amazon Web Services or Google Cloud and from publicly-funded bodies, such as the Copernicus Data and Information Access Services (DIAS) or the European Open Science Cloud (EOSC). Other cloud services offer more Data (Daas)- or Software-as-a-Service (SaaS) capabilities, such as the Google Earth Engine or the Copernicus Climate/Atmosphere Data Stores. The level of specialization of IaaS/PaaS services is low, which provides a wide flexibility for different applications, but requires system architecture knowledge. SaaS / DaaS services are more specialized to specific application areas or data and require more subject-matter expert knowledge.

The Earth Observation Cloud Platform Concept Development Study (CDS) evaluates the readiness of satellite data providers and cloud service providers, as well as the maturity of their current systems, with regard to real-world deployment of the new “Applications-to-the-Data” paradigm, using cloud environments for EO data storage, processing, and retrieval. The study was conducted by having a dedicated “EO Technologies Show and Tell” workshop in December 2020, as well as conducting online meetings with a number of stakeholders in January and February 2021, and by performing a literature study.

NOTE  The CDS intentionally does not cover mission planning and tasking (e.g. of satellites); the focus is on data storage, access, and processing.

5.  EO System Architecture — Stakeholders and Evolution

The growing volumes of EO data require new approaches to disseminate, access and process the data. The traditional workflow of how EO data is disseminated, processed, and analyzed currently undergoes a significant change process. Subsequently, we will present stakeholder groups of the EO system architecture and secondly, the different stages of the EO system architecture evolution.

5.1.  Stakeholders

The Committee on Earth Observation Satellites (CEOS) identifies three key stakeholder groups of the EO data value chain in their ARD Strategy paper: (i) EO data providers (public and private), (ii) big data hosts and aggregators and (iii) data users. These stakeholder groups are mainly defined based on the value chain for EO data from satellites. For the purposes of this study, EO data encompasses in-situ as well as model-based EO data, in addition to satellite data. Furthermore, through the advent of cloud-based services and platforms for EO data management and analysis, the current EO system architecture landscape diversified and the three stakeholder groups defined by CEOS only partially reflect the EO data value chain. For this reason, rather than combining big data hosts and aggregators, we believe that the ‘aggregators’, how CEOS defines them, should be considered as data users. We therefore propose to differentiate between two groups of data users: intermediate users and end users. This differentiation is also reflected in the Copernicus Market Report 2019.

Subsequently, we describe the following four stakeholder groups: (i) EO data providers, (ii) cloud service / platform providers, (iii) Intermediate data users and (iv) end users.

5.1.1.  EO Data Providers

EO data providers are public and private sector organizations, who operate satellites or run models (e.g. for weather- or climate prediction) and are in charge of disseminating the data. Public sector organizations can be on a national level, e.g. the National Aeronautics and Space Administration (NASA) is an independent agency for the U.S. federal government, but can also be intergovernmental like the European Space Agency (ESA), which coordinates the space programs of 22 member states, or the European Centre for Medium-Range Weather Forecasts (ECMWF). There are also private sector companies that operate their own fleets of satellites, such as MAXAR (previously DigitalGlobe), Planet Labs and EU Space Imaging.

5.1.2.  Cloud Service or Platform Providers

Cloud Service or Platform Providers can be publicly-funded organizations or commercial companies offering different types of cloud services or platforms.

Amazon Web Services, Microsoft Azure, and Google Cloud Platform are examples of popular commercial cloud vendors. The European Open Science Cloud or the Copernicus Data Information and Access Services (DIAS), are examples of publicly-funded cloud services.
Platforms for big EO Data Management and Analysis are defined by Gomes et al. (2020) as ‘computational solution that provide functionalities for big EO data management, storage and access; that allow the processing on server side without having to download big amounts of EO data sets; and that provide a certain level of data and processing abstractions for EO community users and researchers’. Examples of commercially developed EO platforms that provide access to (value-added) EO data and processing are the Euro Data Cube as well as Ellip.

The differentiation between EO data providers and cloud service or platform providers is not always distinct. An EO data provider can also be a cloud service provider, as it is in the case of the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) or the European Centre for Medium-Range Weather Forecasts (ECMWF). Both organizations are data providers of large volumes of satellite- and model-based data and at the same time involved in operating the Copernicus DIAS service WEkEO. Additionally, both organizations are also developing the European Weather Cloud (EWC), a dedicated cloud for national weather organisations and researchers of ECMWF and EUMETSAT’s member states.

EO data providers have also developed EO platforms, which makes them platform providers at the same time, e.g. the Thematic Exploitation Platforms (TEP) developed by ESA, the Joint ESA-NASA Multi-Mission Algorithm and Analysis Platform (MAAP) Albinet et.al [1], and the Climate Data Store Toolbox developed by the ECMWF.

On the other hand, cloud service or platform providers can also become EO data providers. Both Amazon and Google, for example, run Public Dataset Programs, where they provide access to open datasets, e.g. Copernicus satellite data. The data originator remains the original EO data provider, but the organisation who provides ‘access’ to the data in this case is the cloud service provider. The same applies for third party platform providers, such as Google Earth Engine or the providers of the Open Data Cube. By developing and managing the platform, they become an EO data provider for the platform users.

5.1.3.  Intermediate Users

Intermediate users are technical experts working in private companies or in publicly funded organizations, for example universities or research organizations, building the bridge between end users on one side, and EO data providers, cloud service and platform providers on the other side. Intermediate users are subject matter experts and have the required technical skills to access, handle, process and analyze EO data in order to retrieve the required information for end users.

5.1.4.  End Users

End users are policy and decision makers who need information gained from the analysis of EO data in the form of a map, graph, or number, summarized in a report. End users do not have the required technical skills to be able to process and analyze EO data, but have expert knowledge in a specific application domain.

5.2.  Evolution of EO System Architectures

The continued exponential growth of big EO data, combined with technological advancement, leads to fundamental changes in how and where EO data is stored and how users access, process, and use the data. The traditional ‘data-centric’ approach limits the full uptake and use of open EO data, as large volumes of EO data are copied to and processed on local machines (see Figure 1). In order to address the evolving need to minimize data duplication and offer scalable processing capabilities, the ‘moving code’ paradigm has evolved, which propagates the storage of large volumes of EO data as cloud objects and the access, processing and analysis of EO data on (cloud-based) servers. In general, cloud systems offer effective and scalable processing capabilities, but require advanced technical understanding from users. For this reason, more ‘user-centric’ approaches aiming to provide advanced access and processing of EO data, while hiding technical complexities, have been developed. Gomes et al. (2020) call these solutions ‘Platforms for big EO Data Management and Analysis’.
The current landscape of EO system architectures offers services of different evolution steps, from traditional EO system architectures to ‘user centric’ platforms to highly advanced cloud services allowing to execute entire applications on data stored in the cloud and federations of multiple cloud-based systems. Federations of cloud-based systems, however, is currently a visionary concept to which EO system architectures will evolve in the future. The role of the end user remains constant in all approaches and for this reason, we do not explicitly elaborate on end users.

5.2.1.  Traditional ‘Data-Centric’ approach

The majority of data users still tend to follow the traditional approach, where large volumes of data are downloaded and processed on local machines. In this scenario, different EO data providers (e.g. satellite-, model-based and in-situ data) are responsible for managing the data and disseminating it via a download service. Intermediate users are e.g. researchers or commercial companies, who make a copy of the data and pre-process and analyze the data on their local machines. With the growing volumes of EO data, this approach has gotten more and more cumbersome and limits the full uptake and use of EO data.

Figure 1 — Traditional data-centric approach, where large volumes of EO data are copied to and processed on local machines

5.2.2.  Traditional Approach With Cloud Storage

The growing volumes of EO data increase the need on the data provider side to more effectively manage EO data and offer data access in a programmatic way through e.g. an Application Programming Interface (API). Many EO data providers store their data archives on a cloud service (either their own cloud implementation or through services from commercial cloud vendors), which is accessible via an API for data users. The modality of how intermediate users access data does not change substantially in this scenario. Intermediate users still follow the traditional approach by ‘bulk-downloading’ large volumes of EO data to their local machines. The only change is the location where the data is stored, of which the intermediate user might or might not be aware of. Examples of this approach are the Copernicus Climate Data Store implemented by the European Centre for Medium-Range Weather Forecasts, as well as the public dataset programs Google Cloud Public Datasets and Earth on AWS.

Figure 2 — User-centric approach with dedicated platforms for EO data management and analysis

5.2.3.  Innovative ‘User Centric’ Approach — EO Data Management and Analysis Platforms

Platforms for EO data management and analysis have been developed based on the need to offer advanced access to and processing of EO data, while hiding technical complexities under a layer of abstraction. Different platforms and approaches have been developed. Examples for EO platforms are the Thematic Exploitation Platforms (TEP) developed by ESA, the Joint ESA-NASA Multi-Mission Algorithm and Analysis Platform (MAAP) (Albinet et.al [1]), the Open Data Cube (Killough, 2018), as well as commercial platforms such as Google Earth Engine (Gorelick et al. 2018). An example from the climate community is the Climate Data Store Toolbox.

Figure 3 — User-centric approach with dedicated platforms for EO data management and analysis

These platforms represent a new ‘user centric’ approach, where users and applications are brought to the data in order to process and analyze it. Platforms differ in the underlying technology they use, level of openness, the level of abstraction, and the data and functionalities they offer. According to the charter of the OGC EO Exploitation Platform DWG, the platforms share a common set of functionalities:

  • Cataloguing and searching;

  • Storage and access;

  • Visualization;

  • Data processing and analysis; and

  • User authentication, authorization, and accounting.

However, since current platforms for EO data management and analysis have been independently developed by public organizations and commercial companies, these platforms are not interoperable. In other words, they do not use (a) common (set of) interfaces and data formats for implementing the aforementioned functionalities. Additionally, the layer of abstraction dampens the flexibility and users might be constrained in the data or functionalities these platforms offer.

Regarding the EO data value chain, these platforms introduce platform providers as an additional stakeholder. Platform providers may be EO data providers, but most likely a different team in the organization or agency, or a third-party organization or company. The responsibility of managing, maintaining and provisioning the platform is shifted from the EO data provider to the platform provider. In the backend, such platforms may be operated in a cloud service, but may also be managed on a company server.

5.2.4.  OGC Activities Driving the Development of Interoperable EO Exploitation Platforms

The OGC has been driving the development of an interoperable Applications-to-the-Data EO platform architecture through multiple OGC Innovation Program initiatives (see Figure 4). Further initiatives — such as OGC Testbed-17 and the OGC Disasters Pilot 2021 — as well as the OGC EO Exploitation Platform Domain Working Group (DWG) will continue to improve and define the architecture in more detail.

Figure 4 — OGC Innovation Program initiatives that drove the development towards an interoperable Applications-to-the-Data EO platform architecture

At present, it is unclear which set of standards and specifications the OGC defines or recommends for realizing an interoperable EO platform. Defining this set of standards and specifications is amongst the key activities of the OGC EO Exploitation Platform DWG. The working group started writing an OGC Best Practice document for Earth Observation Application Package implementation (OGC 20-089). The document will focus on the application package (concept), but will also address the deployment viewpoint, i.e. how to deploy a package within a platform.

Relevant standards, specifications, and technologies appear to be, but are not limited to:

  • Common Workflow Language (CWL) — To describe an application package (input and output parameters, invocation) as well as to define complex application workflows.

  • Spatio Temporal Asset Catalog (STAC) — Used as data manifest for application inputs and outputs metadata.

  • Docker — A docker container encapsulates the implementation of an application, and can be executed in different cloud environments. A docker container registry (also known as docker hub) provides a common means to store and download docker containers. A docker container can potentially also be built ad-hoc, in order to lower security concerns when downloading and executing pre-built third-party applications.

  • OGC API — Processes and OGC Web Processing Service (WPS) — Provide a standard interface for deploying and executing an application. Deployment requires the transactional extension of the API / service. The interface is used by both EMS and ADES, which may require further profiling of the generic processing interface to fully specify the specific interactions and functions of the two services.

  • OGC OpenSearch, with Geo & Time extensions, with EO extensions — For discovery and cataloguing.

5.2.5.  Cloud-Based Data Access and Processing

Compared to EO data management and analysis platforms, cloud-based systems represent a more flexible approach of the ‘moving-code paradigm’. They provide scalable and effective access to and processing of EO data in the cloud and eliminate the need to download massive amounts of data. While intermediate users are provided a higher degree of functionalities compared to EO data platforms, pre-defined data selections often remain. On the other hand, cloud-based systems demand deeper technical knowledge of the system configuration and this poses a challenge to many EO data users.
Many large data organizations are currently either moving their data holdings to commercial cloud providers (e.g. NASA), or are in the process of implementing their own cloud service (e.g. ECMWF). In this latter scenario, data providers turn into cloud service providers at the same time. Nevertheless, cloud-based services also differ in their functionalities and specialization. There are general cloud services from commercial providers, e.g. Amazon Web Services or Google Cloud Platform, offering high flexibility but demanding high technical knowledge from users. With the paradigm change in EO, more specialized cloud services tailored for EO access and management are being implemented, such as the Copernicus Data and Information Access Services (DIAS) or the European Weather Cloud implemented by ECMWF / EUMETSAT. They offer the same flexibility as more general cloud services in terms of VM specification, but offer facilitated access to EO data holdings as well as community specific tools facilitating EO data processing.
Instead of downloading EO data and processing it on local machines, intermediate users specify a virtual machine in the cloud and with this virtual machine, they access and process the data and only transfer intermediate or end results to their local machine. This represents a true paradigm change of how users access and process EO data. However, it also implicates challenges on different levels, including an insufficient expertise in cloud-based systems, a general skepticism in several aspects of cloud security and limited transparency in potentially evolving costs for processing. For these reasons, strong efforts have to be undertaken to strengthen capacities in cloud-based services in general, while building up overall trust in the security of cloud services and a good understanding for emerging processing costs.

Figure 5 — Cloud-based data access and processing

5.2.6.  Future: Federations and Interoperability of Cloud-Based Systems

A natural future evolution of EO cloud-based systems will be that interactions between these systems will increase, sharing data and processing in order to support the information needs of users. Thus, cloud federations will emerge, providing the ability to flexibly share data, applications and processing resources between multiple cloud-based systems.