I. Abstract
The Earth Observation Cloud Platform Concept Development Study (CDS) evaluates the readiness of satellite data providers and cloud service providers, as well as the maturity of their current systems, with regard to real-world deployment of the new “Applications-to-the-Data” paradigm, using cloud environments for EO data storage, processing, and retrieval.
II. Executive Summary
Earth Observation (EO) data include data from satellites, but also in-situ as well as model-based data. The volume of Earth Observation (EO) has drastically increased in recent years. An increasing number of satellites and improved capabilities (better spatial and temporal resolution) of new imaging sensors and weather and climate models have led to an exponential growth in the daily EO data stream. EO data providers as well as EO data users are faced with challenges to manage, process and handle the continued increase in EO data. The growing data volume, combined with the EO variety, and the velocity in which the data is being made available, call for fundamental changes in the traditional modality of how EO data is disseminated and how users process and analyse the data. The volume of EO data can no longer be downloaded and processed on local machines. New EO data analysis workflows focus on ‘bringing applications to the data’, where EO data is accessed and processed in a cloud-based environment. Cloud-based services provide a highly scalable and flexible computing environment, where storage and computing resources can be acquired as needed, and where overall IT costs for working with EO data can significantly be reduced.
Cloud-based systems to store, process, analyze, and make EO data accessible are a paradigm change and disrupt the traditional EO data dissemination and analysis workflow.
The Earth Observation Cloud Platform Concept Development Study (CDS) evaluates the readiness of satellite data providers and cloud service providers, as well as the maturity of their current systems, with regard to real-world deployment of the new “Applications-to-the-Data” paradigm, using cloud environments for EO data storage, processing, and retrieval. The study highlights the results of three activities: (i) a dedicated “EO Technologies Show and Tell” workshop in December 2020, (ii) online meetings with a number of stakeholders in January and February 2021, and (iii) a literature review.
The study report documents the evolution of EO system architectures and covers common aspects of these architectures and platforms — data coverage and transmission, storage, discovery, access, as well as security — and the current status of systems from satellite data as well as cloud service providers. A number of topics were identified that satellite data and cloud service providers intend to address in the near future, such as improving interoperability, continuing the migration into the cloud, and expanding the range of available toolsets for analyzing EO data in the cloud. A number of challenges and recommendations have been identified, as well as lessons learned. They include, but are not limited to: the need for more interoperability in cloud-based applications, the lack of policies for data sharing in case of disasters, and the need for training and capacity building to develop the skills necessary for working with EO data in a cloud-based environment.
The study reveals that satellite data providers are moving towards cloud computing, and implementing the applications-to-the-data paradigm. Right now, the major focus for many providers is to make EO data accessible in the cloud. Others already process, analyze and disseminate EO data in the cloud.
Processing in the cloud removes the need to download large volumes of EO data leading to a decrease in total time it takes to analyze the data. Furthermore, the cloud provides the computing capabilities needed for processing such huge EO datasets, which local computing environments rarely support. The costs of using the cloud can be hard to specify up-front. Gradually migrating a system into the cloud and getting hands-on experience through well-defined projects can help build up the necessary experience, though.
Cloud-based EO systems are a way forward to better, manage, provision and process vast amounts of EO data. However, some problems that are typical for any IT system, such as technical and information interoperability, require additional work. The study shows that bringing applications to the data in the cloud is going to be the future in order to be able to process large amounts of EO data.
Cloud-based EO system architectures will enable monitoring Planet Earth and its ecosystems, and to study the impact of climate initiatives and programs. Simulations and forecasts on a global scale will be made possible, but also timely provisioning of EO data in case of disasters, such as forest fires and flooding. Cloud-based EO systems will play a vital role in supporting the U.S. engagement to address climate change (as outlined in the Executive Order on tackling the climate crisis, e.g. Sec. 211 d) and Sec. 222 b) (ii)) and the goals of the European strategy for data, including the Destination Earth initiative.
III. Keywords
The following are keywords to be used by search engines and document catalogues.
ogcdoc, OGC document, EO Cloud Platform, CDS, ER
IV. Preface
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.
Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.
V. Security considerations
No security considerations have been made for this document.
VI. Submitting Organizations
The following organizations submitted this Document to the Open Geospatial Consortium (OGC):
- interactive instruments GmbH
VII. Submitters
All questions regarding this document should be directed to the editor or the contributors:
Name | Organization | Role |
---|---|---|
Johannes Echterhoff | Interactive Instruments | Editor/Contributor |
Julia Wagemann | Consultant | Editor/Contributor |
Josh Lieberman | OGC | Editor |
VIII. Acknowledgements
We would like to thank the following companies and organizations who provided inputs for this study.
Company / Organization | Contact |
---|---|
Amazon Web Services (AWS) |
Mark Korver |
European Space Agency (ESA) |
Albrecht Schmidt, Cristiano Lopes |
European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) |
Joana Miguens, Peter Albert |
CERN |
Joao Fernandes |
Fisheries and Oceans Canada (DFO) |
Tobias Spears |
Maxar Technologies |
Kumar Navulur |
Mercator Ocean International |
Alain Arnaud |
National Aeronautics and Space Administration (NASA) |
Chris Lynnes |
Natural Resources Canada (NRCan) |
Brian Low, Ryan Ahola, Will Mackkinnon |
Planet Labs |
Chris Holmes, Quinn Scripter |
Earth Observation Cloud Platform Concept Development Study Report
1. Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
ISO: ISO 19115-1:2014, Geographic information — Metadata — Part 1: Fundamentals. International Organization for Standardization, Geneva (2014). https://www.iso.org/standard/53798.html
ISO: ISO 19115-2:2019, Geographic information — Metadata — Part 2: Extensions for acquisition and processing. International Organization for Standardization, Geneva (2019). https://www.iso.org/standard/67039.html
ISO: ISO 19157:2013, Geographic information — Data quality. International Organization for Standardization, Geneva (2013). https://www.iso.org/standard/32575.html
ISO: ISO 19165-1:2018, Geographic information — Preservation of digital data and metadata — Part 1: Fundamentals. International Organization for Standardization, Geneva (2018). https://www.iso.org/standard/67325.html
ISO: ISO 19165-2:2020, Geographic information — Preservation of digital data and metadata — Part 2: Content specifications for Earth observation data and derived digital products. International Organization for Standardization, Geneva (2020). https://www.iso.org/standard/73810.html
ISO/IEC: ISO/IEC 19941:2017, Information technology — Cloud computing — Interoperability and portability. International Organization for Standardization and International Electrotechnical Commission, Geneva (2017). https://www.iso.org/standard/66639.html
Graham Vowles: OGC 06-004r4, Topic 18 — Geospatial Digital Rights Management Reference Model (GeoDRM RM). Open Geospatial Consortium (2007). https://portal.ogc.org/files/?artifact_id=17802
2. Terms, definitions and abbreviated terms
This document uses the terms defined in OGC Policy Directive 49, which is based on the ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards. In particular, the word “shall” (not “must”) is the verb form used to indicate a requirement to be strictly followed to conform to this document and OGC documents do not use the equivalent phrases in the ISO/IEC Directives, Part 2.
This document also uses terms defined in the OGC Standard for Modular specifications (OGC 08-131r3), also known as the ‘ModSpec’. The definitions of terms such as standard, specification, requirement, and conformance test are provided in the ModSpec.
For the purposes of this document, the following additional terms and definitions apply.
2.1. Terms and definitions
2.1.1. Cloud Computing
Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model is composed of five essential characteristics (on-demand self-service, broad network access, resource pooling, rapid elasticity, measured service), three service models (Software as a Service, Platform as a Service, Infrastructure as a Service), and four deployment models (private, community, public, and hybrid cloud). Source: The NIST Definition of Cloud Computing.
2.1.2. Earth Observation
Data and information collected about our planet, whether atmospheric, oceanic or terrestrial. This includes space-based or remotely-sensed data, as well as ground-based or in situ data, and computer model based data such as weather forecasts. Based on https://earthobservations.org/geo_wwd.php (accessed on 2021-01-14), extended to include model-based data.
2.2. Abbreviated terms
ADES
-
Application Development and Execution Service
AIS
-
Automatic Identification System
API
-
Application Programming Interface
ARD
-
Analysis Ready Data
AWS
-
Amazon Web Service
CDS
-
Concept Development Study
CEOS
-
Committee on Earth Observation Satellites
CFRA
-
Cloud Federation Reference Architecture
CMR
-
Common Metadata Repository
COG
-
Cloud-optimized GeoTIFF
CRR
-
Cross Region Replication
CWL
-
Common Workflow Language
DaaS
-
Data-as-a-Service
DCS
-
Data Centric Security
DIAS
-
Data and Information Access Services
DRM
-
Digital Rights Management
DWG
-
Domain Working Group
EC
-
European Commission
ECMWF
-
European Centre for Medium-Range Weather Forecasts
EMS
-
Execution Management Service
EO
-
Earth Observation
EODC
-
Earth Observation Data Centre
EODMS
-
EO Data Management System
EOSC
-
European Open Science Cloud
EOSDIS
-
Earth Observing System Data and Information System
ESA
-
European Space Agency
EU
-
European Union
EUMETSAT
-
European Organisation for the Exploitation of Meteorological Satellites
EWC
-
European Weather Cloud
GCMD
-
Global Change Master Directory
HDA
-
Harmonized Data Access
HPC
-
High Performance Computing
IaaS
-
Infrastructure-as-a-Service
IDN
-
International Data Network
JP2
-
JPEG 2000
MAAP
-
Multi-Mission Algorithm and Analysis Platform
NASA
-
National Aeronautics and Space Administration
NISAR
-
NASA-Indian Space Research Organisation Synthetic Aperture Radar
NIST
-
National Institute of Standards and Technology
NITF
-
National Imagery Transmission Format
ODIS
-
Ocean Data and Information Section
PaaS
-
Platform-as-a-Service
RCM
-
RADARSAT Constellation Mission
SaaS
-
Software-as-a-Service
SNAP
-
Sentinel Application Platform
STAC
-
Spatio Temporal Asset Catalog
STEP
-
Science Toolbox Exploitation Platform
SWOT
-
Surface Water Ocean Topography
TB
-
Terabyte
TEP
-
Thematic Exploitation Platform
UMM
-
Unified Metadata Model
USGS
-
United States Geological Survey
WPS
-
Web Processing Service
3. Overview
The report is structured as follows:
-
The EO system architectures chapter includes a discussion of relevant stakeholder groups. It covers common aspects of architectures and platforms — data coverage and transmission, storage, discovery, access, as well as security.
-
The current status of systems chapter concerns satellite data as well as cloud service providers.
-
The future evolution chapter documents which topics satellite data and cloud service providers intend to address in the near future.
-
The recommendations chapter documents a number of challenges, recommendations and lessons learned in the course of the study.
-
The conclusions chapter considers the readiness and maturity of current EO systems regarding the adoption of the “Applications-to-the-Data” paradigm.
4. Introduction
Satellites collect vast amounts of Earth Observation (EO) data every day. Increasing numbers of satellites and improved capabilities of new imaging sensors have led to an exponential growth in the daily EO data stream. In 2019, for example, over 18TB of EO data from Copernicus, the European Union’s EO program, have been published on a daily basis under a full, free and open data license (European Space Agency 2020). Taking into account that EO data includes not only space-based data, but also remotely sensed, in-situ, as well as computer model based data, EO data providers as well as EO data users are faced with a challenge to manage, process and make use of this data flood. The growing data volume, combined with the EO variety, and the velocity in which the data is being made available, call for fundamental changes in the traditional modality of how EO data is disseminated and how users process and analyze the data. EO data volumes can no longer be downloaded and processed on local machines. New EO data analysis workflows focus on ‘bringing applications to the data’, where EO data is accessible and processable in a cloud environment.
Cloud-based services as a means to store, process, analyze, and make EO data accessible are a paradigm change and disrupt the traditional EO data dissemination and analysis workflow.
Cloud services vary widely in their capabilities, protocols, business models, and legal policies. There are services offered as Infrastructure (IaaS)- and/or Platform-as-a-Service (PaaS) from commercial cloud vendors, such as Amazon Web Services or Google Cloud and from publicly-funded bodies, such as the Copernicus Data and Information Access Services (DIAS) or the European Open Science Cloud (EOSC). Other cloud services offer more Data (Daas)- or Software-as-a-Service (SaaS) capabilities, such as the Google Earth Engine or the Copernicus Climate/Atmosphere Data Stores. The level of specialization of IaaS/PaaS services is low, which provides a wide flexibility for different applications, but requires system architecture knowledge. SaaS / DaaS services are more specialized to specific application areas or data and require more subject-matter expert knowledge.
The Earth Observation Cloud Platform Concept Development Study (CDS) evaluates the readiness of satellite data providers and cloud service providers, as well as the maturity of their current systems, with regard to real-world deployment of the new “Applications-to-the-Data” paradigm, using cloud environments for EO data storage, processing, and retrieval. The study was conducted by having a dedicated “EO Technologies Show and Tell” workshop in December 2020, as well as conducting online meetings with a number of stakeholders in January and February 2021, and by performing a literature study.
NOTE The CDS intentionally does not cover mission planning and tasking (e.g. of satellites); the focus is on data storage, access, and processing.
5. EO System Architecture — Stakeholders and Evolution
The growing volumes of EO data require new approaches to disseminate, access and process the data. The traditional workflow of how EO data is disseminated, processed, and analyzed currently undergoes a significant change process. Subsequently, we will present stakeholder groups of the EO system architecture and secondly, the different stages of the EO system architecture evolution.
5.1. Stakeholders
The Committee on Earth Observation Satellites (CEOS) identifies three key stakeholder groups of the EO data value chain in their ARD Strategy paper: (i) EO data providers (public and private), (ii) big data hosts and aggregators and (iii) data users. These stakeholder groups are mainly defined based on the value chain for EO data from satellites. For the purposes of this study, EO data encompasses in-situ as well as model-based EO data, in addition to satellite data. Furthermore, through the advent of cloud-based services and platforms for EO data management and analysis, the current EO system architecture landscape diversified and the three stakeholder groups defined by CEOS only partially reflect the EO data value chain. For this reason, rather than combining big data hosts and aggregators, we believe that the ‘aggregators’, how CEOS defines them, should be considered as data users. We therefore propose to differentiate between two groups of data users: intermediate users and end users. This differentiation is also reflected in the Copernicus Market Report 2019.
Subsequently, we describe the following four stakeholder groups: (i) EO data providers, (ii) cloud service / platform providers, (iii) Intermediate data users and (iv) end users.
5.1.1. EO Data Providers
EO data providers are public and private sector organizations, who operate satellites or run models (e.g. for weather- or climate prediction) and are in charge of disseminating the data. Public sector organizations can be on a national level, e.g. the National Aeronautics and Space Administration (NASA) is an independent agency for the U.S. federal government, but can also be intergovernmental like the European Space Agency (ESA), which coordinates the space programs of 22 member states, or the European Centre for Medium-Range Weather Forecasts (ECMWF). There are also private sector companies that operate their own fleets of satellites, such as MAXAR (previously DigitalGlobe), Planet Labs and EU Space Imaging.
5.1.2. Cloud Service or Platform Providers
Cloud Service or Platform Providers can be publicly-funded organizations or commercial companies offering different types of cloud services or platforms.
Amazon Web Services, Microsoft Azure, and Google Cloud Platform are examples of popular commercial cloud vendors. The European Open Science Cloud or the Copernicus Data Information and Access Services (DIAS), are examples of publicly-funded cloud services.
Platforms for big EO Data Management and Analysis are defined by Gomes et al. (2020) as ‘computational solution that provide functionalities for big EO data management, storage and access; that allow the processing on server side without having to download big amounts of EO data sets; and that provide a certain level of data and processing abstractions for EO community users and researchers’. Examples of commercially developed EO platforms that provide access to (value-added) EO data and processing are the Euro Data Cube as well as Ellip.
The differentiation between EO data providers and cloud service or platform providers is not always distinct. An EO data provider can also be a cloud service provider, as it is in the case of the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) or the European Centre for Medium-Range Weather Forecasts (ECMWF). Both organizations are data providers of large volumes of satellite- and model-based data and at the same time involved in operating the Copernicus DIAS service WEkEO. Additionally, both organizations are also developing the European Weather Cloud (EWC), a dedicated cloud for national weather organisations and researchers of ECMWF and EUMETSAT’s member states.
EO data providers have also developed EO platforms, which makes them platform providers at the same time, e.g. the Thematic Exploitation Platforms (TEP) developed by ESA, the Joint ESA-NASA Multi-Mission Algorithm and Analysis Platform (MAAP) Albinet et.al [1], and the Climate Data Store Toolbox developed by the ECMWF.
On the other hand, cloud service or platform providers can also become EO data providers. Both Amazon and Google, for example, run Public Dataset Programs, where they provide access to open datasets, e.g. Copernicus satellite data. The data originator remains the original EO data provider, but the organisation who provides ‘access’ to the data in this case is the cloud service provider. The same applies for third party platform providers, such as Google Earth Engine or the providers of the Open Data Cube. By developing and managing the platform, they become an EO data provider for the platform users.
5.1.3. Intermediate Users
Intermediate users are technical experts working in private companies or in publicly funded organizations, for example universities or research organizations, building the bridge between end users on one side, and EO data providers, cloud service and platform providers on the other side. Intermediate users are subject matter experts and have the required technical skills to access, handle, process and analyze EO data in order to retrieve the required information for end users.
5.1.4. End Users
End users are policy and decision makers who need information gained from the analysis of EO data in the form of a map, graph, or number, summarized in a report. End users do not have the required technical skills to be able to process and analyze EO data, but have expert knowledge in a specific application domain.
5.2. Evolution of EO System Architectures
The continued exponential growth of big EO data, combined with technological advancement, leads to fundamental changes in how and where EO data is stored and how users access, process, and use the data. The traditional ‘data-centric’ approach limits the full uptake and use of open EO data, as large volumes of EO data are copied to and processed on local machines (see Figure 1). In order to address the evolving need to minimize data duplication and offer scalable processing capabilities, the ‘moving code’ paradigm has evolved, which propagates the storage of large volumes of EO data as cloud objects and the access, processing and analysis of EO data on (cloud-based) servers. In general, cloud systems offer effective and scalable processing capabilities, but require advanced technical understanding from users. For this reason, more ‘user-centric’ approaches aiming to provide advanced access and processing of EO data, while hiding technical complexities, have been developed. Gomes et al. (2020) call these solutions ‘Platforms for big EO Data Management and Analysis’.
The current landscape of EO system architectures offers services of different evolution steps, from traditional EO system architectures to ‘user centric’ platforms to highly advanced cloud services allowing to execute entire applications on data stored in the cloud and federations of multiple cloud-based systems. Federations of cloud-based systems, however, is currently a visionary concept to which EO system architectures will evolve in the future. The role of the end user remains constant in all approaches and for this reason, we do not explicitly elaborate on end users.
5.2.1. Traditional ‘Data-Centric’ approach
The majority of data users still tend to follow the traditional approach, where large volumes of data are downloaded and processed on local machines. In this scenario, different EO data providers (e.g. satellite-, model-based and in-situ data) are responsible for managing the data and disseminating it via a download service. Intermediate users are e.g. researchers or commercial companies, who make a copy of the data and pre-process and analyze the data on their local machines. With the growing volumes of EO data, this approach has gotten more and more cumbersome and limits the full uptake and use of EO data.
Figure 1 — Traditional data-centric approach, where large volumes of EO data are copied to and processed on local machines
5.2.2. Traditional Approach With Cloud Storage
The growing volumes of EO data increase the need on the data provider side to more effectively manage EO data and offer data access in a programmatic way through e.g. an Application Programming Interface (API). Many EO data providers store their data archives on a cloud service (either their own cloud implementation or through services from commercial cloud vendors), which is accessible via an API for data users. The modality of how intermediate users access data does not change substantially in this scenario. Intermediate users still follow the traditional approach by ‘bulk-downloading’ large volumes of EO data to their local machines. The only change is the location where the data is stored, of which the intermediate user might or might not be aware of. Examples of this approach are the Copernicus Climate Data Store implemented by the European Centre for Medium-Range Weather Forecasts, as well as the public dataset programs Google Cloud Public Datasets and Earth on AWS.
Figure 2 — User-centric approach with dedicated platforms for EO data management and analysis
5.2.3. Innovative ‘User Centric’ Approach — EO Data Management and Analysis Platforms
Platforms for EO data management and analysis have been developed based on the need to offer advanced access to and processing of EO data, while hiding technical complexities under a layer of abstraction. Different platforms and approaches have been developed. Examples for EO platforms are the Thematic Exploitation Platforms (TEP) developed by ESA, the Joint ESA-NASA Multi-Mission Algorithm and Analysis Platform (MAAP) (Albinet et.al [1]), the Open Data Cube (Killough, 2018), as well as commercial platforms such as Google Earth Engine (Gorelick et al. 2018). An example from the climate community is the Climate Data Store Toolbox.
Figure 3 — User-centric approach with dedicated platforms for EO data management and analysis
These platforms represent a new ‘user centric’ approach, where users and applications are brought to the data in order to process and analyze it. Platforms differ in the underlying technology they use, level of openness, the level of abstraction, and the data and functionalities they offer. According to the charter of the OGC EO Exploitation Platform DWG, the platforms share a common set of functionalities:
-
Cataloguing and searching;
-
Storage and access;
-
Visualization;
-
Data processing and analysis; and
-
User authentication, authorization, and accounting.
However, since current platforms for EO data management and analysis have been independently developed by public organizations and commercial companies, these platforms are not interoperable. In other words, they do not use (a) common (set of) interfaces and data formats for implementing the aforementioned functionalities. Additionally, the layer of abstraction dampens the flexibility and users might be constrained in the data or functionalities these platforms offer.
Regarding the EO data value chain, these platforms introduce platform providers as an additional stakeholder. Platform providers may be EO data providers, but most likely a different team in the organization or agency, or a third-party organization or company. The responsibility of managing, maintaining and provisioning the platform is shifted from the EO data provider to the platform provider. In the backend, such platforms may be operated in a cloud service, but may also be managed on a company server.
5.2.4. OGC Activities Driving the Development of Interoperable EO Exploitation Platforms
The OGC has been driving the development of an interoperable Applications-to-the-Data EO platform architecture through multiple OGC Innovation Program initiatives (see Figure 4). Further initiatives — such as OGC Testbed-17 and the OGC Disasters Pilot 2021 — as well as the OGC EO Exploitation Platform Domain Working Group (DWG) will continue to improve and define the architecture in more detail.
Figure 4 — OGC Innovation Program initiatives that drove the development towards an interoperable Applications-to-the-Data EO platform architecture
At present, it is unclear which set of standards and specifications the OGC defines or recommends for realizing an interoperable EO platform. Defining this set of standards and specifications is amongst the key activities of the OGC EO Exploitation Platform DWG. The working group started writing an OGC Best Practice document for Earth Observation Application Package implementation (OGC 20-089). The document will focus on the application package (concept), but will also address the deployment viewpoint, i.e. how to deploy a package within a platform.
Relevant standards, specifications, and technologies appear to be, but are not limited to:
-
Common Workflow Language (CWL) — To describe an application package (input and output parameters, invocation) as well as to define complex application workflows.
-
Spatio Temporal Asset Catalog (STAC) — Used as data manifest for application inputs and outputs metadata.
-
Docker — A docker container encapsulates the implementation of an application, and can be executed in different cloud environments. A docker container registry (also known as docker hub) provides a common means to store and download docker containers. A docker container can potentially also be built ad-hoc, in order to lower security concerns when downloading and executing pre-built third-party applications.
-
OGC API — Processes and OGC Web Processing Service (WPS) — Provide a standard interface for deploying and executing an application. Deployment requires the transactional extension of the API / service. The interface is used by both EMS and ADES, which may require further profiling of the generic processing interface to fully specify the specific interactions and functions of the two services.
-
OGC OpenSearch, with Geo & Time extensions, with EO extensions — For discovery and cataloguing.
5.2.5. Cloud-Based Data Access and Processing
Compared to EO data management and analysis platforms, cloud-based systems represent a more flexible approach of the ‘moving-code paradigm’. They provide scalable and effective access to and processing of EO data in the cloud and eliminate the need to download massive amounts of data. While intermediate users are provided a higher degree of functionalities compared to EO data platforms, pre-defined data selections often remain. On the other hand, cloud-based systems demand deeper technical knowledge of the system configuration and this poses a challenge to many EO data users.
Many large data organizations are currently either moving their data holdings to commercial cloud providers (e.g. NASA), or are in the process of implementing their own cloud service (e.g. ECMWF). In this latter scenario, data providers turn into cloud service providers at the same time. Nevertheless, cloud-based services also differ in their functionalities and specialization. There are general cloud services from commercial providers, e.g. Amazon Web Services or Google Cloud Platform, offering high flexibility but demanding high technical knowledge from users. With the paradigm change in EO, more specialized cloud services tailored for EO access and management are being implemented, such as the Copernicus Data and Information Access Services (DIAS) or the European Weather Cloud implemented by ECMWF / EUMETSAT. They offer the same flexibility as more general cloud services in terms of VM specification, but offer facilitated access to EO data holdings as well as community specific tools facilitating EO data processing.
Instead of downloading EO data and processing it on local machines, intermediate users specify a virtual machine in the cloud and with this virtual machine, they access and process the data and only transfer intermediate or end results to their local machine. This represents a true paradigm change of how users access and process EO data. However, it also implicates challenges on different levels, including an insufficient expertise in cloud-based systems, a general skepticism in several aspects of cloud security and limited transparency in potentially evolving costs for processing. For these reasons, strong efforts have to be undertaken to strengthen capacities in cloud-based services in general, while building up overall trust in the security of cloud services and a good understanding for emerging processing costs.
Figure 5 — Cloud-based data access and processing
5.2.6. Future: Federations and Interoperability of Cloud-Based Systems
A natural future evolution of EO cloud-based systems will be that interactions between these systems will increase, sharing data and processing in order to support the information needs of users. Thus, cloud federations will emerge, providing the ability to flexibly share data, applications and processing resources between multiple cloud-based systems.