Publication Date: 2019-02-04
Approval Date: 2018-12-13
Submission Date: 2018-06-14
Reference number of this document: OGC 18-038r2
Reference URL for this document: http://www.opengis.net/doc/PER/t14-D030
Category: Public Engineering Report
Editor: Tom Landry
Title: OGC Testbed-14: Machine Learning Engineering Report
COPYRIGHT
Copyright (c) 2019 Open Geospatial Consortium. To obtain additional rights of use, visit http://www.opengeospatial.org/
WARNING
This document is not an OGC Standard. This document is an OGC Public Engineering Report created as a deliverable in an OGC Interoperability Initiative and is not an official position of the OGC membership. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an OGC Standard. Further, any OGC Engineering Report should not be referenced as required or mandatory technology in procurements. However, the discussions in this document could very well lead to the definition of an OGC Standard.
LICENSE AGREEMENT
Permission is hereby granted by the Open Geospatial Consortium, ("Licensor"), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.
If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.
THIS LICENSE IS A COPYRIGHT LICENSE ONLY, AND DOES NOT CONVEY ANY RIGHTS UNDER ANY PATENTS THAT MAY BE IN FORCE ANYWHERE IN THE WORLD. THE INTELLECTUAL PROPERTY IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE DO NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE INTELLECTUAL PROPERTY WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION OF THE INTELLECTUAL PROPERTY WILL BE UNINTERRUPTED OR ERROR FREE. ANY USE OF THE INTELLECTUAL PROPERTY SHALL BE MADE ENTIRELY AT THE USER’S OWN RISK. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR ANY CONTRIBUTOR OF INTELLECTUAL PROPERTY RIGHTS TO THE INTELLECTUAL PROPERTY BE LIABLE FOR ANY CLAIM, OR ANY DIRECT, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM ANY ALLEGED INFRINGEMENT OR ANY LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR UNDER ANY OTHER LEGAL THEORY, ARISING OUT OF OR IN CONNECTION WITH THE IMPLEMENTATION, USE, COMMERCIALIZATION OR PERFORMANCE OF THIS INTELLECTUAL PROPERTY.
This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.
Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications.
This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.
None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.
- 1. Summary
- 2. References
- 3. Terms and definitions
- 4. Overview
- 5. AI, ML and DL landscape
- 6. Proof-of-concept
- 7. Approaches and good practices
- 8. Demonstration
- 9. Discussion and Open Issues
- Appendix A: ML System: Process description of ExecuteML
- Appendix B: ML System: Process description of RetrainML
- Appendix C: ML System: Process description of TrainML
- Appendix D: CVM: Default JSON response
- Appendix E: CVM: Full JSON response
- Appendix F: TB14 MoPoQ ML Task components
- Appendix G: Sample Pleiades Imagery
- Appendix H: Revision History
- Appendix I: Bibliography
1. Summary
This OGC Engineering Report (ER) describes the application and use of OGC Web Services (OWS) for integrating Machine Learning (ML), Deep Learning (DL) and Artificial Intelligence (AI) in the OGC Testbed-14 Modeling, Portrayal, and Quality of Service (MoPoQ) Thread. This report is intended to present a holistic approach on how to support and integrate emerging AI and ML tools using OWS, as well as publishing their input and outputs. This approach should seek efficiency and effectiveness of knowledge sharing.
This engineering report will describe: experiences, lessons learned, best practices for workflows, service interaction patterns, application schemas, and use of controlled vocabularies. It is expected that the description of workflows for geospatial feature extraction will be more complex than the implementations found in the deliverables.
1.1. Requirements & Research Motivation
The AI landscape is rapidly evolving, leading to new possibilities for geospatial analytics on ever larger Big Data. Current standards are being modernized to take into account these novel methods and the new data available in federated infrastructures. The goal of this work is to develop a holistic understanding of AI, ML, and DL in the context of geospatial. By this work, it is expected to advance or derive best practices for integrating ML, DL, and AI tools and principles in the context of OWS. The following figure presents an overview of the research motivations to be addressed.
Throughout their experiments, the participants of this task needed to answer the following questions:
-
What are the main interface requirements?
-
What are the main interactions between components?
-
Does the proof of concept allow for support of all geospatial datatypes, including vector information?
-
Are there interfaces or requirements specific to gridded imagery to introduce?
-
Is the proof-of-concept independent of algorithm type?
Several approaches were proposed for consideration in this work:
-
Consider training data handling and performance information of the ML tools
-
Compare "move-the-algorithm-to-the-data" and "move-the-data-to-the-algorithm"
-
Describe transparency and accountability principles
-
Describe testing of ML algorithms on fully-scaled infrastructure
1.2. Prior-After Comparison
1.2.1. Main findings
Previous work where OGC prototyped a Web Image Classification Service (WICS) interface (OGC 05-017) [1] that defined GetClassification, TrainClassifier and DescribeClassifier operations. WICS provided support for unsupervised classification, supervised classification using trained classifiers, and supervised classification using classifiers trained based on client-supplied training data. In this work, the D165 Machine Learning (ML) system describe these operations: TrainML, RetrainML and ExecuteML. The ML Knowledge Base (KB) describes the GetModels, GetImages, GetFeatures, and GetMetadata operations, as well as their Store counterparts. The Controlled Vocabulary Manager (CVM) offered by the D166 Semantic Enablement of ML task enables searching of terms by specifying namespace, prefix, and other parameters using facet searches.
The work described in this ER improves on WICS [1] by presenting additional findings on one ML classifier and one Deep Learning (DL) classifier. Additionally, this work presents findings from another DL classifier mainly used to transfer learning from EO workflows onto pretrained DL models. Details on these Additional ML systems can be found in the Approaches and good practices section. Therefore, the use of three different ML systems in this task allowed experimentation that further enabled definition of interoperable standards supporting ML, DL, and, by extension, AI.
The disaster concept considered in the Proof of Concept (POC) section has potential impacts and implications for numerous information communities such as Geospatial Intelligence (GeoINT) and Earth systems science. This work is therefore expected to support the OGC Disasters Interoperability Concept Study conducted simultaneously with Testbed-14. Initial concepts discussed very early in the ML task did consider the use of Internet-of-Things (IoT) sensor data, but was dismissed from implementations due to complexity. The POC demonstrates the use of both Very High Resolution (VHR) and Synthetic Aperture Radar (SAR) imagery, offering distinct spatiotemporal characteristics and enabling different applications. Section 5 offers a quick survey of the AI, ML, and DL landscape and briefly presents applications and challenges for GeoInt and Earth Systems. The reader can also refer to the bibliography for an overview of the literature considered.
1.2.2. How to best support ML and AI using OWS?
This ER notes the following uses of OWS that enabled support of ML in the testbed:
-
Implementation of a Representational State Transfer (REST) and JavaScript Object Notation (JSON) binding for a transactional Web Processing Service (WPS) to improve interoperability with Exploitation Platform experiments in the Earth Observation & Clouds (EOC) thread
-
Develop ML systems following Application Packaging best practices as advanced in EOC and as found in [2]
-
Develop an experimental WPS, Web Map Service (WMS), and Web Feature Service (WFS) for model interoperability and transparency, such as described in discussion section 9.6.
-
Use of a Web Map Tile Service (WMTS) and OGC 17-041 Vector Tiles as input for ML annotation, training, and inference operations, as described in the discussion in Section 9.8.
1.2.3. How to best publish input to and outputs from ML and AI using OWS?
This ER notes the following uses of OWS to better publish input and outputs of ML:
-
Use of an OGC Catalogue Service for the Web (CSW) as an interface for the Knowledge Base (KB), as described in the discussion in Section 9.2.
-
Use of an implementation of the OGC ISO Application Profile of CSW to handle information in KB
-
Use of CSW for data exchange and as an interface of the Controlled Vocabulary Manager (CVM), as described in the discussion in Section 9.7.
-
Consider use of OGC WFS 3.0 to manage annotated and output features, and more generally as a reference service interaction pattern
-
Consider use of ISO 19115 and/or OGC CSW-ebRIM Application Profile
-
Use of the OGC® Open Modelling Interface (OpenMI) Interface Standard for advanced description of models
1.3. Recommendations for Future Work
A goal of this task and its analysis was also to suggest potential future activity where these results could be investigated through recommended future tasks, deliverables, components, and Engineering Reports. This section presents recommended potential future tasks and deliverables that can support advancement of requirements and expand research motivations. The reader can also refer to Section 9 this Engineering Report for discussion and open issues for further details.
1.3.1. Recommended Future Tasks
-
Advance temporal enablement of ML through experiments with a variety of temporal data at largely different scales, such as IoT timeseries from sensors, satellite imagery, weather forecasts, climate projections, etc. While selection of data would aim for temporal variety, it would be most likely accompanied by a large spatial variety. Related to Testbed-15 idea: Predictive geospatial analytics through Artificial Intelligence
-
Advance Semantic enablement of ML through experiments including data search, data annotation, model training, model interoperability, ML output publishing, metadata, and transfer learning. One goal of this task would be to identify standardization needs and future work for transforming data into knowledge, thus supporting advanced AI use cases.
-
Advance Computational enablement of ML through experiments with a variety of processing methods, infrastructures, and execution environments, including secured workflows, containerization, cloud, and In-Memory MapReduce. One goal of this task would be to ensure that other tasks could design or operate in fully-scaled infrastructures.
-
Provide an OGC Catalogue Service (CSW) interface as a standardised interface into the Knowledge Base and experiment with Next Generation OGC Web services, as described in the discussion on knowledge base.
-
Integrate the CVM into the existing OGC Testbed-14 AI/ML architecture, as described in the discussion on semantic interoperability.
1.3.2. Recommended Future Deliverables
Recommended Future Components
The following components are suggested to be deployed for testing, demonstration, and integration purposes. The resulting functionalities of these components could support the recommended future tasks.
-
ML-enabled EOC application packages and workflows, where a packaged ML system is used in conjunction with EO pre-processing steps, in coherence with Testbed-15 idea: Workflows - Motivating use case
-
ML systems that trains models including point clouds as inputs. Related to Testbed-15 idea: Generalizing point cloud data services using OGC standards
-
ML systems supporting interfaces for interoperable DL models, where trained models from a ML system can be loaded into a different one to be used for inference. For example, we note ONNX for Deep Learning open models, and more generally OGC® Open Modelling Interface (OpenMI) Interface Standard for process simulations
-
Testbed-15 idea: Filter Encoding Standard extension for Imagery
-
Transfer learning demonstrators, where a model trained from scratch by one ML system is reused and adapted in a second ML system, possibly on a different computational environment
-
Advanced data store that allows efficient and flexible mapping between tiles or other forms of efficient spatiotemporal subsetting and the input layers of Deep Learning architectures
-
Advanced knowledge base and search capabilities that allow better transparency of ML models and support for validation of large numbers of training runs
-
Clients that can support supervised learning at scale such as Testbed-15 idea: Large scale geospatial annotation campaign for Deep Learning applications
-
Advanced clients providing tools to interact with all previously mentioned future components, as well as managing user feedback as described by the Geospatial User Feedback (GUF) standard
-
Clients and systems that implement security mechanisms such as OAuth 2.0
Recommended Future Engineering Reports (ER)
The following Engineering Reports are suggested to support and document experiments conducted inside previously mentioned recommended future tasks and components.
-
Geospatial ML systems best practices
-
AI for EO Application Packaging and Workflows best practices
-
ML for disaster interoperability
1.4. Document contributor contact points
All questions regarding this document should be directed to the editor or the contributors:
Contacts
Name | Organization |
---|---|
Tom Landry |
CRIM |
Cameron Brown |
Envitia |
Neil Kirk |
Envitia |
Chih-Wei Kuan |
Feng Chia University |
Benjamin Pross |
52 North |
Cullen Rombach |
Image Matters |
Martin Sotir |
CRIM |
1.5. Foreword
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.
Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.
2. References
The following normative documents are referenced in this document.
-
OGC: OGC 13-084r2, OGC® I15 (ISO19115 Metadata) Extension Package of CS-W ebRIM Profile 1.0, 2014
-
OGC: OGC 07-110r4, OGC® CSW-ebRIM Registry Service - Part 1: ebRIM profile of CSW0, 2009
-
OGC: OGC 11-Â014r3, OGC® Open Modelling Interface (OpenMI) Interface Standard, 2014
-
W3C: A JSON-based Serialization for Linked Data, 2018
3. Terms and definitions
For the purposes of this report, the definitions specified in Clause 4 of the OWS Common Implementation Standard OGC 06-121r9 shall apply. In addition, the following terms and definitions apply.
-
Active learning
Active learning is a special case of semi-supervised machine learning in which a learning algorithm is able to interactively query the user to obtain the desired outputs at new data points. There are situations in which unlabeled data is abundant but manually labeling is expensive. In such a scenario, learning algorithms can actively query the user/teacher for labels.
-
Algorithm
An unambiguous specification of how to solve a class of problems. Algorithms can perform calculation, data processing and automated reasoning tasks.
-
Annotation
Manual image annotation is the process of manually defining regions in an image and creating a textual description of those regions. Automatic image annotation (also known as automatic image tagging or linguistic indexing) is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image.
-
Application schema
Conceptual schema for data required by one or more applications [SOURCE: ISO 19101‑1:2014]
-
Artificial intelligence
Artificial Intelligence (AI) is the ability of a computer program or a machine to think and learn. An ideal (perfect) intelligent machine is a flexible agent which perceives its environment and takes actions to maximize its chance of success at some goal. An extreme goal of AI research is to create computer programs that can learn, solve problems, and think logically.
-
Artificial neural network
An artificial neural network is an interconnected group of nodes, akin to the vast network of neurons in a brain. The signal at a connection between artificial neurons is a real number, and the output of each artificial neuron is computed by some non-linear function of the sum of its inputs. Artificial neurons typically have a weight that adjusts as learning proceeds. See Neural Network.
-
Batch
A subdivision of a dataset into number of batches. A batch is required when the entire dataset is to large to be passed to a neural network.
-
Batch size
Total number of training examples present in a single batch.
-
Bundle
To bundle software means to sell it together with a computer, or with other hardware or software, as part of a set.
-
Class
A class is the category for a classifier which is given by the target.
-
Classification
Classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.
-
Classification map
A visual representation of the output of classification, often viewed as dense, gridded imagery. See Classifier.
-
Classifier
An algorithm or a method that processes to classification of an input. See Classification.
-
Cluster
Generally, a group of data objects. Typical cluster models include connectivity models, distribution models, density models and neural models.
-
Clustering
Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). A "clustering" is essentially a set of such clusters, usually containing all objects in the data set. Additionally, it may specify the relationship of the clusters to each other.
-
Controlled vocabulary
Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other forms of knowledge organization systems.
-
Convolutional neural network
A convolutional neural network (CNN) uses convolutions to extract features from local regions of an input. CNNs have gained popularity particularly through their excellent performance on visual recognition tasks.
-
Dataset
Identifiable collection of data.
-
Deep learning
Deep Learning is a class of machine learning algorithms that: use a cascade of multiple layers of nonlinear processing units for feature extraction and transformation; learn in supervised and/or unsupervised manners; learn multiple levels of representations that correspond to different levels of abstraction.
-
Deep neural network
A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers between the input and output layers. See Artificial Neural Network.
-
Detection
Detection includes methods for computing abstractions of image information and making local decisions at every image point whether there is an image feature of a given type at that point or not. The resulting features are subsets of the image domain, often in the form of isolated points, continuous curves or connected regions.
-
Entity
A thing with distinct and independent existence. An entity is something that exists as itself, as a subject or as an object, actually or potentially, concretely or abstractly, physically or not. An entity-relationship (ER) model describes interrelated things of interest in a specific domain of knowledge. A basic ER model is composed of entity types (which classify the things of interest) and specifies relationships that can exist between instances of those entity types.
-
Epoch
Learning iteration on a training dataset.
-
Extraction
In machine learning, feature extraction starts from an initial set of measured data and builds derived values (features) intended to be informative and non-redundant, facilitating the subsequent learning and generalization steps, and in some cases leading to better human interpretations. Feature extraction is related to dimensionality reduction. See Feature.
-
Feature
Abstraction of real-world phenomena. A feature can occur as a type or an instance. Feature type or feature instance should be used when only one is meant [SOURCE: ISO 19101‑1:2014].
-
Feature engineering
Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work. The need for manual feature engineering can be obviated by automated feature learning. See Feature Learning.
-
Feature learning
Feature learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature engineering and allows a machine to both learn the features and use them to perform a specific task. See Feature.
-
Fine tuning
Fine-Tuning refers to the technique of initializing a network with parameters from another task, and then updating these parameters based on the task at hand. Fine tuning is a common technique is to train the network on a larger data set from a related domain. Once the network parameters have converged an additional training step is performed using the in-domain data to fine-tune the network weights. This allows convolutional networks to be successfully applied to problems with small training sets.
-
Generative adversarial networks
Generative adversarial networks (GANs) are a class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks contesting with each other in a zero-sum game framework. One network generates candidates (generative) and the other evaluates them (discriminative). Training the discriminator involves presenting it with samples from the dataset, until it reaches some level of accuracy. See Sample.
-
Ground truth
In machine learning, the term "ground truth" refers to the accuracy of the training set’s classification for supervised learning techniques.
-
Hyperparameter
In machine learning, a hyperparameter is a parameter whose value is set before the learning process begins. By contrast, the values of other parameters are derived via training. Given these hyperparameters, the training algorithm learns the parameters from the data. The time required to train and test a model can depend upon the choice of its hyperparameters. A hyperparameter is usually of continuous or integer type, leading to mixed-type optimization problems.
-
Iteration
In machine learning systems, the number of batches needed to complete one epoch.
-
Knowledge base
A knowledge base (KB) is a technology used to store complex structured and unstructured information used by a computer system. A knowledge-based system consists of a knowledge-base that represents facts about the world and an inference engine that can reason about those facts and use rules and other forms of logic to deduce new facts or highlight inconsistencies.
-
Hidden layer
In traditional feed-forward neural networks, a hidden layer neuron is a neuron whose output is connected to the inputs of other neurons and is therefore not visible as a network output. See Layer.
-
Label
See Class.
-
Layer
Basic unit of geographic information that may be requested as a map from a server. In the context of neural networks, a layer is an ensemble of neurons in a network that processes a set of inputs and their results.
-
Machine learning
Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to progressively improve performance on a specific task with data, without being explicitly programmed.
-
Metadata
Data that provides information about other data.
-
MLlib
MLlib is Apache Spark’s scalable machine learning library. It provides tools such as ML Algorithms, Featurization, Pipeline, Persistence and Utilities.
-
Model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of some sample data and similar data from a larger population. A statistical model represents, often in considerably idealized form, the data-generating process.
-
Neural network
A computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs.
-
OpenAPI
The OpenAPI Specification is an Application Programming Interface (API) description format for REST APIs.
-
Parameter
In computer programming, a parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. In convolutional layers, the layer’s parameters consist of a set of learnable filters or kernels.
-
Profile
Set of one or more base standards and - where applicable - the identification of chosen clauses, classes, subsets, options and parameters of those base standards that are necessary for accomplishing a particular function [ISO 19101, ISO 19106].
-
PyTorch
PyTorch is an open source machine learning library for Python, based on Torch, used for applications such as natural language processing or image processing. PyTorch provides GPU-accelerated Tensor computation and Deep Neural Networks.
-
Remote sensing
Remote sensing is the acquisition of information about an object or phenomenon without making physical contact with the object and thus in contrast to on-site observation. It generally refers to the use of satellite- or aircraft-based sensor technologies to detect and classify objects on Earth, including on the surface and in the atmosphere and oceans, based on propagated signals.
-
Segmentation
In computer vision, image segmentation is the process of partitioning a digital image into multiple segments.
-
Semantics
A conceptualization of the implied meaning of information that requires words and/or symbols within a usage context.
-
Semantic class
A semantic class contains words that share a semantic feature. According to the nature of the noun, they are categorized into different semantic classes. Semantic classes may intersect. See Semantic Feature.
-
Semantic feature
Semantic features represent the basic conceptual components of meaning for any lexical item.
-
Semantic gap
Separation between the visual content in a digital image and semantic descriptions.
-
Semantic interoperability
Semantic interoperability that assures that the content is understood in the same way in both systems, including by those humans interacting with the systems in a given context.
-
Semantic segmentation
Semantic segmentation describes the process of associating each pixel of an image with a class label.
-
Supervised Learning
Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples.
-
Synthetic aperture radar
Synthetic aperture radar (SAR) is a form of active remote sensing – the antenna transmits radiation that is reflected from the image area, as opposed to passive sensing, where the reflection is detected from ambient illumination. SAR image acquisition is therefore independent of natural illumination and images can be taken at night. See Remote Sensing.
-
Target
Detection of targets is a specific field of study within the general scope of image processing and image understanding. From a sequence of images (usually within the visual or infrared spectral bands), it is desired to recognize a target such as a car.
-
Taxonomy
A system or controlled list of values by which to categorize or classify objects.
-
Test dataset
A test dataset is a dataset that is independent of the training dataset, but that follows the same probability distribution as the training dataset. The test dataset is a dataset used to provide an unbiased evaluation of a final model fit on the training dataset. A test set is therefore a set of examples used only to assess the performance (i.e. generalization) of a fully specified classifier.
-
Tile
A rectangular representation of geographic data, often part of a set of such elements, covering a spatially contiguous extent which can be uniquely defined by a pair of indices for the column and row along with an identifier for the tile matrix [source: OGC 07-057r7].
-
Training dataset
A training dataset is a dataset of examples used for learning. A model is initially fit on a training dataset, that is a set of examples used to fit the parameters of the model. The model is trained on the training dataset using a supervised learning method.
-
Tuning
See Fine Tuning.
-
Uncertainty
Two type of uncertainty can be identified, epistemic and aleatoric uncertainty. Epistemic uncertainty captures ignorance about which model generated the collected data. Aleatoric uncertainty relates to information which collected data cannot explain.
-
Unsupervised Learning
Unsupervised machine learning is the machine learning task of inferring a function that describes the structure of "unlabeled" data.
-
Validation dataset
A validation dataset is a set of examples used to tune the hyperparameters of a classifier. Fitted models are used to predict the responses for the observations in the validation dataset. The validation dataset provides an unbiased evaluation of a model fit on the training dataset while tuning the model’s hyperparameters.
-
Vocabulary
A language user’s knowledge of words. See Controlled Vocabulary.
3.1. Abbreviated terms
-
AOI Area of Interest
-
AI Artificial Intelligence
-
API Application Programming Interface
-
CFP Call For Participation
-
CRIM Computer Research Institute of Montreal
-
CNN Convolutional Neural Network
-
COG Cloud Optimized GeoTIFF
-
CWL Common Workflow Language
-
DGGS Discrete Global Grid Systems
-
DL Deep Learning
-
DSTL Defense Science and Technology Laboratory
-
EOC Earth Observation & Clouds
-
ER Engineering Report
-
FCU Feng Chia University
-
GAN Generative Adversarial Network
-
GPU Graphical Processing Unit
-
JSON JavaScript Object Notation
-
ML Machine Learning
-
MGCP Multinational Geospatial Coproduction Program
-
MLP Multi Layer Perceptron
-
MoPoQ Modeling, Portrayal, and Quality of Service
-
NAS NSG Application Schema
-
NDAAS NSG Data Analytic Architecture Service
-
NGA National Geospatial-Intelligence Agency
-
NLP Natural Language Processing
-
NLU Natural Language Understanding
-
NN Neural Network
-
NIEM National Information Exchange Model
-
NSG National System for Geospatial Intelligence
-
OGC Open Geospatial Consortium
-
OSM OpenStreetMap
-
OWS OGC Web Services
-
RNN Recurrent Neural Network
-
SAR Synthetic Aperture Radar
-
SatCen European Union Satellite Centre
-
SVM Support Vector Machines
-
URL Uniform Resource Locator
-
USGIF United States Geospatial Intelligence Foundation
-
VHR Very High Resolution
-
W3C World Wide Web Consortium
-
WCS Web Coverage Service
-
WFS Web Feature Service
-
WMS Web Map Service
-
XML Extensible Markup Language
4. Overview
Section 5 provides background information found in the state-of-the-art of AI, ML, and DL that is relevant to the ML task.
Section 6 describes the proof of concept developed by the task participants. The concept, scenarios, and use cases are presented. The design of each component, deliverable by deliverable, is described. Implemented systems, processes, and workflows are reported. Metadata and application schemas are described.
Section 7 reports various approaches and good practices as identified by the participants through their own implementations or by references of authoritative material. The section presents best practices of workflows, service interaction patterns, and application schemas.
Section 8 demonstrates the proof of concept. The demonstration scenarios and relevant material are presented.
Section 9 lists the recommendations from the task participants to OGC. Recommendations addressing the creation or extension of OGC standards to support ML and AI geospatial algorithms are presented, as well as recommendations of good practices for the use of tiles or grid structures for the analysis, including DGGS.
Annex A provides an XML WPS 2.0 process description of ExecuteML as presented by the ML system.
Annex B provides an XML WPS 2.0 process description of RetrainML as presented by the ML system.
Annex C provides an XML WPS 2.0 process description of TrainML as presented by the ML system.
Annex D provides a JSON file of the default response of the Controlled Vocabulary Manager.
Annex E provides a JSON file of the full response of the Controlled Vocabulary Manager.
Annex F provides a components table for the ML Task of the MoPoQ thread of Testbed-14.
Annex G illustrates a sample Pleiades VHR image of Paris used to train models and infer classes.
5. AI, ML and DL landscape
Section 5 provides background information found in the state-of-the-art of AI, ML, and DL that is relevant to the ML task. This landscape study puts into context key elements from the CFP, supports future work recommendations found in the summary, and lays the groundwork for discussion.
For additional background, the CFP also lists the following sources:
-
Big Data DWG: Simple Features for Big Data
-
Human-Agent Collectives
-
Data Quality work of Testbed-12
-
Gal’s Thesis on Uncertainty in Deep Learning
-
Principles for Algorithmic Transparency and Accountability
-
NSG Data Analytic Architecture Service (NDAAS)
-
Multinational Geospatial Coproduction Program (MGCP)
-
NGA 2020 analysis technology plan
5.1. Background information on ML
Several examples of supervised, unsupervised, and semi-supervised ML applications can be found in [3] and [4]. Typical problem classes are illustrated by [5] in the following figure.
Major performance metrics of ML systems include precision and recall, training time and execution time, and metaparameters and analyst feedback. Similarly, the CFP lists the following metadata as key elements to consider in the ML task:
-
trust levels and confidence
-
associated body of evidence
-
associated imagery
-
optimized learning capabilities
-
applied processes
-
quality parameters
-
links to original images and/or snippets
5.2. GeoINT applications
This section briefly presents key GeoINT challenges coherent with this work’s scope, with respect to the role of the analyst, to the processes and workflows, and to scenarios and applications. Further reading material can be found in [6], [7], [8], [9] and [10]. Below is a definition of GeoInt as found on Wikipedia.
GEOINT (GEOspatial INTelligence) is intelligence about the human activity on earth derived from the exploitation and analysis of imagery and geospatial information that describes, assesses, and visually depicts physical features and geographically referenced activities on the Earth.
5.2.1. Processes and workflows
Several sources often mentioned the use of processes such as metadata tagging, data preparation and ingestion, data search, and filtering. Advanced use cases in Natural Language processing (NLP) findings [11], with systems containing query interpreters and query conductors in federated subsystems, are mentioned. Data challenges presented include hypothesis formulation, confidence tracking, maintaining links to the original data, and transversal of workflows. Similarly, the OGC Testbed-14 CFP lists the following processes and workflows as references for the ML task participants:
-
Conversion into a format suitable for feature extraction and classification
-
Filtering of images to remove very low quality samples that might skew the ML model
-
Provision of metadata to describe the provenance and configuration of the model
-
Export of outputs to be re-ingested as inputs for further learning, in a closed feedback loop
-
WPS, WFS, and OWS Context Profiles to support development of DL models for geospatial feature extraction
5.2.2. Scenarios and use cases
Several scenarios and use cases of ML can be found in the literature. Usual applications involving detection of objects and events, as found in [9], [12] and [13], include:
-
Tracking of human migration
-
Finding high value targets such as terrorists
-
Locating waste piles that enable breeding of mosquitos
-
Monitoring of land cover and determination of land use
Other GeoInt use cases involving forecasting, simulation, and prediction can be found in [8]. In this broad category, we note the following applications:
-
Inventory management based on weather patterns
-
Regional climate response
-
Watershed evaluation
-
Agricultural forecasting and food and water security
Finally, in accordance with the disaster scenario described in Section 6, more information can be found in [14], [15], [16] and [9]. For this scenario, we also note the following applications:
-
Detecting trends in air pollution
-
Flood and inundation models for storm surge prediction
-
Simulation of human behavior in catastrophic urban scenarios
5.3. Earth Systems applications
This section briefly presents key Earth Systems challenges coherent with this work’s scope with respect to the data characteristics as well as scenarios and applications. Here this engineering report notes a large overlap with GeoInt applications presented above, as Earth Systems studies natural phenomena that impact life.
5.3.1. Data
Data sources often used in Earth Systems include, aerial [17], in-situ, model outputs and remote sensing. From the latter, this engineering report notes the possible estimation of variables [18] such as methane in air, forest cover, global surface water and land cover change (unsupervised). A good overview [19] of the inherent characteristics of gridded data, vector data, and swaths includes:
-
Boundaries and spatial dimensionality
-
Temporal characteristics and presence of various cycles
-
Multiresolution
-
Sample and ground truth sizes
Amongst these data sources, the use of the term "features" to describe a transformation or reduction of the input space, for instance in [20]. Rules, heuristics and ad hoc models are often used as handcrafted features [12] [4]. In the context of DL, "deep features" describes the outmost layers of a learned NN, as found in [21] and [22]. Here, this engineering report notes the relevance of work conducted in the Spatial Data on the Web Best Practices WG [OGC 15-107].
5.3.2. Scenarios and use cases
Several scenarios and use cases of ML can be found in the literature. Usual applications involving detection, tracking, and forecasting of events and phenomena can be found in [18]. In accordance with the disaster scenario described in Section 6, the following applications are also noted:
-
hurricanes, tornados, and fires
-
weather fronts and atmospheric rivers
5.4. Annotation datasets
In our scenario, an analyst that must annotate imagery in order to train or retrain models has several modalities offered to him or her. For example, annotation clients and methods found in [4], [17] and [23] use point points, patches, bounding boxes, tiles, contours, and regions. The figure below illustrates the use of polygons to compactly delimitate objects of interest and may be considered as state-of-the-art.