Publication Date: 2021-01-13

Approval Date: 2020-12-14

Submission Date: 2020-11-19

Reference number of this document: OGC 20-018

Reference URL for this document:

Category: OGC Public Engineering Report

Editor: Guy Schumann

Title: OGC Testbed-16: Machine Learning Training Data ER

OGC Public Engineering Report


Copyright © 2021 Open Geospatial Consortium. To obtain additional rights of use, visit


This document is not an OGC Standard. This document is an OGC Public Engineering Report created as a deliverable in an OGC Interoperability Initiative and is not an official position of the OGC membership. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an OGC Standard. Further, any OGC Public Engineering Report should not be referenced as required or mandatory technology in procurements. However, the discussions in this document could very well lead to the definition of an OGC Standard.


Permission is hereby granted by the Open Geospatial Consortium, ("Licensor"), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.

If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.


This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.

Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications.

This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.

None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.

Table of Contents

1. Subject

The OGC Testbed-16 Machine Learning (ML) Training Data Engineering Report (ER) describes training data used for developing a Wildfire Response application. Within the context of the application, this ER discusses the challenges and makes a set of recommendations. The two scenarios for the wildfire use case include fuel load estimation and water body identification. The ML training data described in this ER are based on these two scenarios. Suggestions are also made for future work on a model for ML training dataset metadata, which is intended to provide vital information on the data and therefore facilitate the uptake of training data by the ML community. Additionally, this ER summarizes the discussions and issues about ML training data among the Testbed-16 ML thread participants and draws conclusions and recommendations for future work on the subject. Finally, this ER also links to current Analysis Ready Data (ARD) principles and efforts, in particular in the Earth Observation (EO) community.

2. Executive Summary

2.1. General Purpose of the ML thread and this Engineering Report

The OGC Testbed-16 Machine Learning task focused on understanding the potential of existing and emerging OGC standards for supporting Machine Learning (ML) applications in the context of wildland fire safety and response. In this context, the integration of ML models into standards-based data infrastructures, the handling of ML training data, and the integrated visualization of ML data with other source data was explored. Emphasis was on the integration of data from the Canadian Geospatial Data Infrastructure CGDI, the handling of externally provided training data, and the provisioning of results to end-users without specialized software.

2.2. Requirements of the ML training data set components in particular

The Testbed-16 ML participants explored how to leverage ML technologies for dynamic wildland fire response. An objective was to also provide insight into how OGC standards can support wildland fire response activities in a dynamic context. Any identified limitations of existing OGC standards can be used to plan improvements to these frameworks.

Though this task uses a wildland fire scenario, the emphasis is not on the quality of the modeled results but on the integration of externally provided source and training data, the deployment of the ML model on remote clouds through a standardized interface, and the visualization of model output.

In summary, Testbed-16 addressed the following three challenges:

  • Discovery and reusability of training data sets

  • Integration of ML models and training data into standards-based data infrastructures

  • Cost-effective visualization and data exploration technologies based on Map Markup Language (MapML)

This Machine Learning Training Data ER focuses explicitly on the first point. For ML models and their integration, readers are referred to the OGC Machine Learning Engineering Report [OGC 20-015].

2.3. Training Datasets

The Earth Observation (EO) user and developer community currently can access unprecedented capabilities. To combine these capabilities with the major advances in Artificial Intelligence (AI) in general and Machine Learning (ML) in particular, the community needs to close the gap between ML on one side and Earth observation data on the other. In this context, two aspects need to be addressed.

  • The extremely limited discoverability and availability of training and test datasets.

  • The interoperability challenges to enable ML systems to work with available data sources and live data feeds coming from a variety of systems and APIs

In this context, training datasets are pairs of examples of labelled data (independent variable) and the corresponding EO data (dependent variables). Together, these two types are used to train an ML model that is then used to make predictions of the target variable based on previously unseen EO data. Test data is a set of observations used to evaluate the performance of the model using some performance metric. In addition to the training and test data, a third set of observations, called a validation or hold-out set, is sometimes required. The validation set is used to tune variables called hyper parameters, which control how the model learns. In this ER, the set of training data, test data, and validation data together are referred to simply as training datasets.

To address the general lack of training data discoverability, accessibility, and reusability, Testbed-16 participants developed solutions that describe how training data sets can be generated, structured, described, made available, and curated.

2.4. (Research) Questions that this ER tries to address

As indicated in the Testbed-16 Call for Participation (CFP), this ER tries to address the following:

  • Where do trained datasets go and how can they be re-used?

  • How can we ensure the authenticity of trained datasets?

  • Is it necessary to have analysis ready data (ARD) for ML? Can ML help ARD development?

  • What metamodel structure should be used for ML TDS?

  • What is the value of datacubes for ML?

2.5. About the Canadian Wildland Fire Information System (CWFIS)

The Canadian Wildland Fire Information System (CWFIS) creates daily fire weather and fire behavior maps year-round and hot spot maps throughout the forest fire season, generally between May and September. The CWFIS monitors fire danger conditions and fire occurrence across Canada. Daily weather conditions are collected from across Canada and used to produce fire weather and fire behavior maps. In addition, satellites are used to detect fires, and reported fire locations are collected from fire management agencies.

The CWFIS comprises various systems. The Canadian Forest Fire Danger Rating System (CFFDRS) is a national system for rating the risk of forest fires in Canada . Forest fire danger is a general term used to express a variety of factors in the fire environment, such as ease of ignition and difficulty of control. Fire danger rating systems produce qualitative and/or numeric indices of fire potential, which are used as guides in a wide variety of fire management activities.

The CFFDRS has been under development since 1968. Currently, two subsystems – the Canadian Forest Fire Weather Index (FWI) System and the Canadian Forest Fire Behavior Prediction (FBP) System – are being used extensively in Canada and internationally.

These are all available as training datasets for Testbed 16.

2.6. What does this ER mean for the EDM Working Group and the OGC

The purpose of the Emergency & Disaster Management (EDM) DWG is to promote and support the establishment of requirements and best practices for web service interfaces, models and schemas to enable the discovery, access, sharing, analysis, visualization and processing of information to the forecasting, prevention, response to and recovery from emergency and disaster situations. The mission lies in improving interoperability of geospatial products and other information consumables that can be shared across these communities. Two main objectives of work described in this ER are:

  • Identify interoperability standards gaps and opportunities to support improved EDM information sharing, collaboration and decision making.

  • Propose or encourage initiation of Interoperability Program studies, experiments, pilot initiatives, testbed threads or demonstrations to address technical, institutional and policy related interoperability challenges, and identify and engage the interest of potential sponsors for these activities.

The former OGC Law Enforcement And Public Safety (LEAPS) DWG promoted and supported the establishment of local, national, regional and international requirements and best practices for web service interfaces, data models and schemas for enabling the discovery, access, sharing, analysis, visualization and processing of information. This geospatial and temporal information is used comprehensively to address crime, terrorist activities and public safety incidents in an operationally effective way.

Given that the objectives and general purpose were very similar and overlapping for many applications, especially during disaster situations, these two groups were combined into the EDM/LEAPS DWG.

The objectives of this DWG are synergistic with the requirement and deliverables in the Testbed-16 Machine Modeling (ML) Thread. To facilitate access and formats of data for ML model training, validation and testing for the purpose of better managing emergencies such as wildland fires.

This situation is very common in the emergency response and disaster management sector where more ML and ML models are being developed and used. Hence the importance of this ER to the EDM/LEAPS DWG.

2.7. Document contributor contact points

All questions regarding this document should be directed to the editor or the contributors:


Name Organization Role

Guy Schumann

RSS Hydro


Albert Kettner

Contractor to RSS Hydro


Ignacio Correas



2.8. Foreword

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.

Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.

3. References

4. Terms and definitions

For the purposes of this report, the definitions specified in Clause 4 of the OWS Common Implementation Standard OGC 06-121r9 shall apply. In addition, the following terms and definitions apply.

4.1. Abbreviated terms

CWFIS Canadian Wildland Fire Information System

EDM Emergency and Disaster Management

FWI Fire Weather Index

ML Machine Learning

MSE Mean Squared Error

NFIS National Forest Information System

NRCan Natural Resources Canada

SLD Styled Layer Descriptor

SSIM Structural Similarity Index

WMS Web Map Service

5. Overview

Chapter 1 introduces the subject matter of this Testbed 16 OGC Engineering Report.

Chapter 2 provides an Executive Summary for the Testbed-16 ML Training Data activity.

Chapter 3 provides a reference list of normative documents.

Chapter 4 gives a list of the abbreviated terms and the symbols necessary for understanding this document.

Chapter 5 lists the content of each chapter (this chapter).

Chapters 6 to 9 contain the main technical details and recommendations of this ER. This section provides a high-level outline of the use case scenarios, followed by an in-depth description of the work performed and the challenges encountered, raising issues and discussing possible solutions.

Chapter 10 summarizes recommendations and suggests top-priority items for future work.

Annex A includes a history table of changes made to this document.

6. Use Case Scenarios

6.1. Fuel Load Estimation (D136)

Problem statement: Explore interoperability challenges of training data for the wildfire use case specifically for fuel load estimation. In addition, solutions need to be developed that allow the wildland fire training data, test data, and validation data be structured, described, generated, discovered, accessed, and curated within data infrastructures.

6.2. Water Body Identification (D135)

Problem statement: Investigate how existing standards related to water resources, in conjunction with ML, can be used to locate potential water sources for wildland fire event response.

7. Fuel Load Estimation: Data and Technical Details

7.1. Data analysis tools

7.1.1. Structural Similarity Index (SSIM)

The first challenge was data analysis. Specifically, how to compare two different variables, represented by their respective maps, to estimate their correlation. This is a basic step in feature selection. The process is to select those features that contribute most to the prediction variable or output in a Machine Learning algorithm. This process helps finding relevant features, leaving irrelevant features out of the training data set.

One of the most common methodologies used in data analysis is Mean Squared Error (MSE). This error estimate measures the average of the squares of the errors between two variables. However, as this error estimate only considers local values and ignores shapes of isolines, which is one of the main characteristics in maps, this method is unsuited for map comparison. For example, two maps showing very similar isolines but different absolute values will be considered unrelated by MSE methodology.

In order to overcome this challenge, the participants researched new methodologies in the field of image processing. They determined that Structural Similarity Index (SSIM) was considered suitable for the task. The SSIM index is a method for measuring the similarity between two images. The SSIM index can be viewed as a quality measure of one of the images being compared provided that the other image is regarded as of perfect quality. This is a method for predicting the perceived quality of digital television and cinematic pictures, as well as other kinds of digital images and videos. However, SSIM can be applied to find common patterns and contours between two images and estimate their degree of similarity.

As displayed in the following example, when applying SSIM to comparing two different images the value will be indicative of the similarity of the images. When applying SSIM to maps, the hypothesis is that SSIM will be a better indication than MSE-related metrics for variable correlation or for the accuracy of ML predictions.

example MSE SSI methodologies
Figure 1. Example of MSE and SSI methodologies when comparing images. SSI clearly indicates high similarity when image lighting was adjusted and low similarity when an image was noisy. MSE provided the same metrics in both cases. (Source:

7.1.2. SSIM applied to Petawawa data

The SSIM method was tested with the Web Map Service (WMS) accessible Petawawa data. The Petawawa forest dataset provides a series of maps displaying the values of several measurements, such as volume, height or biomass. The dataset is for a relatively small and compact area but offers up to 26 different measurements with a spatial resolution of 30 meters and the quality of the data is assumed to be very high: that is, close to the real value distribution. Further, the maps come directly as gray scale that can be analyzed directly.

Both methods (MSE and SSIM) were run for all maps. SSIM is an index, topped at 1.0, which indicates identical maps. Values over 0.9 indicate a high similarity (90 percentile), and lower than 0.5 indicate very low similarity (10 percentile). In contrast, MSE is an absolute value, with 0 indicating identical images, values under 500 indicate very few differences (10 percentile), and over 5000 indicate substantial differences (90 percentile).

The interest in using SSIM becomes obvious when analyzing maps that show related variables, such as Lorey’s height, the weighted mean height whereby individual trees are weighted in proportion to their basal area (RF_PRF_LOREYSHT), in relation to Co-dominant - dominant height (RF_PRF_CD_HT) and Top height, the average of the largest 100 trees per ha (RF_PRF_TOPHT). These variables all measure forest heights and should show a strong correlation.

When comparing Lorey’s height in relation to Co-dominant - dominant height, both methods indicate high similarity, with SSIM value at 0.93 (very close to the maximum 1.0) and MSE at 161 (well below the 10th percentile value of 500).


Please note in the following figures that each SSIM value and each MSE value represents a comparison of the two images (rather than the SSIM value being applied to one image and the MSE value being applied to the other).

Loreys height vs Co dominant   dominant height
Figure 2. Lorey’s height (SSIM) and Co-dominant - dominant height (MSE) both indicate strongly similar images

However, when comparing Lorey’s height in relation to Top height, only SSIM, with a value of 0.92, indicates strong correlation. MSE gives a value of 1036, indicating that there might be some correlation. The reason for this misalignment is due to the fact that in spite of both maps showing similar shapes, the absolute values of the variables differ significantly.

Loreys height vs Top height average
Figure 3. Lorey’s height (SSIM) indicates strongly similar images, but Top height (MSE) indicates dissimilarity

Similar issues arise when comparing unrelated variables. For example, when comparing Top height (RF_PRF_TOPHT) in relation to Basal area in the Pole tree size class (10-24cm) (RF_PRF_BAPOLES), both methods indicate very low correlation or none at all, with values of 6746 for MSE and 0.39 for SSIM.

Basal area vs Top height
Figure 4. Both Top height (MSE) and Basal area (SSIM) indicate dissimilarity

However, when comparing Basal area in relation to Quadratic mean diameter for all trees >9.1cm (RF_PRF_DBHQMERCH_MASKED) only SSIM results, with a value of 0.37, clearly indicate no correlation. The MSE value of 1666 which, although not pointing towards a clear correlation, does not reject the relationship either.

Basal area vs Quadratic mean diameter
Figure 5. Basal area (SSIM) indicates dissimilarity while Quadratic mean diameter (MSE) is borderline

7.1.3. SSIM applied to Canadian Wildland Fire Information System (CWFIS)

All the previously described tests were carried out using the WMS accessible Petawawa dataset. This is high quality data with one continuous variable per layer and with maps served in grayscale. However, this dataset’s geographic extent is very limited and conclusions therefore do not necessarily extrapolate to countrywide maps.

The CWFIS offers an alternative source of data to analyze the available data and showcase the Structural Similarity Index. The CWFIS maps are countrywide and many refer to discrete variables with their own color code, as ranges of the real-world variable. Moreover, many of these maps are related, such as the Fire Weather Index System or the Fire Behavior Prediction System, and offer a good testbed to analyze correlations.

fwi structure
Figure 6. Fire Weather Index (FWI) structure and relations between variables/maps. (Source: NRCan)

The first challenge analyzing CWFIS maps is to convert them to a scale that is numerically significant and comparable by an algorithm. Maps in CWFIS have color codes that need to be transformed to a grayscale in which the higher volume according to the color code corresponds to either white or black and vice-versa. This challenge can be overcome as the CFWIS WMS offers a GetLegendGraphic option to associate a color to a value range, which then returns a "png" image with the map legend. This option is useful for visual inspection of the data. Alternatively, a WMS instance could offer a Styled Layer Descriptor (SLD) document. This is an XML file encoding of the legend information in a machine-readable format and accessible using a GetStyles operation. The SLD document can describe rules for rendering, and these rules can contain MinScaleDenominator and MaxScaleDenominator elements specifying the numerical values of the scale ranges. This is a valid method to communicate scale values in a machine-readable format using OGC standards. Both GetLegendGraphic and GetStyles operations are specified in an extension to the WMS standard. This extension is called OpenGIS SLD Specification, which is an optional extension.

Unfortunately, this extension does not seem to be fully implemented in NRCan’s services. The GetLegendGraphic operation works properly and provides a descriptive "png" image that a user can visually interpret. However, when the GetStyles operation is used to retrieve an SLD document, the retrieved document is empty. This means that an ML algorithm does not have a numerical reference to interpret the colors and shades in a map and cannot make proper inferences and regressions. In order to overcome this limitation, the color value is considered the value of the variable.

In the following examples, the lowest value is set to black and the highest to white, while the remaining gray gradation is assigned evenly through the different ranges. The following maps show the gray-scaled maps for Wind Speed, Fine Fuel Moisture Code and Initial Spread Index.

ws current
Figure 7. CWFIS Wind Speed.
ffmc current
Figure 8. CWFIS Fine Fuel Moisture Code.
isi current
Figure 9. CWFIS Initial Spread Index.

According to the Fire Weather Index structure graph, the combination of Wind Speed (WS) and Fine Fuel Moisture Code (FFMC) maps generates the Initial Spread Index (ISI) map. This derived map when combined with the Buildup Index (BUI) generates the Fire Weather Index (FWI). Applying both MSE and SSIM methods indicates a strong correlation between FFMC and ISI (0.96 according to SSIM and 114 according to MSE) and between ISI and FWI (0.97 according to SSIM and 70 according to MSE). It is even possible to apply a simple regression and reverse-engineer the generation of ISI, as depicted in the following map, with a result pretty close to the original ISI map. Comparing this result to the original ISI map it scores relatively well (0.96 and 80 respectively).

ffmc ws
Figure 10. CWFIS reverse-engineered Initial Spread Index.

MSE and SSIM methods applied to all CWFIS maps draw consistent correlations, grouping maps by families according to their structural similarity. For example, Head Fire Intensity (HFI) is closely related to Rate of Spread (RoS) and Total Fuel Consumption (TFC) but they are also loosely related to FFMC, ISI or WS. This makes sense as HFI, RoS and TFC belong to the Canadian Forest Fire Behavior Prediction (FBP) System and FFMC, ISI and WS belong to the Canadian Forest Fire Weather Index (FWI) System).

7.1.4. Conclusions

  • Feature selection requires reliable methods to compare variables and estimate correlations. SSIM, used in the field of image processing, has been tested to compare it with traditional methods such as MSE.

  • MSE seems to perform well in simple, discrete-variable maps but is less accurate finding correlations in continuous-variable maps. SSIM is better at finding common shapes even with different absolute values, but SSIM is sensitive to noise.

  • Although access to a graphical legend is possible for discrete-variable maps to visually interpret them, machines require access to machine-readable formats through WMS operations, such as accessing a descriptive SLD document through GetStyles operation. However, these are not always implemented and could not be tested in Testbed 16. Continuous-variable maps represent a similar challenge, as a machine needs to interpret the values of the different shades of grey but there is no available machine-readable format. The underlying issue seems to be related to the fact that the GetStyles operation is optional and is specified in an optional extension to WMS, weakly supported in the OGC web services interface standards.

  • Discrete-variable maps are often just a simplification for humans, grouping values in ranges so that our eyes are able to find patterns in the isolines. However, machines are able to process more detailed maps, combine several sources and find more sophisticated patterns, making continuous-variable maps more interesting for machine learning applications.

  • Existing OGC Web Services were successfully used to retrieve different maps and evaluate their level of correlation, with the aforementioned limitations. This evaluation was done programmatically and was performant, requiring a few seconds to complete for all the layers in the Canadian Wildland Fire Information System with each other. If required it should be feasible to execute automatically prior to training a Machine Learning algorithm.

7.1.5. Recommendations

  • Continue research on image processing methods applied to map analysis, such as SSIM, that seem to provide more accurate results than other methods, such as MSE. These methods can eventually facilitate feature selection and automate part of the process of building new training datasets.

  • Review the implementations of WMS services in NRCan and add an SLD document describing scale ranges, retrievable through GetStyles operation, to allow machine learning algorithms properly retrieve and interpret the data.

  • Evaluate the possibility of creating new sources of data by turning the discrete-variable maps into continuous-variable ones, more suitable for machine learning applications.

7.2. Ground truth selection for fuel load estimation

Training a Machine Learning algorithm requires a ground truth dataset to serve as a reference. For the challenge of Fuel Load estimation, there are two available sources in the NRCan datasets: Petawawa (RF_PRF_BIOMASS_KG_DRY_MASKED layer) and the National Forest Information System (NFIS) (tot_bio_r layer).

The Petawawa dataset is both accurate and of high resolution. However, the geographic area of coverage is small and too limited in tree species to provide a general training base. Therefore, extrapolating to countrywide area is not possible. Visualization of the Petawawa area shows smooth gradients and consistency as depicted in the image below.