Publication Date: 2017-05-12

Approval Date: 2017-03-23

Posted Date: 2016-12-12

Reference number of this document: OGC 16-050

Reference URL for this document: http://www.opengis.net/doc/PER/t12-DG003

Category: Public Engineering Report

Editor: Joan Masó and Alaitz Zabala

Title: Testbed-12 Imagery Quality and Accuracy Engineering Report


Testbed-12 Imagery Quality and Accuracy Engineering Report (16-050)

COPYRIGHT

Copyright © 2017 Open Geospatial Consortium. To obtain additional rights of use, visit http://www.opengeospatial.org/

WARNING

This document is an OGC Public Engineering Report created as a deliverable of an initiative from the OGC Innovation Program (formerly OGC Interoperability Program). It is not an OGC standard and not an official position of the OGC membership.It is distributed for review and comment. It is subject to change without notice and may not be referred to as an OGC Standard. Further, any OGC Engineering Report should not be referenced as required or mandatory technology in procurements. However, the discussions in this document could very well lead to the definition of an OGC Standard.

LICENSE AGREEMENT

Permission is hereby granted by the Open Geospatial Consortium, ("Licensor"), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.

If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.

THIS LICENSE IS A COPYRIGHT LICENSE ONLY, AND DOES NOT CONVEY ANY RIGHTS UNDER ANY PATENTS THAT MAY BE IN FORCE ANYWHERE IN THE WORLD. THE INTELLECTUAL PROPERTY IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE DO NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE INTELLECTUAL PROPERTY WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION OF THE INTELLECTUAL PROPERTY WILL BE UNINTERRUPTED OR ERROR FREE. ANY USE OF THE INTELLECTUAL PROPERTY SHALL BE MADE ENTIRELY AT THE USER’S OWN RISK. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR ANY CONTRIBUTOR OF INTELLECTUAL PROPERTY RIGHTS TO THE INTELLECTUAL PROPERTY BE LIABLE FOR ANY CLAIM, OR ANY DIRECT, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM ANY ALLEGED INFRINGEMENT OR ANY LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR UNDER ANY OTHER LEGAL THEORY, ARISING OUT OF OR IN CONNECTION WITH THE IMPLEMENTATION, USE, COMMERCIALIZATION OR PERFORMANCE OF THIS INTELLECTUAL PROPERTY.

This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.

Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications.

This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.

None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.

Table of Contents
Abstract

The scenario of rapidly growing geodata catalogues requires tools focused on facilitating users the choice of products. Having populated quality fields in metadata allows the users to rank and then select the best fit-for-purpose products. For example, decision-makers would be able to find quality and uncertainty measures to take the best decisions as well as to perform dataset intercomparison. In addition, it allows other components (such as visualization, discovery, or comparison tools) to be quality-aware and interoperable.

This ER deals with completeness, logical consistency, positional accuracy, temporal accuracy and thematic accuracy issues to improve quality description in the metadata for imagery. Based on ISO 19157, UncertML and QualityML standardized measures, this ER describes how to encode quality measures in order to allow datasets comparison. Moreover, description of pixel-level quality measures is also included. Finally, alternatives to communicate tile level quality as well as mosaic products quality are proposed.

Business Value

This ER has the objective of facilitating the fit for purpose assessment of imagery based on quality measures. This will make the determination of the right data more agile. In addition, a machine readable quality reporting will make possible that processing tools such as WPS and WCPS can propagate quality indicators to the results.

Technology Value

This ER contributes to the Data Quality DWG activities by exploring quality documentation for imagery, that compliments other domain specifics previous works such as Data Quality for Citizen Science.

How this ER relates to the work of the Working Group

This ER is important for the Data Quality DWG because it represents a continuation of works initiated with the UncertML discussion paper.

Keywords

ogcdocs, testbed-12, Data quality, metadata, accuracy, uncertainty

Proposed OGC Working Group for Review and Approval

DataQuality DWG

1. Introduction

1.1. Scope

This OGC® document proposes an encoding for data quality for imagery using metadata for dataset and pixel level quality. In addition it also deals with separating spatial resolution from spatial accuracy and uncertainty, as well as, temporal extent and the temporal accuracy

This OGC® document is applicable to ISO 19157 implementations and to data processing tools such as WPS, WCPS and WMTS.

1.2. Document contributor contact points

All questions regarding this document should be directed to the editor or the contributors:

Table 1. Contacts
Name Organization

Joan Masó

UAB-CREAF

Alaitz Zabala

UAB-CREAF

1.3. Background

The research leading to the first version of QualityML was carried out in the GeoViQua project that has received funding from the European Union Seventh Framework Programme (FP7/2010-2013) under grant agreement no. 265178.

TestBed 12 has contributed to a considerable improvement of QualityML catalogue, in particular to quality measures concerning imagery.

1.4. Future Work

Implement and test the suggested "last time" parameter. Consider that, this may not be the best time parameter approach and other alternatives need to be explored (TIME=current.land) as described in WMTS Time Accuracy of this document.

The results of this work have been applied partially for time accuracy in a WMTS server. Extend the proposed service to the whole proposed A3C (Accuracy, Currency, Completeness and Consistency) framework and apply it to the WCS by means of a CIS quality extension.

QualityML vocabulary can be proposed as OGC discussion paper in the Data Quality DWG, in the same way that UncertML was proposed in the past.

1.5. Foreword

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.

Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.

2. References

The following documents are referenced in this document. For dated references, subsequent amendments to, or revisions of, any of these publications do not apply. For undated references, the latest edition of the normative document referred to applies.

3. Terms and definitions

For the purposes of this report, the following terms and definitions apply.

3.1. accuracy

closeness of agreement between a test result or measurement result and the true value.
[SOURCE: ISO 3534-2:2006, 3.3.1]

3.2. conformance

fulfillment of specified requirements.
[SOURCE: ISO 19105:2000, 3.8]

3.3. data quality basic measure

generic data quality (4.21) measure used as a basis for the creation of specific data quality measures.
[SOURCE: ISO 19157:2013, 4.7]

3.4. quality

degree to which a set of inherent characteristics fulfills requirements.
[SOURCE: ISO 9000:2005, 3.1.1]

3.5. tile

a rectangular pictorial representation of geographic data, often part of a set of such elements, covering a spatially contiguous extent and sharing similar information content and graphical styling, which can be uniquely defined by a pair of indices for the column and row along with an identifier for the tile matrix.
[SOURCE: OGC 07-057r7, 4.11]

3.6. tile matrix

a collection of tiles for a fixed scale
[SOURCE: OGC 07-057r7, 4.12]

3.7. uncertainty

parameter, associated with the result of measurement, that characterizes the dispersion of values that could reasonably be attributed to the measurand.
[SOURCE: JCGM 100:2008]

4. Conventions

4.1. UML notation

Most diagrams that appear in this standard are presented using the Unified Modeling Language (UML) static structure diagram, as described in Subclause 5.2 of [OGC 06-121r9].

4.2. Data quality measures dictionary tables

The data quality measures dictionary is specified herein in a series of tables. The contents of the columns in these tables are described in the next Table.

ISO 19157 provides a long list of quality measures. To define these metrics in a formal way, we are using a data structure proposed in annex D of ISO 19157 with some modifications:

Table 2. Data Structure to define quality measures used in this section
Line Component Observation

1

Name

Name is the name of the measure.

2

Alias

Alias is another recognized name for the same data quality measure. It may be a different commonly used name, or an abbreviation or a short name. More than one alias may be provided.

3

Element name

Element name is the name of the data quality element to which a measure applies. More than one element name may be provided.

4

Basic measure

If a measure is based on one of the basic measures, it shall be described by its name, definition and value type. Basic measures are identified by their names. A variety of measures are based on counting of erroneous items. There are also several measures dealing with the uncertainty of numerical values. In order to avoid repetition, the most common methods of constructing count-related measures as well as general statistical measures for one- and two-dimensional random variables should be defined in terms of basic measures. The basic measures should also be used for creating new measures if applicable, for example to report non-closed surface patches or other application-dependent measures.

5

Definition

If the measure is derived from a basic measure, the definition is based on the basic measure definition and specialized for this measure.

6

Description

Description is the description of the measure including methods of calculation, with all formula and/or illustrations needed to establish the result of applying the measure. If the measure uses the concept of errors, it should be stated how an item is classified as incorrect. This is the case when the quality only can be reported as correct or incorrect.

7

Parameter

Parameter is an auxiliary variable used by the measure. It shall includes name, definition and value type, More than one measure parameter may be provided.

8

Value type

Not used

9

Value structure

Not used

10

Source reference

Source reference is the citation of the documentation of the measure. When a measure, for which additional information is provided in an external source, is added to the list of standardized measures, a reference to that source may be provided here.

11

Example

Example is an example of applying the measure or the result obtained for the measure. More than one example may be provided.

12

Identifier

Identifier is a value uniquely identifying a measure within a namespace.

13

Domain

Replaces Value type and Value structure in ISO 19157

14

Metrics

Replaces Value type and Value structure in ISO 19157

The contents of these data dictionary tables are normative, including any table footnotes and has been transposed to QualityML.

5. Overview

The scenario of rapidly growing geodata catalogues requires tools focused on facilitating the choice of products to the users. Having populated quality fields in metadata allows the users to rank and then select the best fit-for-purpose products. For example, decision-makers would be able to find quality and uncertainty measures to take the best decisions as well as to perform dataset intercomparison. In addition, it allows other components (such as visualization, discovery, or comparison tools) to be quality-aware and interoperable.

This ER covers the following aspects of data quality: completeness, logical consistency, positional accuracy, temporal accuracy and thematic accuracy issues to improve quality description for imagery in the metadata.

In particular the ER address the following issues:

  • Quality parameters offer a way to compare and determine the right data that fits for purpose. Standard vocabularies and taxonomies are paramount to describe data quality and allow comparison. In the 7th Framework Programme of the European Commission, the GeoViQua, a 3-year project, (UAB-CREAF were the coordinators) worked in different aspects of data quality and data visualization. One of the outcomes of the project was the QualityML vocabulary. This vocabulary is an extension of UncertML (the v1 of its community standard is a discussion paper in OGC). This vocabulary provides with a common solution for all quality indicators described in the former ISO 19138 which is the new ISO 19157. It also proposes a clear encoding in XML metadata documents (see http://www.QualityML.org/). QualityML has been reviewed and extended in this activity for imagery. The A3C (Accuracy, Currency, Completeness and Consistency) model quality framework is an instrument to compare imagery from multiple sources and determine which one fits better for the purpose that has also been integrated. These topics, and the relations among them, are discussed in A3C Framework, ISO Quality Framework, A3C mapped to ISO and QualityML vocabulary encoding.

  • A list of quality measures for raw imagery and higher level raster derived data that follows the A3C model (and uses ISO 19157 and QualityML) is conceptually described in Quality Measures Raster.

  • Traditionally, quality reports were associated to product specifications resulting in quality metrics common for all the elements of the dataset. XML encoding for quality metadata for dataset level, using QualityML to give semantics to quality measures, is discussed in Dataset Levels.

  • Recently, per pixel or per object qualities are also being included (e.g MODIS and Sentinel quality bands). There are also intermediate levels that can be defined, including those that appear due to users visualize data at their preferred levels of resolution (scale denominator). An example of that is the tiling system where tiles can be considered a visualization units with an associated quality description. Therefore, a general way of expressing quality at levels below dataset is needed. These topics are discussed in Pixel Level and WMTS Time Accuracy.

  • Data aggregation and data fusion are examples of simple processes where the resulting dataset needs an complex definition of quality as an integration of the quality indicators of each individual data source. This topic is discussed in Quality Big Mosaic.

5.1. A3C quality framework

The A3C (Accuracy, Currency, Completeness and Consistency) model quality framework is introduced by DigitalGlobe as an instrument to compare imagery from multiple sources and determine which one fits better for the purpose. This model was initially suggested in the Request for Quotations, Annex B of Testbed 12 and the original text is included in Appendix A3C for convenience.

The final aim is to generate an standard way to define quality measurements to allow image comparison based on these measures.

5.1.1. A3C description

The A3C defined the 4 quality classes (Accuracy, Currency, Completeness, and Consistency) as:

  • Accuracy refers to spatial accuracy of a location derived from the pixel in X,Y dimensions and potentially in Z dimension.

  • Currency refers to the temporal extent of the imagery and products used to cover the associated area, since multiple dates of collections are typically required to cover a large area.

  • Completeness of imagery and products refers to quality metrics including cloud cover, sensor specs on collection geometry, temporal range of the data, other spectral bands if any, radiometric depth of the pixels, etc.

  • Consistency metric describes the consistency of colors, relative accuracy over time and over different sensors, spectral and spatial error propagation from collection to production, etc.

A3C model is designed with the fit for purpose in mind and includes elements that are considered as data quality by ISO 19157 complemented by other aspects that can be found in other classes of the more general ISO 19115-1 metadata describing the datasets.

This ER links the A3C framework to ISO 19157 and QualityML vocabularies, and proposes encodings compatible with common metadata standards. In addition to this ER, implementations of this combined framework have been developed in Testbed 12 such as a WMTS server that implements the time accuracy part and a WCS that provides support to the complete framework.

5.1.2. Accuracy

Different sources of information come with different location accuracy (focusing in horizontal accuracy).

One of the common indications to define this accuracy is the CE90. It is defined in the ISO 19157 in Table D.46 — "Circular map accuracy standard".

Table 3. Circular map accuracy standard as defined by ISO 19157 and with QualityML identifiers
Line Component Description

1

Name

circular map accuracy

2

Alias

circular map accuracy standard (CMAS)1

3

Element name

absolute or external accuracy

4

Basic measure

CE901

5

Definition

radius describing a circle, in which the true point location lies with a certain probability

6

Description

includes\images\CE90

The equation expresses CE901 measure that, for some particular probabilities, can be calculated depending on the standard deviations σx and σy.2
The one dimensional standard deviation σ for each dimension (represented by a Z) can be estimated by:

includes\images\LE68.3

7

Parameter

"level" for level of confidence

10

Source reference

ISO 19157 Id. 42 (39.4%), Id. 43 (50%), Id. 44 (90%), Id. 45 (95%), Id. 46 (99.8%), unified in QualityML

11

Example

20m

12

Identifier

http://www.qualityml.org/1.0/measure/CircularMapAccuracy

13

Domain

http://www.qualityml.org/1.0/domain/DifferentialErrors2D

14

Metrics

http://www.qualityml.org/1.0/metrics/Half-lengthConfidenceInterval

1 only when confidence interval is 90%
2 the precise definition was extracted form annex G.3.3 "Two-dimensional random variable Χ and Υ"

For example, in the constellation of DigitalGlobe, the older satellites (Quickbird and Ikonos) have a positional accuracy of 20m CE90. In the new generation, they have a positional accuracy of 3m and in the future we could have 1m for satellites orbiting at 400 miles.

To be able to estimate this positional accuracy, several factors have to be taken into account giving us different approaches to determine positional accuracy.

  • Sensor Accuracy: Some satellite use a push broom (that acts as a broom sweeping). In this case, we can have column errors coming from the stability of the DCA (detector chip array) and row errors derived from the stability of the satellite with sweeping the Earth (obtained through ephemerides and validation sites on Earth). Using this information a rough estimation of image positional accuracy can be done.

  • Product accuracy (x,y,z): Processed pixel location accuracy on the ground. Another source of error are the ones introduced when the final product is elaborated. E.g. introduced by the orthorectifying process using a DEM (Digital Elevation Model) and GCP (Ground Control Points). The accuracy and resolution of the DEM can introduce more uncertainty in the accuracy.

It is important to clarify that the same quality measure would be used for any of these approaches, thus it is also important to provide information on the evaluation method of a particular metric.

5.1.3. Currency

The immediate aspect of currency is how recently the scene was taken. This can be found in the time stamp when the image was collected (initial and end time). If the image is being used as the "current" one, this introduces uncertainties in the features we are seeing in the images, if they are not recent enough (and some elements may have changed or moved).

A more elaborated use case with currency issues appears with temporal uncertainty related to a large image created by mosaicking multiple paths and scenes. For example, temporal accuracy can be provided as "90% of the mosaic is as current as 3 days or less". The temporal uncertainty can be bigger in mosaics over large territories (e.g. the African continent) . It can also be described as an overall measure (aggregation: "no pixel is older than a value t"). This can be provided at the pixel level as well.

This is in line with the WMTS service that is developed in this Testbed 12 and explained in WMTS Time Accuracy. In this case each, tiles can be tagged with a time accuracy estimation. For low scale tiles we will have time accuracy estimation for a small area. For higher scales we will have more aggregated values and in the extreme we will end in a scale where the whole extent fits in a tile and in this case the tagged quality will be the global time accuracy or the composed mosaic.

This is particularly important for the the disaster management use case, where currency is the most important factor.

5.1.4. Completeness

This is also related with mosaics of large areas (e.g. a mosaic of all African continent). For large extension it could not be possible to generate a completely cloud free image. In this case the percentage of the data tagged as NoData can be one of this metrics (e.g. 0.5 % of area is missing). We can go further defining the percentage of the mosaic covered by clouds. We could do the same for cloud shadows but this is much more difficult and the industry is not able to report this yet.

The presence of cloud information can also be exposed by a WMTS. We could try and use this information in the WMTS. Some times there are products that have a "cloud" raster layer. Some other times the clouds are stored as polygons in a database.

5.1.5. Consistency

Consistency in a GIS vector context is easily understandable and measured through a conformity test to the data model specification. In a remote sensing context there is no theoretical pattern to compare to, and only a reduced number field measures (e.g. field radiometry records) are usually available to correct image radiometry and make it consistent over time.

The weather and atmosphere status changes over time. Important factors that influence the image quality are the presence of aerosols and water vapor. This can be different even in areas of a single scene and changes the quality of portions the image.

There are different levels of radiometric corrections involved:

  • Raw data is expressed in digital numbers (DN). In this case, the quality of the image is conditioned by the atmosphere but also by the reflectivity of the atmosphere due to the angle of the sun.

  • First level of correction is to get Top of the Atmospheric Reflectance (TOA) where we eliminate the reflectivity of the atmosphere due to the angle of the sun but the atmosphere effects are still there. This can be achieved by a calibrating the sensor (cal-val operations)

  • A second level of correction is to calculate the Surface Reflectance (SR) using atmospheric models. In this case we need to know if the atmospheric data was captured at the scene level or at a coarse level (e.g Aerosol pixel image). The estimation of aerosols can be done at a 2 m resolution with modern techniques.

  • In addition the Surface Reflectance can be obtained by introducing information about pseudo invariant areas (PIAs).

Consistency may be intended to be within a single image, the whole product series (all the historical archive of the instrument), the complete constellation (among Worldview satellites) or even within different constellations (e.g. Landsat 8 and Sentinel 2).

5.2. ISO quality framework

The ISO standards coming from TC 211 form a collection of standard documents dealing with different aspects of Geographic Information and Geomatics.

Current ISO standards dealing with data quality are:

  • ISO 19157 Data quality unifies data quality in almost a single document: what is data quality, quantitative/conformance measures and metadata encoding

  • ISO 19115-1 Metadata specifies only lineage/usage/purpose

  • ISO 19115-2 Metadata for imagery specifies some extensions to 19115. In particular introduces the idea of a coverage to specify pixel level quality. ISO 19115-2 is in the revision period and contributions are welcome to improve it.

  • ISO 19115-3 XML schema implementation for Metadata specifies XML encoding rules for ISO 19xxx standards

that replaces some previous well known standards:

  • ISO 19113 Data quality principles specifies what is data quality and what is not

  • ISO 19138 Data quality measures specifies a long list of quantitative/conformance measures for data quality.

  • ISO 19115 Metadata specifies how to encode quantitative/conformance measures in ISO metadata.

Following subclauses introduce the main concepts relevant for data quality:

  • Quality classes and structure of the quality measures

  • Others quality related elements

  • Others metadata elements indirectly related with quality

5.2.1. Quality classes and structure of the quality measures

According to ISO 19157 there are seven quality elements (or classes) describing certain aspects of the quality of a geographic dataset:

  • completeness: presence and absence of features, their attributes and relationships;

  • logical consistency: degree of adherence to logical rules of data structure, attribution and relationships (data structure can be conceptual, logical or physical);

  • positional accuracy: accuracy of the position of features within a spatial reference system;

  • thematic accuracy: accuracy of quantitative attributes and the correctness of non-quantitative attributes and of the classifications of features and their relationships;

  • temporal quality: accuracy of the temporal attributes and temporal relationships of features;

  • usability: Usability is based on user requirements. All quality elements may be used to evaluate usability;

  • metaquality: a set of quantitative and qualitative statements about a quality evaluation and its result. The knowledge about the quality and the suitability of the evaluation method, the measure applied and the given result may be of the same importance as the result itself.

These quality classes (or elements) can be elaborated in several quality indicators (or sub-elements).

Table 4. Data quality classes and subclasses
data quality element data quality subelement definition

completeness

commission

excess data present in a dataset

omission

data absent from a dataset

logical consistency

conceptual consistency

adherence to rules of the conceptual schema

domain consistency

adherence of values to the value domains

format consistency

degree to which data is stored in accordance with the physical structure of the dataset

topological consistency

correctness of the explicitly encoded topological characteristics of a dataset

positional accuracy

absolute or external accuracy

closeness of reported coordinate values to values accepted as or being true

relative or internal accuracy

closeness of the relative positions of features in a dataset to their respective relative positions accepted as or being true

gridded data positional accuracy

closeness of gridded data position values to values accepted as or being true

thematic accuracy

classification correctness

comparison of the classes assigned to features or their attributes to a universe of discourse (e.g. ground truth or reference dataset)

non-quantitative attribute correctness

correctness of non-quantitative attribute

quantitative attribute accuracy

accuracy of quantitative attributes

temporal quality

accuracy of a time measurement

correctness of the temporal references of an item (reporting of error in time measurement)

temporal consistency

correctness of ordered events or sequences, if reported

temporal validity

validity of data with respect to time

usability

all quality elements

based on user requirements

metaquality

confidence

trustworthiness of a data quality result

representativity

degree to which the sample used has produced a result which is representative of the data within the data quality scope

homogeneity

expected or tested uniformity of the results obtained for a data quality evaluation

The definition of DQ_Element and its classes and sub-classes as described in ISO 19157 can be seen in next figure:

DQ element
Figure 1. DQ_Element as defined in ISO 19157

When describing the quality of geographic data, different quality elements and different subsets of the data may be considered. In order to describe these, data quality units are used. A data quality unit is the combination of a scope and data quality elements.

The scope of the data quality unit(s) specifies the extent, spatial and/or temporal, and/or common characteristic(s) that identify the data on which data quality is to be evaluated. One data quality scope shall be specified for each data quality unit.

Moreover, an evaluation of a data quality element is described by the following:

  • measure – the type of evaluation

  • evaluation method – the procedure used to evaluate the measure

  • result – the output of the evaluation (includes value type, unit and date)

DQ DataQuality and DQ Element
Figure 2. Data quality units and data quality element descriptors as defined in ISO 19157

Thus, to describe a quality subelement 7 items should be provided:

  • data quality scope;

  • data quality measure;

  • data quality evaluation method;

  • data quality result;

  • data quality value type;

  • data quality value unit;

  • data quality date.

Results can be:

  • Quantitative results (may be a single value or multiple values, depending on the values of attributes valueType and valueStructure defined in the description of the measure applied)

  • Conformance results (in this case the "value" is a Boolean "yes" or "no")

  • Descriptive results (in this case the "value" is a free text "statement")

  • Coverage results (in this case the "value" is an image link)

includes\images\DQ element DQ CoverageResult
Figure 3. Alternatives for the DQ_Result

There are several non quantitative elements on the metadata that include information somehow related to data quality:

  • purpose: describes the rationale for creating a dataset and contain information about its intended use.

  • lineage: describes the history of a dataset and, in as much as is known, recount the life cycle of a dataset from collection and acquisition through compilation and derivation to its current form.

  • usage: describes the application(s) for which a dataset has been used or uses of the dataset by the data producer or by other, distinct, data users.

In FP7 GeoViQua (http://www.geoviqua.org) the usage concept was extended to geospatial user feedback (GUF). GUF is a form of metadata derived from the experience that users gain by using the data. The GUF.SWG developed an standard in two separate documents:

  • 15-097r1 Geospatial User Feedback Standard. Conceptual model

  • 15-098r1 Geospatial User Feedback Standard. XML Encoding extension

Recently, the OGC TC approved these two documents as international standards. This work is probably out of scope in this ER but is worth to mention it for completeness.

There are some elements that can be used for "fit for purpose" assessment but are not considered parts of the data quality in ISO:

  • spatial resolution: this information is commonly confused with the spatial accuracy. The spatial resolution is related to the pixel size chosen to encode the data in a raster/coverage format while the spatial accuracy refers to the deviance in the geographic position of the pixel from its real ground position. Many times both are related but are not the same. The encoding of this one escapes data quality and ISO 19115 explains how to do it.

  • temporal extent: similar situation can be found with temporal extent and the temporal accuracy. The temporal extent indicates the interval of time (e.g. as a start time and an end time) of data in the image (e.g. in an remote sensing image is equal to the date the image was acquired; and in a photo-interpreted map is related to the acquisition date when the image had been taken, not the date when the photo-interpretation was carried out) while the temporal accuracy refers to the uncertainty in the individual time measurement. In ISO 19115 this is related to the extent of the data (in ISO 19115, extent is a multidimensional cube that gives information on BBOX, temporal extent and elevation interval).

5.3. Mapping the A3C model to ISO TC211 model

5.3.1. Accuracy

A3C accuracy can be expressed using DQ_PositionalAccuracy quality element from ISO 19157 (including AbsoluteExternalPositionalAccuracy, RelativeInternalPositionalAccuracy or GridedDataPositionalAccuracy). This is valid for X, Y. If the CRS is a 3D CRS, Z dimension can be also included. If Z is considered an attribute, ISO 19197 includes a DQ_QuantitativeAttributeAccuracy that could also be used the describe, for example, the elevation accuracy in a Digital Elevation Model (DEM).

In the original definition of the A3C model in the Annex B of the Request for Quotation, there is no mention of thematic accuracy. This could be an important component for vector data but also for imagery representing quantitative variables (e.g. temperatures) and categorical variables (e.g. Land cover classes).

So that, proposed measures will include thematic accuracy classes: i.e.: DQ_QuantitativeAttributeAccuracy, DQ_ThematicClassificationCorrecteness and DQ_NonQuantitativeAttributeCorrectness

5.3.2. Currency

Currency refers to the temporal extent of the imagery products used to cover the associated area, since multiple dates of images are typically required to cover a large area. This is covered by ISO 19115-1 in EX_Extent and the specialization EX_TemporalExtent.

In addition, other elements of the ISO 19115-1 model could be used to reflect the frequency in which the producer plans to update the dataset. This is not directly related with the currency of the data, but can be a proxy to it, in case the EX_TemporalExtent is not indicated. It is clear that if the producer intends to update the dataset every 15 days, the probability of having a current dataset is higher than if the maintenance is planned for each 10 years. Please note that the use of these elements to infer currency is not perfect: even if the dataset is updated frequently, the updated version could be another dataset and not the one at hand.

In the figure below, resourceMaintenance points to a MD_MaintenanceInformation with:

  • maintenanceAndUpdateFrequency: frequency with which changes and additions are made to the resource after the initial resource is completed (e.g. daily, weekly, …​)

  • userDefinedMaintenanceFrequency: Frequency maintenance period other than those defined in MD_MaintenanceAndUpdateFrequencyCode

  • maintenanceDate: date information associated with maintenance of resource

includes\images\MD MaintenanceInformation
Figure 4. Maintenance information

Data quality classes can also be used in combination with EX_TemporalExtent. This is recorded using DQ_TemporalQuality. Temporal quality can represent different aspects concerning the temporal tags of the data, exemplified in the following use cases:

  • To assess the quality of the temporal measurements one can use DQ_AccuracyOfATimeMeasurement.

  • When combining data covering different EX_TemporalExtent’s to produce a new product, the result can increase the time uncertainty and this can be recorded in DQ_AccuracyOfATimeMeasurement.

  • When the time is uncertain, the sequence of events can be compromised (not being able to identify the most current image); this is reflected in DQ_TemporalConsistency (correctness of ordered events or sequences).

  • DQ_TemporalValidity is defined as the validity of time measurements with respect to the expected time domain (expressed in EX_TemporalExtent), and can be used to describe a rate of items older than a specific date.

5.3.3. Completeness

ISO defines completeness as commission (DQ_CompletenessComission) and omission (DQ_CompletenessOmission).

In imagery products, cloud cover is usually documented as quality flags adding a quality band. A quality band can be added defining a quality band in MD_Band (new in ISO 19115-1) or by adding a quality measure with a QE_CoverageResult (ISO 19115-2) (alternatives are described in Pixel level Quality).

In some cases, the sensor specifications forces the collection of data with geometrical restrictions that can leave gaps between scenes or paths that will create data omissions. For similar reasons, the revisiting time of some zones of the Earth in satellite imagery coming from near polar orbits is less frequent in the equator leaving gaps for some time periods. Other reasons for omissions are the lack of some bands due to sensor failure. In some situations measures can saturate the dynamic range of the sensor or the digital representation of them resulting in omission of information.

5.3.4. Consistency

On one hand, the description of the radiometric processing performed over the image should be documented on lineage (statement or processStep). This metadata element can help understanding the consistency that each image can have related to the whole time series.

Moreover, DQ_ConceptualConsistency and DQ_DomainConsistency can be used to record inconsistencies over time that can have its origin in lack of good calibration of the sensors and bias. This will generate inconsistency of colors and relative inaccuracy over time. Temporal series can be built combining old and new platforms and sensors resulting in non-homogeneous results. Spectral and spatial error propagation from collection to production can also influence in consistency (in particular if processing chains change over time).

5.4. QualityML quality vocabulary and encoding

In the 7th Framework Programme of the European Commission, the GeoViQua, a 3-year project, (UAB-CREAF were the coordinators) worked in different aspects of data quality and data visualization. One of the outcomes of the project was the Quality Mark-up Language (QualityML). QualityML is both a vocabulary and an encoding for data quality. QualityML is an extension of UncertML (the v1 of its community standard is a discussion paper in OGC). This vocabulary provides with a common solution for all quality indicators described in the former ISO 19138 which is the new ISO 19157. It also proposes a clear encoding of quality elements (using standardized quality measures) in XML metadata documents (see www.qualityML.org). QualityML has been reviewed and extended in this activity.

5.4.1. UncertML

In 2009, OGC announced the release of UncertML as a discussion paper (http://www.opengeospatial.org/pressroom/pressreleases/1002). "UncertML is a conceptual model and XML encoding designed for encapsulating probabilistic uncertainties and may be used to quantify and exchange complex uncertainties in data. Most data contains uncertainty arising from sources such as measurement error, observation operator error, processing/modeling errors, or corruption. Processing uncertain data propagates and often increases uncertainty. Thus there is a need for a standard way of characterizing uncertainty that is readily interpreted by software systems. UncertML is based on a number of ISO and OGC standards, such as ISO 19138 Data Quality Measures, and addresses the ISO/IEC guide to the expression of uncertainty in measurement (GUM). UncertML utilizes the OGC Geography Markup Language (GML) Standard and the OGC Sensor Web Enablement Common (SWE) Standard."

Later, UncertML was adopted by the UncertWeb project and a version 2.0 of it emerged as a well documented proposal maintained by a small community: http://www.uncertml.org.

5.4.2. Why QualityML

The problem with UncertML is that it only deals with one way of measuring the data quality: uncertainty. This works well with some measurements of accuracy but does not cover other quality measures like consistency, completeness etc. We needed a way to extend the UncertML approach to all other quality indicators and to harmonize them with the measurement definitions that ISO 19157 is providing and many more that are used in other domains.

5.4.3. How to understand QualityML

QualityML (http://www.qualityML.org) is a data model, a vocabulary and an encoding for data quality.

To report quantitative or conformance quality in QualityML, you need to combine the following concepts:

  • quality class (called quality elements in ISO 19157): the seven elements described in ISO 19157 are considered

  • quality indicator (called quality sub-elements in ISO 19157): the eighteen elements described in ISO 19157 are considered

  • quality measure (called quality measures in ISO 19157 and described in its Annex D): the quality measure used to describe the quality element

  • quality domain (or measurement field): the field where the metrics is going to be applied. Examples are: "deltas" between 1d expected and the real value, 3d "deltas", individual wrong measurements, predicted or actual values,…​

  • quality metric (which include a metric name, metric description, metric parameter, its values and units of measure): This is the mathematics expression applied to the "measurement field" and results in a value that can be a number count, a floating point average or a Boolean yes/no.

Table 5. Example of conceptual encoding using QualityML: an omission of 15% of the items in a dataset
Concept Value Comments

quality indicator

DQ_CompletenessOmission

quality measure

Missing items

More information in QualityML definition

quality domain

Non conformance

More information in QualityML definition

domain parameters

Rule

Forest

needed if the omission of a single category is reported

quality metric

Items count

Can be a choice of:
* indicator (boolean)
* count (int)
* rate (real, attribute: max (real))
More information in QualityML definition

metric parameters

Rate

66

needed if rate is selected to describe Items count

metric parameters

Maximum

100

needed if rate is selected to describe Items count

metric units of measure

none

5.4.4. From ISO 19157 to QualityML

The main idea behind QualityML is that it can be used to describe quality elements in metadata in a standardized way that it also includes semantics (as it also points to the included definitions), allowing easy comparison of products.

On one hand, the QualityML is a profile of the ISO geospatial metadata standards (e.g. ISO 19157) providing a set of rules for precisely documenting quality indicator parameters that is structured in 6 levels. On the other hand, QualityML includes semantics and vocabularies for the quality concepts.

Whenever possible, QualityML uses statistic expressions from the UncertML dictionary (http://www.uncertml.org) encoding. However it also extends UncertML to provide list of alternative metrics that are commonly used to quantify quality beyond the uncertainty concept, for example the ones used in ISO 19157.

Quality metrics are used to compute the result of each quality measure value, when applied to a certain domain. QualityML provide a matrix of the combinations of indicators, measurements, domains and metrics commonly used.

The main idea behind this structure is to unlink measures, domains and metrics description, in order to maximize generalization of descriptions and increase coherence among several measures using the same metrics (even with different domains), or several quality indicators using the same measures.

Unifying different ISO Quality Basic Measures into a single QualityML Quality Metrics (with parameters)

ISO 19157, in its Annex D, introduces the concept of data quality basic measure to avoid the repetitive definition of the same concept. There are data quality measures that have certain commonalities. Two principle categories of data quality basic measures are listed in the Annex. The uncertainty-related data quality basic measures are based on the concept of modeling the uncertainty of measurements with statistical methods. The measured quantity can be embedded in different dimensions. Depending on the dimension of the measured quantity, different types of data quality basic measures are used to construct data quality measures. The counting-related data quality basic measures are based on the concept of counting errors or correct items.

Uncertainty-related data quality basic measures

Several uncertainty-related measures are described in ISO linked to a common Basic measure, for example:

Table 6. Measures and Basic measures in ISO 19157
Measure name Element name Basic measure Identifier

linear error probable

absolute or external accuracy

LE50 or LE50(r)

33

time accuracy at 50% significance level

accuracy of a time measurement

55

attribute value uncertainty at 50% significance level

quantitative attribute accuracy

69

QualityML goes one step beyond this generalization effort and also group basic measures describing the same metrics with different parameters. For example, all the measures regarding "half length of the interval" are grouped in a single general metric called Half-lengthConfidenceInterval, that includes a parameter to describe the confidence level (or probability) of the true value being between the lower and the upper limit. Level has to be in the range [0,1]. This QualityML metric includes several ISO 19157 basic measures.

Table 7. ISO 19157 basic measures and QualityML metrics
Basic measure Element name Id. Measure name QualityML metric Metric parameter

LE50

absolute or external accuracy

33

linear error probable

Half-length Confidence Interval
(QualityML definition for more details)

level = 0.5

accuracy of a time measurement

55

time accuracy at 50% significance level

quantitative attribute accuracy

69

attribute value uncertainty at 50% significance level

LE68.3

absolute or external accuracy

34

standard linear error

level = 0.683

accuracy of a time measurement

54

time accuracy at 68,3% significance level

quantitative attribute accuracy

68

attribute value uncertainty at 68,3% significance level

LE90

absolute or external accuracy

35

linear map accuracy at 90% significance level

level = 0.90

accuracy of a time measurement

56

time accuracy at 90 % significance level

quantitative attribute accuracy

70

attribute value uncertainty at 90% significance level

LE95

absolute or external accuracy

36

linear map accuracy at 95% significance level

level = 0.95

accuracy of a time measurement

57

time accuracy at 95 % significance level

quantitative attribute accuracy

71

attribute value uncertainty at 95% significance level

LE99

absolute or external accuracy

37

linear map accuracy at 99% significance level

level = 0.99

accuracy of a time measurement

58

time accuracy at 99 % significance level

quantitative attribute accuracy

72

attribute value uncertainty at 99% significance level

LE99.8

absolute or external accuracy

38

near certainty linear error

level = 0.998

accuracy of a time measurement

59

time accuracy at 99,8% significance level

quantitative attribute accuracy

73

attribute value uncertainty at 99,8% significance level

The advantage of this generalization is not only the increase of coherence on the quality measures and metrics description, but also the possibility of describe any other confidence level interval in a standardized way.

This is done in QualityML not only for one dimensional random variables (Z, using "Half-length Confidence Interval" metrics), but also for two dimensional variables, including in a single definition several ISO metrics (confidence ellipse and uncertainty ellipse). In fact, the ISO description of confidence ellipse is general in the same way, as it has a parameter to describe the confidence (or significance) level.

Counting-related data quality basic measures

The basic measures described in ISO 19157 related to counting are shown in the next table:

Table 8. Data quality basic measures for counting-related data quality measures as described in ISO 19157
Basic measure name Basic measure definition Example Data quality value type

Error indicator

Indicator that an item is in error

False

Boolean (if the value is true
the item is not correct)

Correctness indicator

Indicator that an item is correct

True

Boolean (if the value is true
the item is correct)

Error count

Total number of items that are subject to an error
of a specified type

11

Integer

Correct items count

Total number of items that are free of errors
of a specified type

571

Integer

Error rate

Number of the erroneous items with respect to
the total number of items that should have been present

0.0189

Real

Correct items rate

Number of the correct items with respect to
the total number of items that should have been present

0.9811

Real

A first grouping that QualityML defines beyond these ISO basic measures, is related to the concept of counting items. All the measures related to the same quality measure are grouped and used a metrics call items which result can be expresses as a boolean, count or rate.

In fact ISO 19157 slightly suggest several options, in this case for the rate elements, when recognizes that "[Error rate / Correct items rate] can either be presented as percentage or as a ratio. The value unit in the quantitative result (see 7.5.4.2) can be used to specify that the result is presented in percentage or as a ratio." To standardize these options for the rate as well as to combine the other two options (boolean and count), QualityML describe the Items metrics as a choice among "indicator" (for boolean), "count" or "rate". For the last one a parameter is described in order to include the maximum value of the rate. Thus, a value of 100 in this attribute will be used to express that the value is a percentage. Default value for this attribute is 1, representing a pure ratio.

Moreover, usually measures based on errors and on correct items are described in ISO 19157. Both definitions are exactly the same, only with the difference about "which elements" the measure is counting. This, in QualityML is described by the Domain of the Quality measure, allowing then a higher agrupation schema, as shown in the table below (only for measures related to conceptual consistency):

Table 9. Data quality basic measures for counting-related data quality measures as described in ISO 19157 and as grouped in QualityML (describing conceptual consistency)
Id. Measure name Basic measure QualityML measure QualityML metric Metric choice
(parameter)
QualityML domain

8

conceptual schema non-compliance

error indicator

Conceptual Schema
(QualityML definition for more details)

Items
(QualityML definition for more details)

indicator

Non-Conformance
(QualityML definition for more details)

10

number of items not compliant with the rules of the conceptual schema

error count

count

12

non-compliance rate with respect to the rules of the conceptual schema

error rate

rate (parameter max)

9

conceptual schema compliance

correctness indicator

indicator

Conformance
(QualityML definition for more details)

-

not defined in ISO

correct items count

count

13

compliance rate with the rules of the conceptual schema

correct items rate

rate (parameter max)

QualityML domain (with parameters)

QualityML describes several domains to which the quality measures can be applied. Sometimes restrictions on the domain used needs to be described, and thus some parameters are needed for the domain.

For example when describing domain conformance, the item metrics can be used to described the rate of elements no conformant with the domain. If a requirement is described then this should be encoded as a domain parameters. See the following table for an example.

Table 10. 'Temporal validity' element and 'Value domain' measure using several domains, with or without domain requirements
Quality measure Domain Metrics Example Elements to be encoded

Value domain

Non Conformance

items

30% of the items are not conformant with their value domain (e.g. invalid dates codified in a DBF table)

measure used: measure/ValueDomain
result value type: Items
result value record:
- items/rate: 30 (max="100")
- NonConformance/rule: Invalid dates codified in a DBF table

30% of the pixels are older than a 24 month of the time intended for the data (i.e. older than 2014-05-01)

measure used: measure/ValueDomain
result value type: Items
result value record:
- items/rate: 30 (max="100")
- NonConformance/min: 2014-05-01
- NonConformance/rule: Pixels are older than a 24 month of the time intended for the data (i.e. older than 2014-05-01)

Conformance

65%1 of the items are conformant with their value domain (e.g. valid dates codified in a DBF table)

measure used: measure/ValueDomain
result value type: Items
result value record:
- items/rate: 65 (max="100")
- Conformance/rule: Valid dates codified in a DBF table

65%1 of the pixels are within the 24 months before the time intended for the data (i.e. after than 2014-05-01)

measure used: measure/ValueDomain
result value type: Items
result value record:
- items/rate: 65 (max="100")
- Conformance/min: 2014-05-01
- Conformance/rule: Pixels are within the 24 month before the time intended for the data (i.e. after than 2014-05-01)

1 In this example we consider that 5% of the DBF records or of the pixels has an empty date, thus conformant and non conformant percetages does not sum up 100%.

More details about QualityML in http://www.QualityML.org.

5.5. Levels of granularity in quality measures

Quality evaluation is carried out at pixel or feature level, as those are the elements evaluated and compared with those accepted as or being true. This comparison is then statistically assessed resulting in a what is called the measure. Measures can be delivered as different levels of granularity, depending on statistical selected domain. This section enumerates the most commonly used levels of granularity.

5.5.1. Product level

Traditionally, quality reports are associated to product specifications resulting in quality metrics common for all scenes of the same product.

5.5.2. Dataset level

Each individual scene (associated here to the concept of dataset) can be validated against what is considered ground truth and can have quality metric reports e.g. about the geospatial accuracy of the positions of the individual pixels (a process that is commonly done applying GCP). This results in an statistical dataset quality.

5.5.3. Pixel level

Recently, per pixel or per object qualities are also being included (e.g MODIS and Sentinel). In this case, an uncertainty model is associated to an scene that allows to compute individual uncertainties for each pixel. These uncertainties can be of any kind but e.g a set go GCP can be used to model the uncertainty distribution of the scene or an illumination model can be used to model the radiometric uncertainty of each pixel. This error modeling is particularly interesting when there is a need to propagate uncertainty when the same scene is used to derive high level products.

Another source of per pixel uncertainty is introduced when several scenes are combined to create a mosaic product. In this case the pixels of the mosaic have different uncertainty depending on the source of the original data (depending on the original scene).

5.5.4. Tile level uncertainty

To improve performance of visualization portals, data is structured on map tiles that are served by a tile service (e.g. based on the OGC WMTS). Tile layer services are expected to be easy to use providing access to a product that is a spatial and temporal mosaic of a long time series. The quality of each tile depends on the quality of the source scenes. Potentially each tile can have a different quality information depending on the source scenes. More details are discussed in WMTS Time Accuracy.

6. Quality Measures for Raster

ISO 19157 provides a long list of quality measures. This quality measures were mainly targeting vector data (features). This section proposes a list of quality measures at the conceptual level for raw imagery and higher level raster derived data that follows the A3C model (that has been mapped to the ISO 19157 structure in a previous section). In some cases, ISO 19157 measures can be used directly only requiring complementing the description with some imagery specific details in the description. In some other cases, a complete reformulation of the measures is needed. Some common imagery measurements have no match with ISO 19157 element and complete new measure proposals are enumerated.

To define these metrics in a formal way, we are using a data structure described in a table using the columns defined in the Conventions. If an existing measure is already available, only a link to its description (in QualityML and or ISO 10157) will be given.

These measures are also synchronized and stored in the QualityML vocabulary, and a general XML encoding proposal (giving semantics to quality measures) is given there. Moreover a detailed description of XML encoding is described in Dataset Level and Pixel Level.

6.1. Accuracy (positional)

Most of ISO 19157 quality measures can easily applied to imagery, and thus can describe the uncertainty of pixels positions. ISO 19157 describe three quality elements for positional accuracy (see the list in Data Quality Classes), and DQ_GriddedDataPositionalAccuracy is specially described for imagery.

The accuracy of gridded data may be described using the same data quality measures as for the horizontal positional uncertainty, so identifiers 42 to 51 (as specified in Annex D.4.1.3) of ISO 19157, as well as, of course, corresponding QualityML measures. All these measures are computed using 2D positions. Moreover, 1D position errors could also be used if positional uncertainties are expressed in each dimension separately (or even 3D if a Digital Elevation Model or a 3D raster data cube is described).

6.1.1. Sensor Accuracy: positional uncertainty of satellite position

Metadata of each image can describe the positional uncertainty of satellite position using: ISO 19157 D.29 — Mean value of positional uncertainties. This measure is a measure describing the whole dataset series for a certain amount of time (as the measure is using measures from validation stations during a certain amount of time), and thus this need to be specified using and specific EX_TemporalExtent on the scope of the quality measure.

In QualityML, this measure is called Mean Absolute Error (MAE) and the domain and metrics are different depending if a 1D, 2D or 3D approach is needed.

Table 11. Positional uncertainty of satellite position
Line Component Description

1

Name

Mean Absolute Error

2

Alias

MAE

3

Element name

gridded data positional accuracy

4

Basic measure

 — 

5

Definition

Mean Absolute Error (MAE) of imagery related to satellite position within a certain time period using data coming from validation sites distributed around the world with precise GPS locations

6

Description

 — 

7

Parameter

 — 

10

Source reference

ISO 19157 Id. 28, Id. 29, unified in QualityML

11

Example

A positional uncertainty of 5 m is related to satellite position accuracy within a certain time period

12

Identifier

http://www.qualityml.org/1.0/measure/MeanAbsoluteError

13

Domain

MAE can be used over several differential errors domain. In this case the domain is: http://www.qualityml.org/1.0/domain/DifferentialErrors2D

14

Metrics

- metrics: http://www.qualityml.org/1.0/metrics/MeanAbsolute2D
- value: 5 m

6.1.2. Ground location accuracy (x,y,z): Pixel location accuracy on the ground

Metadata of each imagery can describe the ground location accuracy using: ISO 19157 D.44 to D.49 — Circular error at different significance levels. This measure is a measure describing each dataset, and is usually computed according to satellite ephemerides.

A common measurement for positional accuracy is called the circular error. The circular error estimates probability for the true value to lie in an circular area assuming a two-dimensional density function of the normal distribution. The circular area is characterized by its radius. For the same set of values, the confidence radius can be returned at different probabilities.

Different probabilities are considered different measures in ISO 19157. As explained in QualityML quality vocabulary and encoding, some ISO measures are generalized in QualityML as single measure (with a parameter), called Circular Map Accuracy in this case (with a parameter "levels" to describe the confidence).

Table 12. Confidence level and quality measures for Circular Map Accuracy (source: ISO 19157, Table G.5)
Probability P
(level of confidence)
Data quality basic measure Name ISO 19157
Identifier
QualityML
measure

39.4 %

includes\images\CE39 4

CE39.4

42

Circular Map Accuracy

50 %

includes\images\CE50

CE50

43

90 %

includes\images\CE90

CE90

44

95 %

includes\images\CE95

CE95

45

99.8 %

includes\images\CE99 8

CE99.8

46

Thus, the description of ground location accuracy using QualityML would use:

Table 13. Pixel location accuracy on the ground
Line Component Description

1

Name

Circular Map Accuracy

2

Alias

 — 

3

Element name

gridded data positional accuracy

4

Basic measure

 — 

5

Definition

Radius describing a circle, in which the true point location lies with a certain probability, computed using satellite ephemerides

6

Description

 — 

7

Parameter

"level" value to indicate the confidence level (in the range [0,1])

10

Source reference

ISO 19157 Id. 42, Id.43, Id. 44, Id. 45, Id. 46, unified in QualityML

11

Example

3.2 m is the radius of the circle in which the true pixel location lies with a 95% probability

12

Identifier

http://www.qualityml.org/1.0/measure/CircularMapAccuracy

13

Domain

http://www.qualityml.org/1.0/domain/DifferentialErrors2D

14

Metrics

In this case the metrics and its parameters are:
- metrics: http://www.qualityml.org/1.0/metrics/CircularError
- level: 0.95
- value: 3.2 m

6.1.3. Product accuracy (x,y,z): Processed pixel location accuracy on the ground

Moreover, metadata of the imagery can describe the product accuracy meaning the positional accuraccy of the final product at a dataset level. Usually the evaluation of the positional accuracy for gridded data is carried out by the definition of ground control points (with known true, or accepted as true, positions).

In this case, most common measure for positional accuracy is root mean square error of planimetry RMSEP, that is equivalent to a bidimensional standard deviation (or σ)):

includes\images\RMSEP

(that, assuming a normal distribution of errors, describes a probability of 63.8%).

In QualityML, this measure is called Root Mean Square Error and, for planimetry, is applied to a 2D Differential Error Measure domain, and uses Root Mean Square Error 2D metrics.

Table 14. Processed pixel location accuracy on the ground
Line Component Description

1

Name

Root Mean Square

2

Alias

RMS

3

Element name

gridded data positional accuracy

4

Basic measure

 — 

5

Definition

Measure of the differences between 2D values predicted by a model or an estimator and the values actually observed in a set of ground control points

6

Description

 — 

7

Parameter

10

Source reference

ISO 19157 Id. 39

11

Example

2.9 m

12

Identifier

http://www.qualityml.org/1.0/measure/RootMeanSquare

13

Domain

http://www.qualityml.org/1.0/domain/DifferentialErrors2D

14

Metrics

- metrics: http://www.qualityml.org/1.0/metrics/RootMeanSquareError2D
- value: 2.9 m

6.2. Accuracy (thematic)

Thematic accuracy is an important issue for imagery for example to describe accuracy on the original digital numbers, on the radiances or on the reflectances. This is also really important for higher level products such as vegetation indexes, or biophysical parameters obtained from the imagery.

Most of ISO 19157 thematic accuracy measures can be easily applied to imagery, and thus can describe the uncertainty of pixels values (identifiers 60 to 63, as specified in Annex D.6.3; and in corresponding QualityML measures). ISO 19157 describe three quality elements for thematic accuracy (see the list in Data quality classes and subclasses) and all of them can be applied to imagery (depending if a raw image or a high-level product is described).

6.2.1. Thematic accuracy: DEM accuracy (dataset-level)

A Digital Elevation Model is a useful product and describing its accuracy on the height values is related to describing thematic accuracy. In this case a DQ_QuantitativeAttributeAccuracy quality element is the convenient, and the proposal is to use the UncertML Normal Distribution description to describe the mean and the variance of the uncertainty on this element:

Table 15. Thematic accuracy: DEM accuracy (dataset-level)
Line Component Description

1

Name

Mean Absolute Error

2

Alias

MAE

3

Element name

quantitative attribute accuracy

4

Basic measure

 — 

5

Definition

Mean Absolute Error (MAE) of DEM values and expressed as mean and variance

6

Description

 — 

7

Parameter

 — 

10

Source reference

QualityML proposal (based on UncertML)

11

Example

Mean absolute error for the DEM is 1.21 m with a variance of 3.06 m

12

Identifier

http://www.qualityml.org/1.0/measure/MeanAbsoluteError

13

Domain

MAE can be used over several differential errors domain. In this case the domain is: http://www.qualityml.org/1.0/domain/DifferentialErrors1D

14

Metrics

- metrics: http://www.uncertml.org/distributions/normal
- value: mean: 1.21 m
- value: variance: 3.06 m

6.2.2. Thematic accuracy: NDVI accuracy (pixel-level)

A Normalized Difference Vegetation Index (NDVI), for example, can be derived from imagery, and documenting its thematic accuracy is important for ulterior modeling based on this biophysical variable.

In this case a DQ_QuantitativeAttributeAccuracy quality element is the convenient, and measures such as Root Mean Square Error, Mean Absolute Error or Quantitative attribute correctness can be used.

Table 16. Thematic accuracy: NDVI accuracy (pixel-level)
Line Component Description

1

Name

Mean Absolute Error

2

Alias

MAE

3

Element name

quantitative attribute accuracy

4

Basic measure

 — 

5

Definition

Mean Absolute Error (MAE) of NDVI values computed using original bands uncertainty and error propagation

6

Description

 — 

7

Parameter

 — 

10

Source reference

QualityML proposal (based on ISO 19157 Id. 28, Id. 29 but applied to quantitative attribute accuracy)

11

Example

A coverage describing the mean absolute error for each pixel in an NDVI product

12

Identifier

http://www.qualityml.org/1.0/measure/MeanAbsoluteError

13

Domain

MAE can be used over several differential errors domain. In this case the domain is: http://www.qualityml.org/1.0/domain/DifferentialErrors1D

14

Metrics

- metrics: http://www.qualityml.org/1.0/metrics/MeanAbsolute
- value: pixel-level MAE result

6.3. Currency

6.3.1. General purpose metadata: Image acquisition date

In generic terms, currency is defined by the distance between the date the data was captured and the current time. In the ISO 19115 and 19115-1 the acquisition date can be documented as the temporal extent of the resource.

includes\images\EX TemporalExtent
Figure 5. EX_TemporalExtent as a way to document the capture date

Then to calculate the currency, a difference between the temporal extent and the current date will give us a currency estimation.

Note that a temporal extent can be expressed not only by a single date but by a range. A range is useful for a single scene that is captured by a push-broom sensor. Sometimes a dataset is composed by juxtaposition/mosaic of scenes with a variety of acquisition time. In this case, the temporal extent of the result could be quite wide. Depending on the circumstances, it could diminish the quality of the data.

6.3.2. Currency: temporal validity with a domain requirement

If a currency requirement is described in a certain environment, or for a certain product specification, a Temporal validity indicator could be used to state the degree of agreement between image temporal extent and the requirement.

For example, if the product specifications has a requirement that considers pixels that are older than a limit (e.g. 24 months old) as not current enough, the temporal validity quality indicator can report this, describing the requirement as a parameter in the domain of the quality measure, as explained (with the same example) in QualityML domain (with parameters).

Table 17. Items that does not conform to a temporal validity requirement
Line Component Description

1

Name

Items

2

Alias

 — 

3

Element name

Temporal validity

4

Basic measure

Value domain

5

Definition

Indication of if an item is conformant or not with a rule. The conformance or non-conformance can be expressed as a boolean, count or rate.

6

Description

 — 

7

Parameter

"max" for a maximum ratio if rate is used

10

Source reference

ISO 19157 Id. 14, Id.15 (boolean), Id.16 (count), Id.17 + Id.18 (rate), unified in QualityML

11

Example

30 % of the pixels are older than a 24 month of the time intended for the data (older than 2014-05-01)

12

Identifier

http://www.qualityml.org/1.0/measure/ValueDomain

13

Domain

Items metrics can be used with Conformance or Non-Conformance domains. In both cases the domain can describe a "range" ("min" and/or "max") and/or a "rule" parameters to define the requirement, as needed. In this case the domain is:
- domain: http://www.qualityml.org/1.0/domain/NonConformance
- type: range
- min: 2014-05-01
- rule: Pixels are older than a 24 month of the time intended for the data (i.e. older than 2014-05-01)

14

Metrics

In this case the metrics and its parameter are:
- metrics: http://www.qualityml.org/1.0/metrics/items
- type: rate
- max: 100
- value: 30

6.4. Completeness

In a dataset that has been produced by a classification of imagery (or a similar thecnique), some expected classes can be completely missing, but there is evidence that the class is present in the area.

Several quality measures to describe omission can be used are:

  • Missing class

  • Nodata values

  • Flag areas

Table 18. Missing class
Line Component Description

1

Name

missing class

2

Alias

missing category

3

Element name

omission

4

Basic measure

error indicator

5

Definition

indication showing that a specific class is missing in the data. It can be expressed as a boolean, count or rate.

6

Description

 — 

7

Parameter

 — 

10

Source reference

ISO 19157 Id. 5, Id. 6, Id. 7, unified in QualityML

11

Example

a class "boreal forest" is not present in the classified image. In situ data (ground truth) suggest that boreal forest is present in the scene extent.

12

Identifier

http://www.qualityml.org/1.0/measure/MissingClass

13

Domain

In this case the domain is:
- domain: http://www.qualityml.org/1.0/domain/NonConformance
- rule: Boreal class missing

14

Metrics

http://www.qualityml.org/1.0/metrics/items

Table 19. Nodata values
Line Component Description

1

Name

Nodata areas

2

Alias

3

Element name

omission

4

Basic measure

error indicator

5

Definition

This quality measure indicates if there is absence of data for some reason that made impossible to get data in this area. In the case that the measure is provided as a pixel level quality, the resulting parameters are presented in the form of a data mask

6

Description

Typically used in imagery to mark pixels that has a value that does not represent a correct measurement. Can be reported as a boolean per pixel or a percentage of "nodata" pixels in an scene. This metrics is particularly relevant when building big mosaics of several scenes where, despite the efforts of the producer, some parts of the mosaic area are covered by any of the scenes and result in nodata values.

7

Parameter

 — 

10

Source reference

 — 

11

Example

12

Identifier

http://www.qualityml.org/1.0/measure/NodataAreas

13

Domain

In this case the domain is:
- domain: http://www.qualityml.org/1.0/domain/NonConformance
- rule: Nodata areas

14

Metrics

http://www.qualityml.org/1.0/metrics/items

Table 20. Flag values
Line Component Description

1

Name

Flag areas

2

Alias

3

Element name

omission

4

Basic measure

error indicator

5

Definition

This quality measure indicates if a data has the presence of some detected as anomalous such as snow or cloud. In the case that the measure is provided as a pixel level quality, the resulting parameters are presented in the form of a data mask.

6

Description

Typically used in imagery to mark pixels that show snow instead of "land" behind, covered by clouds, in the shadow of a cloud, etc. It can be reported as a boolean per pixel or a percentage of "flagged" pixels in an scene. This metrics is particularly relevant when building big mosaics of several scenes where, despite the efforts of the producer, some parts of the mosaic area cannot be covered with better scenes and still contain snow, cloud, shadows or other artifacts.

7

Parameter

 — 

10

Source reference

 — 

11

Example

12

Identifier

http://www.qualityml.org/1.0/measure/FlagAreas

13

Domain

In this case the domain is:
- domain: http://www.qualityml.org/1.0/domain/NonConformance
- rule: Cloud areas

14

Metrics

http://www.qualityml.org/1.0/metrics/items

Related to cloud cover, if a single value for the whole image is described, then there are two options:

  • general purpose metadata: cloudCoverPercentage in MD_ContentInformation | MD_ImageDescription can be used if a single percentage is used. This is a general metadata element is described in ISO 19115

  • quality measure FlagAreas (QualityML) using the domain as "cloud areas" can be used. This is mandatory if a coverage indicating whether each pixel is a cloud or not is used, but it is optional if a single percentage for the whole images is given

6.4.1. Completeness: cloud cover (percentage result) and snow cover (pixel-level result)

Note, that metrics/items allow for both a boolean indicator, the number of pixels or a percentage. Thus, usually when a pixel-value result is used, the items is used as a boolean to indicate (for each pixel) if a certain characteristic is present (e.g. is this pixel covered by snow). When the result is given as a quantitative result, usually rates (or number of pixels) are used (e.g. 26% of the image is covered by clouds).

In the next sections QualityML Flag areas measure is used to describe snow cover, including pixel-level encoding examples. On the other hand, cloud cover is provided using the general purpose metadata cloudCoverPercentage:

Table 21. 'Flag areas' element and 'Items' metrics for snow cover pixel-level indicator
Quality measure Domain Metrics Example Elements to be encoded

Flag areas

Non Conformance

items

Flag areas indicate in this report the snow cover of the image

measure/FlagAreas
result value type: Items
result value record:
- items/rate: pixel-level boolean indicator
- NonConformance/rule: a coverage result is used including for each pixel a boolean indication whether the pixel is covered by snow

6.5. Consistency

Consistencies in imagery is mainly related to homogeneity among servers images and even among sensors in order to have common measures along time two produce radiometric-consistent time series.

One way to document part of this information is using general purpose metadata to describe the processing level of the image. Satellite optical imagery is commonly classified in raw data (where the content of the data is related with the quantity of light measured) and high level products (where the content of the data is related to a biophysical or biochemical variable of the terrain; quantitative or categorical). But not all raw data is equivalent and there are several levels of processing to remove atmospheric and topographical effects. In general this is characterized by a level number, but agencies are not completely consistent with the numbering of the levels.

In the ISO 19115-1 introduces a new element in MD_CoverageDescription called processingLevelCode that is an MD_Identifier.

includes\images\MD CoverageDescription
Figure 6. MD_CoverageDescription showing the processingLevelCode
includes\images\MD Identifier
Figure 7. MD_Identifier that is used as processingLevelCode data type

This data type allows for the different agencies to define their own identifiers and reference them here.

Our proposal is to define the levels, not as a sequential numbers but as codes that can be better understood by users.

This are the proposed MD_Identifiers that can also be maintained in the QualityML vocabulary:

Table 22. processingLevel identifiers defined in QualityML
code codeSpace version description

DigitalNumbers

http://www.qualityml.org/processLevel

1.0

Digital Numbers

TopOfAtmosphereReflectance

http://www.qualityml.org/processLevel

1.0

Top of the Atmosphere Reflectance

SurfaceReflectanceAtmosphericModel

http://www.qualityml.org/processLevel

1.0

Surface Reflectance based on atmospheric modeling

SurfaceReflectancePseudoInvariantAreas

http://www.qualityml.org/processLevel

1.0

Surface Reflectance based on pseudo invariant areas

Moreover, quality measures related to consistency can be provided such as:

  • logical consistency - domain consistency - value domain: it can be used to describe domain inconsistencies on the image values (e.g. reflectances or normalized vegetation index values outside theoretical range)

  • logical consistency - format consistency - physical structure conflicts: problems sue to format consistency (e.g. incorrect file format or value type selection or encoding)

6.5.1. Consistency: surface reflectance and rate of domain inconsistencies

Next sections will show and example of image processing level documentation to describe the values of the image according to a general codeSpace (liked to a QualityML dictionary), as well as to include a quality measure to describe the rate on pixel values that are not conformant to its value domain, because they have reflectance values higher that the maximum (e.g. due to sensor saturation).

Table 23. Items that does not conform to a the value domain
Line Component Description

1

Name

Items

2

Alias

 — 

3

Element name

Domain consistency

4

Basic measure

Value domain

5

Definition

Pixel values that are not conformant to its value domain, because they have reflectance values higher that the maximum (e.g. due to sensor saturation)

6

Description

 — 

7

Parameter

"max" for a maximum ratio if rate is used

10

Source reference

ISO 19157 Id. 14, Id.15 (boolean), Id.16 (count), Id.17 + Id.18 (rate), unified in QualityML

11

Example

5% of the pixels have reflectances values higher than its theoretical maximum, due to sensor saturation

12

Identifier

http://www.qualityml.org/1.0/measure/ValueDomain

13

Domain

Items metrics can be used with Conformance or Non-Conformance domains. In both cases the domain can describe a "range" ("min" and/or "max") and/or a "rule" parameters to define the requirement, as needed. In this case the domain is:
- domain: http://www.qualityml.org/1.0/domain/NonConformance
- rule: Pixel reflectances with values higher than its theoretical maximum, due to sensor saturation

14

Metrics

In this case the metrics and its parameter are:
- metrics: http://www.qualityml.org/1.0/metrics/items
- type: rate
- max: 100
- value: 5

7. Dataset level quality

This ER assumes that dataset level quality and pixel level quality will be documented in and ISO 19115-1/19157 metadata document following ISO 19115-3 XML encoding. In this chapter we discuss the different encoding for dataset level quality.

Since ISO 19115-1 was initially designed for dataset level metadata, it is logical to assume that dataset level quality will be more direct to do. This is true in simple cases but it can also be more complex than initially expected, in particular, if the dataset is a composite of several layers and each layer can have quality indicators that requires more that one parameter to be described.

Note
ISO 19115 and 19115-1 schemas are not identical as:
- new elements are added to ISO 19115-1 (e.g codeSpace for MD_Identifier, DQ_CoverageResult for DQ_Result,…​)
- namespaces have dramatically changed on ISO 19115-1.

In this ER, ISO 19115 encoding will only be described if differences between both encoding differ only in the namespaces notation. On the other hand, if other differences arise between the two schemas, then both encodings will be provided.

7.1. Basic encoding

ISO 19157 describes the DQ_DataQuality package, used in ISO 19115-1 to describe the overall assessment of quality of a resource (trough the "dataQualityInfo" association). As described in previous clauses, to describe a quality elements several aspects should be described.

DQ DataQuality and DQ Element
Figure 8. Data quality units and data quality element descriptors as defined in ISO 19157

See also Data Quality Classes and DQ Element as 19157 for a list and description of the data quality elements, that are components describing a certain aspect of the quality of geographic data an have been organized into different categories.

7.2. Scope encoding

When describing the quality of geographic data, different quality elements and different subsets of the data may be considered. In order to describe these, data quality units are used. A data quality unit is the combination of a scope and data quality elements. The scope of the data quality unit(s) specifies the extent, spatial and/or temporal, and/or common characteristic(s) that identify the data on which data quality is to be evaluated.

The following are examples of what defines a data quality scope (see also MD_Scope in ISO 19115-1):

  • a dataset series;

  • a dataset;

  • a subset of data defined by one or more of the following characteristics:

    • types of items (sets of feature types, feature attributes, feature operations or feature relationships);

    • specific items (sets of feature instances, attribute values or instances of feature relationships);

    • geographic extent;

    • temporal extent (the time frame of reference and accuracy of the time frame).

On of the most typical scopes for a data quality unit is the dataset, but sometimes other are used. Moreover, sometimes the scope may include a specific extent (spatial, temporal or both) to describe more information.

This is the case of the quality measure conceptually described in QM_1 Accuracy Satellite Position QMR. The full XML encoding for this quality measure can be found in Appendix QM Raster, in QM_1 Accuracy Satellite Position old 19115 XML and QM_1 Accuracy Satellite Position 19115_1 XML.

The encoding of a temporal extent for the dataset series scope following ISO 19115 is described here for convenience:

	<gmd:scope>
		<gmd:DQ_Scope>
			<gmd:level>
				<gmd:MD_ScopeCode codeListValue="series" codeList="MD_ScopeCode"/>
			</gmd:level>
			<gmd:extent>
				<gmd:EX_Extent>
					<gmd:temporalElement>
						<gmd:EX_TemporalExtent>
							<!-- temporal extent used to compute "positional uncertainty of satellite position" should be stated here  -->
							<gmd:extent>
								<gml:TimePeriod gml:id="SensorAccuracy_Scope_TempExt">
									<gml:beginPosition>2016-01-01T00:00:00.000+01:00</gml:beginPosition>
									<gml:endPosition>2016-06-30T00:00:00.000+01:00</gml:endPosition>
								</gml:TimePeriod>
							</gmd:extent>
						</gmd:EX_TemporalExtent>
					</gmd:temporalElement>
				</gmd:EX_Extent>
			</gmd:extent>
		</gmd:DQ_Scope>
	</gmd:scope>

7.3. Give semantics to Quality Measures

According to ISO 19157, a data quality element should refer to one measure only, by means of a measure reference, providing an identifier of a measure fully described elsewhere and/or providing the name and a short description of the measure.

To facilitate dataset comparisons, it is necessary that the results in the data quality reports are expressed in a comparable way and that there is a common understanding of the data quality measures that have been used. In order to make evaluations and data quality reports (metadata or a standalone quality report) from different sources comparable, standardized data quality measures described in ISO 19157, Annex D, shall be used when possible.

This ER in its Quality Measures Raster chapter describes quality measures, sometimes directly coming from the standard, or sometimes described in QualityML.

Encoding of quality measures in the metadata shall directly link to catalogues of data quality measures to fully describe the measures referenced in the data quality report of the data evaluated. Our proposal is to link to QualityML that includes ISO 19157 measures and extend it with new definitions.

7.3.1. ISO 19157 XML encoding of 'Measure reference'

ISO 19157 gives examples of how DQ_MeasureReference should be described in metadata (e.g. in section E.4.1.2 Reporting commission, table E.12):

  • nameOfMeasure/CharacterString: Number of excess item

  • measureIdentification/MD_Identifier/code/CharacterString: 2

  • measureDescription/CharacterString: number of items within the dataset that should not have been in the dataset

In this example, the namespace (in the 'codeSpace' element of the MD_Identifier) is not defined. Moreover, the identifier is not directly linked to a catalogue defining the measure.

Encoded in XML following ISO 19115 this example looks like:

     <gmd:nameOfMeasure>
          <gco:CharacterString>Number of excess item</gco:CharacterString>
     </gmd:nameOfMeasure>
     <gmd:measureIdentification>
          <gmd:MD_Identifier>
               <gmd:code>2</gmd:code>
          </gmd:MD_Identifier>
     </gmd:measureIdentification>
     <gmd:measureDescription>
          <gco:CharacterString>Number of items within the dataset that should not have been in the dataset</gco:CharacterString>
     </gmd:measureDescription>

According to ISO 19139, Anchor is substitutable of CharacterString, so this ability can be used when instantiating the web environment extensions for properties having CharacterString type in ISO 19115-1. The main advantage of using 'Anchor' element is that it allows to define an "xlink:href" attribute to include a link to the catalogue definition of the code or the codeSpace when describing a measure.

7.3.3. Encoding proposal including semantics

Our general proposal to give semantics to a Measure reference (by means of directly link it to a catalogue) is the use of Anchor element in:

  • MD_Identifier/code (ISO 19115 and ISO 19115-1 schemas)

  • MD_Identifier/codeSpace (only in ISO 19115-1 schemas)

to include direct links to QualityML measures as a recommended option (but any other dictionary that a user would like to use is also possible).

Next examples are covering the quality measure conceptually described in QM_3 Accuracy Processed pixel location QMR. The full XML encoding for this quality measure can be found in Appendix QM Raster, in QM_3 Accuracy Processed pixel location old 19115 XML and QM_3 Accuracy Processed pixel location 19115_1 XML.

The encodings of a nameOfMeasure, measureIdentification and measureDescription for this quality measure are described here for convenience (as an exemplification of the Anchor element usage):

First option: 19115 schema
	<gmd:DQ_GriddedDataPositionalAccuracy>
		<gmd:nameOfMeasure>
			<gco:CharacterString>Root Mean Square (RMS)</gco:CharacterString>
		</gmd:nameOfMeasure>
		<gmd:measureIdentification>
			<gmd:MD_Identifier>
				<gmd:code>
					<gmx:Anchor xlink:href="http://www.qualityml.org/1.0/measure/RootMeanSquare">RootMeanSquare</gmx:Anchor>
				</gmd:code>
			</gmd:MD_Identifier>
		</gmd:measureIdentification>
		<gmd:measureDescription>
			<gco:CharacterString>Measure of the differences between 2D values predicted by a model or an estimator and the values actually observed in a set of ground control points.</gco:CharacterString>
		</gmd:measureDescription>
		<gmd:result>
			[...]
		</gmd:result>
	</gmd:DQ_GriddedDataPositionalAccuracy>
Second option: 19115-1 schema
	<mdq:DQ_GriddedDataPositionalAccuracy>
		<mdq:measure>
			<mdq:DQ_MeasureReference>
				<mdq:measureIdentification>
					<mcc:MD_Identifier>
						<mcc:code>
							<gcx:Anchor xlink:href="http://www.qualityml.org/1.0/measure/RootMeanSquare">RootMeanSquare</gcx:Anchor>
						</mcc:code>
						<mcc:codeSpace>
							<gcx:Anchor xlink:href="http://www.qualityml.org">QualityML</gcx:Anchor>
						</mcc:codeSpace>
						<mcc:version>
							<gco:CharacterString>1.0</gco:CharacterString>
						</mcc:version>
					</mcc:MD_Identifier>
				</mdq:measureIdentification>
				<mdq:nameOfMeasure>
					<gco:CharacterString>Root Mean Square (RMS)</gco:CharacterString>
				</mdq:nameOfMeasure>
				<mdq:measureDescription>
					<gco:CharacterString>Measure of the differences between 2D values predicted by a model or an estimator and the values actually observed in a set of ground control points.</gco:CharacterString>
				</mdq:measureDescription>
			</mdq:DQ_MeasureReference>
		</mdq:measure>
		<mdq:result>
		    [...]
		</mdq:result>
	</mdq:DQ_GriddedDataPositionalAccuracy>

7.4. Give semantics to Quality Metrics

7.4.1. Basic encoding

Quality Metrics are described in QualityML as the elements used to describe the result for a certain Quality Measure. A single quality measure can be expressed by different metrics, and this is the reason for the split between the two concepts. More details can be found in QualityML quality vocabulary and encoding or in QualityML.

Quantitative result may be a single value or multiple values, depending on the values of attributes valueType and valueStructure defined in the description of the measure applied. The attribute valueRecordType is used to describe how the valueType and valueStructure defined in the measure are implemented to provide the value of the quantitative result.

QualityML metrics schema (available at http://qualityml.geoviqua.org/schemas/qualityml/1.0/qualityml.xsd) describes the metrics defined in QualityML that can be used in an XML encoding within the gmd:value/gco:Record.

When describing a quantitative result, the recommendation to give semantics is to:

  • DQ_QuantitativeResult/valueType/RecordType: include an xlink:href attribute to QualityML description of the metrics used

  • DQ_QuantitativeResult/value/Record: include the QualityML element for the selected RecordType.

  • DQ_QuantitativeResult/valueUnit: include an xlink:href attribute to a units dictionary

Next examples are still covering the quality measure conceptually described in Product accuracy (x,y,z): Processed pixel location accuracy on the ground. The full XML encoding for this quality measure can be found in XML examples of encoding for Quality Measures for Raster, in ISO 19115 encoding and ISO 19115-1 encoding.

The encodings of a RecordType, Record and valueUnit for this quality measure result are described here for convenience (as an exemplification of the QualityML metrics elements usage):

First option: 19115 schema
	<gmd:DQ_GriddedDataPositionalAccuracy>
		[...]
		<gmd:result>
			<gmd:DQ_QuantitativeResult>
				<gmd:valueType>
					<gco:RecordType xlink:href="http://www.qualityml.org/1.0/metrics/RootMeanSquareError2D">Root mean square error 2D</gco:RecordType>
				</gmd:valueType>
				<gmd:valueUnit xlink:href="urn:ogc:def:uom:OGC:1.0:metre"/>
				<gmd:value>
					<gco:Record>
						<qml:RootMeanSquare>
							<qml:values>2.9</qml:values>
						</qml:RootMeanSquare>
					</gco:Record>
				</gmd:value>
			</gmd:DQ_QuantitativeResult>
		</gmd:result>
	</gmd:DQ_GriddedDataPositionalAccuracy>
Second option: 19115-1 schema
	<mdq:DQ_GriddedDataPositionalAccuracy>
		<mdq:measure>
    		[...]
		</mdq:measure>
		<mdq:result>
			<mdq:DQ_QuantitativeResult>
				<mdq:value>
					<gco:Record>
						<qml:RootMeanSquare>
							<qml:values>2.9</qml:values>
						</qml:RootMeanSquare>
					</gco:Record>
				</mdq:value>
				<mdq:valueUnit xlink:href="urn:ogc:def:uom:OGC:1.0:metre"/>
				<mdq:valueRecordType>
					<gco:RecordType xlink:href="http://www.qualityml.org/1.0/metrics/RootMeanSquareError2D">Root mean square error 2D</gco:RecordType>
				</mdq:valueRecordType>
			</mdq:DQ_QuantitativeResult>
		</mdq:result>
	</mdq:DQ_GriddedDataPositionalAccuracy>

7.4.2. Parameter encoding

Some metrics, or some metrics options need the description of some parameters to be fully described. A first example of this is the description of the parameter "max" when an Items/rate metrics is used, or the parameter "level" when the CircularError metrics is used. More details can be found in QualityML quality vocabulary and encoding or in QualityML.

Next examples are covering the quality measure conceptually described in QM_2 Accuracy Pixel location QMR. The full XML encoding for this quality measure can be found in Appendix QM Raster, in QM_2 Accuracy Pixel location old 19115 XML and QM_2 Accuracy Pixel location 19115_1 XML.

The encodings of a RecordType, Record and valueUnit for this quality measure result are described here for convenience (as an exemplification of the QualityML metrics elements, and its parameters, usage):

First option: 19115 schema
	<gmd:DQ_GriddedDataPositionalAccuracy>
		[...]
		<gmd:result>
			<gmd:DQ_QuantitativeResult>
				<gmd:valueType>
					<gco:RecordType xlink:href="http://www.qualityml.org/1.0/metrics/CircularError">Circular Error</gco:RecordType>
				</gmd:valueType>
				<gmd:valueUnit xlink:href="urn:ogc:def:uom:OGC:1.0:metre"/>
				<gmd:value>
					<gco:Record>
						<qml:CircularError level="0.95">
							<qml:values>3.2</qml:values>
						</qml:CircularError>
					</gco:Record>
				</gmd:value>
			</gmd:DQ_QuantitativeResult>
		</gmd:result>
	</gmd:DQ_GriddedDataPositionalAccuracy>
Second option: 19115-1 schema
	<mdq:DQ_GriddedDataPositionalAccuracy>
		<mdq:measure>
		    [...]
		</mdq:measure>
		<mdq:result>
			<mdq:DQ_QuantitativeResult>
				<mdq:value>
					<gco:Record>
						<qml:CircularError level="0.95">
							<qml:values>3.2</qml:values>
						</qml:CircularError>
					</gco:Record>
				</mdq:value>
				<mdq:valueUnit xlink:href="urn:ogc:def:uom:OGC:1.0:metre"/>
				<mdq:valueRecordType>
					<gco:RecordType xlink:href="http://www.qualityml.org/1.0/metrics/CircularError">Circular Error</gco:RecordType>
				</mdq:valueRecordType>
			</mdq:DQ_QuantitativeResult>
		</mdq:result>
	</mdq:DQ_GriddedDataPositionalAccuracy>

7.5. Give semantics to Quality Domain

7.5.1. Basic encoding

Some metrics, or some metrics options need the description of some parameters to be fully described. A firs example of this is the description of the parameter "max" when an Items/rate metrics is used, or the parameter "level" when the CircularError metrics is used. Quality Metrics are described in QualityML as the elements used to describe the result for a certain Quality Measure. A single quality measure can be expressed by different metrics, and this is the reason for the split between the two concepts. More details can be found in QualityML quality vocabulary and encoding or in QualityML.

The use of the Domain of the Quality measure, un QualityML has been described in QualityML vocabulary encoding, allowing then a higher agrupation schema. QualityML describes several domains to which the quality measures can be applied.

Next examples are covering the quality measure conceptually described in QM_7 Consistency SurfRefl value domain QMR. The full XML encoding for this quality measure can be found in Appendix QM Raster, in QM_7 Consistency SurfRefl value domain old 19115 XML and QM_7 Consistency SurfRefl value domain 19115_1 XML.

The encodings of a RecordType, Record and valueUnit for this quality measure result are described here for convenience, as an exemplification of the QualityML metrics elements, its parameters, and a general "rule" for the quality metric domain:

First option: 19115 schema
	<gmd:DQ_DomainConsistency>
        [...]
		<gmd:result>
			<gmd:DQ_QuantitativeResult>
				<gmd:valueType>
					<gco:RecordType xlink:href="http://www.qualityml.org/1.0/metrics/items">Items</gco:RecordType>
				</gmd:valueType>
				<gmd:valueUnit xlink:href="urn:ogc:def:uom:OGC:1.0:percent"/>
				<gmd:value>
					<gco:Record>
						<qml:Items>
							<qml:rate max="100">5</qml:rate>
						</qml:Items>
						<qmld:NonConformance>
							<qmld:rule>Pixel reflectances with values higher than its theoretical maximum, due to sensor saturation</qmld:rule>
						</qmld:NonConformance>
					</gco:Record>
				</gmd:value>
			</gmd:DQ_QuantitativeResult>
		</gmd:result>
    </gmd:DQ_DomainConsistency>
Second option: 19115-1 schema
	<mdq:DQ_DomainConsistency>
		<mdq:measure>
			[...]
		</mdq:measure>
		<mdq:result>
			<mdq:DQ_QuantitativeResult>
				<mdq:value>
					<gco:Record>
						<qml:Items>
							<qml:rate max="100">5</qml:rate>
						</qml:Items>
						<qmld:NonConformance>
							<qmld:rule>Pixel reflectances with values higher than its theoretical maximum, due to sensor saturation</qmld:rule>
						</qmld:NonConformance>
					</gco:Record>
				</mdq:value>
				<mdq:valueUnit xlink:href="urn:ogc:def:uom:OGC:1.0:percent"/>
				<mdq:valueRecordType>
					<gco:RecordType xlink:href="http://www.qualityml.org/1.0/metrics/items">Items</gco:RecordType>
				</mdq:valueRecordType>
			</mdq:DQ_QuantitativeResult>
		</mdq:result>
	</mdq:DQ_DomainConsistency>

7.5.2. Parameters encoding

Sometimes restrictions on the domain used needs to be described, and thus some parameters are needed for the domain.

Next examples are covering the quality measure conceptually described in QM_5 Currency TemporalValidity QMR. The full XML encoding for this quality measure can be found in Appendix QM Raster, in QM_5 Currency TemporalValidity old 19115 XML and QM_5 Currency TemporalValidity 19115_1 XML.

The encodings of a RecordType, Record and valueUnit for this quality measure result are described here for convenience, as an exemplification of the QualityML metrics elements, its parameters, and also a domain and its parameters:

First option: 19115 schema
	<gmd:DQ_TemporalValidity>
		[...]
		<gmd:result>
			<gmd:DQ_QuantitativeResult>
				<gmd:valueType>
					<gco:RecordType xlink:href="http://www.qualityml.org/1.0/metrics/items">Items</gco:RecordType>
				</gmd:valueType>
				<gmd:valueUnit xlink:href="urn:ogc:def:uom:OGC:1.0:percent"/>
				<gmd:value>
					<gco:Record>
						<qml:Items>
							<qml:rate max="100">30</qml:rate>
						</qml:Items>
						<qmld:NonConformance>
							<qmld:range>
								<qmld:min>
									<qmld:date>2014-05-01</qmld:date>
								</qmld:min>
							</qmld:range>
							<qmld:rule>Pixels are older than a 24 month of the time intended for the data (i.e. older than 2014-05-01)</qmld:rule>
						</qmld:NonConformance>
					</gco:Record>
				</gmd:value>
			</gmd:DQ_QuantitativeResult>
		</gmd:result>
	</gmd:DQ_TemporalValidity>
Second option: 19115-1 schema
	<mdq:DQ_TemporalValidity>
		<mdq:measure>
			[...]
		</mdq:measure>
		<mdq:result>
			<mdq:DQ_QuantitativeResult>
				<mdq:value>
					<gco:Record>
						<qml:Items>
							<qml:rate max="100">30</qml:rate>
						</qml:Items>
						<qmld:NonConformance>
							<qmld:range>
								<qmld:min>
									<qmld:date>2014-05-01</qmld:date>
								</qmld:min>
							</qmld:range>
							<qmld:rule>Pixels are older than a 24 month of the time intended for the data (i.e. older than 2014-05-01)</qmld:rule>
						</qmld:NonConformance>
					</gco:Record>
				</mdq:value>
				<mdq:valueUnit xlink:href="urn:ogc:def:uom:OGC:1.0:percent"/>
				<mdq:valueRecordType>
					<gco:RecordType xlink:href="http://www.qualityml.org/1.0/metrics/items">Items</gco:RecordType>
				</mdq:valueRecordType>
			</mdq:DQ_QuantitativeResult>
		</mdq:result>
	</mdq:DQ_TemporalValidity>

7.6.1. Temporal extent

Related to temporal quality element, the TemporalExtent general metadata element is usually documented, sometimes giving information needed to better understand the quality measure.

Next examples are covering the description of temporal extent that is related to the quality measure conceptually described in QM_5 Currency TemporalValidity QMR. The full XML encoding for this quality measure can be found in Appendix QM Raster, in QM_5 Currency TemporalValidity old 19115 XML and QM_5 Currency TemporalValidity 19115_1 XML.

The encodings of temporalExtent (following ISO 19115) is described here for convenience (other sections of the same example have been shown previously and the full XML is also referenced above):

	<gmd:EX_Extent>
		<gmd:temporalElement>
			<gmd:EX_TemporalExtent>
				<gmd:extent>
					<gml:TimePeriod gml:id="TE_1">
						<gml:beginPosition>2012-07-15</gml:beginPosition>
						<gml:endPosition>2016-05-01</gml:endPosition>
					</gml:TimePeriod>
				</gmd:extent>
			</gmd:EX_TemporalExtent>
		</gmd:temporalElement>
	</gmd:EX_Extent>

7.6.2. Cloud coverage percentage

Related to Completeness quality element, the cloudCoverPercentage general metadata element is usually documented (in MD_ContentInformation | MD_ImageDescription) if only a dataset-level value is available. If a clouds flag mask (pixel-level) is available, it can be described in metadata as explained in Pixel level Quality can be used if a single percentage is used.

Next examples are covering the description of cloudCoverPercentage that is related to the quality measure conceptually described in QM_6 Completeness FlagAreas QMR. The full XML encoding for this quality measure can be found in Appendix QM Raster, in QM_6 Completeness FlagAreas old 19115_XML and QM_6 Completeness FlagAreas 19115_1 XML.

The encodings of temporalExtent (following ISO 19115) is described here for convenience (other sections of the same example will be discussed in Pixel Level):

First option: 19115 schema
	<gmd:contentInfo>
		<gmd:MD_ImageDescription>
			<gmd:attributeDescription>
				<gco:RecordType>Imagery bands</gco:RecordType>
			</gmd:attributeDescription>
			<gmd:contentType>
				<gmd:MD_CoverageContentTypeCode codeList="MD_CoverageContentTypeCode"
				codeListValue="MD_CoverageContentTypeCode_physicalMeasurement"/>
			</gmd:contentType>
			<gmd:dimension>
				<gmd:MD_RangeDimension id="Imagery_R">
					<gmd:descriptor>
						<gco:CharacterString>Red imagery band</gco:CharacterString>
					</gmd:descriptor>
				</gmd:MD_RangeDimension>
			</gmd:dimension>
			<gmd:dimension>
				<gmd:MD_RangeDimension id="Imagery_G">
					<gmd:descriptor>
						<gco:CharacterString>Green imagery band</gco:CharacterString>
					</gmd:descriptor>
				</gmd:MD_RangeDimension>
			</gmd:dimension>
			<gmd:dimension>
				<gmd:MD_RangeDimension id="Imagery_B">
					<gmd:descriptor>
						<gco:CharacterString>Blue imagery band</gco:CharacterString>
					</gmd:descriptor>
				</gmd:MD_RangeDimension>
			</gmd:dimension>
			<gmd:cloudCoverPercentage>
				<gco:Real>26</gco:Real>
			</gmd:cloudCoverPercentage>
		</gmd:MD_ImageDescription>
	</gmd:contentInfo>
Second option: 19115-1 schema
	<mdb:contentInfo>
		<mrc:MI_ImageDescription>
			<mrc:attributeDescription>
				<gco:RecordType>Imagery and quality flag bands</gco:RecordType>
			</mrc:attributeDescription>
			<mrc:attributeGroup>
				<mrc:MD_AttributeGroup id="Imagery">
					<mrc:contentType>
						<mrc:MD_CoverageContentTypeCode codeList="MD_CoverageContentTypeCode"
						codeListValue="MD_CoverageContentTypeCode_physicalMeasurement"/>
					</mrc:contentType>
					<mrc:attribute>
						<mrc:MD_RangeDimension>
							<mrc:description>
								<gco:CharacterString>Red imagery band</gco:CharacterString>
							</mrc:description>
						</mrc:MD_RangeDimension>
					</mrc:attribute>
					<mrc:attribute>
						<mrc:MD_RangeDimension>
							<mrc:description>
								<gco:CharacterString>Green imagery band</gco:CharacterString>
							</mrc:description>
						</mrc:MD_RangeDimension>
					</mrc:attribute>
					<mrc:attribute>
						<mrc:MD_RangeDimension>
							<mrc:description>
								<gco:CharacterString>Blue imagery band</gco:CharacterString>
							</mrc:description>
						</mrc:MD_RangeDimension>
					</mrc:attribute>
				</mrc:MD_AttributeGroup>
			</mrc:attributeGroup>
			<mrc:attributeGroup>
				<mrc:MD_AttributeGroup id="SnowCover_Quality_AttributeGroup">
					<mrc:contentType>
						<mrc:MD_CoverageContentTypeCode codeList="MD_CoverageContentTypeCode"
						codeListValue="MD_CoverageContentTypeCode_qualityInformation"/>
					</mrc:contentType>
					<mrc:attribute>
					    [...]
					</mrc:attribute>
				</mrc:MD_AttributeGroup>
			</mrc:attributeGroup>
			<mrc:cloudCoverPercentage>
				<gco:Real>26</gco:Real>
			</mrc:cloudCoverPercentage>
		</mrc:MI_ImageDescription>
	</mdb:contentInfo>

7.6.3. Image processing level

One way to document part of this information is using general purpose metadata to describe the processing level of the image. In the ISO 19115-1 introduces a new element in MD_CoverageDescription called processingLevelCode that is an MD_Identifier and thus the proposal is to link to a code described in the QualityML vocabulary, using the Anchor element (as explained in other examples above).

Next examples are covering the description of image processing level general metadata, that is related to the quality measure conceptually described in QM_7 Consistency SurfRefl value domain QMR. The full XML encoding for this quality measure can be found in Appendix QM Raster, in QM_7 Consistency SurfRefl value domain old 19115 XML and QM_7 Consistency SurfRefl value domain 19115_1 XML.

The encoding of processingLevel (following ISO 19115) is described here for convenience (other sections of the same example have been shown previously and the full XML is also referenced above):

First option: 19115 schema
	<gmd:contentInfo>
		<gmd:MD_ImageDescription>
			<gmd:attributeDescription>
				<gco:RecordType>Surface reflectance</gco:RecordType>
			</gmd:attributeDescription>
			<gmd:contentType>
				<gmd:MD_CoverageContentTypeCode codeList="MD_CoverageContentTypeCode"
				codeListValue="MD_CoverageContentTypeCode_physicalMeasurement"/>
			</gmd:contentType>
			<gmd:dimension>
				<gmd:MD_RangeDimension id="SurfRefl_AttributeGroup">
					<gmd:descriptor>
						<gco:CharacterString>Surface Reflectance</gco:CharacterString>
					</gmd:descriptor>
				</gmd:MD_RangeDimension>
			</gmd:dimension>
			<gmd:processingLevelCode>
				<gmd:MD_Identifier>
					<gmd:code>
						<gmx:Anchor xlink:href="http://www.qualityml.org/processLevel">SurfaceReflectancePseudoInvariantAreas</gmx:Anchor>
					</gmd:code>
				</gmd:MD_Identifier>
			</gmd:processingLevelCode>
		</gmd:MD_ImageDescription>
	</gmd:contentInfo>
Second option: 19115-1 schema
	<mdb:contentInfo>
		<mrc:MI_ImageDescription>
			<mrc:attributeDescription>
				<gco:RecordType>Surface reflectance</gco:RecordType>
			</mrc:attributeDescription>
			<mrc:processingLevelCode>
				<mcc:MD_Identifier>
					<mcc:code>
						<gcx:Anchor xlink:href="http://www.qualityml.org/processLevel">SurfaceReflectancePseudoInvariantAreas</gcx:Anchor>
					</mcc:code>
					<mcc:codeSpace>
						<gcx:Anchor xlink:href="http://www.qualityml.org">QualityML</gcx:Anchor>
					</mcc:codeSpace>
					<mcc:version>
						<gco:CharacterString>1.0</gco:CharacterString>
					</mcc:version>
				</mcc:MD_Identifier>
			</mrc:processingLevelCode>
			<mrc:attributeGroup>
				<mrc:MD_AttributeGroup id="SurfRefl_AttributeGroup">
					<mrc:contentType>
						<mrc:MD_CoverageContentTypeCode codeList="MD_CoverageContentTypeCode"
						codeListValue="MD_CoverageContentTypeCode_physicalMeasurement"/>
					</mrc:contentType>
					<mrc:attribute>
						<mrc:MD_RangeDimension>
							<mrc:description>
								<gco:CharacterString>Surface Reflectance</gco:CharacterString>
							</mrc:description>
						</mrc:MD_RangeDimension>
					</mrc:attribute>
				</mrc:MD_AttributeGroup>
			</mrc:attributeGroup>
		</mrc:MI_ImageDescription>
	</mdb:contentInfo>

7.7. A multicomponent quality attribute

Sometimes for describing a quality parameter, not only one but two values are needed. This is easily solved using quality measures described in QualityML as the schema for each metric can include several child elements if needed.

Next examples are covering the description the quality measure conceptually described in QM_3b Accuracy DEM accuracy QMR. The full XML encoding for this quality measure can be found in Appendix QM Raster, in QM_3b Accuracy DEM accuracy old_19115 XML and QM_3b Accuracy DEM accuracy 19115_1 XML.

The encoding of DQ_QuantitativeResult (following ISO 19115) is described here for convenience (the full XML is referenced above):

	<gmd:DQ_QuantitativeResult>
		<gmd:valueType>
			<gco:RecordType xlink:href="http://www.uncertml.org/distributions/normal">Normal Distribution</gco:RecordType>
		</gmd:valueType>
		<gmd:valueUnit xlink:href="urn:ogc:def:uom:OGC:1.0:metre"/>
		<gmd:value>
			<gco:Record>
				<un:NormalDistribution>
					<un:mean>1.21</un:mean>
					<un:variance>3.06</un:variance>
				</un:NormalDistribution>
			</gco:Record>
		</gmd:value>
	</gmd:DQ_QuantitativeResult>

8. Pixel level Quality

This ER assumes that dataset level quality and pixel level quality will be documented in an ISO 19115-1 metadata document. In this chapter we discuss the different possible encoding for pixel level quality, reviewing the alternatives and defining a recommended proposal. Moreover, some examples of quality measures described in Quality Measures Raster will be included.

8.1. Encoding alternatives

ISO 19115-1 was initially designed for dataset level metadata, but there are still some ways of convey per pixel level quality (one of them included in the ISO 19115-2). Again, it is not complicated to do it in simple cases but it can also be more complex than initially expected, in particular, if the dataset is a composite of several layers (or bands) and each layer can have quality indicators that requires more that one parameter to be described.

The current approach for CoverageResult supposes that the “coverage” that defines the quality is NOT part of the dataset but part of the metadata. This way adds a link to an “external file” as a MX_DataFile. This has 2 problems:

  • Since this file is not part of the dataset it has no “distribution” so it is probable that it is never distributed. People will forget about the link or will not be able to support it due to that current metadata systems assume the metadata record is a “monolithic” self-contained XML.

  • In remote sensing, data “bands” and quality “bands” are usually integrated in a single package and they are distributed together. E.g. you can have an IR band, 3 visible bands + a cloud flag band, a nodata flag band, and even an uncertainty band. All bands should be described in the contentInfo.

This is why two approaches will be considered:

  • Quality bands included in the product (and thus explained in the same metadata file) e.g. for MODIS or Landsat-8 quality masks.

  • Quality bands included in other files or even obtained through web services

This two approaches are more inline with common practice in the remote sensing domain.

8.1.1. Alternative 1: Quality coverage

ISO 19115-2 adds a new subclass to DQ_Element specifically designed to include a coverage that contains a raster file that has values indicating the error or the uncertainty for each pixel.

includes\images\DQ element DQ CoverageResult
Figure 9. DQ_CoverageResult as an alternative to DQ_QuantitativeResult

The coverage result has several additional attributes of type MD_SpatialRepresentation (to describe the grid; its dimension etc), MD_CoverageDescription (to describe the semantics of the coverage and a description of several bands grouped in attribute groups), MD_Format (describes de format of the data) and MX_DataFile (that allow for the inclusion of the file itself).

includes\images\DQ CoverageResult
Figure 10. DQ_CoverageResult details

Example:

XML code fragment for a DQ_DataQuality element that has QE_CoverageResult as a result.
<?xml version="1.0" encoding="UTF-8"?>
<mdq:DQ_DataQuality xmlns:gco="http://standards.iso.org/iso/19115/-3/gco/1.0" xmlns:gml="http://www.opengis.net/gml/3.2" xmlns:cit="http://standards.iso.org/iso/19115/-3/cit/1.0" xmlns:gcx="http://standards.iso.org/iso/19115/-3/gcx/1.0" xmlns:mcc="http://standards.iso.org/iso/19115/-3/mcc/1.0" xmlns:mdq="http://standards.iso.org/iso/19157/-2/mdq/1.0" xmlns:mrd="http://standards.iso.org/iso/19115/-3/mrd/1.0" xmlns:mrc="http://standards.iso.org/iso/19115/-3/mrc/1.0" xmlns:msr="http://standards.iso.org/iso/19115/-3/msr/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xsi:schemaLocation="http://standards.iso.org/iso/19115/-3/gco/1.0 ../19115/-3/gco/1.0/gco.xsd http://www.opengis.net/gml/3.2 ../19136/gml.xsd http://standards.iso.org/iso/19115/-3/cit/1.0 ../19115/-3/cit/1.0/cit.xsd http://standards.iso.org/iso/19115/-3/gcx/1.0 ../19115/-3/gcx/1.0/gcx.xsd http://standards.iso.org/iso/19115/-3/mcc/1.0 ../19115/-3/mcc/1.0/mcc.xsd http://standards.iso.org/iso/19157/-2/mdq/1.0 ../19157/-2/mdq/1.0/mdq.xsd http://standards.iso.org/iso/19115/-3/mrd/1.0 ../19115/-3/mrd/1.0/mrd.xsd http://standards.iso.org/iso/19115/-3/mrc/1.0 ../19115/-3/mrc/1.0/mrc.xsd http://standards.iso.org/iso/19115/-3/msr/1.0 ../19115/-3/msr/1.0/msr.xsd">
	<mdq:scope>
		<mcc:MD_Scope>
			<mcc:level>
				<mcc:MD_ScopeCode codeListValue="dataset" codeList="MD_ScopeCode"/>
			</mcc:level>
		</mcc:MD_Scope>
	</mdq:scope>
	<mdq:report>
		<mdq:DQ_QuantitativeAttributeAccuracy>
			<mdq:result>
				<mdq:QE_CoverageResult>
					<mdq:spatialRepresentationType>
						<mcc:MD_SpatialRepresentationTypeCode codeListValue="grid"
						codeList="MD_SpatialRepresentationTypeCode"/>
					</mdq:spatialRepresentationType>
					<mdq:resultFile>
						<mdq:QualityResultFile>
							<mdq:fileName>
								<gcx:FileName>UncertaintyCoverage.tiff</gcx:FileName>
							</mdq:fileName>
							<mdq:fileType>
								<gcx:MimeFileType type="image/tiff"/>
							</mdq:fileType>
							<mdq:fileDescription>
								<gco:CharacterString>Uncertainty Coverage Description</gco:CharacterString>
							</mdq:fileDescription>
							<mdq:fileFormat xlink:href="#TIFF_File_format"/>
						</mdq:QualityResultFile>
					</mdq:resultFile>
					<mdq:resultSpatialRepresentation>
						<msr:MD_Georectified>
							<msr:numberOfDimensions>
								<gco:Integer>2</gco:Integer>
							</msr:numberOfDimensions>
							<msr:cellGeometry>
								<msr:MD_CellGeometryCode codeListValue="point" codeList="MD_CellGeometryCode"/>
							</msr:cellGeometry>
							<msr:transformationParameterAvailability>
								<gco:Boolean>false</gco:Boolean>
							</msr:transformationParameterAvailability>
							<msr:checkPointAvailability>
								<gco:Boolean>false</gco:Boolean>
							</msr:checkPointAvailability>
							<msr:cornerPoints>
								<gml:Point gml:id="ID_1">
									<gml:pos>123.0 45.0</gml:pos>
								</gml:Point>
							</msr:cornerPoints>
							<msr:cornerPoints>
								<gml:Point gml:id="ID_2">
									<gml:pos>123.1 45.1</gml:pos>
								</gml:Point>
							</msr:cornerPoints>
							<msr:pointInPixel>
								<msr:MD_PixelOrientationCode>upperRight</msr:MD_PixelOrientationCode>
							</msr:pointInPixel>
						</msr:MD_Georectified>
					</mdq:resultSpatialRepresentation>
					<mdq:resultContentDescription>
						<mrc:MD_ImageDescription>
							<mrc:attributeDescription>
								<gco:RecordType>uncertainty</gco:RecordType>
							</mrc:attributeDescription>
						</mrc:MD_ImageDescription>
					</mdq:resultContentDescription>
					<mdq:resultFormat>
						<mrd:MD_Format id="TIFF_File_format">
							<mrd:formatSpecificationCitation>
								<cit:CI_Citation>
									<cit:title>
										<gco:CharacterString>TIFF format</gco:CharacterString>
									</cit:title>
								</cit:CI_Citation>
							</mrd:formatSpecificationCitation>
						</mrd:MD_Format>
					</mdq:resultFormat>
				</mdq:QE_CoverageResult>
			</mdq:result>
		</mdq:DQ_QuantitativeAttributeAccuracy>
	</mdq:report>
</mdq:DQ_DataQuality>

Limitations:

  • A single MX_File can be added per DQ_element. A Quality Indicator with more that one component should be described using a multiband file. Semantics of what is each band may be included in resultContentDescription or in the same MX_File if the format allow it (e.g. netCDF file, MMZ file,…​)

  • Difficulties to add semantics. The only possible way is attributeDescription (RecordType).

  • Coverage data is described as a file that has a path. A more service oriented approach is sometimes convenient.

  • The coverage of quality is sometimes one of the "bands" offered in the band collection, and this needs to be properly described.

8.1.2. Alternative 2: Quality band

ISO 19115-1 adds a new MD_CoverageContentTypeCode called "qualityInformation" specifically designed to include a "band" that contains a raster file that has values indicating the error or the uncertainty for each pixel.