Publication Date: 2017-06-30

Approval Date: 2017-01-26

Posted Date: 2016-11-11

Reference number of this document: OGC 16-041r1

Reference URL for this document: http://www.opengis.net/doc/PER/t12-A080

Category: Public Engineering Report

Editor: Liping Di, Eugene G. Yu, Md Shahinoor Rahman, Ranjay Shrestha

Title: Testbed-12 WPS ISO Data Quality Service Profile Engineering Report


Testbed-12 WPS ISO Data Quality Service Profile Engineering Report (16-041r1)

COPYRIGHT

Copyright © 2017 Open Geospatial Consortium. To obtain additional rights of use, visit http://www.opengeospatial.org/

WARNING

This document is an OGC Public Engineering Report created as a deliverable of an initiative from the OGC Innovation Program (formerly OGC Interoperability Program). It is not an OGC standard and not an official position of the OGC membership.It is distributed for review and comment. It is subject to change without notice and may not be referred to as an OGC Standard. Further, any OGC Engineering Report should not be referenced as required or mandatory technology in procurements. However, the discussions in this document could very well lead to the definition of an OGC Standard.

LICENSE AGREEMENT

Permission is hereby granted by the Open Geospatial Consortium, ("Licensor"), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.

If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.

THIS LICENSE IS A COPYRIGHT LICENSE ONLY, AND DOES NOT CONVEY ANY RIGHTS UNDER ANY PATENTS THAT MAY BE IN FORCE ANYWHERE IN THE WORLD. THE INTELLECTUAL PROPERTY IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE DO NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE INTELLECTUAL PROPERTY WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION OF THE INTELLECTUAL PROPERTY WILL BE UNINTERRUPTED OR ERROR FREE. ANY USE OF THE INTELLECTUAL PROPERTY SHALL BE MADE ENTIRELY AT THE USER’S OWN RISK. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR ANY CONTRIBUTOR OF INTELLECTUAL PROPERTY RIGHTS TO THE INTELLECTUAL PROPERTY BE LIABLE FOR ANY CLAIM, OR ANY DIRECT, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM ANY ALLEGED INFRINGEMENT OR ANY LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR UNDER ANY OTHER LEGAL THEORY, ARISING OUT OF OR IN CONNECTION WITH THE IMPLEMENTATION, USE, COMMERCIALIZATION OR PERFORMANCE OF THIS INTELLECTUAL PROPERTY.

This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.

Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications.

This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.

None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.

Table of Contents
Abstract

This Data Quality Engineering Report describes data quality handling requirements, challenges and solutions. One focus is on data quality in general that needs to be communicated from one service to another. In addition, it discusses WPS data quality solutions. The ultimate goal is for it to be nominated as a WPS ISO Data Quality Service Profile. ISO 19139 is used as the base to encode the data quality. WPS and workflows are used to streamline and standardize the process of data quality assurance and quality control. The main topics include: (1) generalized summary and description of the design and best practices for analyzing data quality of all feature data sources used in the Citizen Observatory WEB (COBWEB) project, (2) solutions and recommendations for enabling provenance of data quality transparent to end users when the data is processed through a WPS, (3) best practices and recommendations for designing and prototyping the WPS profile to support data quality service conformant to the NSG Metadata Framework, and (4) general solution for data quality fit for both raster-based imageries and vector-based features.

Business Value

This Engineering Report (ER) captures the essence and best practice for data quality that were successfully established and applied in the Citizen Observatory Web (COWBWEB) project. It goes one step further to formalize and standardize the processes as OGC WPS processes to address data quality issues by using networks of "people as sensors" and by analyzing observations and measurements in real-time combination with authoritative models and datasets. The ER content can be summarized as follows:

  • Innovative use of crowdsourcing and citizen sensors to solve data quality control and assurance with prescribed seven standard WPS processes,

  • Formalize the processes to solve data quality issues using citizen sensors that harmonize the data and service interoperation across processes as Web services, and

  • Achieve compatible data quality assurance levels.

Technology Value

The relevance and importance of the ER to WPS 2.0 SWG are obvious in two aspects. On the one hand, the best practice and solutions described in the ER utilizes WPS 2.0 as a general framework and service implementation specification to achieve data quality control and assurance in dealing with networks of citizen sensors and their information offers. Each data quality operation is implemented as WPS process. The adoption of WPS not only benefits high level interoperation among services, but also prompts the applications of WPS in citizen sensor network applications. On the other hand, the formalization and standardization of seven processes identified in the COBWEB project lead to the development of a WPS profile with ISO Data Quality standards that are applicable for citizen sensor data quality control and assurance. Seven processes are to be specified as WPS process. The seven WPS processes are: (1) LBS-Positioning, (2) Cleaning, (3) Automatic Validation, (4) Authoritative Data Comparison, (5) Model-Based Validation, (6) Linked Data Analysis, and (7) Semantic Harmonization.

How does this ER relate to the work of the Working Group

This ER demonstrates a use case for web-based processing using the WPS 2.0 interface standard. Also, a basis for a data quality WPS profile is described. The goal of the hierarchical profiling approach specified in the WPS 2.0 standard is to foster interoperability among different WPS clients and servers. A data quality profile could serve as proof of concept of the WPS 2.0 profiling approach and could be used to incorporate data quality checks in (automated) geoprocessing workflows.

Keywords

ogcdocs, testbed-12, WPS, Web services, ISO 19139, ISO 19115, Workflow

Proposed OGC Working Group for Review and Approval

The ER will be submitted to WPS 2.0 SWG for review. The ultimate goal is to develop and promote it as a WPS profile with the approval of WPS 2.0 SWG.

1. Introduction

1.1. Scope

This report captures the best practice of using WPS processes as the interoperation framework to support data quality assurance and control using networks of "people as sensors". Seven processes for data quality control shall be formalized and specified as WPS processes. The interoperation among processes as well as between citizen sensors shall be enabled at levels of data and services.

1.2. Document contributor contact points

All questions regarding this document should be directed to the editor or the contributors:

Table 1.1. Contacts
Name Organization

Eugene G. Yu

George Mason University/CSISS

Liping Di

George Mason University/CSISS

Md Shahinoor Rahman

George Mason University/CSISS

Ranjay Shrestha

George Mason University/CSISS

Lingjun Kang

George Mason University/CSISS

Sam Meek

Helyx Secure Information Systems Ltd

1.3. Future Work

Several future recommendations have been identified. Details will be discussed in the section on Future Recommendations. The recommendations are: (1) alignment with the evolution of geospatial standards, (2) data quality workflow enablement, and (3) data quality service test suites.

1.4. Foreword

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.

Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.

2. References

The following documents are referenced in this document. For dated references, subsequent amendments to, or revisions of, any of these publications do not apply. For undated references, the latest edition of the normative document referred to applies.

  • OGC 06-121r9, OGC® Web Services Common Standard

Note
This OWS Common Standard contains a list of normative references that are also applicable to this Implementation Standard.
  • OGC 14-065, OGC® WPS 2.0 Interface Standard

  • OGC 06-121r9, OGC® Web Services Common Standard

  • ISO 19157:2013, Geographic information — Data quality

  • ISO/DTS 19157-2, Geographic information — Data quality — Part 2: XML Schema Implementation

  • ISO 19115:2003, Geographic information — Metadata

3. Terms and definitions

For the purposes of this report, the definitions specified in Clause 4 of the OWS Common Implementation Standard [OGC 06-121r9] and in OGC® Abstract Specification Topic 11: Metadata [OGC 01-111] shall apply. In addition, the following terms and definitions apply.

3.1. data quality

Data quality is a concept used in the context to represent the geospatial data quality with multiple components that include data validity, precision and accuracy. Data validity may be described as “fitness for use,” i.e. the degree to which data are fit for an application. Geospatial precision is related to resolution and variation. Geospatial accuracy refers only to how close the measurement is to the true value.

3.2. positional accuracy

the quantifiable value that represents the positional difference between two geospatial layers or between a geospatial layer and reality

3.3. lineage

description of the history of the spatial data, including descriptions of the source material from which the data were derived, and the methods of derivation

3.4. attribute accuracy

the accuracy of the quantitative and qualitative information attached to each feature

3.5. consistency

description of the dependability of relationships encoded in the data structure of the digital spatial data

3.6. completeness

the degree to which geographic features, their attributes and their relationships are included or omitted in a dataset

4. Conventions

4.1. Abbreviated terms

  • API Application Program Interface

  • BPMN Business Process Model and Notation

  • COBWEB Citizen OBservatory WEB

  • COM Component Object Model

  • CORBA Common Object Request Broker Architecture

  • COTS Commercial Off The Shelf

  • DCE Distributed Computing Environment

  • DCOM Distributed Component Object Model

  • DQ Data Quality

  • DTS Draft Technical Specification

  • GeoJSON Geographic JavaScript Object Notation

  • GML Geography Markup Language

  • IDL Interface Definition Language

  • ISO International Organization for Standardization

  • JSON JavaScript Object Notation

  • NGA National Geospatial-Intelligence Agency

  • NSG National System for Geospatial Intelligence

  • UC Use Case

  • WFS Web Feature Service

  • WPS Web Processing Service

  • XML EXtensible Markup Language

4.2. UML notation

Most diagrams that appear in this standard are presented using the Unified Modeling Language (UML) static structure diagram, as described in Subclause 5.2 of [OGC 06-121r9].

4.3. Used parts of other documents

This document uses significant parts of document [OGC 06-121r9]. To reduce the need to refer to that document, this document copies some of those parts with small modifications. To indicate those parts to readers of this document, the largely copied parts are shown with a light grey background (15%).

5. Overview

Data quality services are the focus. Specifications and standards define how data quality is described and presented. Many processes to derive data quality share common solutions for different cases. This Engineering Report (ER) aims to enable automation of commonly required data quality measurements and assessments. Web Processing Service (WPS) is used as the vehicle to achieve such automation.

6. Status Quo & New Requirements Statement

6.1. Status Quo

6.1.1. Data quality assurance and data quality control

The Citizen Observatory Web (COBWEB) is a citizen science project that explores the potential of combining citizen resources and open geospatial standards in supporting biosphere data collection, validation, and analysis[1]. The infrastructure sets a suite of technologies to form a citizens’ observatory framework that effectively exploits technological developments in ubiquitous mobile devices, crowd-sourcing of geographic information and the operational applications of standards based spatial data infrastructure (SDI). The framework enables citizens to collect environmental information on a range of parameters including species distribution, flooding and land cover/use[1, 2]. Workflow was used to design different complex process from component services[3]. Dealing with diversified sources of data, the project had to tackle the data quality issues. One of the important and efficient approaches is its adoption of WPS processes to enable data quality assurance and validation. The data quality was addressed by using networks of “people as sensors” and by analyzing observations and measurements in real-time combination with authoritative models and datasets. The COBWEB project represents the status quo or starting point for the work done in testbed 12 to develop and formalize the WPS processes to facilitate data quality assurance.

6.1.2. Data quality assessment challenges

In the COBWEB project, the challenges of the quality assurance were: how to design and implement a system that was flexible enough to qualify data with different fitness for purpose requirements, different data schemas, recorded by different devices. More specifically, the challenges are as follows.

  1. Fitness of data quality model: What to model? What is the proper model process? What are the variations in capture devices/persons?

  2. Provenance: The history of data collection is important. It is related to what curation process is involved.

  3. Metaquality: The questions for different data qualities need to be answered. What are the qualities of DQ metadata? How to define accuracy? How to define completeness? What criteria and strategies should be used to keep consistency?

  4. Levels of DQ assessment: DQ assessment can be done at different levels. What should be the proper level? Is it needed to be as detailed as up to the level of dataset? Is it only necessary to evaluate at the level of collection?

  5. Propagation of data uncertainty: Data error and uncertainty may be propagated through the chain of processes when multiple processes are involved. How to represent and record the propagation among workflows? How to track the propagation among data fusion?

6.2. Requirements Statement

The requirements for the sponsor, NGA, differ from the COBWEB project in the following ways:

  • The data is authoritative.

  • The data is likely to have a static structure.

  • Metadata is likely to exist for the products, which can be utilized in the qualification process.

  • In COBWEB the focus was on observations recorded as points, this project requires qualification to be performed on different types of data including points, lines, polygons and images.

By analyzing the requirements and the demand on data quality services, the following common requirements can be identified:

  1. Quality assurance of data quality: This defines what to be assessed, how to assess, and required standard approach.

  2. Fit data quality assessment approaches: The atomic process may be represented as a WPS process. The complex assessment process may be combining or chaining several atomic processes to form workflows. Enabling workflows and composition of atomic processes allows extended adaptivity and flexibility to meet various requirements with different complex levels. Efficiency can be achieved with enhanced reusability of atomic processes.

  3. Provenance: This keeps track of data quality and data history.

  4. Unified aggregating data quality to high levels: Approaches and methods to aggregate data quality need to be unified.

  5. Standard mechanisms to encode, store, and retrieve data quality metadata at multiple levels: Different levels of details on data may adopt different encoding, storing, and accessing mechanism. Geospatial data may be dealt with two levels in general: dataset and data collection.

  6. Data quality consumption: Processes and outputs for data quality should be clearly understood about who is the intended consumer of recorded DQ information. Typical, two distinguishable types of consumption should be considered: machine-readable and human-readable.

There are also extra requirements from the sponsor including adherence to specific standards for data and metadata. The main required standard is the NGA Metadata Framework and ISO 19115 metadata documents. Recently, the quality elements of ISO 19115 have been split into a separate document, ISO 19157. It is going to be adopted as the document recording the quality elements as the specifications from OGC and NGA evolve. In this testbed, all these will be taken into account in designing the WPS Data Quality processes.

7. Solutions

7.1. Targeted Solutions

7.1.1. Overall Design Strategy and Architecture

Data quality (DQ) involves different aspects – completeness, positional accuracy, topological accuracy, domain consistency, conceptual consistency, format consistency, and correctness. The realization of such DQ functionality is recommended to be implemented as a series of atomic WPS process. WPS, as an OGC processing specification, is identified as a fit technology to enable the implementation of DQ processes in the Web environment[1, 2]. With considerations of dealing with complexity and multi-levels of granularity, each WPS process should be designed as atomic as possible to allow its reusability in composition through workflows.

As defined in ISO 19157, there are many DQ criteria tests. The DQ WPS should consist of a set of atomic DQ WPS test processes to meet the functional requirements defined in ISO 19157. Each DQ process should be configurable and atomic. They should be passed with metrics that correspond to the Universe of Discourse, or what the thresholds are for what is considered as quality in ISO 19157 terms. The WPS processes all follow a similar design to make them interoperable, suited for chaining and so that they conform to some uniform pattern. This is depicted in the following Figure.

figure atomic dq wps process
Figure 1. Atomic DQ WPS process

Each atomic DQ WPS process may take two types of inputs: data and reference data. Both data and reference data can be served through standard WCS or WFS services. They can be encoded in GML, GeoJSON, XML, or JSON. An atomic DQ WPS process may output metadata in XML and/or optionally non-conforming data in GML. The output should contain a statement to clarify its conformance.

There are three main aspects of data quality issues to be tested with DQ WPS processes. They are as follows.

  1. Data Quality Assurance/Quantity Control WPS,

  2. Encoding/curating data quality: correctness, completeness, consistency, and provenance, and

  3. Standard data quality metadata consumption: making mapping to NSG metadata framework mandatory, and providing both machine-readable and human-readable formats.

7.1.2. Completeness Omission/Completeness Commission

Completeness has two connotations. One is to inspect omission, i.e. how much is not included in the geospatial database. Another is to inspect commission, i.e. how much is falsely included in the geospatial database. The measurements can be in quantity, rate, or duplicates. This can be implemented as one WPS process that completes the computation by comparing the geospatial database with the reference geospatial database.

7.1.3. Positional Accuracy

The position accuracy is related to the geometrical measurements. There are two cases that have quite distinguishing characteristics due to their different formats – vector or raster. Two separate processes are proposed to deal with such different geospatial databases.

Positional Accuracy (vector feature)

Vector-based geospatial features are often managed by database or database-like system. Each feature has a set of attributes. There would be one or more fields that form the primary key. By associating the database to the reference databases, one can verify if they have the required positional accuracy. This will be designed as a dedicated WPS process.

Positional Accuracy (gridded)

Raster-based geospatial features are concerned with spatial resolution and location displacement. The comparison and validation against reference raster-based dataset need to consider both spatial resolution and location displacement. This will be developed as one dedicated WPS process to check positional accuracy using a reference dataset.

7.1.4. Topological Consistency

Geometrical contradictions should not exist in the result geospatial database. This needs to verify that geospatial rules are met, such as one location for one point, polygon bound by lines, etc. A WPS process will be designed and developed to complete the consistency check in a single geospatial database.

7.2. Recommendations

ISO standards will be adopted to encode data quality. Specifically, ISO 19157 is one of the primary standards to support different aspects of data quality. The mapping of elements can be seen in the following table.

figure iso19157 map
Figure 7.2. ISO 19157 Element Map

In overall design, the following are recommended in dealing with data quality issues.

  1. WPS workflow enablement with BPMN for flexibility

  2. Seven important aspects for data quality control: location-based-service position correction, cleaning, model-based validation, authoritative data comparison, automatic validation, linked data analysis, and semantic harmonization (Meek, S Jackson, M Leibovici, DG (2014) )

  3. Recommended levels of data quality metadata: multiple levels of conformance to meet different requirements and standard information to make users aware of levels of data quality assurance and data quality control.

7.2.1. Completeness Omission/Completeness Commission WPS processes

The following table defines the generic WPS process for processing the Completeness Omission/Completeness Commission. There are two types: omission and commission. The processes can be further broken down to different processes for vector-based and raster-based features.

Table 7.1. Completeness WPS Process

Name:

iso19157.DQ_Completeness.DQ_Completeness

Description:

1. Calculate omission and commission of a dataset based on a reference dataset.

2. Calculate rate of omission and commission of a dataset based on a reference dataset.

3. Calculate duplicate features within a dataset vector-based and raster-based features.

Input:

Target dataset, field of interest, Reference dataset, field of interest declaration.

Algorithm:

1. Summarizes the data in each and calculates entry type and frequency for both datasets and compares the results.

2. Uses the summary table calculated in 1) and calculates a percentage of omission/commission.

3. Performs a multi-step check on the dataset. Compares geometries of a feature to all other features, if geometries match then compare each of the fields within the dataset, if the values all match then the entry is a duplicate.

Output:

One of the following:

1. A table listing all data types and frequency for both target and reference datasets.

2. A list of data types and rate of omission/commission

3. The number of duplicate features.

Completeness Omission WPS processes

This section describes completeness omission WPS processes.

Completeness Omission WPS process for vector-based dataset

The following table defines the WPS process to evaluate the Completeness Omission of vector-based dataset.

Table 7.2. Completeness Omission WPS Process for vector-based dataset

Name:

iso19157.DQ_Completeness.DQ_CompletenessOmission

Description:

1. Calculate omission of a vector dataset based on a reference vector dataset.

2. Calculate rate of omission of a vector dataset based on a reference vector dataset.

3. Calculate duplicated features within a vector dataset.

Input:

1. Target vector dataset to be qualified

2. Reference vector dataset to qualify the target vector dataset against

3. Lookup field for the target vector dataset

4. Lookup field for the reference vector dataset

5. Link to metadata document (optional)

6. Threshold for omission rate (percentage)

Algorithm:

1. Summarizes the data in each and calculates entry type and frequency for both datasets and compares the results.

2. Uses the summary table calculated in 1) and calculates a percentage of omission.

3. Performs a multi-step check on the dataset. Compares geometries of a feature to all other features, if geometries match then compare each of the fields within the dataset, if the values all match then the entry is a duplicate.

Output:

One of the following:

1. A table listing all data types and frequency for both target and reference datasets.

2. A list of data types and rate of omission

3. The number of duplicate features.

UML:

See Figure 7.3.

Example:

Endpoint: http://54.201.124.35/wps/WebProcessingService

Request: See example shown in Table A.1. in Appendix A.

Response: See example response shown in Table A.1. in Appendix A.

desc iso19157 DQ Completeness Omission Vector
Figure 7.3. UML model for the Completeness Omission WPS process (vector-based dataset)
Completeness Omission WPS process for raster-based dataset

The following table defines the WPS process to evaluate the Completeness Omission of raster-based dataset.

Table 7.3. Completeness Omission WPS Process for raster-based dataset

Name:

iso19157.DQ_Completeness.DQ_CompletenessOmissionR

Description:

1. Calculate omission of a raster dataset based on a reference raster dataset.

2. Calculate rate of omission of a raster dataset based on a reference raster dataset.

3. Calculate duplicated features within a raster dataset.

Input:

1. Target raster dataset to be qualified

2. Link to metadata document (optional)

3. Threshold for omission rate (percentage)

Algorithm:

1. Summarizes the data in each and calculates entry type and frequency for input dataset.

2. Uses the summary table calculated in 1) and calculates a percentage of omission.

3. Performs a multi-step check on the dataset. Compares the pixel of a feature to all other features, if the geometries match then compare each of the fields within the dataset, if the values all match then the entry is a duplicate.

Output:

One of the following as the result of the test:

1. A table listing all data types and frequency.

2. A list of data types and rate of omission

3. The number of duplicate features.

UML:

See Figure 7.4.

Example:

Request: See example shown in Table A.2. in Appendix A.

Response: See example response shown in Table A.2. in Appendix A.

desc iso19157 DQ Completeness Omission Raster
Figure 7.4. UML model for the Completeness Omission WPS process (raster-based dataset)
Completeness Commission WPS processes

This section describes completeness commission WPS processes.

Completeness Commission WPS process for vector-based dataset

The following table defines the WPS process to evaluate the Completeness Commission of vector-based dataset.

Table 7.4. Completeness Commission WPS Process for vector-based dataset

Name:

iso19157.DQ_Completeness.DQ_CompletenessCommission

Description:

1. Calculate commission of a vector dataset based on a reference vector dataset.

2. Calculate rate of commission of a vector dataset based on a reference vector dataset.

3. Calculate duplicated features within a vector dataset.

Input:

1. Target vector dataset to be qualified

2. Reference vector dataset to qualify the target vector dataset against

3. Lookup field for the target vector dataset

4. Lookup field for the reference vector dataset

5. Link to metadata document (optional)

6. Threshold for omission rate (percentage)

Algorithm:

1. Summarizes the data in each and calculates entry type and frequency for both datasets and compares the results.

2. Uses the summary table calculated in 1) and calculates a percentage of omission.

3. Performs a multi-step check on the dataset. Compares geometries of a feature to all other features, if geometries match then compare each of the fields within the dataset, if the values all match then the entry is a duplicate.

Output:

One of the following:

1. A table listing all data types and frequency for both target and reference datasets.

2. A list of data types and rate of commission

3. The number of duplicate features.

UML:

See Figure 7.5.

Example:

Request: See example shown in Table A.3. in Appendix A.

Response: See example response shown in Table A.3. in Appendix A.

desc iso19157 DQ Completeness Commission Vector
Figure 7.5. UML model for the Completeness Commission WPS process (vector-based dataset)
Completeness Commission WPS process for raster-based dataset

The following table defines the WPS process to evaluate the Completeness Commission of raster-based dataset.

Table 7.5. Completeness Commission WPS Process for raster-based dataset

Name:

iso19157.DQ_Completeness.DQ_CompletenessCOmmissionR

Description:

1. Calculate commission of a raster dataset based on a reference raster dataset.

2. Calculate rate of commission of a raster dataset based on a reference raster dataset.

3. Calculate duplicated features within a raster dataset.

Input:

1. Target raster dataset to be qualified

2. Link to metadata document (optional)

3. Threshold for commission rate (percentage)

Algorithm:

1. Summarizes the data in each and calculates entry type and frequency for input dataset.

2. Uses the summary table calculated in 1) and calculates a percentage of omission.

3. Performs a multi-step check on the dataset. Compares the pixel of a feature to all other features, if the geometries match then compare each of the fields within the dataset, if the values all match then the entry is a duplicate.

Output:

One of the following as the result of the test:

1. A table listing all data types and frequency.

2. A list of data types and rate of commission

3. The number of duplicate features.

UML:

See Figure 7.6.

Example:

Request: See example shown in Table A.4. in Appendix A.

Response: See example response shown in Table A.4. in Appendix A.

desc iso19157 DQ Completeness Commission Raster
Figure 7.6. UML model for the Completeness Commission WPS process (raster-based dataset)

7.2.2. Positional Accuracy WPS processes

This section describes positional accuracy WPS processes.

Positional Accuracy (vector feature) WPS process

The following table defines the Positional Accuracy (vector feature) WPS processes.

Table 7.6. Positional Accuracy (vector feature) WPS processes

Name:

iso19157.DQ_PositionalAccuracy​.DQ_AbsoluteExternalPositionalAccuracy

Description:

Calculates the positional accuracy of a target dataset given a reference dataset and lookup field

Input:

Target dataset, target dataset field ID, reference dataset, reference dataset field ID

Algorithm:

It takes the target dataset and matches up its entries with those in the reference dataset by comparing their Identifiers (IDs) (they must be identified as an integer) - i.e. target dataset field ID and reference dataset field ID defined in the inputs.

Output:

The mean uncertainties as defined by ISO 19157

UML:

See Figure 7.7.

Example:

Request: See example shown in Table B.1. in Appendix B.

Response: See example response shown in Table B.1. in Appendix B.