Publication Date: 2018-01-26
Approval Date: 2018-01-22
Posted Date: 2017-11-11
Reference number of this document: OGC 17-018
Reference URL for this document: http://www.opengis.net/doc/PER/t13-FA002
Category: Public Engineering Report
Editor: Alaitz Zabala, Joan Maso
Title: OGC Testbed-13: Data Quality Specification Engineering Report
COPYRIGHT
Copyright © 2018 Open Geospatial Consortium. To obtain additional rights of use, visit http://www.opengeospatial.org/
WARNING
This document is not an OGC Standard. This document is an OGC Public Engineering Report created as a deliverable in an OGC Interoperability Initiative and is not an official position of the OGC membership. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an OGC Standard. Further, any OGC Engineering Report should not be referenced as required or mandatory technology in procurements. However, the discussions in this document could very well lead to the definition of an OGC Standard.
LICENSE AGREEMENT
Permission is hereby granted by the Open Geospatial Consortium, ("Licensor"), free of charge and subject to the terms set forth below, to any person obtaining a copy of this Intellectual Property and any associated documentation, to deal in the Intellectual Property without restriction (except as set forth below), including without limitation the rights to implement, use, copy, modify, merge, publish, distribute, and/or sublicense copies of the Intellectual Property, and to permit persons to whom the Intellectual Property is furnished to do so, provided that all copyright notices on the intellectual property are retained intact and that each person to whom the Intellectual Property is furnished agrees to the terms of this Agreement.
If you modify the Intellectual Property, all copies of the modified Intellectual Property must include, in addition to the above copyright notice, a notice that the Intellectual Property includes modifications that have not been approved or adopted by LICENSOR.
THIS LICENSE IS A COPYRIGHT LICENSE ONLY, AND DOES NOT CONVEY ANY RIGHTS UNDER ANY PATENTS THAT MAY BE IN FORCE ANYWHERE IN THE WORLD. THE INTELLECTUAL PROPERTY IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE DO NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE INTELLECTUAL PROPERTY WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION OF THE INTELLECTUAL PROPERTY WILL BE UNINTERRUPTED OR ERROR FREE. ANY USE OF THE INTELLECTUAL PROPERTY SHALL BE MADE ENTIRELY AT THE USER’S OWN RISK. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR ANY CONTRIBUTOR OF INTELLECTUAL PROPERTY RIGHTS TO THE INTELLECTUAL PROPERTY BE LIABLE FOR ANY CLAIM, OR ANY DIRECT, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM ANY ALLEGED INFRINGEMENT OR ANY LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR UNDER ANY OTHER LEGAL THEORY, ARISING OUT OF OR IN CONNECTION WITH THE IMPLEMENTATION, USE, COMMERCIALIZATION OR PERFORMANCE OF THIS INTELLECTUAL PROPERTY.
This license is effective until terminated. You may terminate it at any time by destroying the Intellectual Property together with all copies in any form. The license will also terminate if you fail to comply with any term or condition of this Agreement. Except as provided in the following sentence, no such termination of this license shall require the termination of any third party end-user sublicense to the Intellectual Property which is in force as of the date of notice of such termination. In addition, should the Intellectual Property, or the operation of the Intellectual Property, infringe, or in LICENSOR’s sole opinion be likely to infringe, any patent, copyright, trademark or other right of a third party, you agree that LICENSOR, in its sole discretion, may terminate this license without any compensation or liability to you, your licensees or any other party. You agree upon termination of any kind to destroy or cause to be destroyed the Intellectual Property together with all copies in any form, whether held by you or by any third party.
Except as contained in this notice, the name of LICENSOR or of any other holder of a copyright in all or part of the Intellectual Property shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Intellectual Property without prior written authorization of LICENSOR or such copyright holder. LICENSOR is and shall at all times be the sole entity that may authorize you or any third party to use certification marks, trademarks or other special designations to indicate compliance with any LICENSOR standards or specifications.
This Agreement is governed by the laws of the Commonwealth of Massachusetts. The application to this Agreement of the United Nations Convention on Contracts for the International Sale of Goods is hereby expressly excluded. In the event any provision of this Agreement shall be deemed unenforceable, void or invalid, such provision shall be modified so as to make it valid and enforceable, and as so modified the entire Agreement shall remain in full force and effect. No decision, action or inaction by LICENSOR shall be construed to be a waiver of any rights or remedies available to it.
None of the Intellectual Property or underlying information or technology may be downloaded or otherwise exported or reexported in violation of U.S. export laws and regulations. In addition, you are responsible for complying with any local laws in your jurisdiction which may impact your right to import, export or use the Intellectual Property, and you represent that you have complied with any regulations or registration procedures required by applicable law to make this license enforceable.
- 1. Summary
- 2. References
- 3. Terms and definitions
- 4. Abbreviated terms
- 5. Overview
- 6. Aviation Quality Measures
- 7. SDCM Extension
- 8. Quality of Service parameters related to Quality of Data
- Appendix A: Unified Modeling Language (UML) model
- Appendix B: Revision History
- Appendix C: Bibliography
1. Summary
With the proliferation of digital services that provide data for the Aeronautical Domain, a formalized definition of the quality of the data offered is needed. This responds to the needs of the following use cases:
-
Service advertising: In this case, a service makes known to a potential client the quality of the data provided by the service. Based on this information, the client can determine whether the service meets its needs (that is, to determine if it is fit-for-purpose).
-
Service validation: In this case, assurance is given that the quality of the data provided by a service is consistent with the quality that is explicitly defined in a service requirement or any kind of agreement that may exist between a service provider and the clients.
In practical terms, users will approach a catalog of aviation services such as the one provided by the United States Federal Aviation Administration (FAA) under the name of National Aerospace Standard (NAS) Service Registry and Repository (https://nsrr.faa.gov/) and will look for and compare information about the quality of the aviation data offered. The NSRR uses the Service Description Conceptual Model (SDCM) that provides a Service Description for aviation services (in other words, metadata about the aviation service) in a similar way to what GetCapabilities does for OGC services and ISO 19119 (now included in ISO 19115-1) does for geospatial services. Currently, none of the above mentioned service descriptions provide direct information about the data quality offered by the service. A set of the three Engineering Reports (ER) that this document is part of study some of the technical possibilities to include data quality information in aviation service descriptions. In particular:
-
OGC 17-032 (Testbed-13 Abstract Data Quality Engineering Report) provides a taxonomy and a model for the fundamental concepts covered by the internationally agreed rules and regulations related with data quality in terms of accuracy, resolution and integrity (or equivalent assurance level), traceability, timeliness, completeness, and format. It maps these concepts to the International Organization for Standardization (ISO) Technical Committee (TC) 211 equivalent concept for consistency to the geospatial domain.
-
OGC 17-018 (Testbed-13 Data Quality Specification Engineering Report) provides methods to quantify the quality concepts defined in OGC 17-032 and a way to include the quantifications in service descriptions. It extends QualityML quality metrics (that already includes ISO 19157) into the aviation domain. It lists a set of quantitative and conformance measurements that are specified in terms of quality measures, domains, and metrics (value types and units) and are appropriated for each quality type and data type. Secondly, it extends the SDCM to be able to encode and include the above mentioned quality information for each service in a interoperable way.
-
OGC 17-025 (Testbed-13 Quality Assessment Service) provides a description of a service that is able to connect to other services and infer the data quality of them. To do that, it reads the data that the external service contains, applies a set of rules and procedures to determine the quality of that data and documents it on the service description metadata. The rules and procedures to apply may differ considerably from one data type to another. The service procedures are based on the measures, domains and metrics defined in OGC 17-018 and might require comparison with data that is considered ground truth, statistical analysis of repetitive measurements (e.g. weather forecast ensembles) or consistency checks. In the end, the results will be added to the data quality section of the service description following the SDCM model.
1.1. Requirements
Upon successful completion of the abstract model in OGC 17-032, this ER develops a DataQuality Assessment Specification. This specification defines a set of data quality parameters as well as the methods and units of measure employed for measuring these parameters. This specification is information domain neutral, i.e., it specifies data quality characteristics and methods that can be applied to all aviation information domains: weather, flight, and aeronautical. This document also includes:
-
An extension mechanism for the abstract model to be extended to address domain-specific requirements.
-
A mechanism for augmenting the SDCM with classes/concepts for describing a service’s data quality. This includes taxonomies that capture defined parameters, methods of measurement, and units of measure.
-
Discussions of the relationships between Quality of Service (QoS) parameters already defined in the SDCM and data quality parameters proposed in this document.
1.2. Key Findings and Prior-After Comparison
Currently QualityML and ISO 19157 have abundant information on common quality measures that can be applied to the aviation domain. We will analyze how these measures adapt to aviation dataset(s) and will include new ones when needed.
Service Description for aviation services (SDCM), OGC ServiceMetadata response to GetCapabilities and ISO 19119 (now included in ISO 19115-1) describe several characteristics of the services and the data they provide but none of them directly include information about data quality (some data quality information can be indirectly found by getting access to the metadata describing the data in the service). This engineering report describes a possible way to do it in SDCM that is new and can open the possibility to include data quality in other service metadata standards improving the process of finding data that is fit for purpose.
1.3. What does this ER mean for the Working Group and OGC in general
The Aviation Domain Working Group (DWG) and the Data Quality DWG should be interested in this work for different reasons. For the Aviation DWG, it represents a way to complete the SDCM data model. For the Data Quality DWG, it brings the perspective of the aviation domain and contributes to increase the list of relevant quality measures know by the community.
1.4. Document contributor contact points
All questions regarding this document should be directed to the editor or the contributors:
Name | Organization |
---|---|
Alaitz Zabala |
UAB-CREAF |
Joan Maso |
UAB-CREAF |
1.5. Future Work
The work on this document can impact on OWS Common by adopting data quality descriptions in service metadata. The work on this document can impact the future evolution of SDCM. Future editions of the Testbed can experiment with implementations of the proposed approach in aviation services and catalogues.
1.6. Foreword
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.
Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.
2. References
The following normative documents are referenced in this document.
-
OGC 17-032, OGC® Testbed-13 Abstract Data Quality ER
-
ISO 19115-1:2014, Geographic information - Metadata - Part 1: Fundamentals
-
ISO 19157:2013, Geographic information - Data quality
-
ISO/TS 19115-3:2016, Geographic information - Metadata - Part 3: XML schema implementation for fundamental concepts
-
AIXM, Aeronautical Information Exchange Model
-
FIXM, Flight Information Exchange Model
-
WXXM, Weather Information Exchange Model
-
QualityML v1.0, Quality Indicators Dictionary and Markup Language
-
FAA SWIM Governance team, SWIM Controlled Vocabulary (v.1.1)
3. Terms and definitions
For the purposes of this report, the definitions specified in Clause 4 of the OWS Common Implementation Standard OGC 06-121r9 shall apply. In addition, the following terms and definitions apply.
3.1. accuracy
closeness of agreement between a test result or measurement result and the true value. [SOURCE: ISO 3534-2:2006, 3.3.1]
a degree of conformance between the estimated or measured value and true value. [SOURCE: ICAO Annex 15]
3.3. integrity
a degree of assurance that an aeronautical data and its value has not been lost or altered since the data origination or authorized amendment. [SOURCE: ICAO Annex 15]
3.4. data quality basic measure
generic data quality (4.21) measure used as a basis for the creation of specific data quality measures. [SOURCE: ISO 19157:2013, 4.7]
3.7. lineage
provenance, source(s) and production process(es) used in producing a resource [SOURCE: ISO 19115-1:2014, 4.9]
3.8. precision
The smallest difference that can be reliably distinguished by a measurement process. [SOURCE: ICAO Annex 15]
3.9. provenance
organization or individual that created, accumulated, maintained and used records [SOURCE: ISO 5127:2001, 4.1.1.10]
3.10. provider
supplier, organization that provides a product or a service [SOURCE: ISO 9000:2015, 3.2.5]
3.11. quality
degree to which a set of inherent characteristics fulfills requirements. [SOURCE: ISO 9000:2015, 3.6.2]
a degree or level of confidence that the data provided meets the requirements of the data user in terms of accuracy, resolution and integrity. [SOURCE: ICAO Annex 15]
3.12. quality of service
A parameter that specifies and measures the value of a provided service. [SOURCE: SWIM Controlled Vocabulary (v.1.1), #quality-of-service]
3.13. resolution
A number of units or digits to which a measured or calculated value is expressed and used. [SOURCE: ICAO Annex 15]
3.14. service
capability which a service provider entity makes available to a service user entity at the interface between those entities [SOURCE: ISO 19101:2002, 4.11]
distinct part of the functionality that is provided by an entity through interfaces [SOURCE: ISO 19119:2005, 4.1]
A mechanism to enable access to one or more capabilities, where the access is provided using a prescribed interface and is exercised consistent with constraints and policies as specified by the service description. [SOURCE: SWIM Controlled Vocabulary (v.1.1), #service]
4. Abbreviated terms
-
AIM: Aeronautical Information Management
-
AIS: Aeronautical Information Services
-
AIXM: Aeronautical Information Exchange Model
-
AQM: Abstract Quality Model
-
FIXM: Flight Information Exchange Model
-
ICAO: International Civil Aviation Organization
-
ISO: International Organization for Standardization
-
IWXXM: ICAO Meteorological Information Exchange Model
-
NAS: National Aerospace Standard
-
NSRR: NAS Service Registry/Repository
-
O&M: Objects and Measurements
-
QoD: Quality of Data
-
QoS: Quality of Service
-
QualityML: Quality Indicators Dictionary and Markup Language
-
SDCM: Service Description Conceptual Model
-
SWIM: System Wide Information Management
-
UML: Unified Modeling Language
-
UoM: Unit of Measurement
-
WSDD: Web Service Description Document
-
WSQM: Web Service Quality Model
-
WXXM: Weather Information Exchange Model
5. Overview
The scenario of rapidly growing geodata catalogues requires tools focused on facilitating the users choice of services and datasets. Quality of services in the context of System Wide Information Management (SWIM) has two major use cases, service advertising and service validation. To assess both use cases, the quality of the data provided by the service needs to be available in order for the user to determine if it meets its needs or that the data provided by a service is consistent with the service requirements. Thus, having populated quality fields in metadata using an unambiguous definition of the data quality concept and a set of measurable parameters is "a must" for QoS. Moreover, this would lead to a Data Quality Assessment Service (DQAS) that will evaluate the quality of data based on a set of criteria. In addition, having clear data quality concepts and a set of measure parameters allows other components (such as visualization, discovery, or comparison tools) to be quality-aware and interoperable.
This ER is related to the "FA001: Abstract Quality Model Engineering Report" that develops a conceptual model for data quality in the context of Service-Oriented Architecture (SOA) services in general and OGC-compliant services in particular. It is based on Service Description Conceptual Model (SDCM), ISO 19157, and QualityML to improve quality description in the metadata.
The ER developed on this activity will be based on the previous one, and will develop a Data Quality Assessment Specification. It will define a set of data quality parameters dealing with completeness, logical consistency, positional accuracy, temporal accuracy and thematic accuracy. Moreover, this specification will review the common aviation information domains (weather, flight, and aeronautical), and will define the quality they require.
In particular, this document addresses:
-
Definition of data and quality measures: It is a common issue to confuse the spatial resolution of the data with the spatial accuracy. The spatial resolution is related to the pixel size chosen to encode the data in a raster format while the spatial accuracy refers to the deviance in the geographic position of the pixel from its real ground position. In many times, both are related but are not the same. This deliverable discusses how to encode both in a clear way. The same happens with the temporal extent and the temporal accuracy. The temporal extent indicates the interval of time (hours, dates, etc.) of data in the image while the temporal accuracy refers to the uncertainty in the individual time measurement. All these aspects will be recorded in the ER.
-
Define a set of data quality parameters as well as the methods and units of measure employed for measuring these parameters. This description will be domain neutral, but will include extension mechanism to address domain-specific requirements.
-
The use of standard vocabularies and taxonomies to describe data quality is mandatory in a QoS paradigm. In the 7th Framework Programme of the European Commission, the GeoViQua, a 3-year project, (UAB-CREAF were the coordinators) worked in different aspects of data quality and data visualization. One of the outcomes of the project was the QualityML vocabulary. This vocabulary is an extension of UncertML (the v1 of its community standard is a discussion paper in OGC). This vocabulary provides a common solution for all quality indicators described in the ISO 19157. It also proposes a clear encoding in XML metadata documents (see www.quality.org). QualityML was reviewed and extended in Testbed 12 DG003: Imagery Quality and Accuracy ER, ensuring the need of imagery and to meet A3C quality framework[2]. FA001: Abstract Quality Model Engineering Report and FA002: Data Quality Assessment Specification Engineering Report activities are suitable activities to apply or extend the QualityML vocabulary in order to describe new quality concepts and parameters (or adapt the existing ones) needed in the QoS framework. This links to the taxonomies requirement on the OGC Testbed 13 Call for Participation (CFP), capturing defined parameters, methods of measurements and units of measure.
-
Extends the SDCM to be able to encode and include the above-mentioned quality information for each service in a interoperable way.
-
Discussions of the relationships between Quality of Service (QoS) parameters already defined in the SDCM and data quality parameters proposed in the specification.
6. Aviation Quality Measures
6.1. Introduction
An extension mechanism for the abstract model to be extended to address domain-specific requirements will be developed with regards to the following:
-
Definition of data and quality measures
-
Define a set of data quality parameters
-
The use of standard vocabularies and taxonomies
In other words: "OGC 17-018 (Data Quality Specification Engineering Report) provides methods to quantify the quality concepts defined in OGC 17-032 and a way to include the quantifications in service descriptions. It extends QualityML quality metrics (that already includes ISO 19157) into the aviation domain. It lists a set of quantitative and conformance measurements that are specified in terms of quality measures, domains, and metrics (value types and units) and are appropriated for each quality type and data type."
6.2. Levels of granularity in quality measures
There are several levels of granularity of quality measures that will be covered in the next sub-sections, namely:
-
Feature instance level
-
Dataset level
-
Service level
6.2.1. Feature instance level quality measures
Usually, feature instance level metadata and even attribute instance level metadata is allowed by adding fragments of metadata types to the other attributes in the data model of the feature types.
The exploration of the Aeronautical Information Exchange Model (AIXM) 5.1 reveals that there are some quality measures that have already been considered for some feature types.
Most feature types in AIXM are derived from basic geometric primitives or from the elevated version of them. Next figure shows as an example one feature type for each geometric elevated primitives.
In addition to the horizontal and vertical accuracy, specific feature types for aviation objects can carry other quality elements for aspects other than positions. For example, Accuracy is included as an attribute of:
-
AirportHeliport|Airport/Heliport|AirportHeliport: fieldElevationAccuracy + magneticVariationAccuracy
-
Runway|NavaidEquipmentDistance|distanceAccuracy
-
Runway|Runway: lengthAccuracy + widthAccuracy
-
Runway|RunwayDeclaredDistanceValue: distanceAccuracy
-
Runway|RunwayDirection: trueBearingAccuracy + elevationTDZAccuracy
-
Geometry|Point & Geometry|Curve & Geometry|Surface: horizontalAccuracy
-
Geometry|ElevatedPoint & Geometry|ElevatedCurve & Geometry|ElevatedSurface: verticalAccuracy
-
Navaids Points|Navaids|Azimuth: trueBearingAccuracy
-
Navaids Points|Navaids|Elevation: angleAccuracy
-
Navaids Points|Navaids|GlidePath: angleAccuracy + rdhAccuracy
-
Navaids Points|Navaids|Localizer: magneticVariationAccuracy + trueBearingAccuracy + widthCourseAccuracy
-
Navaids Points|Navaids|NavaidEquipment: magneticVariationAccuracy
-
Obstacle|VerticalStructurePart: verticalExtendAccuracy
-
Surveillance|PrecisionApproachRadar: slopeAccuracy
-
Surveillance|RadarEquipment: rangeAccuracy + magneticVariationAccuracy
There are also some other feature elements that can be somehow related to quality, for example the CodeIntegrityLevelILSBaseType, defined as "A coded value indicating the quality which relates to the trust which can be placed in the correctness of the information supplied by the ILS facility", is included in Navaids Points|Navaids|Navaid.
On the other hand, the Meteorological Community Exchange model METCE, within the Weather Information Exchange Model (WXXM), talks about resolution in Procedure|MeasurementContext: resolutionScale, defined as "The attribute 'resolutionScale' specifies the smallest change (e.g. the 'resolution') in property value of the 'measurand' that is intended to be measured within this procedure, using the unit of measure 'uom'. It shall be provided as a scaling factor, e.g. scale = -2 implies a precision of 100 units.."
In the models, there are currently no indications about other data quality aspects regarding quality facets such a timeliness, etc.
6.3. Data models used in the aviation services
Service classifications depend on the aviation data standards they use:
-
Aeronautical Information Exchange Model (AIXM) based: describe the AIS (e.g. the infrastructures, air spaces, etc) and its temporal modifications which are published through a Notice To Airmen (NOTAM). It is the more static one. It is distributed in “packs” and the quality measures should be at the feature and dataset level.
-
Flight Information Exchange Model (FIXM) based: describe the flight and flow information of aircraft. It is used in navigation and flight It is distributed in real time and each piece of information is generated by different agents in the flight route. Data quality measures should be at the feature level as different providers may have different quality on their data. Overall quality measures can be generated.
-
Weather Information Exchange Model (WXXM) based: describe the current weather and its forecast. It is produced by weather centers that provide their own data quality measures associated to products.
6.4. Data quality measures that are appropriate for the data models used in aviation
Again, AIXM describes the AIS (e.g. the infrastructures, air spaces, etc) and its temporal modifications (NOTAMs). AIXM is therefore the more static of the above-listed data models. It is distributed in “packs” and the quality measures should be at the dataset level.
6.4.1. Accuracy and Precision: positional and thematic
Positional accuracy in the aviation context is concerned with the recorded position of each feature compared to its actual position. Note that this also includes a measurement of precision, a separate measure for the aviation domain.
Within AIXM models, feature positional description use 2D elements (Point, Curve and Surface) or its elevated versions (ElevatedPoint, ElevatedCurve and ElevatedSurface). All of them have horizontalAccuracy and the latter also have verticalAccuracy. Thus, elevation accuracy is decoupled from horizontal accuracy, so that pure 3D quality measures do not apply to this model.
All types of accuracy are described as quality elements in this quality of data model. Quality category is used to describe (in an enumeration basis) the different quality elements described in ISO such as positional accuracy, quantitative attribute accuracy and so on, as well as the new elements described in this document such as positional precision. The list of category values can be seen in the SDCM extension model explanation, in the next section.
Feature level horizontalAccuracy (for points, curves and surfaces) is defined in AIXM models as "The difference between the recorded horizontal coordinates of a feature and its true position referenced to the same geodetic datum expressed as a circular error at 95 percent probability".
Model element | Content |
---|---|
Quality category |
positional accuracy |
Quality scope |
dataset |
Measure name |
http://qualityml.geoviqua.org/1.0/measure/CircularMapAccuracy |
Measure domain |
|
Measure metrics |
|
Measure parameter name |
level |
Measure parameter value |
0.95 |
Quantity value |
0.2 |
Quantity unit of measure |
m |
Origin |
ISO 19157 Id. 45 |
Feature level verticalAccuracy is defined as "The difference between the recorded elevation of a feature and its true elevation referenced to the same vertical datum expressed as a linear error at 95 percent probability".
Model element | Content |
---|---|
Quality category |
vertical accuracy |
Quality scope |
dataset |
Measure name |
http://qualityml.geoviqua.org/1.0/measure/QuantitativeAttributeCorrectness |
Measure domain |
|
Measure metrics |
http://www.qualityml.org/1.0/metrics/Half-lengthConfidenceInterval |
Measure parameter name |
level |
Measure parameter value |
0.95 |
Quantity value |
1 |
Quantity unit of measure |
m |
Origin |
ISO 19157 Id. 71 |
There are two main features within the models, those representing actual phenomena on the Earth (e.g. tower), and those representing human conventions (e.g delimitation of areas tagged as forbidden).
On the first group, a validation campaign may be done (sampling on the ground) to check the correspondence of the dataset position for these elements, and their real position. These differences may be aggregated to compute a dataset quality measure Circular Map Accuracy for the whole dataset. For example, an AeronauticalGroundLight (that is used for markingTheSiteOf within AirportHeliport) does have an elevatedPoint position that may include its 95% circular error (CE95) in the horizontalAccuracy element and an elevationAccuracy that may include its linear error at 95% (LE95). Several CE95 and LE95 values (for several AeronauticalGroundLight features) will be respectively used to compute the dataset CE95 and LE95 values.
Sometimes, it is not possible to provide a numerical value for a quality category as the quantitative uncertainty of the instrument used to do the measure is not available. Nevertheless, usually most instruments have a known order of magnitude precision, thus it is useful to provide the instrument name as an indication of this precision. For example the position of an aircraft can be measured using different instruments on board or other estimations in the positional precision element. Typical examples are: GPS, ADS-B (meters), ADSBLostCoverageEstimation or TimeSpeedDistanceEstimation (100km or more).
Model element | Content |
---|---|
Quality category |
positional precision |
Quality scope |
FlightObject/Flight/EnRoute/Position/AircraftPosition/position |
Description |
TimeSpeedDistanceEstimation |
Origin |
OGC 17-032 |
A descriptive result for a data quality element can also be used to describe the spatial distribution of data quality. For example, it could be good to have a way to state that horizontal accuracy of the elements next to or inside an Airport are defined in a more certain way than the obstacles around it. This is the typical use of a descriptive result as defined in ISO 19157.
Model element | Content |
---|---|
Quality category |
positional accuracy |
Quality scope |
dataset |
Description |
The relative positional accuracy has a higher value for features far from airports |
Origin |
TestBed 13 |
Thematic accuracy, is defined as the accuracy of attributes and it depends on the attribute type. In ISO it consists of three data quality elements: classification correctness, non-quantitative attribute correctness and quantitative attribute accuracy.
Regarding quantitative attributes, there are several relevant variables in the AIXM models covering angles (magnetic variations, bearing, slope), sizes (height, width) that are difficult in a single meaningful overall quality measurement. We suggest thematic accuracy is only provided for the dataset when it includes a coherent set of variables such as magnetic variations. The attribute that is being assessed should be defined in the scope of the quality element (or in the scope of the whole quality of data section if needed).
Model element | Content |
---|---|
Quality category |
quantitative attribute accuracy |
Quality scope |
Navaids Points/Navaids/Azimuth/trueBearingAccuracy |
Measure name |
http://qualityml.geoviqua.org/1.0/measure/QuantitativeAttributeCorrectness |
Measure domain |
|
Measure metrics |
http://www.qualityml.org/1.0/metrics/Half-lengthConfidenceInterval |
Measure parameter name |
level |
Measure parameter value |
0.95 |
Quantity value |
2 |
Quantity unit of measure |
degree |
Origin |
ISO 19157 Id. 71 |
On the other hand, to cover classification correctness, the usual approach uses misclassification matrices to indicate thematic accuracy. An example of this can be giving the misclassification matrix for the attribute type in Routes/En-route/Route/type within AIXM Route feature:
Model element | Content |
---|---|
Quality category |
classification correctness |
Quality scope |
Routes/En-route/Route/type |
Measure name |
http://qualityml.geoviqua.org/1.0/measure/Misclassification/ |
Measure domain |
http://www.qualityml.org/1.0/domain/predictedValues |
Measure metrics |
|
Measure parameter name |
max |
Measure parameter value |
100 |
Measure parameter name |
actualCategories |
Measure parameter value |
ATS NAT OTHER |
Measure parameter name |
predictedCategories |
Measure parameter value |
ATS NAT OTHER |
Quantity value |
87 1 12 |
Quantity unit of measure |
percentage |
Origin |
ISO 19157 Id. 61 |
Moreover, as recognized in OGC 17-032, not only the usual misclassification indication but an indication about unclassified items may be helpful. This can be described using NodataAreas metrics in QualityML, understanding the percentage of unclassified items is similar to percentage of unclassified area in a classification image:
Model element | Content |
---|---|
Quality category |
completeness omission |
Quality scope |
Routes/En-route/Route/type |
Measure name |
|
Measure metrics |
|
Measure parameter name |
rate max |
Measure parameter value |
100 |
Quantity value |
15 |
Quantity unit of measure |
percentage |
Origin |
A3C - OGC 17-032 |
6.4.2. Resolution
The resolution class contains information about the resolution of a dataset including its positional (or spatial), vertical, temporal and attribute resolutions. ISO describes spatial resolution through the MD_Resolution class and temporal resolution through the TM_Duration class.
SDCM quality of data model describes a general Quantity class that can be used to describe the value and units of measure of any of these resolutions in a general way, and that is specified for each of the four resolution types. Positional resolution has also a equivalentScale element in order to be able to describe the most important elements in ISO (as distance, angularDistance and levelOfDetail can be described within positional resolution using the value and the needed units). For temporal resolution, the value and units are enough to describe the elements contained in ISO TM_Duration.
Model element | Content |
---|---|
scope |
can be described if should be applied only to certain features |
Positional resolution value |
1.5 |
Positional resolution unit of measure |
m |
Origin |
OGC 17-018 |
Model element | Content |
---|---|
scope |
AirportHeliport/Surface Contamination/SurfaceContamination/observationTime |
Positional resolution value |
10 |
Positional resolution unit of measure |
s |
Origin |
OGC 17-018 |
Regarding attribute resolution, the element resolutionScale in Procedure|MeasurementContext from the Meteorological Community Exchange model METCE (within WXXM) is another indication of resolution, and thus should be "translated" to resolution quality element within SDCM extension model. This is a direct translation as it is a scaling factor and thus the resolution value can be computed using the formula:
resolution = 10^-resolutionScale
And for example for a feature with a resolutionScale of -2, an attribute resolution with value 100 and attributes units of measure should be defined.
6.4.3. Traceability
Traceability describes the ability to trace the history, application or location of that which is under consideration. The closest match within ISO standards is Ll_lineage element. Information on the originator/person responsible should be included.
A first approach to lineage of a dataset would be given as a statement. This is a simple way to describe traceability but it is not the most recommended, as it does not easily state sources used and process steps related to the dataset history. According to ISO, a full flexible system is described where several options are possible:
-
sources can be described including the process steps to generate each source
-
process steps of the dataset can be described including the sources for each process step
-
mixed situation can be defined
Even though the power of ISO schema should be recognized, allowing hierarchical definition of process steps and sources, creates a very flexible situation for each dataset of each service that would lead to a probably complex situation with hardly comparable datasets within a service or among services. Moreover, the originator of the data entity is also recognized in OGC 17-032 as relevant for traceability.
Thus, the proposal for SDCM extension described in this ER is to recommend to include a flat structure for sources, process steps and originator in the traceability section. To follow this aim, current elements on SDCM are reviewed in order to decide if they can be used to describe these elements, or if new elements are needed.
SDCM Operation element may seem a good candidate to be reused (even if an extension is needed) to be mapped to ISO processStep element. Unfortunately, stepDateTime and processor are missing in SDCM operation class and these are important elements to describe traceability as describe who and when. Moreover, several Operation elements make no sense on a process step description. Thus, a new Process Step element in SDCM is created to fulfill the process step requirements, including process step description, date and time, organization carrying out the process and an optional reference to the process description. The processor description can be omitted if it is the same as the data originator (or even the service originator). The process step can refer to a process description document, similarly to the data description document from a data entity.
Regarding source descriptions for the lineage and traceability, it can be done reusing the "Data Entity" class within SDCM model and then describing the name and description of the Data entity and giving a reference document as well. The extended SDCM model includes a Quality of data section that is aggregated to Data Entity, the source description could also include its originator or its spatial resolution through Data quality sub-elements.
As an example, a service can provide a feature collection composed of an airport description, the approach and departure procedures, and obstacles around the airport. This feature collection conforms to a data entity delivered by one service. In the Quality of Data entity, the traceability of this feature collection can be stated as:
Model element | Model sub-element | Content |
---|---|---|
Statement |
The feature collection is created by combining three original datasets into a feature collection, after a projection change to original datasets |
|
Data source 1 |
name |
Airport description |
definition |
a link to a data documentation may be described here |
|
quality - traceability originator name |
it is not defined as it is the same than service provider |
|
quality - positional resolution scaleDenominator |
5000 |
|
Data source 2 |
name |
Approach and departure procedures |
description |
A series of predetermined maneuvers for the orderly transfer of an aircraft under instrument flight conditions from the beginning of the initial approach to a landing or to a point from which a landing may be made visually |
|
quality - traceability originator name |
Aeronautical Information Management Modernization |
|
quality - traceability originator web page |
||
quality - positional resolution value |
10 |
|
quality - positional resolution units of measure |
m |
|
Data source 3 |
name |
Obstacles |
description |
||
quality - traceability originator name |
Aeronautical Information Management Modernization |
|
quality - traceability originator web page |
||
quality - positional resolution value |
12 |
|
quality - positional resolution units of measure |
m |
|
Process Step 1 |
description |
A projection change is applied to each data source |
processor name |
it is not defined as it is the same than service provider |
|
Process Step 2 |
description |
A confusion procedure is developed to generate a single feature collection with the selected information of each data source |
processor name |
it is not defined as it is the same than service provider |
|
Originator |
name |
it is not defined as it is the same than service provider |
Origin |
OGC 17-018 |
A second example is related to gridded weather products such as the ones delivered by CSS-Wx Web Coverage Service (WCS), that enable National Airspace System (NAS) systems to access high-resolution aviation weather data to meet their individual needs and to support NAS operations.
Model element | Model sub-element | Content |
---|---|---|
Statement |
A gridded precipitation (VIL) dataset is created by fusion of weather data received from multiple radar and sensor systems |
|
Process Step 1 |
description |
Interpolation of source data using linear method |
processor name |
Aviation Weather & Aeronautical Services (AJM-33) Weather and Radar Processor |
|
Originator |
name |
it is not defined as it is the same than service provider |
Origin |
OGC 17-018 |
6.4.4. Completeness
Completeness describes the amount of data in a dataset according to a comparison with the expected data, it is described using a measure of commission and omission. Completeness may be described at a dataset level or at feature level, for example regarding omission and commission errors among categories in a classification.
Several measures are described in ISO 19157 to describe completeness. The measures they represent use mainly excess, duplicate and omission measures. QualityML groups and extends the list of measures in order to aggregate the same concept but different metric. All the measures related to the same quality measure are grouped and use a metric called items which results can be expressed as a boolean, count or rate. In fact ISO 19157 suggest several options, in this case for the rate elements, when states that "[Error rate / Correct items rate] can either be presented as percentage or as a ratio. The value unit in the quantitative result (see 7.5.4.2) can be used to specify that the result is presented in percentage or as a ratio". To standardize these options for the rate as well as to combine the other two options (boolean and count), QualityML describe the Items metrics as a choice among "indicator" (for boolean), "count" or "rate". For the last one a parameter is described in order to include the maximum value of the rate. Thus, a value of 100 in this attribute will be used to express that the value is a percentage. Default value for this attribute is 1, representing a pure ratio.
Moreover, usually measures based on errors and on correct items are described in ISO 19157. Both definitions are exactly the same, the only difference being "which elements" the measure is counting. This, in QualityML is described by the Domain of the Quality measure, allowing then a higher aggregation schema relating several ISO measures to the same QualityML measure with several metrics and domains:
-
Commission:
-
Omission:
-
measure/FlagAreas: (used to flag elements that are detected as anomalous such as "cloud flag" or "snow flag") metrics/items
-
measure/Misclassification: domain/predictedValues or domain/actualValues + metrics/OmissionError or metrics/FalseNegative
These measures are mainly related to a dataset level, as they describe the differences among the elements described in the dataset and the elements in the universe of discourse, finding those that are missing, duplicated or existing only in the dataset. As an example, for a dataset describing obstacles in an airport, an omission indicator may be described as:
Model element | Content |
---|---|
Quality category |
completeness omission |
Quality scope |
obstacles dataset |
Measure name |
|
Measure metrics |
|
Quantity value |
3 |
Quantity unit of measure |
obstacles |
Origin |
ISO 19157 Id 6. |
6.4.5. Temporal Accuracy and Precision
Time is an essential aspect of the aeronautical information world, where change notifications are usually made well in advance of their effective dates. Aeronautical information systems are usually requested to store and to provide both the current situation and the future changes.
In order to satisfy the temporal requirements of aeronautical information systems, AIXM must include an exhaustive temporality model, which enables a precise representation of the states and events of aeronautical features. A general temporal model should be uniformly applied to all aeronautical feature types and the temporality concept should be abstracted from the task of modeling object properties. At the conceptual level, the model should describe the temporal evolution of the features, as they occur in the real world.
The AIXM Temporality Model describes two levels at which aeronautical feature instances are affected by time: 1) Every feature has a start of life and an end of life; and 2) The properties of a feature can change within the lifetime of the feature[3]. It is considered that any feature property may change in time, except for the global unique identifier. This is a key assumption of the AIXM Temporality model.
Within FIXM models, there are several elements describing time, such as runwayTime (in FIXM 3.0.1.Base.Aerodrome.RunwayPositionAndTime.runwayTime) or standTime (in FIXM 3.0.1.Base.Aerodrome.StandPositionAndTime.standTime). FIXM also includes the Base.Time schema that provides representations for time elements.
What AIXM temporality and FIXM models do not cover is the accuracy of a time measurement, that is fixed to 1 minute, as the validTime on the model is a date and time element. The accuracy time measurement or the precision of time measurement could be described for the whole dataset or for several temporal elements described in the AIXM or FIXM models.
When accuracy elements are described within AIXM and FIXM models, the selected measure is related to 95% probability (as explained for example for trueBearingAccuracy attribute in AIXM Navaids Points|Navaids|Azimuth). Thus, the recommendation to describe temporal accuracy is to use "accuracy of a time measurement" also with 95% probability, for example:
Model element | Content |
---|---|
Quality category |
accuracy of a time measurement |
Quality scope |
dataset |
Measure name |
|
Measure domain |
|
Measure metrics |
http://www.qualityml.org/1.0/metrics/Half-lengthConfidenceInterval |
Measure parameter name |
level |
Measure parameter value |
0.95 |
Quantity value |
2.5 |
Quantity unit of measure |
min |
Origin |
ISO 19157 Id. 57 |
A temporal precision element has been also added as a quality category as suggested by OGC 17-032.
6.4.6. Timeliness
Timeliness is a representation of the concepts of currency and fitness for purpose and is described including date and time of capture, maintenance date and time and maintenance frequency. Using these elements a dataset can be described in terms of when it was produced, and whether it is valid.
For a dataset that is updated quarterly, for example, capture date, last maintenance and maintenance frequency can be described.
Model element | Content |
---|---|
date and time of capture |
2017-09-12 10:00 |
maintenance date and time |
2017-09-25 10:00 |
maintenance frequency |
quarterly |
Origin |
OGC 17-018 |
Moreover, for a service, an aggregated measure related to timeliness can be computed if a requirement on timeliness is defined, such as that this data should not be more than a certain number of days old. This could be described using a Temporal validity data quality category, using an 'item' metric (a boolean indicator, number of items or rate) and a domain requirement to define the time limit to consider and element to be conformant or not to this time requirement . To describe the conformance or non-conformance domain requirements, the domain can describe a range ("domain min" and/or "domain max" parameters) to define the requirement, as needed.
Using the AIXM temporality model, several data quality measures can be computed to describe the temporal validity of the dataset, for example describing if there are elements that have already ended their lives (and thus, do not follow timeliness requirements).
For example, if the feature requirements considers that features older than 15 days are not current enough (so they do not conform to the rule), a quality indicator with a temporal validity category can report the rate of elements non conformant with the domain with an specific requirement as a parameter in the domain of the quality measure (3% in this example):
Model element | Content |
---|---|
Quality category |
temporal validity |
Quality scope |
dataset |
Measure name |
|
Measure domain |
|
Measure metrics |
|
Measure parameter name |
domain min |
Measure parameter value |
2017-09-151 |
Measure parameter name |
rate max |
Measure parameter value |
100 |
Quantity value |
3 |
Quantity unit of measure |
percentage |
Origin |
ISO 19157 Id. 18 |
1 Features older than 15 days of the time intended for the data (i.e. older than 2017-09-15 if the day this measure is computed is 2017-10-01) |
Within the WXXM model, there are also highly dynamic features that should be properly described in order to be able to assess its timeliness. Inside the model, there is an AIRMET package that reports the occurrence and/or expected occurrence of specified en-route weather phenomena which may affect the safety of aircraft operations, and of the development of those phenomena in time and space. These weather phenomena are reported as impacted regions of airspace.
This package contains candidate representations for eventual adoption by ICAO Meteorological Information Exchange Model (IWXXM). Representations are based upon ICAO Annex 3 Amendment 76 / WMO No. 49. These representations should be considered unofficial until incorporated into IWXXM. Each observation/forecast phenomenon includes its own period of validity for described meteorological conditions, which is represented as the O&M Observation validTime. These elements can be similarly used to compute aggregated measures such as percentage of features covering a certain time requirement (like the last example).
6.4.7. Integrity
Integrity describes the degree of assurance that can be given that the dataset has not been altered or lost since creation or update from the required body. There are several commonly used strategies to describe and ensure data integrity (some recognized in ICAO Annex 15 or other standards [5]), and those are included in the Integrity class in the model:
-
cyclic redundancy check (CRC) values: electronic aeronautical data sets shall be protected by the inclusion in the data sets of a 32-bit cyclic redundancy check (CRC) implemented by the application dealing with the data sets
-
designated level: demonstration of compliance of the quality management system applied shall be by audit. If nonconformity is identified, initiating action to correct its cause shall be determined and taken. All audit observations and remedial actions shall be evidenced and properly documented
-
integrity faults: related to ICAO DIL classification: routine data (10-3 integrity level), essential data (10-5 integrity level), critical data (10-8 features,), described in aeronautical data quality requirements in appendix 7 in ICAO annex 15
-
signature: as recognized by ISO 19165, not only integrity (such as crc values) are important but also signature and certification of data. This element allows for the identification of signature information.
Moreover, several quality elements can be described in order to include the audit results as well as other integrity parameters related to logical consistency elements in ISO (and its four types) and temporal consistency:
-
Conceptual consistency: rules to conceptual schema. This may include the audit of datasets and if classification of integrity faults requirements are met.
-
Domain consistency: check if feature attributes have the expected domains, for example: AIXM navaids features may be checked during quality control to see if the Navaids Points.Navaids.Azimuth.trueBearing attribute is a value within the range [0,360].
-
Format consistency: degree to which data is stored in accordance with the physical structure of the dataset
-
Topological consistency: explicitly encoded topological characteristics of a dataset
-
Temporal consistency: within AIXM temporality model, to check for a specific feature is there are chronological errors (i.e. feature/time slice start of life is later than end of life)
There are several qualityML measures and metrics that can be used to describe consistency quality parameters, all of them using items measure metrics, for example:
Quality category | Measure name | Measure domain | Origin |
---|---|---|---|
Conceptual consistency |
ISO 19157: Id. 8 + Id. 9 (boolean), Id. 10 + GeoViQua (count), Id. 12 + Id. 13 (rate) |
||
GeoViQua (boolean, rate), ISO 19157 Id. 11 (count) |
|||
Domain consistency |
ISO 19157 Id. 14 + Id. 15 (boolean), Id. 16 + GeoViQua (count), Id. 17 + Id. 18 (rate) |
||
Format consistency |
GeoViQua (boolean), ISO 19157 Id. 119 (boolean), Id. 19 (count), Id. 20 (rate) |
||
Topological consistency |
GeoViQua (boolean), ISO 19157 Id. 21 (count), Id. 22 (rate) |
||
GeoViQua (boolean, rate), ISO 19157 Id. 23 (count) |
|||
GeoViQua (boolean, rate), ISO 19157 Id. 24 (count) |
|||
GeoViQua (boolean, rate), ISO 19157 Id. 25 (count) |
|||
GeoViQua (boolean, rate), ISO 19157 Id. 26 (count) |
|||
GeoViQua (boolean, rate), ISO 19157 Id. 27 (count) |
6.5. NSRR service exploration
The FAA has a National Airspace System (NAS) Service Registry and Repository that includes metadata describing around 80 services in several life cycle stage (Proposed, Verification, Definition, Deprecated, Production and Development). These services are categorized in several groups under several criteria, for example depending on the ATM service or the SWIM product category, as seen in the next figure:
Metadata for each service includes several sections, one of them describing the Data of the service. The Data section describes the nature, structure, and representation of the data that constitutes the body or payload of the service’s messages, i.e., the data shared between service providers and consumers. The Data section captures bibliographic and location information for relevant data definition documents, including XML Schemas, Data Model Diagrams, or other descriptive documents. Only 37 of the services described in the registry include some information on the Data section. Unfortunately, the data section is not always populated so a careful reading of some service description reveals more details about the data types involved. We also have to take into account that sometimes more than one service provides the same data in different formats, and we only need to consider this data once.
This information is a source list of different data used by these services and, thus, is interesting to check it in order to define quality measures related to most of the data available through the services. These can be grouped by model as seen in the next table:
code | Name | Model | Data |
---|---|---|---|
Aeronautical Common Services Data Query (ACS-DQ) V 2.0 |
AIXM |
Special User Airspace (SUA), NOTAM |
|
Aeronautical Common Services Geodetic Computation (ACS-GC) |
AIXM |
Magnetic declination |
|
Aeronautical Common Services Web Feature Service (ACS-WFS) v2.0 |
AIXM |
Airports, navaids, obstacles, procedures, and NOTAMs |
|
Get Static Special Activity Airspace (SAA) |
AIXM |
Special User Airspace (SUA) |
|
En Route Airspace Data Publication - v2.0, SWIM Flight Data Publication Service (SFDPS) |
AIXM |
route, sector, altimeter setting, and special activities airspace information |
|
Federal NOTAM System (FNS) Publication |
AIXM |
NOTAM |
|
Aeronautical Feature Data |
AIXM |
||
Aeronautical Information Authoritative Source (AIAS) Data Service Web Feature Service (WFS) |
AIXM |
Obstacle Authoritative Source (OAS) |
|
SWIM Terminal Data Distribution System (STDDS) Airport Data Service (APDS) |
AIXM |
Runway Visual Range (RVR) data: Runway visibility and trend for touchdown, midpoint and rollout. Edge ad centerline light intensity settings |
|
En Route Flight Data Publication - v2.0 (SFDPS) |
FIXM |
flight plan, track |
|
STDDS Surface Movement Event (SMES) |
FIXM |
Surface movement events for all aircraft monitored at select towers. track positions for all aircraft and vehicles collected from towers |
|
STDDS Terminal Automation Information Service (TAIS) |
FIXM |
Flight plan data, track data, sign-in / sign-out (SISO) data, alert data, Instrument Meteorological Conditions (IMC) data, traffic count data, and performance monitoring data |
|
STDDS Tower Departure Event (TDES) |
FIXM |
Departure events for all flights |
|
Time Based Flow Metering (TBFM) Publication (TBFM-MIS) |
FIXM |
TBFM Metering Status, TBFM Interface Status, TRACON Name, Gate Name, Arrival Airport Information, Airport Configuration, MRE Information, Arrival Airport Configuration Information, Airport Acceptance Rate Group, Terminal Radar Approach Control (TRACON) Acceptance Rate Group, Meter Point Acceptance Rate Group, Runway Acceptance Rate Group, Super Stream Class Configuration Group, Satellite Airport Configuration Group, Flight Plan Information, Estimated Times of Arrivals (ETA), Scheduled Times of Arrival (STA), Meter Reference Element (MRE) information, and Scheduling information. |
|
TBFM Release Time Coordination Service (TBFM-RTCS) |
FIXM |
Departure release time |
|
Terminal Flight Data Manager (TFDM) Airport and Flight Information Service (AFIS) |
FIXM |
Airport’s configuration, demand, delay, other airport information, and flight specific data and delay information. |
|
Traffic Flow Management (TFM) Data |
FIXM |
Route information, entry/exit data for certain Traffic Management Initiatives (TMIs), Route Availability Planning Tool (RAPT) timeline data, National Traffic Management Log (NTML) restrictions |
|
TFM Data R13 |
FIXM |
Route information, entry/exit data, Route Availability Planning Tool (RAPT) timeline data, Traffic restrictions |
|
Corridor Integrated Weather System (CIWS) Data Distribution Service (CDDS) Web Coverage Service (WCS) Gridded Weather Products (CIWS WCS) |
WXXM |
Gridded - current Continental United States (CONUS) Vertically Integrated Liquid (VIL) Dataset, Forecast CONUS VIL Dataset, Current Echo Tops Dataset, Forecast Echo Tops Dataset and Current CONUS Satellite Dataset |
|
CDDS Web Feature Service (WFS) NonGridded Weather Products (CIWS WFS) |
WXXM |
Non- Gridded - Growth and Decay Trends, Storm Information – Echo Tops Tags, Storm Information – Leading Edges, Storm Information – Motion Vectors, Forecast Standard-Mode VIL Contours, Forecast Winter-Mode VIL Contours, Forecast Echo Tops Contours, Echo Tops Forecast Accuracy Scores, Standard-Mode VIL Forecast Accuracy Scores and Winter-Mode VIL Forecast Accuracy Scores. |
|
Common Support Services Weather (CSS-Wx) Web Coverage Service (WCS) |
WXXM |
Gridded weather products: Precipitation, Precipitation with Mask, Precipitation Forecast, Precipitation (VIL) Forecast with Mask, Echo Tops, Echo Tops Forecast, Precipitation (Base Reflectivity), Precipitation (Composite Reflectivity), Precipitation (Composite Reflectivity) with Mask, Surface Precipitation Phase, Surface Precipitation Phase Forecast, Precipitation (ASR), Precipitation (ASR AP Flagged), Icing Tops, Icing Tops Forecast, Icing Bottoms, Icing Bottoms Forecast, Icing Layer, Composite Icing, Icing Layer Forecast, Composite Icing Forecast, Turbulence Layer, Turbulence Layer Forecast, Composite Turbulence, Composite Turbulence Forecast, Convective Weather Avoidance Fields, Convective Weather, Avoidance Field Forecast, Satellite, Terminal Winds, NOAA Model Data (Rapid Refresh -RAP-, High-Resolution Rapid Refresh -HRRR-, Global Forecast System -GFS-) |
|
CSS-Wx Web Feature Service (WFS) |
WXXM |
Meteorological Terminal Aviation Routine Weather Report (METAR) and Terminal Aerodrome Forecast (TAF), Precipitation (VIL) Forecast Accuracy, Precipitation (VIL) Forecast Contours, Echo Tops Forecast Accuracy, Echo Tops Forecast Contours, Lightning, Airport Lightning Warning, Storm Information Hazard Text, Storm Information Leading Edges, Storm Information Motion Vectors, Fronts Forecast, Growth Trends, Decay Trends, Forecast Confidence, Convective Weather Avoidance Polygons, Wind Profiles, Tornado Detections, Jet Stream (WP2), Winds Aloft Forecast, Microburst, Gust Front, Gust Front Estimated Time to Impact, Tornado Alert, Tornado Warnings, Tornado Watches, Configured Alerts, Wind Shear ATIS Timers – Microburst, Wind Shear ATIS Timers – Wind Shear, Terminal Weather Graphics, Terminal Weather Text, Icing Layer Contours, Composite Icing Contours, Turbulence Layer Contours, Composite Turbulence Contours, Pilot Report (PIREP), ICAO Aircraft Report, Urgent Pilot Report (PIREP), Significant Meteorological Information (SIGMET), Convective Significant Meteorological Information (Convective SIGMET), Airmen’s Meteorological Information Advisories (AIRMET), Surface Weather Observations, Airport Status Summary, Aviation Watch Notification, Severe Thunderstorm, Warnings, Severe Thunderstorm Watches, Volcanic Ash Advisory Statement (VAAS), Terminal Area Forecast (TAF), Area Forecasts, Center Weather Advisories, Meteorological Impact Statements |
|
Weather and Radar Processor (WARP) Publication |
WXXM |
Weather information and radar product |
|
WARP Vendor Weather Data Publication |
WXXM |
Weather information and radar product |
|
National Weather Service (NWS) Weather Information Network Server (WINS). WCS Gridded data products |
WXXM |
Rapid Refresh (RAP), Global Forecast System (GFS), North American Mesoscale (NAM) (CONUS, Alaska and Puerto Rico domains), National Convective Weather Diagnostic (NCWD) and Current Icing Product (CIP). All of the data products are in netCDF-4 format and use Climate and Forecast (CF) metadata conventions. |
|
NWS WINS WFS Non Gridded products |
WXXM 1.1 |
National Convective Weather Forecast (NCWF), Airmen’s Meteorological Information (AIRMETs), Significant Meteorological Information (SIGMETs), Meteorological Aviation Reports (METARs) and Meteorological Data Collection and Reporting System (MDCRS) |
|
Acknowledge Weather Report, Weather Message Switching Center Replacement System (WMSCR) |
WXXM 1.1 |
NOTAM, Pilot Reports (PIREPS), altimeter setting |
|
Publish Altimeter Setting (WMSCR) |
WXXM 1.1 |
altimeter setting |
|
Publish PIREP (WMSCR) |
WXXM 1.1 |
PIREP |
|
WMSCR Report Retrieval (WMSCR) |
WXXM 1.1 |
PIREPs or Altimeter Setting |
7. SDCM Extension
The second version of the Service Description Conceptual model (SDCM v2.0) is extended in this ER in order to include new elements covering the quality information for each service in a interoperable way.
SDCM in its profile diagram, describes a Quality of Service class where requirements of the services can be included and added to the QoS class as a new property "parameter type" related to the taxonomy of parameters describing QoS (ex. timeliness, etc). This is shown in figure 3 of SDCM v2.0.
Moreover, on the model diagram, SDCM has a Data class where quality of the data or basic metadata of the data may also be included. This is shown in figure 4 of SDCM v 2.0.
7.1. Extension options
SDCM v2.0 model needs to be extended to include quality of the data. This can be done in several ways, as described in next sub-sections.
7.1.1. First option: quality of service for quality of data
The quality of service class could be applied directly to Quality of Data. This approach is the simplest one in the way that the model is only slightly modified, but no new classes or class modification would be needed. On the contrary, this approach maybe too simple to contain all the elements or to provide enough semantic information about the quality measures needed to describe aviation services in general, and specifically those registered that includes a Quality Element in NASS services.
7.1.2. Second option: new elements inspired on ISO 19115 and 19157 concepts
In this case, SDCM extensions are more important. The idea is to modify SDCM following ISO 19115 and 19157 concepts, but keeping as much as possible SDCM simplicity. Instead of directly adopting the ISO 19115 and 19157 classes, a reinterpretation (usually simplifying the ISO classes) is suggested.
A new "Quality of Data" class is created that includes several child elements:
-
quality element
-
positional, vertical, temporal and attribute resolution
-
traceability
-
timeliness
-
integrity
The new element "Quality of Data" can be used to illustrate the level of granularity "dataset" or "service". Details are described in next sub-sections.
Dataset level
In this case, the suggestion is to extend the model diagram (figure 4 of SDCM v 2.0) to attach the quality to each data payload responded by each operation. This has the benefit that provides quality at the dataset level permitting a quality report for each data type that a service can potentially provide; being able to give different reports for each dataset. The approach also has drawbacks. First, in case the same data type is provided by more than one operation, the data quality description will be repeated for each instance. Secondly, different service instances can serve the same data using different operation structures with different names and payload sets. This will make quality comparison among services more challenging.
Service level
In this case, the suggestion is to extend the profile diagram (figure 3 of SDCM v2.0) to attach an overall quality report that summarizes the characteristics of all datasets in the service. This has the advantage of making the data quality more prominent in the model (at the same level as the Quality of Service), and allows the easy comparison of the data quality of several services.
7.1.3. Third option: full MD_Metadata record
OGC 17-032, OGC® Testbed-13 Abstract Data Quality ER, explained how to describe aviation concepts using ISO elements. With this approach an XML file describing the full ISO metadata record for a dataset (or service) may be described.
SDCM v2.0 "Data Definition" element on "Data Entity class" could be used to link to the full MD_metadata class.
The drawback of this option is that ISO standards are quite complex and several mandatory information elements that would be required are potentially not relevant or it’s harder to find that in a simplest model that will be more easily comparable.
7.2. Model description
7.2.1. Scope
Within the new Quality of Data class, an element "scope" is defined. This element is intended to include the description of the scope of elements to whom this quality information should apply.
Typically, "dataset" level is the proper value if this quality report is describing the whole dataset, but sometimes an attribute or feature type may be specified if this quality report only applies to them, for example "FlightObject/Flight/EnRoute/Position/AircraftPosition/position".
7.2.2. Quality element
The first element inside Quality of Data is Quality element. This element is intended to describe quality reports about the dataset, feature or attribute. It is inspirited in ISO 19157 DQ_Element.
The element contains a quality category, that should be selected from the namesake enumeration. Values for this enumeration come from ISO 19157 DQ_Elements subclasses, as well as from OGC 17-032 or this ER proposals:
-
positional accuracy
-
positional precision
-
vertical accuracy
-
quantitative attribute accuracy
-
quantitative attribute precision
-
classification correctness
-
accuracy of a time measurement
-
temporal validity
-
completeness commission
-
completeness omission
-
topological consistency
-
domain consistency
-
conceptual consistency
-
format consistency
-
temporal consistency
The second element is quality scope. This only needs to be populated if its value is not the same as the one for the general "scope" element (described in the previous section). This allows, for example, the generation of a single "Quality of Data" section for the whole dataset, that includes "Quality elements" for the whole dataset (thus "quality scope" will be not defined) or other for specific features or attributes (and thus "quality scope" will be described for them).
The result of the quality element can be described as three options:
-
descriptive result: it can be used to describe in a textual way the order of magnitude precision of the instrument used, for example GPS, ADS-B (meters), ADSBLostCoverageEstimation or TimeSpeedDistanceEstimation (100km or more) for positional accuracies. It can also be used, as typically used in ISO 19157, to describe the spatial distribution of data quality, for example to explain that horizontal accuracy of the elements next to or inside an Airport are defined in a more certain way than the obstacles around it.
-
conformance result: describe if a certain specification is met or not. The result of the measure is described in the conformance pass element, the specification is described in the conformance description element, and a formal reference to the specification document can also be provided.
-
quantitative result: numerical value
When a quantitative result is provided, several elements should be described:
-
measure name: identification of the measure. When possible a link to an external and recognized description should be provided. This ER recommends to use links to QualityML concepts, that include ISO 19157, GeoViQua and A3C (after TestBed 12 efforts) proposals. For example: Circular map accuracy in QualityML or Value domain in QualityML
-
measure domain: the selected measure can be applied to several value domains, for example Differential errors 2D in QualityML, Differential errors 1D in QualityML or Non conformance domain in QualityML
-
measure metrics: a specific metric (numerical computation) may be used, for example Circular error in QualityML or items in QualityML
-
measure parameter name: sometimes, to fully describe the domain or the metrics, or to set up a requirement on the domain, some parameters are needed. For example, level of confidence is needed when a differential error or a half-confidence interval is used. Another example is when an item metrics is used is to provide the maximum value if a rate option for the metric is used. These parameters are described in the domain or metrics description in QualityML.
-
measure parameter value: the value of the previous parameter, for example 0.95 for level of confidence or 100 to indicate that items rate is a percentage.
-
quantity value: the value for the quantitative quality element
-
quantity unit of measure: the units of measure for the value
More information on QualityML use to describe measure name, domain, metrics and values (including parameters when needed) can be found in OGC 16-050 Imagery Quality and Accuracy ER.
7.2.3. Positional, vertical, temporal and attribute resolution
The resolution classes contain information about the resolution of a dataset including its positional (or spatial), vertical, temporal and attribute resolutions.
ISO describes spatial resolution through the MD_Resolution class and temporal resolution through the TM_Duration class.
All of them are specified from a general Quantity element, and thus include a value and a unit of measure. This is enough for vertical, temporal and attribute resolutions. For positional resolution, also an equivalentScale element is added in order to allow describing resolution in this approach, like ISO standards recognize. Note that other options described by ISO, such as distance, angularDistance and levelOfDetail can be described using the general Quantity class, describing the proper units.
7.2.4. Traceability
The traceability class includes a first textual element, statement, to describe in a general way the lineage of the element it is describing (usually a dataset). Moreover, traceability class has three elements that can be described:
-
data source
-
process step
-
originator
Several data sources can be described as being part of the dataset creation. The data source is a specified class from Data Entity class, and thus it may have, if necessary a quality of data section to describe its traceability or other quality elements.
Several process steps describing the algorithms or processes that had been applied to obtain the dataset can be described. A description and date-time of execution needs to be described for each process step. Moreover, the organism carrying out the process (processor) may be described. It can be omitted if it is the same than the data originator (or even the service originator). A process description document can be also referenced.
Finally, the originator of the dataset may also be described as a part of the lineage. It can be omitted if it is the same than the service provider.
Both originator and processor classes are specified from the SDCM Organization class and thus can be described with a name, description and website, and at least one point of contact.
7.2.5. Timeliness
Timeliness class on SDCM extension is intended to describe the data and time of the information, the date and time of its maintenance and also its expected maintenance frequency. Possible values for the last element are the ones in the ISO MD_MaintenanceFrequencyCode codelist.
Moreover, and also very important especially for dataset and service comparison, timeliness is related to quality elements categorized with time validity.
7.2.6. Integrity
Integrity class includes several elements to certify integrity of the data and data authorship, following ICAO and widely used rules:
-
crc value
-
designed level
-
integrity faults (routine, essential or critical levels)
Moreover, and also very important especially for dataset and service comparison, integrity is related to quality elements categorized with several consistency categories, i.e. conceptual consistency, domain consistency, format consistency, topological consistency and temporal consistency.
8. Quality of Service parameters related to Quality of Data
8.1. Introduction
SDCM, in its current version, contains a Quality of Service (QoS) class that allows service providers to describe quality parameters of their services within the Web Service Description Documents (WSDD). Nevertheless, no list of such parameters are listed on the SDCM document where only a couple of names for these parameters are given (Table 3.13 1 Quality of Service Attributes: Examples include: capacity, response time, etc.).
On the other hand, on the document Preparation of Web Service Description Documents (FAA-STD-065A, 2013), a list of examples of QoS parameters is given on its Appendix D [4]. This appendix recognizes that WSDD developers may reuse these parameters or provide their own, as well as their own values or range of values.
8.2. FAA-STD-065A parameters
The list of QoS parameters listed on this document are:
Name | Definition | Method | Unit of Measure | Value or Range of Values |
---|---|---|---|---|
Accuracy |
Number of errors produced by the service over a period of time. |
Simple count. Measurements are taken daily and apply to the preceding 24-hour period |
Whole positive number |
250 |
Availability |
Probability that the service is present or ready for immediate use |
100 * ((24 – Total Outage Time in Hours) / 24). Measurements are taken daily and apply to the preceding 24-hour period. |
Percentage, accurate to 3 decimal places |
Greater than or equal to 99.900% |
Capacity |
Number of service requests that the service can accommodate within a given time period |
Simple count |
Whole positive number, per period of time |
25 per minute |
Mean Time Between Critical Failure (MTBCF) |
The average time between hardware or software component failures that result in the loss of the service |
The sum of the individual times between critical failures divided by the number of critical failures |
Hours |
Greater than or equal to 3000 |
Mean Time Between Failure (MTBF) |
Average time between hardware or software component failures that do not result in the loss of the service |
The sum of the individual times between noncritical failures divided by the number of noncritical failures |
Hours |
Greater than or equal to 5000 |
Mean Time To Restore (MTTR) |
Average time required to localize a component failure, remove and replace the failed component, and to perform tests to confirm operational readiness of the component |
The sum of the individual times to repair divided by the number of repairs |
Hours |
Less than or equal to 0.5 |
Response Time |
Maximum time required to complete a service request |
Measured from the time the provider agent receives the request to the time the service provider transmits the response |
Seconds |
10 |
None of these parameters have relation with the quality of the data that is provided by this server. New proposals of QoS parameters are given in the next section as examples of this relation.
8.3. New parameters
The description of these QoS parameters is done using the same structure that is in the FAA-STD-065A document that, in fact, describes the elements of the QoS class in SDCM. The only element that is not provided is a reference value to compare to, as it will be used as a quantitative value to establish an order in service comparison. Moreover, specific values for describing suitable or non suitable services will depend on the application and will be defined by the user (as each application has certain data requirements). These two group of parameters are complimentary as the first one allows to better understand and better compare values of the second one for different services.
8.3.1. Describing completeness about the quality of the data documentation
The first set of quality parameters refer to the completeness of the server and datasets documentation, taking into account how many datasets of the service provide a specific data quality element.
A similar approach was previously used in GeoViQua project on the description of the GeoLabel, that is a graphic representation (i.e., a static image) which generated for each dataset based on the quality information that is available for that dataset. This idea evolved and has been later used as a voluntary label part of GEO Data branding strategy to give visibility to the effort data providers put into making their processes conformant with the Data Management Principles (DPM) in the intergovernmental Group on Earth Observations (GEO).
In the same line, NOAA defined and uses the Completeness Rubric in addition to ISO compliance, to provide an extra level of assessment to help metadata authors provide more thorough information about their data [12].
Next table describes proposed Quality of Service parameters based on Quality of Data parameters and describing completeness about the quality of the data documentation. There are several parameters that describe completeness of different aspects of the SDMC data quality extension. One table is presented below for each aspect of the data quality extension.
Name | Definition | Method | Related QoD information | Unit of Measure |
---|---|---|---|---|
Quality Category Completeness |
Percentage of datasets provided by this service including some quality elements describing a certain quality category1 |
The number of datasets2 including any kind of quality elements describing a certain quality category divided by the number of datasets provided by the service, and multiplied by 100 |
Quality elements with quality category equal to the selected quality category: |
percentage |
Quantitative Quality Category Completeness |
Percentage of datasets provided by this service including a quantitative quality element describing a certain quality category1 |
The number of datasets2 including quantitative quality elements describing a certain quality category divided by the number of datasets provided by the service, and multiplied by 100 |
Quality elements with quality category equal to the selected quality category and a quantitative result: |
percentage |
1 Possible quality categories are: positional accuracy, positional precision, vertical accuracy, quantitative attribute accuracy, quantitative attribute precision, classification correctness, accuracy of a time measurement, temporal validity, completeness commission, completeness omission, topological consistency, domain consistency, conceptual consistency, format consistency, temporal consistency. |
Name | Definition | Method | Related QoD information | Unit of Measure |
---|---|---|---|---|
Resolution Completeness |
Percentage of datasets provided by this service including some resolution information |
The number of datasets including any kind of resolution information divided by the number of datasets provided by the service, and multiplied by 100 |
Positional Resolution, Vertical Resolution, Temporal Resolution and/or Attribute Resolution |
percentage |
Certain Resolution Completeness |
Percentage of datasets provided by this service including some resolution information describing a certain kind of resolution |
The number of datasets including a certain kind of resolution information divided by the number of datasets provided by the service, and multiplied by 100 |
Positional Resolution, Vertical Resolution, Temporal Resolution or Attribute Resolution |
percentage |
Quantitative Positional Resolution Completeness |
Percentage of datasets provided by this service including quantitative positional resolution information |
The number of datasets including quantitative positional resolution information divided by the number of datasets provided by the service, and multiplied by 100 |
Positional resolution > value |
percentage1 |
Equivalent Scale Positional Resolution Completeness |
Percentage of datasets provided by this service including equivalent scale positional resolution information |
The number of datasets including equivalent scale positional resolution information divided by the number of datasets provided by the service, and multiplied by 100 |
Positional resolution > equivalentScale |
percentage1 |
1 Note that this two percentages may sum up more than 100% as some datasets can have both quantitative and equivalent scale positional resolution described. |
Name | Definition | Method | Related QoD information | Unit of Measure |
---|---|---|---|---|
Traceability Completeness |
Percentage of datasets provided by this service including some traceability information |
The number of datasets including any kind of traceability information divided by the number of datasets provided by the service, and multiplied by 100 |
Traceability elements: |
percentage |
Traceability |
Percentage of datasets provided by this service including traceability data sources information |
The number of datasets including traceability data sources information divided by the number of datasets provided by the service, and multiplied by 100 |
Traceability > Data Source |
percentage |
Traceability |
Percentage of datasets provided by this service including traceability process steps information |
The number of datasets including traceability process steps information divided by the number of datasets provided by the service, and multiplied by 100 |
Traceability > Process Steps |
percentage |
Traceability Originator Completeness |
Originator not being described on a traceability element does not mean that it is unknown, as in this case it is assumed to be the service provider. Thus, it makes not many sense to include a completeness indicator on this element. |
Name | Definition | Method | Related QoD information | Unit of Measure |
---|---|---|---|---|
Timeliness Completeness |
Percentage of datasets provided by this service including some timeliness information |
The number of datasets including any kind of timeliness information divided by the number of datasets provided by the service, and multiplied by 100 |
Timeliness elements: |
percentage |
Date and Time of Capture Completeness |
Percentage of datasets provided by this service including date and time of capture information |
The number of datasets including date and time of capture information divided by the number of datasets provided by the service, and multiplied by 100 |
Timeliness > date and time of capture |
percentage |
Maintenance Date and Time Completeness |
Percentage of datasets provided by this service including maintenance date and time information |
The number of datasets including maintenance date and time information divided by the number of datasets provided by the service, and multiplied by 100 |
Timeliness > maintenance date and time |
percentage |
Maintenance frequency Completeness |
Percentage of datasets provided by this service including maintenance frequency information |
The number of datasets including maintenance frequency divided by the number of datasets provided by the service, and multiplied by 100 |
Timeliness > maintenance frequency |
percentage |
Time validity Completeness |
Percentage of datasets provided by this service including some quality elements describing time validity quality category |
The number of datasets including any kind of quality element describing time validity quality category divided by the number of datasets provided by the service, and multiplied by 100 |
Quality Element with quality category equal to time validity |
percentage |
Quantitative Time validity Completeness |
Percentage of datasets provided by this service including quantitative quality elements describing time validity quality category |
The number of datasets including quantitative time validity quality category description divided by the number of datasets provided by the service, and multiplied by 100 |
Quality elements with quality category equal to time validity and a quantitative result: |
percentage |
Name | Definition | Method | Related QoD information | Unit of Measure |
---|---|---|---|---|
Integrity Completeness |
Percentage of datasets provided by this service including some integrity information |
The number of datasets including any kind of integrity information divided by the number of datasets provided by the service, and multiplied by 100 |
Integrity elements: |
percentage |
Crc value Completeness |
Percentage of datasets provided by this service including crc value information |
The number of datasets including crc value information divided by the number of datasets provided by the service, and multiplied by 100 |
Integrity > crc value |
percentage |
Designated level Completeness |
Percentage of datasets provided by this service including designated level information |
The number of datasets including designated level information divided by the number of datasets provided by the service, and multiplied by 100 |
Integrity > designated level |
percentage |
Integrity faults Completeness |
Percentage of datasets provided by this service including integrity faults information |
The number of datasets including integrity faults information divided by the number of datasets provided by the service, and multiplied by 100 |
Integrity > integrity faults |
percentage |
Signature Completeness |
Percentage of datasets provided by this service including signature information |
The number of datasets including signature information divided by the number of datasets provided by the service, and multiplied by 100 |
Integrity > signature |
percentage |
Consistency Completeness |
Percentage of datasets provided by this service including some quality elements describing any kind of consistency quality category |
The number of datasets including any kind of quality element describing any kind of consistency quality category divided by the number of datasets provided by the service, and multiplied by 100 |
Quality Element with quality category equal to topological, domain, conceptual, format or temporal consistency |
percentage |
Quantitative Consistency Completeness |
Percentage of datasets provided by this service including quantitative quality elements describing any kind of consistency quality category |
The number of datasets including a quantitative quality element describing any kind of consistency quality category divided by the number of datasets provided by the service, and multiplied by 100 |
Quality elements with quality category equal to topological, domain, conceptual, format or temporal consistency and a quantitative result: |
percentage |
Topological consistency Completeness |
Percentage of datasets provided by this service including some quality elements describing topological consistency quality category |
The number of datasets including any kind of quality element describing topological consistency quality category divided by the number of datasets provided by the service, and multiplied by 100 |
Quality Element with quality category equal to topological consistency |
percentage |
Quantitative Topological consistency Completeness |
Percentage of datasets provided by this service including quantitative quality elements describing topological consistency quality category |
The number of datasets including quantitative topological consistency quality category description divided by the number of datasets provided by the service, and multiplied by 100 |
Quality elements with quality category equal to topological consistency and a quantitative result: |
percentage |
Domain consistency Completeness |
Percentage of datasets provided by this service including some quality elements describing domain consistency quality category |
The number of datasets including any kind of quality element describing domain consistency quality category divided by the number of datasets provided by the service, and multiplied by 100 |
Quality Element with quality category equal to domain consistency |
percentage |
Quantitative Domain consistency Completeness |
Percentage of datasets provided by this service including quantitative quality elements describing domain consistency quality category |
The number of datasets including quantitative domain consistency quality category description divided by the number of datasets provided by the service, and multiplied by 100 |
Quality elements with quality category equal to domain consistency and a quantitative result: |
percentage |
Conceptual consistency Completeness |
Percentage of datasets provided by this service including some quality elements describing conceptual consistency quality category |
The number of datasets including any kind of quality element describing conceptual consistency quality category divided by the number of datasets provided by the service, and multiplied by 100 |
Quality Element with quality category equal to conceptual consistency |
percentage |
Quantitative Conceptual consistency Completeness |
Percentage of datasets provided by this service including quantitative quality elements describing conceptual consistency quality category |
The number of datasets including quantitative conceptual consistency quality category description divided by the number of datasets provided by the service, and multiplied by 100 |
Quality elements with quality category equal to conceptual consistency and a quantitative result: |
percentage |
Format consistency Completeness |
Percentage of datasets provided by this service including some quality elements describing format consistency quality category |
The number of datasets including any kind of quality element describing format consistency quality category divided by the number of datasets provided by the service, and multiplied by 100 |
Quality Element with quality category equal to format consistency |
percentage |
Quantitative Topological consistency Completeness |
Percentage of datasets provided by this service including quantitative quality elements describing format consistency quality category |
The number of datasets including quantitative format consistency quality category description divided by the number of datasets provided by the service, and multiplied by 100 |
Quality elements with quality category equal to format consistency and a quantitative result: |
percentage |
Temporal consistency Completeness |
Percentage of datasets provided by this service including some quality elements describing temporal consistency quality category |
The number of datasets including any kind of quality element describing temporal consistency quality category divided by the number of datasets provided by the service, and multiplied by 100 |
Quality Element with quality category equal to temporal consistency |
percentage |
Quantitative Temporal consistency Completeness |
Percentage of datasets provided by this service including quantitative quality elements describing temporal consistency quality category |
The number of datasets including quantitative temporal consistency quality category description divided by the number of datasets provided by the service, and multiplied by 100 |
Quality elements with quality category equal to temporal consistency and a quantitative result: |
percentage |
Of course, a high value on these indicators it is not per se an indicator that the datasets served by a server are better than others, as the presence of documentation has nothing to do with its value (for example a higher or lower attribute accuracy) or its usefulness (having a process step description that says "unknown" or an unintelligible character string identifying a process in a certain software may not be very informative). This needs to be complemented with some quantitative parameters (next section), when possible.
8.3.2. Quantitatively describing service and its datasets
The second set of quality parameters are those quantitatively describing services and their datasets. Several measures can be computed using dataset values for a certain quality parameter, in order to give some insight on how to compare services regarding the quality of its datasets.
For many of the parameters a set of summary statistics (or metrics) can be provided, each of them bringing its approach to the measure. The most common value to summarize is the average (or arithmetic mean) but the count of values is also meaningful to know how many elements this value is representing. The median is also interesting as it may differ from the average depending on the data distribution. Minimum value and maximum value are helpful in order to describe the maximum range of the values for this variable in a certain service. Finally, the standard deviation is also representative as a dispersion measure and, thus, to obtain confidence intervals centered on the average (if a normal distribution is assumed).
As this set of parameters can be used for several quality parameters, its definition is presented once in the next table and then referred to from the table describing the QoS parameters proposed:
Metrics | Method | Lineal positional accuracy example |
---|---|---|
Count |
Number of datasets that have this information1 defined |
Number of datasets with a lineal positional accuracy defined |
Average |
The arithmetic mean (typically just the mean) is what is commonly called the average. The sum of values divided by the number of datasets that have this value defined |
The sum of individual lineal positional accuracies divided by the number of datasets that have this value defined |
Median |
The median is described as the numeric value separating the higher half of a sample (or population) from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one. If there is an even number of observations, then there is no single middle value, then the average of the two middle values is used. The median is also the 0.5 quantile, or 50th percentile. |
The median individual lineal positional accuracies |
Minimum |
Minimum value of the datasets that have this vale defined |
The minimum value of individual lineal positional accuracies |
Maximum |
Maximum value of the datasets that has this value defined |
The maximum value of individual lineal positional accuracies |
Standard deviation |
The standard deviation of a distribution or population is the square root of its variance. The standard deviation is a widely used measure of the variability or dispersion since it is reported in the natural units of the quantity being considered. |
The standard deviation of individual lineal positional accuracies |
1 These summarizing metrics will be applied to a certain information, for example to quality elements of a certain type (as lineal positional accuracy in the third column in this table), to traceability elements or to resolution elements. The concrete element summarized each time is described in the cent tables in the Related QoD information column. |
The next table describes proposed Quality of Service parameters based on Quality of Data parameters and describing summary metrics about the quality of the data within a service:
Quality Category | Parameter Name | Related QoD information | Measure, domain & metric 1 | Unit of Measure |
---|---|---|---|---|
positional accuracy |
Circular Error Positional Accuracy |
Quality elements with quality category equal to positional accuracy, a quantitative result and a linear or angular unit of measure: |
Measure name: Circular Map Accuracy |
meters or degrees2 |
vertical accuracy |
Half-length Confidence Interval Vertical Accuracy |
Quality elements with quality category equal to vertical accuracy, a quantitative result and a linear or angular unit of measure: |
Measure name: Quantitative Attribute Correctness |
meters or degrees2 |
quantitative attribute accuracy |
Half-length Confidence Interval Quantitative Attribute Accuracy |
Quality elements with quality category equal to quantitative attribute accuracy, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Quantitative Attribute Correctness |
units of measure depending on the measured attribute, described in the Quality Scope |
classification correctness |
Confusion Matrix Classification Correctness |
Quality elements with quality category equal to classification correctness, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Misclassification |
dimensionless, in percentage or per unit |
accuracy of a time measurement |
Half-length Confidence Interval Temporal Accuracy |
Quality elements with quality category equal to accuracy of a time measurement, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Time Accuracy |
temporal units of measure2 |
temporal validity |
Temporal validity non conformance |
Quality elements with quality category equal to temporal validity, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Value Domain |
temporal units of measure2 |
completeness commission |
Excess Items completeness commission |
Quality elements with quality category equal to completeness commission, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Excess |
dimensionless, in percentage or per unit |
Duplicate Items completeness commission |
Quality elements with quality category equal to completeness commission, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Duplicate |
dimensionless, in percentage or per unit |
|
completeness omission |
Missing completeness omission |
Quality elements with quality category equal to completeness omission, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Missing Items |
dimensionless, in percentage or per unit |
Nodata Areas completeness omission |
Quality elements with quality category equal to completeness omission, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Nodata Areas |
dimensionless, in percentage or per unit |
|
conceptual consistency |
Conceptual schema compliance |
Quality elements with quality category equal to conceptual consistency, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Conceptual Schema |
dimensionless, in percentage or per unit |
Conceptual schema non compliance |
Quality elements with quality category equal to conceptual consistency, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Conceptual Schema |
dimensionless, in percentage or per unit |
|
domain consistency |
Value Domain compliance |
Quality elements with quality category equal to domain consistency, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Value Domain |
dimensionless, in percentage or per unit |
Value Domain non compliance |
Quality elements with quality category equal to domain consistency, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Value Domain |
dimensionless, in percentage or per unit |
|
format consistency |
Physical structure conflicts |
Quality elements with quality category equal to format consistency, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Physical Structure Conflicts |
dimensionless, in percentage or per unit |
topological consistency |
Faulty point-curve connections |
Quality elements with quality category equal to topological consistency, a quantitative result, the same Quality Scope3, and units of measure: |
Measure name: Faulty Point-curve Connections |
dimensionless, in percentage or per unit |
1 Documentation of several datasets may use different measure and measure metrics. To compute a meaningful value, values should be only summarized over the same measure and measure metrics. Moreover, if the metrics has parameters, they should also be the same value to be summarized, for example: Circular error or Half-Lenght Confidence interval with the same probability (encoded in "level" parameter for each metric), or Items with the same max parameter (percentage or per unit) |
Name | Related QoD information | Unit of Measure |
---|---|---|
Certain resolution summary |
Certain resolution element with a quantitative result and the same units of measure: Positional Resolution, Vertical Resolution, Temporal Resolutionand/or Attribute Resolution > value |
certain units of measure1 |
Equivalent scale positional resolution summary |
Positional resolution with an Equivalent scale information |
dimensionless |
1 Documentation of several datasets and certain resolution types may use different units of measure. To compute a meaningful value, either values are only summarized over the same UoM or internal unit transformations are computed by the system |
Name | Related QoD information | Explanation |
---|---|---|
Date and time of capture |
Timeliness > date and time of capture |
For a service, statistics can be summarized using date and time of capture to describe its minimum value (older dataset), maximum value (most current dataset), its average, etc |
Maintenance date and time |
Timeliness > maintenance date and time |
For a service, statistics can be summarized using maintenance date and time to describe its minimum value (dataset updated the less recently), maximum value (dataset updated the most recently), its average, etc |
Maintenance frequency mode |
Timeliness > maintenance frequency |
In this case, as this is a categorical metadata element, the mode can of several datasets for a service can be described |
Maintenance frequency histogram |
Timeliness > maintenance frequency |
Moreover, even an histogram showing how many datasets of each maintenance frequency are distributed within a service can be provided |
Appendix A: Unified Modeling Language (UML) model
The Unified Modeling Language (UML) model including SDMC extension described this ER is included in a ZIP file on the UML folder of the ER repository on GitHub, and can be found in the OGC® Public Engineering Report Repository. The following figure presents the UML diagram of the SDCM extension for describing Data Quality.
Appendix B: Revision History
Date | Release | Editor | Primary clauses modified | Descriptions |
---|---|---|---|---|
June 30, 2017 |
A. Zabala |
0.1 |
all |
initial version |
September 30, 2017 |
A. Zabala |
0.7 |
various |
final content for most sections |
October 19, 2017 |
A. Zabala |
0.8 |
various |
preparation for publication (section 8: some content is left, main structure and general ideas described) |
October 25, 2017 |
A. Zabala |
0.9 |
various |
Greg Buehler comments included, section 8 improved |
October 31, 2017 |
A. Zabala |
1.0 |
various |
Final version for all sections. Gobe Hobona comments included. |
Appendix C: Bibliography
[1] Aeronautical information publication, Spain (updated 14th September 2017)
[2] DigitalGlobe: The A3C quality framework (2017)
[3] EUROCONTROL and Federal Aviation Administration: AIXM 5 Temporality Model (2010)
[5] ICAO: Annex 15 (2014)
[6] ICAO: Chicago Convention Document 7300 (2017)
[7] ISO 19115: Geographic information - Metadata (2003) (deprecated)
[8] ISO/TS 19138: Geographic information - Data quality measures (2006) (deprecated)
[9] ISO/TS 19139: Geographic information - Metadata - XML schema implementation (2007) (deprecated)
[10] ISO/DIS 19165: Geographic information - Preservation of Digital Data and Metadata. Draft International Standard (2017)
[11] ISO/TC 211 Standards Guide (2009)
[12] NOAA: Completeness Rubric (2017)