i. Abstract
GeoSciML is a model of geological features commonly described and portrayed in geological maps, cross sections, geological reports and databases. The model was developed by the IUGS CGI (Commission for the Management and Application of Geoscience Information) and version 4.1 is the first version officially submitted as an OGC standard. This specification describes a logical model and GML/XML encoding rules for the exchange of geological map data, geological time scales, boreholes, and metadata for laboratory analyses. It includes a Lite model, used for simple map-based applications; a basic model, aligned on INSPIRE, for basic data exchange; and an extended model to address more complex scenarios.
The specification also provides patterns, profiles (most notably of Observations and Measurements - ISO19156), and best practices to deal with common geoscience use cases.
ii. Keywords
The following are keywords to be used by search engines and document catalogues.
Ogc doc, OGC document, geology, geoscience, stratigraphy, borehole, geochemistry, geophysics, rock, fault, contact, fold, fossil, UML, GML, XML.
iii. Preface
The primary goal of this specification is to capture the semantics, schema, and encoding syntax of key elements described and portrayed in geological maps and databases, in order to enable information systems to interoperate with such data.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. The Open Geospatial Consortium shall not be held responsible for identifying any or all such patent rights.
Recipients of this document are requested to submit, with their comments, notification of any relevant patent claims or other intellectual property rights of which they may be aware that might be infringed by any implementation of the standard set forth in this document, and to provide supporting documentation.
iv. Submitting organizations
The following organizations submitted this Document to the Open Geospatial Consortium (OGC):
- Arizona Geological Survey (AzGS), Arizona, USA
- British Geological Survey (NERC-BGS), UK
- Bureau de Recherches Géologiques et Minières (BRGM), France
- Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia
- Geological Survey of Victoria (GSV), Australia
- Geological Survey of Finland (GTK), Finland
- Geological Survey of Italy (ISPRA), Italy
- Geological Survey of Sweden (SGU), Sweden
- Geoscience Australia (GA), Australia
- Institute of Geological and Nuclear Sciences (GNS), New Zealand
- Landcare Research, New Zealand
- Natural Resources Canada (NRCan), Canada
- U.S. Geological Survey (USGS), United States of America
v. Submitters
All questions regarding this submission should be directed to the editors or submitters:
|
Name |
Affiliation |
OGC Member |
|---|---|---|
|
Eric Boisvert |
Geological Survey of Canada (Natural Resources Canada) |
Y |
|
Ollie Raymond |
Geoscience Australia |
Y |
|
Marcus Sen |
British Geological Survey |
Y |
1. Scope
GeoSciML (Geoscience Markup Language) covers the domain of geology (earth materials, geological units and stratigraphy, geological time, geological structures, geomorphology, geochemistry) and sampling features common to the practice of geoscience, such as boreholes and geological specimens. The specification also proposes a simplified version of GeoSciML suitable for portrayal of geological features on digital maps. This specification does not address (or very partially addresses) more specialised geoscience domains such as hydrogeology, seismology, geophysics or economic geology. Some of these domains are covered by other specifications (e.g. GroundwaterML for hydrogeology; OGC 16-032, and EarthResourceML for economic geology – both developed in concert with GeoSciML).
2. Conformance
This standard defines a logical model and an XML encoding which conform to OGC GML 3.3 encoding rules, itself, an iteration over ISO 19136 (2007).
Requirements for three standardization target types are considered:
- Logical Model
- Encoding
- Data instance
Conformance with this standard shall be checked using all the relevant tests specified in Annex A (normative) of this document. The framework, concepts, and methodology for testing, and the criteria to be achieved to claim conformance are specified in the OGC Compliance Testing Policies and Procedures and the OGC Compliance Testing web site[1].
All requirements-classes and conformance-classes described in this document are owned by the standard(s) identified.
3. References
The following normative documents are referenced in the text or provide significant context for the development of GeoSciML 4.1. For dated references, subsequent amendments to, or revisions of, any of these publications do not apply. However, parties to agreements based on this document are encouraged to investigate the possibility of applying the most recent editions of the normative documents indicated below. For undated references, the latest edition of the normative document applies.
- OGC: OGC 06-121r9, OGC® Web Services Common Standard (2010)
- ISO / TC 211: ISO 19103:2005 - Conceptual Schema Language (2005)
- ISO / TC 211: ISO 19107:2003 - Spatial Schema (2003)
- ISO / TC 211: ISO 19108:2006 - Temporal Schema (2006)
- ISO: ISO 8601- Data elements and interchange formats – Information interchange – Representation of dates and times (2004)
- OGC: OGC Abstract Specification Topic 20 – Observations and Measurements (also ISO 19156:2011) (2011)
- OGC: OGC Abstract Specification Topic 2 – Spatial Referencing by Coordinates (also ISO 19111:2007) (2007)
- ISO / TC 211: ISO 19115:2003 – Geographic information - Metadata (also OGC Abstract Specification Topic 11) (2003)
- OGC: OGC 07-036, Geography Markup Language (also ISO 19136:2007) (2007)
- OGC: OGC 10-025r1, OGC Observations and Measurements - XML Implementation v2.0 (2011)
- OGC: OGC 08-094r1, OGC SWE Common Data Model Encoding Standard v2.0 (2011)
- IETF: RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax, (2005)
- ISO/IEC: ISO/IEC 19757-3, Information technology - Document Schema Definition Languages (DSDL) - Part 3: Rule-based validation – Schematron (2006)
- OGC: OGC 08-131r3, The Specification Model - A Standard for Modular specifications (2009)
- Schadow, G. and McDonald, C. J.: Unified Code for Units of Measure (UCUM) - Version 2.0.1, (2014)
- OMG: Unified Modeling Language (UML). Version 2.3. (2010)
- W3C: Extensible Markup Language (XML) - Version 1.0 (Fourth Edition) (2006)
- W3C: XML Schema - Version 1.0 (Second Edition) (2004)
- INSPIRE Thematic Woring Group Geology: INSPIRE Data Specification for the spatial data theme Geology Version 3.0. European Commission Joint Research Group (2013)
- North American Geologic Map Data Model Steering Committee: NADM Conceptual Model 1.0—A conceptual model for geologic map information. U.S. Geological Survey Open-File Report 2004-1334, 58 p., accessed online at URL http://pubs.usgs.gov/of/2004/1334 (2004)
- North American Geologic Map Data Model science language products (http://ngmdb.usgs.gov/www-nadm/sltt/products.html)
- Murphy, M. A. and Salvador, A.: International Stratigraphic Guide – an abridged version accessed online at URL (http://www.stratigraphy.org/index.php/ics-stratigraphicguide (1994)
4. Terms and Definitions
This document uses the terms defined in Sub-clause 5.3 of [OGC 06-121r8], which is based on the ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards. In particular, the word “shall” (not “must”) is the verb form used to indicate a requirement to be strictly followed to conform to this standard.
For the purposes of this document, the following additional terms and definitions apply.
- 4.1 classifier
-
A classifier is an abstract UML metaclass which describes (classifies) a set of instances having common features (not to be confused with the “Feature” stereotype from the OGC Feature Model). A feature declares a structural or behavioral characteristic of instances of classifiers. (http://www.uml-diagrams.org/classifier.html). Classes, Interfaces, Association, and Types are kinds of classifiers.
- 4.2 domain feature
-
Feature of a type defined within a particular application domain.
NOTE: This may be contrasted with observations and sampling features, which are features of types defined for cross-domain purposes.
[ISO 19156, definition 4.4]
- 4.3 element <XML>
-
Basic information item of an XML document containing child elements, attributes and character data.
NOTE: From the XML Information Set ― each XML document contains one or more elements, the boundaries of which are either delimited by start-tags and end-tags, or, for empty elements, by an empty-element tag. Each element has a type, identified by name, sometimes called its ‘generic identifier’ (GI), and may have a set of attribute specifications. Each attribute specification has a name and a value.
[ISO 19136:2007]
- 4.4 feature
-
Abstraction of a real-world phenomenon.
[ISO 19101:2002, definition 4.11]
- 4.5 GML application schema
-
Application schema written in XML Schema in accordance with the rules specified in OGC GML 3.3
[ISO 19136:2007]
- 4.6 GML document
-
XML document with a root element that is one of the elements AbstractFeature, Dictionary or TopoComplex, specified in the GML schema or any element of a substitution group of any of these elements.
[ISO 19136:2007]
- 4.7 GML schema
-
Schema components in the XML namespace ― as specified in OGC GML 3.3
[ISO 19136:2007]
- 4.8 measurement
-
Set of operations having the objective of determining the value of a quantity.
[ISO/TS 19101-2:2008, definition 4.20]
- 4.9 observation
-
Act of observing a property.
NOTE: The goal of an observation may be to measure or otherwise determine the value of a property.
[ISO 19156:2011 definition 4.10]
- 4.10 observation procedure
-
Method, algorithm or instrument, or system which may be used in making an observation.
[ISO19156, definition 4.11]
- 4.11 observation result
-
Estimate of the value of a property determined through a known procedure.
[ISO 19156:2011]
- 4.12 property <General Feature Model>
-
Facet or attribute of an object referenced by a name.
EXAMPLE: Abby’s car has the color red, where “color red” is a property of the car instance.
- 4.13 sampled feature
-
The real-world domain feature of interest, such as a geological unit or structure which is observed.
[ISO 19156:2011]
- 4.14 sampling feature
-
Feature, such as a station, outcrop, borehole, section or specimen, which is involved in making observations of a domain feature.
NOTE: A sampling feature is purely an artefact of the observational strategy, and has no significance independent of the observational campaign.
[ISO 19156:2011, definition 4.16]
- 4.15 schema <XML Schema>
-
XML document containing a collection of schema component definitions and declarations within the same target namespace.
Example Schema components of W3C XML Schema are types, elements, attributes, groups, etc.
NOTE: The W3C XML Schema provides an XML interchange format for schema information. A single schema document provides descriptions of components associated with a single XML namespace, but several documents may describe components in the same schema, i.e. the same target namespace.
[ISO 19136:2007]
5. GeoSciML Models
The GeoSciML 4.1 is an ISO General Feature Model (ISO19101, ISO19109) implementation of portions of the North American Data Model [12] and CSIRO’s XMML model. GeoSciML also provides models for concepts at the immediate periphery of geological mapping, such as boreholes, geologic specimens and laboratory analysis, modelled as SF_SamplingFeatures and OM_Observations (OGC 10-004r3).
GeoSciML has been through 4 major releases and a few minor releases since 2005. Each version brought a different interpretation of what is essentially the same conceptual model. The reader looking at all iterations will see, with few exceptions, the same concepts, the same associations and the same properties, but packaged differently. GeoSciML increasingly adopted other domain models as it evolved; it replaced XMML (eXploration and Mining Markup Languages, developed by CSIRO) by Observations and Measurements (ISO19156), custom data types for ranges and categories by SWE Common, and removed custom vocabularies to use web resources.
This fourth iteration is essentially a repackaging of the previous version 3.2 from 13 packages organised by themes into 6 packages organised by use cases (Figure 2):
- GeoSciML Basic: a set of core geologic features, aligned to the INSPIRE Data Specification on Geology.
- GeoSciML Extension: an extension providing detailed description of basic features which adds additional properties and associations.
- GeoSciML Geologic Age: a model for the representation of geologic time using procedures adopted by the International Stratigraphic Commission.
- GeoSciML Borehole: a model for boreholes, including geologic logs and drilling details and other engineering information.
- GeoSciML Laboratory and Analysis: a model for laboratory analytical metadata, geological sampling and specimens, and isotopic age observation results.
- GeoSciML Lite: previously known as “GeoSciML Portrayal” in version 3.2; a simplified alternate implementation of the conceptual model for layer based applications.
Each application package is the subject of at least one requirements class (to conform to the modular specification) per target implementation (this specification has three targets; logical model, encoding and data instance). More target implementations might be published as separate documents.
5.1 GeoSciML Basic and Extension
GeoSciML describes geological features from the mapping perspective, articulated around the concept of a MappedFeature – the cartographic element shown on a map, and the GeologicFeature it represents. All geologic concepts that can be represented on a map are subtypes of GeologicFeature.
GeologicFeature is an abstract class materialised into four concrete classes (Figure 4) - GeologicEvent, GeologicStructure, GeologicUnit and GeomorphologicFeature. The other main features of the GeoSciML model are not geologic features themselves, but features related to the activity of sampling and observing geology (such as Borehole or GeologicSpecimen) and are therefore modelled as SF_SamplingFeature (O&M) subtypes.
GeologicFeature can share arbitrary relationships through a relation class (AbstractRelation), subtyped into different kind of relationships, providing distinct properties and constraints.
In order to provide a simple entry level model for data providers, but also to align to INSPIRE, only a minimal set of properties are supported by the basic package. When more properties are required, the data provider can use the extension package. To split properties between basic and extension, a modelling pattern has been adopted to overcome the limitations of classical object oriented subtyping imposed by UML and XSD.
5.1.1 AbstractDescription classes
The technique to add extended properties to an existing class is normally to create a subtype to carry the new properties (Figure 5).
But this only works when properties need to be added to a leaf class. Properties added by subtyping a class higher up in the chain of inheritance will create a new branch, and new properties won’t propagate to existing subtypes in the main branch. GeoSciML 4 adopts an extension pattern using abstract property blocks or ‘AbstractDescription’ classes (Figure 6). Blocks of extended properties are organized in their own Datatype, subtyping AbstractDescription.
This pattern has two main advantages:
- It does not require the creation of a new feature type to add properties to core features.
- Extra properties can be defined and used by other user communities (e.g., properties added by a geophysical application could be reused by groundwater applications).
GeoSciML Basic contains nine stub AbstractDescription classes ultimately materialised in GeoSciML Extension (Table 1).
| Class | Description |
|---|---|
|
AbstractFeatureRelation |
Association class placeholder to describe relations between geologic features. |
|
EarthMaterialAbstractDescription |
Detailed earth material description placeholder for GeologicUnit and EarthMaterial. |
|
GeologicUnitAbstractDescription |
Detailed description placeholder for GeologicUnit. |
|
GeologicEventAbstractDescription |
Detailed description placeholder for GeologicEvent. |
|
ContactAbstractDescription |
Detailed description placeholder for Contact. |
|
FoliationAbstractDescription |
Detailed description placeholder for Foliation. |
|
FoldAbstractDescription |
Detailed description placeholder for Fold. |
|
ShearDisplacementStructureAbstractDescription |
Detailed description placeholder for ShearDisplacementStructure. |
|
GeomorphologicUnitAbstractDescription |
Detailed description placeholder for GeomorphologicUnit. |
Since those classes are abstract in GeoSciML Basic, data providers need to implement GeoSciML Extension, or any third party extension to get concrete classes.
This modelling pattern is also used by other standards communities (e.g., ISO 19115-3).
5.2 GeoSciML Lite
GeoSciML Lite is a denormalised view, or a transformation, of key geological and sampling features, designed as a simple entry-level model to publish datasets, particularly adapted to geographic visualization with key reporting properties. The use case for GeoSciML Lite is a simple layer-based application; such as web map application or GIS where the key functionality is to display a map layer and perform simple identify or query operations. The classes are modelled to be easily implementable in any GIS or web mapping application. One feature type maps to one table composed of optional, single-occurrence properties – consistent with the structure of denormalised RDBMS tables. The XML implementation (clause 9.8) is conformant with GML Simple Feature (OGC 10-100r3).
Each property of GeoSciML Lite classes is derived from a subset of the properties available in the full GeoSciML model, with the exception of “genericSymbolizer”, which is a convenience property providing a cartographic symbol or code. The property is a shortcut to symbolisation that would otherwise be provided by an SLD (Styled Layer Descriptor).
Some fields are external references, in the form of HTTP URI, to provide hyperlinks for applications to access linked data definitions to externally governed vocabulary terms and/or complex representations of the features when required.
6. Conventions
6.1 Requirement class
Each normative statement (requirement or recommendation) in this specification is a member of a requirements class. Each requirements class is described in a discrete clause or sub-clause, and summarized using the following template:
| Requirements class | /req/{classM} |
|---|---|
|
Target type |
[artefact or technology type] |
|
Dependency |
[identifier for another requirements class] |
|
Requirement |
/req/{classM}/{reqN} |
|
Recommendation |
/req/{classM}/{recO} |
|
Requirement |
/req/{classM}/{reqP} |
|
Requirement /Recommendation |
[repeat as necessary] |
All requirements in a requirements class must be satisfied. Hence, the requirements class is the unit of re-use and dependency, and the value of a dependency requirement is another requirements class. All requirements in a dependency must also be satisfied by a conforming implementation. A requirements class may consist only of dependencies and introduce no new requirements.
6.2 Requirement and Recommendation
All requirements and recommendations are normative, and each is presented using the following template:
| Requirement /req/[classM]/[reqN] |
|
[Normative statement] |
where /req/[classM]/[reqN] identifies the requirement or recommendation. The use of this layout convention allows the normative provisions of this specification to be easily located by implementers.
6.3 Conformance class
Conformance to this specification is possible at a number of levels, specified by conformance classes (Annex A). Each conformance class is summarized using the following template:
| Conformance class | |
|---|---|
| /conf/{classM} | |
|
Dependency |
[identifier for another conformance class] |
|
Requirements |
/req/{classA} |
|
Tests |
[reference to clause(s) containing tests] |
All tests in a class must be passed. Each conformance class tests conformance to a set of requirements packaged in a requirements class.
W3C Schema (XSD) and ISO Schematron (SCH) files are considered as part of this specification, although available online only, due to concerns about document size. Many requirements are expressed in a single XSD or SCH file, although tests are listed individually in the conformance annex (one test for XSD and one test for SCH).
Schematron files explicitly specify which requirements are being tested in the title of the schematron pattern.
<pattern id="unit-of-measure">
<title>Test requirement: /req/gsml4xsd/unit-of-measure</title>
<rule context="SWE::Quantity">
<assert test="SWE::Quantity">Quantity must have a UOM</assert>
</rule>
</pattern>
6.4 Identifiers
The normative provisions in this specification are denoted by a URI constructed using this pattern:
http://www.opengis.net/spec/{standard}/{m.n}
All requirements and conformance tests that appear in this document are denoted by a partial URI which is relative to this base. The identifier supports cross-referencing of class membership, dependencies, and links from each conformance test to the requirements tested. In this specification identifiers are expressed as partial URIs or paths, which can be appended to a base URI that identifies the specification as a whole in order to construct a complete URI for identification in an external context.
The URI for each requirements class has the form:
http://www.opengis.net/spec/geosciml/4.1/req/[classM].
The URI for each requirement or recommendation has the form:
http://www.opengis.net/spec/geosciml/4.1/req/[classM]/[reqN].
The URI for each conformance class has the form:
http://www.opengis.net/spec/geosciml/4.1/conf/[classM].
The URI for each conformance test has the form:
http://www.opengis.net/spec/geosciml/4.1/conf/[classM]/[testN].
6.5 Classifiers
This document contains a large number of references to classifiers that might sometimes be ambiguous. Classes and packages are simply referred by their name formed using “CamelCase” name in mono space type. Duplicate names do exist and the scope (the package of a class or the class a property belongs to) must be made explicit.
OCL syntax will be used to identify a logical model classifier from the UML model.
Package::{…}Package::Classifier::Property:Type
Package names are not formal in UML and can change from one implementation to another. The reference model used by GeoSciML, and several other domain models, is HollowWorld. For example, a complete path for a SF_SamplingPoint in HollowWorld (from HollowWorld root) is
ISO TC211::ISO 19156 All::ISO 19156:2011 Observations and Measurements::Sampling Features::samplingPoint::SF_SamplingPoint
For the sake of readability, and also because some HollowWorld package names do not have OCL friendly names (e.g. some package names contain ‘:’, as shown in the previous example), this document will use shortcuts to identify packages. For example, for OM::SF_SamplingPoint, OM acts as a shortcut for (ISO TC211::ISO 19156 All::ISO 19156:2011 Observations and Measurements::*) that includes all classifiers in all sub packages and avoids creating a shortcut for all sub packages. The list of shortcuts is provided in Section 8.1.2. GeoSciML also uses the recently published ISO19115-3 model which has numerous classifier name overlaps with ISO19115 from HollowWorld.
W3C XPath will be used in XML instances. XML entities will be identified using their full qualified name (namespace, identified by its prefix, and entity name).
- gsmlb:GeologicUnit refers to an instance of GeologicUnit, from namespace xmlns:gsmlb="http://www.opengis.net/gsml/4.1/GeoSciML-Basic"
- gsmlb:GeologicUnit/gml:name refers to the name property of GeologicUnit
- gsmlb:GeologicUnit/gml:name/@codeName refers to the codeName attribute of the name property of GeologicUnit
7. Conceptual Model
The strictly geological portion of GeoSciML, as opposed to the parts dealing with sampling (e.g., boreholes) and laboratory metadata, is largely an implementation of the North American Data Model [12]. NADM is a technologically neutral conceptual model that addresses geoscience concepts and the relationships between them. GeoSciML 4 does not implement NADM Geologic Portrayal (a model of cartographic elements composing a geological map, such as legends, symbols, insets, etc.) nor Geologic Vocabulary (although older versions of GeoSciML did).
GeoSciML is an ISO Feature Model implementation of NADM and this created subtle differences between NADM and GeoSciML as the logical model deals with ISO Feature Model idiosyncrasies. For instance, NADM multiple inheritances used in Fossil could not be implemented in the ISO world that forbids such constructs. There were also conceptual changes, especially regarding EarthMaterial that is not considered as a GeologicFeature (hence an ISO FeatureType) in GeoSciML, but as a Type.
Sampling and analytical metadata features (Borehole and GeologicSpecimen in particular) are extensions of Observations and Measurements (10-004r3) and as such implement the underlying Observations and Measurements conceptual model. Borehole introduces engineering concepts known to the industry without a formal conceptual model. It has been recognised that Boreholes are features that are common outside the geological mapping realms (like the energy and mineral resources industries, hydrogeology, civil engineering, etc) and more formal work could be carried by those interested parties. Therefore, Borehole in GeoSciML is a essentially placeholder waiting to be replaced by a more formal Borehole model that is applicable across more domains than geology. It is expected that the Laboratory Analysis model could also be formalised by parties interested in (geo)chemical analysis.
|
Requirements class |
|
|---|---|
|
Target type |
Logical Model |
|
Name |
GeoSciML conceptual model |
|
Dependency |
ISO19101:2002 Reference Model Clause 7 |
|
Dependency |
ISO19103 2015 Conceptual Model Language |
|
Dependency |
ISO19104:2008 |
|
Dependency |
Unified Modeling Language (UML). Version 2.3. May 2010 |
|
Requirement |
/req/gsml4-conceptual/similarity A target logical model, when claiming compliance with this conceptual model, SHALL implement its components (classes, attributes, relationships) respecting the conceptual model definitions and intent, such that high semantic similarity is obtained between the logical and conceptual model components. |
Target logical models that are compliant with the conceptual model shall implement components of the conceptual model respecting their semantics, i.e. their definition and intent. In other words, the logical model must be highly semantically similar to components of the conceptual model. Semantic similarity can be tested in multiple ways, including but not limited to: (i) direct comparison of UML components, (ii) comparison after mapping components to a common expressive knowledge representation language, such as first order logic or common logic, or (iii) comparison after mapping components to a reference ontology. The target can reuse and adapt existing logical models.
| Requirement /req/gsml4-conceptual/similarity |
|
A target logical model, when claiming compliance with this conceptual model, SHALL implement its components (classes, attributes, relationships) respecting the conceptual model definitions and intent, such that high semantic similarity is obtained between the logical and conceptual model components. |
8. Logical Model
This section describes requirements that must be met by all target implementations that claim conformance to this specification. The target implementation of the logical model is generally an encoding specification or a schema (which could use technologies like XSD, for example) and not a data instance. The logical model, expressed using UML, provides naming, structure and cardinality for any physical implementation. The UML model is a normative artefact as the official representation of GeoSciML. Rules that can be unambiguously inferred from the UML model will not be documented as explicit requirements. Specific encoding idiosyncrasies shall be addressed in the requirement clauses pertaining to that encoding.
The logical model contains almost no semantic requirements (i.e., vocabularies, enumerations). It is expected that users will employ controlled vocabularies of terms which are developed by user communities. The model provides mechanisms for delivering concepts from controlled vocabularies via URI’s and linked data principles.
8.1 UML Model
The UML model provides name, structure and cardinality for data elements composing various potential physical implementations of GeoSciML. There are formal mappings between UML and GML (ISO-19136), UML and RDF (ISO-19150) and best practices exist for mapping UML to RDBMS. Although it is assumed that UML is technologically neutral, in reality UML models always end up addressing some of the encoding specification details. The current GeoSciML UML model has been designed as a GML application according to ISO 19109 and borrows some of artefacts of GML. Several design decisions were guided by limitations of UML (e.g. single inheritance) and XSD (package dependencies artefacts) and some constraints of GML delivery using ISO19142 WFS (for instance, some XSD encodings are not queryable easily with ISO19143 FES). However, the UML model is detailed enough to constrain the main elements of any encoding; the names of entities and the cardinality of properties, the associations between entities and to some extent property types. On the other hand, some UML features do not have equivalences in certain encoding (for instance, JSON does not have a native support for namespaces or even schema).
Figure 8 shows requirements class dependencies:
This section defines the minimum UML mapping requirements that shall be met by any target claiming compliance to this specification.
8.1.1 Property cardinality
All properties that could feasibly be made optional are optional in GeoSciML 4.1. This is a reversal of the pattern used in GeoSciML version 3.2 where all properties were made mandatory, forcing the data provider to document why the property was missing using nillable properties. This design attracted a lot of criticism (not only for GeoSciML but for other communities confronted with the same pattern) from application developers and data providers that consider filling the instance with nil properties is “unnecessarily verbose” and a waste of bandwidth. It has been argued that nillable properties are just verbose absent values. This issue is a real concern for mobile applications where payload has an impact on user experience.
Nillable properties actually carry useful or even required information in certain use cases, such as legally bounded data exchange scenarios. Some communities using GeoSciML may still want to force usage of nillable properties and the SWG recognised that different communities might want to enforce the use of some properties for their particular needs. To meet this requirement and to offer flexibility to various communities wanting to use (or extend) GeoSciML, properties are optional, but can still be nilled. A data provider is offered two options when a value is missing:
- Omit the property
- Deliver a nilled property with relevant justification.
Which option is most useful for a community is left to that community to decide. Their decision can be enforced using Schematron. The editors foresee the use of a) the GeoSciML data model as defined by this specification providing naming and structure and b) a series of community-defined rules to enforce the presence of certain properties relevant to their use cases. For an XML implementation, this translates into a set of common XSD and SCH to govern conformance to GeoSciML, and community-specific SCH to enforce specific use cases, such as the INSPIRE geology specification [8].
8.1.2 Package shortcuts
The following shortcuts are used to refer to external (non GeoSciML) classifiers.
| Shortcut | Full path (HollowWorld) |
|---|---|
|
OM |
ISO TC211/ISO 19156 All/ISO 19156:2011 Observations and Measurements |
|
SWE |
OGC/Sensor Web Enablement 2.0/SWE Common Data Model 2.0 |
|
GEO |
ISO TC211/ISO 19107 All/ISO 19107:2003 Spatial Schema |
|
Primitive |
ISO TC211/ISO 19103 All/ISO 19103:2005 Conceptual schema language |
|
Temporal |
ISO TC211/ISO 19108 All/ISO 19108:2006 Temporal Schema |
|
GML |
ISO TC211/ISO 19136:2007 GML |
8.2 GeoSciML Core Abstract Requirements Class (Normative)
| Abstract Requirements Class | |
|---|---|
| /req/gsml4-core | |
|
Target type |
Encoding |
|
Dependency |
ISO19103:2005 Conceptual Schema Language |
|
Dependency |
ISO19107:2003 Spatial Schema |
|
Dependency |
ISO19109:2015 Rules for application schemas |
|
Dependency |
RFC 3986 Uniform Resource Identifier (URI): Generic Syntax |
|
Dependency |
ISO19115-3 Metadata |
|
Requirement |
/req/gsml4-core/uml-entity-name When the target implementation allows it, the exact name of the classifier SHALL be used. |
|
Requirement |
/req/gsml4-core/uml-cardinality If the target implementation allows it, it SHALL implement the same cardinality of properties and associations as defined in the UML. |
|
Requirement |
/req/gsml4-core/uml-abstract Abstract classes SHALL NOT be materialised. |
|
Requirement |
/req/gsml4-core/uml-polymorphism A target implementation SHALL implement type substitutions inferred from the UML model. |
|
Requirement |
/req/gsml4-core/quantities-uom Quantities and measurements SHALL have explicit units of measure from a governed ontology. |
|
Requirement |
/req/gsml4-core/quantities-single-range QuantityRange properties that must report a single value SHALL assign both lower and upper value as equal to that single value. |
|
Requirement |
/req/gsml4-core/codelist Empty classes with stereotype <<CodeList>> SHALL be implemented as externally governed vocabularies which terms are encoded as URI (RFC 3986). |
This section presents requirements to which all target encodings must conform in to order to claim compliance to GeoSciML 4.1.
8.2.1 Naming of entities
| Requirement /req/gsml4-core/uml-entity-name |
|
When the target implementation allows it, the exact name of the classifier SHALL be used. |
If a target implementation is capable of encoding all the artefacts (classes and properties) using the same names used in UML, it shall do so. Some target implementations might prevent it; for example, dBase (DBF files) column names are restricted to 10 characters or some RDBMS limits the use of camel case names. But if the target allows it, the exact names shall be used.
8.2.2 Cardinality
| Requirement /req/gsml4-core/uml-cardinality |
|
If the target implementation allows it, it SHALL implement the same cardinality of properties and associations as defined in the UML. |
Cardinality shall be the same as defined in UML model. Since essentially all properties are optional, this clause addresses the upper bounds of cardinality: “1” or “many” in almost all cases. Therefore, if the UML model limits a property’s maximum cardinality to “1”, then the target implementation cardinality cannot be “many”.
8.2.3 Abstract classes
| Requirement /req/gsml4-core/uml-abstract |
|
Abstract classes SHALL NOT be materialised. |
Not all physical implementations support the concept of an abstract class, or even inheritance and polymorphism. XSD does support that concept and all its implications, but JSON does not – although JavaScript can somewhat. This requirement specifies that the encoding specification shall not allow materialisation of an instance of a class stereotyped as abstract.
8.2.4 Polymorphism
| Requirement /req/gsml4-core/uml-polymorphism |
|
A target implementation SHALL implement type substitutions inferred from the UML model. |
The type hierarchy of the UML model implies type substitutions for property values. For instance, a property value of type GeologicEvent can be substituted by a value of type DisplacementEvent because DisplacementEvent is a subtype of GeologicEvent. Many property types are abstract types and only a concrete subtype may be materialised (as per /req/gsml4-core/uml-abstract). A target implementation shall consider type substitutions using mechanisms available for this implementation.
8.2.5 Quantities
| Requirement /req/gsml4-core/quantities-uom |
|
Quantities and measurements SHALL have explicit units of measure from a governed ontology. |
The quantities and measurements units of measure shall be taken from a standard vocabulary governed by an appropriate community, for example the Unified Code for Units of Measure (UCUM).
8.2.6 QuantityRange
A QuantityRange is a quantity formed of a lower and upper value forming a range of values. If a single value needs to be represented as a QuantityRange, where the single value is assigned to both lower and upper properties.
| Requirement /req/gsml4-core/quantities-single-range |
|
QuantityRange properties that must report a single value SHALL assign both lower and upper value as equal to that single value. |
8.2.7 Code lists
| Requirement /req/gsml4-core/codelist |
|
Empty classes with stereotype <<CodeList>> SHALL be implemented as externally governed vocabularies which terms are encoded as URI (RFC 3986). |
All properties that require formal vocabularies are modelled in the UML as classes having the stereotype <<CodeList>>. The list of valid terms is either prescribed by this specification, with a list of possible entries (Figure 9) or open (i.e., without any terms).
When the list is open, the vocabulary is managed externally over the web where each vocabulary term should be encoded as a resource. Vocabulary term identifiers are URIs representing concepts from a standard vocabulary governed by an appropriate community - for example, the IUGS CGI Geoscience Terminology Working Group (http://www.cgi-iugs.org/tech_collaboration/geoscience_terminology_working_group.html and http://resource.geosciml.org) or INSPIRE [8].
This requirement does not require that URIs be actually dereferenceable, but just that a vocabulary term is associated with a syntactically correct URI.
8.3 Linked Open Data Requirements Class (Normative)
| Requirements Class | |
|---|---|
|
/req/gsml4-lob |
|
|
Target type |
Encoding |
|
Dependency |
/req/gsml4-core |
|
Dependency |
URI |
|
Dependency |
HTTP |
|
Requirement |
/req/gsml4-lod/codelistURI URI used for vocabulary terms SHALL be dereferenceable to one or more representations of the vocabulary term. |
|
Requirement |
/req/gsml4-lod/identifier HTTP URI used as entity identifiers SHALL be to one or more representations of that entity. |
|
Recommendation |
/req/gsml4-lod/gsml-representation Dereferenceable HTTP URI used as identifiers SHOULD provide GeoSciML 4.1 as one of its representations. |
|
Requirement |
/req/gsml4-lod/byref External references to an entity conforming to [/req/gsml4-lod/identifier] shall be expressed using this entity identifier. |
Although OGC standards are not restricted to a web environment, they are strongly influenced by this environment. GeoSciML was originally developed specifically for XML, but many other encodings are suitable hypermedia formats (RDF/XML, JSON-LD, HTML). This requirements class describes extra rules that shall be implemented to turn GeoSciML data instances into hypermedia compatible with Linked Open Data principles.
Linked Open Data is a method to publish structured data on the web. It leverages existing web technologies such as HTTP (transfer protocol) and URI (addressing over the web) to connect structured resources. The principle is similar to interconnected web pages through hyperlinks, except that pages are replace with structured information that can be processed by machines.
The following requirements essentially impose that URI used as vocabulary, identifiers and references can be “dereferenced”, which is “The act of retrieving a representation of a resource identified by a URI”[2] from the web. A resource can have multiple representations (GML, XML, RDF, etc.) and this specification does not impose a particular one, although it is common sense in this context to provide a GeoSciML representation for geological features.
It is important to note that a HTTP URI in this context is both an identifier and a location. The same identifier is used to refer to any number of representations. Therefore, different representations are selected through content negotiation with the server.
8.3.1 Code lists URI
| Requirement /req/gsml4-lod/codelistURI |
|
URI used for vocabulary terms SHALL be dereferenceable to one or more representations of the vocabulary term. |
The requirement described at 8.2.7 in Abstract Requirements Class demand that a vocabulary reference be encoded as a URI, but does not require that the URI actually resolve to anything (it could, but it is not required). In this class, the target must ensure that the URI used to identify vocabulary terms SHALL dereference to one or more representations of a definition of the term (eg, RDF/SKOS, HTML, GML Definition, etc.)
8.3.2 Identifier
| Requirement /req/gsml4-lod/identifier |
|
HTTP URI used as entity identifiers SHALL be to one or more representations of that entity. |
This requirement demands that the target ensures that a data instance exposes a URI as a unique identifier for this feature and this identifier SHALL be dereferenceable to one or more representations of that feature.
| Requirement /req/gsml4-lod/gsml-representation |
|
Dereferenceable HTTP URI used as identifiers SHOULD provide GeoSciML 4.1 as one of its representations. |
It is expected that one of the representations should be a XML (GML) or any GeoSciML compliant representations, including any profiles derived from this specification.
8.3.3 ByReference associations
| Requirement /req/gsml4-lod/byref |
|
External references to an entity conforming to [/req/gsml4-lod/identifier] shall be expressed using this entity identifier. |
Serialization of a dataset will often omit the full description of a feature and replace the property value with an external reference. A reference to this feature is formed by the dereferenceable identifier described in clause 8.3.2. A client ingesting the dataset can use this reference to extract a feature representation if need be. Over the web, this reference shall be a HTTP URI that can be dereferenced to one or more representations of that feature.
8.4 GeoSciML Basic Requirements Class (Normative)
| Requirements Class | |
|---|---|
|
/req/gsml4-basic |
|
|
Target type |
Encoding |
|
Dependency |
/req/gsml4-core |
|
Dependency |
Spatial Schema ISO19107 |
|
Dependency |
Conceptual schema language ISO19103 |
|
Requirement |
/req/gsml4-basic/geologicfeature-purpose Purpose SHALL be a value from Table 2 of clause 8.4.1.1.2. |
|
Requirement |
/req/gsml4-basic/geologicevent-single An individual GeologicEvent SHALL only apply to one of DisplacementEvent, AlterationDescription or MetamorphicDescription. |
|
Requirement |
/req/gsml4-basic/geologicevent-non-null Either (olderNamedAge + youngerNamedAge) or numericAge SHALL not be null. |
|
Requirement |
/req/gsml4-basic/plane-pol-dip-az-not-null At least one of polarity, azimuth or dip SHALL not be null. |
|
Requirement |
/req/gsml4-basic/linear-trend-plunge-not-null At least one of plunge or trend SHALL not be null. |
|
Requirement |
/req/gsml4-basic/quantity-range-order The QuantityRange lowerValue SHALL be less than or equal to the upperValue. |
|
Requirement |
/req/gsml4-basic/quantity-range-repeat The QuantityRange’s value[0] SHALL provide the same value as lowerValue and value[1] SHALL provide the same value as upperValue. |
Basic package provides a collection of classes representing fundamental geological and geomorphological features (units, structures, and events), earth materials, geologic time, and the relations between them. It limits the number of descriptive properties to match important common use cases, including the INSPIRE geological theme specification [8].
8.4.1 Geology Basic
GeologyBasic is a package of classes representing fundamental geological map features and the relations between them. GeoSciML describes a geologic dataset as a series of GeologicFeature occurrences, spatially represented as MappedFeature. A map is a collection of MappedFeatures. The term “map”, typically understood as a map sheet (a given area on the surface of the earth), is only one of the possible collection of MappedFeatures. Other examples are cross-sections, block diagrams, and even borehole logs (a linear map). MappedFeature can represent any features and GeologicFeatures are one of the kinds of features it can represent. A MappedFeature identifies what it represents using its “specification” association.
Figure 11 shows the fundamental relationships between a MappedFeature and the GeologicFeature. GeologicFeature is further subtyped into GeologicUnit, GeologicStructure, GeomorphologicFeature and GeologicEvent.
8.4.1.1 GeologicFeature
The abstract GeologicFeature class represents a conceptual feature that is hypothesized to exist coherently in the world. It corresponds with a “legend item” from a traditional geologic map and its instance acts as the “description package”. The description package is classified according to its intended purpose as a typicalNorm, definingNorm or instance. GeologicFeature can be used outside the context of a map (it can lack a MappedFeature), for example when describing typical norms (describing expected property from a feature) or defining norms (describing properties required from a feature to be classifying in a group, such as given geologic unit). A GeologicFeature appearing on a map is considered as an “instance”.