I. Executive Summary
The Federated Marine Spatial Data Infrastructure (FMSDI) Pilot is an OGC Innovation Program initiative with the objective of enhancing Marine Spatial Data Infrastructures (MSDIs), to better understand MSDI maturity, and to demonstrate the power of FAIR (Findable, Accessible, Interoperable, Reusable) data in the context of the marine environment. It is organized in three phases. This Engineering Report is based on the work of the second phase.
Motivation and Objectives
One of the challenges of Marine Protected Area (MPA) data is to make it available for a wide variety of users, including those outside the MSDI domain, such as fishermen, resource extractors, utilities, tourists, or recreational boaters. These users, who do not have direct access to MPA databases to access the data they need to perform their activities, rely on smaller consumer-facing applications, which in turn rely on Application Programming Interfaces (APIs) to request and consume the data they work with.
The use of standards makes it easier for developers to build software applications. The more robust these standards are, the easier it is to build applications, and the more diverse the audiences that can utilize them in a variety of scenarios. Because of this, the demonstration of standards related to both MPA data and the APIs they are served through becomes of key importance.
Within this context, this pilot addressed the following research questions:
What stages the data go through from MPA to S-122;
What steps were taken in the server development to standardize the various data into an S-122 data set;
What stages the data go through a fusion scenario, regarding format, metadata, etc;
What steps were taken in the server development to synthesize the data and create digestible data for clients;
Which OGC API standards were leveraged to perform transformations to this data;
How the data were processed by the clients and what views were used; and
What kind of modifications do the S-122 and OGC API standards need to better address the use of MPA data.
Technical Overview
The activities were divided into two concurrent stages or sections. The first stage focused on the demonstration of the transformation of MPA data into the S-122 standard and its achieved interoperability when being served through OGC APIs. The second stage went beyond marine protected areas and opened the examination to a broader set of data and standards. These stages saw the demonstration of seven components.
One Baltic/North Sea Server (D100): One processing server that ingested the data from various sources of the Baltic Sea / North Sea providers. These MPA data were brought into the server and transcribed into the International Hydrographic Organization (IHO) S-122 Marine Protected Area standard.
Two Baltic/North Sea Clients (D101 and D102): These client services demonstrated different viewpoints and methods for digesting the data from the server and standardized data.
Two Data Fusion Servers (D120 and D121): These servers ingested various data inputs including MPA data. D120 was implemented using the OGC API — Features standard, while D121 was implemented using the OGC API — Environmental Data Retrieval (EDR) standard.
Two Data Fusion Clients (D122 and D123): These clients ingested the outputs from the two servers and displayed the data to end users.
Lessons Learned and Recommendations
The development, testing, and demonstrations carried out throughout this Pilot provided lessons learned for all of the Pilot’s participants. The following list summarizes the lessons learned and recommendations for IHO and OGC standards that resulted from the activities of this Pilot.
OGC Standards
Using OGC API — Features to Serve MPA Data: The implementations of OGC API standards were found to be of great use because of their ability to retrieve data from the authoritative sources, the ease of automation by client services, the format-neutrality of such APIs, and the fine control over retrieval, which is not present in file based encodings. This ability allows for simple filters on data fields, and compound and spatial queries, as well as simpler queries against bounding boxes.
Accessing MPA Data Through OGC API — Features: Requesting all MPA features proved to be expensive from a computing perspective. Authentication and authorization was identified as not explored sufficiently.
Spatial Filtering Using a Bounding Box Query: The bbox spatial filter, which is specified in Part 1 of the OGC API — Features standard, returned unnecessary data back to clients, which can be a challenge for users operating in low connectivity environments where bandwidth is at a premium.
Using Bounding Boxes to Represent Features in Denied, Degraded, Intermittent, or Limited Bandwidth (DDIL) environments: Bounding boxes were created for some of the feature collections to reduce the complexity and size of the MPAs in order to make them suitable for DDIL environments.
EDR API and Discrete Global Grid System (DGGS): While the EDR API has been shown in the pilot project to provide a naïve client with the tools to successfully “explore” DGGS data, any client that requests a location from a DGGS server must understand the DGGS geometry it is receiving.
Features API vs. EDR API: While the Features API proved to do a well serving distinctly identifiable entities like ships, routes, lakes, and zones, among others, the EDR API provided a great way of accessing data that is not easily mapped to identifiable entities such as subsets of data about an area of interest.
Common Query Language (CQL) Support in Service Metadata: The lack of information about CQL support in the API Service Metadata proved to be a downside.
Filtering Complex Features: Searching Data from complex features was found to complicate the filtering processes for the clients.
Feature Styling in Features and EDR API: The lack of styling support by both the Features and EDR API was not ideal.
IHO Standards
The S-100/S-122 Model from a Server Perspective: The current S-122 model was found to be fairly basic in terms of its representation of MPA data. This encoding, whilst a good fit for maritime use cases, did not currently reflect the broader application of MPAs in different geospatial agencies and the richer attribution required for those uses.
A number of proposed suggestions have been implemented in the model used for the server, specifically:
Added three new values to categoryOfMarineProtectedArea;
Added complex attribute to record other designations;
Added a dimensions complex attribute to record the calculated dimensions of the MPA;
Added enactment date (mandatory) and update date (optional) to all MPAs;
Added information representing Management Plans;
All Feature Types now have multiple identifiers;
producerCode was added as a simple attribute to Agency; and
Added regional to jurisdiction.
Client Perspective of S-122: The MPA Feature ID was not clearly ‘universally persistent’. On the other hand, managing the MPA feature name as a complex type made it difficult to manage queries based on the well-known name of an MPA. Furthermore, the S-121 specification did not address the concept of marine protected area networks, preventing the modeling of the ‘synergistic’ properties of the MPA network and its application toward a common objective.
Authority Names: The authority is represented as a featureName in the S-100 model and is affected by the same naming convention issues identified in previous works. The client application must have prior knowledge of the locale-specific authority name in order to provide this information as part of the query filter.
Use of Bounding Boxes by the S-122 Product Specification: The S-122 product specification supports bounding boxes for individual features, however the part 10b Geography Markup Language (GML) encoding does not specify their use.
GeoJSON
Challenges With The GeoJSON Encoding: The use of GeoJSON as an encoding is not part of S-100 itself. However, its ubiquity as a format for exchange of geospatial data raises the possibility of its use for modeling S-100 General Feature Model (GFM) data.
Use of GeoJSON: The GeoJSON format provides several advantages for interoperability because of wide adoption and support in mapping software. However, the return of non-standardized JSON objects made it difficult to know how to display the information to the user in a meaningful way.
CovJSON vs. GeoJSON: For an EDR service, Coverage JSON would represent the data better than GeoJSON.
Recommended Future Work
Future work should build upon the findings that emerged from the development and testing of these components, and answer questions that were left out of the scope. The following list summarizes the recommendations for future work that resulted from the activities of this Pilot.
OGC Standards
Disconnected, Degraded, Intermittent, Limited Bandwidth (DDIL) Environments: Further investigation is required on how to optimize the retrieval and storage of MPA feature collections as a GeoPackage using a supported OGC file encoding.
Further Enhancing MPA Filters: There is a need to further develop the client application alongside a server implementing the OGC API — Features — Part 3: Filtering candidate standard for additional filtering to reduce the amount of data requested from the server.
Using Vector Tiles: Due to the many benefits that vector tiles offer, especially for users operating in a DDIL environment, future work should explore using vector tiles for MPA data.
Explore Potential Solutions to Challenges With DGGS: While the D121 Fusion Server was implemented with a DGGS, the interface also shows promise for further development.
Beyond the core EDR query capabilities such as query by position, radius, trajectory, bounding box, etc., the DGGS service could provide a refined query interface to allow the client to specify the tessellation level for an area of interest.
Also of benefit would be support for temporal extents in which the client would provide both a spatial and temporal extent over which the DGGS service would provide aggregated datums.
IHO Standards
S-122 and ISO 19152: More work is required to map examples and test the application of ISO 19152 for some providers.
Establishing a Data Schema for DDIL Environments: If using a DDIL twin for any data is to be considered going forward, then there needs to be some consideration for what the data schema would need to be.
GeoJSON
GeoJSON Elements that Remain to be Worked Out: Information types, relationships, identifiers, other standardized metadata, and other geometries.
Further Explorations on GeoJSON and S-100: Most GeoJSON data tends to have the same attributes for each feature. So, whilst S-100 is conformant, it may pose challenges for some implementers and perhaps there are accommodations which can be made. Furthermore, the ability to seamlessly aggregate datasets together in GeoJSON would be a good step forward to working with S-100.
GeoJSON and Moving Features: For future work, application of the GeoJSON Moving Features specification would be well-positioned to represent vessel traffic.
I.A. Document contributor contact points
All questions regarding this document should be directed to the editor or the contributors:
Table — Contacts
Name | Organization | Role |
---|---|---|
Sergio Taleisnik | Skymantics | Editor |
Terry Idol, PhD. | Skymantics | Editor |
Sara Saeedi, PhD. | OGC | Collaborator |
Sina Taghavikish | OGC | Collaborator |
Jason MacDonald | Compusult | Collaborator |
Matthew Jestico | Helyx | Collaborator |
Jonathan Pritchard | IIC Tech | Collaborator |
Glenn Laughlin | Pelagis | Collaborator |
Marta Padilla Ruiz | University of Calgary | Collaborator |
Perry Peterson | University of Calgary | Collaborator |
II. Keywords
The following are keywords to be used by search engines and document catalogues.
Federated Marine, SDI, IHO, OGC, Spatial Data Infrastructure
III. Security considerations
No security considerations have been made for this document.
IV. Submitting Organizations
The following organizations submitted this Document to the Open Geospatial Consortium (OGC):
- Skymantics, LLC
Towards a Federated Marine SDI: IHO and OGC standards applied to Marine Protected Area Data Engineering Report
1. Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
IHO: IHO S-57, IHO Transfer Standard for Digital Hydrographic Data. International Hydrographic Organization, Monaco (2000–). https://iho.int/uploads/user/pubs/standards/s-57/31Main.pdf.
IHO: IHO S-100, IHO Universal Hydrographic Data Model. International Hydrographic Organization, Monaco (2018–). https://iho.int/uploads/user/pubs/standards/s-100/S-100 Ed%204.0.0 Clean 17122018.pdf.
IHO: IHO S-122, Marine Protected Areas. International Hydrographic Organization, Monaco (2019–). https://registry.iho.int/productspec/view.do?idx=73=S-122=5=ALL=product_ID
OGC API — Processes, https://ogcapi.ogc.org/processes/
OGC API — Features, https://ogcapi.ogc.org/features/
OGC API — Records, https://ogcapi.ogc.org/records/
OGC API — Discrete Global Grid System, https://ogcapi.ogc.org/dggs/
OGC API — Environmental Data Retrieval, https://ogcapi.ogc.org/edr/
2. Terms, definitions and abbreviated terms
This document uses the terms defined in OGC Policy Directive 49, which is based on the ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards. In particular, the word “shall” (not “must”) is the verb form used to indicate a requirement to be strictly followed to conform to this document and OGC documents do not use the equivalent phrases in the ISO/IEC Directives, Part 2.
This document also uses terms defined in the OGC Standard for Modular specifications (OGC 08-131r3), also known as the ‘ModSpec’. The definitions of terms such as standard, specification, requirement, and conformance test are provided in the ModSpec.
For the purposes of this document, the following additional terms and definitions apply.
2.1. DDIL
Denied, Disrupted, Intermittent, and Limited Bandwidth environments. Used to describe scenarios where the connectivity is not ideal and actions need to be taken to guarantee a normal or minimum operation of software applications.
2.2. Interoperability
Capability to communicate, execute programs, or transfer data among various functional units in a manner that allows the user to have little or no knowledge of the unique characteristics of those units [ISO 19119].
2.3. Marine Spatial Data Infrastructure (MSDI)
A specific type of Spatial Data Infrastructure (SDI) with a focus on the marine environment.
2.4. Abbreviated terms
API
Application Programming Interface
DDIL
Disconnected, Degraded, Intermittent, and Limited Bandwidth Environments
DGGS
Discrete Global Grid System
EDR
Environmental Data Retrieval
ER
Engineering Report
FAIR
Findable, Accessible, Interoperable, and Reusable
FMSDI
Federated Marine Spatial Data Infrastructure
IHO
International Hydrographic Organization
MPA
Marine Protected Area
MSDIWG
Marine Spatial Data Infrastructures Working Group
OGC
Open Geospatial Consortium
SDI
Spatial Data Infrastructure
TIE
Technology Integration Experiment
3. Subject
This Engineering Report (ER) summarizes the demonstrations, findings, and recommendations that emerged from the second phase of the OGC Federated Marine Spatial Data Infrastructure (FMSDI) Pilot. The goal of this initiative was to further advance the interoperability and usage of Marine Protected Area (MPA) data through the implementation of the IHO standard S-122 and several OGC API standards.
This ER describes a solution architecture consisting of a collection of interoperable components developed to demonstrate technologies that helped to achieve the objectives of this Pilot’s phase. This document describes a server built to serve MPA data through an OGC API – Features endpoint and two servers that combined MPA data with additional datasets and served it through both an OGC API – Features and an OGC API — EDR endpoint. This document also describes the three clients built to consume under different scenarios the data offered by the aforementioned servers. Finally, this ER captures lessons learned and recommendations for IHO and OGC API standards, and recommendations for future work.
4. Overview
Due to the depth of the topic and the complexities of developing a Federated Marine Spatial Data Infrastructure, this initiative and therefore, the Engineering Report, will be ongoing, living documents beyond their initial publication. The new versions and additions will extend the development and structure of a Federated Marine Spatial Data Infrastructure in both technology and location of focus and testing.
Section 5: Towards an FMSDI (Initiative Overview)
This section explores the FMSDI Pilot, describes past initiatives, describes its problem statements and motivation, its phases, and the tasks within its phases.
Section 6: Background
This section describes the technologies and knowledge base that make up the baseline for this Pilot.
Section 7: Research Objectives and Technical Architecture
This chapter describes the motivations that guided this Pilot’s work, the research objectives, and the component architecture that was demonstrated to address this Pilot’s goals.
Section 8: Baltic Sea/North Sea Server
This section describes the Baltic Sea/North Sea Server. This component was designed to ingest the MPA data from various sources of the Baltic Sea / North Sea providers, transform the data to comply with the S-122 standard, and offer it through an API built using OGC API standards. This component was demonstrated by IIC Technologies.
Section 9: Baltic Sea/North Sea Client 1
This section describes the Baltic Sea/North Sea Client. This component was designed to ingest MPA data from the Baltic Sea/North Sea Server developed in this Pilot, consume it in S-122 format, and demonstrate different viewpoints and methods for digesting data from the server. This component was demonstrated by Pelagis.
Section 10: Baltic Sea/North Sea Client 2
This section describes the Baltic Sea/North Sea Client 2. This component was designed to ingest MPA data from the Baltic Sea/North Sea Server developed in this Pilot. It was designed to demonstrate an alternative viewpoint and method to the Baltic Sea/North Sea Client 1 for digesting S-122 data. The viewpoint that this client utilized was that of a less connected service, such as that on an older vessel with limited technology or connectivity. This component was demonstrated by Helyx Secure Information Systems Ltd.
Section 11: Data Fusion Server 1
This section describes the Data Fusion Server #1. This component was designed to ingest MPA datasets as well as other datasets, combine them, and serve them through an API built using the OGC API standards. This component was demonstrated by IIC Technologies.
Section 12: Data Fusion Server 2
This section describes the Data Fusion Server #2. This component successfully published feature and coverage data sources processed into an Icosahedral Snyder Equal Area Aperture 3 Hexagonal (ISEA3H) DGGS for discovery as EDR Collections and EDR Parameter Queries. This component was demonstrated by the University of Calgary.
Section 13: Data Fused Client 1
This section describes the Data Fusion Client #1. This component was designed to ingest the various datasets (S-122 and others) served by the Data Fusion Servers developed throughout this Pilot. This component was demonstrated by Compusult.
Section 14: Data Fused Client 2
This section describes the Data Fusion Client #2. This component was designed to ingest various datasets served by the Data Fusion Servers federated with 3rd party open data services. This component was demonstrated by Pelagis.
Section 15: Technology Integration Experiments (TIEs)
The FMSDI Pilot Technology Integration Experiments (TIEs) focused on the exchange of MPA data through OGC APIs. Each TIE explored under different circumstances the potential of OGC API standards and IHO standards with the objective of relaying MPA data. This section describes each TIE and outlines its expected and actual results.
Section 16: Challenges and Recommendations for OGC Standards and IHO Standards
This section outlines a prescriptive list of challenges and lessons learned through the different stages of the initiative. This section also includes recommendations for the various standards utilized through the initiative.
Section 17: Recommendations for Future Works
This section outlines a descriptive list of various items that could be expanded upon in future initiatives or for the sponsors to utilize and build from.
5. Towards an FMSDI (Initiative Overview)
The Federated Marine Spatial Data Infrastructure (FMSDI) Pilot is an OGC Innovation Program initiative with the objective to enhance Marine Spatial Data Infrastructures (MSDIs), to better understand MSDI maturity, and to demonstrate the power of FAIR (Findable, Accessible, Interoperable, and Reusable) data in the context the marine environment.
A Marine Spatial Data Infrastructure (MSDI) is a specific type of Spatial Data Infrastructure (SDI) with a focus on the marine environment. It is not only a collection of hydrographic products, but is also an infrastructure that promotes the interoperability of data at all levels (e.g., regional, national, and international). Like all SDIs, it tries to enhance the discoverability, accessibility, and interoperability of marine data. By doing so it supports a wider, non-traditional user-base of marine data far beyond what is typically used for navigation.
This initiative, which is conducted under the OGC Innovation Program, builds directly on what was accomplished earlier in the year through the Federated Marine Spatial Data Infrastructure Pilot 2021. The 2021 pilot again built on the works of prior initiatives, such as the Marine Spatial Data Infrastructure Concept Development Study, the Maritime Limits and Boundaries Pilot, and the Arctic Spatial Data Infrastructure Pilot.
5.1. Problem Statement and Motivation
Ocean and marine data are recognized as valuable resources that tend to have a high cost of acquisition. Large quantities of these data are collected and stored all over the world for a wide variety of purposes and by a variety of public and private entities. Due to its importance and value, these data should be well-managed and made as widely available to end users as possible for a variety of uses including planning, policy and decision making, marine management, scientific research, and economic activities.
The collection, protection and sharing of marine data provides enormous societal benefits. Data and information on the state and variability of the marine environment is crucial for understanding changes that may result from human activity, including the effects of human-induced climate change and ocean acidification.
Currently, Government Agencies, research institutions, and the private sector provide a considerable investment in marine monitoring and observation, data sharing and assembly, and downstream services. As a result, significant progress has been made to collect, aggregate, and make publicly available the data and information derived from monitoring and observing the Marine environment.
However, data-sharing initiatives still face common challenges in their efforts to unlock the full societal and economic potential of the wealth of marine data and observations at national, regional, or local levels. The ability to effectively share, use, and re-use geospatial information and applications across and between governments and Non-Government Organizations (NGOs) is dependent upon having an effective SDI already in-place.
The motivations of the FMSDI Pilot include the following.
Demonstration — A practical technology demonstration from global community experts showcasing federated Marine SDI for selected Land/Sea use cases. Possible examples include use cases for the Arctic, European Coastal Regions, and a southeast Asian region. The demonstration will show how using OGC, IHO, and other open standards enable the community with the ability to find, obtain, use, share, interoperate, and reuse data.
Impact on OGC Standards — Lessons learned, gaps, and the need for changes to the OGC Standards Baseline will be summarized in an Engineering Report that will inform the OGC Standards Program.
Impact on IHO Standards — Practical testing of relevant S-100 based IHO standards will accelerate the process for adoption and implementation of IHO standards. The resulting Engineering Report will help to inform the work of the IHO HSSC Working Group and will provide inputs to enhance the framework and its component standards.
Development of the Marine Spatial Data Infrastructure (MSDI) Maturity Model — Provide a roadmap for MSDI development.
Several challenges facing Marine SDI can be identified:
lack of an integrated policy and operational framework to facilitate rapid acceptance, qualification, ingest, and use of relevant geospatial information from a range of government, commercial providers, and citizens;
the current focus on products supporting a single customer group;
the inability with existing metadata approaches to quickly discover and understand which information sources are most useful in the context of a user’s need;
the inability to properly fuse and synthesize multiple data sources; and
the need for a persistent platform to organize and manage marine information and the tools necessary for collaboration among organizations to fully utilize the variety of marine data.
5.2. Previous Marine SDI Initiatives
This initiative builds on what has been accomplished in previous initiatives: the Marine Spatial Data Infrastructure Concept Development Study; the Maritime Limits and Boundaries Pilot; and the Arctic Spatial Data Infrastructure Pilot. The Marine Spatial Data Infrastructure Concept Development Study summarized the efforts and information gathered from a Request for Information which focused on in-depth data requirements, architecture, and standards needs for a Marine Spatial Data Infrastructure. The Maritime Limits and Boundaries Pilot worked to build a detailed implementation for testing S-121 Standard data. The Arctic Spatial Data Infrastructure Pilot aimed to utilize international standards to support a spatial data exchange focusing on the complex issues of Arctic marine space.
5.3. FMSDI Pilot Phases
The FMSDI pilot started in August 2021 and is currently planned to run until December 2022. It is organized into three phases: Phase I: Initial RFIs and Datasets Overview; Phase II: IHO and OGC standards applied to MPAs; and Phase III: Overview of the next phase.
5.3.1. Phase I: Initial RFIs and Datasets Overview
Ocean and marine data are recognized as valuable resources that tend to have a high cost of acquisition. Large quantities of these data are collected and stored all over the world for a wide variety of purposes and by a variety of public and private entities. Due to its importance and value, this data should be well-managed and made as widely available to end-users as possible for a variety of uses including planning, policy and decision making, marine management, scientific research, and economic activities.
The collection, protection, and sharing of marine data provide huge societal benefits. Data and information on the state and variability of the marine environment are crucial for understanding changes that may result from human activity, including the effects of human-induced climate change and ocean acidification.
The already completed first phase included the Marine Data Availability and Accessibility Study (MDAAS). MDAAS began with the release of a Request for Information (RFI) to help determine data availability and accessibility of Marine Protected Areas (MPA, IHO S-122) and other marine data in the North Sea and Baltic Sea. The MDAAS further helped assess interoperability, availability, and usability of data, geospatial Web services, and tools across different regions and uses of marine spatial data. MDAAS also provided identification of gaps and helped define reference use-cases and scenarios for use in future FMSDI Pilot activities.
Phase I brought together diverse stakeholders from the global marine community to assess the current state of Marine SDI. This RFI is used to gather knowledge from marine domain stakeholders and contributors. The summary of this RFI is available in Annex A and an overview of the dataset is listed in Table 1.
Table 1 — Phase I (RFI) Dataset Overview
Organization | Notes | Link |
---|---|---|
HELCOM — Baltic Marine Environment Protection Commission | Reported tabular data is collected and made available via HELCOM MPA database | http://mpas.helcom.fi |
Spatial data on MPA areas is also available as a spatial dataset (shapefile). The spatial data can be accessed via web service | https://maps.helcom.fi/website/mapservice/?datasetID=d27df8c0-de86-4d13-a06d-35a8f50b16fa | |
Metadata record for the above shapefile | http://metadata.helcom.fi/geonetwork/srv/eng/catalog.search#/metadata/d27df8c0-de86-4d13-a06d-35a8f50b16fa | |
OGC Web Map Service (WMS) | https://maps.helcom.fi/arcgis/services/MADS/Biodiversity/MapServer/WMSServer?request=GetCapabilities=WMS | |
ArcGIS REST | https://maps.helcom.fi/arcgis/rest/services/MADS/Biodiversity/MapServer/54 | |
UK Hydrographic Office | UK Offshore Marine Protected Areas/JNCC Resource Hub | https://hub.jncc.gov.uk/assets/ade43f34-54d6-4084-b66a-64f0b4a5ef27 |
The Danish Agency for Culture and Palaces | Data can be downloaded through 2 locations | https://www.kulturarv.dk/ffreg/ |
The data can also be accessed via web services | https://www.kulturarv.dk/ffpublic/wms/ows?service=wms=1.1.0=GetCapabilities | |
Danish Geodata Agency | Protected areas can be retrieved from the Danish Environmental Portal, which is a public partnership to make environmental data easily available | https://arealinformation.miljoeportal.dk/html5/index.html?viewer=distribution |
Geodata info is the national metadata portal for data discovery | www.geodata-info.dk | |
Dataforsyningen — the Danish data and map supply — provides access to free public data | https://dataforsyningen.dk/ | |
Finland Traficom | All Traficom data sets can be found from our geoportal | https://julkinen.traficom.fi/oskari/ |
Calls to interfaces | https://www.traficom.fi/en/statistics-and-publications/spatial-dataset-material/calls-interfaces | |
Additional resources | https://kartta.paikkatietoikkuna.fi/?lang=en | |
Lithuanian Transport Safety Administration | We use public data from national spatial data center | www.geoportal.lt |
German Federal Maritime and Hydrographic Agency | The GeoSeaPortal is part of the integrated German and European MSDI network | https://www.geoseaportal.de/mapapps/?lang=en |
Swedish Hydrographic Organization | Many GIS stakeholders rely on the national SDI for data discovery | www.geodata.se/geodataportalen |
Flemish Hydrography | Data custodian for various relevant datasets as they are included on navigational charts. It concerns the 6 MPA’s described in the Marine Spatial Plan. | https://www.geopunt.be/catalogus/webservicefolder/688b3a9c-025b-4872-b1c6-06126a821e25 |
Geoscience Australia | A whole-of-government data access and visualization application. Web Coverage Service (WCS) harvester compiles web services into a common framework | https://nationalmap.gov.au/ |
Maritime boundaries thematic mapping applications. Internal curated datasets are made available with analysis tools | http://maps.ga.gov.au/interactive-maps/#/theme/amsis | |
Seafloor thematic mapping application. Internal curated datasets are made available with analysis tools | https://portal.ga.gov.au/persona/marine | |
Location Index (Loc-I) is a framework that provides a consistent way to seamlessly integrate data on people, business, and the environment. Open datasets are converted to linked data for research and development | http://www.locationindex.org/ | |
AusSeabed | Seafloor topography GeoTIFF’s available for download | https://portal.ga.gov.au/persona/marine |
AWS S3, eCat GeoNetwork | https://ecat.ga.gov.au/geonetwork/srv/eng/catalog.search#/home |
5.3.2. Phase II: IHO and OGC standards applied to MPAs
The second phase, which is what this Engineering Report is based on, extended the MPA-focus of the first phase by digging into all the various data services and began building out an S-122 demonstration model, including the exploration of the S-100 data specifications and how other data (terrestrial, meteorological, Earth observation, etc.) can mingle to create a more holistic view of the region of focus. In addition, Phase II designed a Marine Spatial Data Infrastructure (MSDI) maturity model, to provide a roadmap for MSDI development. The maturity model is described in a separate Engineering Report.
This Pilot phase was broken into three segments of focus:
Task 1: Developing a federation of S-122 Standard Marine Protected Area (MPA) data sets in the Baltic/North Sea area;
Task 2: Exploring a data fidelity, mobility, and versatility of S-100 Product Specification as well as other marine standards and data; and
Task 3: Designing a UNGGIM-IGIF (United Nations Global Geospatial Information Management-Integrated Geospatial Information Framework) derived Marine Spatial Data Infrastructure maturity model which provides a roadmap for MSDI development.
5.3.2.1. Task 1: BNS Overview (Scenarios and Architecture)
The Baltic/North Sea use case looked at utilizing numerous Marine Protected Area data and related data to identify Marine Protected Areas within the Baltic/North Sea. To accomplish this, a federation of MPA data was created from the various countries that have an interest in the Baltic/North Sea region.
5.3.2.2. Task 2: Fusion Overview (Scenarios and Architecture)
The second stage identified, examined, and expanded upon existing data sets to give them greater fidelity, mobility, and versatility. This went beyond marine protected areas and opened the examination to a broader set of data and standards. These included other data sets and standards that could be utilized to develop a firmer more holistic view of a region.
5.3.2.3. Task 3: IGIF-MSDI Maturity Roadmap Overview
UKHO has developed an IGIF-MSDI Maturity Framework with international contributions highlighting that an MSDI is a continual journey and not an end-state of technological sophistication. It asserts that nations are sovereign in what type of MSDI they genuinely require for their national needs, not by an externally imposed level of technological exploitation and concomitant expenditure (unless deliberately otherwise chosen). The foundation of this initiative is the UN IGIF Nine Pathways model that lays out the strategic vision and rationale for an All-Domain geospatial infrastructure (air, land, sea, and space), which equitably benefits all socioeconomic stakeholders and sectors of a nation.
The IGIF-MSDI Maturity Framework was developed in conjunction with the UN, IHO, OGC, and World Bank with representation from Denmark (DGA), Singapore (MPA), and the United States (NOAA). As a compact and accessible document its intent is to provide a more quantitative and prescriptive “Quick Start” or “Stepping Stone” for nations beginning their IGIF-aligned MSDI implementation plans. It seeks to supplement, not supplant, any existing resource within this area and is intended to be read alongside referenced publications from the UN, IHO, OGC and World Bank. One of the major outputs from its application is the quantitative baseline rating that a nation (or national agency) can use for self-improvement towards a defined future end-state (not for regional comparisons).
The World Bank, with its many decades of financial expertise in global development programs, is an indispensable partner for realizing this vision. The involvement of the World Bank was crucial in providing answers to the Financing question (incl. costed business cases), alongside Why (UN), What (IHO), and How (OGC). The current World Bank 91-question SDI Diagnostic Toolkit with its Terrestrial Heritage is being augmented with IHO and OGC insights to maximize its benefits to the Marine community, whilst being aligned with the UN IGIF principles (and UN SDGs as a result). These insights from the Hydrographic and Standards communities are modular additions to a condensed subset of the current SDI Diagnostic Toolkit, which itself is an overlap of the World Bank’s “Decision-Maker” and “End-User” question subsets. This ensures that the IGIF-MSDI Maturity Framework is fully interoperable with the World Bank’s full IGIF implementation methodology, of which the SDI Diagnostic toolkit is only the first step.
5.3.3. Phase III: Overview of the next phase
Phase III, which will start with an additional Call for Participation in Summer 2022, will primarily extend the use cases developed in Phase II and add the Arctic region as a new location to the demonstration scenarios.
The Arctic Regional Marine Spatial Data Infrastructure Data Working Group (ARMSDIWG) of the International Hydrographic Organization (IHO) identifies and assesses the statuses of individual MSDI implementations and considers MSDI policies in related international projects and cooperates specifically with the spatial data infrastructure for the Arctic. Among other tasks, the working group analyzes how maritime authorities can contribute their spatial information and updates so information can easily be collated with other data to a current overall picture of the Arctic region. Through association with the OGC Marine DWG the ARMSDIWG monitors the development of relevant and applicable OGC standards and activities in the context of marine data services for the Arctic.
The Federated Marine Spatial Data Infrastructure 2022 Pilot will focus on several aspects that contribute to an overarching scenario that helps better understand both the challenges and potential opportunities for coastal communities, ecosystems, and economic activities in the Arctic region. Potential activities may include:
demonstrating interoperability between land and marine data that is necessary to understand coastal erosion (e.g., ocean currents, geology, permafrost characteristics, etc.);
coastal erosion over time, which includes a temporal component (possible study area: Alaska, Canada, Greenland, and Iceland);
defining coastline (highest line) and coastal transition zone (intertidal zone);
effects of climate change and a changing Arctic environment on wildlife migration corridors: land-sea ice-island (caribou) and sea (marine mammals);
demonstrating the role of OGC standards to support the measurement of impacts of coastal erosion (e.g., infrastructure, food safety, traditional activities, wildlife migration, etc.) on coastal areas in the context of a changing Arctic;
mapping of coastal sensitivity to climate change and the impacts on local communities;
investigating the role of vector tiles and style sheets across the land-sea interface; and
a sea-based health and safety scenario incorporating the land/sea interface in the Arctic. This scenario would demonstrate the technology and data used with OGC, IHO, and other community standards in response to a grounding event and the evacuation of a cruise ship or research vessel in the Arctic.
6. Background
This section describes the technologies and knowledge base that make up the baseline for this Pilot.
6.1. Marine Protected Areas (MPAs)
A Marine Protected Area (MPA) is a protected area whose boundaries include an area of the ocean. They include areas of the intertidal or sub-tidal terrain together with their overlying water and associated flora, fauna, historical, and cultural features which have been reserved by law or other effective means to protect part or all of the enclosed environment. For example, MPAs may be established to protect fish species, rare habitat areas, or entire ecosystems.
MPAs can range from simple declarations to protect a resource to areas that are extensively regulated. The degree to which environmental regulations affect shipping varies according to whether MPAs are located in territorial waters, exclusive economic zones, or high seas. These limits are regulated by the Law of the Sea. Most MPAs are located in the territorial waters of coastal states where enforcement can be ensured. MPAs can also be established in a state’s exclusive economic zone or within international waters. [2]
6.2. S-122 Standard
The S-122 Product Specification is intended to encode Marine Protected Area (MPA) information for use in Electronic Chart Display and Information Systems (ECDIS) and other information systems. MPAs are protected areas of seas, oceans, estuaries, or large lakes. Marine Protected Area information may be considered supplementary additional information that compliments the S-101 ENC. [3]
The S-122 product is based on the S-100 General Feature Model (GFM) and is a feature-based vector product. Figure 1 shows how the S-122 application schema is realized from the S-100 GFM. [2]
Figure 1 — Realizations from the S-100 General Feature Model
The S-100 Standard is a framework document that is intended for the development of digital products and services for hydrographic, maritime, and GIS communities. It comprises multiple parts that are based on the geospatial standards developed by the International Organization for Standardization, Technical Committee 211 (ISO/TC211). [4]
6.3. OGC APIs
For several years OGC members have worked on developing a family of Web API standards for the various geospatial resource types. These APIs are defined using the OpenAPI specification. As the OGC API Standards are developed, approved by the OGC, and then implemented by the community, the marine community can subsequently experiment and implement them as well.
The following OGC API Standards and Draft Specifications were used for the development of APIs during this Pilot.
OGC API – Features: a multi-part standard that defines the capability to create, modify, and query vector feature data on the Web and specifies requirements and recommendations for APIs that want to follow a standard way of accessing and sharing feature data. It currently consists of four parts.
OGC API — Features — Part 1: Core. Approved September 2019, this standard defines discovery and query operations. [5]
OGC API — Features — Part 2: Coordinate Reference Systems by Reference. Approved October 2020, extends the core capabilities specified in Part 1: Core with the ability to use coordinate reference system identifiers other than the defaults defined in the core. [6]
Draft OGC API — Features — Part 3: Filtering. Part 3 specifies an extension to the OGC API — Features — Part 1: Core standard that defines the behavior of a server that supports enhanced filtering capabilities. [7]
Draft OGC API — Features — Part 4: Create, Replace, Update and Delete. Part 4 specifies an extension that defines the behavior of a server that supports operations to add, replace, modify or delete individual resources from a collection. [8]
A specification for Version 2 of the Common Query Language (CQL2) is being developed together with Part 3 to standardize a language that is recommended for filter expressions. [9]
OGC API — Environmental Data Retrieval: The Environmental Data Retrieval (EDR) Application Programming Interface (API) provides a family of lightweight query interfaces to access spatiotemporal data resources by requesting data at a Position, within an Area, along a Trajectory or through a Corridor. A spatiotemporal data resource is a collection of spatiotemporal data that can be sampled using the EDR query pattern geometries. These patterns are detailed in the section describing the Core Requirements Class.
The goals of the EDR API are to make it easier to access a wide range of data through a uniform, well-defined simple Web interface and to achieve data reduction to only the data needed by the user or client while hiding much of the data storage complexity. A major use case for the EDR API is to retrieve small subsets from large collections of environmental data, such as weather forecasts, though many other types of data can be accessed. The important aspect is that the data can be unambiguously specified by spatiotemporal coordinates. [10]
Draft OGC API – Styles: This draft API specifies building blocks for OGC Web APIs that enable map servers and clients as well as visual style editors to manage and fetch styles. [1]
7. Research Objectives and Technical Architecture
As seen in Clause 5 this Pilot was the second phase of a broader initiative. This chapter describes the motivations that guided this Pilot’s work and the component architecture that was developed to address this Pilot’s goals.
7.1. Problem Statement
One of the challenges of Marine Protected Area (MPA) data is to make it available for a wide variety of users, including those outside the MSDI domain, such as fishermen, resource extractors, emergency services, utilities, tourists, or recreational boaters. These users, who may not have direct access to MPA databases to access the data they need to perform their activities, rely on smaller consumer-facing applications, which in turn rely on APIs to request and consume the data they work with.
The use of standards makes it easier for developers to build software applications. The more robust these standards are, the easier it is to build applications and the more diverse the audiences that are able to utilize them in a variety of scenarios. Because of this, the demonstration of standards related to both MPA data and the APIs they are served through becomes of key importance.
Within this context, this pilot addressed the following research questions:
What stages the data goes through from MPA to S-122;
What steps were taken in the server development to standardize the various data into an S-122 data set;
What stages the data goes through in a fusion scenario, regarding format, metadata, etc.;
What steps were taken in the server development to synthesize the data and create digestible data for clients;
Which OGC APIs were leveraged to perform transformations to this data;
How the data were processed by the clients and what views were used; and
What kind of modifications do the S-122 and OGC API standards need to better address the use of MPA data.
7.2. Technical Overview
The activities were divided into two concurrent stages or sections, each one with its own technical architecture.
7.2.1. First Stage
The first stage (Figure 2) focused on the demonstration of the transformation of MPA data into the S-122 standard, and its achieved interoperability when being served through OGC APIs. This stage demonstrated the access to MPA data through APIs built based on OGC API standards in order to test the ability of OGC API standards to build APIs that make MPA data more Findable, Accessible, Interoperable, and Reusable (FAIR) for a community greater than just the traditional domain experts. By demonstrating the use of the IHO S-122 MPA data standard, it also demonstrated its combined use with APIs based on OGC API standards.
This stage saw the demonstration of three components.
One Baltic/North Sea Server (D100): One processing server that ingested the data from various sources of the Baltic Sea / North Sea providers. These MPA data were brought into the server and transcribed into the IHO S-122 Marine Protected Area standard.
Two Baltic/North Sea Clients (D101 and D102): These client services demonstrated different viewpoints and methods for digesting the data from the server and standardized data, including:
A well-connected, online service that one may use to analyze the scenario from afar in an office or other remote location;
A less-connected (DDIL: Denied, Disrupted, Intermittent, and Low Bandwidth) service for use on the Baltic / North Sea on an older vessel that may have limited technology; and
An entity on a newer vessel with more recent technology that could be actively committing additional data to the input data.
Figure 2 — First Stage Architecture
MPA datasets might also need to be combined with datasets from other domains for the purpose of providing a more comprehensive overview of a region. Scenarios that might have an impact on land and water would require the consumption of a combination of hydrographic, terrestrial, and meteorological data. Examples of these activities include construction, disaster response, and multimodal transportation. The second segment addressed this data combination challenge, demonstrating the use of OGC API standards and IHO standards in combination with other datasets and standards.
7.2.2. Second Stage
The second stage (Figure 3) identified, examined, and expanded upon existing data sets to give them greater fidelity, mobility, and versatility. This went beyond marine protected areas and opened the examination to a broader set of data and standards. These included other data sets and standards that could be utilized to develop a firmer, more holistic view of a region, such as terrestrial data, meteorological data, earth observation data, etc.
Figure 3 — Second Stage Architecture
This stage saw the demonstration of four components.
Two Data Fusion Servers (D120 and D121): These servers ingested various data inputs, including MPA data. D120 was implemented using the OGC API — Features standard, while D121 was implemented using the OGC API — EDR standard.
Two Data Fusion Clients (D122 and D123): These clients ingested the outputs from the two servers and displayed the data to end users.
The following seven chapters, Chapters 8 through 14, describe all these components individually followed by Chapter 15 which describes the interactions between them. Each component chapter includes a description of the baseline from which that component was demonstrated, the technical architecture of that component, and the challenges and lessons learned that emerged from the demonstration of that component.
The final two chapters, Chapters 16 and 17, present the main lessons learned and recommendations, and suggested future work, respectively.
8. Baltic Sea/North Sea Server
The Baltic Sea/North Sea Server component was designed to ingest MPA data from various sources of the Baltic Sea / North Sea providers, transform the data to comply with the S-122 standard, and offer it through an API built using OGC API standards. This component was demonstrated by IIC Technologies.
8.1. Status Quo
IIC has previous experience designing and building proof of concept (POC) access points for data using open source tools and OGC Standards. Most platforms use combinations of open source geospatial databases (PostgreSQL/PostGIS) and middleware (most notably GeoServer) implementing existing OGC W*S standards. Although the concept of such access points is nothing new, there are a number of gaps in implementation and the project offered opportunities to advance the work further.
The IHO S-100 framework is approaching a level of maturity where it is ready for formalized adoption within the IMO SOLAS framework for marine charts. In addition, the potential of S-100, as an ISO implementation of model-driven marine geospatial data is enormous. Using the framework, a broad array of marine phenomena can be modeled and commissioned as standards conformant datasets with known structures and metadata. The maturity of S-100 has made it attractive to the MSDI community as a potential vehicle for broader marine geospatial data, hence the commissioning of the OGC FMSDI project to explore the subject in more detail. As well as exploring implementation/adaptation of the S-100 framework, the other area of research is the implementation of API based standards. There is an existing Part within S-100 dealing with data streaming but this is more of a conceptual framework around which to build specific web services than a specific methodology. In addition, the navigational community is in the process of development of an API-based structure for eNavigation on SOLAS vessels, under IEC63135, the SECOM standard for secure marine communications. This standard, developed in parallel with S-100, will offer a way of transferring file-based S-100 data to vessel navigation systems. There is currently no standardized way of transferring S-100 data via API to broader stakeholders, nor is there an understanding of the potential impact this would have on the traditional Hydrographic Office (HO) community.
In addition to these constraints S-100 currently has no formalized encodings specific to modern web services. Although a GML implementation is included in S-100 Part 10b, this is now somewhat dated (and has been updated for S-100 edition 5.0.0) and a draft GeoJSON encoding has been developed for use in other projects. This has proved useful for implementation of OGC API Features tools as most are designed for GeoJSON distribution natively. This in no way precludes distribution via GML (the current encoding supported by S-122) but is useful for GIS integration, web services, and API processing by bespoke scripts. The project provides a useful opportunity to refine the GeoJSON encoding, the eventual aim being to publish it as a GFM/OGC “bridging” specification between the two organizations. Many institutions currently hold marine data in a variety of database structures and formats. Interchange between agencies (nationally, regionally, and internationally) is not regulated (in contrast to vessel navigation) and tends to be implemented according to national priorities. Therefore, datasets received as part of the project’s Phase 1, a survey of S-122/MPA implementations, used a variety of formats and content models. It is hoped that the project can inform such infrastructure on the content, exchange, and transformation of such datasets to positively benefit stakeholders.
The main objectives explored in Phase 1 were to look in detail at the possibilities offered by S-100 within the domain of Marine Protected Areas (MSP). Currently, many agencies promulgate data on MPAs and aggregate other data sources into their own. This results in a great deal of duplication and a lack of “custodianship” across the region.
The main questions/issues explored by IIC were as follows.
How can S-100, specifically S-122, help? By providing a common model across the domain of MPAs which can be implemented by all and used to create better interoperability across the region and across boundaries.
S-122 creation from different agencies’ data must be possible. So, all agencies’ requirements must be accounted for (this implies some filtering of data which is being released for access).
Interoperability between the different data sources must be achieved using a harmonized data model conforming to the S-100 modeling conventions and, preferably, within a revision of S-122 which is geared towards use by a broader class of agencies.
What needs to be done to the S-122 model to try and make it work “better” for use cases other than navigation?
What difference does access via API bring? This is really important for hydrographic offices who are used to distributing data via hard media. Most have “download” services but this means that data gets out of date and authenticity is lost.
How does the use of OGC API Standards “help”? (Does it help?)
Transboundary, Regional and multi-partner aspects are a high priority.
There is a need to reconcile data with existing limits and boundaries. This could be done with S-121 but boundaries can also be delimited by the S-122 data. It is also important to bring in WEND100 as a governance mechanism which ties distribution to EEZ limits. Furthermore, data integrity and traceability are always big issues for producers.
8.2. Technical Architecture
The server component converts data by mapping individual fields from the native format in which it was received to the fields defined by the S-122 feature catalogue. This process was achieved by mapping the fields individually and then constructing bespoke pipelines for the actual transformations.
The server uses an S-100 database with a simple schema implementing the framework. Features do not have their own tables; the database schema stores features and attributes together with the feature catalogue mediating attribute type, multiplicity, bindings, and relationships.
The S-122 form of the data is then stored in files which are served by the OGC API Features implementation built using pygeoapi, a framework-based tool not specific to any one dataset. Bespoke pipelines were constructed using IIC’s API for the S-100 framework which simplifies the construction of features/attributes and sub-attributes and their storage in the database
This integration into the database also then enables manual GUI editing of the S-122 data via the IIC Feature Builder tool (Figure 4). Once the data is prepared, edited, and ready it can be exported into the correct format ready for deployment with pygeoapi.
Figure 4 — IIC Feature Builder Tool
8.2.1. Data Transformation into S-122
Data was received in a variety of forms, mainly shapefile and OGC WFS. The format was first reduced to PostGIS database tables using OGR tools. A standardized interface to the database was then used to construct bespoke data processing pipelines to create the S-122 data. As IIC have an open database implementing S-100, this was a fairly rapid process and the GeoJSON outputs conforming to the S-122 feature catalogue were simple to produce. Pygeoapi was customized in a standard way to show bounding boxes, contact details, and other dataset metadata.
The transformation of data from the bespoke form to the S-122 standardized form was generally done using bespoke pipelines, as described. The server was a framework-level component requiring no customization specifically for S-122 data. The pipelines were run locally with data deployed to the S-122 server. The server had a customized HTML skin and was implemented with both flat GeoJSON content as well as content using Eleasticsearch as a service provider – providing more comprehensive filtering and search capabilities.
8.2.2. Usage of OGC APIs
OGC APIs were not used to effect transformations to the data as the pipelines required bespoke processing for all the supplied data. Parts 1 and 3 of OGC API Features were used for distribution of the S-122 data only.
The servers implemented were API endpoints capable of distributing data in a variety of forms and with a selection of content drawn from both the existing IHO S-122 data model and an enhanced model drafted during the project. The Baltic Server collected endpoints using the basic model of S-122, as published by the IHO, and using data received at project outset collected from sponsors and questionnaire respondents by OGC.
The capabilities of the server extended to a full, conformant implementation of OGC API Features Part 1 and 3 together with some facilities for filtering and querying. Filtering was performed via individual simple attributes on the data and querying was bounding box only as per the current pygeoapi implementation.
Distribution of data was accomplished via an S-122 GeoJSON encoding developed prior to the project and enhanced during its execution. Although at an early stage, this encoding bridged many gaps between IHO S-100’s default encodings (ISO8211, GML and HDF5) and those more commonly used in popular OGC implementations (OGC API commonly implements GeoJSON and HTML). An example of such an encoding is shown below in Clause 8.2.3.
8.2.3. Restricted Bandwidth Collections
One of the client implementations was to examine the possibility of data provision in areas with reduced bandwidth. Examination of the data has shown that the vast majority of the size of the datasets is for the representation of detailed coordinates since MPA data is often cut to coastlines, requiring large numbers of vertices to accurately represent the shoreline boundary. Conversely, the seaward boundaries are more normally delimited or tied to regulated positions and simple polygons only. Therefore, a separate API endpoint implemented a bounding box alternative to the full data. The code below shows one of these bounding box endpoints. The identifiers of the bounding box features are identical to those in the full data, allowing the client to establish potential intersections with areas of interest and then only request full data with authoritative positions once the initial list has been pre-filtered.
{
"type": "Feature",
"id": "DEU:83:11837:27",
"properties": {
"foid": "DEU:83:11837:27",
"featureName": {
"name": "Niederschsisches Wattenmeer"
},
"categoryOfMarineProtectedArea": "IUCN Category II",
"status": "permanent"
},
"geometry": {
"type": "Polygon",
"coordinates": [
[ [ 6.580833209, 53.368864047 ],[6.580833209, 53.872514458],[8.558466819,53.872514458],[8.558466819, 53.368864047],[6.580833209,53.368864047 ]]]
}
}
8.3. Challenges and Lessons Learned
Please also refer to the challenges and recommendations section of the chapter D120 Data Fusion Component.
8.3.1. API distribution of S-122 data
Many technical issues with the API distribution of S-122 data are related to the high vertex density of some of the polygons in the datasets. These are mainly in two areas.
Polygons with a shoreline component, although which shoreline is rarely, if ever, documented. These introduce a large number of vertices into polygons which make downloading and viewing challenging from a performance perspective (Figure 5).
Polygons with a large number of multiple components. These can also be challenging to download/retrieve because of their complex nature (Figure 6).
Figure 5 — Polygons With a Shoreline Component
Figure 6 — Polygons With a Large Number of Multiple Components
A possible remedy for this situation would be to partition features into multiple sections based on a grid of variable resolution. One of the participants in the project used a DGGS representation, and partitioning the MPAs according to such grids alleviated the issue of having too many vertices. The client was then left with the task of piecing polygons together composed of multiple parts. This was a good solution for performance and one which is used in other contexts within the IHO. The following guidelines could be useful as a recommendation.
Partitioning to a grid is fairly simple to do and can be done by the middleware components. It requires a suitable grid for the region in question and pre-processing of features to ensure good performance.
It is likely some areas will be complete grid cells, i.e. all MPA, and others which are partial. So, a concept of “coverage” within a grid cell is important. Many IHO product specifications have a concept of “data coverage” which accounts for this in the feature model.
A distinction should be made in the returned features of boundaries introduced by the grid partitioning and those which are part of the actual data. In the context of marine geo-regulation and MLB these could be “construction lines” and designated as such. There is no provision in the data model as yet for such things although product specifications like S-121 have already made such recommendations.
8.3.2. Use of GeoJSON
The implementation of the OGC API Features service in the project relied upon an encoding of data in GeoJSON, replacing the existing GML implementation which is currently part of S-100 edition 4.0.0. The GML implementation has been substantially revised for edition 5.0.0 of S-100 since previous implementations suffered from mismatches between GML and feature catalogue structures. In S-100 the feature catalogue represents the single definition of the data structure of a product specification and binds entities drawn together from the IHO geospatial registry.
Product development is currently advanced for many product specifications and the IHOs Nautical Information Provision Working Group (NIPWG) oversees the creation of many of these. All product specifications, even gridded data, maintain a feature catalogue. Those with specific symbolization requirements also maintain an S-100 specific portrayal catalogue. Each product specification also contains a default encoding, normally drawn from the three included in S-100 Part 10 (a, b or c). These are:
Part 10a – ISO8211;
Part 10b – Geographic Markup Language (GML); and
Part 10c – HDF5.
The use of GeoJSON as an encoding is not currently part of S-100 itself. However, its ubiquity as a format for exchange of geospatial data raises the possibility of its use for modeling S-100 General Feature Model (GFM) data.
In order to progress this useful addition and to form a bridge between the OGC API family of standards with the S-100 GFM, a draft GeoJSON model has been developed and implemented during the project. This is based around the following principles.
Each S-100 GFM Feature is a named and identified GeoJSON Feature. The feature names used are those defined in the feature catalogue.
Similarly, all attribute names (simple or complex) are identical with those in the feature catalogue.
Both simple and complex attributes are GeoJSON properties.
Simple attributes are rendered as strings, although some simple types could be implemented as GeoJSON native types (String, Integer, Real, Boolean). String representation simplifies the initial encoding.
Where the feature catalogue multiplicities have cardinality > 1 and more than one is encoded, they are encoded as an array. Singular instances are encoded as a singular property.
Geometry is encoded without inline topology using Point, LineString, and Polygon primitives. Other GeoJSON geometry primitives can also be used (“Multi”) if required. Most features do not use Z coordinates (depth) preferring to attribute depth but this could be implemented as a z coordinate.
The following elements of the GeoJSON encoding remain to be worked out. Some attempts at definition have been done but a consultation period with interested parties is probably required before a first draft of an encoding is published for testing.
Information Types GeoJSON and its clients are often not good at dealing with features which have no geometry. This needs to be explored and resolved. Currently this is done by optionally leaving out the information types or by including them inline with whatever features refer to them. This is wasteful on space, however, for the client. A separate download of JSON renderings of information types could also be accomplished if an inter-dataset referencing system can be accomplished.
Relationships A systematic way of associating features together needs to be established in the encoding. This could use either standard JSON encodings or inline expansion of linked features. Relationships are absolutely key to S-100 GFM and an integral part of data. As with identifiers, each encoding in S-100 implements its own relationship methodology (ISO8211 uses LNAM and GML uses gml ids and XML references. So, GeoJSON can do the same. S-100 lacks (at a framework level) a way of relating features from different product specifications together. This could be achieved through Maritime Resource Name (MRN) identifiers though.
Identifiers Each encoding implements its own identifiers. Also most product specifications will attribute identifiers, most likely MRNs. ISO8211 uses FOID while GML uses gml:id. A standardized identifier for features in the encodings needs to be defined, probably as simple as an “id” property or feature attribute – tbd. We have kept this broad as a string but formatting with the product specification may be better, e.g., “S-122:UKNCMPA020”. The “links” fields have been used in the current implementation which seems to work, although links with the same key value need to be accounted for.
Other standardized metadata The feature type should be included as a standard field in each feature. There may be others but testing will show those up. The Feature Collection could also contain some metadata as well. S-100 embeds more information in the dataset metadata so it would be good to be able to encode some of this somewhere in the feature collection. Although currently done by option, each feature should probably carry its product specification, as well, perhaps as a field in the id.
Geometry It is possible, of course, to encode any valid GeoJSON geometry, but a stronger relationship to S-100 Part 7 is probably a good addition and would make it clearer. As topoJSON becomes more accepted, an extension of this encoding to include a formal topology would be extremely valuable. This should more well profile the topology structure encapsulated in the current ISO8211 Part 10a.
An example of a feature encoded in GeoJSON is shown in Figure 7. A sample source code is shown in the code below.
Figure 7 — HTML Representation (pygeoapi) - GeoJSON Rendering on Right
{
"type": "Feature",
"id": "1:UKNCMPA020",
"properties": {
"featureType": "MarineProtectedArea",
"dateEnacted": "23-07-2014",
"foid": "GB:1:UKNCMPA020:2",
"featureName": {
"name": "Central Fladen"
},
"categoryOfMarineProtectedArea": "Not Applicable",
"designation": [
{
"identifier": "UKNCMPA020",
"designationValue": "NCMPA",
"jurisdiction": "National"
},
{
"designationValue": "555560480",
"jurisdiction": "International"
}
],
"status": "permanent",
"dimensions": {
"valueOfDimension": "92500.0",
"categoryOfDimension": "Area",
"unitOfMeasure": "ha"
}
},
"geometry": {
"type": "Polygon",
"coordinates": [
]
}
}
Feature Encoded in GeoJSON
This potentially extends to many product specifications. There are two aspects which need exploring in more detail.
The use of the IHO feature catalogue and an analogous structure for GeoJSON data The JSON Schema is probably the way to validate data against a schema, and mapping from feature catalogue (FC) to JSON Schema for each product spec is probably achievable, but difficult. The key observation here is that each feature in S-100 normally is different than with the FC describing the range of possible attributes and values they can have. Most GeoJSON data tends to have the same attributes for each feature, essentially a tabular structure. So, whilst our encoding is conformant, it may pose challenges for some implementers and maybe there are accommodations which can be made. The API should probably return the FC / JSON Schema in its conformance classes.
Aggregation S-100 aggregates features into datasets and datasets into exchange sets with appropriate placement of metadata. OGC API Features only have one level of aggregation, that of items into collections. A robust way of recursively aggregating would be a good step forward for OGC and provides an analogous structure for “exchange sets.” The ability to seamlessly aggregate datasets together in GeoJSON would also be a good step forward. As noted earlier, if MRN is implemented and used to refer to features outside a feature aggregation, then inter-product relationships would be implemented.
9. Baltic Sea/North Sea Client 1
The Baltic Sea/North Sea Client component was designed to ingest the MPA data from the Baltic Sea/North Sea Server developed in this Pilot, consume it in S-122 format, and demonstrate different viewpoints and methods for digesting the data from the server. This component was demonstrated by Pelagis.
9.1. Status Quo
Pelagis is an ocean-tech organization based in Nova Scotia, Canada. Their approach to address the needs of the marine environment, and the shared use of our ocean resources generally, is to make Marine Spatial Planning a core foundation on which to build out vertical applications. The platform architecture is based on a strongly typed federated information model represented as a unified social graph. This provides a decentralized approach towards designing various data pipelines each represented by their well-known and/or standardized model.
Pelagis’ previous work has focused primarily around the coastal marine environment and in particular, has worked closely with various agencies and non-profits supporting the growth of sustainable ocean farming. This is the first experience for Pelagis as a member of OGC and its involvement with IHO standards.
Pelagis provides an opinionated framework that purposely abstracts a microservice architecture behind a strongly typed information graph model. Each service endpoint represents a feature domain encapsulating the business requirements and behavior specific to its domain. This separation of concerns provides the means to implement each service endpoint independently while hiding the complexity of the implementation from the client environment. In this way, client applications may be developed based on a unified information model hosted as an open, spatially-enabled application service (OpenSEA) while domain-specific endpoints are developed by separate teams. This federated graph model is represented as a centralized information schema accessible through a non-procedural query language representing ‘what’ information a client is requesting while delegating the ‘how’ to access the information to the platforms query optimizer.
9.2. Technical Architecture
The D100 Baltic/North Sea (BNS) service endpoint provides Marine Protected Area features and related data to identify Marine Protected Areas within the Baltic/North Sea. The service is designed to ingest various sources of MPA feature data provisioned through various agencies. The BNS service is implemented according to the OGC API — Features Standard and represents feature data compliant with the S-122 information model.
The D101 client issues queries against the BNS service endpoint conforming to OGC API — Features over http(s) to retrieve a feature collection of MPA features. The client environment demonstrates various viewpoints designed to stress the S-122 information model and the capabilities of the OGC Features API as a recommended API for managing the MPA feature domain.
Figure 8 — D101 Client & Application Server Architecture
The framework seen in Figure 8 incorporates the requirements of the FMSDI project by extending the OpenSEA architecture to define the S-122 feature model for marine protected features. The core design of the OpenSEA application service aligns closely with our requirements to support a general marine spatial data infrastructure and in this case, directly supports our initiatives in providing a federated view of marine protected features.
The main components of the application architecture are the end-user client libraries, the OpenSEA application service, and each of the individual domain service endpoints.
Client Library
The client library is a declarative state management library responsible for interfacing between the end-user application and the OpenSEA service endpoint. The library is implementation-dependent based on the development framework of the application. The SentinelNg library is a Typescript library for web development providing a reactive-style design pattern. The Sentry client library is a Python library useful for analytics and exploratory data analysis at scale. Each of these libraries provides the core capabilities to query and modify application features through a local cache which supports both real-time and offline navigation of the domain feature model.
OpenSEA Application Service
The OpenSEA application service implements a unified graph model for each feature domain. A query issued to the service from the client library will validate against the feature domain schema and build the appropriate query plan to access each of the backend data services.
For the purpose of this project, the OpenSEA Application Service represents the client environment and is responsible for directly interfacing between the D100 Baltic/North Sea service, the D120 Data Fusion Server, and the D121 Data Fusion Server.
Data Service
The Baltic/North Sea (BNS) service publishes MPA features in accordance with the OGC API — Features Standard. This approach is based on an adapter design pattern in which each individual MPA feature source is published according to the S-122 information model. This provides a consistent interface to the MPA feature model independent of how the source system represents the MPA feature model.
9.2.1. Scenarios
Use Case: As a user, I want to see all marine protected areas
This scenario satisfies the basic requirement to query for all marine protected areas specific to a source authority and published through the D100 BNS feature service. The information model was consistent with the S-122 standard allowing the client applications to access individual MPA feature properties and geometry. The goal behind this scenario was to stress the core S-122 feature model as it aligned with the OGC set of standards.
Figure 9 — D101 Client Workflow
Client query:
post( URI=’https://…/ogcfmsdi/;, json=’query all_MPAs ($authcode: AuthorityCode) { marine_protected_areas (authCode: $authcode) { _id geometry{geojson} featureName categoryOfMarineProtectedArea { category } }}
Result:
Figure 10 — D101 Client Query Result
9.3. Challenges and Lessons Learned
10. Baltic Sea/North Sea Client 2
The Baltic Sea/North Sea Client 2 (D102) component was designed to ingest the MPA data from the Baltic Sea/North Sea Server (D100) developed in this Pilot. It was designed to demonstrate an alternative viewpoint and method to the D101 client for digesting the S-122 data. The viewpoint that D102 utilized in this pilot was that of a less connected service, such as that on an older vessel with limited technology or connectivity. This component was demonstrated by Helyx Secure Information Systems Ltd.
10.1. Status Quo
Low connected environments, often described as DDIL, are a concern for many communities operating across various sectors. The maritime domain certainly has a significant DDIL component, with vessels frequently operating in less-than-ideal network conditions. Vessels can have poor signal quality, limited bandwidth, and intermittent connectivity, all of which impact the ability to send and receive data. Whilst the technology continues to improve, there is a need to both understand and consider the range of technology currently equipped by the increasingly large user base of maritime data, and not to assume that all user needs are the same.
There are two overarching considerations when working in a less connected environment; the first concerns the various approaches that could be used to mitigate some or most of the effects in operating in a DDIL environment. For example, pre-loading or pre-caching the data, storing the data locally and updating via physical means, making the best use of networks, or using compression techniques to transmit the data. The second, and often overlooked, consideration concerns the data itself that is being transmitted to and from users operating in a less-connected environment. There remains a question of whether all users require the original data all the time, or if some derivative or simplified version would be sufficient for some users. This simplified version of the data would be much smaller in file size and could therefore be easier for vessels to receive whilst at sea. By having an alternative version of the data, it also allows the data to be potentially consumed and used by a wider user base that might not have been able to use the data in its original form. The simplified version of the data is envisaged to be a supplement to the original data rather than a replacement.
As this stage of the FMSDI pilot focuses on demonstrating improved access to MPA data for a broader variety of end users outside of the traditional MSDI domain, the D102 client primarily deals with this second consideration concerning the data. Previous work addresses approaches to compensate for DDIL environments, including networks, GeoPackages, and data compression techniques. Furthermore, as the MPA data access is widened to other maritime users, now is a good time to consider possible alternatives for how the data could be presented and delivered to an increasingly wide range of users that may not have optimal connectivity or onboard technology.
MPA features can vary in shape, extent, and complexity, with some features containing hundreds of vertices. These more complex features are inherently larger in file size than features with fewer vertices and will therefore take longer to transmit to a user operating in a DDIL environment. MPA features located in the littoral or inland regions are especially complex in their shape as they follow the coastline or inland waterways.
10.2. Technical Architecture
10.2.1. Approach
In the pilot scenario, it was envisaged that a user aboard a vessel in a low connectivity environment would need to know the location of any MPAs upon a given route. For example, changing the planned route mid-journey in reaction to an emergency event or to respond to a natural disaster. This would require the user onboard the vessel to request data for a geographic area that was not necessarily pre-loaded onto the vessels system. It was assumed that having a small buffer on the vessels route would also be required so that MPAs within a known distance of the route would be returned from the server. For the D102 client an arbitrary buffer of 5 nautical miles was added to the vessels route.
The approach taken in this pilot was to minimize the volume of data requested and to determine whether using an alternative to the original data would be suitable and appropriate for the S-122 MPA standard. This was achieved in two ways.
Utilizing bounding boxes of the MPA features, rather than the MPA features themselves, to reduce the size of the data requested from the D100 server.
By using a spatial filter to only retrieve data from the D100 server that is located within a bounding box of the vessels route (buffered by 5 NM).
10.2.2. Architecture and Interactions
Figure 11 — BSNS Client #2 Architecture Overview
A lightweight client was built that connected to the D100 server (Figure 11). The D102 client was developed using Leaflet, a JavaScript library for interactive maps, and was chosen for its ease of use, the ability for it to be used on desktop and mobile devices, and its ability to be extended using plugins. Two Leaflet plugins were also used for the client.
Leaflet Draw: This was employed to implement the draw feature in the client, enabling the user to draw their route.
Turf.js: This enables geoprocessing to be done in the client, such as buffering features, performing intersections, etc.
Figure 12 — Workflow for the BSNS Client
The sequence of interactions that take place is detailed below and shown in Figure 12.
1 - User creates a route of the vessel by drawing it on the map in the client. This uses the Leaflet Draw plugin functionality.
Vessel Route Drawn in the Client Application
2 - The route drawn by the user is then converted in the client. First, the route is converted into a GeoJSON object and logged in the console. Then the GeoJSON object is converted into a Turf Linestring object using the turf.lineString method from the Turf.js plugin, as seen on the code below. This conversion into a Turf Linestring is required to perform the buffer in Step 3.
The drawn route: {”type”:”Feature”,”properties”:{},”geometry”:{”type”:”LineString”,”coordinates”:[[22.212982,60.43622], [22.118225,60.40368],[22.103119,60.379254],[22.015228,60.281366],[21.814728,60.251397],[21.758423,60.224129],[21.656799,60.213898], [21.595001,60.213216],[21.507111,60.198203],[21.472778,60.180453],[21.430206,60.175672],[21.128082,60.033988],[21.051178,59.974944], [20.887756,59.965322],[20.89325,59.950885]]}}
3 - The Turf Linestring object created in Step 2 is then buffered in the client using the turf.buffer method from the Turf.js plugin. This is currently set at an arbitrary distance of 9.26km (5NM) for demonstration purposes. The buffer is not displayed in the client, but the coordinates of the buffer are shown in the console log.
Route Buffer Coordinates as Shown in the Console Log
4 - The geographical extent of the buffered route is calculated in the client using Leaflet’s .getBounds() and .toBboxString() methods. This returns the bottom-left and top-right coordinates of the bounding box that encloses the buffered route and outputs them as a string X1, Y1, X2, Y2, as seen on the code below. This is used as the bbox spatial filter when querying the D100 server and is added to the console log.
Bounding Box of the Buffered Line: 20.72139348936155,59.86761112867299,22.381746235013203,60.519483021624964
5 - The coordinates of the buffered route bounding box are added to the URL query as a bbox spatial filter, as seen on the code below. The URL query is added to the console log as a hyperlink.
The URL query of the feature collection is: http://35.176.64.149/pygeoapi/coliections/S-122WOPA_ES_BB/it_61112867299,22.381746235013203,60.519483021624964&limit=2000
6 - D102 client then requests the MPA features from the D100 server using the query created in Step 5. Note that the map extent of the feature collection on the D100 server is confined to the bounding box created in Step 4.
MPA Features Retrieved From the D100 Server
7 - D100 server then returns the MPA features back to the D102 client that are located within the bbox spatial filter. The number of features returned is added to the console log.
FeatureCollection Object Returned Back to the Client
8 - The MPA features returned in Step 7 are then processed in the client using the Turf.js plugin. The MPA features are converted into Turf multi-polygons using the turf.multipolygon method and the buffered route is converted into a turf polygon using the turf.polygon method. An intersection between these two is then performed in the client using the turf.intersect method.
MPA Features Displayed in the Client
9 — The D102 client then only displays the MPA features that intersect the route buffer. The number of MPA features that are displayed in the client is added to the console log. Note in this example there were 544 MPA features returned from the D100 server and only the 65 features that intersected the buffered route were displayed in the client.
There are 65 bounding boxes of Marine Protected Areas within 5NM of the vessels route
Two of the feature collections (WDPA source and JNCC source) on the D100 server were provided in two different formats; one had the original MPA features, and the other had bounding boxes that represented the original MPA features. Of the two collections, the WDPA contained 2000 features (Figure 13) whilst the JNCC collection only contained 25 features. Figure 14 and Figure 15 contain an overview of the differences between querying the original data and the bounding boxes when using the two test routes hosted on the D100 server. It was evident that using a simplified version of the MPA data can significantly reduce the size of the returned query.
Figure 13 — WDPA Collection
Figure 14 — Overview of using WDPA data on the two test routes
Figure 15 — Overview of using JNCC data on the two test routes
Network emulation was implemented to simulate a network that had limited bandwidth, packet loss, and an increased delay. Even with low bandwidth the D102 client was able to successfully query the D100 server and display the bounding box MPA features that intersected the vessels buffered route. Conversely, using network emulation and then executing a query using the original MPA features would take significantly longer and often fail to load on the client.
10.3. Challenges and Lessons Learned
10.3.1. Spatial Filtering Using a Bounding Box Query
The bbox spatial filter, which is specified in Part 1 of the OGC API — Features Standard, is a useful method to query a server and reduce the volume of data returned to the client. However, using this method often covers a significantly larger geographic area than what is required (Figure 3). Additionally, maritime routes that have significant displacement in both the X and Y axes will have a large bounding box enclosing the route, whereas a route that has minimal displacement in either the X or Y axis will have a much smaller bounding box. Using the bounding box spatial filter can therefore mean that valuable network traffic in low-connected environments can be consumed by returning features that are irrelevant to the user’s query. An alternative approach using Part 1 of the OGC API — Features Standard could be to use a series of bbox spatial filters for each segment of a route, either using a set specified distance or based on a change in bearing. Whilst not the most efficient technique, if only OGC API — Features — Part 1 is implemented by a server it would still reduce the overall area of the query and lower the number of features returned when compared to using a single bbox spatial query.
A more efficient method of performing a spatial filter would be to use the functionality of CQL2 in creating the spatial filter, as described in Part 3 of the OGC API — Features Standard. Using this additional functionality enables spatial filtering to be confined to other shape types and is not limited to a simple bounding box shape. In this scenario the buffered route of the vessel, or the route itself with some additional conditions specified in the spatial filter (i.e. distance from the route), could be used as the spatial filter (Figure 16). By using a server that implements OGC API — Features — Part 3 this would significantly reduce the area of the query sent to the server and subsequently reduce the number of MPA features returned to the client.
Figure 16 — Example of how using a bbox spatial filter (black line) covers a much larger area compared to a polygon shape of the buffered route (blue line).
During the pilot the D100 server used pygeoapi v0.12 to publish MPA vector data through an OGC API Features endpoint. However, the bbox spatial filter only works on certain data providers; GeoJSON and CSV providers cannot be filtered using a bbox query when using pygeoapi v0.12. If the bbox spatial filter approach is to be used, then the data would need to be in an appropriate format (i.e., provider) or hosted on a server with suitable spatial filtering capability. Additionally, CQL filtering on pygeoapi 0.12 only works with Elasticsearch providers; this functionality was not enabled on the D100 server and therefore could not be tested during the pilot.
10.3.2. Using Bounding Boxes to Represent Features
It was clear from very early stages in the pilot that MPA features can be complex in their shape, especially in the littoral regions. This impacts the number of vertices that a particular feature has, and significantly increases the file size of the feature. When dealing with a small number of features this is not necessarily an insurmountable problem. However, when dealing with a collection that has hundreds or thousands of complex features it can severely impact the end users experience and is unsuitable for low connectivity environments. Whilst this affects the S-122 standard for MPAs it is not unique to S-122. There are a number of other datasets that potentially suffer from the same problem, i.e., anything with very complex shapes.
Therefore, creating a DDIL twin (a similar concept to a digital twin, whereby a simple and lightweight version of the data coexists to the original data) was suggested when the MPA shape needed to be simplified. Bounding boxes were created for some of the feature collections to test whether this concept would work and what issues would arise.
During the pilot there were several instances of MPA feature collections that had been converted to bounding boxes before the final two collections (WDPA, JNCC). There were some issues identified, especially with these earlier instances of bounding box data collections. For example:
the features that were converted into bounding boxes used single polygons for all features, which was an issue for multi-part features; and
the bounding boxes were an envelope-type minimum-bounding geometry of the feature.
MPAs that are multipart features, i.e., features that have the same attributes but consist of numerous individual features, were represented by a single bounding box (Figure 17). This would cause the vessel route to return a bounding box which may contain many individual MPA features.
Figure 17 — Example of a single bounding box (A) that represents numerous multi-part features (B)
Additionally, depending on the shape of the MPA feature, using an envelope type bounding box would sometimes provide large areas that did not contain MPA features (Figure 18). An alternative option is to use a convex-hull type bounding box which would not only provide a better representation of the feature, but also reduce the amount of empty space that does not actually contain MPA features (Figure 19). This would reduce the occasions of ‘false positives’ whereby a bounding box feature is returned to the client as it is within 5NM of the vessels route, however the actual MPA feature the bounding box represents is not close to the vessel’s route. If the bounding box approach is going to be taken forward, these issues will need to be considered in future implementations.
Figure 18 — Example of how an envelope type bounding box (A) can lead to a large geographic area (B) that contains no MPA feature
Figure 19 — Example of an envelope bounding box (left) and convex hull (right) minimum bounding geometry of the same feature
The S-122 standard doesn’t appear to readily use bounding boxes, only that of the overall dataset extent, however in the GML data format documentation there are several sections where gml:boundedBy attributes could be utilized to store the coordinates of a bounding box of the features and include the information in the metadata. Alternatively, the creation of bounding boxes for individual features can easily be achieved using S-122 MPA datasets in several GIS software applications (ArcGIS, QGIS, etc.) before the bounding box features are hosted onto a server.
10.4. Recommendations for Future Work
10.4.1. Further Enhancing MPA Filters
There is an opportunity for further development of the client alongside a server implementing OGC API — Features — Part 3 to use additional filtering to reduce the amount of data requested from the server. For example, giving the user the option to filter depending on what is important for their needs, which may change over time depending on their location or mission.
This could be accomplished by allowing the user to query using the attributes in the S-122 standard, such as the data provider, country, MPA status, category, vessel restrictions, etc. There is also the potential to increase the functionality of the client by allowing the user to request specific original MPA feature(s) once the bounding boxes have been returned.
Finally, the functionality on the server implementation could be expanded to clip the MPA features that intersect the query from the client and only return these segments of intersecting MPA features, further reducing the amount of data returned to the client.
10.4.2. Establishing a Data Schema for DDIL Environments
If using a DDIL twin for any data is to be considered going forward, then there needs to be some consideration for what the data schema would need to be. For the S-122 MPA data this would be the simplified bounding box versions of the original data. As the simplified version of the data is envisaged to be a supplement to the original data rather than a replacement, it would need to share common attributes with the original data and have a clear link back to the original features.
Whilst having additional metadata would slightly increase the file size, this is not the main contributing factor as the features remain a simple shape with few vertices. Some important parts of the DDIL schema are listed below. This is not an exhaustive list and consultation with relevant stakeholders in the community and beyond should be undertaken to fully understand what is important.
Feature ID
Feature Name
Category
Status
Country
Any restrictions in the area
A clear notation that the feature is a DDIL representation and not the original feature
There are many other considerations that could be leveraged if the objective is to reduce either the size of the response payload or the complexity of the MPA feature boundary. The former can be addressed, as an example, through compression to (possibly) geoParquet. The latter point is a discussion that should be oriented towards the accuracy the client is willing to give up to approximate the MPA boundary based on BBOX.
10.4.3. Using Vector Tiles
Tiled feature data, colloquially referred to as Vector Tiles, enables the delivery of vector feature data in pre-defined tiles which enables small amounts of data to be delivered to the client and has been previously proven to work in DDIL environments. Some of the potential benefits could be:
efficiently storing, delivering, and visualizing vector data from large datasets (such as S-122);
varying levels of data ‘resolution’ allows low resolution over a large geographical area, or high resolution over a much smaller area;
enables efficient caching of data;
can provide clients with a hierarchy of available data, while awaiting requests for higher resolution tiles; and
uses established techniques and APIs (OGC API – Tiles, OGC WFS 3.0, OGC WMTS).
This could be an alternative approach for using S-122 in low connectivity environments by hosting and serving the S-122 data as vector tiles. It is recommended that this is explored in the next phase of the FMSDI pilot.
11. Data Fusion Server 1
The Data Fusion Server component was designed to ingest MPA datasets as well as other datasets, combine them, and serve them through an API built using the OGC API Standards. This component was demonstrated by IIC Technologies.
11.1. Status Quo
The fusion server was a development of some of the facilities on the Baltic Server designed to enhance the data content available in respect of Marine Protected Areas, to enhance the S-122 model following feedback from stakeholders, and to produce data fusing the old/new aspects of the model. An additional aim was to enable more sophisticated search, filter, and retrieval technologies to facilitate clients with richer functionality. These facilities have traditionally been non-integrated, using middleware such as GeoServer implementing older OGC Web Service standards.
The process of model enhancements and data examples conforming to S-100, which take advantage of them, is generally a process which takes many months of evolution with stakeholders. The IHO-established process for such development is model-driven but tends to be a predominantly manual one. The FMSDI project used a rapid application development methodology to speed up and prototype through demonstration the possibilities of enhanced data models using S-100 as a framework.
Data distribution by the stakeholders is generally currently accomplished by a “portal” methodology where specific websites offer data for browsing and download in a number of formats, both proprietary and open standards based. Although some have used previous OGC API Standards to stream data, many still use file download as a primary mechanism. Additionally, a number of regional “aggregators” exist where individual contributions are joined together under a single data model, representative of the regional aggregators identify and specific to them. Therefore, although OGC Standards have made inroads into the SDI development it is by no means pre-eminent and the presence of aggregators adds to the complexity. The fusion server concept contributes to this picture by providing specific vector feature API endpoints with a high degree of interoperability, data conformance to S-100, and adaptability using structured query techniques.
11.2. Technical Architecture
The fusion server was implemented with a set of extra API Endpoints co-located with the Baltic Server implementation. The technical services re-use the components from D100, namely PostGIS, pygeoapi, GeoJSON and bespoke software pipelines to create data conforming to a number of model enhancements proposed following stakeholder consultations. Unified Modeling Language (UML) diagrams expressed using Enterprise Architect were transformed via IIC’s embedded IHO Feature Catalogue extensions and then transferred into an S-100 conformant FC for implementation in the enhanced pipelines creating MPA data.
At this stage in the project it became apparent that the implementation of OGC API Features — Part 3 CQL querying was fairly basic in the pygeoapi components. Therefore, a previous version was sought, installed, and configured with a selection of datasets to try a more complete implementation of CQL. This seems to have been more successful and has been enough to try querying out via API programming, although, of course, GIS currently do not provide query interfaces conforming to the CQL standard.
It should also be recalled that many of the fusion aspects of the data endpoints can be realized by using multiple feature types within individual data types. Unlike most data frameworks, S-100 is inherently multi-feature so a “collection” in the IHO context (normally associated with a “dataset” is an aggregation of many different kinds of feature and relationships between them) so “fusion” in an S-100 ecosystem is often accomplished at the content level, not at the web services level.
The first element of the Fusion Server that was focussed on was the inclusion of data from broader authorities and its content.
The Danish habitat data, described in the section on the S-122 model, is an example of this mixture of individual data from different sources and the S-122 data model.
The integration of the habitat (WFS source) data with the S-122 model was also performed using the IIC Feature Builder Tool (Figure 20).
Figure 20 — Integrating Data With the IIC Feature Builder Tool
This data modeling and mapping exercise provided many suggestions for the data model enhancements which were then back-propagated into the original data sources to enhance the S-122 representation of the data.
The fusion server data was hosted in parallel with the Baltic Server and was accessed through a different port (5000). This was an implementation configured during the course of the project using an older version of the pygeoapi implementation which supports fuller capabilities for intelligent querying, specifically the CQL interfaces from OGC API Features — Part 3.
Data was ingested into PostGIS databases and then custom automated processes created S-100 GFM features and information types using an S-100 GFM database schema. Once populated, the data was then used to extract S-100 GFM GeoJSON which was then deployed under pygeoapi.
The current version of pygeoapi under development (13 at the time of writing) is fairly limited in its CQL support, but an earlier version implemented more complete support (v0.9.0). So, in order to provide facilities for broader scenarios the earlier version was implemented in parallel. This allows full search capabilities in addition to bounding box queries.
In order to demonstrate the processing potential of the API model a python jupyter notebook was written which explored various aspects of the server operation in annotated code. The URL below shows an example of a URL which makes accessible a simple “WITHIN” query.
Compound queries were also possible, such as queries selecting features from within a polygon and then filtering on the id:
Figure 21 — Testing CQL Queries on the D120 Fusion Server
There were many difficulties in implementing such queries reliably with pygeoapi as the CQL implementation is at an early stage and such complex spatial queries require full implementation in pygeoapi and more robust error detection and messaging. The initial results are encouraging, though feedback to pygeoapi should be structured and passed on as part of the project outputs.
In order to support more detailed scenarios of establishing which MPAs fit along individual routes, a pair of routes were digitized from existing maritime routing data from within the public domain through the IMO’s recommended routes publication. These were made available on the query server for clients to ingest, buffer, and create proximity queries for, as seen in Figure 22.
Figure 22 — pygeoapi Displaying Routes
The routes created themselves to represent a fusion of datasets, as they are created from maritime route data and digitized from Automatic Identification System (AIS) heat maps obtained from datasets in the public domain. There is no automated means to create such routes since the constraints on them in terms of maritime traffic flow are substantial. In order to support such scenarios, these routes were created manually – there is an interesting extension to this work in trying to come up with such routing data automatically from point cloud AIS data and navigational data constraints – this is beyond the scope of the existing project but much of the source data undoubtedly exists. Figure 23 shows Denmark MPA data, IMO routing features and AIS heatmap tracks.