| Guidance name | User-friendly and efficient managing of geodata in a metadata catalogue for geodata |
| Motivation / Intent / Aim of guidance | Managing geodata can be challenging, in particular, when a huge number of (big) spatial datasets should be made available for collaborated usage/processing or should be findable within and outside projects. To find / discover geodata, users need specific interfaces that foster spatial, and combined temporal and thematic filtering. Further, to evaluate the fitness for use of potential geodata, users often need a map-based visualisation. Metadata catalogues for geodata provide user interfaces, a related database, and (often) a standardised interface to manage and discover metadata and geodata, and implement a role and access management. They are either implemented and published as open-source products and can be configured/modified/extended or are available as commercial solution.|
| Recommended activities | In Earth System Sciences, metadata catalogues for geodata are used to manage geospatial data and related metadata by providing discipline-specific user interfaces, e.g. spatial filter and search menus, and APIs, e.g. for spatial requests. Manage geospatial data in a metadata catalogue for geodata directly from the project beginning. Whenever possible, use an existing catalogue, e.g. an institutional catalogue. When not having the option to use an existing metadata catalogue for geodata, you can choose from a list of various existing (open-source) catalogues. |
|Example / Use case| The BMBF project GeoKur aims to support the curation and quality assurance of Earth System Science (ESS) data sets, focusing on the suitability of geospatial time-series of global land use data by analysing human-environment relations such as land degradation, biodiversity, human migration and ecosystem services. The project uses existing publicly available datasets and provides produced datasets as open data, open access, and FAIR-compliant. During the project, datasets will be managed via open source catalogue CKAN with spatial extensions facilitating direct metadata and data access via API. Selected results will be stored in the institutional data management platform DMP/DRP including raw data after the project ends, resp. published on PANGAEA for long-term storage. The researchers develop data analysis scripts using the language R. Scripts will be managed on GitHub and published via Zenodo following reproducible research approaches by including links to the welldocumented open-source GitHub repository and used datasets. The data management focus of the project is on discipline-specific provenance and quality tracking for produced datasets and documentation for both - collected and produced – datasets. Therefore, all datasets will be described using a project-specific GeoDCAT metadata profile with linked PROV-O and Data Quality Vocabulary. Metadata in the GeoDCAT format will therefore be automatically extracted or tracked by GeoKur-specific tools orextended by manual processes, when extraction or tracking is not possible. A specific quality register facilitates the curated management of quality measure descriptions.<br><br> Checklist for choosing a catalogue in GeoKur: <ul><li>Options for managing private and public datasets</li><li>Open-source, strong community, high number of extensions, several active instances available on the Web</li><li>Options to manage discipline-specific metadata profile, here a project-specific GeoDCAT profile with extended provenance and quality information for geospatial datasets</li><li>Options to implement several metadata profiles, here for datasets, workflows and processes</li><li>Geospatial filter and extent visualisation</li><li>Options to manage different types of data, geospatial and non-spatial data</li><li>API to directly create, update or delete metadata and data via R scripts</li><li>Interface to link project-specific and institutional catalogue</li><li>Options to link the catalogue with existing geospatial Web services, e.g. OGC WMS visualization services</li></ul>|
|Context: Discipline, Funder, Template, Keywords | Earth System Sciences or related disciplines mainly using geospatial data; Not relevant for this guidance; Science Europe, Section 5 - Data sharing |
| Consequences / Costs| When not using a metadata catalogue for geodata, data discovery and data usage will be limited and inefficient, in particular spatial filtering, search, and API-based data and metadata usage or update, e.g. in analysis scripts. Catalogues for geodata provide specific filter, search and preview functionality for geospatial dataset. When not using a catalogue, datasets have to be indexed/tagged with the spatial extent with other tools or manually. For the evaluation of fitness for use, the data needs to be downloaded, if Web-based preview functionality is not provided.Furthermore, metadata catalogues for geodata provide specific interfaces to automatically publish and update data in geospatial Web services, which allows researchers to use the catalogue for geodata as a central entry point for managing and publishing data. Thus, when not using a metadata catalogue for geodata, data publication in such Web services needs to be done as separate step.|
|Literature| Open-Source Geo-Catalogue GeoNetwork https://geonetwork-opensource.org/, Open-Source Catalogue with several geospatial extensions https://ckan.org/, Open Geospatial Consortium – Description of Catalogue Services https://www.ogc.org/standards/cat, Scientific geodata infrastructures: challenges, approaches and directions https://www.tandfonline.com/doi/full/10.1080/17538947.2013.781244, Handbook of Research on Geoinformatics, Chapter 5: Spatial Data Infrastructures https://www.igi-global.com/gateway/book/470 |
|Participants| Data manager, IT team|
|Related guidance| Metadata profiling for geospatial data; Open geospatial formats |