Datahub harvesting pipeline: mapping of metadata fields
The initial code to harvest aggregated metadata of ESS datasets from the search index of the Datahub is already implemented, see https://git.rwth-aachen.de/nfdi4earth/knowledgehub/kh-populator/-/blob/dev/kh_populator_domain/datahub.py?ref_type=heads#L8 (and the related searcher in kh_populator/harvesting searchers
and consumer in kh_populator/harvesting_consumers
).
The metadata is retrieved as JSON documents as returned from the underlying ElasticSearch index from the endpoint https://o2a-data.de/index/rest/search.
However, in the function datahub_record_to_n4e
we currently only map/translate a few basic fields and more fields must be included.
-
title -
description -
download
/Distribution->accessUrl
-
spatialCoverage -
temporalCoverage -
keywords -
link to entry of re3data repository in the KH -
license -
authors -
publishing organization -
...