This repository contains the code created in the master thesis "An Efficient Semantic Search Engine for Research Data in an RDF-based Knowledge Graph" to implement a search engine for RDF-based metadata graphs based on Virtuoso, Elasticsearch and the developed mapping of the data.
Code documentation:
The code documentation can be found under ./docs/index.html.
Setup/Prerequisites:
- Virtuoso
- Virtuoso.ini file
- NumberOfBuffers >= 660000
- MaxDirtyBuffers >= 495000
- Define the following graphs as rulesets named "ruleset" for inference rules with
rdfs_rule_set ('<rulesetname>', '<graphname>') ;
- Virtuoso.ini file
- Data mentioned in "Sample Dataset for Search Engine Evaluation for Research Data in an RDF-based Knowledge Graph"
- All named_graphs, use virtuoso.db/graphs.db
- Elasticsearch
- Specified packages
How to use:
- (Re-)Build solution (Install/restore necessary packages)
- Start Virtuoso
- Start Elasticsearch
- Run Coscine.SemanticSearch.Cmd.exe with the necessary arguments
- Find your index name under
http://localhost:9200/_cat/indices/
- Preview your index at
http://localhost:9200/<INDEXNAME>/
- Preview data for index at
http://localhost:9200/<INDEXNAME>/_search
Commandline arguments
Option | Description |
---|---|
-a, --action | Required. Possible action: search, reindex, index, delete or add |
-m | (Default: http://localhost:8890/sparql) The link to the SPARQL connection |
-e | (Default: localhost) Server name of Elasticsearch |
--ep | (Default: 9200) Port of Elasticsearch |
-d, --doc | ID of metadata graph |
-q, --query | (Default: *) Elasticsearch query |
--adv | (Default: false) Set true for advanced Elasticsearch search syntax |
-u, --user | (Default: ) Specify user or only public metadata records could be found |