Skip to content
Snippets Groups Projects
Select Git revision
  • main
  • gitkeep
  • dev protected
3 results

semanticsimilarity

  • Open with
  • Download source code
  • Your workspaces

      A workspace is a virtual sandbox environment for your code in GitLab.

      No agents available to create workspaces. Please consult Workspaces documentation for troubleshooting.

  • SemanticSimilarity

    This repository contains the code of the semantic similarity implementation. The idea is that multiple similarity implementations can be used and compared against each other.

    A server implementation exists, to provide it as a service, so that it can be used in other areas.

    The analysis data can be found in the Analysis folder.

    Installation

    Run:

    pip install -r requirements.txt

    Server

    For running the SemanticSimilarity as a server, just execute the server.py with Python.

    python server.py

    Set the environment variable SEMANTICSIMILARITYPORT if you want to specify the port and SEMANTICSIMILARITYHOST if you want to specify the host.

    Build

    This repository comes with build artifacts which contains an executable that starts the server.

    Example Usage

    Navigate to the application after it runs and check out the Swagger interface!

    Example Request

    After the server has been set up and is running, a post request with this JSON data will yield results (comparing two SHACL application profiles):

    {
      "graphs": [
        {
          "definition": "@prefix dcmitype: <http://purl.org/dc/dcmitype/> .\n@prefix dcterms: <http://purl.org/dc/terms/> .\n@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n@prefix aps: <https://purl.org/coscine/ap/> .\n\n<https://purl.org/coscine/ap/radar#subject> sh:path dcterms:subject ;\n\tsh:order 3 ;\n\tsh:maxCount 1 ;\n\tsh:class <http://www.dfg.de/dfg_profil/gremien/fachkollegien/faecher/> ;\n\tsh:name \"Fachrichtung\"@de, \"Subject Area\"@en .\n\n<https://purl.org/coscine/ap/radar#created> sh:path dcterms:created ;\n\tsh:order 2 ;\n\tsh:maxCount 1 ;\n\tsh:name \"Production Date\"@en, \"Erstelldatum\"@de ;\n\tsh:minCount 1 ;\n\tsh:datatype xsd:date ;\n\tsh:defaultValue \"{TODAY}\" .\n\n<https://purl.org/coscine/ap/radar#creator> sh:path dcterms:creator ;\n\tsh:order 0 ;\n\tsh:maxCount 1 ;\n\tsh:name \"Ersteller\"@de, \"Creator\"@en ;\n\tsh:minCount 1 ;\n\tsh:datatype xsd:string ;\n\tsh:defaultValue \"{ME}\" ;\n\tsh:minLength 1 .\n\n<https://purl.org/coscine/ap/radar#rights> sh:path dcterms:rights ;\n\tsh:order 5 ;\n\tsh:maxCount 1 ;\n\tsh:name \"Rights\"@en, \"Berechtigung\"@de ;\n\tsh:datatype xsd:string .\n\n<https://purl.org/coscine/ap/radar#rightsHolder> sh:path dcterms:rightsHolder ;\n\tsh:order 6 ;\n\tsh:maxCount 1 ;\n\tsh:name \"Rightsholder\"@en, \"Rechteinhaber\"@de ;\n\tsh:datatype xsd:string .\n\n<https://purl.org/coscine/ap/radar#title> sh:path dcterms:title ;\n\tsh:order 1 ;\n\tsh:maxCount 1 ;\n\tsh:name \"Titel\"@de, \"Title\"@en ;\n\tsh:minCount 1 ;\n\tsh:datatype xsd:string ;\n\tsh:minLength 1 .\n\n<https://purl.org/coscine/ap/radar#type> sh:path dcterms:type ;\n\tsh:order 4 ;\n\tsh:maxCount 1 ;\n\tsh:class <http://purl.org/dc/dcmitype/> ;\n\tsh:name \"Resource\"@en, \"Ressource\"@de .\n\n<https://purl.org/coscine/ap/radar/> dcterms:rights \"Copyright © 2020 IT Center, RWTH Aachen University\" ;\n\tdcterms:title \"radar application profile\"@en ;\n\tdcterms:license <http://spdx.org/licenses/MIT> ;\n\tsh:targetClass <https://purl.org/coscine/ap/radar/> ;\n\tsh:closed true ;\n\tsh:property <https://purl.org/coscine/ap/radar#subject>, <https://purl.org/coscine/ap/radar#created>, <https://purl.org/coscine/ap/radar#creator>, <https://purl.org/coscine/ap/radar#rights>, <https://purl.org/coscine/ap/radar#rightsHolder>, <https://purl.org/coscine/ap/radar#title>, <https://purl.org/coscine/ap/radar#type>, [\n\t\tsh:path rdf:type ;\n\t] ;\n\tdcterms:publisher <https://itc.rwth-aachen.de/> ;\n\ta sh:NodeShape .",
          "type": "turtle"
        },
        {
          "definition": "@prefix dcmitype: <http://purl.org/dc/dcmitype/> .\n@prefix dcterms: <http://purl.org/dc/terms/> .\n@prefix prov: <http://www.w3.org/ns/prov#> .\n@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n@prefix aps: <https://purl.org/coscine/ap/> .\n\n<https://purl.org/coscine/ap/8e9c8f97-8bf3-46c1-aba6-27777791d01d/> dcterms:license <http://spdx.org/licenses/MIT> ;\n\tdcterms:publisher <https://itc.rwth-aachen.de/> ;\n\tdcterms:rights \"Copyright © 2020 IT Center, RWTH Aachen University\" ;\n\tdcterms:title \"radar application profile\"@en ;\n\ta sh:NodeShape ;\n\tprov:wasDerivedFrom <https://purl.org/coscine/ap/radar/> ;\n\tsh:closed true ;\n\tsh:property <https://purl.org/coscine/ap/radar#created>, <https://purl.org/coscine/ap/radar#rights>, <https://purl.org/coscine/ap/radar#rightsHolder>, <https://purl.org/coscine/ap/radar#subject>, <https://purl.org/coscine/ap/radar#title>, <https://purl.org/coscine/ap/radar#type>, _:b50_c14n0 ;\n\tsh:targetClass <https://purl.org/coscine/ap/radar/> .\n\n<https://purl.org/coscine/ap/radar#created> sh:datatype xsd:date ;\n\tsh:defaultValue \"{TODAY}\" ;\n\tsh:maxCount 1 ;\n\tsh:minCount 1 ;\n\tsh:name \"Erstelldatum\"@de, \"Production Date\"@en ;\n\tsh:order 2 ;\n\tsh:path dcterms:created .\n\n<https://purl.org/coscine/ap/radar#rights> sh:datatype xsd:string ;\n\tsh:maxCount 1 ;\n\tsh:name \"Berechtigung\"@de, \"Rights\"@en ;\n\tsh:order 5 ;\n\tsh:path dcterms:rights .\n\n<https://purl.org/coscine/ap/radar#rightsHolder> sh:datatype xsd:string ;\n\tsh:maxCount 1 ;\n\tsh:name \"Rechteinhaber\"@de, \"Rightsholder\"@en ;\n\tsh:order 6 ;\n\tsh:path dcterms:rightsHolder .\n\n<https://purl.org/coscine/ap/radar#subject> sh:maxCount 1 ;\n\tsh:name \"Fachrichtung\"@de, \"Subject Area\"@en ;\n\tsh:order 3 ;\n\tsh:path dcterms:subject ;\n\tsh:class <http://www.dfg.de/dfg_profil/gremien/fachkollegien/faecher/> .\n\n<https://purl.org/coscine/ap/radar#title> sh:datatype xsd:string ;\n\tsh:maxCount 1 ;\n\tsh:minCount 1 ;\n\tsh:name \"Titel\"@de, \"Title\"@en ;\n\tsh:order 1 ;\n\tsh:path dcterms:title ;\n\tsh:minLength 1 .\n\n<https://purl.org/coscine/ap/radar#type> sh:maxCount 1 ;\n\tsh:name \"Resource\"@en, \"Ressource\"@de ;\n\tsh:order 4 ;\n\tsh:path dcterms:type ;\n\tsh:class <http://purl.org/dc/dcmitype/> .\n\n",
          "type": "turtle"
        }
      ],
      "methods": [
        "FilterStructureSimplerJaccardSimilarity"
      ]
    }

    The results are:

    [
      {
        "method": "FilterStructureSimplerJaccardSimilarity",
        "returnObject": {
          "similarity_matrix": {
            "fileDistances": [],
            "graphDistances": [
              [
                1,
                0.880920060331825
              ],
              [
                0.880920060331825,
                1
              ]
            ]
          },
          "time_spent": 0.012035846710205078
        }
      }
    ]

    Docker

    Run the following commands for using this repository with docker (replacing {yourPort}):

    docker build -t semantic-similarity .
    docker run --publish {yourPort}:36543 semantic-similarity