NFDI4Earth Community Interoperability Label
Description
We are designing evaluation criteria and an application process for an NFDI4Earth label. The label will provide information about the trustworthiness and compatibility of technical infrastructures (repositories, data services, etc.) within the NFDI4Earth infrastructure.
This package contains functions that retrieve information on repositories to create our Community Interoperability Label.
Installation and Development setup
Clone this repo and change into the directory, then:
python -m venv . # only once
source bin/activate
pip install -e .[dev]
If you want type annotations for dependencies:
mypy --install-types
pip install pandas-stubs types-requests
Note: The KH schema is available here.
Running the tests
pytest tests/
NOTE: If you get a 500 error (Something went wrong, contact your admin) when running the tests (main of persist.py or create_testdata.py), the object already exists in the database with another ID, but some field has to be unique, like username for users. This is a known issue with the tests atm.
Required F-UJI version
This tool was tested with F-UJI version 3.2.0, so make sure to deploy the adequate service (e.g. docker run -p 1071:1071 ghcr.io/pangaea-data-publisher/fuji:3.2.0
)
Make sure both the knowledgehub backend and FUJI are running.
Interactions of the Label Software with other systems
In general, the primary data source is Cordra. When the Label Web App needs to access sensitive information (like details of an Assessment) or needs to modify something in Cordra, this is done via the backend (called Python API below). The backend checks the authorization of the frontend user (via the bearer token the client gets from keycloak when the user logs into the frontend), and if successful, performs the requested operation.
- Frontend
- SelfAssessmentForm ====> Python API: save assessment to ====> Cordra
- RequestRe3dataAssessmentForm ====> Python API: query Triplestore ====> TripleStore: return data on repo ====> Python API: perform assessment and save assessment to ====> Cordra
- RequestFujiAssessmentForm ====> Python API: query Fuseki ====> TripleStore : return API url => Python API: run F-UJI assessment ====> Fuji: connect to repository API ====> Python API: save assessment sto ====> Cordra
- ApplicationForm: ====> Python API : ====> to a webmailer to send emails
- All other pages that list repositories or public assessments: query data from ====> Cordra
- All other pages that display a Label badge: connect to ====> Python API ====> sends link to badge generated via shields.io
- Command line interface
-
lbl_evalrepos
: connects to TripleStore to get data on repo and API url (re3data sub assessment). Connects to TripleStore to get API URL and to F-UJI if suitable API URL available (F-UJI sub assessment). Connects to Cordra to save assessments. -
lbl_runassessment
: Connects to Cordra. Connects to TripleStore to get data on repo (if asmtype==re3data). If asmtype==fuji, connects to TripleStore to get API URL and connects to F-UJI to run assessment. -
lbl_listrepos
: Connects to Cordra to retrieve repos. -
lbl_listassessments
: Connects to Python API, and that to Cordra to retrieve assessments. -
lbl_saveassessment
: Connects to Python APIs, and that to Cordra to save assessment. -
lbl_asmtemplate
: No interaction with external systems. -
lbl_listsubassessments
: Connects to Python API, and that tos Cordra to retrieve sub assessments. -
lbl_setsubassessmentcategory
: Connects to Python API, and that to Cordra to modfiy assessment.
-
Running the Label API service
Backend
The backend is implemented in FastAPI. The python package in this repository is named nfdi4earth-label
(see file labelapi.py
for the API). It includes a command line interface to the Label, and the Fast API backend.
To run it, first start the F-UJI service (e.g., F-UJI Docker image) and the Knowledge Hub backend (from repo knowledge-hub-backend-setup
). Then change working dir to the root of the nfdi4earth-label
repository.
uvicorn nfdi4earth_label.labelapi:app --reload
Note: the dependencies required to run the backend (like uvicorn
) are part of the package dependencies (in pyproject.toml
), so you should already have them.
Then go use the API, e.g., connect with your webbrowser to 127.0.0.1:8000/repos to GET the lists of repo IDs and repo names.
You can go to 127.0.0.1:8000/frontend/index.html to see the frontend.
Frontend
Note: The old frontend is a React singe page application, and it is stored in its own repo `nfdi4earth-label-frontend'. This is now deprecated and should not be used anymore.
The old, standalone frontend application (or its React components) were integrated into the OneStop web app, in the onestop4all-implementation
repository (the integration branch is named feature/labelintegration
), so go there to find information on the Label frontend.
Using the command line interface
The python package that implements the backend also comes with a command line interface. To use it, first make sure you are in the correct (virtual) environment, e.g., run source bin/activate
if you are using venv
. Then, the following command line applications are available:
lbl_runassessment # run a specific sub assessment type (re3data, fuji) for a single repository.
lbl_evalrepos # run re3data (and optionally F-UJI) assessments for all repositories. See also lbl_runassessment to have more control and run for individual repositories. If you use lbl_runassessment, you do not need this.
lbl_listrepos # list all repos which are currently in the Cordra database in CSV format
lbl_listassessments # list all assessments which are currently in the Cordra database in CSV format
lbl_listsubassessments # list all sub assessments which are currently in the Cordra database in CSV format
lbl_saveassessment # test app and tool to manually save a sub assessment (of type 're3data', 'fuji', or 'selfassessment') that you have in a JSON file into the Cordra database, and assign it to a repository.
lbl_asmtemplate # get JSON-formatted string of assessment templates.
lbl_setsubassessmentcategory # set or unset a sub assessment (of type re3data, fuji, or selfasm) as official
Developer reminder: if you add a new CLI tool in pyproject.toml, you will have to run pip install -e .
again.
Development Info
Quickly adding some test assessment for development
Use the command line tools like this if you want to add them for PANGAEA:
lbl_listrepos | grep PANGAEA # gives you the internal ID of PANGAEA, something like n4e/r3d-r3d100010134
Then generate some re3data (sub) assessments:
lbl_runassessment re3data n4e/r3d-r3d100010134 --store
lbl_runassessment re3data n4e/r3d-r3d100010134 --store # a 2nd one
And some F-UJI (sub) assessments:
lbl_runassessment fuji n4e/r3d-r3d100010134 --store --apiurl auto
And, finally, some (sub) assessments of type selfassessment:
If you just want to quicky add a positive and a negative one:
lbl_saveassessment selfassessment n4e/r3d-r3d100010134 # negative result
lbl_saveassessment selfassessment n4e/r3d-r3d100010134 --achieved # positive result
Alternatively, if you want to generate one, save it to a file and edit manually before uploading:
lbl_asmtemplate selfassessment --positive > myselfasm.txt # now edit myasm.txt in a text editor
# after editing, upload myselfasm.txt
lbl_saveassessment selfassessment n4e/r3d-r3d100010134 --afile myselfasm.txt
Accessing the cordra logs
To access the cordra logs, get the Docker ID of the cordra container, find the mount point, then access the logs.
docker ps # shows all running containers. get the docker ID of the cordra container.
docker inspect <dockerID> # browse output till you find 'Mounts' section. Find the mount and check the 'Source' entry for the path. Something like /var/snap/docker/common/var-lib-docker/volumes/knowledge-hub-backend-setup_knowledge-hub-cordra-volume/_data
Now view the logs:
sudo su
cd <cordra_volume_path>
ls logs/ # will contain something like `error.log-202407', which is what you are looking for.
Resetting your cordra if the volume is outdated
Note that users you added manually and other changes will be gone, and you will have to rec-create them.
In the knowledge-hub-backend-setup
repo root:
cp variables.env.default variables.env # in case new ones were added
docker compose down # if running
docker volume ls # now find the exact corda-keycloack volume name, we assume it is 'knowledge-hub-backend-setup_knowledge-hub-keycloak-pg-volume' in the next line.
docker volume rm knowledge-hub-backend-setup_knowledge-hub-keycloak-pg-volume
docker compose build # rebuild volume
docker compose up # start backend
# now setup realm again.
cd devops/keycloak/
python setup_keycloak_realm.py
Adding a user for the Label web frontend and assigning to a repo
User with status repository representative for a repo
- make sure the knowledge hub backend Docker container is running
- go to the keycloak admin interface in your browser at http://localhost:8081.
- use Keycloak admin password from file
variables.env
- important: select realm
NFDI4EARTH
on the top left - go to
Users
, create or select the user that should be able to perform assessments for a repository. - for the user, select
Attributes
. Add a new attribute with keycanAssessRepos
, and the value must be a repo id, e.g.n4e/r3d-r3d………
. Note: you can get a list of the IDs on the command line by activating the virtual environment of yournfdi4earth-label
repo (the backend label repo), and runnung thelbl_listrepos
command.
User with status Label team member
Do as described above to create or select the user. Then:
- for the user, select
Attributes
. Add a new attribute with keyisLabelTeamMember
, and the value must be the stringyes
.
Making sure the schema in Cordra and Fuseki are up-to-date
This application requires a running Knowledge Hub (Cordra+Fuseki).
The Label assessment information for a repository is saved back to the database, to an object of type RepositoryEvalForLabel
. To get the schema for that object into Cordra,
you currently have to use the dev
branch of the knowledge-hub-backend-setup
repo and run the script knowledge-hub-backend-setup/devops/cordra/upsert_cordra_types.py
(after setting op the cordra container, as described in the README of that repo).
Note: If the NFDI4Earth Knowledge Hub schema has changed since you installed the backend, you should make sure to update it as well. To do that, see the nfdi4earth-kh-schema repo. Follow installation instructions in there, and check the Makefile once it is installed. You will need to run make generate-json-schema-kh
and make generate-jsonld-context-kh
. Do not forget to reinstall the local version of the package into the venv used by the backend.
What to do after changing something in the schema of RepositoryEvalForLabel objects
Overview
If you need new fields for an assessment or sub assessment, you will have to do three things:
- In the knowledge-hub-backend-setup repo, adapt the schema of
devops/cordra/RepositoryEvalForLabel.json
- Adapt the backend code to work with the new schema
- Adapt the frontend code to work with the new schema
Schema adaption step by step
- In the
knowledge-hub-backend-setup
repo, go todevops/cordra/RepositoryEvalForLabel.json
and edit the file to your new schema needs. - To test your changes, and make sure they have no syntax error, start the Docker container of the knowledge hub backend if you do not have it running, and then:
- First run the script
devops/cordra/upsert_cordra_types.py
to upload the schema to Cordra. See the README of theknowledge-hub-backend-setup
repo for instructions on running the script (command line arguments). Watch for error messages. If uploading succeeds, the syntax is fine. - You may also want to delete all old instances during development, as they will not conform to the schema anymore:
/delete_all_data.py --types RepositoryEvalForLabel --delete
. For a production instance, you will have to write code to migrate the schema of existing instances to the new one, of course. - You may want to add new instances and run the Label app to see that everything worked out. See below for tipps on quickly adding new assessments during development.
- First run the script