Coscine Python SDK

This ipython notebook serves as a quick hands-on introduction on how to use the coscine python module.

Index

  1. Installation
  2. API Token
  3. Import and initialization
  4. Logging and cli output
  5. Working with Projects
  6. Working with Resources
  7. Working with Objects and Metadata
  8. Exceptions
  9. Documentation

Installation

Ensure you have Python and pip installed. As this is platform dependent, you need to figure out how to install them by yourself. This will get you started:

This module is hosted on the Python Package Index (PyPi). You can install and update it and all of its dependencies via the Python Package Manager (pip):

python -m pip install --user --upgrade coscine

Depending on your install you may have to substitute python with py or python3.


Creating an API token

You need an API token to use the Coscine API. If you have not already, create a new API token. Once you have an API token you are ready to use the API.

A word of advice

The token represents sensible data and grants anyone in possesion of it full access to your data in coscine. Do not leak it publicly on online platforms such as github! Do not include it within your sourcecode if you intend on uploading that to the internet. Take precautions and follow best practices to avoid corruption, theft or loss of data!

Using the API token

There are two simple and safe methods of using the API token without exposing it to unintended audiences.

Storing the token in a file and loading it when needed

Simply put your API token in a file on your harddrive and read the file when initializing the Coscine client. This has the advantage of keeping the token out of the sourcecode and offering the user an easy way to switch between tokens by changing the filename.

import os

fd = open("token.txt", "rt")
token = fd.read()
fd.close()

However it comes at the disadvantage of potentially exposing the token by accidentially leaking the file together with the sourcecode. Therefore precautions must be taken i.e. when using git as a versioning system. A .gitignore file including any possible token name or file extension should be mandatory. You could for example exclude the filename token.txt. A better way would be to agree upon a common token file extension such as .token and exclude that file extension. Then you can safely push your code to online platforms such as GitLab or GitHub.

Storing the token in an environment variable

This method does not rely on any files but instead on environment variables. Simply set an environment variable containing your token and use that variable from your python program.

import os

# Set environment variable
os.environ["COSCINE_API_TOKEN"] = "My Token Value"

# Get environment variable
token = os.getenv("COSCINE_API_TOKEN")

This is certainly a little more complex for some users who may want to use your program. They can easily share tokens by sending a file to colleagues but sharing environment variables requires each user to additionally create the environment variable on their local PC.

Find out how to temporarily or permanently set environment variables on certain Operating Systems:


Import

Before you can use this python library you obviously have to import it in your sourcecode.

import coscine

Initialization

Initializing the Coscine API client is done by calling the CoscineClient constructor:

coscine.Client(token: str, verbose: bool = False, lang: str = "en", persistent_cache: bool = False, loglevel)

The constructor takes one mandatory argument - the Coscine API token. A minimal usage example would thus look like this:

client = coscine.Client(token)

Note how we call the CoscineClient constructor with a variable called token. This variable should contain a string with your Coscine API token. You can set this variable by following one of the steps described in Creating an API token.

The constructor takes a few optional arguments such as:


Setup

Logging and cli output

The python SDK provides a handful of logging functionality to inspect data sent to/from Coscine and get updated on up-/download progress.

You can enable/disable and configure it in the coscine.Client() constructor.

Setting verbose to True enables command line output. By default this option is disabled. Furthermore you can specify what kind of messages are printed by setting the loglevel. You can choose between any combination of the aforementioned values. loglevel expects a list of strings and could for examples be set to loglevel = ["REQUEST", "DATA"] to inspect all incoming and outgoing http traffic aswell as the data that is coming in and getting sent out.


Working with Coscine Projects

Coscine is a project-based data management platform. A Project represents the central hub for all the data resources and the people involved in a scientific undertaking.

Get projects

Getting a list of projects is easy:

If we know which project we require beforehand, we can just query it by its name. We now get a single object of type Project, instead of a list:

Here we can see some more project metadata. When printing a project object, its metadata is output nicely formatted inside of a table.

The keys and values are human readable and easy to understand. Under the hood its not that simple and sometimes we might need to access the Coscine internal identifier for a certain field. In that case we can use the data dictionary, which grants us access to the metadata as seen by the Coscine server. Printing it yields a JSON representation of our projects metadata:

Delete a project

Once again a plain and simple function call. Be aware though, that this function may fail due to unsufficient privileges. The call may yield an error, which we should catch and handle:

Create a project

The python module provides a simplified way of programmatically creating a project. All data formatting and error detection is performed under the hood.

Looking at the use of a dictionary, one might ask why we do not just use a function with named arguments for each dictionary field. Multiple benefits arise from the use of a dictionary:

The 2nd benefit ultimately enables easy inclusion in GUI applications.

Downloading a project

To download a project and all of the resources contained within the project, call the download method. You need to specify a storage location using the path argument, otherwise the current directory will be used.

Member management

Inviting new members:


Working with Coscine Resources

Resources store all of your data and metadata. As such they represent a key data structure, which you most certainly will interact a lot with.

Getting a list of resources

Analoguous to getting a list of projects, you can get a list of resoures. The only difference being, that the resources() method is part of the project object and does only query resources contained within that project. Just like for projects, you can specify a filter to filter by certain resource properties.

If we know which resource we require beforehand, we can just query it by its name. We now get a single object of type Resource, instead of a list:

Deleting a resource

Again, this does not work with our public token - you do not have privileges to delete our sample resource. Try deleting a resource you have created, but be careful not to delete anything of value.

Downloading a resource

Downloading a resource and all of the data contained within the resource is just as simple as downloading a project. In fact internally project.download() just calls Resource.download() for all resources contained within the project.

Resource Quota

You can fetch the used up quota of a resource as an integer indicating the size used in Bytes.

Resource Application profile

An application profile specifies a template for metadata. There may be times where you need to interact with that profile. To get the application profile of a resource you simple call the application_profile() method. You can either get the raw application profile in JSON-LD format or a more readable (and easier to interact with) parsed version, by setting the parse argument to True.

Creating a resource

Once again we are using InputForms to set metadata of a Coscine object.

Getting S3 credentials

RDS-S3 resources can be directly accessed via an S3-client. Direct connections require S3 credentials, which s3 resource instances happily provide to us. Only works for s3 resources.


Working with Objects and Metadata

Files are stored inside of resources. However Resources do not necessarily contain files. It depends on the resource type. RDS and RDS-S3 contain files, but Linked Data resources contain references to files. Therefore we cannot just talk about files, but have to use a more abstract term such as 'object'. Objects represent files and file-like instances in Coscine. Nonetheless the methods of interacting with files and file-like objects is always the same - just don't expect file contents for objects of linked data resources, as those merely contain links or whatever has been specified as their content.

Getting a list of files

Downloading a file

Uploading a file

Deleting a file

Working with file metadata

We can interact with metadata using a MetadataForm.

The form fields change depending on the selected application profile for the resource.


Exceptions

The python module defines a bunch of custom exceptions.

class CoscineException(Exception):
    """
    Coscine base Exception class.
    """
    pass

###############################################################################

class ConnectionError(CoscineException):
    """
    In case the client is not able to establish a connection with
    the Coscine servers, a ConnectionError is raised.
    """
    pass

###############################################################################

class ClientError(CoscineException):
    """
    An error has been made or detected on the client side.
    """
    pass

###############################################################################

class ServerError(CoscineException):
    """
    An error has been made or detected on the Coscine server.
    """
    pass

###############################################################################

class VocabularyError(CoscineException):
    """
    Raised in InputForms when a supplied value is not contained within
    a controlled vocabulary.
    """
    pass

###############################################################################

class RequirementError(CoscineException):
    """
    Commonly raised in InputForms when a required field has not been set.
    """
    pass

###############################################################################

class AuthorizationError(CoscineException):
    """
    AuthorizationErrors are thrown when the owner of the Coscine API
    token does not hold enough privileges.
    """
    pass

###############################################################################

class AmbiguityError(CoscineException):
    """
    An AmbiguityError is raised in cases where two objects could
    not be differentiated between.
    """
    pass

###############################################################################

class ParameterError(CoscineException):
    """
    Invalid (number of) function parameters provided. In some cases
    the user has the option of choosing between several optional arguments,
    but has to provide at least one.
    """
    pass

###############################################################################

Documentation

You can generate detailed documentation using pydoc.
Install pydoc with py -m pip install pydoc.
Install coscine with py -m pip install coscine.
Download the coscine python sdk repository as a zip or with git.
Navigate to the coscine/doc directory inside of the repository and either use one of the supplied scripts or generate documentation yourself with py -m pydoc -w coscine, py -m pydoc -w coscine.project, etc.