Skip to content
Snippets Groups Projects
Commit e8ca5086 authored by Julius's avatar Julius
Browse files

deleted some doc, updated gitignore

parent c79a1229
No related branches found
No related tags found
1 merge request!19completed four issues, added heavy support for adding metadata, revised part...
......@@ -14,6 +14,7 @@ __pycache__
!/.gitignore
!/make.bat
!/Makefile
!/CONTRIBUTORS.md
!/pyproject.toml
!/requirements.txt
......
## Original Authors
- Michaela Leštáková (Idea and initial implementation)
- Ning Xia (Idea and initial implementation)
## Contributors
- Jan Groen (first build after prototype as part of a bachelor project)
- Johnas Jahnel (first build after prototype as part of a bachelor project)
- Julius Florstedt (first build after prototype as part of a bachelor project, expanded first build as scientific staff)
- Max Troppmann (first build after prototype as part of a bachelor project)
- Thomas Wu (first build after prototype as part of a bachelor project)
\ No newline at end of file
Serializing 3D-Plots
===========================================
PlotSerializer only supports the initiazation of the figure and axes via the subplots method.
The following two step initialisation to draw a 3D plot is not supported:
.. code-block:: python
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
x = [1,2,3]
y = [3,2,4]
z = [4,4,4]
ax.scatter(x,y,z)
plt.show()
Instead you have to add the projection attribute into the subplots method:
.. code-block:: python
from plot_serializer.matplotlib.serializer import MatplotlibSerializer
fig, ax = serializer.subplots(subplot_kw={"projection": "3d"})
x = [1,2,3]
y = [3,2,4]
z = [4,4,4]
ax.scatter(x,y,z)
plt.show()
Deserializer
===================
Matplotlib-Deserializer
---------------------------------
PlotSerializer also provides the functionality of converting the JSON file back into a diagram.
......@@ -20,18 +21,6 @@ We deserialize the JSON file created above as follows:
Hint for Jupyter Notebook users: Calling plt.show is unneccessary as the deserialize_from_json_file function returns a figure which gets automatically rendered!
Serializer -> JSON/RO Export -> Deserializer
RO-Crates
---------------------------------
PlotSerializer is able to integrate with `RO-Crates <https://www.researchobject.org/ro-crate/>`_.
This means that you can add the serialized diagrams to an RO-Crate as a file with appropriate metadata.
You can accomplish this through the ``add_to_ro_crate()``-Method.
The first argument is the file path to the ro-crate directory and the second argument is the location where the file should be placed within the crate.
When the specified RO-Crate does not exist, a new one is created (this can also be controlled through the ``create`` parameter).
PlotSerializer will try to figure out an appropriate name for the object, but can also be explicitly specified with the ``name`` parameter.
Getting Started
===============
Installation
------------
Install PlotSerializer by running
.. code-block:: bash
pip install plot-serializer
Serializing your first plot
---------------------------
We will serialize an example matplotlib plot that we have created as follows:
.. code-block:: python
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
np.random.seed(19680801)
X = np.round(np.linspace(0.5, 3.5, 100), 3)
Y1 = 3 + np.cos(X)
Y2 = 1 + np.cos(1 + X / 0.75) / 2
ax.set_xlim(0, 4)
ax.set_ylim(0, 4)
ax.tick_params(which="major", width=1.0, length=10, labelsize=14)
ax.tick_params(which="minor", width=1.0, length=5, labelsize=10, labelcolor="0.25")
ax.grid(linestyle="--", linewidth=0.5, color=".25", zorder=-10)
ax.plot(X, Y1, c="C0", lw=2.5, label="Blue signal", zorder=10)
ax.plot(X, Y2, c="C1", lw=2.5, label="Orange signal")
ax.set_title("Example figure", fontsize=20, verticalalignment="bottom")
ax.set_xlabel("TIME in s", fontsize=14)
ax.set_ylabel("DISTANCE in m", fontsize=14)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.set_ylim([40,0])
ax.legend(loc="upper right", fontsize=14)
plt.show()
Of particular interest are the two following lines:
.. code-block:: python
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
To collect the data from the plot, we want to serialize, we first need to create a ``Serializer`` object.
Specifically, since we're looking at matplotlib in this case, we need to create a ``MatplotlibSerializer``.
The ``MatplotlibSerializer`` also exposes a ``subplots()``-Method.
This way the ``Serializer`` is able to capture everything you do with the returned objects.
In concrete terms, we replace the two lines above with the following code:
.. code-block:: python
from plot_serializer.matplotlib.serializer import MatplotlibSerializer
serializer = MatplotlibSerializer()
fig, ax = serializer.subplots()
Finally, get the resulting Json string, we can invoke the ``json()``-Method on the serializer:
.. code-block:: python
serializer.to_json()
We can also write the plot to a file directly:
.. code-block:: python
serializer.write_json_file("test_plot.json")
Adding custom metadata
----------------------------------------
In case of data that can not be plotted or serialized, PlotSerializer provides the option of adding it to the JSON file regardless.
There are five places inside the JSON's hierachy that metadata can be added:
To the entire figure of plots, individual plots, their axis if existent, their traces, and the traces datapoints/slices/boxes. For an Introduction on the Hierachy see Overview.
Metadata is always added via a dict parameter.
A full example:
.. code-block:: python
from plot_serializer.matplotlib.serializer import MatplotlibSerializer
import logging
# set logging to info to get feedback how many traces/datapoints get selected
logging.basicConfig(level=logging.INFO)
serializer = MatplotlibSerializer()
_, ax = serializer.subplots()
x = [1, 4]
y = [7, 4]
z = [10, 4]
ax.plot(x, y)
ax.plot(y, z)
serializer.add_custom_metadata_figure({'date_created' : "10.01.2023"})
serializer.add_custom_metadata_plot({'group_traces' : "data for longevity in mice"})
serializer.add_custom_metadata_axis({'axis_information' : "link to unit: example_unit.html"}, axis="y")
serializer.add_custom_metadata_trace({'collected_data' : "from 08.01.2023"}, trace_selector=0)
serializer.add_custom_metadata_datapoints(
{'information' : "the data of this point might be faulty"}, trace_selector=0, point_selector= 1
)
To understand where each metadata gets added you can take a look at the JSON output:
.. image:: static/custom_metadata_example.png
:width: 400
:alt: JSON file with custom metadata
**Selecting Traces:**
There are two options to select traces: By index and by distance.
Selecting by index is done by passing trace_selector an integer. It selects the trace corresponding to the i-th plot plotted.
Selecting by distance can be done via a tuple and relative tolerance. It selects the traces which have a datapoint near the given point.
.. code-block:: python
serializer.add_custom_metadata_trace(
{'data1' : "from 08.01.2023"}, trace_selector=0
)
serializer.add_custom_metadata_trace(
{'data2' : "from 17.07.2023"}, trace_selector=(3,3), trace_rel_tolerance=0.0001
)
**Selecting Points:**
Selecting points is done similarily to selecting traces. By index or distance. You can however also narrow the traces down via the same rules given in the paragraph above.
Selecting by index is done by passing point selector an integer. It selects the datapoint corresponding to the index of your input data.
Pie Plots slices, Bar plots bars, Boxplots boxes and Histograms datasets are also considered as "points" in this specific regard and can be supplemented with metadata.
Selecting by distance is only viable for datapoints where all axes units are numbers, like scatter, lines, surface, etc.
.. code-block:: python
serializer.add_custom_metadata_datapoints(
{'info1' : "point might be faulty"}, trace_selector=0, point_selector= 1
)
serializer.add_custom_metadata_datapoints(
{'info2' : "point might be faulty"}, trace_selector=(1,1), trace_rel_tolerance=0.2, point_selector= 1
)
serializer.add_custom_metadata_datapoints(
{'info3' : "point might be faulty"}, trace_selector=0, point_selector=point_selector= (4,4), point_rel_tolerance= 0.0001
)
serializer.add_custom_metadata_datapoints(
{'info4' : "point might be faulty"}, trace_selector=(1,1), trace_rel_tolerance=0.2, point_selector= (4,4), point_rel_tolerance= 0.1
)
#specifying no trace will lead to searching above all traces
serializer.add_custom_metadata_datapoints({'info1' : "point might be faulty"}, point_selector= 1)
What does, what does not get serialized?
----------------------------------------
PlotSerializer always reads out the main data for the plot. Further supported parameters will be specifically noted in this documentation, see "Supported Plot Types".
Note that parameters which are used to make the diagram more appealing are not extracted by PlotSerializer. Instead they might distort the data inside the JSON file.
Because of this we recommend to run PlotSerializer first once with your raw data, and simply add the all stylish choices for the plot afterwards.
Similarly beware of modifying the diagram anywhere else besides the creation methods, such as plot,pie,scatter.
An example of this would be you taking the returned objects of these methods and calling further functions on them, modyfying their attributes.
This will not be caught upon by PlotSerializer and the change will be ignored.
Besides the data of the plots, the title label and scales of the axes as well as the title of the whole figure combining them will be serialized.
Integrating with RO-Crates
--------------------------
PlotSerializer is able to integrate with `RO-Crates <https://www.researchobject.org/ro-crate/>`_.
This means that you can add the serialized diagrams to an RO-Crate as a file with appropriate metadata.
You can accomplish this through the ``add_to_ro_crate()``-Method.
The first argument is the file path to the ro-crate directory and the second argument is the location where the file should be placed within the crate.
When the specified RO-Crate does not exist, a new one is created (this can also be controlled through the ``create`` parameter).
PlotSerializer will try to figure out an appropriate name for the object, but can also be explicitly specified with the ``name`` parameter.
.. code-block:: python
serializer.add_to_ro_crate("crate", "my-plot-2.json")
Deserializing a plot from JSON
------------------------------
PlotSerializer also provides the functionality of converting the JSON file back into a diagram.
Only serialized attributes can influence the deserialized plot,
the created graph might thus look slightly different from the original, the data however will remain unchanged.
We deserialize the JSON file created above as follows:
.. code-block:: python
from plot_serializer.matplotlib.deserializer import deserialize_from_json_file
from matplotlib.pyplot as plt
fig = deserialize_from_json_file("test_plot.json")
plt.show()
Hint for Jupyter Notebook users: Calling plt.show is unneccessary as the deserialize_from_json_file function returns a figure which gets automatically rendered!
Introduction
==========================================
Overview
---------------------------------
PlotSerializer helps researchers and scientists of all kinds to store research data cleanly.
......@@ -60,27 +63,13 @@ In concrete terms, we replace the two lines above with the following code:
serializer = MatplotlibSerializer()
fig, ax = serializer.subplots()
Finally, get the resulting Json string, we can invoke the ``json()``-Method on the serializer:
.. code-block:: python
serializer.to_json()
We can also write the plot to a file directly:
Finally, to get the resulting JSON file, we can invoke the ``write_json_file()``-Method on the serializer:
.. code-block:: python
serializer.write_json_file("test_plot.json")
How PlotSerializer sees diagrams
---------------------------------
......
Overview
========
Why PlotSerializer?
---------------------------------
PlotSerializer helps researchers and scientists of all kinds to store research data cleanly.
Specifically, the aim is to convert raw data published for graphs within published in scientific publications into a machine-readable format as easily as possible.
In the case of a scientific paper, for example, the data can be published directly together with the paper so that it can be used later by other researchers.
In a broader sense, this also contributes to the prevention of studies that cannot be reproduced, keyword: reproducibility crisis.
Access to the raw data of research enables scientists who want to build on existing work a much deeper insight into the original facts.
How PlotSerializer sees diagrams
---------------------------------
PlotSerializer uses its own data model for representing scientific diagrams.
The base class for this data model is ``plot_serializer.model.Figure``.
A full Json-Schema for this model is available in this documentation as well.
The basics are illustrated by the following diagram:
.. image:: static/data_structure.svg
:width: 800
:alt: PlotSerializer data structure
Serializer
==========================================
Serialized parameters
----------------------------------------
......@@ -83,9 +86,6 @@ Optional:
* norm
* marker
3D-Plots
---------------------------------
**3D-Line**:
* x
* y
......@@ -195,3 +195,60 @@ Selecting by distance is only viable for datapoints where all axes units are num
)
#specifying no trace will lead to searching above all traces
serializer.add_custom_metadata_datapoints({'info1' : "point might be faulty"}, point_selector= 1)
Serializing 3D-Plots
---------------------------------
PlotSerializer only supports the initiazation of the figure and axes via the subplots method.
The following two step initialisation to draw a 3D plot is not supported:
.. code-block:: python
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
x = [1,2,3]
y = [3,2,4]
z = [4,4,4]
ax.scatter(x,y,z)
plt.show()
Instead you have to add the projection attribute into the subplots method:
.. code-block:: python
from plot_serializer.matplotlib.serializer import MatplotlibSerializer
fig, ax = serializer.subplots(subplot_kw={"projection": "3d"})
x = [1,2,3]
y = [3,2,4]
z = [4,4,4]
ax.scatter(x,y,z)
plt.show()
Output
---------------------------------
**JSON:**
The JSON string can be accessed via the ``to_json()``-Method on the serializer or written to a file via the ``write_json_file()``-Method.
.. code-block:: python
serializer.to_json()
serializer.write_json_file("test_plot.json")
**RO-Crate:**
PlotSerializer is able to integrate with `RO-Crates <https://www.researchobject.org/ro-crate/>`_.
This means that you can add the serialized diagrams to an RO-Crate as a file with appropriate metadata.
You can accomplish this through the ``add_to_ro_crate()``-Method.
The first argument is the file path to the ro-crate directory and the second argument is the location where the file should be placed within the crate.
When the specified RO-Crate does not exist, a new one is created (this can also be controlled through the ``create`` parameter).
PlotSerializer will try to figure out an appropriate name for the object, but can also be explicitly specified with the ``name`` parameter.
Supported Plot Types
===========================================
Plot Serializer currently supports the following plot types. Supported arguments that will get serialized are noted below.
See `here <https://matplotlib.org/stable/plot_types/index.html>`_ for an explanation of these parameters.
1D/2D Plots
---------------------------------
**Line**:
* x
* y
Optional:
* label
* linestyle
* linewidth
* color, given as a string or a rgb/rgba tuple
* marker
**Pie**:
* x
Optional:
* labels
* explode
* color, given as a list of strings
**Bar**:
* x
* height
Optional
* color, given as a string, a rgb/rgba tuple or an array of the former
**Boxplot**:
* x
Optional:
* tick_labels
* notch
* whis
* bootstrap
* usermedians
* conf_intervals
**ErrorBar**
* x
* y
Optional:
* xerr
* yerr
* color
* ecolor
* label
* marker
**Histogram**
* x
Optional:
* bins
* label
* color
* density
* cumulative
**2D-Scatter**:
* x
* y
Optional:
* label
* s
* c
* cmap
* norm
* marker
Note that the scatter plot has increased support for colors. The following inputs types are allowed:
* string
* list of strings
* list of rgb/rgba tuples
* list of scalar values in combination with a cmap and normalization
3D-Plots
---------------------------------
**3D-Line**:
* x
* y
Optional:
* label
* color as a string, or rgb/rgba tuple
* linewidth
* linestyle
* marker
**3D-Surface**:
* x, as a 2D float array
* y, as a 2D float array
* z, as a 2D float array
Optional:
* label
* marker
**3D-Scatter**:
* x
* y
* z
Optional:
* label
* s
* c
* cmap
* norm
* marker
Note that the scatter plot has increased support for colors. The following inputs types are allowed:
* string
* list of strings
* list of rgb/rgba tuples
* list of scalar values in combination with a cmap and normalization
\ No newline at end of file
......@@ -10,15 +10,13 @@ Plot Serializer is a tool for creating easily readable JSON files for your scien
It increases the transparency of scientific publications by helping you share data from your diagrams - with the option of providing custom metadata.
It's simple and hassle-free!
Note: Plot Serializer is currently under development and its functionality is limited to 2D line plots, pie plots and bar plots. More will come soon!
.. toctree::
:maxdepth: 1
Overview
Getting Started
3D-Plots
Supported Plot Types
Introduction
Serializer
Deserializer
Api documentation:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment