diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000000000000000000000000000000000000..c4c4ffc6aa41a89cc618a31d17f6d5924ddf2b10
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1 @@
+*.zip
diff --git a/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering.ipynb b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..636b74b0a37eba8272bbc6986d9ae5a6480938df
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering.ipynb
@@ -0,0 +1,332 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "9ccf7386",
+   "metadata": {},
+   "source": [
+    "# $K$-Means Clustering"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "random-contract",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "\n",
+    "from matplotlib import pyplot as plt\n",
+    "import numpy as np\n",
+    "\n",
+    "import importlib\n",
+    "import helper\n",
+    "importlib.reload(helper)\n",
+    "\n",
+    "from IPython.display import clear_output\n",
+    "from time import sleep, time"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3ae69504",
+   "metadata": {
+    "jp-MarkdownHeadingCollapsed": true,
+    "tags": []
+   },
+   "source": [
+    "## Introduction\n",
+    "$K$-Means Clustering is a method from classical machine learning. It is used to find $K$ different groups of similar items in a dataset.\n",
+    "\n",
+    "In our case the dataset is a set of $N$ 2-dimensional coordinate vectors $\\vec{x}_1,\\vec{x}_2,\\dots,\\vec{x}_N$. These points form $K < N$ clusters which we would like to find. In order to characterise a cluster we use the cluster centre $\\vec{\\mu}_j$ ($1 \\leq j \\leq K$). *Each* point from the size-$N$ set can be assigned to *one* of these clusters (we will limit ourselves to cases where this indeed is possible)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e8075aed",
+   "metadata": {},
+   "source": [
+    "## Algorithm\n",
+    "Assigning a point to a cluster works according to the following procedure:\n",
+    "\n",
+    "1. **Initialisation**: Randomly choose cluster centres $\\vec{\\mu}_j$ ($1 \\leq j \\leq K$). A simple way to achieve this is to choose them from the set of points $\\{\\vec{x}_i\\}_{i = 1, \\dots, N}$.\n",
+    "\n",
+    "2. **Iterations**: \n",
+    "    - For all $i = 1, \\dots N$ find the cluster centre with position $\\vec{\\mu}_j$ to which $\\vec{x}_i$ has the *smallest* euclidian distance:\n",
+    "    $$\n",
+    "    c^{(i)} = \\operatorname{argmin}_{j \\in \\{1, \\dots, K\\}} \\left\\|\\vec{x}_i - \\vec{\\mu}_j\\right\\|_2^2,\n",
+    "    $$\n",
+    "    where $\\|\\vec{x}\\|_2 = \\sqrt{x_1^2 + x_2^2}$. $c^{(i)}$ is an integer number from the set $\\{1, \\dots, K\\}$. We use is to assign an index to each point $\\vec{x}_i$ (being $c^{(i)}$). This index designated the cluster centre to which the $i$th point is closest to. Hence, for each of the points we must compute the (squared) distance to *all* cluster centres $\\vec{\\mu}_j$ ($1 \\leq j \\leq K$) and determine the smallest of these distances. The index $j$ of the cluster with the smallest distance to a point with index $i$ is assigned to $c^{(i)}$.\n",
+    "    - After having assigned each point of the set $\\{\\vec{x}_i\\}_{i = 1, \\dots, N}$ re-compute the position of all cluster centers:\n",
+    "    $$\n",
+    "    \\vec{\\mu}_j = \\frac{1}{n_j} \\sum_{\\vec{x}_i\\text{ with }c^{(i)} = j} \\vec{x}_i,\n",
+    "    $$\n",
+    "    By $n_j$ we mean the total number of points for which $c^{(i)} = j$. The *new* cluster centre is nothing but the arithmetic mean of all points $\\vec{x}_i$ that were assigned to the previous cluster centre.\n",
+    "    - We compare the set cluster centres $C^{\\mathrm{old}} = \\{\\vec{\\mu}_1^{\\mathrm{old}}, \\dots, \\vec{\\mu}_K^{\\mathrm{old}} \\}$ from the previous iteration and the current set of cluster centres  $C = \\{\\vec{\\mu}_1, \\dots, \\vec{\\mu}_K \\}$. If cluster centres are pair-wise equal (compare those with the same index) we stop the iterations. We have reached a steady state and the algorithm has *converged*."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ba899ad5",
+   "metadata": {},
+   "source": [
+    "## Task formulation\n",
+    "\n",
+    "Implement the outlined algorithm for the method of $K$-Means Clustering. Stick to the paradigm of *array-oriented programming* as often as possible.\n",
+    "\n",
+    "In case you have trouble mapping the algorithm to Numpy commands and functions it can help to first implement it with standard Python only.\n",
+    "\n",
+    "The folder `sample-data` contains some sample-dataset that you can use to explore the algorithm and your implementation.\n",
+    "\n",
+    "*Hint*: It can be helpful to plot the data and the cluster centres determined with your implementation. Have a look at the `make_scatter_plot` function from the `helper.py` module provided with this notebook."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "25abbe32",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "n_clusters = 2\n",
+    "dataset = np.loadtxt(f\"sample-data/coords-with-labels-{n_clusters}.dat\", delimiter=\",\")\n",
+    "coords, labels = dataset.T[:2].T, dataset.T[-1]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5bf0a0e2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, ax = plt.subplots()\n",
+    "\n",
+    "helper.make_scatter_plot(\n",
+    "        ax,\n",
+    "        [coords[labels == tt] for tt in range(n_clusters)], \n",
+    "        labels=[f\"cluster {tt}\" for tt in range(n_clusters)],\n",
+    "        markers=[\"o\"] * n_clusters\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "21546ed7",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "62ababfe",
+   "metadata": {},
+   "source": [
+    "## Implementation of solution"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "european-bookmark",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# return True, if centers have not changed and the algorithm can therefore stop\n",
+    "def centers_have_not_changed(a, b):\n",
+    "    # Provide your implementation here.\n",
+    "    return np.allclose(a,b)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ahead-antenna",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# return the updated locations of the cluster centers\n",
+    "def compute_centers(coords, labels, n_centers):\n",
+    "    # Provide your implementation here. \n",
+    "    # **HINT**:\n",
+    "    # \n",
+    "    # Use advanced indexing with boolean masks to access\n",
+    "    # all points that have a label corresponding to the \n",
+    "    # index of a cluster center.\n",
+    "    return np.array([coords[labels == idx].mean(axis=0) for idx in range(n_centers)])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "known-travel",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# return the list of *indices* of the cluster centers for the coordinates\n",
+    "def find_closest_center(coords, coords_center):\n",
+    "    # Provide your implementation here.\n",
+    "    # **HINT**:\n",
+    "    # \n",
+    "    # Use `np.tile()` to augment `coords` and then make use\n",
+    "    # of NumPy's implicit broadcasting capabilities to\n",
+    "    # compute the distance of each point to *all* cluster\n",
+    "    # centers. You might also need to reshape the array.\n",
+    "    # Think about along which *axis* to compute the norm. \n",
+    "    #\n",
+    "    # Then select the *index* of cluster center with the \n",
+    "    # least distance for each point (Look up the \n",
+    "    # `np.argmin()` function.).\n",
+    "    n_centers = coords_center.shape[0]\n",
+    "    coords_shifted = np.reshape(\n",
+    "        np.tile(coords, (1, n_centers)) - coords_center.ravel(),\n",
+    "        (coords.shape[0], n_centers, coords.shape[1]),\n",
+    "    )\n",
+    "    return np.argmin(np.linalg.norm(coords_shifted, axis=2), axis=1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "physical-saturday",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# The driver function - You need to change it, as ther is an error in it\n",
+    "# the error is *not* in the visualization part\n",
+    "def kmeans(coords, n_centers, n_iter, initial_random_state=42,visualize_progres=True,sleep_time=0.5):\n",
+    "    # Initialise the coordinates of the cluster centers\n",
+    "    rng = np.random.RandomState(initial_random_state)\n",
+    "    index = rng.choice(coords.shape[0], n_centers, replace=False)\n",
+    "    \n",
+    "    # Store coords of the center for iterations\n",
+    "    coords_center = coords[index, ...].copy()\n",
+    "    coords_center_old = coords_center.copy()\n",
+    "    \n",
+    "    for i in range(n_iter):\n",
+    "        # Find closest center for each point\n",
+    "        ### --> you provide this function ###\n",
+    "        labels = find_closest_center(coords, coords_center)\n",
+    "        if visualize_progres:\n",
+    "            # Visualization of the process\n",
+    "            sleep(sleep_time) \n",
+    "            clear_output(wait=True)\n",
+    "            helper.plot_clustering(n_centers,coords,coords_center,labels)\n",
+    "       \n",
+    "        # Update the centeroids\n",
+    "        # INFO: \"...\" in x[...] is a slicing operation called \"ellipsis\". You can learn\n",
+    "        # more about it here: https://stackoverflow.com/questions/118370/how-do-you-use-the-ellipsis-slicing-syntax-in-python\n",
+    "        coords_center_old = coords_center # save old version for testing convergence\n",
+    "        ### --> you provide this solution ###\n",
+    "        # erronenous:\n",
+    "        #coords_center[...] = compute_centers(coords, labels, n_centers)\n",
+    "        # correct\n",
+    "        coords_center = compute_centers(coords, labels, n_centers)\n",
+    "        # Test for convergence\n",
+    "        ### --> you provide this solution ###\n",
+    "        if centers_have_not_changed(coords_center, coords_center_old):\n",
+    "            if visualize_progres:\n",
+    "                # visualize final state\n",
+    "                sleep(sleep_time)\n",
+    "                clear_output(wait=True)\n",
+    "                helper.plot_clustering(n_centers,coords,coords_center,labels)\n",
+    "            print(\"Finished after %d iterations\"%i)\n",
+    "            break\n",
+    "\n",
+    "            \n",
+    "    return coords_center, labels"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bronze-advocate",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def main(n_clusters, dataset, n_iter=1000):\n",
+    "#     coords, labels = dataset.T[:2].T, dataset.T[-1].astype(int)\n",
+    "    coords  = dataset.T[:2].T\n",
+    "    \n",
+    "    coords_center, center_labels = kmeans(\n",
+    "        coords=coords,# the input data (coordinates of the points to be clustered)\n",
+    "        n_centers=n_clusters,# number of clusters\n",
+    "        n_iter=n_iter,# maximum number of iterations to perform, if algorithm does not converge before\n",
+    "        #initial_random_state=int(time()),# initial random seed - use a fixed value, if you want to have the same initial state for every execution\n",
+    "        # this is a good random seed to see the bug\n",
+    "        initial_random_state=4321,\n",
+    "        visualize_progres=True,#Turn Off, if you do not want to wait for the visualization\n",
+    "        sleep_time=1 # the sleep time controls the speed of the visualization (lower means faster)\n",
+    "        \n",
+    "    )\n",
+    "    \n",
+    "    print(coords_center)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "middle-planner",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if __name__ == \"__main__\":\n",
+    "    n_clusters = 4 # change this value to test different datasets\n",
+    "    dataset = np.loadtxt(f\"sample-data/coords-with-labels-{n_clusters}.dat\", delimiter=\",\")\n",
+    "    main(n_clusters, dataset)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "surprising-austria",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "91f9bc21",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": true,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": true
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering.ipynb.license b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering.ipynb.license
new file mode 100644
index 0000000000000000000000000000000000000000..c207ab8c094a9d18d7c6cb5c9dfbf8913df4aa8a
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering.ipynb.license
@@ -0,0 +1,4 @@
+SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+
+SPDX-License-Identifier: MIT
diff --git a/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_stdPython.ipynb b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_stdPython.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..33e38357348697c443aab237abfc8f10036fc6fd
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_stdPython.ipynb
@@ -0,0 +1,318 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "partial-munich",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "\n",
+    "from matplotlib import pyplot as plt\n",
+    "import numpy as np\n",
+    "\n",
+    "import importlib\n",
+    "import helper\n",
+    "importlib.reload(helper)\n",
+    "\n",
+    "import math\n",
+    "\n",
+    "from IPython.display import clear_output\n",
+    "from time import sleep, time"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "honest-mexico",
+   "metadata": {},
+   "source": [
+    "## Beispiel fuer einen Datensatz mit 4 Clustern"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "invalid-baseball",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset = np.loadtxt(\"sample-data/coords-with-labels-4.dat\", delimiter=\",\")\n",
+    "coords, labels = dataset.T[:2].T, dataset.T[-1].astype(int)\n",
+    "\n",
+    "num_labels = np.unique(labels).size\n",
+    "coords_by_label = list(coords[labels == tt] for tt in range(num_labels))\n",
+    "\n",
+    "coords_center = np.loadtxt(\"sample-data/cluster-centers-4.dat\", delimiter=\",\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "piano-vehicle",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ax1, ax2 = helper.init_figure()\n",
+    "# Scatter plot of coords without clustering.\n",
+    "helper.make_scatter_plot(ax1, coords=[coords], labels=[\"\"])\n",
+    "# Scatter plot of coords assigned to clusters\n",
+    "helper.make_scatter_plot(\n",
+    "    ax2,\n",
+    "    coords_by_label, \n",
+    "    labels=[f\"cluster {tt}\" for tt in range(num_labels)],\n",
+    "    markers=[\"o\"] * num_labels\n",
+    ")\n",
+    "# Plot cluster centers.\n",
+    "helper.make_scatter_plot(\n",
+    "    ax2,\n",
+    "    coords_center, \n",
+    "    labels=[f\"centeroid {tt}\" for tt in range(num_labels)],\n",
+    "    colors=[\"black\"] * num_labels,\n",
+    "    with_legend=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "collectible-detector",
+   "metadata": {},
+   "source": [
+    "## Implementation using standard Python only"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3084ddcb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# return True, if centers have not changed and the algorithm can therefore stop\n",
+    "def centers_have_not_changed(a, b):\n",
+    "    # if the center location only changes very little, we also consider it same\n",
+    "    rtol=1e-05\n",
+    "    atol=1e-08\n",
+    "    #has_changed=False\n",
+    "    # Provide your implementation here.\n",
+    "    for point_a,point_b in zip(a,b):\n",
+    "        for coordinate_a, coordinate_b in zip(point_a,point_b):\n",
+    "            if abs(coordinate_a - coordinate_b) >= (atol + rtol * abs(coordinate_b)):\n",
+    "                #has_changed=True\n",
+    "                return False\n",
+    "    return True\n",
+    "        "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0ae4cfd1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# return the updated locations of the cluster centers\n",
+    "def compute_centers(coords, labels, n_centers):\n",
+    "    # Provide your implementation here. \n",
+    "    # **HINT**:\n",
+    "    # \n",
+    "    # Use advanced indexing with boolean masks to access\n",
+    "    # all points that have a label corresponding to the \n",
+    "    # index of a cluster center.\n",
+    "    coords_center = []\n",
+    "    # For every cluster we look up all points that are closest to it.\n",
+    "    for ccidx in range(n_centers):\n",
+    "        ccx, ccy = 0, 0\n",
+    "        cluster_size = 0\n",
+    "        # Find all points \"assigned\" to the current cluster center.\n",
+    "        for lc, c in zip(labels, coords):\n",
+    "            cx, cy = c\n",
+    "            if ccidx == lc:\n",
+    "                cluster_size += 1\n",
+    "                ccx += cx\n",
+    "                ccy += cy\n",
+    "        assert cluster_size > 0, \"Error - found cluster size with value 0.\"\n",
+    "        # Remember to divide by the cluster_size since we compute the \n",
+    "        # new cluster centre as the arithmetic mean from the coordinates\n",
+    "        # of all points assigned to it.\n",
+    "        coords_center.append([ccx / cluster_size, ccy / cluster_size])\n",
+    "    return coords_center"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3c7f163f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# return the list of *indices* of the cluster centers for the coordinates\n",
+    "def find_closest_center(coords, coords_center):\n",
+    "    # Provide your implementation here.\n",
+    "    # **HINT**:\n",
+    "    # \n",
+    "    # Use `np.tile()` to augment `coords` and then make use\n",
+    "    # of NumPy's implicit broadcasting capabilities to\n",
+    "    # compute the distance of each point to *all* cluster\n",
+    "    # centers. You might also need to reshape the array.\n",
+    "    # Think about along which *axis* to compute the norm. \n",
+    "    #\n",
+    "    # Then select the *index* of cluster center with the \n",
+    "    # least distance for each point (Look up the \n",
+    "    # `np.argmin()` function.).\n",
+    "    labels = []\n",
+    "    # For *all* points search the closest cluster centre.\n",
+    "    for c in coords:\n",
+    "        min_ccidx, min_dist = 100000, 1e+18\n",
+    "        # Test each cluster center ...\n",
+    "        for ccidx, cc in enumerate(coords_center):\n",
+    "            # Squared distance of point to cluster centre.\n",
+    "            dist = sum(r ** 2 for r in (x - y for x, y in zip(c, cc)))\n",
+    "            # Found a new candidate.\n",
+    "            if dist < min_dist:\n",
+    "                min_ccidx, min_dist = ccidx, dist\n",
+    "        # After finishing this loop we have a found the closest cluster centre.\n",
+    "        # (Or at least a the closest in case some have the same distance.)\n",
+    "        # The *index* of that cluster centre is stored.\n",
+    "        labels.append(min_ccidx)\n",
+    "    return labels"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0707a439",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# The driver function is supplied, you do not need to change it\n",
+    "def kmeans(coords, n_centers, n_iter, initial_random_state=42,visualize_progres=True,sleep_time=0.5):\n",
+    "    # Initialise the coordinates of the cluster centers\n",
+    "    rng = np.random.RandomState(initial_random_state)\n",
+    "    index = rng.choice(coords.shape[0], n_centers, replace=False)\n",
+    "    \n",
+    "    # Store coords of the center for iterations\n",
+    "    coords_center = coords[index, ...].copy()\n",
+    "    coords_center_old = coords_center.copy()\n",
+    "    \n",
+    "    for i in range(n_iter):\n",
+    "        # Find closest center for each point\n",
+    "        ### --> you provide this function ###\n",
+    "        labels = find_closest_center(coords, coords_center)\n",
+    "        if visualize_progres:\n",
+    "            # Visualization of the process\n",
+    "            sleep(sleep_time) \n",
+    "            clear_output(wait=True)\n",
+    "            # vor visualization, we have to convert the list of tuples back into an numpy array\n",
+    "            helper.plot_clustering(n_centers,coords,np.asarray(coords_center),np.asarray(labels))\n",
+    "       \n",
+    "        # Update the centeroids\n",
+    "        # INFO: \"...\" in x[...] is a slicing operation called \"ellipsis\". You can learn\n",
+    "        # more about it here: https://stackoverflow.com/questions/118370/how-do-you-use-the-ellipsis-slicing-syntax-in-python\n",
+    "        coords_center_old = coords_center # save old version for testing convergence\n",
+    "        ### --> you provide this solution ###\n",
+    "        coords_center= compute_centers(coords, labels, n_centers)\n",
+    "        # Test for convergence\n",
+    "        ### --> you provide this solution ###\n",
+    "        if centers_have_not_changed(coords_center, coords_center_old):\n",
+    "            if visualize_progres:\n",
+    "                # visualize final state\n",
+    "                sleep(sleep_time)\n",
+    "                clear_output(wait=True)\n",
+    "                helper.plot_clustering(n_centers,coords,np.asarray(coords_center),np.asarray(labels))\n",
+    "            print(\"Finished after %d iterations\"%i)\n",
+    "            break\n",
+    "\n",
+    "            \n",
+    "    return coords_center, labels"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "authorized-slovenia",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def main(n_clusters, dataset, n_iter=1000):\n",
+    "#     coords, labels = dataset.T[:2].T, dataset.T[-1].astype(int)\n",
+    "    coords  = dataset.T[:2].T\n",
+    "    \n",
+    "    coords_center, center_labels = kmeans(\n",
+    "        coords=coords,# the input data (coordinates of the points to be clustered)\n",
+    "        n_centers=n_clusters,# number of clusters\n",
+    "        n_iter=n_iter,# maximum number of iterations to perform, if algorithm does not converge before\n",
+    "        initial_random_state=int(time()),# initial random seed - use a fixed value, if you want to have the same initial state for every execution\n",
+    "        visualize_progres=True,#Turn Off, if you do not want to wait for the visualization\n",
+    "        sleep_time=0.5 # the sleep time controls the speed of the visualization (lower means faster)\n",
+    "        \n",
+    "    )\n",
+    "    \n",
+    "    print(coords_center)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "scientific-compensation",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if __name__ == \"__main__\":\n",
+    "    n_clusters = 4 # change this value to test different datasets\n",
+    "    dataset = np.loadtxt(f\"sample-data/coords-with-labels-{n_clusters}.dat\", delimiter=\",\")\n",
+    "    main(n_clusters, dataset)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "098b1f1d-049d-4459-a518-1b2aef76c40e",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6c4b7a8a",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": true,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_stdPython.ipynb.license b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_stdPython.ipynb.license
new file mode 100644
index 0000000000000000000000000000000000000000..c207ab8c094a9d18d7c6cb5c9dfbf8913df4aa8a
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_stdPython.ipynb.license
@@ -0,0 +1,4 @@
+SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+
+SPDX-License-Identifier: MIT
diff --git a/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_tasks.ipynb b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_tasks.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..40f98b5b7de4f2afa664b8a62ab49e9792cbe4db
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_tasks.ipynb
@@ -0,0 +1,318 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "9ccf7386",
+   "metadata": {},
+   "source": [
+    "# $K$-Means Clustering"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "random-contract",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "\n",
+    "from matplotlib import pyplot as plt\n",
+    "import numpy as np\n",
+    "\n",
+    "import importlib\n",
+    "import helper\n",
+    "importlib.reload(helper)\n",
+    "\n",
+    "from IPython.display import clear_output\n",
+    "from time import sleep, time"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3ae69504",
+   "metadata": {
+    "jp-MarkdownHeadingCollapsed": true,
+    "tags": []
+   },
+   "source": [
+    "## Introduction\n",
+    "$K$-Means Clustering is a method from classical machine learning. It is used to find $K$ different groups of similar items in a dataset.\n",
+    "\n",
+    "In our case the dataset is a set of $N$ 2-dimensional coordinate vectors $\\vec{x}_1,\\vec{x}_2,\\dots,\\vec{x}_N$. These points form $K < N$ clusters which we would like to find. In order to characterise a cluster we use the cluster centre $\\vec{\\mu}_j$ ($1 \\leq j \\leq K$). *Each* point from the size-$N$ set can be assigned to *one* of these clusters (we will limit ourselves to cases where this indeed is possible)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e8075aed",
+   "metadata": {},
+   "source": [
+    "## Algorithm\n",
+    "Assigning a point to a cluster works according to the following procedure:\n",
+    "\n",
+    "1. **Initialisation**: Randomly choose cluster centres $\\vec{\\mu}_j$ ($1 \\leq j \\leq K$). A simple way to achieve this is to choose them from the set of points $\\{\\vec{x}_i\\}_{i = 1, \\dots, N}$.\n",
+    "\n",
+    "2. **Iterations**: \n",
+    "    - For all $i = 1, \\dots N$ find the cluster centre with position $\\vec{\\mu}_j$ to which $\\vec{x}_i$ has the *smallest* euclidian distance:\n",
+    "    $$\n",
+    "    c^{(i)} = \\operatorname{argmin}_{j \\in \\{1, \\dots, K\\}} \\left\\|\\vec{x}_i - \\vec{\\mu}_j\\right\\|_2^2,\n",
+    "    $$\n",
+    "    where $\\|\\vec{x}\\|_2 = \\sqrt{x_1^2 + x_2^2}$. $c^{(i)}$ is an integer number from the set $\\{1, \\dots, K\\}$. We use is to assign an index to each point $\\vec{x}_i$ (being $c^{(i)}$). This index designated the cluster centre to which the $i$th point is closest to. Hence, for each of the points we must compute the (squared) distance to *all* cluster centres $\\vec{\\mu}_j$ ($1 \\leq j \\leq K$) and determine the smallest of these distances. The index $j$ of the cluster with the smallest distance to a point with index $i$ is assigned to $c^{(i)}$.\n",
+    "    - After having assigned each point of the set $\\{\\vec{x}_i\\}_{i = 1, \\dots, N}$ re-compute the position of all cluster centers:\n",
+    "    $$\n",
+    "    \\vec{\\mu}_j = \\frac{1}{n_j} \\sum_{\\vec{x}_i\\text{ with }c^{(i)} = j} \\vec{x}_i,\n",
+    "    $$\n",
+    "    By $n_j$ we mean the total number of points for which $c^{(i)} = j$. The *new* cluster centre is nothing but the arithmetic mean of all points $\\vec{x}_i$ that were assigned to the previous cluster centre.\n",
+    "    - We compare the set cluster centres $C^{\\mathrm{old}} = \\{\\vec{\\mu}_1^{\\mathrm{old}}, \\dots, \\vec{\\mu}_K^{\\mathrm{old}} \\}$ from the previous iteration and the current set of cluster centres  $C = \\{\\vec{\\mu}_1, \\dots, \\vec{\\mu}_K \\}$. If cluster centres are pair-wise equal (compare those with the same index) we stop the iterations. We have reached a steady state and the algorithm has *converged*."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ba899ad5",
+   "metadata": {},
+   "source": [
+    "## Task formulation\n",
+    "\n",
+    "Implement the outlined algorithm for the method of $K$-Means Clustering. Stick to the paradigm of *array-oriented programming* as often as possible.\n",
+    "\n",
+    "In case you have trouble mapping the algorithm to Numpy commands and functions it can help to first implement it with standard Python only.\n",
+    "\n",
+    "The folder `sample-data` contains some sample-dataset that you can use to explore the algorithm and your implementation.\n",
+    "\n",
+    "*Hint*: It can be helpful to plot the data and the cluster centres determined with your implementation. Have a look at the `make_scatter_plot` function from the `helper.py` module provided with this notebook."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "25abbe32",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "n_clusters = 2\n",
+    "dataset = np.loadtxt(f\"sample-data/coords-with-labels-{n_clusters}.dat\", delimiter=\",\")\n",
+    "coords, labels = dataset.T[:2].T, dataset.T[-1]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5bf0a0e2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, ax = plt.subplots()\n",
+    "\n",
+    "helper.make_scatter_plot(\n",
+    "        ax,\n",
+    "        [coords[labels == tt] for tt in range(n_clusters)], \n",
+    "        labels=[f\"cluster {tt}\" for tt in range(n_clusters)],\n",
+    "        markers=[\"o\"] * n_clusters\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "21546ed7",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "62ababfe",
+   "metadata": {},
+   "source": [
+    "## Implementation of solution"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "european-bookmark",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# return True, if centers have not changed and the algorithm can therefore stop\n",
+    "def centers_have_not_changed(a, b):\n",
+    "    # Provide your implementation here.\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ahead-antenna",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# return the updated locations of the cluster centers\n",
+    "def compute_centers(coords, labels, n_centers):\n",
+    "    # Provide your implementation here. \n",
+    "    # **HINT**:\n",
+    "    # \n",
+    "    # Use advanced indexing with boolean masks to access\n",
+    "    # all points that have a label corresponding to the \n",
+    "    # index of a cluster center.\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "known-travel",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# return the list of *indices* of the cluster centers for the coordinates\n",
+    "def find_closest_center(coords, coords_center):\n",
+    "    # Provide your implementation here.\n",
+    "    # **HINT**:\n",
+    "    # \n",
+    "    # Use `np.tile()` to augment `coords` and then make use\n",
+    "    # of NumPy's implicit broadcasting capabilities to\n",
+    "    # compute the distance of each point to *all* cluster\n",
+    "    # centers. You might also need to reshape the array.\n",
+    "    # Think about along which *axis* to compute the norm. \n",
+    "    #\n",
+    "    # Then select the *index* of cluster center with the \n",
+    "    # least distance for each point (Look up the \n",
+    "    # `np.argmin()` function.).\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "06d72489",
+   "metadata": {},
+   "source": [
+    "## The driver function \n",
+    "You need to change it, as there is an error in it the error is *not* in the visualization part."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "physical-saturday",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "def kmeans(coords, n_centers, n_iter, \n",
+    "           initial_random_state=42, \n",
+    "           visualize_progress=True,\n",
+    "           sleep_time=0.5):\n",
+    "    # Initialise the coordinates of the cluster centers\n",
+    "    rng = np.random.RandomState(initial_random_state)\n",
+    "    index = rng.choice(coords.shape[0], n_centers, replace=False)\n",
+    "    \n",
+    "    # Store coords of the center for iterations\n",
+    "    coords_center = coords[index, ...].copy()\n",
+    "    coords_center_old = coords_center.copy()\n",
+    "    \n",
+    "    for i in range(n_iter):\n",
+    "        # Find closest center for each point\n",
+    "        ### --> you provide this function ###\n",
+    "        labels = find_closest_center(coords, coords_center)\n",
+    "        if visualize_progress:\n",
+    "            # Visualization of the process\n",
+    "            sleep(sleep_time) \n",
+    "            clear_output(wait=True)\n",
+    "            helper.plot_clustering(n_centers,coords,coords_center,labels)\n",
+    "       \n",
+    "        # Update the centeroids\n",
+    "        # INFO: \"...\" in x[...] is a slicing operation called \"ellipsis\". You can learn\n",
+    "        # more about it here: https://stackoverflow.com/questions/118370/how-do-you-use-the-ellipsis-slicing-syntax-in-python\n",
+    "        coords_center_old = coords_center # save old version for testing convergence\n",
+    "        ### --> you provide this function ###\n",
+    "        coords_center[...] = compute_centers(coords, labels, n_centers)\n",
+    "        # Test for convergence\n",
+    "        ### --> you provide this function ###\n",
+    "        if centers_have_not_changed(coords_center, coords_center_old):\n",
+    "            if visualize_progres:\n",
+    "                # visualize final state\n",
+    "                sleep(sleep_time)\n",
+    "                clear_output(wait=True)\n",
+    "                helper.plot_clustering(n_centers,coords,coords_center,labels)\n",
+    "            print(\"Finished after %d iterations\"%i)\n",
+    "            break\n",
+    "            \n",
+    "    return coords_center, labels"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bronze-advocate",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def main(n_clusters, dataset, n_iter=1000):\n",
+    "#     coords, labels = dataset.T[:2].T, dataset.T[-1].astype(int)\n",
+    "    coords  = dataset.T[:2].T\n",
+    "    \n",
+    "    coords_center, center_labels = kmeans(\n",
+    "        coords=coords,# the input data (coordinates of the points to be clustered)\n",
+    "        n_centers=n_clusters,# number of clusters\n",
+    "        n_iter=n_iter,# maximum number of iterations to perform, if algorithm does not converge before\n",
+    "        #initial_random_state=int(time()),# initial random seed - use a fixed value, if you want to have the same initial state for every execution\n",
+    "        # this is a good random seed to see the bug\n",
+    "        initial_random_state=4321,\n",
+    "        visualize_progres=True,#Turn Off, if you do not want to wait for the visualization\n",
+    "        sleep_time=0.5 # the sleep time controls the speed of the visualization (lower means faster)\n",
+    "        \n",
+    "    )\n",
+    "    \n",
+    "    print(coords_center)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "middle-planner",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if __name__ == \"__main__\":\n",
+    "    n_clusters = 2 # change this value to test different datasets\n",
+    "    dataset = np.loadtxt(f\"sample-data/coords-with-labels-{n_clusters}.dat\", delimiter=\",\")\n",
+    "    main(n_clusters, dataset)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": true,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": true
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_tasks.ipynb.license b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_tasks.ipynb.license
new file mode 100644
index 0000000000000000000000000000000000000000..c207ab8c094a9d18d7c6cb5c9dfbf8913df4aa8a
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/NumPy_KMeansClustering_tasks.ipynb.license
@@ -0,0 +1,4 @@
+SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+
+SPDX-License-Identifier: MIT
diff --git a/exercises/Numpy_KMeansClustering/helper.py b/exercises/Numpy_KMeansClustering/helper.py
new file mode 100644
index 0000000000000000000000000000000000000000..a5e45b0a287cd37fe6765a61470f84d4316ea61a
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/helper.py
@@ -0,0 +1,88 @@
+# SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+# SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+#
+# SPDX-License-Identifier: MIT
+
+import matplotlib.pyplot as plt
+from matplotlib.lines import Line2D
+
+
+def init_figure(figsize=(8, 8)):
+    _, (ax1, ax2) = plt.subplots(1, 2, sharex=True, sharey=True, figsize=figsize)
+    return ax1, ax2
+
+
+# We have not dealt with `matplotlib` (or other packages for plotting data) yet
+# but it is quite convenient for the purpose of visualising the results of the
+# cluster search.
+def make_scatter_plot(
+    ax,
+    coords,
+    labels,
+    markers=None,
+    colors=None,
+    with_legend=False,
+    figname=None,
+):
+    ax.set_aspect("equal")
+    ax.minorticks_on()
+
+    if colors is None:
+        cmap = plt.get_cmap("tab10")
+        color_list = [cmap(idx) for idx in range(len(labels))]
+    else:
+        color_list = colors
+
+    marker_list = (
+        list(Line2D.filled_markers)[: len(labels)] if markers is None else markers
+    )
+
+    for xy, col, ll, mm in zip(coords, color_list, labels, marker_list):
+        try:
+            x, y = xy.transpose()
+        except AttributeError:
+            x, y = [c[0] for c in xy], [c[1] for c in xy]
+        ax.scatter(x, y, s=20, color=col, label=ll, marker=mm)
+
+    if with_legend:
+        ax.legend(bbox_to_anchor=(1, 1), loc="upper left")
+
+    if figname is not None:
+        plt.savefig(figname, bbox_inches="tight")
+
+def plot_clustering(n_clusters,coords,coords_center,center_labels):
+    fig, ax = plt.subplots()
+
+    # Assigen each point to a cluster.
+    coords_labelled = list(
+        coords[center_labels == tt] for tt in range(n_clusters)
+    )
+    # Plot clusters with colors according to which cluster they belong.
+    make_scatter_plot(
+        ax,
+        coords_labelled, 
+        labels=[f"cluster {tt}" for tt in range(n_clusters)],
+        markers=["o"] * n_clusters
+    )
+    # Plot cluster centers.
+    make_scatter_plot(
+        ax,
+        coords_center, 
+        labels=[f"centeroid {tt}" for tt in range(n_clusters)],
+        colors=["black"] * n_clusters,
+        with_legend=True,
+#         figname="kmeans.pdf"
+    )
+    plt.show()
+  
+        
+
+def read_cluster_data(filename):
+    """Helper function to read sample datasets."""
+    with open(filename, "r", encoding="utf-8") as datafile:
+        coords, labels = [], []
+        for line in datafile:
+            x, y, l = map(float, line.split(","))
+            coords.append([x, y])
+            labels.append(int(l))
+    return coords, labels
diff --git a/exercises/Numpy_KMeansClustering/sample-data/cluster-centers-2.dat b/exercises/Numpy_KMeansClustering/sample-data/cluster-centers-2.dat
new file mode 100644
index 0000000000000000000000000000000000000000..448b8cea44a93df9c556d1ac1a864bfa83845125
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/sample-data/cluster-centers-2.dat
@@ -0,0 +1,7 @@
+# SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+# SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+#
+# SPDX-License-Identifier: CC0-1.0
+
+-2.621797518717023490e+00,9.050606662874209007e+00
+4.737827335215956559e+00,1.994987523048710854e+00
diff --git a/exercises/Numpy_KMeansClustering/sample-data/cluster-centers-3.dat b/exercises/Numpy_KMeansClustering/sample-data/cluster-centers-3.dat
new file mode 100644
index 0000000000000000000000000000000000000000..8bc42fcc7d2387f0de5840457a4f1da1bb4abeca
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/sample-data/cluster-centers-3.dat
@@ -0,0 +1,8 @@
+# SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+# SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+#
+# SPDX-License-Identifier: CC0-1.0
+
+-2.633232678649361613e+00,9.043569782044549754e+00
+-6.883871789341749370e+00,-6.983984146713130059e+00
+4.747103374180733582e+00,2.010594272771337288e+00
diff --git a/exercises/Numpy_KMeansClustering/sample-data/cluster-centers-4.dat b/exercises/Numpy_KMeansClustering/sample-data/cluster-centers-4.dat
new file mode 100644
index 0000000000000000000000000000000000000000..62087c101ff846b14626f9cb75fa90cf24779d47
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/sample-data/cluster-centers-4.dat
@@ -0,0 +1,9 @@
+# SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+# SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+#
+# SPDX-License-Identifier: CC0-1.0
+
+-6.883871789341752923e+00,-6.983984146713127394e+00
+-2.633232678649360281e+00,9.043569782044546201e+00
+-8.929211039812535944e+00,7.381960674811766765e+00
+4.747103374180734470e+00,2.010594272771337288e+00
diff --git a/exercises/Numpy_KMeansClustering/sample-data/coords-with-labels-2.dat b/exercises/Numpy_KMeansClustering/sample-data/coords-with-labels-2.dat
new file mode 100644
index 0000000000000000000000000000000000000000..f486ef8b5b3c218a000a33a8769e238b4b63d8b2
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/sample-data/coords-with-labels-2.dat
@@ -0,0 +1,205 @@
+# SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+# SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+#
+# SPDX-License-Identifier: CC0-1.0
+
+3.045451177433734280e+00,1.373794660986959126e+00,1.000000000000000000e+00
+4.962597396566191144e+00,1.145938740388408927e+00,1.000000000000000000e+00
+4.664389010487044018e+00,2.471167975186181920e+00,1.000000000000000000e+00
+-3.571501336778855062e+00,9.487878558833502396e+00,0.000000000000000000e+00
+4.920870703963133863e+00,1.350470164120138206e+00,1.000000000000000000e+00
+6.783822925553426586e+00,2.607088706258743116e+00,1.000000000000000000e+00
+4.753396181479349281e+00,2.635300358461778458e+00,1.000000000000000000e+00
+4.164933525067144870e+00,1.319840451367020107e+00,1.000000000000000000e+00
+-2.955712575119771479e+00,9.870684922521792970e+00,0.000000000000000000e+00
+5.497538459430121094e+00,1.813231153977304944e+00,1.000000000000000000e+00
+-2.448967413111723612e+00,1.147752824068360766e+01,0.000000000000000000e+00
+5.539478711661351973e+00,2.280469204817341389e+00,1.000000000000000000e+00
+-1.106403312116650994e+00,7.612435065406041090e+00,0.000000000000000000e+00
+5.186976217398139077e+00,1.770977031506837829e+00,1.000000000000000000e+00
+1.398611496159028800e+00,9.487820426064421664e-01,1.000000000000000000e+00
+-6.434231119079936168e-01,9.488119049110109060e+00,0.000000000000000000e+00
+4.863971318038518454e+00,1.985762084722526799e+00,1.000000000000000000e+00
+3.633861454728399387e+00,7.589810711529998422e-01,1.000000000000000000e+00
+4.154515288398997974e+00,2.055043823327054486e+00,1.000000000000000000e+00
+3.909512204510964928e+00,2.189628273522707058e+00,1.000000000000000000e+00
+5.321831807523064839e+00,1.662902927347275961e+00,1.000000000000000000e+00
+5.154914103436761152e+00,2.486955634852940911e+00,1.000000000000000000e+00
+-1.043548854131196135e+00,8.788509827711786571e+00,0.000000000000000000e+00
+3.810883825306029316e+00,1.412988643743762429e+00,1.000000000000000000e+00
+-2.185113653657955179e+00,8.629203847782004999e+00,0.000000000000000000e+00
+-3.053580347577932841e+00,9.125208717908186884e+00,0.000000000000000000e+00
+5.144866115208558632e+00,2.838924878110853367e+00,1.000000000000000000e+00
+-1.686652710949561040e+00,7.793442478227299297e+00,0.000000000000000000e+00
+3.741464164879743315e+00,2.465088855447237659e+00,1.000000000000000000e+00
+-1.696671800658552165e+00,1.037052615676914513e+01,0.000000000000000000e+00
+-2.545023662162701594e+00,1.057892978401232753e+01,0.000000000000000000e+00
+5.803042588383060973e+00,1.983402744960319097e+00,1.000000000000000000e+00
+-3.499733948183438415e+00,8.447988398595549953e+00,0.000000000000000000e+00
+-2.147561598005116146e+00,8.369166373593197150e+00,0.000000000000000000e+00
+-1.695680405683080316e+00,7.783421811764366538e+00,0.000000000000000000e+00
+4.838938531801571408e+00,1.372952806781937429e+00,1.000000000000000000e+00
+-1.366374808537729635e+00,9.766219160885095008e+00,0.000000000000000000e+00
+6.225895652373453437e+00,7.353541851138829522e-01,1.000000000000000000e+00
+-2.422150554814578971e+00,8.715278777732454074e+00,0.000000000000000000e+00
+3.847358097795400944e+00,1.858433242473833014e+00,1.000000000000000000e+00
+-1.031303578311234093e+00,8.496015909924674148e+00,0.000000000000000000e+00
+5.052810290503725987e+00,1.409445131136757290e+00,1.000000000000000000e+00
+4.627632063381186711e+00,1.075915312454900352e+00,1.000000000000000000e+00
+4.996894322193148774e+00,1.280260088680077679e+00,1.000000000000000000e+00
+-2.496195731174843058e+00,1.046782020535563795e+01,0.000000000000000000e+00
+3.814381639435589832e+00,1.651783842287738668e+00,1.000000000000000000e+00
+-2.151410262704466891e+00,9.575070654566555817e+00,0.000000000000000000e+00
+-3.317691225945937905e+00,8.512529084613785102e+00,0.000000000000000000e+00
+-2.249314828804326538e+00,9.796108999975631448e+00,0.000000000000000000e+00
+5.614998569645852200e+00,1.826112302438593460e+00,1.000000000000000000e+00
+2.515983111918294490e+00,1.447414662259971063e+00,1.000000000000000000e+00
+-3.393055059253883066e+00,9.168011234143849109e+00,0.000000000000000000e+00
+-2.624845905440990723e+00,8.713182432609032801e+00,0.000000000000000000e+00
+-3.109836312971554939e+00,8.722592378405044755e+00,0.000000000000000000e+00
+-1.426146379877473169e+00,1.006808818023322516e+01,0.000000000000000000e+00
+3.712948364650018540e+00,1.913644327878931906e+00,1.000000000000000000e+00
+-2.412120073704709711e+00,9.982931118731210418e+00,0.000000000000000000e+00
+-2.216125149754069046e+00,8.299934710171953611e+00,0.000000000000000000e+00
+4.168840530609778661e+00,2.205219621298368349e+00,1.000000000000000000e+00
+3.658370185180150447e+00,2.435273158204002808e+00,1.000000000000000000e+00
+4.431756585870826548e+00,1.480168749281899121e+00,1.000000000000000000e+00
+3.880746174674403193e+00,2.123563470416939492e+00,1.000000000000000000e+00
+4.737554934776933457e+00,1.200159900085265630e+00,1.000000000000000000e+00
+-2.441669418364826427e+00,7.589537941984865199e+00,0.000000000000000000e+00
+4.525338990975483533e+00,3.210985995914193758e+00,1.000000000000000000e+00
+-4.059861054118883317e+00,9.082849103004349445e+00,0.000000000000000000e+00
+-2.522694847790684314e+00,7.956575199242420737e+00,0.000000000000000000e+00
+5.263998653280256512e+00,2.601515193205012011e+00,1.000000000000000000e+00
+-3.837383671951180908e+00,9.211147364067445054e+00,0.000000000000000000e+00
+-2.165579333484288771e+00,7.251245972835587139e+00,0.000000000000000000e+00
+5.159225350469273330e+00,3.505908596943309696e+00,1.000000000000000000e+00
+-3.522028743387173755e+00,9.328533460793595466e+00,0.000000000000000000e+00
+-1.883530275287744082e+00,8.157128571782038762e+00,0.000000000000000000e+00
+-1.718165676009703269e+00,8.104898673403582166e+00,0.000000000000000000e+00
+6.081152125294217115e+00,5.373075327612926166e-01,1.000000000000000000e+00
+-2.773854456290706150e+00,1.173445529478794036e+01,0.000000000000000000e+00
+-9.299848075453587271e-01,9.781720857351229981e+00,0.000000000000000000e+00
+3.262209468271010326e+00,1.035344644025609107e+00,1.000000000000000000e+00
+-2.177934191649186335e+00,9.989831255320680725e+00,0.000000000000000000e+00
+-3.110904235282147212e+00,1.086656431270725953e+01,0.000000000000000000e+00
+3.378994881893055968e+00,2.891031630995508195e+00,1.000000000000000000e+00
+5.387172441351363084e+00,2.583539949374197064e+00,1.000000000000000000e+00
+5.465295185216131557e+00,2.786679319941370636e+00,1.000000000000000000e+00
+5.945357643382430446e+00,1.994173525573491146e+00,1.000000000000000000e+00
+4.387310684834941021e+00,7.253865019758825028e-01,1.000000000000000000e+00
+6.954537402901610044e+00,1.059044913489839423e-01,1.000000000000000000e+00
+-5.128942727142494107e+00,9.836188632573545476e+00,0.000000000000000000e+00
+5.906789985414723887e+00,1.265500218321951253e+00,1.000000000000000000e+00
+3.817658440661670038e+00,2.216856895432644414e+00,1.000000000000000000e+00
+3.800156994047325210e+00,1.373777038496709846e+00,1.000000000000000000e+00
+-2.504084166410289303e+00,8.779698994823174729e+00,0.000000000000000000e+00
+-2.409546257965109017e+00,8.510810474082122212e+00,0.000000000000000000e+00
+-2.701558587833872593e+00,9.315833470531934779e+00,0.000000000000000000e+00
+-2.232506823722731237e+00,9.841469377234345117e+00,0.000000000000000000e+00
+4.884845407336824152e+00,1.466226508569602238e+00,1.000000000000000000e+00
+-1.478198100556799233e+00,9.945566247314520325e+00,0.000000000000000000e+00
+-1.987256057435852430e+00,9.311270801431508204e+00,0.000000000000000000e+00
+6.762035033240734627e+00,3.005634944491879068e+00,1.000000000000000000e+00
+-3.211250716930102556e+00,8.686623981600552824e+00,0.000000000000000000e+00
+3.867053621690529575e+00,1.736351077200723125e+00,1.000000000000000000e+00
+3.319645629207458981e+00,3.804628449795085743e+00,1.000000000000000000e+00
+-3.924568365103164425e+00,8.593640805432961827e+00,0.000000000000000000e+00
+6.772912210884367568e+00,2.108188441823011239e-02,1.000000000000000000e+00
+-2.901305776184907703e+00,7.550771180066202959e+00,0.000000000000000000e+00
+-3.580090121113862267e+00,9.496758543441506717e+00,0.000000000000000000e+00
+4.620862628325412835e+00,9.706403193029231602e-01,1.000000000000000000e+00
+5.593880599721304137e+00,2.624560935246529780e+00,1.000000000000000000e+00
+2.614736249570494220e+00,2.159623998710159754e+00,1.000000000000000000e+00
+5.590302674414151518e+00,1.396266028278328797e+00,1.000000000000000000e+00
+-4.116680857613977729e+00,9.198919986730626164e+00,0.000000000000000000e+00
+5.452740955067061357e+00,2.602798525864344015e+00,1.000000000000000000e+00
+-2.969836394012537628e+00,1.007140835441723681e+01,0.000000000000000000e+00
+3.439582429172324929e+00,1.638668448099783514e+00,1.000000000000000000e+00
+-1.593795505350676045e+00,9.343037237858005994e+00,0.000000000000000000e+00
+6.793061293739658169e+00,1.205822121052682494e+00,1.000000000000000000e+00
+3.821658152994628743e+00,4.065556959626192679e+00,1.000000000000000000e+00
+-2.267235351486716066e+00,7.101005883540523200e+00,0.000000000000000000e+00
+-3.987719613420177556e+00,8.294441919803613672e+00,0.000000000000000000e+00
+-1.770731043057339749e+00,9.185654409388291697e+00,0.000000000000000000e+00
+5.917543732016525837e+00,1.381598295104902174e+00,1.000000000000000000e+00
+-1.922340529252479779e+00,1.120474175400829964e+01,0.000000000000000000e+00
+5.330022827939213670e+00,1.571949212054895684e+00,1.000000000000000000e+00
+6.829681769445773654e+00,1.164871398585580531e+00,1.000000000000000000e+00
+-3.355991341121155269e+00,7.499438903512457344e+00,0.000000000000000000e+00
+-3.348415146275388832e+00,8.705073752347107785e+00,0.000000000000000000e+00
+5.083698264374329590e+00,2.747803737370068777e+00,1.000000000000000000e+00
+-2.336016697201568348e+00,9.399603507927158930e+00,0.000000000000000000e+00
+-3.292450915388987376e+00,8.692224611992646288e+00,0.000000000000000000e+00
+-3.186119623358708797e+00,9.625962417039190200e+00,0.000000000000000000e+00
+5.210769346921268586e+00,3.108735324121330912e+00,1.000000000000000000e+00
+-3.417221698573960964e+00,7.601982426863029829e+00,0.000000000000000000e+00
+4.531118687771243714e+00,2.374881406039673237e+00,1.000000000000000000e+00
+6.091022444023143301e+00,2.932440510025938973e+00,1.000000000000000000e+00
+-1.350602044045346117e+00,8.193603809846610631e+00,0.000000000000000000e+00
+4.167946970438667798e+00,3.062120280908097847e+00,1.000000000000000000e+00
+4.685450676131915237e+00,1.321569336334914802e+00,1.000000000000000000e+00
+-3.038957826819788988e+00,9.527553561311677299e+00,0.000000000000000000e+00
+3.120508870274087965e+00,1.488935611074480692e+00,1.000000000000000000e+00
+4.645122535946284437e+00,2.020150277705473840e+00,1.000000000000000000e+00
+-4.234115455565783392e+00,8.451998598957349174e+00,0.000000000000000000e+00
+5.512199472948779544e+00,2.156511689679083688e+00,1.000000000000000000e+00
+-2.281737688448620904e+00,1.032142888248074897e+01,0.000000000000000000e+00
+-3.398712052678273476e+00,8.198475843232882809e+00,0.000000000000000000e+00
+-2.300334028047994916e+00,7.054616004318545741e+00,0.000000000000000000e+00
+-2.258704772706873420e+00,9.360734337695296503e+00,0.000000000000000000e+00
+3.191794494730777032e+00,5.657059095641767676e-01,1.000000000000000000e+00
+4.709680921218120098e+00,1.587856087078971745e+00,1.000000000000000000e+00
+6.272290140159736183e+00,5.430283059800993239e-01,1.000000000000000000e+00
+-2.988371860898040300e+00,8.828627151534504947e+00,0.000000000000000000e+00
+4.950786401826105632e+00,3.448525900890284213e+00,1.000000000000000000e+00
+-1.545821493808428482e+00,9.427067055134820350e+00,0.000000000000000000e+00
+4.981634812005260926e+00,3.849340523156618232e+00,1.000000000000000000e+00
+4.324609591587755375e+00,2.732138904433999649e+00,1.000000000000000000e+00
+4.736874801220819720e+00,2.568326709377645400e+00,1.000000000000000000e+00
+4.964045188716543322e+00,1.843026629573047526e+00,1.000000000000000000e+00
+-6.230117218422199787e-01,9.188863941030160021e+00,0.000000000000000000e+00
+-2.732660408378601247e+00,9.728286622290413632e+00,0.000000000000000000e+00
+-3.483879293280071732e+00,9.801370731940773240e+00,0.000000000000000000e+00
+-3.615532597058778386e+00,7.818079504117650735e+00,0.000000000000000000e+00
+-1.687137463058260067e+00,1.091107911085226867e+01,0.000000000000000000e+00
+-2.450988904606750118e+00,7.871315830367698219e+00,0.000000000000000000e+00
+4.621365700235711138e+00,1.684511045020593567e+00,1.000000000000000000e+00
+-2.417436846517247773e+00,7.026717213597429179e+00,0.000000000000000000e+00
+5.154926522534148958e+00,5.825901174595452758e+00,1.000000000000000000e+00
+-3.189222344631240880e+00,9.246539825359324283e+00,0.000000000000000000e+00
+-3.428621857286553443e+00,1.056422053321586141e+01,0.000000000000000000e+00
+4.488093741192518138e+00,2.561486890425308527e+00,1.000000000000000000e+00
+5.819318956949388166e+00,1.503994031836027201e+00,1.000000000000000000e+00
+4.618977242263953009e+00,2.090497067249514007e+00,1.000000000000000000e+00
+-2.213077345988174294e+00,9.275341400378211532e+00,0.000000000000000000e+00
+4.704158855323564481e+00,8.954249060114258807e-01,1.000000000000000000e+00
+2.926744307137223888e+00,3.327042058106144840e+00,1.000000000000000000e+00
+-2.543909392757993437e+00,7.845608090578789273e+00,0.000000000000000000e+00
+4.199834349531117894e+00,2.103910261226823231e+00,1.000000000000000000e+00
+-4.427968838351791447e+00,8.987772252749104851e+00,0.000000000000000000e+00
+-3.660191200475052753e+00,9.389984146543993049e+00,0.000000000000000000e+00
+-2.851912139579519501e+00,8.212008858976702186e+00,0.000000000000000000e+00
+6.405333076509197809e+00,2.378151394901687699e+00,1.000000000000000000e+00
+-2.978672008987702124e+00,9.556846171784286526e+00,0.000000000000000000e+00
+3.978092371459713394e+00,2.825603018736956074e+00,1.000000000000000000e+00
+5.797989709728168961e+00,2.764832377903667648e+00,1.000000000000000000e+00
+4.422197633000880757e+00,3.071946535927922106e+00,1.000000000000000000e+00
+-2.728869510890262085e+00,9.371398699710068669e+00,0.000000000000000000e+00
+-3.746148333930832131e+00,7.693829515114044781e+00,0.000000000000000000e+00
+-2.295103878922546414e+00,7.768547349486333076e+00,0.000000000000000000e+00
+-2.035959998479205169e+00,8.941457215541449344e+00,0.000000000000000000e+00
+-2.147802017544336195e+00,1.055232269466429074e+01,0.000000000000000000e+00
+-2.581207744633084111e+00,1.001781902609034525e+01,0.000000000000000000e+00
+3.924575126968133265e+00,2.652767432875407838e+00,1.000000000000000000e+00
+-2.972615315865212438e+00,8.548556374628065058e+00,0.000000000000000000e+00
+3.921434614975665589e+00,1.759722532228884750e+00,1.000000000000000000e+00
+-2.670483334718759316e+00,9.418336985012860652e+00,0.000000000000000000e+00
+-2.743350997776086153e+00,8.780149171249140849e+00,0.000000000000000000e+00
+5.326139026602614734e+00,3.604538127510803491e-01,1.000000000000000000e+00
+-3.700501120255398568e+00,9.670839736832151701e+00,0.000000000000000000e+00
+-2.586299332466854395e+00,9.355438103014964923e+00,0.000000000000000000e+00
+4.050514079283889401e+00,2.822771780961756516e+00,1.000000000000000000e+00
+-2.754585739055620763e+00,8.260549963840832177e+00,0.000000000000000000e+00
+4.715683394421827934e+00,1.296007972428620203e+00,1.000000000000000000e+00
+-2.251647232329985648e+00,8.939840212432153876e+00,0.000000000000000000e+00
diff --git a/exercises/Numpy_KMeansClustering/sample-data/coords-with-labels-3.dat b/exercises/Numpy_KMeansClustering/sample-data/coords-with-labels-3.dat
new file mode 100644
index 0000000000000000000000000000000000000000..420385f5a44f81040f2909d027ba961cc78d774e
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/sample-data/coords-with-labels-3.dat
@@ -0,0 +1,305 @@
+# SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+# SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+#
+# SPDX-License-Identifier: CC0-1.0
+
+-7.338988090691514365e+00,-7.729953962740738760e+00,1.000000000000000000e+00
+-7.740040556435222818e+00,-7.264665137505772030e+00,1.000000000000000000e+00
+-1.686652710949561040e+00,7.793442478227299297e+00,0.000000000000000000e+00
+4.422197633000880757e+00,3.071946535927922106e+00,2.000000000000000000e+00
+-8.917751726329123940e+00,-7.888195904193350927e+00,1.000000000000000000e+00
+5.497538459430121094e+00,1.813231153977304944e+00,2.000000000000000000e+00
+-2.336016697201568348e+00,9.399603507927158930e+00,0.000000000000000000e+00
+5.052810290503725987e+00,1.409445131136757290e+00,2.000000000000000000e+00
+-2.988371860898040300e+00,8.828627151534504947e+00,0.000000000000000000e+00
+-3.700501120255398568e+00,9.670839736832151701e+00,0.000000000000000000e+00
+-3.110904235282147212e+00,1.086656431270725953e+01,0.000000000000000000e+00
+4.996894322193148774e+00,1.280260088680077679e+00,2.000000000000000000e+00
+-2.300334028047994916e+00,7.054616004318545741e+00,0.000000000000000000e+00
+-3.924568365103164425e+00,8.593640805432961827e+00,0.000000000000000000e+00
+-7.530269760273096580e+00,-7.367234977040642896e+00,1.000000000000000000e+00
+-3.211250716930102556e+00,8.686623981600552824e+00,0.000000000000000000e+00
+-8.507169629034432745e+00,-6.832024646614564212e+00,1.000000000000000000e+00
+2.614736249570494220e+00,2.159623998710159754e+00,2.000000000000000000e+00
+-2.412120073704709711e+00,9.982931118731210418e+00,0.000000000000000000e+00
+-1.922340529252479779e+00,1.120474175400829964e+01,0.000000000000000000e+00
+-1.350602044045346117e+00,8.193603809846610631e+00,0.000000000000000000e+00
+-2.670483334718759316e+00,9.418336985012860652e+00,0.000000000000000000e+00
+5.614998569645852200e+00,1.826112302438593460e+00,2.000000000000000000e+00
+-6.991955240842099961e+00,-7.101079192809169882e+00,1.000000000000000000e+00
+-2.972615315865212438e+00,8.548556374628065058e+00,0.000000000000000000e+00
+-6.349823013235987190e+00,-5.438540972618046254e+00,1.000000000000000000e+00
+-7.456398521719602712e+00,-6.124718367450190826e+00,1.000000000000000000e+00
+3.821658152994628743e+00,4.065556959626192679e+00,2.000000000000000000e+00
+4.627632063381186711e+00,1.075915312454900352e+00,2.000000000000000000e+00
+-3.398712052678273476e+00,8.198475843232882809e+00,0.000000000000000000e+00
+-3.499733948183438415e+00,8.447988398595549953e+00,0.000000000000000000e+00
+-3.580090121113862267e+00,9.496758543441506717e+00,0.000000000000000000e+00
+-6.049291374607024707e+00,-7.736193419184814069e+00,1.000000000000000000e+00
+-2.295103878922546414e+00,7.768547349486333076e+00,0.000000000000000000e+00
+-8.394818253349821902e+00,-5.513235325831422173e+00,1.000000000000000000e+00
+-2.281737688448620904e+00,1.032142888248074897e+01,0.000000000000000000e+00
+-6.122638574505918641e+00,-7.802274917453572378e+00,1.000000000000000000e+00
+4.884845407336824152e+00,1.466226508569602238e+00,2.000000000000000000e+00
+-6.986657551105827757e+00,-7.915351915695320706e+00,1.000000000000000000e+00
+4.981634812005260926e+00,3.849340523156618232e+00,2.000000000000000000e+00
+5.906789985414723887e+00,1.265500218321951253e+00,2.000000000000000000e+00
+-2.251647232329985648e+00,8.939840212432153876e+00,0.000000000000000000e+00
+-7.367233415223763515e+00,-7.312667781095567143e+00,1.000000000000000000e+00
+4.525338990975483533e+00,3.210985995914193758e+00,2.000000000000000000e+00
+-2.543909392757993437e+00,7.845608090578789273e+00,0.000000000000000000e+00
+-2.147802017544336195e+00,1.055232269466429074e+01,0.000000000000000000e+00
+-6.808060953931877712e+00,-7.357767040041062856e+00,1.000000000000000000e+00
+4.154515288398997974e+00,2.055043823327054486e+00,2.000000000000000000e+00
+-6.542024529076067907e+00,-7.291986559398414336e+00,1.000000000000000000e+00
+6.225895652373453437e+00,7.353541851138829522e-01,2.000000000000000000e+00
+4.715683394421827934e+00,1.296007972428620203e+00,2.000000000000000000e+00
+-6.887599832467887317e+00,-5.400165454385920327e+00,1.000000000000000000e+00
+-6.513028945054421648e+00,-7.819989379603302204e+00,1.000000000000000000e+00
+-1.031303578311234093e+00,8.496015909924674148e+00,0.000000000000000000e+00
+-5.700330007087443640e+00,-6.812591111865837767e+00,1.000000000000000000e+00
+5.154926522534148958e+00,5.825901174595452758e+00,2.000000000000000000e+00
+-6.485175048772973128e+00,-7.301094074096209141e+00,1.000000000000000000e+00
+-1.545821493808428482e+00,9.427067055134820350e+00,0.000000000000000000e+00
+4.753396181479349281e+00,2.635300358461778458e+00,2.000000000000000000e+00
+-2.969836394012537628e+00,1.007140835441723681e+01,0.000000000000000000e+00
+-6.644012633042704508e+00,-6.109244399388980007e+00,1.000000000000000000e+00
+6.772912210884367568e+00,2.108188441823011239e-02,2.000000000000000000e+00
+5.539478711661351973e+00,2.280469204817341389e+00,2.000000000000000000e+00
+-3.800746382696032377e+00,-5.760534681841369853e+00,1.000000000000000000e+00
+-7.128591339630343526e+00,-5.908538642321591539e+00,1.000000000000000000e+00
+3.741464164879743315e+00,2.465088855447237659e+00,2.000000000000000000e+00
+3.921434614975665589e+00,1.759722532228884750e+00,2.000000000000000000e+00
+-6.168012313062380514e+00,-8.004751685113815185e+00,1.000000000000000000e+00
+-8.583009630506424514e+00,-6.935657292172565214e+00,1.000000000000000000e+00
+-3.571501336778855062e+00,9.487878558833502396e+00,0.000000000000000000e+00
+5.945357643382430446e+00,1.994173525573491146e+00,2.000000000000000000e+00
+-5.821202704301682296e+00,-8.638849079699060241e+00,1.000000000000000000e+00
+-7.579352699143855787e+00,-6.666129682541724222e+00,1.000000000000000000e+00
+-2.035959998479205169e+00,8.941457215541449344e+00,0.000000000000000000e+00
+-2.901305776184907703e+00,7.550771180066202959e+00,0.000000000000000000e+00
+-6.609170365371431544e+00,-6.930347702725083714e+00,1.000000000000000000e+00
+-8.947069291191146689e+00,-6.969229632788734641e+00,1.000000000000000000e+00
+3.880746174674403193e+00,2.123563470416939492e+00,2.000000000000000000e+00
+-3.109836312971554939e+00,8.722592378405044755e+00,0.000000000000000000e+00
+5.819318956949388166e+00,1.503994031836027201e+00,2.000000000000000000e+00
+-3.522028743387173755e+00,9.328533460793595466e+00,0.000000000000000000e+00
+-2.581207744633084111e+00,1.001781902609034525e+01,0.000000000000000000e+00
+-6.378710003526888883e+00,-7.857664838074497560e+00,1.000000000000000000e+00
+-2.177934191649186335e+00,9.989831255320680725e+00,0.000000000000000000e+00
+5.590302674414151518e+00,1.396266028278328797e+00,2.000000000000000000e+00
+-6.043935079086128148e+00,-8.009816447933564731e+00,1.000000000000000000e+00
+-5.711845129491463169e+00,-6.625688749974733227e+00,1.000000000000000000e+00
+-6.434231119079936168e-01,9.488119049110109060e+00,0.000000000000000000e+00
+6.405333076509197809e+00,2.378151394901687699e+00,2.000000000000000000e+00
+-3.886866991009841232e+00,8.076461088283199530e+00,0.000000000000000000e+00
+-8.549032472272642735e+00,-6.336749400896011686e+00,1.000000000000000000e+00
+-2.545023662162701594e+00,1.057892978401232753e+01,0.000000000000000000e+00
+-6.400647365404878109e+00,-6.546447487988998226e+00,1.000000000000000000e+00
+-1.593795505350676045e+00,9.343037237858005994e+00,0.000000000000000000e+00
+-3.038957826819788988e+00,9.527553561311677299e+00,0.000000000000000000e+00
+-7.433276496498452346e+00,-8.077987485864795758e+00,1.000000000000000000e+00
+-7.947247620533864243e+00,-7.022489078297240006e+00,1.000000000000000000e+00
+-2.249314828804326538e+00,9.796108999975631448e+00,0.000000000000000000e+00
+-7.642886347693787386e+00,-8.684991693940466106e+00,1.000000000000000000e+00
+-6.466192287927569282e+00,-5.003313780717880910e+00,1.000000000000000000e+00
+4.164933525067144870e+00,1.319840451367020107e+00,2.000000000000000000e+00
+-2.151410262704466891e+00,9.575070654566555817e+00,0.000000000000000000e+00
+4.431756585870826548e+00,1.480168749281899121e+00,2.000000000000000000e+00
+-1.718165676009703269e+00,8.104898673403582166e+00,0.000000000000000000e+00
+-3.348415146275388832e+00,8.705073752347107785e+00,0.000000000000000000e+00
+-2.267235351486716066e+00,7.101005883540523200e+00,0.000000000000000000e+00
+-2.165579333484288771e+00,7.251245972835587139e+00,0.000000000000000000e+00
+-2.258704772706873420e+00,9.360734337695296503e+00,0.000000000000000000e+00
+6.793061293739658169e+00,1.205822121052682494e+00,2.000000000000000000e+00
+-8.278194764970411512e+00,-6.317140356585375649e+00,1.000000000000000000e+00
+5.326139026602614734e+00,3.604538127510803491e-01,2.000000000000000000e+00
+-3.292450915388987376e+00,8.692224611992646288e+00,0.000000000000000000e+00
+-3.317691225945937905e+00,8.512529084613785102e+00,0.000000000000000000e+00
+-2.441669418364826427e+00,7.589537941984865199e+00,0.000000000000000000e+00
+-2.522694847790684314e+00,7.956575199242420737e+00,0.000000000000000000e+00
+4.838938531801571408e+00,1.372952806781937429e+00,2.000000000000000000e+00
+-7.916873345477726254e+00,-7.070448271359555115e+00,1.000000000000000000e+00
+2.926744307137223888e+00,3.327042058106144840e+00,2.000000000000000000e+00
+-8.750419112177125314e+00,-7.231623077317255621e+00,1.000000000000000000e+00
+3.633861454728399387e+00,7.589810711529998422e-01,2.000000000000000000e+00
+5.159225350469273330e+00,3.505908596943309696e+00,2.000000000000000000e+00
+4.863971318038518454e+00,1.985762084722526799e+00,2.000000000000000000e+00
+-1.106403312116650994e+00,7.612435065406041090e+00,0.000000000000000000e+00
+-6.303070228095503325e+00,-6.568859438732410183e+00,1.000000000000000000e+00
+-6.547313179171678321e+00,-7.628596129832500239e+00,1.000000000000000000e+00
+-7.007544782632036728e+00,-7.835650033876372156e+00,1.000000000000000000e+00
+-6.265460491107845087e+00,-6.122601883228641739e+00,1.000000000000000000e+00
+-3.186119623358708797e+00,9.625962417039190200e+00,0.000000000000000000e+00
+-5.842087246893370889e+00,-7.390125992130693433e+00,1.000000000000000000e+00
+-7.409884809523711091e+00,-7.672982425538291018e+00,1.000000000000000000e+00
+-1.426146379877473169e+00,1.006808818023322516e+01,0.000000000000000000e+00
+-4.427968838351791447e+00,8.987772252749104851e+00,0.000000000000000000e+00
+-2.417436846517247773e+00,7.026717213597429179e+00,0.000000000000000000e+00
+-4.234115455565783392e+00,8.451998598957349174e+00,0.000000000000000000e+00
+-3.987719613420177556e+00,8.294441919803613672e+00,0.000000000000000000e+00
+5.144866115208558632e+00,2.838924878110853367e+00,2.000000000000000000e+00
+4.387310684834941021e+00,7.253865019758825028e-01,2.000000000000000000e+00
+-7.149502126444641448e+00,-7.858873309058253653e+00,1.000000000000000000e+00
+-2.504084166410289303e+00,8.779698994823174729e+00,0.000000000000000000e+00
+-7.393494108487963956e+00,-7.939323115164897970e+00,1.000000000000000000e+00
+-2.978672008987702124e+00,9.556846171784286526e+00,0.000000000000000000e+00
+-2.754585739055620763e+00,8.260549963840832177e+00,0.000000000000000000e+00
+-4.818879266269282979e+00,-5.124768750832742192e+00,1.000000000000000000e+00
+-2.422150554814578971e+00,8.715278777732454074e+00,0.000000000000000000e+00
+3.120508870274087965e+00,1.488935611074480692e+00,2.000000000000000000e+00
+3.924575126968133265e+00,2.652767432875407838e+00,2.000000000000000000e+00
+5.452740955067061357e+00,2.602798525864344015e+00,2.000000000000000000e+00
+-2.232506823722731237e+00,9.841469377234345117e+00,0.000000000000000000e+00
+5.917543732016525837e+00,1.381598295104902174e+00,2.000000000000000000e+00
+-3.189222344631240880e+00,9.246539825359324283e+00,0.000000000000000000e+00
+-3.417221698573960964e+00,7.601982426863029829e+00,0.000000000000000000e+00
+-4.914902058234880577e+00,-6.844846041304218254e+00,1.000000000000000000e+00
+4.920870703963133863e+00,1.350470164120138206e+00,2.000000000000000000e+00
+-8.413741361886891923e+00,-5.602432771377437781e+00,1.000000000000000000e+00
+-2.213077345988174294e+00,9.275341400378211532e+00,0.000000000000000000e+00
+3.439582429172324929e+00,1.638668448099783514e+00,2.000000000000000000e+00
+4.621365700235711138e+00,1.684511045020593567e+00,2.000000000000000000e+00
+-7.410128338761797551e+00,-7.455927833920626746e+00,1.000000000000000000e+00
+-6.241034732373895721e+00,-8.541629655544905830e+00,1.000000000000000000e+00
+-3.393055059253883066e+00,9.168011234143849109e+00,0.000000000000000000e+00
+-5.128942727142494107e+00,9.836188632573545476e+00,0.000000000000000000e+00
+-7.755245444536027044e+00,-8.262909324240283127e+00,1.000000000000000000e+00
+-5.328475215628746930e+00,-6.764434958983088109e+00,1.000000000000000000e+00
+-5.678413268987325679e+00,-7.288184966297498235e+00,1.000000000000000000e+00
+-2.450988904606750118e+00,7.871315830367698219e+00,0.000000000000000000e+00
+-6.008502487719577623e+00,-7.206133125443788146e+00,1.000000000000000000e+00
+4.168840530609778661e+00,2.205219621298368349e+00,2.000000000000000000e+00
+-2.448967413111723612e+00,1.147752824068360766e+01,0.000000000000000000e+00
+-1.987256057435852430e+00,9.311270801431508204e+00,0.000000000000000000e+00
+-1.695680405683080316e+00,7.783421811764366538e+00,0.000000000000000000e+00
+-1.366374808537729635e+00,9.766219160885095008e+00,0.000000000000000000e+00
+-3.428621857286553443e+00,1.056422053321586141e+01,0.000000000000000000e+00
+3.800156994047325210e+00,1.373777038496709846e+00,2.000000000000000000e+00
+-2.955712575119771479e+00,9.870684922521792970e+00,0.000000000000000000e+00
+3.658370185180150447e+00,2.435273158204002808e+00,2.000000000000000000e+00
+4.664389010487044018e+00,2.471167975186181920e+00,2.000000000000000000e+00
+5.512199472948779544e+00,2.156511689679083688e+00,2.000000000000000000e+00
+-2.773854456290706150e+00,1.173445529478794036e+01,0.000000000000000000e+00
+6.762035033240734627e+00,3.005634944491879068e+00,2.000000000000000000e+00
+5.263998653280256512e+00,2.601515193205012011e+00,2.000000000000000000e+00
+-6.793037403678369834e+00,-7.035786828668026516e+00,1.000000000000000000e+00
+-3.053580347577932841e+00,9.125208717908186884e+00,0.000000000000000000e+00
+-7.542250950097116657e+00,-6.309510924682787625e+00,1.000000000000000000e+00
+6.272290140159736183e+00,5.430283059800993239e-01,2.000000000000000000e+00
+5.210769346921268586e+00,3.108735324121330912e+00,2.000000000000000000e+00
+-9.351271691278558507e+00,-7.677004848746423527e+00,1.000000000000000000e+00
+4.964045188716543322e+00,1.843026629573047526e+00,2.000000000000000000e+00
+-1.043548854131196135e+00,8.788509827711786571e+00,0.000000000000000000e+00
+1.398611496159028800e+00,9.487820426064421664e-01,2.000000000000000000e+00
+4.199834349531117894e+00,2.103910261226823231e+00,2.000000000000000000e+00
+-6.942306288424441973e+00,-5.924967272774708249e+00,1.000000000000000000e+00
+-5.251011645579978016e+00,-8.260211051490838230e+00,1.000000000000000000e+00
+3.814381639435589832e+00,1.651783842287738668e+00,2.000000000000000000e+00
+-8.486073511408843473e+00,-6.676645957408723575e+00,1.000000000000000000e+00
+3.909512204510964928e+00,2.189628273522707058e+00,2.000000000000000000e+00
+3.319645629207458981e+00,3.804628449795085743e+00,2.000000000000000000e+00
+4.620862628325412835e+00,9.706403193029231602e-01,2.000000000000000000e+00
+6.783822925553426586e+00,2.607088706258743116e+00,2.000000000000000000e+00
+-3.483879293280071732e+00,9.801370731940773240e+00,0.000000000000000000e+00
+-6.697760936092774564e+00,-6.631889006975610457e+00,1.000000000000000000e+00
+-1.770731043057339749e+00,9.185654409388291697e+00,0.000000000000000000e+00
+-2.624845905440990723e+00,8.713182432609032801e+00,0.000000000000000000e+00
+3.817658440661670038e+00,2.216856895432644414e+00,2.000000000000000000e+00
+4.050514079283889401e+00,2.822771780961756516e+00,2.000000000000000000e+00
+-1.696671800658552165e+00,1.037052615676914513e+01,0.000000000000000000e+00
+4.950786401826105632e+00,3.448525900890284213e+00,2.000000000000000000e+00
+-7.865353237486814031e+00,-6.376063077758102438e+00,1.000000000000000000e+00
+-7.526200075393796318e+00,-7.961657596890341360e+00,1.000000000000000000e+00
+4.736874801220819720e+00,2.568326709377645400e+00,2.000000000000000000e+00
+-2.147561598005116146e+00,8.369166373593197150e+00,0.000000000000000000e+00
+-2.409546257965109017e+00,8.510810474082122212e+00,0.000000000000000000e+00
+-7.844550651731374558e+00,-6.194058133277507316e+00,1.000000000000000000e+00
+6.091022444023143301e+00,2.932440510025938973e+00,2.000000000000000000e+00
+3.378994881893055968e+00,2.891031630995508195e+00,2.000000000000000000e+00
+-6.831105563206443243e+00,-7.711059709686984398e+00,1.000000000000000000e+00
+-5.377270139055242204e+00,-6.806014812856171048e+00,1.000000000000000000e+00
+-6.246845325044985131e+00,-4.609416735471550730e+00,1.000000000000000000e+00
+-6.302555063970729954e+00,-7.083154979318939226e+00,1.000000000000000000e+00
+-3.746148333930832131e+00,7.693829515114044781e+00,0.000000000000000000e+00
+-7.154678888302914430e+00,-9.182030758011531901e+00,1.000000000000000000e+00
+-6.050221609967780800e+00,-9.091244902283831308e+00,1.000000000000000000e+00
+2.515983111918294490e+00,1.447414662259971063e+00,2.000000000000000000e+00
+-7.635977936435573099e+00,-8.302363302873621009e+00,1.000000000000000000e+00
+-8.184096691656122857e+00,-6.210437044445908050e+00,1.000000000000000000e+00
+-2.496195731174843058e+00,1.046782020535563795e+01,0.000000000000000000e+00
+3.847358097795400944e+00,1.858433242473833014e+00,2.000000000000000000e+00
+-7.323920451227381889e+00,-6.502809100231094597e+00,1.000000000000000000e+00
+-5.192485556078705322e+00,-5.998469836326496107e+00,1.000000000000000000e+00
+4.324609591587755375e+00,2.732138904433999649e+00,2.000000000000000000e+00
+-2.586299332466854395e+00,9.355438103014964923e+00,0.000000000000000000e+00
+-1.687137463058260067e+00,1.091107911085226867e+01,0.000000000000000000e+00
+-5.953449643619628695e+00,-4.970692952805816134e+00,1.000000000000000000e+00
+-2.851912139579519501e+00,8.212008858976702186e+00,0.000000000000000000e+00
+-8.062885703817045169e+00,-8.919341771036046751e+00,1.000000000000000000e+00
+4.685450676131915237e+00,1.321569336334914802e+00,2.000000000000000000e+00
+5.321831807523064839e+00,1.662902927347275961e+00,2.000000000000000000e+00
+-7.531463298953429586e+00,-6.832710921959532335e+00,1.000000000000000000e+00
+4.618977242263953009e+00,2.090497067249514007e+00,2.000000000000000000e+00
+-5.234659477649985959e+00,-7.129145632832324608e+00,1.000000000000000000e+00
+-6.945706989798586584e+00,-8.091125793038402847e+00,1.000000000000000000e+00
+-6.589852334254857169e+00,-4.804708794630507818e+00,1.000000000000000000e+00
+4.962597396566191144e+00,1.145938740388408927e+00,2.000000000000000000e+00
+5.797989709728168961e+00,2.764832377903667648e+00,2.000000000000000000e+00
+-1.883530275287744082e+00,8.157128571782038762e+00,0.000000000000000000e+00
+-5.356503113881612599e+00,-6.341199549591287621e+00,1.000000000000000000e+00
+3.045451177433734280e+00,1.373794660986959126e+00,2.000000000000000000e+00
+5.330022827939213670e+00,1.571949212054895684e+00,2.000000000000000000e+00
+4.645122535946284437e+00,2.020150277705473840e+00,2.000000000000000000e+00
+-6.619904689429787936e+00,-7.784426218380355422e+00,1.000000000000000000e+00
+5.186976217398139077e+00,1.770977031506837829e+00,2.000000000000000000e+00
+-6.508481317779961195e+00,-7.484094779991766977e+00,1.000000000000000000e+00
+4.531118687771243714e+00,2.374881406039673237e+00,2.000000000000000000e+00
+-7.472021115390139023e+00,-7.744100362955762762e+00,1.000000000000000000e+00
+6.954537402901610044e+00,1.059044913489839423e-01,2.000000000000000000e+00
+6.829681769445773654e+00,1.164871398585580531e+00,2.000000000000000000e+00
+-6.552699817387107828e+00,-7.099210122084810948e+00,1.000000000000000000e+00
+3.712948364650018540e+00,1.913644327878931906e+00,2.000000000000000000e+00
+-3.837383671951180908e+00,9.211147364067445054e+00,0.000000000000000000e+00
+-8.358213436931110962e+00,-5.736355550069017539e+00,1.000000000000000000e+00
+-2.216125149754069046e+00,8.299934710171953611e+00,0.000000000000000000e+00
+-2.732660408378601247e+00,9.728286622290413632e+00,0.000000000000000000e+00
+-1.478198100556799233e+00,9.945566247314520325e+00,0.000000000000000000e+00
+-6.802258883503651710e+00,-7.741393794604210399e+00,1.000000000000000000e+00
+-4.059861054118883317e+00,9.082849103004349445e+00,0.000000000000000000e+00
+5.465295185216131557e+00,2.786679319941370636e+00,2.000000000000000000e+00
+3.978092371459713394e+00,2.825603018736956074e+00,2.000000000000000000e+00
+-6.759331559439370807e+00,-6.365670759217197272e+00,1.000000000000000000e+00
+4.709680921218120098e+00,1.587856087078971745e+00,2.000000000000000000e+00
+5.387172441351363084e+00,2.583539949374197064e+00,2.000000000000000000e+00
+-2.701558587833872593e+00,9.315833470531934779e+00,0.000000000000000000e+00
+-9.299848075453587271e-01,9.781720857351229981e+00,0.000000000000000000e+00
+4.737554934776933457e+00,1.200159900085265630e+00,2.000000000000000000e+00
+4.167946970438667798e+00,3.062120280908097847e+00,2.000000000000000000e+00
+5.154914103436761152e+00,2.486955634852940911e+00,2.000000000000000000e+00
+-6.780294885722044640e+00,-6.128722469904158032e+00,1.000000000000000000e+00
+-5.873334381936829551e+00,-7.457001462799095037e+00,1.000000000000000000e+00
+-7.149034025595828012e+00,-6.162567337479984531e+00,1.000000000000000000e+00
+-3.615532597058778386e+00,7.818079504117650735e+00,0.000000000000000000e+00
+-6.230117218422199787e-01,9.188863941030160021e+00,0.000000000000000000e+00
+-3.355991341121155269e+00,7.499438903512457344e+00,0.000000000000000000e+00
+3.867053621690529575e+00,1.736351077200723125e+00,2.000000000000000000e+00
+5.083698264374329590e+00,2.747803737370068777e+00,2.000000000000000000e+00
+6.081152125294217115e+00,5.373075327612926166e-01,2.000000000000000000e+00
+3.191794494730777032e+00,5.657059095641767676e-01,2.000000000000000000e+00
+-6.541130783656855741e+00,-7.295397507176748064e+00,1.000000000000000000e+00
+4.704158855323564481e+00,8.954249060114258807e-01,2.000000000000000000e+00
+-6.234251241566122204e+00,-5.511478035743597736e+00,1.000000000000000000e+00
+5.593880599721304137e+00,2.624560935246529780e+00,2.000000000000000000e+00
+4.488093741192518138e+00,2.561486890425308527e+00,2.000000000000000000e+00
+-6.495561742211962475e+00,-6.912804341370039296e+00,1.000000000000000000e+00
+-2.185113653657955179e+00,8.629203847782004999e+00,0.000000000000000000e+00
+4.189813364748857794e+00,2.596019616288230747e+00,2.000000000000000000e+00
+5.803042588383060973e+00,1.983402744960319097e+00,2.000000000000000000e+00
+-2.728869510890262085e+00,9.371398699710068669e+00,0.000000000000000000e+00
+-7.118575238017680107e+00,-7.787673255317544729e+00,1.000000000000000000e+00
+-3.660191200475052753e+00,9.389984146543993049e+00,0.000000000000000000e+00
+3.810883825306029316e+00,1.412988643743762429e+00,2.000000000000000000e+00
+-4.116680857613977729e+00,9.198919986730626164e+00,0.000000000000000000e+00
+-6.861208811961718723e+00,-5.203672281000663702e+00,1.000000000000000000e+00
+-6.010021271045610014e+00,-5.524471734470996154e+00,1.000000000000000000e+00
diff --git a/exercises/Numpy_KMeansClustering/sample-data/coords-with-labels-4.dat b/exercises/Numpy_KMeansClustering/sample-data/coords-with-labels-4.dat
new file mode 100644
index 0000000000000000000000000000000000000000..dd089173f784c08f0c5a2437d703834aa8193780
--- /dev/null
+++ b/exercises/Numpy_KMeansClustering/sample-data/coords-with-labels-4.dat
@@ -0,0 +1,405 @@
+# SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+# SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+#
+# SPDX-License-Identifier: CC0-1.0
+
+-1.011875710913229298e+01,9.078317097483067144e+00,2.000000000000000000e+00
+-5.128942727142494107e+00,9.836188632573545476e+00,1.000000000000000000e+00
+-9.084070820721954931e+00,7.050799345751032732e+00,2.000000000000000000e+00
+5.614998569645852200e+00,1.826112302438593460e+00,3.000000000000000000e+00
+5.210769346921268586e+00,3.108735324121330912e+00,3.000000000000000000e+00
+-3.292450915388987376e+00,8.692224611992646288e+00,1.000000000000000000e+00
+-2.035959998479205169e+00,8.941457215541449344e+00,1.000000000000000000e+00
+-8.320668736166886958e+00,6.597779102345237234e+00,2.000000000000000000e+00
+-2.422150554814578971e+00,8.715278777732454074e+00,1.000000000000000000e+00
+6.091022444023143301e+00,2.932440510025938973e+00,3.000000000000000000e+00
+-8.667462318508068364e+00,7.139539579146023662e+00,2.000000000000000000e+00
+3.712948364650018540e+00,1.913644327878931906e+00,3.000000000000000000e+00
+-6.956303260160976443e+00,8.668942961653680612e+00,2.000000000000000000e+00
+3.924575126968133265e+00,2.652767432875407838e+00,3.000000000000000000e+00
+-7.367233415223763515e+00,-7.312667781095567143e+00,0.000000000000000000e+00
+-6.831105563206443243e+00,-7.711059709686984398e+00,0.000000000000000000e+00
+-6.378710003526888883e+00,-7.857664838074497560e+00,0.000000000000000000e+00
+-2.624845905440990723e+00,8.713182432609032801e+00,1.000000000000000000e+00
+4.050514079283889401e+00,2.822771780961756516e+00,3.000000000000000000e+00
+-6.589852334254857169e+00,-4.804708794630507818e+00,0.000000000000000000e+00
+4.167946970438667798e+00,3.062120280908097847e+00,3.000000000000000000e+00
+-8.512194734393892404e+00,6.072409339113399973e+00,2.000000000000000000e+00
+-3.053580347577932841e+00,9.125208717908186884e+00,1.000000000000000000e+00
+1.398611496159028800e+00,9.487820426064421664e-01,3.000000000000000000e+00
+-2.701558587833872593e+00,9.315833470531934779e+00,1.000000000000000000e+00
+-9.628802212081321699e+00,7.794991272634699264e+00,2.000000000000000000e+00
+-6.466192287927569282e+00,-5.003313780717880910e+00,0.000000000000000000e+00
+-2.412120073704709711e+00,9.982931118731210418e+00,1.000000000000000000e+00
+-7.635977936435573099e+00,-8.302363302873621009e+00,0.000000000000000000e+00
+-8.912761185736057357e+00,7.944195013049380805e+00,2.000000000000000000e+00
+-8.582298022322135012e+00,8.306213899444216509e+00,2.000000000000000000e+00
+5.917543732016525837e+00,1.381598295104902174e+00,3.000000000000000000e+00
+-7.809172119310366256e+00,7.796120397911746380e+00,2.000000000000000000e+00
+-1.024583945135383090e+01,6.545706227907827746e+00,2.000000000000000000e+00
+-2.988371860898040300e+00,8.828627151534504947e+00,1.000000000000000000e+00
+-6.168012313062380514e+00,-8.004751685113815185e+00,0.000000000000000000e+00
+6.405333076509197809e+00,2.378151394901687699e+00,3.000000000000000000e+00
+4.525338990975483533e+00,3.210985995914193758e+00,3.000000000000000000e+00
+-4.116680857613977729e+00,9.198919986730626164e+00,1.000000000000000000e+00
+-5.192485556078705322e+00,-5.998469836326496107e+00,0.000000000000000000e+00
+-1.366374808537729635e+00,9.766219160885095008e+00,1.000000000000000000e+00
+-1.039495693015991407e+01,7.929532866844342998e+00,2.000000000000000000e+00
+-8.642482501538328421e+00,6.345150137883670993e+00,2.000000000000000000e+00
+-6.400647365404878109e+00,-6.546447487988998226e+00,0.000000000000000000e+00
+4.189813364748857794e+00,2.596019616288230747e+00,3.000000000000000000e+00
+4.737554934776933457e+00,1.200159900085265630e+00,3.000000000000000000e+00
+-1.545821493808428482e+00,9.427067055134820350e+00,1.000000000000000000e+00
+-7.844550651731374558e+00,-6.194058133277507316e+00,0.000000000000000000e+00
+-2.851912139579519501e+00,8.212008858976702186e+00,1.000000000000000000e+00
+2.515983111918294490e+00,1.447414662259971063e+00,3.000000000000000000e+00
+-3.038957826819788988e+00,9.527553561311677299e+00,1.000000000000000000e+00
+-8.873316247132972734e+00,9.094323551134213091e+00,2.000000000000000000e+00
+-8.184096691656122857e+00,-6.210437044445908050e+00,0.000000000000000000e+00
+5.465295185216131557e+00,2.786679319941370636e+00,3.000000000000000000e+00
+5.263998653280256512e+00,2.601515193205012011e+00,3.000000000000000000e+00
+-8.728932962031118237e+00,8.049289539397395998e+00,2.000000000000000000e+00
+5.154926522534148958e+00,5.825901174595452758e+00,3.000000000000000000e+00
+-2.773854456290706150e+00,1.173445529478794036e+01,1.000000000000000000e+00
+-9.199293922455090922e+00,8.482852718862950780e+00,2.000000000000000000e+00
+5.052810290503725987e+00,1.409445131136757290e+00,3.000000000000000000e+00
+-5.821202704301682296e+00,-8.638849079699060241e+00,0.000000000000000000e+00
+3.921434614975665589e+00,1.759722532228884750e+00,3.000000000000000000e+00
+-7.902649363488548850e+00,8.595078010492862575e+00,2.000000000000000000e+00
+-6.942306288424441973e+00,-5.924967272774708249e+00,0.000000000000000000e+00
+-1.695680405683080316e+00,7.783421811764366538e+00,1.000000000000000000e+00
+-1.696671800658552165e+00,1.037052615676914513e+01,1.000000000000000000e+00
+-8.062885703817045169e+00,-8.919341771036046751e+00,0.000000000000000000e+00
+-8.549032472272642735e+00,-6.336749400896011686e+00,0.000000000000000000e+00
+-4.818879266269282979e+00,-5.124768750832742192e+00,0.000000000000000000e+00
+-8.127367788432518836e+00,7.767786226984743081e+00,2.000000000000000000e+00
+-3.398712052678273476e+00,8.198475843232882809e+00,1.000000000000000000e+00
+-6.302555063970729954e+00,-7.083154979318939226e+00,0.000000000000000000e+00
+-8.015157172674095776e+00,7.396840882687106600e+00,2.000000000000000000e+00
+3.867053621690529575e+00,1.736351077200723125e+00,3.000000000000000000e+00
+-8.908493468094658141e+00,5.662561981982712211e+00,2.000000000000000000e+00
+-6.434231119079936168e-01,9.488119049110109060e+00,1.000000000000000000e+00
+-9.565464932460878700e+00,7.076004279947198050e+00,2.000000000000000000e+00
+-8.583009630506424514e+00,-6.935657292172565214e+00,0.000000000000000000e+00
+-6.230117218422199787e-01,9.188863941030160021e+00,1.000000000000000000e+00
+-2.232506823722731237e+00,9.841469377234345117e+00,1.000000000000000000e+00
+-7.154678888302914430e+00,-9.182030758011531901e+00,0.000000000000000000e+00
+-1.593795505350676045e+00,9.343037237858005994e+00,1.000000000000000000e+00
+-8.357318524899296719e+00,7.547406939777834722e+00,2.000000000000000000e+00
+-9.463146334331689502e+00,7.349613965709536956e+00,2.000000000000000000e+00
+-7.128591339630343526e+00,-5.908538642321591539e+00,0.000000000000000000e+00
+5.497538459430121094e+00,1.813231153977304944e+00,3.000000000000000000e+00
+-9.069262286844688603e+00,8.019729280312121844e+00,2.000000000000000000e+00
+-7.410128338761797551e+00,-7.455927833920626746e+00,0.000000000000000000e+00
+-2.336016697201568348e+00,9.399603507927158930e+00,1.000000000000000000e+00
+-7.338988090691514365e+00,-7.729953962740738760e+00,0.000000000000000000e+00
+5.186976217398139077e+00,1.770977031506837829e+00,3.000000000000000000e+00
+-2.450988904606750118e+00,7.871315830367698219e+00,1.000000000000000000e+00
+-9.612116955739583801e+00,6.078868212187286346e+00,2.000000000000000000e+00
+-4.914902058234880577e+00,-6.844846041304218254e+00,0.000000000000000000e+00
+4.164933525067144870e+00,1.319840451367020107e+00,3.000000000000000000e+00
+-9.362848022915784441e+00,7.812897476726621271e+00,2.000000000000000000e+00
+-6.989371661690665150e+00,8.450087945046460547e+00,2.000000000000000000e+00
+-7.149034025595828012e+00,-6.162567337479984531e+00,0.000000000000000000e+00
+-6.991955240842099961e+00,-7.101079192809169882e+00,0.000000000000000000e+00
+-2.496195731174843058e+00,1.046782020535563795e+01,1.000000000000000000e+00
+-5.700330007087443640e+00,-6.812591111865837767e+00,0.000000000000000000e+00
+3.978092371459713394e+00,2.825603018736956074e+00,3.000000000000000000e+00
+-3.924568365103164425e+00,8.593640805432961827e+00,1.000000000000000000e+00
+-8.917751726329123940e+00,-7.888195904193350927e+00,0.000000000000000000e+00
+-8.358213436931110962e+00,-5.736355550069017539e+00,0.000000000000000000e+00
+-6.265460491107845087e+00,-6.122601883228641739e+00,0.000000000000000000e+00
+5.387172441351363084e+00,2.583539949374197064e+00,3.000000000000000000e+00
+-7.947247620533864243e+00,-7.022489078297240006e+00,0.000000000000000000e+00
+4.431756585870826548e+00,1.480168749281899121e+00,3.000000000000000000e+00
+-3.483879293280071732e+00,9.801370731940773240e+00,1.000000000000000000e+00
+-2.295103878922546414e+00,7.768547349486333076e+00,1.000000000000000000e+00
+-7.914300737429109667e+00,7.138620779055713683e+00,2.000000000000000000e+00
+4.838938531801571408e+00,1.372952806781937429e+00,3.000000000000000000e+00
+5.083698264374329590e+00,2.747803737370068777e+00,3.000000000000000000e+00
+-8.640242995868154807e+00,7.179162503574760379e+00,2.000000000000000000e+00
+-8.245226498667626913e+00,7.013976476184712538e+00,2.000000000000000000e+00
+-6.278243218367215661e+00,7.227463015774053368e+00,2.000000000000000000e+00
+4.531118687771243714e+00,2.374881406039673237e+00,3.000000000000000000e+00
+-9.383246843444959850e+00,7.722659029850774459e+00,2.000000000000000000e+00
+-3.211250716930102556e+00,8.686623981600552824e+00,1.000000000000000000e+00
+-7.531463298953429586e+00,-6.832710921959532335e+00,0.000000000000000000e+00
+4.199834349531117894e+00,2.103910261226823231e+00,3.000000000000000000e+00
+-8.183962100281952701e+00,7.267938244588248331e+00,2.000000000000000000e+00
+-3.746148333930832131e+00,7.693829515114044781e+00,1.000000000000000000e+00
+-8.660626755702756085e+00,5.988178556788602336e+00,2.000000000000000000e+00
+-6.945706989798586584e+00,-8.091125793038402847e+00,0.000000000000000000e+00
+-7.393494108487963956e+00,-7.939323115164897970e+00,0.000000000000000000e+00
+3.658370185180150447e+00,2.435273158204002808e+00,3.000000000000000000e+00
+-9.351271691278558507e+00,-7.677004848746423527e+00,0.000000000000000000e+00
+-1.043548854131196135e+00,8.788509827711786571e+00,1.000000000000000000e+00
+-4.427968838351791447e+00,8.987772252749104851e+00,1.000000000000000000e+00
+-2.955712575119771479e+00,9.870684922521792970e+00,1.000000000000000000e+00
+-6.547313179171678321e+00,-7.628596129832500239e+00,0.000000000000000000e+00
+-9.551173539313174032e+00,7.429953143190600073e+00,2.000000000000000000e+00
+-7.579352699143855787e+00,-6.666129682541724222e+00,0.000000000000000000e+00
+-9.201939968849865537e+00,7.266577291777635672e+00,2.000000000000000000e+00
+-1.770731043057339749e+00,9.185654409388291697e+00,1.000000000000000000e+00
+-6.697760936092774564e+00,-6.631889006975610457e+00,0.000000000000000000e+00
+-3.186119623358708797e+00,9.625962417039190200e+00,1.000000000000000000e+00
+-6.050221609967780800e+00,-9.091244902283831308e+00,0.000000000000000000e+00
+-3.571501336778855062e+00,9.487878558833502396e+00,1.000000000000000000e+00
+-7.530269760273096580e+00,-7.367234977040642896e+00,0.000000000000000000e+00
+-8.750419112177125314e+00,-7.231623077317255621e+00,0.000000000000000000e+00
+4.950786401826105632e+00,3.448525900890284213e+00,3.000000000000000000e+00
+-9.299848075453587271e-01,9.781720857351229981e+00,1.000000000000000000e+00
+-8.742206979695026803e+00,6.861247626793661070e+00,2.000000000000000000e+00
+-6.241034732373895721e+00,-8.541629655544905830e+00,0.000000000000000000e+00
+-3.522028743387173755e+00,9.328533460793595466e+00,1.000000000000000000e+00
+-8.458129905630046963e+00,7.934108660782526634e+00,2.000000000000000000e+00
+5.154914103436761152e+00,2.486955634852940911e+00,3.000000000000000000e+00
+-9.850432131896177168e+00,5.668666243632934254e+00,2.000000000000000000e+00
+-2.969836394012537628e+00,1.007140835441723681e+01,1.000000000000000000e+00
+-6.552699817387107828e+00,-7.099210122084810948e+00,0.000000000000000000e+00
+5.906789985414723887e+00,1.265500218321951253e+00,3.000000000000000000e+00
+-6.542024529076067907e+00,-7.291986559398414336e+00,0.000000000000000000e+00
+-2.177934191649186335e+00,9.989831255320680725e+00,1.000000000000000000e+00
+3.880746174674403193e+00,2.123563470416939492e+00,3.000000000000000000e+00
+-2.586299332466854395e+00,9.355438103014964923e+00,1.000000000000000000e+00
+4.387310684834941021e+00,7.253865019758825028e-01,3.000000000000000000e+00
+-2.147561598005116146e+00,8.369166373593197150e+00,1.000000000000000000e+00
+2.614736249570494220e+00,2.159623998710159754e+00,3.000000000000000000e+00
+-2.216125149754069046e+00,8.299934710171953611e+00,1.000000000000000000e+00
+4.964045188716543322e+00,1.843026629573047526e+00,3.000000000000000000e+00
+-8.810009380505549714e+00,7.353279054994448671e+00,2.000000000000000000e+00
+5.144866115208558632e+00,2.838924878110853367e+00,3.000000000000000000e+00
+-8.408709537503424869e+00,7.531210602661814413e+00,2.000000000000000000e+00
+-7.409884809523711091e+00,-7.672982425538291018e+00,0.000000000000000000e+00
+-9.696685536817224005e+00,8.023832794907693966e+00,2.000000000000000000e+00
+-8.205920017580488945e+00,8.296077365125432479e+00,2.000000000000000000e+00
+-8.004405602087105720e+00,7.782702994727140222e+00,2.000000000000000000e+00
+-8.877882910492665758e+00,8.005023612871328353e+00,2.000000000000000000e+00
+-3.348415146275388832e+00,8.705073752347107785e+00,1.000000000000000000e+00
+-6.802258883503651710e+00,-7.741393794604210399e+00,0.000000000000000000e+00
+-7.007544782632036728e+00,-7.835650033876372156e+00,0.000000000000000000e+00
+-8.413741361886891923e+00,-5.602432771377437781e+00,0.000000000000000000e+00
+4.704158855323564481e+00,8.954249060114258807e-01,3.000000000000000000e+00
+5.803042588383060973e+00,1.983402744960319097e+00,3.000000000000000000e+00
+-2.147802017544336195e+00,1.055232269466429074e+01,1.000000000000000000e+00
+6.225895652373453437e+00,7.353541851138829522e-01,3.000000000000000000e+00
+-7.172853312173433693e+00,8.337892980516834029e+00,2.000000000000000000e+00
+-5.953449643619628695e+00,-4.970692952805816134e+00,0.000000000000000000e+00
+3.120508870274087965e+00,1.488935611074480692e+00,3.000000000000000000e+00
+5.330022827939213670e+00,1.571949212054895684e+00,3.000000000000000000e+00
+4.715683394421827934e+00,1.296007972428620203e+00,3.000000000000000000e+00
+-6.049291374607024707e+00,-7.736193419184814069e+00,0.000000000000000000e+00
+4.620862628325412835e+00,9.706403193029231602e-01,3.000000000000000000e+00
+-2.522694847790684314e+00,7.956575199242420737e+00,1.000000000000000000e+00
+-2.670483334718759316e+00,9.418336985012860652e+00,1.000000000000000000e+00
+-6.010021271045610014e+00,-5.524471734470996154e+00,0.000000000000000000e+00
+-6.759331559439370807e+00,-6.365670759217197272e+00,0.000000000000000000e+00
+3.847358097795400944e+00,1.858433242473833014e+00,3.000000000000000000e+00
+-9.919384297044272714e+00,8.376675768831606916e+00,2.000000000000000000e+00
+-8.278537308704970954e+00,8.404303641053324725e+00,2.000000000000000000e+00
+4.618977242263953009e+00,2.090497067249514007e+00,3.000000000000000000e+00
+-6.861208811961718723e+00,-5.203672281000663702e+00,0.000000000000000000e+00
+-8.947069291191146689e+00,-6.969229632788734641e+00,0.000000000000000000e+00
+-8.996335655214998894e+00,6.896641845551283012e+00,2.000000000000000000e+00
+-7.323920451227381889e+00,-6.502809100231094597e+00,0.000000000000000000e+00
+-9.787726645467953901e+00,9.955904980336093502e+00,2.000000000000000000e+00
+-8.871081026852008833e+00,6.780098144364938406e+00,2.000000000000000000e+00
+2.926744307137223888e+00,3.327042058106144840e+00,3.000000000000000000e+00
+-2.165579333484288771e+00,7.251245972835587139e+00,1.000000000000000000e+00
+-6.485175048772973128e+00,-7.301094074096209141e+00,0.000000000000000000e+00
+-1.350602044045346117e+00,8.193603809846610631e+00,1.000000000000000000e+00
+-1.922340529252479779e+00,1.120474175400829964e+01,1.000000000000000000e+00
+5.321831807523064839e+00,1.662902927347275961e+00,3.000000000000000000e+00
+-9.411989763516245944e+00,6.776663974258310574e+00,2.000000000000000000e+00
+-3.189222344631240880e+00,9.246539825359324283e+00,1.000000000000000000e+00
+-9.181015350716359436e+00,6.952082049502911865e+00,2.000000000000000000e+00
+-7.916873345477726254e+00,-7.070448271359555115e+00,0.000000000000000000e+00
+-2.249314828804326538e+00,9.796108999975631448e+00,1.000000000000000000e+00
+4.627632063381186711e+00,1.075915312454900352e+00,3.000000000000000000e+00
+5.326139026602614734e+00,3.604538127510803491e-01,3.000000000000000000e+00
+-1.883530275287744082e+00,8.157128571782038762e+00,1.000000000000000000e+00
+5.590302674414151518e+00,1.396266028278328797e+00,3.000000000000000000e+00
+-8.278194764970411512e+00,-6.317140356585375649e+00,0.000000000000000000e+00
+-7.149502126444641448e+00,-7.858873309058253653e+00,0.000000000000000000e+00
+5.593880599721304137e+00,2.624560935246529780e+00,3.000000000000000000e+00
+-3.837383671951180908e+00,9.211147364067445054e+00,1.000000000000000000e+00
+3.909512204510964928e+00,2.189628273522707058e+00,3.000000000000000000e+00
+-5.678413268987325679e+00,-7.288184966297498235e+00,0.000000000000000000e+00
+-2.213077345988174294e+00,9.275341400378211532e+00,1.000000000000000000e+00
+-3.110904235282147212e+00,1.086656431270725953e+01,1.000000000000000000e+00
+3.319645629207458981e+00,3.804628449795085743e+00,3.000000000000000000e+00
+-8.394818253349821902e+00,-5.513235325831422173e+00,0.000000000000000000e+00
+-3.109836312971554939e+00,8.722592378405044755e+00,1.000000000000000000e+00
+-6.513028945054421648e+00,-7.819989379603302204e+00,0.000000000000000000e+00
+-3.700501120255398568e+00,9.670839736832151701e+00,1.000000000000000000e+00
+-5.328475215628746930e+00,-6.764434958983088109e+00,0.000000000000000000e+00
+-2.978672008987702124e+00,9.556846171784286526e+00,1.000000000000000000e+00
+-6.609170365371431544e+00,-6.930347702725083714e+00,0.000000000000000000e+00
+-2.754585739055620763e+00,8.260549963840832177e+00,1.000000000000000000e+00
+3.821658152994628743e+00,4.065556959626192679e+00,3.000000000000000000e+00
+-9.761561002746914184e+00,5.971838309882369522e+00,2.000000000000000000e+00
+-8.724100107973971063e+00,7.473824676960580504e+00,2.000000000000000000e+00
+-7.456398521719602712e+00,-6.124718367450190826e+00,0.000000000000000000e+00
+-6.264967953386149979e+00,7.382741349513191054e+00,2.000000000000000000e+00
+-8.819893823570616576e+00,7.671104620860374368e+00,2.000000000000000000e+00
+4.753396181479349281e+00,2.635300358461778458e+00,3.000000000000000000e+00
+-7.118575238017680107e+00,-7.787673255317544729e+00,0.000000000000000000e+00
+-2.185113653657955179e+00,8.629203847782004999e+00,1.000000000000000000e+00
+-2.581207744633084111e+00,1.001781902609034525e+01,1.000000000000000000e+00
+-8.824398464723063995e+00,7.299397828388699772e+00,2.000000000000000000e+00
+-1.718165676009703269e+00,8.104898673403582166e+00,1.000000000000000000e+00
+-6.780294885722044640e+00,-6.128722469904158032e+00,0.000000000000000000e+00
+-5.234659477649985959e+00,-7.129145632832324608e+00,0.000000000000000000e+00
+6.081152125294217115e+00,5.373075327612926166e-01,3.000000000000000000e+00
+-3.417221698573960964e+00,7.601982426863029829e+00,1.000000000000000000e+00
+3.800156994047325210e+00,1.373777038496709846e+00,3.000000000000000000e+00
+3.814381639435589832e+00,1.651783842287738668e+00,3.000000000000000000e+00
+5.945357643382430446e+00,1.994173525573491146e+00,3.000000000000000000e+00
+4.981634812005260926e+00,3.849340523156618232e+00,3.000000000000000000e+00
+-3.886866991009841232e+00,8.076461088283199530e+00,1.000000000000000000e+00
+-6.541130783656855741e+00,-7.295397507176748064e+00,0.000000000000000000e+00
+4.863971318038518454e+00,1.985762084722526799e+00,3.000000000000000000e+00
+-6.122638574505918641e+00,-7.802274917453572378e+00,0.000000000000000000e+00
+-2.901305776184907703e+00,7.550771180066202959e+00,1.000000000000000000e+00
+4.884845407336824152e+00,1.466226508569602238e+00,3.000000000000000000e+00
+-1.148929756502902322e+01,8.415029767421165374e+00,2.000000000000000000e+00
+-9.093304974056865220e+00,8.827515904081391085e+00,2.000000000000000000e+00
+-3.355991341121155269e+00,7.499438903512457344e+00,1.000000000000000000e+00
+3.817658440661670038e+00,2.216856895432644414e+00,3.000000000000000000e+00
+5.452740955067061357e+00,2.602798525864344015e+00,3.000000000000000000e+00
+-8.558359130316189223e+00,6.198033868200326424e+00,2.000000000000000000e+00
+-9.078653154794144697e+00,6.948702107949105589e+00,2.000000000000000000e+00
+-3.800746382696032377e+00,-5.760534681841369853e+00,0.000000000000000000e+00
+-6.043935079086128148e+00,-8.009816447933564731e+00,0.000000000000000000e+00
+-6.793037403678369834e+00,-7.035786828668026516e+00,0.000000000000000000e+00
+-2.543909392757993437e+00,7.845608090578789273e+00,1.000000000000000000e+00
+5.512199472948779544e+00,2.156511689679083688e+00,3.000000000000000000e+00
+6.783822925553426586e+00,2.607088706258743116e+00,3.000000000000000000e+00
+-7.865353237486814031e+00,-6.376063077758102438e+00,0.000000000000000000e+00
+-6.644012633042704508e+00,-6.109244399388980007e+00,0.000000000000000000e+00
+4.685450676131915237e+00,1.321569336334914802e+00,3.000000000000000000e+00
+-1.687137463058260067e+00,1.091107911085226867e+01,1.000000000000000000e+00
+-6.008502487719577623e+00,-7.206133125443788146e+00,0.000000000000000000e+00
+-2.417436846517247773e+00,7.026717213597429179e+00,1.000000000000000000e+00
+-2.732660408378601247e+00,9.728286622290413632e+00,1.000000000000000000e+00
+4.621365700235711138e+00,1.684511045020593567e+00,3.000000000000000000e+00
+-7.689054430350334535e+00,6.620346490372815751e+00,2.000000000000000000e+00
+-2.728869510890262085e+00,9.371398699710068669e+00,1.000000000000000000e+00
+6.762035033240734627e+00,3.005634944491879068e+00,3.000000000000000000e+00
+-6.246845325044985131e+00,-4.609416735471550730e+00,0.000000000000000000e+00
+-1.012828865637706421e+01,6.028444143435087277e+00,2.000000000000000000e+00
+-1.478198100556799233e+00,9.945566247314520325e+00,1.000000000000000000e+00
+-3.317691225945937905e+00,8.512529084613785102e+00,1.000000000000000000e+00
+-8.782602844347316307e+00,8.417714433969651466e+00,2.000000000000000000e+00
+-8.216517794418813025e+00,5.753298195608246957e+00,2.000000000000000000e+00
+-8.875962459060858123e+00,8.426824797515225285e+00,2.000000000000000000e+00
+5.819318956949388166e+00,1.503994031836027201e+00,3.000000000000000000e+00
+4.709680921218120098e+00,1.587856087078971745e+00,3.000000000000000000e+00
+-8.651560992158932706e+00,6.568139983145380612e+00,2.000000000000000000e+00
+3.378994881893055968e+00,2.891031630995508195e+00,3.000000000000000000e+00
+4.324609591587755375e+00,2.732138904433999649e+00,3.000000000000000000e+00
+-5.873334381936829551e+00,-7.457001462799095037e+00,0.000000000000000000e+00
+-9.272823984068326197e+00,7.014350792030064063e+00,2.000000000000000000e+00
+-2.281737688448620904e+00,1.032142888248074897e+01,1.000000000000000000e+00
+-3.580090121113862267e+00,9.496758543441506717e+00,1.000000000000000000e+00
+3.439582429172324929e+00,1.638668448099783514e+00,3.000000000000000000e+00
+-9.413965582873784044e+00,7.445532730144064359e+00,2.000000000000000000e+00
+-9.814201009613343629e+00,8.377164712106543121e+00,2.000000000000000000e+00
+3.633861454728399387e+00,7.589810711529998422e-01,3.000000000000000000e+00
+-8.530525987743951433e+00,5.613354522842077365e+00,2.000000000000000000e+00
+-5.251011645579978016e+00,-8.260211051490838230e+00,0.000000000000000000e+00
+4.996894322193148774e+00,1.280260088680077679e+00,3.000000000000000000e+00
+-8.116655692592775750e+00,6.194471144281473940e+00,2.000000000000000000e+00
+5.159225350469273330e+00,3.505908596943309696e+00,3.000000000000000000e+00
+-9.361050777155050184e+00,8.372532141335591760e+00,2.000000000000000000e+00
+-1.153521439957758155e+01,7.269228048980891366e+00,2.000000000000000000e+00
+-3.987719613420177556e+00,8.294441919803613672e+00,1.000000000000000000e+00
+-1.426146379877473169e+00,1.006808818023322516e+01,1.000000000000000000e+00
+-2.441669418364826427e+00,7.589537941984865199e+00,1.000000000000000000e+00
+-9.378087436945371280e+00,6.545218190096390387e+00,2.000000000000000000e+00
+-5.377270139055242204e+00,-6.806014812856171048e+00,0.000000000000000000e+00
+-2.504084166410289303e+00,8.779698994823174729e+00,1.000000000000000000e+00
+-2.448967413111723612e+00,1.147752824068360766e+01,1.000000000000000000e+00
+-8.507169629034432745e+00,-6.832024646614564212e+00,0.000000000000000000e+00
+-4.234115455565783392e+00,8.451998598957349174e+00,1.000000000000000000e+00
+-6.619904689429787936e+00,-7.784426218380355422e+00,0.000000000000000000e+00
+-7.542250950097116657e+00,-6.309510924682787625e+00,0.000000000000000000e+00
+-7.526200075393796318e+00,-7.961657596890341360e+00,0.000000000000000000e+00
+-6.495561742211962475e+00,-6.912804341370039296e+00,0.000000000000000000e+00
+-1.053079238635083037e+01,8.853073234959316196e+00,2.000000000000000000e+00
+-9.948903602101839994e+00,9.075793358922325638e+00,2.000000000000000000e+00
+4.154515288398997974e+00,2.055043823327054486e+00,3.000000000000000000e+00
+-6.887599832467887317e+00,-5.400165454385920327e+00,0.000000000000000000e+00
+-7.245141129996612861e+00,6.812307239067518339e+00,2.000000000000000000e+00
+-4.059861054118883317e+00,9.082849103004349445e+00,1.000000000000000000e+00
+3.810883825306029316e+00,1.412988643743762429e+00,3.000000000000000000e+00
+-1.987256057435852430e+00,9.311270801431508204e+00,1.000000000000000000e+00
+-1.106403312116650994e+00,7.612435065406041090e+00,1.000000000000000000e+00
+-7.642886347693787386e+00,-8.684991693940466106e+00,0.000000000000000000e+00
+-2.251647232329985648e+00,8.939840212432153876e+00,1.000000000000000000e+00
+4.645122535946284437e+00,2.020150277705473840e+00,3.000000000000000000e+00
+-1.092025716451973238e+01,9.019979283788741142e+00,2.000000000000000000e+00
+-6.986657551105827757e+00,-7.915351915695320706e+00,0.000000000000000000e+00
+-7.592242564138381056e+00,5.250132683090553698e+00,2.000000000000000000e+00
+-2.409546257965109017e+00,8.510810474082122212e+00,1.000000000000000000e+00
+-3.660191200475052753e+00,9.389984146543993049e+00,1.000000000000000000e+00
+-9.449845559627958025e+00,5.916861818650480664e+00,2.000000000000000000e+00
+-9.097919107999615562e+00,5.820379962380597405e+00,2.000000000000000000e+00
+-5.842087246893370889e+00,-7.390125992130693433e+00,0.000000000000000000e+00
+3.045451177433734280e+00,1.373794660986959126e+00,3.000000000000000000e+00
+-3.499733948183438415e+00,8.447988398595549953e+00,1.000000000000000000e+00
+-8.627310289433392398e+00,7.226809803628310824e+00,2.000000000000000000e+00
+6.829681769445773654e+00,1.164871398585580531e+00,3.000000000000000000e+00
+6.772912210884367568e+00,2.108188441823011239e-02,3.000000000000000000e+00
+-9.919391084235908096e+00,7.939458522442967237e+00,2.000000000000000000e+00
+-6.392575777019184002e+00,7.452744097473930296e+00,2.000000000000000000e+00
+-2.258704772706873420e+00,9.360734337695296503e+00,1.000000000000000000e+00
+3.741464164879743315e+00,2.465088855447237659e+00,3.000000000000000000e+00
+4.422197633000880757e+00,3.071946535927922106e+00,3.000000000000000000e+00
+-7.900043950660013081e+00,6.807478187281329696e+00,2.000000000000000000e+00
+5.539478711661351973e+00,2.280469204817341389e+00,3.000000000000000000e+00
+-9.107216447190840114e+00,6.216997006757033262e+00,2.000000000000000000e+00
+4.736874801220819720e+00,2.568326709377645400e+00,3.000000000000000000e+00
+4.920870703963133863e+00,1.350470164120138206e+00,3.000000000000000000e+00
+-2.972615315865212438e+00,8.548556374628065058e+00,1.000000000000000000e+00
+4.962597396566191144e+00,1.145938740388408927e+00,3.000000000000000000e+00
+-3.428621857286553443e+00,1.056422053321586141e+01,1.000000000000000000e+00
+-1.006045556552795617e+01,8.036521345671090444e+00,2.000000000000000000e+00
+-1.018651317874172335e+01,8.066787009521418028e+00,2.000000000000000000e+00
+-7.740040556435222818e+00,-7.264665137505772030e+00,0.000000000000000000e+00
+-8.486073511408843473e+00,-6.676645957408723575e+00,0.000000000000000000e+00
+-1.067920198796765519e+01,6.043945948763001397e+00,2.000000000000000000e+00
+-6.508481317779961195e+00,-7.484094779991766977e+00,0.000000000000000000e+00
+6.793061293739658169e+00,1.205822121052682494e+00,3.000000000000000000e+00
+-6.303070228095503325e+00,-6.568859438732410183e+00,0.000000000000000000e+00
+-5.356503113881612599e+00,-6.341199549591287621e+00,0.000000000000000000e+00
+-9.174112455926138665e+00,8.992544440788096338e+00,2.000000000000000000e+00
+-3.615532597058778386e+00,7.818079504117650735e+00,1.000000000000000000e+00
+-1.061704800554028871e+01,8.819567226987885533e+00,2.000000000000000000e+00
+-8.566748919440636101e+00,6.046774339678393950e+00,2.000000000000000000e+00
+-8.130575821180535456e+00,6.761056139604435522e+00,2.000000000000000000e+00
+-3.393055059253883066e+00,9.168011234143849109e+00,1.000000000000000000e+00
+-5.711845129491463169e+00,-6.625688749974733227e+00,0.000000000000000000e+00
+4.488093741192518138e+00,2.561486890425308527e+00,3.000000000000000000e+00
+-6.349823013235987190e+00,-5.438540972618046254e+00,0.000000000000000000e+00
+-7.755245444536027044e+00,-8.262909324240283127e+00,0.000000000000000000e+00
+-7.472021115390139023e+00,-7.744100362955762762e+00,0.000000000000000000e+00
+-2.545023662162701594e+00,1.057892978401232753e+01,1.000000000000000000e+00
+4.168840530609778661e+00,2.205219621298368349e+00,3.000000000000000000e+00
+-7.433276496498452346e+00,-8.077987485864795758e+00,0.000000000000000000e+00
+-9.827932576894591321e+00,7.197735995399055398e+00,2.000000000000000000e+00
+-9.465294814423778291e+00,9.135971473495631656e+00,2.000000000000000000e+00
+6.954537402901610044e+00,1.059044913489839423e-01,3.000000000000000000e+00
+-2.300334028047994916e+00,7.054616004318545741e+00,1.000000000000000000e+00
+5.797989709728168961e+00,2.764832377903667648e+00,3.000000000000000000e+00
+-6.234251241566122204e+00,-5.511478035743597736e+00,0.000000000000000000e+00
+-9.542671447178769029e+00,5.915061619135143722e+00,2.000000000000000000e+00
+-2.267235351486716066e+00,7.101005883540523200e+00,1.000000000000000000e+00
+-8.345009855755121109e+00,7.508359039193576834e+00,2.000000000000000000e+00
+-1.686652710949561040e+00,7.793442478227299297e+00,1.000000000000000000e+00
+-1.031303578311234093e+00,8.496015909924674148e+00,1.000000000000000000e+00
+-8.430075000921538830e+00,5.620939311260862326e+00,2.000000000000000000e+00
+4.664389010487044018e+00,2.471167975186181920e+00,3.000000000000000000e+00
+3.191794494730777032e+00,5.657059095641767676e-01,3.000000000000000000e+00
+-6.808060953931877712e+00,-7.357767040041062856e+00,0.000000000000000000e+00
+6.272290140159736183e+00,5.430283059800993239e-01,3.000000000000000000e+00
+-2.151410262704466891e+00,9.575070654566555817e+00,1.000000000000000000e+00
diff --git a/exercises/Pandas_WeatherData/WeatherData_Analysis.ipynb b/exercises/Pandas_WeatherData/WeatherData_Analysis.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..7e6721adaf54289d6706b560439e6a363709cff3
--- /dev/null
+++ b/exercises/Pandas_WeatherData/WeatherData_Analysis.ipynb
@@ -0,0 +1,448 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "7114f455",
+   "metadata": {},
+   "source": [
+    "# Weather Data "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f361dfb4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "\n",
+    "from matplotlib import pyplot as plt\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "\n",
+    "import io\n",
+    "import urllib\n",
+    "import zipfile\n",
+    "from pathlib import Path"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "70f8451d",
+   "metadata": {},
+   "source": [
+    "## Downloading the weather dataset\n",
+    "\n",
+    "Use the function `download_dwd` provided below to download the weather dataset. \n",
+    "\n",
+    "This function will download a ZIP file and will store it in the directory `./tmp` (if not specified differently). Inside `tmp` the ZIP file will be unpacked. One of the files contained in the tarball is `produkt_tu_stunde_19500101_20201231_01639.txt`. This file contains measurements of weather data (in particular temperature and relative humidity) from 1950 to the end of 2020."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f6ec374f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def download_and_extract_weatherdata(url: str, outdir: Path = Path('tmp')) -> None:\n",
+    "    \"\"\"download DWD climate data from url and extract.\"\"\"\n",
+    "    # Create temp directory for saving file to disk\n",
+    "    outdir.mkdir(exist_ok=True)\n",
+    "    \n",
+    "    # Retrieve the file from URL and extract data from tarball\n",
+    "    response = urllib.request.urlopen(url)\n",
+    "    \n",
+    "    # Extract the tarball\n",
+    "    z = zipfile.ZipFile(io.BytesIO(response.read()))\n",
+    "    z.extractall(path='tmp/')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "62103fc3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Download the data and extract.\n",
+    "# Download the data and extract.\n",
+    "URL = (\n",
+    "    'https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/'\n",
+    "    'air_temperature/historical/stundenwerte_TU_01639_19500101_20211231_hist.zip'\n",
+    ")\n",
+    "TMP_DIRECTORY = Path(\"tmp\")\n",
+    "download_and_extract_weatherdata(url=URL, outdir=TMP_DIRECTORY)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f0be4be5",
+   "metadata": {},
+   "source": [
+    "## Importing the measurement data\n",
+    "\n",
+    "The file `produkt_tu_stunde_19500101_20201231_01639.txt` is a CSV file (although the suffix `.txt` conveys something else).\n",
+    "\n",
+    "The single columns of the file have the following headers:\n",
+    "\n",
+    "```\n",
+    "STATIONS_ID;MESS_DATUM;QN_9;TT_TU;RF_TU;eor\n",
+    "```\n",
+    "\n",
+    "Load the content of this file into a `pd.DataFrame` by using a suitable function. Only import the columns \n",
+    "- `MESS_DATUM` (date of the measurement), \n",
+    "- `TT_TU` (measured temperature in ${}^{\\circ}\\text{C}$), and \n",
+    "- `RF_TU` (measured relative humidity). \n",
+    "\n",
+    "After having imported the data gather some `info`rmation on the data (e.g. datatypes of columns or memory usage)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f99173c3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_weather = pd.read_csv(\n",
+    "    TMP_DIRECTORY / \"produkt_tu_stunde_19500101_20211231_01639.txt\",\n",
+    "    delimiter=\";\",\n",
+    "    usecols=[\"MESS_DATUM\",\"TT_TU\", \"RF_TU\"]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "28ec724b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_weather.head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a9448957",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_weather.tail()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0377e45a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_weather.info(memory_usage=\"deep\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "040a748e",
+   "metadata": {},
+   "source": [
+    "## Prepare the data for further analysis"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b111bfdc",
+   "metadata": {},
+   "source": [
+    "The column `MESS_DATUM` contains the data of each measurement in the format `%Y%m%d%H`. The datatype of this column is `int64`.\n",
+    "\n",
+    "Create a new `DataFrame` named `df_weather_cleaned` that is based on the original `df_weather` from above.\n",
+    "\n",
+    "Make the following modifications to the `df_weather` `DataFrame` to generate a new one that is then assigned to the `df_weather_tweaked` variable.\n",
+    "\n",
+    "**Note**: You will be using several methods of the `DataFrame` object. Consider *chaining* the calls to these methods to have a compact way to make the relevant transformations to the `df_weather` `DataFrame`.\n",
+    "\n",
+    "\n",
+    "### Changing the format of the measurement dates\n",
+    "\n",
+    "The `\"MESS_DATUM\"` column in the original `DataFrame` contains the dates of measurement as integer values. The format is `%Y%m%d%H` which is meant to represent \"YearMonthDayHour\". We would like to have these in date-like format. \n",
+    "\n",
+    "To transfer these integer values to a suitable date format look at the [documentation of the `DataFrame` object](https://pandas.pydata.org/pandas-docs/stable/getting_started/intro_tutorials/09_timeseries.html?highlight=datetime) to find a suitable function for making such a conversion.\n",
+    "\n",
+    "Use this function together with the `assign` method of a `DataFrame` instance to modify the \"`MESS_DATUM`\" column in an appropriate manner.\n",
+    "\n",
+    "### Rename the column headers\n",
+    "\n",
+    "Rename the following columns:\n",
+    "\n",
+    "* `\"MESS_DATUM\"` $\\to$ `\"Date of Measurement\"`\n",
+    "* `\"TT_TU\"` $\\to$ `\"Temperature\"`\n",
+    "* `\"RF_TU\"` $\\to$ `\"Humidity\"`\n",
+    "\n",
+    "\n",
+    "### Setting a new index \n",
+    "\n",
+    "Make the column named `\"Date of Measurement\"` the new index of the new `DataFrame` instance.\n",
+    "\n",
+    "### Note\n",
+    "\n",
+    "In all the following tasks you are supposed to work with the new modified `df_weather_tweaked`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8030b621",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_weather_tweaked = (\n",
+    "    df_weather\n",
+    "    # Transform the integer dates to a date-like format\n",
+    "    .assign(\n",
+    "        MESS_DATUM=pd.to_datetime(\n",
+    "            df_weather[\"MESS_DATUM\"].astype('str'), \n",
+    "            format=\"%Y%m%d%H\"\n",
+    "        )\n",
+    "    )\n",
+    "    # Rename the columns\n",
+    "    .rename(\n",
+    "        columns={\n",
+    "            \"MESS_DATUM\": \"Date of Measurement\",\n",
+    "            \"TT_TU\": \"Temperature\",\n",
+    "            \"RF_TU\": \"Humidity\"\n",
+    "        }\n",
+    "    )\n",
+    "    # Set a new index\n",
+    "    .set_index(\"Date of Measurement\")\n",
+    "    .astype(\n",
+    "        {\n",
+    "            \"Temperature\": np.float64,\n",
+    "            \"Humidity\": np.float64\n",
+    "        }\n",
+    "    )\n",
+    ")\n",
+    "df_weather_tweaked"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4b681490",
+   "metadata": {},
+   "source": [
+    "## Clean up the dataset\n",
+    "\n",
+    "When measurements are taken over a long period of time it is quite likely the erroneous data sneaks into the dataset. Indeed, we should remove this data from the `DataFrame`.\n",
+    "\n",
+    "Analyse the dataset in a suitable manner to investigate if the measured values for the temperature and the relative humidity are present that seem reasonable.\n",
+    "\n",
+    "- Plot the distribution of the temperature and the relative humidity. Look for suitable functions in the [`pandas.DataFrame.plot`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.html) module.\n",
+    "- Determine the smallest (minimal) as well as the largest (maximal) value for each of the data columns.\n",
+    "- Remove all conspicuously small or large values from the dataset. Make sure not to generate a new `DataFrame` but rather to perform all adjustments with the already-existing one. Afterwards check re-check your results to assure all "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2a28e418",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4), sharey=\"row\")\n",
+    "\n",
+    "ax1.set_xlabel(\"temperature / degree Celsius\")\n",
+    "df_weather_tweaked.plot.hist(ax=ax1, y=\"Temperature\", bins=50)\n",
+    "ax2.set_xlabel(\"relative humidity / %\")\n",
+    "df_weather_tweaked.plot.hist(ax=ax2, y=\"Humidity\", bins=50)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d16a3cfb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_weather_tweaked.describe()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3eb31737",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# clear all values that are \n",
+    "boolean_mask = df_weather_tweaked.index[(df_weather[\"TT_TU\"] < -998.9999) | (df_weather[\"RF_TU\"] < -998.9999)]\n",
+    "df_weather_tweaked.drop(boolean_mask, inplace=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2ee9be9c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_weather_tweaked.describe()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "546c3127",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))\n",
+    "\n",
+    "ax1.set_xlabel(\"temperature / degree Celsius\")\n",
+    "ax2.set_xlabel(\"relative humidity / %\")\n",
+    "\n",
+    "df_weather_tweaked[\"Temperature\"].value_counts().plot.line(ax=ax1, style=\"o\")\n",
+    "df_weather_tweaked[\"Humidity\"].value_counts().plot.line(ax=ax2, style=\"s\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "de80c2cf",
+   "metadata": {},
+   "source": [
+    "## Analyse the data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "74dd1b57",
+   "metadata": {},
+   "source": [
+    "### Monthly distribution of temperature and humidity\n",
+    "\n",
+    "- Group the data by month in which each measurement has been conducted. *Hint*: The `index` of the `DataFrame` has a `month` attribute.\n",
+    "\n",
+    "- Display the distribution of the temperature and the relative humidity for each month in a [violin plot](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.violinplot.html). The abscissa must show each month as a integer value while the ordinate must show the values for the temperature or the relative humidity, respectively. *Hint*: In order to extract the data from the subframes of the `DataFrameGroupBy` object you need to iterate over it in a suitable manner."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ceaaacae",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "by_month = df_weather_tweaked.groupby(df_weather_tweaked.index.month)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b9251d7e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(8, 5), sharex=\"col\")\n",
+    "\n",
+    "ax2.set_xticks(range(1, 13))\n",
+    "ax2.set_xlabel(\"month\")\n",
+    "\n",
+    "ax1.set_ylabel(\"temperature / deg. C\")\n",
+    "ax1.violinplot([subframe[\"Temperature\"] for _, subframe in by_month]); # added to avoid verbose output\n",
+    "ax2.set_ylabel(\"rel. humidity / %\")\n",
+    "ax2.violinplot([subframe[\"Humidity\"] for _, subframe in by_month]); # added tp avoid verbose output"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "95b85f06",
+   "metadata": {},
+   "source": [
+    "### Yearly mean temperature\n",
+    "\n",
+    "- Group the data by the year in which the measurements have been conducted. Then use *two* different methods of your choice to compute the mean value of the temperatures in each subframe. The result is the average temperature for each year in the dataset.\n",
+    "\n",
+    "- Plot the results for the yearly averaged temperate in a suitable manner."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b750cd9d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "by_year = df_weather_tweaked.groupby(df_weather_tweaked.index.year)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a83a4e4b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_by_year_agg = by_year.agg([np.mean])\n",
+    "df_by_year_apply = by_year.apply(lambda x: x.mean())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ed4157a9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_by_year_apply[\"Temperature\"].plot.line(style=\"o\", xlabel=\"year\", ylabel=\"temperature / degree Celsius\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "489abbfd",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": true,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {
+    "height": "calc(100% - 180px)",
+    "left": "10px",
+    "top": "150px",
+    "width": "384px"
+   },
+   "toc_section_display": true,
+   "toc_window_display": true
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/exercises/Pandas_WeatherData/WeatherData_Analysis.ipynb.license b/exercises/Pandas_WeatherData/WeatherData_Analysis.ipynb.license
new file mode 100644
index 0000000000000000000000000000000000000000..c207ab8c094a9d18d7c6cb5c9dfbf8913df4aa8a
--- /dev/null
+++ b/exercises/Pandas_WeatherData/WeatherData_Analysis.ipynb.license
@@ -0,0 +1,4 @@
+SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+
+SPDX-License-Identifier: MIT
diff --git a/exercises/Pandas_WeatherData/WeatherData_Analysis_tasks.ipynb b/exercises/Pandas_WeatherData/WeatherData_Analysis_tasks.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..1ad3696e349ee55baf0a967e92e89f71b2ecbc89
--- /dev/null
+++ b/exercises/Pandas_WeatherData/WeatherData_Analysis_tasks.ipynb
@@ -0,0 +1,476 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "7114f455",
+   "metadata": {},
+   "source": [
+    "# Weather Data "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f361dfb4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "\n",
+    "from matplotlib import pyplot as plt\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "\n",
+    "import io\n",
+    "import urllib\n",
+    "import zipfile\n",
+    "from pathlib import Path"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "70f8451d",
+   "metadata": {},
+   "source": [
+    "## Downloading the weather dataset\n",
+    "\n",
+    "Use the function `download_dwd` provided below to download the weather dataset. \n",
+    "\n",
+    "This function will download a ZIP file and will store it in the directory `./tmp` (if not specified differently). Inside `tmp` the ZIP file will be unpacked. One of the files contained in the tarball is `produkt_tu_stunde_19500101_20201231_01639.txt`. This file contains measurements of weather data (in particular temperature and relative humidity) from 1950 to the end of 2020."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f6ec374f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def download_and_extract_weatherdata(url: str, outdir: Path = Path('tmp')) -> None:\n",
+    "    \"\"\"download DWD climate data from url and extract.\"\"\"\n",
+    "    # Create temp directory for saving file to disk\n",
+    "    outdir.mkdir(exist_ok=True)\n",
+    "    \n",
+    "    # Retrieve the file from URL and extract data from tarball\n",
+    "    response = urllib.request.urlopen(url)\n",
+    "    \n",
+    "    # Extract the tarball\n",
+    "    z = zipfile.ZipFile(io.BytesIO(response.read()))\n",
+    "    z.extractall(path='tmp/')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "62103fc3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Download the data and extract.\n",
+    "# Download the data and extract.\n",
+    "URL = (\n",
+    "    'https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/'\n",
+    "    'air_temperature/historical/stundenwerte_TU_01639_19500101_20211231_hist.zip'\n",
+    ")\n",
+    "TMP_DIRECTORY = Path(\"tmp\")\n",
+    "\n",
+    "download_and_extract_weatherdata(url=URL, outdir=TMP_DIRECTORY)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f0be4be5",
+   "metadata": {},
+   "source": [
+    "## Importing the measurement data\n",
+    "\n",
+    "The file `produkt_tu_stunde_19500101_20201231_01639.txt` is a CSV file (although the suffix `.txt` conveys something else).\n",
+    "\n",
+    "The single columns of the file have the following headers:\n",
+    "\n",
+    "```\n",
+    "STATIONS_ID;MESS_DATUM;QN_9;TT_TU;RF_TU;eor\n",
+    "```\n",
+    "\n",
+    "Load the content of this file into a `pd.DataFrame` by using a suitable function. Only import the columns \n",
+    "- `MESS_DATUM` (date of the measurement), \n",
+    "- `TT_TU` (measured temperature in ${}^{\\circ}\\text{C}$), and \n",
+    "- `RF_TU` (measured relative humidity). \n",
+    "\n",
+    "After having imported the data gather some `info`rmation on the data (e.g. datatypes of columns or memory usage)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "06461f7c",
+   "metadata": {},
+   "source": [
+    "Import the data:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f99173c3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "### YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "65a852ea",
+   "metadata": {},
+   "source": [
+    "Inspect the first few lines of the `DataFrame`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "28ec724b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "### YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f05b58c7",
+   "metadata": {},
+   "source": [
+    "Inspect the last lines of the `DataFrame`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a9448957",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "### YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "379a7374",
+   "metadata": {},
+   "source": [
+    "What is the memory usage of the current `DataFrame` instance?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0377e45a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "040a748e",
+   "metadata": {},
+   "source": [
+    "## Prepare the data for further analysis"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b111bfdc",
+   "metadata": {},
+   "source": [
+    "The column `MESS_DATUM` contains the data of each measurement in the format `%Y%m%d%H`. The datatype of this column is `int64`.\n",
+    "\n",
+    "Create a new `DataFrame` named `df_weather_cleaned` that is based on the original `df_weather` from above.\n",
+    "\n",
+    "Make the following modifications to the `df_weather` `DataFrame` to generate a new one that is then assigned to the `df_weather_tweaked` variable.\n",
+    "\n",
+    "**Note**: You will be using several methods of the `DataFrame` object. Consider *chaining* the calls to these methods to have a compact way to make the relevant transformations to the `df_weather` `DataFrame`.\n",
+    "\n",
+    "\n",
+    "### Changing the format of the measurement dates\n",
+    "\n",
+    "The `\"MESS_DATUM\"` column in the original `DataFrame` contains the dates of measurement as integer values. The format is `%Y%m%d%H` which is meant to represent \"YearMonthDayHour\". We would like to have these in date-like format. \n",
+    "\n",
+    "To transfer these integer values to a suitable date format look at the [documentation of the `DataFrame` object](https://pandas.pydata.org/pandas-docs/stable/getting_started/intro_tutorials/09_timeseries.html?highlight=datetime) to find a suitable function for making such a conversion.\n",
+    "\n",
+    "Use this function together with the `assign` method of a `DataFrame` instance to modify the \"`MESS_DATUM`\" column in an appropriate manner.\n",
+    "\n",
+    "### Rename the column headers\n",
+    "\n",
+    "Rename the following columns:\n",
+    "\n",
+    "* `\"MESS_DATUM\"` $\\to$ `\"Date of Measurement\"`\n",
+    "* `\"TT_TU\"` $\\to$ `\"Temperature\"`\n",
+    "* `\"RF_TU\"` $\\to$ `\"Humidity\"`\n",
+    "\n",
+    "\n",
+    "### Setting a new index \n",
+    "\n",
+    "Make the column named `\"Date of Measurement\"` the new index of the new `DataFrame` instance.\n",
+    "\n",
+    "### Note\n",
+    "\n",
+    "In all the following tasks you are supposed to work with the new modified `df_weather_tweaked`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8030b621",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4b681490",
+   "metadata": {},
+   "source": [
+    "## Clean up the dataset\n",
+    "\n",
+    "When measurements are taken over a long period of time it is quite likely the erroneous data sneaks into the dataset. Indeed, we should remove this data from the `DataFrame`.\n",
+    "\n",
+    "Analyse the dataset in a suitable manner to investigate if the measured values for the temperature and the relative humidity are present that seem reasonable.\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1c6d2689",
+   "metadata": {},
+   "source": [
+    "- Plot the distribution of the temperature and the relative humidity. Look for suitable functions in the [`pandas.DataFrame.plot`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.html) module."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2a28e418",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d9c3c339",
+   "metadata": {},
+   "source": [
+    "- Determine the smallest (minimal) as well as the largest (maximal) value for each of the data columns.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d16a3cfb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e63a133a",
+   "metadata": {},
+   "source": [
+    "- Remove all conspicuously small or large values from the dataset. Make sure not to generate a new `DataFrame` but rather to perform all adjustments with the already-existing one. Afterwards check re-check your results."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3eb31737",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2ee9be9c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "546c3127",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "de80c2cf",
+   "metadata": {},
+   "source": [
+    "## Analyse the data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "74dd1b57",
+   "metadata": {},
+   "source": [
+    "### Monthly distribution of temperature and humidity\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7bee5b7d",
+   "metadata": {},
+   "source": [
+    "- Group the data by month in which each measurement has been conducted. *Hint*: The `index` of the `DataFrame` has a `month` attribute.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ceaaacae",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "28c23a20",
+   "metadata": {},
+   "source": [
+    "- Display the distribution of the temperature and the relative humidity for each month in a [violin plot](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.violinplot.html). The abscissa must show each month as a integer value while the ordinate must show the values for the temperature or the relative humidity, respectively. *Hint*: In order to extract the data from the subframes of the `DataFrameGroupBy` object you need to iterate over it in a suitable manner."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b9251d7e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "95b85f06",
+   "metadata": {},
+   "source": [
+    "### Yearly mean temperature"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "44fbb458",
+   "metadata": {},
+   "source": [
+    "\n",
+    "- Group the data by the year in which the measurements have been conducted. Then use *two* different methods of your choice to compute the mean value of the temperatures in each subframe. The result is the average temperature for each year in the dataset."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b750cd9d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a83a4e4b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c3000dae",
+   "metadata": {},
+   "source": [
+    "- Plot the results for the yearly averaged temperate in a suitable manner."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ed4157a9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# YOUR CODE GOES HERE"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9bcce04b",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": true,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {
+    "height": "calc(100% - 180px)",
+    "left": "10px",
+    "top": "150px",
+    "width": "384px"
+   },
+   "toc_section_display": true,
+   "toc_window_display": true
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/exercises/Pandas_WeatherData/WeatherData_Analysis_tasks.ipynb.license b/exercises/Pandas_WeatherData/WeatherData_Analysis_tasks.ipynb.license
new file mode 100644
index 0000000000000000000000000000000000000000..c207ab8c094a9d18d7c6cb5c9dfbf8913df4aa8a
--- /dev/null
+++ b/exercises/Pandas_WeatherData/WeatherData_Analysis_tasks.ipynb.license
@@ -0,0 +1,4 @@
+SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+
+SPDX-License-Identifier: MIT
diff --git a/slides/Day1.ipynb b/slides/Day1.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..ed50bae8d42016e63fb438057cd83ff94fe5de87
--- /dev/null
+++ b/slides/Day1.ipynb
@@ -0,0 +1,3911 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "d93c06a4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# HiPerCH 14 Module 1:  Introduction to Python Data Processing tools"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "09758895",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "notes"
+    }
+   },
+   "source": [
+    "Notiz: \",\" removes the ? icon"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "094990cd",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Content\n",
+    "- Basic overview of NumPy\n",
+    "  - datatypes\n",
+    "  - array-oriented programming\n",
+    "  - linear algebra\n",
+    "- Tabulated data: Pandas\n",
+    "  - adding semantic information\n",
+    "  - reading, transforming, and plotting data\n",
+    "  - grouping and aggregation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7a816a80",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Formats\n",
+    "### Presentation\n",
+    "- zoom conference with screenshare\n",
+    "- notebooks available for download\n",
+    "\n",
+    "### Small practical demonstrations\n",
+    "- integrated into presentation\n",
+    "- ~10 minutes assigned per demo\n",
+    "\n",
+    "### take-home exercises\n",
+    "- simple practical projects\n",
+    "- demonstration of common methods\n",
+    "- self-study during or after the workshop"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "42c8b5d5",
+   "metadata": {
+    "cell_style": "split",
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Agenda for today\n",
+    "- 09:00 - 12:00 Morning session\n",
+    "  - Introduction to Numpy\n",
+    "  - Arrays, datatypes, array access\n",
+    "- 12:00 - 13:00 Lunch break\n",
+    "- 13:00 - 17:00 Afternoon session\n",
+    "  - Broadcasting and universal functions\n",
+    "  - **Hands on Exercises**"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9e6a61ae",
+   "metadata": {
+    "cell_style": "split",
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "## Agenda for tomorrow\n",
+    "- 09:00 - 12:00 Morning session\n",
+    "  - Introduction to Pandas\n",
+    "  - Usage of Pandas `Dataframe`s\n",
+    "- 12:00 - 13:00 Lunch break\n",
+    "- 13:00 - 17:00 Afternoon session\n",
+    "  - Some more `DataFrame`s\n",
+    "  - **Hands on Exercises**\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "791069bd",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Why Use Python for scientific computing?\n",
+    "* Python is easy to learn\n",
+    "* Fast prototyping\n",
+    "* Excellent for interactive exploratory work (e.g. Jupyther Notebook)\n",
+    "* Many of different scientifc modules / libraries available\n",
+    "* Efficient calculations are possible\n",
+    "\n",
+    "$\\Rightarrow$ Python code is glue-code between \"high-performance\" languages (C/C++, Fortran, ...)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "425fd375",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Setting up: environment creation and validation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b98d9e52",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "## Environment Creation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "71cd16fb",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Python Anaconda distribution\n",
+    " \n",
+    "#### From the command line\n",
+    "To create a conda environment, execute the following command from the command line:\n",
+    "```bash\n",
+    "$ cd /path/to/course/directory  # make sure to navigate to the course directory first!\n",
+    "$ conda env create -f environment.yml\n",
+    "```\n",
+    "\n",
+    "Afterwards, activate it:\n",
+    "```bash\n",
+    "$ conda activate scipython\n",
+    "```\n",
+    "\n",
+    "#### From the Anaconda Navigator (e.g. Windows)\n",
+    "Follow the instructions at https://docs.anaconda.com/anaconda/navigator/tutorials/manage-environments/#id6 and use the provided `environment.yml` file."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6c2c10b0",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Python virtual environments\n",
+    "For creating a Python-native virtual environment, open a terminal emulator and execute the following commands: \n",
+    "```bash\n",
+    "$ cd /path/to/course/directory  # make sure to navigate to the course directory first!\n",
+    "$ python3 -m venv .venv  # creates a virtual environment\n",
+    "$ source .venv/bin/activate\n",
+    "$ pip3 install --upgrade pip\n",
+    "$ pip3 install -r requirements.txt               \n",
+    "$ jupyter contrib nbextension install --sys-prefix\n",
+    "$ jupyter-nbextension install rise --py --sys-prefix\n",
+    "$ jupyter-nbextension enable rise --py --sys-prefix\n",
+    "\n",
+    "```\n",
+    "Please refer to [this link](https://docs.python.org/3/library/venv.html) for how to create a virtual environment on Windows."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0c8c43d2",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Starting the jupyter server\n",
+    "### From the command line\n",
+    "Make sure that the course environment is active. In the course directory, start a jupyter server:\n",
+    "```bash\n",
+    "$ cd /path/to/course/directory  # make sure to navigate to the course directory first!\n",
+    "$ jupyter notebook  # this will open a browser window\n",
+    "```\n",
+    "\n",
+    "### From the anaconda navigator\n",
+    "Make sure that the course environment is active. Then open a Jupyter notebook from the GUI."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "507795f7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Validation: Are you ready to start?\n",
+    "\n",
+    "If you can execute the following cells without error, you are ready to start this module."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "677dee02",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "import scipy\n",
+    "from matplotlib import pyplot as plt\n",
+    "\n",
+    "print(f\"Numpy version  : {np.__version__}\")\n",
+    "print(f\"Scipy version  : {scipy.__version__}\")\n",
+    "print(f\"Pandas version : {pd.__version__}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ca28902c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Part 1: Numpy\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e1a6bd4d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Content\n",
+    "\n",
+    "- Introduction to Numpy\n",
+    "- Datatypes\n",
+    "- Concept of multi-dimensional arrays\n",
+    "- Array access\n",
+    "- Broadcasting\n",
+    "- Universal functions\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7501ca2b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Content\n",
+    "  - ***Introduction to Numpy***\n",
+    "  - Datatypes\n",
+    "  - Concept of multi-dimensional arrays\n",
+    "  - Array access\n",
+    "  - Broadcasting\n",
+    "  - Universal functions\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b420d5ce",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## NumPy (Numerical Python)\n",
+    "* Open source Python library\n",
+    "* Multi-dimensional arrays\n",
+    "* Efficient storing of data\n",
+    "* Efficient computing (e.g. vectorization)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b080c967",
+   "metadata": {
+    "cell_style": "split",
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "![Scientific Python Ecosystem](https://media.springernature.com/lw685/springer-static/image/art%3A10.1038%2Fs41586-020-2649-2/MediaObjects/41586_2020_2649_Fig2_HTML.png?as=webp)\n",
+    "Figure From: https://doi.org/10.1038/s41586-020-2649-2"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "da03546b",
+   "metadata": {
+    "cell_style": "split",
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "* Numpy is the base of many other scientific libraries.\n",
+    "\n",
+    "* For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves and in the first imaging of a black hole."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5ba975b4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Numpy VS Standard Python\n",
+    "\n",
+    "Hands on: Time the code on the next slides and see where Numpy is faster!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cf5059d3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "myrange = range(10000) # normal python list\n",
+    "%timeit [i ** 2 for i in myrange]\n",
+    "a = np.arange(10000) # numpy array\n",
+    "%timeit a ** 2 # note that we operate on the array like a scalar value!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9a1b1d4a",
+   "metadata": {
+    "cell_style": "center",
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Example: vector addition\n",
+    "def python_version(size):\n",
+    "    X = range(size)\n",
+    "    Y = range(size)\n",
+    "    Z = [X[i] + Y[i] for i in range(len(X)) ]\n",
+    "\n",
+    "\n",
+    "def numpy_version(size):\n",
+    "    X = np.arange(size)\n",
+    "    Y = np.arange(size)\n",
+    "    Z = X + Y\n",
+    "\n",
+    "%timeit python_version(1000)\n",
+    "%timeit numpy_version(1000)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "298661a9",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "notes"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "TODO an example with strings?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4d94a263",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# So where does the performance benefit come from?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "62f09236",
+   "metadata": {
+    "cell_style": "split",
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "### Datatypes\n",
+    "* Fixed static datatypes for usage of more efficient CPU instructions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a21e7d4f",
+   "metadata": {
+    "cell_style": "split",
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "### Array oriented programming\n",
+    "* Only work with array objects\n",
+    "* Offload loops to *compiled* programming language\n",
+    "* Use the CPUs vector instructions (SSE, AVX, ...) instead of loops"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bf7b6ed4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Content\n",
+    "  - Introduction to Numpy\n",
+    "  - ***Datatypes***\n",
+    "  - Concept of multi-dimensional arrays\n",
+    "  - Array access\n",
+    "  - Broadcasting\n",
+    "  - Universal functions\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "28074bdf",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Python has 3 built-in numeric datatypes:\n",
+    "- integers (`int`)\n",
+    "- floating point numbers (`float`)\n",
+    "- complex numbers (`complex`)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a0ee54a1",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## `int`: integer numbers\n",
+    "* literals: `42`, `1`, `-15`\n",
+    "* no hard maximum value (e.g. `10**1000` is perfectly valid)\n",
+    "* dynamic size, memory overhead, *cannot directly use native CPU instructions for basic math* \n",
+    "* $\\Rightarrow$ very flexible, but often not very efficient\n",
+    "* typical integer datatypes in C: `int`, `unsigned int`, `long int`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4dd77e54",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# The size of of an Python built-in `int` object can depend on the value the integer\n",
+    "import sys\n",
+    "\n",
+    "# sizes are given in bytes\n",
+    "print(sys.getsizeof(1))\n",
+    "print(sys.getsizeof(10 ** 1000))\n",
+    "print(sys.getsizeof(10 ** 10000))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "651b5a9f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## `float`: floating point numbers\n",
+    "* floating point number: $\\textrm{sign} * \\textrm{mantissa} * \\textrm{base}^\\textrm{exponent}$, e.g. $-1.234\\cdot10^{2}$\n",
+    "* literals in Python: `0.0`, `314.15`, `-1.5e7` (meaning $-1.5\\cdot10^{7}$)\n",
+    "* usually implemented as `double` in C (64 bit / 8 byte) \n",
+    "* thus, limited max, min, eps (see [`sys.float_info`](https://docs.python.org/3/library/sys.html#sys.float_info))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "471ca00e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# The size of `float` objects is *independent` of the value\n",
+    "import sys\n",
+    "\n",
+    "print(sys.getsizeof(1.0))\n",
+    "print(sys.getsizeof(1.34e18))\n",
+    "\n",
+    "# low level information about precision and internal representation of `float`s\n",
+    "print(sys.float_info)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9b2467e7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## `complex`: complex floating point numbers:\n",
+    "* literals: `1.0+2.0j`, `1j`\n",
+    "* `1j**2 == -1`\n",
+    "* `x.real`, `x.imag` are `float`s"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0eaca8aa",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "notes"
+    }
+   },
+   "source": [
+    "-- comment: j comes from electircal engenieering, wehre I is used for current\n",
+    "-- https://stackoverflow.com/questions/24812444/why-are-complex-numbers-in-python-denoted-with-j-instead-of-i"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "019a151e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Content\n",
+    "  - Introduction to Numpy\n",
+    "  - Datatypes\n",
+    "  - ***Concept of multi-dimensional arrays***\n",
+    "  - Array access\n",
+    "  - Broadcasting\n",
+    "  - Universal functions\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b4a7b89",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "$N$-dimensional arrays of numerical data are essential for scientific computing where data is often handled by means to multidimensional indexed arrays.\n",
+    "\n",
+    "- Natural sciences & numerical mathematics\n",
+    "    - Vectors, matrices, tensors\n",
+    "- Data Science\n",
+    "    - Datasets (e.g. via Pandas), tensors"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f9a4116a",
+   "metadata": {
+    "cell_style": "center",
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Use cases\n",
+    "\n",
+    "- Linear algebra\n",
+    "    - Matrix-vector multiplication, matrix-matrix multiplication\n",
+    "- Statistics with large datasets\n",
+    "    - Aggregating data for computing mean, standard deviation, ...\n",
+    "- Deep learning:\n",
+    "    - Operations involving high-dimensional arrays (\"tensors\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e9f6d378",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## What about Python's builtin container types (`list`, `tuple`)?\n",
+    "\n",
+    "- Can hold *any type* of Python object\n",
+    "    - (Mostly?) Not suitable for native CPU instructions\n",
+    "    - Agnostic of concept of e.g. a rectangular array\n",
+    "- Not designed with numerical calculations in mind\n",
+    "\n",
+    "Not efficient enough to be used for \"number crunching\"."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "230188fa",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Numpys [Ndarrays](https://numpy.org/doc/stable/reference/arrays.ndarray.html)\n",
+    "\n",
+    "- multi-dimensional (, fixed-size) containers of items of the same *size* and *type*\n",
+    "- number of dimensions and items \n",
+    "    - `shape`: N-tuple with non-negative integer values describing the sizes of each dimension\n",
+    "- type of each item\n",
+    "    - data type object (`dtype`)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9b542fa8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Usage of Numpy"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e26a3840",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Import NumPy\n",
+    "Import NumPy into current namespace, usually with alias `np` for conceiseness."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f092e077",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5ac42dc5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Array creation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4226dbd4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### From Python's built-in container types"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ae65a30e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# from Python lists\n",
+    "A = np.array([1, 2, 3])\n",
+    "A, type(A)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d996a1c8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# from Python tuples\n",
+    "A = np.array((1, 2, 3))\n",
+    "A, type(A)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f7a19ea3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Does not work for generators since ndarrays need their size at creation time\n",
+    "A = np.array(i**2 for i in range(1, 5))\n",
+    "A\n",
+    "# We have to use `np.fromiter()` to convert iterator to ndarray type.\n",
+    "# A = np.fromiter((i ** 2 for i in range(1,  5)), dtype=np.int32)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "79a667ca",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We can also use nested lists\n",
+    "nested_list = [[1, 2, 3], [10, 20, 30]] \n",
+    "A = np.array(nested_list)\n",
+    "print(A)\n",
+    "print(f\"dimension of A: {A.ndim}\")\n",
+    "print(f\"shape     of A: {A.shape}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f346f93c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# MIND: sublist must all have the *same* size\n",
+    "nested_list = [[1] * 3, [2] * 3, [3] * 2]\n",
+    "print(nested_list)\n",
+    "np.array(nested_list)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cdfbc8cb",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# A = np.array(range(3))\n",
+    "A = np.array([[1.0] * 5, [4] * 5])\n",
+    "\n",
+    "# metadata sample of an ndarray (these are the attributes of the instances of ndarray class)\n",
+    "print(\"shape   :\", A.shape)\n",
+    "print(\"size    :\", A.size)\n",
+    "print(\"dtype   :\", A.dtype)  # we will come to the type later!\n",
+    "print(\"itemsize:\", A.itemsize) # size is in bytes"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a38c59fb",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### From NumPy built-in functions (factory functions)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f18b2f91",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# range(n) generates n integers,\n",
+    "# starting at 0, up to n-1\n",
+    "# (this function is similar to the standard python range function)\n",
+    "A = np.arange(10)\n",
+    "A"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "74fb95b7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# range with start, stop, step parameters\n",
+    "# NOTE: start value is included, stop value not\n",
+    "print(np.arange(3, 11, 1))\n",
+    "print(np.arange(3, 11, 2))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "014c3f70",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Numpy provides some factory methods for array initialisation:\n",
+    "\n",
+    "# zero array, here with 2D shape\n",
+    "array_of_zeros = np.zeros((2, 3))\n",
+    "print(array_of_zeros)\n",
+    "\n",
+    "# create ones array with same shape as zeros\n",
+    "array_of_ones = np.ones(array_of_zeros.shape)\n",
+    "print(array_of_ones)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7cff636b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "array_of_ones = np.ones((3, 3))\n",
+    "print(array_of_ones)\n",
+    "# Return an array of zeros with same shape and type as given array.\n",
+    "array_of_zeros = np.zeros_like(array_of_ones)\n",
+    "print(array_of_zeros)\n",
+    "# Return an array with `fill_value` with same shape and type as given array.\n",
+    "array_of_fives = np.full_like(array_of_ones, fill_value=5)\n",
+    "print(array_of_fives)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ff4b8beb",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Diagonal arrays\n",
+    "print(np.diag(np.arange(1, 4), k=-1))\n",
+    "print(np.eye(3))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5e72e19f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "- Numpy arrays can also be read from / stored to disk.\n",
+    "- But this is not covered in this course.\n",
+    "- For this we will introduce Pandas tomorrow."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f530026d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Data type ([dtype](https://numpy.org/doc/stable/reference/arrays.dtypes.html))\n",
+    "You can explicitly set the underlying numeric data type for an ndarray.\n",
+    "We must specify this in order to have take advantage of the speed of numpy.\n",
+    "\n",
+    "A (more) complete list of supported data types for `ndarray`s can be found [here](https://numpy.org/doc/stable/user/basics.types.html). The tables also feature the corresponding C datatype.\n",
+    "\n",
+    "NOTE: There are *plattform dependent* datatypes (e.g. `np.intc`) where the corresponding C type (e.g. `int`) is also plattform dependent."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2c938caa",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "\n",
+    "- integer types\n",
+    "    - `np.int64` (default int type on 64-bit architectures)\n",
+    "    - `np.int32`\n",
+    "    - `np.uint64` (unsigned int)\n",
+    "    - `np.uint8`\n",
+    "    - ...\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dacd61f5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "Floating point-based data types:\n",
+    "\n",
+    "- floating point types\n",
+    "    - `np.float64` (aka double precision; matches the precision of Python `float`)\n",
+    "    - `np.float32` (aka single precision)\n",
+    "    - `np.float16` (aka half precision)\n",
+    "    - ...\n",
+    "- complex types\n",
+    "    - `np.complex64` (2x 32-bit floats; for real and imag. part)\n",
+    "    - `np.complex128` (2x 64-bit floats; for real and imag. part)|"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c3886925",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Numpy deduces the type of the elements from the input. Types can be inferred from dtype attribute.\n",
+    "A = np.array([1, 2, 3])\n",
+    "print(A)\n",
+    "print(A.dtype)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a01d43ae",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A_float = np.array([1.1, 5.5, 9.9])\n",
+    "print(A_float)\n",
+    "print(A_float.dtype)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e8409769",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We can also pass the type explicitly\n",
+    "A_int64 = np.array(list(map(float, range(4))), dtype=np.int64)\n",
+    "A_float32 = np.array(list(map(float, range(4))), dtype=np.float32)\n",
+    "\n",
+    "print(f\"A_int64 type : {A_int64.dtype}\")\n",
+    "print(f\"A_float32 type: {A_float32.dtype}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "65b112fb",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "- `dtype` can have an impact on the performance of computations with `ndarray`s. \n",
+    "- In essence, use a *numerical* datatype (e.g. `int` or `float64`) for representing numerical data."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cb71d2da",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Compare numeric type vs object type\n",
+    "N = 100000\n",
+    "%timeit np.arange(N, dtype=object).sum()  # Python object\n",
+    "%timeit np.arange(N, dtype=int).sum()     # Python compatible integer (most likely np.int64)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9dfba808",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Overflows\n",
+    "\n",
+    "Example: `np.int8` represents a *signed* integer of size 8 bit $\\Rightarrow$ $2^8 = 256$ possible values!\n",
+    "\n",
+    "- First bit is for sign: $\\pm$\n",
+    "- Remaining 7 bits for value: $[-2^7, 2^7 - 1] = [-128, 127]$\n",
+    "\n",
+    "MIND: Any number *larger* that 127 *cannot* be represented with this type of integer.\n",
+    "\n",
+    "*Note*: Fixed size datatypes are essential for performant calculations and vectorization."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1b63c961",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Overflows\n",
+    "print(np.arange(0, 256, 1, dtype=np.int8)) # note the \"warp-around\" after 127\n",
+    "# print(np.arange(0, 256, 1, dtype=np.uint8)) # works for *unsigned* integer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "207d3ba2",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# special float values\n",
+    "# 1.0 / 0.0  # raises ZeroDivisionError exception\n",
+    "# but what about numpy?\n",
+    "\n",
+    "A = np.array([1.0]) / 0.0  # raises no exception\n",
+    "print(A)                   # special value np.inf (\"infinity\")\n",
+    "B = np.array([-1.0]) / 0.0 \n",
+    "print(B)                   # results in -np.inf\n",
+    "C = np.array([0.0]) / 0.0  \n",
+    "print(C)                   # special value np.nan (\"not a number\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "60fbf8c8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# If desired, we can specify at which instances we want an error to be raised.\n",
+    "\n",
+    "# np.seterr(**{'divide': 'warn', 'invalid': 'warn', 'over': 'warn', 'under': 'ignore'})  # default\n",
+    "#np.seterr(all='raise')  # raise exceptions on numerical errors"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "336d45fe",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Rounding Errors and Machine Epsilon"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8ed2d381",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "np.finfo(np.float32)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "56c6a257",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "np.finfo(np.float64)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c5298837",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "x=np.asarray([0.001,1,1e10],dtype=np.float32)\n",
+    "one=np.ones(3,dtype=np.float32)\n",
+    "y=x+one\n",
+    "z=y-one\n",
+    "print(x)\n",
+    "print(y)\n",
+    "print(z)\n",
+    "print(one)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "569d8895",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(x==z)# should be true for all entries\n",
+    "print(y==x)# should be false for all entries"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "09e93162",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Machine epsilon\n",
+    "- Computers cannot represent every number exact\n",
+    "- Machine epsilon is the smallest distance between two representable numbers \n",
+    "    - minimal $\\epsilon$ such that $1+\\epsilon >1$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cb8be20e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "def geteps(base_val=1,dtype=np.float64):\n",
+    "    one, new_eps, eps = dtype(base_val), dtype(base_val), dtype(base_val)\n",
+    "    while ( one + new_eps ) > one:\n",
+    "        eps = new_eps\n",
+    "        new_eps /= dtype(2)\n",
+    "    return eps"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0eacb662",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(geteps(dtype=np.float32))\n",
+    "print(geteps(dtype=np.float64))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fb64a4fd",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Machine Epsilon\n",
+    "- The value of machine epsilon changes with the number of bits for a float\n",
+    "- But it also gets larger for larger numbers\n",
+    "- **It is a *relative* error**!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "33e9df26",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(geteps(1,dtype=np.float64))\n",
+    "print(geteps(1e6,dtype=np.float64))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8ac8a4f8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Number Representation\n",
+    " - Computers use a binary representation\n",
+    " - Some numbers that have finite number of digits in decimal representation, have infinite number of digits in binary representation\n",
+    " - Example:  \n",
+    " $\\frac{1}{10}=0.1$ in decimal, but $0.000110011\\overline{0011}$ in binary\n",
+    " - It has the same reason as why $\\frac{1}{6}=0.\\overline{6}$ has an infinite number of digits in decimal representation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "768e6b74",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "notes"
+    }
+   },
+   "source": [
+    "- This is way, rounding errors larger than machine epsilon are possible"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e0843fbc",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Commutative operations\n",
+    "- The rounding errors of floating point operations may depend on their execution order"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d2c1b62d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "x=np.asarray([0.001,1,1e10],dtype=np.float32)\n",
+    "a=np.asarray([1,1,1],dtype=np.float32)\n",
+    "print (x==x+a-a) #executed as (x+a)-a\n",
+    "print (x==x+(a-a))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "400b1d00",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "- These errors can propagate through multiple steps of calculations and even magnify"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "10d62ba5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## GENERAL rules of thumb\n",
+    "- Multiplication/Division are mostly safe\n",
+    "- Addition and subtraction can lead to errors:\n",
+    "    - when values of different magnitude are involved, the digits of the smaller one can be lost\n",
+    "    - when subtracting two numbers that are close together, rounding errors are more likely\n",
+    "\n",
+    "- More detailed information: https://doi.org/10.1145/103162.103163"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a7d5b7a4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Comparing floating point numbers with rounding errors\n",
+    "- Comparing numbers with rounding errors with `==` can lead to wrong assumptions\n",
+    "    - We have seen that the number representation is not exact in some cases\n",
+    "- numpy offers `isclose` (or `allclose`) for this purpose\n",
+    "- $a$ and $b$ are considered close if $|a-b| <= atol+rtol*|b|$\n",
+    "- numpy uses $rtol=10^{-5}$ and $atol=10^{-8}$ by default\n",
+    "    - But you can specify your own tolerances\n",
+    "        - you may need adjust $atol$ if you want to compare numbers close to 0"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5a21c09a",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "a=np.asarray([1e-6,1,1e6])\n",
+    "b=np.asarray([1e-6+1e-6,1e-6+1,1e6+1])\n",
+    "print(a)\n",
+    "print(b)\n",
+    "print(np.isclose(a,b))\n",
+    "print(np.isclose(a,b,rtol=1e-6))\n",
+    "print(np.isclose(a,b,atol=1e-6))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8fcd7aea",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "notes"
+    }
+   },
+   "source": [
+    "https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0efcf38d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Array shape manipulation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3b0e3a9e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.arange(10)\n",
+    "print(A)\n",
+    "print(A.shape) # the return type is a tuple"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ab24e946",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Reshaping\n",
+    "A = A.reshape((2, 5))\n",
+    "# A = A.reshape((2, -1)) # make NumPy deduce the last dimension\n",
+    "\n",
+    "# * First  set of 5 values: row 1\n",
+    "# * Second set of 5 values: row 2\n",
+    "print(A)\n",
+    "print(A.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7025063c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Transpose\n",
+    "A = np.arange(10).reshape((5, 2))\n",
+    "# We make an explicit copy here (more on copies and views later!).\n",
+    "A_transpose = A.copy().T # or A.copy().transpose(( 1, 0))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1ac5962d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(A)\n",
+    "print(A_transpose)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f89634e3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(f\"shape of A          : {A.shape}\")\n",
+    "print(f\"shape of A_transpose: {A_transpose.shape}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f3109ba5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Transposition also works with N-dimensional arrays\n",
+    "A = np.ones((3, 4, 5))\n",
+    "\n",
+    "print(f\"shape of A: {A.shape}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "70fe95ee",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A_transpose = A.copy().transpose() # this reverses the order of the sizes of each dimension\n",
+    "print(f\"shape of A_transpose: {A_transpose.shape}\")\n",
+    "A.copy().transpose((2, 1, 0)).shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3496374c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from itertools import permutations\n",
+    "for perm in permutations((0, 1, 2)):\n",
+    "    A_transpose = A.copy().transpose(perm) # Provide a tuple with dimension to permute\n",
+    "    print(f\"shape of A_transpose for permutation {perm}: {A_transpose.shape}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0fec034c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Concatenation and stacking\n",
+    "A0 = np.zeros(3)\n",
+    "A1 = np.ones(3)\n",
+    "print(\"start with two arrays:\", A0, A1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d20cabf3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Concatenation\n",
+    "A_concat = np.concatenate((A0, A1), axis=0)\n",
+    "print(\"concatenate along existing dimension:\")\n",
+    "print(A_concat)\n",
+    "print(A_concat.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3d224ca7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Stacking\n",
+    "A_stack_ax0 = np.stack((A0, A1), axis=0) # stacking along rows\n",
+    "A_stack_ax1 = np.stack((A0, A1), axis=1) # stacking along columns\n",
+    "print(\"stack along axis=0:\")\n",
+    "print(A_stack_ax0)\n",
+    "print(\"stack along axis=1:\")\n",
+    "print(A_stack_ax1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e38879b4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Nested arrays can also be \"transformed\" in to 1D arrays\n",
+    "A = np.arange(1, 28).reshape((3, 3, 3))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "08888075",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# this returns a 1D copy\n",
+    "A_flattened = A.flatten()\n",
+    "print(np.may_share_memory(A, A_flattened))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0b4344d8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# this returns a view (we will come to this later)\n",
+    "A_ravelled = A.ravel() # np.ravel(A) in case you want to use the free function\n",
+    "print(np.may_share_memory(A, A_ravelled))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ad68e206",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# `np.ravel()` is equivalent to using `reshape()`\n",
+    "A.reshape((-1,))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2316fafd",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Timing analysis: Copies vs. views"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ed9b4aa4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.arange(500000).reshape(5, -1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dca9aefd",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# measure time of summing large array after ...\n",
+    "# - ... `flatten()`ing the array, and\n",
+    "# - ... `ravel()`ing the array."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4332dcab",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%timeit np.sum(A.flatten())\n",
+    "%timeit np.sum(A.ravel())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "59f610ef",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "We will come back to the difference between a *view* and a *copy* later"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2be9aebd",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Content\n",
+    "  - Introduction to Numpy\n",
+    "  - Datatypes\n",
+    "  - Concept of multi-dimensional arrays\n",
+    "  - ***Array access***\n",
+    "  - Broadcasting\n",
+    "  - Universal functions\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "359ea28c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Element access with indexing in 1D"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9291539d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Access array element\n",
+    "A = np.array([10, 20, 30])\n",
+    "A[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fb82fd90",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Change value of array element\n",
+    "A[1] = 222\n",
+    "A"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "19141d3d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Reverse indexing also works like with Python lists\n",
+    "A[-1]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2ceb94a8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "### Exercises\n",
+    "1. Create a numpy array with the current date (year, month, day)\n",
+    "\n",
+    "2. Index the array to retrieve the year.\n",
+    "\n",
+    "3. Replace the year with the year of your birth.\n",
+    "\n",
+    "4. Create a NumPy array containing every 7rd number from 123 to 456. What is the 9th number in this array?  What is the 78th?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cf66e38e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "#This field is for the solution of the exercise"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4ec81374",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Elementwise operations"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6822f4d7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.array([1, 2, 3])\n",
+    "print(A)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9ae54600",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A + 1 # Note that this operation is broadcasted over the whole array"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bd4ad7ec",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A / 2 # Note that this operation is broadcasted over the whole array"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e94e610d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "B = np.ones((2, 2))\n",
+    "B"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "78fe1c57",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Element-wise addition\n",
+    "B + B"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bff45d9c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Elementwise multiplication. This is *not* matrix-matrix multiplication.\n",
+    "# this is the Hadamard product: https://en.wikipedia.org/wiki/Hadamard_product_(matrices)\n",
+    "B * B # element-wise matrix multiplication; "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fb4c4393",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### `np.dot()` and `np.matmul()`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "efa62480",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.arange(1, 5).reshape((2, 2))\n",
+    "B = np.arange(5, 9).reshape((2, 2))\n",
+    "A, B"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a4646f4b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Compute the matrix-matrix multiplication in 3 equivalent ways\n",
+    "print(np.dot(A, B)) # np.dot is much more general; prefer np.matmul for matrix-multiplication\n",
+    "print(np.matmul(A, B)) # implements the semantics of the `@` operator\n",
+    "print(A @ B)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4437b478",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "notes"
+    }
+   },
+   "source": [
+    "\n",
+    "\n",
+    "    matmul differs from dot in two important ways.\n",
+    "\n",
+    "        Multiplication by scalars is not allowed.\n",
+    "        Stacks of matrices are broadcast together as if the matrices were elements.\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "84358ebf",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Matrix-vector / vector matrix multiplication\n",
+    "M = np.arange(1, 5).reshape((2, 2))\n",
+    "v = np.arange(1, 3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "21d500cc",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "Compute $\\sum_{j = 1}^N M_{ij} v_j$: Sum along columns (`axis = 1`) of the matrix."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6816a443",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(np.dot(M, v))\n",
+    "print([sum(row * v) for row in M])  # test if `dot()` does what it is supposed to"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "13b90b55",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "Compute $\\sum_{i = 1}^N v_i M_{ij}$: Sum along the rows (`axis = 0`) of the matrix."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f4805934",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(np.dot(v, M))\n",
+    "print([sum(v * row) for row in M.transpose()]) # test if `dot()` does what it is supposed to"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "16cc2079",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "Differences for `np.matmul` and `np.dot` are for higher-dimensional arrays (3-dimensions upwards).\n",
+    "\n",
+    "See the docs for [`np.matmul`](https://numpy.org/doc/stable/reference/generated/numpy.matmul.html?highlight=matmul#numpy.matmul) and [`np.dot`](https://numpy.org/doc/stable/reference/generated/numpy.dot.html#numpy.dot) for more details."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "29d79088",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Computing the dot-product for (complex-valued) vectors: `np.dot()` vs. `np.vdot()`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "29c09a65",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "For $\\vec{v}, \\vec{w} \\in \\mathbb{R}^N$: $\\langle v, w \\rangle = \\sum_{i=1}^N v_i w_i$ "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3d7f76bc",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# For real-valued vectors there is *no* difference\n",
+    "v = np.arange(1, 5)\n",
+    "print(np.dot(v, v), np.vdot(v, v))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b13ef4b6",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "For $\\vec{v}, \\vec{w} \\in \\mathbb{C}^N$: $\\langle v, w \\rangle = \\sum_{i=1}^N \\overline{v_i} w_i$ "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "11eb626e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# For complex-valued vectors there is a difference\n",
+    "v = np.arange(1, 5) + 1j + np.arange(1, 5)\n",
+    "print(np.dot(v, v))  # does *not* automatically apply complex conjugation of 1st argument (use np.dot(v.conj(), v))\n",
+    "print(np.vdot(v, v))  # does apply complex conjugation to 1st argument"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "13fe7988",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Reductions"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c9a7cf0e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.array([[1, 2], [3, 4]])\n",
+    "A"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e477c39a",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Summing all elements in an array\n",
+    "np.sum(A), A.sum()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7f7c5bb2",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Summing along rows (axis=0) and along columns (axis=1)\n",
+    "np.sum(A, axis=0), np.sum(A, axis=1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d81b0eb0",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Some statistical quantities\n",
+    "np.mean(A), np.std(A)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e3920dd3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Slicing (1D)\n",
+    "Access range of elements using slice notation: \n",
+    "\n",
+    "```\n",
+    "    A[start:stop:step]\n",
+    "```\n",
+    "\n",
+    "defaults: `start=0`, `stop=len(array)`, `step=1`  \n",
+    "second `:` is optional, if default step is used\n",
+    "\n",
+    "Remember: Indices start at 0. `stop` is exclusive."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d4b44b53",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.arange(10)\n",
+    "A"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e80ce38f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "#equivalent due to default values:\n",
+    "print(A[0:6:1])\n",
+    "print(A[0:6:])\n",
+    "print(A[0:6])   #second ':' is optional if default step used\n",
+    "print(A[:6])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d45423bc",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A[:-2]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c4c26dd8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A[1::2]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cf391c33",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A[::-1]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e4a1472c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A[3:1:-1]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "50149cd6",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Assigning to a slice is something we cannot do with a Python list.\n",
+    "A = np.arange(10)\n",
+    "A[1::2] = -100\n",
+    "print(A)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "faa52409",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We get an error with a Python list.\n",
+    "A = list(range(10))\n",
+    "A[1::2] = -100\n",
+    "# A[1::2] = [-100] * 5 # This implies knowing the length of the slice.\n",
+    "print(A)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1ea5138b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from time import time\n",
+    "# Example: Prime sieve\n",
+    "N = 100000\n",
+    "prime_candidates = np.ones((N,), dtype=np.bool_)\n",
+    "prime_candidates[0] = False\n",
+    "prime_candidates[1] = False\n",
+    "\n",
+    "tstart = time()\n",
+    "for i in range(2, N):  # for each integer starting from 2 cross out higher multiples\n",
+    "    prime_candidates[2 * i :: i] = False\n",
+    "print(f\"Time needed: {time() - tstart}\")\n",
+    "    \n",
+    "# print(prime_candidates)\n",
+    "# boolean array; if prime_candidates[x] == True, then x is prime"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "018e8135",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "#convert boolean mask to list of integers with list comprehension\n",
+    "%timeit np.array([x for x in range(N) if prime_candidates[x]])  # list comprehension with conditional"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4857233a",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# convert boolean mask to list of integers with NumPy built-in function (much faster)\n",
+    "%timeit np.nonzero(prime_candidates)  # or using a NumPy function"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "62a2ece1",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "### Array-oriented programming\n",
+    "\n",
+    "- Use `numpy` built-in functions and methods of `ndarray` class to operate on `ndarray`s\n",
+    "- *Avoid* using standard Python loop contructs such as \"raw\" for loops of list comprehension"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7dfbfb0f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Indexing and slicing in $N$ dimensions"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9d50f99e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "B = np.array([[1, 2, 3], [10, 20, 30]])\n",
+    "print(B)\n",
+    "print(B.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7aef119c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "B[0, 1] # get value from index along each axis\n",
+    "# B[0][1]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "eef08541",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "B[:2, :2] # use slicing along each axis"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a262f62f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "B[:, 1] = [70, 700] # use full slice ':' to select whole axis 1 (here: 2nd column)\n",
+    "B"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "daae0878",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "a = np.arange(1, 49).reshape((8, 6))\n",
+    "a"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "56309989",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# From first row take two element before the last one\n",
+    "a[0, 3:5]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ac557199",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Submatrix in upper left corner\n",
+    "a[:2, :2]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1b46d191",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Last column\n",
+    "a[:, -1]\n",
+    "# a[-1, :]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "621a7337",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Amore complicated pattern with non-unit steps\n",
+    "a[2::2, 3::]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a6b1df82",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Views and copies\n",
+    "\n",
+    "When using *slicing* or *transposition* we create *references* to the original data (memory views). **No** copy is made of the original array and stored in memory. We can use `np.may_share_memory()` to check if two arrays share the same memory block. Note however, that this uses heuristics and may give you false positives."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5e3fee47",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.arange(5)\n",
+    "A"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e7aca885",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A_view = A[::2]              # slicing! This gives us a *view* on this mem location\n",
+    "print(A_view)\n",
+    "print(A_view.shape, A.shape) # view has it own metadata"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2d6404ba",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Notice how this operation changes the original array!\n",
+    "A_view[0] = 100\n",
+    "A"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7c5648a3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "np.shares_memory(A, A_view), np.may_share_memory(A, A_view)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c4f38792",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# If this were not a view, we could not manipulate the data\n",
+    "A[::2] = 100\n",
+    "A"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "32fa43d7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# If you want real copy use the copy method provided by the ndarrays\n",
+    "A = np.arange(5)\n",
+    "A_copy = A[::2].copy() # we make an explicit copy \n",
+    "np.shares_memory(A, A_copy)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b5327c13",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A_copy[0] = 100"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ac181286",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(A.shape, A_copy.shape)\n",
+    "print(A, A_copy)  # check if the original array was modified as well"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6038e1d3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Copying values into an existing object using [:]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4edb042c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.array([1, 2, 3])\n",
+    "B = np.array([4, 5, 6])\n",
+    "A = B  # assign object referenced by B to variable A\n",
+    "B[0] = 100\n",
+    "print(A, B)\n",
+    "print(\"A has same identity as B: \", id(A) == id(B))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "64928398",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.array([1, 2, 3])\n",
+    "B = np.array([4, 5, 6])\n",
+    "old_A_id = id(A)\n",
+    "A = B.copy() # copy values into new object, then assign this new object to name A\n",
+    "B[0] = 100\n",
+    "print(A, B)\n",
+    "print(\"A has same identity as B: \", id(A) == id(B))\n",
+    "print(\"A has kept it's identity: \", old_A_id == id(A))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "01b8a7de",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.array(range(6)).reshape((3, 2))\n",
+    "B = np.array(range(6, 12)).reshape((3, 2))\n",
+    "old_A_id = id(A)\n",
+    "A[:, :] = B  # copy values from B into existing object A\n",
+    "# syntax A[:] ensures, that assinment to object in A is triggered\n",
+    "# not to the symbol A\n",
+    "B[0] = 100\n",
+    "print(A)\n",
+    "print(B)\n",
+    "print(\"A has kept it's identity: \", old_A_id == id(A))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7fc99cb2",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Advanced indexing\n",
+    "\n",
+    "*Use of arrays (integer of boolean type) to index other arrays*."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9f4eb590",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "\n",
+    "- \"Elementary\" indexing (and slices) always returns a *view*.\n",
+    "- Advanced (\"fancy\") indexing always returns a *copy*."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5d12425e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "- Assignment *to* array with advanced indexing changes original array (just like with normal indexing slicing).\n",
+    "- Assignment *from* array with advanced indexing creates copies (and not views like regular slicing)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "404837b7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Boolean expressions with `ndarray`s"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fc042b09",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.array([1, 2, 3])\n",
+    "B = np.array([3, 2, 1])\n",
+    "A == B # component-wise comparison"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f8329f0b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "In boolean expression with `ndarray`s use *binary* operations instead of logical opertions:\n",
+    "\n",
+    "Operation | Not to use | Use\n",
+    "------   | -----------|----------\n",
+    "and      | `and`      | `&`\n",
+    "or       | `or`       | `\\|`\n",
+    "not      | `not`      | `~`  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b89d056f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# The following expressions yield the same result\n",
+    "print(np.array([not x for x in ((A == B) | ([True, False, False]))]))\n",
+    "print(~((A == B) | ([True, False, False])))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "15b5a088",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Boolean masks\n",
+    "A = np.arange(100000)  # change to larger value for time measurement below\n",
+    "divisible_by_3_mask = (A % 3 == 0)\n",
+    "# print(A)\n",
+    "# print(divisible_by_3_mask)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "80f77707",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%timeit A[divisible_by_3_mask]\n",
+    "%timeit [A[i] for i in range(A.size) if divisible_by_3_mask[i] == True]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f02f53fa",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Assign *from* array with boolean mask!\n",
+    "A_masked = A[divisible_by_3_mask] # creates a copy!!!\n",
+    "print(A_masked)\n",
+    "print(id(A_masked) == id(A))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4059b8c8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Assign *to* array with boolean mask\n",
+    "A[divisible_by_3_mask] = -100 # changes original!!!\n",
+    "print(A)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "932492d7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# get indices where condition is True\n",
+    "np.where(divisible_by_3_mask)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "19fdda9b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# integer list indexing\n",
+    "B = np.arange(4 * 4).reshape(4, 4)\n",
+    "print(B)\n",
+    "list_index = [1, 3]\n",
+    "B[list_index] # or B[list_index, :]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d091bf47",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# 2D: 2 lists, each containing indices for axis 0 and axis 1, respectively\n",
+    "B[[0, 3], [1, 2]], B[0, 1], B[3, 2]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5822143f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# or, to create a boolean mask from these integer indices:\n",
+    "mask = np.zeros_like(B, dtype=np.bool_)\n",
+    "mask[[0, 3], [1, 2]] = 1\n",
+    "print(mask)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f3bf12e7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "B[mask] = -10000\n",
+    "print(B)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5091adea",
+   "metadata": {},
+   "source": [
+    "For a visual example for fancy indexing see [here](http://scipy-lectures.org/intro/numpy/array_object.html#fancy-indexing)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d55d3a8b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Content\n",
+    "  - Introduction to Numpy\n",
+    "  - Datatypes\n",
+    "  - Concept of multi-dimensional arrays\n",
+    "  - Array access\n",
+    "  - ***Broadcasting***\n",
+    "  - Universal functions\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "586156e7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Broadcasting\n",
+    "[Broadcasting](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) describes how arrays with the *different* shapes are treated during arithmetic operations."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0386670a",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Let's start with a very simple example\n",
+    "a = np.array([1.0, 2.0, 3.0])\n",
+    "b = np.array([2.0, 2.0, 2.0])\n",
+    "a * b # Multiplication means element-wise multiplication"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e9504866",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# NumPy *automatically* applies the scalar value to all elements of the ndarray. We can \"think\" of `b` being \n",
+    "# replicated to an ndarray of the same size as a. NumPy, however, is smart enough *not* to make additional copies.\n",
+    "a = np.array([1.0, 2.0, 3.0])\n",
+    "b = 2.0\n",
+    "a * b"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "da58f3b5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# This also works for higher-dimensional arrays\n",
+    "A = np.arange(1, 10).reshape((3, 3))\n",
+    "b = 0.5\n",
+    "A * b"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "687bcf20",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### General broadcasting rules\n",
+    "NumPy compares shapes of two arrays element-wise, starting from the *end* of the `shape` tuple and working its way to the beginning. Dimensions are compatible if\n",
+    "\n",
+    "1. they are *equal*, or\n",
+    "2. one of them is 1 ."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c6ead848",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "\n",
+    "Arrays can be broadcast into the same shape if one of following conditions is fulfilled:\n",
+    "1. Arrays already have exactly the same shape.\n",
+    "2. Arrays have same number of dimensions, and the individual dimensions are either of the same length, or of length 1.\n",
+    "3. Arrays of unequal dimensions can have their shape prepended with dimensions of length 1. Then rule 2. applies."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "027bc942",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Note\n",
+    "\n",
+    "The following examples only serve to deepen the conceptual understanding of broadcasting rules.\n",
+    "\n",
+    "This is **not** how NumPy does broadcasting. NumPy is much more memory efficient since it avoids making needless copies of the data."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "26fa673b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "Arrays do *not* need to have the same number of dimensions. Then rule 3. applies.\n",
+    "```\n",
+    "data    (4d array): 4 x 8 x 3 x 8\n",
+    "factor  (1d array):             8  # replace missing dimensions with 1 (1 x 1 x 1 x 8)\n",
+    "result  (4d array): 4 x 8 x 3 x 8\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "40e511a2",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "data = np.random.random((4, 8, 3, 8))\n",
+    "factor = np.random.random((8,))\n",
+    "result = factor * data\n",
+    "result.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ea5d63af",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "factor_augmented = np.tile(\n",
+    "    factor.reshape((1, 1, 1, 8)), # Add missing dimensions\n",
+    "    (4, 8, 3, 1)                  # Augment the data to shape (8, 8, 3)\n",
+    ")\n",
+    "print(scale_augmented.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ba6e6a46",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Compare sizes of arrays\n",
+    "print(factor.nbytes)\n",
+    "print(factor_augmented.nbytes) # NumPy does not create such an array"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "79169f4d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "np.allclose(data * factor_augmented, result)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7d52d43a",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "```\n",
+    "data1  (5d array):  2 x 8 x 1 x 6 x 1\n",
+    "data2  (3d array):          7 x 1 x 5 # axes with length 1 will be expanded\n",
+    "Result (5d array):  2 x 8 x 7 x 6 x 5\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7631defe",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Example\n",
+    "data1 = np.random.random((2, 8, 1, 6, 1))\n",
+    "data2 = np.random.random((7, 1, 5))\n",
+    "result = data1 * data2\n",
+    "print(result.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8fffaf22",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "data1_augmented = np.tile(data1, (1, 1, 7, 1, 5))\n",
+    "data2_augmented = np.tile(data2.copy().reshape((1, 1) + data2.shape), (2, 8, 1, 6, 1))\n",
+    "print(data1_augmented.shape)\n",
+    "print(data2_augmented.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3053479c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(np.allclose(result, data1 * data2))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "35cae9b9",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.arange(3 * 4).reshape((3, 4))\n",
+    "B = np.array([10, 20, 30, 40])\n",
+    "print(A)\n",
+    "print(B)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6fb5ea59",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Broadcasting occurs implicitly\n",
+    "# print(np.tile(B, (3, 1)))\n",
+    "A + B"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "15b1d452",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# explicit broadcast\n",
+    "A_broadcast, B_broadcast = np.broadcast_arrays(A, B)\n",
+    "print(\"A:\")\n",
+    "print(A_broadcast)\n",
+    "print(\"B:\")\n",
+    "print(B_broadcast)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bc9a187f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Methods for adding dimensions\n",
+    "A = np.arange(3)\n",
+    "print(A.reshape(1, -1))\n",
+    "print(A[None, :])\n",
+    "print(A[np.newaxis, :])\n",
+    "print(np.newaxis is None) # np.newaxis is an alias for None"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "536fab81",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html\n",
+    "print(np.expand_dims(A, axis=0))\n",
+    "print(np.expand_dims(A, axis=1))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "801f988d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.arange(3 * 4).reshape((3, 4))\n",
+    "B = np.array([1, 2, 3])\n",
+    "print(A.shape, B.shape)\n",
+    "# This does not work, since A.shape and B.shape are not compatible, \n",
+    "# because dimensions along axis 0 cannot be matched.\n",
+    "# A_broadcast, B_broadcast = np.broadcast_arrays(A, B)\n",
+    "A_broadcast, B_broadcast = np.broadcast_arrays(A, B[:, None]) # adding another dimension helps"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2ba18826",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "Sometimes it can be useful to [manually add another axis](https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html) to leverage broadcasting."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5e54e77d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.zeros((3, 2))\n",
+    "B = np.arange(3)\n",
+    "print(A.shape)\n",
+    "print(B.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9eb94a5b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A + B # This will fail since last dimensions mismatch: a: 2 vs b: 3"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9d14fff6",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A + B[:, None]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "893b2ae5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "TODO Broadcasting Quiz, if still time left"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a6c61ee4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Content\n",
+    "  - Introduction to Numpy\n",
+    "  - Datatypes\n",
+    "  - Concept of multi-dimensional arrays\n",
+    "  - Array access\n",
+    "  - Broadcasting\n",
+    "  - ***Universal functions***\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c31a4456",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## NumPy universal elementwise functions (\"ufuncs\")\n",
+    "[ufuncs](https://docs.scipy.org/doc/numpy/reference/ufuncs.html ) perform function operations on individual elements of `ndarray`s in an element-by-element manner. They have\n",
+    "broadcasting built-in."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "115ab970",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "Non-exhaustive list of universal functions and their operator equivalent.\n",
+    "\n",
+    "Operator | ufunc           | Description\n",
+    "---------| ----------------|----------\n",
+    "`+`      | `np.add()`      | addition\n",
+    "`-`      | `np.subtract()` | subtraction\n",
+    "`*`      | `np.mul()`      | multiplication\n",
+    "`/`      | `np.divide()`   | division\n",
+    "`//`     | `np.floor_divide()`| floor division\n",
+    "`**`     | `np.power()`    | exponentiation\n",
+    "`%`      | `np.mod()`      | remainder of division\n",
+    "\n",
+    "For more mathematical functions included in NumPy see [here](https://numpy.org/doc/stable/reference/ufuncs.html)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "db3eaeb7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "These functions generally create a new (temporary) target array. You can use the `out=` parameter to avoid creating temporary output arrays by supplying an existing array."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7a1e4b7d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = np.arange(3 * 3).reshape(3, 3)\n",
+    "np.power(A, 2, out=A)  # no temporary array\n",
+    "A"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "30310e19",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "A useful resource on universal functions can be found [here](https://jakevdp.github.io/PythonDataScienceHandbook/02.03-computation-on-arrays-ufuncs.html)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d3d8ed2b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Create your own ufuncs from scalar Python functions"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ba396e41",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Poor implementation of a function that tests if its argument is prime.\n",
+    "def is_prime(x):\n",
+    "    \"\"\"Check if input number is a prime number.\"\"\"\n",
+    "    if x < 2:\n",
+    "        return False\n",
+    "    for value in range(2, x): # x is not included in range!\n",
+    "        if x % value == 0:\n",
+    "            return False\n",
+    "    return True"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cb952cff",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We can apply this function to a `ndarray`. This will be inefficient when the we have to check many numbers.\n",
+    "[is_prime(x) for x in np.arange(1, 10, dtype=int)]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2a88b8c1",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We can create our *own* universal function \n",
+    "is_prime_ufunc = np.vectorize(is_prime)\n",
+    "print(is_prime_ufunc.__doc__)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "493eace7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# let's make a timing analysis\n",
+    "number_range = np.arange(1, 1000, dtype=int)\n",
+    "%timeit [is_prime(x) for x in number_range]\n",
+    "%timeit is_prime_ufunc(number_range)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "123372f3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Time For Hands On"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "403b7e56",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Hands On Exercise\n",
+    " - Implement the K-means clustering algorithm. \n",
+    "     - See [here](https://en.wikipedia.org/wiki/K-means_clustering) for a description of the algorithm.\n",
+    " - To explain the algorithm, we will first implement it with standard Python together.\n",
+    " - Then it's your turn to use Numpy for it.\n",
+    " - Which implementation is more efficient?\n",
+    " "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ce7e0814",
+   "metadata": {
+    "cell_style": "split",
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Summary\n",
+    "- Concept of numpy arrays\n",
+    "- Indexing / slicing arrays\n",
+    "- Difference between a copy and a view\n",
+    "- Array-oriented programming for better performance"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cd06a649",
+   "metadata": {
+    "cell_style": "split",
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "## Agenda for tomorrow\n",
+    "- 09:00 - 12:00 Morning session\n",
+    "  - Introduction to Pandas\n",
+    "  - Usage of Pandas `Dataframe`s\n",
+    "- 12:00 - 13:00 Lunch break\n",
+    "- 13:00 - 17:00 Afternoon session\n",
+    "  - Some more `DataFrame`s\n",
+    "  - **Hands on Exercises**"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6fad04ed",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "notes"
+    }
+   },
+   "source": [
+    "TODO interactive Quiz zu broadcasting rules?\n",
+    "\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  },
+  "rise": {
+   "controls": true,
+   "controlsLayout": "edges",
+   "controlsTutorial": false,
+   "footer": "<img src=hpc-hessen-logo-only.png height=60 width=100>Competence Center for High Performance Computing in Hessen (HKHLR)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Tim Jammer, Marcel Giar &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;HiPerCH 2022",
+   "header": "",
+   "help": false,
+   "slideNumber": "c/t",
+   "theme": "white"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": false,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/slides/Day1.ipynb.license b/slides/Day1.ipynb.license
new file mode 100644
index 0000000000000000000000000000000000000000..c207ab8c094a9d18d7c6cb5c9dfbf8913df4aa8a
--- /dev/null
+++ b/slides/Day1.ipynb.license
@@ -0,0 +1,4 @@
+SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+
+SPDX-License-Identifier: MIT
diff --git a/slides/Day2_PandasDataFrames.ipynb b/slides/Day2_PandasDataFrames.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..288ef4f76211849868b2ac8055e7f3dfe237e890
--- /dev/null
+++ b/slides/Day2_PandasDataFrames.ipynb
@@ -0,0 +1,2330 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "4d6423b0",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# HiPerCH 14 Module 1:  Introduction to Python Data Processing tools"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dbd2f680",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Pandas `DataFrame`s"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7f4829e5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "\n",
+    "from matplotlib import pyplot as plt\n",
+    "\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "\n",
+    "f\"Numpy version: {np.__version__}; Pandas version: {pd.__version__}\"\n",
+    "\n",
+    "import importlib\n",
+    "import utils\n",
+    "importlib.reload(utils)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e1376473",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# `DataFrame` Objects\n",
+    "\n",
+    "The `pd.DataFrame` class provides a data structure to handle 2-dimensional tabular data. `DataFrame`  objects are *size-mutable* and can contain mixed datatypes (e.g. `float`, `int` or `str`). All data columns inside a `DataFrame` share the same `index`."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "857d1d3c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "## Creating `DataFrame`s"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "62875512",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "name = [\"person 1\", \"person 2\", \"person 3\"]\n",
+    "age = [23, 27, 34] "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e2f65441",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Create nested list and pass column names\n",
+    "df = pd.DataFrame(data=zip(name, age), columns=[\"Name\", \"Age\"])\n",
+    "df # This gives a nicely formatted output. When using the `print` function the output looks different."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d6e0313e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# The same can be achieved by using a `dict`\n",
+    "df =  pd.DataFrame(data={\"Name\": name, \"Age\": age})\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b81aae00",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "It is also possible to create `DataFrame`s from `Series` objects."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5b8385cc",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "math_grades = pd.Series({\n",
+    "    'student1': 15,\n",
+    "    'student2': 11,\n",
+    "    'student3': 9,\n",
+    "    'student4': 13,\n",
+    "    'student5': 12,\n",
+    "    'student6': 7,\n",
+    "    'student7': 14\n",
+    "})\n",
+    "chemistry_grades = pd.Series({\n",
+    "    'student1': 10,\n",
+    "    'student2': 14,\n",
+    "    'student3': 12,\n",
+    "    'student4': 8,\n",
+    "    'student5': 11,\n",
+    "    'student6': 10,\n",
+    "    'student7': 12,\n",
+    "    \"student8\": 5  # <-- note the additional entry here\n",
+    "})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1234ca1c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df = pd.DataFrame(data={\"Math Grades\": math_grades, \"Chemistry Grades\": chemistry_grades})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5ae40e4e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "\n",
+    "Series objects are *matched by index* and missing values are replaced with a default value."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2222bd8e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df # default value is `NaN`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a220919f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Exercises (optional)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "552b4899",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "* Given the two iterables `values1` and `values2`, create a `pd.DataFrame` containing both in two different ways. Label the columns `'label1'` and `'label2'`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "267a91a5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "values1 = np.random.randint(-10, 10, 5)\n",
+    "values2 = range(5)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ffbf28d0",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df_iterables = pd.DataFrame(data=zip(values1, values2), columns=[\"label1\", \"label2\"])\n",
+    "df_iterables"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "00fc0dd4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "\n",
+    "* Combine the two `pd.Series` named `series1` and `series2` to a `pd.DataFrame`. Label the columns `'col1'` and `'col2'`. \n",
+    "    * Replace missing values with `0`.\n",
+    "    * Remove rows that contain `NaN` values."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e0881ae3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "series1 = pd.Series(data=range(5), \n",
+    "                    index=[f\"{idx}\" for idx in range(5)])\n",
+    "series2 = pd.Series(data=range(0, 10, 2), \n",
+    "                    index=[f\"{idx}\" for idx in range(0, 10, 2)])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "de1b4fbe",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df_from_series = pd.DataFrame({\"col1\": series1, \"col2\": series2})\n",
+    "df_from_series"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1f76d67e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# df_from_series.replace(np.NaN, 0 )\n",
+    "# df_from_series.dropna() "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "86337fd7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## What characterises a `DataFrame`?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6fd61f32",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df = pd.DataFrame(data={\"Math Grades\": math_grades, \"Chemistry Grades\": chemistry_grades})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6f08fae8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "How many rows and columns are container in the `DataFrame`. We have seen this attribute when dealing with `ndarrays` ..."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c9fc5896",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a12c423b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Detailed information on the data contained inside the `DataFrame`.\n",
+    "df.info()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7f4f886e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "`DataFrame`s are essentially composed of 3 components. Theses components can be accessed with specific data attributes.\n",
+    "\n",
+    "- Index (`df.index`)\n",
+    "- Columns (`df.columns`)\n",
+    "- Body (`df.values`)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e6ed9ab7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df.index"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b0afa504",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df.columns"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7ade6b93",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df.values"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "de4be0fb",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Data indexing and selection"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fa82625d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### The Iris flower dataset\n",
+    "\n",
+    "<a title=\"w:ru:Денис Анисимов (talk | contribs), Public domain, via Wikimedia Commons\" href=\"https://commons.wikimedia.org/wiki/File:Irissetosa1.jpg\"><img width=\"512\" alt=\"Irissetosa1\" src=\"https://upload.wikimedia.org/wikipedia/commons/thumb/a/a7/Irissetosa1.jpg/512px-Irissetosa1.jpg\"></a>\n",
+    "\n",
+    "Image taken from: <a href=\"https://commons.wikimedia.org/wiki/File:Irissetosa1.jpg\">w:ru:Денис Анисимов (talk | contribs)</a>, Public domain, via Wikimedia Commons\n",
+    "\n",
+    "Attribution for dataset: *Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.*"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "673472df",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "The dataset contains measurements of for \"features\" related to the species of Iris flowers:\n",
+    "* Petal length (\"Bluetenblattlaenge\")\n",
+    "* Petal width (\"Bluetenblattbreite\")\n",
+    "* Sepal length (\"Kelchblattlaenge\")\n",
+    "* Sepal width (\"Kelchblattbreite\")\n",
+    "\n",
+    "The species contained in the dataset are:\n",
+    "\n",
+    "* Iris setosa\n",
+    "* Iris virginica\n",
+    "* Iris versicolor"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f95709e5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df = utils.download_IRIS()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "30c97a17",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Quick check if data looks alright\n",
+    "# petal - Bluetenblatt\n",
+    "# sepal - Kelchblatt\n",
+    "df.head() \n",
+    "# df.tail()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "76e5852d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df.columns"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "285fad0e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Column access with the `[]` operator.\n",
+    "df[\"Name\"]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0c7fc248",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# The columns of a DataFrame are `Series` objects.\n",
+    "type(df[\"Name\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "87123e39",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "data_columns = [cname for cname in df.columns if cname != \"Name\"]\n",
+    "data_columns"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "32a00007",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df[data_columns]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ce8a4f9e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "As for `Series` objects the `loc` as well as the `iloc` methods are also available for `DataFrame`s."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "146337e7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Remember that when using the `loc` method the argument passed to the `[]` operator must present in `df.index`.\n",
+    "df.loc[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7bac9da5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We can also use slicing with the `loc` method.\n",
+    "df.loc[0::50].head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "75a449ca",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Fancy indexing is also possible.\n",
+    "df.loc[[0, 50, 100]]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7dcc63b4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We can combine row and column access with the `loc` method.\n",
+    "df.loc[:, ['sepal width', 'sepal length']].head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ab98b434",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Rows can also be selected with boolean masks.\n",
+    "mask = (df[\"Name\"] == \"Iris-setosa\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "87024f21",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df.loc[mask].head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9eb788b5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# More complicated boolean masks can be conceived\n",
+    "mask = (df[\"sepal length\"] > 6.0) & (df[\"petal length\"] > 1.0) # use () for each boolean sub-expression\n",
+    "df.loc[mask]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2b97d718",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Exercises (optional)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b1d0491f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "* Change all column names to uppercase, e.g.\n",
+    "    * \"petal length\" $\\to$ \"PETAL LENGTH\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "86cadb9b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "77c2b1fc",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "* From the `\"sepal length\"` column retrieve all values that are `> 6` but `< 7`! How often does each of the resulting values occur in this column? (*Hint*: Refer to the [`DataFrame` documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) for a method to count values.)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "905540ad",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a71c5cd3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "* In the DataFrame `df`, *simultaneously* access the columns `\"sepal length`\", `\"petal width\"`, and `\"Name\"` in two different ways.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4a346e36",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e1b3f2b9",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f5522a5c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "* Compare the following two ways of replacing data in a DataFrame. Do they both work? Why?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c85a31fe",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "50f405da",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "eb2bcfe0",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "* Determine the indices in the `DataFrame` that correspond to rows that contain data on the Iris setosa species.\n",
+    "* Use indices to delete the corresponding rows from the `DataFrame`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8cc5891d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "69acd4c9",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "* Sort the columns in the `DataFrame` by the values contained in the columns `\"petal length\"` *and* `\"petal width\"`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4b240056",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e852cb23",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Reading data into a `DataFrame`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1c151460",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "Pandas can import several common file formats:\n",
+    "\n",
+    "- `pd.read_csv`: Read in CSV spreadsheets (`.csv` suffix)\n",
+    "- `pd.read_excel`: Read in MS Office spreadsheets (`.xls` and `.xlsx` suffix) \n",
+    "- `pd.read_stata`: Read stata datasets (`.dta` suffix)\n",
+    "- `pd.read_hdf`: Read HDF datasets (`.hdf` suffix)\n",
+    "- `pd.read_sql`: Read from SQL database\n",
+    "\n",
+    "Other file formats are [supported](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html) as well."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2ec29c1d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "## Reading CSV files "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "002a168c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Download the files and write to CSV file.\n",
+    "from pathlib import Path\n",
+    "importlib.reload(utils)\n",
+    "utils.download_IRIS_with_addons(delimiter=\";\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bdb4b736",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Inspect the file content. This command will only work on a UNIX-like operating system.\n",
+    "! head -n 15 tmp_with_addons/iris-data.csv | nl"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bcadd113",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Read the file with Pandas and specify the delimiter symbol as well as the a symbol for the comment.\n",
+    "df = pd.read_csv(Path(\"tmp_with_addons\") / \"iris-data.csv\", delimiter=\";\", comment='#')\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7e4cb505",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We can limit the number of imported columns by specifying those that we explicitly want to have.\n",
+    "df = pd.read_csv(Path(\"tmp_with_addons\") / \"iris-data.csv\", \n",
+    "                 delimiter=\";\", \n",
+    "                 comment=\"#\", \n",
+    "                 usecols=[\"Name\", \"sepal length\", \"sepal width\"])\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "62801f60",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# When importing data we can specifiy which data column should become the index in the `DataFrame`.\n",
+    "df =  pd.read_csv(Path(\"tmp_with_addons\") / \"iris-data.csv\", delimiter=\";\", \n",
+    "                  comment=\"#\", index_col=\"Name\")\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d9cf40bc",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df_tmp1 = df.copy(deep=True)\n",
+    "df_tmp2 = df.copy(deep=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "53834587",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "Oftentimes -- when invoking a method of a `DataFrame` object -- a *new* `DataFrame` instance is returned. This means that new memory allocations will be made which can be quite time-consuming and also a waste of precious memory ressources."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c79088e2",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "Reset the index of the current `DataFrame`. This is done   *out-of-place* and a new instance is returned."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7c17dbdd",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df_tmp1.reset_index().set_index(\"sepal length\").head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d8f6726f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "We can use the `inplace` argument to modify the current instance itself."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b244ae21",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We can use the `inplace` argument to modify the object itself. \n",
+    "df_tmp2.reset_index(inplace=True)\n",
+    "df_tmp2.set_index(\"sepal length\", inplace=True)\n",
+    "df_tmp2.head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "125a396b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Operations with `DataFrame`s"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4446f2d6",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "## Arithmetic operations"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5a6a944b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "Mapping between Python arithmetic operators and `DataFrame` methods.\n",
+    "\n",
+    "| Python operator | Pandas methods                   |\n",
+    "|:---------------:|----------------------------------|\n",
+    "|       `+`       | `add()`                          |\n",
+    "|       `-`       | `sub()`, `subtract()`            |\n",
+    "|       `*`       | `mul()`, `multiply()`            |\n",
+    "|       `/`       | `truediv()`, `div()`, `divide()` |\n",
+    "|       `//`      | `floordiv()`                     |\n",
+    "|       `%`       | `mod()`                          |\n",
+    "|       `**`       | `pow()`                          |"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "74e113b4",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "A = pd.DataFrame(np.random.randint(0, 20, (3, 2)), columns=list(\"AB\"))\n",
+    "B = pd.DataFrame(np.random.randint(0, 20, (3, 3)), columns=list(\"BAC\"))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a9bb50e8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Indices of all DataFrames involved in the operation are aligned. The order of each index is irrelevant.\n",
+    "# Data columns not shared by the DataFrames will be filled with a special value.\n",
+    "A + B"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dc290da1",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Use the `add` method to specifiy the fill_value. Note that the `fill_value` will be used in the DataFrame with the\n",
+    "# *missing* column. The specified `fill_value` is then used in the arithmetic operation. \n",
+    "# >>> Choose wisely when using the `fill_value` argument <<<\n",
+    "A.add(B, fill_value=\"-1000\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7d5fccee",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "NumPy broadcasting rules apply for `DataFrame`s as well."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dfd28894",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df = pd.DataFrame(np.random.randint(10, size=(3, 4)), columns=list(\"wxyz\"))\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9d8b8e06",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Subtract a row.\n",
+    "df - df.loc[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9c0ecfb7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Call the appropriate method if you want to operate on the columns. We operate along axis=0 (the rows).\n",
+    "df.sub(df[\"x\"], axis=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8da5ad3d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "`DataFrame`s can be fed to Numpy `ufunc`s."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "55d00536",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "np.exp(df)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "60dfcb8b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "New columns can be added with arithmetic operations."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "29bb3975",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df[\"asdf\"] = np.sin( df[\"x\"] + df[\"y\"] )\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b1381d02",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Methods for operating on `DataFrame`s"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "aa5758e3",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "Pandas `DataFrame` and `Series` objects have several built-in method to operate on the data.\n",
+    "\n",
+    "- `apply()`: available for *both* `Series` and `DataFrame` objects\n",
+    "- `transform()`: available for *both* `Series` and `DataFrame` objects\n",
+    "- `applymap()` *only* available for `DataFrame` objects\n",
+    "- `map()`: *only* available for `Series` objects"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0545a722",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df = utils.download_IRIS()\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a0b8e197",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Get a subset of columns by using regular expressions\n",
+    "data_columns = df.columns[df.columns.str.match('^(petal|sepal).*(width|length)$')]\n",
+    "data_columns"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c4ca8241",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### [`apply()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html)\n",
+    "\n",
+    "```python\n",
+    "DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwds)\n",
+    "```\n",
+    "- *applies* a function (callable) along an `axis` of the `DataFrame`\n",
+    "    - `axis=0`: `func` is applied to each column (a `Series` object). This is the default!\n",
+    "    - `axis=1`: `func` is applied to each row\n",
+    "- return type is inferred from `func`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ec3f1484",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "The return type of `func` determines the form of the result.\n",
+    "\n",
+    "`func` can operate on `Series` objects an perform operations that are supported by these types of objects (e.g. by means of the methods `.min()`, `.max()` or `.mean()`). \n",
+    "- result can be a scalar value (e.g. `.sum()` which is an aggregation operation)\n",
+    "- result can be another `Series` object"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "440dfc49",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "Compute the mean value of each column (this is the default because we do not specify the `axis` argument)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f412f504",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "mean_values = df[data_columns].apply(lambda x: x.mean())\n",
+    "mean_values # This returns a `Series` object because x.mean() returns a scalar value."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "511add01",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Question\n",
+    "\n",
+    "How does the result look like if the operate along the rows of the `DataFrame`. This is achieved by using the argument `axis = 1`. What is the shape of the resulting object?\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ef03fc24",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "Now we transform the values in the columns of the `DataFrame`. We define a function that will operate on the `Series` objects that form the columns.\n",
+    "\n",
+    "The object resulting from this operation is another `DataFrame` instance."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "58bdf8e5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "def scale_to_mm(s):\n",
+    "    return s * 10\n",
+    "\n",
+    "df_scaled_to_mm = df[data_columns].apply(scale_to_mm) # This will return a new DataFrame\n",
+    "df_scaled_to_mm[\"Name\"] = df[\"Name\"]\n",
+    "df_scaled_to_mm.head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "233adbba",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Question\n",
+    "\n",
+    "How must the above command be changed if we want to operate along the rows of the `DataFrame` instead? Does this also work with the already-defined function or do we have to define a dedicated function?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5dd57a2b",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df[data_columns].apply(scale_to_mm, axis=1).head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "371d521f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Experimenting with the `apply()` method"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "afb7a126",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "Let's generate a large `DataFrame`. We wish to operate on the data with the `apply` method. We can do this in two different ways:\n",
+    "- Operate along the rows (`axis=1`)\n",
+    "- Operate along the columns (`axis=0`)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "355d3698",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "N_rows, N_cols = 10_000, 500\n",
+    "data = pd.DataFrame(np.random.random((N_rows, N_cols)), columns=[f\"col{idx}\" for idx in range(N_cols)])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2d78baa7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Question \n",
+    "\n",
+    "What do you think is faster: Operating along the columns or operating along the rows?\n",
+    "\n",
+    "When you have made your decision try to come up with a reason!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "368b5567",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%timeit data.apply(lambda x: x ** 2, axis=0) # operate along columns\n",
+    "%timeit data.apply(lambda x: x ** 2, axis=1) # operate along rows"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "524759b7",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "The `apply` method wants to operate on `Series` objects. The columns of a `DataFrame` are `Series`. Inside each `Series` data is stored contiguously in memory. Hence operating on the columns is *fast*.\n",
+    "\n",
+    "When operating row-wise for *each* row a new `Series` object must be generated. A buffer must be allocated in memory and data needs to copied to that buffer in order to be able to operate on the data with the `apply` method. Since there are many steps involved that are repeated for each row this procedure generally is *slower* than operating along the columns."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cb2244cb",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Task (optional)\n",
+    "\n",
+    "The names of the Iris species are contained in the column with heading `\"Name\"`. The names follow the pattern:\n",
+    "\n",
+    "```\n",
+    "Iris-<identifier for species>\n",
+    "```\n",
+    "\n",
+    "Remove the dash `-` from the names and just keep the identifier for each species. Use the `apply` method.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "79f7a36a",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3f112be8",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### `transform()`\n",
+    "\n",
+    "```python\n",
+    "DataFrame.transform(func, axis=0, *args, **kwargs)\n",
+    "```\n",
+    "\n",
+    "`func` can either be\n",
+    "- callable, e.g. `np.exp`\n",
+    "- list-like, e.g. `[np.sin, np.cos]`\n",
+    "- dict-like, e.g. `{\"sepal length\": np.sin,  \"petal length\": np.cos}`. Application is limited to columns names passed as keys to `dict`.\n",
+    "- string, e.g. `\"sqrt\"`\n",
+    "\n",
+    "*Note*: This function *transforms*, i.e, when the input value is `Series` another (transformed) `Series` is returned. Returning a scalar value is not valid (resulting error message will be: `ValueError: Function did not transform\n",
+    "`)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4848e300",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df[data_columns].transform({\"sepal length\": np.cos, \"petal length\": np.sin}).head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "baccb2f2",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Task (optional)\n",
+    "\n",
+    "Convert the measured values (which are all given in cm units) to mm units by using the `transform` method."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d19379c2",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c0910d4c",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Performance considerations"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b7d5536e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "When operating on columns of a `DataFrame` or a `DataFrame` *as a whole* it is oftentimes faster to use a vectorised operations instead of column-/row-wise operations.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d8432d37",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df = pd.DataFrame(np.random.randn(1_000_000, 3), columns=list(\"abc\"))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "75e45ac5",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%timeit df.apply(lambda x: x ** 2, axis=0)\n",
+    "%timeit df ** 2\n",
+    "%timeit (df.values ** 2) # here we operate on the underlying `ndarray`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dd5b8a17",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### `assign`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cd2aaa0",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "The `assign` method adds a new column to a `DataFrame`. It is called on an existing `DataFrame` and returns a new `DataFrame` (that has all columns of the original `DataFrame`) with the new column added.\n",
+    "\n",
+    "* Allows to add single as well as multiple columns per call."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "527d0daf",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df_mean = df[data_columns].mean()\n",
+    "df.assign(\n",
+    "    petal_length_dev_from_mean=lambda x: x[\"petal length\"] - df_mean[\"petal length\"],\n",
+    "    petal_width_dev_from_mean=lambda x: x[\"petal width\"] - df_mean[\"petal width\"],\n",
+    "    sepal_length_dev_from_mean=lambda x: x[\"sepal length\"] - df_mean[\"sepal length\"],\n",
+    "    sepal_width_dev_from_mean=lambda x: x[\"sepal width\"] - df_mean[\"sepal width\"]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f470a4a6",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Grouping data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ab7c9102",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Properties of `GroupedBy` objects"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e360d0f1",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "- Oftentimes items in a dataset can be grouped in a certain manner (e.g., if a column contains a value multiple times). The Iris dataset, for instance, can  be grouped according the species of each flower.\n",
+    "\n",
+    "    ```python\n",
+    "    my_dataframe.groupby(by=[\"<column label>\"])\n",
+    "    ```\n",
+    "- The `DataFrame` is split and entries are grouped according to the values in the column with `\"<column-label>\"`. Once the data  has been grouped operations can be conducted on the items of each group.\n",
+    "\n",
+    "*Note*: `DataFrame`s cannot only be [grouped](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html) according to the entries of a column."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "99fce3ac",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "The return type of `groupby()` is *not* another `DataFrame` but rather a `DataFrameGroupBy` object. We can imagine this object to be a grouping of multiple `DataFrame`s.\n",
+    "\n",
+    "It is important to understand that such an object essentially is a special *view* on the original `DataFrame`. No computations have been carried out when generating it (lazy evaluation)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "72c259e0",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df = utils.download_IRIS()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "00f1cbca",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We group the data according to the species of the flowers\n",
+    "grouped_by_species = df.groupby(by=[\"Name\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "acbfef47",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(type(grouped_by_species))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "12e3ea0f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "This data structure still knows about the `columns` that were present in the original `DataFrame`. We can use the `[<column-name>]` operation to access the columns with the correspoding label in each of the group members (subframes)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "74f7786d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "grouped_by_species[\"sepal length\"]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ab6a8365",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Pandas will access the corresponding column of all subframes and apply the functions passed to the `agg()` method.\n",
+    "grouped_by_species[\"sepal length\"].agg([np.min, np.max, np.mean])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3e17a46e",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "We can iterate over the `DataFrameGroupBy` object where each subframe is returned as a `Series` of a `DataFrame`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a245fca9",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "for (species, subframe) in grouped_by_species:\n",
+    "    print(f\"Subframe for species {species} has shape {subframe.shape}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4d110f91",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Call the getter to obtain a `DataFrame`.\n",
+    "grouped_by_species.get_group(\"Iris-setosa\").head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cf9faf2d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "Methods that are not directly implemented for the `DataFrameGroupBy` object are passed to the subframes and executed on these."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "db2a75ca",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# The `describe()` method can also be called on the full object but the output would be rather hard to view.\n",
+    "grouped_by_species[\"sepal length\"].describe() # The return type is a `DataFrame`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2df3b968",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Single methods are available as well. E.g. `mean()`, `std()` or `sum()`\n",
+    "grouped_by_species.mean() # The return type is a `DataFrame`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dafb2e66",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Operating on `GroupedBy` objects"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9bbd870d",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "`DataFrameGroupBy` object support `aggregate()`, `filter()`, `transform()` and `apply()` operations.\n",
+    "\n",
+    "These methods can be efficiently used to implement a great variety of operations on grouped data."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "03cd3096",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "#### [`aggregate()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.DataFrameGroupBy.aggregate.html) (or simply `agg()`)\n",
+    "\n",
+    "```python\n",
+    "DataFrameGroupBy.aggregate(func=None, *args, engine=None, \n",
+    "                           engine_kwargs=None, **kwargs)\n",
+    "```\n",
+    "\n",
+    "`func` can for example be ...\n",
+    "- ... function (Python callable),\n",
+    "- ... a string specifiying a function name (e.g. `\"mean\"`)\n",
+    "- ...  list of functions or strings, e.g. `[\"std\", np.mean]`\n",
+    "- ... `dict` of column labels and function to apply (e.g. `{'data1': np.mean}`)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "32798a44",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Perform some common aggegrations within each subframe. The output of this method is another `DataFrame`.\n",
+    "group_agg = grouped_by_species.agg([np.min, np.max, np.mean, np.std])\n",
+    "group_agg"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "217e1c1f",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# To understand this a bit better consider the following. Note that we limit the output to only one species.\n",
+    "df.loc[df[\"Name\"] == \"Iris-setosa\", df.columns[:-1]].agg(\n",
+    "    [np.min, \n",
+    "     np.max, \n",
+    "     np.mean, \n",
+    "     np.std]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cfe77e99",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "The resulting output looks somewhat complicated than what we are used to from `DataFrame`s so far. The column labels now are hierarchical due to the grouping."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9de104f9",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "group_agg.columns # This is a so-called `MultiIndex`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4ef5258d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9fe0ff59",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Exercises (optional)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e1b12f15",
+   "metadata": {},
+   "source": [
+    "### Task 1\n",
+    "\n",
+    "Consider the Iris dataset.\n",
+    "\n",
+    "* For each of the features compute the mean value as well as the standard deviation.\n",
+    "* Center the values of a particular feature on the mean values and scale them to have unit variance.\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "62978f94",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df = utils.download_IRIS()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5b09bfcb",
+   "metadata": {},
+   "source": [
+    "Let us first make a working copy of the `DataFrame` containing the data on the Iris dataset."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "15edb74b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_tmp = df.copy()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d7ba0a2a",
+   "metadata": {},
+   "source": [
+    "Next, compute the mean value and the standard deviation for all features of the dataset. Computing these quantities does *not* take into the account the particular species."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "53d45569",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ae660e8f",
+   "metadata": {},
+   "source": [
+    "Now transform each of the features to be centred on the mean value and to have unit variance."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "52a96d54",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2b49c83",
+   "metadata": {},
+   "source": [
+    "### Task 2\n",
+    "\n",
+    "Again consider the Iris dataset.\n",
+    "\n",
+    "* Group the measured values by the species.\n",
+    "* Create boxplots for each species for all features.\n",
+    "    * Retrieve the names of the single groups from the `GroupedBy` objects.\n",
+    "    * Get the `DataFrame` for each of the groups from the `GroupedBy` object and call the [`boxplot` method](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.boxplot.html) to create the plot.\n",
+    "    * Use the names in the titles of the plot.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7ae51b08",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df = utils.download_IRIS()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "801dd946",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "90b8c49e",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a031db1b",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  },
+  "rise": {
+   "controls": true,
+   "controlsLayout": "edges",
+   "controlsTutorial": false,
+   "footer": "<img src=hpc-hessen-logo-only.png height=60 width=100>Competence Center for High Performance Computing in Hessen (HKHLR) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Tim Jammer, Marcel Giar &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;HiPerCH 2022",
+   "header": "",
+   "help": false,
+   "slideNumber": "c/t",
+   "theme": "white"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": false,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/slides/Day2_PandasDataFrames.ipynb.license b/slides/Day2_PandasDataFrames.ipynb.license
new file mode 100644
index 0000000000000000000000000000000000000000..c207ab8c094a9d18d7c6cb5c9dfbf8913df4aa8a
--- /dev/null
+++ b/slides/Day2_PandasDataFrames.ipynb.license
@@ -0,0 +1,4 @@
+SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+
+SPDX-License-Identifier: MIT
diff --git a/slides/Day2_PandasSeries.ipynb b/slides/Day2_PandasSeries.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..9dec8104e92205f611b670ca6097e2e73ce17fcb
--- /dev/null
+++ b/slides/Day2_PandasSeries.ipynb
@@ -0,0 +1,1749 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# HiPerCH 14 Module 1:  Introduction to Python Data Processing tools"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Day 2: Pandas"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## A Python data analysis library and data manipulation tool\n",
+    "- essential Python library for data analysis\n",
+    "\n",
+    "- a \"wrapper\" around numpy\n",
+    "    - basic knowledge of numpy is required for this course\n",
+    "    - numpy provides efficiency \"under the hood\"\n",
+    "    - pandas provides lots of ready-made functions for analyzing and plotting data\n",
+    "\n",
+    "- \"Excel inside of Python\"\n",
+    "\n",
+    "- provides its own data structures\n",
+    "    - `Series` and `DataFrame`s have numerous methods to work on data\n",
+    "    - no need for imperative programming!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "\n",
+    "from matplotlib import pyplot as plt\n",
+    "\n",
+    "import pandas as pd\n",
+    "import numpy as np \n",
+    "\n",
+    "print(f'Pandas version: {pd.__version__}\\nNumpy version: {np.__version__}')\n",
+    "\n",
+    "import importlib\n",
+    "# import utils"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Pandas `Series` Objects\n",
+    "\n",
+    "- essentially `np.ndarray`s with generalized indexing capabilities\n",
+    "- have an `index`, `values`, a `size`, and a `dtype`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Constructing `Series` from Python objects\n",
+    "- may use `list`s, `tuple`s, or `dicts`\n",
+    "- `set` does *not* work since contained data is *unordered*\n",
+    "- can contain different datatypes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Construct from Python `list` object\n",
+    "integers = pd.Series(data=[10, 30, 195, 2021])  # data keyword can be omitted since it is the first positional argument\n",
+    "integers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Series objects have important metadata\n",
+    "integers.values, integers.index, integers.dtype, integers.size"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# When constructing from a `dict` the keys become the index and the values become the value entries.\n",
+    "ordinal_values = pd.Series({'a': 97, 'b': 98, 'c': 99})\n",
+    "ordinal_values"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Generating `Series` from numpy arrays\n",
+    "- fastest to \"stay in the numpy world\"\n",
+    "- Series neatly wrap themselves around numpy arrays"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "array = np.arange(10, 14)\n",
+    "integers = pd.Series(data=array)\n",
+    "integers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "floats = pd.Series(data=np.random.randn(4))\n",
+    "floats"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Indexing Series\n",
+    "- *not* recommended: Python-style indexing with `[]` operator\n",
+    "    - unintuitive behavior\n",
+    "    - slicing refers to *numeric* indices\n",
+    "- use Series methods `.loc`, `.iloc` instead"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Indexing with `.loc`, `.iloc` methods\n",
+    "- [`loc[<index value>]`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.loc.html)\n",
+    "    - access by actual (index) label\n",
+    "    - slices include both end points\n",
+    "\n",
+    "- [`iloc[<index value>]`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html)\n",
+    "    - numeric indexing with integers, from 0\n",
+    "    - slices exclude the end point (as with e.g. ranges)\n",
+    "- can be used with boolean arrays"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "name": "#%%\n"
+    },
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "ordinal_values = pd.Series({'a': 97, 'b': 98, 'c': 99})\n",
+    "ordinal_values"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# We  can use the `[]` operator with the `loc` and `iloc` methods.\n",
+    "ordinal_values.loc['a'], ordinal_values.iloc[1]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Slicing\n",
+    "ordinal_values.loc['a':'c']"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers = pd.Series(data=[137, 214, 195, 271], index=[2014, 2016, 2018, 2020])\n",
+    "yearly_numbers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "name": "#%%\n"
+    },
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers.loc[2020], yearly_numbers.iloc[-1]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers.iloc[0:2]  # This will *not* work with the `loc` method!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers.loc[2014:2018]  # This will *not* work with the `iloc` method!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Setting values\n",
+    "- values of a `Series` can be modified\n",
+    "- use `.loc`, `.iloc` for indexing!\n",
+    "    - unintuitive results with \"standard Python\" indices"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers[2014] = 138"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers[2016] = 300.78  # Warning: This is typecast to `int`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers[2018] = \"300\"  # Warning: This is typecast to `int`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers[2020] = \"300.7889\"  # Warning: Now the *Series* changes!\n",
+    "yearly_numbers[0:2] = [1.5, 2.2]   # This also changes the series!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Do it this way instead\n",
+    "yearly_numbers = pd.Series(data=[137, 214, 195, 271], index=[2014, 2016, 2018, 2020])\n",
+    "yearly_numbers.dtype"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers.loc[2014] = 300.78\n",
+    "yearly_numbers  # Setting with `.loc` *always* changes the Series datatype (here: type conversion from `int64` to `float64`)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers.iloc[2] = \"Can also set to a string\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers.loc[2014:2018] = ['this', 'also', 'works']\n",
+    "yearly_numbers"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Setting the Index\n",
+    "- `Series` have an `index` as a separate attribute\n",
+    "    - index itself is a numpy array\n",
+    "- can be inspected and set\n",
+    "    - various data types possible\n",
+    "   "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers.index"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_numbers.index = [0, 2, 4, 6]\n",
+    "yearly_numbers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "floats.values, floats.index"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "floats.index = ['a', 'c', 'b', 'd']\n",
+    "floats.loc['b']"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "floats.loc['c':'d']"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "We can set the index when we create an instance of the `pd.Series` object. Use the `index` argument of the `pd.Series` constructor for this purpose."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "monthly_numbers = pd.Series([5, 2, 3, 91], index = 'Jan Feb Mar Apr'.split())\n",
+    "monthly_numbers"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "### Fancy indexing\n",
+    "- we can address series by more than one index at once\n",
+    "    - give a sequence of indices we want to pull out for `.loc`, `.iloc`\n",
+    "    - values may repeat"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "monthly_numbers.loc[['Jan', 'Mar']]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "monthly_numbers.iloc[[1, 2, 3, 2, 3, 2, 1]]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers.loc[[True, False, True, False]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Operations on Series\n",
+    "- Series can be added, multiplied, divided, ...\n",
+    "    - operations are performed element-wise\n",
+    "    - with other series: performed by index (*not* the numeric index!)\n",
+    "    - with scalar values: broadcast to all values"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_revenue = pd.Series([4, 20, 69, 420])\n",
+    "yearly_expenses = pd.Series([1, 33, 7, 57])\n",
+    "\n",
+    "yearly_revenue - yearly_expenses"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_revenue + yearly_expenses"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_revenue % yearly_expenses"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_revenue > yearly_expenses"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_revenue = pd.Series([4, 20, 69, 420], index=[2017, 2018, 2019, 2020])\n",
+    "yearly_expenses = pd.Series([1, 33, 7, 57], index = [2020, 2018, 2017, 2019])\n",
+    "\n",
+    "yearly_revenue - yearly_expenses"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yearly_expenses = pd.Series([1, 33, 7, 57, 120000], index=[2020, 2018, 2017, 2019, 2020])\n",
+    "yearly_revenue - yearly_expenses"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers + 2"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers ** 2"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers + 0.2"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers < 12"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Boolean Masks\n",
+    "- an easy way to extract data by condition\n",
+    "    1. create a boolean mask (same length, entries `True` / `False`)\n",
+    "    2. use with `.loc`\n",
+    "- careful: Cannot use \"Truthiness\" in place of booleans\n",
+    "    - may need to explicitly compare"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers < 10"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers.loc[integers < 12]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers % 2"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers[integers % 2]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers.loc[integers % 2 == 1]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers.loc[(integers < 11) | (integers > 12)]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Exercises\n",
+    "1. Create a `Series` with 8 random integers from in the range $[0,7]$. For the `index` use letters \"a\" to \"h\".\n",
+    "2. Which entry do you get for index \"d\"? \n",
+    "3. Retrieve the frist, the fifth and the last entry of the `Series`.\n",
+    "4. Retrieve all `Series` entries that are even.\n",
+    "5. Is the sum of all entries and even or an odd number?\n",
+    "6. Copy all values into a new `Series` object. For the new object use indices 'fegdachb' (in this order).\n",
+    "7. What do you get when dividing one `Series` object by the other?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Plotting data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "Pandas `Series` objects have an interface to Matplotlib that can be conventiently used to generate plots of datasets. The advantage of having a dedicated method for visualising (parts of) the data will become even more apparent when we deal with Pandas `DataFrame` objects."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "- `Series` instances have a [`plot()`](https://pandas.pydata.org/docs/reference/api/pandas.Series.plot.html) method that returns a Matplotlib `Axes` object.\n",
+    "    - The `kind` parameter of this method allows tho choose between different *types* of plots (default value is `'line'`). \n",
+    "- It is also possible to use the `plot` module which offers dedicated functions for certain types of plots (e.g. `pandas.Series.plot.line` or `pandas.Series.plot.bar`)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "x_values = np.linspace(-np.pi, np.pi, num=201)\n",
+    "cos_data = pd.Series(data=np.cos(x_values), index=x_values)\n",
+    "\n",
+    "ax = cos_data.plot.line()\n",
+    "ax.set_xlabel(\"$x$ label\")\n",
+    "ax.set_ylabel(\"$y$ label\")\n",
+    "ax.grid()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "The plotting capabilities of `Series` are particularly useful when dealing with categorical data.\n",
+    "\n",
+    "We will learn more about this later when we deal with `pd.DataFrame`s."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Datatypes and Missing values"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "## Series data types\n",
+    "* internally, `Series` (and indices) use NumPy datatypes\n",
+    "* important implications for \"big data\":\n",
+    "    * storage requirement differs widely\n",
+    "    * overflow, precision\n",
+    "* at creation, pandas determines a \"fitting\" dtype\n",
+    "    * only numeric types or \"object\"\n",
+    "* `Series` are \"flexible\"\n",
+    "    * assignment can *change* the Series data type (and therefore the type of the underlying C array)\n",
+    "    * easy typecasting with `.astype`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers32 = pd.Series(np.ones((1000000,)), dtype=np.int32)  # use `dtype` to specifiy the type when creating a `Series`\n",
+    "integers32.dtype\n",
+    "# integers.memory_usage(index=False, deep=True) # returned values are in [bytes]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Use `astype` method for explicit type conversion/\n",
+    "integers64 = integers32.astype(np.int64)\n",
+    "integers64.dtype"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Type case can also happen *implicitly*\n",
+    "floats64 = integers32 * 1.0\n",
+    "floats64.dtype"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Implicit type conversion can also happen when changing single value.\n",
+    "integers32.loc[0] = 1.234\n",
+    "integers32.dtype"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "An implicit / explicit type conversion can *increase* memory demands of a `Series`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers = pd.Series(np.ones((1000000,)), dtype=np.int32)\n",
+    "integers.memory_usage(index=False, deep=True)  # returned values are in [bytes]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers.astype(np.float64).memory_usage(index=False, deep=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "When casting to a type with a larger \"itemsize\" (i.e. more bits used to represent the numeric type) a reallocation must occur to accomodate for the larger memory demand of the underlying C array."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Exercises\n",
+    "Create three Series with random entries:\n",
+    "* a Series with 4 integer values and indices 'abcd',\n",
+    "* a Series with 5 float values and indices 'abcde',\n",
+    "* and a Series with 6 boolean values and indices 'abcabc'\n",
+    "\n",
+    "\n",
+    "1. multiply these Series pairwise (for each pair of two series). Where and why do you see missing values?\n",
+    "2. how can you deal with `nan`-values or prevent their creation?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Transformations (additional material)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "* Series values and indices are mutable\n",
+    "    * can easily be re-assigned\n",
+    "    * typical operations still create new instances"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "More comprehensive transformations need dedicated methods:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "cell_style": "split",
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "* replace\n",
+    "    * `Series.replace` *ignores* values not found\n",
+    "    * `Series.map` *drops* values not found\n",
+    "\n",
+    "* condense\n",
+    "    * `Series.cumsum` adds progressively\n",
+    "    * `Series.aggregate` (or `Series.agg`) returns a scalar value\n",
+    "        \n",
+    "   "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "cell_style": "split",
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "* sort\n",
+    "    * `Series.sort_values` sorts by series *values*\n",
+    "    * `Series.sort_index` sort by series *index*\n",
+    "* manipulate\n",
+    "    * `Series.apply` uses a single function\n",
+    "    * `Series.transform` uses one or more functions, \"string functions\", or dicts"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "## Replace and map\n",
+    "* Replace values with different values according to a replacement rule\n",
+    "* for the difference, see also https://stackoverflow.com/a/62947436"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### `Series.replace`\n",
+    "- can utilize strings or regular expressions\n",
+    "- may give two positional arguments: replace first with second\n",
+    "- may also give a mapping (dict or Series)\n",
+    "- all values not explicitly given are ignored\n",
+    "\n",
+    "See `help(pd.Series.replace)` for more details."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "strings = pd.Series('Er sah das Wasser as'.split())\n",
+    "strings"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "strings.replace(to_replace='as', value='an')  # replace with string"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "strings.str.replace('as', 'an')  # accessing the `str.replace` method"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers = pd.Series((0, 10, 20, 30))\n",
+    "integers.replace(0, 1000)  # replace with two values"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers.replace({10: 100, 20: 200, 50: 10})  # replace with a dict"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### `Series.map`\n",
+    "- accepts a Series, dict, or function\n",
+    "    - `Series` with old values in the index\n",
+    "    - `dict` with old values: new values as key-value pairs\n",
+    "    - function with a single argument: similar to `apply` (see below)\n",
+    "- if a value is not found, replace with `na`\n",
+    "\n",
+    "Refer to `help(pd.Series.map)` for more details.\n",
+    "\n",
+    "Main difference to `replace`: `map` is applied to each element of a `Series` object while `replace` usually is applied to only a few elements."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers = pd.Series(range(1, 5))\n",
+    "integers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "integers.map(lambda x: x ** 2 / ( x + 1 ))  # pass a callable"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Condense\n",
+    "- `Series.cumsum` cumulates values\n",
+    "- `Series.mean`, `Series.std` for statistics\n",
+    "- `Series.all`, `Series.any` for truthiness\n",
+    "- `Series.agg` with arbitrary functions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "#### `Series.cumsum`\n",
+    "- adds up all values for a given index in the `Series`\n",
+    "- sometimes useful in statistics\n",
+    "- returns a Series of the sum up to each index"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "errors = pd.Series((1, 1, 0, 0, 2, 2, 1), index=pd.date_range(start='2021-04-01', periods=7))\n",
+    "errors"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "errors.cumsum()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "#### `Series.aggregate`\n",
+    "* applies a function to a Series\n",
+    "    * returns a single value\n",
+    "* applies a *list of* functions\n",
+    "    * returns a *Series of* values"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "errors.agg(np.sum)  # pass a single callable. Return value is a scalar."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "errors.agg([np.sum, np.std])  # pass a list of callables. The operation then returns a `Series` object."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "random_normal = pd.Series(np.random.normal(size=5))\n",
+    "# We can also pass multiple callables in a list. The operation then returns a `Series` object.\n",
+    "random_normal.agg([pd.Series.count, pd.Series.mean, pd.Series.std,\n",
+    "                   pd.Series.min, pd.Series.max, pd.Series.quantile],\n",
+    "                  q=0.25)  # positional arguments to be passed to each function              "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "#### `Series` and statistics\n",
+    "`pd.Series` objects have a number of methods that be used to compute statistical quantities such as the mean value, the median oder the standard deviation.\n",
+    "\n",
+    "The latter deserves some explanation: Ther is a *difference* between NumPy and Pandas in how the standard deviation (oftentimes denoted as $\\sigma$) is computed:\n",
+    "\n",
+    "Generally:\n",
+    "$$\\mu = \\frac{1}{N} \\sum_{i=1}^N s_i\\quad;\\quad\\sigma = \\sqrt{\\frac{1}{N-\\Delta_{\\text{dof}}} \\sum_{i=1}^N (s_i - \\mu)^2}$$\n",
+    "- with *degrees of freedom* $\\Delta_\\text{dof}$: default 1, **differing from [numpy.std](https://numpy.org/doc/stable/reference/generated/numpy.std.html)** (where `ddof=0` by default)\n",
+    "- pass `ddof=0` for the \"uncorrected\" standard deviation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "random_values = pd.Series(np.random.random((10,)))\n",
+    "print(f\"Standard deviation with default Pandas behaviour: {random_values.std()}\")  # ddof=1 by default\n",
+    "print(f\"Standard deviation with default NumPy behaviour : {random_values.std(ddof=0)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Exercises\n",
+    "\n",
+    "Create a `Series` named `ints` with random integers between 0 and 100.\n",
+    "\n",
+    "* what do you get with `ints.replace(ints)`, versus `ints.map(ints)`? Where and why do you get missing values?\n",
+    "\n",
+    "* replace all values < 10 and all values > 90 with `np.nan`. What else changes?\n",
+    "* write functions `sum_odd` and `sum_even`, which sum the odd and even values of a series, respectively. Use `Series.aggregate` to create a new Series with the sum of even values, the sum of odd values, and the sum of all values."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "### Apply and Transform\n",
+    "- invoke a function on the values\n",
+    "    - operates on *one row at a time*\n",
+    "    - may provide additional keyword args\n",
+    "- for the difference, see https://towardsdatascience.com/difference-between-apply-and-transform-in-pandas-242e5cf32705"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "#### `Series.transform`\n",
+    "(single Series $\\rightarrow$ multiple results)\n",
+    "- may use a (numpy or python) function, a 'string function', a list of functions, or a dict\n",
+    "- cannot use to aggregate Series (result has same length as input)\n",
+    "- may only use a single Series at a time"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "values = pd.Series(range(10, 40, 10))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "values.transform(np.exp)  # transforms the whole `Series` and returns another `Series`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df_values = values.transform([np.exp, np.sin, np.cos])\n",
+    "df_values  # this is a DataFrame (we will deal with this data structure later)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Non-transforming functions produce a ValueError\n",
+    "def compute_mean(x):\n",
+    "    return x.mean()\n",
+    "\n",
+    "values.transform(compute_mean)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "#### `Series.apply`\n",
+    "(multiple Series $\\rightarrow$ single result)\n",
+    "- may *only* use a numpy ufunc, string function, or a Python function\n",
+    "  - cannot always use list or dict\n",
+    "- may use multiple Series (of a DataFrame) at a time\n",
+    "- may produce aggregated results\n",
+    "- may automatically convert the data type"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "values = pd.Series(range(10, 40, 10))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "values.apply(np.exp)  # ufunc applied to each value of the a Series -> returns another Series"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# Try this with the `transform()` method and see what happens.\n",
+    "values.apply('sum')  # reduction operations: returns the sum of all values in the Series (a scalar!)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Exercises\n",
+    "\n",
+    "Create a `Series` named `ints` with random integers between 0 and 100.\n",
+    "\n",
+    "- apply (with `Series.apply`) the list of functions `[np.log, np.exp, 'sqrt', 'square']` to the Series. Inspect the resulting object. Then apply the function `'sum'` to this object, passing the additional argument `axis=1`.\n",
+    "\n",
+    "- how can you reach the same final result with a single call to `Series.transform`?"
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "file_extension": ".py",
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  },
+  "mimetype": "text/x-python",
+  "name": "python",
+  "npconvert_exporter": "python",
+  "pygments_lexer": "ipython3",
+  "rise": {
+   "controls": true,
+   "controlsLayout": "edges",
+   "controlsTutorial": false,
+   "footer": "<img src=hpc-hessen-logo-only.png height=60 width=100>Competence Center for High Performance Computing in Hessen (HKHLR) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Tim Jammer, Marcel Giar &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;HiPerCH 2022",
+   "header": "",
+   "help": false,
+   "slideNumber": "c/t",
+   "theme": "white"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": false,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {
+    "height": "calc(100% - 180px)",
+    "left": "10px",
+    "top": "150px",
+    "width": "384px"
+   },
+   "toc_section_display": true,
+   "toc_window_display": false
+  },
+  "varInspector": {
+   "cols": {
+    "lenName": 16,
+    "lenType": 16,
+    "lenVar": 40
+   },
+   "kernels_config": {
+    "python": {
+     "delete_cmd_postfix": "",
+     "delete_cmd_prefix": "del ",
+     "library": "var_list.py",
+     "varRefreshCmd": "print(var_dic_list())"
+    },
+    "r": {
+     "delete_cmd_postfix": ") ",
+     "delete_cmd_prefix": "rm(",
+     "library": "var_list.r",
+     "varRefreshCmd": "cat(var_dic_list()) "
+    }
+   },
+   "types_to_exclude": [
+    "module",
+    "function",
+    "builtin_function_or_method",
+    "instance",
+    "_Feature"
+   ],
+   "window_display": false
+  },
+  "version": 3
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/slides/Day2_PandasSeries.ipynb.license b/slides/Day2_PandasSeries.ipynb.license
new file mode 100644
index 0000000000000000000000000000000000000000..c207ab8c094a9d18d7c6cb5c9dfbf8913df4aa8a
--- /dev/null
+++ b/slides/Day2_PandasSeries.ipynb.license
@@ -0,0 +1,4 @@
+SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+
+SPDX-License-Identifier: MIT
diff --git a/slides/utils.py b/slides/utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..b9207ddc54dbb6d7c07b8adc10337bdf26b90c0a
--- /dev/null
+++ b/slides/utils.py
@@ -0,0 +1,53 @@
+# SPDX-FileCopyrightText: © 2021 HPC Core Facility of the Justus-Liebig-University Giessen <philipp.e.risius@theo.physik.uni-giessen.de>,<marcel.giar@physik.jlug.de>
+# SPDX-FileCopyrightText: © 2022 Competence Center for High Performance Computing in Hessen (HKHLR) <tim.jammer@hpc-hessen.de>, <marcel.giar@hpc-hessen.de>
+#
+# SPDX-License-Identifier: MIT
+
+import urllib
+from os import makedirs, path
+from pathlib import Path
+
+import pandas as pd
+
+
+def download_IRIS(url="https://archive.ics.uci.edu/ml/machine-learning-databases/iris/"):
+    datafile = 'iris.data'
+    namesfile = 'iris.names' 
+    
+    output_path = Path('tmp')
+    output_datafile = output_path / "iris-data.csv"
+    
+    makedirs(output_path, exist_ok=True)
+    
+    column_names = ["sepal length", "sepal width", 'petal length', 'petal width', "Name"]
+    if not path.exists(output_datafile):
+        print(f"Will be downloading Iris dataset...")
+        with urllib.request.urlopen(url + datafile) as response, open(output_datafile, "w", encoding="utf-8") as out_file:
+            data = response.read()
+            out_file.write(",".join(column_names) + "\n")
+            out_file.write(data.decode('utf-8'))
+    else:
+        print(f"No need to download Iris dataset. Data is already present in {output_datafile}.")
+    
+    df = pd.read_csv(output_datafile, delimiter=',')
+    
+    return df
+
+def download_IRIS_with_addons(url="https://archive.ics.uci.edu/ml/machine-learning-databases/iris/",
+                              delimiter=None, datafile = 'iris.data', namesfile = 'iris.names'):
+    output_path = Path("tmp_with_addons")
+    output_datafile = output_path / "iris-data.csv"
+    makedirs(output_path, exist_ok=True)
+    
+    column_names = ["sepal length", "sepal width", 'petal length', 'petal width', "Name"]
+    with urllib.request.urlopen(url + datafile) as response, open(output_datafile, "w", encoding="utf-8") as out_file:
+        data = response.read()
+        for cname in column_names[:-1]:
+            out_file.write(f"# {cname} is in [cm]\n") # We use the '#' symbols for comments.
+        out_file.write("# Species:\n# - Iris Setosa\n# - Iris Versicolour\n# - Iris Virginica\n")
+        if delimiter is None:
+            out_file.write(",".join(column_names) + "\n")
+            out_file.write(data.decode('utf-8'))
+        else:
+            out_file.write(f"{delimiter}".join(column_names) + "\n")
+            out_file.write(data.decode("utf-8").replace(",", delimiter))
\ No newline at end of file