diff --git a/datascienceintro/AutoGrad.ipynb b/datascienceintro/AutoGrad.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..62ce94339cecbb9fc28f874bb8e5585ec78f76c5
--- /dev/null
+++ b/datascienceintro/AutoGrad.ipynb
@@ -0,0 +1,379 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "xyl4csp7yEbk"
+      },
+      "source": [
+        "# Computing Gradients in PyTorch\n",
+        "\n",
+        "[PyTorch](https://pytorch.org/) is a comprehensive library that is primarily used for machine learning. However, it can also be used as an effective way to handle matrix operations or gradients.\n",
+        "In particular for the latter, we can exploit the fact that training neural networks requires calculating gradients efficiently as this is the backbone of the algorithms for training the networks.\n",
+        "\n",
+        "Therefore, if we can formute our problem at hand in such a way that we can use PyTorch, we can use the inbuilt methods to compute and obtain the gradients.\n",
+        "In PyTorch, this is done via [AutoGrad](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html).\n",
+        "\n",
+        "In this example, we use a simple sine-function: Using such a simple function makes it easy for any neural network to learn the functional dependency. Moreover, we can compare this to the well known derivative: $\\frac{d \\sin(x)}{d x} = \\cos(x)$, which makes it immediately obvious if we have learned the correct gradient."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "rxyiP-tSyIPg",
+        "outputId": "05dd72f4-c7d5-4af9-ba67-2a55c08bf1b5"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Using cpu device\n"
+          ]
+        }
+      ],
+      "source": [
+        "import torch\n",
+        "from torch import nn\n",
+        "import torch.optim as optim\n",
+        "import torch.nn.functional as F\n",
+        "\n",
+        "\n",
+        "import matplotlib.pyplot as plt\n",
+        "import seaborn as sns\n",
+        "import numpy as np\n",
+        "\n",
+        "# Get cpu or gpu device for training.\n",
+        "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
+        "print(f\"Using {device} device\")\n",
+        "\n",
+        "seed = 42\n",
+        "np.random.seed(seed)\n",
+        "torch.manual_seed(seed)\n",
+        "torch.cuda.manual_seed(seed)\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "KX6GlXtUysE9"
+      },
+      "source": [
+        "## Training Data\n",
+        "\n",
+        "In this simple example, we will use $f(x) = \\sin(x)$ to generate training data.\n",
+        "First of all, the releationship is very simple, i.e. even small networks will be able to learn this quickly. Additionally, we know what the gradient will look like: $\\frac{dy}{dx} = \\cos(x)$, i.e. we know immediately if the network has learned the correct gradient.\n",
+        "\n",
+        "The function [torch.linspace](https://pytorch.org/docs/stable/generated/torch.linspace.html) is the equivalent to numpy version but produces a tensor directly.\n",
+        "The part ```.view(-1,1)``` re-shapes the resulting array such that we have one feature: torch.linspace creates a tensor with shape (100), i.e. a 1D tensor with 100 elements. The ```-1``` is a placeholder to tell PyTorch to infer the length automatically from the number of elements in the original tensor. The ```1``` tells PyTorch to reformat the data such that we have one feature. The resulting tensor has a shape of (100,1), i.e. 100 rows of 1 feature each."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "aeKM4U2ellXN"
+      },
+      "source": [
+        "**Exercise**\n",
+        "\n",
+        "Create training data ```x_train``` and ```y_train``` for a $sin(x)$ function in the interval $x_{train} \\in (0, 2\\pi)$.\n",
+        "\n",
+        "Plot the resulting training data."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "Ta5rGSisl58h"
+      },
+      "outputs": [],
+      "source": [
+        "##\n",
+        "## Your code here\n",
+        "##"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "JoWaGzuPl7tm"
+      },
+      "source": [
+        "**Solution**"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "mu-cy0lXyu7z"
+      },
+      "source": [
+        "## Network Definition and Training\n",
+        "\n",
+        "We now define a very small neural network, for example a \"shallow\" network with just three fully connected layers.\n",
+        "\n",
+        "- How many input nodes do we need?\n",
+        "- How many output nodes do we need?\n",
+        "\n",
+        "Here, we need one input node, since we pass one value at the time to the network: $y = \\sin(x)$.\n",
+        "\n",
+        "Similarly, we only need one output node as we want the network to learn a single number."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "LKl2PBVxl_H3"
+      },
+      "source": [
+        "**Exercise**\n",
+        "\n",
+        "Write a class for a small neural network with three fully-connected (linear) layers and $\\tanh(x)$ as activatin function.\n",
+        "\n",
+        "Discuss how many input and output nodes the network needs."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "yyY9odYjmQMm"
+      },
+      "outputs": [],
+      "source": [
+        "class NeuralNetwork(nn.Module):\n",
+        "   def __init__(self):\n",
+        "        super(NeuralNetwork, self).__init__()\n",
+        "        ##\n",
+        "        ## your code here\n",
+        "        ##\n",
+        "\n",
+        "   def forward(self, x):\n",
+        "        ##\n",
+        "        ## your code here\n",
+        "        ##\n",
+        "       return x\n",
+        "\n",
+        "model = NeuralNetwork().to(device)\n",
+        "print(model)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "AxNeglAYzqzf",
+        "outputId": "3ac54803-87d8-4b10-cf43-3900fc8987e7"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Epoch 0/1000, Total Loss: 1.043527\n",
+            "Epoch 100/1000, Total Loss: 0.000421\n",
+            "Epoch 200/1000, Total Loss: 0.000122\n",
+            "Epoch 300/1000, Total Loss: 0.000928\n",
+            "Epoch 400/1000, Total Loss: 0.000005\n",
+            "Epoch 500/1000, Total Loss: 0.000288\n",
+            "Epoch 600/1000, Total Loss: 0.000004\n",
+            "Epoch 700/1000, Total Loss: 0.000003\n",
+            "Epoch 800/1000, Total Loss: 0.000210\n",
+            "Epoch 900/1000, Total Loss: 0.000007\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Train model\n",
+        "\n",
+        "\n",
+        "# Define the optimizer and loss function\n",
+        "optimizer = # YOUR CODE HERE\n",
+        "\n",
+        "# Training loop\n",
+        "num_epochs = 1000\n",
+        "loss_history = []\n",
+        "\n",
+        "for epoch in range(num_epochs):\n",
+        "    # Enable gradient tracking for time steps\n",
+        "    x_train.requires_grad = True\n",
+        "\n",
+        "    # Forward pass: Predict\n",
+        "    predictions = # YOUR CODE HERE\n",
+        "\n",
+        "    # Compute the data loss (difference from sin(x))\n",
+        "    # using the mean squared error as a loss-function for regression\n",
+        "    data_loss = torch.mean((predictions - y_train) ** 2)\n",
+        "\n",
+        "    # Compute the gradient dt/dx using torch.autograd.grad\n",
+        "    dy_train = torch.autograd.grad(\n",
+        "        outputs=predictions,\n",
+        "        inputs=x_train,\n",
+        "        grad_outputs=torch.ones_like(predictions),\n",
+        "        create_graph=True\n",
+        "    )[0]\n",
+        "\n",
+        "    # Physics loss: Enforce the relationship dy/dx = cos(x)\n",
+        "    physics_loss = torch.mean((# YOUR CODE HERE#) ** 2)\n",
+        "\n",
+        "    # Total loss: Combine data and physics losses\n",
+        "    total_loss = data_loss + physics_loss\n",
+        "\n",
+        "    # Backward pass and optimization step\n",
+        "    # YOUR CODE HERE\n",
+        "    # YOUR CODE HERE\n",
+        "    # YOUR CODE HERE\n",
+        "\n",
+        "    # Record the loss\n",
+        "    loss_history.append(total_loss.item())\n",
+        "\n",
+        "    # Print progress every 100 epochs\n",
+        "    if epoch % 100 == 0:\n",
+        "        print(f\"Epoch {epoch}/{num_epochs}, Total Loss: {total_loss.item():.6f}\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 5,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 455
+        },
+        "id": "wpmM_03k0gZh",
+        "outputId": "98b8277d-e929-4c50-bb25-204b83fb47cd"
+      },
+      "outputs": [
+        {
+          "data": {
+            "image/png": "",
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ]
+          },
+          "metadata": {},
+          "output_type": "display_data"
+        }
+      ],
+      "source": [
+        "sns.lineplot(loss_history, label='Training loss')\n",
+        "plt.xlabel('Epoch')\n",
+        "plt.ylabel('Loss')\n",
+        "plt.show()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "xnVeLeA211hL"
+      },
+      "source": [
+        "# Plot the Gradient\n",
+        "\n",
+        "We now check if the network has learned the correct gradient, i.e.  $\\frac{dy}{dx} = \\cos(x)$\n",
+        "\n",
+        "We generate some independent numbers on the same domain, obtain the predictions $\\hat{y}$ and plot:\n",
+        "- the ground truth: $y = \\sin(x)$,\n",
+        "- the predictions $\\hat{y}$\n",
+        "- the gradient"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 506
+        },
+        "id": "jKniVE8M10qA",
+        "outputId": "f6e1fe97-f0e0-431d-8103-4c223020c24d"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/tmp/ipykernel_6172/235726813.py:3: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).\n",
+            "  y_test = torch.sin(torch.tensor(x_test))\n"
+          ]
+        },
+        {
+          "data": {
+            "image/png": "",
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ]
+          },
+          "metadata": {},
+          "output_type": "display_data"
+        }
+      ],
+      "source": [
+        "# Prepare test data with requires_grad=True\n",
+        "x_test = torch.linspace(0, 2 * torch.pi, steps=200, device=device, requires_grad=True).view(-1, 1)\n",
+        "y_test = torch.sin(torch.tensor(x_test))\n",
+        "\n",
+        "# predictions from the trained model\n",
+        "y_hat  = model(x_test)\n",
+        "\n",
+        "#gradient\n",
+        "dy_dx = torch.autograd.grad(\n",
+        "    outputs=y_hat,\n",
+        "    inputs=x_test,\n",
+        "    grad_outputs=torch.ones_like(y_hat),\n",
+        "    create_graph=True\n",
+        ")[0]\n",
+        "\n",
+        "# detach from GPU and graph\n",
+        "x_test = x_test.detach().cpu().numpy().flatten()\n",
+        "y_test = y_test.detach().cpu().numpy().flatten()\n",
+        "y_hat = y_hat.detach().cpu().numpy().flatten()\n",
+        "dy_dx = dy_dx.detach().cpu().numpy().flatten()\n",
+        "\n",
+        "\n",
+        "\n",
+        "# Plot predictions and gradients\n",
+        "sns.lineplot(x=x_test, y=y_test, label='sin(x) (Ground Truth)')\n",
+        "sns.lineplot(x=x_test, y=y_hat, label='Prediction')\n",
+        "sns.lineplot(x=x_test, y=dy_dx, label='Predicted Gradient')\n",
+        "plt.xlabel('x')\n",
+        "plt.ylabel('y')\n",
+        "plt.legend()\n",
+        "plt.show()\n"
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "datascienceintro-eVBNPtpL-py3.11",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.11.11"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}