"Welcome to the linear regression exercise. In this exercise we will fit linear data with a minimal neural network model. We will learn how to plot the data, how to set up a suitable model, how to train it and how to look at the results."
]
},
{
"cell_type": "markdown",
"id": "c1227fac",
"metadata": {},
"source": [
"## Imports and Seeding\n",
"First we will do the necessary imports:\n",
"* `numpy` for general data handling and array manipulation\n",
"* `tensorflow` to build and train the regression model\n",
"* `matplotlib.pyplot` for plotting"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "60d0b21e",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import tensorflow as tf\n",
"from matplotlib import pyplot as plt"
]
},
{
"cell_type": "markdown",
"id": "866c84a8",
"metadata": {},
"source": [
"Then we set a random seed for the `np.random` module. This makes our code reproducible as the random operations will yield the same results in every run through the notebook."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "44ec8731",
"metadata": {},
"outputs": [],
"source": [
"# for reproducibility\n",
"np.random.seed(42)"
]
},
{
"cell_type": "markdown",
"id": "9ae28192",
"metadata": {},
"source": [
"## Data creation\n",
"First we set the main parameters of our data:\n",
"* `n_dim`: The number of dimensions to be used\n",
"* `n_data`: The number of datapoints that will be used\n",
"* `uncertainty`: The uncertainty which will be used when creating the dataset"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "e8a7ccbf",
"metadata": {},
"outputs": [],
"source": [
"# parameter\n",
"n_dim = 1\n",
"n_data = 1000\n",
"uncertainty = 0.0"
]
},
{
"cell_type": "markdown",
"id": "d96c7c5a",
"metadata": {},
"source": [
"Now we will create our data by dicing random values x (n-dimensional) and then transforming them using a multi-dimensional linear function with randomly set parameters."
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "8d2682b8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Your linear function has slopes of [[0.53355443]] and an offset of [0.000241]\n"
]
}
],
"source": [
"# create data\n",
"# we dice random linear functions\n",
"x = np.random.uniform(size=(n_data, n_dim))\n",
"w = np.random.rand(n_dim)[None, ...]\n",
"b = np.random.rand(1)\n",
"\n",
"print(f\"Your linear function has slopes of {w} and an offset of {b}\")\n",
"\n",
"y = np.sum(x*w, axis=-1) + b\n",
"y += np.random.uniform(low=-uncertainty, high=uncertainty, size=y.shape)"
]
},
{
"cell_type": "markdown",
"id": "4e898a50",
"metadata": {},
"source": [
"## Data visualization\n",
"Using `plt.scatter` we can plot our data $y = f(x_i)$ in all dimensions ($i$):"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "6f5506cd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\nTODO: Visualize the data y = f(x_i) in every dimension (i).\\n'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"\"\"\n",
"TODO: Visualize the data y = f(x_i) in every dimension (i).\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "24effc72",
"metadata": {},
"source": [
"## Model Creation\n",
"Now it is time to set up the model. We will use the `tf.keras.models.Sequential` API to do so."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "f1f1e7b7",
"metadata": {},
"outputs": [],
"source": [
"\"\"\"\n",
"TODO: Create a tf.keras.model using the `tf.keras.models.Sequential` API.\n",
"You may answer the following questions first:\n",
"- How many inputs does our model need?\n",
"- How many outputs does our model need?\n",
"\"\"\"\n",
"model = None"
]
},
{
"cell_type": "markdown",
"id": "b4bf9817",
"metadata": {},
"source": [
"We can extract the prediction of our model using the `model.predict` function.\n",
"Do so and plot the prediction of the model along with the original data."
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "08f3500f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\nTODO: Extract the prediction of the model y = f_DNN(x_i) and plot it alogn with the data f(x_i) that\\nyou already have visualized in the task above.\\n'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"\"\"\n",
"TODO: Extract the prediction of the model y = f_DNN(x_i) and plot it alogn with the data f(x_i) that\n",
"you already have visualized in the task above.\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "79cbb263",
"metadata": {},
"source": [
"## Model Training\n",
"Before training the model we need to compile it.\n",
"In the compilation we can configure its losses, metrics and the used optimizers.\n",
"For the loss, we will use the mean squarred error (`\"mse\"`), as optimizer we will use the Stochastic Gradient Descent (`\"sgd\"`)."
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "bab12ffa",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\nTODO: Compile the model using the model.compile function.\\n'"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"\"\"\n",
"TODO: Compile the model using the model.compile function.\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "eb78e8f5",
"metadata": {},
"source": [
"Now everything is set up and we are ready to train our model using the `model.fit` function."
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "0b41f262",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\nTODO: Train the model using the model.fit function.\\n'"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"\"\"\n",
"TODO: Train the model using the model.fit function.\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "6522515d",
"metadata": {},
"source": [
"## Model evaluation\n",
"In such a simple model we can still look at every of the weights by eye. We also know, what the weights of our simple model should look like. Print the weights of the model and look at them. What do you observe? Do the values make sense?"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "4ad8817c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\nTODO: Print the weights of every layer of our model.\\n'"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"\"\"\n",
"TODO: Print the weights of every layer of our model.\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "b98d2f16",
"metadata": {},
"source": [
"Last but not least we will also look at the prediction of our model after the training. You can reuse the code from some cells above. Plot the prediction of the model along with the original data. What do you observe? Does this meet your expectation?"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "3a21b829",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\nTODO: Extract the prediction of the model y = f_DNN(x_i) and plot it alogn with the data f(x_i).\\n'"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"\"\"\n",
"TODO: Extract the prediction of the model y = f_DNN(x_i) and plot it alogn with the data f(x_i).\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "663b76e2",
"metadata": {},
"source": [
"## Further Tasks\n",
"Go back to the beginning of the notebook. There, we set the uncertainty of the data generation and the number of dimensions. Perform the following tasks:\n",
"* What happens if the measured data (y) is uncertain (uncertainty > 0)? Explain your observation!\n",
"* Vary the number of input dimensions to n=1/2/10. How do you need to change the model? Describe your observation."
]
},
{
"cell_type": "markdown",
"id": "c856e7d0",
"metadata": {},
"source": [
"## Summary\n",
"This concludes our tutorial on the linear regression.\n",
"\n",
"In this tutorial you have learned:\n",
"* How to visualize N dimensional data distributions\n",
"* How to build a tf.keras model\n",
"* How to train a tf.keras model on a given data distribution\n",
"* How to visualize the output of your model\n",
"* How the scenario of the linear regression changes with uncertainty on the training data\n",
"* How the scenario of the linear regression changes with the number of dimensions"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e23c2c3b",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
%% Cell type:markdown id:57b36a2a tags:
# Linear Regression
Welcome to the linear regression exercise. In this exercise we will fit linear data with a minimal neural network model. We will learn how to plot the data, how to set up a suitable model, how to train it and how to look at the results.
%% Cell type:markdown id:c1227fac tags:
## Imports and Seeding
First we will do the necessary imports:
*`numpy` for general data handling and array manipulation
*`tensorflow` to build and train the regression model
*`matplotlib.pyplot` for plotting
%% Cell type:code id:60d0b21e tags:
``` python
importnumpyasnp
importtensorflowastf
frommatplotlibimportpyplotasplt
```
%% Cell type:markdown id:866c84a8 tags:
Then we set a random seed for the `np.random` module. This makes our code reproducible as the random operations will yield the same results in every run through the notebook.
%% Cell type:code id:44ec8731 tags:
``` python
# for reproducibility
np.random.seed(42)
```
%% Cell type:markdown id:9ae28192 tags:
## Data creation
First we set the main parameters of our data:
*`n_dim`: The number of dimensions to be used
*`n_data`: The number of datapoints that will be used
*`uncertainty`: The uncertainty which will be used when creating the dataset
%% Cell type:code id:e8a7ccbf tags:
``` python
# parameter
n_dim=1
n_data=1000
uncertainty=0.0
```
%% Cell type:markdown id:d96c7c5a tags:
Now we will create our data by dicing random values x (n-dimensional) and then transforming them using a multi-dimensional linear function with randomly set parameters.
%% Cell type:code id:8d2682b8 tags:
``` python
# create data
# we dice random linear functions
x=np.random.uniform(size=(n_data,n_dim))
w=np.random.rand(n_dim)[None,...]
b=np.random.rand(1)
print(f"Your linear function has slopes of {w} and an offset of {b}")
Your linear function has slopes of [[0.53355443]] and an offset of [0.000241]
%% Cell type:markdown id:4e898a50 tags:
## Data visualization
Using `plt.scatter` we can plot our data $y = f(x_i)$ in all dimensions ($i$):
%% Cell type:code id:6f5506cd tags:
``` python
"""
TODO: Visualize the data y = f(x_i) in every dimension (i).
"""
```
%% Output
'\nTODO: Visualize the data y = f(x_i) in every dimension (i).\n'
%% Cell type:markdown id:24effc72 tags:
## Model Creation
Now it is time to set up the model. We will use the `tf.keras.models.Sequential` API to do so.
%% Cell type:code id:f1f1e7b7 tags:
``` python
"""
TODO: Create a tf.keras.model using the `tf.keras.models.Sequential` API.
You may answer the following questions first:
- How many inputs does our model need?
- How many outputs does our model need?
"""
model=None
```
%% Cell type:markdown id:b4bf9817 tags:
We can extract the prediction of our model using the `model.predict` function.
Do so and plot the prediction of the model along with the original data.
%% Cell type:code id:08f3500f tags:
``` python
"""
TODO: Extract the prediction of the model y = f_DNN(x_i) and plot it alogn with the data f(x_i) that
you already have visualized in the task above.
"""
```
%% Output
'\nTODO: Extract the prediction of the model y = f_DNN(x_i) and plot it alogn with the data f(x_i) that\nyou already have visualized in the task above.\n'
%% Cell type:markdown id:79cbb263 tags:
## Model Training
Before training the model we need to compile it.
In the compilation we can configure its losses, metrics and the used optimizers.
For the loss, we will use the mean squarred error (`"mse"`), as optimizer we will use the Stochastic Gradient Descent (`"sgd"`).
%% Cell type:code id:bab12ffa tags:
``` python
"""
TODO: Compile the model using the model.compile function.
"""
```
%% Output
'\nTODO: Compile the model using the model.compile function.\n'
%% Cell type:markdown id:eb78e8f5 tags:
Now everything is set up and we are ready to train our model using the `model.fit` function.
%% Cell type:code id:0b41f262 tags:
``` python
"""
TODO: Train the model using the model.fit function.
"""
```
%% Output
'\nTODO: Train the model using the model.fit function.\n'
%% Cell type:markdown id:6522515d tags:
## Model evaluation
In such a simple model we can still look at every of the weights by eye. We also know, what the weights of our simple model should look like. Print the weights of the model and look at them. What do you observe? Do the values make sense?
%% Cell type:code id:4ad8817c tags:
``` python
"""
TODO: Print the weights of every layer of our model.
"""
```
%% Output
'\nTODO: Print the weights of every layer of our model.\n'
%% Cell type:markdown id:b98d2f16 tags:
Last but not least we will also look at the prediction of our model after the training. You can reuse the code from some cells above. Plot the prediction of the model along with the original data. What do you observe? Does this meet your expectation?
%% Cell type:code id:3a21b829 tags:
``` python
"""
TODO: Extract the prediction of the model y = f_DNN(x_i) and plot it alogn with the data f(x_i).
"""
```
%% Output
'\nTODO: Extract the prediction of the model y = f_DNN(x_i) and plot it alogn with the data f(x_i).\n'
%% Cell type:markdown id:663b76e2 tags:
## Further Tasks
Go back to the beginning of the notebook. There, we set the uncertainty of the data generation and the number of dimensions. Perform the following tasks:
* What happens if the measured data (y) is uncertain (uncertainty > 0)? Explain your observation!
* Vary the number of input dimensions to n=1/2/10. How do you need to change the model? Describe your observation.
%% Cell type:markdown id:c856e7d0 tags:
## Summary
This concludes our tutorial on the linear regression.
In this tutorial you have learned:
* How to visualize N dimensional data distributions
* How to build a tf.keras model
* How to train a tf.keras model on a given data distribution
* How to visualize the output of your model
* How the scenario of the linear regression changes with uncertainty on the training data
* How the scenario of the linear regression changes with the number of dimensions