"In many applications, we need to find the root of a function, i.e. the point where the function crosses the $x$-axis: $f(x_r) = 0$\n",
"\n",
"A variety of methods exist for this problem, in this examle we want to use Newton's method. The general idea is the following:\n",
"We start at some point, our initial guess $x_0$. Then, we calculate the value of the function at point $x_n$ (starting from the initial guess) $f(x_n)$, as well as the derivative $f'(x_n)$, the derivative is the slope of the tangent line to the function $f(x)$ at this point $x_n$:\n",
"$$ y = f(x_n) + f'(x_n)(x-x_n)$$\n",
"We now want to find the point where the tangent line intersects with the $x$-axis, i.e. we set $y=0$, leading to:\n",
"$$f'(x_n)(x-x_n) = -f(x_n)$$\n",
"Assuming $f'(x_n) \\neq 0$, we can divide both sides by $f'(x_n)$, solve for $x$ and then iterate.\n",
"\n",
"More concisely, the overall approach is:\n",
"\n",
"\n",
"1. Choose an initial guess $ x_0 $.\n",
"2. Iterate using the formula:\n",
"$$x_{n+1} = x_n - \\frac{f(x_n)}{f'(x_n)}$$\n",
" where $ n = 0, 1, 2, \\ldots $\n",
"\n",
"The process continues until the difference between successive approximations is less than a predetermined tolerance level or until a maximum number of iterations is reached.\n",
"\n",
"Note that if our initial guess $x_0$ is not suitable, the method may not converge.\n",
"\n",
"One of the underrated features of modern deep learning frameworks is the automatic differentiation. In \"conventional\" deep learning, we use this as a tool behind the scenes to train a neural network and do not really interact with this. However, this method is useful in a range of applications, such as physics-informed neural networks or, indeed, this example of finding the root of a function efficiently.\n",
"While we perceive deep-learning frameworks such as [PyTorch](https://pytorch.org/) or [TensorFlow](https://www.tensorflow.org/) primarily as libraries for deep learning (and we do indeed use them for this purpose), they are, essentially, heavily optimised libraries for matrix operations and numerical handling of equations that can, in addition, levarage the computation power of GPUs.\n",
"\n",
"Note that while we would ideally work with functions where we can caluclate the derivative analytically, this is not necessary.\n",
"We will use the example of a conic steel vessel discussed in the lecture \"Numerical Models in Processing\" by [PD Dr. W. Lenz](https://www.iob.rwth-aachen.de/habilitation-von-dr-wolfgang-lenz/). In this example, a numerical solution is derived which we will use as starting point.\n",
"\n",
"First, we will start with a motivating generic example to get familiar with the method and general code structure before then turning to the concrete example."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "FTDBpsfQAJUu"
},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"\n",
"import torch\n",
"\n",
"from datetime import datetime\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gbNrWJBShOFE"
},
"source": [
"## General Example\n",
"\n",
"We start with a generic example using the function\n",
"$f(x) = \\cos(x) -x$.\n",
"\n",
"First, we plot the function.\n",
"Note that we directly use [torch.tensor](https://pytorch.org/docs/stable/tensors.html) as we will later on use the automatic differentiation to implement Newton's method for finding roots.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "fQA-El5LATbi"
},
"outputs": [],
"source": [
"def f(x):\n",
" return # YOUR CODE HERE"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5Kh08W8faFrI"
},
"source": [
"Let's first make a plot of this function.\n",
"Assuming that we already know that the root of the function is at $x=0.755$, we add a vertical line to indicate this root."
"Function value at root approximation f(x) = 0.0\n"
]
}
],
"source": [
"x = torch.tensor([0.1], requires_grad=True)\n",
"tolerance = 1e-6\n",
"max_iterations = 100\n",
"\n",
"t_start = datetime.now()\n",
"for i in range(max_iterations):\n",
" y = f(x)\n",
" # YOUR CODE HERE\n",
" with torch.no_grad():\n",
" # Replacing in-place copy with out-of-place operation\n",
" x_new = # YOUR CODE HERE\n",
"\n",
" if torch.abs(x_new - x).item() < tolerance: #add .item() to get a python number\n",
" t_stop = datetime.now()\n",
" print(f'Converged after {i+1} iterations.')\n",
" print(f'Time taken: {t_stop - t_start}')\n",
" break\n",
"\n",
" x = x_new.clone().detach().requires_grad_(True) # Create a new tensor with gradient enabled\n",
"\n",
"print(f'Root approximated at x = {x.item()}')\n",
"print(f'Function value at root approximation f(x) = {f(x).item()}')"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Svek5D1zaYaJ"
},
"source": [
"# With Optimiser\n",
"\n",
"In the above code, we have implemented Newton's method directly.\n",
"However, modern deep learning packages include poweful optimisers that perform the calculation of the gradient, as well as the subsequent updates of the parameters.\n",
"\n",
"As an exercise, re-write the code to use the [Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html) optimiser.\n",
"\n",
"*Hint*: You need to think of a suitable loss function;\n",
"\n",
"*Note*: Depending on the problem at hand, using an optimiser and loss function may (or may not) improve convergence. You may find that the standard approach works sufficiently well for your problem."
In many applications, we need to find the root of a function, i.e. the point where the function crosses the $x$-axis: $f(x_r) = 0$
A variety of methods exist for this problem, in this examle we want to use Newton's method. The general idea is the following:
We start at some point, our initial guess $x_0$. Then, we calculate the value of the function at point $x_n$ (starting from the initial guess) $f(x_n)$, as well as the derivative $f'(x_n)$, the derivative is the slope of the tangent line to the function $f(x)$ at this point $x_n$:
$$ y = f(x_n) + f'(x_n)(x-x_n)$$
We now want to find the point where the tangent line intersects with the $x$-axis, i.e. we set $y=0$, leading to:
$$f'(x_n)(x-x_n) = -f(x_n)$$
Assuming $f'(x_n) \neq 0$, we can divide both sides by $f'(x_n)$, solve for $x$ and then iterate.
More concisely, the overall approach is:
1. Choose an initial guess $ x_0 $.
2. Iterate using the formula:
$$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$$
where $ n = 0, 1, 2, \ldots $
The process continues until the difference between successive approximations is less than a predetermined tolerance level or until a maximum number of iterations is reached.
Note that if our initial guess $x_0$ is not suitable, the method may not converge.
One of the underrated features of modern deep learning frameworks is the automatic differentiation. In "conventional" deep learning, we use this as a tool behind the scenes to train a neural network and do not really interact with this. However, this method is useful in a range of applications, such as physics-informed neural networks or, indeed, this example of finding the root of a function efficiently.
While we perceive deep-learning frameworks such as [PyTorch](https://pytorch.org/) or [TensorFlow](https://www.tensorflow.org/) primarily as libraries for deep learning (and we do indeed use them for this purpose), they are, essentially, heavily optimised libraries for matrix operations and numerical handling of equations that can, in addition, levarage the computation power of GPUs.
Note that while we would ideally work with functions where we can caluclate the derivative analytically, this is not necessary.
We will use the example of a conic steel vessel discussed in the lecture "Numerical Models in Processing" by [PD Dr. W. Lenz](https://www.iob.rwth-aachen.de/habilitation-von-dr-wolfgang-lenz/). In this example, a numerical solution is derived which we will use as starting point.
First, we will start with a motivating generic example to get familiar with the method and general code structure before then turning to the concrete example.
%% Cell type:code id: tags:
``` python
importnumpyasnp
importmatplotlib.pyplotasplt
importseabornassns
importtorch
fromdatetimeimportdatetime
```
%% Cell type:markdown id: tags:
## General Example
We start with a generic example using the function
$f(x) = \cos(x) -x$.
First, we plot the function.
Note that we directly use [torch.tensor](https://pytorch.org/docs/stable/tensors.html) as we will later on use the automatic differentiation to implement Newton's method for finding roots.
%% Cell type:code id: tags:
``` python
deff(x):
return# YOUR CODE HERE
```
%% Cell type:markdown id: tags:
Let's first make a plot of this function.
Assuming that we already know that the root of the function is at $x=0.755$, we add a vertical line to indicate this root.
# Replacing in-place copy with out-of-place operation
x_new=# YOUR CODE HERE
iftorch.abs(x_new-x).item()<tolerance:#add .item() to get a python number
t_stop=datetime.now()
print(f'Converged after {i+1} iterations.')
print(f'Time taken: {t_stop-t_start}')
break
x=x_new.clone().detach().requires_grad_(True)# Create a new tensor with gradient enabled
print(f'Root approximated at x = {x.item()}')
print(f'Function value at root approximation f(x) = {f(x).item()}')
```
%% Output
Converged after 5 iterations.
Time taken: 0:00:00.001611
Root approximated at x = 0.7390851378440857
Function value at root approximation f(x) = 0.0
%% Cell type:markdown id: tags:
# With Optimiser
In the above code, we have implemented Newton's method directly.
However, modern deep learning packages include poweful optimisers that perform the calculation of the gradient, as well as the subsequent updates of the parameters.
As an exercise, re-write the code to use the [Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html) optimiser.
*Hint*: You need to think of a suitable loss function;
*Note*: Depending on the problem at hand, using an optimiser and loss function may (or may not) improve convergence. You may find that the standard approach works sufficiently well for your problem.