Skip to content
Snippets Groups Projects
Commit 5d287eba authored by Ulrich Kerzel's avatar Ulrich Kerzel
Browse files

python dict update

parent b2897ce3
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
# Common Datatypes
Previously, we have seen the basic datatypes such as:
* int
* float
* string
* boolean
These are suitable to hold single values - but often, we want to create more complex datatypes that can hold more than one value.
## Lists
Lists are "containers" that store a sequence of elements.
We can initialise a list with a sequence of elements or start with an empty list.
We can get the length of the list by using the function ```len()```
%% Cell type:code id: tags:
``` python
my_empty_list = []
my_list = [1,2,3,4]
# print the content of the list
print(my_list)
# print the length of the list - remember the way to format the printed statements
print('The list has {} elements'.format(len(my_list)))
```
%% Output
[1, 2, 3, 4]
The list has 4 elements
%% Cell type:markdown id: tags:
### Extending lists
There are multiple ways to add new elements to a list ```my_list```
(note that normally we would use more descriptive variable names instead of just ```my_list``` !)
* add a new element at the end of the list: ```l.append(element)```
* insert an element at a specific position: ```l.insert(index, element)```
* concatenating lists: ```list_1 + list_2```
%% Cell type:code id: tags:
``` python
# add a new element at the end
my_list.append(5)
print(my_list)
print('-------------')
# add a new element in the middle at index 2
my_list.insert(2,3)
print(my_list)
print('-------------')
# concatenating lists
list_1 = [1,2,3]
list_2 = [2,3,4]
my_list = list_1 + list_2
print(my_list)
print('-------------')
```
%% Output
[1, 2, 3, 4, 5]
-------------
[1, 2, 3, 3, 4, 5]
-------------
[1, 2, 3, 2, 3, 4]
-------------
%% Cell type:markdown id: tags:
### Common list operations
* sorting a list: ```l.sort()```. We can add the argument ```reverse=True``` if we want the list to be sorted in reverse.
* count the number of times an element appears in the list: ```l.count(element)```
* reverse the order of elements in the list ```l.reverse()```
* find the position (index) of the first occurence of an element with value ```value```: ```l.index(value, pos)```. The second argument ```pos``` is optional, if we specify ```pos>0```, the search starts from this position (index) instead of the beginning.
**Exercise:**
Take the list defined below, sort it in ascending order and count how often the number 2 appears.
%% Cell type:code id: tags:
``` python
my_list = [0,3,2,6,3,2,1,7,8,7]
# ... your code here ....
```
%% Output
None
%% Cell type:markdown id: tags:
### Removing elements from lists
* remove all items from a list: ```l.clear()```
* remove an item at a specific index and return it: ```element = l.pop(index)```. The argument ```index``` is optional, if you do not specifiy it, it will remove and return the last element in the list.
* remove the first item with value ```value``` from the list: ```l.remove(value)```. If the ```value``` does not exist, we get an error (```ValueError```)
%% Cell type:code id: tags:
``` python
my_list = [0,3,2,6,3,2,1,7,8,7]
# remove the first 3
my_list.remove(3)
print(my_list)
```
%% Output
[0, 2, 6, 3, 2, 1, 7, 8, 7]
%% Cell type:markdown id: tags:
### Accessing list elements and slicing
In order to work with lists, we also need to access the elements. We can do this by using their index in the following way: ```l[index]```
Indices are counted forwards (starting from 0), i.e. the first element has ```index = 0```, the second element ```index = 1```, and so on.
However, we can also count backwards. Then, the last element has ```index = -1 ```, the second last has ```index = -2```, and so on.
![Image](ListIndex.png)
%% Cell type:code id: tags:
``` python
my_list = [1, 2, 3, 4, 5]
# print the full list
print(my_list)
print('-----------')
# print the third element
print(my_list[2])
print('-----------')
# print the last element
print(my_list[4])
print(my_list[len(my_list)-1])
print(my_list[-1])
```
%% Output
[1, 2, 3, 4, 5]
-----------
3
-----------
5
5
5
%% Cell type:markdown id: tags:
There we see three ways of accessing the last element of the list
* we happen to know that the list contains five elements, hence, the last one is at ```index = 4``` (because we start counting from zero)
* the function ```len( list )``` gives us the length of the list. We subtract one (as we start counting from zero), to get the index of the last element.
* we use the backward index and use ```index = -1``` to refer to the last element.
> **Note:**
>
> Think about which method you would use and why.
Lists are ***mutable***, i.e. we can change the elements, e.g.
%% Cell type:code id: tags:
``` python
print(my_list)
my_list[2] = 10
print(my_list)
```
%% Output
[1, 2, 3, 4, 5]
[1, 2, 10, 4, 5]
%% Cell type:markdown id: tags:
We can access ranges of lists via the index. The general syntax is ```list [ start_index : stop_index : step_size ]```.
This means:
* we start our slice at ```start_index```,
* end at the ```stop_index```, and
* go ```step_size``` steps at a time. Positive step sizes mean we go forward, negative we go backward.
All three arguments ```start_index```, ```stop_index```, and ```step_size``` are optional. If we do not specify them, this means *the rest of the list*.
Hence ```my_list``` and ```my_list[:]``` refer to the whole list.
%% Cell type:code id: tags:
``` python
my_list = [1, 2, 3, 4, 5]
print(my_list)
print(my_list[:])
print('-----------')
# print the list from the second element onwards
print(my_list[1:])
print('-----------')
# print the list up to the second last element
# should give: [1, 2, 3]
# ... your code here ....
print('-----------')
# print the list between the second and the second last element
# should give [2, 3]
# ... your code here ....
print('-----------')
# print every other element of the list
# this should give [1, 3, 5]
# ... your code here ....
```
%% Output
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
-----------
[2, 3, 4, 5]
-----------
-----------
-----------
%% Cell type:markdown id: tags:
### Lists and strings
In a way, list and strings share some similar behaviour. We can (almost) interpret a string as a list of letters. In this sense, we can also access the elemets of the string with indices as we did with lists, work with ranges, etc.
However, crucially, strings are ***immutable***, i.e. once created we cannot change the letters at the indices.
%% Cell type:code id: tags:
``` python
my_string = 'I love python'
index_p = my_string.index('p')
print(my_string[index_p:])
```
%% Output
python
%% Cell type:markdown id: tags:
Exercise:
Use indices and slicing methods to print the word "run" from the word "nurse" below.
%% Cell type:code id: tags:
``` python
my_word = 'nurse'
# ... your code here ...
```
%% Cell type:markdown id: tags:
## Tuples
Tuples are quite similar to lists and we can also access them via their indices.
However, unlike lists, tuples are ***immutable***, i.e. once created, we cannot change the values.
Tuples are (technically) defined by the presence of the comma, however, we typically use round brackets to make it look neater and more easily recogniseable.
%% Cell type:code id: tags:
``` python
my_tuple = (1, 2, 3, 4, 5)
print(my_tuple)
print('-----------')
print(my_tuple[2:])
print('-----------')
#this will fail.
my_tuple[1] = 10
```
%% Output
(1, 2, 3, 4, 5)
-----------
(3, 4, 5)
-----------
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In [10], line 10
7 print('-----------')
9 #this will fail.
---> 10 my_tuple[1] = 10
TypeError: 'tuple' object does not support item assignment
%% Cell type:markdown id: tags:
## Set
A set is also similar to a list - but here each element can only occur once and the order of the set is not fixed, i.e. it is an unorderd collection of distinct objects.
We use curly brackets to define a set.
%% Cell type:code id: tags:
``` python
my_set = {1, 2, 3}
print(my_set)
print('-----------')
# this is the same
my_set = {1, 1, 2, 3, 3, 3, 2, 1 }
print(my_set)
print('-----------')
```
%% Output
{1, 2, 3}
-----------
{1, 2, 3}
-----------
%% Cell type:markdown id: tags:
---
## Dictionaries
Dictionaries are a very common datatype in python that are typically used to store and look-up information.
Each element of a dictionary has two elements: the "key" and the "value" that always come together, i.e. we have *key-value pairs*
* key: Each key is associated with a value and we can use this key to access or change the information stored in the corresponding value. The keys can be any immutable data type (i.e. one we cannot change later, otherwise we could no longer establish the relationship "key - value" if we allowed the key to change.).
Each key needs to be unique, again, we could not establish a "key - value" relationship if we had the same key multiple times.
* value: This holds the content we want to associate with the key. The values can be of any python data type. We could have, for example, a simple string, a number - but also lists or even other dictionaries.
The general syntax is:
``` my_dict = { key_1 : value_1, key_2 : value_2, ...} ```
%% Cell type:code id: tags:
``` python
# A simple dictionary describing a person
person = {
'hair_colour' : 'black',
'eye_colour' : 'green',
'glasses' : True,
'shoe_size' : 45
}
print(person)
```
%% Output
{'hair_colour': 'black', 'eye_colour': 'green', 'glasses': True, 'shoe_size': 45}
%% Cell type:markdown id: tags:
We can access and change the values by accessing the dictionary via its key:
```my_dict[key]``` will give the value, ```my_dict[key] = new_value``` will assign the new value.
If the key does not yet exist, it will be added to the dictionary.
%% Cell type:code id: tags:
``` python
print(person['shoe_size'])
print('------------')
person['hair_colour'] = 'green'
person['hair_style'] = 'short'
print(person)
```
%% Output
45
------------
{'hair_colour': 'green', 'eye_colour': 'green', 'glasses': True, 'shoe_size': 45, 'hair_style': 'short'}
%% Cell type:markdown id: tags:
If we no longer need a specific key-value pair, we can remove it via ```del my_list[key]```
%% Cell type:code id: tags:
``` python
del person['shoe_size']
print (person)
```
%% Output
{'hair_colour': 'green', 'eye_colour': 'green', 'glasses': True, 'hair_style': 'short'}
%% Cell type:markdown id: tags:
***Exercise:***
Write a dictionary that describes your favourite pizza.
Use more than one topping - think about how you would organise this.
%% Cell type:code id: tags:
``` python
# ... your code here ...
```
%% Cell type:markdown id: tags:
Useful functions for dictionaries
* ```my_dict.clear()``` : removes all key - value pairs
* ```my_dict.get(key)```: access the value for the ```key``` but do not give an error if the key does not exist (instead, retuns ```None```)
* ```my_dict.items()```: returns a tuple of all key-value pairs
* ```my_dict.keys()``` : return a list of all keys
* ```my_dict.values()```: returns a list of all values
* ```my_dict.pop(key)```: removes the ```key``` from the dictionary and returns the associated value
* ```my_dict.popitem()```: removes the last key-value pair and returns it as a tuple.
* ```my_dict.merge(other_dict)```: updates the dictionary ```my_dict``` with ```other_dict```: If the keys of ```other_dict``` exist in ```my_dict``` already, the values will be updated, otherwise, the keys will be added.
* ```my_dict.update(other_dict)```: updates the dictionary ```my_dict``` with ```other_dict```: If the keys of ```other_dict``` exist in ```my_dict``` already, the values will be updated, otherwise, the keys will be added.
%% Cell type:code id: tags:
``` python
person = {
'hair_colour' : 'black',
'eye_colour' : 'green',
'glasses' : True,
'shoe_size' : 45
}
print(person)
print('-------------')
print(person.values())
print('-------------')
print(person.keys())
print('-------------')
value = person.pop('shoe_size')
print('The value returned is: {}'.format(value))
print(person)
```
%% Output
{'hair_colour': 'black', 'eye_colour': 'green', 'glasses': True, 'shoe_size': 45}
-------------
dict_values(['black', 'green', True, 45])
-------------
dict_keys(['hair_colour', 'eye_colour', 'glasses', 'shoe_size'])
-------------
The value returned is: 45
{'hair_colour': 'black', 'eye_colour': 'green', 'glasses': True}
......
......@@ -507,7 +507,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.12"
},
"orig_nbformat": 4,
"vscode": {
......
This diff is collapsed.
%% Cell type:markdown id: tags:
# Fast Python with Numba
%% Cell type:code id: tags:
``` python
# all imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
import numba
from numba import jit
```
%% Cell type:code id: tags:
``` python
print(numba.__version__)
```
%% Output
0.56.4
%% Cell type:code id: tags:
``` python
@jit(nopython=True)
def random_walk(n_steps = 5000, step_size = 1):
# we always start at (0,0)
x_points = [0]
y_points = [0]
# do the random walk:
for i in range(0, n_steps):
# choose direction:
# the following is the same as np.random.choice([-1,1]) but this cannot be optimized with Numba
x_dir = np.round(2*(np.random.randint(0,2)-0.5))
y_dir = np.round(2*(np.random.randint(0,2)-0.5))
# calculate new positions: last position + step_size * direction
new_x = x_points[-1] + step_size * x_dir
new_y = y_points[-1] + step_size * y_dir
# append to arrays
x_points.append(new_x)
y_points.append(new_y)
# calculate distance between start and end as Eucledian distance
# bit explicit as numba does not work with the one line we have used before
x_start = x_points[0]
y_start = y_points[0]
x_stop = x_points[-1]
y_stop = y_points[-1]
distance2 = (x_stop - x_start )**2 + ( y_stop - y_start )**2
distance = np.sqrt( distance2)
return x_points, y_points, distance
```
%% Cell type:markdown id: tags:
Now we can compare the timings with and without the ```@jit``` decorator. \
Remember that decorators change the behaviour of the function - but we do not have to change the function itself.
In this case, Numba is a specialised package that optimises a function "behind the scenes".
Note that the first call includes the optimisation / compile time. If we want to measure the time the optimised function takes, we need to discard the timing from the first call.
%% Cell type:code id: tags:
``` python
%%time
distances = []
for i in range(0,200):
_, _, distance = random_walk()
distances.append(distance)
```
%% Output
CPU times: user 33.4 ms, sys: 0 ns, total: 33.4 ms
Wall time: 33 ms
CPU times: user 34.5 ms, sys: 0 ns, total: 34.5 ms
Wall time: 33.9 ms
......
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment