Skip to content
Snippets Groups Projects
Commit fb57240b authored by Ulrich Kerzel's avatar Ulrich Kerzel
Browse files

rename list l in DataTypes

parent 1ca8690a
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
# Common Datatypes
Previously, we have seen the basic datatypes such as:
* int
* float
* string
* boolean
These are suitable to hold single values - but often, we want to create more complex datatypes that can hold more than one value.
## Lists
List are "containers" that store a sequence of elements.
We can initialise a list with a sequence of elements or start with an empty list.
We can get the length of the list by using the function ```len()```
%% Cell type:code id: tags:
``` python
my_empty_list = []
my_list = [1,2,3,4]
# print the content of the list
print(my_list)
# print the length of the list - remember the way to format the printed statements
print('The list has {} elements'.format(len(my_list)))
```
%% Output
[1, 2, 3, 4]
The list has 4 elements
%% Cell type:markdown id: tags:
### Extending lists
There are multiple ways to add new elements to a list ```l````
(note that normally we would use more descriptive variable names instead of just "l"!)
There are multiple ways to add new elements to a list ```my_list```
(note that normally we would use more descriptive variable names instead of just ```my_list``` !)
* add a new element at the end of the list: ```l.append(element)```
* insert an element at a specific position: ```l.insert(index, element)```
* concatenating lists: ```list_1 + list_2```
%% Cell type:code id: tags:
``` python
# add a new element at the end
my_list.append(5)
print(my_list)
print('-------------')
# add a new element in the middle at index 2
my_list.insert(2,3)
print(my_list)
print('-------------')
# concatenating lists
list_1 = [1,2,3]
list_2 = [2,3,4]
my_list = list_1 + list_2
print(my_list)
print('-------------')
```
%% Output
[1, 2, 3, 3, 4, 5, 5]
-------------
[1, 2, 3, 3, 3, 4, 5, 5]
-------------
[1, 2, 3, 2, 3, 4]
-------------
%% Cell type:markdown id: tags:
### Common list operations
* sorting a list: ```l.sort()```. We can add the argument ```reverse=True``` if we want the list to be sorted in reverse.
* count the number of times an element appears in the list: ```l.count(element)```
* reverse the order of elements in the list ```l.reverse()```
* find the position (index) of the first occurence of an element with value ```value```: ```l.index(value, pos)```. The second argument ```pos``` is optional, if we specify ```pos>0```, the search starts from this position (index) instead of the beginning.
**Exercise:**
Take the list defined below, sort it in ascending order and count how often the number 2 appears.
%% Cell type:code id: tags:
``` python
my_list = [0,3,2,6,3,2,1,7,8,7]
# ... your code here ....
```
%% Cell type:markdown id: tags:
### Removing elements from lists
* remove all items from a list: ```l.clear()```
* remove an item at a specific index and return it: ```element = l.pop(index)```. The argument ```index``` is optional, if you do not specifiy it, it will remove and return the last element in the list.
* remove the first item with value ```value``` from the list: ```l.remove(value)```. If the ```value``` does not exist, we get an error (```ValueError```)
%% Cell type:code id: tags:
``` python
my_list = [0,3,2,6,3,2,1,7,8,7]
# remove the first 3
my_list.remove(3)
print(my_list)
```
%% Output
[0, 2, 6, 3, 2, 1, 7, 8, 7]
%% Cell type:markdown id: tags:
### Accessing list elements and slicing
In order to work with lists, we also need to access the elements. We can do this by using their index in the following way: ```l[index]```
Indices are counted forwards (starting from 0), i.e. the first element has ```index = 0```, the second element ```index = 1```, and so on.
However, we can also count backwards. Then, the last element has ```index = -1 ```, the second last has ```index = -2```, and so on.
![Image](ListIndex.png)
%% Cell type:code id: tags:
``` python
my_list = [1, 2, 3, 4, 5]
# print the full list
print(my_list)
print('-----------')
# print the second element
print(my_list[2])
print('-----------')
# print the last element
print(my_list[4])
print(my_list[len(my_list)-1])
print(my_list[-1])
```
%% Output
[1, 2, 3, 4, 5]
-----------
3
-----------
5
5
5
%% Cell type:markdown id: tags:
There we see three ways of accessing the last element of the list
* we happen to know that the list contains five elements, hence, the last one is at ```index = 4``` (because we start counting from zero)
* the function ```len( list )``` gives us the length of the list. We subtract one (as we start counting from zero), to get the index of the last element.
* we use the backward index and use ```index = -1``` to refer to the last element.
> **Note:**
>
> Think about which method you would use and why.
Lists are ***mutable***, i.e. we can change the elements, e.g.
%% Cell type:code id: tags:
``` python
print(my_list)
my_list[2] = 10
print(my_list)
```
%% Output
[1, 2, 3, 4, 5]
[1, 2, 10, 4, 5]
%% Cell type:markdown id: tags:
We can access ranges of lists via the index. The general syntax is ```list [ start_index : stop_index : step_size ]```.
This means:
* we start our slice at ```start_index```,
* end at the ```stop_index```, and
* go ```step_size``` steps at a time. Positive step sizes mean we go forward, negative we go backward.
All three arguments ```start_index```, ```stop_index```, and ```step_size``` are optional. If we do not specify them, this means *the rest of the list*.
Hence ```my_list``` and ```my_list[:]``` refer to the whole list.
%% Cell type:code id: tags:
``` python
my_list = [1, 2, 3, 4, 5]
print(my_list)
print(my_list[:])
print('-----------')
# print the list from the second element onwards
print(my_list[1:])
print('-----------')
# print the list up to the second last element
# should give: [1, 2, 3]
# ... your code here ....
print('-----------')
# print the list between the second and the second last element
# should give [2, 3]
# ... your code here ....
print('-----------')
# print every other element of the list
# this should give [1, 3, 5]
# ... your code here ....
```
%% Output
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
-----------
[2, 3, 4, 5]
-----------
-----------
-----------
[1, 3, 5]
%% Cell type:markdown id: tags:
### Lists and strings
In a way, list and strings share some similar behaviour. We can (almost) interpret a string as a list of letters. In this sense, we can also access the elemets of the string with indices as we did with lists, work with ranges, etc.
However, crucially, strings are ***immutable***, i.e. once created we cannot change the letters at the indices.
%% Cell type:code id: tags:
``` python
my_string = 'I love python'
index_p = my_string.index('p')
print(my_string[index_p:])
```
%% Output
python
%% Cell type:markdown id: tags:
Exercise:
Use indices and slicing methods to print the word "run" from the word "nurse" below.
%% Cell type:code id: tags:
``` python
my_word = 'nurse'
# ... your code here ...
```
%% Output
RUN
%% Cell type:markdown id: tags:
## Tuples
Tuples are quite similar to lists and we can also access them via their indices.
However, unlike lists, tuples are ***immutable***, i.e. once created, we cannot change the values.
Tuples are (technically) defined by the presence of the comma, however, we typically use round brackets to make it look neater and more easily recogniseable.
%% Cell type:code id: tags:
``` python
my_tuple = (1, 2, 3, 4, 5)
print(my_tuple)
print('-----------')
print(my_tuple[2:])
print('-----------')
#this will fail.
my_tuple[1] = 10
```
%% Output
(1, 2, 3, 4, 5)
-----------
(3, 4, 5)
-----------
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In [18], line 10
7 print('-----------')
9 #this will fail.
---> 10 my_tuple[1] = 10
TypeError: 'tuple' object does not support item assignment
%% Cell type:markdown id: tags:
## Set
A set is also similar to a list - but here each element can only occur once and the order of the set is not fixed, i.e. it is an unorderd collection of distinct objects.
We use curly brackets to define a set.
%% Cell type:code id: tags:
``` python
my_set = {1, 2, 3}
print(my_set)
print('-----------')
# this is the same
my_set = {1, 1, 2, 3, 3, 3, 2, 1 }
print(my_set)
print('-----------')
```
%% Output
{1, 2, 3}
-----------
{1, 2, 3}
-----------
%% Cell type:markdown id: tags:
---
## Dictionaries
Dictionaries are a very common datatype in python that are typically used to store and look-up information.
Each element of a dictionary has two elements: the "key" and the "value" that always come together, i.e. we have *key-value pairs*
* key: Each key is associated with a value and we can use this key to access or change the information stored in the corresponding value. The keys can by any immutable data type (i.e. one we cannot change later, otherwise we could no longer establish the relationship "key - value" if we allowed the key to change.).
Each key needs to be unique, again, we could not establish a "key - value" relationship if we had the same key multiple times.
* value: This holds the content we want to associate with the key. The values can be of any python data type. We could have, for example, a simple string, a number - but also lists or even other dictionaries.
The general syntax is:
``` my_dict = { key_1 : value_1, key_1 : value_2, ...} ```
%% Cell type:code id: tags:
``` python
# A simple dictionary describing a person
person = {
'hair_colour' : 'black',
'eye_colour' : 'green',
'glasses' : True,
'shoe_size' : 45
}
print(person)
```
%% Output
{'hair_colour': 'black', 'eye_colour': 'green', 'glasses': True, 'shoe_size': 45}
%% Cell type:markdown id: tags:
We can access and change the values by accessing the dictionary via its key:
```my_dict[key]``` will give the value, ```my_dict[key] = new_value``` will assign the new value.
If the key does not yet exist, it will be added to the dictionary.
%% Cell type:code id: tags:
``` python
print(person['shoe_size'])
print('------------')
person['hair_colour'] = 'green'
person['hair_style'] = 'short'
print(person)
```
%% Output
45
------------
{'hair_colour': 'green', 'eye_colour': 'green', 'glasses': True, 'shoe_size': 45, 'hair_style': 'short'}
%% Cell type:markdown id: tags:
If we no longer need a specific key-value pair, we can remove it via ```del my_list[key]```
%% Cell type:code id: tags:
``` python
del person['shoe_size']
print (person)
```
%% Output
{'hair_colour': 'green', 'eye_colour': 'green', 'glasses': True, 'hair_style': 'short'}
%% Cell type:markdown id: tags:
***Exercise:***
Write a dictionary that describes your favourite pizza.
Use more than one topping - think about how you would organise this.
%% Cell type:code id: tags:
``` python
# ... your code here ...
```
%% Cell type:markdown id: tags:
Useful functions for dictionaries
* ```my_dict.clear()``` : removes all key - value pairs
* ```my_dict.get(key)```: access the value for the ```key``` but do not give an error if the key does not exist (instead, retuns ```None```)
* ```my_dict.items()```: returns a tuple of all key-value pairs
* ```my_dict.keys()``` : return a list of all keys
* ```my_dict.values()```: returns a list of all values
* ```my_dict.pop(key)```: removes the ```key``` from the dictionary and returns the associated value
* ```my_dict.popitem()```: removes the last key-value pair and returns it as a tuple.
* ```my_dict.merge(other_dict)```: updates the dictionary ```my_dict``` with ```other_dict```: If the keys of ```other_dict``` exist in ```my_dict``` already, the values will be updated, otherwise, the keys will be added.
%% Cell type:code id: tags:
``` python
person = {
'hair_colour' : 'black',
'eye_colour' : 'green',
'glasses' : True,
'shoe_size' : 45
}
print(person)
print('-------------')
print(person.values())
print('-------------')
print(person.keys())
print('-------------')
value = person.pop('shoe_size')
print('The value returned is: {}'.format(value))
print(person)
```
%% Output
{'hair_colour': 'black', 'eye_colour': 'green', 'glasses': True, 'shoe_size': 45}
-------------
dict_values(['black', 'green', True, 45])
-------------
dict_keys(['hair_colour', 'eye_colour', 'glasses', 'shoe_size'])
-------------
The value returned is: 45
{'hair_colour': 'black', 'eye_colour': 'green', 'glasses': True}
......
%% Cell type:markdown id: tags:
# First Steps
We can use python in a variety of ways:
The simplest is to invoke the interactive shell by just typing ```python``` on the command line. This will allways work whenever we have an python interpreter installed and we will get the characteristic prompt ```>>>```. However, it is not very convenient to use as it does not allow a rich history or editing. You can leave this shell by either typing ```quit()``` or ```CTRL+D``` (for EOF: End-Of-File). ```ipython``` is an enhanced interactive shell that is a good way to interact with python whenever we need to do so from the command line.
If we have a python program in a file (e.g. my_program.py), we can run this program via ```python my_program.py```.
Another - very popular - way to use python is via Jupyter Notebooks (such as this one). The notebooks consists of separate "cells" where we can mix, for example, documentation or instructions (text, images), and code together with the output of the code.
Each cell can be executed and the code in this cell is then passed to the python interpreter, executed and then the resulting output (if any) is displayed.
This is a great way to explore and work interactively.
However (and there is always a "however") - the cells have a "state" in which the current status is captured and, therefore, the order in which you execute the cells matters. As always: use with care...
But now, let's start!
%% Cell type:markdown id: tags:
## Python as a pocket calculator
Let's do a simple sum, say, what is 3 + 4
%% Cell type:code id: tags:
``` python
3+4
```
%% Output
7
%% Cell type:markdown id: tags:
We can also do the other basic operations, such as multiplication, division
%% Cell type:code id: tags:
``` python
3/2
```
%% Output
1.5
%% Cell type:markdown id: tags:
Now multiply 25 by 25.
Note: lines starting with "#" are comments and ignored by python.
%% Cell type:code id: tags:
``` python
# .... your code here....
```
%% Cell type:markdown id: tags:
Next, we want to calculate the circumference of a rectangle. The long side is, say, 10cm long, the short side 7cm.
We could write it like this:
%% Cell type:code id: tags:
``` python
2*10+2*7
```
%% Output
34
%% Cell type:markdown id: tags:
However, this would be very inconvenient if we were to look at more rectangles.
We can use variables to store the values of the length of the sides:
%% Cell type:code id: tags:
``` python
short_side = 7
long_side = 10
```
%% Cell type:markdown id: tags:
First, we notice that we do not need to declare variables - we "simply" assign them use them.
Also, the above cell does not produce an output.
We can use the ```type()``` command to see that python has indeed determined that this is an integer variable.
%% Cell type:code id: tags:
``` python
type(short_side)
```
%% Output
int
%% Cell type:markdown id: tags:
We can see that python has inferred that we are using integers here. Python uses the concept of "dynamic typing" which means that the type of the variable is only determined at run-time (when the code is excecuted). Other languagues, such as, e.g. C++, use "strong typing" where the type of the variable has to be defined when the code is written (e.g.: ```int short_type = 7```).
As always - dynamic typing has advantages and disadvantages: It is obviously very easy to use as we see above and we can also write quite flexible code that operates on a range of types. For example, if we had written ```short_side = 7.0``` we would have obtained a float variable. Our code would do the same, in python, it even is the same code, in other languages, we have to be more stringent.
On the other hand, strong typing also allows to check for incompatible types even before the code is executed.
> __A word on naming variables:__
>
> Try to name the variables such that their name reflects their purpose. We could have written ```x=7``` and ```y=10``` but then we would probably have forgotten what ```x``` and ```y``` stand for immediately. we could also have written ```short=7``` and ```long=10```. This would have been a bit better - but if we have a few more variables, we would probably wonder what we refer to as ```long``` and ```short```...
>
> Many conventions exist regarding variable names. Some use CamelCaps (```ShortSide```), in python, most follow the convention to use underscores (```short_side```).
Back to the problem at hand, we wanted to calculate the circumference. Best to store that in a variable as well.
%% Cell type:code id: tags:
``` python
circumference = 2 * short_side + 2 * long_side
```
%% Cell type:markdown id: tags:
We notice that this cell does not have an output now.
To get the value of the variable, we can use the ```print()``` statement to print the value to the output or screen:
%% Cell type:code id: tags:
``` python
print(circumference)
```
%% Output
34
%% Cell type:markdown id: tags:
This works, but it would be nice to answer the question "What is the circumference of the rectangle" with a complete sentence (much as you did in primary school, presumably).
The simplest way would be to print the string ```The circumference in cm is:``` and then the value.
Note that we can separate the two by a comma (```,```). The print statement then prints all the arguments we give it one after another.
%% Cell type:code id: tags:
``` python
print('The circumference in cm is:', circumference)
```
%% Output
The circumference in cm is: 34
%% Cell type:markdown id: tags:
However, the print statement is much more powerful and we can place, for example, the value in the middle (before the unit).
You can find more details in the official [format documentation](https://docs.python.org/3/tutorial/inputoutput.html#the-string-format-method).
The general format is:
```print('output string {<variable and qualifier>}.format(variable1, variable2, ...))```
So here:
%% Cell type:code id: tags:
``` python
print('The circumference is {} cm.'.format(circumference))
```
%% Output
The circumference is 44 cm.
The circumference is 34 cm.
%% Cell type:markdown id: tags:
Exercise: extend the print statement and include the long and short side as well.
%% Cell type:code id: tags:
``` python
# ... your code here ....
```
%% Cell type:markdown id: tags:
So far we have used integer values. Now we take real valued variables ("float").
Repeat the exercise but now calculate the circumference of a circle.
When you print out the result, use only two significant digits in the printout.
First, think about which variables you want to use. Then calculate the circumference and print the result.
> __Note:__
>
> Python does not know the value of pi, we'll define it here for the sake of this exercise.
> You also note another coding convention: Whenever we define a number or constant to use in our code, we use captitals to signify this.
%% Cell type:code id: tags:
``` python
MY_PI = 3.141592653589793238
# .... your code here ...
```
%% Cell type:markdown id: tags:
Your output should look similar to:
``` The circumference of the circle is 3.14 cm```
%% Cell type:markdown id: tags:
The most basic types are:
- int (integer)
- float (real valued)
- string
- boolean (True / False)
%% Cell type:markdown id: tags:
There are two ways we can define strings, either with single quote or with double quotes.
However, we cannot mix the two.
%% Cell type:code id: tags:
``` python
my_string = 'This is a string.'
my_other_string = "This is also a string."
print(my_string)
print(my_other_string)
```
%% Output
This is a string.
This is also a string.
%% Cell type:markdown id: tags:
We can convert between these types using ```int( number)```, ```float (number)```, ```string (number)```
%% Cell type:code id: tags:
``` python
number = 42
print('This is the number {} as a {}'.format(number, type(number)))
float_number = float(number)
print('This is the number {} as a {}'.format(float_number, type(float_number)))
string_number = str(number)
print('This is the number {} as a {}'.format(string_number, type(string_number)))
```
%% Output
This is the number 42 as a <class 'int'>
This is the number 42.0 as a <class 'float'>
This is the number 42 as a <class 'str'>
%% Cell type:markdown id: tags:
# As a final note:
Look at the following code:
%% Cell type:code id: tags:
``` python
shortside = 10
long_side = 15
circumference = 2 * short_side + 2 * long_side
print('The circumference is {} cm.'.format(circumference))
```
%% Output
The circumference is 44 cm.
%% Cell type:markdown id: tags:
Hm....
we would have expected the output to be 50 cm (2 \* 10 + 2 \* 15) - but it's 44 cm.
Note that in the cell above we have written ```shortside = 10``` when we assigned the variable, however, in the calculation of the circumference we have used ```short_side```.
Earlier, we have indeed defined and used a variable called ```short_side``` - and this is the variable that is used here. These kind of typos are often the cause of a lot of bugs.
By adhering to best coding practices we can reduce that this happens too much.
>Jupyter notebooks are particularly vulnerable to these kind of bugs as they can easily become quite long and all variables from each cell is stored. This can lead to very confusing behaviour if we execute cells in different orders and jumpt between cells.
>
> We will use Jupyter notebooks throughout the course for the majority of exercises - mainly as they allow you to participate from a browser and you do not have to set a development environment up on your computer. However, we will also discuss how to write code in less error-prone ways.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment