Skip to content
Snippets Groups Projects
Commit 3db2a483 authored by Leštáková, Michaela's avatar Leštáková, Michaela
Browse files

Add new file

parent f58de581
Branches
No related tags found
No related merge requests found
# Unit-Testing in Python
This article should give a general introduction to unit testing. Various tools are introduced to perform unit tests in Python and different metrics to evaluate code completeness and quality are discussed. In the end a more advanced testing strategy and some tips for writing testable code are presented.
- [Why write tests?](#why-write-tests)
- [Setting up a testing framework in Python](#setting-up-a-testing-framework-in-python)
- [Choosing the testing tool](#choosing-the-testing-tool)
- [Anatomy of a test - AAA](#anatomy-of-a-test---aaa)
- [Example Project](#example-project)
- [Code coverage](#code-coverage)
- [Returning to the example](#returning-to-the-example)
- [How to write easy to test code](#how-to-write-easy-to-test-code)
## Why write tests?
There are plenty of reasons for why to write tests, the most important ones are listed below.
Unit testing:
1. gives higher confidence that the code works as expected
2. provides a safety net when redesigning/refactoring the code base
3. is more time efficient than testing manually
4. improves code quality → writing easy to test code leads to more modular code
5. simplifies debugging process → when a bug is introduced the test will make out the function that produces the bug
6. enhances working in a team → tests are a great entry to get to know a code base; handing over a project to another person lets them know what works
## Setting up a testing framework in Python
A python project is usually setup in the following of the two:
**flat-Layout**: All the source code is located in a folder with the name as the project name:
```
<project_name>
└───<project_name>
│ │ __init__.py
│ │ module1.py
│ │
│ └───subfolder1
│ │ __init__.py
│ │ module2.py
│ │ ...
└───tests
│ │ (__init__.py)
│ │ test_module1.py
│ │
│ └───subfolder1
│ │ (__init__.py)
│ │ test_module2.py
│ │ ...
│ README.md
│ project.toml / setup.py
```
**src-Layout**: The folder with all the source code is inside another folder called src.
```
<project_name>
└───src
│ └───<project_name>
│ │ __init__.py
│ │ module1.py
│ │
│ └───subfolder1
│ │ │ __init__.py
│ │ │ module2.py
│ │ │ ...
└───tests
│ │ test_module1.py
│ │
│ └───subfolder1
│ │ test_module2.py
│ │ ...
│ README.md
│ project.toml / setup.py
```
The differences of these approaches are elaborated [here](https://packaging.python.org/en/latest/discussions/src-layout-vs-flat-layout/) and [here](https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure%3E). The src-Layout requires the installation of the project by including a setup.py or pyproject.toml file. If the project is installed the \_\_init\_\_.py files in the testing folders are unnecessary. Not having the \_\_init\_\_.py files in the testing folders is generally recommended also for the flat-Layout. Considering all the benefits of the src-Layout, the src-Layout is used in the following examples.
If pyproject.toml is used, the following has to be defined inside the file:
``` toml
[project]
name = "<project_name>"
version = "1.0.0" #if the package is not indented to be published the version number doesn't matter, but it must be specified and must have the following format "X.X.X" where X is an integer
[tool.setuptools.packages.find]
where = ["src"]
```
If setup.py is used:
``` python
from setuptools import setup, find_packages
setup(
name="<project_name>",
version="1.0.0",
package_dir={'': 'src'},
packages=find_packages(where='src')
)
```
Before installing the project a virtual environment should be created:
Windows:
```bash
py -m pip venv venv
```
Mac:
```bash
python3 -m pip venv venv
```
After creating the virtual environment it needs to be activated:
Windows:
```bash
venv\Scripts\activate
```
Mac:
```bash
source venv/bin/activate
```
Now the following command can be run to install the project as an editable package.
```bash
pip install -e .
```
## Choosing the testing tool
There are two main tools in python for writing automated tests:
1. [unittest](https://docs.python.org/3/library/unittest.html)
- part of the standard library
- test functions are inside classes
2. [pytest](https://docs.python.org/3/library/unittest.html)
- third party package
- test function can be defined outside classes
Since pytest requires less boilerplate code compared to unittest and is simpler to use, it will be used in the following examples. It can be installed with the following command:
```bash
pip install pytest
```
## Anatomy of a test - AAA
A test usually consists of the following four steps, where the core steps are step 2 and 3.
1. **Arrange**: preparation for everything necessary to run the test e.g. connection to database, creation of file,...
2. **Act**: call the function to be tested
3. **Assert**: compare output with expected output
4. **(Cleanup)**: clean up after the test is done, so no other tests are influenced by it
## Example Project
To demonstrate all the different steps an example package is written that can perform different math operations. The project structure looks as follows:
```
math_operations
└───src
│ └───math_operations
│ │ __init__.py
│ │ basic.py
└───tests
│ │ test_basic.py
│ │
│ README.md
│ project.toml
```
The pyproject.toml file is configured as follows:
```toml
[project]
name = "math_operations"
version = "1.0.0"
[tool.setuptools.packages.find]
where = ["src"]
```
In basic.py a simple add function is written:
```python
def add(num1: Union[int, float], num2: Union[int, float]) -> Union[int, float]:
return num1 + num2
```
To test this function, a test function is added in test_basic.py:
```python
from math_operations.basic import add
def test_add() -> None:
assert add(2,2) == 4
```
To access the add function it needs to be imported first. This can be done without adding an \_\_init__.py file in the test directory, since the project is installed as an editable package. In this function just two steps of the four possible steps are performed:
- Act &rarr; the function to be tested (the add function) is called
- Assert &rarr; the output of the function if compared with the expected output of 4.
If the statement after the assert keyword is true, the test will pass, otherwise it will fail.
To run the tests, `pytest` needs to be executed in the terminal:
Code-Editors like VS-Code and PyCharm allow to run tests without using the terminal. More information about python testing in VS-Code can be found [here](https://code.visualstudio.com/docs/python/testing) and about testing in PyCharm can be found [here](https://www.jetbrains.com/help/pycharm/testing-your-first-python-application.html).
After running the tests over the terminal the following message is printed.
```bash
collected 1 item
tests/test_basic.py . [100%]
=========== 1 passed in 0.05s ===========
```
The test passed.
To make the add function only accept numbers (int or float), it is slightly modified:
```python
def add(num1: Union[int, float], num2: Union[int, float]) -> Union[int, float]:
if not isinstance(num1, (int, float)):
raise TypeError(f"'num1' is {type(num1)}; expected float, int")
if not isinstance(num2, (int, float)):
raise TypeError(f"'num2' is {type(num2)}; expected float, int")
return num1 + num2
```
Both input arguments are check if they are of type int or float. If not a TypeError is raised.
To test the new functionality a new test is added.
```python
import pytest
def test_add_raises_type_error() -> None:
with pytest.raises(TypeError):
add("a", [1])
```
> **Note**: One test should only test for one specific behavior of the function that is being tested. The name of the test function should be precisely named after what the test is actually testing.
Inside the test it is checked if the add function raises a `TypeError` when an argument with the wrong type is passed to the function. In order to check whether a function raises a certain Exception, the raises function from pytest is needed.
Both tests pass, when running the tests again.
## Code Coverage
To determine how much of the written add function is covered by the tests, a code coverage analysis can be performed. Code coverage is a metric for the amount of code tested by the unit tests. There are three main coverage criteria for unit tests:
- statement coverage: number of executed statements divided by the total number of statements
- branch coverage: number of executed branches divided by the total number of branches
- condition coverage: percentage of conditions that affected independently the outcome of a conditional statement (if-statement)
To illustrate the different code coverages, a code coverage analysis for the following example function will be performed:
```python
def foo(x: int, y: int) -> int:
if x > 0 and y > 0: # 1. statement
z = x + y # 2. statement
else:
z = 10 # 3. statement
return z # 4. statement
```
This function has a total of 4 statements (not counting the function deceleration and the else keyword) and 2 branches. The if statement can have 3 different combinations of True and False for different values of `x` and `y`. If both, `x` and `y`, are greater than 0 both comparisons are True. If `x` is > 0 and `y` <= 0 the first comparison is True and the second one is False. Last if `x` <= 0 the first comparison is False. If a statement inside a if-statement is False, everything that comes afterwards is not evaluated anymore. So the last condition is (False, False/True).
To test this function, different inputs are passed to the function.
Test 1: `x` = 1, `y` = 1
- statement coverage = 3/4 = 75% (statement number 1,2 and 4 are covered)
- branch coverage = 1/2 = 50% (only the if branch is covered)
- condition coverage = 1/3 = 33% (both x and y are greater than 0, so both comparisons are True &rarr; combination (True, True))
Testing with another set of input arguments, the coverage increases:
Test 1-2: (`x` = 1, `y` = 1), (`x` = 0, `y` = 1):
- statement coverage = 4/4 = 10% (statement number 1,2,3 and 4 are covered)
- branch coverage = 2/2 = 100% (all branches are covered)
- condition coverage = 2/3 = 67% ((True, True), (False, True))
Adding a last test case leads to 100% coverage for all coverages:
Test 1-3: (`x` = 1, `y` = 1), (`x` = 0, `y` = 1), (`x` = 1, `y` = 0):
- statement coverage = 4/4 = 10% (statement number 1,2,3 and 4 are covered)
- branch coverage = 2/2 = 100% (all branches are covered)
- condition coverage = 3/3 = 100% ((True, True), (False, True), (True, False))
Instead of doing the analysis manually, a tool can be used to generate a coverage report. When using pytest, [pytest-cov](https://pytest-cov.readthedocs.io/en/latest/) can be used. It can be installed with the following command:
```bash
pip install pytest-cov
```
`pytest-cov` is able to calculate statement and branch coverage but not condition coverage. However, condition coverage is, in the most cases, not necessary to calculate, especially if the if-statements of the tested code only contain one boolean expression.
## Returning to the example
To generate a coverage report for the written test functions, the following command can be run.
```bash
pytest --cov=math_operations --cov-report=html --cov-branch
```
A folder with the name `htmlcov` is generated. Inside lies a `index.js` file which can be opened with a browser. Clicking on `math_operations/basic.py`, it can be seen that the second `TypeError`-Exception is never raised because no test is provided where only the second input argument has the wrong type. To fix this, the `test_add_raises_type_error` test-function is modified:
```python
def test_add_raises_type_error() -> None:
with pytest.raises(TypeError):
add("a", [1])
with pytest.raises(TypeError):
add(1, [1])
```
Running the coverage analysis again, 100% statement and branch coverage is achieved. However, now there are duplications inside the test functions. This would be really inefficient and error prone if the function needed to be tested with a lot more input arguments. pytest has a way getting around this problem by parametrizing test functions.
```python
@pytest.mark.parametrize("num1", [1, "1"])
@pytest.mark.parametrize("num2", [[1]])
def test_add_raises_type_error(num1, num2) -> None:
with pytest.raises(TypeError):
add("a", [1])
with pytest.raises(TypeError):
add(1, [1])
```
All combinations für num1 and num2 are passed as a input arguments. In this case `(1,[1])` and `("1", [1])`.
To make the first test more robust, it can be parametrized as well:
```python
@pytest.mark.parametrize("num1, num2, expected", [(2,2,4),(0.1,-10,-9.9), (0.1,0.2,0.3)])
def test_add(num1, num2, expected) -> None:
assert add(num1, num2) == expected
```
If the tests are run again, the following is printed to the terminal:
```bash
============================= short test summary info ==============================
FAILED tests/test_basic.py::test_add[0.1-0.2-0.3] - assert 0.30000000000000004 == 0.3
```
The test fails, because 0.1+0.2 is apparently equal to 0.30000000000000004 not 0.3. This shows a fundamental problem when doing arithmetics with floating point numbers. Many floats cannot be represented exactly in a binary floating point representation. More on this can be read [here](https://en.wikipedia.org/wiki/IEEE_754-1985).
> **Note**: Floats should **never** (whether in tests or in the code itself) be checked for exact equality.
To test for "approximate" equality, pytest provides a function:
```python
@pytest.mark.parametrize("num1, num2, expected", [(2,2,4),(0.1,-10,-9.9), (0.1,0.2,0.3)])
def test_add(num1, num2, expected) -> None:
assert add(num1, num2) == pytest.approx(expected)
```
Now the question arises for how many inputs a function must be tested for and how can it be ensured that various edge cases are also covered. This is not always easy and time-consuming. For some unit-test a tool called [hypothesis](https://hypothesis.readthedocs.io/en/latest/) can solve this problem. It can be installed with the following command:
```bash
pip install hypothesis
```
With `hypothesis` property-based testing can be performed. Property-based testing consists of three main steps:
1. generate many inputs matching some specification &rarr; Fuzzing
2. perform operations on the inputs
3. assert the result has some property
Property-based testing does not replace unit-tests, but extends them. It can make tests more efficient and robust.
To create a property-based test a strategy needs to be created for the input arguments. In case of the add function inputs should have the type integer or float. For simplicity only integers will be looked at first. The test function would look as follows:
```python
from hypothesis import given
import hypothesis.strategies as st
@given(num1 = st.integers(), num2 = st.integers())
def test_add(num1, num2):
assert add(num1, num2) == num1 + num2
```
The test function needs to be decorated with the given decorator. Inside we pass as key, value pairs the strategies for the input arguments. In this case the inputs should be integers. Since it is not known which inputs are generated we cannot pass a expected output to the function. Instead we test the output for a desired property. Here the property is rather simple, the input arguments should add up to the output.
If the tests are run now, it seems nothing changed. To make the generated inputs visible, the test function has to be modified as follows:
```python
from hypothesis import given, Verbosity, settings
import hypothesis.strategies as st
@settings(verbosity=Verbosity.verbose)
@given(num1 = st.integers(), num2 = st.integers())
def test_add(num1, num2):
assert add(num1, num2) == num1 + num2
```
Instead of running only pytest the -s flag has to be added:
```bash
pytest -s
```
Now all the different inputs that are passed to the test are shown in the terminal. By default 100 inputs are generated. This can also be modified by passing a different value for the `max_example` variable inside the `settings` decorator. A small snippet of the generated inputs are shown below.
```bash
tests/test_basic.py Trying example: test_add(
num1=0, num2=0,
)
Trying example: test_add(
num1=0, num2=0,
)
Trying example: test_add(
num1=27878, num2=4602116426027312825,
)
Trying example: test_add(
num1=0, num2=0,
)
Trying example: test_add(
num1=921618898953530018, num2=-21,
)
Trying example: test_add(
num1=-29772, num2=2432,
)
Trying example: test_add(
num1=-4083, num2=103974182981581506855294222177511708431,
)
Trying example: test_add(
num1=-21202785064719431102335601989547548353, num2=-31708,
)
```
As it can be seen `hypothesis` generates all kind of inputs: really large, small, negatives and 0.
To expand the test function to also generate floats a custom strategy is defined:
```python
@st.composite
def int_or_float(draw):
return draw(
st.one_of(
st.floats(allow_infinity=False, allow_nan=False), st.integers()
)
)
```
This strategy generates either an integer or a float. When generating floats, also nan, inf and -inf are generated. Since it is not wanted in this case we set the corresponding input variables to `False`.
The test function can be modified as follows:
```python
@given(num1 = int_or_float(), num2 = int_or_float())
def test_add(num1, num2):
assert add(num1, num2) == num1 + num2
```
The shown example is really trivial. A more complex one e.g. would be a function that sorts an array/list. Inside the array are integers and floats. A property-based test with hypothesis could look like the following:
```python
@given(st.lists(int_or_float()))
def test_sort(arr: list[Union[int, float]]) -> None:
sorted_arr = sort_array(arr)
assert isinstance(sorted_arr, list)
assert Counter(arr) == Counter(sorted_arr)
assert all(
x <= y for x,y in zip(sorted_arr, sorted_arr[1:])
)
```
First the strategy for the the inputs is defined. In this case it should be a list with integers and floats as its elements. Then the `sort` function is called. Afterwards three different properties of the output are tested. First it is checked whether the output is of type list again. Then the Counter class from the collections library is used to check whether all the elements that were in the list before are also in the sorted list. Last it is checked if the list is ordered in ascending order.
## How to write easy to test code
An introduction was given on how to setup a testing framework and how to write tests. The question arises now, how to write easy to test code. Two things should be considered when writing code.
1. functions should perform only one task
2. functions should be „pure functions“
A pure function is a function that has identical output for identical inputs and has no side effects. This means there are no global variables inside the function that have effects on the output and no print statements or I/O actions are performed. Since it is not possible to only write pure functions, because side effects are often needed, pure functions should be separated from impure functions. The following example illustrates this:
```python
def display_user_info(name: str, age: int) -> None:
print(f"Name: {name}\nAge: {age}")
```
The `display_user_function` does two things. It formats the user info and prints it to the terminal. This function is hard to test and mixes pure functionalities with impure ones. The function can be rewritten as follows:
```python
def format_user_info(name: str, age: int) -> str:
return f"Name: {name}\nAge: {age}"
def display_user_info(user_info: str) -> None:
print(user_info)
```
The code has become more modular and is easier to test.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment