Corrections - Tom
Compare changes
- Tom Reclik authored
+ 3
− 3
The data are often included as demo datasets in various data science packages and also available on public repositories such as the [Iris Data Set](https://archive.ics.uci.edu/ml/datasets/iris) entry on the UCI Machine Learning Repository. In this case, we use the copy from the [data archive in Seaborn](https://github.com/mwaskom/seaborn-data), which is a copy of the UCI repository, but with some added information such as a description in the data what the columsn mean.
In a first step, we read the contents of the file and store the data in a new dataframe. As mentioned when we first worked with files in Python, we do not *usually* do this manually, but use one of the many convenient functions that are already provided. In our case, Pandas knows how to read CSV files using the [read_csv](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html) function.
```
```
```
Here, Seaborn becomes more convenient to use - we can do the same thing with matplotlib but it is not quite as convenient. By adding the parameter ```hue```, Seaborn splits the histogram by the type of flower we consider in our data, adds a separate colour to each of them and adds a legend explaining what is what.
```
```
```