Changes in Example

698fa096 · Ann-Kathrin Margarete Edrich · be2d3d83 · 698fa096 · 698fa096
Commit 698fa096 authored 3 months ago by Ann-Kathrin Margarete Edrich
--- a/docs/source/example-plain.rst
+++ b/docs/source/example-plain.rst
@@ -303,17 +303,5 @@ Sure! Here’s your long table converted into an HTML table format with left-ali
                <td class="col-10">Bool</td>
                <td>True</td>
            </tr>
-            <tr>
-                <td>keep_cat_features</td>
-                <td class="col-50">True if instances in the input dataset without categorical class information shall be kept and proceeded as intended, else False</td>
-                <td class="col-10">Bool</td>
-                <td>True</td>
-            </tr>
-            <tr>
-                <td>remove_instances</td>
-                <td class="col-50">True if instances in the input dataset without categorical class information shall be removed and marked with the no data value in the final map, else False</td>
-                <td class="col-10">Bool</td>
-                <td>False</td>
-            </tr>
        </tbody>
    </table>
--- a/docs/source/example.rst
+++ b/docs/source/example.rst
@@ -225,23 +225,6 @@ Susceptibility map generation
   the mapping, they need to be removed. The feature names need to be provided in a comma-separated way without any spaces.
   Then we also need to provide the **Name of the label column in the training dataset**.
   
-   An important decision is made under **How to treat mismatching categories**. If one-hot encoding is used for training
-   and prediction dataset generation, then it might be that the categories might be mismatching between the two input datasets.
-   In the case that not all features that are contained in the training dataset are contained in the prediction dataset, the 
-   mapping process is aborted and the model is automatically retrained. Before retraining, the mismatching features are
-   removed from the training dataset. The results are stored in a separate folder named *<old folder name>_retrain*. 
-   In the case that there are more features contained in the prediction dataset than in the training dataset, the mismatching
-   features in the prediction dataset are automatically removed before mapping. Furthermore, an identical order of the features
-   between the input datasets is ensured. If one-hot encoded feature classes are removed from the prediction dataset there will
-   be instances is the prediction dataset, i.e. individual locations within the area of interest, which are described by
-   no feature class still contained in the prediction dataset. This means that the value for this feature for all classes 
-   still contained in the prediction dataset is
-   0. Under **How to treat mismatching categories**, when choosing **Keep instances of mismatching classes**, these instances
-   are kept in the prediction dataset and are included in the mapping in the same way as all the other locations. When
-   choosing **Remove instances of mismatching classes**, these instances are handled in the same way as the locations where
-   at least one feature contains a no data value and they will not be included in the mapping. As we are using ordinal encoding 
-   in this example, this decision of reduced importance for the moment.
-   
   Finally, the Random Forest needs to be defined regarding the number of trees, depth of the trees and the evaluation criterion.  
   For more information, see the documentation for scikit learn's `Random Forest Classifier <https://scikit-learn.org/dev/modules/generated/sklearn.ensemble.RandomForestClassifier.html>`_.
   Provide the **Size of the test dataset (0...1)** as well. For the values chosen in this example, refer to the image above.
@@ -275,10 +258,10 @@ Final map, output files and validation information
   
       
   Each of the previously described steps habe their own input files, which have been discussed and are described in the user manual.
-   When checking the folder of the training and prediction dataset were generated as well as the folder where training and prediction results are stored, it can be seen that
+   When checking the folder of the training and prediction dataset as well as the folder where training and prediction results are stored, it can be seen that
   several new files were created.

-   **Beware!:** The files produced in each run depend also on the chosen options, e.g. regarding compilation strategy of the training dataset.
+   **Beware!:** The files produced in each run depend also on the chosen options, e.g., regarding compilation strategy of the training dataset.

   Most of the files are intended to support transparency and reusability.