Palette Transport System Control
This is an implementation of the control component concept utilizing the Unity ML-Agents Toolkit for reinforcement learning in a Unity simulation of a very simplified palette transport system (PTS):
Prerequesites
- Used Testsystem:
- Unity 2019.3.11f1
- macOS 10.15.6
- python 3.8.0
- MLAgents: Release 12
- com.unity.ml-agents 1.7.2
- ml-agents 0.23.0
- ml-agents-envs 0.23.0
- Communicator API 1.3.0
- (PyTorch 1.7.1)
- Installed Unity and python according to Unity ML-Agents Docs
Heuristic: How to test the PTS environment
- Open PTSSim Project in Unity
- Open the Test scene in Unity: Assets/Scenes/Test.unity
- Click Play
- Use arrow keys to trigger the palettes actions
- left/right
- up/down on current conveyor
- Use key y or x in combination with up or down arrow key to slide the left or right shift conveyors up or down (relative to palette position)
- Alternatively pick a numbers between 1-8
Training: How to train a model
- Open PTSSim Project in Unity
- Open the Training scene in Unity: Assets/Scenes/Training.unity
- Open terminal in this folder:
- Activate ml-agents environment according to your python installation
- For example:
conda activate ml-agents
orsource ~/python-envs/mlagents/bin/activate
- For example:
- (Start tensorboard:
tensorboard --logdir=results/ --port=6006 &
) - Start a training run:
mlagents-learn trainer_config.yaml --run-id <<uniqueNameForTraining>>
- Activate ml-agents environment according to your python installation
- In Unity
- Click Play
- Watch the training process in Unity or tensorboard
- Display 1 to Display 3 allow for switching between single PTS, arena or agent view
Inference: How to test a trained model
- Open PTSSim Project in Unity
- Open the Run scene in Unity: Assets/Scenes/Run.unity
- Navigate to the Palette game object in the hierarchy view: PTS/PA01
- Place your trainend model (from results/<>/PalettePathFinder.onnx) in the Model property of the Behavior Parameters component (in the inspector view).
- An example for a trained model can be found in Assets/Brains/Example.onnx
- Click Play
- Display 2 shows a view from the palette perspective
AT-Paper on Complexity Considerations
This repository was used to show the influence on the choice of different control task considering the operational equipment measure model in the following journal contribution:
- Title: Assessment of Reinforcement Learning Applications for Industrial Control Based on Complexity Measures
- Authors: Julian Grothoff, Nicolas Camargo Torres and Tobias Kleinert},
- Journal: at - Automatisierungstechnik (ISSN: 2196-677X)
- Date: Jan. 2022
- DOI: 10.1515/auto-2021-0118
Scenarios
As shown below four different scenarios for the control task were defined in the paper:
Scenario 2 and 3 were realised. Therefore, scenes, PTS environment prefabs, operation mode scripts and trainer configurations were added besides some code changes: 6e9433b2...e6e13abb. The setup of the training environment PTS was changed to support the indipendent handling of RL agents and palettes with control components. Moreover, a script to represent the physical palette is introduced that handles the physical positioning of the palettes.
Scenario 2: Procedure for each Coil
In scenario 2 the plant oriented ML palette control operation mode was moved from the palette control component, e.g. GCU 1 in figure above, to an own control component representing the product related measure (procedure) to position a coil. Hence, some default operation modes for the palette (LEFT, RIGHT, UP, DOWN) were implemented in the control components for palettes, as depicted in the following figure. The regular operation modes LEFT and RIGHT will also check, if the next conveyor is a shifttable and move it to the right position, as in the "real" process control application, realized with ACPLT/RTE in acplt.
In consequence a control component for each coil with one operation mode to position the palette with regular operation modes is dynamically created based on the parameterization of the PTS environment. To run or train scneario 2 use the corresponding scenes.
Scenario 3: Procedure for all Coils
For scenario 3 a single RL-agent controls all palettes in one PTS environment. It is realized as measure (procedure) control component with one operation mode.
To run or train scneario 3 use the corresponding scenes.