1. [ How to develop and train a CNN component using EMADL2CPP](#nn) 2. [ How to build and run the app ](#app) # Development and training of a CNN component using EMADL2CPP ## Prerequisites * Linux. Ubuntu Linux 16.04 and 18.04 were used during testing. * Deep learning backend: * MXNet * training - generated is Python code. Required is Python 2.7 or higher, Python packages `h5py`, `mxnet` (for training on CPU) or e.g. `mxnet-cu75` for CUDA 7.5 (for training on GPU with CUDA, concrete package should be selected according to CUDA version). Follow [official instructions on MXNet site](https://mxnet.incubator.apache.org/install/index.html?platform=Linux&language=Python&processor=CPU) * prediction - generated code is C++. Install MXNet using [official instructions on MXNet site](https://mxnet.incubator.apache.org) for C++. ### HowTo 1. Define a EMADL component containing architecture of a neural network and save it in a `.emadl` file. For more information on architecture language please refer to [CNNArchLang project](https://git.rwth-aachen.de/monticore/EmbeddedMontiArc/languages/CNNArchLang). An example of NN architecture: ``` component VGG16{ ports in Z(0:255)^{3, 224, 224} image, out Q(0:1)^{1000} predictions; implementation CNN { def conv(filter, channels){ Convolution(kernel=(filter,filter), channels=channels) -> Relu() } def fc(){ FullyConnected(units=4096) -> Relu() -> Dropout(p=0.5) } image -> conv(filter=3, channels=64, ->=2) -> Pooling(pool_type="max", kernel=(2,2), stride=(2,2)) -> conv(filter=3, channels=128, ->=2) -> Pooling(pool_type="max", kernel=(2,2), stride=(2,2)) -> conv(filter=3, channels=256, ->=3) -> Pooling(pool_type="max", kernel=(2,2), stride=(2,2)) -> conv(filter=3, channels=512, ->=3) -> Pooling(pool_type="max", kernel=(2,2), stride=(2,2)) -> conv(filter=3, channels=512, ->=3) -> Pooling(pool_type="max", kernel=(2,2), stride=(2,2)) -> fc() -> fc() -> FullyConnected(units=1000) -> Softmax() -> predictions } } ``` 2. Define a training configuration for this network and store it in a `.cnnt file`, the name of the file should be the same as that of the corresponding architecture (e.g. `VGG16.emadl` and `VGG16.cnnt`). For more information on architecture language please refer to [CNNTrainLang project](https://git.rwth-aachen.de/monticore/EmbeddedMontiArc/languages/CNNTrainLang). An example of a training configuration: ``` configuration VGG16{ num_epoch:10 batch_size:64 normalize:true load_checkpoint:false optimizer:adam{ learning_rate:0.01 learning_rate_decay:0.8 step_size:1000 } } ``` 3. Generate GPL code for the specified deep learning backend using the jar package of a EMADL2CPP generator. The generator receives the following command line parameters: * `-m` path to directory with EMADL models * `-r` name of the root model * `-o` output path * `-b` backend Assume both the architecture definition `VGG16.emadl`and the corresponding training configuration `VGG16.cnnt` are located in a folder `models` and the target code should be generated into `target` folder using `MXNet` backend. An example of a command is then: ```java -jar embedded-montiarc-emadl-generator-0.2.4-SNAPSHOT-jar-with-dependencies.jar -m models -r VGG16 -o target -b MXNET``` You can find the EMADL2CPP jar [here](doc/embedded-montiarc-emadl-generator-0.2.4-SNAPSHOT-jar-with-dependencies.jar) 4. When the target code is generated, the corresponding trainer file (e.g. `CNNTrainer_.py` in case of MXNet) can be executed. # Building and running an application for TORCS ## Prerequisites 1. Linux. Ubuntu Linux 16.04 and 18.04 were used during testing. 2. ROS, Java runtime environment, GCC/Clang and armadillo - install using your linux distribution tools, e.g. apt in Ubuntu: ```apt-get install ros-base-dev clang openjdk-8-jre libarmadillo-dev``` 3. MXNet - install using [official instructions at MXNet Website](https://mxnet.incubator.apache.org/) for C++ 4. TORCS (see below) ### TORCS Installation 1. Download customized TORCS distribution from the [DeepDriving site](http://deepdriving.cs.princeton.edu/) 2. Unpack downloaded archive, navigate to the `DeepDriving/torcs-1.3.6` directory 3. Compile and install by running `./configure --prefix=/opt/torcs && make -j && make install && make datainstall` 4. Remove original TORCS tracks and copy customized tracks: ```rm -rf /opt/torcs/share/games/torcs/tracks/* && cp -rf ../modified_tracks/* /opt/torcs/share/games/torcs/tracks/``` 5. Start TORCS by running `/opt/torcs/bin/torcs` Further installation help can be found in the Readme file provided with the DeepDriving distribution. ### TORCS Setup 1. Run TORCS 2. Configure race 1. Select Race -> Quick Race -> Configure Race 2. Select one of the maps with the chenyi- prefix and click Accept 3. Remove all drivers from the Selected section on the left by selecting every driver and clicking (De)Select 4. Select driver chenyi on the right side and add it by clicking (De)Select 5. Add other drivers with the chenyi- prefix if needed 6. Click Accept -> Accept -> New Race Example of a drivers configuration screen: ![Drivers](doc/torcs_Drivers.png) 3. Use keys `1-9` and `M` to hide all the widgets from the screen 4. Use `F2` key to switch between camera modes to select the mode when the car or it's parts are not visible 5. Use `PgUp/PgDown` keys to switch between cars and select `chenyi` - the car that does not drive on its own ## Code generation and running the project 1. Download and unpack the [archive](doc/deep_driving_project.zip) that contains all EMA and EMADL component for an application 2. Run `generate.sh` script. It will generate the code to the `target` folder, copy the handwritten part of the project (communication with TORCS via shared memory) as well as the weights of the trained CNN and finally build the project 3. Start TORCS and configure race as described above. Select mode where host car is not visible 4. Go to the `target` folder and start `run.sh` script. It will open two three terminals: one for the ROS core, one for the TORCSCOmponent (application part responsible for communication with TORCS) and one for the Mastercomponent (application part generated from the models at step 2 which is repsondible for application logic)