Skip to content
Snippets Groups Projects

PK_Run_Parallel_FM_RWTH_Cluster.sh

  • Clone with SSH
  • Clone with HTTPS
  • Embed
  • Share
    The snippet can be accessed without any authentication.
    Authored by Petr Krasnov
    Edited
    PK_Run_Parallel_FM_RWTH_Cluster.sh 5.43 KiB
    ## This small guide explains how to install and correctly run the parallel version of FlameMaster
    ## in the RWTH Cluster, especially for the backend. 
    ## Access to both FlameMaster and eglib (https://git.rwth-aachen.de/ITV/eglib) repositories is required.
    ## Ask me if you are interested in getting access. 
    ## All members of the ITV-RWTH-GitLab group have access, and no further action is required for them.
    
    ## To checkout and compile the parallel version of FlameMaster in the
    ## RWTH Cluster, the following steps are required:
    
    cd your/parallel/FlameMaster
    git clone git@git.rwth-aachen.de:ITV/FlameMaster.git Repository --branch jl_dco_activated
    cd Repository/src/libraries/
    git clone git@git.rwth-aachen.de:ITV/eglib.git
    cd ../../../
    mkdir -p Build && cd Build
    
    ## The following installation uses Intel compilers and Intel MKL.
    ## For the RWTH Cluster, this is the preferred configuration (other compilers can be used
    ## but the performance is lower)
    
    ## Unload modules for safety
    module unload gcc
    module unload intel
    module unload clang
    
    ## Load intel and gcc/8
    module load intel Eigen/3.3.7 GCC/8.3.0
    
    ## Set up intel variables
    source /opt/intel/oneAPI/2023.0/setvars.sh
    
    
    ## CMake command with the correct instructions for the broadwell nodes (ex "citv 5-7")
    ## Setup has been tested on the broadwell nodes, as well as the frontend
    ## The compiled executables do not work on ivybridge machines.
    CXX=icpc CC=icc FC=ifort cmake ../Repository -DCMAKE_BUILD_TYPE=Release -DEIGEN_INTEGRATION=ON -DCOMBUSTION_LIBS=ON -DCMAKE_CXX_FLAGS_RELEASE="-Ofast -ffast-math -DNDEBUG -march=broadwell -mtune=broadwell -funroll-all-loops -qopt-multi-version-aggressive -ipo -parallel" -DCMAKE_C_FLAGS_RELEASE="-Ofast -ffast-math -DNDEBUG -march=broadwell -mtune=broadwell -funroll-all-loops -qopt-multi-version-aggressive -ipo -parallel" -DCMAKE_Fortran_FLAGS_RELEASE="-Ofast -DNDEBUG -march=broadwell -mtune=broadwell -funroll-all-loops -qopt-multi-version-aggressive -ipo -parallel" -DFAST_COLLISION_INTEGRAL=ON -DINSTALL_SUNDIALS=ON -DSUNDIALS_LAPACK=ON
    
    ## Compile and install
    make -j12 install
    
    ###################################################################
    ## BENCHMARK RUN TO CHECK IF THE CODE HAS BEEN PROPERLY COMPILED ##
    ###################################################################
    
    # It is STRONGLY advised NOT to run on the frontend. 
    # Always use the backend nodes (in this case the "broadwell" nodes) through SLURM 
    
    ## Backend test ##
    
    # Setup for the test run 
    # Follow these steps:
    
    # Generate the required mechanisms
    cd /home/YOUR_TIMID/your/parallel/FlameMaster/Run/ScanMan/
    bash CreateAllMechanisms.bash
     
    # Test run directory
    cd ../../Run/FlameMan/Diff/SteadyPlugFlow/Wullenkord
    
    ## To run the test in the backend, copy the template submission script in the same directory
    ## with the following command. In order to have the correct access to the ITV nodes, use this 
    ## Script which has the latest options to run on our computational nodes
    cp /home/itv/SLURM_submission_scripts/SlurmScript_FM .
    
    ## Modify line 76 ('exe') and 79 ('arg') with the correct path of your FlameMaster executable and Data directory
    ## i.e. " exe ='/home/YOUR_TIMID/your/parallel/FlameMaster/Bin/bin/FlameMan' "
    ## run the script:
    
    sbatch SlurmScript_FM
    
    ## Check the job output (job.%JOBID.out) to
    ## see if the execution time matches the expected time for the parallel run (as follows)
    
    # Expected runtime for 12 threads (parallel)
    # ~ 16.5 sec +- 0.5 sec (RWTH cluster frontend, Broadwell E5-2650 v4 @ 2.20GHz, with 12 threads, 
    # performance tuning node)
    ## Expected runtime for 1 thread (serial)
    # ~ 73.8 sec +- 1 sec (RWTH cluster frontend, Broadwell E5-2650 v4 @ 2.20GHz, performance tuning node)
    
    ## If your runtime is significantly far (more than a couple of seconds) from the expected parallel runtime, something
    ## might be wrong with your setup.
    
    ################# RUN OTHER CASES ######################
    
    ## adapt the slurm script mentioned above (/home/itv/SLURM_submission_scripts/SlurmScript_FM)
    ## to your needs (executable path, total time, mail notification, input name)
    
    ## The correct set of module for the parallel FM to run are already specified
    ## Differently from the front end, OMP_NUM_THREADS=XX is not required,
    ## as it is handled by SLURM via the option
    ## #SBATCH --cpus-per-task=XX
    ## XX=12 is the preferred number of threads for the broadwell nodes (ex "citv 5-7") used at the ITV 
    ## (adding all 24 cores doesn't have any benefit).
    ## In SLURM, "CPUs" are the physical cores. If tasks-per-cpu=1 (No multithreading) physical cores = omp_threads 
    
    
    ## The value for memory requirements (mem-per-cpu) has been set to 3GB per thread. 
    ## Change it accordingly to your needs (if not specified, 1GB per thread will be assigned).
    ## The maximum value is Total memory node (~128 GB for broadwell) divided the number of used threads. 
    ## SLURM will throw an error if the requested memory exceeds the available one.
    
    ## Please remember that this configuration has been tested only for intel compilers and broadwell nodes
    ## on the RWTH Cluster.
    
    ## Other machines, operating systems, compilers, and LAPACK  implementations have been
    ## tested and can be used but can require a different CMake configuration command.
    ## Switching compilers, replacing the intel MKL with a different LAPACK implementation,
    ## or changing the architecture-specific optimization flags with an incorrect 
    ## CMake configuration can lead to incorrect simulation results.
    ## Further details are discussed here: https://git.rwth-aachen.de/-/snippets/766 .
    
    
    0% Loading or .
    You are about to add 0 people to the discussion. Proceed with caution.
    Please register or to comment