DDT not connecting to multiserver
When we run with DDT (allinea parallel debugger) the procedure used is usually the following: run ddt locally on the workstation (or laptop) and in the slurm submission script prepending to the MPI run the ddt command as follows:
module load ddt/19.0.5
ddt --connect mpirun -np N_CORES ./nemo
Anyway when trying to run with with ORCA1 we had the following problem: it was connecting just to the first n-cores. In the beginning I thought it was due to the Slurm directive
#SBATCH --ntasks-per-node 20
that we were using for being able to spawn then N openMP thread. After some tests came out that the problem is not Slurm nor OpenMP. Oriol suggested that the problem was when running on more that one node, and he was right.
After some research I found here instruction on how to run with more that one node. I am documenting here the steps:
-
Launch DDT locally
-
Remote launch the connection to MN4 login node
-
In the "application" box put the absolute path to you executable
-
In the "working directory" put the path to the run folder with input data + namelists

-
In the "Submission template file" I put a path to the following script
#SBATCH --nodes=NUM_NODES_TAG #SBATCH --time=WALL_CLOCK_LIMIT_TAG #SBATCH --job-name="ddt" #SBATCH --output=allinea.stdout #SBATCH --error=allinea.stdout source /gpfs/scratch/bsc32/bsc32402/RUN/ORIOL_OpenMP/param.cfg set_env AUTO_LAUNCH_TAG -
Close the tab and hit the "Submit" botton
The script is submitted on the "normal" queue, so change manually the queue from MN4 login node if you don't want to queue for a long time.
Probably changing the submission script, adding a line spiecifying the debug queue, would change this behavior.


