The former scripts using srun option --ntasks=N often lead to a sub-optimal MPI process placement on the cluster. An optimised version is now available. Documentation here
--ntasks=N