Newer
Older
In this repository you will find the [apptainer](https://apptainer.org/) definition file which are used to create
the apptainer containers that one can use for plasma pic simulation on the **virgo3** cluster.
The containers are all built on top of [Rocky Linux v8.10](https://rockylinux.org/news/rocky-linux-8-8-ga-release/),which is
similar to the system used by all **virgo3** nodes.
The containers provide a self-consistent environment to perform PIC simulation using either
- [EPOCH 1d, 2d, 3d](https://github.com/Warwick-Plasma/epoch)
- [WarpX 1d, 2d, rz, 3d](https://github.com/ECP-WarpX/WarpX)
All the containers have been tested and guarantee
- full speed Infiniband interconnect communication
- Lustre filesystem optimised MPI I/O
Containers provide the **latest** version for both the needed libraries/dependencies and PIC codes.
The latest versions are always installed in the **/dev** subdirectory keeping the former releases versions in both
the **/prod** and **/old** subdirectories.
The **/dev** (latest) software stack includes:
- Linux
- [RockyLinux v8.10](https://docs.rockylinux.org/release_notes/8_8/)
- [Lustre fs client v2.15.5](http://downloads.whamcloud.com/public/lustre/lustre-2.15.5/el8.10/client)
- [openMPI v5.0.5](https://www.open-mpi.org/software/ompi/v5.0/)
- [PMix v5.0.3](https://github.com/openpmix/openpmix)
- [UCX v1.17.0](https://github.com/openucx/ucx)
- [HDF5 v1.14.4](https://www.hdfgroup.org/solutions/hdf5/)
- [ADIOS2 2.10.1](https://github.com/ornladios/ADIOS2)
- PIC codes
- [EPOCH v4.19.0](https://github.com/Warwick-Plasma/epoch/releases/tag/v4.19.0)
- epoch1d,2d,3d
- epoch1d_lstr,2d_lstr,3d_lstr (increased string length)
- [WarpX v24.08](https://github.com/ECP-WarpX/WarpX)
- [OpenPMD-api v0.15.2]( https://github.com/openPMD/openPMD-api.git)
- [Python](https://www.python.org/)
- v3.6.8 (Rocky Linux 8.10 native version)
- scientific packages (numpy, matplotlib, lmfit, yt etc ...)
- `sdfutils` bindings for `Epoch`
Containers are available on `/cvmfs/phelix.gsi.de/sifs/` :
### [Epoch](https://github.com/Warwick-Plasma/epoch/releases/tag/v4.19.0) + [WarpX](https://github.com/ECP-WarpX/WarpX)
- `GCC 8.5 + python v3.6.8` (RockyLinux 8.10 system compiler+python)
- `/cvmfs/phelix.gsi.de/sifs/cpu/dev/rlx8_ompi_ucx.sif`
- `/cvmfs/phelix.gsi.de/sifs/cpu/dev/rlx8_ompi_ucx_dask.sif`
- `GCC 13.2 + python v3.12.1` (latest GNU compiler + Python versions)
- `/cvmfs/phelix.gsi.de/sifs/cpu/dev/rlx8_ompi_ucx_gcc13_py312.sif`
- `FLASH dedicated container using latest software stack.`
- `/cvmfs/phelix.gsi.de/sifs/cpu/dev/rlx8_ompi_ucx_gcc13_flash.sif`
### GPUs (AMD GPUs MI-100 Instinct)
### [WarpX](https://github.com/ECP-WarpX/WarpX)
- `GCC 8.5 + python v3.6.8 + ROCm 6.2` (RockyLinux 8.10 system compiler+python)
- `/cvmfs/phelix.gsi.de/sifs/gpu/dev/rlx8_rocm-6.2_warpx.sif`
### [PicOnGPU](https://github.com/ComputationalRadiationPhysics/picongpu)
- `GCC 8.5 + python v3.6.8 + ROCm 5.7.1` (RockyLinux 8.10 system compiler+python)
- `/cvmfs/phelix.gsi.de/sifs/gpu/prod/rlx8_rocm-5.7.1_picongpu.sif`
## Apptainer in unpriviled user namespace mode
A new version of `Apptainer (1.3)` in unprivileged user namespace mode is now installed on Virgo3.
In order to avoid a possible user namespace exhaustion and to use full
MPI intra-node optimizations, MPI users need to use the so-called
Apptainer sharens mode.
To do that, MPI users should add the following environment variables in
their submit script:
```
export APPTAINER_SHARENS=true
export APPTAINER_CONFIGDIR=/tmp/$USER
```
Setting `APPTAINER_SHARENS` to `true` value will tell `Apptainer` to switch to the sharens mode.
All the MPI spawned processes on one node will then be moved to the same user namespace
defined by a unique Apptainer instance created on this node.
The `APPTAINER_CONFIGDIR` is the location where the metadata
of the unique Apptainer instance is stored. This information does not need
to be shared between nodes so one can safely use `/tmp/$USER`.
## Getting started with pp-containers
To use container you need first to login to **virgo3** on a baremetal submit host:
ssh user_name@virgo3.hpc.gsi.de
```
From the baremetal submit node, you are able to submit a job with the environment define in
your own container.
export CONT=/cvmfs/phelix.gsi.de/sifs/cpu/dev/rlx8_ompi_ucx.sif
# Apptainer settings
export APPTAINER_SHARENS=true
export APPTAINER_CONFIGDIR=/tmp/$USER
# openMPI I/O module
# run your application as if it was installed on the host !
echo "." | srun --export=ALL -- $CONT epoch3d
```
and that's it !
Basically the scheduling will be done from SLURM installed on the baremetal host, for all the rest
( execution of MPI core, I/O and PIC code) the software stack installed inside the container
will take over.
## Interaction with the container
Once you data is produced you can do analysis using the same containerized environment since it also provide
the necessary python libraries:
```
[dbertini@lxbk1131 ~]$ singularity exec /cvmfs/phelix.gsi.de/sifs/cpu/rlx8_ompi_ucx.sif bash -l
Centos system profile loaded ...
Apptainer> python3 --version
Python 3.6.8
Apptainer> python3
Python 3.6.8 (default, Feb 21 2023, 16:57:46)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-16)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> import matplotlib
>>> import sdf
...
```
A dedicated container can be used to run the latest [WarpX 24.01](https://github.com/ECP-WarpX/WarpX/tree/development)
on the virgo3 `gpu` partition.
The container is built on top of the standard `rlx8` container featuring a Rocky Linux 8.10 system with the common additional
The latest Radeon Open Compute [ROCM 6.0](https://github.com/RadeonOpenCompute/ROCm)
is installed and has been tested on the AMD MI 100 GPUs available on the
`gpu` virgo3 partition.
To submit on gpu, example scripts are available from the `gpu_scripts` directory.
Basically the submission is similar to the one uses for CPUs-based partition i.e
```
#!/bin/bash
# Define container image and working directory
export CONT=/cvmfs/phelix.gsi.de/sifs/gpu/dev/rlx8_rocm-6.0_warpx.sif
export WDIR=/lustre/rz/dbertini/gpu/warpx
# Define I/O module for openMPI
# Definie apptainer external filesystems bindings
export APPTAINER_BINDPATH=/lustre/rz/dbertini/,/cvmfs
export APPTAINER_SHARENS=true
export APPTAINER_CONFIGDIR=/tmp/$USER
# Executable with dimentionality and corresponding input deck file.
srun --export=ALL -- $CONT warpx_2d $WDIR/scripts/inputs/warpx_opmd_deck
```
## New Container directory layout on `/cvmfs`
The container directory layout provides now `old`, `prod`, `dev` directories.
It is recommended to always use the `dev` containers since they features the latest
openMPI software stack versions.
New produced container will be push to the `dev` directory waiting for validation from
the user community. After validation we will move them to `prod` and `prod` containers will then
be mode to the `old` directory.
CVMFS directory layout (date `Tue Sep 3 12:42:56 CEST 2024`):
[dbertini@lxbk0724 /cvmfs/phelix.gsi.de/sifs]$ tree
.
│ │ ├── rlx8_ompi_ucx_dask.sif
│ │ ├── rlx8_ompi_ucx_flash.sif
│ │ ├── rlx8_ompi_ucx_gcc13_py312.sif
│ │ └── rlx8_ompi_ucx.sif
│ ├── old
│ └── prod
│ ├── rlx8_ompi_ucx_gcc12.sif
│ └── rlx8_ompi_ucx.sif
└── gpu
├── dev
│ └── rlx8_rocm-6.2_warpx.sif
│ ├── rlx8_rocm-5.4.6.def
│ ├── rlx8_rocm-5.4.6.sif
│ ├── rlx8_rocm-5.4.6_warpx.def
│ ├── rlx8_rocm-5.4.6_warpx.sif
│ ├── ubuntu-20.04_rocm-5.4.2_picongpu.def
│ ├── ubuntu-20.04_rocm-5.4.2_picongpu.sif
│ ├── ubuntu-20.04_rocm-5.4.2_warpx.def
│ └── ubuntu-20.04_rocm-5.4.2_warpx.sif
├── rlx8_rocm-5.7.1_picongpu.sif
├── rlx8_rocm-5.7.1_warpx_aware.sif
├── rlx8_rocm-6.0_warpx_aware.sif
└── rlx8_rocm-6.0_warpx.sif
8 directories, 19 files
## Compiling and installing own package
One can compile and install its own package using the provided containers
/cvmfs/phelix.gsi.de/sifs/cpu/dev/rlx8_ompi_ucx.sif
This container provide all components which are necessary for self
install of user packages.
To compile within the container environment you need first to load the container.
```
> export CONT=/cvmfs/phelix.gsi.de/sifs/cpu/dev/rlx8_ompi_ucx.sif
> singularity exec $CONT -B /lustre -B /cvmfs bash -l
The `Apptainer` prompt means that created shell contains now the containerized environment.
You can get the normal unix prompt back typing once the bash command
```
Apptainer> bash
[dbertini@lxbk1130 /lustre/rz/dbertini]$
```
You can check for example which version are available within the environment:
```
[dbertini@lxbk1130 /lustre/rz/dbertini]$ singularity exec rlx8_ompi_ucx.sif bash -l
RLX system profile loaded ...
[pp_container]/lustre/rz/dbertini]$ g++ --version
g++ (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[pp_container]/lustre/rz/dbertini] $ ompi_info | grep ucx
Configure command line: '--prefix=/usr/local' '--with-pmix=/usr/local' '--with-libevent=/usr' '--with-ompi-pmix-rte' '--with-orte=no' '--disable-oshmem' '--enable-mpirun-prefix-by-default' '--enable-shared' '--without-verbs' '--with-hwloc' '--with-ucx=/usr/local/ucx' '--with-lustre' '--with-slurm' '--enable-mca-no-build=btl-uct'
MCA osc: ucx (MCA v2.1.0, API v3.0.0, Component v5.0.1)
MCA pml: ucx (MCA v2.1.0, API v2.1.0, Component v5.0.1)
```
Installation of packages should be done on the Lustre shared filesystem, which is acessible from every nodes in the cluster.
To run the self installed package use the following command in your `run-file.sh`
```
export CONT=/cvmfs/phelix.gsi.de/sifs/cpu/dev/rlx8_ompi_ucx.sif
export OMPI_MCA_io=romio341
export APPTAINER_SHARENS=true
export APPTAINER_CONFIGDIR=/tmp/$USER
srun --export=ALL -- singularity exec -B /lustre -B /cvmfs $CONT <my_compiled_exec> <options>