PSBLAS-EXT

Parallel Sparse BLAS GPU Plugin

PSBLAS-EXT

Parallel Sparse BLAS GPU Plugin

PSBLAS-EXT

Add NVIDIA-GPU capabilities to your code

Tested on thousands of GPUs

Description

This is the extenstions plugin for PSBLAS

This package contains:

  1. Extended matrix formats: ELLPACK, Hacked ELLPACK, DIAgonals, Hacked DIAgonals. Note: DIA and HDIA have limited support.
  2. A GPU plugin: gpu-enabled versions of the above, with the CUDA code from davidebarbieri/spgpu, plus interfaces to CSR and HYB formats available in the NVIDIA CuSPARSE lib. Note: DIAG and HDIAG have limited support.
  3. RSB: an interface to librsb

PREREQUISITES

To build this code you need to have PSBLAS 3.8.0-2 or later, together with its prerequisites.

To make use of the NVIDIA GPU you’ll need:

  1. An installation of the CUDA toolkit (version $ \geq 10$);
  2. The SPGPU code from davidebarbieri/spgpu

RELEASE

Library releases for PSBLAS-EXT.

Release Date Sources Documentation Works with
Version 1.3.1 September 29, 2023 ZIP Archive PDF PSBLAS 3.8.1
Version 1.3.1 Novembre 24, 2022 ZIP Archive PDF PSBLAS 3.8.0-2
Version 1.3.0 May 12, 2021 ZIP Archive PDF PSBLAS 3.7.0.1
Version 1.3.0-rc1 April 12, 2021 ZIP Archive PDF PSBLAS 3.7.0

Library releases can be downloaded from: psblas3-ext/releases

INSTALLING

./configure --prefix=/path/to/install \
            --with-psblas=/path/to/PSBLAS/install \
	    --with-cuda=/CUDA/install \
	    --with-spgpu=/SPGPU/install \
	    --with-librsb=/LIBRSB/install

make;
make install

Note: we have only tested with GNU Fortran compiler.

Note: CUDA nvcc typically lags behind the latest versions of GCC/GNU Fortran; currently nvcc supports GCC 8 so this is the preferred choice. Mixing SPGPU CUDA code compiled with an older version and the rest with e.g. 4.9 has worked fine so far: YMMV. See the docs at NVIDIA for further information on the compatibility between GCC and nvcc.

TODO

Improve MPI support.

WHAT IS NOT HERE

Good preconditioners for the GPU. Performance of triangular system solves on the GPU is very bad: we enable it in CSRG and HYBG, we do not even bother to implement it in ELG and HLG.
So if you use the GPU, you are limited to no preconditioning, or diagonal scaling. We are working on an independent plugin that will deliver better alternatives based on approximate inverses.

Report bugs to: https://github.com/sfilippone/psblas3-ext/issues

Contributors

  • Salvatore Filippone
  • Alessandro Fanfarillo