Great addition to any scientific project
Comes with a range of interfaces
Tested on tens of thousands of cores
Parallel Sparse BLAS
PSBLAS
Parallel Sparse BLAS
PSBLAS
Great addition to any scientific project
Comes with a range of interfaces
Tested on tens of thousands of cores
Description
The PSBLAS library, developed with the aim to facilitate the parallelization of computationally intensive scientific applications, is designed to address parallel implementation of iterative solvers for sparse linear systems through the distributed memory paradigm. It includes routines for multiplying sparse matrices by dense matrices, solving block diagonal systems with triangular diagonal entries, preprocessing sparse matrices, and contains additional routines for dense matrix operations. The current implementation of PSBLAS addresses a distributed memory execution model operating with message passing.
The PSBLAS library version 3 is implemented in the Fortran 2008 programming language, with reuse and/or adaptation of existing Fortran 77 and Fortran 95 software, plus a handful of C routines.
Library releases for PSBLAS.
| Release | Date | Sources | Documentation |
|---|---|---|---|
| Version 3.9.0 | December 23, 2025 | ||
| Version 3.9-RC3 | July 23, 2025 |
Library releases can be downloaded from: psblas3/releases
The architecture, philosophy and implementation details of the library are contained in the following papers:
S. Filippone, A. Buttari. Object-Oriented Techniques for Sparse Matrix Computations in Fortran 2003, ACM Trans. on Math. Software, vol. 38, No. 4, 2012.
V. Cardellini, S. Filippone and D. Rouson. Design Patterns for sparse-matrix computations on hybrid CPU/GPU platforms, Scientific Programming, 22(2014), pp.1-19.
S. Filippone, V. Cardellini, D. Barbieri and A. Fanfarillo: Sparse Matrix-Vector Multiplication on GPGPUs ACM Transactions on Mathematical Software (TOMS), Volume 43 Issue 4, December 2016.
S. Filippone, M. Colajanni. PSBLAS: A library for parallel linear algebra computation on sparse matrices, ACM Trans. on Math. Software, 26(4), Dec. 2000, pp. 527-550.
A. Buttari, P. D’Ambra, D. di Serafino, S. Filippone, Extending PSBLAS to build parallel Schwarz preconditioners, Applied Parallel Computing. State of the Art in Scientific Computing: 7th International Workshop, PARA 2004, LNCS 3732, 2006, pp. 593-602.
A. Buttari, P. D’Ambra, D. Di Serafino, S. Filippone, 2LEV-D2P4: A package of high-performance preconditioners for scientific and engineering applications, Applicable Algebra in Engineering, Communications and Computing, 2007, 18(3), pp. 223-239.
P. D’Ambra, D. Di Serafino, S. Filippone, MLD2P4: A package of parallel algebraic multilevel domain decomposition preconditioners in Fortran 95 ACM Transactions on Mathematical Software, 2010, 37(3), 30
PSBLAS is the backbone of the Parallel Sparse Computation Toolkit (PSCToolkit) suite of libraries. See the paper:
D’Ambra, P., Durastante, F., & Filippone, S. (2023). Parallel Sparse Computation Toolkit. Software Impacts, 15, 100463.
We originally included a modified implementation of some of the Sparker (serial sparse BLAS) material; this has been completely rewritten, way beyond the intention(s) and responsibilities of the original developers. The main reference for the serial sparse BLAS is:
Duff, I., Marrone, M., Radicati, G., and Vittoli, C. Level 3 basic linear algebra subprograms for sparse matrices: a user level interface, ACM Trans. Math. Softw., 23(3), 379-401, 1997.
To compile (using configure/make/make install) and run our software you will need the following prerequisites (see also SERIAL below):
A working version of MPI.
A version of the BLAS; you can specify a specific version with --with-blas
We have had good results with the METIS library, from
https://github.com/KarypisLab/METIS.
This is optional; it is used in the util and test/fileread
directories but only if you specify --with-metis.
If you have the AMD package of Davis, Duff and Amestoy, you can
specify --with-amd (see ./configure --help for more details).
We use the C interface to AMD.
If you have CUDA available, use
--enable-cuda to compile CUDA-enabled methods--with-cudadir=<path> to specify the CUDA toolkit location--with-cudacc=XX,YY,ZZ to specify a list of target CCs (compute
capabilities).
CUDA versions have specific compatibility requirements;
for example:
The configure script will generate a Make.inc file suitable for building
the library. The script is capable of recognizing the needed libraries
with their default names; if they are in unusual places consider adding
the paths with --with-libs, or explicitly specifying the names in
--with-blas, etc.
[!CAUTION] Please note that a common way for the configure script to fail is to specify inconsistent MPI vs. plain compilers, either directly or indirectly via environment variables; e.g. specifying the Intel compiler with
FC=ifortwhile at the same time having anMPIFC=mpif90which points to GNU Fortran.
[!TIP] The best way to avoid this situation is (in our opinion) to use the environment modules package (see http://modules.sourceforge.net/), and load the relevant variables with (e.g.)
module load gcc/14.2.0 openmpi/5.0.8This will delegate to the modules setup to make sure that the version of openmpi in use is the one compiled with the gnu46 compilers. After the configure script has completed you can always tweak the Make.inc file yourself.
After you have Make.inc fixed, run
make
to compile the library; go to the test directory and its subdirectories to get test programs done. You can then install with
make install
We recommend specifying --prefix=/path in the configure step, so that
the libraries will be installed under /path/lib,
the module files will be installed under /path/modules, the documentation under /path/docs and so on.
The C interface header files are under /path/include.
If /path is a system directory, you may need
sudo make install
If you do not specifye --with-prefix the usual default of /usr applies.
This version of PSBLAS incorporates into a single package three entities that were previously separated: | Library | | |———|——————–| | PSBLAS | the base library | | PSBLAS-EXT | a library providing additional storage formats for matrices and vectors | | SPGPU | a package of kernels for NVIDIA GPUs originally written by Davide Barbieri and Salvatore Filippone; see the license file cuda/License-spgpu.md |
Moreover, the module and library previously called psb_krylov are now called psb_linsolve, but their usage is otherwise unchanged.
There is a highly experimental version of an OpenACC interface, you can compile it by speficifying
--enable-openacc --with-extraopenacc="-foffload=nvptx-none=-march=sm_70"
where the argument to the extraopenacc option depends on the compiler you are using (the example shown here is relevant for the GNU compiler).
Configuring with --enable-serial will provide a fake MPI stub library
that enables running in pure serial mode; no MPI installation is needed
in this case (but note that the fake MPI stubs are only guaranteed to
cover what we use internally, it’s not a complete replacement).
We have two kind of integers: IPK for local indices, and LPK for global indices. Their size can be specified at configure time, e.g.
--with-ipk=4 --with-lpk=8
which is asking for 4-bytes local indices, and 8-bytes global indices (this is the default).
PSBLAS supports building with CMake (version 3.11 or higher). This method handles the automatic detection of compilers, MPI, and linear algebra libraries. Standard Compilation (Without CUDA) To perform a standard compilation, run:
mkdir build
cd build
cmake ..
make
If you wish to install PSBLAS in a specific location (similar to using the –prefix option in the legacy configure script), you must define the CMAKE_INSTALL_PREFIX variable. To set a custom installation path, run the configuration command as follows:
cmake -DCMAKE_INSTALL_PREFIX=/home/user/psblas_install
To enable GPU support via CUDA, you must set the PSB_BUILD_CUDA option to ON during the configuration step. Important Compatibility Note: CUDA support is strictly incompatible with 8-byte local integers. If you manually set CMAKE_PSB_IPK to 8, CUDA support will be automatically disabled by the system. To build with CUDA enabled:
cmake -DPSB_BUILD_CUDA=ON ..
The compilation then proceed as before through make When this flag is active, CMake will search for the CUDAToolkit, enable the CUDA language, and define necessary macros such as PSB_HAVE_CUDA.
You can override the default integer sizes (4-byte local IPK and 8-byte global LPK) using the following variables: Example: Using 8-byte global integers (default) and 4-byte local integers
cmake -DCMAKE_PSB_IPK=4 -DCMAKE_PSB_LPK=8 ..
To install the libraries, header files, and Fortran modules to your system (or a custom path defined by -DCMAKE_INSTALL_PREFIX), run:
make install
The files will be organized into the lib, include, and modules subdirectories within the installation prefix, same as the configure build.
The library has been successfully compiled and tested with multiple compilers and MPI implementations; this release has been successfully tested with:
combined with
Moreover, it has been tested with the Intel OneAPI toolchain versions 2025.2 and 2025.3
As of this release, the NVIDIA compiler 25.7 fails to handle our code. Cray, IBM and NAg compilers have been used for testing in the past, but not on this version.
Further information on installation and configuration can be found in the documentation. See docs/psblas-3.9.pdf; an HTML version of the same document is available in docs/html. Please consult the sample programs, especially
which contain examples for the solution of linear systems obtained by the discretization of a generic second-order differential equation in two:
- a_1 \frac{\partial^2 u}{\partial x^2}
- a_2 \frac{\partial^2 u}{\partial y^2}
+ b_1 \frac{\partial u}{\partial x}
+ b_2 \frac{\partial u}{\partial y}
+ c u = f
or three
- a_1 \frac{\partial^2 u}{\partial x^2}
- a_2 \frac{\partial^2 u}{\partial y^2}
- a_3 \frac{\partial^2 u}{\partial z^2}
+ b_1 \frac{\partial u}{\partial x}
+ b_2 \frac{\partial u}{\partial y}
+ b_3 \frac{\partial u}{\partial z}
+ c u = f
dimensions on the unit square/cube with Dirichlet boundary conditions.
The test/util directory contains some utilities to convert to/from Harwell-Boeing and MatrixMarket file formats.
[!NOTE] To report bugs 🐛 or issues ❓ please use the GitHub issue system.
Project lead: Salvatore Filippone
Contributors (roughly reverse cronological order):
The PSBLAS library, developed with the aim to facilitate the parallelization of computationally intensive scientific applications, is designed to address parallel implementation of iterative solvers for sparse linear systems through the distributed memory paradigm. It includes routines for multiplying sparse matrices by dense matrices, solving block diagonal systems with triangular diagonal entries, preprocessing sparse matrices, and contains additional routines for dense matrix operations. The current implementation of PSBLAS addresses a distributed memory execution model operating with message passing.
The PSBLAS library version 3 is implemented in the Fortran 2003 programming language, with reuse and/or adaptation of existing Fortran 77 and Fortran 95 software, plus a handful of C routines.
The test/util directory contains some utilities to convert to/from
Harwell-Boeing and MatrixMarket file formats.
See docs/psblas-3.7.pdf; an HTML version of the same document is
available in docs/html, and on this website.
Please consult the sample programs, especially
test/pargen/psb_[sd]_pde[23]d.f90
We originally included a modified implementation of some of the Sparker (serial sparse BLAS) material; this has been completely rewritten, way beyond the intention(s) and responsibilities of the original developers. The main reference for the serial sparse BLAS is:
Duff, I., Marrone, M., Radicati, G., and Vittoli, C. Level 3 basic linear algebra subprograms for sparse matrices: a user level interface, ACM Trans. Math. Softw., 23(3), 379-401, 1997.
To compile and run our software you will need the following prerequisites (see also SERIAL below):
A working version of MPI
A version of the BLAS; if you don’t have a specific version for your platform you may try ATLAS available from http://math-atlas.sourceforge.net/
We have had good results with the METIS library, from
http://www-users.cs.umn.edu/~karypis/metis/metis/main.html.
This is optional; it is used in the util and test/fileread
directories but only if you specify --with-metis.
If you have the AMD package of Davis, Duff and Amestoy, you can
specify --with-amd (see ./configure --help for more details).
The configure script will generate a Make.inc file suitable for building
the library. The script is capable of recognizing the needed libraries
with their default names; if they are in unusual places consider adding
the paths with --with-libs, or explicitly specifying the names in
--with-blas, etc. Please note that a common way for the configure script
to fail is to specify inconsistent MPI vs. plain compilers, either
directly or indirectly via environment variables; e.g. specifying the
Intel compiler with FC=ifort while at the same time having an
MPIFC=mpif90 which points to GNU Fortran. The best way to avoid this
situation is (in our opinion) to use the environment modules package
(see http://modules.sourceforge.net/), and load the relevant
variables with (e.g.)
module load gnu46 openmpi
This will delegate to the modules setup to make sure that the version of openmpi in use is the one compiled with the gnu46 compilers. After the configure script has completed you can always tweak the Make.inc file yourself.
After you have Make.inc fixed, run
make
to compile the library; go to the test directory and its subdirectories
to get test programs done. If you specify --prefix=/path you can do make
install and the libraries will be installed under /path/lib, while the
module files will be installed under /path/modules. The regular and
experimental C interface header files are under /path/include.
The PSBLAS library is also distributed as a package for the following Linux distributions.
| Distro | PSBLAS Version | Reference | Maintainer | To install |
|---|---|---|---|---|
| Fedora | 3.8.0-2 | psblas3 | sagitter | dnf install psblas3-serial dnf install psblas3-openmpi dnf install psblas3-mpich dnf install psblas3-debugsource dnf install psblas3-debuginfo Note: For the mpich and openmpi versions also the -debuginfo and -devel are available. |
Configuring with --enable-serial will provide a fake MPI stub library
that enables running in pure serial mode; no MPI installation is needed
in this case (but note that the fake MPI stubs are only guaranteed to
cover what we use internally, it’s not a complete replacement).
From version 3.7.0 of PSBLAS, we handle 8-bytes integer data for the global indices of distributed matrices, allowing an index space larger than 2G. The default setting in the configure uses 4 bytes for local indices, and 8 bytes for global indices. This can be tuned using the following configure flags.
| Configure flag | |
|---|---|
--with-ipk=<bytes> |
Specify the size in bytes for local indices and data, default 4 bytes. |
--with-lpk=<bytes> |
Specify the size in bytes for global indices and data, default 8 bytes. |
Note: If you wish to compile PSBLAS with the Metis interfaces then you need
to configure and compile it in a consistent way with the choice made for --with-lpk.
This can be achieved by setting the defines in the metis.h library, e.g, to use
8 bytes integers and 4 bytes floats
#define IDXTYPEWIDTH 64
#define REALTYPEWIDTH 32
Fix all reamining bugs. Bugs? We dont’ have any ! ;-)
Project lead: Salvatore Filippone
Contributors (roughly reverse cronological order):
PSBLAS is distributed under the BSD 3-Clause license.
Parallel Sparse BLAS version 3.8
(C) Copyright 2006-2022
Salvatore Filippone
Alfredo Buttari
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions, and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the PSBLAS group or the names of its contributors may
not be used to endorse or promote products derived from this
software without specific written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE PSBLAS GROUP OR ITS CONTRIBUTORS
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.