Motivation
@j.schmoelder and me were wondering if CADET-Core installed from conda is using openblas or intel MKL blas and what the performance difference might be. So I checked.
Background
CADET-Core compiled for conda links it’s dependencies dynamically. We can therefore easily switch which BLAS library is used by installing a different libblas
library in our conda-env through conda.
The default on windows is MKL, the default on WSL Ubuntu is openblas. Which one you have can be checked with conda list libblas
.
A different version can be installed with mamba install libblas=*=*mkl
The syntax of
=*=*
is:
- The first
=*
indicates that you are willing to accept any version of thelibblas
package.- The second
=*mkl
indicates that you want to install any build oflibblas
that is built with themkl
(Intel Math Kernel Library) variant.(thanks ChatGPT)
Performance
Simulation duration for a 3 component LWE with tight tolerances (abstol = 1e-12, algtol = 1e-12, reltol = 1e-8
):
OS | CADET source | BLAS | version | TBB | time [s] |
---|---|---|---|---|---|
Linux (WSL Ubuntu) | conda | MKL | 23_linux64_mkl | yes | 20.29 |
Linux (WSL Ubuntu) | conda | openblas | 21_linux64_openblas | yes | 30.51 |
Linux (WSL Ubuntu) | conda | BLIS | 21_linux64_blis | yes | 44.52 |
Windows | self-compiled | MKL | oneAPI 2022.1.0 | no | 21.00 |
Windows | conda | MKL | 20_win64_mkl | yes | 22.14 |
Windows | conda | MKL | 23_win64_mkl | yes | 22.95 |
Windows | conda | openblas | 23_win64_openblas | yes | 40.99 |
Windows | conda | BLIS | 23_win64_blis | yes | 48.77 |
The bad result of blis is suprising, as the blis team reports better performance compared to MKL on Zen 1 and Zen 2 architecture. Maybe my Zen 3 CPU works well with MKL again, maybe intel TBB and blis don’t get along, maybe it’s something else. ¯_(ツ)_/¯
Key takeaways
Conda can be used to select the BLAS backend. MKL BLAS is significantly faster on both Linux and Windows than OpenBlas and BLIS.