Motivation
@j.schmoelder and me were wondering if CADET-Core installed from conda is using openblas or intel MKL blas and what the performance difference might be. So I checked.
Background
CADET-Core compiled for conda links it’s dependencies dynamically. We can therefore easily switch which BLAS library is used by installing a different libblas library in our conda-env through conda.
The default on windows is MKL, the default on WSL Ubuntu is openblas. Which one you have can be checked with conda list libblas.
A different version can be installed with mamba install libblas=*=*mkl
The syntax of
=*=*is:
- The first
=*indicates that you are willing to accept any version of thelibblaspackage.- The second
=*mklindicates that you want to install any build oflibblasthat is built with themkl(Intel Math Kernel Library) variant.(thanks ChatGPT)
Performance
Simulation duration for a 3 component LWE with tight tolerances (abstol = 1e-12, algtol = 1e-12, reltol = 1e-8):
| OS | CADET source | BLAS | version | TBB | time [s] |
|---|---|---|---|---|---|
| Linux (WSL Ubuntu) | conda | MKL | 23_linux64_mkl | yes | 20.29 |
| Linux (WSL Ubuntu) | conda | openblas | 21_linux64_openblas | yes | 30.51 |
| Linux (WSL Ubuntu) | conda | BLIS | 21_linux64_blis | yes | 44.52 |
| Windows | self-compiled | MKL | oneAPI 2022.1.0 | no | 21.00 |
| Windows | conda | MKL | 20_win64_mkl | yes | 22.14 |
| Windows | conda | MKL | 23_win64_mkl | yes | 22.95 |
| Windows | conda | openblas | 23_win64_openblas | yes | 40.99 |
| Windows | conda | BLIS | 23_win64_blis | yes | 48.77 |
The bad result of blis is suprising, as the blis team reports better performance compared to MKL on Zen 1 and Zen 2 architecture. Maybe my Zen 3 CPU works well with MKL again, maybe intel TBB and blis don’t get along, maybe it’s something else. ¯_(ツ)_/¯
Key takeaways
Conda can be used to select the BLAS backend. MKL BLAS is significantly faster on both Linux and Windows than OpenBlas and BLIS.