CADET-Core BLAS linking in conda

Motivation

@j.schmoelder and me were wondering if CADET-Core installed from conda is using openblas or intel MKL blas and what the performance difference might be. So I checked.

Background

CADET-Core compiled for conda links it’s dependencies dynamically. We can therefore easily switch which BLAS library is used by installing a different libblas library in our conda-env through conda.

The default on windows is MKL, the default on WSL Ubuntu is openblas. Which one you have can be checked with conda list libblas.

A different version can be installed with mamba install libblas=*=*mkl

The syntax of =*=* is:

  • The first =* indicates that you are willing to accept any version of the libblas package.
  • The second =*mkl indicates that you want to install any build of libblas that is built with the mkl (Intel Math Kernel Library) variant.

(thanks ChatGPT)

Performance

Simulation duration for a 3 component LWE with tight tolerances (abstol = 1e-12, algtol = 1e-12, reltol = 1e-8):

OS CADET source BLAS version TBB time [s]
Linux (WSL Ubuntu) conda MKL 23_linux64_mkl yes 20.29
Linux (WSL Ubuntu) conda openblas 21_linux64_openblas yes 30.51
Linux (WSL Ubuntu) conda BLIS 21_linux64_blis yes 44.52
Windows self-compiled MKL oneAPI 2022.1.0 no 21.00
Windows conda MKL 20_win64_mkl yes 22.14
Windows conda MKL 23_win64_mkl yes 22.95
Windows conda openblas 23_win64_openblas yes 40.99
Windows conda BLIS 23_win64_blis yes 48.77

The bad result of blis is suprising, as the blis team reports better performance compared to MKL on Zen 1 and Zen 2 architecture. Maybe my Zen 3 CPU works well with MKL again, maybe intel TBB and blis don’t get along, maybe it’s something else. ¯_(ツ)_/¯

Key takeaways

Conda can be used to select the BLAS backend. MKL BLAS is significantly faster on both Linux and Windows than OpenBlas and BLIS.

7 Likes

Does this have an impact on speed of simulations or fitting? Or is it just installation?

Those seconds times are the duration of a simulation. So this will also have an effect on fitting & optimization runs.

1 Like