12

I was wondering what is a more optimized open source BLAS/LAPACK package with respect to modern multi-core processors (Haswell and beyond). Is there any distribution that can attain performance close to that of Intel MKL for instance?

In addition to Intel, which libraries may perform well on AMD architectures for Zen gen1 and Zen gen2 ?

I was hoping that the answer can be based on actual performance analysis work.

Thanks Michael

tamumiket
  • 121
  • 1
  • 7
  • 9
    I don't have concrete numbers, but in my experience, OpenBLAS has given me close to MKL performance. (To be specific, this was for NumPy/SciPy linked against MKL -- before they changed their noncommercial license policy -- vs. OpenBLAS.) As a bonus, getting things to work with OpenBLAS was much less painful, too. – Christian Clason Mar 05 '16 at 00:07
  • 1
    In which classes of problems was OpenBLAS close to MKL? Have you ever compared OpenBLAS against GotoBLAS or ATLAS? – tamumiket Mar 07 '16 at 03:47
  • This may not answer your question, but why not use the Free version of MKL: https://software.intel.com/en-us/articles/free_mkl Other than that, I would guess OpenBLAS, GotoBLAS, or Atlas. – Scott Thornton Mar 06 '16 at 02:28
  • Solving linear systems for sparse matrices and eigenvalue problems, mostly. I have made no comparison to GotoBLAS (which is no longer actively maintained, and on which OpenBLAS is based on) or ATLAS. Have you actually googled around? There are tons of benchmarks on the net, e.g., https://github.com/tmolteno/necpp/issues/18, http://stackoverflow.com/questions/25830764/numpy-with-atlas-or-openblas (follow the links in the comments), http://blog.nguyenvq.com/blog/2014/11/10/optimized-r-and-python-standard-blas-vs-atlas-vs-openblas-vs-mkl/. – Christian Clason Mar 07 '16 at 08:24
  • 3
    In particular, this post on StackExchange should exhaustively answer your question: http://stackoverflow.com/a/7645939. Be aware that the numbers could be very different depending on the precise problem you are trying to solve. – Christian Clason Mar 07 '16 at 08:27
  • 4
    Talking about performance - you may consider using flexiblas which lets you switch the BLAS backend at runtime. http://www.mpi-magdeburg.mpg.de/projects/flexiblas (disclaimer: flexiblas was developed by my colleagues) – Jan Mar 07 '16 at 23:05
  • 1
    A more modern solution is the BLAS-like Library Instantiation Software (BLIS): https://github.com/flame/blis. It implements the same algorithm that underlies the GotoBLAS and OpenBLAS but in a more flexible way, and with alternative interfaces in additional to the traditional BLAS interface. It is what AMD distributes as part of its open source library solution. – Robert van de Geijn May 13 '18 at 14:29
  • 3
    IMHO, BLIS is the current gold standard for open source BLAS on these architectures: https://github.com/flame/blis (Disclaimer: Poster is part of the BLIS project.) – Robert van de Geijn Jun 20 '18 at 17:01
  • Eigen has had an edge over MKL+ICC on my recent tests in this context, specially dense linear algebra. Need to compile with optimization flags to get this performance. Not a BLAS LAPACK standard API, though (which can actually be a plus for optimization) – rfabbri Mar 08 '19 at 15:11
  • Not only is OpenBLAS as good as MKL, but it's also as easy to install as sudo apt install libopenblas-dev (on Ubuntu). – schneiderfelipe Aug 28 '19 at 04:49

0 Answers0