BLAS performance tests #3566

ViralBShah · 2013-06-28T05:33:24Z

We should have a set of BLAS performance tests that test various problems of various sizes with different numbers of threads. We can then compare performance of different BLAS libraries on different platforms.

lindahua · 2013-06-29T12:59:34Z

This should also help to decide a size threshold -- when the sizes of matrices are below this threshold, we should fallback to hand crafted light weight implementation.

ViralBShah · 2013-06-30T07:52:11Z

This would be useful to help make tuning decisions in openblas (OpenMathLib/OpenBLAS#103)

blakejohnson · 2013-07-17T13:18:33Z

The BLAS performance tests are already showing something interesting. @staticfloat, what is the sysblas on criid? It seems to be at least 50% faster than openblas on the level-1 tests. At level-2, openblas finally is faster at the "large" and "huge" sizes of gemv. At level-3, the differences are pretty minimal.

staticfloat · 2013-07-17T18:01:55Z

sysblas is the flavor of julia built against the system-provided BLAS. On a Debian-based system, that means whatever is providing /usr/lib/libblas.so.3. In most cases, that will be the reference blas, or perhaps ATLAS if the user has requested it. (In extremely rare cases, the user could have installed my libopenblas-base package, in which case this could be openblas!)

For the purposes of these benchmarks, sysblas is reference blas on Linux, and is Accelerate on OSX.

blakejohnson · 2013-07-17T18:07:42Z

I see, so since 'criid' is an OSX machine, I'm looking at OpenBLAS vs Accelerate.

staticfloat · 2013-07-17T18:12:59Z

Yep. Exactly. We definitely want to compare the BLAS implementations on various systems and see how they affect us on all our benchmarks.

blakejohnson · 2013-07-18T13:28:22Z

I probably should have asked this before throwing a bunch of tests out there, but... What is the appropriate design for BLAS performance tests, in terms of how many times to loop an operation? I chose iteration counts such that each test would take >100ms on my machine, and scaled the counts inversely with the problem size to keep the runtime of each test within an order of magnitude of each other. It now occurs to me that this requires a little extra care in interpreting the results, because now the absolute time differences of the smaller problem sizes are 'levered up' by the large numbers of iterations.

Fortunately, one can still compare the relative difference between different BLAS implementations, because the percent differences should still accurately convey the performance gaps.

ViralBShah · 2013-07-18T14:24:07Z

That was what I was going for to start with. This is also good enough to detect regressions over time.

staticfloat · 2013-07-18T16:20:17Z

Indeed. I've also seen BLAS tests where the size of the problem is divided out so that the units being reported are no longer seconds, but rather FLOPS (Floating-Point Operations Per Second). I'm not sure we have a problem here yet, though. I still have yet to track down all the segfaults and bugs in codespeed that are preventing everything from working perfectly. :)

oscardssmith · 2016-10-30T06:53:38Z

Can/should this be closed?

ViralBShah · 2017-07-17T04:21:10Z

Closing this as too general and is somewhat addressed by all the benchmarking work in recent times.

ViralBShah mentioned this issue Jun 28, 2013

OpenBLAS crash when using OPENBLAS_DYNAMIC_ARCH=1 #3369

Closed

blakejohnson mentioned this issue Jul 15, 2013

Adds a matmul performance test. #3722

Merged

blakejohnson mentioned this issue Oct 28, 2013

Proposal: Better linear algebra benchmarks JuliaLang/LinearAlgebra.jl#35

Closed

2 tasks

ViralBShah closed this as completed Jul 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BLAS performance tests #3566

BLAS performance tests #3566

ViralBShah commented Jun 28, 2013

lindahua commented Jun 29, 2013

ViralBShah commented Jun 30, 2013

blakejohnson commented Jul 17, 2013

staticfloat commented Jul 17, 2013

blakejohnson commented Jul 17, 2013

staticfloat commented Jul 17, 2013

blakejohnson commented Jul 18, 2013

ViralBShah commented Jul 18, 2013

staticfloat commented Jul 18, 2013

oscardssmith commented Oct 30, 2016

ViralBShah commented Jul 17, 2017

BLAS performance tests #3566

BLAS performance tests #3566

Comments

ViralBShah commented Jun 28, 2013

lindahua commented Jun 29, 2013

ViralBShah commented Jun 30, 2013

blakejohnson commented Jul 17, 2013

staticfloat commented Jul 17, 2013

blakejohnson commented Jul 17, 2013

staticfloat commented Jul 17, 2013

blakejohnson commented Jul 18, 2013

ViralBShah commented Jul 18, 2013

staticfloat commented Jul 18, 2013

oscardssmith commented Oct 30, 2016

ViralBShah commented Jul 17, 2017