12

I have two 2-d numpy arrays with the same dimensions, A and B, and am trying to calculate the row-wise dot product of them. I could do:

np.sum(A * B, axis=1)

Is there another way to do this so that numpy is doing the row-wise dot product in one step rather than two? Maybe with tensordot?

Roger Fan
  • 4,720
  • 28
  • 34
Neil G
  • 30,493
  • 34
  • 149
  • 247
  • Are A and B matrices? I'm assuming with identical dimensions? – Roger Fan Oct 02 '14 at 19:56
  • [Note that "matrix" is a technical word in numpy-speak, and differs from "array": if A and B are arrays, not matrices, then `A*B` is an elementwise product, not a dot product.] – DSM Oct 02 '14 at 19:58
  • 1
    @DSM: Yes, it's elementwise. Looking forward to the `@` operator so that we can forget about `array` :) – Neil G Oct 02 '14 at 20:05

3 Answers3

12

This is a good application for numpy.einsum.

a = np.random.randint(0, 5, size=(6, 4))
b = np.random.randint(0, 5, size=(6, 4))

res1 = np.einsum('ij, ij->i', a, b)
res2 = np.sum(a*b, axis=1)

print(res1)
# [18  6 20  9 16 24]

print(np.allclose(res1, res2))
# True

einsum also tends to be a bit faster.

a = np.random.normal(size=(5000, 1000))
b = np.random.normal(size=(5000, 1000))

%timeit np.einsum('ij, ij->i', a, b)
# 100 loops, best of 3: 8.4 ms per loop

%timeit np.sum(a*b, axis=1)
# 10 loops, best of 3: 28.4 ms per loop
Roger Fan
  • 4,720
  • 28
  • 34
4

Even faster is inner1d from numpy.core.umath_tests:

enter image description here


Code to reproduce the plot:

import numpy
from numpy.core.umath_tests import inner1d
import perfplot


perfplot.show(
        setup=lambda n: (numpy.random.rand(n, 3), numpy.random.rand(n, 3)),
        kernels=[
            lambda a: numpy.sum(a[0]*a[1], axis=1),
            lambda a: numpy.einsum('ij, ij->i', a[0], a[1]),
            lambda a: inner1d(a[0], a[1])
            ],
        labels=['sum', 'einsum', 'inner1d'],
        n_range=[2**k for k in range(20)],
        xlabel='len(a), len(b)',
        logx=True,
        logy=True
        )
Nico Schlömer
  • 46,467
  • 24
  • 178
  • 218
-1

Even though it is significantly slower for even moderate data sizes, I would use

np.diag(A.dot(B.T))

while you are developing the library and worry about optimizing it later when it will run in a production setting, or after the unit tests are written.

To most people who come upon your code, this will be more understandable than einsum, and also doesn't require you to break some best practices by embedding your calculation within a mini DSL string to serve as the argument to some function call.

I agree that computing the off-diagonal elements is worth avoiding for large cases. It would have to be really really large for me to care about that though, and the trade-off for paying the awful price of expressing the calculating in an embedded string in einsum is pretty severe.

ely
  • 70,012
  • 31
  • 140
  • 215
  • 1
    Using the example in my question, this solution takes almost 100x as long as the `einsum` example and about 20x as long as the `np.sum(A * B, axis=1)` one. I think `einsum` is a powerful and expressive tool, but if you really don't like it then you should use the solution in the question over this. – Roger Fan Oct 02 '14 at 20:29
  • Two orders of magnitude would be worth it to me to avoid the way the calculation is expressed in `einsum` in all but the most severe of production environments. I might also consider what needs to be done with the resulting vector of sums and write a generator with a stupid pure Python for loop performing the use of `.dot` manually on each pair of rows. If I only needed to consume the output once, that would be even better. – ely Oct 02 '14 at 20:31
  • +1 for the good point, although I prefer either of the more efficient solutions. It can be hard to debug when your input becomes big and all of a sudden everything is really slow. – Neil G Oct 02 '14 at 20:34
  • I'm getting over half a second for two 5000x1000 arrays. That's not even close to a large-scale problem, and that's already pretty significant amount of time. Either way, there's absolutely no advantage to this over the `np.sum(A*B, axis=1)` method. That also avoids `einsum` and is 20x faster. – Roger Fan Oct 02 '14 at 20:34
  • I agree that `einsum` is opaque. Thing is, I can always write the efficient solution and then add a comment. I also have a design document, which explains everything I'm doing. – Neil G Oct 02 '14 at 20:37
  • I agree with your second point. There is no advantage to this over `np.sum`. RE your timing: It all depends on what you are doing. If you're constantly re-running a bunch of code on 5000 x 1000 arrays, you're doing it wrong. Write the easier to understand code and develop it on small matrices until you have your application correct and well-tested. Then profile it and discover that the crappy use of `A.dot(B.T)` was slowing you down and spend 10 minutes optimizing it later. In the mean time, if any part of the code is under rapid development, use a more expressive approach. – ely Oct 02 '14 at 20:38
  • @NeilG I actually don't think `einsum` is opaque, you just need to know [Einstein summation notation](http://en.wikipedia.org/wiki/Einstein_notation). If you're comfortable with that, `einsum` is a powerful, fast, and extremely expressive way to perform a large set of matrix operations. – Roger Fan Oct 02 '14 at 20:40
  • 4
    If you dislike strings so much, you can also tell einsum to do its thing, for the OP's use case, as: `np.einsum(a, [0,1], b, [0,1], [0])`. It would be good if you could specify a few of the "so many reasons" that make `einsum` such a bad idea. As is, I am strongly tempted to downvote this for giving bad advice... – Jaime Oct 02 '14 at 20:41
  • Embedding a calculation into an obfuscated string format is bad because it hurts modularity and increases hardcoding. To detect whether there is a problem in the calculation, it's not feasible to break it up into smaller algebraic steps, as it would be with chained application of basic liner algebra steps or canned functions. If you've got an index in the wrong place, that may be entirely non-obvious to someone reading the code and very hard to debug. – ely Oct 02 '14 at 20:45
  • Modularity gets hurt because you are dependent on a string constant that meant something to the person writing the code, but not to consumers who might use it later. Basically, you'd need to wrap this particular use of `einsum` with this particular string into a stand alone helper function whose name an documentation made it clear that it was this unusual row-wise dot product. Then users of that code could feel comfortable with the implementation details of that thing being swept into the helper function. – ely Oct 02 '14 at 20:47
  • But when you just stumble across some code that embeds that, more or less hardcoded, then you can't pretend like the implementation details don't matter. What if someone had the wrong index reduction, what if they didn't think how it would generalize to larger arrays when the code accidentally consumes 3D or 4D arrays, or some other unforseen use case that creates an error. The naked call to `einsum` is then a big problem. I suppose you can orthogalize it out of the code by making a helper function as I mentioned above, but who wants to always wrap all their uses of `einsum`? – ely Oct 02 '14 at 20:49
  • In a sense, it shares a lot of the downsides of using something like `eval` -- the code that gets fed to `eval` becomes a magic constant as far as users of the code are concerned and possibly also a parameter that has to be understood and kept track of to understand or debug the program. – ely Oct 02 '14 at 20:53
  • 2
    @EMS: Isn't *all code* "a string constant that meant something to the person writing the code"? I don't see what's wrong with just adding a comment to explain unobvious code. – Neil G Oct 02 '14 at 21:03
  • It's really disingenuous to draw that comparison. For example, suppose that instead of offering its API with functions, `NumPy` offered just one function, `np.eval` but it allowed you to use keywords inside a big string that it would interpret as math functions. So you could say, `x = np.eval('g = sin(y); x = cos(g);')` and it would look up `y` in the local namespace, apply the functions, and return whatever `x` got bound to within the string context back into the namespace as the variable named `x`. Then suppose someone accidentally typed `sign` instead of `sin`. – ely Oct 02 '14 at 21:07
  • To take apart and debug the computation, you have to read the string and, at a semantic level, determine the meaning of the statements and split them out into their own separate `np.eval` statements, then check step-wise to see where the expected result got off track. And maybe you didn't know they wanted `sin` instead of `sign`. The same problem can happen when those are API functions as well (e.g. they call `np.sign` instead of `np.sin`) but the difference is that you don't have to hunt through a giant string to make sense of the computations statements. They are just Python statements. – ely Oct 02 '14 at 21:09
  • It's not quite that bad with `einsum` since if you choose to use it, you must already be restricted to a very specific kind of operation within linear algebra, but it's the same idea. No, "code" in general is not the same thing when it provides an API of composable structures, versus when its statements are embedded into a string format that differs from the language's native way of producing statements. – ely Oct 02 '14 at 21:10
  • You mean because you're not getting line numbers? Python's eval could theoretically provide that. Everything in Python can be theoretically done at runtime, so this distinction is IMHO a leftover from what I guess is your background with statically compiled languages. :) – Neil G Oct 02 '14 at 21:32
  • EMS: you opinion is wrong. einsum is a godsend to writing readable and bugfree linear products. explicitly specifying axes clarifies intent both to other programmers as well as the computer. – Eelco Hoogendoorn Oct 02 '14 at 21:41
  • @RogerFan You are mistaking about numexpr. That's a very different thing, more like Theano, in which you want to bind symbolics to deferred execution models or to target different hardware. A very different issue all together. @Eelco Hoogendoorn: it is for exactly those reasons why I feel `einsum`'s calling signature is quite poor. – ely Oct 02 '14 at 22:39
  • @NeilG Python is actually the first language I ever learned seriously enough to bother with typing systems. Haskell is the only statically typed language I know well enough to draw comparisons, and that really doesn't apply in this case. I'm not sure I understand your comment. For all of the reasons why many PEPs and experienced Python devs have lambasted use of `eval` I feel the same criticisms apply in a narrower scope to `einsum`. I'm happy to disagree, but I do think the responses here are just talking past the points I've raised. – ely Oct 02 '14 at 22:41
  • Yes, I have used `numexpr` ... in particular I used it a lot when I was working at Continuum on the Numba team alongside the maintainer of sympy. Just because it can be used to apply a (restricted) set of operations into a namespace like that *in no way* makes it the same as what I was talking about. I'm really getting a bit frustrated with the conflation going on. – ely Oct 02 '14 at 22:43
  • @RogerFan While that may be the primary reason, it's hardly the only reason. There are scores of other reasons to not sequester statements you intend to execute into a string that obfuscates them out of the context where you could have executed them as basic, readable statements in the language's basic way for providing for statements. – ely Oct 02 '14 at 22:45
  • To be quite clear, `numexpr`, like Theano, is a way to bind data to symbolic entites, create expression graphs out of computations that will happen to those entities, and then handle the computation separately from its expression. That's not the same thing as providing a library's API through a mini DSL using strings. *As it happens* tools like `numexpr` and `Theano` *do* use string names as their choice for representing statements. The Blaze project, for example, is a way to approach the same thing but without basing it on a string API. But really, this is all completely unrelated to my point – ely Oct 02 '14 at 22:48
  • 1
    prpl: sorry for the necro. Of course I agree that DSL strings have their drawbacks; expressing everything as an expression graph in the native language is of course much cleaner. That is, if the native language permits it; if expressing your statement into the native language feels a lot like fitting a square peg into a round hole, a little DSL may not be so bad. And as far as linear algebra is concerned; its a DSL that the people that care about a function like einsum should already know and love. But the fact that the einsum syntax forces one to explicitly name all the axes is a good thing. – Eelco Hoogendoorn Dec 27 '14 at 11:09