0

if I have k many n,m matrices represented by a n,m,k numpy array, how can I multiply that by k many m,j matrices represented by a 'm,j,k' numpy array at the same time giving me a n,j,k ndarray?

In other words, I need to perform k many matrix multiplications of n,m * m,j = n,j. Is it possible to perform them at once?

EDIT: All of the dimensions vary, but are in general large.

user79950
  • 239
  • 2
  • 10

2 Answers2

2

This really depends on the size and shape of your arrays. Where n,m, and j are small you can do something like the following:

import numpy as np
>>> a = np.random.rand(5,2,1E6)
>>> b = np.random.rand(2,5,1E6)
>>> out = np.einsum('nmk,mjk->njk',a,b)
>>> out.shape
(5, 5, 1000000)

If n, m, and j are large you might want to take advantage of a BLAS like so:

>>> a= np.random.rand(1E3,1E2,5)
>>> b= np.random.rand(1E2,1E3,5)
>>> out = np.empty((1E3,1E3,5))
>>> for x in range(out.shape[-1]):
...     out[:,:,x] = np.dot(a[:,:,x], b[:,:,x])

Keep in mind that numpy arrays are row-major. You may want to rearrange your data depending on what you are doing.

Daniel
  • 18,441
  • 7
  • 56
  • 74
2

The second solution of @Ophion can do without a loop, and it is faster with larger dimension:

In [65]:

#k,n,m,j=2,4,5,6
k,n,m,j=100,101,102,103
A=np.random.random((n,m,k))
B=np.random.random((m,j,k))
In [66]:

%timeit np.rollaxis(np.array(map(np.dot, np.rollaxis(A,2), np.rollaxis(B,2))), 0, 3)
1 loops, best of 3: 313 ms per loop
In [67]:

%timeit np.einsum('nmk,mjk->njk',A,B)
1 loops, best of 3: 793 ms per loop

And slower than enisum when dimension is small:

In [68]:

k,n,m,j=2,4,5,6
#k,n,m,j=100,101,102,103
A=np.random.random((n,m,k))
B=np.random.random((m,j,k))
In [69]:

%timeit np.rollaxis(np.array(map(np.dot, np.rollaxis(A,2), np.rollaxis(B,2))), 0, 3)
10000 loops, best of 3: 73.7 µs per loop
In [70]:

%timeit np.einsum('nmk,mjk->njk',A,B)
100000 loops, best of 3: 13.5 µs per loop

Sure, this is for python 2.x, in 3.x, be aware that the map returns map objects.

CT Zhu
  • 49,083
  • 16
  • 110
  • 125
  • 2
    Two things: First the roll axises is an interesting idea; however if you run the python loop solution it is actually faster then your roll axis solution. Second the memory layout is what makes these operations so slow, if you change the arrays around so that `k` is the first axis all operations are much faster making the above solution slower due to the memory manipulations required. – Daniel May 10 '14 at 02:16