How to do batch inner product in Tensorflow?

Question

I have two tensor a:[batch_size, dim] b:[batch_size, dim]. I want to do inner product for every pair in the batch, generating c:[batch_size, 1], where c[i,0]=a[i,:].T*b[i,:]. How?

Neil Slater · Accepted Answer · 2017-09-14T07:14:56.033

12

There is no native .dot_product method. However, a dot product between two vectors is just element-wise multiply summed, so the following example works:

import tensorflow as tf

# Arbitrarity, we'll use placeholders and allow batch size to vary,
# but fix vector dimensions.
# You can change this as you see fit
a = tf.placeholder(tf.float32, shape=(None, 3))
b = tf.placeholder(tf.float32, shape=(None, 3))

c = tf.reduce_sum( tf.multiply( a, b ), 1, keep_dims=True )

with tf.Session() as session:
    print( c.eval(
        feed_dict={ a: [[1,2,3],[4,5,6]], b: [[2,3,4],[5,6,7]] }
    ) )

The output is:

[[ 20.]
 [ 92.]]

edited Sep 14 '17 at 07:14

answered Nov 10 '16 at 08:00

Neil Slater

28,918
4
80
100

It solved my problem, thx! – HenrySky Nov 10 '16 at 08:52
1

tf.mul is now tf.multiply. https://github.com/tensorflow/tensorflow/issues/7032 – Rahul Jha Sep 13 '17 at 21:30
3

There's seemingly nothing TF developers love more than changing the API... – Emre Sep 14 '17 at 07:42
1

https://www.tensorflow.org/api_docs/python/tf/keras/backend/batch_dot – sajed zarrinpour May 22 '20 at 16:08
@sajedzarrinpour Thanks. I hope that appeared some time between 2016 and now? Will adjust my answer appropriately – Neil Slater May 22 '20 at 16:39

score 6 · Answer 2 · answered Nov 11 '17 at 13:31

Another option worth checking out is [tf.einsum][1] - it's essentially a simplified version of Einstein Notation.

Following along with Neil and dumkar's examples:

import tensorflow as tf

a = tf.placeholder(tf.float32, shape=(None, 3))
b = tf.placeholder(tf.float32, shape=(None, 3))

c = tf.einsum('ij,ij->i', a, b)

with tf.Session() as session:
    print( c.eval(
        feed_dict={ a: [[1,2,3],[4,5,6]], b: [[2,3,4],[5,6,7]] }
    ) )

The first argument to einsum is an equation representing the axes to be multiplied and summed over. The basic rules for an equation are:

Input-tensors are described by a comma-separated string of dimension-labels
Repeated labels indicate that the corresponding dimensions will be multiplied
The output-tensor is described by another string of dimension-labels representing corresponding inputs (or products)
Labels that are missing from the output string are summed over

In our case, ij,ij->i means that our inputs will be 2 matrices of equal shape (i,j), and our output will be a vector of shape (i,).

Once you get the hang of it, you'll find that einsum generalizes a huge number of other operations:

X = [[1, 2]]
Y = [[3, 4], [5, 6]]

einsum('ab->ba', X) == [[1],[2]]   # transpose
einsum('ab->a',  X) ==  [3]        # sum over last dimension
einsum('ab->',   X) ==   3         # sum over both dimensions

einsum('ab,bc->ac',  X, Y) == [[13,16]]          # matrix multiply
einsum('ab,bc->abc', X, Y) == [[[3,4],[10,12]]]  # multiply and broadcast

Unfortunately, einsum takes a pretty hefty performance hit when compared to a manual multiply+reduce. Where performance is critical, I'd definitely recommend sticking with Neil's solution.

dumkar · Answer 3 · 2017-07-18T12:02:46.770

3

Taking the diagonal of tf.tensordot also does what you want, if you set axis to e.g.

[[1], [1]]

I have adapted Neil Slater's example:

import tensorflow as tf

# Arbitrarity, we'll use placeholders and allow batch size to vary,
# but fix vector dimensions.
# You can change this as you see fit
a = tf.placeholder(tf.float32, shape=(None, 3))
b = tf.placeholder(tf.float32, shape=(None, 3))

c = tf.diag_part(tf.tensordot( a, b, axes=[[1],[1]]))

with tf.Session() as session:
    print( c.eval(
        feed_dict={ a: [[1,2,3],[4,5,6]], b: [[2,3,4],[5,6,7]] }
    ) )

which now also gives:

[ 20.  92.]

This might be suboptimal for large matrices though (see discussion here)

edited Jul 18 '17 at 12:02

answered Jul 18 '17 at 11:38

dumkar

131
5

1

The march of progress :-), I'm not sure which API version this was added in? I suggest expand your answer with a short example (perhaps based on mine, but it should be simpler, since it won't need the reduce_sum) – Neil Slater Jul 18 '17 at 11:44
I added the example! Actually it also gives off-diagonal dot-products if you don't use tf.diag_part, so your answer will probably be faster. Not really sure in which API version tf.tensordot got introduced, but it might be long ago since it is also available in numpy. – dumkar Jul 18 '17 at 12:04
Wouldn’t this take much more memory than the element-wise multiply and sum? – kbrose Nov 13 '17 at 16:30

How to do batch inner product in Tensorflow?

3 Answers3