I have two tensor a:[batch_size, dim] b:[batch_size, dim].
I want to do inner product for every pair in the batch, generating c:[batch_size, 1], where c[i,0]=a[i,:].T*b[i,:]. How?
3 Answers
There is no native .dot_product method. However, a dot product between two vectors is just element-wise multiply summed, so the following example works:
import tensorflow as tf
# Arbitrarity, we'll use placeholders and allow batch size to vary,
# but fix vector dimensions.
# You can change this as you see fit
a = tf.placeholder(tf.float32, shape=(None, 3))
b = tf.placeholder(tf.float32, shape=(None, 3))
c = tf.reduce_sum( tf.multiply( a, b ), 1, keep_dims=True )
with tf.Session() as session:
print( c.eval(
feed_dict={ a: [[1,2,3],[4,5,6]], b: [[2,3,4],[5,6,7]] }
) )
The output is:
[[ 20.]
[ 92.]]
- 28,918
- 4
- 80
- 100
-
It solved my problem, thx! – HenrySky Nov 10 '16 at 08:52
-
1tf.mul is now tf.multiply. https://github.com/tensorflow/tensorflow/issues/7032 – Rahul Jha Sep 13 '17 at 21:30
-
3There's seemingly nothing TF developers love more than changing the API... – Emre Sep 14 '17 at 07:42
-
1https://www.tensorflow.org/api_docs/python/tf/keras/backend/batch_dot – sajed zarrinpour May 22 '20 at 16:08
-
@sajedzarrinpour Thanks. I hope that appeared some time between 2016 and now? Will adjust my answer appropriately – Neil Slater May 22 '20 at 16:39
Another option worth checking out is [tf.einsum][1] - it's essentially a simplified version of Einstein Notation.
Following along with Neil and dumkar's examples:
import tensorflow as tf
a = tf.placeholder(tf.float32, shape=(None, 3))
b = tf.placeholder(tf.float32, shape=(None, 3))
c = tf.einsum('ij,ij->i', a, b)
with tf.Session() as session:
print( c.eval(
feed_dict={ a: [[1,2,3],[4,5,6]], b: [[2,3,4],[5,6,7]] }
) )
The first argument to einsum is an equation representing the axes to be multiplied and summed over. The basic rules for an equation are:
- Input-tensors are described by a comma-separated string of dimension-labels
- Repeated labels indicate that the corresponding dimensions will be multiplied
- The output-tensor is described by another string of dimension-labels representing corresponding inputs (or products)
- Labels that are missing from the output string are summed over
In our case, ij,ij->i means that our inputs will be 2 matrices of equal shape (i,j), and our output will be a vector of shape (i,).
Once you get the hang of it, you'll find that einsum generalizes a huge number of other operations:
X = [[1, 2]]
Y = [[3, 4], [5, 6]]
einsum('ab->ba', X) == [[1],[2]] # transpose
einsum('ab->a', X) == [3] # sum over last dimension
einsum('ab->', X) == 3 # sum over both dimensions
einsum('ab,bc->ac', X, Y) == [[13,16]] # matrix multiply
einsum('ab,bc->abc', X, Y) == [[[3,4],[10,12]]] # multiply and broadcast
Unfortunately, einsum takes a pretty hefty performance hit when compared to a manual multiply+reduce. Where performance is critical, I'd definitely recommend sticking with Neil's solution.
- 61
- 1
- 1
Taking the diagonal of tf.tensordot also does what you want, if you set axis to e.g.
[[1], [1]]
I have adapted Neil Slater's example:
import tensorflow as tf
# Arbitrarity, we'll use placeholders and allow batch size to vary,
# but fix vector dimensions.
# You can change this as you see fit
a = tf.placeholder(tf.float32, shape=(None, 3))
b = tf.placeholder(tf.float32, shape=(None, 3))
c = tf.diag_part(tf.tensordot( a, b, axes=[[1],[1]]))
with tf.Session() as session:
print( c.eval(
feed_dict={ a: [[1,2,3],[4,5,6]], b: [[2,3,4],[5,6,7]] }
) )
which now also gives:
[ 20. 92.]
This might be suboptimal for large matrices though (see discussion here)
- 131
- 5
-
1The march of progress :-), I'm not sure which API version this was added in? I suggest expand your answer with a short example (perhaps based on mine, but it should be simpler, since it won't need the
reduce_sum) – Neil Slater Jul 18 '17 at 11:44 -
I added the example! Actually it also gives off-diagonal dot-products if you don't use tf.diag_part, so your answer will probably be faster. Not really sure in which API version tf.tensordot got introduced, but it might be long ago since it is also available in numpy. – dumkar Jul 18 '17 at 12:04
-
Wouldn’t this take much more memory than the element-wise multiply and sum? – kbrose Nov 13 '17 at 16:30