Prediction with GPU is much slower than with CPU?

Question

curiously I just found out that my CPU is much faster for predictions. Doing inference with GPU is much slower then with CPU.

I have tf.keras (tf2) NN model with a simple dense layer:

input = tf.keras.layers.Input(shape=(100,), dtype='float32')
X = X = tf.keras.layers.Dense(2)(input)
model = tf.keras.Model(input,X)

#also initiialized with weights from a file
weights = np.load("weights.npy", allow_pickle=True )
model.layers[-1].set_weights(weights)

scores = model.predict_on_batch(data)

For 100 samples doing predictions I get:

2 s for GPU
0.07 s for CPU (!)

I am using a simple geforce mx150 with 2gb

I also tried the predict_on_batch(x) as someone suggested this as it is more faster than just predict. But here it is of same time.

Refer: Why does keras model predict slower after compile?

Has anyone an idea, what is going on there? What could be an issue possibly?

"As is this a simple 1 layer NN with tf.keras I think I do not need an example. I think it is a GPU related questions". You absolutely do need an example. Often times the problem in situations like these is something you did in calling the code. — mCoding, Dec 18 '20 at 18:04
There are hundreds of questions asking why this code runs slow in the GPU but fast in the CPU, and the answer is always the same, you are not putting enough load in the GPU (model is very small) to overcome communication between CPU and GPU, so the whole process is slower than just using the CPU. — Dr. Snoopy, Dec 18 '20 at 21:36
Thanks. But what do you mean by 'ou are not putting enough load in the GPU (model is very small) to overcome communication between CPU and GPU' — ctiid, Dec 19 '20 at 14:24

Prediction with GPU is much slower than with CPU?

0 Answers0