Does the phrase "Training Example" always mean a Single Row?

Question

I'm trying to understand the Dropout algorithm. In this paper, the authors say that the nodes are randomly switched off with probability $p$ for each "Training Example".

Does this literally mean that every row in your dataset has a different dropout layout? Or, can "Training Example" mean a training batch?

score 1 · Accepted Answer · answered Dec 14 '22 at 08:16

1

As detailed in this question and answer, a training example refers to a "row", not a "batch".

The paper you refer to actually says the nodes are randomly switched off for "each presentation of each training example". So each time a training example (row) is used (assuming you train for more than one epoch), a different set of nodes are dropped out. There's no permanent association between a row and a "dropout layout" - over the entire training process all the data is used to train the whole network.

answered Dec 14 '22 at 08:16

Lynn

1,707

Surely that would create issues with backpropagation? The original paper says dropout applies to a "Training Case", which is explicitly a mini-batch. Have the authors used the wrong terminology, or is "Training Example" often used in place of "Training Case"? – Connor Dec 14 '22 at 09:44
1

See the answer to the question I linked - it explains why backpropagation works with dropout. – Lynn Dec 14 '22 at 10:48
Okay, interesting, conceptually I always considered it like training a mini neural net using a batch of examples. But I guess having a different dropout for every example is taking that concept to its maximum extent. I presume either choice is mathematically fine to apply. Has anyone studied what affect choosing one option over the other has on your final model? – Connor Dec 14 '22 at 11:51

Does the phrase "Training Example" always mean a Single Row?

1 Answers1