3

In the top answer on this post: What are the advantages of stacking multiple LSTMs? the idea of stacking LSTMs vertically is distinguished from stacking them horizontally. I quite don't understand what this means. When I think of stacking LSTMs, I think of having the output of one LSTM layer going into another LSTM layer. Would this be vertical or horizontal stacking?

To me that is the only way the word "stacking" makes sense. If you had two layers side by side (like with bidirectional layers which are technically just two separate layers whose outputs are concatenated or averaged) then it doesn't make much sense to call them "stacked".

What is the other form of stacking, other than feeding one layer into another? And in what cases should you use one type of stacking vs the other?

tomjavg
  • 31

0 Answers0