0

I implemented a simple RNN from scratch (using only the numpy library )for predicting the next characters, and I trained it on a simple text=“hello world”. It works fine, but I want to train it on a very large text. So I don’t know how I should train it. I understand I cannot train it on the large text at once because of the vanishing/exploding problem. So, I should train it in small batches, but I still don’t understand how that will work. How will the network learn from all the batches?”

Please note that I have implemented the entire RNN from scratch, including backpropagation through time, so I am familiar with the fundamentals.

  • The term of art you’re looking for is “truncated back-propagation through time.” – Sycorax Feb 05 '24 at 19:31
  • i know how back propagation through time works,but for a single text not batches –  Feb 05 '24 at 19:39

0 Answers0