Simple RNN for predicting the next character

Asked Feb 05 '24 at 19:23

Active Feb 05 '24 at 20:29

Viewed 37 times

I implemented a simple RNN from scratch (using only the numpy library )for predicting the next characters, and I trained it on a simple text=“hello world”. It works fine, but I want to train it on a very large text. So I don’t know how I should train it. I understand I cannot train it on the large text at once because of the vanishing/exploding problem. So, I should train it in small batches, but I still don’t understand how that will work. How will the network learn from all the batches?”

Please note that I have implemented the entire RNN from scratch, including backpropagation through time, so I am familiar with the fundamentals.

edited Feb 05 '24 at 20:03

asked Feb 05 '24 at 19:23

The term of art you’re looking for is “truncated back-propagation through time.” – Sycorax Feb 05 '24 at 19:31
i know how back propagation through time works,but for a single text not batches – Feb 05 '24 at 19:39

Simple RNN for predicting the next character

0 Answers0