Loss is growing between epochs in simple model

Question

I am new to ML and trying to write a very simple model that can predict the future of a function given the past. That is to say, I define a simple cubic function (y = x^3 + x^2 + x) and apply it to a domain of 100 points from x = -10 to x = 10. I would like to use the first 80 points -- params = y[:80] -- to train some weights using SGD -- weights = np.random.normal((20,80)). The reason I decided to choose this shape for my weights is that weights @ params will return an array with only 20 items, which are compared with trgts = y[80:] using the torch.nn.MSELoss function.

When I run my code, which I will post in full below, my loss function diverges rather than converging. I would like to know why this is.

import numpy as np
import torch
x = np.linspace(-10,10,100)
y = x3 + x2 + x
loss_fn = torch.nn.MSELoss(reduction = 'sum')
def get_trgts(i): return torch.tensor(y[i:])
def get_weights(i,o): return torch.tensor(np.random.normal(0,3,(o,i))).requires_grad_()
def get_params(i): return torch.tensor(y[:i])
weights = get_weights(80,20)
trgts = get_trgts(80)
params = get_params(80)
bias = get_weights(1,1)
lr = 1e-6
def apply_step(weights, bias, prn = True):
    preds = weights@params + bias
    loss = loss_fn(preds,trgts)
    loss.backward()
    with torch.no_grad():
        weights -= weights.grad * lr
        bias -= bias.grad * lr
        weights.grad.zero_()
        bias.grad.zero_()
    if prn: print(loss.item())
    return preds
for i in range(10): apply_step(weights, bias)

The printed output in one case is:

1305310646.9058952
184198069792.42996
25992991779912.277
3667984265155172.0
5.176052331083886e+17
7.304152852734643e+19
1.0307208174021644e+22
1.4544950315880012e+24
2.0525012798773587e+26
2.8963739390011883e+28

Which is evidently diverging.

P.S. I understand that in its current form, the model is useless, I'm simply using it to learn the principles.

score 0 · Answer 1 · answered Nov 05 '22 at 12:39

0

I have discovered the error. Changing the learning rate to lr = 1e-7 will cause the loss to converge rather than diverge! Thanks to anyone who spent time considering my question.

answered Nov 05 '22 at 12:39

I hate coding

1

Loss is growing between epochs in simple model

1 Answers1