0

I have this plot where I found that the best degree of freedom is 8, as it minimizes the SSE when predicted from the training data. However, the graph does not go into the final 8th knot in the spline. Am I doing something wrong with how I am plotting? Or if I am not, what is happening?

xlims_sample2_test<-range(sample2_test$x)
x.grid_sample2_test<-seq(from=xlims_sample2_test[1], to = xlims_sample2_test[2])
sample2_test_fit <- lm(y ~ ns(x, df = 8), data = sample2_test)
sample2_test_pred<-predict(sample2_test_fit, newdata = data.frame(x = x.grid_sample2_test), se = T)

ggplot(sample2_test, aes(x = x, y = y)) + geom_point(pch = 1)+ geom_line(data = data.frame(x.grid_sample2_test, sp = sample2_test_pred$fit), aes(x = x.grid_sample2_test, y = sp), color = "red")+ geom_line(data = data.frame(x.grid_sample2_test, ub = sample2_test_pred$fit + 2sample2_test_pred$se), aes(x = x.grid_sample2_test, y = ub), linetype = 2,color="black",lwd=0.65) + geom_line(data = data.frame(x.grid_sample2_test, lb = sample2_test_pred$fit - 2sample2_test_pred$se), aes(x = x.grid_sample2_test, y = lb), linetype = 2,color="black",lwd=0.65) + geom_vline(xintercept = attributes(ns(sample2_test$x, df = 8))$knots, linetype = "dashed", color = "grey30")+ labs(title = "Cubic Spline of sample 2 testing, df = 8")

enter image description here

Sam
  • 1
  • Using ns with $8$ degrees of freedom fits a spline with a total of $9$ knots: $2$ boundary knots and $7$ internal knots. See here for example. – COOLSerdash Mar 23 '23 at 15:43
  • There shouldn't be sharp corners in a spline fit like this. The whole point is to make the fit appear to be smooth. Looks like something is wrong in your code somewhere. Providing a minimal reproducible example with data would help a lot. – EdM Mar 23 '23 at 15:49
  • 2
    On your 2nd line, inside your seq() function, try adding the argument length.out = 1000 or something. Right now x.grid_sample2_test only has something like 7 or 8 distinct values, so you only evaluate predict() at these few x-values, so geom_line() is just connecting them with straight line segments. That's why you don't see the spline's actual curvature, and might also explain why the right end of the spline is missing. – civilstat Mar 23 '23 at 16:09

0 Answers0