The splines library has functions bs and ns that will create spline basis to use with the lm function, then you can fit a linear model and a model including splines and use the anova function to do the full and reduced model test to see if the spline model fits significantly better than the linear model.
Here is some example code:
x <- rnorm(1000)
y <- sin(x) + rnorm(1000, 0, 0.5)
library(splines)
fit1 <- lm(y~x)
fit0 <- lm(y~1)
fit2 <- lm(y~bs(x,5))
anova(fit1,fit2)
anova(fit0,fit2)
plot(x,y, pch='.')
abline(fit1, col='red')
xx <- seq(min(x),max(x), length.out=250)
yy <- predict(fit2, data.frame(x=xx))
lines(xx,yy, col='blue')
You can also use the poly function to do a polynomial fit and test the non-linear terms as a test of curvature.
For the loess fit it is a little more complicated. There are some estimates of equivalent degrees of freedom for the loess smoothing parameter that could be used along with the $R^2$ values for the linear and loess models to construct and F test. I think methods based on bootstrapping and permutation tests may be more intuitive.
There are techniques to compute and plot a confidence interval for a loess fit (I think there may be a built-in way in the ggplot2 package), you can plot the confidence band and see if a straight line would fit within the band (this is not a p-value, but still gives a yes/no.
You could fit a linear model and take the residuals and fit a loess model to the residuals as response (and the variable of interest as predictor), if the true model is linear then this fit should be close to a flat line and reordering the points relative to the predictor should not make any difference. You can use this to create a permutation test. Fit the loess, find the predicted value farthest from 0, now randomly permute the points and fit a new loess and find the furthest predicted point from 0, repeat a bunch of times, the p-value is the proportion of permuted values that are further from 0 than the original value.
You may also want to look at cross-validation as a method of choosing the loess bandwidth. This does not give a p-value, but an infinite bandwidth corresponds to a perfect linear model, if the cross-validation suggests a very large bandwidth then that suggests a linear model may be reasonable, if the higher bandwidths are clearly inferior to some of the smaller bandwidths then this suggests definite curvature and linear is not sufficient.
anovawith splines approach. For the F test from $R^2$ consider that $R^2$ is the SSR divided by SST and $1-R^2$ is SSE divided by SST, so the ratio $\frac{R^2}{1-R^2}$ is just the SSR divided by SSE (the 2 cases of SST cancel out). Include the degrees of freedom and a little algebra and you have the F statistic for overall regression. – Greg Snow Jan 31 '15 at 21:06lm(y~bs(x,5))doing and why it's notlm(y~I(bs(x,5)))? I am quite confused by this call because the result of bs(x,5) is not a variable... 2) Do I understand it correctly that the p-value I am looking for is the result ofanova(fit0,fit2)? – Tomas Jan 31 '15 at 21:23bsfunction creates a matrix of basis splines that are passed to thelmfunction andlmknows how to deal with the results of functions like this. The anova with fit0 compares your splines to an overall mean, the one with fit1 compares it to a linear relationship, I think the 2nd is probably the more interesting. The comparison to fit0 is actually the same as the overall F test fromsummary(fit2). – Greg Snow Jan 31 '15 at 23:59lm(y~bs(x,5))is not the ordinary linear regression with a result ofbs(x,5), but completely different spline regression? Maybe this is the confusing part, because I thoughtlmis only doing linear regression. – Tomas Feb 01 '15 at 00:09bsfunction creates transforms of the $x$ variable and passes them tolmwhich does the linear regression. – Greg Snow Feb 01 '15 at 00:20iforstepfunction I think, right?). So this still confuses me a bit... – Tomas Feb 01 '15 at 01:40