My answer is a bit of a combination of Peter and mkt's advice. Basically, if you are worried about confounding, you should directly model it. And if you suspect your data is not linear, then again, you should model it that way with the understanding that there may be structure in the data, and with this mgcv may be your friend.
Take this scatterplot for example. We can see some portion of it is relatively flat/linear while another portion of it has an almost exponential relationship behind it:

Fitting a GAM to this data may in some sense capture some of the nonlinearity in the data, but most of that is being obfuscated by something behind the data generating process behind this. We can see that with how bizarrely it draws a line through the empty space here.

Well there is some omitted variable bias going on here. In actuality, this data was generated with two factors, the simulated data and initial fit found below:
#### Load packages ####
library(mgcv)
library(tibble)
N and Scale
set.seed(123)
n <- 100
scale <- 2
Create X and F
x <- runif(n, 0, 1)
f1 <- function(x, b, c = 1) exp(b * x) * c
f2 <- function(x, b, c = 1) exp(b * x) * c
f1_x <- f1(x, b = 4)
f2_x <- f2(x, b = 2)
e <- rnorm(n, 0, scale)
Create Factors
fac <- as.factor(sample(1:2,n,replace=TRUE))
fac1 <- as.numeric(fac==1)
fac2 <- as.numeric(fac==2)
Estimate Y
y <- x + f1_xfac1 + f2_xfac2 + e
Merge Data
df <- tibble(y = y, x = x, fac = fac)
df
Plot
plot(x,y,
pch=23,
main="Simulated Data",
xlab="X",
ylab="Y",
cex=1.5)
Fit Model
fit <- gam(
y ~ s(x),
data = df,
method = "REML"
)
Add Lines to Plot
newdata <- data.frame(
x = seq(
min(x),max(x),length.out=100
)
)
pred <- predict(fit,
newdata=newdata,
se=T)
lines(newdata$x,
pred$fit)
lines(newdata$x,
pred$fit + pred$se.fit,
lty="dashed")
lines(newdata$x,
pred$fit - pred$se.fit,
lty="dashed")
Now you may notice there is actually a factor variable here which may be useful to visualize:
#### Plot ####
plot(x,y,
bg=c("gray","black")[fac],
pch=23,
main="Simulated Data",
xlab="X",
ylab="Y",
cex=1.5)

We can see now where the splitting is coming from. There are two distinct groups, and only one is linear while the other is nonlinear. We can model this directly in mgcv and plot the fitted lines like so:
#### Second Fit ####
fit2 <- gam(
y
~ fac
+ s(x, by = fac),
data = df,
method = "REML"
)
New Pred Lines
newdata2 <- data.frame(
x = seq(
min(x),max(x),length.out=100
),
fac = 1
)
pred2 <- predict(fit2,
newdata = newdata2,
se = T)
newdata3 <- data.frame(
x = seq(
min(x),max(x),length.out=100
),
fac = 2
)
pred3 <- predict(fit2,
newdata = newdata3,
se = T)
Plot
plot(x,y,
bg=c("gray","black")[fac],
pch=23,
main="Simulated Data",
xlab="X",
ylab="Y",
cex=1.5)
Fac 1 Lines
lines(newdata2$x,
pred2$fit)
lines(newdata2$x,
pred2$fit + pred2$se.fit,
lty="dashed")
lines(newdata2$x,
pred2$fit - pred2$se.fit,
lty="dashed")
Fac 2 Lines
lines(newdata3$x,
pred3$fit)
lines(newdata3$x,
pred3$fit + pred3$se.fit,
lty="dashed")
lines(newdata3$x,
pred3$fit - pred3$se.fit,
lty="dashed")
Now we have clearly demonstrated that the entirety of this relationship is based on the factor variable which confounded the initial results:
