When each variable is used on its own X1 is the best predictor, but that is partially because it correlates with a combination of other variables, that predicts the data even better. For smaller $\lambda$ the combination is included in the model and so the coefficient of X1 falls. X7 might have the largest adjusted effect, but works as a predictor better only in the context of other variables. Depending on the correlations of the covariates they might only explain very little variation on their own, see these questions:
How can delta (Δ) R² for a term be larger than R² for that term?
Should individual $R^2$ of a predictor always be greater than $\Delta R^2$ when removing that predictor from an expanded model?
I have constructed a simple example in R, where x1 is only a proxy for the true predictor x2-x3 :
n <- 100
x2 <- rnorm(n)
x3 <- rnorm(n)
x2 - x3 just to separate them in the plot
y <- x2 - x3 + rnorm(n)
x1 <- (x2 - x3) + rnorm(n, sd = 0.5)
anova(lm(y ~ x1), lm(y ~ x2), lm(y ~ x3)) # lower RSS -> better R2 then the other two models
library(glmnet)
mod <- glmnet(cbind(x1, x2, x3), y, alpha = 1)
plot(mod, xvar = "lambda", xlim =c(-3, 1))
