The answer from @Billy (+1) gets to the critical points of the question you posed. These are just a few further thoughts on your modeling strategy that are too extensive to fit into comments.
First, from what you describe it's not clear what you will gain with elastic net. With 6000 cases and what seems to be an outcome that takes on continuous values, you have a lot of flexibility in fitting your model without the variable omission and coefficient penalization involved in elastic net. By usual rules of thumb for biomedical studies, you could evaluate 300 or more predictors in a regression model without much risk of overfitting the model (a case/predictor ratio of 20). If you have thousands of predictors, like with RNA sequencing (RNAseq) data, elastic net might make sense--depending on how you want to apply your model in the future.
Second, it's not clear what you mean precisely by a "non-linear model" in this context. Some models that appear to be non-linear, like fitting outcomes to polynomial functions of predictors, are still "linear models" insofar as the models are linear in the regression coefficients. Sometimes you need a truly non-linear model, but linear modeling can cover a remarkably wide range of applications. You can use regression splines to model predictors flexibly, do non-linear transformations of variables before linear regression (like the log transform often used for RNAseq data), or use generalized linear models to have a nonlinear mapping between a linear-model predictor function and outcome. Those are all still considered linear models in an important technical sense.
Consider whether you really need a non-linear model for your application. If you can perform your "non-linear" modeling in the context of generalized linear models and you do need to use elastic net, standard tools allow you to do that together instead of separately.
Third, remember that extreme values aren't necessarily "outliers" if the values of the associated predictor variables are also appropriately extreme. What is of concern is when differences between the observed and the model-predicted values (the residuals) are large or vary systematically. You certainly don't want to be removing extreme values as "outliers" at an early stage of analysis unless you know the values to have some technical error.
Fourth, do be sure to include your sites as predictors in the model. Even if the biochemical assays were all performed at the same central location, it's possible for differences among sites in sample handling, patient characteristics, etc. to be important in a way that requires some form of statistical control.
The search function on this site can lead you to much information about these issues. If you don't find an answer that helps with future questions, ask further focused questions. See this help page for ways to write questions that can help both you and other visitors to the site.