If pvalues aren't useful to look at after performing AIC variable selection (Why are p-values misleading after performing a stepwise selection?), what should be the right thing to do in a scientific paper to basically say: "those variables are the ones that are (likely to be) important".
Should we consider all the selected variables? Should we not only consider the selected variables that have a very very small pvalue (for example shifting the standard 0.05 to 0.01? Is there a formula for this?)
Any reference appreciated, I've tried to go through several posts, lots of conflicting opinions and most of the time without reference, which makes it hard to provide an explanation as to why I will do what I will do.
EDIT: some people also say to avoid any AIC variable selection ("You really want to avoid automated model selection methods, if you possibly can. If you must use one, try LASSO or LAR.") which makes me even more confused. What should one do when they have around 15 metrics and trying to infer which metrics are useful metrics to determine if the patient has a certain disease or not. I feel like this is a relatively standard problem.
EDIT2 : I understand how my p values will be badly biased. I'm however looking to find a way to get non biased p values to say: "those variables are statistically significantly correlated to the outcome variable".
statsmodels, a Python module. I'm not sure what the p-values actually refer to, but if I remember well, it's always the F-statistic when considering Linear Regression models? – FluidMechanics Potential Flows Jul 18 '22 at 15:12