I need to convince colleagues that variable selection on the same data you use for inference is a bad idea. I know of some general references on the problems with model selection, listed below -- but none is really appropriate for people with limited time and no mathematical background.
I'm looking for a short, punchy empirical demonstration (i.e. simulation) of what can go wrong with variable selection, ideally containing a single chart that illustrates the problem clearly. A video might well work better than a book here...
Altman, D. G., & Andersen, P. K. (1989). Bootstrap investigation of the stability of a Cox regression model. Statistics in medicine, 8(7), 771–783. https://doi.org/10.1002/sim.4780080702
Derksen, S. and Keselman, H.J. (1992), Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. British Journal of Mathematical and Statistical Psychology, 45: 265-282. https://doi.org/10.1111/j.2044-8317.1992.tb00992.x
Harrell, F. E., Jr. (2016). Regression modeling strategies. Springer International Publishing.
Leeb, H., & Pötscher, B. M. (2005). Model Selection and Inference: Facts and Fiction. Econometric Theory, 21(1), 21–59. http://www.jstor.org/stable/3533623