First, Frank Harrell discusses many approaches to this problem in Regression Modeling Strategies (RMS), especially Chapters 4 and 5. There's nothing wrong with selecting predictors based on your understanding of the subject matter, grouping related predictors together into single predictors, or other methods of data reduction, provided that you don't use the outcomes in making those choices. If you don't use outcomes in this process, you don't inflate the Type I error rate. That type of data reduction can be a useful first step in any event. Don't under-estimate the importance of applying subject-matter knowledge first, something that can be under-emphasized in discussions of machine learning from "big data."
Second, with a binary outcome like yours it's wise to include as many outcome-associated predictors in the model as possible. Otherwise there is a risk of omitted-variable bias even if omitted predictors are uncorrelated with those in the model. The task is thus to try to include as many outcome-associated predictors as possible without overfitting.
Third, L2-penalized maximum-likelihood estimation (extending ridge regression to a binary outcome) is a well respected choice if there are still too many predictors. That includes all predictors while penalizing coefficients to minimize overfitting. If there is a particular predictor of major interest, you could choose not to penalize its coefficient while penalizing those of covariates that you are trying to adjust for. Be careful with penalization when you have categorical predictors, however; see Section 9.11 of RMS and this page and its links.
Fourth, you could use learning methods like boosted trees. If they learn slowly, they can use all the data and incorporate unsuspected interactions among predictors without overfitting. The resulting model can be very good at predictions, but is typically difficult to interpret in terms of the individual predictors. One approach to simplify the model and improve interpretability could be to develop the tree-based model, collect its predicted log-odds estimates, and then use those predictions as outcomes to model via linear regression on the predictor variables.
Types of bias
It's important to distinguish selection bias from the bias introduced by penalized regression.
Selection bias means that the analyzed data sample doesn't adequately represent the underlying population, so that estimates based on the analysis don't adequately represent what you would find in the full population. Consider your situation: a large set of patients with only a subset enrolling for trials.
A trial based on your enrollees would suffer from selection bias if they don't represent the underlying population. A model of who chooses to enroll, based on your electronic records, won't suffer from selection bias provided that your full set of records adequately represent the underlying population.
Penalized regression doesn't produce selection bias. It introduces a different type of bias, a downward bias of the magnitudes of regression coefficient estimates that leads to a corresponding bias in model predictions.
That's a choice made to improve how the model will work on the underlying population, via the bias-variance tradeoff. See Section 2.2.2 of ISLR. A low ratio of cases to predictors in building a model can lead to excessive variance when you apply the model to the broader population. Deliberately introducing a small amount of bias in coefficient estimates and model predictions via penalized regression can provide a more-than-corresponding decrease in variance and greatly improve model performance.
Furthermore, L2 penalization (ridge regression) keeps all predictors in the model and can work when there are more predictors than cases. So you don't face the predictor-selection problem that arises with LASSO or stepwise regression. All the predictors can be penalized to similar degrees if you wish, so you thus can (at least try to) evaluate relative contributions of predictors to the choice of whether to enroll. The estimates of individual coefficients might be biased, but their relative magnitudes can be pretty much maintained.
If you want to model the choice to enroll, penalized regression thus can reliably accomplish what you want. The selection bias will be no different than what's in your full data set, and the coefficient bias will improve model performance with respect to the underlying population beyond your current data set.