As someone relatively new to applied statistics, I have been trying to better understand some of the best practices for applying statistical methods. I have been recently trying to understand when to use different multiple-comparison procedures.
The texts I read suggest using the method most appropriate for the testing data and goals. In these texts, I frequently come across algorithms such as Bonferroni and Tukey (for controlling FWER) and Benjamini-Hochburg (for controlling FDR) with different recommendations for each. They describe Bonferroni as the most conservative, and Tukey HSD as "moderately" conservative. In cases where there are "many" comparisons, techniques to control the FWER become overly conservative, and Benjamini-Hochburg is often recommended as a way out.
I am uncomfortable with these vague recommendations. What does "many treatments" mean? What is a low or high sample size, especially when the sample sizes can vary by treatment? Ideally, I would like to know not only $\alpha$ but also $\beta$, so I could understand precisely the conservatism of these tests with my treatments. To my surprise, this matter is rarely discussed in the literature. What is the state of the art for deciding which test is appropriate for a certain data set?