I think this question is conflating a few distinct issues.
First of all, the terms "multilevel modeling," "random effects," and "fixed effects" are all used in different ways by different people. This post outlines FIVE different ways people define the difference between fixed and random effects.
Second, the most common use of MLM is for when you have observations "nested" at multiple levels (so students nested within schools, or observations nested within people in a longitudinal dataset). The question there is how you should deal with the higher level "units" (schools or people). One approach is to treat them as "fixed effects" (basically include a dummy variable for each "group"). On on hand, this approach controls for ALL possible bias at the group level, so that's good. On the other hand, precisely because of that, it doesn't allow you to actually analyze the effect of any group level variable (like school size, or "race" in a longitudinal dataset). Treating the groups as "random effects" (allowing the intercept and/or one or more coefficient to vary randomly at the group level) allows you to control for other group level variables (and to do various other cool things like empirical Bayes estimation of group level characteristics), but also opens you up to group level bias if you haven't controlled for all of the important group level factors (which is always the case to some extent).
So in a nutshell, that's the trade off between fixed and random effects for using MLM to analyzed clustered data. How you navigate that trade off depends on your research question and how the data are set up.
Now, as you note, some Bayesians (like Andrew Gelman and perhaps also McElreaths) advocate using MLM (the "random effects" approach) even when there is no "nesting" of data, because Bayesians see all model parameters as inherently "random." But this is a more complicated approach and, in my experience, isn't yet super common among day-to-day statisticians due to various philosophical and logistical issues.
Also, any time you run a normal OLS model and include dummy variables for race, you could also correctly say that you are including "fixed effects" for race....but people don't usually consider that "multilevel modeling."
What does all of this have to do with causal inference? Nothing and everything. Causal inference is really tough, and "running a regression model" on observational data is generally regarded as a pretty suboptimal way to establish causality...although sometimes it's all we've got. The extent to which we can interpret the results of a model in causal terms depends both on the model specification and the underlying theory behind it. MLM is just one way of specifying models to deal with particular problems that might contribute to bias or error in our estimates of coefficients and/or standard errors. If deployed well MLM might make a causal interpretation of a particular coefficient in a particular model more defensible, or it might not. But like any kind of model specification MLM (either fixed or random effects) has no inherent power to make models results causally interpretable, any more than "including an interaction term" or "including a control for age," or any other way we might modify the specification of a model.