I am interested in plotting the PDF of a zero-one-inflated beta distribution so that I can overlay an empirical density function of the observed data, with a PDF using the parameters I estimated from a regression fit using the brms package in R.
I understand how to generate a PDF for a discrete zero-inflated distribution but I am having trouble wrapping my head around how this would be possible for a continuous distribution like the beta distribution. Does anyone have any pointers to either an existing implementation of this in R or elsewhere, or could provide some example code for how to do this?
I can provide more specific details if needed, but I think what I generally need is a method to create a PDF for a continuous distribution with one or more "inflation" components, that I could apply to create a zero-one-inflated beta probability density function.
model fit summary
Here is what displays in the console when calling summary(fit). The model allows all parameters of the zero-one-inflated beta to vary as a function of treatment. My goal is to plot a PDF of the modeled response distribution for each treatment.
Family: zero_one_inflated_beta
Links: mu = logit; phi = log; zoi = logit; coi = logit
Formula: Proportion ~ Treatment
phi ~ Treatment
zoi ~ Treatment
coi ~ Treatment
Data: simple_data (Number of observations: 360)
Draws: 4 chains, each with iter = 4000; warmup = 3000; thin = 1;
total post-warmup draws = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -0.91 0.10 -1.09 -0.72 1.00 3242 2746
phi_Intercept 1.58 0.14 1.30 1.85 1.00 3300 2788
zoi_Intercept -1.19 0.22 -1.62 -0.78 1.00 2793 2637
coi_Intercept -6.50 3.10 -14.37 -2.63 1.00 1820 1331
TreatmentAtG -0.28 0.14 -0.54 -0.01 1.00 3470 2874
TreatmentAPG -0.19 0.14 -0.46 0.08 1.00 3213 2973
phi_TreatmentAtG 0.33 0.21 -0.09 0.73 1.00 3265 2826
phi_TreatmentAPG 0.19 0.20 -0.23 0.58 1.00 3019 3015
zoi_TreatmentAtG 0.65 0.28 0.11 1.21 1.00 2969 2872
zoi_TreatmentAPG 0.34 0.29 -0.24 0.92 1.00 2766 2872
coi_TreatmentAtG 2.60 3.25 -2.23 10.50 1.00 1863 1404
coi_TreatmentAPG -0.25 4.53 -9.77 8.95 1.00 1423 1523
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Here is the output of dput(summary(fit)) for the model that contains, among other things, the median values from the posterior distributions of all the parameters.
structure(list(formula = structure(list(formula = Proportion ~
Treatment, pforms = list(phi = phi ~ Treatment, zoi = zoi ~
Treatment, coi = coi ~ Treatment), pfix = list(), resp = "Proportion",
family = structure(list(family = "zero_one_inflated_beta",
link = "logit", linkfun = function (mu)
link(mu, link = slink), linkinv = function (eta)
inv_link(eta, link = slink), dpars = c("mu", "phi", "zoi",
"coi"), type = "real", ybounds = c(0, 1), closed = c(TRUE,
TRUE), ad = c("weights", "subset", "index"), include = "fun_zero_one_inflated_beta.stan",
specials = "sbi_zi_logit", normalized = "", link_phi = "log",
link_zoi = "logit", link_coi = "logit"), class = c("brmsfamily",
"family")), mecor = TRUE), class = c("brmsformula", "bform"
)), data_name = "simple_data", group = character(0), nobs = 360L,
ngrps = NULL, autocor = NULL, prior = structure(list(prior = character(0),
class = character(0), coef = character(0), group = character(0),
resp = character(0), dpar = character(0), nlpar = character(0),
lb = character(0), ub = character(0), source = character(0)), class = c("brmsprior",
"data.frame"), row.names = integer(0)), algorithm = "sampling",
chains = 4L, iter = 4000L, warmup = 3000L, thin = 1L, sampler = "sample(hmc)",
fixed = structure(list(Estimate = c(-0.90528752625, 1.580811685,
-1.19292657275, -6.499043965, -0.27649975441075, -0.187851774786015,
0.32542871341475, 0.18643767712385, 0.6463566213525, 0.3447500927745,
2.60338112814775, -0.2461269624225), Est.Error = c(0.0959046589040605,
0.13945355554377, 0.215237395928849, 3.1009864359503, 0.135865577346364,
0.136497497315428, 0.207714190070951, 0.204924681244251,
0.28040277149041, 0.293531084311733, 3.25075636844655, 4.52740348744564
), `l-95% CI` = c(-1.093054, 1.29537825, -1.6244415, -14.3712875,
-0.539085225, -0.4575207, -0.08814557, -0.2328505, 0.112203525,
-0.235284349999999, -2.2301125, -9.76533725), `u-95% CI` = c(-0.715749925,
1.851207, -0.78194595, -2.63277625, -0.0126142324999999,
0.0835585625000001, 0.72997525, 0.582225925, 1.2074075, 0.91722755,
10.502935, 8.9491615), Rhat = c(1.0015884207194, 0.999917786584758,
1.00037997753762, 1.00176907634062, 1.0003540915247, 1.00059282672043,
1.00052829336817, 1.00026305806547, 1.00124237305807, 1.00075322541757,
1.00089230002969, 1.00201414758976), Bulk_ESS = c(3242.04442608349,
3300.32028285236, 2792.94823881214, 1819.53559579282, 3469.94101433756,
3212.6066785496, 3264.71227200253, 3018.95934804462, 2968.56213796343,
2765.6524712156, 1863.16701706442, 1423.10214690165), Tail_ESS = c(2746.32327946702,
2788.22346690149, 2637.07590677314, 1330.5596854849, 2873.96456716223,
2973.27203895491, 2825.68060612819, 3014.71081254269, 2871.57181775515,
2871.90247589758, 1404.12421696219, 1523.09010264777)), row.names = c("Intercept",
"phi_Intercept", "zoi_Intercept", "coi_Intercept", "TreatmentAtG",
"TreatmentAPG", "phi_TreatmentAtG", "phi_TreatmentAPG", "zoi_TreatmentAtG",
"zoi_TreatmentAPG", "coi_TreatmentAtG", "coi_TreatmentAPG"
), class = "data.frame"), spec_pars = structure(list(Estimate = numeric(0),
Est.Error = numeric(0), `l-95% CI` = numeric(0), `u-95% CI` = numeric(0),
Rhat = numeric(0), Bulk_ESS = numeric(0), Tail_ESS = numeric(0)), row.names = character(0), class = "data.frame"),
cor_pars = structure(list(Estimate = numeric(0), Est.Error = numeric(0),
`l-95% CI` = numeric(0), `u-95% CI` = numeric(0), Rhat = numeric(0),
Bulk_ESS = numeric(0), Tail_ESS = numeric(0)), row.names = character(0), class = "data.frame")), class = "brmssummary")
dput(your_fitted_model). – Stephan Kolassa Jul 22 '22 at 17:19