Probability density function of a zero-inflated (or zero-one-inflated) beta distribution?

Question

I am interested in plotting the PDF of a zero-one-inflated beta distribution so that I can overlay an empirical density function of the observed data, with a PDF using the parameters I estimated from a regression fit using the brms package in R.

I understand how to generate a PDF for a discrete zero-inflated distribution but I am having trouble wrapping my head around how this would be possible for a continuous distribution like the beta distribution. Does anyone have any pointers to either an existing implementation of this in R or elsewhere, or could provide some example code for how to do this?

I can provide more specific details if needed, but I think what I generally need is a method to create a PDF for a continuous distribution with one or more "inflation" components, that I could apply to create a zero-one-inflated beta probability density function.

model fit summary

Here is what displays in the console when calling summary(fit). The model allows all parameters of the zero-one-inflated beta to vary as a function of treatment. My goal is to plot a PDF of the modeled response distribution for each treatment.

Family: zero_one_inflated_beta 
  Links: mu = logit; phi = log; zoi = logit; coi = logit 
Formula: Proportion ~ Treatment 
         phi ~ Treatment
         zoi ~ Treatment
         coi ~ Treatment
   Data: simple_data (Number of observations: 360) 
  Draws: 4 chains, each with iter = 4000; warmup = 3000; thin = 1;
         total post-warmup draws = 4000
Population-Level Effects: 
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept           -0.91      0.10    -1.09    -0.72 1.00     3242     2746
phi_Intercept        1.58      0.14     1.30     1.85 1.00     3300     2788
zoi_Intercept       -1.19      0.22    -1.62    -0.78 1.00     2793     2637
coi_Intercept       -6.50      3.10   -14.37    -2.63 1.00     1820     1331
TreatmentAtG        -0.28      0.14    -0.54    -0.01 1.00     3470     2874
TreatmentAPG        -0.19      0.14    -0.46     0.08 1.00     3213     2973
phi_TreatmentAtG     0.33      0.21    -0.09     0.73 1.00     3265     2826
phi_TreatmentAPG     0.19      0.20    -0.23     0.58 1.00     3019     3015
zoi_TreatmentAtG     0.65      0.28     0.11     1.21 1.00     2969     2872
zoi_TreatmentAPG     0.34      0.29    -0.24     0.92 1.00     2766     2872
coi_TreatmentAtG     2.60      3.25    -2.23    10.50 1.00     1863     1404
coi_TreatmentAPG    -0.25      4.53    -9.77     8.95 1.00     1423     1523
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Here is the output of dput(summary(fit)) for the model that contains, among other things, the median values from the posterior distributions of all the parameters.

structure(list(formula = structure(list(formula = Proportion ~ 
    Treatment, pforms = list(phi = phi ~ Treatment, zoi = zoi ~ 
    Treatment, coi = coi ~ Treatment), pfix = list(), resp = "Proportion", 
    family = structure(list(family = "zero_one_inflated_beta", 
        link = "logit", linkfun = function (mu) 
        link(mu, link = slink), linkinv = function (eta) 
        inv_link(eta, link = slink), dpars = c("mu", "phi", "zoi", 
        "coi"), type = "real", ybounds = c(0, 1), closed = c(TRUE, 
        TRUE), ad = c("weights", "subset", "index"), include = "fun_zero_one_inflated_beta.stan", 
        specials = "sbi_zi_logit", normalized = "", link_phi = "log", 
        link_zoi = "logit", link_coi = "logit"), class = c("brmsfamily", 
    "family")), mecor = TRUE), class = c("brmsformula", "bform"
)), data_name = "simple_data", group = character(0), nobs = 360L, 
    ngrps = NULL, autocor = NULL, prior = structure(list(prior = character(0), 
        class = character(0), coef = character(0), group = character(0), 
        resp = character(0), dpar = character(0), nlpar = character(0), 
        lb = character(0), ub = character(0), source = character(0)), class = c("brmsprior", 
    "data.frame"), row.names = integer(0)), algorithm = "sampling", 
    chains = 4L, iter = 4000L, warmup = 3000L, thin = 1L, sampler = "sample(hmc)", 
    fixed = structure(list(Estimate = c(-0.90528752625, 1.580811685, 
    -1.19292657275, -6.499043965, -0.27649975441075, -0.187851774786015, 
    0.32542871341475, 0.18643767712385, 0.6463566213525, 0.3447500927745, 
    2.60338112814775, -0.2461269624225), Est.Error = c(0.0959046589040605, 
    0.13945355554377, 0.215237395928849, 3.1009864359503, 0.135865577346364, 
    0.136497497315428, 0.207714190070951, 0.204924681244251, 
    0.28040277149041, 0.293531084311733, 3.25075636844655, 4.52740348744564
    ), `l-95% CI` = c(-1.093054, 1.29537825, -1.6244415, -14.3712875, 
    -0.539085225, -0.4575207, -0.08814557, -0.2328505, 0.112203525, 
    -0.235284349999999, -2.2301125, -9.76533725), `u-95% CI` = c(-0.715749925, 
    1.851207, -0.78194595, -2.63277625, -0.0126142324999999, 
    0.0835585625000001, 0.72997525, 0.582225925, 1.2074075, 0.91722755, 
    10.502935, 8.9491615), Rhat = c(1.0015884207194, 0.999917786584758, 
    1.00037997753762, 1.00176907634062, 1.0003540915247, 1.00059282672043, 
    1.00052829336817, 1.00026305806547, 1.00124237305807, 1.00075322541757, 
    1.00089230002969, 1.00201414758976), Bulk_ESS = c(3242.04442608349, 
    3300.32028285236, 2792.94823881214, 1819.53559579282, 3469.94101433756, 
    3212.6066785496, 3264.71227200253, 3018.95934804462, 2968.56213796343, 
    2765.6524712156, 1863.16701706442, 1423.10214690165), Tail_ESS = c(2746.32327946702, 
    2788.22346690149, 2637.07590677314, 1330.5596854849, 2873.96456716223, 
    2973.27203895491, 2825.68060612819, 3014.71081254269, 2871.57181775515, 
    2871.90247589758, 1404.12421696219, 1523.09010264777)), row.names = c("Intercept", 
    "phi_Intercept", "zoi_Intercept", "coi_Intercept", "TreatmentAtG", 
    "TreatmentAPG", "phi_TreatmentAtG", "phi_TreatmentAPG", "zoi_TreatmentAtG", 
    "zoi_TreatmentAPG", "coi_TreatmentAtG", "coi_TreatmentAPG"
    ), class = "data.frame"), spec_pars = structure(list(Estimate = numeric(0), 
        Est.Error = numeric(0), `l-95% CI` = numeric(0), `u-95% CI` = numeric(0), 
        Rhat = numeric(0), Bulk_ESS = numeric(0), Tail_ESS = numeric(0)), row.names = character(0), class = "data.frame"), 
    cor_pars = structure(list(Estimate = numeric(0), Est.Error = numeric(0), 
        `l-95% CI` = numeric(0), `u-95% CI` = numeric(0), Rhat = numeric(0), 
        Bulk_ESS = numeric(0), Tail_ESS = numeric(0)), row.names = character(0), class = "data.frame")), class = "brmssummary")

Can you edit your post with the fitted model? Ideally, just paste in the output from dput(your_fitted_model). — Stephan Kolassa, Jul 22 '22 at 17:19
The fitted model is >5MB in size because of all the posterior samples it contains, so I will add some output with just the summary of the parameter estimates from a simplified version of the model. Will do that in just a minute. — qdread, Jul 22 '22 at 17:31
If you have in principle a spike at zero, how are you going to determine the empirical density function? Kernel estimation at least has great difficulty with bounds and even greater difficulty with spikes. — Nick Cox, Jul 22 '22 at 17:37
Ah, thanks. That is indeed rather complicated. Can you instead give us the fitted parameters of the zero inflated beta for a particular predictor setting? Sorry for the confusion. — Stephan Kolassa, Jul 22 '22 at 19:38
The following might be helpful: https://stats.stackexchange.com/questions/335547/how-does-one-graph-the-pdf-of-a-variable-having-a-mixed-discrete-continuous-dist. — JimB, Jul 22 '22 at 22:18
An empirical quantile plot or cumulative distribution plot could help. — Nick Cox, Jul 23 '22 at 10:07

Probability density function of a zero-inflated (or zero-one-inflated) beta distribution?

model fit summary

0 Answers0