2

I wish to estimate the size parameter of the negative binomial distribution in R. Ideally I would like to estimate the size parameter when all I have are n = numbers of samples, m = mean count per sample, and n*m = total counts. I started out with the binomial distribution and estimated the binomial probability several ways shown below, thinking I could use that R code as a template for the negative binomial distribution.

Here is R code for the binomial distribution:

set.seed(1234)
N       <- 100
Size    <- 1
Prob    <- 0.75
my.data <- rbinom(n = N, size = Size, prob = Prob)
sum(my.data) / length(my.data)
#[1] 0.81

use dbinom

getParm1 = function(betas, my.data) { b1 = betas[1] ll = dbinom(my.data, 1, b1) return(-sum(log(ll))) } r1 = optim(c(0.50), fn = getParm1, my.data=my.data, method = "Brent", lower = 0, upper = 1) r1$par #[1] 0.81

use the explicit likelihood

getParm2 = function(betas, N, K) { p1 = betas[1] nCk = gamma(N+1) / (gamma(K+1) * gamma((N-K)+1)) my.likely <- nCk * p1^K * (1-p1)^(N-K) return(-log(my.likely)) } r2 = optim(c(0.50), fn = getParm2, N = length(my.data), K = sum(my.data), method = "Brent", lower = 0, upper = 1) r2$par #[1] 0.81

remove the binomial coefficient

getParm3 = function(betas, N, K) { p1 = betas[1] my.likely <- p1^K * (1-p1)^(N-K) return(-log(my.likely)) } r3 = optim(c(0.50), fn = getParm3, N = length(my.data), K = sum(my.data), method = "Brent", lower = 0, upper = 1) r3$par #[1] 0.81

use the sum of outcomes with dbinom

getParm4 = function(betas, my.data) { b1 = betas[1] ll = dbinom(sum(my.data), length(my.data), b1) return(-log(ll)) } r4 = optim(c(0.50), fn = getParm4, my.data=my.data, method = "Brent", lower = 0, upper = 1) r4$par #[1] 0.81

However, with the negative binomial distribution I am getting stuck even when trying to use counts from every sample (my.data), let alone trying to use just total counts (total.count) without my.data. Here the estimate of size = 3.073976 is not close to the true value of 25.

set.seed(1234)
N           <- 100            # number of samples
theta       <- 25             # size or shape or dispersion parameter
m           <- 0.75           # mean counts per sample
my.data     <- rnbinom(n = N, mu = m, size = theta)
total.count <- sum(my.data)

use dnbinom

getParm1 = function(betas, my.data, m) { b1 = betas[1] ll = dnbinom(quantile(my.data), mu = m, size = b1) return(-sum(log(ll))) } r1 = optim(c(50), fn = getParm1, my.data=my.data, m = m, method = "Brent", lower = 0, upper = 100) r1$par #[1] 3.073976

Is it possible to estimate the size parameter of the negative binomial distribution using approaches similar to what I used above with the binomial distribution? Is it possible to estimate the size parameter without having the counts per individual sample in my.data?

Here is a similar post, maybe even asking the same question, but without much R code:

Calculating the parameters of a negative binomial distribution given a mean and high density intervals

P.S.

Originally I asked this question on StackOverflow and Ben Bolker provided the following two comments:

"This belongs on CrossValidated. I think the simple answer is "no". (1) Number of samples is information about the experimental design, not about the underlying distribution. (2) n, m, and n*m represent only two independent pieces of information, not three (if you know any two you can compute the third). (3) Therefore you really only have one piece of information about the distribution (the mean). (4) Unlike the binomial (which only has a single undetermined parameter, p), the NB has two ({prob, size} or {mu, size}). (5) You're out of luck ... " – Ben Bolker

"If you had another piece of information (like the variance or standard deviation of the counts) you could do it. But not with just n and m." – Ben Bolker

I need to explore using the variance or standard deviation of the counts.

1 Answers1

1

I have located how to estimate theta using the vector of counts, my.data, and optim in R. I also located how to estimate theta with just the mean and variance of the counts by using the method of moments. I arrived at these solutions using notes from one of Ben Bolker's labs at this link:

https://math.mcmaster.ca/~bolker/emdbook/lab6.html

set.seed(1234)
N           <- 1000           # number of samples
theta       <- 5
m           <- 0.75           # mean counts per sample
my.data     <- rnbinom(n = N, mu = m, size = theta)
total.count <- sum(my.data)

use dnbinom

getParm1 = function(betas, my.data, m) { b1 = betas[1] return(-sum(dnbinom(my.data, mu = m, size = b1, log = TRUE))) } r1 = optim(c(50), fn = getParm1, my.data=my.data, m = m, method = "Brent", lower = 0, upper = 100, hessian = TRUE) r1$par #[1] 6.131066

method-of-moments estimate of theta

theta.emp <- mean(my.data) / (var(my.data) / mean(my.data) - 1) theta.emp #[1] 6.639573

EDIT - October 25, 2022

Below is R code for estimating the mean and the size parameter using the explicit log-likelihood. The log-likelihood is defined here:

Theta in a negative binomial random generator

set.seed(1234)
N            <- 1000           # number of samples
Size         <- 5
m            <- 0.75           # mean counts per sample
my.data      <- rnbinom(n = N, mu = m, size = Size)

getParm3 = function(betas, y) { mu = betas[1] theta = betas[2] my.log.like = 0 for(i in 1:length(y)) { part1 <- gamma(y[i] + theta) / (gamma(theta) * factorial(y[i])) part2 <- (theta^theta * mu^y[i]) / ((mu + theta)^(y[i] + theta)) my.log.like <- my.log.like + log(part1 * part2) } -my.log.like } r1 = optim(c(0.75, 5), fn = getParm3, y=my.data, method = "BFGS", hessian = TRUE) r1$par #[1] 0.7591243 6.0963446

Here is R code that does the same thing using dnbinom

getParm2 = function(betas, my.data) {
     b1 = betas[1]
     b2 = betas[2]
     return(-sum(dnbinom(my.data, mu = b1, size = b2, log = TRUE)))
}
r1 = optim(c(0.75, 5), fn = getParm2, my.data=my.data, method = "BFGS", hessian = TRUE)
r1$par
#[1] 0.7591243 6.0963447

Here is R code that estimates the mean and the variance using the explicit log-likelihood. This code returns warnings which I assume arise from dividing by zero early in the estimation process:

getParm5 = function(betas, y) {
     mu    = betas[1]
     theta = mu / (betas[2] / mu - 1)
     my.log.like = 0
 for(i in 1:length(y)) {
      part1 &lt;- gamma(y[i] + theta) / (gamma(theta) * factorial(y[i]))
      part2 &lt;- (theta^theta * mu^y[i]) / ((mu + theta)^(y[i] + theta))
      my.log.like &lt;- my.log.like + log(part1 * part2)
 }
 -my.log.like

} r1 = optim(c(0.75, 6), fn = getParm5, y=my.data, method = "BFGS", hessian = TRUE) #There were 50 or more warnings (use warnings() to see the first 50) r1$par #[1] 0.7590013 0.8528656

mean(my.data) #[1] 0.759

var(my.data) #[1] 0.8457648