I am trying, without success, to calculate the log-likelihood of the most basic logistic regression model - a constant probability model (i.e. only $\beta_0 \ne 0$).
For the simplest model with 1 coefficient (i.e. constant probability):
$$ E(y) = \pi = \frac{1}{1 + e^{-\beta_0}} $$
The maximum likelihood estimate of $\pi$ is $y/n$. This means that my likelihood is:
$$ \begin{aligned} \ln(L(y_1, \dots, y_2)) &= \ln \left[ \left(\frac{y}{n}\right)^{y}\left(1 - \frac{y}{n} \right)^{n-y} \right] \\ &= y[\ln(y) - \ln(n)] + (n-y)\ln(1 - \frac{y}{n}) \\ &= y\ln(y) - y\ln(n) + (n-y)[\ln(n - y) - \ln(n)] \\ &= y\ln(y) - y\ln(n) - n\ln(n) + y\ln(n) + (n-y) \ln(n-y) \\ &= y\ln(y) - n\ln(n) + (n-y)\ln(n-y) \end{aligned} $$
An example can be found from the book Introduction to Linear Regression Analysis by Montgomery on p. 426 as given below.
A 1959 article in the journal Biometrics presents data concerning the proportion of coal miners who exhibit symptoms of severe pneumoconiosis and the number of years of exposure. The response variable of interest, $y$ , is the proportion of miners who have severe symptoms. A reasonable probability model for the number of severe cases is the binomial, so we will fit a logistic regression model to the data.
exposure = c(5.8, 15.0, 21.5, 27.5, 33.5, 39.5, 46.0, 51.5)
cases = c(0,1,3,8,9,8,10,5)
miners = c(98,54,43,48,51,38,28,11)
y = sum(cases)
n = sum(miners)
print(y * log(y) - n * log(n) + (n - y)*log(n - y))
The above prints -135.0896. However, the log-likelihood of a constant probability model is:
m0 = glm(cbind(successes, failures) ~ 1, family=binomial(link='logit'))
logLik(m0)
The above prints -39.8646.
I don't understand where I'm going wrong with this.