Let's say that I generate observations from logistic regression:
n <- 1000 #n. of observations
p <- 5 #n. of covariates
u <- dnorm(runif(n*p, min = 0, max = 1))
x <- matrix(u, nrow = n, byrow = TRUE)
b <- c(2, 1, 2, 3, 6, -3)
prob <- 1/(1 + exp(-b%*%t(cbind(1,x))))
y <- rbinom(n,1,prob)
But now if I fit a model:
df = data.frame(y, x)
mod <- glm(y ~ . , data = df, family = "binomial")
coefficients(mod)
Coefficients I get are:
(Intercept) X1 X2 X3 X4 X5
-7.0004768 16.5132458 -0.6912072 11.7006775 1.5552072 7.0992499
It aswell might be a very dumb question showing lack of knowledge, but why are my coefficients returned by glm different than these from b? A shoot in the dark is that while generating y I rely upon probability given by a binomial distribution, so there is a room for uncertanity. Am I right? In the sense that drawing from binom(4, 0.5) I could get e.x both (1,1,0,0) and (0,1,0,0).
nby a couple orders of magnitude rectifies the problem. I show how to simulate some responses here. – Demetri Pananos Apr 25 '21 at 12:55