I've fitted a negative binomial model with language (Tamil and French) as the IV and number of prolongations (count) as DV and number of words in each language as an offset (random effect). The question I'm trying to answer: on average, are there more occurrences of prolongations in one language than the other? My output shows a negative coeff for just 'Language' without specifying which one - so I don't know how to interpret these results.
Call:
glm.nb(formula = Prolongations ~ Language + offset(log(Words)),
data = Poisdf, init.theta = 2.963920129, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.4382 -0.7270 -0.1176 0.3917 1.6092
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.6700 0.1994 -13.390 <0.0000000000000002 ***
Language -0.6290 0.3103 -2.027 0.0426 *
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(2.9639) family taken to be 1)
Null deviance: 23.049 on 17 degrees of freedom
Residual deviance: 19.014 on 16 degrees of freedom
AIC: 130.77
Number of Fisher Scoring iterations: 1
Theta: 2.96
Std. Err.: 1.22
2 x log-likelihood: -124.769
How do I interpret this 'Language' Coeff? I can't really say that "for every 1 unit increase in IV, there is a .63 unit decrease (decrease because of the - sign or should -/+ signs be disregarded here?) in number of prolongations"? The last part (DV) makes sense but it doesn't really make sense to speak of a "1 unit increase" in languages when it is a categorical variable with 2 options (Tamil or French).
EDIT: In my data, Tamil = 0 and French = 1. Could this imply that for French, there is a (exp(.63) = .53*100 = 53) 53% decrease in number of prolongations?