My question is regarding to which logistic regression test fits my goal best.
My data set contains 641 rows of which each row is one sample with several independent variables (continuous, nominal and ordinal). However I'm a bit confused on how to classify my response variable. The response variable is constructed as follows:
N-breams (length class 16-40cm) /
(N-breams (length class 16-40cm) + N-breams (length class 40cm+).
This results in a response variable within a range of 0-1. Where the number higher than 0.5 have more breams of length class 16-40cm compared to 40cm+ and vice versa.
In a normal aquatic system the ratio should be higher than 0.5 (or even 1.0), however this isn’t always the case where the ratio is lower than 0.5 (or even 0.0). I'm interested which environmental variables influences this ratio.
So, initially I thought of binomial distribution which looks like this in R (using GLM or GLMM):
glm(y ~ x1 + x2 + x3, family = binomial)
With an output which predicts the probability (0-1) in respect to a significant independent variable. This is the part where I get confused. Since the 0.5 value is a "tipping point" which means that every predicted/fitted value (from the output) lower than 0.5 has more 40cm+ breams than 16-40cm, RIGHT? Or are we talking about chances? So that a 0.5 value is a 50% chance?
Question
So my real question is whether the predicted values are chances (%) or still remain ratio values (but predicted like with the output of a poisson or normal model). I'm almost certain that this regards the latter, but somehow I'm still doubting.
glmfunction can deal with proportion data if supplied by theweightsargument. Mark, this is not a beta regression, it is logistic regression. See here: http://stats.stackexchange.com/questions/26762 Your Q might be a duplicate. – amoeba Sep 28 '16 at 13:39N- = 5, N+ = 10becomes 5 rows withy=0and 10 rows withy=1. – jwimberley Sep 28 '16 at 13:40weightsargument in aGLM, since the ratio is generated by a/(a+b). Would the "total" then be a+a+b (simply put)? – Mark Sep 28 '16 at 14:42weightargument produces a much better fit for the model, great! Quick question: normally I compare model based on AIC values. However, no AIC values are produced with a quasi-binomial distribution. I understand that there are ways to produce AIC values manually, however this is (apparently) debatable. Are there other ways to compare the goodness of fit? – Mark Sep 29 '16 at 07:46