When I do logistic regression in XLstat and do the same in R with the same data (same variables, exactly the same data ) using the following (essential) code I get totally different coefficients. Could somebody explain to me why there is such a difference and how to replicate the results of XLstat in R?
library(caTools)
set.seed(88)
split <- sample.split(train$Recommended, SplitRatio = 0.75)
dresstrain <- subset(train, split == TRUE)
dresstest <- subset(train, split == FALSE)
model <- glm (one ~two+three+four, data = dresstrain, family = binomial)
R output
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.295e+02 8.058e+06 0 1
Altitude -1.532e-01 1.033e+03 0 1
Pool_length -8.374e+00 8.042e+04 0 1
Pool_breadth 1.063e+01 2.102e+05 0 1
Pool_Depth -4.799e+02 7.066e+06 0 1
pH 8.422e+00 2.344e+05 0 1
Conductivity 3.522e-01 3.790e+04 0 1
TDS -2.709e-01 7.375e+04 0 1
Temperature 6.800e+00 2.010e+05 0 1
Nitrate -1.041e+03 7.301e+06 0 1
Phosphate 3.807e+00 9.269e+04 0 1
Sodium 5.410e+00 1.634e+05 0 1
Ammonium -2.277e+02 1.696e+06 0 1
Potassium -5.502e+01 1.133e+06 0 1
Calcium 1.969e+01 3.628e+05 0 1
Magnesium -4.456e+01 1.221e+06 0 1
Fluride 6.257e+00 7.875e+05 0 1
Chloride 1.982e+01 6.618e+04 0 1
Bromide -5.380e+01 5.328e+05 0 1
Sulphate 4.050e-01 3.086e+04 0 1

set.seed): can you check the models are the same if you estimate them on the whole data? – Vincent Guillemot Jan 06 '20 at 15:53