In my model, in which I'm attempting to infer which covariates affect whether a fish has an empty stomach or not (1=empty, 0=not empty), I decided to grand-mean center the variable "SL" (standard length) so that the intercept would make more sense (instead of when SL=0). However, I'm not sure how to interpret the interaction in the summary output when one of the covariates is centered. My categorical variable is "fZone" (factor Zone, my location variable).
center_sl = grand-mean centered standard length of each fish caught
fZone = location of catch (3 levels)
> table(c_neb5$fZone)
Rankin West Whipray
201 436 42
c_neb5$center_sl <- scale(c_neb5$SL, scale=FALSE)
mod2 <- bam(empty ~
center_sl +
fZone +
center_sl:fZone +
...,
data = c_neb5,
method = 'fREML',
discrete = TRUE,
family = binomial(link = "logit"),
select = FALSE)
EDIT: Full model summary
> summary(mod2)
Family: binomial
Link function: logit
Formula:
empty ~ center_sl + fZone + center_sl:fZone + s(sal) + s(temp) +
s(ToD) + s(fStation, bs = "re") + s(fCYR, bs = "re") + s(fStation,
fCYR, bs = "re") + s(fStation, CYR.std, bs = "re")
Parametric coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.298719 0.291203 -4.460 8.2e-06 ***
center_sl -0.038851 0.011985 -3.242 0.00119 **
fZoneWest 0.122594 0.311480 0.394 0.69389
fZoneWhipray -0.327579 0.639371 -0.512 0.60841
center_sl:fZoneWest -0.002926 0.014650 -0.200 0.84169
center_sl:fZoneWhipray 0.061163 0.025891 2.362 0.01816 *
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df Chi.sq p-value
s(sal) 1.783e+00 2.231 1.590 0.558432
s(temp) 1.128e+00 1.236 2.972 0.134198
s(ToD) 2.112e+00 2.637 16.235 0.000755 ***
s(fStation) 1.096e-04 82.000 0.000 0.619807
s(fCYR) 4.740e+00 12.000 14.165 0.009002 **
s(fCYR,fStation) 9.693e+00 237.000 11.111 0.205201
s(CYR.std,fStation) 1.258e+01 80.000 23.798 0.008646 **
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.16 Deviance explained = 18.1%
fREML = 990.1 Scale est. = 1 n = 679
My interpretation is that...(Intercept)=-1.298719 means the average size fish has a exp(-1.298719)= 0.272 odds of an empty stomach; fZoneWest=0.122594 the odds of empty stomach in West compared to my ref. level (Rankin) increase by exp(0.122594)=1.130425; and center_sl:fZoneWest=-0.002926 means for every 1 unit above average in size, the odds of an empty stomach decrease by exp(-0.002926)=0.9970783, compared to my ref. level. Am I on the right track? Any advice or corrections are greatly appreciated! The data is 679 rows in size, so the best I could do was post a subset of it down below.
Subset of my data:
example_data <- c_neb5[sample(nrow(c_neb5), 10), ]
> dput(example_data)
structure(list(CYR_Keyfield = c("C-2018-10-6-255", "C-2017-6-26-278",
"C-2018-9-16-291", "C-2017-10-9-265", "C-2010-11-10-167", "C-2019-10-30-169",
"C-2018-10-6-279", "C-2022-7-10-241", "C-2017-9-4-70", "C-2022-6-23-241"
), Species = c("Cynoscion nebulosus", "Cynoscion nebulosus",
"Cynoscion nebulosus", "Cynoscion nebulosus", "Cynoscion nebulosus",
"Cynoscion nebulosus", "Cynoscion nebulosus", "Cynoscion nebulosus",
"Cynoscion nebulosus", "Cynoscion nebulosus"), ID = c("201810255_86",
"20176278_52", "20189291_39", "201710265_100", "201011167_61",
"201910169_54", "201810279_75", "20227241_46", "2017970_91",
"20226241_34"), SL = c(33.58, 20.12, 50.25, 23.18, 68.72, 14.85,
73.49, 61.84, 13.26, 25.79), empty = c(0, 0, 0, 0, 0, 1, 0, 0,
1, 0), DateTime = structure(c(1538842500, 1498499220, 1537107120,
1507558920, 1289399700, 1572449400, 1538837160, 1657460040, 1504530660,
1656001620), class = c("POSIXct", "POSIXt"), tzone = ""), CYR = c(2018L,
2017L, 2018L, 2017L, 2010L, 2019L, 2018L, 2022L, 2017L, 2022L
), Month = c(10L, 6L, 9L, 10L, 11L, 10L, 10L, 7L, 9L, 6L), DoY = c(279,
177, 259, 282, 314, 303, 279, 191, 247, 174), ToD = c(12.25,
13.7833333333333, 10.2, 10.3666666666667, 9.58333333333333, 11.5,
10.7666666666667, 9.56666666666667, 9.18333333333333, 12.45),
JDay = c(5129, 4662, 5109, 4767, 2242, 5518, 5129, 6502,
4732, 6485), Zone = c("Rankin", "Rankin", "Whipray", "Rankin",
"West", "West", "Rankin", "Rankin", "West", "Rankin"), Station = c(255,
278, 291, 265, 167, 169, 279, 241, 70, 241), Standard_collection_station = c(0,
0, 0, 0, 0, 0, 0, 1, 1, 1), Latitude = c(25.085, 25.145,
25.118, 25.135, 25.106, 25.081, 25.133, 25.0750000309199,
25.132, 25.0750000309199), Longitude = c(-80.802, -80.809,
-80.76, -80.823, -80.917, -80.893, -80.797, -80.8159999921917,
-80.941, -80.8159999921917), sal = c(38.27, 41.01, 33.61,
26.75, 32, 36.18, 36.42, 40.08, 38.1, 39.07), temp = c(27.856,
32.2, 31.791, 29.512, 19.3, 28.398, 27.6679999999999, 30.243,
29.71, 29.262), fCYR = structure(c(9L, 8L, 9L, 8L, 2L, 10L,
9L, 12L, 8L, 12L), levels = c("2009", "2010", "2011", "2012",
"2013", "2015", "2016", "2017", "2018", "2019", "2021", "2022"
), class = "factor"), fMonth = structure(c(8L, 4L, 7L, 8L,
9L, 8L, 8L, 5L, 7L, 4L), levels = c("1", "3", "5", "6", "7",
"8", "9", "10", "11", "12"), class = "factor"), fStation = structure(c(60L,
69L, 76L, 63L, 40L, 42L, 70L, 57L, 11L, 57L), levels = c("20",
"21", "22", "23", "24", "40", "54", "65", "67", "68", "70",
"71", "73", "101", "105", "106", "107", "111", "112", "117",
"118", "119", "122", "123", "124", "130", "133", "134", "135",
"137", "143", "144", "145", "146", "147", "156", "157", "158",
"159", "167", "168", "169", "171", "172", "173", "174", "175",
"176", "224", "225", "226", "227", "229", "237", "239", "240",
"241", "253", "254", "255", "256", "257", "265", "266", "267",
"268", "269", "270", "278", "279", "280", "281", "282", "284",
"290", "291", "292", "294", "301", "302", "312", "609"), class = "factor"),
fZone = structure(c(1L, 1L, 3L, 1L, 2L, 2L, 1L, 1L, 2L, 1L
), levels = c("Rankin", "West", "Whipray"), class = "factor"),
CYR.std = c(9L, 8L, 9L, 8L, 1L, 10L, 9L, 13L, 8L, 13L), center_sl = structure(c(-6.70160530191458,
-20.1616053019146, 9.96839469808542, -17.1016053019146, 28.4383946980854,
-25.4316053019146, 33.2083946980854, 21.5583946980854, -27.0216053019146,
-14.4916053019146), dim = c(10L, 1L)), center_sal = structure(c(1.43373534609722,
4.17373534609722, -3.22626465390278, -10.0862646539028, -4.83626465390278,
-0.656264653902781, -0.416264653902779, 3.24373534609722,
1.26373534609722, 2.23373534609722), dim = c(10L, 1L)), center_temp = structure(c(-1.51357879234165,
2.83042120765835, 2.42142120765835, 0.142421207658348, -10.0695787923417,
-0.971578792341653, -1.70157879234175, 0.873421207658346,
0.340421207658348, -0.107578792341652), dim = c(10L, 1L))), row.names = c(495L,
364L, 303L, 652L, 404L, 375L, 469L, 676L, 508L, 675L), class = "data.frame")
lmerorglmerwhich are suited for linear models? – Shawn Hemelstrand May 23 '23 at 22:00