Say I have some data, where a dependent variable, dv, is a function of some independent variable, iv, and a categorical predictor, cat. Here are some example data below generated in R:
set.seed(1)
a <- c(1:100)
err <- rnorm(100, sd=30)
b <- a + err
c <- a + err + 20
cat1 <- rep(0,100)
cat2 <- rep(1,100)
iv <- c(a,a)
dv <- c(b,c)
cat <- c(cat1,cat2)
data <- data.frame(dv=dv, iv=iv, cat=cat)
I then model dv as a function of iv and cat with this code:
summary(lm(dv~iv + cat, data=data))
and get the following output
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.94997 4.29669 0.919 0.359
iv 0.98647 0.06617 14.909 < 2e-16 ***
cat 20.00000 3.81999 5.236 4.2e-07 ***
Now, I want to plot the effect of cat using a standard bar graph- means and errors. So, based on the model, I calculate what the value of dv should be when cat is 0 and when cat is 1, using a common value of iv of 50. For my particular data set, I would get dv values of 53.27339 and 73.27339, for cat levels 0 and 1, respectively.
My question is: Which term from the model should I use for the error bars? Should I just use the standard error estimate of the cat predictor? Or something more complex that integrates the error values of the intercept and iv parameter as well?
