1

I am trying to evaluate the relationship between two variables - x and y. Variable x has a number of known covariates: a, b, and c. I have identified how these covariates are related to x in r with the formula:

lm(x ~ a + b + c)

How do I adjust x such that I can plot a regression line for y ~ x?

I apologize if this is a simple question - everywhere that I look the answers to this question are to adjust it in the linear model itself by adding the covariates as terms (e.g. y ~ x + a + b + c). This does not seem right to me because while y may or may not covary with a, b, and c, I am only interested in controlling for the impact these have on x. Am I missing something conceptually? How can I do this in r?

womy
  • 85
  • Is your question motivated by some application? It seems like you have a two-stage least squares problem. – utobi Oct 28 '22 at 20:15
  • 2
    You can consider plotting the residuals of y ~ a+ b+ c against x ~ a +b+ c. Alternately, you can create some ranges values of [a, b, c] and plot the bivariate y ~ x relation in a grid among those ranges (see ?coplot in R for some motivating examples). – AdamO Oct 28 '22 at 20:33
  • See https://stats.stackexchange.com/questions/17336 if you need any elaboration of @AdamO's suggestions. – whuber Oct 28 '22 at 21:33
  • It sounds like what you want is y ~ w, where w is x adjusted for a, b, c. But how would you know how to make this adjustment ? You could use some theoretical or empirical understanding of how x, a, b, c could be combined into w. Or, I think, you would use y ~ x + a + b + c, and look at the predicted values from that regression. That would be the best combination of x, a, b, c to predict y given the constraints of the chosen model. – Sal Mangiafico Oct 28 '22 at 22:44
  • This is helpful, thank you! – womy Oct 28 '22 at 22:48
  • @SalMangiafico Yes, your first line is what I want, thank you! Can you briefly explain why y ~ x + a + b + c would work for this? – womy Oct 28 '22 at 22:51
  • A potentially useful search term is partial regression plot. E.g. see https://en.wikipedia.org/wiki/Partial_regression_plot particularly some of the references – Glen_b Oct 29 '22 at 00:51
  • @Glen_b Thank you!! – womy Oct 29 '22 at 01:03
  • To add to the other comments: throughout you assume that the associations are linear, which might or might not be a justified assumption. – dipetkov Oct 29 '22 at 13:29

1 Answers1

1

Answer mostly from comments

This answer assumes you want a linear combination of variables, without polynomials or interactions.

It sounds like what you want is y ~ w, where w is x adjusted for a, b, c. But how would you know how to make this adjustment ?

  • You could use some theoretical or empirical understanding of how x, a, b, c could be combined into w.

  • Or you could use y ~ x + a + b + c, and look at the predicted values from that regression. That would be the best combination of x, a, b, c to predict y given the constraints of the chosen model.

In the case of the second approach, you could extract the coefficients from the model, and consider the predicted values from the model to be your new w values.

But note that this is determined solely by the values in your current data set.

Also note that practically speaking, with the second approach, you are simply doing what you find to be the common approach to your problem, fitting a model y ~ x + a + b + c.

Sal Mangiafico
  • 11,330
  • 2
  • 15
  • 35