1

It probably been asked quite a few times, but I do not know what to look / search for. And english

My question is: If I have a X variable in my linear regression, that is a percentage of something, does this "destroy" my linear regression? Further more if I have multiple variables that are a percentage of something, will this destryo my linear regression even more.

Since even I find it hard to understand my question I will make an example $$ Y_{totalcost} = 0 + \beta_0X_{app} + \beta_1X_{morning} + \beta_2X_{midday} + \beta_3X_{afternoon} + \beta_4X_{night} + \beta_5X_{babstation} + \beta_7X_{notbabstaion} + \beta_7X_{drivendistance} + \beta_8X_{fillingstations} $$ I want to find out how the usage of a refueling app, where consumers can look up the current fuel prices in their area, affect the total cost. Therefor I have created statistical twins, where one observation has used an app and the other has not.

The X[morning,midday, afternoon, night] varibales describe: sum of refill events in the morning/.. divided by the total number refill events and they sum up to 1.

The X[babstation, notbabstation] describe: sum of refill event that have taken place at a highway refilling station divided by the total number of refilling evnets.

As you can see I have two groups of multiple variables that sum up to one and I'm not sure if I can do this in a linear regression. I'm trying to implement such a model I R, so if you could also give me hints how to do this in R it would be great.

Benjamin
  • 141
  • 1
    Is it conceivable that the app would give someone information that would lead to them for instance choosing a babstation instead of a nonbabstation? – jlimahaverford Sep 15 '15 at 09:58
  • I updated my quesiton a litte. I have statistical twins, where one has used an app and the other has not. Basically they the no app alway chooses the nearest fillingstation, whereas the app takes the price in consideration and chooses the cheapest filling station. – Benjamin Sep 15 '15 at 10:26
  • Can you answer my specific question? Could it make them choose a babstation instead of a nonbabstation? – jlimahaverford Sep 15 '15 at 10:35
  • They might, if the highwaystation is cheaper. Thats what I was trying to say in my answer. The babstation for each refill event is a Dummy, which indicates if the refill event has taken place at a refillingstation at the highway. The same is true for morning, afternoon, midday and night. – Benjamin Sep 15 '15 at 10:40
  • 1
    Then you might not see the usefulness of the app manifested in \beta_0. It might be manifested in shifts in the distributions of different IVs. – jlimahaverford Sep 15 '15 at 10:55
  • See https://stats.stackexchange.com/questions/58664/ratios-in-regression-aka-questions-on-kronmal/410465#410465 – kjetil b halvorsen Nov 29 '22 at 16:06

0 Answers0