1

I have an experiment where we measure the energy used by a building and want to regress this energy linearly against so-called degree-days, calculated with two different methods. The data looks like this:

enter image description here

A regression line has been added to each group, that has been forced to go through the origin.

I want to compute the slope of these lines (with std. error), but I'm not sure what is the right way. My data looks like this:

> alvDegreeDays[sample(nrow(alvDegreeDays), 4),]
          Energy  BaseTemp DegreeDays
Feb 2014   984.7 Estimated   365.9771
Mar 2014   864.7 Estimated   307.2246
Apr 20151  512.8       SIA    50.0000
Sep 2015   239.2 Estimated    95.4787

I've tried this first:

lm(Energy ~ DegreeDays * BaseTemp + 0, alvDegreeDays)

Call:
lm(formula = Energy ~ DegreeDays * BaseTemp + 0, data = alvDegreeDays)

Coefficients:
            DegreeDays       BaseTempEstimated             BaseTempSIA  
                 2.436                  23.094                 174.390  
DegreeDays:BaseTempSIA  
                 1.181  

But this yields BaseTempEstimated and BaseTempSIA terms which are, in effect, intercept terms.

Next I tried the following:

(foo <- lm(Energy ~ DegreeDays + DegreeDays:BaseTemp + 0, alvDegreeDays))

Call:
lm(formula = Energy ~ DegreeDays + DegreeDays:BaseTemp + 0, data = alvDegreeDays)

Coefficients:
                  DegreeDays  DegreeDays:BaseTempEstimated        DegreeDays:BaseTempSIA  
                       4.401                        -1.897                            NA  

This looks better, but when I try to call predict on this model I get weird error messages:

> predict(foo, list(DegreeDays = 1, BaseTemp = "Estimated"))
       1 
2.504507 
Warning message:
In predict.lm(foo, list(DegreeDays = 1, BaseTemp = "Estimated")) :
  prediction from a rank-deficient fit may be misleading

Any idea what I may be doing wrong (or right) here?

lindelof
  • 1,119

2 Answers2

3

I think what you want is:

lm(Energy ~ 0 + DegreeDays : BaseTemp, alvDegreeDays)

Which should give:

Call:
lm(formula = Energy ~ 0 + DegreeDays:BaseTemp, data = alvDegreeDays)

Coefficients:
DegreeDays:BaseTempEstimated        DegreeDays:BaseTempSIA  
                       2.504                         4.401 
pete
  • 210
1

Given that you only want the slopes, and both lines are constrained to pass through the origin, there's a simple solution: split the dataset by BaseTemp and run separate regressions.

alvDegreeDays_est <- alvDegreeDays[alvDegreeDays$BaseTemp == 'Estimated',]
alvDegreeDays_sia <- alvDegreeDays[alvDegreeDays$BaseTemp == 'SIA',]

lm_est <- lm(Energy ~ DegreeDays + 0, alvDegreeDays_est)
lm_sia <- lm(Energy ~ DegreeDays + 0, alvDegreeDays_sia)
mkt
  • 18,245
  • 11
  • 73
  • 172