0

Now this is a thoroughly discussed topic but unfortunately I've never come across an explanation that is intuitive, also there may be several reasons, none of which are intuitive.

I have a study in which patients have had a bleeding in the brain. These are treated by surgery and then inserting a drain for up to 24 hours to drain any residual bleeding. I'm interested to see if the amount of time with drain insertion is related to reoccurence of brain bleeds later on.

If I simply logistically regress reoccurence and drain time, I don't get a significant relationship. However, if I add a covariate, the size of the bleed in mililiters, the regression becomes significant for time drained.

Now with a larger bleed, you may expect a longer drain time, however I don't see how this shifts the relationship so drastically.

Can anyone see through the numbers and intuitively explain why this happens? I also think such an answer would be of use for a lot of users asking this question and looking for an intuitive explanation on a concrete example.

Example code:

[CODE]
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(Reoccurence ProductionHours sizeAdjust)
0  8  1.67265
0 22     5.28
0 24   5.9778
0 18     4.29
0  6     3.12
0  0   4.9245
0  7     9.18
0 17   11.088
1 11    10.12
0  9   6.4935
0  1   8.6768
0  4   9.4752
0 13 15.15105
0 22        .
0 18   25.185
0  4  10.4058
1  0  5.47515
0 16  10.4104
0 16  8.12515
0  3    8.775
0  8   8.7098
0 20    9.591
0 23    4.224
0  8  9.01425
0  3   2.1417
1  0  13.5036
0 13    8.239
0 15  10.1365
0 15   5.4432
0 14   14.685
0 24   22.011
1 12   9.0937
1 15  26.6067
0 13  17.5272
0  3   10.528
0 17  20.1856
0  1   6.7527
0 12    5.612
0  1   2.3114
0  8  15.9588
0  2   11.534
0 12   10.115
1 16   10.296
0 17    3.528
0 24   13.224
0 19    7.917
0 24    12.95
1  6   17.875
0 20   10.332
0  4   11.745
0  4     19.6
0 15     13.8
0  7   22.185
0  2    4.875
0  9   12.012
0 11  11.5575
1  3    8.835
0  6  16.1161
0  3  19.5776
0 16  14.4144
0 12     15.9
0 13  15.6664
0  1     7.56
0 24  10.1439
1 13    11.88
0 24   9.9279
0 14  11.1375
0  3        .
0 24  11.7612
0 10   4.9504
0  3        .
0 23    9.734
0 24   7.3575
0 21   6.1968
0  9   13.167
0  2   6.3597
0  8    4.675
0  1   8.3121
1 16   16.132
1 18   16.102
1 16     13.9
0  2   7.7653
0 16  15.8158
0 10  14.0332
0  5    17.76
0 23   16.014
0  8   16.422
0 21   12.064
0 22   4.6926
0 24  10.3071
0 17   13.122
0  4     9.01
0  5   11.904
0  2  12.4168
1  3  15.3792
1  2   4.9044
.  0    3.829
0 11   8.8011
0  0   5.9363
0 17    8.763
end
[/CODE]

Watch what happens when you regress hours alone and then add size.

Paze
  • 2,291
  • If you expected to get the same result, why would you add another variable? V simple example: What's the effect of weight on body fat percentage? What's the effect of weight, controlling for height, on body fat percentage? Which will be higher? – Jeremy Miles Jun 09 '21 at 16:57
  • Don't think of significant vs not, that's an arbitrary threshold. – Jeremy Miles Jun 09 '21 at 16:57
  • 1
    The quick answer is that some other variable introduces variability that you are not considering with just the one variable. By considering that variable, you account for that variability, meaning that you have more signal-to-noise and easier ability to detect differences. Perhaps look into ANCOVA (analysis of covariance). – Dave Jun 09 '21 at 17:00

0 Answers0