1

I'm looking at carrying out inference on the effect of comorboditity, polypharmacy and personal characteristics (age, ethnicitiy etc) on health outcomes such as time on trial and emergency scans/admissions.

I've found some research that combines polyphmarcy and comorbidity into a single variable, the comorbidity polypharmacy score (CPS) - a sum of the number of diseases and medications a patient is on.

I have a few questions about this

  1. It seems problematic that all diseases and medications are counted the same, e.g., saying the the response variable - in a linear model - has the same unit change from a patient taking an an additional medication paracetamol as Fentanyl. I'd also say the same in the case of tally diseases. Are there arguments for simplicity for clinicians that overule these?

  2. If the above is assumed, wouldn't it be preferable to estimate the coefficients of medication and comorbidity separately, and capture and possible interplay with an interaction term?

  3. I was considering carrying out analysis with the CPS and the approach in 2; however, due to multiple testing, it feels that there would have to be additional precaution, even with mutliplicity corrections, around interpretation of anything significant; would it be preferrable to just stick to one analysis and writing up on that?

Geoff
  • 601
  • 2
    I think your statement (1) is a problem with simple counts but is based on a false premise about what actually gets used -- there are lots of (co)morbidity scores that work by applying a previously calculated weighting for each given condition (or medication) and summing these to make a composite score. For example, classically the Charlson index was the tool of choice and more recent updates like the Elixhauser index are now perhaps preferable. – James Stanley Oct 03 '23 at 02:20
  • 2
    If you haven't seen it, there is a quite wide literature on choices of measures: see e.g. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7190061/ which doesn't explicitly cover condition-counting measures but has an extensive reference list that includes papers covering that topic, noting that in some instances simple counts can be more effective than you might expect... but of course it does depend on what you are counting. – James Stanley Oct 03 '23 at 02:22

1 Answers1

2

Frank Harrell devotes a good deal of attention to formulating composite independent variables in Regression Modeling Strategies. In Section 4.7, that's discussed in the context of "data reduction": you have more predictors than you can incorporate into your model without overfitting, so you try to summarize combinations of predictors so that you have fewer "independent" variables in your model. Several of the Case Studies in that online book show the application of those principles to clinical studies. Sometimes simplicity for clinical application is considered in deciding how to proceed. What would make the most sense for your particular application would depend on knowledge of your specific subject matter, which I don't have. The recent reference provided by James Stanley in a comment on your question seems to be a good place to learn about best practices for comorbidity scores.

If you have a single model including medications, comorbidities, and an interaction between them, then there isn't a big problem with multiple testing. You will have a "significance" estimate for the whole model, and the interaction coefficient will indicate how much the association of medications with outcome depends on comorbidities (and vice-versa). Sometimes people build separate models for separate predictors, an inefficient use of data that almost always leads to problems with things like omitted-variable bias. Follow instead Harrell's advice in the Preface to build a single comprehensive model:

A good overall strategy is to decide how many degrees of freedom (i.e., number of regression parameters) can be "spent", where they should be spent, to spend them with no regrets.

Chapter 4 explains in detail.

EdM
  • 92,183
  • 10
  • 92
  • 267
  • Thanks for this, I'll have a read! Regarding multiple testing, if there is interpretation of multiple coefficients within a model, do corrections need to be applied based on the number that I am interested in? Also, if several models are estimated - a CPS model and a seperated with interaction model - does the adjustment need to cover the coefficients across both models? – Geoff Oct 03 '23 at 09:20
  • @Geoff don't get so hung up over arbitrary "statistical significance" thresholds. Look at this page. Try to avoid multiple separate models on the same data; follow Harrell's recommendations. Within a single model that is "significant" overall, people typically report coefficient p-values without correction. Focus on predictions for particular scenarios of interest; report their confidence intervals. See this answer. If you compare multiple scenarios, do corrections on those comparisons. – EdM Oct 03 '23 at 13:07