1

I cant seem to find a clear answer to my question. I have as a dependent variable a count ( number of times certain companies appeared in the news). One of my explanatory variables is age of the company.

would it make sense to square age in such a Poisson regression?

1 Answers1

4

The answer isn't different for a Poisson regression versus others. You are generally wise to include nonlinear terms in continuous predictors like age. You are even wiser if you use a regression spline instead of a fixed polynomial.

There seldom is an exact linear relationship between a continuous predictor like age and the outcome that's modeled directly by the linear predictor, log(counts) in a Poisson regression with a log link. Frank Harrell suggests a generally useful strategy: deciding first how much complexity you can devote to fitting each of your predictors, then devoting the corresponding number of degrees of freedom to each of them in a way that avoids overfitting. See Chapter 4 of his course notes or book.

A simple quadratic form as you propose is seldom wise, however, as it assumes a strict functional relationship between age and log(counts). A regression spline allows the data to help show the form of the relationship. The second chapters of the Harrell references go into more detail. Generalized additive models are another approach to handle nonlinearities.

EdM
  • 92,183
  • 10
  • 92
  • 267
  • This is a question to EdM, not an answer, I dont have enough points to post this as comment so I apologies in advance. My question is, how does one interpret the coeffects from spline model ? – Ahir Bhairav Orai Apr 27 '22 at 04:30
  • @AhirBhairavOrai don't try to interpret the individual coefficients from a spline fit unless you really understand spline bases. The coefficients depend on the way the spline is constructed, which differs among implementations. See this page for extensive discussion. What's best is to display the overall association over a range of reasonable values. – EdM Apr 27 '22 at 11:56