1

Currently trying to model count data using ticket counts for each day of week as the dependent variable (y) and the corresponding day of the week integrated using OHE for 78 days. Assuming Poisson distribution.

I was just wondering if the independent variables are correctly formatted to be able to predict a relationship between day of the week and the ticket count. My goal is to forecast ticket counts for any particular day of week (negBinom regression)

The first image shows the dependent variable y (ticket count) and the second shows what is being modeled as the predictor variables in the regression (days of week). I just want to know if this is the correct thought process. Is it appropriate to use individual categorical features as predictor variables? Even if the days are the same order in the observations? If not, what is a more appropriate way to incorporate days of the week.

tick count enter image description here

mkt
  • 18,245
  • 11
  • 73
  • 172
ty101
  • 31
  • 1
    If you feel you have to post two almost identical questions within one hour, it is good form to at least link back to the first question you posted, to give people some context. It would be even more appreciated if you asked people who answered your original question for clarification if their answer is not helpful, instead of posting a new question. People are helping you and spending time to do so for free here. – Stephan Kolassa Jul 28 '22 at 20:37

1 Answers1

3

Yes, days of the week can be modelled either as continuous 'seasonal' features in time series models (sometimes with an additional binary variable indicating holidays), or a categorical variable. Which approach is better really depends on the specific problem.

mkt
  • 18,245
  • 11
  • 73
  • 172
  • Thanks for responding. However, for my predictions, they don't align well with the observed counts. More specifically, the predictions follow a pattern. Every Monday's predicted value is x, Tuesday's is y , etc. Any advice on what to do here? – ty101 Jul 28 '22 at 20:01
  • I am also using pandas and scikit learn in python – ty101 Jul 28 '22 at 20:30
  • @ty101 I don't understand your (new) question here. Are you asking why your model isn't performing well? I'm afraid you've not provided enough information to answer that. I suggest you post a new question about this with more detail. – mkt Jul 28 '22 at 20:36
  • @mkt: please note my answer to an almost identical question here. – Stephan Kolassa Jul 28 '22 at 20:38
  • @StephanKolassa Thanks for pointing that out. Given that you've already addressed the day of week issue in your answer (aside from much else, +1), perhaps this is best closed as a duplicate? – mkt Jul 28 '22 at 20:44