1

I'm using a public dataset available at this link.

It's about marketing, and one of the variables (pdays , numeric) refers to the number of days that passed by after the client was last contacted from a previous campaign.

Rows which the value is 999 means that the client was not previously contacted. I'm afraid that using this into a ML algorith will lead to wrong results.

I'm thinking of turning them to zero. But I don't know what to do with the zeroes when scaling the dataset before using an algorith (Should I consider the zeroes?).

Is there a better solution?

Guilherme
  • 135
  • 8

1 Answers1

0

You could try creating a binary pday variable with 999 is 0 and 1 otherwise. Then ML algorithm should perform better,

veng19
  • 1