Variables that apply only to a subset of the data

Question

I'm using a public dataset available at this link.

It's about marketing, and one of the variables (pdays , numeric) refers to the number of days that passed by after the client was last contacted from a previous campaign.

Rows which the value is 999 means that the client was not previously contacted. I'm afraid that using this into a ML algorith will lead to wrong results.

I'm thinking of turning them to zero. But I don't know what to do with the zeroes when scaling the dataset before using an algorith (Should I consider the zeroes?).

Is there a better solution?

score 0 · Answer 1 · answered Apr 23 '22 at 22:23

0

You could try creating a binary pday variable with 999 is 0 and 1 otherwise. Then ML algorithm should perform better,

answered Apr 23 '22 at 22:23

veng19

1

Variables that apply only to a subset of the data

1 Answers1