Hi there so my dataset looks as follow:
| Patient ID | Medicine | Death |
|---|---|---|
| 1 | A,B,C,D,E | 1 |
| 2 | B,D | 0 |
| 3 | A,D,E | 1 |
So my dependent feature is death and my independent feature is medicine. I am trying to predict death based on the medication received by a patient using machine learning.
There are five distinct medicines A, B, C, D, and E. Each patient can be given a combination of these medicines. My question is how do I process the medicine feature vector?
I was thinking to create a dummy binary variable for each medicine to check if it was administered. But that seems quite cumbersome, especially if I have more than 5 medicines say 100 medicines. I appreciate your input on how this feature can be processed, I am sure there is a solution out there to handle this kind of situation which I am not aware of. Thanks.