1

As an example, I have two class (say A and B). I would like to determine if age, gender, location etc. is associated with class B. My initial thought is to perform a multivariable analysis using logistic regression. The problem is that samples could belong to both class. For example, a patient (sample) could belong in class A between January and March then class B in December. Hence, this particular child is in both class in the dataset. If this problem exist, can I still perform a multivariate analysis using logistic regression? Are there any limitations? Please feel free to suggest other methods I can use to answer the question.

EDIT: AN EXAMPLE OF HOW A SAMPLE CAN BE IN BOTH CLASS

During a year or data period, patient A (sample) could have been placed in multiple short-term placements. As an example, let's assume patient had two placement in a year. If patient A stayed in the first placement for less than 20 days, that's okay. But if patient A stayed in the second placement for more than 20 days then it's problem. I want to determine if any patient related factors are responsible for patient A staying more than 20 days? So I created a class l label for duration of stay: <20 days & >= 20 days. Hence patient A would be in both class - First placement was less than 20 days and second placement was greater than 20 days.

Bebari
  • 11
  • 2
  • 4
    Why can it belong to both classes? What are the classes? What is the problem you are trying to solve with the model? – Tim Nov 15 '22 at 19:40
  • 1
    I second the comment by @Tim and would like to know more about the problem(s) you want to solve and the available data, though I wonder if this might be a situation where multi-label models are appropriate. – Dave Nov 15 '22 at 19:41
  • @Tim I have edited my post to answer your questions. Thank you. Let me know if you have any questions – Bebari Nov 15 '22 at 20:04
  • @Dave See edit. Let me know if you have any other questions. – Bebari Nov 15 '22 at 20:05
  • Why are you using logistic regression, which posits mutually exclusive classes, instead of a more flexible method, such as predicting the number of days in each placement? – Sycorax Mar 14 '23 at 00:06

1 Answers1

0

I see two approaches.

  1. Approach the problem as some kind of panel data, where you determine the category in each month.

  2. If you only care if someone belongs to group B at some point in the year, you might be interested in treating this as a multi-label problem. Briefly, a multi-label problem models the probability of membership in each category, allowing for a high probability of being in both categories. This sounds like your situation, especially if you only have data on the groups to which someone belonged during a given year, rather than knowing they were in group A in the beginning of the year before moving to group B.

Dave
  • 62,186