I have a dataset of 18 variables (and 170 observations), contains 16 continuous variables. I have calculated the proportion of observations that have metabolic syndrome (19%).
Now I have to determine "if the variables in the dataset distinguish between people with and without metabolic syndrome"?
I would do this by creating a 'logistic regression', where I have metabolic syndrome as dependent variable and al the other (relevant) as independent variables. Then I would create a ROC-curve to assess the discriminatory power of the variables and the logistic regression model.
But I am not sure this is the correct way to solve this question. Could anyone confirm or optimize this or even present a better strategy to solve this?
A list as assumptions is here https://www.statisticssolutions.com/free-resources/directory-of-statistical-analyses/assumptions-of-logistic-regression/
– Peter Flom Dec 26 '23 at 01:00