I am trying to use Binomial Logistic regression to identify features from X-rays, which are associated with a disease state. The idea was to try get Odds Ratios for important radiographic features*. It's pilot work right now, with only X-ray based variables being considered.
Since some features might interact and multiply their effects, I used stepwise Logistic regression to select the 'best' possible model, as defined by BIC.
My sample is not huge... 45 controls to 28 cases.
When looking at significant features in isolation (one IV logistic models), the odds ratios are reasonable looking:
Feature OR Lower CI Upper CI
AC_6 4.6186 2.1386 9.9745
SC_6 3.0416 1.5989 5.7862
SC_2 2.7387 1.4714 5.0976
TC_10 1.7693 1.0285 3.0437
However, the 'best' model (with a BIC of 50, 'worst' has model BIC:100) sees the Odds ratios and confidence intervals jump up - quite a lot.
Feature OR Lower CI Upper CI
const 0.5544 0.1849 1.6624
AC_6 26.4084* 3.7665 185.1573*
SC_2 9.0166 1.8901 43.0129
TC_10 16.9434* 2.0030 143.3262*
SC_6 10.1433 1.8199 56.5358
The idea of using the multivariate model is quite attractive; the features that were selected fit in nicely with current literature concerning the disease. Together, they paint a plausible and interesting picture of how this disease might develop.
However, I feel I need some help interpreting these results.
- Firstly, I am trying to diagnose why this happened.
- Secondly, I am trying work out what conclusions can safely be made from these results
My current ideas on why this happened are:
The sample size is too small for a 4 variable model. Asking too much with too little data, and precision is suffering.
Very large Odds ratios; therefore larger hikes in the uncertainty
Sparsity of outcome: Some of my features are concentrated around 0. No samples actually have a value of 0, but many sit at ~0.1. However, most features are on a scale range between -2 and +2. Zero meaning the feature is absent. -2 and +2 representing extremes of a range of morphologies (imagine knee flexion; at -2 the knee is fully flexed, at 0 its neutral, at +2 its hyper-extended). Here is an example of the distribution between cases and controls in one X-ray feature:

Software issues (as discussed in other posts)
Can anyone advise on how I might narrow into a main cause?
As to what conclusions can be made:
- None from the multivariate analysis - its just too little data. Focus on the singe IV models
- The combination of X-ray features could represent real world phenomenon, but you can never report these, as you will be laughed out of any conference...
- Try remake the models smaller and use BIC + more acceptable confidence intervals as the criteria
Any advice would be appreciated! *I hope this doesn't come across as lazy repetition of a frequently asked question! have read through previous posts on wide confidence intervals - however I am struggling to pin down which explanations might work best for this problem. I am keen to learn how one might narrow the possibilities and make the correct inferences.