My question is a bit long, with 2 major parts. Here are the variables:
- Number of cells (C): main dependent variable
- Disease severity 1 (D1): continuous
- Disease severity 2 (D2): continuous but only quantifiable on diseased organ
- Age Sex
- Organ side: L or R
- Lateralization (L or R = 1, L and R = 2)
- Location of disease in organ
- Concurrent Disease 1, Concurrent Disease 2
- N=208
We are trying to reproduce a previously published paper that found a significant association between C and D1. The disease can be present in L, R, or both. Age is a confounding factor because C normally decreases with age. Both L and R organs are entered in the database as their own lines if both organs are affected, and only one line if only L or R is affected. Each line contains the data of both organs and have the Lateralization variable. We followed the previously published statistics and found directly conflicting evidence, and we want to show that D1 and D2 are not related to C.
ANALYSIS A.
The analysis we reproduced is as follows:
In entire cohort of both Lateralization 1 and 2: With age as a covariate, partial correlation between: C & D1, C & D2.
In cohort of only Lateralization 1:
- With age as a covariate, partial correlation between: C & D1, C & D2, and difference in C between diseased and nondiseased organ vs difference in D1 between diseased and nondiseased organ
- Paired t-test to compare C in diseased organ vs. non-diseased organ
- In patients with D1 < 2 (arbitrary cutoff by previous authors): Pearson correlation between C, D1, D2, age 4.
In all Lateralization 1 patients only, subdivided into groups of D1 <2 and D1 ≥2 D: Mann Whitney U tests for age, D1, D2, C
All aforementioned steps repeated for Patients without disease 1, and without disease 2 separately (not looking for interaction between these diseases)
ANALYSIS B.
However, I thought I could also do 2 hierarchical multiple regressions, both with C as the dependent variable. The blocks would unfold as follows:
- Age, Sex,
- Lateralization location of disease in organ,
- Concurrent Diseases 1 and 2,
- D1
and
- Age, Sex,
- Lateralization location of disease in organ,
- Concurrent Diseases 1 and 2,
- D2
ANALYSIS C.
I did a partial correlation and put all variables from block 1-3 from the regressions with my IVs as D1 and D2, with dependent variable C. I read somewhere that a partial correlation is only good for 3 covariates?
Which is a better analysis to report?
If the multiple regressions are better to report, I have an issue with my results. My regressors are nonsgnificant, which is what we want to confirm. But, ANOVAs for each model are p < 0.01. I ran VIF and all of my variables have VIF < 1.5.
EDIT:
here is my output
EDIT:
changed order of predictors


"One conclusion we can draw from this is that when too many variables are included in a model they can mask the truly significant ones."
"Even if you had no multicollinearity, you can still get non-significant predictors and an overall significant model if two or more individual predictors are close to significant"
- I have done the hierarchical regression analysis with only 2 blocks, with 1 predictor in each (age which is a huge confounder, D) and still get the same results for my hierarchical regression..
– Jenny H Nov 11 '16 at 16:50