I have analyzed some secondary data that relates to Plasmodium infection at three forest sites: inside the forest, at the forest fringe, and outside the forest.
The outcome variable is infection with Plasmodium parasites. I have about 60 variables (individual and household variables) of interest that I have analysed for their association with individual infection. I also have data on the households in which each person lives.
A previous study analyzed the pooled data and used random effects for households and villages (there were 16 villages in total: 9 outside the forest, 3 at the fringe, and 4 inside the forest). They included "forest proximity" as a fixed effect.
My MSc thesis has focused on the "inside-the-forest" sites only (there is strong evidence to suggest investigating the risk factors for inside-the-forest sites separately). I have only used a household random effect.
Can a variable (for example, sleeping outdoors) be statistically significantly associated with infection in the pooled data with random effects for household and village (all 16 villages), but not significant in any of the separate strata (with only household random effects)? My gut feeling is it can (and I certainly hope so), but I am still somewhat of a statistics novice.
If it can, is this just statistically logical, or would I be better off providing a reference to support this? If so, can anyone point me in the direction of an appropriate reference?