I've seen the residuals considered with a prior in the model. Well, in a way. In a paper by a colleague, they show that using bayesian grouping hypotheses over B-splines accounts for perceptual grouping of skeletons really well. But there's a catch. The skeletons are fit with ribs of a certain length. This a necessary parameter in b-splines. The width is sensitive to the scale of the object: how big or small it is. Thus, the width has a gaussian prior over top of it. Though, I believe that they marginalize out such hyper parameters.
There is also another colleague who posits a 'regression to the mean' effect in human memory. In memory tasks, the recalled value of an episodic event (a color value, the height of a person, etc) has larger residuals when the value is further away from the prototypical value of that category (the prototype of blue, or heights of men). Thus, the entire theory revolves around the residuals!
I think the danger in predicting the residuals is not considering all of the alternate hypothesis. For example, perhaps the residuals are measurement error, observational bias, or some other instance which would give rise to deviations from a central tendency. If you plan on this route, I would make sure you model all possible alternative hypothesis and use some form of model comparison, perhaps AIC.