A log-linear model or any model that fails to model the dependence of responses could underestimate (or overestimate) standard errors because they do not take into account potential subject-level association of responses. For example, if some subjects are likely to have responses patterns like (A,A,A,A) and others like (C,C,C,C), treating responses as independent is problematic.
An appropriate model would be the multinomial logit with subject-level random intercepts.
Depending on your colleague's modeling goals, another approach might be latent class regression, which estimates class probabilities and class-conditional responses probabilities for k latent classes. If you expect strong clustering in subjects' responses, this might be a particularly nice approach because you get regression estimates for each of $k$ fuzzy classes, which might have meaningful psychological labels. Identifiability is an issue here because of the large number of parameters. See poLCA in R and the PDF write-up here.
drm in R is another package which is supposed to be able to model clustered categorical responses, but I have not tried it.
Finally, for very specific applications/hypotheses, you could implement resampling methods by resampling entire vectors of responses -- e.g., a permutation test on odds ratios across groups by permuting group labels without replacement.