For my work we collect a lot of surveys which contain a mix of questions: multiple choice, checkbox and numeric. We know that in some instances enumerators might fudge the survey by entering answers at random. We also know that for the majority of surveys, the responses to certain questions will correlate strongly with the answers to other questions (e.g. the more advanced a respondent's schooling, the higher their income is likely to be).
Is it possible, with this knowledge, to design an approach to determine which surveys are most likely to have been filled out by random? We don't know ahead of time which questions the survey will contain (although we do know the types of questions).
Thanks.