1

In my statistics/data mining course, I'm often asked questions like "How can you interpret those results?", and most of the time there's little to nothing to interpret and I'm wondering if this kind of question doesn't expect an different answer than what I'd generally answer.

For example sometimes I'll work on a simulated dataset that I have no information about, I don't know what it represents, it probably doesn't even represent anything and could be generated from a given distribution I don't know, then I apply a bunch of statistic models to it, calculate the % of error for each one of them, and the final (and only) question is : "What can you observe? Interpret the results".

Am I just supposed to say "well this method gives a little better results than the other ones, and this one is a little behind". I have no info on the data and I can't say much more. Am I supposed to know why some methods work better or should I just give general conclusions?

I'm sorry if this sounds a little too scholar, I'm not a looking for a answer to my test, I'm using an example to find an answer to a wider question.

1 Answers1

1

Results are just numbers, interpretation is where we ascribe meaning to them. Your results can tell you several different things - you can learn something about the dataset you're analyzing, the method you're applying, or the particular model that was learned.

Were your results good? Does this dataset represent a trivial problem or a complex one? Were all your results good regardless of the method? Perhaps only the nonlinear methods performed well, suggesting something about the complexity of the problem.

What did your final models look like? Are all variables used in your model, or can you find a sparse solution that uses a reduced dimensionality? Perhaps there are hidden variables that are explaining the behavior of the data that weren't actually measured?

What about meta results? Did your algorithm run efficiently regardless of the problem, or did it take too long as the dimensionality increased? These types of questions won't tell you much about the data, but can be another metric for comparing algorithms.

With time, you'll start to see patterns beyond "this method did better than this method". They key is to think about why you got the observed results, and what is the relationship between your data and the analysis method.