Can someone provide their thoughts on a structured process one might go through to understand a collection of data. The scenario is: you've been given a set of data (features and observations - with descriptions) and been told to "tell me what kind of interesting things this data can tell me". I.e., what are interesting questions that this data can answer. The meaning of "interesting" is certainly subjective.
This appears to be classical unsupervised learning.
My initial thoughts:
- Cluster all pairs of variables to see interesting clusters
- Run PCA find high-variance groupings
Is there a general "how to understand a set of data" process that you've found successful?
Thanks