I was just thinking about what would be the properties of an ideal data set $X \in R^{n,d}$ where n is sample size, d represents features. I think (or at least I understood from reading text books) that there are 2 things that has to be satisfied :
- Each feature ($d_{i}$) must be independent from each other so that the feature space is Positive Definite, which in turn $X^{T}X$ also has condition number close/equal to 1.
- There has to be enough samples ($n$) that prevents the data from curse of dimensionality
Is this enough?
After these requirements are satisfied, could we directly infer about the distribution of the data ? Is there any rule of thumb? For instance, if those 2 (or more) steps are satisfied than the data must be Gaussian or another distribution.
I am trying to fill the gap between statistical properties and the algebraic properties of the data. Hence, I am little confused of building the relationship between them. Could someone explain me where should I check about the materials for building this relationship? Or take time and explain me ?