How do I find the hierarchical interrelationships between features/variables?
I made a test input file p = 30 and n_samples = n = 569 from a pre-made dataset
from sklearn.datasets import load_breast_cancer
cancer = load_breast_cancer()
cancer.keys()
df = pd.DataFrame(cancer['data'],columns=cancer['feature_names'])
df.to_csv(r'input file',index=False)
The result should look something like this but should explain at least 80% of the variance between variables:
