1

Let us suppose that I have a dataset with 3 features and that I know the pairwise correlations among these features.

Let us suppose that I want to build a synthetic dataset that respects those correlations, simulating the real dataset. So, I would provide fake values for one of the features and I would like to calculate te other two features from the pairwise correlations of the real dataset.

Is it possible to do that?

Zaratruta
  • 948
  • 5
  • 15
  • 1
    You can't calculate the values simply from the correlations. It may be possible to generate values. There's a number of posts on site that explain how to simulate data to match either population or sample correlations. Presuming you want the latter, I think the sample correlation one could be adapted to specifying values for one series. – Glen_b Oct 19 '22 at 01:22
  • You may want to look into Metric Multi Dimensional Scaling. – John Madden Oct 19 '22 at 02:37
  • Here's a search to get you started: https://stats.stackexchange.com/search?q=sample+distribution+given+correlation – Sycorax Oct 19 '22 at 04:08

0 Answers0