14

How do I define the distribution of a random variable $Y$ such that a draw from $Y$ has correlation $\rho$ with $x_1$, where $x_1$ is a single draw from a distribution with cumulative distribution function $F_{X}(x)$?

Macro
  • 44,826
OctaviaQ
  • 1,049
  • 7
  • 19

1 Answers1

24

You can define it in terms of a data generating mechanism. For example, if $X \sim F_{X}$ and

$$ Y = \rho X + \sqrt{1 - \rho^{2}} Z $$

where $Z \sim F_{X}$ and is independent of $X$, then,

$$ {\rm cov}(X,Y) = {\rm cov}(X, \rho X) = \rho \cdot {\rm var}(X)$$

Also note that ${\rm var}(Y) = {\rm var}(X)$ since $Z$ has the same distribution as $X$. Therefore,

$$ {\rm cor}(X,Y) = \frac{ {\rm cov}(X,Y) }{ \sqrt{ {\rm var}(X)^{2} } } = \rho $$

So if you can generate data from $F_{X}$, you can generate a variate, $Y$, that has a specified correlation $(\rho)$ with $X$. Note, however, that the marginal distribution of $Y$ will only be $F_{X}$ in the special case where $F_{X}$ is the normal distribution (or some other additive distribution). This is due to the fact that sums of normally distributed variables are normal; that is not a general property of distributions. In the general case, you will have to calculate the distribution of $Y$ by calculating the (appropriately scaled) convolution of the density corresponding to $F_{X}$ with itself.

Macro
  • 44,826
  • 2
    +1 Very nice answer. Nitpick: in the last line you need to convolve scaled versions of $F_X$. – whuber Jul 22 '11 at 16:53
  • Thanks so much, Macro. Just to clarify something -- you mean in your last paragraph that you would need to convolve the rhoX with the sqrt(1 - rho^2)X? (sorry, I couldn't get any formatting, even HTML to work in this particular comment) – OctaviaQ Jul 22 '11 at 23:27
  • 1
    Convolve the densities corresponding to the distributions of $\rho X$ with the distribution of $\sqrt{1 - \rho^{2}} X$. This is a result of the general fact that the density of the sum of two continuous random variables is the convolution of the two densities. – Macro Jul 23 '11 at 01:51
  • 1
    A long time but...ideas of how to do this, also enforcing the marginal distribution of Y? – Julián Urbano Mar 22 '15 at 21:29