2

If $Y∼N(μ,σ^2)$ is normally distributed, then $X=\exp(Y)$ is lognormally distributed. The parameters for a univariate distribution, $\mu$ and $\sigma$ of this lognormal distribution are given by

$$\mu{_1} = \ln\left(\frac{E[X_1]^2}{\sqrt{Var[X_1]+E[X_1]^2}}\right)$$

and

$$\sigma{_1}^2 = \ln\left(1+\frac{Var[X_1]}{E[X_1]^2}\right)$$

Similarly, for a bivariate distribution, I want to get the parameter for the covariance $\sigma_{12}^2$. The equation describing this relation I have been able to find is given by:

$$\sigma_{12}^2 = \ln\left(1+\rho\left[\sqrt{1+\frac{Var[X_1]}{E[X_1]^2}}\sqrt{1+\frac{Var[X_2]}{E[X_2]^2}}-1\right]\right),$$

where $\rho$ is the correlation between $X_1$ and $X_2$ and is given by $\rho = \frac{Cov(X_1,X_2)}{\sqrt{Var[X_1]}\sqrt{Var[X_2]}}$.

Is this equation for $\sigma_{12}^2$ describing the relation correctly and can anyone give me the proof of its validity (or a reference to it)? I checked this topic and this, but I couldn't derive any proof from there myself.

whuber
  • 322,774
  • 1
    Start with the formula at https://stats.stackexchange.com/questions/319122/particular-mean-value-of-fx-y-where-x-y-gaussian-vector (which is straightforward to derive from the usual covariance formulas) and solve. – whuber May 23 '22 at 12:53

1 Answers1

3

Some basic facts about the Normal distribution help with this.

Background on the Normal distribution

The first fact is that any Normal random variable $Y$ with mean $\mu$ and standard deviation $\sigma$ has the same distribution as $\sigma Z + \mu$ where $Z$ is a standard Normal variable (that is, it has zero mean and unit s.d.).

The second fact is that when $(Y_1, Y_2)$ have a bivariate Normal distribution, then any linear combination $U = \alpha Y_1 + \beta Y_2$ has a Normal distribution. We may determine exactly which distribution that is by computing the mean and variance of $U$ using the usual rules,

$$E[U] = \alpha E[Y_1] + \beta E[Y_2]$$

and

$$\operatorname{Var}(U) = \alpha^2\operatorname{Var}(Y_1) + \beta^2\operatorname{Var}(Y_2) + 2\alpha\beta\operatorname{Cov}(Y_1,Y_2).$$

The third fact is that the density function of a standard Normal variable $Z$ at the value $z$ is proportional to $C\exp(-z^2/2)$ for a universal constant $C$ (whose value we don't need to know).

Because this is a density function, it integrates to unity. By a simple change of variable $z \to \alpha z + \beta$ ($\alpha \ne 0$) we can compute a host of related integrals:

$$1 = \int_{\mathbb R}C e^{-z^2/2}\,\mathrm{d}z = C\int_{\mathbb R} e^{-(\alpha z + \beta)^2/2}\,\mathrm{d}(\alpha z + \beta) = |\alpha| e^{-\beta^2/2} C \int_{\mathbb R} e^{-\alpha^2 z^2/2 - \beta z}\,\mathrm{d}z$$

which is equivalent to our fourth (and final) fact,

$$\frac{e^{\beta^2/2}}{|\alpha| } = C\int_{\mathbb R} e^{-\alpha^2 z^2/2 - \beta z}\,\mathrm{d}z.$$


Lognormal distributions

Suppose, then, that $(X_1,X_2)$ has a bivariate Normal distribution with means $\mu_i,$ standard deviations $\sigma_i,$ and covariance $\sigma_{12}=\rho\sigma_1\sigma_2$ (thus, $\rho$ is the correlation coefficient). By definition, $(X_1,X_2) = (e^{Y_1}, e^{Y_2})$ has a bivariate Lognormal distribution. Let's compute some of its moments.

  • The raw moments of any order $k$ are evaluated from the fourth fact as

    $$E\left[X_i^k\right] = E\left[ \left(e^{Y_i}\right)^k\right] = E\left[e^{k Y_i}\right] = E\left[e^{k(\sigma_i Z_i + \mu_i)}\right] = E\left[e^{(k\sigma_i)Z_i + k\mu_i}\right] = e^{k\mu_i + (k\sigma_i)^2/2}$$

    and the mixed raw moments of orders $(j,k)$ as

    $$E\left[X_1^j X_2^k\right] = E\left[ \left(e^{Y_1}\right)^j \left(e^{Y_2}\right)^k\right] = E\left[e^{j Y_1 + k Y_2}\right] = e^{j\mu_1 + k\mu_2} e^{(j^2\sigma_1^2 + k^2\sigma_2^2 + 2jk\rho\sigma_1\sigma_2)/2}.$$

    The last equality follows from the variance formula in the third fact, as applied to the linear combination $jY_1 + kY_2.$

  • Consequently, the variances and covariances are

    $$S_i^2=\operatorname{Var}(X_i) = E[X_i^2] - E[X_i]^2 = e^{2\mu_i + (2\sigma_i)^2/2} - \left(e^{\mu_i + \sigma_i^2/2}\right)^2 = e^{2\mu_i + \sigma_i^2}\left(e^{\sigma_i^2}-1\right)$$

    and, with similar calculations,

    $$S_{12}=\operatorname{Cov}(X_1, X_2) = E[X_1X_2] - E[X_1]E[X_2] = \cdots = e^{\mu_1+\mu_2 + \sigma_1^2/2 + \sigma_2^2/2}\left(e^{\rho\sigma_1\sigma_2} - 1\right).$$

  • By definition, the correlation is

    $$R_{12}=\operatorname{Cor}(X_1,X_2) = \frac{S_{12}}{S_1S_2} = \frac{e^{\rho\sigma_1\sigma_1} - 1}{\sqrt{(e^{\sigma_1^2} -1 )(e^{\sigma_2^2}-1)}}.$$

Answering the question

The question is tantamount to asking how to recover the covariance parameter, $\sigma_{12} = \operatorname{Cov}(Y_1,Y_2)$ in terms of the correlation and other moments of the lognormally distributed variables $(X_1,X_2).$ Writing $$M_i = E[X_i] = e^{\mu_i + \sigma_i^2/2}$$ for the expectations, easy algebra gives

$$e^{\sigma_i^2} = 1 + \frac{S_i^2}{M_i^2},$$

whence

$$\sigma_i = \sqrt{\log \left(1 + \frac{S_i^2}{M_i^2}\right)};$$

and

$$e^{\rho \sigma_1 \sigma_2} = 1 + R_{12} \frac{S_1S_2}{M_1M_2},$$

entailing

$$\sigma_{12} = \rho \sigma_1 \sigma_2 = \log\left(1 + R_{12} \frac{S_1S_2}{M_1M_2}\right).$$


The formula proposed in the question appears to be in some kind of mixed form where the Normal parameters appear on both sides. The closest I can come retains $\rho$ in the foregoing equation and re-expresses the $\sigma_i$ in terms of the moments of the $X_i$ to write

$$\sigma_{12} = \rho \sqrt{\log \left(1 + \frac{S_1^2}{M_1^2}\right)\,\log \left(1 + \frac{S_2^2}{M_2^2}\right)}.$$

whuber
  • 322,774
  • 1
    I do not know why my original equation works for $Var[X_1] \approx Var[X_2]$ and $E[X_1] \approx E[X_2]$; but with the equation provided in this answer, I could show that for $Var[X_1] \ne Var[X_2]$ and $E[X_1] \ne E[X_2]$ it is plainly wrong. – user_829312 May 25 '22 at 09:14