Does $Y=\alpha X + \beta$ hold for multivariate gaussian density?

Question

In the one-dimensional case, if $X$ is $\mathcal{N}(\mu,\sigma^2)$, then $Y =\alpha X + \beta $ is $\mathcal{N}(\alpha \mu + \beta,\alpha^2\sigma^2)$ . We can prove this using the cumulative distribution function of of $Y$

$F_Y(a) = P\{Y \leq a\} = P\{\alpha X + \beta \leq a\} = P\{X \leq (a-\beta)/\alpha\}$.

Substituting $Y =\alpha X + \beta $ and change of variable gives us,

$F_Y(a) = \int_{-\infty}^{a} \frac{1}{\sqrt{2\pi}(\alpha\sigma)} \exp \{ \frac{-(v-(\alpha \mu + \beta))^2}{2(\alpha\sigma)^2}\} dv $

Hence $f_Y(v) = \frac{1}{\sqrt{2\pi}(\alpha\sigma)} \exp \{ \frac{-(v-(\alpha \mu + \beta))^2}{2(\alpha\sigma)^2}\} $

Thus $Y$ is $\mathcal{N}(\alpha \mu + \beta, \alpha^2 \sigma^2)$.

In the multivariate case, if $X$ is $\mathcal{N}(\mu,\Sigma)$ and $Y=\alpha X + \beta$, is $Y \sim \mathcal{N}(\alpha \mu + \beta,\alpha^2\Sigma)$? If so, how do we prove it?

score 8 · Accepted Answer · edited Apr 26 '11 at 11:59

8

The method of characteristic functions (CF) will work here. So we have the CF for $X$ as $$\varphi_{X}(t)=\exp\left(it^{T}\mu_{X}-\frac{1}{2}t^{T}\Sigma_{X}t\right)$$ Now we make the substitution $Y=\alpha X + \beta$ in the CF and we get:

$$\varphi_{Y}(t)=E\left[\exp(it^{T}Y)\right]=E\left[\exp(it^{T}\alpha X +it^{T}\beta)\right]=\exp(it^{T}\beta)\varphi_{X}(\alpha t)$$

Then substitute in the CF expression for $X$.

$$\varphi_{Y}(t)=\exp(it^{T}\beta)\exp\left(i(\alpha t)^{T}\mu_{X}-\frac{1}{2}(\alpha t)^{T}\Sigma_{X}(\alpha t)\right)$$ $$=\exp\left(it^{T}[\alpha\mu_{X}+\beta]-\frac{1}{2}t^{T}[\alpha^{2}\Sigma_{X}]t\right)$$

But this is the characteristic function of a new normal distribution with mean vector $\alpha\mu_{X}+\beta$ and covariance matrix $\alpha^{2}\Sigma_{X}$. As characteristic functions are uniquely defined from a distribution function and vice versa, you have your proof.

To generalise to the case where $\alpha$ is an appropriately defined $c\times p$ matrix ($p$ is the dimension of $X$). we simply replace the covariance matrix $\alpha^{2}\Sigma_{X}$ with the $c\times c$ covariance matrix $\alpha\Sigma_{X}\alpha^{T}$. Note that $\beta$ must be a $c\times 1$ vector for mean vector to make sense - but it is unchanged at $\alpha\mu_{X}+\beta$.

edited Apr 26 '11 at 11:59

cardinal

26,862

answered Apr 25 '11 at 07:00

probabilityislogic

24,971

@probabilityislogic: Thanks for pointing out the variance part.Its mistake while it typed it. Many thanks for ur answer. Got to know the importance of CF. – Learner Apr 25 '11 at 07:59
1

@Arun, it should be noted that this result holds much more generally, in particular, when $\alpha$ is a matrix (of appropriate dimensions). Only the expression for the covariance chances slightly to accommodate this. Note that @probabilityislogic is (rightly) treating $\beta$ as an arbitrary vector of the correct dimension here. – cardinal Apr 25 '11 at 11:43
1

(+1) A couple minor quibbles: (a) Generalize to the matrix case mentioned above, (b) The distribution and not the pdf is characterized by the ch.f., so the last sentence should be changed and (c) Using \exp will get $\exp$ to render properly. – cardinal Apr 25 '11 at 13:22
Cardinal - thanks for the comments - just one minor counter quibble, I thought CFs were just a fourier transform of the PDF - hence you can get the PDF from the CF by inverting the Fourier transform. How then is the PDF not uniquely defined by the CF? – probabilityislogic Apr 25 '11 at 22:53
1

@probabilityislogic: Certainly the characteristic function is uniquely defined once the density is specified. The reverse is not (quite) true. First there is the issue of the existence of the pdf in the first place. Second, we can change the pdf on any set of measure zero and obtain the same characteristic function. Note that this includes dense(!) subsets of $\mathbb{R}$, for example, the rationals. The distribution function is uniquely determined by the ch.f., though. – cardinal Apr 26 '11 at 00:15
@cardinal - I see your point, but my statement basically implies that a PDF exists (unless there is a way to derive a CF from a non-existent PDF - the vice versa). I wouldn't mind seeing a specific example with numbers of the "changing the PDF on a set of measure zero" - this sounds odd to me, like one of the set theory annoyances (sounds related to the axiom of choice and some of its consequences). – probabilityislogic Apr 26 '11 at 02:21
1

@probabilityislogic, the pdf need not exist for the ch.f. to exist. Indeed, a ch.f. exists for all distribution functions $F$ (not just those with densities). This is a simple consequence of monotonicity of expectation, i.e., $|\mathbb{E} e^{i X}| \leq \mathbb{E}|e^{iX}| \leq 1$. – cardinal Apr 26 '11 at 02:38
1

@probabilityislogic, For an example of changing a pdf on a dense subset of the real line, let $\varphi(x)$ be the density of a standard normal and define $\bar{\varphi}(x) = \varphi(x)$ for all $x \not\in \mathbb{Q}$ and $\bar{\varphi}(x) = \varphi(x) + 1/q(x)$ for every $x \in \mathbb{Q}$ where $q(x)$ is the denominator of $x$ expressed in lowest terms. Then $\bar{\varphi}(x)$ is continuous at every irrational number $x$ and discontinuous at every rational number, yet the characteristic functions of $\varphi$ and $\bar{\varphi}$ coincide. – cardinal Apr 26 '11 at 02:39
@cardinal - I understand what you are saying here, but your proof depends on being able to distinguish a rational number from one that is irrational - and this can't be done without the axiom of choice (I think). This is is what I meant by a specific numerical example. For any irrational number can be represented as a limit of a rational one, just by adding more decimal places to the fraction. Your proof relies on the comparison of two infinite quantities - this usually cannot be done without specifying how the "infinities" are to be generated. – probabilityislogic Apr 26 '11 at 03:43
@cardinal this article provides a proof of what I am talking about - basically proves that if axiom of choice is accepted, then cardinality of rationals equals cardinality of irrationals or is bigger, hence the set $\mathbb{Q}$ does not have a measure zero, and your example doesn't apply – probabilityislogic Apr 26 '11 at 06:07
@probabilityislogic, the importance of axiom of choice is well described in wikipedia page, if you want to disregard it you have to have serious reasons. I've read the article in your link, and I suspect that it is false. The assumptions for the proof for example are a tad suspicious for me. Furthermore axioms are generally assumed not proven and given this result's enormous implication it is strange that half of the proof involves historical citations. – mpiktas Apr 26 '11 at 07:34
1

@probabilityislogic, it's good to be able to recognize a crank when you see one. The countability of $\mathbb{Q}$ and uncountability of $\mathbb{R}$ are among the first things one typically learns in any introductory real analysis course. The proofs are elementary and the arguments beautiful. – cardinal Apr 26 '11 at 08:15
@probabilityislogic, the proof that $\mu(\mathbb{Q}) = 0$ is (almost) equally elementary and beautiful. ($\mu$ here being Lebesgue measure or any measure absolutely continuous with respect to Lebesgue measure.) – cardinal Apr 26 '11 at 08:22
@cardinal - but they are both dense sets, so given any two real numbers, we can find a rational one in between them - countability is irrelevant. But you can never actually "write down" an irrational number (should be called the fake numbers I think :) ), except as a limit of a sequence of rational numbers (such as square roots, $\pi$, euler's number). – probabilityislogic Apr 26 '11 at 09:41
@probabilityislogic, I fail to see your point or the relevance to the question, your answer, or my comments. Can you clarify? Have you had any exposure to (introductory) real analysis? If you're really disputing the countability of $\mathbb{Q}$ or its implication in establishing the Lebesgue measure of $\mathbb{Q}$, then I'm afraid you'll find you're in very sparse and (mathematically) naive company. That discussion may be better suited to a different medium than the comments to this answer. – cardinal Apr 26 '11 at 10:12
@cardinal, agreed (discussion finished). Basically a tangent gone too far from a proposed counter example. – probabilityislogic Apr 26 '11 at 10:50
@probabilityislogic, actually my counterexample is a bona fide one and I can provide even "stranger" ones if you'd like. For example, I can give you a distribution that has no atoms and yet has derivative of zero almost everywhere, so it has no density either. Yet, its characteristic function exists. – cardinal Apr 26 '11 at 11:51

Does $Y=\alpha X + \beta$ hold for multivariate gaussian density?

1 Answers1