Distribution change of variables

Question

I'm working on distributions for a physics problem and I am quite stumped on how to proceed properly.

The problem is as follows: I start at the point $(0,0)$ and go a distance $\eta$ in x direction $(\eta, 0)$, from there I go in a random direction $\phi$ (uniformly distributed) a distance $x$ that is gaussian distributed. I need the distribution of $y$, with $y$ being the distance of the resulting point from $(0,0)$

To formalize that:

$f(x)=\frac{1}{b\sqrt{\pi}}\exp(-x^2/b^2)$

(is the distribution of the random x direction) but I also know that

$y = \sqrt{2 x \eta \cos (\phi )+\eta ^2 + x^2}$

with $\phi$ being uniformly randomly distributed between $0, 2\pi$

Is it correct to say (with performing change of variables between x to y, and phi:

$g(y, \phi)\propto f(x)|\frac{dx}{dy}||\frac{dx}{d\phi}|$?

I'm pretty sure this is wrong (I suppose I need the Jacobian in some way), but I can't find good references for it

The modern theory of changes of variable in integration (dating to about 1900) is outlined at https://stats.stackexchange.com/questions/36093. An older theory (based on "Jacobians") is illustrated in many posts here. For intuitive explanations, see https://stats.stackexchange.com/questions/14483. Maybe that answers your question? — whuber, May 29 '22 at 12:36
You are right about your expression being wrong (it's not even a density function - the derivatives can be negative, and the whole thing doesn't integrate to 1). In general it is impossible to determine the joint distribution of two variables from their individual marginal distributions — J. Delaney, May 29 '22 at 14:01
Sorry sloppy physicist here. I changed my question accordingly. On a general level i will be out of luck here? Or is there a general approach i could take? — wa4557, May 29 '22 at 15:51
$y$ is not known but a consequence of $\phi$ and $x$. That's why I want the distribution $g(y)$ — wa4557, May 29 '22 at 16:32
@wa4557 Do you mean that $x$ and $\phi$ are independent random variables and $y$ is a function of them ? — J. Delaney, May 29 '22 at 17:07
Yes exactly. Sorry i was not clear in the question. I changed it again. The PDFs for x and phi are known. And how y relates to them. Now i would like to make a change of variables using the knowledge of y and the destributions. $x$ and $phi$ are independent of each other. — wa4557, May 29 '22 at 17:09
What physical problem is underlying here? When I reverse engineer this then I can model this as the distance between the origin and a random point on a circle with radius $\eta$ and midpoint $(0,x)$ where $x$ follows a normal distribution with standard deviation $b$. But possibly you meant to model the situation with a midpoint $x_1,x_2$ that is bivariate normal distributed? In that case $f(x)$ should be a chi-distribution with two degrees of freedom instead of one, and the distribution of $y$ is related to the rice distribution. — Sextus Empiricus, May 29 '22 at 20:53
In your edit you write "from there I go in a random direction (uniformly distributed) a distance that is gaussian distributed". To me this seems like a very unlikely problem situation. What sort of physics is generating this problem? — Sextus Empiricus, May 30 '22 at 12:41
Thanks for your answers, as it turns out it really is the rice distribution I am after, when doing the calculations carefully as laid out in the answers I end up there (or at least somewhere close). The physics problem is (an approximation) of a pendulum that is turned upside down experiencing (gaussian distributed) fluctuations, but at the same time pushed in a certain direction. That is explained by the formulas above — wa4557, May 31 '22 at 11:45

score 6 · Answer 1 · answered May 29 '22 at 18:16

Let $(X,\Phi,Y)$ be the random variables in question. Let's suppose $X$ and $\Phi$ are independent. $Y$ is a function of those two,

$$Y = f_\eta(X,\Phi)=\sqrt{2X\eta\cos(\Phi) + \eta^2+X^2}$$

for some number $\eta \ge 0.$ (If $\eta \lt 0,$ negating both it and $X$ does not change the distribution of $X$ and places us into the $\eta \gt 0$ situation.) That is, when $X\ge 0,$ $Y$ is the length of the third side of a triangle of side lengths $\eta$ and $X$ with included angle $\Phi;$ and when $X\le 0,$ $Y$ is the length of the third side of a triangle of side lengths $\eta$ and $-X.$ This shows the formula makes sense (we're not trying to find the root of a negative number) and that it is almost everywhere a two-to-one mapping, because

$$f_{\eta}(X,\Phi) = f_{\eta}(-X,\Phi+\pi).$$

The question appears to ask for the joint density function of $(Y,\Phi) = (f_\eta(X,\Phi), \Phi).$

We may easily write the joint density of $(X,\Phi),$ because independence implies the joint density is the product of the univariate densities. The joint probability element therefore is (up to sign)

$$f_{X,\Phi}(x,\phi) = \frac{1}{b\sqrt{\pi}} e^{-x^2/b^2}\,\mathrm{d}x \frac{1}{2\pi}\,\mathrm{d}\phi.\tag{*}$$

To change the coordinates, notice the definition of $Y$ implies

$$y^2 - x^2 - \eta^2 - 2x\eta\cos(\phi)=0.$$

From this we obtain $x$ and $x^2$ (sort of, up to a choice of two solutions) as

$$x = -B(\phi,\eta) \pm \sqrt{\Delta(\phi,\eta,y)}\tag{1a}$$

and

$$x^2 = B(\phi,\eta)^2 + \Delta(\phi,\eta, y) \mp 2B(\phi,\eta)\sqrt{\Delta(\phi,\eta, y)}\tag{1b}$$

where $B(\phi,\eta) = \eta\cos(\phi)$ and $\Delta(\phi,\eta, y) = \eta^2\cos^2(\phi) + (y^2-\eta^2).$

Differentiating (with respect to the variables $(y,x,\phi)$) we also find

$$0 = \mathrm{d}\left(y^2 - x^2 - \eta^2 - 2x\eta\cos(\phi)\right) = 2y\mathrm{d}y - 2x\mathrm{d}x - 2\eta\cos(\phi)\mathrm{d}x + 2x\eta\sin(\phi)\mathrm{d}\phi.$$

Consequently

$$0 = 0\wedge \mathrm{d}\phi = 2y\mathrm{d}y\wedge\mathrm{d}\phi - (2x + 2\eta\cos(\phi))\mathrm{d}x\wedge\mathrm{d}\phi,$$

which we can solve (almost everywhere) to convert the original differential element $\mathrm{d}x\mathrm{d}\phi$ to the new variables,

$$\mathrm{d}x\mathrm{d}\phi = \frac{y}{x + \eta\cos(\phi)}\,\mathrm{d}y\mathrm{d}\phi.\tag{2}$$

Plugging $(2)$ into $(*)$ gives an expression for the joint probability element $f_{Y,\Phi})(y,\phi)\mathrm{d}y\mathrm{d}\phi$ in which "$x$" and "$x^2$" can be replaced by expressions $(1a)$ and $(1b),$ respectively.

Finally, because the map from $(X,\Phi)$ to $(Y,\Phi)$ is two-to-one, we must double the preceding result: that's the answer.

I'm not going to work this out further for several reasons. One is that the many-to-one transformation suggests the problem hasn't been stated quite as clearly as it ought to be. Another is that in the question, the expression given for the density of $X$ is incorrect. I have silently fixed it above, assuming the intention is that $X$ have a Normal distribution: but perhaps that's the wrong assumption. A change in the density of $X$ would not change how the answer is worked out, but it would change the details of the final answer. Finally, I suspect that this is just one step in the solution of a problem that, if it were disclosed, might be (much) simpler to solve some other way. Analyzing the density of $(Y,\Phi)$ is not an appetizing prospect.

Peter Leopold · Answer 2 · 2022-05-29T18:05:18.947

$$ \begin{aligned} g_Y(y) &=\int_0^{2\pi} \frac{d\phi}{2\pi} \int_{-\infty}^{\infty} dx \frac{1}{b\sqrt{\pi}} \exp(\frac{-x^2}{b^2}) \times \\ & ~~~~~~~~~~~~~~~~~~~~~\times \delta \bigg( y-\sqrt{2x\eta\cos{\phi}+\eta^2+x^2} ~~\bigg) \\ &=\int_0^{2\pi} \frac{d\phi}{2\pi} \frac{1}{b\sqrt{\pi}}\exp{\frac{-\tilde{x}^2}{b^2}}\frac{1} {| \frac{\partial}{\partial x }\sqrt{2x\eta\cos{\phi}+\eta^2+x^2}|_{x=\tilde{x}}|}\\ &\text{where }\tilde{x} \text{ solves } \\ y^2 &= 2\tilde{x}\eta \cos{\phi}+\tilde{x}^2+\eta^2 ~ \\ \end{aligned} $$ Complete the square and solve for $\tilde{x}: y^2=2\tilde{x}\eta\cos{\phi}+\eta^2+\tilde{x}^2$ for $\tilde{x}$ in terms of $\phi, y,$ and $\eta$. And now the work begins . . .

But the integral looks vaguely familiar. $\sqrt{2x\eta\cos{\phi}+\eta^2+x^2} $ is screaming out to be $Y=|X+\eta|_2$, or the distribution of the Euclidean length of $X+\eta$ where $X$ and $\eta$ are in $\mathbb{R}^d$ and $\phi$ is simply the angle between them. It suggests that there is an earlier statement of this problem that might yield geometrical transformation and a much easier problem to solve. Maybe not, but you never know.

Two subtleties cannot be ignored: the first concerns why you only differentiate the transformation with respect to $x$ when it also involves $\phi.$ The second concerns the fact that this transformation is not one-to-one. — whuber, May 29 '22 at 18:21
@whuber 1) I did the $x$ integral first. That's a free choice. $\phi$ & its pdf remain in the integrand &are the subject of the 2nd integral. 2) You're right that I ignored the multivalued nature of the sqrt. I was hoping the OP would give whole problem domain, which seems like the distrib of the distance from a random walker starting in, say, $\mathbb{R}^2$ that starts at point $\eta$, proceeds $X$ relative to $\eta$, & then we seek the distance from $\eta+X$ to the origin. If that's the problem, then the problem decomposes into sub problems: walk par & perp to $\eta$ then pythagorean thm. — Peter Leopold, May 29 '22 at 19:39

Sextus Empiricus · Accepted Answer · 2022-05-30T12:46:09.547

Whuber gave an answer to the question about the coordinate transformation. But in the final two sentences of his answer he noted

Finally, I suspect that this is just one step in the solution of a problem that, if it were disclosed, might be (much) simpler to solve some other way. Analyzing the density of $(Y,\Phi)$ is not an appetizing prospect.

In your recent edit you disclosed that you are not after the joint distribution $f_{Y,\Phi}(y,\phi)$ but instead after the marginal distribution $f_Y(y)$.

In this answer we show one of those other more simple ways how to solve it without directly tackling the solution for the joint density distribution.

When I reverse engineer this, then I can model this as the distance $y$ between the origin and a random point on a circle with radius $\eta$ and midpoint $(0,x)$ where $x$ follows a normal distribution with standard deviation $b$.

Instead of computing the distribution density it might be easier to compute the cumulative distribution.

We can consider the values of $x$ and $\phi$ for which the point on the circle is still below a distance $y$ (ie. $f(x,\phi)<y$) and integrate over that area.

We can do this by considering for every point $x$ how much of the circle of radius $\eta$ centered at $0,x$ is intersecting with the circle of radius $y$ centered at $0,0$. Then integrate over all $x$ while multiplying with the fraction of the circle that intersects inside the circle of radius $y$.

$$F(Y<y) = 2 \int_{max(\eta-y,y-\eta)}^{y+\eta} \frac{1}{b\sqrt{\pi}} e^{-\frac{x^2}{b^2}} \overbrace{\frac{1}{\pi} cos^{-1}\left(\frac{x^2+\eta^2-y^2}{2x\eta}\right)}^{\text{fraction of circular segment intersecting}}dx + 2\int_0^{max(0,y-\eta)} \frac{1}{b\sqrt{\pi}} e^{-\frac{x^2}{b^2}} dx$$

To get the distribution density you differentiate the above. For this you can use Leibniz integral rule and you will get several terms because not only the integrand is dependent on $y$ but also the integration limits (but since the integrand is zero at the limits, or cancels, these terms will be zero as well).

If I do this then I get to

$$f(y) = 2 \int_{abs(y-\eta)}^{y+\eta} \frac{1}{b\sqrt{\pi}} e^{-\frac{x^2}{b^2}} \frac{1}{\pi} \frac{y}{x\eta}\frac{1}{\sqrt{1-\left(\frac{x^2+\eta^2-y^2}{2x\eta}\right)^2}}dx $$

or

$$f(y) = \frac{y}{b\pi^{3/2}} \int_{abs(y-\eta)}^{y+\eta} e^{-\frac{x^2}{b^2}} \frac{1}{\sqrt{x^2-(y-\eta)^2}\sqrt{(y+\eta)^2-x^2}}dx $$

Demonstration with computational simulation

Below is a histogram for a simulation of size $n=10^4$ with parameters $\eta= 1$ and $\sigma = b/\sqrt{2} = 1$. Superimposed is a computation of the density distribution based on a differentiation of the formula above.

### settings
set.seed(1)
n = 10^4
eta = 1
sig = 1
simulate and plot histogram
phi = runif(n,0,2pi)
x = rnorm(n,0,sig)
y = sqrt(eta^2+x^2+2xetacos(phi))
hist(y, freq=0, breaks = seq(0,5,0.1))
integrand function
integrand = function(x,z) {
    ### exit for special cases to prevent NA values for acos
    if (x-eta>=z) {
      return(0)
    }
    if (x+eta<=z) {
      return(dnorm(x,0,sig))
    }
    return(dnorm(x,0,sig)acos((x^2+eta^2-z^2)/(2x*eta))/pi)
}
integrand = Vectorize(integrand)
compute density by integrating integrand
pcomp = function(z) {
 d = 0.001
 xi = seq(abs(z-eta),z+eta,d)
 2sum(integrand(xi,z))d + 2*(pnorm(max(0,z-eta))-0.5)
}
pcomp = Vectorize(pcomp)
compute cumulative distribution and differentiate for density
dd = 0.05
zs = seq(0,5,dd)
lines(zs[-1]-dd/2,diff(pcomp(zs))/dd,col=2)

+1 Excellent answer! For whatever it's worth when I use your formula for the cdf I can only come up with single symbolic result and that's when $y=\eta$: $\frac{4 \eta , _2F_2\left(\frac{1}{2},1;\frac{3}{2},\frac{3}{2};-\frac{4 \eta ^2}{b^2}\right)}{\pi ^{3/2} b}$. — JimB, May 29 '22 at 23:56
@JimB This function is, practically by definition, an elliptic integral. — whuber, May 30 '22 at 12:56
Sextus: The reason I didn't go here is that the density likely is wrong. Because it's impossible for a distance to have a Gaussian distribution, I suspect the OP means $x$ is a polar coordinate for a Bivariate Gaussian position. If so, $f$ needs to be multiplied by $x\mathcal{I}(x\gt 0)$ (and renormalized). Here is a quick simulation of this interpretation: xy <- matrix(rnorm(n <- 1e4, c(eta <- 1,0), b <- 1), 2) hist(r <- sqrt(xy[1,]^2 + xy[2,]^2)) (I see you are asking the OP related questions...) — whuber, May 30 '22 at 13:04
@Whuber, technically the density is not wrong. It is not the distance which is Gaussian distributed, but it is the variable $x$. It is just that this variable can be interpreted as a distance. — Sextus Empiricus, May 30 '22 at 13:30
But, I have the same idea that the underlying problem might relate to a bivariate normal distribution and not a single variate normal distribution. I am puzzled about this problem where the radius is normal distributed and looked for a distribution of the Cartesian coordinates that relates to it and got something weird as a distribution for the coordinates like $f(x) \propto e^{-x^2/4} K_0(x^2/4)$ (I don't remember exactly I puzzled it on my phone and the notes got lost). — Sextus Empiricus, May 30 '22 at 13:30
@SextusEmpiricus You are right, the underlying problem is a bivariate normal distribution (see comment above). I am a bit puzzled why your answer does not produce that though (i.e. a Rice distribution), since it seems the assumptions of your question would need to lead there, no? — wa4557, May 31 '22 at 11:47
I figured out where my question is wrong: the distribution in x should read : $f(x)=x\frac{1}{b\sqrt{\pi}}\exp(-x^2/b^2)$, that would model my problem better and that is the reason why your excellent answer doesn't fully explain what I am after. I will accept your answer though because it does answer the problem asked though! — wa4557, May 31 '22 at 12:25

Distribution change of variables

3 Answers3

Demonstration with computational simulation

simulate and plot histogram

integrand function

compute density by integrating integrand

compute cumulative distribution and differentiate for density