7

I have a set of points $P_i$ which are described by an angle $\theta_i$ and a magnitude $r_i$.

$\theta_i$ follows a Uniform distribution $(\theta_i \sim U(0, 2\pi))$ and $r_i$ follows a chi-k distribution $r_i \sim \chi_k$.

k=8

Is there any way of describing this distribution using a multivariate distribution in cartesian coordinates?

If I use a bivariate gaussian, I get a circular distribution. How can I take into account the hole in the middle?

Liam F-A
  • 73
  • 4
  • I would rather call this an "annulus distribution". Searching for that term gives you links like this: https://mathematica.stackexchange.com/q/113988/73813 – Stephan Kolassa Mar 27 '23 at 18:27
  • 1
    What's wrong with what you described above? It's a bivariate distribution on the radius and the angle; just because the two are independent doesn't mean it isn't a bivariate distribution. – jbowman Mar 27 '23 at 18:40
  • @jbowman I agree that it could be a bivariate distribution, but I don't want to leave it in terms of radius and angle. I have further maths I need to do on the variances, and I won't be able to treat the angular variance the same way as the radius variance. – Liam F-A Mar 27 '23 at 18:46
  • https://stats.stackexchange.com/questions/523338/convergence-of-uniformly-distributed-random-variables-on-a-sphere/603767#603767 – Zhanxiong Mar 27 '23 at 18:48
  • This is a multivariate distribution. Are you perhaps wanting to describe it using different coordinates, such as Cartesian coordinates? – whuber Mar 27 '23 at 21:01
  • 1
    @whuber Yes, thanks. What I meant to say was using Cartesian coordinates – Liam F-A Mar 27 '23 at 23:26
  • 1
    A similar question: https://stats.stackexchange.com/questions/604765/maximum-likelihood-estimate-for-mixture-with-components-using-both-cartesian-and/604769#604769 – Zhanxiong Mar 28 '23 at 15:15

1 Answers1

4

Converting a 2D density $f_{r,\theta}$ from polar coordinates $(r,\theta)$ to Cartesian coordinates $(x,y)$ where $r(x,y)^2 = x^2+y^2$ and $\tan{\theta(x,y)}=y/x$ turns out to be simple, because the reference (Lebesgue) measure merely changes from $r\,\mathrm dr\,\mathrm d\theta$ to $\mathrm dx\,\mathrm dy.$ Therefore the probability element can be re-expressed as

$$f_{r,\theta}(r,\theta)\mathrm dr\,\mathrm d\theta = \frac{f_{r,\theta}(r,\theta)}{r} r\,\mathrm dr\,\mathrm d\theta = \frac{f_{r,\theta}(r(x,y),\theta(x,y))}{r(x,y)}\mathrm dx\,\mathrm dy.\tag{*}$$

This formula says to

  1. Divide the density by $r$ and then

  2. Express $r$ and $\theta$ in Cartesian coordinates.

The function f below implements this formula for general radial densities $f_{r,\theta}.$

In the question the density is the product of the chi-squared$(k)$ density for $r$ and the Uniform$(0,2\pi)$ density for $\theta,$ giving

$$f_{r,\theta}(r,\theta) = \frac{1}{2^{k/2}\Gamma(k/2)} r^{k-1}e^{-r/2}\times\frac{1}{2\pi}$$

and plugging that into $(*)$ produces

$$f_{x,y}(x,y) = \frac{1}{\pi\,2^{k/2+1}\,\Gamma(k/2)} r^{k-2}e^{-r/2} = \frac{ \left(x^2+y^2\right)^{k/2-1}e^{-\sqrt{x^2+y^2}/2}}{\pi\,2^{k/2+1}\,\Gamma(k/2)}.$$


As an example (with $k=20$), I used R to generate 100,000 values of $(r,\theta),$ converted those to $(x,y),$ and computed a kernel density estimate.

library(MASS)
# Generate (r, theta).
k <- 20
n <- 1e5
r <- rchisq(n, k)
theta <- runif(n, 0, 2*pi)
# Convert to (x, y).
x <- r * cos(theta)
y <- r * sin(theta)
# Compute and plot a density (omitting some outliers).
q <- qchisq(sqrt(0.99), k)
den <- kde2d(x, y, lims = q * c(-1, 1, -1, 1), n = 50)
image(den, asp = 1, bty = "n")

Here's an image showing your "hole:"

enter image description here

This R function performs steps (1) and (2) with an arbitrary density function df for the radial density (and a uniform density for the angle):

f <- function(x, y, df, ...) {
  r <- outer(x, y, \(x,y) sqrt(x^2 + y^2)) # Step 2: r(x,y)
  df(r, ...) / r * 1 / (2 * pi)            # Step 1: Formula (*)
}

Its inputs are vectors x and y. Its output is a matrix of densities at all ordered pairs from x and y.

Using this function, let's compare the empirical density (shown in the image) to the calculated density using $(*)$ (implemented as dc):

z.hat <- with(den, f(x, y, dc, k = k))
A <- with(den, mean(diff(x)) * mean(diff(y)))
with(den, plot(sqrt(z.hat), sqrt(z), col = gray(0, .25),
               ylab = "Root Empirical Density",
               xlab = "Root Calculated Density"))
abline(0:1, col = "Red", lwd = 2)

enter image description here

The agreement is excellent. (I use root scales to achieve a heteroscedastic response, making the variation around the 1:1 reference line the same at all locations.)

whuber
  • 322,774
  • Amazing answer ! Thank you. – Liam F-A Mar 28 '23 at 15:03
  • Do you think there are any ways of computing the marginal distributions for X and Y ? – Liam F-A Mar 28 '23 at 21:19
  • I would compute them numerically because the formula is likely to be complicated. Think about it: for $x$ and $y$ near $0,$ the distribution is bimodal, but for large $x$ or $y$ it becomes unimodal. The polar coordinate formulation shows you need to integrate an incomplete gamma function to obtain the marginal CDF or PDF. – whuber Mar 28 '23 at 21:26