2

In statistics, we often assume that a particular variable follows a certain distribution. For example, if we know $Y \in \{0, 1\}$, then we can assume $Y \sim \text{Bernoulli}(p)$, since using the Bernoulli is really the only way to model a binary random variable. This assumption allows us to rely on tools such as logistic regression that make specific parametric assumptions about the form of $Y$.

However, relationships between the type of data and the random variable are not always so clear-cut. If we know that $Y$ represents a count of something, then there are multiple options - Binomial, Negative Binomial, Poisson, etc. - depending on how that count is generated. Furthermore, other distributions purport to model specific types of data, such as the Beta for proportions, and the Gamma for time-to-event.

Hence, we can infer logical statements like "If $Y \sim \text{Beta}$, then $Y$ is a proportion." But what about the converse? Is it true that "If $Y$ is a proportion, then $Y\sim \text{Beta}$"? When can we say "If $Y$ is a _______, then $Y\sim f_Y(y)$? Intuitively, it seems possible for the Bernoulli ("If $Y \in \{0, 1\}$, then $Y\sim \text{Bernoulli}(p)$) but I have no proof of this. Do any other distributions possess this property? And how would you prove something like this?

EDIT: To clarify, I suppose a more specific question is: when does the support of a random variable imply its distribution (without necessarily restricting the number of parameters)?

  • 3
    Welcome to CV, Sal. If by "this property" you mean that the set of possible probability distributions can be parameterized by a single parameter, the answer is a flat no, because as soon as a set has more than two elements, the family of all possible probability distributions has at least two parameters. But since that's a trivial mathematical observation, it seems likely you mean something else by "this property." Could you please explain or clarify it? – whuber Jun 11 '23 at 17:09

1 Answers1

1

Hence, we can infer logical statements like "If $Y \sim \text{Beta}$, then $Y$ is a proportion." But what about the converse? Is it true that "If $Y$ is a proportion, then $Y\sim \text{Beta}$"? When can we say "If $Y$ is a _______, then $Y\sim f_Y(y)$? Intuitively, it seems possible for the Bernoulli ("If $Y \in \{0, 1\}$, then $Y\sim \text{Bernoulli}(p)$) but I have no proof of this. Do any other distributions possess this property? And how would you prove something like this?

It depends on what you mean by this. For example, by definition if $Y = \sum_{i=1}^n X_i$ where $X_i$ are independent variables following each Bernoulli distribution with probability $p$, then $Y$ follows a binomial distribution with parameters $p$ and $n$. If $X_1$ and $X_2$ follow Poisson distributions, $X_1 - X_2$ follow Skellam distribution. By the central limit theorem, means follow Gaussian distribution. Below you can see a diagram showing the relations between different probability distributions from here and another one here.

A complicated diagram showing mathematical relations between different probability distributons.

So if you know how different probability distributions are constructed you would know what does lead to them. However, the relations are not always that simple. For example, "if $X$ is a count, then ..." does not have a single answer as there are many distributions for counts, as you already noticed.

Also keep in mind that the distributions are just mathematical models, no "Gaussian distribution" (or any other) exists in nature, they are just a useful model to approximate what we observe.

Tim
  • 138,066