4

I was implementing a simple scheme of Bernoulli distribution sampler. $ X \sim B(p) $. I have a function that generates a uniform random number $r \in (0,1)$. Then, I set $ X = 1 $ if $p > r $, and $X =0$ otherwise. Is this correct?

Tim
  • 138,066
avi
  • 409

1 Answers1

4

If $X$ is a Bernoulli random variable then $E[X]=p$ and $V[X]=p(1-p)$. For example:

> x <- rbinom(1000,1,0.3)
> mean(x)
[1] 0.302
> var(x)
[1] 0.211007

The most basic way to generate a Bernoulli sample is (Kachitvichyanukul and Schmeise): $$\begin{align} 1.&\quad x \leftarrow 0, k \leftarrow 0 \\ 2.&\quad \text{Repeat} \\ &\quad\quad \text{Generate } u\sim\mathcal{U}(0,1), k\leftarrow k + 1 \\ &\quad\quad \text{if }u\le p\text{ then } x \leftarrow x + 1\\ &\quad\text{Until }k = n \\ 3.&\text{Return}\end{align}$$ This algorithm generates $x$ successes out of $n$ trials, but can be slightly modified to generate a sample form the Bernoulli distribution. In R:

> rbernoulli <- function(n, p) {
+     x <- c()
+     for (i in 1:n) {
+         u <- runif(1,0,1)
+         if (u <= p)
+             x <- c(x, 1)
+         else
+             x <- c(x, 0)
+     }
+     return (x)
+ }
> x <- rbernoulli(1000, 0.3)
> mean(x)
[1] 0.314
> var(x)
[1] 0.2156196
Sergio
  • 5,951
  • 7
    Some comments: (1) rbernoulli can be written more succinctly (and far more efficiently) as rbernoulli <- function(n,p) runif(n) < p. Many people would find this much clearer than the pseudocode, too. (2) Checking the variance is redundant. A thorough verification of accuracy (assuming the values truly are independent) requires only (a) demonstrating that all the results are zeros or ones and (b) the proportion of ones is not significantly different from $p$. – whuber May 29 '14 at 18:55
  • @ Whuber. Would it be possible for you to help to query? https://stats.stackexchange.com/questions/443370/how-to-generate-5-columns-random-data-with-a-specified-correlation-coefficient – User20100 Jan 05 '20 at 19:43