I'm coding some Bayesian bandits algorithms for exponential families and for the case when my rewards are normally distributed, I need to use an improper uniform prior. Is there any way to represent this in R? I guess I could use runif(1,0,n) and choose some arbitrarily large $n$ but this still only works for drawing from Unif[0,n] so that wouldn't be ideal.
Here's my current code. Once I pull an arm at least once (think of a slot machine), I can sample from the normal posterior. I'm just trying to see how I can formulate a pull from the prior when it's not a probability distribution.
select_arm = function() {
sampled_theta <- NULL
#first we test if the arm has been pulled. if it hasn't, we draw #from an improper uniform. otherwise, we draw from a normal #distribution
for (i in 1:self$num_arms){
if (private$trials[i]==0){
sampled_theta <- c(sampled_theta,
runif(1,min = -10^4, max=10^4))
}
else {
#draw sample from normal distribution for posterior
a <- private$scores[i]/private$trials[i]
b <- self$variance[i]/private$trials[i]
sampled_theta <-c(sampled_theta,rnorm(1,a,sqrt(b) ) )
}
}
runif(n)gives you $n$ independent $\mathrm{U}[0,1]$. This is not what you want. You probably want something likerunif(n, min = -10^4, max = 10^4). – Zen Dec 01 '14 at 19:10For example, there is a case where we have rewards for each arm $i$ distributed $Bernoulli(p_i)$. However, we first need starting assumptions on $p$ so often we just assume $p_i\sim Beta(1,1)$ for every $i$. Then after the $i$th arm is pulled $n$ times with results $x_1,\ldots,x_n$, we draw $p_i$ from the posterior $p_i|x_1,\ldots,x_n \sim Beta(s_n+1,n-s_n+1)$ where $s_n$ is the total reward in $n$ trials.
– Kashif Dec 02 '14 at 02:16