1

Is there a way to "check" if your MCMC is "correct"?

Suppose I have the following irregular probability distribution (a mixture of two normal distributions): P(x,y,z) = **0.3 * N1 + 0.7 * N2

N1 (x,y,z) ~ Normal with mu = (1,1,1) and sigma(1,0,0,0,1,0,0,0,1)

N2 (x,y,z) ~ Normal with mu = (1,0,1) and sigma(1,0,0,0,1,0,0,0,1)**

I am interested in generating Random Samples this mixture distribution when x1 = 2 (conditional distribution). I wrote the following R code for the Metropolis-Hastings Algorithm to take these random samples.

First, I defined a function that represents the conditional probability distribution I am interested in:

#define constants needed for the multivariate normal
    sigma1 <- c(1,0,0,0,1,0,0,0,1)
      sigma <- matrix(sigma1, nrow=3, ncol= 3, byrow = TRUE)
      sigma_inv <- solve(sigma)
      sigma_det <- det(sigma)
      denom = sqrt( (2*pi)^4 * sigma_det)

#mixture model target <- function(y,z)

{
  x_one = 2 - 1
  x_two = y - 1
  x_three = z - 1


  x_t = c(x_one, x_two, x_three)
  x_t_one &lt;- matrix(x_t, nrow=3, ncol= 1, byrow = TRUE)
  x_t_two =  matrix(x_t, nrow=1, ncol= 3, byrow = TRUE)



  num = exp(-0.5 * x_t_two  %*%  sigma_inv  %*%  x_t_one)

  answer_1 = num/denom


  x_one2 = 2 - 1
  x_two2 = y - 1
  x_three2 = z - 1


  x_t2 = c(x_one2, x_two2, x_three2)
  x_t_one2 &lt;- matrix(x_t2, nrow=3, ncol= 1, byrow = TRUE)
  x_t_two2 =  matrix(x_t2, nrow=1, ncol= 3, byrow = TRUE)


  # In this part, as it's (x-mu)^T * SIGMA * (x-mu)


  num2 = exp(-0.5 * x_t_two2  %*%  sigma_inv  %*%  x_t_one2)

  answer_2 = num2/denom
  return(0.3*answer_1 + 0.7*answer_2)
}

Then, I took MCMC samples from this function using the Metropolis-Hastings Algorithm:

library(mvtnorm)
x = rep(0,10000)
y = rep(0,10000)

x[1] = 3 #initialize; I've set arbitrarily set this to 3 and 1 y[1] =1

for(i in 2:10000){ current_x = x[i-1] current_y = y[i-1]

new = rmvnorm(n=1, mean=c(current_x,current_y), sigma=diag(2), method="chol") # generate bivariate random numbers proposed_x = new[1] proposed_y = new[2]

A = target(proposed_x,proposed_y)/target(current_x,current_y) if(runif(1)<A){ x[i] = proposed_x # accept move with probabily min(1,A) y[i] = proposed_y

} else {
x[i] = current_x        # otherwise &quot;reject&quot; move, and stay where we are
y[i] = current_y

}

}

The samples from this conditional distribution can be seen below:

head(data.frame(x,y))
      x         y

1 3.0000000 1.0000000 2 1.7251393 0.2930731 3 0.7847516 0.3534722 4 0.4495596 1.4447259 5 0.4654599 1.2970119 6 0.4654599 1.2970119

Question: The above R code ran and produced what appears to be random samples from the conditional distributions. When inspecting these random samples, there does not appear to be irregular values : if the random samples included extremely large numbers (e.g. 43321), this number would be extremely unlikely to have come from a probability distribution having a mean vectors and a covariance matrix with elements far smaller than such extreme numbers. But apart from checking that the code actually ran and the samples are within a logical range given the model parameters - is there any way to "check" if the MCMC random samples are in fact from this conditional distribution? (I am also not sure if there is a straightforward way to check if the random samples have passed the "burn in stage" and converged : I have heard that if you generate a large number of samples, they will likely converge - but this of course is not an exact way).

Can someone please tell me how to do this?

Thanks!

stats_noob
  • 1
  • 3
  • 32
  • 105

0 Answers0