Is there a way to "check" if your MCMC is "correct"?
Suppose I have the following irregular probability distribution (a mixture of two normal distributions): P(x,y,z) = **0.3 * N1 + 0.7 * N2
N1 (x,y,z) ~ Normal with mu = (1,1,1) and sigma(1,0,0,0,1,0,0,0,1)
N2 (x,y,z) ~ Normal with mu = (1,0,1) and sigma(1,0,0,0,1,0,0,0,1)**
I am interested in generating Random Samples this mixture distribution when x1 = 2 (conditional distribution). I wrote the following R code for the Metropolis-Hastings Algorithm to take these random samples.
First, I defined a function that represents the conditional probability distribution I am interested in:
#define constants needed for the multivariate normal
sigma1 <- c(1,0,0,0,1,0,0,0,1)
sigma <- matrix(sigma1, nrow=3, ncol= 3, byrow = TRUE)
sigma_inv <- solve(sigma)
sigma_det <- det(sigma)
denom = sqrt( (2*pi)^4 * sigma_det)
#mixture model
target <- function(y,z)
{
x_one = 2 - 1
x_two = y - 1
x_three = z - 1
x_t = c(x_one, x_two, x_three)
x_t_one <- matrix(x_t, nrow=3, ncol= 1, byrow = TRUE)
x_t_two = matrix(x_t, nrow=1, ncol= 3, byrow = TRUE)
num = exp(-0.5 * x_t_two %*% sigma_inv %*% x_t_one)
answer_1 = num/denom
x_one2 = 2 - 1
x_two2 = y - 1
x_three2 = z - 1
x_t2 = c(x_one2, x_two2, x_three2)
x_t_one2 <- matrix(x_t2, nrow=3, ncol= 1, byrow = TRUE)
x_t_two2 = matrix(x_t2, nrow=1, ncol= 3, byrow = TRUE)
# In this part, as it's (x-mu)^T * SIGMA * (x-mu)
num2 = exp(-0.5 * x_t_two2 %*% sigma_inv %*% x_t_one2)
answer_2 = num2/denom
return(0.3*answer_1 + 0.7*answer_2)
}
Then, I took MCMC samples from this function using the Metropolis-Hastings Algorithm:
library(mvtnorm)
x = rep(0,10000)
y = rep(0,10000)
x[1] = 3 #initialize; I've set arbitrarily set this to 3 and 1
y[1] =1
for(i in 2:10000){
current_x = x[i-1]
current_y = y[i-1]
new = rmvnorm(n=1, mean=c(current_x,current_y), sigma=diag(2), method="chol") # generate bivariate random numbers
proposed_x = new[1]
proposed_y = new[2]
A = target(proposed_x,proposed_y)/target(current_x,current_y)
if(runif(1)<A){
x[i] = proposed_x # accept move with probabily min(1,A)
y[i] = proposed_y
} else {
x[i] = current_x # otherwise "reject" move, and stay where we are
y[i] = current_y
}
}
The samples from this conditional distribution can be seen below:
head(data.frame(x,y))
x y
1 3.0000000 1.0000000
2 1.7251393 0.2930731
3 0.7847516 0.3534722
4 0.4495596 1.4447259
5 0.4654599 1.2970119
6 0.4654599 1.2970119
Question: The above R code ran and produced what appears to be random samples from the conditional distributions. When inspecting these random samples, there does not appear to be irregular values : if the random samples included extremely large numbers (e.g. 43321), this number would be extremely unlikely to have come from a probability distribution having a mean vectors and a covariance matrix with elements far smaller than such extreme numbers. But apart from checking that the code actually ran and the samples are within a logical range given the model parameters - is there any way to "check" if the MCMC random samples are in fact from this conditional distribution? (I am also not sure if there is a straightforward way to check if the random samples have passed the "burn in stage" and converged : I have heard that if you generate a large number of samples, they will likely converge - but this of course is not an exact way).
Can someone please tell me how to do this?
Thanks!
(2*pi)^4should be(2*pi)^3– Xi'an Nov 09 '21 at 20:51targetis constructed. The means should differ between the two components. – Xi'an Nov 09 '21 at 20:55