0

I'm trying to create artificial correlated data using the genCorData function, in the simstudy package. I'm running the following in R:

set.seed(1234)
n=50
p=200
X=genCorData(n=n,mu=rep(0,p),sigma=rep(4,p),rho=0)
X=X[,-1]

beta = rnorm(p) y <- as.matrix(X)%*%beta + rnorm(n) # calculate response

intercept <- mean(y) y <- y - intercept

Why do I get the exact same X, and therefore, y, when I change the rho parameter. I've tried with 0, 0.3, and 0.9. They all lead to the same output. Why is this? And how do I generate correlated data?

user19904
  • 105

1 Answers1

1

There are two issues with the code. First, you need to specify a correlation structure with a nonzero rho; from the documentation,

"... if a [correlation] matrix is not provided, then a structure and correlation coefficient rho must be specified."

Second, X as returned by genCorData is a list, not a matrix, so you should convert to a matrix first, before trying to delete the first column.

The rewritten code looks like:

set.seed(1234)
n=50
p=200
X=genCorData(n=n,mu=rep(0,p),sigma=rep(4,p), rho=0.0)
X = as.matrix(X)[,-1]

beta = rnorm(p) y <- X%*%beta + rnorm(n) # calculate response

with output:

> head(y)
          [,1]
[1,] -41.69473
[2,] -21.92404
[3,]  22.59085
[4,] -71.62128
[5,] -22.43739
[6,] -10.92288

and with nonzero correlation:

> set.seed(1234)
> n=50
> p=200
> X=genCorData(n=n,mu=rep(0,p),sigma=rep(4,p), rho=0.4, corstr="cs")
> X = as.matrix(X)[,-1]
> 
> beta = rnorm(p)
> y <- X%*%beta + rnorm(n) # calculate response
> head(y)
          [,1]
[1,] -65.01533
[2,] -41.70743
[3,]  42.23017
[4,] -36.72938
[5,] -38.53522
[6,] -12.85205
jbowman
  • 38,614