Questions:
- Can we talk about:
variance of a deterministic variable?;
covariance between a deterministic variable and a stochastic variable?;
covariance between two deterministic variables? - Are these concepts well defined in sample?; in population?
Motivation
Take a simple regression
$$y = \beta_0 + \beta_1 x + \varepsilon.$$
Suppose the regressor $x$ is stochastic. The OLS estimate of $\beta_1$ will be
$$\hat{\beta}_1=\frac{\widehat{\text{Cov}}(x,y)}{\widehat{Var}(x)}$$
where hats denote sample counterparts of the population concepts. No problem here.
Now suppose $x$ is deterministic. I am not sure if I can use terms like variance and covariance in this context. Should I exchange $\hat{\beta}_1=\frac{\widehat{\text{Cov}}(x,y)}{\widehat{Var}(x)}$ for something like
$$\hat{\beta}_1=\frac{\frac{1}{n-1}\sum(x_i-\bar{x})(y_i-\bar{y})}{\frac{1}{n-1}\sum(x_i-\bar{x})^2}$$
to be correct? But then again, how meaningful is $\bar{x}$ when $x$ is deterministic? So should I go all the way to
$$\hat{\beta}_1=\frac{\frac{1}{n-1}\sum_{i=1}^n(x_i-\frac{1}{n}\sum_{j=1}^n x_j)(y_i-\frac{1}{n}\sum_{j=1}^n y_j)}{\frac{1}{n-1}\sum(x_i-\frac{1}{n}\sum_{j=1}^n x_j)^2}?$$
I am picking on details here and this may not be too important; my main questions are listed at the top of the post.
Rsimulationn <- 100; cor(rnorm(n) + 1:n, rnorm(n) + 1:n)should be close to $0$ -- but it is actually close to $1$. The latter does make sense to me as the two sequences are identical up to a small noise component. Am I getting something wrong here? – statmerkur Jan 20 '22 at 15:09Rcode calculates the correlation in one particular realization of this sequence. – statmerkur Jan 21 '22 at 15:53Rcode treats1:100as the equivalent of its empirical distribution. You can confirm this by computing the covariance directly or using bilinearity:x <- rnorm(n); y <- rnorm(n); z <- 1:n; (c(Covariance=cov(z+x, z+y), Sum=cov(z, z) + cov(z, y) + cov(z, x) + cov(x, y)))– whuber Jan 21 '22 at 16:00