Very belated follow-up to a previous question:
I have some pretty simple linear models predicting a rate (continuous response var) from certain features of the distribution of some measured value. The distribution features are the predictor/explanatory variables. The theory is that a combination of features of the distribution each day (e.g. mean, s.d., skew, possibly quantiles) will relate to the relevant rate / direction on that day, so e.g. rate ~ mean_x + skew_x
The measured predictor values are from a sample taken every day, but with different (and quite widely varying) sample sizes on different days, from dozens to thousands (total sample size of ~300k, average ~400/day).
Conceptually, I understand that if there is some measurement error in both predictor and response variables, then some kind of EIV regression may be in order. But that's about the extent of my understanding. From what I've seen it looks like you normally just feed in the measurement error for each predictor (and possibly the response), one value per variable, into the regression.
But I don't have a known or constant measurement error in the predictor variables; all I have is the sample size for each day (which would relate to the standard error in the measured mean, skewness, etc.). Some datapoints are thus expected to have much smaller or much larger measurement errors. Similarly, I don't have info on measurement error in the response variable, only a sample size for each day as well.
Is it feasible to use the info I have on sample sizes to inform an EIV regression? And how? (I'm working in R.)
tauxwith eachi, similar to how the meantruex[i]already does). – Durden Feb 28 '24 at 15:49Since the true covariate here is a population mean, and the measured quantity is a sample mean, would it be possible to use (inverse of) sample size as a proxy for variance?
(Note that for theoretical reasons the underlying distribution of values from which we're observing the mean is expected to change for different data points)
– TY Lim Feb 28 '24 at 17:53