am writing a seminar on estimation of population ratio using delta method and am having problem on the literature review. i have written the introduction but need help on on how to use delta method to estimate the population ratio and the methodology
-
What ratio is this? A ratio of means? A population attributable fraction? – AdamO Jun 12 '18 at 15:53
-
@gung and others - this question is terribly worded I agree, but Lucas's answer is the only one on the site with the full derivation and explanation - this question deserves to be better worded and re-opened – Xavier Bourret Sicotte Feb 13 '20 at 03:02
-
@XavierBourretSicotte maybe a conversation I wasn't privy to but if my answer would be better suited on another page/question so that this one may be closed please let me know. I agree question is not clearly stated. My answer was written assuming the OP is asking for a ratio of means. – Lucas Roberts Feb 14 '20 at 20:16
-
Yes - this question (mine :) ) is a good place ! https://stats.stackexchange.com/questions/398436/a-b-testing-ratio-of-sums . So what happened with my question above is that I used a combination of bootstrap and delta method (using your exact formula bellow) to validate the distribution of the metric which was indeed normal but had a different variance to the Binomial – Xavier Bourret Sicotte Feb 15 '20 at 01:17
1 Answers
The multivariate delta method has a heuristic justification here: https://en.wikipedia.org/wiki/Delta_method#Multivariate_delta_method. For the multivariate delta method you have a specific function $f$ that takes a vector argument which is $p$ dimensional and maps this to a $k$ dimensional space. In the case of a ratio estimator $p=2$ and $k=1$. The function $f$ is
$$f\left(\begin{bmatrix} \bar{y} \\ \bar{x} \\ \end{bmatrix}\right) = \bar{y}/\bar{x}$$ Now what are needed are a few more quantities, the first is:
$$f(\vec{\mu})=f\left(\begin{bmatrix} \mu_{y} \\ \mu_{x} \\ \end{bmatrix}\right) = \mu_{y}/\mu_{x}$$
These are the $h(B)$ and $h(\beta)$ respectively in notation in the Wikipedia link.
Next you need the vector of partial derivatives of $f(\vec{\mu})$, this is:
$$\nabla f(\vec{\mu})=\begin{bmatrix} \frac{1}{\mu_{x}} \\ \frac{-\mu_{y}}{\mu_{x}^2} \\ \end{bmatrix}$$ Also we need the variance covariance matrix of the vector
$$\begin{bmatrix} \bar{y} \\ \bar{x} \\ \end{bmatrix}$$ which is
$$\begin{bmatrix} \sigma^2_{y}/n & \sigma_{yx} \\ \sigma_{yx} & \sigma^2_{x}/n\\ \end{bmatrix}.$$ Note this variance-covariance matrix is the $\Sigma/n$ in the Wikipedia notation. For a proof that $\mathbb{C}ov(\bar{y},\bar{x}) =\mathbb{C}ov(x,y)$ see Estimating the covariance of the means from two samples? Now the only thing left is to calculate the quadratic form:
$$\nabla f(\vec{\mu})^T\begin{bmatrix} \sigma^2_{y}/n & \sigma_{yx} \\ \sigma_{yx} & \sigma^2_{x}/n\\ \end{bmatrix}\nabla f(\vec{\mu}) = \begin{bmatrix} \frac{1}{\mu_{x}} \\ \frac{-\mu_{y}}{\mu_{x}^2} \\ \end{bmatrix}^T \begin{bmatrix} \sigma^2_{y}/n & \sigma_{yx} \\ \sigma_{yx} & \sigma^2_{x}/n\\ \end{bmatrix} \begin{bmatrix} \frac{1}{\mu_{x}} \\ \frac{-\mu_{y}}{\mu_{x}^2} \\ \end{bmatrix}.$$
Which when I worked this out gives you the equation:
$$\sigma^2_R=\frac{\sigma_y^2}{n\mu_x^2} - 2\frac{\mu_y\sigma_{xy}}{\mu_x^3}+\frac{\sigma^2_x\mu_y^2}{n\mu_x^4},$$
where this quantity is the variance of the delta method normal distribution. Putting this altogether gives us that
$$\left(\frac{\bar{y}}{\bar{x}}-\frac{\mu_y}{\mu_x}\right) \sim N(0,\sigma^2_R)$$
So you can estimate the ratio of the population means by the ratio of the sample means provided you can estimate variances and the covariance, or equivalently, the correlation $\rho = \frac{\sigma_{xy}}{\sigma_x\sigma_y}$, by substitution, $\sigma_x\sigma_y\rho = \sigma_{xy}$. This is how the delta method is most commonly used in the derivation of the ratio estimator distribution.
- 4,259
-
I believe that the evaluation of the quadratic form should give a slightly different result.
$$\sigma^2_R=\frac{\sigma_y^2}{n\mu_x^2} - 2\frac{\sigma_{xy}\mu_y}{\mu_x^3}+\frac{\sigma^2_x\mu_y^2}{n\mu_x^4}$$
– Sextus Empiricus Mar 28 '19 at 19:19 -
@MartijnWeterings you are correct that is a typo going from the quadratic form to the equation for $\sigma^2_R$. Will fix that now, thanks for catching my error. – Lucas Roberts Mar 29 '19 at 00:52
-
@LucasRoberts do you need the $sqrt(n)$ in front of the expression (y¯x¯−μyμx)∼N(0,σ2R)? Isn't $1/n$ already built into σ2R? – Amazonian Oct 26 '19 at 20:05
-
@Amazonian No that is a copy paste error, I've removed it. Also, you can use $\LaTeX$ in comments just as in the regular posts if you'd like your math rendered more legibly. Also please keep in mind this is an approximation and if you decide to use your mileage may vary e.g. if you do not have finite first or second moments for $X$ or $Y$ the approximation is likely not a good one. – Lucas Roberts Oct 26 '19 at 23:58
-
Should it be possible to get a negative number from this formulation? Not sure what I'm doing wrong.. this is what I have coded out in Python:
(var_y / (n * (mean_x ** 2))) - ((2 * mean_y * cov_xy) / (mean_x ** 3)) + ((var_x * (mean_y ** 2)) / (n * (mean_x ** 4)))– getup8 Sep 06 '22 at 05:00 -
@getup8 I'm not clear which part of the formulation you are referring to with getting a negative number. The Normal distribution does put non-zero probability mass on negative numbers so a negative value for $\frac{\bar{y}}{\bar{x}} - \frac{\mu_y}{\mu_x}$ is possible. However, the variance term $\sigma_R^2$ should not be negative. – Lucas Roberts Sep 07 '22 at 13:00
-
Any chance you can explain why COV(B) = $\Sigma/n$ in Wiki and how that translates into your matrix above where some of the elements are divided by n and others are not? – B_Miner Jan 01 '23 at 01:02
-
@B_Miner I think you are referring to $\mathbb{C}ov(h(B))$ and not $\mathbb{C}ov(B)$. These are basic consequences of the mean, variance, and covariance operators on a sample mean. Here $h(\cdot)$ is a vector valued function of the two sample means, $\bar{x}$ and $\bar{y}$. – Lucas Roberts Jan 02 '23 at 23:23
-
Yes sorry. I am confused I guess because I thought Cov(h(B)) was $\Sigma$ so I wasnt sure how those two lines in the wiki could be equal. – B_Miner Jan 03 '23 at 04:05