5

I need to find the expected value and variance of KDE given that $$(i) E[u] = 0 \to \int u\phi(u)du=0\\ (ii)V[u] = \sigma^2 \to \int u^2\phi(u)du=\sigma^2$$ where $\phi$ is the kernel function.

I've tried to do it and also searched for online resourced and while I've found answer, they aren't really explained and I don't really understand where they come from. For expected value $$\mathbf{E}(p_n (x)) = \mathbf{E}\left(\frac{1}{Nh} \sum_{i=1}^{N} \phi(\frac{x-x_i}{h})\right) = \frac{1}{h}\mathbf{E}\left(\phi(\frac{x-x_1}{h})\right) \\= h^{-1} \int \phi(\frac{x-x_1}{h}) p(x_1) dx_1 =\int \phi(z) p(x - hz) dz $$ By using Taylor's theorem $ p(x+(-hz)) = p(x) - p'(x)hz + \frac{1}{2}p''(x)h^2 z^2 + Ο(h^3)$ we get $$ \mathbf{E}(p_n (x)) = \int \phi(z) [p(x) - p'(x)hz + \frac{1}{2}p''(x)h^2 z^2+O(h^3)]dz \\ = p(x)\int \phi(z)dz -hp'(x)\int \phi(z)zdz + \frac{1}{2}h^2 p''(x) \int \phi(z)z^2dz + O(h^3) \\ = p(x) - 0 + \frac{1}{2}h^2p''(x) \int\phi(z)z^2dz + O(h^3) \\ = p(x) +\frac{1}{2}h^2p''(x) \sigma^2 + O(h^3)$$

I think I understand what's going on so far and I could probably do it myself but I can't figure out the variance. $$Var(p_n(x)) = Var\left(\frac{1}{Nh}\sum_{i=1}^{N} \phi(\frac{x-x_i}{h})\right) =\\ =\frac{1}{N^2h^2} \sum_{i=1}^{N}Var\left(\phi(\frac{x-x_i}{h})\right) + 0 \\ \\ =\frac{1}{Nh^2}Var\left(\phi(\frac{x-x_1}{h})\right) = \\ =\frac{1}{Nh^2}\left[\mathbf{E}\left(\phi^2(\frac{x-x_1}{h})\right)-\left[\mathbf{E}\left(\phi(\frac{x-x_1}{h})\right)\right]^2\right] =\\ =\frac{1}{Nh^2}\left[\int \phi^2(\frac{x-x_1}{h})p(x_1)dx_1 - \left[ \int \phi(\frac{x-x_1}{h})p(x_1)dx_1\right]^2\right] = \frac{1}{Nh^2}\left[h\int \phi^2(z)p(x-hz)dz - \left[h\int \phi(z)p(x-hz)dz\right]^2\right]= \\ $$

I've gotten that far but im not sure how to continue. The 2nd term of the last line is the expected value we calculated above. The online resources I found use taylor's expansion again and write $O(h^2)$ to describe the second term, then switch to $O(h)$ then swap to $o(h)$ and Im totally confused, especially because I dont completely understand this notation.. Any help would be appreciated.. thanks.

thenac
  • 341

1 Answers1

3

Here I follow the derivation given in this link. Before we proceed, we need to impose the following mild assumptions before we discuss some properties of the density estimator $\hat{f}$: $$ \widehat{f}(x)=\frac{1}{n h} \sum_{i=1}^n K\left(\frac{X_i-x}{h}\right), $$

  • A.1 $X_1, \cdots, X_n \stackrel{i . i . d .}{\sim} f$.
  • A.2 $f^{\prime\prime}(x)$ is continuous and bounded in the neighborhood of $x$.
  • A.3 The kernel function $K(\cdot)$ is a symmetric pdf satisfying: $$ (i)\begin{aligned}\int K(u) d u=1\end{aligned},\quad (ii)\begin{aligned}\nu_2:=\int K^2(u) d u<\infty\end{aligned},\quad (iii)\begin{aligned}\kappa_2:=\int u^2 K(u) d u \in(0, \infty)\end{aligned} $$
  • A.4 $h \rightarrow 0$, $n h \rightarrow \infty$ as $n \rightarrow \infty$, which means $n^{-1}=o\left((n h)^{-1}\right)$. Then, we have: $$ \begin{aligned} \operatorname{Var}\widehat{f}(x) &=\frac{1}{n} \operatorname{Var}\left[\frac{1}{n h} \sum_{i=1}^n K\left(\frac{X_i-x}{h}\right)\right] \\ (\text{By A.1: $X_1, \cdots, X_n \stackrel{i . i . d}{\sim} f$})\quad &=\frac{1}{n} \operatorname{Var}\left[\frac{1}{h} K\left(\frac{X-x}{h}\right)\right] \\ &=\frac{1}{n}\left( \mathbb{E}\left[\frac{1}{h^{2}} K^{2}\left(\frac{X-x}{h}\right)\right] - \mathbb{E}^{2}\left[\frac{1}{h} K\left(\frac{X-x}{h}\right)\right]\right) \\ &=\frac{1}{n}\left( \mathbb{E}\left[\frac{1}{h^{2}} K^{2}\left(\frac{X-x}{h}\right)\right] - \mathbb{E}^{2}[\widehat{f}(x)]\right) \\ &=\frac{1}{n}\left(\int \frac{1}{h^{2}} K^{2}\left(\frac{X-x}{h}\right)f(X) d X - \mathbb{E}^{2}[\widehat{f}(x)]\right) \\ (\text{Define $u:=\frac{X-x}{h}$})\quad &=\frac{1}{n}\left(\int \frac{1}{h} K^{2}\left(u\right)f(uh+x) d u - \mathbb{E}^{2}[\widehat{f}(x)]\right) \\ (\text{Perform 1st Order Taylor Expansion})\quad &=\frac{1}{n}\left(\int \frac{1}{h} K^{2}\left(u\right)\left(f(x) + O(uh)\right)d u - \mathbb{E}^{2}[\widehat{f}(x)]\right) \\ (\text{Insert $\mathbb{E}[\widehat{f}(x)]$})\quad &=\frac{1}{n}\left(\frac{1}{h}\left(f(x)\int K^{2}\left(u\right)d u + O(h)\right) - (f(x)+O(h^{2}))^{2}\right) \\ &=\frac{1}{n}\left(\frac{f(x)}{h}\int K^{2}\left(u\right)d u + O(1) - O(1)\right) \\ &=\frac{f(x)}{nh}\int K^{2}\left(u\right)d u + O\left(\frac{1}{n}\right) \\ (\text{By A.4: $n^{-1}=o\left((n h)^{-1}\right)$})\quad &=\frac{f(x)}{n h} \int_{-\infty}^{\infty} K^2(u) d u+o\left(\frac{1}{n h}\right) \\ (\text{By A.3(iii): $\begin{aligned}\nu_2:=\int K^2(u) d u<\infty\end{aligned}$})\quad &=\frac{f(x)}{n h} \nu_2+o\left(\frac{1}{n h}\right). \end{aligned} $$