According to the solution above, the integral of the kernel over x is the volume of the hypercube. However, I do not understand why. Can someone explain to me please?
1 Answers
For the Parzen window, under some assumptions you have it that:
$$p(\mathbf x) = \frac K {NV} \tag{1}$$
Where $K$ is the total number of data points inside your region $\mathcal R$ (hypercube), $N$ is the number of observations, and $V$ the volume of the hypercube. Following your notations, the estimated density at $\mathbf x$ is:
$$p_{\phi}(\mathbf x) = \frac 1 {n\cdot h^d} \sum_{i = 1}^{n} \phi \left(\frac{\mathbf x-\mathbf x_i}{h}\right) \tag{2}$$
Compare $(1)$ and $(2)$, $N = n, V = h^d, K = \sum_{i = 1}^{n}\phi \left(\frac{\mathbf x-\mathbf x_i}{h}\right)$. To insure that $p_{\phi}(\mathbf x)$ is a legitimate and proper density function (i.e. nonnegative everywhere and integrates to one), it is required that:
- $\phi(\mathbf u) \ge 0$
- $\displaystyle \int \phi(\mathbf u) \text{ d}\mathbf u=1$
With a change of variable, you have $\displaystyle \int \phi\left(\frac{\mathbf x-\mathbf x_i}{h}\right) \text{ d}\mathbf x = h^d \displaystyle \int \phi(\mathbf u) \text{ d}\mathbf u$.
Using the requirement above and the change of variable to your equation, you have:
$$\displaystyle \int p_{\phi}(\mathbf x) \text{ d}\mathbf x= \frac 1 {n\cdot h^d} \sum_{i = 1}^{n} \int \phi \left(\frac{\mathbf x-\mathbf x_i}{h}\right)\text{ d}\mathbf x = \frac 1 {n\cdot h^d} \sum_{i = 1}^{n} h^d\cdot 1$$
That integral over $\mathbf x$ is indeed the volume $h^d$ of the hypercube. And $p_{\phi}(\mathbf x)$ integrates to $1$.
- 1,032
