2

I have an independent sample $x_1 \ldots x_N$, identically distributed.

I make an empirical CDF as $P_{\mathrm{emp}}(x)=\sum\limits_{i=1}^N H(x-x_i)$, where $H(x)$ is a Heaviside step function.

Then I make an interpolation of the raw CDF as

$$P_{\mathrm{interpolated}}(x) = \frac{P_{\mathrm{emp}}(x_{i+1}) - P_{\mathrm{emp}}(x_i)}{x_{i+1}-x_i}\cdot(x_{i+1}-x),$$ $$x \in [x_i, x_{i+1}]$$

Then I take a known window function $w(x)$, convolve $P_{\mathrm{interpolated}}(x)$ with it:

$$P_{\mathrm{smoothed}}=[P_{\mathrm{interpolated}} \ast w](x)$$

Finally, I take the PDF as the derivative of $P_{\mathrm{smoothed}}(x)$

$$f_{\mathrm{PDF}}(x)=P_{\mathrm{smoothed}}'(x)$$

Can you tell me the proper name of such a method of PDF restoration?

Or, maybe, the name of the wide class of PDF restoration methods?

Felix
  • 285
  • I like to use adobe acrobat. Sorry had to do this. – Taal Oct 06 '13 at 22:47
  • 3
    A related idea, performed directly on the empirical probability function rather than the ECDF, is kernel density estimation. This avoids the need for differentiation. This is a very widely used tool. There's also log-spline density estimation. There are many other techniques in particular situations. – Glen_b Oct 07 '13 at 01:10
  • I know about KDE. But in my case the restoration from empirical CDF works better. All I need is to know a proper name of it to write it in my article. – Felix Oct 07 '13 at 07:24
  • 2
    I suspect this method is equivalent, or nearly so (up to effects occurring beyond the observed range of data) to a KDE using a one-parameter family of kernels parameterized by gaps between the data. I'm not really sure because (a) $P_{emp}$ is not a CDF, since its values range from $0$ through $N$ and (b) $P_{interpolated}$ clearly does not interpolate $P_{emp}$: for instance, it has zeros at $x=x_1, x_2, \ldots, x_N$. I am assuming you intend $P_{interpolated}$ to be a piecewise linear interpolation of the ECDF within the interval $[x_1, x_N]$. – whuber Jan 07 '14 at 22:21
  • I agree with @whuber. This method cannot be better than KDE, and is in fact the same, as can be seen by evaluating $P_{smoothed}'$. – Aksakal May 13 '14 at 02:52

1 Answers1

-1

And it sounds like you're asking about the "probability density function" perhaps?

Here's a link to a related question: Estimating PDF of continuous distribution from (few) data points

Taal
  • 315