I've got a univariate dataset (timeseries) for two kind of simulated systems, and I want to explore the differences between the two.
To do that, I can build a univariate gaussian KDE for each dataset and check the differences in the densities. I am attaching an example of what the two time series I want to compare look like, as well as the pdf's I obtain using R's package ggplot2 (sorry for the color mismatch, green time series data is the blue line in the density plot).
My problem is that the shape of the distribution depends on the amount of data that I use to construct it, and the two datasets that I have to compare actually have different amount of data. The dataset comes from the distance between two atoms in a Molecular Dynamics simulation, and as you can see it contains a fair bit of history-dependence.
I was wondering if one could apply a resampling technique such as cross-validation or bootstrap to this problem and estimate a range of uncertainty for the density plots, to compare the distributions in a more statistically-rigorous way.
Any comments will be of much help.



xshadeis built. In your example, the 'real' distribution from which x is simulated isxax, if I understood correctly? In my case, I don't have to usexaxat all, is that right? – j_eiros Apr 15 '16 at 14:55rnorm, and you would replace that with your data in practice. I made edits to hopefully clarifyxax. As forxshadeI would like to refer you to?polygon(a command which I also find difficult to get my head around...) – Christoph Hanck Apr 16 '16 at 13:24polygoncommand is a bit weird indeed ;) – j_eiros Apr 18 '16 at 08:41