Modified question to better explain the context of my problem:
I am studying young stars. When a star is born, it is surrounded by a disk of dust called "protoplanetary disk". Planets form in these disks, so understanding how they evolve gives information on plaent formation. Current theories and observations suggest that every star is born with one of these disks. However, different processes make these disks dissipate in about 10 million years. The usual way to study this subject is to study the fraction of stars with protoplanetary disks at different ages to see how dissipate. Past studies have found "hints" of massive stars loosing their disks earlier than low-mass stars, and therefore they may form different planetary systems. My aim is to determine the truthfulness of this dependence with stellar mass.
To study these disks, we look at the flux measured at infrared wavelengths. When you know the type of star is (lets say, you know its temperature), you can apply a stellar model. If the flux you measure is signicalty higher (defined in some way) than the expected from the stellar model (a naked star), that could mean you have additional infrared flux emited by the protoplanetary disk. Also, you need an age estimate for the star, and another one for the stellar mass if you want to compare different masses. So, there are several sources of uncertainties:
errors from the infrared measurements
errors from the estimated temperature of the star
errors from the age estimate
errors from the mass estimate.
The origin and behaviour of these uncertainties are very complicated, and usually not included in the calculations.
I have built a large sample of young stars, and I want to see which evidence there is of the stellar mass affecting the evolution/dissipation of protoplanetary disks. To do so, I have subdivided the sample in two mass and ages bins (the cuts having some physical meaning). As a result, I have four bins: "young low-mass", "young high-mass", "old young-mass", "old low-mass". Computing the % of protoplanetary disks for each of these bins is simple, but that is not enough prove or discard the mass influence. On the other hand, assigning errors to that % by error propagation is extremely complicated. Usually, one assumes simple Poisson errors, but that is not correct as it does not account for these uncertainties. That is why I thought I could use bootstrapping, and vary these quantities within reasonable ranges during the iterations to account for them.
As a result of that process, I end up with a list of % values for each bin, and therefore I can get statistical quantities from them (mean, standard deviation,…). They also provide and estimate of the correspoding PDFs.
I would like to know how to quantify the statistical evidence of these bins having different protoplanetary disk fractions, which translates into evidence of stellar mass having an impact on their evolution.
This is an example of the outcome. sample1 is "young, low-mass stars". sample2 is "young, high-mass stars". And their means and standard deviations are:
sample1: 61 +- 2
sample2: 47 +- 5
also, these are the obtained PDFs.
