0

Statistics enthusiasts, I need your guidance!

I have encountered a modeling problem for which I could really need a push in the right direction. Both terminology wise (maybe with the right search words I will find a deep reference library on the subject) and in terms of modeling advice.

Consider the following: Some unspecified raw material is processed in an industry via division into smaller components that are subsequently further refined. That initial division is our focus. It can occur in many different ways depending on the precise form of each raw material unit, with an overall goal specified by a simplified bucket distribution for the total production volume over a variable x along which the divisions occur. The bucket distribution has a few non-zero cells (that sum to 1) and several zeroed out buckets. The specification could look something like the following (made up values, which will not match exactly with the plots that follow...):

x_min  x_max  distribution
0      18     0.00
19     27     0.31
28     36     0.00
37     45     0.02
46     51     0.00
52     60     0.33
61     68     0.00
69     75     0.02
76     83     0.00
84     92     0.32
93     100    0.00

The specification is uniform throughout each bucket, but the industrial equipment is not set up to mimic that (for some reason - maybe it is just not possible). Instead, the equipment is operated to aim at some value in the middle of each non-zero bucket, with some variability due to equipment flaws and measurement errors. The figure below shows the actual outcome distribution corresponding to the made up specification above, with a zoom to the right showing the peak at roughly 3-4 into the bucket. The distribution looks normal-ish, albeit with a little skew. It should be mentioned that I have several such outcome sets (on item level) corresponding to a few different specification lists.

enter image description here

I am now wondering how to tackle this? What I am after is a statistical description grounded on outcome data, in terms of the specification distribution. My idea is that if I have the composite distribution (parameterized in terms of the specification table), I can then simulate what fraction would end up in the non-zero cells when the specification table is varied.

Any help on this matter would be highly appreciated!

Best,

// R.

Robert
  • 15

0 Answers0