1

I have a question about how to calculate and compare median cycle times for the production of widgets in factories. I get a report of median weekly cycle times for the factories. I'd like to compute a median for groups of factories and the company on the whole, but I am suspicious of calculating the medians of medians.

Let’s say there are some factories that make widgets. The factories are grouped into 4 groups, and the groups are different sizes.

Even though the factories each make different numbers of widgets, they all have the same cycle time goal, let’s say 30 minutes, for the production of an individual widget.

Each factory tracks widget cycle time every day and they don't track cycle time for a sample of widget production; it is tracked for all widgets.

Sometimes machines overheat or there’s some other kerfuffle, which causes cycle-time outliers (longer times), so it has been decided that cycle time should be reported as medians, specifically, as weekly median cycle time, which is pretty common industry-wide.

It is not actually clear how the factories are aggregating their widget-level data in order to arrive at the weekly median cycle time, and the raw widget-level data is not available, nor is the daily (morning and afternoon shift) data.

Let’s also say we don’t yet know how the data is distributed, but it’s right-tailed and would still be right-tailed if the most extreme outliers were dropped. The data are expected to be distributed similarly for each of the factories.

I need to calculate the weekly median cycle time for each of the 4 factory groups and for the company overall.

If averages were reported, I could take averages of averages to my heart’s content, as I understand it, but taking medians of medians is suspect, although not always inappropriate.

Aside from trying to get my hands on the raw data, is it possible to make a reasonable stab at these numbers?

What specific issues should I address if I have no choice but to work with the medians?

helinum
  • 11
  • 1
    If you have a good question, there's no need to apologise. If you don't, an apology won't help! So I've edited out some polite words. The tone of this is fine, but once I reached the end I felt I needed to start all over again. If there's a fault, it's just too long for anyone to grasp, but I can't rule out someone smarter than me seeing your point. – Nick Cox Dec 12 '16 at 22:20
  • Ha ha - I've seen so many posts where the OP is asked to be more specific that I was trying to head it off at the pass. Guess I overshot it. I just shortened it up. – helinum Dec 12 '16 at 22:28
  • Have you checked out http://stats.stackexchange.com/questions/4462/median-of-medians-calculation which might not give quite enough answer but seems to be related. They do mention weighted medians, but you may not have that information. – Wayne Dec 12 '16 at 23:03
  • 1
    medians of medians are not in general medians of the aggregate. It's not immediately clear to me whether medians of medians will in general converge to something you might be interested in. Given some suitable assumptions, it may be possible to arrive at some kind of bound on the median, though. – Glen_b Dec 12 '16 at 23:49
  • 1
    Note that even if you're taking averages of averages, if the original averages derive from different sample sizes, you would need to weight the averages in your overall average to get the same answer as averaging the entire collection. – Glen_b Dec 12 '16 at 23:52

0 Answers0