2

I'm running an experiment where I'm collecting samples of different size (numeric data only) and computing the mean, median and mode of each sample. I'm interested in finding out the mode across all samples. I am not sure if I can do this by using the modes for each sample and finding the mode from those modes? So, I'd like to find

Mode(X) from Mode(X1), Mode(X2), ..., and Mode(Xn), where X = [X1,X2,…,Xn].

Can I do this by taking the mode from all modes?

develarist
  • 3,917
  • 1
  • 21
  • 52
alicek
  • 21
  • If a distribution has multiple modes then it has multiple modes, I'm not sure that trying to compute an "overall mode" is meaningful, and alternative measures such as mean/median provide alternatives. Multiple modes indicate that there might be 'groups' in your data. Could you provide a histogram of your data and a bit of context? What are you collecting data on? – jcken Jul 16 '20 at 08:49

2 Answers2

2

One approach to what you're trying to get at is to think of the individual samples as components or subpopulations of a mixture distribution, which covers the overall population. Below is an example of the density functions of two subpopulations or samples of data (dashed lines) that have their own modes. Fitted over them is the overall population set, or global density (solid line), enclosing both.

enter image description here

I wouldn't go so far as to expect that the global density will always have its own single mode, since, for the example shown, it slumps in areas between the subpopulations' modes before growing a hump.

Having a look at the Gaussian mixture model technique might be a step in the right direction to finding a way to quantify the mode of the global density, which likely is some weighted average of the underlying mode probabilities. What's more likely is that mixture distributions will still have multiple modes corresponding to the underlying modes, especially if the gaps between underlying modes are spread far apart.

develarist
  • 3,917
  • 1
  • 21
  • 52
2

The short answer is No. It's possible even to have set-ups where the mode of a combined sample is neither of the modes of each sample.

Suppose sample A has 30 values of 0 and 20 values of 1 and sample B has 20 values of 1 and 30 values of 2. Evidently 0 is the mode of sample A, 2 is the mode of sample B, but neither is the mode of the combined sample, a distinction that belongs to 1. We could rescue that example by an averaging rule, but no averaging rule works when an average produces a mode that isn't a possible value. Worse, an average often can't even be defined: Instead of 0, 1, 2 we might have categories "frog", "toad", "newt".

In brief, the mode must be re-calculated from the original data.

But backing up, what is the mode any way? Let's reconstruct a dialogue of simple (S) and complicating (C) comments, starting with the treatment in many introductory texts.

S1. The mode is the most common value.

S2. For categorical or count variables, the mode can be established by counting, or equivalently by inspecting a table or bar chart of frequencies.

C3. But watch out: Ties can occur, and modes may not be well defined, especially but not only in small samples.

C4. For (approximately) continuous variables, we need to think in terms of the position of the maximum of a density function. Counting may fail abysmally and it may even be true that each distinct value occurs only once, or that any ties are just quirks of measurement or sampling.

S5. Histograms can help. Which bar is the highest? Looking at a histogram often makes it clear that (a) there is a strong mode or (b) there isn't. It may be helpful to think in terms of two or more modes -- with a looser definition of mode as a peak in density.

C6. But watch out: The occurrence of peaks on a histogram can be an artefact of bin origin, bin width, or even boundary rules (are bins $(a, b]$ or $[a, b)$?). Even when a single bar is clearly highest, that still leaves the mode identified as an interval. Some older texts introduced procedures for using the frequencies associated with adjoining bars to get a point estimate.

S7. Hang on: who said that we are limited to using histograms? We can work from other estimates of the density function, such as kernel estimates.

C8. But watch out: The occurrence of peaks in density estimates can be an artefact of kernel shape (occasionally), kernel width (often), whether estimation is combined with transformation, or boundary rules when the support of the variable is bounded.

S9. Sometimes if a brand-name distribution is a good fit, then the mode can be identified directly using estimates of other parameters.

S10. A relatively simple half-sample mode algorithm often works quite well. Much more at How to find the mode of a probability density function?

What is the big picture? Even statistically-minded researchers differ on how far modes are interesting or useful. When they are interesting or useful, it can be true that a rough indication of modes is good enough, and the other details can look like fussing.

Nick Cox
  • 56,404
  • 8
  • 127
  • 185