If I want to improve the precision of the results of an assay by averaging a few repeat measurements of each sample, when the random error is generally normally distributed but occasional outliers are possible due to failures in the assay process that can't be identified for exclusion by any means other than the result, which average provides a better estimate, the mean or the median? (Keeping in mind that what matters is which is more representative of the actual concentration of the analyte, not necessarily the best central tendency of the numeric data)
Suppose I test the same blood sample three times with a glucometer. If the results are 160, 21, and 159 mg/dL, obviously the 21 is an outlier caused by some kind of experimental failure, and the median of 159 is a far better estimate of the concentration than the mean of 113. However, what if the results are 149, 130, and 147, or 149, 112, and 147? Both sets of results are within the range of ordinary random error of the glucometer.
My inclination is that in both cases, the median of 147 is the best estimate. The fact that two results are close and one is a multiple of of the distance between the other two from either of them indicates that the true concentration is likely to be in the vicinity of the two results that are close; and that between those two results, the one that's closer to the more distant result is a better estimate.
However, based on what I've read, medians have a ~25.3% higher standard error than means, and while this may not hold true for small sample sizes, it's generally close. So, by using the median, am I sacrificing precision for robustness?
For some applications, this may be a desirable trade-off. For example, if I'm using the averages to calibrate a continuous glucose monitor (CGM) from which I intend to draw aggregate data, a few poor calibrations could skew a substantial portion of the data, and the aggregate data will be more precise if I sacrifice a small amount of precision with each calibration to limit the influence of outliers.
However, for statistical analysis, such as calculating the MAE and MAPE of the CGM by periodically comparing it to concurrent averaged meter results from the same subject, would I achieve greater precision using the means of the triplicate glucometer measurements, or is my intuition correct that the median is generally a better estimate of the concentration (at least in cases where there is a substantial difference between median and mean), and that I'm better off using it to average the repeat measurements for all purposes? Would that change if I test each sample five or six times instead of three?
[As a side note, I do find experimentally that if I test the same sample four times, the fourth result is closer to the median than the mean of the first three slightly more often, and on average there is a smaller difference between fourth result and the median than the mean, in both absolute and relative terms. However, I'm not sure the data I have on this is statistically significant, as it's not based on a large amount of data and the differences aren't very large.]
