Reconstruction metric robust to scaling, sparsity, and outliers?

Question

I seek a reconstruction error metric with following properties:

Robustness to sparsity: error decreases in presence of many zeros or small values (if predicted correctly)
Scale invariance: error doesn't respond at all to scaling both ground truth and prediction
Robustness to outliers: metrics shouldn't respond 'strangely' to outliers (e.g. change a lot even though predictions match)

Context is signal reconstruction; real example, magnitude of spectrum (below). I've defined three metrics each of which handle cases below differently:

mad_mav = mean(abs(pred - true)) / mean(abs(true))
mad_rms = mean(abs(pred - true)) / sqrt(mean(true**2))
mar     = mean(abs(pred) / abs(true))  # and set nans/infs to 0

Case 1: all data. Reference.
Case 2: Data doubled. All metrics pass.
Case 3: Outliers dropped. Both mad_mav and mad_rms seem to respond appropriately, but mar seems "overly robust".
Case 4: Chunk of data large relative to rest dropped, turning its remainder into outliers. mad_mav responds to this 1.5x more strongly than mad_rms; hard to tell if this is 'overreacting'.
Case 5: All outliers dropped, now pred is consistently greater than true. Now mad_rms reacts x2.85 stronger than mad_mav, and both increase by an order of magnitude. Again not too clear which is 'better'.
Case 6: zero-padded by own length; error should drop, as half of all samples are now predicted perfectly. mad_mav doesn't care - bad. mad_rms drops a bit. mar drops perhaps ideally, by half.

Throughout cases 3-5, and in fact sweeping k 1 to 200 in data[k:], mar's estimate grows approximately linearly (or more accurately, as a very flat parabola), which is strange.

Is there a metric that handles these cases "better" as per comments? data.npy/data.mat for testing.

In your requirements 1 & 3, the distributions you want to predict change, so the predictions should change as well. So should we see your requirements in the context of updated predictions as distributions are updated? Also, it sounds like you want a point prediction. Are you open to predictive densities (which you can summarize by a single number, if you wish), which you could evaluate using proper scoring rules. If it should be a point prediction, which functional of the future density do you wish to elicit (Kolassa, 2020, IJF)? — Stephan Kolassa, Nov 05 '20 at 19:25
@StephanKolassa I'll review your source, but it's important to note that no prediction is happening here; this is a perfectly deterministic inversion of Continuous Wavelet Transform of a noisy signal. I don't seek an "intelligent loss" for optimizing an ML model, only a reliable measure of distance between reconstruction and un-noised signal. mad_rms is pretty close but disappoints with Case 6, in which mar excels, but frankly I'm unsure whether mar does greatly or very poorly everywhere else by my own criterion. — OverLordGoldDragon, Nov 05 '20 at 19:35
Well, whether you call it a prediction or a reconstruction or inversion of a noisy signal, it's still a question of estimating either a full unknown density (which you can assess using proper scoring rules), or of looking for a one-number summary of this unknown density. In which case you have to decide whether the "best" one-number summary is the expectation, the median or some other functional. This decision will inform which error measure makes sense. And my initial questions still apply: if you are changing the distribution, do you change the one-point summary, and if not, why not? — Stephan Kolassa, Nov 06 '20 at 06:31
@StephanKolassa Ah, right - of course they should change, the question is how, which I address case-by-case. e.g. ideally in case 6 our error halves, and should change but not by a lot between cases 2 and 3. Non-one-point are fine so long as can collapse to one-point. -- Also the 'median' vs 'mean' distinction isn't always pertinent; for example we gradually move from MSE to MAE in autoencoder loss to prioritize small deviations and refine reconstruction (with some assumptions on input). And here the goal is an accurate "distance" measure, where MSE would downplay small but many errors. — OverLordGoldDragon, Nov 09 '20 at 01:52

Reconstruction metric robust to scaling, sparsity, and outliers?

0 Answers0

Linked