I seek a reconstruction error metric with following properties:
- Robustness to sparsity: error decreases in presence of many zeros or small values (if predicted correctly)
- Scale invariance: error doesn't respond at all to scaling both ground truth and prediction
- Robustness to outliers: metrics shouldn't respond 'strangely' to outliers (e.g. change a lot even though predictions match)
Context is signal reconstruction; real example, magnitude of spectrum (below). I've defined three metrics each of which handle cases below differently:
mad_mav = mean(abs(pred - true)) / mean(abs(true))
mad_rms = mean(abs(pred - true)) / sqrt(mean(true**2))
mar = mean(abs(pred) / abs(true)) # and set nans/infs to 0
- Case 1: all data. Reference.
- Case 2: Data doubled. All metrics pass.
- Case 3: Outliers dropped. Both
mad_mavandmad_rmsseem to respond appropriately, butmarseems "overly robust". - Case 4: Chunk of data large relative to rest dropped, turning its remainder into outliers.
mad_mavresponds to this 1.5x more strongly thanmad_rms; hard to tell if this is 'overreacting'. - Case 5: All outliers dropped, now
predis consistently greater thantrue. Nowmad_rmsreacts x2.85 stronger thanmad_mav, and both increase by an order of magnitude. Again not too clear which is 'better'. - Case 6: zero-padded by own length; error should drop, as half of all samples are now predicted perfectly.
mad_mavdoesn't care - bad.mad_rmsdrops a bit.mardrops perhaps ideally, by half.
Throughout cases 3-5, and in fact sweeping k 1 to 200 in data[k:], mar's estimate grows approximately linearly (or more accurately, as a very flat parabola), which is strange.
Is there a metric that handles these cases "better" as per comments? data.npy/data.mat for testing.

mad_rmsis pretty close but disappoints with Case 6, in whichmarexcels, but frankly I'm unsure whethermardoes greatly or very poorly everywhere else by my own criterion. – OverLordGoldDragon Nov 05 '20 at 19:35