2

I got two questions.

1.

I know that in predictive analytics contests, when faced to yes/no problems, with the absolute cost function

$f(x) = \frac{1}{n}\sum_{i=1}^n\lvert x_i-\hat x_i\rvert$

the best approach is to give

$1\ if\ \hat x_i>=0.5$

or

$0\ if\ \hat x_i <0.5$

Does anyone know a nice formal prove for it?

2.

In general, do you know any readable articles on the best approach to different cost functions (such as MSE and so on...)?

1 Answers1

1

It is well known that in general squared-error loss is minimized by the mean; absolute loss by the median and zero-one loss by the mode. With a variate taking on values either 1 or 0, such that probability of 1 is p=1-q, both the median and mode are 0 if q > p and are 1 if p > q. For example, if p=0.7>0.5, the median of one minimizes the absolute loss you specified. The subscripts are omitted for simplicity.

As to a formal proof that median minimizes the absolute loss function, see, for example, the paper below, for non-negative random variables, and the references therein for other related proofs. As to a readable article on this topic, see modes-medians-and-means-an-unifying-perspective.

Adell, J. & Jodrá, P. The median of the Poisson distribution Metrika, Springer, 2005, 61, 337-346

Hibernating
  • 3,943
  • Thank you for your answer, that's a new knowledge for me. – Kulawy Krul Dec 24 '13 at 10:44
  • Yet, as it is intuitive why mode minimizes zero-one loss, for the pairs median - absolute loss and mean - MSE; it is harder to imagine why. Shame that in his really good and readable article, John Myles White didn't prove it! And as the article you posted from Springer is payable it's hard for me to get it... – Kulawy Krul Dec 24 '13 at 10:50
  • See also (the bottom) of the second page of http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34.4144&rep=rep1&type=pdf and page 67 of http://books.google.com.au/books?id=Zf0gCwxC9ocC&pg=PA67#v=onepage&q&f=false for the relevant knowledge. – Hibernating Dec 24 '13 at 11:26