I would like to understand what difference it makes, if I use, for example, either Mean Square Error or Poisson Deviance as error metric/loss function for a regression of count data. Are there any a-priori or theoretical reasons to prefer one metric over the other?
Background
I do have a large dataset of count data (i.e. number of events and exposures) depending on various covariates. I would like to fit various models (classical GLMs as well as machine learning models) to this dataset and assess their quality. My main goal is to make point predictions for the conditional means of the distribution depending on the covariates. The according rates are typically small (between 0 and 10% say) and in many cases there are zero observed events. I do not have any domain specific reason, such as a cost function, to choose my error metric, and I am not so much interested in inference (yet). At the moment, I just want a "best" prediction, whatever that means.
As I understand, I should choose a "proper scoring function" for the conditional mean. I further understand that both Mean Squared Error and Poisson Deviance fulfil this requirement.
My questions
- Beyond being proper scoring functions for the mean, are there any other requirements I should be aware of?
- Are there other relevant metrics which could (or should) be used for assessing the quality of predictions?
- What may be possible reasons to prefer one error metric such as Poisson Deviance or Mean Squared Error over other metrics?