1

I am currently trying to find 'Price Outliers' in a set of data. Given a set of product prices (within a product category, e.g. milk) I want to filter for prices that are suspicious - for example I have a set of milk prices $\{1,1.5,10,0.1\}$ then I want to filter for $10$ and $0.1$. The problem is that within different product categories price ranges can vary a lot - for example fish: a tin of tuna and a tin of caviar, converting the prices to 1kg has a very big difference in pricing.

So the question is, if anyone of you knows a (mathematical) model, to describe pricing within 'similar' products - I am more interested in relative conclusions about prices rather than absolute, for example: I am not interested in 1.5 Euro per liter milk against 1 Euro per liter milk but I am interested in the 'range prices can have' with the aim to find suspicious prices.

So far I worked with an empirical expected value and standard deviation, assumed prices are log-normal distributed and calculated the 2-sigma intervals and filtered all prices not within the 2-sigma interval. This gives more or less good results but is not really satisfying.

luchonacho
  • 8,591
  • 4
  • 26
  • 55
Abbraxas
  • 131
  • 2
  • This paper seems to be a good entry point to enter the 'more mathematical' approach to the problem you are describing. – sen_saven Aug 14 '17 at 12:05
  • Do you have any additional information about how "similar" products are? If you have something quantifiable, perhaps, you could build a stochastic frontier. See this survey for example http://pages.stern.nyu.edu/~wgreene/FrontierModeling/SurveyPapers/LIMDEP-Chapter33.pdf – kitsune Aug 14 '17 at 17:00
  • @sen_saven So I've read the paper and as far as I can tell, they aggregate the data for a retailer and test for noticeable price changes. As you said, it seems like a good entry point but I am not quite sure whether this is applicable in my case, as I am interested in the price ranges in a given product group and not price changes within an aggregated group of products. Please, correct me if I am wrong. But thanks anyway, I will try to track the mentioned papers down and see if I can find anything. – Abbraxas Aug 15 '17 at 07:20
  • @kitsune Unfortunately no, I do not have a quantifiable characteristic. I'll look into the FrontierModeling and give feedback. – Abbraxas Aug 15 '17 at 07:21
  • @kitsune So I looked into it and I am not quite sure where to start? You suggested the stochastic frontier, could you help me how that would be applicable in my case? – Abbraxas Aug 15 '17 at 08:23
  • seems like it can be filtered using simple conditioning. do you want straight up mathematical equations or do you want some code which will do it for you? – EconJohn Aug 16 '17 at 04:26
  • @EconJohn Hi, sorry for answering this late, had a lot to do at work... I am a mathematician so equations would always be nice for me. If you have a model and could share it or hint me to a source I would be very thankful. – Abbraxas Aug 23 '17 at 13:54

0 Answers0