0

I have credit card transactions data, and I'd like to detect anomalies whitin customers transactions.

When I have the customer historical data, I can use multiple algorithms in order to detect anomaly behavior. However, I'm facing troubles when I'm trying to detect for anomalies in "new" customers transactions (not only customers with their first transaction, but also those with a few more).

Is there some practical ways to handle this trouble? Maybe define "new" customers behavior and compare transaction of a "new" customer to this behavior?

Thank you in advance.

staove7
  • 107
  • If your question has been answered to your satisfaction, you can accept an answer by clicking the check mark under the voting arrows. – Kodiologist Jul 18 '17 at 16:22

1 Answers1

0

You may have to bite the bullet and accept that your ability to detect anomalies will be limited at the beginning of a customer's history. You don't know what's normal for them yet; hence, you can't in good conscience declare something abnormal.

However, consider that, if I'm not mistaken, your goal isn't merely to detect abnormal transactions, but fraudulent ones. So you're looking for a particular type of abnormal transaction rather than just any old abnormal transaction. Perhaps you can use data on other customers to characterize fraudulent transactions in general, and construct a model that can, with a useful degree of accuracy, identify fraudulent transactions with little or no known-normal transactions to compare them to.

Another option is to try to characterize normal transactions on the basis of data like ZIP code and the customer's age, which you'll have even for new customers.

Kodiologist
  • 20,116
  • Thank you very much for your answer. I'm just trying to understand.. what do you mean by ".. use data on other customers to characterize fraudulent transactions in general.." How do you offer to do such a thing? Is it the same as to compare "new" data point with (randomly) subset of points of historical "new" customers? – staove7 May 28 '17 at 20:14
  • @staove7 No, I mean using the data you have across all customers to train a model to distinguish fraudulent from legitimate transactions. For example, perhaps fraudulent transactions are more likely to involve buying large-denomination gift cards than legitimate transactions. – Kodiologist May 28 '17 at 22:40
  • If I understatnd, you talking about make it a supervised model. I'm just curious.. is there no way of finding anomaly behavior having no historical evidence? For example, if I'd like to estimate who is going to need medical treatment during marathon, and (for simplicity) I'm monitoring runners hurtbeat. In this case, I'd like to those with "different" hurtbeat from the others, assuming they will need to be treated.. How can I find the "normal" hurtbeat in order to detect those with abnormal hurtbeat? – staove7 May 29 '17 at 03:15
  • @staove7 You can, e.g., graph all the heartbeats and look for unusually high or low values. But heartbeat is more consistent between different people than credit-card usage data is. People buy different things even when all transactions are legitimate. – Kodiologist May 29 '17 at 17:50