Data collection after the model is built and deployed

Asked Jun 12 '22 at 10:55

Active Jun 12 '22 at 10:55

Viewed 28 times

I have built a machine learning model which predicts whether a customer will buy a product or not. The model performs well on cross validation tests. Now, I will deploy it in production to recommend the product to customers that the model produces a high likelihood of buying.

I will continue to collect data to further improve the model. I feel that collecting only the actions (bought or not bought) of the users which the product is recommended will create some bias. This seems to violate the i.i.d. assumption since the data collection will not be random. My question in general is how should I continue collecting data once the machine learning model is in production? Should I collect data randomly such as by making product recommendations to random users?

asked Jun 12 '22 at 10:55

Sanyo Mn

1,252
12
19

You could look into active learning, see for instance https://stats.stackexchange.com/questions/422186/motivations-for-experiment-design-in-statistical-learning/422518#422518. Consider add the tag [tag:active-learning] – kjetil b halvorsen Aug 16 '22 at 18:00

Data collection after the model is built and deployed

0 Answers0