Reverse Survivorship Bias

Asked Dec 14 '21 at 22:59

Active Dec 14 '21 at 22:59

Viewed 39 times

Goal:

I have car accident data with geo-positions and would like to create a model to predict hotspots due to specific influence factors or features.

Problem:

To validate the results I want to create a test-set but since I only have accidents and no samples for accident-free car rides I thought about creating them artificially. Unfortunately, I have no data of traffic density for specific roads.

Current Approach:

Therefore I thought about two ways of approaching this:

Use the geo-position of occured accidents and pick other features randomly to keep the distribution intact.
Create Random samples on random geo-positions (within road-network) with random features.

Question:

Is there a way to create artifical samples for this in a way which introduces less bias?

asked Dec 14 '21 at 22:59

Andreas

Reverse Survivorship Bias

0 Answers0