If we have a set of data of how long one watches youtube, these data points only include the raw number of minutes watched. If it is known that some of those data points include situations where you watched youtube for a while then left the room and let the videos play out. How would we statistically set out a period of time that is too long and therefore the videos had been playing without you present? Then additionally, how many minutes should be taken off the observation to obtain a more true time of viewing?
My Idea is to use Outliers i.e 1.5 x IQR and anything past is deemed as being viewed without anyone there. Additionally, perhaps it would make sense to use a hypothesis test with a given % of confidence that the found value is significantly different from the average.
For the reduction of time would the box plot imply using the upper bound as the value to replace all outliers?