I have been tasked with finding out approximately how many people were in a location at a given time. I know how many people arrived per hour and how many people left per hour and the average time someone was here by arrival hour. So data looks like so:
arr hourly_arrival avg_time_here hourly_leaving
0 1631 7 2575
1 1294 7 2434
2 1135 6 2248
3 930 6 2011
4 878 7 1803
5 856 7 1619
6 1152 6 1603
7 1710 6 1091
8 2354 6 1487
9 3153 5 1605
10 3652 6 1913
11 3873 6 2220
12 3901 6 2642
13 3766 6 2983
14 3623 6 3355
15 3672 6 3515
16 3613 6 3607
17 3644 6 3672
18 3735 6 3599
19 3654 6 3343
20 3423 6 3702
21 3072 6 3832
22 2675 6 3595
23 2124 6 3092
0 being midnight and 23 being 11pm. What is the best way to go about this? What I have done is the following but it just does not seem right.
I took the hourly_arrival column and divided each row by 365.25 as the data encompasses a year and got the average arrival per hour. The avg_time_here column represents the average hours someone is here for that given arrival hour. For hourly leaving, I did the same as hourly_arriving.
avg_hrl_arrival becomes
avg_hrl_arr = hourly_arrival / 365.25
avg_hrl_leave becomes
avg_hrl_leave = hourly_leaving / 365.25
Then at each hour I said avg_people_here is
avg_people_here = avg_hrl_arr * avg_time_here