I have a dataframe that looks like this
head(d)
SW_ID test_date Public_Positive Public_Total Private_Positive Private_Total Tested_Positive Tested_Total casedate
7344 36067NY00270811700 8/31/20 1 65 0 0 1 65 2020-08-31
7345 36067NY00270811700 5/8/20 1 24 0 0 1 24 2020-05-08
7346 36067NY00270811700 7/5/20 1 11 0 0 1 11 2020-07-05
7347 36067NY00270811700 8/19/20 0 108 0 0 0 108 2020-08-19
7348 36067NY00270811700 4/11/20 0 4 0 0 0 4 2020-04-11
7349 36067NY00270811700 4/29/20 1 11 0 0 1 11 2020-04-29
County POP2020
7344 Onondaga 16260
7345 Onondaga 16260
7346 Onondaga 16260
7347 Onondaga 16260
7348 Onondaga 16260
7349 Onondaga 16260
I want to count the total number of Tested_positive for each SW_ID and create a new variable called "total_positive". I then want to take that variable and divide it by POP2020 and multiply by 100,000 to get the incidence rate. I believe I can get the incidence rate by d$incidence <- d$total_positive/POP2020 * 100000 but I am unsure of how to actually sum all the dates as an aggregate. Please advise