-1

I'm totally new to RStudio and find myself struggeling with the following problem:

The structure of my data is as follows:

ID Year Country Volume
4435 2018 China 10
4435 2018 China 5
1220 2019 China 15
1337 2019 China 20
... ... ... ...

The ID refers to one unique entity, these however, can consist of multiple subentities. I need to now count these unique IDs per Country and Year and add them to the existing data while also collapsing the whole dataset to the following form:

Nr Year Country SumVolume Number Entities
1 2018 China 15 1
2 2019 China 35 2
3 ... ... ... ...

The dataset contains >50.000 observations and is not only reduced to China, as this example might suggest.

Help would be very much apprechiated!

  • With `library(dplyr)` you can use `your_data %>% group_by(Year, Country) %>% summarize(SumVolume = sum(Volume), NumberEntities = n_distinct(ID))` to do the collapsing. Use `mutate` instead of `summarize` to add the columns to the original dataset. – Gregor Thomas May 26 '22 at 16:26

0 Answers0