How to combine fields with duplicate keys, to comma separated

Question

I have a data.frame that has a key, fignum and a data field, codefile, but where fignum may be duplicated.

Where duplicates occur, I want to combine the codefile data fields into a single row, separated by ,. Here's my input:

> cf
   fignum                       codefile
8     4.6           04_6-cholera-water.R
9     P.3 04_P3a-cholera-neighborhoods.R
10    P.3       04_P3b-SnowMap-density.R
11    5.5    05_5-playfair-east-indies.R

> duplicated(cf[,"fignum"])
[1] FALSE FALSE  TRUE FALSE

The desired output combines the two "P.3" codefile values into one observation, to look like this:

> cf-wanted
   fignum                                                  codefile
8     4.6                                      04_6-cholera-water.R
9     P.3  04_P3a-cholera-neighborhoods.R, 04_P3b-SnowMap-density.R
10    5.5                               05_5-playfair-east-indies.R

score 1 · Answer 1 · answered Nov 06 '21 at 22:55

We could group_by by fignum and summarise

library(dplyr)
cf %>% 
  group_by(fignum) %>% 
  summarise(codefile = paste0(codefile, collapse = ', '), .groups = 'drop')

fignum codefile                                                
  <chr>  <chr>                                                   
1 4.6    04_6-cholera-water.R                                    
2 5.5    05_5-playfair-east-indies.R                             
3 P.3    04_P3a-cholera-neighborhoods.R, 04_P3b-SnowMap-density.R

How to combine fields with duplicate keys, to comma separated

1 Answers1