2

I'm a pretty beginner at R. I've a CSV file where data is as follows, for example:

ID  Values
820 D1,D2,FE
730 D1,D2,D3,PC,Io,He,Bt,Te,AR,PG
730 DV,GTH,LYT
567 EDR,TYU,EOP,OMN
567 FGH,KIH,IOP

I want to remove the duplicates in ID and append their data into its Values column, like this:

ID  Values
820 D1,D2,FE
730 D1,D2,D3,PC,Io,He,Bt,Te,AR,PG,DV,GTH,LYT
567 EDR,TYU,EOP,OMN,FGH,KIH,IOP

How to achieve this in R?

Jaap
  • 77,147
  • 31
  • 174
  • 185
LearneR
  • 2,160
  • 2
  • 20
  • 42

2 Answers2

3
dat <- read.table(text="ID  Values
820 D1,D2,FE
730 D1,D2,D3,PC,Io,He,Bt,Te,AR,PG
730 DV,GTH,LYT
567 EDR,TYU,EOP,OMN
567 FGH,KIH,IOP", header=TRUE)

dat2 <- dat %>% group_by(ID) %>% summarise(val=paste(Values, collapse=","))
Jaap
  • 77,147
  • 31
  • 174
  • 185
2

You can try

library(data.table)
setDT(df1)[, list(Values=paste(Values, collapse=",")) ,ID]

Or using base R

 aggregate(.~ID, df1, paste, collapse=",")
akrun
  • 789,025
  • 32
  • 460
  • 575