Merge on same column in R

Question

My colleague and I are collecting data on separate files and we want to reconcile the data. Our data files look like so:

df1 = data.frame(Id = c(1:5), Score= c(10,NA,4,NA,3))
df2 = data.frame(UserID= c(1:5), Result= c(NA,8,NA,3,NA))

What is the simplest way to merge the two to form the following dataset to attain the following result?

df3 = data.frame(Id= c(1:5), Score= c(10,8,4,3,3))

Changing column names and using merge() don't seem to work which is what I had hoped.

Any suggestions? Would the quickest to be run a for loop across both datasets?

What about `coalesce`? Does this help: `df1$Score – markus Nov 19 '18 at 08:58 — markus, Nov 19 '18 at 08:58

score 2 · Accepted Answer · answered Nov 19 '18 at 09:11

2

you are looking for the function coalesce in the library dplyr:

dplyr::coalesce(df1,df2)
  Id Score
1  1    10
2  2     8
3  3     4
4  4     3
5  5     3

answered Nov 19 '18 at 09:11

onyambu

49,350
3
19
45

score 1 · Answer 2 · answered Nov 19 '18 at 06:49

use the dplyr library

library(dplyr)


inner_join(df1, df2, by = c("Id" = "UserID")) %>%  #specify the column names to join
  mutate(Score = ifelse (is.na(Score), Result, Score)) %>% #This will fill all the NA Scores with values of Result
  select(Id, Score) #Finally, Select the two columns you desire

Hope that helps

score 0 · Answer 3 · answered Nov 19 '18 at 07:52

Assuming that any NA in one df corresponds to a value in the other df, the simplest solution is by using an ifelse() statement to insert the values from df2 inlieu of the NA in df1, thus:

df1$Score <- ifelse(is.na(df1$Score), df2$Result, df1$Score)

This says: if df1$Score is NA, write the value of df2$Result, otherwise keep df1$Score

score 0 · Answer 4 · answered Nov 19 '18 at 08:41

0

With base R this should work as well:

df3 <- merge(df1, df2, by.x = c("Id", "Score"), by.y = c("UserID", "Result"), all = T)
df3 <- df3[!is.na(df3$Score),]
df3

answered Nov 19 '18 at 08:41

Lennyy

5,754
2
8
22

Merge on same column in R

4 Answers4