R: excluding all duplicates rows (both of each pair) based on one column

Question

I have a file (called example.txt) that looks like the following:

A B C  
D E F  
H I C  
Z B Y  
A B C  
T E F  
W O F

Based on column 2, I would like to identify the duplicate rows to obtain the following file:

H I C  
W O F

`df[ave(seq_along(df$col2), df$col2, FUN = length) == 1,]` – d.b Apr 06 '17 at 14:35 — d.b, Apr 06 '17 at 14:35

akrun · Answer 1 · 2017-04-06T14:34:22.650

0

We can use duplicated

df1[!(duplicated(df1$col2)|duplicated(df1$col2, fromLast=TRUE)),]
#   col1 col2 col3
#3    H    I    C
#7    W    O    F

edited Apr 06 '17 at 14:34

answered Apr 06 '17 at 14:29

akrun

score 0 · Accepted Answer · answered Apr 06 '17 at 14:39

0

You can just compute which values occur exactly once and select those rows - like this:

Tab = table(df$V2)
Vals = unlist(attr(Tab, "dimnames"))[which(Tab == 1)]
df[df$V2 %in% Vals, ]
  V1 V2 V3
3  H  I  C
7  W  O  F

answered Apr 06 '17 at 14:39

G5W

2 Answers2