How can I remove duplicate rows from this example data frame?
A 1
A 1
A 2
B 4
B 1
B 1
C 2
C 2
I would like to remove the duplicates based on both the columns:
A 1
A 2
B 4
B 1
C 2
Order is not important.
How can I remove duplicate rows from this example data frame?
A 1
A 1
A 2
B 4
B 1
B 1
C 2
C 2
I would like to remove the duplicates based on both the columns:
A 1
A 2
B 4
B 1
C 2
Order is not important.
unique() indeed answers your question, but another related and interesting function to achieve the same end is duplicated().
It gives you the possibility to look up which rows are duplicated.
a <- c(rep("A", 3), rep("B", 3), rep("C",2))
b <- c(1,1,2,4,1,1,2,2)
df <-data.frame(a,b)
duplicated(df)
[1] FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE
> df[duplicated(df), ]
a b
2 A 1
6 B 1
8 C 2
> df[!duplicated(df), ]
a b
1 A 1
3 A 2
4 B 4
5 B 1
7 C 2
You are looking for unique().
a <- c(rep("A", 3), rep("B", 3), rep("C",2))
b <- c(1,1,2,4,1,1,2,2)
df <-data.frame(a,b)
unique(df)
> unique(df)
a b
1 A 1
3 A 2
4 B 4
5 B 1
7 C 2