-1

Let's say that I have a simple data frame in R, as follows:

#example data frame
a = c("red","red","green")
b = c("01/01/1900","01/02/1950","01/05/1990")
df = data.frame(a,b)
colnames(df)<-c("Color","Dates")

My goal is to count the number of dates (as a class - not individually) for each variable in the "Color" column. So, the result would look like this:

#output should look like this:
a = c("red","green")
b = c("2","1")
df = data.frame(a,b)
colnames(df)<-c("Color","Dates")

Red was associated with two dates -- the dates themselves are unimportant, I'd just like to count the aggregate number of dates per color in the data frame.

knaslund
  • 33
  • 4

3 Answers3

2

Or in base R:

sapply(split(df, df$Color), nrow)
# green   red 
#     1     2 
Ege Rubak
  • 3,967
  • 1
  • 9
  • 17
1

We can use data.table

library(data.table)
setDT(df)[, .(Dates = uniqueN(Dates)) , Color]
#   Color Dates
#1:   red     2
#2: green     1
akrun
  • 789,025
  • 32
  • 460
  • 575
  • This would work, but what if the dates are not unique? So, in red for example, both dates are "01/01/1900" ? – knaslund Jan 06 '17 at 16:40
  • @knaslund It will be 1 using this answer. What is your expected for that case? Do you need `setDT(df)[, .(Dates = .N), Color]` – akrun Jan 06 '17 at 16:40
  • ah, yes this seems like it will work fabulously! thank you! – knaslund Jan 06 '17 at 16:46
0

using the dplyr package from the tidyverse:

library(dplyr)
df %>% group_by(Color) %>% summarise(n())
# # A tibble: 2 × 2
#    Color `n()`
#   <fctr> <int>
# 1  green     1
# 2    red     2
Mike Wise
  • 20,587
  • 7
  • 79
  • 101