0

I'm trying to create a column that ranks each person based on their date of entry, but since everyone's date of entry is unique, it's been challenging.

here's a reprex:

df <- data.frame(
  unique_id = c(1, 1, 1, 2, 2, 3, 3, 3), 
  date_of_entry = c("3-12-2001", "3-13-2001", "3-14-2001", "4-1-2001", "4-2-2001", "3-28-2001", "3-29-2001", "3-30-2001"))

What I want:

df_desired <- data.frame(
  unique_id = c(1, 1, 1, 2, 2, 3, 3, 3), 
  date_of_entry = c("3-12-2001", "3-13-2001", "3-14-2001", "4-1-2001", "4-2-2001", "3-28-2001", "3-29-2001", "3-30-2001"), 
  day_at_facility = c(1, 2, 3, 1, 2, 1, 2, 3))

basically, i want to order the days at facility, but I need it to restart based on each unique ID. let me know if this is not clear.

r2evans
  • 108,754
  • 5
  • 72
  • 122

1 Answers1

0

(This is a dupe of something, haven't found it yet, but in the interim ...)

base R

ave(rep(1L,nrow(df)), df$unique_id, FUN = seq_along)
# [1] 1 2 3 1 2 1 2 3

so therefore

df$day_at_facility <- ave(rep(1L,nrow(df)), df$unique_id, FUN = seq_along)

dplyr

library(dplyr)
df %>%
  group_by(unique_id) %>%
  mutate(day_at_facility = row_number())
# # A tibble: 8 x 3
# # Groups:   unique_id [3]
#   unique_id date_of_entry day_at_facility
#       <dbl> <chr>                   <int>
# 1         1 3-12-2001                   1
# 2         1 3-13-2001                   2
# 3         1 3-14-2001                   3
# 4         2 4-1-2001                    1
# 5         2 4-2-2001                    2
# 6         3 3-28-2001                   1
# 7         3 3-29-2001                   2
# 8         3 3-30-2001                   3
r2evans
  • 108,754
  • 5
  • 72
  • 122