0

I have a data frame with a character column storing strings separated by commas, such as:

  row_name my_col          
  <chr>    <chr>           
1 1        n, g, l, t, f, v
2 2        f, n, e         
3 3        w    

I want to transform that data frame to a data frame in which:

  • have one column per each string value (in this example, one column per letter)
  • each column is named after one of the letters
  • the values in the data frame are either TRUE or FALSE, corresponding to whether the letter in the (new) column name was in the original string for that row, such as:
## desired output
# A tibble: 3 x 9
row_name n     g     l     t     f     v     e     w    
  <dbl> <lgl> <lgl> <lgl> <lgl> <lgl> <lgl> <lgl> <lgl>
1        1 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  FALSE FALSE
2        2 TRUE  FALSE FALSE FALSE TRUE  FALSE TRUE  FALSE
3        3 FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE 

Reproducible code

library(dplyr, warn.conflicts = FALSE)

sample_letters <- function() {
  sample(letters, size = sample(1:6, 1))
}

set.seed(2021)

row_1 <- tibble(my_col = toString(sample_letters()))
row_2 <- tibble(my_col = toString(sample_letters()))
row_3 <- tibble(my_col = toString(sample_letters()))

my_df <- bind_rows(row_1, row_2, row_3, .id = "row_name") 
my_df
#> # A tibble: 3 x 2
#>   row_name my_col          
#>   <chr>    <chr>           
#> 1 1        n, g, l, t, f, v
#> 2 2        f, n, e         
#> 3 3        w

Created on 2021-09-09 by the reprex package (v2.0.0)

I thought this task should be super simple, but it quickly became a nasty script , and I feel there should be a quite simple solution to this. I'll appreciate others' suggestions on this matter.

Thanks!

Emman
  • 2,982
  • 1
  • 12
  • 31
  • Indeed a duplicate. FYI this is the answer the fits my question most closely: https://stackoverflow.com/a/42388592/6105259 – Emman Sep 09 '21 at 15:54

0 Answers0