3

I have the table like this:

  A   B    C
1 a   b'   1
2 a   b'   2
3 a   b''  3
4 a2  b1   1
5 a3  b2   3

For example: Column A - is a genus (like E.) and column B is a species (E. coli) Column C is a class of the item (doesn't metter)

So I need to understand, how many classes item b consist of:

  A  B   C1 C2 C3
1 a  b'  +  +  -
2 a  b'' -  -  +
3 a2 b1  +  -  -
4 a3 b2  -  -  +

3 Answers3

2

Using tidyverse

df <- data.frame(A = c('a','a','a','a2','a3'), 
                 B = c("b'", "b'", "b''", "b1", "b2"),
                 C = c(1,2,3,1,3))

df %>%
  mutate( C = paste("C",C, sep = ""),
          D = "+") %>%
  pivot_wider(names_from = C, values_from = D, values_fill = "-")

First I create the data.frame, and then I actually create the C1,C2,C3. You requested a +/- system, so I put in a + where it is present so when I pivot_wider. Now pivot_wider will end up with + and NA values; so to correct for it, use the values_fill = option and set that to -.

# A tibble: 4 x 5
  A     B     C1    C2    C3   
  <chr> <chr> <chr> <chr> <chr>
1 a     b'    +     +     -    
2 a     b''   -     -     +    
3 a2    b1    +     -     -    
4 a3    b2    -     -     +    
akash87
  • 3,718
  • 3
  • 13
  • 29
1

Using dcast

library(data.table)
dcast(setDT(df)[, tmp := "+"], A + B ~ paste0("C", C), value.var = "tmp", fill = "-")
    A   B C1 C2 C3
1:  a  b'  +  +  -
2:  a b''  -  -  +
3: a2  b1  +  -  -
4: a3  b2  -  -  +
akrun
  • 789,025
  • 32
  • 460
  • 575
1

Update: A more compact version generated with the help of akrun:

df %>%      
  mutate(C = as.factor(C)) %>%
  bind_cols((map(setNames(levels(.$C), paste0("C", levels(.$C))), function(x) ifelse(x == .$C, "+", "-")))) %>% 
  group_by(A, B) %>%  summarise(across(everything(), ~ first(.[order(. != '+')])), .groups = 'drop')

Here is another more complicated solution: Was generated with the help of fabulous akrun Combine, merge, coalesce rows by group and replace certain value by another value without pivoting

library(tidyverse)
df %>% 
    mutate(C = as.factor(C)) %>% 
    cbind(sapply(levels(.$C), `==`, .$C)) %>% 
    rename_with(.cols = 4:6, ~ paste0("C", .x)) %>% 
    mutate(across(C1:C3, ~ifelse(.==TRUE,"+", "-"))) %>% 
    select(-C) %>% 
    group_by(A, B) %>% 
    summarise(across(everything(), 
                     ~ first(.[order(. != '+')])), .groups = 'drop')

output:

  A     B     C1    C2    C3   
  <chr> <chr> <chr> <chr> <chr>
1 a     b'    +     +     -    
2 a     b''   -     -     +    
3 a2    b1    +     -     -    
4 a3    b2    -     -     +  
TarJae
  • 43,365
  • 4
  • 14
  • 40
  • 1
    Or may be a bit more compact `df %>% mutate(C = as.factor(C)) %>% bind_cols((map(setNames(levels(.$C), paste0("C", levels(.$C))), function(x) ifelse(x == .$C, "+", "-")))) %>% group_by(A, B) %>% summarise(across(everything(), ~ first(.[order(. != '+')])), .groups = 'drop')` – akrun Sep 15 '21 at 19:16
  • 1
    Wonderful. I have added to the answer! – TarJae Sep 15 '21 at 19:22