Pass function arguments by column position to mutate_at

Question

I'm trying to tighten up a %>% piped workflow where I need to apply the same function to several columns but with one argument changed each time. I feel like purrr's map or invoke functions should help, but I can't wrap my head around it.

My data frame has columns for life expectancy, poverty rate, and median household income. I can pass all these column names to vars in mutate_at, use round as the function to apply to each, and optionally supply a digits argument. But I can't figure out a way, if one exists, to pass different values for digits associated with each column. I'd like life expectancy rounded to 1 digit, poverty rounded to 2, and income rounded to 0.

I can call mutate on each column, but given that I might have more columns all receiving the same function with only an additional argument changed, I'd like something more concise.

library(tidyverse)

df <- tibble::tribble(
        ~name, ~life_expectancy,          ~poverty, ~household_income,
  "New Haven", 78.0580437642378, 0.264221051111753,  42588.7592521085
  )

In my imagination, I could do something like this:

df %>%
  mutate_at(vars(life_expectancy, poverty, household_income), 
            round, digits = c(1, 2, 0))

But get the error

Error in mutate_impl(.data, dots) : Column life_expectancy must be length 1 (the number of rows), not 3

Using mutate_at instead of mutate just to have the same syntax as in my ideal case:

df %>%
  mutate_at(vars(life_expectancy), round, digits = 1) %>%
  mutate_at(vars(poverty), round, digits = 2) %>%
  mutate_at(vars(household_income), round, digits = 0)
#> # A tibble: 1 x 4
#>   name      life_expectancy poverty household_income
#>   <chr>               <dbl>   <dbl>            <dbl>
#> 1 New Haven            78.1    0.26            42589

Mapping over the digits uses each of the digits options for each column, not by position, giving me 3 rows each rounded to a different number of digits.

df %>%
  mutate_at(vars(life_expectancy, poverty, household_income), 
            function(x) map(x, round, digits = c(1, 2, 0))) %>%
  unnest()
#> # A tibble: 3 x 4
#>   name      life_expectancy poverty household_income
#>   <chr>               <dbl>   <dbl>            <dbl>
#> 1 New Haven            78.1    0.3            42589.
#> 2 New Haven            78.1    0.26           42589.
#> 3 New Haven            78      0              42589

^{Created on 2018-11-13 by the reprex package (v0.2.1)}

In the past when faced with this problem I ended up gathering my columns, grouping them, mutating them, and spreading them back out. See also [How do I sweep specific columns with dplyr?](https://stackoverflow.com/q/28298688/1968) — Konrad Rudolph, Nov 13 '18 at 19:38
@KonradRudolph thanks, I was thinking about that too, and that's an approach I've used before, but I'm trying to figure out whether a super simple, one-line version is possible — camille, Nov 13 '18 at 19:50
@Henrik you might be on to something. Using `map2_dfc` could work, but that requires dropping the `name` column and then maybe joining it back on. I'm trying to imagine a `map2_dfc` / `map_at` hybrid — camille, Nov 13 '18 at 19:56
Seems like it might be easier when you will be able to pass a list of functions to summarize_at/mutate_at: https://github.com/tidyverse/dplyr/issues/3433. That doesn't seem to work yet. — MrFlick, Nov 13 '18 at 20:06
`mutate` supports `!!!` so the easiest in my opinion is to recreate the verbose `mutate` call (not `mutate_at`) programmatically through `map2` or (cleaner to me) `imap` — moodymudskipper, Nov 14 '18 at 14:21

moodymudskipper · Accepted Answer · 2018-11-14T20:10:08.477

2 solutions

mutate with !!!

invoke was a good idea but you need it less now that most tidyverse functions support the !!! operator, here's what you can do :

digits <- c(life_expectancy = 1, poverty = 2, household_income = 0)  
df %>% mutate(!!!imap(digits, ~round(..3[[.y]], .x),.))
# # A tibble: 1 x 4
#          name life_expectancy poverty household_income
#         <chr>           <dbl>   <dbl>            <dbl>
#   1 New Haven            78.1    0.26            42589

..3 is the initial data frame, passed to the function as a third argument, through the dot at the end of the call.

Written more explicitly :

df %>% mutate(!!!imap(
  digits, 
  function(digit, name, data) round(data[[name]], digit),
  data = .))

If you need to start from your old interface (though the one I propose will be more flexible), first do:

digits <- setNames(c(1, 2, 0), c("life_expectancy", "poverty", "household_income"))

mutate_at and <<-

Here we bend a bit the good practice of avoiding <<- whenever possible, but readability matters and this one is really easy to read.

digits <- c(1, 2, 0)
i <- 0
df %>%
  mutate_at(vars(life_expectancy, poverty, household_income), ~round(., digits[i<<- i+1]))
# A tibble: 1 x 4
#     name      life_expectancy poverty household_income
#     <chr>               <dbl>   <dbl>            <dbl>
#   1 New Haven            78.1    0.26            42589

(or just df %>% mutate_at(names(digits), ~round(., digits[i<<- i+1])) if you use a named vector as in my first solution)

This is the correct way to do it. I've deleted my answer because while the output in the console matched OPs result, running `apply(df, 1, print)` showed that the values were each rounded to two decimals. — Mako212, Nov 14 '18 at 17:27
This is wild! So `imap` is mapping over `digits` and its names, then applying the `round` function, but also taking the original data frame in `...`? Am I getting that right? — camille, Nov 14 '18 at 20:04
Yes you got it perfectly, passing the `lhs` to the `...` is a trick I like a lot, I added a more explicit version for clarity. — moodymudskipper, Nov 14 '18 at 20:11

Calum You · Answer 2 · 2018-11-13T20:03:58.283

Here's a map2 solution along the lines of Henrik's comment. You can then wrap this inside a custom function. I provided an rough first attempt but I have done minimal tests, so it probably breaks under all sorts of situations if evaluation is strange. It also doesn't use tidyselect for .at, but neither does modify_at...

library(tidyverse)

df <- tibble::tribble(
  ~name, ~life_expectancy,          ~poverty, ~household_income,
  "New Haven", 78.0580437642378, 0.264221051111753,  42588.7592521085,
  "New York", 12.349685329, 0.324067934, 32156.230974623
)

rounded <- df %>%
  select(life_expectancy, poverty, household_income) %>%
  map2_dfc(
    .y = c(1, 2, 0),
    .f = ~ round(.x, digits = .y)
  )
df %>%
  select(-life_expectancy, -poverty, -household_income) %>%
  bind_cols(rounded)
#> # A tibble: 2 x 4
#>   name      life_expectancy poverty household_income
#>   <chr>               <dbl>   <dbl>            <dbl>
#> 1 New Haven            78.1    0.26            42589
#> 2 New York             12.3    0.32            32156


modify2_at <- function(.x, .y, .at, .f) {
  modified <- .x[.at] %>%
    map2(.y, .f)
  .x[.at] <- modified
  return(.x)
}

df %>%
  modify2_at(
    .y = c(1, 2, 0),
    .at = c("life_expectancy", "poverty", "household_income"),
    .f = ~ round(.x, digits = .y)
  )
#> # A tibble: 2 x 4
#>   name      life_expectancy poverty household_income
#>   <chr>               <dbl>   <dbl>            <dbl>
#> 1 New Haven            78.1    0.26            42589
#> 2 New York             12.3    0.32            32156

^{Created on 2018-11-13 by the reprex package (v0.2.1)}

score 2 · Answer 3 · answered Jan 31 '19 at 15:57

2

Fun with tidyeval:

prepared_pairs <- 
  map2(
    set_names(syms(list("life_expectancy", "poverty", "household_income"))),
    c(1, 2, 0), 
    ~expr(round(!!.x, digits = !!.y))
  )

mutate(df, !!! prepared_pairs)

# # A tibble: 1 x 4
#   name      life_expectancy poverty household_income
#   <chr>               <dbl>   <dbl>            <dbl>
# 1 New Haven            78.1    0.26            42589

answered Jan 31 '19 at 15:57

Aurèle

11,334
1
29
47

Interesting. Using `expr` in this way for the entire expression is comparable to using `enquo` on individual variables? I'm still getting the hang of the different tidyeval verbs – camille Jan 31 '19 at 16:04
(Prefixing everything I say with "As far as I understand"): `expr` is a little more "bare" in the sense that it doesn't carry an environment with it. `expr` is like the lighter `quo` (not `enquo`) without an environment – Aurèle Jan 31 '19 at 16:09
I think `expr` is just `quote` except that it understands `!!` – moodymudskipper Feb 01 '19 at 10:25
1

It's a cool solution, if you use the definition of `digits` that I use it's a bit simpler to read as you can do : `prepared_pairs – moodymudskipper Feb 01 '19 at 10:38
1

Thanks! An idea to make yours robust to grouped data frames is to wrap it in `do` like `df %>% do(mutate(., !!!imap(digits, ~round(..3[[.y]], .x),.)))` – Aurèle Feb 01 '19 at 15:08

Pass function arguments by column position to mutate_at

3 Answers3

Linked