Use pipe operator %>% with replacement functions like colnames()<-

Question

How can I use the pipe operator to pipe into replacement function like colnames()<- ?

Here's what I'm trying to do:

library(dplyr)
averages_df <- 
   group_by(mtcars, cyl) %>%
   summarise(mean(disp), mean(hp))
colnames(averages_df) <- c("cyl", "disp_mean", "hp_mean")
averages_df

# Source: local data frame [3 x 3]
# 
#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

But ideally it would be something like:

averages_df <- 
  group_by(mtcars, cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  add_colnames(c("cyl", "disp_mean", "hp_mean"))

Is there a way to do this without writing a specialty function each time?

The answers here are a start, but not exactly my question: Chaining arithmetic operators in dplyr

You could name your inputs to `summarise` - `group_by(mtcars, cyl) %>% summarise(disp_mean=mean(disp), hp_mean=mean(hp))` Though I'm not seeing how using `colnames` is that much of a drag. Does every little thing have to be done in dplyr? — thelatemail, Jan 22 '15 at 23:53
I believe there's a `rename()` function in `dplyr`. Or yeah, do what @thelatemail said. — Rich Scriven, Jan 22 '15 at 23:54
Or just use `setNames` as in `group_by(mtcars, cyl) %>% summarise(mean(disp), mean(hp)) %>% setNames(., c("cyl", "disp_mean", "hp_mean"))` — David Arenburg, Jan 22 '15 at 23:58
@DavidArenburg - now why didn't I think of that, seeing as I just pointed that out 2 minutes ago? — thelatemail, Jan 22 '15 at 23:58
Thanks for these suggestions. It's true that every little thing doesn't need to be done in `dplyr`. `rename()` and `setNames` solves the problem in question. But there are other "replacement" functions that do something like `foo(x) — Alex Coppock, Jan 23 '15 at 00:08
You can use such function in the same manner as shown in Henriks answer. — David Arenburg, Jan 23 '15 at 00:10

Henrik · Accepted Answer · 2015-01-23T00:47:37.367

131

You could use colnames<- or setNames (thanks to @David Arenburg)

group_by(mtcars, cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  `colnames<-`(c("cyl", "disp_mean", "hp_mean"))
  # or
  # `names<-`(c("cyl", "disp_mean", "hp_mean"))
  # setNames(., c("cyl", "disp_mean", "hp_mean")) 

#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

Or pick an Alias (set_colnames) from magrittr:

library(magrittr)
group_by(mtcars, cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  set_colnames(c("cyl", "disp_mean", "hp_mean"))

dplyr::rename may be more convenient if you are only (re)naming a few out of many columns (it requires writing both the old and the new name; see @Richard Scriven's answer)

edited Jan 23 '15 at 00:47

answered Jan 23 '15 at 00:04

Henrik

61,039
13
131
152

beautiful. I assume that the `\`foo – Alex Coppock Jan 23 '15 at 03:28
The first solution was enlightening and essentially blew my mind! Thanks! – ManuHaq Nov 09 '21 at 16:24

Rich Scriven · Answer 2 · 2015-01-23T21:21:42.980

In dplyr, there are a couple different ways to rename the columns.

One is to use the rename() function. In this example you'd need to back-tick the names created by summarise(), since they are expressions.

group_by(mtcars, cyl) %>%
    summarise(mean(disp), mean(hp)) %>%
    rename(disp_mean = `mean(disp)`, hp_mean = `mean(hp)`)
#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

You could also use select(). This is a bit easier because we can use the column number, eliminating the need to mess around with back-ticks.

group_by(mtcars, cyl) %>%
    summarise(mean(disp), mean(hp)) %>%
    select(1, disp_mean = 2, hp_mean = 3)

But for this example, the best way would be to do what @thelatemail mentioned in the comments, and that is to go back one step and name the columns in summarise().

group_by(mtcars, cyl) %>%
    summarise(disp_mean = mean(disp), hp_mean = mean(hp))

score 13 · Answer 3 · answered Jan 26 '17 at 03:58

We can add a suffix to the summarised variables by using .funs argument of summarise_at with dplyr as below code.

library(dplyr)

# summarise_at with dplyr
mtcars %>% 
  group_by(cyl) %>%
  summarise_at(
    .cols = c("disp", "hp"),
    .funs = c(mean="mean")
  )
# A tibble: 3 × 3
# cyl disp_mean   hp_mean
# <dbl>     <dbl>     <dbl>
# 1     4  105.1364  82.63636
# 2     6  183.3143 122.28571
# 3     8  353.1000 209.21429

Also, we can set column names in several ways.

# set_names with magrittr
mtcars %>% 
  group_by(cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  magrittr::set_names(c("cyl", "disp_mean", "hp_mean"))

# set_names with purrr
mtcars %>% 
  group_by(cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  purrr::set_names(c("cyl", "disp_mean", "hp_mean"))

# setNames with stats
mtcars %>%
  group_by(cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  stats::setNames(c("cyl", "disp_mean", "hp_mean"))

# A tibble: 3 × 3
# cyl disp_mean   hp_mean
# <dbl>     <dbl>     <dbl>
# 1     4  105.1364  82.63636
# 2     6  183.3143 122.28571
# 3     8  353.1000 209.21429

Use pipe operator %>% with replacement functions like colnames()<-

3 Answers3

Linked

Related