4

A) Instead of this (where cars <- data.table(cars))

cars[ , .(`Totals:`=.N), by=speed]  

I need this

strColumnName <- "Totals:"
cars [ , strColumnName = .N, by=speed]  

How to do it?

B) Similarly (more general case) - instead of this:

cars[ dist > 50, .(`Totals:`=.N, x=dist*100), by=speed] 

I need this:

strFactor <- "dist"
cars[ strFactor > 50, .(`Totals:`=.N, x=strFactor*100), by=speed] 

This question is about GENERAL way of assigning/referencing column name variables in data.table, i.e. in 'j' (both RHS and LHS), as well as in 'i' and 'by' - dynamically. This is needed when are chosen elsewhere in the code (e.g. a user my enter them in shiny app)

C) General case involving i,j and by - Instead of this:

 cars[ dist > 50, .(`Totals x Factor: ` = .N * dist), by=speed] 

I need this:

strFactor <- "dist"; 
strNewVariable <- "Totals x Factor: "
strBy <- "speed"
cars[ strFactor > 50, .(strNewVariable = .N * strFactor), by=strBy] 
IVIM
  • 1,703
  • 1
  • 12
  • 32
  • The following does not work `col – IVIM Mar 23 '20 at 16:59
  • `cars[ , .N, by=speed]` is an aggregation, not an assignment. Please clarify what exactly you are trying to achieve. For assignment, there are many dupes with similar title, e.g. https://stackoverflow.com/questions/11745169/dynamic-column-names-in-data-table – David Arenburg Mar 23 '20 at 17:03
  • You need to use the `:=` set notation with this. – Mxblsdl Mar 23 '20 at 17:16
  • In my examples, instead of .N any other RHS statement can be used. And it is not about `:=` – IVIM Mar 23 '20 at 17:57
  • Related: [Dynamically add column names to data.table when aggregating](https://stackoverflow.com/questions/41239290/dynamically-add-column-names-to-data-table-when-aggregating) – Henrik Mar 28 '20 at 16:21

2 Answers2

8

Edit: Based on your clarifications, here is an approach with setNames and get. The trick here is that .. instructs the evaluation to occur in the calling environment.

library(data.table)
cars <- data.table(cars)
strFactor <- "dist"
strNewVariable <- "Totals x Factor: "
strBy <- "speed"
cars[ get(strFactor)  > 50, 
     setNames(.(.N * get(..strFactor)),strNewVariable),
     by=strBy] 
Ian Campbell
  • 21,281
  • 13
  • 26
  • 51
  • Yes - I've been thinking it has to be done with `get()` somehow ! Neat! - I'm going to edit my question though - because in my coding practice, I'm trying "automate" (i.e. dynamically assign all variables) ALL parts inside the `data.table`, i.e. i, j and `by` – IVIM Mar 23 '20 at 19:47
  • Awesome! - This is what I'll be using from now on, with the only change that I will use `eval(as.name(strFactor))` instead of `get(..strFactor)` - as suggested by @akrun, so that there's no need to memorize a new `..` symbol, when we can live without it. Ian, perhaps, you can add this possible modification to your answer, and I will accept it as the best answer. – IVIM Mar 23 '20 at 21:38
  • 1
    +1. The ```by``` argument does not need ```eval``` - this would work ```by = strBy```. Also, for people who prefer avoiding ```get```, you can use ```cars[[strFactor]] > 50``` and ```.SD[[strFactor]]``` for the i and j statements, respectively. A more robust API is in the works as well: https://github.com/Rdatatable/data.table/pull/4304 – Cole Apr 07 '20 at 02:08
4

We can use := and wrap the variable with () to evaluate it instead of assinging it literally

library(data.table)
cars[ , (strColumnName) := .N, by=speed]  

If we need a summarised column,

setnames(cars[, .N, by = speed], 'N',  strColumnName)[]

With the updated code

cars[eval(as.name(strFactor)) > 50, .(`Totals:`=.N, x=eval(as.name(strFactor))*100), by=speed]
akrun
  • 789,025
  • 32
  • 460
  • 575