1

I need to make with the categorical variables a new variable that takes the frequency of each category. I've written the following code:

df[ , t_Product := .N , by = .(Product)]

but I have this error:

 Error in `[.data.frame`(datos, , `:=`(t_Product, .N), by = .(Product)) : 
  unused argument (by = .(Product))

where df is my dataframe, t_Product is the name of the new column and Product is the current column.

If I am right I am grouping by the column Product and creating an other column named t_Product that take count, so takes the frequency...

Phil
  • 5,491
  • 3
  • 26
  • 61
  • Hi, and welcome to Stack Overflow! In order for us to help you, please provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). For example, to produce a minimal data set, you can use `head()`, `subset()`, or the indices. Then use `dput()` to give us something that can be put in R immediately. Also, please make sure you know what to do [when someone answers your question](https://stackoverflow.com/help/someone-answers). Finally, here is a link to Stack Overflow's [help center](https://stackoverflow.com/help). Thank you! – iamericfletcher Aug 10 '20 at 17:15

2 Answers2

1

The syntax you use will be recognized by an object of class data.table but not an object of class data.frame.

I would then suggest to install the data.table package (if not already done) and add the following before your code line :

library(data.table)
setDT(df)
J.P. Le Cavalier
  • 1,235
  • 6
  • 15
0

Is your data in data.table format? It is not, but just so you know how to test if it is in the future, run the following line of code:

is.data.table(df)

It should output FALSE

In order to counteract this, add the following before your code:

library(data.table) #import data.table package

df = as.data.table(df) #convert df to data.table

df[, t_Product := .N , by = .(Product)] #your code

OR

library(data.table)

setDT(df)

df[, t_Product := .N , by = .(Product)]

Example with Palmer Penguins

library(data.table)
library(palmerpenguins)
data(package = 'palmerpenguins') #importing the palmer penguins data

df = as.data.table(penguins) #you can also use setDT(df)
df[, t_Product := .N , by = .(species)] #using your code here
df[, .(species, island, t_Product)] #selecting columns using column name

#>        species    island t_Product
#>   1:    Adelie Torgersen       152
#>   2:    Adelie Torgersen       152
#>   3:    Adelie Torgersen       152
#>   4:    Adelie Torgersen       152
#>   5:    Adelie Torgersen       152
#>  ---                              
#> 340: Chinstrap     Dream        68
#> 341: Chinstrap     Dream        68
#> 342: Chinstrap     Dream        68
#> 343: Chinstrap     Dream        68
#> 344: Chinstrap     Dream        68

Created on 2020-08-10 by the reprex package (v0.3.0)

iamericfletcher
  • 2,529
  • 6
  • 17
  • You're welcome. Please finalize by checking the solution that you find most helpful. Or see what to do [when someone answers your question](https://stackoverflow.com/help/someone-answers) – iamericfletcher Aug 11 '20 at 16:08