R - How to convert a categorical variable in frequencies

Question

I need to make with the categorical variables a new variable that takes the frequency of each category. I've written the following code:

df[ , t_Product := .N , by = .(Product)]

but I have this error:

 Error in `[.data.frame`(datos, , `:=`(t_Product, .N), by = .(Product)) : 
  unused argument (by = .(Product))

where df is my dataframe, t_Product is the name of the new column and Product is the current column.

If I am right I am grouping by the column Product and creating an other column named t_Product that take count, so takes the frequency...

Hi, and welcome to Stack Overflow! In order for us to help you, please provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). For example, to produce a minimal data set, you can use `head()`, `subset()`, or the indices. Then use `dput()` to give us something that can be put in R immediately. Also, please make sure you know what to do [when someone answers your question](https://stackoverflow.com/help/someone-answers). Finally, here is a link to Stack Overflow's [help center](https://stackoverflow.com/help). Thank you! — iamericfletcher, Aug 10 '20 at 17:15

score 1 · Answer 1 · answered Aug 10 '20 at 17:19

The syntax you use will be recognized by an object of class data.table but not an object of class data.frame.

I would then suggest to install the data.table package (if not already done) and add the following before your code line :

library(data.table)
setDT(df)

iamericfletcher · Accepted Answer · 2020-08-10T18:09:25.543

Is your data in data.table format? It is not, but just so you know how to test if it is in the future, run the following line of code:

is.data.table(df)

It should output FALSE

In order to counteract this, add the following before your code:

library(data.table) #import data.table package

df = as.data.table(df) #convert df to data.table

df[, t_Product := .N , by = .(Product)] #your code

OR

library(data.table)

setDT(df)

df[, t_Product := .N , by = .(Product)]

Example with Palmer Penguins

library(data.table)
library(palmerpenguins)
data(package = 'palmerpenguins') #importing the palmer penguins data

df = as.data.table(penguins) #you can also use setDT(df)
df[, t_Product := .N , by = .(species)] #using your code here
df[, .(species, island, t_Product)] #selecting columns using column name

#>        species    island t_Product
#>   1:    Adelie Torgersen       152
#>   2:    Adelie Torgersen       152
#>   3:    Adelie Torgersen       152
#>   4:    Adelie Torgersen       152
#>   5:    Adelie Torgersen       152
#>  ---                              
#> 340: Chinstrap     Dream        68
#> 341: Chinstrap     Dream        68
#> 342: Chinstrap     Dream        68
#> 343: Chinstrap     Dream        68
#> 344: Chinstrap     Dream        68

^{Created on 2020-08-10 by the reprex package (v0.3.0)}

You're welcome. Please finalize by checking the solution that you find most helpful. Or see what to do [when someone answers your question](https://stackoverflow.com/help/someone-answers) — iamericfletcher, Aug 11 '20 at 16:08

R - How to convert a categorical variable in frequencies

2 Answers2