0

Beginner Question: What is a simple way to rename a variable observation in a dataframe column?

I have dataframe "Stuff" with a column of categorical data called "Age" where one of the data variables is called "Age80+". I've learned that R does not like "+" in a name,

e.g. Age80+ <- brings up an error

In column "Age" there are 7 other variable numbers, e.g. "Age18_30" so I cannot manually change the observation names efficiently.

I have looked but I haven't found a simple way to rename all "Age80+" to "Age80plus" without bringing in complicated packages like "stringer" or "dplyr". The dataframe has 100's of "Age80+" observations.

Thank you

I have tried

Stuff$Age<- gsub("Age80+", "Age80plus", Stuff$Age)

But that changes "Age80+" to "Age80plus+" not "Age80plus"

The change leaves the "+"

deschen
  • 7,069
  • 3
  • 20
  • 37
ceallac
  • 77
  • 8
  • Does `gsub("Age80\\+", "Age80plus", Stuff$Age)` work? – jay.sf Jan 02 '22 at 16:17
  • If `is.character(Age)` is `TRUE`, then you could just do `Age[Age == "Age80+"] – Mikael Jagan Jan 02 '22 at 16:51
  • 1
    @jay.sf yes it does! thank you! should you post it as a solution? – ceallac Jan 02 '22 at 16:52
  • Also a few clarifications: it seems you want to recode values of a column, NOT rename the column, in which case having the age coded as „Age80+“ is no problem at all. Also, I wouldn‘t call dplyr a complicated package. On the contrary, for beginners it‘s easier to learn than all the base R syntax (although this won‘t be true for every person and people are free to disagree with this statement). – deschen Jan 02 '22 at 16:55
  • thank you @deschen. I'm starting out and if I cannot see it - then it's a little too advanced for me right now. But I will try it – ceallac Jan 02 '22 at 17:13

1 Answers1

1

+ is a special character aka regular expression, that you may escape \\+ if you want the actual character.

dat <- transform(dat, age=gsub('Age80\\+', 'Age80plus', age))
dat
#   id       age          x
# 1  1 Age80plus -0.9701187
# 2  2 Age80plus -0.5522213
# 3  3 Age80plus -1.6060125
# 4  4     Age60 -1.5417523
# 5  5     Age40 -1.9090871

Data:

dat <- structure(list(id = 1:5, age = c("Age80+", "Age80+", "Age80+", 
"Age60", "Age40"), x = c(-0.970118672988532, -0.552221336521097, 
-1.60601248510621, -1.54175233366043, -1.909087068272)), class = "data.frame", row.names = c(NA, 
-5L))
jay.sf
  • 46,523
  • 6
  • 46
  • 87