0

Hi all I'm new to R and would love your help. I have a data frame where I would like to recode some values. Here's an example data frame:

df <- data.frame(age = sample(100, size = 6),
                 gender = c("boy", "girl"))
print(x)
      age gender
    1  58    boy
    2  41   girl
    3  31    boy
    4  96   girl
    5  93    boy
    6  60   girl

Let's say I want to recode boy to man and girl to woman in a new column called new.gender . I tried using the ifelse function (to no avail):

df$new.gender <- NA
ifelse(x$gender == "boy", x$new.gender <- "man", x$new.gender <- "woman")
print(x)
  age gender new.gender
1  96    boy      woman
2  46   girl      woman
3  68    boy      woman
4   6   girl      woman
5  26    boy      woman
6  55   girl      woman

After some thinking, I changed the syntax a bit and got it to work:

x$new.gender <- NA
x$new.gender <- ifelse(x$gender == "boy", "man", "woman")
print(x)
  age gender new.gender
1  96    boy        man
2  46   girl      woman
3  68    boy        man
4   6   girl      woman
5  26    boy        man
6  55   girl      woman

Can someone help me understand why my first attempt resulted in all values changing to woman, while my second attempt worked? Thanks!

thelatemail
  • 85,757
  • 12
  • 122
  • 177
jason.f
  • 67
  • 1
  • 6

1 Answers1

1

ifelse(test, yes, no) returns a vector equal to the length of test.

in your case, the assignment of a full column was executed for nrow(x) times. The final result depends on the last test (x$gender == "boy", false), which explains why you see a queue of women in that column.

TC Zhang
  • 2,667
  • 1
  • 12
  • 19