886

What are the differences between the assignment operators = and <- in R?

I know that operators are slightly different, as this example shows

x <- y <- 5
x = y = 5
x = y <- 5
x <- y = 5
# Error in (x <- y) = 5 : could not find function "<-<-"

But is this the only difference?

smci
  • 29,564
  • 18
  • 109
  • 144
csgillespie
  • 57,032
  • 13
  • 142
  • 178
  • 59
    As noted [here](http://blog.revolutionanalytics.com/2008/12/use-equals-or-arrow-for-assignment.html) the origins of the ` – joran Dec 12 '14 at 17:35

8 Answers8

752

The difference in assignment operators is clearer when you use them to set an argument value in a function call. For example:

median(x = 1:10)
x   
## Error: object 'x' not found

In this case, x is declared within the scope of the function, so it does not exist in the user workspace.

median(x <- 1:10)
x    
## [1]  1  2  3  4  5  6  7  8  9 10

In this case, x is declared in the user workspace, so you can use it after the function call has been completed.


There is a general preference among the R community for using <- for assignment (other than in function signatures) for compatibility with (very) old versions of S-Plus. Note that the spaces help to clarify situations like

x<-3
# Does this mean assignment?
x <- 3
# Or less than?
x < -3

Most R IDEs have keyboard shortcuts to make <- easier to type. Ctrl + = in Architect, Alt + - in RStudio (Option + - under macOS), Shift + - (underscore) in emacs+ESS.


If you prefer writing = to <- but want to use the more common assignment symbol for publicly released code (on CRAN, for example), then you can use one of the tidy_* functions in the formatR package to automatically replace = with <-.

library(formatR)
tidy_source(text = "x=1:5", arrow = TRUE)
## x <- 1:5

The answer to the question "Why does x <- y = 5 throw an error but not x <- y <- 5?" is "It's down to the magic contained in the parser". R's syntax contains many ambiguous cases that have to be resolved one way or another. The parser chooses to resolve the bits of the expression in different orders depending on whether = or <- was used.

To understand what is happening, you need to know that assignment silently returns the value that was assigned. You can see that more clearly by explicitly printing, for example print(x <- 2 + 3).

Secondly, it's clearer if we use prefix notation for assignment. So

x <- 5
`<-`(x, 5)  #same thing

y = 5
`=`(y, 5)   #also the same thing

The parser interprets x <- y <- 5 as

`<-`(x, `<-`(y, 5))

We might expect that x <- y = 5 would then be

`<-`(x, `=`(y, 5))

but actually it gets interpreted as

`=`(`<-`(x, y), 5)

This is because = is lower precedence than <-, as shown on the ?Syntax help page.

Richie Cotton
  • 113,548
  • 43
  • 231
  • 352
  • 10
    This is also mentioned in chapter 8.2.26 of [The R Inferno](http://www.burns-stat.com/pages/Tutor/R_inferno.pdf) by Patrick Burns (Not me but a recommendation anyway) – Uwe Jun 14 '16 at 09:17
  • 3
    However, `median((x = 1:10))` has the same effect as `median(x – Francesco Napolitano Sep 27 '16 at 13:27
  • 3
    i dont really consider them shortcuts, in any case you press same number of keys – yosemite_k Jun 08 '18 at 12:42
  • 5
    I just realised that your explanation of how `x – Konrad Rudolph Jul 27 '18 at 11:25
  • In R studio, `alt+-` actually gives you ` – HongboZhu Oct 04 '18 at 08:35
  • 5
    … And I just realised that the very first part of this answer is incorrect and, unfortunately, quite misleading because it perpetuates a common misconception: The way you use `=` in a function call **does not perform assignment**, and isn’t an assignment operator. It’s an entirely distinct parsed R expression, which just happens to use the same character. Further, the code you show does not “declare” `x` in the scope of the function. The *function declaration* performs said declaration. The function call doesn’t (it gets a bit more complicated with named `...` arguments). – Konrad Rudolph Apr 12 '19 at 10:33
  • Note also, for vim users. In the package Nvim-R " – DryLabRebel Jul 15 '19 at 00:16
  • 1
    @DryLabRebel Luckily that annoying behaviour can be disabled by putting `let R_assign = 0` into the .vimrc file. – Konrad Rudolph Jul 17 '19 at 13:03
  • @KonradRudolph I'm just going to pretend I already knew that and wasn't too lazy to figure it out. Although I do think I find it more useful than not. The only really irksome thing is that it affects macros, so that if you include an "_" in a string, and try to run it in a macro it will fail. – DryLabRebel Jul 17 '19 at 22:48
  • 2
    @DryLabRebel The issue with that shortcut is that the most widely recommended style convention for R uses underscores in delimiters `like_this`, and thatʼs very annoying to type with this shortcut. – Konrad Rudolph Jul 18 '19 at 07:56
  • > x Error: object 'x' not found > median(x = 1:10) [1] 5.5 Curious about the Behavior??? – Aashu Feb 17 '20 at 14:44
  • `median(x = 1:10)` works fine (returns 5.5) in version 4.0.2. Guessing this changed with version 4.0.0 – markhogue Jul 08 '20 at 18:43
  • @markhogue No, the behaviour didn’t change. The code always worked. What this answer attempts to convey is that this code does not perform assignment, so no variable `x` exists afterwards in the calling scope. – Konrad Rudolph Mar 05 '21 at 18:28
222

What are the differences between the assignment operators = and <- in R?

As your example shows, = and <- have slightly different operator precedence (which determines the order of evaluation when they are mixed in the same expression). In fact, ?Syntax in R gives the following operator precedence table, from highest to lowest:

…
‘-> ->>’           rightwards assignment
‘<- <<-’           assignment (right to left)
‘=’                assignment (right to left)
…

But is this the only difference?

Since you were asking about the assignment operators: yes, that is the only difference. However, you would be forgiven for believing otherwise. Even the R documentation of ?assignOps claims that there are more differences:

The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.

Let’s not put too fine a point on it: the R documentation is wrong. This is easy to show: we just need to find a counter-example of the = operator that isn’t (a) at the top level, nor (b) a subexpression in a braced list of expressions (i.e. {…; …}). — Without further ado:

x
# Error: object 'x' not found
sum((x = 1), 2)
# [1] 3
x
# [1] 1

Clearly we’ve performed an assignment, using =, outside of contexts (a) and (b). So, why has the documentation of a core R language feature been wrong for decades?

It’s because in R’s syntax the symbol = has two distinct meanings that get routinely conflated (even by experts, including in the documentation cited above):

  1. The first meaning is as an assignment operator. This is all we’ve talked about so far.
  2. The second meaning isn’t an operator but rather a syntax token that signals named argument passing in a function call. Unlike the = operator it performs no action at runtime, it merely changes the way an expression is parsed.

So how does R decide whether a given usage of = refers to the operator or to named argument passing? Let’s see.

In any piece of code of the general form …

‹function_name›(‹argname› = ‹value›, …)
‹function_name›(‹args›, ‹argname› = ‹value›, …)

… the = is the token that defines named argument passing: it is not the assignment operator. Furthermore, = is entirely forbidden in some syntactic contexts:

if (‹var› = ‹value›) …
while (‹var› = ‹value›) …
for (‹var› = ‹value› in ‹value2›) …
for (‹var1› in ‹var2› = ‹value›) …

Any of these will raise an error “unexpected '=' in ‹bla›”.

In any other context, = refers to the assignment operator call. In particular, merely putting parentheses around the subexpression makes any of the above (a) valid, and (b) an assignment. For instance, the following performs assignment:

median((x = 1 : 10))

But also:

if (! (nf = length(from))) return()

Now you might object that such code is atrocious (and you may be right). But I took this code from the base::file.copy function (replacing <- with =) — it’s a pervasive pattern in much of the core R codebase.

The original explanation by John Chambers, which the the R documentation is probably based on, actually explains this correctly:

[= assignment is] allowed in only two places in the grammar: at the top level (as a complete program or user-typed expression); and when isolated from surrounding logical structure, by braces or an extra pair of parentheses.


In sum, by default the operators <- and = do the same thing. But either of them can be overridden separately to change its behaviour. By contrast, <- and -> (left-to-right assignment), though syntactically distinct, always call the same function. Overriding one also overrides the other. Knowing this is rarely practical but it can be used for some fun shenanigans.

Konrad Rudolph
  • 506,650
  • 124
  • 909
  • 1,183
  • 4
    About the precedence, and errors in R's doc, the precedence of `?` is actually right in between `=` and ` – moodymudskipper Jan 10 '20 at 00:12
  • 2
    @Moody_Mudskipper that’s bizarre! You seem to be right, but according to the *source code* ([`main/gram.y`](https://github.com/wch/r-source/blob/386c3a93cbcaf95017fa6ae52453530fb95149f4/src/main/gram.y#L384-L390)), the precedence of `?` is correctly documented, and is lower than both `=` and ` – Konrad Rudolph Jan 10 '20 at 10:13
  • 1
    I don't speak C but I suppose that `=` get a special treatment before the parse tree is built. Maybe related to function arguments, it makes sense that in `foo(x = a ? b)` we'd look for `=` before parsing rest of the expression. – moodymudskipper Jan 10 '20 at 10:34
  • 1
    @Moody_Mudskipper [I’ve asked r-devel](https://stat.ethz.ch/pipermail/r-devel/2020-January/078898.html) – Konrad Rudolph Jan 10 '20 at 11:16
  • I had initiated a discussion a few months agoin case you missed it : https://r.789695.n4.nabble.com/Syntax-wrong-about-s-precedence-td4759013.html – moodymudskipper Jan 10 '20 at 11:33
  • @Moody_Mudskipper Thanks. Yes, completely missed that (I generally only skim r-devel, and before posting now I only checked last month’s archive). – Konrad Rudolph Jan 10 '20 at 11:47
  • 7
    @Moody_Mudskipper FWIW this is finally fixed in 4.0.0. – Konrad Rudolph Apr 24 '20 at 13:17
  • Is the operator/syntax-token dichotomy the reason I get a named list from `list(x = 10)`, but not from `list(x – Captain Hat Jan 21 '21 at 13:35
  • @CaptainHat Yes. In this case, `=` isn’t the operator, it’s a syntactic token. Whereas ` – Konrad Rudolph Jan 21 '21 at 13:57
  • @KonradRudolph: would you be willing to update your answer to mention that it now refers to outdated behaviour ... ? – Ben Bolker May 17 '21 at 20:32
  • @BenBolker Which behaviour are you referring to? Lambda support in 4.1? (Besides this, I have been mulling over a large rewrite of this answer to tighten the information and make it mode understandable, but I haven’t yet found time to give it its due attention.) – Konrad Rudolph May 17 '21 at 22:05
  • Oh, I guess it was just @Moody_Mudskipper's comment about precedence, which isn't in the answer at all. Sorry about that. Was the original issue about wrong documentation ever raised on r-devel, and did anyone bite? Is it worth my submitting a doc patch to the bug tracker? – Ben Bolker May 17 '21 at 22:30
110

Google's R style guide simplifies the issue by prohibiting the "=" for assignment. Not a bad choice.

https://google.github.io/styleguide/Rguide.xml

The R manual goes into nice detail on all 5 assignment operators.

http://stat.ethz.ch/R-manual/R-patched/library/base/html/assignOps.html

xxfelixxx
  • 6,318
  • 3
  • 30
  • 38
Nosredna
  • 78,682
  • 15
  • 92
  • 122
46

x = y = 5 is equivalent to x = (y = 5), because the assignment operators "group" right to left, which works. Meaning: assign 5 to y, leaving the number 5; and then assign that 5 to x.

This is not the same as (x = y) = 5, which doesn't work! Meaning: assign the value of y to x, leaving the value of y; and then assign 5 to, umm..., what exactly?

When you mix the different kinds of assignment operators, <- binds tighter than =. So x = y <- 5 is interpreted as x = (y <- 5), which is the case that makes sense.

Unfortunately, x <- y = 5 is interpreted as (x <- y) = 5, which is the case that doesn't work!

See ?Syntax and ?assignOps for the precedence (binding) and grouping rules.

Steve Pitchers
  • 6,771
  • 5
  • 39
  • 40
  • Yes, as [Konrad Rudolph](https://stackoverflow.com/questions/1741820/what-are-the-differences-between-and-in-r#51564252)'s answer said ` – Nick Dong Apr 19 '19 at 03:04
  • 1
    @Nick Dong Yes indeed. Helpfully, the operator precedendence table is documented unambiguously in [?Syntax {base}](https://stat.ethz.ch/R-manual/R-devel/library/base/html/Syntax.html). – Steve Pitchers Apr 28 '19 at 17:38
36

According to John Chambers, the operator = is only allowed at "the top level," which means it is not allowed in control structures like if, making the following programming error illegal.

> if(x = 0) 1 else x
Error: syntax error

As he writes, "Disallowing the new assignment form [=] in control expressions avoids programming errors (such as the example above) that are more likely with the equal operator than with other S assignments."

You can manage to do this if it's "isolated from surrounding logical structure, by braces or an extra pair of parentheses," so if ((x = 0)) 1 else x would work.

See http://developer.r-project.org/equalAssign.html

Aaron left Stack Overflow
  • 35,618
  • 5
  • 75
  • 137
  • What do you expect that line to do? Surely, it's equivalent to `x = 0`? – Steve Pitchers Sep 15 '14 at 08:19
  • 14
    It's a common bug, `x==0` is almost always meant instead. – Aaron left Stack Overflow Sep 15 '14 at 15:04
  • 17
    Ah, yes, I overlooked that you said "programming error". It's actually good news that this causes an error. And a good reason to prefer `x=0` as assignment over `x – Steve Pitchers Sep 16 '14 at 09:55
  • 9
    Yes, it is nice that this causes an error, though I draw a different lesson about what to prefer; I choose to use `=` as little as possible because `=` and `==` look so similar. – Aaron left Stack Overflow Sep 16 '14 at 16:09
  • 2
    The way this example is presented is so strange to me. `if(x = 0) 1 else x` throws an error, helping me find and correct a bug. `if(x – Gregor Thomas Feb 24 '17 at 21:32
  • Well, I suppose you could make the case that should throw an error too, but to me it's clear from the syntax of code what it's doing: assignment, and using the result of that assignment as the input to the if. On the other hand, the single = looks like it might be testing for equality, but it's not, which is why I think it's nice that it is not allowed. – Aaron left Stack Overflow Feb 25 '17 at 14:27
  • 1
    “The operator `=` is only allowed at the top level” — no this is completely wrong. Try `if ((x = 0)) 1 else x`. – Konrad Rudolph Mar 02 '17 at 14:39
  • Yes, an exception is if it's "isolated from the surrounding logical structure." I'll edit... – Aaron left Stack Overflow Mar 02 '17 at 16:46
  • 3
    I mean, a *really* helpful error checker would throw an error there and say "you have useless code that will always return the `else` value, did you mean to write it that way?", but, that may be a pipe dream... – TylerH Mar 08 '17 at 17:13
  • @AaronleftStackOverflow Wait, how? I mean, from 15 feet away, sure. – NoName May 18 '21 at 18:36
25

The operators <- and = assign into the environment in which they are evaluated. The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.

nbro
  • 13,796
  • 25
  • 99
  • 185
Haim Evgi
  • 120,002
  • 45
  • 212
  • 219
8

This may also add to understanding of the difference between those two operators:

df <- data.frame(
      a = rnorm(10),
      b <- rnorm(10)
)

For the first element R has assigned values and proper name, while the name of the second element looks a bit strange.

str(df)
# 'data.frame': 10 obs. of  2 variables:
#  $ a             : num  0.6393 1.125 -1.2514 0.0729 -1.3292 ...
#  $ b....rnorm.10.: num  0.2485 0.0391 -1.6532 -0.3366 1.1951 ...

R version 3.3.2 (2016-10-31); macOS Sierra 10.12.1

Scarabee
  • 5,267
  • 5
  • 26
  • 51
Denis Rasulev
  • 3,246
  • 4
  • 33
  • 40
  • 8
    can you give a more detailed explaination of why this happens/what's going on here? (hint: `data.frame` tries to use the name of the provided variable as the name of the element in the data frame) – Ben Bolker Dec 10 '16 at 22:16
  • Just thought, could this possibly be a bug? And if so, how and where do I report it? – Denis Rasulev Jul 15 '17 at 06:27
  • 9
    it's not a bug. I tried to hint at the answer in my comment above. When setting the name of the element, R will use the equivalent of `make.names("b – Ben Bolker Jul 16 '17 at 15:21
0

I am not sure if Patrick Burns book R inferno has been cited here where in 8.2.26 = is not a synonym of <- Patrick states "You clearly do not want to use '<-' when you want to set an argument of a function.". The book is available at https://www.burns-stat.com/documents/books/the-r-inferno/

Diego
  • 320
  • 2
  • 9
  • Yup, [it has been mentioned](https://stackoverflow.com/questions/1741820/what-are-the-differences-between-and-assignment-operators-in-r/69078055#comment63079875_1742550). But the question is about the *assignment operator*, whereas your excerpt concerns the syntax for passing arguments. It should be made clear (because there’s substantial confusion surrounding this point) that this is *not* the assignment operator. – Konrad Rudolph Sep 06 '21 at 17:27