12

I am maximizing a likelihood function and trying to reduce the loop. I want to add the vector(parameters to be estimated) to all rows of a matrix (data). The length of vector equals to the column of matrix. a+b would give the wrong results because the recycle rule of R is by column not row.

a<-c(1,2,0,0,0)  # parameters to be optimized
b<-matrix(1,ncol=5,nrow=6) # data
t(a+t(b)) # my code would work, anything more intuitive?

Desired output

        [,1] [,2] [,3] [,4] [,5]
    [1,]    2    3    1    1    1
    [2,]    2    3    1    1    1
    [3,]    2    3    1    1    1
    [4,]    2    3    1    1    1
    [5,]    2    3    1    1    1
    [6,]    2    3    1    1    1

The wrong output

a+b
    [,1] [,2] [,3] [,4] [,5]
[1,]    2    3    1    1    1
[2,]    3    1    1    1    2
[3,]    1    1    1    2    3
[4,]    1    1    2    3    1
[5,]    1    2    3    1    1
[6,]    2    3    1    1    1

2 Answers2

18

We can use col to replicate the 'a' elements

b + a[col(b)]
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    2    3    1    1    1
#[2,]    2    3    1    1    1
#[3,]    2    3    1    1    1
#[4,]    2    3    1    1    1
#[5,]    2    3    1    1    1
#[6,]    2    3    1    1    1

Or a faster option would be to use rep

b + rep(a, each = nrow(b))

Or use sweep

sweep(b, 2, a, "+")

Benchmarks

set.seed(24)
b <- matrix(sample(0:9, 8000*5000, replace=TRUE), ncol=5000)
a <- sample(0:3, 5000, replace=TRUE)
system.time(b + a[col(b)])
#  user  system elapsed 
#  1.08    0.06    1.14 
system.time(b + rep(a, each = nrow(b)))
#   user  system elapsed 
#   0.83    0.03    0.86 

system.time(t(a+t(b)))
#   user  system elapsed 
#   1.14    0.03    1.17 

system.time(sweep(b, 2, a, "+"))
#  user  system elapsed 
#  0.62    0.06    0.69 
akrun
  • 789,025
  • 32
  • 460
  • 575
  • I tried the b+a[col(b)], when data is large, it seems pretty slow. I will double check my code to have a feedback. – News_is_Selection_Bias Sep 13 '16 at 04:44
  • 2
    @Phdaml In that case, you can try with `rep` i.e. `b + rep(a, each = nrow(b))` should be faster. I updated the post – akrun Sep 13 '16 at 04:47
  • @Phdaml I updated with some benchmarks using `8000*5000`. It seems like the `sweep` and `rep` are faster. – akrun Sep 13 '16 at 04:53
3

These solutions using outer() or collapse::TRA() are is significantly faster than using rep or sweep.

Sebastian
  • 704
  • 5
  • 9