12

Suppose I have two matrices Nx2, Mx2 representing N, M 2d vectors respectively. Is there a simple and good way to calculate distances between each vector pair (n, m)?

The easy but inefficient way is of course:

d = zeros(N, M);
for i = 1:N,
  for j = 1:M,
    d(i,j) = norm(n(i,:) - m(j,:));
  endfor;
endfor;

The closest answer I've found is bsxfun, used like so:

bsxfun(inline("x-y"),[1,2,3,4],[3;4;5;6])

ans =
  -2 -1  0  1
  -3 -2 -1  0
  -4 -3 -2 -1
  -5 -4 -3 -2
Aron Ahmadia
  • 6,951
  • 4
  • 34
  • 54
Kelley van Evert
  • 221
  • 1
  • 2
  • 3
  • I took a look at this and I couldn't do much better than vectorizing the computation. I think this computation is a pretty good candidate for writing an external C/Fortran function. – Aron Ahmadia Feb 26 '12 at 11:05
  • 1
    I bet you could make a 2xNxM matrix which you populate with an outer product, then square each of the entries and sum along the zeroth axis and square root.

    In Python this would look like: distance_matrix = (n[:,:,nexaxis] * m[:,newaxis,:]); distance_matrix = distance_matrix**2; distance_matrix = sqrt(distance_matrix.sum(axis=1));

    If you only need to know the closest n-vectors there are much better ways to do this!

    – meawoppl Feb 27 '12 at 20:46
  • 3
    @meawoppl (New to Octave) I found out how to use the linear-algebra package in Octave, which provides cartprod, so now I can write: (1) x = cartprod(n(:,1), m(:,1)); (2) y = cartprod(n(:,2), m(:,2)); (3) d = sqrt((x(:,1)-x(:,2)).^2+(y(:,1)-y(:,2)).^2) ..which runs much faster! – Kelley van Evert Feb 27 '12 at 23:36
  • How about http://octave.sourceforge.net/statistics/function/pdist.html – Nemo Mar 07 '15 at 10:37

2 Answers2

6

Vectorizing is straightforward in these situations using a strategy like this:

eN = ones(N,1);
eM = ones(M,1);
d  = sqrt(eM*n.^2' - 2*m*n' + m.^2*eN');

Here's an example that vectorizes the for loop with a 15x speedup for M=1000 and N=2000.

n = rand(N,2);
m = rand(M,2);
eN = ones(N,2);
eM = ones(2,M);

tic;
d_vect  = sqrt(eN*m.^2' - 2*n*m' + n.^2*eM);
vect_time = toc;

tic;
for i=1:N
  for j=1:M
     d_for(i,j) = norm(n(i,:)-m(j,:));
  end
end
for_time = toc; 

assert(norm(d_vect-d_for) < 1e-10*norm(d_for)) 
Aron Ahmadia
  • 6,951
  • 4
  • 34
  • 54
  • David, nice to see you on scicomp! I shamelessly edited your code fragment and expanded it some, please revert if my edits went the wrong direction from clarifying what you intended. – Aron Ahmadia Apr 09 '12 at 21:56
2

From Octave 3.4.3 and later the operator - does automatic broadcasting (uses bsxfun internally). So you can proceed in this way.

Dx = N(:,1) - M(:,1)';
Dy = N(:,2) - M(:,2)';
D = sqrt (Dx.^2 + Dy.^2);

You can do the same using a 3d matrix but I guess like this is more clear. D is a NxM matrix of distances, every vector in N against every vector in M.

Hope this helps

JuanPi
  • 233
  • 1
  • 7