11

I'm using itertools.combinations() as follows:

import itertools
import numpy as np

L = [1,2,3,4,5]
N = 3

output = np.array([a for a in itertools.combinations(L,N)]).T

Which yields me the output I need:

array([[1, 1, 1, 1, 1, 1, 2, 2, 2, 3],
       [2, 2, 2, 3, 3, 4, 3, 3, 4, 4],
       [3, 4, 5, 4, 5, 5, 4, 5, 5, 5]])

I'm using this expression repeatedly and excessively in a multiprocessing environment and I need it to be as fast as possible.

From this post I understand that itertools-based code isn't the fastest solution and using numpy could be an improvement, however I'm not good enough at numpy optimazation tricks to understand and adapt the iterative code that's written there or to come up with my own optimization.

Any help would be greatly appreciated.

EDIT:

L comes from a pandas dataframe, so it can as well be seen as a numpy array:

L = df.L.values
Community
  • 1
  • 1
Khris
  • 2,884
  • 3
  • 27
  • 53
  • 1
    For 2D you'd have `numpy.triu_indices`, but higher dimensions are more difficult – Daniel F Feb 09 '17 at 14:11
  • What about [`sklearn.utils.extmath.cartesian`](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/extmath.py#L557)? – blacksite Feb 09 '17 at 14:16
  • Never looked into `scikit-learn` so far, I'll read up on it. – Khris Feb 09 '17 at 14:22
  • I suppose T is a dynamic list, otherwise you could pre-compute/cache the combinations – BlackBear Feb 09 '17 at 16:33
  • Is T always the same size? – Daniel F Feb 09 '17 at 17:13
  • `T` can be any size though it has some soft upper limit. Its length is great-equal `N` of course. `T` isn't necessarily a list, in fact it comes from a column in a pandas dataframe so it can start off as a numpy array as well. I will edit my post for that. – Khris Feb 10 '17 at 06:20
  • http://stackoverflow.com/questions/16003217/n-d-version-of-itertools-combinations-in-numpy – Curt F. Feb 10 '17 at 06:28
  • @Curt F.: I tried doing `np.fromiter(list(a for a in itertools.combinations(T,N)),dtype=dt)` but this yields a result I can't transpose simply and according to `timeit` it's exactly as fast as my initial code. – Khris Feb 10 '17 at 07:50
  • Khris, can you say anything about what values of N you expect? – Paul Panzer Feb 13 '17 at 11:42

2 Answers2

5

Here's one that's slightly faster than itertools UPDATE: and one (nump2) that's actually quite a bit faster:

import numpy as np
import itertools
import timeit

def nump(n, k, i=0):
    if k == 1:
        a = np.arange(i, i+n)
        return tuple([a[None, j:] for j in range(n)])
    template = nump(n-1, k-1, i+1)
    full = np.r_[np.repeat(np.arange(i, i+n-k+1),
                           [t.shape[1] for t in template])[None, :],
                 np.c_[template]]
    return tuple([full[:, j:] for j in np.r_[0, np.add.accumulate(
        [t.shape[1] for t in template[:-1]])]])

def nump2(n, k):
    a = np.ones((k, n-k+1), dtype=int)
    a[0] = np.arange(n-k+1)
    for j in range(1, k):
        reps = (n-k+j) - a[j-1]
        a = np.repeat(a, reps, axis=1)
        ind = np.add.accumulate(reps)
        a[j, ind[:-1]] = 1-reps[1:]
        a[j, 0] = j
        a[j] = np.add.accumulate(a[j])
    return a

def itto(L, N):
    return np.array([a for a in itertools.combinations(L,N)]).T

k = 6
n = 12
N = np.arange(n)

assert np.all(nump2(n,k) == itto(N,k))

print('numpy    ', timeit.timeit('f(a,b)', number=100, globals={'f':nump, 'a':n, 'b':k}))
print('numpy 2  ', timeit.timeit('f(a,b)', number=100, globals={'f':nump2, 'a':n, 'b':k}))
print('itertools', timeit.timeit('f(a,b)', number=100, globals={'f':itto, 'a':N, 'b':k}))

Timings:

k = 3, n = 50
numpy     0.06967267207801342
numpy 2   0.035096961073577404
itertools 0.7981023890897632

k = 3, n = 10
numpy     0.015058324905112386
numpy 2   0.0017436158377677202
itertools 0.004743851954117417

k = 6, n = 12
numpy     0.03546895203180611
numpy 2   0.00997065706178546
itertools 0.05292179994285107
Paul Panzer
  • 49,838
  • 2
  • 44
  • 91
  • Thank you for the effort, but what are `n` and `k`? Is it possible to write that solution in a way that it can use a list with arbitrary elements e.g. with strings like these `['a','ret','g','fd']`? – Khris Feb 13 '17 at 11:15
  • Also the function used like you do throws an error: `ValueError: need at least one array to concatenate`, it seems to come from the iteration at level 10 or 11. – Khris Feb 13 '17 at 11:18
  • @Khris n and k are the sizes of the set and the subset. You can have the numbers replaced by arbitrary objects by creating an object array `oa = np.array(['a', 'ret', 'g', 'fd'], dtype=object)` and then using the output `out` of `nump` like so `oa[out[0]]`. Re the Exception are you running the script as is or do you use other parameters? If the latter, could you post them? – Paul Panzer Feb 13 '17 at 11:28
  • The exception was my own mistake, nvm that. Thanks for the hint with the indexing, it works, testing the timeit now. – Khris Feb 13 '17 at 12:05
  • I can see your solution being sometimes faster but often slower than mine. The problem is that your solution produces additional data that I'm not using. – Khris Feb 13 '17 at 13:07
  • @Khris added a new version `nump2` which is much faster. It returns an array that can be directly compared to the output of your function (with L=[0,1,2,...,n-1]. – Paul Panzer Feb 13 '17 at 14:30
2

This is is most certainly not faster than itertools.combinations but it is vectorized numpy:

def nd_triu_indices(T,N):
    o=np.array(np.meshgrid(*(np.arange(len(T)),)*N))
    return np.array(T)[o[...,np.all(o[1:]>o[:-1],axis=0)]]

 %timeit np.array(list(itertools.combinations(T,N))).T
The slowest run took 4.40 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.6 µs per loop

%timeit nd_triu_indices(T,N)
The slowest run took 4.64 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 52.4 µs per loop

Not sure if this is vectorizable another way, or if one of the optimization wizards around here can make this method faster.

EDIT: Came up with another way, but still not faster than combinations:

%timeit np.array(T)[np.array(np.where(np.fromfunction(lambda *i: np.all(np.array(i)[1:]>np.array(i)[:-1], axis=0),(len(T),)*N,dtype=int)))]
The slowest run took 7.78 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 34.3 µs per loop
Daniel F
  • 12,948
  • 1
  • 24
  • 53