It turns out that instead of np.split, list comprehension is more performative. So the below function (almost like @unutbu's consecutive function except it uses a list comprehension to split the array) is much faster:
def consecutive_w_list_comprehension(arr, stepsize=1):
idx = np.r_[0, np.where(np.diff(arr) != stepsize)[0]+1, len(arr)]
return [arr[i:j] for i,j in zip(idx, idx[1:])]
For example, for an array of length 100_000, consecutive_w_list_comprehension is over 4x faster:
arr = np.sort(np.random.choice(range(150000), size=100000, replace=False))
%timeit -n 100 consecutive(arr)
96.1 ms ± 1.22 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit -n 100 consecutive_w_list_comprehension(arr)
23.2 ms ± 858 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In fact, this relationship holds up no matter the size of the array. The plot below shows the runtime difference between the answers on here.
![enter image description here]()
Code used to produce the plot above:
import perfplot
import numpy as np
def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)
def consecutive_w_list_comprehension(arr, stepsize=1):
idx = np.r_[0, np.where(np.diff(arr) != stepsize)[0]+1, len(arr)]
return [arr[i:j] for i,j in zip(idx, idx[1:])]
def group_consecutives(vals, step=1):
run = []
result = [run]
expect = None
for v in vals:
if (v == expect) or (expect is None):
run.append(v)
else:
run = [v]
result.append(run)
expect = v + step
return result
def JozeWs(array):
diffs = np.diff(array) != 1
indexes = np.nonzero(diffs)[0] + 1
groups = np.split(array, indexes)
return groups
perfplot.show(
setup = lambda n: np.sort(np.random.choice(range(2*n), size=n, replace=False)),
kernels = [consecutive, consecutive_w_list_comprehension, group_consecutives, JozeWs],
labels = ['consecutive', 'consecutive_w_list_comprehension', 'group_consecutives', 'JozeWs'],
n_range = [2 ** k for k in range(5, 22)],
equality_check = lambda *lst: all((x==y).all() for x,y in zip(*lst)),
xlabel = '~len(arr)'
)