For performance reasons I often use numba and for my code I need to take a random sample without replacement. I found, that I could use the numpy.random function for that, but I noticed that it is extremely slow compared to the random.sample function. Am I doing something wrong? How could I improve the performance for the numba function? I boiled down my code to this minimal example:
import numpy as np
import numba as nb
def func2():
List = range(100000)
for x in range(20000):
random.sample(List, 10)
@nb.njit()
def func3():
Array = np.arange(100000)
for x in range(20000):
np.random.choice(Array, 10, False)
print(timeit(lambda: func2(), number=1))
print(timeit(lambda: func3(), number=1))
>>>0.1196
>>>20.1245
Edit: I'm now using my own sample function, which is much faster than np.random.choice.
@nb.njit()
def func4():
for x in range(20000):
rangeList = list(range(100000))
result = []
for x in range(10):
randint = random.randint(0, len(rangeList) - 1)
result.append(rangeList.pop(randint))
return result
print(timeit(lambda: func4(), number=count))
>>>0.1767