Why deploying Numba and using arrays instead of class gives different results for the same parameters in my program?

Question

So, to summarize I have the exact same parameters for my code but with three different methods:

Defining a class for my list of particles SP with three instances/attributes. (around 1 mins)
Defining SP as an array. (around 1 mins)
Defining SP as an array and using Numba (around 5 seconds)

Consider the first case:

n=1000
mu=np.random.uniform(0,1,n)
r=[sqrt(-2*log(1-i)) for i in mu]
eta=np.random.uniform(0,1,n)
theta=2*pi*eta;
cuz=[cos(i) for i in theta]
suz=[sin(i) for i in theta]
Zinitial=[a*b for a,b in zip(r,cuz)];
Pinitial=[a*b for a,b in zip(r,suz)];
class Particle:
    def __init__(self, pos, mom, spin):
        self.pos = pos
        self.mom = mom
        self.spin = spin
   
SP = sorted([Particle(pos = i, mom = j, spin = choice([1, 0])) for i,j in zip(Zinitial,Pinitial)],key=lambda x:x.pos)
Upi=[];
Downi=[];
count_plot=[];
for j in range(len(SP)):
    if SP[j].spin == 1:
        Upi.append(SP[j].pos)
    else:
        Downi.append(SP[j].pos)
Zavgi=sum(Zinitial)/len(Zinitial)
Zreli=sum(Upi)/len(Upi)-sum(Downi)/len(Downi)
"Observables"
Zavg=[Zavgi];
Zrelm=[Zreli];
T_plot=[0];

"Time"

iter=10**(4);
dt=1/(2*n);
alf=sqrt(n);

"Dynamics"


counter=0;
sum1,sum2=0,0;

for i in range(1,iter+1):
        
        t=i*dt;
        T_plot.append(t) 
        Z=[];
        Up=[];
        Down=[];
        c,s=cos(t),sin(t);
        c1,s1=cos(t-dt),sin(t-dt);
        for j in range(n-1):
            collchk=((c*(SP[j].pos)+s*(SP[j].mom))-(c*(SP[j+1].pos)+s*(SP[j+1].mom)))*(c1*(SP[j].pos)+s1*(SP[j].mom)-(c1*(SP[j+1].pos)+s1*(SP[j+1].mom)));

            prel=((c*(SP[j].mom)-s*(SP[j].pos))-(c*(SP[j+1].mom)-s*(SP[j+1].pos)))/2;
               
            rcoeff=1/(1+(prel*alf)**2);
            rand_value=random();
            
            
            if collchk<0:
               
              
               SP[j], SP[j+1]=SP[j+1],SP[j];
               
              
               if rcoeff>rand_value:
                   counter=counter+1
                   SP[j].spin,SP[j+1].spin=SP[j+1].spin,SP[j].spin;
            if SP[j].spin == 1:
                Up.append(c*(SP[j].pos)+s*(SP[j].mom))
            else:
                Down.append(c*(SP[j].pos)+s*(SP[j].mom))
            Z.append(c*(SP[j].pos)+s*(SP[j].mom))

        
        
        Zrel=sum(Up[0:])/len(Up) - sum(Down[0:])/len(Down);
        Zrelm.append(Zrel)
                        
        Zm=sum(Z)/len(Z)
        Zavg.append(Zm)
   

print("Rate of collision per particle = ",counter/(n*(dt*iter)))

The OUTPUT is: Rate of collision per particle = 0.0722

The second case:

n=1000
mu=np.random.uniform(0,1,n)
r=np.array([sqrt(-2*log(1-i)) for i in mu])
eta=np.random.uniform(0,1,n)
theta=2*pi*eta;
cuz=np.array([cos(i) for i in theta])
suz=np.array([sin(i) for i in theta])
Zinitial=np.multiply(r,cuz);
Pinitial=np.multiply(r,suz);

SP = np.array(sorted(np.array([  np.array([i,j,choice([1,0])]) for i, j in zip(Zinitial, Pinitial)]),
                key=lambda x: x[0]))

Upi=[];
Downi=[];
count_plot=[];
for j in range(len(SP)):
    if SP[j][2] == 1:
        Upi.append(SP[j][0])
    else:
        Downi.append(SP[j][0])
Zavgi=sum(Zinitial)/len(Zinitial)
Zreli=sum(Upi)/len(Upi)-sum(Downi)/len(Downi)
"Observables"
Zavg=[Zavgi];
Zrelm=[Zreli];
T_plot=[0];

"Time"

iter=10**(4);
dt=1/(2*n);
alf=sqrt(n);

"Dynamics"


counter=0;
sum1,sum2=0,0;

for i in range(1,iter+1):
        
        t=i*dt;
        T_plot.append(t) 
        Z=[];
        Up=[];
        Down=[];
        c,s=cos(t),sin(t);
        c1,s1=cos(t-dt),sin(t-dt);
        for j in range(n-1):
            collchk=((c*(SP[j][0])+s*(SP[j][1]))-(c*(SP[j+1][0])+s*(SP[j+1][1])))*(c1*(SP[j][0])+s1*(SP[j][1])-(c1*(SP[j+1][0])+s1*(SP[j+1][1])));

            prel=((c*(SP[j][1])-s*(SP[j][0]))-(c*(SP[j+1][1])-s*(SP[j+1][0])))/2;
               
            rcoeff=1/(1+(prel*alf)**2);
            rand_value=random();
            
            
            if collchk<0:
               
              
               SP[j], SP[j+1]=SP[j+1],SP[j];
               
              
               if rcoeff>rand_value:
                   counter=counter+1
                   SP[j][2],SP[j+1][2]=SP[j+1][2],SP[j][2];
            if SP[j][2] == 1:
                Up.append(c*(SP[j][0])+s*(SP[j][1]))
            else:
                Down.append(c*(SP[j][0])+s*(SP[j][1]))
            Z.append(c*(SP[j][0])+s*(SP[j][1]))

        
        
        Zrel=sum(Up[0:])/len(Up) - sum(Down[0:])/len(Down);
        Zrelm.append(Zrel)
                        
        Zm=sum(Z)/len(Z)
        Zavg.append(Zm)
   

print("Rate of collision per particle = ",counter/(n*(dt*iter)))

The OUTPUT is :

Rate of collision per particle = 0.0134

And the quickest Numba case;

import numpy as np
import matplotlib.pyplot as plt
import random
from math import *
from random import *
from numba import jit


"Dynamics"

@jit(nopython=True)
def f(SP, Zavgi, Zreli, alf, dt, n):
    "Time"
    counter = 0;
    
    Zavg = np.array([Zavgi]);
    Zrelm = np.array([Zreli]);
    T_plot = np.array([0]);
    for i in range(1, iter + 1):

        t = i * dt;
        np.append(T_plot,t)
        Z = [];
        Up = [];
        Down = [];
        c, s = cos(t), sin(t);
        c1, s1 = cos(t - dt), sin(t - dt);
        for j in range(n - 1):
            collchk = ((c * (SP[j][0]) + s * (SP[j][1])) - (c * (SP[j + 1][0]) + s * (SP[j + 1][1]))) * (
                    c1 * (SP[j][0]) + s1 * (SP[j][1]) - (c1 * (SP[j + 1][0]) + s1 * (SP[j + 1][1])));

            prel = ((c * (SP[j][1]) - s * (SP[j][0])) - (c * (SP[j + 1][1]) - s * (SP[j + 1][0]))) / 2;

            rcoeff = 1 / (1 + (prel * alf) ** 2);
            rand_value = random();

            if collchk < 0:

                SP[j], SP[j + 1] = SP[j + 1], SP[j];

                if rcoeff > rand_value:
                    counter = counter + 1
                    SP[j][2], SP[j + 1][2] = SP[j + 1][2], SP[j][2];
            if SP[j][2] == 1:
                Up.append(c * (SP[j][0]) + s * (SP[j][1]))
            else:
                Down.append(c * (SP[j][0]) + s * (SP[j][1]))
            Z.append(c * (SP[j][0]) + s * (SP[j][1]))

        Zrel = np.sum(np.array(Up)) / len(Up) - np.sum(np.array(Down)) / len(Down);
        Zrelm = np.append(Zrelm, Zrel)

        Zm = np.sum(np.array(Z)) / len(Z)
        Zavg = np.append(Zavg, Zm)


    return Zavg, Zrelm, counter, T_plot



if __name__ == '__main__':

    n = 1000
    mu = np.random.uniform(0, 1, n)
    r = [sqrt(-2 * log(1 - i)) for i in mu]
    eta = np.random.uniform(0, 1, n)
    theta = 2 * pi * eta;
    cuz = [cos(i) for i in theta]
    suz = [sin(i) for i in theta]
    Zinitial = [a * b for a, b in zip(r, cuz)];
    Pinitial = [a * b for a, b in zip(r, suz)];

    iter = 10 ** (4);
    dt = 1 / (2 * n);
    alf = sqrt(n);


    SP = np.array(sorted(np.array([  np.array([i,j,choice([1,0])]) for i, j in zip(Zinitial, Pinitial)]),
                key=lambda x: x[0]))
    Upi = [];
    Downi = [];
    count_plot = [];
    for j in range(len(SP)):
        if SP[j][2] == 1:
            Upi.append(SP[j][0])
        else:
            Downi.append(SP[j][0])
    Zavgi = np.sum(Zinitial) / len(Zinitial)
    Zreli = np.sum(Upi) / len(Upi) - np.sum(Downi) / len(Downi)


    Zavg, Zrelm, counter, T_plot = f(SP, Zavgi, Zreli, alf, dt, n)
    print("rate= ", counter/(n*(iter*dt)))

The OUTPUT:

rate= 0.00814

As can be seen, the rates are different for all three cases even when the parameters are the same. The one with the Numba has a rate different by a factor of 10.

Since the parameters are the same I would expect the three different methods to give very close rates.

Why is this happening?

EDIT:

I have shortened the time of the code by simply running it for a much smaller time. Due to this I have also removed the graphs which look super weird for such short time scales. I want to stress that the problem seems to be two fold but related: The rates differ by an order for many runs and the graphs for the array case (with and without Numba) appear more jittered (which becomes really prominent for long time runs) than the class case.

This is certainly due to a numerical instability. Why do you expect the result to be the same while calling `random()` in the middle of the code? Is the same code is deterministic? Besides this, there is an error in with the plotting: `ValueError: x and y must have same first dimension, but have shapes (1,) and (100001,)`. Can you fix this and provide a version of the code that does not require to wait for 18+6 minutes so to get the result (or the previous error) ? It make any debugging/check tricky. — Jérôme Richard, May 29 '22 at 17:34
I will try to provide a version that takes lesser time but the value error does not occur in my system. Maybe the problem occurs in some indentation while copy-pasting the code where an array is pasted outside the for loop. But I will still check it with the shorter versions if possible. Besides this, I expect it to give the same result while calling random() because the average behaviour (over time loop) remains almost same. This can be checked by re-running the same code again (which will produce different random values) but almost the same rate. — Lost, May 29 '22 at 17:49
Please add a [\[SO\]: How to create a Minimal, Reproducible Example (reprex (mcve))](https://stackoverflow.com/help/minimal-reproducible-example). What are in your opinion close rates? Is it the same code (because it's not small, and not well written (check https://peps.python.org/pep-0008))? — CristiFati, Jun 04 '22 at 09:30
@CristiFati I won't be able to reduce it fuether without getting rid of what I think important things. And yes it is the same code just written with arrays and with Numba. — Lost, Jun 04 '22 at 11:08

score 1 · Accepted Answer · answered Jun 04 '22 at 13:25

TL;DR: one problem comes from the naive line SP[j], SP[j + 1] = SP[j + 1], SP[j] behaving differently between programs due to the different type of SP. Another comes from the random number generation that behave differently in Numba (hard to track without taking care of reproducibility).

Debugging

Before delving into the problem, the first thing to do is to change the code so to get deterministic results so to be able to reproduce the error in a debugger and then track the difference with this. Then, you need to adapt the code so it can run quickly and be easy to debug since the program is generally executed several time before the bug can be located. Then, you can log the values of the variables impacting counter (which is the variable causing different results). Then, you can check the difference and locate a step where counter is different between the program. Finally, you can backtrack to the origin of the bug step by step. This is a generic method to debug such a program quite efficiently. Note that is could be even more efficient if you could use a reverse debugger (but it turns out that I did not found one that works on my machine for Python yet).

The key to get deterministic reproducible results is to use the following code:

np.random.seed(0)
seed(0)

The random Number generator of Numba is not synchronized with the one of Numpy by default and so the seed needs to be put in the Numba function so to actually impact the seed of the random number generator of Numba (rather than the one of Numpy from CPython). Moreover, random() of Numba is also not synchronized with the one of CPython, but using the same seed is not sufficient since Numba appears not to use the same algorithm internally anyway. Using np.random.rand() instead of random() fixes the reproducibility issue.

Under the hood

Here are the backtracking steps to understand what is the root of the problem:

counter differs at a specific step (when (i,j)=(2,20) for n=100 and iter=3)
collchk and prel are set to 0 in the second program while they are both different to 0 in the first
SP[j] and SP[j+1] are equal in the second program but not in the first
SP[j], SP[j + 1] = SP[j + 1], SP[j] causes the value to be equal only in the second program

This line behave differently in the two first program because the former deals with a 1D array of object wile the later deals with a 2D array of native values. Indeed, in the former, the object are properly swapped but the later operates on Numpy views and the implicit copy of the view does not copy the array content resulting in the overwrite. Here is how it works internally:

SP[j], SP[j + 1] = SP[j + 1], SP[j] is implicitly translated to something equivalent to that:

tmp1 = SP[j+1,:]
tmp2 = SP[j,:]
SP[j,:] = tmp1
SP[j+1,:] = tmp2

The thing is that tmp1 and tmp2 are references of two Numpy views: SP[j,:] and SP[j+1,:]. Thus, SP[j] = tmp1 causes the content of the view of tmp2 to be also overwritten. Thus, in the end, the row is replicated resulting to a sneaky bug.

One solution to fix this problem is to explicitly call np.copy on the swapped row. Swap two values in a numpy array. also provides an alternative interesting solution (note the second answer which is validated is currently the one causing bogus results in your case and comments pointed out that this was not a safe solution). Using SP[[j, j+1]] = SP[[j+1, j]] fixes the problem because SP[[j+1, j]] creates a temporary array (which is not a view of SP). Note that this solution is not supported by Numba. Numba does support the copy but creating temporary arrays is expensive. For the Numba code, you need to use the following code:

for k in range(SP.shape[1]):
    SP[j, k], SP[j + 1, k] = SP[j + 1, k], SP[j, k]

Finally, note that using np.random.rand with the above fix is sufficient to get the same results in Numba.

Why deploying Numba and using arrays instead of class gives different results for the same parameters in my program?

1 Answers1

Debugging

Under the hood

Linked