According to What is a "side effect?", I know the side effect means changing outside world. But what if a function that changes the outside state during execution, but reverts the state to original state after execution? For example, originally I have a function about physics simulation that doesn't modify the outer variable (copy a new data to do simulation):
int predictNumberOfStoppedMarbles(std::vector<Marble> marbles){
//some physics simulation that modifies x,y,vx,vy of marbles but not add,remove marbles or change sequence of marbles
int count=0;
for(Marble& marble : marbles){
count += marble.vx==0 && marble.vy==0?1:0;
}
return count;
}
However, I found this method is too slow, because it needs to copy all marbles when the function executes, so I modify it as follows, which mutates the exact income data directly:
int predictNumberOfStoppedMarbles(std::vector<Marble>& marbles){
std::vector<std::vector<float> > originalDataArray;
for(Marble& marble : marbles){ //backup original x,y,vx,vy
originalDataArray.push_back({marble.x,marble,y,marble.vx,marble.vy});
}
//some physics simulation that modifies x,y,vx,vy of marbles but not add,remove marbles or change sequence of marbles
int count=0;
for(Marble& marble : marbles){
count+= marble.vx==0 && marble.vy==0?1:0;
}
for(int i=0;i<marbles.size();i++){ //restore original x,y,vx,vy
marbles[i].x=originalDataArray[i][0];
marbles[i].y=originalDataArray[i][1];
marbles[i].vx=originalDataArray[i][2];
marbles[i].vy=originalDataArray[i][3];
}
return count;
}
now it modifies the outer data source (marbles from outer world) directly during simulation, but after execution, the function reverts the data back. Is the function still considered as "no side effect"?
Note: In real code, the physics engine needs to accept Marble type as parameter, it is not easy to copy or modify the physics logic code that operates from Marble type to float array type, so the solution that modifies the copied array is not suitable for me.
marble.vx==0 && marble.vy == 0condition can SIMD vectorize without shuffles, like load a vector of 4 or 8 VX velocity components and another vector of 4 or 8 VYs, compare each against zero and AND the compare results together. Then accumulate the 0 / -1 (0xFFFFFFFF) compare results with an integer subtract into a count vector. Like x86vcmpeqps ymm1, ymm0, [rdi]/vcmpeqps ymm2, ymm0, [rsi]/vpandd ymm1, ymm2/vpsubd ymm7, ymm1. (Then horizontal sum the count vector after the loop.) – Peter Cordes Mar 11 '24 at 21:16-1instead of0. But half of your vector width is then wasted on elements you didn't need, X and Y components, andvpcmpeqq64-bit integer compare makes a vector with the result from 1 marble instead of from 4 marbles withvpand. Perhaps shuffle together multiple vectors before that step, but that's still more work, and auto-vectorization probably won't. – Peter Cordes Mar 11 '24 at 21:21std::vector<float>, so each element of the outer std::vector is three pointers to dynamically allocated space for four floats. That's hilariously inefficient, like 24 bytes of pointers and a dynamic allocation for every 16 bytes of floats. Or instead of a struct, you could just have a flatstd::vector<float>as the outer array, and usei*4 + 0..3to access the components of marblei. – Peter Cordes Mar 11 '24 at 21:23marbles[i]for the unchanging parts of the marble data so you don't have to copy it all if there's stuff like colour, radius, mass or density, elasticity, spin, etc.) That would avoid the entire problem of a temporary side effect as well as reducing copying. – Peter Cordes Mar 11 '24 at 21:30