The obvious difference is that PAND has one more byte than ANDPS, and PAND is available from SSE2, but other than that, I cannot understand why there should be two different instructions. They both take xmm1, xmm2/m128 as argument and does a bitwise operation. There is no difference between performing bitwise operation on integer and on float, or is there? Even more interesting is that according to the Intel intrinsics guide, the throughput (CPI) for ANDPS is 1 while PAND is 0.33. Does this mean PAND runs 3 times slower than ANDPS? Can these two operations each result differently while taking the same inputs?
Asked
Active
Viewed 26 times
0
xiver77
- 1,299
- 1
- 1
- 11
-
Also https://stackoverflow.com/q/24943521/2945027 , https://stackoverflow.com/q/24943521/2945027 – Alex Guteniev Dec 15 '21 at 15:57