8

I'm aware that most modern GPUs, although designed for floating point, are more or less equivalent in integer performance these days, with a few caveats like the lack of a fused multiply add. I'm not sure how this applies to shift operations though. I'm doing Marching Cubes on GPU, initially writing out a 32-bit packed position for each surface cube then unpacking these in a later pass to the actual vertices in that cube, like this :

ivec3 unpackedPos = ivec3( packedPos >> 20 & 0x3FF,
                         packedPos >> 10 & 0x3FF,
                         packedPos & 0x3FF);

It just occurred to me to wonder if shader units have barrel shifters in them these days? Am I doing 2 shifts here or 30?

EDIT>> I'm an idiot... Thanks for the answers guys, useful to know, but I've been going about this all wrong. I should just be using the RGB10_A2UI texture format then packing/ unpacking with a single image load/store instruction instead of messing around with bitshifts myself.

RE_EDIT>> Or not... This method apparently works on red boxes but not on green ones, so it's back to bit-shifts.

russ
  • 2,392
  • 9
  • 18
  • 24 bit shifters are used in single precision floating point to align mantissas, so the compiler might generate a few, but I don't think you'll see 30. –  Mar 05 '16 at 18:34

1 Answers1

5

Yes ( with 50% of the FMA 32b throughput on nVIDIA Maxwell).

See https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#arithmetic-instructions

Fabrice NEYRET
  • 1,266
  • 8
  • 14