I am trying to implement an encryption cipher in CUDA. It uses 128-bit Key as input. Also, the operations that would be performed on this would be shift operation and addition. I have currently implemented it using array of size 2, each holding 64 bits of data.
From the CUDA documentation, I am able to allocate the vector using
int4 make_int4(int x, int y);
But unable to perform operations on it. Kindly point me where to look or help me. Thanks in advance.