cuda atomicMin operation seems only find the minimum value of a device memory trunk. But, is there anyway to find which block/thread finally find this minimum value? I have compute-2.0.
Asked
Active
Viewed 125 times
0
Hailiang Zhang
- 15,674
- 21
- 65
- 112
1 Answers
2
If you are doing an atomicMin on a 32-bit value, you can use a generalized atomic operation on a 64 bit value, 32 bits of which represent the minimized value, and 32bits of which represent the global index of the thread. A general approach is outlined here.
Since 64 bit atomicMin is only supported on cc 3.5 devices, I assume you are finding 32-bit minimum values.
If you are working with 64-bit values, then you can use a parallel reduction technique to carry both the minimum (or maximum) value and the index through the reduction. This question/answer demonstrates a parallel reduction approach which finds both maximum and index, per row of a matrix.
Community
- 1
- 1
Robert Crovella
- 131,712
- 9
- 184
- 228