I'm working with DirectCompute, but this question can be applied to general gpu programming I suppose.
As far as I know the threads in a group works in a lockstep. It means that every instruction for every thread executes at the same time, right? But what if one thread out of 1024 entered if/else condition? All other 1023 will just wait or lockstep condition will be violated?