How is PTRACE_SINGLESTEP implemented?

Question

To the best of my knowledge (I could be wrong), there's no way to just execute one instruction on an x86-64 system. Perhaps instead you could execute the instruction followed by the 'ud2' opcode to trigger a signal -- but then you have to worry about the instruction modifying control flow and going somewhere else.

Yet, if I understand correctly, the ptrace() syscall has a SINGLESTEP option that will execute only a single instruction. How is this implemented? I can't imagine the kernel has some kind of disassembler to identify the instruction and reason about it. So, is there some kind of architectural feature it's using that I don't know about? Or something entirely different?

I not up-to-date on this, but is the Trap Flag (TF) no longer a thing? — NPE, Aug 21 '16 at 14:05

Peter Cordes · Accepted Answer · 2021-10-29T00:57:09.603

Yes, there's an architectural single-step flag on x86. Returning from kernel to user-space gives the kernel a chance to set both RIP/RFLAGS at the same time, so it can set the single-step for user-space without having it trigger on a kernel instruction.

For some reason, the Trap Flag has its own wikipedia article! See also wikipedia's EFLAGS article.

See the x86 tag wiki for links to Intel's architecture manuals which document all of this.

Perhaps instead you could execute the instruction followed by the 'ud2' opcode to trigger a signal

Then you'd need code to determine the decode x86 instruction lengths. And you wouldn't use ud2, you'd use int3 which exists for this purpose.

IIRC, there are also debug registers which can set hardware breakpoints without modifying the code.

Fun fact: not all ISAs have hardware support for PTRACE_SINGLESTEP.

Case in point, the Linux kernel used to emulate it for ARM, but that required an ARM disassembler in the kernel to place a breakpoint at the next instruction, even if a branch target. It was removed in ~2011; now ptrace(PTRACE_SINGLESTEP) returns -ENOSYS on ARM.

They just ripped out all that complexity instead of trying to make it SMP-safe and support every new instruction like Thumb-2 and so on. (http://lists.infradead.org/pipermail/linux-arm-kernel/2011-February/041324.html)

So debuggers have to manually use breakpoints on such ISAs instead of having the kernel do it for them. If that means other threads notice a debug-break opcode in memory temporarily, that's not the kernel's problem.

How is PTRACE_SINGLESTEP implemented?

1 Answers1

Linked