The ud2 instruction (2 bytes) is a rather compact way to raise SIGILL on POSIX platforms. Are there any other similarly tight ways of raising a hardware exception on x86_64?
Asked
Active
Viewed 48 times
0
-
2`ud2` is an x86_64 instruction. Note that it is unrelated to POSIX and POSIX does not actually provide a `__builtin_trap` function. – fuz May 23 '22 at 12:29
-
3You can use an instruction that is invalid in 64 bit mode, e.g. `AAA` opcode `0x37` will produce #UD. – Jester May 23 '22 at 12:36
-
@fuz What I meant is that POSIX kernel turns it into a SIGILL signal but that's not super important. I'm simply curious if another exception (SIGFPE? SIGSEGV? or another SIGILL that is distinguishable from the ud2-caused SIGILL) can be raised with similarly compact code (well, other than SIGTRAP, which can be raised with a one-byte INT3, but isn't suitable for my purpose). – PSkocik May 23 '22 at 12:37
-
@Jester Thanks! That looks promising. Any way to force it onto the gnu assembler, which is rejecting it with `'aaa' is not supported in 64-bit mode`? – PSkocik May 23 '22 at 12:39
-
2Use `.byte 0x37`. You can also use privileged instructions for #GP. – Jester May 23 '22 at 12:44
-
4@PSkocik You can reliably raise a `SIGFPE` using `DIV AH`. – fuz May 23 '22 at 13:27
-
@PSkocik Also note that there is no such thing as a POSIX kernel. It's an implementation detail of various UNIX-like operating systems to generate a SIGILL for undefined instructions. No standard exists to say what x86 hardware exceptions should be mapped to. There's only historical practice. – fuz May 23 '22 at 13:29
-
1As discussed in [Looking for a \_one byte\_ invalid opcode with x86](https://stackoverflow.com/q/72334461) , one-byte opcodes that currently #UD are *not* guaranteed to do so on future CPUs; only `ud2` is future-proof on paper. Hopefully some future extension will use that 64-bit-only coding space for something, instead of only longer encodings that can also be valid in 32-bit mode like VEX and EVEX prefixes. But on all current x86-64 CPUs, yes, things like `0x37` (32-bit mode AAA) will `#UD`. – Peter Cordes May 23 '22 at 14:52
-
1@fuz: The wording of the POSIX standard does fairly strongly imply that if a kernel is going to deliver a signal as a result of a hardware exception on an arithmetic instruction, the signal should be SIGFPE. https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html lists `si_code` values like integer divide, integer overflow, and various unmaked FP exception types. (related: [On which platforms does integer divide by zero trigger a floating point exception?](https://stackoverflow.com/a/37266507)). – Peter Cordes May 23 '22 at 14:57
-
1I've seen people say that SIGBUS is more appropriate for misaligned `movdqa`, too, unlike Linux's SIGSEGV. IIRC MacOS raises SIGBUS on something like `movaps xmm0, [-1]`. (More compact if you had a known register value that's either odd or even, so you could offset it by an odd disp8 if needed, or a RIP-relative to reach an odd address.) – Peter Cordes May 23 '22 at 15:01
-
1Can you clarify more what kind of "hardware exception" you mean? Are you specifically looking for an instruction that will raise a different signal than `SIGILL`, or is there some other criteria? `hlt` is a popular way to get SIGSEGV in one byte. – Nate Eldredge May 24 '22 at 00:28
-
@NateEldredge Thanks. It's basically already well-answered in the comments and the linked question. Was looking for a compact way of raising *some* signal (perhaps even another SIGILL, as long as it's distinguishable in a signal action from the ud2 SIGILL, which it will be if it's caused by a different instruction). ` – PSkocik May 24 '22 at 05:53
-
You'd like ARMv8. Their permanently undefined instruction `udf` consists of any 32-bit word of which the high 16 bits are zero, and the low 16 bits can be anything. So you effectively get 65536 easy-to-remember undefined instructions. – Nate Eldredge May 24 '22 at 06:15
-
@fuz Ran into another one: lock std -- 2 byte felixcloutier.com/x86/std (.word 0x3c0f), generates #UD regardless of mode. There's probably more of those. But so far, I've really liked the one you suggested. – PSkocik Jun 04 '22 at 11:56
-
1@PSkocik The classic undefined instruction is `ff ff` but today people prefer `ud1` and `ud2` (the difference is that `ud1` takes modr/m operands, `ud2` does not). I would not build on `lock std` remaining undefined in the foreseeable future. – fuz Jun 04 '22 at 12:19