Is there a CPU ISA preferring a test for the value of one over testing for zero?

Question

While discussing a question about the origin of Zero as value for the default exit code for success, I reflected if there is any Instruction Set Architecture or implementation thereof where testing a byte or word for a value of ONE is favoured over testing for a value of ZERO, but couldn't think of any. All ISA I have meet so far do worst in handling a test for ONE.

Or in a more generic form: is there any ISA treating any discrete value, i.e. a single value, not a range, other then zero, privileged over zero?

(To keep wording simple I'll continue asking about a value of one, but any other than zero will do it)

To give a clear benchmark: The test for ONE must measured against the test for ZERO on these criteria with lower numbers being better:

number clock cycles
required memory footprint
number of instructions

A solution is considered better if it has a lower value on any of these criteria, while not being higher on any of the other. Mixed cases (one higher one lower) might be as well valid after a closer look.

Note, this is about a test of the values of ONE or ZERO of a byte or larger memory unit (like a word), as used for return values. This is not about testing single bit tests (within a byte or word). Of course methods that test for a value of one using on bit test instruction are quite fine. Also, the test must be non-destructive.

So, is there any ISA where testing for a value other than zero performs better than testing for zero?

Bonus: An ISA where testing for any other discrete values is as privileged as testing for zero would already be interesting.

(Note: it still must be privileged over generic compare)

Examples

For example on a 6502 (or 68xx) testing a byte for a one value can be done by loading the value to be tested and compared :

 LDA  <value>
 CMP  #1
 BEQ  was_one

Or, if no longer needed in A decrementing will save a byte (only 65C02)

 LDA  <value>
 DEA
 BEQ  was_one

In contrast, testing for zero is done by just loading the byte in question:

 LDA  <value>
 BEQ  was_zero

This is possible as a load instruction sets the zero flag accordingly. Thus testing for zero will always be one instruction less, at least one byte shorter and two clocks faster than testing for one.

Of course this is a benefit of architectures that set flags during load. Architectures that set their flags/condition codes only due test instruction will usually need more instructions, but still often prefer tests for zero. For example an 8080 needs with

 LDA  <value>
 ORA  A
 JZ   was_zero

6 bytes and 27 cycles, while testing for one will be as well 3 instructions, but 7 bytes and 30 cycles

 LDA  <value>
 CPI  1
 JZ   was_one

Two address architectures like x86 or /370 will usually perform equal for both cases, as it comes down ot a simple compare - unless the input is already in a register, where a test instruction again prefers test for zero.

P.S.: I'm confident RC.SE patrons will soon find all possible loopholes :))

Or, similarly, a test for -1 would have been just as useful for a convention where -1 indicated ... something (either success or error as a status code, or a downward-counting loop that included 0 ...) — davidbak, Aug 08 '22 at 01:33
Actually, testing for 1 might be fast on a condition-code ISA if you have a spare register to burn (for some ISAs) or even better if the value being tested is dead after the test. Just subtract 1 and then branch on Z. Though those ISA probably already have a "free" test against zero as the side effect of loading it in a register or computing it in the first place. — davidbak, Aug 08 '22 at 01:48
Many processors include the built-in hardware logic required for ORing (or ANDing, your pleasure) all of the bits for an operation (which can be an ALU OP or can be a LOAD which involves the same hardware logic.) That can readily work with all '0' or all '1' bits in a word. Some procs have constant generators built-in. But the ones I know of often have more than 0 and 1 as constants and none I know of have only 1 but then not 0 as pre-defined generated constants. If 1 is special-cased then so is 0. May be hard to find one where 1 is special and 0 isn't. — jonk, Aug 08 '22 at 04:26
Given that test for equality is almost always implemented as doing an ALU operation and then testing for zero, I'd be really surprised if there was an ISA that preferred it. There are ISAs where the convention "use the carry flag to return some status" also makes sense, and for those it doesn't matter which way you do it (and it's only one bit). — dirkt, Aug 08 '22 at 06:06
@davidbak case of 'burning' due 'subtract one' is already shown in the second example, except it's still outperformed in all criteria by a test for zero. Point is not about programming tricks to improve a test for one, but an ISA. Sure, a test for -1 would be as fine. In fact, I think this can be generalized by asking for any discrete value other than 0 being privileged over 0 (or at least as privileged). I'll change the wording. — Raffzahn, Aug 08 '22 at 09:53
@jonk An ISA preferring any discrete constant (or small number of such) over zero would be an interesting find to me. Heck, it might be even interesting if it's as preferred as zero - as long as it is about one or a few values, not a simple compare with the whole range of that byte/word. So gimme your constant generator :)) — Raffzahn, Aug 08 '22 at 10:02
@dirkt Jup, that's the whole assumption behind that question. Zero is naturally favoured against all other values - not just in computing. Also, the question is not about return values, but only about testing a value. Other concepts of return codes are of course there (even more elaborate like returning to different addresses according to result), but not relevant to this. — Raffzahn, Aug 08 '22 at 10:06
Testing for zero has *many* more uses than just "is this number zero". (1) Is this unsigned integer positive? (2) Is this 2s-complement signed integer positive (also uses the sign bit (MSB))? (3) Are these numbers (un)equal? (4) Other integer comparisons (although many can be re-done as a check of the sign bit). (5) Certain loops terminate at zero. (6) Is this pointer null? (7) Is this the last character of the string? (8) Terminating sentinel values in arrays. Most of these uses would not work with a one-based comparison. I'm skeptical than an architecture would eliminate == 0. — DrSheldon, Aug 08 '22 at 15:01
Unlikely - A test for zero (or non-zero, for that matter) in a CPU is typically combinatory logic, while a test for a certain value is stateful logic, which is much more expensive. — tofro, Aug 09 '22 at 14:49

A. I. Breveleri · Answer 1 · 2022-08-10T16:02:40.577

12

ISZ on the PDP-8.

In the special case where the flag will only ever be tested once (because the test trashes the flag), the PDP-8 "ISZ" instruction distinguishes the value -1 from all other values.

This allows the construction of a 2-word (24 bit) conditional branch, the shortest possible on this ISA.

ISZ     flag
JMP     notset
...     / it was set (to -1)

Also, this sequence does not perturb any registers, which is very useful as there is only one of those.

Testing for zero on the same machine requires 3 words,

TAD     flag
SZA
JMP     notset
...     / it was set (to 0)

and trashes all the registers.

edited Aug 10 '22 at 16:02

answered Aug 09 '22 at 16:46

A. I. Breveleri

1,479
9
14

But note that the PDP-8 (and the PDP-11 in the other answer) do follow the general "zero is special because it can be tested with SZA etc." pattern. And if you'd try to make a non-destructive test with ISZ, it would probably be much worse than 3 words. Also, in case of a return value (which is the situation where the question originated), the value might very well be already in the accumulator. – dirkt Aug 10 '22 at 04:56
I expected that (if at all) one of the primitive ones to come around, but hadn't the PDP-8 in focus. Yes, this works much like the use of BCT(R) in a /360 or DJNZ for 8051. And likewise it's faster than any other test. – Raffzahn Aug 10 '22 at 07:20
@dirkt: Nondestructive use of the -1==true flag requires four words, but you have to put the restore code where the execution flow joins at the end of the segments. I only ever used this once, when I absolutely had to preserve the AC through the test. – A. I. Breveleri Aug 10 '22 at 16:02

score 7 · Answer 2 · answered Aug 08 '22 at 21:39

In the General Instruments PIC architecture families that debuted in the 1970s, and were adapted into the Microchip architecture families of the same name, there is no instruction to directly test whether the entire byte at an address is zero, but there are instructions to directly test whether bit 0 of a byte is set, whether bit 1 of a byte is set, etc. up through bit 7. To test whether an entire byte is zero, one must read the entire byte and then test whether bit 2 of the status register is set (it will be set if the byte was zero).

On the 6502, one can quickly test for whether a byte is zero by loading it into a register if one doesn't mind trashing the value of the register in question. On the other hand, it's possible to simultaneously test the state of both of the upper two bits of a byte in memory, and branch on them separately, without affecting the state of any registers. Thus, storing a pair of flags in the top two bits of a byte may allow code that would generally need to inspect both of them whenever inspecting either to be more efficient than would be possible if the bits were stored in two separate bytes.

On the PIC you can test a 'file register' (memory location) for a value of 1 and conditionally skip the next instruction with DECFSZ f,w, which decrements it and puts the result in w (working register) while leaving the file register untouched. This is 2 instructions shorter and 2 cycles less than the 'conventional' method ie. MOVF f,w : SUBLW k : SKPZ. — Bruce Abbott, Aug 09 '22 at 01:26
@BruceAbbott: I've used PICs for many years, but not really thought of doing that. Probably because btfss/btfsc work so well. As for incfsz,w the main place I used that is in the sequence lda srcLo / addwf destLo,f / movf srcHi,w / btfsc STATUS,C / incfsw srcHi,w / addwf destHi,f which is a brilliant cascadable way of doing an add-with-carry operation. — supercat, Aug 09 '22 at 14:52

score 3 · Answer 3 · answered Aug 09 '22 at 22:34

3

A little contrived, but:

The PDP-11 (in all but the earliest models) has the SOB instruction:

SOB Rn, LABEL

The register is decremented, and if the result is non-zero, a branch is made to LABEL. Thus you can code a (destructive) test for Rn == 1 in a single instruction.

Any similar test for zero, unless you already have the condition codes set from the previous instruction, requires at least two instructions:

 TST  Rn
 BEQ  LABEL

I grant you that the advantage is lost as soon as you're talking about a value in memory.

answered Aug 09 '22 at 22:34

dave

35,301
3
80
160

Much like the PDP-8 - see my comment there. The single instruction nature makes it naturally unbeatable - if destructive a test is ok. – Raffzahn Aug 10 '22 at 07:22
You can. do the same thing on the 68k with the DBF instruction to test for zero and branch in one, which can save some cycles – tofro Aug 10 '22 at 17:32

dirkt · Answer 4 · 2022-08-10T08:18:17.510

The Burroughs B5000 has no way to test a complete word directly against a value, be it zero or one. Looking at the instructions (section 5), the only conditional branch operations are BFC (branch forward conditional) and BBC (branch backward conditional), which both test the low-order bit of register B (the next to top-of-stack register) and branch if this bit is zero.

So while you can directly branch on "boolean" values (where only the low-order bit counts), for everything else you need to load a literal and execute a comparison.

So as far as testing for complete values is concerned, zero performs equally well as any other value, and the particular case if taking the branch the low-order bit is clear performs better.

So that at least sort of matches the question in the sense that testing for particular values is "as privileged" as testing for a complete zero.

Interesting point. Maybe two remarks: The B5000 is special in the way that not the branch instruction made to test for flags set prior, but the compare instruction distilling this into a boolean value. All to the advantage that only one bit is needed to branch afterwards - or use the value as a result for further logic operations. I always though of it as an interesting way to improve pipelining. Second: the 'as privileged as' part doesn't fit, as neither is privileged - it's the basic situation were everything has to go thru a compare instruction first. Bonus for almost finding a loophole :)) — Raffzahn, Aug 10 '22 at 07:33

score 1 · Answer 5 · answered Aug 10 '22 at 20:20

I dimly recall the i960 setting aside a few bits in certain instructions so that the opcode could include a small number, possibly 4 bits signed IIRC so that values in the range -8 to +7 could be handled without the need for a second fetch cycle. That wouldn’t make any value faster than testing for zero but would make the performance equal. The engineers at Intel had realised that the majority of numbers handled at machine code level were quite small and so this feature could significantly boost performance in a fairly wide range of situations, well worth setting aside some bits of the RISC instruction set for (who needs 4 billion instruction types anyway).

Is there a CPU ISA preferring a test for the value of one over testing for zero?

Examples

5 Answers5