What is the purpose of the Parity Flag on a CPU?

Question

Some CPUs (notably x86 CPUs) feature a parity flag on their status register. This flag indicates whether the number of bits of the result of an operation is odd or even.

What actual practical purpose does the parity flag serve in a programming context?

Side note: I'm presuming it's intended to be used in conjunction with a parity bit in order to perform basic error checking, but such a task seems to uncommon to warrant an entire CPU flag.

1970s hardware, like paper tape punches and serial ports, those old bits fell over much easier :) Thumbwheels and nixie tubes begat the BCD instructions, like AAA. — Hans Passant, Sep 07 '14 at 06:54
@HansPassant BCD I understand keeping, 7-segs and nixies are still used by hobbyists (and maybe cheapskates or dot-matrix hating madmen). — Pharap, Sep 07 '14 at 07:02
Bad news for the hobbyists I'm afraid, they were actually dropped in x64 to make room for 64-bit instructions. — Hans Passant, Sep 07 '14 at 07:19

score 28 · Accepted Answer · edited Sep 07 '14 at 05:52

28

Back in the "old days" when performance was always a concern, it made more sense. It was used in communication to verify integrity (do error checking) and a substantial portion of communication was serial, which makes more use of parity than parallel communications. In any case, it was trivial for the CPU to compute it using just 8 XOR gates, but otherwise was rather hard to compute without CPU support. Without hardware support it took an actual loop (possibly unrolled) or a lookup table, both of which were very time consuming, so the benefits outweighed the costs. Now though, it is more like a vestige.

edited Sep 07 '14 at 05:52

Pharap

3,448
4
34
50

answered Sep 07 '14 at 05:01

Dwayne Towell

7,586
4
34
48

3

I think you should add the reason why Intel x86 (and clones) processors today still have it - Intel has tried to retain backwards compatibility (as much a possible) with each earlier chip. So since the root of the line comes from the 8086 in 1978 each generation of chip retained that functionality across 35 years. – Michael Petch Sep 07 '14 at 05:16
2

The most weird thing here is that initially Intel designed to provide this flag after _each_ operation which modifies condition codes. A less strange approach is to provide a separate instruction which checks parity and modifies a selected CC flag (e.g. CF) but 8080 wasn't designed this way, and now even 64-bit operations set parity in the same manner. This combination of one-step forecasting and keeping old legacy is the most horrible Intel feature. – Netch Sep 07 '14 at 05:19
@Netch Such is the price of backwards compatibility. Personally I'm ok with tearing something up and replacing it if the new method is superior, but I think it's probably a little late in the game to change this one. – Pharap Sep 07 '14 at 06:04
@Pharap there is NO such price. Both change to 32-bit ISA and 64-bit one could be place to remove all bits of weird old crap. The reason it wasn't ever done is brain damaged policy of "compatibility" when it can be easily ignored without any real problem. – Netch Sep 07 '14 at 06:56
3

32-bit support didn't really change the ISA so much as extend it to 32-bits. It would've meant special casing 32-bit operations to not generate the parity bit, for no actual gain. The hardware for parity generation would still be there, there would just now be additional hardware to conditionally not generate it. And remember back in 1985 when the 80386 was introduced parity and the cost of calculating it was still relevant. As for the 64-bit ISA, you still have the issue of additional silicon space being used to disable a feature that still needs be implemented for backwards compatibility. – Ross Ridge Sep 07 '14 at 15:45
3

And if you want an ISA that didn't follow the "brain damaged policy of 'compatibility'" Intel called it Itanium. – Ross Ridge Sep 07 '14 at 15:49
@RossRidge when I want such ISA, I don't search it at Intel;( – Netch Sep 08 '14 at 06:49
tl:dr; there is no use of this flag for a programmer, except for code obfuscation? – sivizius Sep 14 '17 at 16:52
@sivizius yes. It's used for swapping bits or [comparison in x87](https://stackoverflow.com/questions/25707130/what-is-the-purpose-of-the-parity-flag-on-a-cpu/25707223#comment79463092_43433515) – phuclv Sep 27 '20 at 00:56

Alex Yursha · Answer 2 · 2017-09-15T17:29:23.800

17

The Parity Flag is a relic from the old days to do parity checking in software.

TL;DR

What is parity

As Randall Hyde put it in The Art of Assembly Language, 2nd Edition:

Parity is a very simple error-detection scheme originally employed by telegraphs and other serial communication protocols. The idea was to count number of set bits in a character and include an extra bit in the transmission to indicate whether that character contained an even or odd number of set bits. The receiving end of the transmission would also count the bits and verify that the extra "parity" bit indicated a successful transmission.

Why Parity Flag was added to CPU architecture

In the old days there was serial communication hardware (UART) that lacked the ability to do parity checking on transmitted data, so programmers had to do it in software. Also some really old devices like paper tape punches and readers, used 7 data bits and a parity bit, and programmers had to do the parity checking in software to verify data integrity. In order to be able to use parity bit for error detection communicating parties would have to agree in advance on whether every transmitted byte should have odd or even parity (part of a communication protocol).

The primary methods to do parity checking in software without CPU support are bit counting or using a lookup table. Both are very expensive compared to having a Parity Flag in a CPU computed by a single instruction. For that reason in April 1972 Intel introduced the Parity Flag into their 8008 8-bit CPU. Here is an example of how each byte could be tested for integrity on the receiving end since then.

mov        al,<byte to be tested>
test       al,al
jp         <somewhere>         ; byte has even parity
                               ; byte has odd parity

Then a program could perform all sorts of conditional logic based on the value of the Parity Flag.

Evolution of conditional parity instructions in Intel CPUs

1972 - the Parity Flag is first introduced with Intel 8008. There are conditional instructions for jumps (JPO, JPE), calls (CPO, CPE) and returns (RPO, RPE).
1978 - Intel 8086 drops everything except for conditional jumps (JNP/JPO, JP/JPE).
1985 - Conditional set instructions SETPE/SETP and SETPO/SETNP are added with Intel 80386.
1995 - Conditional move instructions CMOVP/CMOVPE, CMOVNP/CMOVPO are added with Pentium Pro.

This set of instructions which make use of the Parity Flag remained fixed since then.

Nowadays the primary purpose of this flag has been taken over by hardware. To quote Randall Hyde in The Art of Assembly Language, 2nd Edition:

Serial communications chips and other communications hardware that use parity for error checking normally compute the parity in hardware; you don't have to use software for this purpose.

The antiquity of the Parity Flag is proved by the fact that it works on low 8 bits only, so it's of limited use. According to Intel® 64 and IA-32 Architectures Software Developer Manuals the Parity Flag is:

Set if the least-significant byte of the result contains an even number of 1 bits; cleared otherwise.

Interesting fact: By his own words a networking engineer Wolfgang Kern scanned all code he had written at some point (~14 GB) for JPE and JPO instructions and found it only in an RS232 driver module and in an very old 8-bit calculation.

Sources

The Intel 8008 support page
Intel 8080 Assembly Programming Manual
Complete 8086 Instruction Set
Wikipedia x86 instruction listings
Intel® 64 and IA-32 Architectures Software Developer Manuals
The Art of Assembly Language, 2nd Edition by Randall Hyde

edited Sep 15 '17 at 17:29

answered Apr 16 '17 at 03:52

Alex Yursha

2,928
2
23
23

1

*80386 instead of 80383 – sivizius Sep 14 '17 at 16:49
2

@sivizius: these days the main use for x86's PF is in floating-point code, because FP compares set PF when the result is unordered. (One or both operands are NaN). This is for historical reasons x87: `fucom st1` `FNSTSW AX` / `sahf` ends up putting `c2` from the FP status word into PF, and later instructions like `fucomi` and SSE `ucomiss` put the compare result into integer flags directly with the same mapping. (See http://www.ray.masmcode.com/tutorial/fpuchap7.htm). Here's a real example of x86-64 code-gen by gcc7.2 using `JP`: https://godbolt.org/g/hHRCzv – Peter Cordes Sep 16 '17 at 02:09
Obviously in this case it's not actually a parity bit like the question is asking about, but you were commenting on the other answer. (See also [the `fucomi`](http://felixcloutier.com/x86/FCOMI:FCOMIP:%20FUCOMI:FUCOMIP.html) instruction-set manual entry for a table of how it sets flags. – Peter Cordes Sep 16 '17 at 02:11
1

@AlexYursha: to get the parity of a 64-bit integer using PF for the low 8: `x ^= (x>>32);` `x^=(x>>16);` `xor al, ah`, then PF is set according to the parity of the whole thing. PF saves you another three steps of shift/xor. (And without BMI2 `rorx` to copy+shift, `x^= x>>16` takes a MOV / SHR / XOR.) – Peter Cordes Sep 16 '17 at 02:18
It's better with AVX, where you can `vpsrlq ymm1, ymm0, 32` / `vpxor ymm0, ymm0, ymm1`, so only 2 instructions to narrow by two inside each 64-bit element. To get the parity of a very long bitstring, of course you `vpxor` in 256b chunks until you have one vector at the end to horizontal xor. (And then you'd use shuffle/vpxor to get down to one 8-bit element, starting like you would for a horizontal sum: https://stackoverflow.com/questions/6996764/fastest-way-to-do-horizontal-float-vector-sum-on-x86). Anyway, PF just saves 3 steps at the end, and is useful in any rare case you want parity. – Peter Cordes Sep 16 '17 at 02:20
@PeterCordes There is no `jp` in the code at this link you posted. – sivizius Sep 18 '17 at 14:38
@sivizius: in https://godbolt.org/g/hHRCzv? Look at line 8 of the asm output: `JP .L7`. gcc does a very poor job of CSEing, sorry I could have picked a better example with fewer other branches. This (https://godbolt.org/g/sY5cbo) is simpler: jump if Unordered (`jp`), then jump if not equal (`jne`). – Peter Cordes Sep 18 '17 at 14:44
NoScript prevented it to load because it does not like potential JS code in the url, but know I see it, thx. – sivizius Sep 18 '17 at 14:51
1

@Alex: oops, I just realized that PF isn't part of the optimal solution for parity of wider registers on CPUs with `popcnt`. As [Cody Gray points out](https://stackoverflow.com/a/43929095/224132), `popcnt rax, rax` / `and eax, 1` gives you the parity of a 64-bit register. No need to narrow down to 8 bit and `setp`. – Peter Cordes Sep 19 '17 at 06:01
1

@Peter Cordes: Your link on `pushf` is wrong. It is an 8086-level instruction, not 186. This error was apparently fixed some time between your link's revision and the one which I extracted from NASM 2.05: https://ulukai.org/ecm/doc/insref.htm#insPUSHF – ecm Apr 01 '21 at 19:27
1

@ecm: ok that makes sense, `pushf` is a pretty essential instruction (the only way to get at some of the bits in FLAGS except for making an exception push them) so it would have been really weird for 8086 not to have it. – Peter Cordes Apr 01 '21 at 19:32
1

The useful part of that earlier comment was: `lahf` loads AH from the low byte of FLAGS ([including PF as bit #2](https://en.wikipedia.org/wiki/FLAGS_register)), so you can build a `setp` out of that (with some shift/AND) without a branch on CPUs before 386. Although it's probably faster on 8086 to just branch, especially without 186 `shr ah, 2`, unless you're ok with `lahf` / `and ah, 1<<2` to get a 0 or 4 instead of 0 or 1. And of course `pushf` can push the whole (E)FLAGS. So you can read PF in ways other than `jp` / `jnp` even on 8086. – Peter Cordes Apr 01 '21 at 19:35

score 4 · Answer 3 · answered Sep 16 '17 at 04:47

There's one practical micro-optimization achievable with parity -- that's bit swapping as used eg in fourier transform address generation using the butterfly kernel.

To swap bits 7 and 0, one can exploit parity of (a&0x81) followed by conditional (a^=0x81). Repeat for bits 6/1, 5/2 and 4/3.

score 2 · Answer 4 · answered Apr 01 '21 at 15:56

2

Personally, I think that rumours of the parity flag's death have been greatly exaggerated. It can be extremely useful in certain circumstances. Consider the following assembler language procedure:

push       rbp
mov        rbp, rsp
xor        eax, eax
ucomisd    xmm0, xmm1
setnp      al
pop        rbp
ret

This takes two double-precision arguments in xmm0, xmm1, and returns a boolean result. See if you can figure out what it's doing.

answered Apr 01 '21 at 15:56

Dave Jewell

138
5

2

Indeed, that's the main use of PF in modern code, but it's not actually parity of anything. (Spoiler alert for what it does: [comments on a previous answer](https://stackoverflow.com/questions/25707130/what-is-the-purpose-of-the-parity-flag-on-a-cpu/66907854#comment79463092_43433515) mention this use-case.) – Peter Cordes Apr 01 '21 at 16:44

What is the purpose of the Parity Flag on a CPU?

4 Answers4

What is parity

Why Parity Flag was added to CPU architecture

Evolution of conditional parity instructions in Intel CPUs

Sources

Linked