19

The Motorola 68000 has 16 (somewhat) general-purpose registers of 32 bits each, a generous complement by the standards of its day. I would expect these to take a significant fraction of the die area. (If they didn't, there would be little reason for competing microprocessors like the 8086 to fail to provide something similar.)

Back of the envelope calculation, 16x32x6 (static memory takes six transistors per bit) = 3072 transistors. The 68k is reckoned to have 40k transistors if you don't count microcode, almost 70k if you do. So the memory cells for the registers should take somewhere around 5% of the die; maybe it's closer to 10% if you also take the access circuitry into account?

That is surprisingly small, sufficiently so that it seems like 32 registers could've been provided. (The ARM-1 did exactly that, somewhat later but with similar process technology.) Maybe the instruction encoding space would have been considered a problem. Or maybe the designers were considering use of compiled languages, and noting contemporary compilers were not good at using lots of registers.

Trying to find an annotated die photo, I found this: http://www.easy68k.com/paulrsm/doc/dpbm68k2.htm

Which... doesn't mention the registers at all, nor leave any unaccounted space where they could be.

Is there an error in the annotation? or are the registers part of one of the marked units?

rwallace
  • 60,953
  • 17
  • 229
  • 552
  • 2
    have a look yourselves here: http://www.visual6502.org/images/68000/Motorola_68000_die_20x_1a_top_10000w.png – tofro Jul 06 '23 at 17:45
  • @tofro That is an impressively detailed photo, but it is not annotated. Where on it are the registers? – rwallace Jul 06 '23 at 17:52
  • 2
    "That is surprisingly small, sufficiently so that it seems like 32 registers could've been provided." Maybe could have transistor-wise, though instruction set encoding was already pretty clever, separating address and data register sets, so able to use only 3 bits to identify one of the 16 registers. – Erik Eidt Jul 06 '23 at 19:06
  • 4
    It wasn't so much the (admittedly generous, but Z80 or 8086 aren't far off with 14) amount of registers the 68k offered, but rather the fact that they were truely (or nearly truely considering the distinction between Ax and Dx) general-purpose. Gone were the days where you always ended up with the value in the "wrong" register for the next operation, gone were the days where you shuffled around values to free up a loop counter. That fact was also a relief for compiler writers whose register allocators could now degenerate to almost trivial. – tofro Jul 07 '23 at 06:33
  • "The ARM-1 did exactly that" - didn't it only have 25 thirty-two bit registers, rather than a full 32? – psmears Jul 07 '23 at 14:27
  • 3
    The 8086 did not fail to provide sixteen general purpose registers. It provided as many registers as the system architects thought it needed when they looked at the big picture. Maybe you didn't enjoy writing 8086 assembly code as much as 68000 assembly code (I certainly didn't!!) but then again, we all eventually figured out not to write any assembly code except in very rare cases. (I wrote a bare-metal embedded system for a 68000, and I remember bragging to my peers about how there were only 17 lines of assembly source in the whole thing.) – Solomon Slow Jul 07 '23 at 15:44
  • 3
    @SolomonSlow architects vision and bigger picture are the key words. The 8086 is, much like the 6502 designed to excel at certain usage. If one is not able to take the same PoV, unhappiness is guaranteed. It's like having a pet with personality. In contrast the 68k follows a rather bland recipe. Not much to dislike but as well not much to like - quite like Mickey D's. – Raffzahn Jul 07 '23 at 16:42
  • @Raffzahn I disagreee (wholeheartedly) that there's not much to like on the m68k. That's how our mileage varies.... – tofro Jul 07 '23 at 22:40
  • 3
    @tofro Read close, it never said being mediocre can't be appealing :)) – Raffzahn Jul 07 '23 at 23:00
  • 5
    Hindsight is 20/20. We can look back today and categorically say that the 8086 was a questionable design for what it ultimately ended up used for, but at the time it was a reasonable design for it’s intended purpose. Purpose-specific registers were relatively normal, it was designed to compete with comparable chips recently released by NatSemi, Motorola, and Zilog while remaining assembly-compatible with the 8008, 8080, and 8085, and it was supposed to be a stop-gap (Intel was betting on the success of the iAXP 432, which obviously did not pan out). – Austin Hemmelgarn Jul 08 '23 at 02:26
  • @SolomonSlow True enough, the only time I really got to use all the registers on the 68k was when I wrote assembly; the Amiga C compilers were hopeless at register allocation. Ironically, Turbo C was better at it, despite having less to work with. – rwallace Jul 08 '23 at 13:18

3 Answers3

24

The Motorola 68000 has 16 (somewhat) general-purpose registers of 32 bits each

Well, not really; the 68k ISA does not feature a single set of 16 General Purpose Registers (GPRs) but two sets of 8 specialized registers - each handled and numbered on its own.

That is surprisingly small, sufficiently so that it seems like 32 registers could've been provided.

Sure. Chip space wouldn't increase much - even 64 registers wouldn't make it bloaty.

But chip space needed is only of secondary consideration - if not even lower - when designing a new CPU. Long before anyone tries to design the CPU circuitry, the desired instruction set will be drawn up, as it defines what is to be done and therefore needed in the first place. Hence people usually talk about an Instruction Set Architecture (ISA) when it comes to design, not CPU gate knitting.

When designing an ISA with a general purpose register file, the size of that file is most of all dependent on the amount of code size, i.e. number of bits within an instruction, that can be spent. In case of the 68000 there was room for a single 3-bit field to mark which register is to be used. Which set to be used is defined either by the (3-bit) mode field (*1) and/or the instruction itself.

Having 16 GPRs instead of 8+8, without spending any other functionality, would mean the opcodes would grow from 16 to 18 bits; with 32 GPRs it's 20 bits.

This is the reason why the 68k has not a single 16-register file but two of 8 each. Instruction bits are a more important resource than any chip area.


*1 - Mode 0 -> Data Register ; 1 -> Address Register; 2..6 -> Addressing modes; 7 -> Special Addressing Modes

Raffzahn
  • 222,541
  • 22
  • 631
  • 918
  • Trivia addition: almost all of the instructions that can operate on either data or address registers can be decoded as if register were a contiguous 4-bit field with bit 3 set implying it’s an address operation. So, sort of, Motorola did encode as if they were just 16 registers, kind of. But not really because the operation often differs slightly depending on which group it’s using. Though I thought that the fact that it’s a contiguous four bits was interesting. – Tommy Jul 07 '23 at 13:17
  • @Tommy It's not quite that simple. The indirect modes are only available on A registers. You'd still need three mode bits even though mode 1 would not be necessary, so that's seven bits per operand. That said, I think it could be done with some tinkering with the instructions e.g. the arithmetic instructions seem to be duplicated for address and data destinations. – JeremyP Jul 07 '23 at 13:51
  • 1
    @JeremyP Well, that's true for the entire CPU. With a little more thinking lot could have been improved with the 68k. ut it's a typical Motorola design: Straight out of the book. – Raffzahn Jul 07 '23 at 15:26
  • @JeremyP I'm unclear what you're responding to; my observation is that the usual expression of mmm rrr where mmm = 000 means data register rrr and mmm = 001 means address register rrr is sufficient for the claim that "the instructions that can operate on either data or address registers can [almost always] be decoded as if register were a contiguous 4-bit field". I'm aware that there are many forms of addressing other than register direct, and those would not be "instructions that can operate on either data or address registers". – Tommy Jul 07 '23 at 17:50
  • (and, again, I thought it was an interesting observation, which does not invalidate anything else said here, and is not the same as having 16 general purpose registers, etc, etc) – Tommy Jul 07 '23 at 18:07
  • 1
    @Tommy the point is that most instructions do not operate on Dn or An, at least, not for both operands. Most instructions operate on effective addresses. Most of the effective address modes except the absolute, PC relative, immediate and register direct modes require the specification of an address register e.g. there is (An) but not (Dn). If you have 16 registers that can operate in the capacity of an address register instead of 8 registers, you need that extra fourth bit for the register specification and there's no way around it without adding complexity to the instruction decoding. – JeremyP Jul 10 '23 at 17:59
  • @JeremyP fantastic, then we both completely agree with the original statement. I’m unsure why it’s interesting to you to discuss what proportion of instructions it’s relevant to, but — yes — it is exactly as it was originally phrased applicable to a subset only. Part of the 68000 test suite that I maintain is a simple decoding map so I guess we could inspect that if your lengthy digression is that weirdly important to you. – Tommy Jul 11 '23 at 13:14
  • Ugh, too late to edit, but that was meant to be italics, not bold. Apologies for the crazed screaming. Otherwise, the decoder that I wrote and maintain is here, the one I merely maintain is here. I'm still at a complete loss as to how "you said that in applicable cases the encoding is as if there are 16 registers, but that's only true in applicable cases!" is a helpful or relevant follow up but whatever. Pull requests always welcome for errors. – Tommy Jul 11 '23 at 13:46
  • 1
    @Tommy Neat. Feels a bit bloaty, then again, it is for modern machines, so who cares about absolute speed:) I very much like the user centred approach. We need more stuff like that. – Raffzahn Jul 11 '23 at 14:01
  • @Raffzahn yeah, super-bloated, completely organised towards potential emulator authors in modern environments. There is no other good reason for doing it not just in text but in JSON. – Tommy Jul 11 '23 at 14:40
  • 1
    @Tommy well, that comment was more about the decoder code, but yeah, you're right. – Raffzahn Jul 11 '23 at 14:42
  • 2
    @Tommy Regarding the 4 bit field. This seems only true when looking at the effective address and taking the mode as part of the register number - which it is not. It is for one not true as mode 001 is simply not supported for most operations (see AND etc.) and second the register field (bit 5..7), can only hold a data reg. Using the direction bit doesn't help much - not to mention that encoding would be no longer continuous. That's why ADDA/SUBA exist as dedicated instructions. ISA wise those are two separate register sets. (P.S.: does the linked map of yours on purpose decode ADDQ as ADD?) – Raffzahn Jul 11 '23 at 14:43
  • @Raffzahn indeed, “can be decoded as if” does a lot of the heavy lifting in my original comment — it is definitely intended as “this view of things isn’t true, but you can sometimes pretend that it is”. Will have to check on ADDQ. – Tommy Jul 11 '23 at 16:38
  • Oh, and re: code, yeah, it’s straightforward rather than fast. Where straightforward is arguably a synonym of lazy but might more charitably be used to cover the fact that time is finite, computers are fast, and so optimising for net productivity on a mere hobby implies preferring to spend the time elsewhere. That is, if I were trying to be as nice as possible, ummmm, to myself. – Tommy Jul 11 '23 at 16:41
  • Belated, re: ADDQ and appreciating that it was just an aside, yeah, those are printed as ADD with the Q field decoded to a constant value. E.g. ADD.l 1, -(A5). Since the map is on opcode only, no following words considered, you can view the Q as being implicit from the fact that one of the operands is known. But probably it should be explicit. – Tommy Jul 11 '23 at 16:56
  • @Tommy :)) Don't tell me about lazy :)) My way of lazy would be some tables and an interpreter, as I hate to code what I can implement as data :) YMMV. I don't mind ADDQ (and potentially others) being decoded as ADD - depending on usage this may speedup reading. Except when it's important to see what exact opcode is used (like ADD #6 can be an ADDI or an ADDQ). of course such an output can not be guaranteed to compiled back to original form – Raffzahn Jul 11 '23 at 17:36
  • Within the confines of what that case is permitting you to test, ADD #6 can't be ADDI because that would occupy two words. Otherwise: I've flipped and flopped between having that code populate tables of different levels of detail at startup versus just doing all decoding fully live; an early version which was opcode to complete list of bus operations ended up at more than half a megabyte of tables, which was both slower and harder to read. But part of the reason that the code keeps pushing things it had decoded into template parameters is so that I can pick the cut later. – Tommy Jul 11 '23 at 18:37
21

In the photo you link, they are near the bottom, part of the address execution unit, and data execution unit. In this cropped section from a higher resolution image (linked in the comments), you can see the regular structure. That is from the bottom left corner; the vertical columns are (part) of the address registers. The registers are partially fused with the arithmetic and logic.

Just eyeballing it, they seem to use, very roughly, 10% of the die area. This is rather close to the estimate you came up with.

RETRAC
  • 13,656
  • 3
  • 42
  • 65
9

I'm sure there were all sorts of tradeoffs the designers balanced, but eight GP data registers and seven GP address registers was already luxurious in those days.

Going to more registers would've required bumping the register-select part of every opcode to 4 bits instead of 3, which would've left less space available for distinguishing actual operations. (As you say, instruction encoding space.)

I'll have to dig out the linked issue of Byte. I was admiring the 68000 from afar at that time; a year or so later, I was teaching its assembly language to freshmen. Happy days!

jeffB
  • 2,826
  • 1
  • 13
  • 20
  • 3
    Above the increased opcode size, how much extra internal *wiring complexity* is created by adding more registers? – RonJohn Jul 07 '23 at 01:54
  • 3
    @RonJohn not much, as register in and output gors from and to busses, so it's all about latching - whicha register needs anyway and output buffer - which as well is a given. – Raffzahn Jul 07 '23 at 15:27