Why did the Motorola 68000 processor family fall out of use in personal computers in the 21st century?

Question

In the '80s and '90s the Intel x86 and Motorola 68000 families were the two leading microcomputer architectures in the 16-bit/32-bit personal computer scene. The 68000s were even preferred by the purists because of its orthogonal instruction set. The Intel x86 family, although always the market leader, has been criticized for its non-orthogonal instruction set and segmented addressing.

When the Macintosh line switched to PowerPC, the Motorola 68000 family began to disappear as a contender for newly designed personal computers outside the “Wintel” ecosystem. New contenders, such as the PowerPC, ARM and MIPS, take the 68000 family’s place.

I cannot find on the internet the reason why the 68000 family fell out of favor in the personal computer market.

Can someone explain this or point me in the direction of the answer?

If it didn't run Windows it ran some version of Unix, and the Desktop pc hardware vendors at the time saw cheap pc's eating their marked from below. — Thorbjørn Ravn Andersen, Sep 25 '23 at 04:04
High cost and low volume meant Motorola couldn't afford the engineering to keep making them faster, so by early/mid 90s the x86s were both faster and cheaper. RISC cpus, being much simpler/cheaper to build took over everything not requiring Windows. — Chris Dodd, Sep 25 '23 at 05:07
there's this very good instruction set which is also a problem to implement some RISC-like architecture as some instructions are 10-bytes long (record is 12) whereas smallest ones are 2 bytes long. Plus the compilers becoming more and more powerful and the systems more and more demanding feature-wise, the need to code in assembly faded away — Jean-François Fabre, Sep 25 '23 at 05:25
The success of x86 is not just the CPU, but the whole PC machine, e.g. BIOS and the peripheral hardware, timers, DMA and interrupt controllers. For a few decades we kept duplicating and emulating the entire original PC machine, and x86 ISA is just part of it. Even if someone designed a computer with a x86 core but with different set of peripherals such that it's not compatible with PC versions of available software, then it won't sell. — user3528438, Sep 25 '23 at 06:04
Processor architectures come and go. The history of the 68k architecture is perfectly normal. It's the x86 that's anomalous. I believe that is the result of Microsoft's peculiar inability to switch architectures. No other software company seems so stuck: they move opportunistically to whatever architecture suits their immediate needs. Apple has evolved 6502->68k->PPC->x86->ARM. — John Doty, Sep 25 '23 at 12:14
But Microsoft has (or had) the ability to switch architectures. Windows has been available on MIPS, PowerPC, Alpha, Itanium, etc. Why MS no longer supports those is due to application unavailability. Unless 3rd party apps are available for a platform, no-one wants to buy that platform, and therefore OS support is an undue burden on MS. (Granted, the argument here is parallel support, rather than serial changes of ISA). — dave, Sep 25 '23 at 12:37
@JohnDoty: Microsoft could and did switch architectures, as another-dave said, at least for their server OSes. The problem with making that commercially relevant for desktop/laptop home users is the 3rd-party ecosystem that's designed around binary compatibility not source, the strong backwards-compat of x86 which is or was its reason for commercial success, and crufty 3rd-party codebases that make all kinds of assumptions that aren't documented but are de-facto true on x86 Windows, like little-endian, stack-args calling conventions, etc. (Even moving to x86-64 was slow for some code.) — Peter Cordes, Sep 25 '23 at 13:20
@PeterCordes But other third-party ecosystems don't have that trouble. Linux is entirely third-party ecosystem. — John Doty, Sep 25 '23 at 13:34
@JohnDoty: Exactly, because Linux is a source-based ecosystem, not e.g. selling a binary DLL without source as a component for other people to build apps around. People knew that was a non-starter on Linux, I think. And vendors of closed-source software for Linux had often already ported it from something else, so it's probably portable enough to recompile if stuff (like target architecture) changed. I don't really know why the Wintel ecosystem evolved to be so binary centric while others didn't, but I think it's true to say that running old binaries would be key for any new PC desktop. — Peter Cordes, Sep 25 '23 at 14:10
@PeterCordes: If everything a program will need to access can fit in a 4GB address space, why should it need to waste space with 32-bit pointers? For some kinds of applications, 64-bit addressing could be a major win, but for many others it would be a performance drain with no benefit. — supercat, Sep 25 '23 at 19:20
@supercat: Some code stayed 32-bit for valid performance reasons, e.g. pointer-heavy data structures, and 32-bit-pointers-in-64-bit-mode ABIs like Linux x32 to get the best of both worlds never caught on for x86-64, unlike for AArch64 where I think that gets some use. But other code would benefit more from x86-64 having more registers than 32-bit mode, and guaranteed availability of SSE2, and a more modern calling convention. Benefit more than the extra cost of larger instruction sizes (REX prefixes) and larger cache footprint from pointers and size_t. — Peter Cordes, Sep 25 '23 at 19:28
@PeterCordes: If a compiler and OS supported an ABI which used 32-bit pointers while also using a bigger register set, I would think it would be worthwhile to migrate programs to that, but with an easy build option to select pointer size, but I find it weird that people who claim to be interested in performance ignore an opportunity to slash caching footprints of language frameworks that make heavy use of references. — supercat, Sep 25 '23 at 20:42
@supercat: That was the case on Linux for many years. Most distros at least had x32 libc and a few other core packages packages you could install on x86-64 systems, since multi-arch was already supported for i386 + x86-64. But I don't know of any distros that built the core OS like /bin/bash and init as x32 and treated x86-64 (full 64-bit pointers) as the unusual one. A third set of libraries to upgrade was more pain than most people wanted for the modest perf gains with most code :( Plus, if you have a mix of processes running, both libcs etc. are resident in RAM (and L3 cache). — Peter Cordes, Sep 25 '23 at 21:19
@supercat: Some distros these days are dropping x32 support, e.g. turning it off in their kernel config. BTW, GCC's actual code-gen for -mx32 isn't as efficient as it could be, e.g. using address-size prefixes to truncate to 32-bit (and zero-extend to 64) even on addressing-modes that couldn't have involved a negative offset (in that case 32-bit address-size is a cheap way to avoid carry-out into the high half, an address above 4GiB). https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85038 and the earlier bug 82267 linked from it. So code-size suffers some, although fewer REX prefixes helps. — Peter Cordes, Sep 25 '23 at 21:24
Belabouring the point, but Windows 11 is actively available for ARM, both via dedicated consumer devices and in virtual machines on other ARM devices such as the Mac. — Tommy, Sep 25 '23 at 21:31
@supercat In the 2020s, people who are interested in performance are usually interested in performance because their datasets are multiple gigabytes in size. — Russell Borogove, Sep 26 '23 at 22:41
@RussellBorogove: Most tasks don't involve such data sets. If a system is seldom if ever used for tasks that would involve such large data sets, accepting a performance penalty on all tasks seems a bit ugly, and viewing as obsolete anything which doesn't subject itself to that performance penalty seems like a devolution. — supercat, Sep 27 '23 at 15:25
@supercat laughs in game developer, then cries in game developer — Russell Borogove, Sep 27 '23 at 16:53
@RussellBorogove: Even if the total amount of RAM needed to manage all of the things that need to be done to make a game work would exceed 2GB, the actions that would need to be done each frame could be logically subdivided into many smaller tasks, which for many games would not involve that much storage. — supercat, Sep 27 '23 at 17:11
The 68k ISA is not orthogonal. 68k is not DEC PDP-11 or VAX. 68k doesn't even allow to have pure relocatable addressing when even the 6809 does. — vollitwr, Oct 13 '23 at 06:02

Patrick Schlüter · Answer 1 · 2023-09-28T08:19:08.293

56

The 68k family instruction set, as elegant as it appeared to the casual assembler programmer (been there), had several flaws that made it very difficult to get fast in hardware. Out of order or superscalar execution were very, very difficult to implement.

Over-complex addressing modes, especially the indirect one introduced with 68020: when combined with virtual memory made it theoretical possible to get up to 16 page faults in 1 instruction (move long indirect with displacement and shifted index from an odd address touching 2 pages etc.). These indirect addressing modes were the first to be removed when defining the Coldfire ISA.
Exposure of the pipeline internals on traps and exceptions: the idea was that on a trap, the instruction could be resumed after fixing the error cause. This made it very difficult to get performance out of the kernel as it wrote more and more data to the stack at each generation, and it also limited the internal state that could be saved. x86 was much more pragmatic and just restarted the cancelled instruction from start.
Compatibility between successive family members was not as good as in intel CPUs. If you want to compile a program that runs on 68000 and on any of 680[2346]0 you will lose a lot of features on the side.

There is the famous newsgroup post from John Mashey explaining the fundamental issue with the 68k ISA in comparison to other ISAs of that time.

edited Sep 28 '23 at 08:19

answered Sep 25 '23 at 08:25

Patrick Schlüter

4,120
1
15
22

3
An excellent explanation! I'd like to argue a bit:
1. x86 is also having insns that can create lots of page faults, for example block move insns. There's still a way to execute them slow yet correct, for example from within a microcode engine.
2. Those incompatibilities between families are at the same time a way to move further while dropping lots of no more necessary legacy en route.
– lvd Sep 25 '23 at 13:20
8

@lvd: x86 rep movs is interruptible, though, so it can make partial progress through those page faults. (Same for AVX2 / AVX-512 gather / scatter instructions) Patrick is talking about the worst-case number of pages that all have to be present for a program to make forward progress. In x86's case, that's 6 (Do x86 instructions require their own encoding as well as all of their arguments to be present in memory at the same time?) for one step of a misaligned rep movsd with the 2-byte instruction spanning a page boundary itself. – Peter Cordes Sep 25 '23 at 13:27
1

Or does m68k save partial execution progress ("pipeline internals") on the stack to resume after an exception? x86 interrupts happen strictly on instruction boundaries, so execution restarts from scratch after any fault. – Peter Cordes Sep 25 '23 at 13:30
3

Perhaps worth pointing out that 386 made x86 much more orthogonal than before, at least in 32-bit mode. (But if you're writing 16-bit code that cares about compat with 8086 or even just 286, you're in the same boat as 68k in terms of leaving new features unused because 68000 doesn't have them. Also, if the first hypothetical m68k Windows PCs used 68020, that would be the baseline for m68k PC software, not 68000.) – Peter Cordes Sep 25 '23 at 13:34
Still any complex addressing mode move (or any other insn) is perfectly interruptible (or more specifically, cancellable) up until the final unaligned write, which would require at max. 2 page faults. A more interesting case here could be movem. – lvd Sep 25 '23 at 13:44
1

@PeterCordes yes, 680x0(1) save intermediary state to the stack and allow thus an interrupted instruction to be continued. With interrupted I don't mean IRQ's that are, as you noted, happening between instructions, but exception like bus error, invalid statement (A000/F000), etc. x86 was more pragmatic and in the long run did the right thing of restarting the instrcution completely. (1) that resume mechanism didn't work on 68000 as the stack dump was incomplete => 68010. – Patrick Schlüter Sep 25 '23 at 13:50
The MC68060 uses a restart exception processing model. Exceptions are recognized at the execution stage of the operand execution pipeline (OEP) and force later instructions that have not yet reached that stage to be aborted. -- so it was flexible across the full 68k family. And in this way, it is not that much more complex to be superscalar than for any other "CISC" cpu with lots of legacy. – lvd Sep 25 '23 at 13:59
3

@PeterCordes: On the 68000, if memory address 0x1234FFFC holds the value 0x1234FFFE, code performs an indirect store of some 32-bit number through the address held in address 0x1234FFFC, and a page fault could occur when accessessing the least-significant halfword of the destination (0x12350000), that fault wouldn't occur until the 0xFFFE part of the address had been overwritten and the target address didn't exist anywhere in the universe outside the internal CPU state. – supercat Sep 25 '23 at 15:34
This is interesting. I had always heard that we missed out on 68000's flat memory addressing model and were cursed with x86's complicated segmented memory. There is more to it than that apparently. I remember the convolutions I had to resort to, to get arrays > 64K on 16-bit systems. – Pierre Oct 07 '23 at 18:29
2

@Pierre: The 68000 had real advantages in ease of programming, until it was possible to use i386 flat mode in 1985-86. But before then, in 1984, Motorola had lumbered themselves with indirect addressing modes which made pipelining hard, limiting the performance of future 68000s. They then didn't do a very good job of the 88000, and joined the AIM alliance in 1991. – John Dallman Oct 07 '23 at 19:08

John Dallman · Accepted Answer · 2023-10-07T18:32:01.923

39

The Apple-IBM-Motorola alliance was created in 1991 to compete with the Windows/Intel market. Its main successes were the creation of the PowerPC instruction set, derived from IBM's POWER architecture, and Apple's Power Macintosh line of computers.

IBM originated the idea, having seen that Windows on Intel was out-competing OS/2, and wanting to avoid being dependent on Intel. Apple joined it, seeing the chance to grow out of their existing markets, and Motorola presumably saw it as a successor to 68000, having failed comprehensively with the MC88000.

While the 68000 was used in the Macintosh series, Atari STs and Amigas, all of which sold in large numbers, all the operating systems involved were quite different, so there was no unified software base. That meant there wasn't the sustained demand for 68000 that could have paid for chip development on the scale required to keep it competitive with x86. The engineering workstation market had started with the 68000, but had already switched to RISC before AIM was created.

edited Oct 07 '23 at 18:32

answered Sep 25 '23 at 06:36

John Dallman

13,177
3
46
58

2

Good answer, however I have to object to "home computers, plus the Macintosh". The "home computers" (I assume you mean the Amiga and Atari ST) were on the same level as the 68k Macs - ok, the Amiga was held back by only having "TV-compatible" video output (so its high-resolution mode was interlaced and flickering), and the Atari ST was derided as the "Jackintosh", but still, both of them were equally capable as the 68k Macs and also saw professional use. – rob74 Sep 26 '23 at 14:34
@rob74: They had the hardware capabilities, but did they have the range and quality of software that the Macs did? See what you think of this edit. – John Dallman Sep 26 '23 at 21:35
1

@JohnDallman It feels like one of the long, fruitless discussions we held back in the days :) High regard for Macintoshes was mostly an American thing. In Europe - Amiga and Atari ST were seen as both less expensive and better for professional use. The killer applications were music and DTP for Atari ST (it had built-in MIDI and was the host to the original Digital Audio Workstation - Cubase) and video for Amiga (due to availability of inexpensive Genlock hardware). – fdreger Sep 27 '23 at 13:30
Bingo. Basic capitalism wins every time. If there's no demand for your product, it will necessarily be supplanted by products that consumers do want (or are willing to pay for). – Ian Kemp Sep 27 '23 at 15:59

score 17 · Answer 3 · edited Sep 25 '23 at 07:43

17

Motorola stopped investing in MC68000 family when everyone thought that RISC was the future and that CISC CPUs would be soon non competitive. So it switched to PowerPCs.

Even Intel thought this and developed RISC CPUs (i860, i960...). Intel reluctantly continued investing in x86.

For Motorola, it was probably true, the last version, MC68060 was competitive with Pentium but it was quickly surpassed because of Intel manufacturing superiority allowing lower dissipation, higher frequencies. Switching to simpler RISC CPUs could allow to stay in the race.

Now, the difference between RISC and CISC (eg x86) is less relevant performance-wise due to the possibility of putting 100 times more transistors on a die.

edited Sep 25 '23 at 07:43

Patrick Schlüter

4,120
1
15
22

answered Sep 25 '23 at 07:22

Grabul

3,637
16
18

3

Before Motorola joined the AIM alliance it had a hand at creating its own RISC architecture, the 88000 line (m88k, consisting of two models shipped over the course of three years, before being discontinued in 1991). – njuffa Sep 25 '23 at 09:05
Yes, and AMD tried RISC with AMD29K. – Grabul Sep 25 '23 at 20:40
3

The main difference between CISC and RISC is in the engineering cost and complexity. As long as you have enough volume (sales), that cost can be made up. x86 had the volume and 68K did not. – Chris Dodd Sep 26 '23 at 21:51
2

@ChrisDodd: Another detail is that executing RISC code from cache is faster than executing CISC code from cache, but if CISC code is smaller than CISC code, it can be fetched from main memory faster than RISC code. Having a system store CISC code in main memory but convert it to a RISC representation before storing it into the internal cache offers the best of both worlds. – supercat Sep 28 '23 at 16:03

score 11 · Answer 4 · answered Sep 26 '23 at 11:29

Motorola lost the race to 32 bit computing from a simple engineering mistake. In the era when the 68k and x86 were very popular, the shift to 32 bit CPUs was a race to mass production. No question the 68020 was a cleaner design and almost destined to be the no. 1 choice for new machines. Friends of mine paid around $400 at the time for early 68020 chips to build test boards. X86 at the time was hopelessly behind. BUT, the first iteration of the 68020 had pushed the design parameters of the chip process to the point that yields were appalling. Every chip sold at a loss. Motorola then had to redesign all the masks which was an 18 month engineering exercise. 18 months was the window that let x86 get ahead in the market and that was the end of the 68k family's dominance. Shame

score 10 · Answer 5 · edited Sep 25 '23 at 08:01

10

When the Macintosh line switched to Power PC, the Motorola 68000 family begun to disappear

It's been rather the other way around. Apple switching was a result of Motorola losing the race.

New contenders, such as the Power PC, ARM and MIPS take the 68000 family’s place.

Not really - also you forget the NS32k family going away at the same time, being maybe less visible but at least as successful as the 68k.

I cannot find in the internet the reason why the 68000 family fell out of favor in the personal computer market.

Cost on either side:

Motorola wasn't able to keep up investment to improve their CPUs the same way that Intel did
Resulting CPUs were considerably more expensive than Intel's offering.

This is not only true at upper end offerings with '060 vs. Pentium but even more for embedded. Basic 80(1)88 based systems could be delivered at considerable lower development and production cost.

edited Sep 25 '23 at 08:01

Omar and Lorraine

38,883
14
134
274

answered Sep 25 '23 at 06:45

Raffzahn

222,541
22
631
918

3

Let me be doubtful about NS32032 success, as at that time I was a fan of NS products, and in the market of MCU/MPU for some time, and I've never seen a single design with it. I knew it had been employed in some high-end (for the time) laser printer, no idea what. MC68K was simply anywhere, its assembly language, high-level language friendly, was unbeatable. Only heavy-weight IBM could have decided by marketing for a technical freak CPU, coming from a company that didn't even believe in microprocessors. – LuC Sep 25 '23 at 16:02
1

Well, one example might be Siemens switched (ca. 1985) their Unix line (PC-MX) from Intel to NS. Those systems were way into the late 1990s the best selling Unix systems in Europe, covering a range from single 32032 CPU all the way to 16 way 32532. For market share in that segment 68k was at best a third. – Raffzahn Sep 25 '23 at 16:58
Well, starting from the mid-eighties I would recall Apple Macintosh, Commodore Amiga, and Atari ST/TT lines, without forgetting the venerable Sinclair QL. I'm quite sure they could have sold a few units more than Siemens' devices. On the industrial side, I've witnessed a few VME bus systems with MC68K, running Sys V too. – LuC Sep 26 '23 at 10:44
2

@LuC Sure, but whats your point? You assumed there weren't any NS systems, which he MX family proves different. Same goes for embedded. Especial in high reliability environments. Also, do you really think the National would have poured the money needed to developing the series all the way to the 32641, a 1992 superscalar server CPU - not to mix up with the embedded 32x160 series, developed until 1997 - if it wasn't successful and generating positive ROI? – Raffzahn Sep 26 '23 at 10:59
I'm still convinced that besides any great innovation in the NS32K series, they didn't have as much market success as you said. Of curiosity, I found a historical fan site mentioning the selling numbers of the Siemens MX 300 as 13000 units over its life span. In the early '90s, I collaborated with a small company (let's say insignificant, compared to Siemens) that was selling that number of 68K systems every month, and then I found strange reading of a similarity in success. – LuC Sep 26 '23 at 12:30

score 6 · Answer 6 · answered Sep 25 '23 at 02:50

6

The 68000 was a joy to program (compared to the segmented memory Intel x86), but it simply didn't keep up in the clocking race.

answered Sep 25 '23 at 02:50

ubfan1

161
1

score 3 · Answer 7 · answered Sep 27 '23 at 10:39

This was purely economics. By about 1988 the ready availability of IBM AT clones had the effect of pushing the price of support hardware- including cases, discs etc.- down enormously, and even on the '286 there were UNIX variants that demonstrated that such things were possible. The '386, when introduced, exploited that, and from that point onwards it became a race between what Intel- with a growing income- and Motorola et al.- with static incomes- could do with the available semiconductor technology.

By about 1995 Intel's price/performance ratio was unassailable, and graphics accelerators which could operate in conjunction with PC hardware were starting to erode the market for specialist workstations.

Kinda misses the window. Apple and Motorola committed to PowerPc (thereby dooming the 68000) in 1991. With the AIM alliance (Apple, IBM, Motorola) backing PowerPC it was not at all clear that Intel's price/performance ratio was unassailable. The first PowerPC Mac shipped with the then-state-of-the-art ATI Mach 8 on the motherboard-- the ATI Mach 8 being essentially the first-generation of graphics accelerators on both platforms. Plus the PowerPc macs shared the same ISA bus with IBM machines. Mac has never particularly been a "specialist workstation". — Robin Davies, Oct 07 '23 at 00:25

Why did the Motorola 68000 processor family fall out of use in personal computers in the 21st century?

7 Answers7