How did old computers address far more than 64K of memory despite only having a 16 bit address bus?

Question

I have an old Sharp PC-G830 pocket computer from the '80s that has 32K of RAM and 256K of ROM. I also have a simple single board computer I built with 128K of RAM and a few megabytes of ROM from a MicroSD card. Both of these computers, however, use a regular 40-pin Zilog Z80 processor with a 16-bit address bus and therefore a theoretical "limit" of 64K of memory. My understanding is that many other 8-bit computers like the Commodore 64 can utilize far above 64K of memory as well.

How do these computers accomplish this? How can my SBC address 128K of RAM even though it should theoretically only be able to address 64K? I have been scouring the internet for an answer for a few hours now and have yet to find an explanation I can wrap my head around. I have tried examining the schematics of computers that do this but I don't understand the theory behind the setup or how it works.

Any help is much appreciated, thank you for your time.

EDIT Another post was brought up in the comments and I would like to say that I am not looking for instructions on how to make my own MMU or a computer revolving around one. Judging by the comments on this post, the presented schematic is poorly designed and I am not looking to use it as a reference. Additionally, such a question is not related to retro computing and as such doesn't really belong on this forum. I am only looking for an explanation of how old, already designed computers accomplished the task of having more than 64K of total memory. The post that has been brought up does not explain how old computers addressed more than 64K of RAM and only pertains to new, theoretical and poorly made designs.

The term you want is 'bank switching'. Google it and/or have a read of https://en.wikipedia.org/wiki/Bank_switching. There's loads on this on the internet you can read up. — TonyM, Jan 14 '22 at 20:03
@TonyM Aah, thank you. The term "Memory banking" turns up very little on my search engine that is relevant to the topic. Bank switching definitely turns up more relevant info, thank you for telling me what I can search instead of just saying "Google it" :-) — Shades, Jan 14 '22 at 20:06
I find it interesting that you don't seem to be confused as to how it could access several megabytes of ROM. Surely any method of accessing ROM can be used to access RAM (from a practical point of view, slower access speed are accepted for accessing ROM, resulting in different methods actually being used, but from a purely theoretical point of view, is there a difference?) — Acccumulation, Jan 15 '22 at 22:45
@Acccumulation From a theoretical viewpoint, banking ROM and RAM is the same. From a more practical viewpoint, ROM tends to contain more code than data (because data tends to change). And because you tend to know the code you put in ROM up-front, you can arrange it cleverly to achieve greater locality, thus less required bank switches (that need time) — tofro, Jan 16 '22 at 00:09
duplicates: How can 8-bit processor support more than 256 bytes of RAM?, How do 8-bit and 16-bit processors access more RAM with two registers? — phuclv, Jan 16 '22 at 15:58
There is an effort to close this question as a duplicate. However, the old question has issues and is already closed. This question is better written and should be the one left open. Vote to leave this open. — DrSheldon, Jan 16 '22 at 17:08
@Raffzahn Thank you for your input, however, that post seems to be poorly written. "Having said that, my very first question would be why on earth any hardware control needs to be located within memory address space." is the top comment and from that I glean that the presented schematic is not one that should be followed. — Shades, Jan 16 '22 at 22:47
What was this computer you built, and how is it that you're unaware of how it worked? Some kit that didn't explain how it arranged nemory? — dave, Jan 18 '22 at 02:48
@another-dave I built the Z-80 MBC-2 that was on Hackaday a while back along with the Uterm add on. It wasn’t really a kit per se I ordered all of the parts myself and there wasn’t any substantial documentation on the project page of how it works. The schematic is available but it seems to use an ATMega 328-PU as an MMU and there isn’t an explanation of the software driving this. — Shades, Jan 18 '22 at 15:59

score 30 · Accepted Answer · answered Jan 14 '22 at 19:39

30

I don't know details of the Sharp PC-G830 specifically but the technique used to address more than 64K with a 16-bit address bus is called "bank switching".

This involves setting up some portions of memory to be switchable via an I/O port line and then the application program organizes things in memory such that different sections can be switched in and out as needed.

It's not anywhere near as clean as having a larger address bus because you have to ensure that you don't switch a block of memory out that is being used or end up with a situation where you have to continually switch back and forth to complete the program.

Overall it was a scheme trying to allow more space in a system that was permitted by the architecture of the CPU. This was an interim step on the way to 16-bit CPUs such as the 8086 and 8088. Some would say that these two, for example, are really 8 bitters with bank switching built in and there is some truth to that. If you compare the 8086 architecture to say the 68000 architecture you can clearly see the difference.

answered Jan 14 '22 at 19:39

jwh20

3,039
1
10
19

2

My understanding then, is that the memory banks must be switched for whatever block you want to address. Wouldn't this add a lot of time to memory access if you had to switch banks a lot, or is the time the operation takes negligible? I'm just wondering since, say, my SBC has the full 64K and then some taken up by RAM wouldn't it have to switch banks every time it wishes to read data from ROM for the next instruction to execute? Sorry if this is a dumb question, I'm fairly new to designing hardware – Shades Jan 14 '22 at 19:57
7

@Shades you’d organise your program so that it doesn’t need to switch banks very often. Some systems also had schemes where reads would read from ROM, and writes would write to RAM, so you could copy data into a bank with a ROM overlay while still running code in ROM (and then switch when you want to read from RAM). – Stephen Kitt Jan 14 '22 at 20:14
1

@StephenKitt: In some cases, one could also exploit the fact that a systems didn't fully decode I/O to allow zero-overhead bank switching. The menu program I did for the Atari 2600 cartridge "Stella's Stocking" did this. One many scan lines it needed direct access to about 3KB worth of wave tables and about 3K worth of graphics data; writes to the TIA at addresses $00-$3F when bank 14 was enabled switch to bank 15, and writes at $40-$7F when bank 15 was enabled would switch to bank 14, thus allowing bank-switching with literally zero overhead. – supercat Jan 14 '22 at 22:04
4

An alternative scheme to bank switching is address latching, which permits more address bits than there are address pins. However, since the question said 16 address bits, not 16 address pins, that approach seems to be off the table here. (The two schemes are very closely related, differing in whether the "bank register" is itself mapped into the I/O space as its own location vs considered to be the upper bits of an pointer/index register used for indirect addressing.) – Ben Voigt Jan 14 '22 at 22:19
2

Yet another schema is DMA transfer to and from external, non-addressable memory (e.g. Commodore REU) – tofro Jan 14 '22 at 23:26
@Shades In most cases you can switch banks with a single write to memory. If done relatively rarely, this isn't expensive. At the other extreme, if you need to switch banks before every read and write your memory throughput can be reduced by 50% or more. Note that on some systems it's possible to have an area of address space reading one bank while writing another, which can greatly mitigate the performance penalty in certain situations. (E.g., on a C64 you can copy ROM to RAM by simply reading the ROM address and writing that value back to the same address, which will write "through" to RAM.) – cjs Jan 15 '22 at 07:04
Re 8086/8088 what is 8 bit about the 8086? wouldn't it be more accurate to say they are "16 bitters" with 20 bit (with caveats) address space? – 640KB Jan 16 '22 at 18:15
@Shades I remember Jon Sachs (coder of Lotus 1-2-3) quipping that on other computers you just asked for the memory location you wanted, but on 8086 you needed to say "pretty please with sugar on it" ツ – John Doty Jan 16 '22 at 22:58
When I first saw a program with bank switching I confused it with segmented paging as on Windows and thought it was very inefficient to switch banks for a single function call. Actually, bank switching is not so terrible unless you are writing in assembly - a good compiler/linker can lay out banks for you and insert switches as needed. – user253751 Mar 06 '24 at 15:54

tofro · Answer 2 · 2022-01-14T23:37:12.140

There are a number of approaches that can allow a CPU with a 16-bit address bus to address more than 64kBytes of memory:

Bank Switching - explained in another answer,basicaly switching for example 8- or 16-kBytes blocks back and forth into the addressable range, so in effect exchanging the blocks with ones that are currently paged in. Some computers could also bank in ROM, EPROM and Flash memory. The Cambridge Z88, for example uses banked RAM, EPROMs and Flash memory up to an overall amount of 4MB as mass storage medium directly addressed by the CPU and its banking chip (the blink) - So, you can "run" a program directly in-place "on" the storage media without "loading" it first, or access a "file" on the storage media by simply paging it in at a suitable address.
DMA transfer of "external" memory contents to and from the addressable address range (That is, for example, what the Commodore REU for the C128 and C64 did). That is, in effect, a really fast external RAM disk allowing to page in external memory contents into internal RAM (by transferring it byte-wise through the DMA chip). How fast the memory is accessible depends primarily on the achievable speed of the DMA chip employed. Later memory expansions (the GEOS RAM expansions) used banking as above.

re external memory -- the CDC 6000 had extended core storage, to and from which main core storage would be swapped out, using some sort of specialized "block move" instruction. — dave, Jan 14 '22 at 23:35
Both of these were (as you know) still in use well into the ’90s. On DOS computers, EMS used bank-switching into 16K banks, and XMS did transfers. By then, the CPU was capable of 32-bit addressing if the OS supported it. — Davislor, Jan 15 '22 at 04:20

score 7 · Answer 3 · edited Jan 15 '22 at 13:50

7

Another solution worth mentioning — although I doubt it was used by the systems you mention, and it merely allows addressing twice as much as 64K of memory — is "Split I&D", as rather famously used on later models of the PDP-11. A program could have 64K of memory for program text, and 64K of memory for data, for a total of 128K. The processor knew, based on whether it was fetching an instruction to execute or some data to work on, which half to access. In effect, the text-or-data distinction became a 17th address bit.

Although doubling the amount of memory you could access was a pretty big win, it placed limitations on certain exotic programming techniques, such as dynamically loaded (or self-modifying) code.

Split I&D is mentioned briefly (though not under that name) in the Wikipedia article on PDP-11 architecture.

(Fun historical anecdote: I heard a story, which I entirely believe, that when Unix was first modified to allow compiling and running split-I&D programs, some programs failed in strange ways. The problem was that in the initial implementation, variables were assigned starting at the beginning of data space, or address 0x0000. This meant that there was one poor variable whose address was the null pointer! This rather badly violated the fundamental notion that a null pointer is not the address of any object. The linker was modified to allocate a dummy, unused variable at address 0, and the problem went away, although the usable data space then dropped to 63.998K.)

edited Jan 15 '22 at 13:50

Toby Speight

1,611
14
31

answered Jan 15 '22 at 12:53

Steve Summit

297
1
5

4

Is "Split I&D" just a way to refer to a Harvard Architecture, or is there some subtle difference? – Toby Speight Jan 15 '22 at 13:51
1

@TobySpeight I don't think it's a Harvard Architecture, no. AIUI, a Harvard Architecture treated instructions and data completely separately, with separate pathways and storage areas for each. The PDP-11, OTOH, was a purely general von Neumann architecture, with main memory that held code or data. Split I&D was then this little wrinkle that squeezed a 17th address bit out of thin air. – Steve Summit Jan 15 '22 at 13:57
1

Thanks for the clarification. I think I understand it a little better (for a Z80 guy like myself, it would be akin to using M1 to identify instruction access, if all instructions were a single word). – Toby Speight Jan 15 '22 at 14:11
The point about Harvard is loading an instruction and it#s data in parallel, which needs two complete Independent busses operating in parallel. Splitting I&D is nust adding another address bit selecting between two spaces, one reserved for data the other for instructions. – Raffzahn Jan 15 '22 at 14:13
4

@TobySpeight: It's Harvard in the sense that data and code are in separate address-spaces, and you can't use normal load/store instructions to access machine code. (e.g. to apply runtime relocations to absolute addresses in a page of machine code.) But normally Harvard also implies building the CPU to take advantage of that, as Raffzahn points out to allow independent parallel code-fetch and data load/store, making the Von Neumann bottleneck a little less bad. (Which this wouldn't do.) So it's the worst of both worlds, except for memory capacity. – Peter Cordes Jan 15 '22 at 17:04
Funny point: the NULL pointer doesn't need to be represented by a value zero in the memory, the compiler could store e.g. 0xbad00bad and then checks like if (!ptr) or even if (ptr == 0) would be converted to assembler that compared to 0xbad00bad. Of course the compiler needs to support doing that and the programs must not be relying on "NULL is represented by storing a value zero". So they must not be doing things like storing pointers in integer variables, or setting variables to NULL with bzero(). The standard doesn't allow them to, but it'd be easy to overlook. – Ángel Jan 15 '22 at 23:22
So yes, adding a dummy variable address would be indeed the quickest and more robust solution. :-) – Ángel Jan 15 '22 at 23:23
@Ángel: The Standard may specify that programs that do certain things aren't strictly conforming, but otherwise the Standard actually allows almost everything and forbids almost nothing. – supercat Jan 17 '22 at 15:40
@Ángel Funny point: the NULL pointer doesn't need to be represented by a value zero That's true, and there were even one or two machines where the null pointer wasn't zero, but the PDP-11 was never one of them, so pleeeeeease let's not get bogged down in yet another elaborate discussion of this obscure but old and tired topic! – Steve Summit Jan 17 '22 at 16:06
As well as split I/D space, some later PDP-11 OSes took advantage of the otherwise-unused supervisor mode. Commonly-used resident libraries (like the file system library - FCS) could be mapped into supervisor mode. A mode-change stub would be linked into the user-mode program image. I seem to recall that supervisor mode mapping had D-space (mostly?) identical to the D-space of user mode. – dave Jan 18 '22 at 13:34

Patrick Schlüter · Answer 4 · 2022-01-18T07:28:36.910

There's also the TI-99/4A way of accessing more ROM and RAM than there is address space. The TMS-9900 has a 15 bit address bus and could therefore only address 32768 16 bit words. When one looks at the memory map of the TI-99/4A one can see that the console can address (theoretically) up to 32KiB+256 RAM, 16KiB VRAM, 8KiB + 2x8KiB ROM, 24KiB + 16x40KiB GROM which represents up to 736 KiB.

How does it do it?

In a quite complicated and diverse manner

Direct access on 16 bit bus: 8K of ROM and 256 byte of RAM are directly mapped in the address range and are fast access.
Direct access on 8 bit bus: the 32k memory extension is directly mapped in the address space. Cartridge ROM can also be accessed via the 8 bit bus.
I/O memory access: the 16k video RAM is not directly accessible to the CPU and I/O operation on the video chip are necessary for access (unfortunately the base unit only had this memory to store the Basic programs, which crippled seriously the machine).
GROM access: GROM were special ROM chips sold by Texas Instrument that had a very different addressing scheme than normal ROMS. Normal ROM chips have a separate address and data bus with enough pins for addressing the capacity of the chip. GROM on the other hand have an 8 bit multiplexed bus and special pins and a read sequentially in a streaming way. Set start address and then read byte after byte, each read operation incrementing the address.

So, a very complex scheme with an architecture tuned for stream oriented processing with accessing more memory than the 64 KiB would allow, but a system that does not (initially) use bank switching.

score 0 · Answer 5 · answered Jan 15 '22 at 19:21

0

I wrote a rather large program for a DEC PDP 11-70 at the end of the 1970s, and once it broke through the 64k barrier, I had to go to Overlays. This meant manually partitioning the program into pages, so that any one page was self consistent. Very tedious and error-prone. I don't recall if the entire 64k had to be swapped for a different one each time, or whether (say) a base 32k could stay resident while the other 32k swapped. All of that hassle went away when the company switched to a VAX.

answered Jan 15 '22 at 19:21

Neil_UK

101
1

It still had only 64KB virtual address space, and nothing "swapped" out; a new overlay (of exactly the size it was) would be read into the overlay area, or simply remapped there if you used memory-resident (PLAS) overlays.. But this is not really pertinent to the question, since the 11/70 had 22-bit physical addresses, i.e., a physical address space of 4MB. – dave Jan 16 '22 at 04:22

How did old computers address far more than 64K of memory despite only having a 16 bit address bus?

5 Answers5