1

Let's say you are running a 32-bit RISC system. What instructions would you use to access a 64-bit memory address?

In a CISC instruction set, you can simply pass the extra word using a multiword instruction. For example:

1a) JMP
1b) loAddress
1c) hiAddress

Given that RISC instructions are only one word each, how would you access a multi-word address?

Assume the ALU is 32-bit and has a carry flag.

Also, in a CISC system (for example the 8080) both the loAddress and hiAddress words would be stored in the program memory. I.e. the JMP instruction knows to look at the next item in program memory to retrieve the loAddress, and the item after that to retrieve the hiAddress. What happens in RISC?

phuclv
  • 32,499
  • 12
  • 130
  • 417
Jet Blue
  • 4,589
  • 5
  • 33
  • 43
  • 2
    If your registers are 32 bit only then you probably need specialized instruction that uses a register pair. Alternatively, you would select a memory bank in an architecture specific way, e.g. writing into a control register. Then again, your example uses a `JMP` so that means at least your `PC` is 64 bit so maybe you do have 64 bit registers where you can simply use an indirect jump. – Jester Jun 07 '18 at 19:06
  • Also your title doesn't make much sense to me, your address is obviously not greater than the address space. – Jester Jun 07 '18 at 19:10
  • With regards to `JMP`, I was thinking something like the `JMP` command in the 8bit 8080. What it does is load the hiAddress (one byte) into one register, and the loAddress (one byte) into another. – Jet Blue Jun 07 '18 at 19:11
  • I updated the title. Also, I'm asking in a general sense as I am more familiar with CISC instructions than RISC. – Jet Blue Jun 07 '18 at 19:13
  • @downvoter, an explanation for the downvote would be nice. I am more than willing to take any advice needed to improve the question – Jet Blue Jun 07 '18 at 20:37
  • Your question is based on a false assumption. RISC architectures can have variable-length instructions (e.g, ARM in Thumb mode), and CISC architectures can have fixed-length instructions. –  Jun 08 '18 at 01:02
  • @duskwuff I was not aware of this. I was under the impression that RISC put everything needed to execute an instruction (ex `JMP`) in one word (this being the key difference between it and CISC). – Jet Blue Jun 08 '18 at 02:14
  • @JetBlue No, not at all. While it is true that many RISC designs use a fixed-length instruction word, there is no requirement that this be the case. –  Jun 08 '18 at 05:36

2 Answers2

4

Even on a CISC, what you describe is quite unusual. It's not because of being CISC, it's because of using addresses wider than registers. This is only usually only found in 8-bit CPUs. (Although x86 segmentation qualifies, too, with indirect far jumps taking a pointer to a m16:32 segment / offset pair. Or in 16-bit mode, m16:16. Being little-endian, the offset is first.) Outside 64-bit mode, jmp ptr16:32 is also encodeable, with the absolute segment:offset as part of the instruction stream.)

Normally when you want to design a CPU with larger address space, you also make the registers wider so you can deal with addresses efficiently. It's only at the very low end when you want to save transistors by using mostly 8-bit registers / ALUs, but can't limit your address space to 256 bytes, where you find this kind of design.


There is a real issue here even when the address size matches the word size. Constructing arbitrary 32-bit (or 64-bit) constants is a problem that different ISAs solve different ways. ARM often uses PC-relative loads from a nearby "literal pool", while others often use a lui or equivalent to set the upper 16 bits and zero the rest, then ori with a 16-bit immediate. (ARM has some neat tricks for encoding immediates with only a few bits set, by using a shifted/rotated immediate.)

In general on a RISC, if you need to jump far away you may need to construct the address in a register using multiple instructions. Then use a jump-to-register instruction.

MIPS branch instructions are interesting: It has relative branches that add a signed displacement to the program counter with a fairly large range, and absolute jump instructions that replace the low 28 bits of PC with a new address. (Constructed from a 26-bit immediate left-shifted, because MIPS requires instructions to be aligned so the low 2 bits don't need to be stored.) How to Calculate Jump Target Address and Branch Target Address?. But when the target isn't reachable from the current location with those, you need jr with an address in a register.

x86-64 also lacks a 64-bit relative jump instruction. If you need to jump farther than +-2GiB away (not far as in a new CS segment), you need an indirect jump. Normal jump/branch instructions still use rel8 or rel32 displacements, keeping the machine code compact. The only instruction that can take a 64-bit immediate is mov-to-register. The normal code model assumes that all code within the same library or executable is within 2GiB of each other, so the linker will be able to fill in 32-bit displacements.


8-bit RISC

The only RISC ISA I'm aware of with a program counter wider than registers is AVR, a microcontroller with 8-bit registers. It can treat pairs of registers as 16-bit addresses, and its PC is 16-bit. It IJMP (indirect jump) instruction sets PC = Z (where Z is a pair of 8 bit registers). On AVRs with 22-bit program counters instead of just 16, it zeros PC(21:16).

EIJMP (extended indirect jump) takes the EIND register from I/O space for the high bits of PC, with the low bits still coming from Z.

AVR instructions are almost all 2 bytes long, but some versions have a 4-byte jmp instruction which takes a 0..4M absolute address for the jump target.


Mainstream RISC machines with 32-bit registers also have 32-bit program counters and virtual address-spaces. (Having more than 4GiB of physical memory could be possible, but you couldn't map it all at the same time in one process).

Most of them are heavily word-oriented in their design, so all they need is jr reg (MIPS) or whatever equivalent to branch to any possible address, because it fits in one registers. This is part of the reduced complexity that RISC literally stands for.


On a normal RISC like MIPS, SPARC, or PowerPC, 64-bit addresses are only available in the 64-bit ISA extension, where you have 64-bit integer registers. So you'd use instructions like MIPS ld $2, 0($3) to do a 64-bit (doubleword) load using $3 as the 64-bit base address. See this MIPS-IV ISA manual. (MIPS-III added 64-bit extensions, with instructions like ld and daddu. Apparently MIPS-I left a lot of its opcode coding space unused, so there was plenty of room for new opcodes to do full 64-bit ALU operations.)

Some 32-bit CPUs added extensions to support large physical addresses without increasing the virtual address space. For example, x86's PAE defined a new page-table format with 36-bit physical addresses. But even with segmentation, a single process can't address more than 4GiB of virtual memory at a time. (x86 segment base+offset happens before virt->phys translation, creating a 32-bit linear address. So it's still useful for thread-local storage, e.g. with [fs:0] being a different linear address depending on that thread's fs segment base.)


Extended addressing on 32-bit RISC ISAs

Paul Clayton comments:

PA-RISC had "space registers" which provided extended addressing. 32-bit PowerPC had segment registers which were selected based on the most significant 4 bits of the effective address from a 16-entry table (providing a 52-bit virtual address space). For PA-RISC "SRs 5 through 7 can be modified only by code executing at the most privileged level." For PowerPC, any segment register change required privilege.

So apparently some RISC ISAs did extend their addressing before going fully 64-bit. But I don't know the details and am not planning to take the time to research this. Other answers welcome!

Peter Cordes
  • 286,368
  • 41
  • 520
  • 731
  • Thanks for the answer. I am not familiar with the AVR's ISA. How would the Z register be set? Say for example to hold the value 65535 ( 0xFFFF on both bytes). Are there separate instructions to set each member of the pair, or a special instruction to set both simultaneously? – Jet Blue Jun 07 '18 at 22:25
  • Googling around yielded `LD ZH, hiByte` and `LD ZL, loByte` – Jet Blue Jun 07 '18 at 22:44
  • @JetBlue: AVR's doc pages have a table of contents. I looked around a bit, and [`LDI` (load-immediate)](https://www.microchip.com/webdoc/avrassembler/avrassembler.wb_LDI.html) shows an example of setting Z: `clr r31` (zhigh = 0) / `ldi r30, $F0` (zlow = 0xF0) / `lpm` (load constant from program memory, using Z as a pointer). So **Z is `r31:r30`**. A few instructions can operate on a whole word, like [`adiw ZH:ZL, 63`](https://www.microchip.com/webdoc/avrassembler/avrassembler.wb_ADIW.html). So it's a LOT like how 8080 can use a pair of 8-bit regs for some 16-bit ops, and as a pointer. – Peter Cordes Jun 07 '18 at 23:35
  • @JetBlue: I added some more stuff after thinking about what you were really asking. Even a full 32-bit address is a problem in a 32-bit instruction word! – Peter Cordes Jun 08 '18 at 01:39
  • Thank you for all that additional info! =) I'll have to read it a couple of times to understand all of it. Yes, that is the problem I was having. I am trying to access more memory than what the RISC-like (single word instructions) CPU (16-bit homebrew) I am using can directly address. – Jet Blue Jun 08 '18 at 02:28
  • 1
    PA-RISC had "space registers" which provided extended addressing. 32-bit PowerPC had segment registers which were selected based on the most significant 4 bits of the effective address from a 16-entry table (providing a 52-bit virtual address space). For PA-RISC "SRs 5 through 7 can be modified only by code executing at the most privileged level." For PowerPC, any segment register change required privilege. – Paul A. Clayton Jun 08 '18 at 02:33
  • @JetBlue: have a look at how AArch64 encodes immediates, with support for repeating a bit-pattern in 2/4/8/16/32 bit chunks, or for rotating a few bits to an arbitrary position. But presumably that needs significant hardware to shuffle / rotate, and 32-bit constants with 16-bit instructions is less of a big deal. But still, ARM32 immediates can be rotated, and `mvn` (mov-negated) lets you construct `0xff1fffff` in one 16-bit thumb instruction. Thumb mode immediate encodings (16-bit instructions) might be instructive, and maybe also MIPS16 (which still uses 32-bit regs) – Peter Cordes Jun 08 '18 at 02:44
  • @PaulA.Clayton: If you have time, that could be expanded into an answer. For now I copied your comment into my answer, but I know nothing about PA-RISC, and only a tiny bit of PowerPC (e.g. enough to wish that x86 had `rlwinm` for unpacking bitfields and narrow integers). – Peter Cordes Jun 08 '18 at 02:46
2

Given that RISC instructions are only one word each

This is not true. Most modern RISC architectures have a variable-width instruction set, or at least a special variable-width mode (ForwardCom, SuperH, MIPS16e, thumb2 in ARM, C instruction set in RISC-V...) although they're mainly for compacting purposes to increase code density. That still means you can actually make your RISC architecture uses multiword instructions

Even then it won't help you unless you can use instructions that are wider than 64 bits (which are too big to be practical). With only two 32-bit words you'll still be limited to some offset around the base address instead of the full 64-bit address space. But that shouldn't be a problem because almost no single program can utilize the vast 64-bit address space. That's why there's no instructions receiving 64-bit immediate address in x86-64 since a 32-bit offset is already enough. So you can do the same: use a small immediate offset for most situations and use a 2-register pair when you need the full 64-bit address

As Peter said, addresses wider than the word size is primarily seen only in 8-bit microcontrollers. Beside AVR, it's also used in 8-bit PIC where the program counter is 13 or 14-bit long. Instructions generally contain only the low bits of the address, the high bits will be taken from the PC or PCLATH register. If you don't want to use an offset like above then replacing the low bits directly like this is an alternative way. Obviously you still need a separate register for the high bits. But if you don't care about orthogonality then just use a dedicated huge register for addressing, like in 8051, 6502 or other older CISC architectures

There are many other ways to support a wider address range than the register size as I described here How can 8-bit processor support more than 256 bytes of RAM?. One of them is to limit virtual address to the register size only (like ARM LPAE or x86 PAE), while allowing physical address to be 64-bit wide. The pages will be mapped in the TLB and you don't need to use 2 registers to address. If you want to access more than 4GB of memory in this mode just use some API similar to Windows AWE, or use multiple processes (like how Adobe Premiere CS4 did)

phuclv
  • 32,499
  • 12
  • 130
  • 417
  • 1
    I don't think ForwardCom makes any claim of being RISC. The idea is an ISA that can decode and pipeline easily while still maintaining the code density advantages of CISC (in bytes and in work per instruction). The first bullet point of highlights is that it's neither RISC nor CISC. So it definitely doesn't belong as your first entry in a list of RISC ISAs! But anyway, all of those compact variable-length encodings exist mostly because the market for RISCs shifted to embedded / microcontrollers, where code size matters and short narrow pipelines make variable-length decoding cheap. – Peter Cordes Feb 14 '20 at 14:02
  • 2
    So variable-length encodings for RISCs are basically a non-RISC feature that real-world originally-pure RISC CPUs added because RISC purity doesn't sell chips directly. (ARM32 is a good example of other non-RISC features, like a microcoded push/pop with a bitmap of registers.) Note that AArch64 does not (AFAIK) have a variable length mode; you can only get that if you run Thumb code in 32-bit mode (on chips that support 32-bit mode at all; some of Apple's dropped it IIRC.) So for relatively high performance, fixed-length was still the choice in ~2010 after years of Thumb experience. – Peter Cordes Feb 14 '20 at 14:04