What is the origin of different styles of assembly language mnemonics?

Question

As exemplified in answers to this question (I hope that closed questions that are linked to, don't get purged), the instruction mnemonics in early assembly languages had a 1-to-1 correspondence to the machine instructions they denoted.

For example, if the memory access instructions have different opcodes depending on the direction of the data transfer, there will be assembly instructions with an abbreviation of "load" and of "store" in their names, like LDA for "load accumulator" and STA for "store accumulator".

On the other hand, one of the x86 assembly languages could use the same mnemonic MOV for the whole group of data transfer instructions which encompass several different bit patterns in their opcodes, be it a load or a store. The actual meaning of the instruction is then represented by the order and the type of the operands. This can be thought of as a step up in the level of the language, sparing the programmer from remembering the gory details of the instruction set architecture.

Which architecture/platform was the first to have this kind of a "smart" assembly language?

Also, at the bottom of this answer there seems to be an example of an assembly language program for the IAS machine with formulas serving as mnemonics, e. g. S(x)->R for loading from memory. It appears that this style was used in the IAS emulator, but it is unclear if there was an assembler program on the actual machine that understood these mnemonics (doubtful, as in the late 1940s or early 1950s is would be too wasteful to use a 7-character mnemonic where a 2-character one would do).

Were there actual assembly languages that used formula-style instruction mnemonics rather than abbreviations of words, e. g. something like MTR (Memory To register R) instead of what would be LDR as a "conventional" mnemonic, and RTM (register R To Memory) instead of what would be STR, etc.?

The x86 instruction set doesn't just have one MOV mnemoic. For simple copying one value from one location to another there's a number of "load"/"save" instructions: FLD, FLDCW, FST. FSTCW, LAHF, LAR, LDDQU, LDMXSCR, LDS, LES, LFS, LGS, LSS, LGDT, LIDT, LLDT, LMSW, LODS, LSL, LTR, SAHF, SGDT, SIDT, SLDT, SMSW, STOS, STMXCSR, STR. A number of additional "move" mnemonic: MOVAPD, MOVAPS, MOVD, MOVQ, MVDQA, MVDQU, MOVDQ2Q, MOVHLPS, MOVHPD, MOVLHPS, MOVLPD, MOVLPS, MOVQ2DQ, MOVS, MOVSD, MOVSS, MOVUPD, MOVUPS. — , Jun 26 '17 at 22:42
And a few "read"/"write" mnemomics for good measure: RDFSBASE, RDGSBASE, RDMSR, RDPID, RDPKRU, RDPMC, RDTSC, RDTSCP, WRFSBASE, WRGSBASE, WRMSR, WRPRU. — , Jun 26 '17 at 22:44
@RossRidge I never implied that the MOV instruction could denote all possible data transfers. There is a link in the question showing which specific group of instructions I'm talking about. — Leo B., Jun 26 '17 at 22:47

tofro · Answer 1 · 2017-06-26T22:19:04.673

15

Most probably it was the Z80 that actually tried to (and succeeded to) orthoginalize an already existing assembler language, that of the Intel 8080.

8080 assembly language used mnemonics that were different all over the place:

MOV, STA and LDA for register moves, Accumulator Load and Store
LXI for register pair load and store
MVI for moving immediate data into registers

On the Z80, this was all just flattened to LD (Load) plus appropriate arguments, while using the very same opcodes. Similar changes were done to other mnemonics.

Other quirks were

ADD and SUB looked different whether the operands were immediate or registers
and quite some more quirky and hard to remember mnemonics...

which were also "fixed" and straightened in Z80 assembly.

So LXI B,0FFH, MOV A,B, STAX , MVI A,0ffh were all different ways on the 8080 to move data into or from registers or register pairs.

Z80 assembly translated all of these to a simple LD <destination>,<source>

So for the very same opcode, compatible with the 8080, the Z80 used different (and much more systematically built) assembler mnemonics that could actually fulfil the promise of their name and be memorizable. This most probably made the assembler quite a bit more complicated (after all, it had to find out what the operand types were in order to find the proper opcode, while the 8080 had the operand types embedded in the mnemonics), but made life way easier for the programmer.

Beyond that, the Z80 did have quite some operations the 8080 didn't have, like relative jumps and quite a number of very useful block transfer commands (but that was not the question)

edited Jun 26 '17 at 22:19

answered Jun 26 '17 at 21:57

tofro

34,832
4
89
170

2

Thanks! It is likely that the PDP-11 instruction set was the source of the idea that all transfers can be denoted in the assembly language as MOV or an equivalent; so the timeframe for the birth of "smart" assembly languages shrinks to just a few years (1970-1976). – Leo B. Jun 26 '17 at 22:08
1

Yes, the PDP-11 instruction set was extremely orthogonal as well - But it was a CPU designed from scratch that didn't have to be compatible with anything - So even the opcodes were designed in an orthogonal and systematic way. I deliberately chose the Z80 over it, because I think its assembly language shows some very much improved design thinking over the 8080 - It does the same thing, is completely upward-compatible and uses the same opcodes as the 8080, but introduced a much more clear and straightforward assembly language. Was some kind of an embarrassment for Intel engineers, probably. – tofro Jun 26 '17 at 22:15
2

In PDP-11 there was only one opcode for all transfer instructions, so there was still a 1-to-1 correspondence of opcodes and instruction mnemonics. My question is about assembly languages that present a more orthogonal view of the instruction set than it actually is. So the Z80 assembly is an answer, but the PDP-11 assembly isn't. – Leo B. Jun 26 '17 at 22:21
4

If memory serves, the Z80 mnemonics differ from the 8080 because in those days it was clear that mnemonics were protected intellectual property, but everything else was up for grabs. So the Z80 could safely be a clone and extension of the 8080, from the same designers, but copying the assembly language would have been actionable. Necessity is the mother of invention. – Tommy Jun 27 '17 at 03:56
1

@LeoB. The opcodes for the PDP 11 included the address modes of each operand and the source and destination registers. There were literally hundreds of opcodes that meant "move something from one place to another". – JeremyP Jun 27 '17 at 09:44
@Tommy Zilog weren't afraid to copy the CPU architecture, the register structure and the opcodes from Intel. I severely doubt they would have been afraid to copy a bunch of three-letter-acronym mnemonics as well. I am pretty sure they were aiming to improve the assembly language. And history proved them right - while the 8080 is long obsolete, Z80 derivates are still around. – tofro Jun 27 '17 at 10:32
3

@tofro there are quite a few people who have heard it my way around, although I wouldn't cite any of these as proof, e.g. https://www7.dict.cc/wp_examples.php?lp_id=1&lang=en&s=legal%20reasons or https://news.ycombinator.com/item?id=14520019 ; the story is also pretty convincing from a legal standpoint; written text definitely has copyright. In 1976 it won't have been clear whether independently-produced photolithographic masks could incur any liability through their functiomal similarity to other plates when printed onto silicon. – Tommy Jun 27 '17 at 13:30
@JeremyP In computing, an opcode (abbreviated from operation code, also known as instruction syllable, instruction parcel or opstring) is the portion of a machine language instruction that specifies the operation to be performed. There was only one encoding for the PDP-11 MOV opcode: 01xxyy. – Leo B. Jun 27 '17 at 15:34
1

@Tommy English Wikipedia supports your statement with citing an Intel Manual that said something along the line of "All Mnemonics @ Intel". – tofro Jun 28 '17 at 11:29
1

@Tommy: Reading the documentations for enhanced instructions on the NEC-V20 (an 8088 clone) was rather confusing until I realized that the registers it called IX and IY were the same ones Intel called SI and DI. I wonder if the same copyright-related forces were at work with that terminology. – supercat Nov 01 '17 at 16:36
@Main difference being with the V20 the original was straightforward and clear, and the copy was confusing, with Z80 and 8080 it was the other way round. – tofro Dec 01 '17 at 14:57
The relative jumps of the Z80 allowed me to shrink a CP/M bios enough by disassembling and reassembling to give space for additional functionality like interrupt driven serial ports and keyboard buffers, and a clock in the status line without having to decrease the amount of memory available to transient programs. Only took 5 minutes to assemble. – Thorbjørn Ravn Andersen Sep 17 '19 at 09:37

score 7 · Accepted Answer · answered Jun 27 '17 at 05:22

The CDC 6000 series, starting with the CDC 6600 introduced in September 1964 (13 years before the Z80), had a somewhat unique assembly format: The "opcode field" consisted of a prefix denoting a function followed by a register name, and the "operand field" consisted of one or two arguments (register names or constants), combined with +, *, - or / like a mathematical formula. There were three sets of registers, A (address), B (increment) and X (data).

Here is an example taken from page 67 (page 74 of the PDF) of Ralph Grishman: Assembly Language Programming for the Control Data 600 and Cyber Series, that implements the standard Fortran funcion IDIM (truncated subtraction):

        IDENT   IDIM
        ENTRY   IDIM
IDIM    BSS     1
        SA2     A1+1      X2 = address of second argument
        SA1     X1        X1 = first argument
        SA2     X2        X2 = second argument
        IX6     X1-X2     X6 = ARG1-ARG2
        PL      X6,IDIM   if ARG1-ARG2 positive, return
        SX6     B0        else set result=0
        EQ      IDIM      and return
        END

And different combinations of symbols and register would get assembled to different opcodes, e.g. (octal opcode in first column):

50   SAi  Aj+K      Set Ai to Aj + K
51   SAi  Bj+K      Set Ai to Bj + K
...
74   SXi  Aj+Bk     Set Xi to Aj + Bk
75   SXi  Aj-Bk     Set Xi to Aj - Bk

A full summary can be found on page 2 of the Reference Manual.

A complete assembly environment called COMPASS with macros and pseudo-instructions was available.

What is the origin of different styles of assembly language mnemonics?

2 Answers2

Linked