Edit: Any addition or subtraction can be done with a 258-byte look-up table and using only mov, cmp and jne. There is absolutely no need at all for giant look-up tables. Lower 8 bits and upper 8 bits are updated separately using the same look-up table.
Here's the code that sums ax and bx using only one 258-byte look-up table and only mov, cmp and jne:
[bits 64]
; valid instuctions: mov, cmp, jmp, je, jne
; used instuctions: mov, cmp, jne
section .text
global _start
; this code sums ax & bx
_start:
; define the values to be summed (in ax & bx).
mov ax,12853 ; example summand 1.
mov bx,33276 ; example summand 2.
; summing is easy: just decrement each summand until it becomes zero,
; and for each decrement, increment the sum (in ax).
cmp bx,0
jne start_summing ; normally this would be je ready and
; next 2 instructions would be deleted.
cmp bx,1 ; this shows that jne is sufficient.
jne ready ; this conditional jump branches always.
start_summing:
mov ecx,0
summing_loop:
mov cl,bl
mov bl,[rcx+(number_line-1)] ; decrement bl.
cmp bl,255
jne bl_not_carry
mov cl,bh
mov bh,[rcx+(number_line-1)] ; decrement bh.
bl_not_carry:
mov cl,al
mov al,[rcx+(number_line+1)] ; increment al.
cmp al,0
jne al_not_carry
mov cl,ah
mov ah,[rcx+(number_line+1)] ; increment ah.
al_not_carry:
cmp bx,0
jne summing_loop
ready:
; sum is now in eax.
section .data
max_value equ 255
max_value_plus_1 equ (max_value + 1)
db max_value ; 0 - 1 = 255
number_line:
%assign myValue 0
%rep max_value_plus_1
db myValue
%assign myValue (myValue + 1)
%endrep
db 0
Edit: The rest of the answer deals with other solutions that need more memory.
Edit: A one-dimensional 128 KiB look-up table is sufficient for any addition or subtraction of 16-bit operands. No need for giant look-up tables.
Edit: Fixed bug that causes additions that normally set carry flag produce incorrect result.
Here's the code in x86-64 assembly, assembles with YASM, probably with NASM too. Implements add ax,bx, using only mov, cmp & jne.
[bits 64]
; valid commands: mov, cmp, jmp, je, jne
; used commands: mov, cmp, jne
section .text
global _start
; this code sums ax & bx
_start:
; define the values to be summed (in ax & bx).
mov ax,12853 ; example summand 1.
mov bx,33276 ; example summand 2.
; summing is easy: just decrement each summand until it becomes zero,
; and for each decrement, increment the sum (in ax).
mov edx,0
mov dx,ax
mov eax,edx ; eax = ax
mov ecx,0
mov cx,bx ; ecx = bx
summing_loop:
mov cx,[2*rcx+(number_line-2)] ; decrement ecx.
mov ax,[2*rax+(number_line+2)] ; increment eax.
cmp ecx,0
jne summing_loop
; sum is now in eax.
section .data
max_value equ 65535
dw max_value ; 0 - 1 = 65535
number_line:
%assign myValue 0
%rep max_value
dw myValue
%assign myValue (myValue + 1)
%endrep
dw 0
Edit: The rest of the answer deals with a more limited solution I first came up with.
It can be done with a two-dimensional look-up table.
For 8-bit registers, for example al & bl, it's easy. For 16-bit registers, it can be done, but the look-up table will be huge (almost 1 tebibyte), see below why. Each cell of the lookup table contains the sum of the corresponding X & Y coordinates (the X & Y coordinates are the summands).
For 8-bit sum the look-up table (a 256 * 256 matrix) is like this:
db 0, 1, 2, ... , 253, 254, 255
db 1, 2, 3, ... , 254, 255, 0
db 2, 3, 4, ... , 255, 0, 1
. . . . . . .
. . . . . . .
. . . . . . .
db 253, 254, 255, ... , 250, 251, 252
db 254, 255, 0, ... , 251, 252, 253
db 255, 0, 1, ... , 252, 253, 254
In x86 and x86-64 mov can be used for multiplication by 256^n, that is: 256, 65536, 16777216, ...
Multiplying by 256 with mov is easy, to compute ax = 256 * bl:
mov ax,0
mov ah,bl
To add eg. al & bl, we need to get the right offset, it's256 * al + bl, or 256 * bl + al (because the look-up table is a symmetric matrix, and it's symmetric because addition is a commutative operation).
Multiplying by 65536 and bigger numbers using only mov in x86/x86-64 requires using memory, because there is no way to address directly upper 16 bits of a 32-bit general register (such as eax) or the upper 32 bits of a 64-bit general register (such as rax).
To compute eax = 65536 * bx using only mov:
mov [temp], dword 0
mov [temp+2], bx
mov eax, [temp]
...
temp dd 0
But the real problem with 16-bit values is that in x86/x86-64 memory is addressed using byte offset, not with word/dword/qword offset, and we can only multiply by 256^n. But let's see first how the look-up table could look like if we didn't have this problem with multiplication and byte offset addressing. The look-up table could then be like this:
dw 0, 1, 2, ... , 65533, 65534, 65535
dw 1, 2, 3, ... , 65534, 65535, 0
dw 2, 3, 4, ... , 65535, 0, 1
. . . . . . .
. . . . . . .
. . . . . . .
dw 65533, 65534, 65535, ... , 65530, 65531, 65532
dw 65534, 65535, 0, ... , 65531, 65532, 65533
dw 65535, 0, 1, ... , 65532, 65533, 65534
Here, each row has 65536 cells, each is dword, so each row takes 2 * 65536 bytes = 131072 bytes. There are 65536 rows, so it's a 65536 * 65536 matrix.
Word-sized cells are not a problem for X (the horizontal index, either of the summands), because x86 assembly allows scale factors of 1, 2, 4 and 8.
Edit: Corrected text on array size, it's actually little bit smaller than 1 TiB.
The problem here is that multiplying Y (the vertical index, the other summand) by 131072 is not possible using only mov. So each row of the look-up table must be repeated 32768 times, or more precisely, there must be 32767 unused filler rows between any actual data rows. Why 32767? Because mov can only be used to multiply by 256, 65536, 16777216 ... so we need to multiply Y (the vertical index, the other summand) by 16777216. As each row takes 131072 bytes, to make new data rows start every 16777216 bytes, there must be 32767 unused filler rows (each takes 131072 bytes) after every data row. After the last data row filler rows are not needed, so in total the array size would be:
65535 * 16777216 + 131072 = 10.99 * 10^12 bytes = almost 1 tebibyte (1 TiB).
Unfortunately I don't have that much memory in my computer, but it's possible in x86-64.
Here's the code for the 8-bit addition using only mov and a 256 * 256 look-up table (tested with YASM, should assemble with NASM too):
[bits 64]
; valid instructions: mov, cmp, jmp, je, jne
; used instructions: mov
section .text
global _start
; al & bl must be preserved
; this code sums al & bl
_start:
; define the values to be summed (in al & bl).
mov al,47 ; example first summand
mov bl,55 ; example second summand
; the summing code starts here.
mov ecx,0
mov cl,al ; ecx = al
mov ch,bl ; ecx = 256 * bl + al
mov al,[rcx+sum_look_up_table] ; fetch the sum from look-up table.
; for 32-bit code, rcx -> ecx
; the sum is now in al.
section .data
y_times equ 256
x_times equ 256
sum_look_up_table:
%assign myY 0
%rep y_times
%assign myX 0
%rep x_times
%assign myValue (myX + myY)
%rep y_times
%if myValue >= 256
%assign myValue (myValue - 256)
%endif
%endrep
db myValue
%assign myX (myX + 1)
%endrep
%assign myY (myY + 1)
%endrep