14

Applesoft BASIC programs start at location $0801 in memory (usually). If you put a nonzero value at address $0800, though, you get an error when you try to run the program:

?SYNTAX ERROR IN 65124

Why does this happen?

hippietrail
  • 6,646
  • 2
  • 21
  • 60
fadden
  • 9,040
  • 1
  • 30
  • 84

1 Answers1

19

The Applesoft RUN command ($d912) begins by calling SETPTRS ($d665), which calls STXTPT ($d697) to initialize TXTPTR ($b8-b9) to the value in TXTTAB ($67-68) minus one. In simple terms, parsing of the program actually starts at $0800 when the program is loaded at $0801.

When the RUN command finishes, Applesoft falls back into its command execution loop NEWSTT ($d7d2), which had just finished calling EXECUTE_STATEMENT from $d820. When it jumps back to the top of the loop, it pulls the next byte from memory and evaluates it.

Normally, at the start of the program, it will read a zero, which causes it to behave as if it had reached the end of a line, and it will start processing the line. If it doesn't see a zero, it acts like it's in the middle of processing a line, and looks for a colon (':'). If it doesn't see that, it reports a syntax error (jump from SYNERR_1 at $d846), because statements must be separated by a colon or line break.

Setting $0800 to $3a (':') doesn't generally work, because Applesoft will think it's mid-line, but the next things it finds in memory at $0801 are a 16-bit next-line address followed by a 16-bit line number. These are unlikely to form a valid Applesoft statement.

The syntax error message uses the contents of CURLIN ($75-76), which was partially initialized: CURLIN+1 ($76) is set to $ff in "command" mode, and RUN decrements it to $fe to indicate that we're in "run" mode. The line number reported will thus be somewhere in the range $fe00-feff (65024-65279).

See also the Applesoft disassembly.

Sep Roland
  • 1,043
  • 5
  • 14
fadden
  • 9,040
  • 1
  • 30
  • 84
  • I wonder why TXTTAB is set to point to the byte following the zero byte, rather than pointing at the zero byte itself? Does that allow other parts of the code to be simplified? – supercat Jun 21 '21 at 18:00
  • 1
    @supercat TXTTAB points to the start of the program code, which is technically $0801, since the $00 is an end-of-line indicator, not a start-of-line indicator. This situation is an artifact of keeping the interpreter design simple: when it finishes with the RUN command it wants to find a statement separator, and relies on the coldstart code to have put one there (see $f1b0). No need to waste space for that byte on tape/disk since it's supposed to have been taken care of already. (DOS/ProDOS LOAD really ought to zero it out though.) – fadden Jun 21 '21 at 18:56
  • Saved BASIC programs have quite a bit of redundant information, such as line links (which, on the 6502 could have been reduced to one byte if they'd specified the length of input lines. Actually, if program lines and strings had been stored in reverse order, that would have made a lot of things more efficient on the 6502, since code to iterate through a string could simply output (stringPtr),y and decrement y until it hits zero, rather than having to check each index against the string length. – supercat Jun 21 '21 at 19:25
  • @supercat Regarding re-use of 8080 and 6800 techniques in 6502 MS-BASIC, see this question and answer(s), which I created in part to give a place for further comments about this. (I keep seeing comments like yours scattered throughout RCSE.) Many of the design decisions that were poor on the 6502 worked very well on the 8080 and 6800, which had 16-bit index registers. – cjs Jun 06 '22 at 07:21
  • The line links work especially well on the 6800; consider searching for a line number you have in the A and B registers, with X pointing to the start of a line. This is a very simple and efficient sequence along the lines of nextline: LDX ,X / BEQ notfound / search: CMP A,2,X / BNE nextline / CMP B,3,X / BNE nextline / .... (I don't know if this is exactly how they did it, but you get the idea.) – cjs Jun 06 '22 at 07:25
  • @cjs: Using 8-bit relative line links, I think the code would be: LDA line_msb/ bra testH / lp1: ldb ,x / beq oops / abx / testH: cmpa 2,X / bne lp1 / lda line_lsb / bra testL/ lp2: ldb ,x / beq oops / abx / testL: cmp 1,x / bne lp2 which I think would work out pretty well. – supercat Jun 06 '22 at 15:16
  • @supercat It might work ok on a processor with an abx instruction. That's not the 6800. :-) – cjs Jun 10 '22 at 05:31
  • @cjs: Hmm... I thought I'd looked that up. Though maybe I accidentally looked at some other 68xx derivative. Yeah, on the 6800 storing line links that way would be pretty much obligatory, but since neither the 6502 nor 8080 has a particularly convenient way of retrieving a pair of consecutive bytes from a pointer storing line lengths would have been faster to work with as well as being more compact [on the 8080, optimal would probably be to if lines store line number MSB plus one, length, and line number LSB in that order]. A line-number-MSB-plus-one of 255 would... – supercat Jun 10 '22 at 15:26
  • ...naturally trigger the end of the loop because it would be greater than any line number that could be searched for. I think if D is kept at zero and A holds the line number MSB, the find-MSB loop would be (z80 mnemonics for 8080 instructions) ld e,(hl)/inc e/dec hl/cmp (hl)/add hl,de/jp c,loop. Then code would need to see if (hl) is one greater than A, exit if not, and otherwise search for the LSB with a similar loop, and then recheck the MSB once it was done. I think that's faster than any approach using two-byte line links. – supercat Jun 10 '22 at 15:33
  • @supercat The 8080 is not so bad. It doesn't have the super-convenient LDX ,X instruction of the 6800, but emulating that is only a few instructions long: MOV D,M; INX H; MOV C,M; XCHG. – cjs Jun 10 '22 at 15:51
  • @cjs: That approach isn't terrible, but it makes the program bigger and requires that line links be regenerated when a program is edited. The equivalent portion of the line search routine using 8-bit relative links would be the instructions ld e,(hl) / inc (or dec) E / add hl,de. – supercat Jun 10 '22 at 16:09
  • @cjs: BTW, speaking of the LDX ,X concept, I wonder why the 6502 designers didn't use the same effective calculation logic for LDX, LDY, STX, and STY, as they used for LDA, STA, ADC, SBC, etc. While STX abs,X might not have been terribly useful, there are many cases where LDX abs,X would be nice, and I would think having the bit pattern for the abs,X addressing mode encode that mode even for the LDX instruction, and likewise abs,Y, would have been easier than making the abs,X bit pattern behave as ABS,Y and then not having abs,X. – supercat Jun 10 '22 at 20:22