C64: Why is POS(π) faster than POS(0)?

Question

According to this article: german C64 Wiki article about the POS() command POS(π) is 20 % faster.
Although in my experience it is circa 28 % faster. Is there a specific reason for this immense speed boost?

Probably because the constant 0 needs to be converted into floating-point, while Pi is already stored in ROM in floating-point form. — , Sep 27 '17 at 15:13
It is even 17 % slower than POS(0). POS(.0) is the same speed and POS(.1) is 65 % slower — , Sep 27 '17 at 15:55
The fastest way to represent 0 in MS Basic V2 is with a "."; I hadn't thought of trying pos(π), but pos(.) should be faster than pos(0). — supercat, Sep 27 '17 at 16:50
If it helps others with context: POS returns the cursor column. It takes an argument but doesn't use it — it's a dummy. — Tommy, Sep 27 '17 at 17:28
@Tommy: Further, even though the value of the argument ignored, the interpreter still goes through the trouble of calculating its value as a floating-point number. — supercat, Sep 27 '17 at 17:54
Why is the dummy argument needed? I guess to keep the stack from unwinding, but is that a bug in the BASIC interpreter? — snips-n-snails, Sep 27 '17 at 18:12
@traal It's an implementation detail, also known as a feature. — wizzwizz4, Sep 27 '17 at 18:20
I think each digits to left of the decimal point entails a multiply by ten, and each digit to the right requires a divide by ten. A bare decimal point doesn't require either. And π simply instructs the compiler to load a hard-coded constant with no further logic required. — supercat, Sep 27 '17 at 18:23
@supercat The division operation is too slow to do it per digit, and it would lose too much precision, so it is done once. Sometimes a table of negative powers of 10 is used to do a multiplication instead of a division. — Leo B., Sep 27 '17 at 18:56

Leo B. · Accepted Answer · 2017-09-27T22:35:16.707

35

That interpreter apparently parses the source text, or at least the numerical literal values, at every execution. π is a single-byte magic token, therefore, as soon as it is recognized, it is immediately substituted with the value of Pi, and nothing else needs to be done.

When a byte that might occur in a number is parsed, the number parsing routine begins. Its execution time will depend on the number of characters in the string representing the number. When parsing a number, each digit in the integer part requires multiplying the number parsed so far by 10 and adding the value of the digit. The fractional part is usually parsed as an integer, then divided by the appropriate power of 10. The division operation is expensive, and it makes sense to check if the dividend is 0 before dividing — that's why .0 is much faster than .1.

That's why, as the German Wiki article says, the next best execution time is for POS(.): the number parsing routine starts, but there are no digits to parse, and it devolves to skipping the period and adding the initial (zero) values of the integer and the fractional part, then for a variable with a single-character name (name lookup), and then for the number 0 (will cause the computation of 0 * 10 + 0). Longer numbers will take even longer.

edited Sep 27 '17 at 22:35

answered Sep 27 '17 at 18:17

Leo B.

19,082
5
49
141

2

I'm pretty sure there was an HP BASIC that, after each RUN, would tokenise each line the first time it is encountered, replacing the original in memory, then detokenise only when the program ended. It's a shame Microsoft et al didn't learn from that. – Tommy Sep 27 '17 at 18:23
@Tommy Keywords could be tokenized and detokenized at will, but numbers are literals that don't have a 1-to-1 tokenized representation. If the source has 000.000, you cannot just replace it with a floating point zero in the tokenized form. – Leo B. Sep 27 '17 at 18:27
3

you definitely can, and some BASICs do. But they're the sort that tokenise as you type, never keeping a plain-text copy around. – Tommy Sep 27 '17 at 18:29
4

@Tommy: Lines are tokenized in MS-BASIC, but sequences of digits are kept as written. In some BASIC dialects, if one writes 10 X=1.20 and then types LIST the program would list back as 10 X=1.2, but Commodore BASIC generally tries to list programs in such a way as to match what was typed. – supercat Sep 27 '17 at 18:29
@Tommy That is, if you type 000.000 and do LIST, you'll see 0.0? – Leo B. Sep 27 '17 at 18:32
@Tommy: Strings actually are a bigger issue--if the tokenizer represented strings as a "string" marker followed by a length and the text, then non-constant strings in memory could be stored with their length preceding them as well, which could in turn greatly expedite garbage collection. – supercat Sep 27 '17 at 18:32
@supercat my comment is: it's a shame Microsoft BASIC works in way X when way Y would be more efficient, assuming side effects are acceptable. 1.2 is less communicative than 1.20 but I doubt the common variety of home computer BASIC programmer would value that against their home-made game being a few percent faster. – Tommy Sep 27 '17 at 18:32
@Tommy: There are bigger missed opportunities. Optimizing calculations when both values are integers -32768..32767 is perhaps the biggest, though string handling missed some major chances as well. Interestingly, in many cases code may be able to massively improve string performance by doing a few pokes. – supercat Sep 27 '17 at 18:34
@LeoB. I'm struggling to find the BASIC I half remember, so I'll have to defer on that. Though I have eliminated my first guess of ZX81/Spectrum BASIC, which stores both a floating point version of literals and the text version you typed in. Who says it doesn't have a tokeniser? (EDIT: source for that claim is http://problemkaputt.de/zxdocs.htm#basicinterpreter under "BASIC Programs and Variables (ZX81/Spectrum)") – Tommy Sep 27 '17 at 18:36
1

@Tommy The need to store both forms for a good user experience would explain the design decision: Microsoft likely strived (strove?) for memory efficiency rather than speed. Of course, that made sense during Altair days, but progressively less as memory sizes grew. – Leo B. Sep 27 '17 at 18:42
@LeoB. right, with a desire for shared structure no doubt explaining how Altair decisions end up affecting Microsoft's 6502 BASIC. And, honestly, if you're going to improve Commodore-Microsoft BASIC specifically, I think you'd start with hardware support and worry about line structure later. – Tommy Sep 27 '17 at 18:45
An interpreter parses the source text at every execution. No, a crap interpreter does that. Decent interpreters build, at the very least, an abstract syntax tree and execute that. Even among BASIC interpreters on 8 bit micros there existed implementations which tokenized the lines of code to avoid scanning text while executing.
– Kaz Sep 27 '17 at 22:10
4

@Kaz Or an interpreter optimized to fit in the available memory, when the choice is between a slow ("crap") interpreter or no interpreter at all. I'll rephrase that statement. – Leo B. Sep 27 '17 at 22:31
9

Don't forget that Microsoft BASIC was designed to fit in a very small menory footprint because that's all there was. The original PET had 4 or 8K RAM and BASIC had to fit into 8K ROM. The Altair BASIC from which Commodore BASIC was derived had to fit in even less space. You can visualise the entire C64 memory map losslessly with a 256 x 256 GIF image. People today forget how constrained memory was back in the early 80's. – JeremyP Sep 28 '17 at 09:01
@LeoB. - note that the ZX81 came with the same amount of memory as the minimum spec for running BASIC on the Altair... – Jules Sep 28 '17 at 12:01
Sinclair's Basic used a 8 KByte ROM (if I remember right). I wondered how a full Basic interpreter with floating point routines and essential I/O handling is fitting into this space. One major building block seems to be a P-code interpreter for floating point operations ... not comparable at all with MS Basic's structure and internal design. – Johann Klasek Oct 01 '17 at 12:43
I just want to add that the English C64 Wiki page for POS provides the same information as previously only the German one meanwhile. – Johann Klasek Oct 01 '17 at 19:59
@JohannKlasek: The Atari 2600 Basic Programming cartridge is 4K ROM. It's a really super wimpy dialect of BASIC, but that 4K includes all the code necessary for a CodeView-style user interface (highlighting the currently-executing code, showing the contents of variables, etc.). If the Atari 2600 had even a little more RAM (e.g. 256 bytes instead of 128), that cartridge could have been pretty neat. – supercat Oct 03 '17 at 21:15

score 1 · Answer 2 · answered Jan 12 '19 at 22:24

The C64 BASIC uses a floating point as basic format for everything. So even if a function requires an integer argument, the number is parsed into a floating point number first and then converted into an integer. When giving the constant π, a pre-stored floating point value of 3.14159265 is used, so the time for the conversion of the number to floating point is omitted. For POS(), the argument is a dummy argument, so its value does not matter to the result, which makes POS(π) easily possible. But even for other functions like POKE giving π as an argument (which will then be converted to 3) would be a bit faster than writing a 3.

I wonder how much extra code would have been required to say that whenever the exponent byte is zero, the next four bytes will be interpreted as a signed integer, and operations whose operands and results are both signed integers yield signed integer results? Operations that can't be handled that way would simply need to start with a call to a function that would convert operands to normalized format, but most operations would execute much faster because they could skip the normalization steps. — supercat, Jan 14 '19 at 20:25

C64: Why is POS(π) faster than POS(0)?

2 Answers2

Linked