26

The BASIC I'm most familiar with is Atari BASIC since I had an Atari 800 way back when.

The Atari BASIC Source Book includes details about how Atari BASIC maintains variables. There is a Variable Name Table that essentially assigns an index number to each variable name. The interpreter uses the index to get the variable value from the Variable Value Table.

The Variable Value Table includes:

  • The type of the variable
  • The index number of the variable
  • 6 additional bytes of data that vary depending on the type of the variable

My question is: Since the variable value includes the type, at least in Atari BASIC, why does the programmer need to also include the type in the name of the variable? In other words, why can't we just do LET X="HELLO" without making it X$? It knows the right-hand side of the assignment is a string, so it can just assign a string to X. And if the program later does LET X=1 then the value could then be an integer.

My thoughts:

  • Possibly this was important when variable names were limited to 1 letter since that would give 26 numeric variables and 26 string variables.
  • Maybe some BASICs have separate data structures for string and numeric variables and the suffix tells the interpreter which one to use.
  • BASIC could be doing type checking when parsing the line, avoiding the need to check types at runtime.
user3840170
  • 23,072
  • 4
  • 91
  • 150
Willis Blackburn
  • 730
  • 4
  • 12
  • 12
    Re, "It knows the right-hand side of the assignment is a string." I don't actually know how Atari BASIC was implemented, but I would not make that assumption. A parser, scanning let x$=... from left to right already knows the type of the variable before it attempts to parse the given value. Having that information sooner, rather than later, might make a difference in the size and complexity of the interpreter. On a platform that might have as little as 8K bytes of RAM, that might be important. – Solomon Slow Sep 27 '21 at 14:33
  • 5
    Adding a link to The Atari BASIC Source Book for others to peruse: https://archive.org/details/ataribooks-the-atari-basic-source-book It's a lucid and concise explanation of the design & implementation of an 8-bit era BASIC. – Jim Nelson Sep 27 '21 at 17:10
  • 6
    Well, not all BASICs need to type variables. Sinclair QL's SuperBASIC, for example, can use typeless variables as arguments to functions and procedures - You can write a procedure that sorts integer, string and floating point arrays with the same piece of code. – tofro Sep 27 '21 at 22:27
  • 3
    In some ways it was the precursor of type-prefixing of variable names, such as intCounter and strAddress. I.E., despite some ridicule of it, the programming profession actually ended up largely adopting the practice in a more language-agnostic way (at least up until type-identifying tools like Intellesense were widely available). – RBarryYoung Sep 28 '21 at 15:06
  • 3
    @RBarryYoung - nice to know I'm not part of that majority :-) – dave Sep 28 '21 at 22:52
  • 1
    @RBarryYoung: What you are referring to is "System Hungarian" and was an abomination that completely ignored what the actually useful "Apps Hungarian" was actually about or good for, and got "largely adopted" because too many people unthinkingly adopt what Microsoft tells them is the current bee's knees. Until even Microsoft realized that System's Hungarian serves absolutely no purpose whatsoever other than obfuscating your source, and discontinued its use. – DevSolar Sep 30 '21 at 15:52
  • 1
    From the ATARI book (in chapter 5, the pre-compiler): The variable is considered to be a string if it ends with $; otherwise it is a numeric variable. – No'am Newman Sep 30 '21 at 18:17

7 Answers7

39

TL;DR:

Most straight answer: Because it's BASIC

A trailing '$'-sign is the syntax BASIC defined when adding strings.

Also, the suffix is not only a type marker, but part of the name. In BASIC A and A$ are two different variables. This is not a bug but a feature. The type is needed to distinguish them.

The BASIC language is defined to work that way. Changing this would make it a different language/add incompatibilities (*1). Much the same way that C requires parentheses around function parameters, despite that it would have been possible to go without.


It's an Atari

... how Atari BASIC maintains variables. There is a Variable Name Table that essentially assigns an index number to each variable name.

I guess that's a point where much of the confusion is originated. Atari BASIC is a very special implementation; the Variable Name Table is a prime oddity here. Atari BASIC keeps a static 128 entry table of all variables, using a unique indexing system for access. More so, this table is even static across many operations like editing - unlike variable handling in many other BASIC (*2).

But even Atari BASIC follows the rules of the BASIC language and handles A and A$ as different variables. Simply try this:

10 A=1
20 DIM A$(3) : REM Special case only needed for Atari BASIC
30 A$="ABC"
40 PRINT A,A$

Every BASIC compatible with its forefathers will print

RUN
1       ABC

READY

This Example works only because these are two variables, different by type and name and is an inherent feature of BASIC.

That table and its behaviour is Atari specific, not part of any generic BASIC definition. So the question might be not about BASIC but rather why Atari BASIC didn't diverge and automated type detection. The answer could be that, beside a more complex parser, less readability and unclear situations it came down to avoiding incompatibility with BASIC as a language.

It's a Language Predating Atari's Implementation

My question is: Since the variable value includes the type, why does the programmer need to also include the type in the name of the variable?

Because it's part of the name? Even with Atari BASIC, the '$'-Suffix is stored in the name entry and Atari BASIC does distinguish between A and A$.

Maybe Atari could have left out the variable type, but then their programs would be incompatible to BASIC - i.e. be only a BASIC-like language.

Not to mention that in this case variable declaration would become mandatory to the tell the interpreter ahead of time which type a variable should have. A feature BASIC avoided on purpose for simplicity (unless one wants to increase complexity with sum types).

In other words, why can't we just do LET X="HELLO" without making it X$?

Because then the compiler/interpreter would not know what type X is supposed to be.

It knows the right-hand side of the assignment is a string, so it can just assign a string to X.

No, it doesn't, as it works simply left to right. No look ahead and no backtracking. BASIC is intended to be a simple language. For usage as well for implementation.

And if the program later does LET X=1 then the value could then be an integer.

Beside that integers are a later addition (and the fact that there is no look ahead), how to tell the compiler/interpreter that it's an integer, not a float?

Also, wouldn't that redefine the variable from string to integer? Polymorphism isn't a thing in BASIC, it's of strict type.

It's about Teaching

Beside simplicity of compiler/interpreter design, it all comes down to the fact that BASIC is intended to be basic. BASIC is a language meant for teaching people who never ever touched a computer before, nor seen one in their whole life (*3). It is about introducing concepts that are completely alien to students in the 1960s.

Teaching starts with introducing variables for numbers and giving them names. The idea that computers divide numbers in classes, like float or integer, is still to be learned way later. Then strings are added and marked accordingly and way later integers may be added and so on.

Adding variable definition ahead of usage might seem like natural to experienced users - and the way even back then 'professional' languages used - but it's an additional hurdle in the process of learning. One BASIC has avoided by usage without declaration - which implies that each occurrence had to be type qualified.

And it's about Era

When trying to understand a language of the past, it helps to look at their intention, and to remember that 1961 is not 2021. And Computers in the 60s are worlds apart from today's machinery.


From the thoughts section:

Maybe some BASICs have separate data structures for string and numeric variables and the suffix tells the interpreter which one to use.

What data structure is used is implementation, not language definition, isn't it? Also, yes, other interpreters use other structures - but all used on micro computers were created way after BASIC as a language was defined.

BASIC could be doing type checking when parsing the line, avoiding the need to check types at runtime.

That is exactly what BASIC does - just no look ahead. Also, parsing of a line is done, even in Atari BASIC only for syntactical correctness, to determine variable types, as the structures to be used do not exist at that moment in time.

Variable tables are build only at runtime.


*1 - Which Atari did anyway.

*2 - Introducing quite some unexpected trouble when not starting with NEW

*3 - Maybe except in flicks like Colossus :)

Raffzahn
  • 222,541
  • 22
  • 631
  • 918
  • First BASIC I used even had three A: string, integer and float. (Ok, only the float was actually written A, the others where A¤ and A%). – UncleBod Sep 27 '21 at 13:36
  • 2
    @UncleBod some BASICs even allowed to explicite mark floats as A! – Raffzahn Sep 27 '21 at 13:37
  • 3
    I asked, why does BASIC need these variable name suffixes, and you answered, because it's BASIC. This answer begs the question, which isn't about why Atari BASIC chose to implement BASIC this way but rather why the language is defined this way. – Willis Blackburn Sep 27 '21 at 13:53
  • 2
    @WillisBlackburn But that's the whole point. BASIC is made to allow two different variables of different type with the same name. Further BASIC was made to be simple, no prior data definition, no look ahead for assignments or alike. It's to be simply interpreted left to right . – Raffzahn Sep 27 '21 at 14:13
  • @WillisBlackburn so would it be accurate to say that the real question is “Why did BASIC v4 add string variables with a $ suffix?” (The first versions of BASIC didn’t have string variables, they were added in v4.) – Stephen Kitt Sep 27 '21 at 14:26
  • So the key is “No, it doesn't, as it works simply left to right. No look ahead and no backtracking. BASIC is intended to be a simple language. For usage as well for implementation.” Note that Atari BASIC has a few deviations from “standard” BASIC, so using that as an argument doesn’t work all that well — in particular, string handling ;-). (Atari BASIC doesn’t support lists of strings.) – Stephen Kitt Sep 27 '21 at 14:48
  • 1
    @StephenKitt yes, by now I think the 'features' of Atari BASIC are the real origin why the OP came to ask in the first place. Easy to stray afar when only looking at this (very special) implementation. – Raffzahn Sep 27 '21 at 15:07
  • 1
    A good study of the differences between Atari BASIC and its brethren: https://web.archive.org/web/20070524044410/http://www3.sympatico.ca/maury/other_stuff/atari_basic.html (written by @maury-markowitz, I think?) – Jim Nelson Sep 27 '21 at 17:26
  • 1
    Kind of looking for a more technical answer. I think you're saying, variables have to have a declared type, and just suffixing string variables with '$' saves the novice programmer from having to declare the variables as strings before using them. Except that variables don't have to have a declared type. They don't in Python, or in JavaScript, or in a lot of other languages. It would have been easier to just let the programmer assign values to variables without worrying about type. Atari BASIC at least was clearly capable of changing the type of a variable based on its assigned value. – Willis Blackburn Sep 27 '21 at 19:37
  • Maybe other BASICs had separate variable name tables for, say, integers and reals. That would allow the interpreter to reserve only 2 bytes for each integer instead of the 4 or 6 bytes (or in Atari BASIC's case, 8 byte) needed to store a value of any type. Can anyone identify a BASIC that worked that way? – Willis Blackburn Sep 27 '21 at 19:39
  • 1
    @WillisBlackburn - for any implementation except the original Dartmouth BASIC, the answer to why is "because that is the language". For Dartmouth, the answer is probably (and here we have to guess, unless there's a rationale written by Thomas Kurtz or John Kemeny hiding somewhere) "because it made the language easier to use". See my answer. – dave Sep 27 '21 at 21:16
  • 2
    @WillisBlackburn - dynamic typing requires a runtime system that static typing does not. BASIC was a compiled language when language decisions were being made. – dave Sep 27 '21 at 21:32
  • 1
    Also note that most dialects of BASIC treated arrays as a separate namespace, so A, A$, and A() would all denote different variables. – dan04 Sep 27 '21 at 21:56
  • 5
    @WillisBlackburn Yes, Python doesn't have type. But The Python runtime (Just the program named python2.7) on my system is bigger than 3 MB. I would have trouble to fit that into a 8-bit computer. Even with overlays and memory banking. – UncleBod Sep 28 '21 at 04:30
  • 7
    @WillisBlackburn You're arguing again with Atari BASIC and in hindsight. So if you want a more technical answer: BASIC was developed in 1964 on a GE-225 Computer with 8 Ki Word of drum based memory running in batch mode, which got user programs feed as virtual punch cards by a DN-30 front-end processor. There is no room for a complex runtime management to have dynamic typing or a compiler analysing program flow to guess types. Not to mention, that dynamic typing wasn't a thing in 1964. – Raffzahn Sep 28 '21 at 05:07
  • @WillisBlackburn Microsoft level 2 Basic could declare variables and their types. DEFDBL A-Z or DEFINT I, etc. This said, the choice of a sigils to distinguish variable type is a language design choice. Perl is for example another example besides BASIC. As for Python's variable they are not typeless, they are so called sumtypes, i.e. a union of all types. The variable has one type that is chosen when initialising the variable. – Patrick Schlüter Sep 28 '21 at 06:19
  • 1
    The answer that I think is becoming apparent is that BASIC type-checks program lines when they are parsed or compiled but does not check expression types at runtime, and the reason for this is that although a BASIC interpreter with an integrated editor like Atari BASIC obviously has to carry around type-checking code and could apply it at runtime, earlier BASIC implementations were compilers, and type-checking during compilation enabled those compilers to produce a smaller executable. – Willis Blackburn Sep 28 '21 at 14:44
  • @WillisBlackburn (caveat, there are many variations out there even such you'd like) No, it's just the other way around - most basic do NOT type check anything when source lines get entered. They just crunch them into a more compact representation, which gets interpreted at runtime. Type checking is done during interpretation, but there is no polymorphism. A function called simply checks if the parameters supplied are of the required type and throws an error when otherwise. During runtime and from left to right. It also has nothing to do with compilers as back then compilers were as simple. – Raffzahn Sep 28 '21 at 14:56
  • @Raffzahn Kind of confused by all the different answers. First you said there is "no room for complex runtime management to have dynamic typing" but then you said that no type-checking is performed at the point of program entry... it can't be both ways. If the code logic is "check if the operands to this plus operator are both integers, and if so then add them, otherwise generate an error," that's dynamic typing. A statically-typed program would just add the two numbers, knowing that they were integers. – Willis Blackburn Sep 28 '21 at 15:25
  • 1
    It's easy to demonstrate that Atari BASIC at least does do type-checking at program entry. Entering 10 PRINT "X"+1 immediately produces an error. – Willis Blackburn Sep 28 '21 at 15:29
  • 1
    As mentioned in my link above, one of Atari BASIC's features is that it performs tokenization at the time the line is entered, and even points to the character in the line where it first detected a problem. That was unusual at the time, however. – Jim Nelson Sep 28 '21 at 17:36
  • @WillisBlackburn Erm, It seems you're random jumping between questions about specific implementations, like Atari BASIC, but insist to want general answer about BASIC as a language, don't you? It would help if you could stay with one topic. Atari BASIC is in many ways a special case with implementation specific details. So is your question not about BASIC and its syntax, but why Atari did stick to standard syntax despite being able to do different? If that's the case, then please tell so. I start to belive that the question is not well thought thru and may need to be more focused. – Raffzahn Sep 28 '21 at 19:28
  • 2
    @WillisBlackburn BASIC requires by definition that string variables are marked by a trailing '$' sign - despite the possibility of compiler/interpreter that could do without. The same way C requires that all functions need to have parenthesis and required them to be matching, no matter if a compiler could be written that would work without, like accept fu(bar);, fu (bar; or fu bar; all the same. A compiler not requiring parenthesis would simply be not a C compiler. – Raffzahn Sep 28 '21 at 19:40
  • @Raffzahn The question was about why BASIC uses type suffixes instead of just implementing dynamic typing at runtime, which you pointed out that many BASICs actually do. I only used Atari BASIC as an example of a BASIC where the type of each variable is known at runtime. I think I understand the rationale, but your initial answer, "because it's BASIC," wasn't helpful. – Willis Blackburn Sep 28 '21 at 22:40
  • @WillisBlackburn But it's exactly that point. The suffix is not just a qualifier, but part of the name. A and A$ are two different variables (IIRC even in Atari BASIC - it's stored in the variable table entry as part of the name and compared accordingly). It's because BASIC is defined that way. – Raffzahn Sep 29 '21 at 16:07
  • "It is about introducing concepts that are completely alien to students in the 1960s." - it's worth noting that whilst general purpose computer hardware was new in the 1960s, the principles of mathematics and secretarial science were not, and it's very likely that the average university student in the 1960s had a much better conceptual grounding than the average modern student. – Steve Sep 29 '21 at 17:37
  • @Raffzahn: The Standard allows Conforming C Implementations to extend the language in arbitrary fashion, with only three caveats: (1) Programs where an #error directive survives preprocessing must be rejected ; (2) An implementation would need to issue a diagnostic upon receipt of some programs, but may issue such a diagnostic at arbitrary other times as well or even unconditionally; (3) extensions must not affect the behavior of Strictly Conforming C Programs. For a C compiler to accept code like what you show would be weird, but nothing in the Standard would forbid such a thing. – supercat Sep 29 '21 at 17:48
  • @supercat And the point is? The same way one could drop the suffix notation in BASIC, which is what the OP had in mind (and Visual BASIC did). Just, it wouldn't anymore be BASIC, same way as dropping parenthesis around function argument would make it no longer C. – Raffzahn Sep 29 '21 at 21:50
24

To add to the "because it's BASIC" answers: BASIC came from the Dartmouth Timesharing System (DTSS) in the 1960s. Other implementations need to follow basic BASIC standards.

In 1964, the variables A and A() -- i.e., scalar and array -- were considered to be different. BASIC, Oct 1964, page 36. There were no string variables.

By 1968, string variables, and string arrays, had been added. These had names ending in a dollar sign, thus A$ and A$(). BASIC, Jan 1968, page 63.

So that's the original BASIC. And that's why every other BASIC has to follow suit.

Why? Because of the 'B' in BASIC - "beginner's". The designers, who were very capable but still feeling their way at this time, probably did not want to burden their users with either type declarations such as were found in ALGOL, nor with dynamic typing such as was found in LISP. A fixed name format seems clear.

And, note, DTSS BASIC was a compiled language, not an interpreted language. That too may colour some decisions, for example maybe you don't want to need runtime type checking on the validity of

100 LET X = Y + 1

which presumably is meaningless if Y is string-valued.

dave
  • 35,301
  • 3
  • 80
  • 160
  • 1
    Well, there are some languages that do let you mix types like that, and implicitly cast either Y to a number (with + as numeric addition) or 1 to a string (with + as string concatenation) to make it work. But this would presumably make the compiler/interpret more complex to implement, and thus be avoided in the days when RAM was measured in KB instead of in GB. – dan04 Sep 27 '21 at 22:09
  • 3
    @dan04 - you need either static typing, so the compiler can emit the right code, or you need a runtime mechanism to dispatch to the '+' implementation that is appropriate to the type that Y has at the time the statement is executed. And the possibility that X might end up as numeric 2 or string FOO1 is confusing to beginners: just go read Java questions on SO for proof. – dave Sep 27 '21 at 22:46
  • 2
    Thanks for this answer. I think it better describes the reason: the DTSS BASIC designers' desire to do static type-checking at compile time and make the generated code simpler and smaller. – Willis Blackburn Sep 28 '21 at 14:50
  • 2
    Other implementations need to follow basic BASIC standards. Well, not really. BASIC was not standardized in the sense that you could port source code without modification. Apple II BASIC was different from TRS-80 Level I BASIC, and so on. If you were typing in code from a magazine, part of the process was to make the appropriate changes for your own machine's dialect. –  Sep 28 '21 at 14:52
  • 1
    @BenCrowell wasn't BASIC standardized well enough that simple programs would work without modification? Sure peek and poke would need translation, but number and string input and output and manipulation should have been the same. – Mark Ransom Sep 28 '21 at 16:07
  • 3
    @MarkRansom - I suspect there was a de facto standard core and then everyone and his dog added their own "extensions", which are as deadly to portability as omissions. FIPS/ANSI and ECMA/ISO published standards, which I have never looked at. – dave Sep 28 '21 at 17:53
  • 1
    @another-dave: Indeed. If you stuck to the most basic forms of commonly-implemented statements (DATA, FOR...NEXT, GOSUB, GOTO, IF...THEN, INPUT, LET, ON...GOTO, PRINT, READ, REM, RESTORE); stuck to stdin/stdout-style I/O with no disk files, and no color, graphics, or sound; and kept the number of variables and GOSUB recursion levels small enough to not run into implementation limits, then you could write programs compatible with almost any BASIC interpreter. You can find compilations of "classic" BASIC games using this approach. – dan04 Sep 28 '21 at 20:19
  • ...You'd also have to avoid string variables due to the schism between whether DIM A$(10) declared an array of 11 dynamically-allocated strings (as Microsoft did), or a single string variable with a maximum of 10 chars (as Atari did). – dan04 Sep 28 '21 at 20:21
  • 2
    @dan04 - In BASIC from Dartmouth in 1968, it's 11 strings a-stringing. So any implementation that treats it as a single string is just defective. – dave Sep 28 '21 at 21:20
  • @another-dave The 1968 doc provides an explanation of the CHANGE statement, which I'd seen in the IMSAI 8K BASIC documentation but was mysterious to me. I don't think a BASIC is "defective" because it doesn't work like Dartmouth BASIC because virtually no BASICs do. For example the CHANGE statement doesn't exist in Atari or or any of the Microsoft BASICs. The ECMA-55 Minimal BASIC standard didn't support string arrays at all. The MAT statements are not supported by any microcomputer BASIC that I've ever used. Some BASICs don't even support floating point. – Willis Blackburn Sep 28 '21 at 22:27
  • Atari BASIC isn't defective, it's just a dialect of BASIC that's different from Dartmouth BASIC. And the same can be said for Apple Integer BASIC, Microsoft BASIC, etc. – Willis Blackburn Sep 28 '21 at 22:30
  • I'm a fundamentalist when it comes to language definition. Implement the standard or call it something else. As Barry Mailloux said to the Algol-68R implementors: "we have a bible and you are sinning". – dave Sep 28 '21 at 22:35
  • 1
    @another-dave There wasn't a standard to implement. According to Wikipedia, HP Time-Shared BASIC added support for string variables before Dartmouth BASIC and used the "array-slicing" approach in which DIM A$(10) allocates space for a 10-character string and A$(1) gives you the first character of that string. So this was an approach to string handling that was established before it was used in Atari BASIC. – Willis Blackburn Sep 28 '21 at 23:12
  • @WillisBlackburn that's funny, I learned computing on that HP BASIC. It didn't have integers, it only had floating point. – Mark Ransom Sep 30 '21 at 14:33
  • 1
    @MarkRansom- Right, that was standard BASIC at the time. Numbers were numbers. The first BASIC I saw that had explicit integer support was BASIC-PLUS on RSTS/E (one of the PDP-11 OSes). – dave Sep 30 '21 at 14:36
7

Akin to Raffzahn's answer, yes, because it's BASIC.

That said, later BASICs had DEFINT, DEFSTR, and such to set the types of variables upfront, so that you no longer had to use suffixes.

Similar to the IMPLICIT statement in FORTRAN.

Will Hartung
  • 12,276
  • 1
  • 27
  • 53
  • 1
    :) well, maybe less is more :)) (It may be worth to add that this will as well eliminate the ability of the same names for different types variables - which many programs did use). – Raffzahn Sep 27 '21 at 14:57
  • 1
    @Raffzahn No, I think it's just affected the default name. You believe you could always be explicit with the suffixes. I can't say with authority, I rarely used them. – Will Hartung Sep 27 '21 at 20:45
  • 1
    Will, sorry, I guess I should have be more clear with that little addition. Of course it may still be possible to make an override by using explicit typing again. I didn't want to deny this. Point was that by using it like the OP intended, would eliminate the ability to use implied type. – Raffzahn Sep 27 '21 at 22:06
  • 2
    @WillHartung's comment matches my recollection from QBasic/QuickBasic: I tended to use DEFINT A-Z with types suffixes for everything that needed to not be an integer. Strings and numeric types that differed only in the suffix were OK (misusing would result in an error), but keeping track of which one of A%, A&, and A! you meant made debugging a pain, and wasn't necessary as variable names could be reasonably descriptive – Chris H Sep 28 '21 at 10:28
  • 1
    QBasic's type suffixes are % (INTEGER/16-bit), & (LONG/32-bit), ! (SINGLE), # (DOUBLE), and $ (STRING), and the modern clone QB64 has extensions for && (64-bit integer) and ## (extended-precision float). – dan04 Sep 28 '21 at 21:44
  • A decade before QBasic, BBC BASIC used a % suffix for 32-bit integers as well as the standard $ for strings. (The other characters had different meanings, though: & was a prefix for hex constants, ! the word indirection operator, and # indicated a file handle or, in the assembler, an immediate value.) – gidds Oct 19 '21 at 11:53
5

Section 7.1 of the ANSI X3.60-1978/EMCA-55 Minimal BASIC Standard specifies that string variable names end in the "$" character.

That particular string variable naming specification allowed the Standard to be backwards compatible with the 1968 implementation of Dartmouth BASIC (and with DEC BASIC, et.al.).

So support for that type of string variable names is required. (whether or not any particular implementation could do otherwise, due to symbol tables or whatever...) There are hints on the web that specification compliance may even have been a government purchasing requirement.

And Shepardson Microsystems possibly didn't even start porting Atari Basic until after the Minimal Basic Standard was published in 1978.

hotpaw2
  • 8,183
  • 1
  • 19
  • 46
3

Any answer to this question is going to be somewhat a matter of opinion, since there certainly are languages that are designed with dynamic typing, and BASIC could have been done that way. Unless someone comes up with a memoir or historical article explaining some of these design decisions, we won't really have a conclusive answer.

I disagree with the answers saying that Dartmouth BASIC worked a certain way, so later versions of the language had to follow that. There was no source code compatibility between different implementations of BASIC. Apple II BASIC was different from TRS-80 Level I BASIC, and so on. If you were typing in code from a book or magazine, part of the process was to make the appropriate changes for your own machine's dialect.

Some versions of BASIC only allowed one-character variable names, while by the era of consumer machines, most dialects allowed two-character names, such as AB or A7. This was doubtless a memory-saving and efficiency device. A two-letter variable name could be encoded into two bytes, and manipulated by putting it in a 16-bit register. This created an artificial shortage of variable names, especially if you wanted to find variable names that would be memorable. Worsening this shortage were two other factors: (1) there were no local variables, and (2) source code was normally all upper-case. Because of this shortage, it was a big win to be able to have an N$ for the user's name, and have that be different from N which was the number of something.

I don't think the issue was purely the simplicity of parsers. Actually, the parsing and polymorphism were pretty complicated. You can play around and observe this in an emulator, e.g. http://trsjs.48k.ca/

A%=3:B%=4:PRINT A%/B%

This outputs .75, which shows that operands could automatically be promoted from integer to floating point. (Although this design decision probably made the interpreter slower, it would definitely be preferable, for beginners, compared to having the result be zero.)

A%=1.9:PRINT A%

This outputs 1, showing a demotion from float to int.

It's certainly not true that the interpreter could look at the left hand of an assignment and know the types of all the operands on the right-hand side.

A$=CHR$(42)'RIGHT-HAND SIDE INVOLVES AN INTEGER BUT RESULTS IN A STRING

There is type checking:

A%="HELLO"'PRODUCES ?TM ERROR

Although functions like LEN and STR$ had suffixes that indicated their types, the binary operators were polymorphic.

A$="HELLO":B$=" WORLD":PRINT A$+B$

Note that although there was stuff like DEFINT, it was optional, and in general there were no type declarations as in languages like pascal. This was obviously a usability decision for beginners. Incorporating punctuation to show types is actually a pretty widespread idea, as in languages like the Unix shell, and Perl.

I would guess that it helped quite a bit in simplifying the parser and making it more efficient that it could always tell the types of operands on syntactic grounds. Suppose that BASIC had had dynamic typing. Then what does the following code do?

A=B+C

In a hypothetical dynamically typed BASIC interpreter, we would have to look at the type of B, look at the type of C, then determine whether they're both strings, both numeric, or one string and one numeric. Then we'd have to have three different branches: concatenate, add, or throw a type error. The performance of the language on these machines was horrible. They certainly didn't want to make design decisions that would make it even worse by adding more complication at run-time.

If there was a dynamically typed interpreter, it would also automatically add some complication to the run-time, which had to be small (e.g. 16K of ROM). We would need functions for converting from one type to another. We would also need some provision in the user interface for inspecting the type of a variable.

user3840170
  • 23,072
  • 4
  • 91
  • 150
  • 1
    I don’t think Unix shell and Perl existed when this design decision in BASIC was made. – user3840170 Sep 28 '21 at 15:35
  • 2
    @user3840170: Shell dates to 1971, and one of the other answers says the first versions of basic with strings were 1968. So they were close contemporaries, and the idea of using these sigils as a way of avoiding explicit type declarations was clearly in the air at the time. Perl is of course much later, and was basically shell on steroids. –  Sep 28 '21 at 16:03
  • Thanks @BenCrowell. I also noticed that many BASICs will promote integers to reals. I didn't have much luck with concatenating strings with "+". It's interesting that TRS-80 BASIC can do that. But in each of these cases I think that the need to promote an integer to a real, or to truncate a real to an integer, or to concatenate rather than add, can be determined at the time the line is parsed. BASIC can include this information in the tokenized/compiled program and avoid working it out at runtime. But this only works if each variable has a known static type, thus the suffixes. – Willis Blackburn Sep 28 '21 at 22:51
  • 1
    re I disagree with the answers saying that Dartmouth BASIC worked a certain way - sure, you could implement FORTRAN and call it BASIC. But names have to mean something else it is madness. So the question is, at what point do we say that if it's not like existing BASICs, it's not actually BASIC? – dave Sep 28 '21 at 22:55
  • 2
    @another-dave I think you're assigning too much meaning to the term BASIC. Dartmouth BASIC is Dartmouth BASIC, Microsoft BASIC is Microsoft BASIC, etc. They're related but not the same. Nobody at the time was representing that the language was portable. It's similar to Unix family tree. Neither Linux nor macOS is AT&T Unix but so what? There are a lot of similarities and a lot of what we know about one carries over to the others, so they're all Unixes without specifically being AT&T Unix. – Willis Blackburn Sep 28 '21 at 23:19
  • 3
    I say the guys that designed it get to have the real version. – dave Sep 29 '21 at 00:21
  • 1
    @another-dave Fair enough. You'd get some support from the designers of Dartmouth BASIC, who tried to challenge Microsoft's dominance over microcomputer BASIC with a language called True BASIC. – Willis Blackburn Sep 29 '21 at 12:32
  • But trying to standardize BASIC on 8-bit microcomputers was a futile effort. Microcomputer designers had to fit an entire BASIC in 8K, and they were never going to agree on what features should be in or out. For example Atari BASIC has a SOUND statement that basically pokes directly into the sound-control registers of the POKEY chip. But that statement would have been too simplistic for the Commodore 64 and its SID chip, and too sophisticated for the Apple II, which only had 1-bit sound. So how to standardize that? – Willis Blackburn Sep 29 '21 at 12:34
  • Okay, so maybe it would have been enough to just standardize the core language features. But why bother when everyone had pretty much settled on Microsoft BASIC. Apple, Commodore, Tandy, and IBM licensed Microsoft BASIC, and it was available for the Atari too, so if you wanted a standard BASIC, that was it. – Willis Blackburn Sep 29 '21 at 12:38
2

Firstly there is more than one way to assign a value to a variable. Eg input x, y Print x+y If the input is 4 and 2, should the interpreter return 6 or "42"? You need to be able to tell apart strings and numbers.

Secondly limited memory space was a major consideration back then. Strings usually reserved 256 bytes, whereas numbers required much fewer bytes. This is particularly an issue with arrays, eg if dim a(10,10) needlessly reserved 10 x 10 x 256 bytes (when all you wanted was numbers not strings), then that's 25,600 bytes consumed out of eg 65,536 bytes total memory. Reserving 256 bytes for just one string literally took away 0.4% of your total memory, and there's no need to tie up that much resource if the variable was just a number.

Paul
  • 21
  • 1
-1

Symbol table space was the most likely culprit. Integers and floating point values were known sizes. Strings were problematic.

The interpreter (or P-code) had to keep track of all the variables and their contents. floating and integer types were easy to store, as they had a known size. strings were trickier and required additional information to store, such as the length of the string - some Basics required you to declare the size of the string before you could use them.

The run-time would have to look up the value of a variable in the symbol table before using it. Knowing if it was a string, integer, or floating point value makes the lookup easier.

Tim Lovern
  • 79
  • 1
  • 3