33

Machine language (and Assembly language) don't have the concept of data types, so if you want to add an int and a float variables in Assembly, you have to use the appropriate Assembly instruction that adds an int and a float.

But if you are working with a high level language (for example: C), all you have to do is "mark" one variable with the int keyword and mark the other variable with the float keyword, and then use the + operator to add the two variables together, and the compiler will generate the machine language instruction that adds an int and float.

But I am wondering, which was the first programming language that had data types?

user11869
  • 355
  • 1
  • 3
  • 3
  • 16
    Not sure if it's the earliest, but the initial version of FORTRAN had at least two "types": integers ("fixed point") and floating point -- see also Programmer's Reference Manual, The FORTRAN Automatic Coding System for the IBM 704 EDPM – Felix Palmen Feb 05 '19 at 14:01
  • 6
    I can't really answer with "the first" to have data types, although I suspect it's Fortran, note that you don't have to have manifest typing to have types. In C, you have to say int i; float f; to tag i as capable of holding an int and f as holding a float. In many other languages, such as Lisp, Perl, Python, many Basics, JavaScript, and so on, the variables can hold any type, but the operators know what to do with the values in them because the values know their types. – Dranon Feb 05 '19 at 14:42
  • 5
    Welcome to Retrocomputing! Since you use it as an example in your question -- not that it was the first -- the main difference between the C language and its own predecessor B was the introduction of data types. – DrSheldon Feb 05 '19 at 15:02
  • 15
    It's unclear to me what the actual question is here. Since even machine language has data types (bytes, words, doublewords, etc.) I think this question may actually be asking "Which was the first programming language that had implicit type coercion?" – Ken Gober Feb 05 '19 at 15:10
  • 2
    @Dranon Fortran's original behaviour was any variable who's name started with I through N was integer, and all others were floating point. Later versions (Fortran 66, I think) allowed you to override this and explicitly specify the type of a variable. With my first "big" programming exercise (in Fortran 77) I soon learned to use the IMPLICIT statement to turn off the default typing based on a variable's name (just as I learnt to ensure OPTION EXPLICIT was turned on in VB). – TripeHound Feb 05 '19 at 15:18
  • 1
    @KenGober I would say bytes/words/etc. are different sizes more than different types. It's up to the programmer to know (and take into account) whether this bunch of 16-bits represents an integer, a floating point value, or two ASCII characters. – TripeHound Feb 05 '19 at 15:19
  • 2
    @KenGober there's not "the" assembly language, and some really just know a single "type": a "word" (or byte). Then, even if different "types" are known, many are just about the size in memory, nothing else (especially not about the meaning/interpretation of the content). What the OP probably means is types in a semantic sense. – Felix Palmen Feb 05 '19 at 15:20
  • @KenGober Or, possibly, which was the first programming language where the type was tracked either with the variable or the value. Which might also be some variety of assembly. – Dranon Feb 05 '19 at 15:28
  • 1
    All processors have data types that they work with naturally. If they couldn't load, store, and add some string of bits it would hardly be a processor. Some need software to handle additional data types like extended precision, floating point, or additional operations like multiply and divide, but of course, they do that by using the data types they already support, signed and/or unsigned words. – Erik Eidt Feb 05 '19 at 15:49
  • 2
    @FelixPalmen Well, while a on machine level a word is a word, already early assemblers offer type definitions to seperate words from pointers - like F for a 32 bit integer word, and A for a word containing an address (and E for a float of the same memory size) They can be handled with the same instructions, but differ in meaning/interpretation in program context - thus enabling type mismatch warnings (or errors) during compile time. Like with any other HLL. Isn't it? – Raffzahn Feb 05 '19 at 16:25
  • 1
    @Raffzahn: I don't doubt such assemblers exist. But many assemblers just map mnemonics to opcodes and that's it. So, to be relevant for this question: does any such assembler predate e.g. fortran? – Felix Palmen Feb 05 '19 at 16:45
  • 1
    @FelixPalmen Judgeing a topic is always a bit odd when viewing from an off position - or do you judge a Mercedes SLC by using a go-cart? And yes, they predate FORTRAN by decades. FORTRAN introduced type conversion in FORTRAN 77 by aplying intrinsics (like INT() or DBLE()) to all parameters of an assignment (!). So no, Fortran is not an answer here. – Raffzahn Feb 05 '19 at 17:02
  • 4
    So why are you talking about conversion now? How is this asked in the question here? And how is this implemented in an assembler? – Felix Palmen Feb 05 '19 at 17:04
  • 1
    @KenGober: I interpreted it as "what was the first language that could be considered 'strongly typed'?" So far all of the answers are clearly incorrect. – BlueRaja - Danny Pflughoeft Feb 05 '19 at 18:04
  • 2
    There was a lot of exciting work on decidability and computability in the early/mid 20th century by people like Church, Turing, Kleene, Robinson, Mostowski, Godel, Lambek, Curry (,Howard) to name a few. Church published about the Untyped Lambda Calculus (i.e. Typed Lambda Calculus with only one type) in 1936. This later materialized as Lisp in the late 50's, just younger than Fortran. I'm not sure this is exactly an answer, so just leaving this as a comment. – tolos Feb 05 '19 at 20:25
  • 2
    Being able to add an int and a float is not a good test of whether a language has datatypes. OCaml, which has one of the most comprehensive type systems, does not allow adding an int and a float. This is reasonable 1) because it is not clear what the result is supposed to be (an int? a float? depending on order? depending on the result?), and anyway 2) most processors cannot add an int and a float, so it is better to force the programmer to explicitly convert one of the operands first. Go is another one that disallows this. – P Varga Feb 06 '19 at 02:22
  • Are there any high-level languages that don't have data types in the manner you describe? The earliest popular languages Fortran and COBOL have them. Algol and PL/I have them. – Barmar Feb 06 '19 at 07:36
  • @Barmar, There was one language, arguably higher "level" than either Fortran or COBOL, that explicitly rejected the idea of data types. https://en.wikipedia.org/wiki/BLISS BLISS was a structured programming language, inspired by ALGOL 60, in which the only primitive data type was a native machine word. – Solomon Slow Feb 06 '19 at 14:08
  • "Machine language (and Assembly language) don't have the concept of data types, so if you want to add an int and a float variables in Assembly, you have to use the appropriate Assembly instruction that adds an int and a float." – This makes no sense to me. I draw the exact opposite conclusion from you: if you need separate instructions for adding integers and floats, then clearly the language has a concept of data types, since it clearly distinguishes between ints and floats. – Jörg W Mittag Feb 06 '19 at 16:16
  • Types were introduced into set theory in 1908 in order to solve Russell's Paradox, and their application to programming languages is simply an obvious transfer. The Simply-Typed λ-calculus was introduced in 1940, and it clearly is a programming language, even though it almost predates computers. – Jörg W Mittag Feb 06 '19 at 16:20
  • Re: BLISS being typeless. So too was BCPL, which might be said to be a more direct descendent of Algol60 (Algol -> CPL -> BCPL) – dave Feb 06 '19 at 23:22
  • @JörgWMittag - I take your point, but I'd say the data are not typed, but the instructions are. One is free, for example to apply integer-add to floating-value on many machines (excluding, of course, those which only do floating-point operations in dedicated floating-point registers). If we declare that any machine that has an integer-add instruction and a floating-add instruction has hardware data types, then the definition seems like it has become useless through ubiquity. Note that the aforementioned typeless BLISS would be a typed language by having floating-point builtins. – dave Feb 06 '19 at 23:32
  • As with all the other "What was the first language..." questions I've seen here, this one is unclear. Undefined terms. As a result, answers and comments all over the map. (This site needs some question criteria.) – Drew Jun 04 '19 at 22:31

6 Answers6

43

The premise:

Machine language (and Assembly language) don't have the concept of data types

is not quite correct, because tagged architecture means exactly this, machine language where the data is tagged for its "type" (even though not quite what we know from higher level languages).

Probably the first widespread tagged architecture computer was the Burrough B5000 (or 5500?) from 1960s. But FORTRAN predates this.

Radovan Garabík
  • 4,993
  • 1
  • 16
  • 35
  • Except, that Fortran did no implied type conversion - at least not prior to the introduction of intrinsics inserted into assignment evaluations in FORTRAN 77. Way after ALGOL 60. (Intrinsics itself where introduced in Fortran 66). – Raffzahn Feb 05 '19 at 17:06
  • 8
    Again here, how does "which was the first programming language that had data types?" ask anything about type conversions? – Felix Palmen Feb 05 '19 at 18:20
  • 4
    I agree with @FelixPalmen in that the posed question is not about type conversions, though type conversions are mentioned in the middle paragraph. However, with respect to whether Fortran had implied type conversion: Fortran II certainly did, across an assignment operator: both float = int expression and int = float expression were allowed. Exponentiation of a float to an integer power was also allowed. Generalized mixed-mode expressions were not allowed, however. – dave Feb 06 '19 at 00:15
  • @another-dave You're right. FORTRAN II did convert on expression assignemnts. I added that as a footnote. – Raffzahn Feb 06 '19 at 14:58
  • Further: per the manual linked by @FelixPalmen in his answer, the first 704 Fortran had intrinsics - although the term doesn't appear, they are built-in open subroutines, i.e,, intrinsics. The relevant ones here are ABSF and XINTF. – dave Feb 07 '19 at 01:14
  • 1
    Both FORTRAN II and FORTRAN IV had intrinsics: in-line functions. Mixed-mode expressions were accepted by every FORTRAN IV compiler I ever used. (I learned FORTRAN II initially, but never used it. I used FORTRAN IV starting in 1970, on the CDC 6600 at UT Austin.) Programmers were warned that mixed-mode expressions could have interesting results. – John R. Strohm Feb 07 '19 at 20:26
  • 3
    Thirteen comments of being mean to each other‽ Come on, you're all better than that. – wizzwizz4 Feb 08 '19 at 06:47
34

Machine language (and Assembly language) don't have the concept of data types, so if you want to add an int and a float variable in Assembly, you have to use the appropriate Assembly instruction that adds an int and a float.

Erm... this sounds as if you're mixing up the idea of data types and operations on these. Data types are memory structures. Operations are an independent unit. And just because some languages do provide operators that can be used with multiple data types, doesn't mean they do in general and always. For example in C the sine function is defined as:

double sin(double x)

This means feeding anything but a double, for example an integer, will screw it up. Much like using a floating point operation (like FSIN) on an x87 will choke if an integer is handed as parameter.

Long story short, Assembler does have data types and does obey them (*1). For example on a /360 (1964) that would be:

Type             Example    Alignment
Character        C'1234'    Byte
Binary           B'0101'    Byte 
Packed (BCD)     P'1234'    Byte
Decimal          Z'1234'    Byte
Char (hex)       X'1234'    Byte
Integer 16 Bit   H'1234'    Halfword
Integer 32 Bit   F'1234'    Word
Float (32 bit)   E'-12.34'  Word
Float (64 bit)   D'-12.34'  Doubleword
Float (128 bit)  L'-12.34'  Doubleword
Pointer (32 Bit) A(1234)    Word
Pointer (16 Bit) Y(1234)    Halfword

(There are also Q, S and V pointers, but that's extreme high level stuff :))

Using the wrong data type in an instruction will make the assembler throw a warning, exactly the same way as a C compiler does.

But if you are working with a high level language (for example: C), all you have to do is "mark" one variable with the int keyword and mark the other variable with the float keyword, and then use the '+' operator to add the two variables together, and the compiler will generate the machine language instruction that adds an int and float.

As said before, C does this only for a handful of predefined operators for convenience, not in general and all over. C99 resolved this in part by selecting one of several possible functions fitting the operand type(s), and C++ used overloading. Still, not by default and everywhere.

But I am wondering, which was the first programming language that had data types?

As shown, it's Assembly :))

Beside that, each and every programming language that was ever designed and implemented for a real machine does include data types. After all, without it won't operate, would it?

If the question is more about implied type conversion (and/or selection), then again Assembly will be a valid answer, as Assembly offers the same ways as C/C++ to write code that adapts to data types (*2). Now, if you insist to exclude Assembly for whatever ideological reason, then ALGOL 60 (*3) may be a good candidate. The sometimes cited FORTRAN introduced it quite late (*4) with FORTRAN 77 (in 1978) (*5) using intrinsics (introduced with FORTRAN 66).


*1 - Or better can, as many - let's say less proficient - programmers decide to ignore or even disable that feature.

*2 - As usual, the secret lies within meta programming - aka Macros - much you do overloading in C++. Except, Assembler does not even force you to use existing operators.

*3 - In fact, ALGOL is a very nice example for the issues of automatic conversion and how to handle it. Where ALGOL 60 added arbitrary type conversion, like its descendant C, ALGOL 68 restricted automatic type conversion later, to only work upward, to avoid program/data errors due to precision loss. So INT could be implied converted to FLOAT, but a downward conversion had to be explicit.

*4 - Which let people use explicit conversions way into the 80s, making it hard to update programs until today. A great example of the advantages of clear, stringent and centralized definition. The ability to switch from single to double or long with just a few changes, instead of debugging huge piles of old code to find each and every explicit conversion.

*5 - As another-dave pointed out in a comment IBM's Fortran II (of 1958) did automatic type conversion between float and int when assigning the result of an expression (See p.22 'Mode of an Arithmetic Statement' in the manual). The expression itself had to be, in all parts, either integer or float, thus it might not fit case made by the OP.

Raffzahn
  • 222,541
  • 22
  • 631
  • 918
  • 11
    I'm not the downvoter, but this shows a very inaccurate understanding of types. The C function prototype does not mean that "feeding anything but a double, for example an integer, will screw it up," it means that anything but a double--or something convertible to a double--is forbidden by the type system and won't compile. Likewise, having annotations in a specific assembler that the coder is free to turn off, and which give warnings, not errors, if your code describes a nonsensical operation, is not a static type system in the style of C or ALGOL. – Mason Wheeler Feb 05 '19 at 17:23
  • 2
    @MasonWheeler So am I right to understand your point is about the way the warnings are presented and how that was handled afterwards? Correct me, but are these not features of a language, but rather of its development environment? C is a real bad example, as it did allow to produce type mismatch producing bad code - not to mention that several compilers did generate an a.out despite warnings and errors given. So it's as well up to the user (or his policies represented in IDE settings) how to handle it. Not really a difference, right? Not everything we assume being part of a language is ) – Raffzahn Feb 05 '19 at 17:42
  • 5
    This seems to be a snarky reply saying "assembly has types if you squint hard enough", which is more of a comment than an answer. – BlueRaja - Danny Pflughoeft Feb 05 '19 at 18:03
  • 5
    @BlueRaja-DannyPflughoeft No need to squint at all. It's all there. If people decide not to do it, it can't be made the fault of the language. Beside, the squint is maybe rather on the side of people looking at the puny capabilities of GNU assembler and judging all the others with that knowledge - much like saying cars don't have air bags by looking at a Yugo. – Raffzahn Feb 05 '19 at 19:42
  • @Wilson Ups. Sorry, my fault, it should read L' ' with the quad word float. It's the type letter that tells the assembler what type it is. Thanks for noting that detail – Raffzahn Feb 06 '19 at 12:28
  • 2
    When you say that "Assembly [was the first language to have datatypes]", are you claiming that there is essentially one programming language called "Assembly", or are you claiming that IBM S/360 assembly is the first one? – Omar and Lorraine Feb 06 '19 at 14:01
  • Neither. Assembly is family of languages with adaptions to various CPUs. They developed with the first machines. And while the /360 Assembler isn't the first to do so, it is a great example, as it is a rather advanced example making it easy to show various features. Looking at a more primitive (early) one may not be obvious. It also helps that the /360 is a machine with a broad variety of types. – Raffzahn Feb 06 '19 at 14:12
  • @Wilson I guess it would take a separate question (instead of a comment) about type usage in Assembly to better show the details - then again, such a question would not be strictly a RC.SE one, as the techniques used are anything but outdated. – Raffzahn Feb 06 '19 at 14:46
  • 1
    @Raffzahn Have you a source for your definition of „data type“? Mine includes operations being part of a data type and not just the memory structure. And of course I didn't come up with my definition myself – it was formed by books, lectures, and also reflected by Wikipedia on data type: https://en.wikipedia.org/wiki/Data_type – BlackJack Feb 06 '19 at 15:32
  • I think this is a valid answer for one definition of data types, and some assemblers, but fails to acknowledge that that is far from the only definition, and far from a universal property of assemblers. The statement "every programming language that was ever designed and implemented for a real machine does include data types" is also provably false, except in the facile sense of "has at least one data type"; a language is both possible and useful that simply allows any operation on any location in memory. – IMSoP Feb 06 '19 at 15:37
  • @BlackJack Hmm, beside that I didn't give an 'definition', but examples, thus not presenting any of 'mine'? Well, take the very first sentence of the wiki article linked "[it] tells the compiler [...] how the programmer intends to use the data". Thats what above examples do - don't they? Now, if you say, that operations are part of the type, then you not only imply a way wider definition as neccessary (they are always contextual) but also deny portability of a type as soon as one partner does not offer the same operations (and on a side note, this would better fir CS.SE than RC.SE). – Raffzahn Feb 06 '19 at 15:46
  • @IMSoP Yes, the definition of what the term data type describes may vary greatly between environments. With this being RC.SE and the context of the question it may be safe to stay to the basics. And again yes, Assemblers can be way different. To many people are only exposed to the bare minimum ones, thus underestimating the tools Assembler can offer is common. Above example could have been made with MASM and x86 in mind as well - alas less elegant - but it'S next to impossible on GNU. – Raffzahn Feb 06 '19 at 15:59
  • @IMSoP You do touch a core issue with the second part. For one, yes, a CPU with only one data type (memory location) has also only this one, still it's one. Early CPU were exactly this way, with locations to store a word, often being FP. Doing non-FP was not provided and required (if possible) quite some code. Later CPUs added specialisation due subdividing these words. A de-specialization happened there after - up to the point where all memory was treated as bytes and all other data types where build from them upward. For historic use we should avoid using hindsight of later developments. – Raffzahn Feb 06 '19 at 16:07
  • @Raffzahn Even leaving aside the question of "what is a type", the question asked was which was the first language with types, not which is the simplest. So, yes, there are assembly languages with simple types, maybe even complex types; but are they older than other languages with types? Saying "every computer has at least one type" is pedantically true, but not very helpful; so the question isn't really answered, what was the first language which made a meaningful distinction between types? – IMSoP Feb 06 '19 at 16:07
  • @Raffzahn: By your definition I meant this quote from your answer: „Data types are memory structures. Operations are an independent unit.“ This doesn't sound like an example but a quite absolute statement, or definition to me. If I just look at the first sentence of the Wikipedia article… – well I didn't stop there. It says in the first paragraph that a data type „[…] defines the operations that can be done on the data“. – BlackJack Feb 06 '19 at 16:18
  • @IMSoP Are you asking this (older languages) from a theoretic or a practical view? From a theoretic PoV one may be able to find papers predating computers using a typed notation. From a practical PoV an Assembler is always the first language implemented on a computer. The same way CPUs added new data types (and operations), Assemblers made them available in abstract for. At that point it may be worth to note thate early HLL didn't differ much from Assemblers one way or the other - like FLOW-MATIC vs. 1401 Symbolic Programming. – Raffzahn Feb 06 '19 at 16:19
  • 1
    @BlackJack Yes, you're right, the key word here is can. And I got the feeling we are at cross-purpose (pun intended). A data type is a way to define what kind of data structure memory holds (at that point) and as follow up how to be interpreted within the program context. It is to support consistent operations and avoid harm. Still it does not limit what can be done. A pointer to int can still be handled as pointer to char or integer in itself. There is no underlaying mechanic to prevent it. Compiler offer various levels of security measures to be obeyed when doing so. – Raffzahn Feb 06 '19 at 16:33
  • @Raffzahn If you want clarification of the question, ask the person who asked the question. If you look at the other answers on this page, you'll see that both theoretical and hardware answers have been proposed. – IMSoP Feb 06 '19 at 17:01
20

Perhaps Plankalkül (1942-45).

Plankalkül [has ...] floating point arithmetic, arrays, hierarchical record structures [...] The Plankalkül provides a data structure called generalized graph (verallgemeinerter Graph), which can be used to represent geometrical structures. [...] Some features of the Plankalkül: [...] composite types are arrays and tuples [...] The only primitive data type in the Plankalkül is a single bit, denoted by S0. Further data types can be built up from these.

Addition: from Stanford CS-76-562 Early development of programming languages (Knuth & Pardo).

Thus the Plankalkül included the important concept of hierarchically structured data, going all the way down to the bit level. Such advanced data structures did not enter again into programming languages until the late 1950's, in IBM's Commercial Translator. The idea eventually appeared in many other languages, such as FACT, COBOL, PL/I, and extensions of ALGOL 60

There are some details about numbers also:

Integer variables in the Plankalkül were represented by type A9. Another special type was used for floating-binary numbers, namely [...] The first three-bit component here was for signs and special markers -- indicating, for example, whether the number was real or imaginary or zero; the second was for a seven-bit exponent in two's complement notation; and the final 22 bits represented the 25-bit fraction part of a normalized number, with the redundant leading "11" bit suppressed.

Tomas By
  • 2,082
  • 2
  • 15
  • 35
  • Except it neiter made it to real use or had any influence. – Raffzahn Feb 05 '19 at 14:46
  • 3
    "The only primitive data type in the Plankalkül is a single bit, denoted by S0. Further data types can be built up from these." - so, no. No "built-in" primitive data types. – Radovan Garabík Feb 05 '19 at 14:49
  • The question just says "data types". – Tomas By Feb 05 '19 at 14:53
  • Well, it wasn't ever implemented until 1975. Does a language "exist" simply by having a specification? – Felix Palmen Feb 05 '19 at 15:04
  • 5
    @FelixPalmen: yes? – Tomas By Feb 05 '19 at 15:06
  • @TomasBy or maybe no? after all, you can't program in a language that doesn't have any implementations. Sure, it depends on what the OP is actually interested in ;) – Felix Palmen Feb 05 '19 at 15:09
  • 11
    You can write programs, and you can perform all the operations theoretically on paper. Have you never taken a programming course at university, and sat an exam? – Tomas By Feb 05 '19 at 15:16
  • 1
    @TomasBy So this is "programming"? Please keep your presumptions to yourself, all I'm saying here is there's reasonable doubt your answer is what the OP was looking for. Not that it's wrong or bad or anything. – Felix Palmen Feb 05 '19 at 15:18
  • @TomasBy Felix is onto something here :)) But there's an even more nitpicking point: The question asks for data typeS and conversion between them. If a language has only a single built in data type, then it might not fit - as well as for this language conversion simply isn't a thing - or is it? – Raffzahn Feb 05 '19 at 15:24
  • 5
    It says "arrays, records, graphs" - all those are data types. And it also says floating point math, so presumably there are both integers and floats in some form. "primitive" is not the same as "built-in". – Tomas By Feb 05 '19 at 15:28
  • 10
    @FelixPalmen: writing programs is progamming, correct. – Tomas By Feb 05 '19 at 15:29
  • 2
    @TomasBy Not realy, without a compiler and a machine to run it, it's rather a kind of literature. Eventually of the fantasy kind. – Raffzahn Feb 05 '19 at 16:26
  • 2
    @Raffzahn: computer programming is closer to carpentry than it is to writing literature. – Tomas By Feb 05 '19 at 16:38
  • @TomasBy Maybe the way you think of it, but random words in arbitrary order on paper is still ink on paper, not threads in a loom. – Raffzahn Feb 05 '19 at 16:43
  • 1
    Whether or not you have built an interpreter or compiler for it, you can still define the intended semantics of the language. – chepner Feb 05 '19 at 17:28
  • 7
    Why not? 8008 assembler was designed, and coded in, before any processor capable of running it existed. That is how design works. Fortran as a notably typed language was designed and coded in before a Fortran compiler was ever written. This is a chicken/egg issue. The people designing this language didn't design it not anticipating to build a real computer later. This was part of their justification for the money. Duh. – Harper - Reinstate Monica Feb 05 '19 at 18:14
  • @Harper don't you think 30 years is a bit much until a first implementation? Wouldn't anyone just move to something that is implemented? What do you think the OP's intention was? Would you program for an original Turing machine? – Felix Palmen Feb 05 '19 at 18:17
  • @FelixPalmen and how would they have known it would be 30 years? You are arguing "design vs implentation time matters, and implementation should apply here", in which case you should ask your own question because this question does not say that. – Harper - Reinstate Monica Feb 05 '19 at 18:22
  • @Harper oh really? You know for sure the OP was asking for languages that just existed as a concept? This is impressive .... – Felix Palmen Feb 05 '19 at 18:22
  • 3
    Yes. I'm saying OP decides his question. I'm also saying all languages exist as a concept until they're fully implemented. In 1942 most people did not carry powerful computers around in their pocket that could virtualize language the way modern computers virtualize C. Back in that age, language and hardware were tightly coupled, and hardware was expensive enough to nearly need a state actor behind it. As such, implementation trailed development by much longer than today. – Harper - Reinstate Monica Feb 05 '19 at 18:30
5

Let's get the answer to the question out of the way first. Limiting ourselves to high level languages designed for electronic digital computers and that are not really obscure, the answer is between Cobol and Fortran depending on which was invented first.

Machine language (and Assembly language) don't have the concept of data types

This is not true. Many assembler languages have multiple different sized words they can operate on and some have floating point types.

But if you are working with a high level language (for example: C), all you have to do is "mark" one variable with the int keyword and mark the other variable with the float keyword, and then use the + operator to add the two variables together, and the compiler will generate the machine language instruction that adds an int and float.

That's called implicit coercion and while it's true that you need data types to do implicit coercion so that the compiler knows how to do the coercion, coercion is not synonymous with data types or even a necessary condition. Swift, for example, has no implicit coercion - you always have to convert both operands to the same type when doing arithmetic.

There are three related concepts here that are being confused,

  • types assign a meaning to certain bit patterns in memory. They tell you and the compiler what kind of object a thing in memory is and what you can do with it.
  • type checking is where the compiler or the language runtime checks that an operation is valid for a particular type
  • implicit coercion is where the compiler or language runtime has a rule for automatically converting one type to another if needed.

Almost all computer languages are typed to some degree. Some languages are called "typeless" e.g. the predecessor to C which was called "B" (in reality these languages actually have one type), often the machine word. What really distinguishes languages is not whether they are typed or not but how much type checking is done, when the type checking is done and what happens when mismatched types are found.

Let's look at some examples:

One of the biggest complaints about Javascript is that its type system is very weak. This is not really true. When a program is running, the interpreter always knows exactly what type every object is. The problems with Javascript occur because type checking is done at run time (this is called "dynamic typing", compile time type checking is called "static typing") and if you perform an operation on an object of an incompatible type, Javascript will try to coerce it into a compatible type, sometimes with surprising results.

C is relatively strongly typed with static type checking. This was not always the case. In the pre-ANSI standard days, not all conversions between types were checked. Perhaps the most egregious issue was the fact that the compiler didn't check assignments between pointers and ints. On many architectures, you got away with it because int and someType * were the same size. However, if they were not the same size (as with my Atari ST Megamax C compiler) assigning a pointer to an int would lose bits and then hilarity would ensue.

The trend today seems to be towards statically typed languages but with type inference. Type inference allows the programmer to omit the type specification if the compiler can infer the type of the object from the context. For example, in Swift:

func square(_ x: Int) -> Int
{
    return x * x
}

let a = square(3)

defines a function square that takes an Int and returns an Int and the applies it to 3 and assigns the result to a. The compiler infers the type of a from the return type of the function and the type of the literal 3 from the type of the function's parameter. In C I would have to declare the type of a although it does have limited inference for literals.

Type inference seems to be a new trend although, as with all things in Computer Science, the concept probably dates back decades. Statically typed languages are as old as high level languages.

JeremyP
  • 11,631
  • 1
  • 37
  • 53
  • "Type inference seems to be a new trend although ... the concept probably dates back decades." Wikipedia claims the Hindley-Milner principal type algorithm goes back to 1969. Certainly Haskell had type inference from day 1, Haskell dates back to 1990, and Haskell is loosely based on Miranda, which dates to 1985... It seems real language implementations have been doing this for a while. – MathematicalOrchid Feb 07 '19 at 14:11
  • @MathematicalOrchid I tend to think of everyone after 1980 as being new - that was when I first came in contact with real computers. I am aware that type inference has been around in functional programming languages for a long time but as a popular trend for mainstream languages, it is quite new. – JeremyP Feb 08 '19 at 09:54
  • To be clear, I wasn't disputing that the trend is new, I was confirming that the original idea itself is quite old. – MathematicalOrchid Feb 08 '19 at 13:22
5

There's a lot of heated discussion about what the meaning of 'programming language that had data types' might be. In the absence of clarification from the OP, here's my opinion.

Definitions:

'Programming language' - any language in which programs are written. For the purposes of this answer, I'm restricting this to languages which are implemented 'soon' after design, for some vague value of 'soon' (I don't exclude languages which have programs written before the implementation). I wish to exclude Plankalkül for this answer, work of genius though it may be, simply because it did not become known to the world until after data-typing became commonplace. Until implementation, it's a theoretical idea, not a programming language - though perhaps as a workaday programmer I am prejudiced.

'Language with data types' - I think the bedrock requirements here are that the language defines more than one type, and there be some way to indicate what type a quantity has. I include explicit declaration, implicit typing (by denotation for literals, initial letters for variables, or weird sigil schemes), and runtime determination.

Lastly, if a language is said by its population of programmers to be 'untyped' then I think we should agree with those programmers. This means BLISS is typeless even though it has builtins that will treat a machine word as holding a floating-point value.

Having established my frame of reference, I say the answer is "early FORTRAN" (1954). FORTRAN is normally considered to have been born in 1957, but this survey of early programming languages by Knuth shows, on pages 62-63, an early implementation where the I-N as integer, others as real numbers, convention is in place.

dave
  • 35,301
  • 3
  • 80
  • 160
  • Knuth & Pardo 1976, yes, if you check the list on p. 1, then Plankalkül is #1 and FORTRAN is #11. – Tomas By Feb 07 '19 at 01:44
  • @TomasBy - you're referring to the order of language appearance, right? I'm inclined to not consider Plankalkül for the purpose of this question since it was not implemented. As far as I can tell, no implemented language until Fortran had the data typing we're looking for. (I was quite disappointed that Glennie's Autocode didn't) – dave Feb 07 '19 at 02:08
  • Will do later. I admit it's a slippery slope since as observed, people program in a language before it's implemented (extreme case: compiler bootstrapping). However, feels like it was a paper exercise, even if not intended that way. I'll do a little more reading first. Thanks for the note. – dave Feb 07 '19 at 13:01
  • 1
    @IMSoP - I laid my cards on the table! – dave Feb 09 '19 at 01:15
1

Not the first, but Algol 60 deserves honorable mention. It had strong typing, but type mismatches caused compiler errors instead of automatic conversions.

The typing system for Algol 60 was better than Fortran or COBOL.

Incidentally there was a period of four years when major languages were launched. 1957 Fortran; 1958 COBOL; 1959 Lisp; and 1960 Algol.

Walter Mitty
  • 6,128
  • 17
  • 36
  • Agreed, but there was that weird thing of the specification part (if I recall the terminology correctly) being optional for formal procedure parameters. Many implementations required it. But if you don't know the parameter data types (without exhaustive analysis of every potential execution path!) how can you compile the code for the procedure, in the general case? Hmm, maybe this should be a new question! – dave Feb 07 '19 at 13:05
  • I have forgotten too much Algol to follow your comment. Suffice it to say that Algol was a step in that direction. – Walter Mitty Feb 07 '19 at 13:16
  • Oh, Algol was a long way ahead of its contemporaries. I'm just commenting on what seems like a weird oversight to me. Per Revised Report, real procedure foo(a, b) begin foo := a + b end; is legal and complete. There is no type specification for a and b. – dave Feb 07 '19 at 23:38
  • To be fair to Algol, I think we'd have to refer to the first published version Algol 58 (id est, 1958). The better known Algol 60 is the revised version (1960) but the concepts of "the algorithmic language" were published in 1958 and established back in 1957. – NimbUs Oct 20 '19 at 12:57
  • My pairing of years with languages is subject to correction. The development of languages is sometimes a multi year process. – Walter Mitty Oct 20 '19 at 13:14