19

As a companion to the question "What was the first programming language to have 'printf'?", which language had the first scanf?

It doesn't have to be literally called scanf, but I am looking for the following:

  • A procedure, subroutine, function, or statement with the purpose of reading textual items from a file or input device.
  • Takes a variable number of parameters, at least one of which specifies the number and format of the items to read.
  • As this is a "first" question, you need to specify the year such a feature was introduced. I am not looking for every language with such a feature.

The C scanf appears to have been introduced sometime between 1972 and K&R first edition in 1978, so that's an upper bound to the answer.

DrSheldon
  • 15,979
  • 5
  • 49
  • 113
  • 4
    To narrow the dates down a little, in the Unix world scanf first appeared in Mike Lesk’s Portable C Library, which was added as iolib in V6 in 1975. – Stephen Kitt Aug 17 '21 at 09:40
  • 4
    Under what conditions does a library count as part of a language? (Honest question, I don't know the answer.) – LarsH Aug 17 '21 at 14:13
  • 1
    @LarsH: I would considered a library that is delivered or installed with an implementation of the language to count as part of the language. – DrSheldon Aug 17 '21 at 15:50
  • I don't know enough of the early history of COBOL to make this a definitive answer, but in COBOL you declare the format of a file declaratively in its data description, then you issue a READ statement, and it populates a record whose structure is defined by the data description. Arguably much more robust, though less dynamic, than what C has to offer. It doesn't do it with a variadic function call, but who needs that if you've got rich structured data types? – Michael Kay Aug 18 '21 at 23:26

3 Answers3

31

Fortran: October 1956.

See The Programmer's Reference Manual for Fortran on the IBM704, Chapter 5.

In the question about printf Fortran was explicitly excluded, but I don't see why it fails to meet the criteria of this question. The I/O statement certainly "has a parameter which specifies the format of the items to read," though the definition of the format was separated out, with the obvious advantage that identical format definitions can be used in multiple I/O statements without unnecessary repetition.

The format can be separated out in scanf by specifying it as a variable instead of a constant, of course.

Later versions of Fortran, which included character data types, allowed the format specification to be embedded within the I/O statement, if desired.

chicks
  • 397
  • 1
  • 4
  • 15
alephzero
  • 6,646
  • 3
  • 29
  • 34
  • Maybe FORMAT should be disqualified on the requirement that there be an argument that "specifies the number ... of the items to read." :-) If I recall correctly, it's the de facto number of variable/array elements that appear in the READ that determine how many items are read. "Not using all the FORMAT" is ok, and coming to the end of the FORMAT is governed by baroque rules of where to resume scanning. I suspect Algol 68 has similar features. – dave Aug 17 '21 at 14:10
  • 5
    @another-dave Well, scanf just replaces the "built-in baroque logic" with user-written baroque logic to loop around the scanf statement. IMO the real genius of the Fortran design was identifying a subset of the general formatting problem which was "good enough" to do something useable almost all of the time. – alephzero Aug 17 '21 at 14:33
  • @alephzero: Were the formatting operations in FORTRAN processed as calls to variadic functions, or would they have required compiler logic to translate what looked like variadic calls into sequences of calls to functions that would each expect a fixed set of arguments? – supercat Aug 17 '21 at 16:11
  • @supercat I don't think the a compiler could have converted them into calls with a predefined fixed sequence of arguments, because (1) as another-dave pointed out, re-using the format items follows it own logic (depending on the format string itself) and (2) the number of input items can be read as part of the input data - e.g 10 FORMAT(3I5) and then READ(*,10) M, (N(J),J=1,M) is valid Fortran. The first input data line contains M and the first two elements of array N, subsequent lines each contain the next three elements of N, ignoring any excess input items on the last line. – alephzero Aug 17 '21 at 17:32
  • ... AFAIK modern Fortran run-time libraries use a function with just two arguments, i.e. two data structures representing that format string and the variables to be input or output. (I don't know the history of how this evolved over time, but note that modern Fortran includes arbitrary user-defined code activated by the FORMAT specification, e.g. to output user-defined data types. – alephzero Aug 17 '21 at 17:38
  • @alephzero: On a FORTRAN implementation which doesn't have a stack, it would be impossible for a function to accept a variable number of arguments; the only way to have a single function call take care of an arbitrary input/output task would be to use the approach you describe for modern implementations. FORTRAN has, so far as I can tell, always allowed a reference to an arbitrary-sized array to be passed as a single argument, but that's very different from how scanf would have worked in typical early C implementations which could read arguments off the stack without the compiler... – supercat Aug 17 '21 at 19:18
  • ...having to know or care how many words on the stack were in fact function arguments. – supercat Aug 17 '21 at 19:19
  • 1
    @supercat: re: On a FORTRAN implementation which doesn't have a stack, it would be impossible for a function to accept a variable number of arguments - I don't see this at all. If your calling sequence is, say, a jump-to-subroutine followed by sequence of addresses of arguments (pass-by-ref), then I can imagine a number of ways to determine the number of actual-args, and adjust the return appropriately. – dave Aug 17 '21 at 19:29
  • @another-dave: I should perhaps have added "without having to be given information about how many arguments there are". An implementation expect each call instruction to be followed by a zero-terminated list of argument addresses, but that would make it necessary to either include that extra zero word on all function calls, and have all functions' return logic scan for it (which would I guess be possible, but impractical), or else process calls to variadic functions differently from calls to other functions. – supercat Aug 17 '21 at 19:45
  • Re "...obvious advantage that identical format definitions can be used in multiple I/O statements...", you can do this in C scanf too. The format argument is a string, which doesn't have to be a literal string. It can be a string variable (that is, '\0' terminated array of char). You can even write the format to the string on the fly. – jamesqf Aug 17 '21 at 22:45
14

I think the FORTRAN answer is valid within the terms of this question, but for orthogonality with the 'printf' answer, I think we need to mention Algol 68 again.

Algol 68 provided 'readf', for formatted input from standard in, and 'getf', for formatted input from a specific file channel.

The names given here are from the Revised Report, published 1974, but the functionality was present in the original 1968 Report under different names, 'out' and 'outf'.

In both Reports, these facilities were implemented as procedure calls, with format as a mode defined in the language, with its own denotation (i.e., a format was not just a string).

dave
  • 35,301
  • 3
  • 80
  • 160
  • 1
    Here's a link to the definition of format in the Revised Report: https://jmvdveer.home.xs4all.nl/en.post.algol-68-revised-report.html#A34 – texdr.aft Aug 17 '21 at 21:32
6

Perhaps the real question shouldn't be which language "had" the first scanf function, but rather which language first allowed variadic functions to be written in user code. That would have been possible in B, and I am unaware of any earlier languages that would have supported it.

Other earlier languages allowed programmers to supply an arbitrary number of arguments for items to be read, but the compiler would have known how many arguments were being passed, and generated code to handle that many arguments. In Pascal, for example, ReadLn(A, B, C); would have been equivalent to Read(A); Read(B); Read(C); ReadLn; What would have made the behavior of scanf different is that a B or C compiler could (and typically would) have been completely agnostic to the fact that the function needed special handling to accommodate variable numbers of arguments, and would have processed calls to scanf no differently from calls to any other function.

supercat
  • 35,993
  • 3
  • 63
  • 159
  • 1
    That sounds like a reasonable -- but separate -- question. – DrSheldon Aug 17 '21 at 15:52
  • I am unaware of languages prior to B that allowed functions to actually be passed arbitrary variable numbers of arguments. Instead, compilers for those language would translate each construct that looked like variadic function call into one or more calls to functions that each expected a fixed number of arguments. The reason I focus on the "user code" question is that compilers for C and likely B as well would have come bundled with an implementation of scanf well before such a function was formalized as part of a standard library, but I think it's fair to back-date language support... – supercat Aug 17 '21 at 16:04
  • ...to the time when a compiler vendor demonstrated for users how to achieve such functionality. Having to copy in a blob of source text is less convenient than just #include <stdio.h>, but the reason the function became part of the Standard library is that programmers would often include a definition of the function and then treat it as part of the language, rather than writing purpose-specific functions to handle input parsing. – supercat Aug 17 '21 at 16:09
  • I agree that it's a significant development in computer languages, but it merits a separate question. – DrSheldon Aug 17 '21 at 16:12
  • @DrSheldon: It seems strongly enough related that if it were asked as a separate question, it might get dupe-hammered with a reference to this one. – supercat Aug 17 '21 at 16:20
  • 1
    @supercat Lisp supported it since the beginning. – texdr.aft Aug 17 '21 at 20:32
  • @supercat Don't forget that in the 1950s (and 60s) nobody was too bothered about standards and portability. The focus was on getting something done. There was a lot of truth in the "real programmers don't eat quiche" opinion that (particularly in Fortran) "every data structure is just a special case of a one-dimensional array, so what exactly is the problem that needs solving here?" The concept of scanf and printf functions doesn't make much sense until you have file systems which abstract away the underlying hardware, which early versions of Fortran did not. – alephzero Aug 17 '21 at 21:02
  • @alephzero: Of course, today there's no consensus understanding as to whether the C or C++ Standard is intended to avoid characterizing as UB any construct that programmers might need to use, or to avoid defining the behavior of any construct that some compilers might have difficulty processing in a fashion consistent with sequential program execution, thus leaving a large number of constructs which should be expected to behave identically on most general-purpose implementations, but whose behavior cannot be defined by the Standard. – supercat Aug 17 '21 at 21:16
  • I still remember when I first encountered C after having learned Pascal that the ability to write printf-like functions in user code without special treatment in the compiler convinced me to drop Pascal and stay with C :-) This is really a powerful concept. – Hans-Martin Mosner Aug 19 '21 at 14:02
  • @Hans-MartinMosner: It is a powerful concept, but maintainers of free compilers completely lose sight of the Spirit of C, especially the principle "Don't prevent the programmer from doing what needs to be done". In Ritchie's Language, applications could implement their own type-agnostic memory management code. Even if the Standard wouldn't require that implementations allow programs to recycle memory without releasing and reacquiring it via free/malloc, the ability to do so is useful for many purposes, and implementations intended to be suitable for such purposes should support such code. – supercat Aug 20 '21 at 20:06