29

I found the old C compiler from V6, and, though it seems to the modern eye a little different from good, idiomatic C, evidently it uses things like #include and #define, but I do not see how it implements these.

Was this an external program at that time? In that case, is it lost to obscurity or am I missing something in this codebase?

user3840170
  • 23,072
  • 4
  • 91
  • 150
Omar and Lorraine
  • 38,883
  • 14
  • 134
  • 274
  • 9
    Not sure, but wasn't cpp, the preprocessor an external tool anyway? – Raffzahn Mar 09 '21 at 10:49
  • @Raffzahn Yes, but nowadays (clang/gcc) the external cpp tool is only provided for compatibility and you should invoke cc -E on those compilers instead, otherwise some expansions might not succeed properly. – ljrk Mar 09 '21 at 18:40
  • 2
    my eyes! it seems to perform complete manual parsing. Must have been horrible to write. BTW which compiler was used to compile this C compiler for first pass??? – Jean-François Fabre Mar 09 '21 at 20:36
  • 4
    @Jean-FrançoisFabre It was made in incremental steps from a B compiler actually. I'm pretty sure there's a question on this site about that – Omar and Lorraine Mar 10 '21 at 06:50

1 Answers1

42

In V6, the C preprocessor is part of cc, the compiler driver; see the expand() function in cc.c. The directory you linked to contains the source code to the two passes of the C compiler, c0 and c1 (and their floating-point variants, fc0 and fc1), and the optional optimiser, c2. The passes are driven by cc, whose source code is available in the s1 directory.

As far as I can tell, the external preprocessor was introduced in V7.

In both cases, the preprocessor can be invoked using cc -P. The V6 and V7 cc(1) manpages provide more detail. Dennis M. Ritchie’s The Development of the C Language paper gives this context:

Many other changes occurred around 1972-3, but the most important was the introduction of the preprocessor, partly at the urging of Alan Snyder, but also in recognition of the utility of the the file-inclusion mechanisms available in BCPL and PL/I. Its original version was exceedingly simple, and provided only included files and simple string replacements: #include and #define of parameterless macros. Soon thereafter, it was extended, mostly by Mike Lesk and then by John Reiser, to incorporate macros with arguments and conditional compilation. The preprocessor was originally considered an optional adjunct to the language itself. Indeed, for some years, it was not even invoked unless the source program contained a special signal at its beginning. This attitude persisted, and explains both the incomplete integration of the syntax of the preprocessor with the rest of the language and the imprecision of its description in early reference manuals.

The “special signal at its beginning” is # as the very first character; this test can be seen in V5’s cc.c (and in V6’s).

(I’m linking to Diomidis Spinellis’ Unix history repo because I’m familiar with it, but the repo you’ve found also has cc in s1.)

Leo B.
  • 19,082
  • 5
  • 49
  • 141
Stephen Kitt
  • 121,835
  • 17
  • 505
  • 462
  • 7
    It was internal, then external, and nowadays internal again? – dave Mar 09 '21 at 13:01
  • Internal, then external, then it depends :-). GCC still has an external cpp but it’s not used when compiling. – Stephen Kitt Mar 09 '21 at 13:25
  • (Once cpp became available, other tools started using it; xmkmf in particular needs a working cpp.) – Stephen Kitt Mar 09 '21 at 13:33
  • 1
    Did it really start internal though? Everything about Unix and the preprocessor specs suggests it was conceived as an external tool. – Euro Micelli Mar 09 '21 at 14:29
  • 1
    There is no cc.c file in the cited source. And the comments in c0.cet al hint at each one being a separate program. – vonbrand Mar 09 '21 at 15:40
  • 4
    Given that C evolved at around the same time as the invention of pipes, I am not surprised that an early version of C would have an internal preprocessor. – DrSheldon Mar 09 '21 at 15:54
  • @vonbrand cc.c is part of the same V6 source, even though it’s not in the same directory; it drives c0, c1, and c2 (look for char *pass in cc.c). Look at the run file in the same directory as c0.c to see how the passes are constructed. – Stephen Kitt Mar 09 '21 at 16:52
  • 3
    The history I always learned was that, early on, the preprocessor was a separate program, and was only executed if the driver script saw a '#' as the very first byte of the source file. But what version that applied to, I can no longer say. By the time I wrote C, the preprocessor always ran (though it might still have been separate). – dave Mar 09 '21 at 17:02
  • @another-dave the # test is visible in cc.c as far back as V5 at least, but it is part of cc there (and in V6). – Stephen Kitt Mar 09 '21 at 17:11
  • 1
    @another-dave I think the argument boils down to this: It is external if you view cc as a "driver" that is separate from the actual compilation process, which in the early days consisted of passes run as different programs. It is internal if you consider "the C compiler" to include cc. – texdr.aft Mar 09 '21 at 17:17
  • 3
    @texdr.aft from the user’s perspective though the C compiler was cc; I don’t think the separate passes were run directly by users (or at least, by most users). – Stephen Kitt Mar 09 '21 at 17:18
  • 3
    @StephenKitt Right. I suppose my point was that cc and the compiler proper were not the same executable. – texdr.aft Mar 09 '21 at 17:28
  • 1
    @texdr.aft yup, I get that ;-). – Stephen Kitt Mar 09 '21 at 17:32
  • The ability to combine multiple source files in a compilation unit would seem to have obvious value, and I find it curious that it would have required using an external program to produce a file with all of the inputs concatenated. I wonder if there would have been any problem with saying that if e.g. a command-line argument starts with a +, the compiler will process treat the remainder of that argument as a source file name, but when it reaches the end of that file it will open the file specified by the next command-line argument and treat it as a continuation of the same compilation unit? – supercat Mar 09 '21 at 20:13
  • I find it very strange that c0, c1 and c2 are in a completely different directory from the actual compiler executable thing itself. That kind of shenanigan would not get past code review today would it? – Omar and Lorraine Mar 09 '21 at 20:33
  • @OmarL: Unless I'm confused about what you're talking about, the idea is they're not in your PATH as they're not user-invokable in any normal sense. – Joshua Mar 09 '21 at 23:44
  • @Joshua I mean, the source code is in a completely different directory – Omar and Lorraine Mar 10 '21 at 06:48