48

By the early 1980s, C was using 0x as a prefix to indicate integer literals expressed in hexadecimal, e.g., 0xCAFE. This did not exist in B as of 1972, though B did support octal integer literals via a 0 prefix.

Where and when was this 0x prefix first used?

chicks
  • 397
  • 1
  • 4
  • 15
cjs
  • 25,592
  • 2
  • 79
  • 179

1 Answers1

41

This did not exist in B as of 1972, though B did support octal integer literals via a 0 prefix.

True, but B's predecessor, BCPL, had a notation of # for octal and #x for hexadecimal. So the idea 'jumped' a generation.


The history of C is one of removing features to be added later on again ... for better or worse.

  • CPL (Combined Programming Language) was developed in Cambridge and London as a simplified Algol for system programming (*1). It was implemented in 1965/66 for the EDSAC 2, Atlas and IBM 7094.

  • BCPL (Basic CPL) was, as the name suggests a simplified version of CPL, first implemented 1967 on an IBM 7094 (*2).

  • B was again a simplified BCPL (*3, *4), made to it fit the PDP-7 in 1969.

  • C was developed via NB (New B) for the PDP-11 in 1972ff, adding (back in) features.

CPL used # to denote octal constants. There was no real need for hex, as all machines it was implemented for had word sizes dividable by 3 and used 6 bit characters for output.

BCPL evolved over (quite short) time. While from start on # was used to mark an octal number, it got soon supplemented by #b for binary #x for hexadecimal and even #o for octal. These additions were time and implementation specific, but at least #x became quick a standard.

B dropped #x again with the whole # notation (*5) in favour of a preceding zero, simplifying the parser. Since the PDP-7 was a multiple of 3 word size machine, octal was the only machine specific notation needed (*6,7).

C in turn was developed for the PDP-11, for which, as 16 bit machine, many machine dependent constants come naturally in hex - not to mention the 8 bit byte and ASCII's segmentation in groups of 32. Now reintroducing a hex notation was considered useful - just this time staying with the idea of a preceding zero.


*1 - CPL is really worth a look. While it already has many of the basics of C, like pointers to words as basic element,it also contains several features that seem quite unconventional from today. For example a = bcdoesn't assign the variable bc to a, but the product of b times c. Multi-character identifiers had to start with capital letters. This might as well be the origin of the much liked camel case.

*2 - And a Model 35 TTY, which, at that time, had no curly or square braces, thus digraphs were added. Similar no back slash, so *was used for special characters in strings.

*3 - Plus some funny switches. Had Algol +:= as augmented assignment, so used CPL and BCPL =+, while B switched back to +=.

*4 - Maybe some PL/I added.

*5 - It similar dropped #as part of comparison operators as well

*6 - It's always worth to keep in mind that the 8 bit byte and the corresponding hex notation was only introduced shortly before with the IBM /360.

*7 - An interesting side note may be that CTC used as well a preceding zero for octal constants in their Assembler for the 1970 Datapoint 2200. So while I know of no direct relation, it's quite interesting that they came up with the same solution at the same time as Thompson did.

Raffzahn
  • 222,541
  • 22
  • 631
  • 918
  • 1
    BCPL is really work a look with odd 4-byte aligned pointers that you had to multiply by 4 to get the real address... nightmare :) – Jean-François Fabre Aug 16 '20 at 07:37
  • # is also used in Microsoft basic for hex numbers – Jean-François Fabre Aug 16 '20 at 07:39
  • My understanding was that CPL was never completely implemented (and it was not 'simplified' compared to Algol 60 -- the reason for non-implementation was complexity). Surely if it had been implemented, Richards would not have needed to invent BCPL. – dave Aug 16 '20 at 12:51
  • 1
    I certainly disagree that hex was useful for the PDP-11, Apart from dial strings for X.25, I never once wrote a hex digit in my time as a PDP-11 programmer (of the kernel-mode persuasion), and I bet you won't find a hex digit in DEC PDP-11 manuals. – dave Aug 16 '20 at 12:55
  • CPL overview paper - click on PDF link. "Based on concepts of Algol 60", Implementations were under way on Titan (Cambridge) and Atlas (Imperial); note that Titan was a sort of stripped-down Atlas. CPL indicated octal by a leading '8' -- boldfaced to distinguish it from '8', probably underlined when on papertape. – dave Aug 16 '20 at 13:05
  • 2
    @another-dave I was waiting for that PDP-11 comment of yours :)) For CPL, isn't that reference contradicting the claim that CPL was never implemented? So far I do not see the point you want to make. Also, the leading bold 8 was implementation specific. Before going into more of a hasty dialog, maybe read it in Richards own words: How BCPL evolved from CPL – Raffzahn Aug 16 '20 at 13:15
  • The reference says that the language was "being implemented". It was never finished. BCPL was started because of difficulty implementing CPL. Originally the intent was to use BCPL to bootstrap a CPL implementation, My points? Firstly, that many details you give are wrong, and secondly, that I disbelieve the story that you extract by sequencing the languages thus. – dave Aug 16 '20 at 13:18
  • Re: hex and C. The first ed. of K&R mentions 3 machines other than PDP-11 (32-bit Interdata, S/370, and something else I forget). I'm inclined to think hex notation was added for one of those. – dave Aug 16 '20 at 13:31
  • @another-dave First, the reference cited is from 1963, right? They were still working on it, that's why I used the 1966/67 date, were the implementations were done. Next, no language is ever 'finished'. Otherwise no new versions/standards would ever come. Last, did you read the Richards paper I linked? "Integer constants were sequences of decimal digits, or represented in binary, octal or hex by preceding the appropriate sequence of digits by #b, #o or #x." I would take him as authoritative. And yes, Hex was added back in C, to my understanding before any port to /370. – Raffzahn Aug 16 '20 at 13:37
  • @another-dave PS: The third one mentioned is the Honywell 6000, a 36 bit machine using 9 bit chars. It should be noted that the first edition book is of 1978, so half a decade after their first versions of NB/C. In this 1978 paper Ritchie tells a bit about ports and what was done. – Raffzahn Aug 16 '20 at 14:36
  • 2
    @another-dave Yes, octal was DEC's standard for everything, including the PDP-11. I wrote a lot of assembly language for that machine, data acquisition stuff for plasma fusion and astronomy, using octal. But octal was clumsy on an 8/16 bit machine: I would have preferred hex if the assembler had supported it. – John Doty Aug 16 '20 at 15:08
  • 1
    Ah,yes. BCPL - I remember it well from Oxford in the 1970s. I think the original motivation was to write operating systems as well as CPL compliers. Its simplicity was that it was completely typeless, so anything was possible. Its basic problem was that it assumed that the unit of addressing was 1, which works fine on some machines but not on others. CPL (Cambridge/Combined/Common/Chris's [Prof. Strachey] Programming language) had good ideas but wsas too complex and ran out of impetus. I looked at the compiler but can't remember anything about it. – Peter Aug 17 '20 at 15:58
  • BCPL was kicking around Leeds in the mid 1970s, too. I think it was mainly the domain of CS post-grads, I think on a CTL Modular 1. – dave Aug 18 '20 at 23:26
  • 1
    @another-dave presumably 'CS' refers to Chris Strachey. His team wrote the OS/1 operating system for Modular 1 at Oxford in about 1969. – Peter Aug 23 '20 at 15:53
  • "CS" as used by me above means Computer Science. I don't know what OS was running on the Mod 1 at Leeds, I only ever did a little Logo programming on it. It could well have come from Oxford though. Strachey and Stoy's OS was called OS/6, – dave Aug 23 '20 at 16:39
  • Teletype 35, and 33, did have square brackets_US/braces_GB, but not curly, although I can imagine such special chars might not have translated correctly and reliably to and/or from IBM. They also did not have lowercase letters, a far greater impediment to use with C and Unix, and I believe B. – dave_thompson_085 Sep 06 '20 at 20:57