Why do PC DOS kernel files have the COM extension, even though they are not executable as COM files?

Question

The PC DOS kernel is stored in files named IBMBIO.COM and IBMDOS.COM. Although they have the COM extension like executable files, neither of these files could actually be run from the command line, and attempting to do so would crash the system. To prevent such crashes, the kernel files were given ‘hidden’ and ‘system’ attributes (apparently specifically invented for the purpose), so that the command interpreter would not try to execute them. (In some versions of DR-DOS, this was instead solved by simply making the files print version information and exit when launched from the command line.)

Given that those files are not regular executables, why were they given the COM extension in the first place? Why go to the trouble of inventing ‘hidden’ and ‘system’ attributes? Why not instead give them the SYS extension, like MS-DOS, (ealier versions of) DR DOS and even 86-DOS did?

Unless the person who named the files shows up here, I don't see how we can get a definitive answer. People often name files inappropriately. On the other hand, I'd guess that the OS boot files were named before the OS got the ability to run commands, since that particular OS egg needs to come before its chicken. — dave, May 03 '20 at 16:20
Maybe, but I’m willing to accept speculative answers, of which two kinds seem the most probable: that despite these complications the name served some purpose or that it was a haphazardly chosen default of some toolchain that stuck for the sake of compatibility. Evidence for either should be possible to find in public information. — user3840170, May 03 '20 at 16:43
There could be logic in OS that makes decisions or does checking based on the filenames, or at least the extensions. There's certainly no shortage of files with .com extensions that are internally at least MZ format files. Or it could be they needed binaries without the limitations that .com binaries have. — hippietrail, May 03 '20 at 17:03
If I remember correctly, MS-DOC gave the files the SYS extension whereas IBM DOS gave them the COM extension. — No'am Newman, May 04 '20 at 05:47
Arguably, for early versions of PC/MS-DOS at least, .COM is a more accurate extension than .SYS: IBMBIO.COM and IBMDOS.COM are memory images, loaded as-is by the boot sector (their contents are moved around by SYSINIT during boot, but that’s nothing a .COM file couldn’t do either). .SYS implies a device-driver-like structure, which IBMBIO.COM/IO.SYS don’t have. Execution starts at the beginning of IBMBIO.COM/IO.SYS, just like a .COM file, except that there is no PSP before the image. — Stephen Kitt, May 04 '20 at 14:01
PC-DOS 1.0 didn’t support installable drivers, not until 2.0 I believe. Therefore nothing used the .SYS extension yet; I don’t even thing that CONFIG.SYS existed either back then. So there was no reason to use .SYS (or any other extension). Hiding the file and calling it .COM seems pretty reasonable. — Euro Micelli, May 04 '20 at 15:07
Good point @EuroMicelli, CONFIG.SYS etc. were indeed added in 2.0. — Stephen Kitt, May 04 '20 at 15:46
For what it’s worth, Compaq’s version of MS-DOS 1.25 named these files IOSYS.COM and MSDOS.COM (but there are other versions of MS-DOS 1.25 with IO.SYS and MSDOS.SYS, so those names didn’t appear with version 2). — Stephen Kitt, May 04 '20 at 18:14
On the 8″ disk image of 86-DOS 1.14 available from WinWorldPC, the DOS kernel is contained in a file named 86DOS.SYS. The extension clearly existed before DOS 2.0. On the 1.0 image, there is no separate file with the kernel; it’s written directly to the first track of the image, so the claim that ‘the OS boot files were named before the OS got the ability to run commands’ seems to be false. — user3840170, Sep 05 '20 at 06:35

score 9 · Answer 1 · edited May 07 '20 at 16:59

Given that those files are not regular executables, why were they given the COM extension in the first place?

They are executable files. They are loadable binary images. In so far they are exactly like COM files, except, when loaded, they are not loaded at offset 0100h, after a prepared PSP, and started with CS:IP as segment:0100h, but segment:0000h. Programming/assembling sources works exactly like with a COM program, except were a 'regular' COM file source starts with an ORG 0100h directrive, BIOS and DOS code is assembled with ORG 0000h instead (*1).

[Insert:] Thanks to a hint by Stephen Kitt pointing out the avaibility of the original sources, it looks as if 86-DOS really used this, as the source starts like this (line 109):

ORG  0
PUT  100H

So while the address base 0000h, the assembler output is moved to 0100h. This means the resulting file may have included some 256 bytes infront. But then again, there must have been some kind of merger utility, as with PC-DOS 1.0 IBMBIO and IBMDOS were not loaded as files, but as one continuous blob of 10 Kib from sector 8 onward.

86-DOS did not care about files at all. The directory was only checkt to see if the entries were there, not to use any of the meta information. It wasn't until PC-DOS 2.0 that the file system really got observed.

So for all development and testing purpose they could be as well assembled as COM files as well. While it doesn't matter for the very first version, as soon as a basic system is bootable, further development and debugging can be done in situ, using the (preloaded) OS environment and debugger. All that is necessary to do so is changing the ORG statement to 0100h before compiling.

Why not give them the SYS extension, like MS-DOS and (earlier versions of) DR DOS did?

Sure, it would have been possible - and would as well have made a lot of sense. Then again, it's always easy to see something as obvious in hindsight. Also, we all know how impossible unsensible users are - who would have thought about them unhiding these files and trying to execute? There is no sane reason to do so.

Well, I guess the MS-DOS team had to learn this lesson and it took about a year:

August 1980: SCP ships 86-DOS 0.1
September 1980: Microsoft licences 86-DOS for 'some' customer
April 1981: SCP finishes and ships 86-DOS 1.0 using the .COM type
August 1981: MS delivers 86-DOS 1.14 as PC-DOS 1.0 to IBM, both using .COM file types for IBMBIO and IBMDOS (*2)
May 1982: (MS/86)-DOS 1.24 is released by IBM as PC-DOS 1.1, both still with .COM types
June 1982: Microsoft released MS-DOS 1.25. The first version to be used with (mostly) PC compatible machines (*3), as well the first version where the .SYS file type was used.

IBM never followed that switch, and some manufacturers of compatible systems seem to have continued with a .COM type (IOSYS.COM and MSDOS.COM in case of Compaq)

In the end, it's as with many 'why' questions, as Another_Dave already mentions:

"Unless the person who named the files shows up here, I don't see how we can get a definitive answer."

*1 - Today many would not care and make it the same, but back then, saving 256 bytes of RAM was a big thing. After all, the first 86-DOS could boot in as little as 12 KiB, leaving space for user code in systems with as little as 16 KiB of RAM. Something the PC no longer supported, as here the BIOS loads the Boot sector to 07C00h (that's 1 Ki below 32 Ki). Thus 32 KiB was the minimum system for DOS.

*2 - BTW, the whole wigwag is a bit more complex, as IBMBIO contains two parts:

the machine depended BIOS interface and
the machine independent SYSINT code, controlling the boot process.

In case of an IBM PC, the machine dependent BIOS interface is rather small and mostly a collection of device drivers, as it uses the ROM-BIOS. For non IBM machines it contained the whole BIOS.

*3 - I still got somewhere an original shrinkwrapped DOS 1.25 for Columbia Data Products' MPC.

We even have the source code to see how the binary was constructed (but not the assembler command, or details of any subsequent processing). — Stephen Kitt, May 05 '20 at 05:59
"they are exactly like COM files, except, when loaded, they are not loaded at offset 0100h, after a prepared PSP, and started with CS:IP as segment:0100h, but segment:0000h" The execution environment for each a normal .COM executable, the BIO file, and the DOS file, is completely different, not only in the origin offset. For instance, the (late) MS-/PC-DOS load protocols always load the BIO file's first 1536 or 2048 bytes to linear 00700h, and pass a first cluster, and a BPB pointed to by ss:bp. For .COM (or .EXE) files there's no BPB, but the entire executable image is loaded. — ecm, May 06 '20 at 22:10
@ecm And you point is? For one, the load address isn't relevant, as it is started with IP==0. But there are other things more important: With DOS 1 the boot sector did not load IBMBIO to 00700h, but 60:0h, there was no BPB pointed to, as none was used and last but not least, it did not load 3 (or 4) sectors, but 20 (10 KiB) starting at Sector 8 of track 0 side 0. Please do your home work before assuming. — Raffzahn, May 07 '20 at 01:35
@Raffzahn: I did specify that I was talking about later MS-DOS versions' protocols. Even so, 70h:0 or 60h:0 are both incompatible to application mode. The point is that you cannot just do this: "as soon as a basic system is bootable, further development and debugging can be done in situ, using the (preloaded) OS environment and debugger", because of the different load protocol of a kernel versus an application. Other than the specifics I already mentioned, the kernel also expects to have control over the entire base memory as given by int 12h. — ecm, May 07 '20 at 05:37
@ecm Just, the question isn't about some later version, but how it came into existence. This 'later version' reliance might as well be the reason for assumptions that are simply not true. so far, none of you 'other points' did hold up. There is no different 'kernel' load protocol. It's simply loading 20 consecutive sectors (8..27 or T1/S0.. T3/S3), like with a COM, which happen to be (at least) these two files. No difference from a load (L) via debug. DOS doesn't care about it's location, as it is relocated anyway. Int 12 is not used for memory sizing. That's as well a later addition. — Raffzahn, May 07 '20 at 10:09
@ecm IO hands over a size of 1 in DX to DOS initialisation, which in turn lets DOS scan all memory, after it's own memory, for being writable. Your insistence about 'cannot do' sounds like you never tried to get new software running on a new system with nothing than the target itself and some older (Z80 in this case) system for cross compilation. In 1979 the luxury of a highly developed toolchain and emulators was non existent. ICE and Logic Analyzers are out of range for most. Long story short, if you want to explain how something started, looking at the end of development isn't helpful. — Raffzahn, May 07 '20 at 10:11
@Raffzahn: Fair about this being applicable to the initial versions of the files, if it is, but it seemed as if you referred to IBMBIO.COM / IBMDOS.COM generally, across their entire history: "They are executable files. They are loadable binary images. In so far they are [...]" -- So you could more precisely refer to the fact that, even if it was sensible for 86-DOS 1.x, it isn't any more for the later versions. It is news to me that the kernel parts could be loaded anywhere, and only used memory starting from that point. A debugger'd still have to insure it was located below that though. — ecm, May 07 '20 at 11:25
@Raffzahn: "There is no different 'kernel' load protocol. It's simply loading 20 consecutive sectors (8..27 or T1/S0.. T3/S3), [...] which happen to be (at least) these two files." That is a kernel load protocol. And it doesn't match .COM protocol, loading two files (generally) and loading them at origin zero. "No difference from a load (L) via debug." L without a parameter loads .COM files to origin 100h. With a parameter you need to manually allocate or figure out usable space for the load. And to load the *DOS file after the BIO you'd have to use N then L again, with the next address. — ecm, May 07 '20 at 11:30
@ecm Mind to note, that I did list the history? It seams as if you're trying very hard to construct claims that have never been made, to prove some points that are either simply wrong or irrelevant. I think it's save to have the confidence into the developer of all these parts to handle it, don't you? I certainly would. Do yourself a favour and check the information given before insisting on issues that not exist,keeping in mind that it's about how it came to be, not how it ended up many version later.If it is about some wording, I'm quite fine with improving it - focused on the question — Raffzahn, May 07 '20 at 12:44

Why do PC DOS kernel files have the COM extension, even though they are not executable as COM files?

1 Answers1