Document format for a DOS word processor where control commands begin with a dot

Question

I successfully achieved the reading of several 5.25″ old floppies. They contain old documents from a retired lawyer.

For the memories, we want to read them correctly, nevertheless, I don't find which software was used neither he remember which one.

The text files are similar, starting in the first five lines with the following characters, then follows the text:

.op
.mt1
.hm1
.pl88
..
†††††----!----------!------------------------------------------------
†††††    DÆ† ArmandÔ Sanche˙ D°az¨† mayoÚ dÂ edad¨† casado¨† GraduadÔ ç
ç
†††††Social¨† vecinÔ dÂ est· Ciuda‰ coÓ domiciliÔ · efectÔ dÂ notifiç
ç

Does anyone know how I can find out what software to use to read these files?

Kind regards, David.

What kind of answer do you expect? Do you just want to know the used word processor? Or do you want to extract the contents? If so, with or without the formatting? Please [edit] your question and clarify. — the busybee, Apr 20 '23 at 05:56
@DavidSosa: You can try to install WordStar itself and see if the files open properly. On this site https://winworldpc.com/product/wordstar you can find various versions of the program. I did not use any of those files, and I do not plan to do it - simply because I do not have Wordstar files. I have the link from the Wikipedia site: http://justsolve.archiveteam.org/wiki/WordStar. I got there by searching "import from wordstar" - which returns quite a lot of results. — virolino, Apr 20 '23 at 12:55

score 33 · Answer 1 · answered Apr 20 '23 at 02:19

33

Almost definitely WordStar or a compatible program such as NewWord.

dot commands at the top. From the WordStar 3.3 manual:
- .op - omit page numbers
- .mt1 - margin top 1 line
- .hm1 - heading margin 1 line
- .pl88 - page length 88 lines (likely 11" x 8 lines/inch)
- .. - next line is a comment line and not printed
8th bit of last letter of each word set high - just clear the bit to get regular text

The only thing that doesn't make sense to me is the 5 characters at the beginning of each regular text line. That may be an artifact of the copy/paste process - i.e., actually some other character.

answered Apr 20 '23 at 02:19

manassehkatz-Moving 2 Codidact

14,358
2
37
65

1

Maybe they are "let's center the table using tabs, spaces or some other character"... – virolino Apr 20 '23 at 06:20
21

The † is unicode 0x2020 which is the same as two spaces run together as a 16 bit character. My guess is there has been some incorrect conversion between character sets along the way. – JeremyP Apr 20 '23 at 10:43
@virolino according to other answers/comments, that seems to be how Wordstar worked – Esther Apr 20 '23 at 16:47

camelccc · Answer 2 · 2023-04-19T16:20:20.757

19

Looks to me like a set of dot commands found at the beginning of a Wordstar file .mt1 margin at top, .hm heading margin. Try Wordstar 3

Unfortunately, modern word processors that I know of don't have import filters for Wordstar.

edited Apr 19 '23 at 16:20

answered Apr 19 '23 at 16:14

camelccc

645
4
9

9

Also evident from the use of high ASCII characters as the last character of each word. WordStar only supports 7-bit ASCII; the 8th bit was used as a flag on the last character of words, presumably to indicate that the space(s) following the word could be adjusted (add or remove spaces, add or remove line breaks. WordStar didn’t format the paragraph on the fly; it literally changed spaces/line breaks, on the file’s content) – Euro Micelli Apr 19 '23 at 16:33
2

Although there is something weird about the specific high-Ascii characters being shown in the post. They don’t match what you would expect, even accounting for typical codepage conversion snafus between 437, 1252, 850, or even 860 for a stretch (none of the snafus I tried anyway) – Euro Micelli Apr 19 '23 at 16:55
1

Not for the faint-of-heart but I did find this on converting WS to MS Word:
https://ataridogdaze.com/tech/wordstar-convert.html#:~:text=In%20a%20file%20browser%2C%20go,appended%20to%20the%20file%20name.
– jwh20 Apr 19 '23 at 20:39
if he wants the text this is probably best. Otherwise if he wants the documents formatting, id set up wordstar in a vm, n print to a file. Apple Laserwriter is pretty generic postscript, which can then be converted to pdf – camelccc Apr 19 '23 at 20:54
2

In reference to @jwh20’s suggestion: if the OP doesn’t have Perl, installing Perl just for this purpose might be a bit overkill. But the task is ultimately simple: “Set the 8th bit of every character to 0”. This can be easily reimplemented on just about any language amenable to the OP. – Euro Micelli Apr 19 '23 at 22:19
If I remember correctly, very old versions of MS Office could read Wordstar and other old formats. But no promise made :) By old, I mean Win XP era or earlier. Probably earlier. – virolino Apr 20 '23 at 06:19
FWIW, 'dot commands' are common to a whole family of text processors, starting from RUNOFF on CTSS and descending through almost every OS DEC ever wrote. So it would not surprise me if there were programs other than Wordstar using the convention. – dave Apr 20 '23 at 11:30
1

@virolino I'd like to know which version you are claiming. I know there are claims on the internet, but Word for Windows 1.00 definitely did not, I remember trying to bring Wordstar into this, neither did Word 2.00. Office 2003 Definitely does not, and this is before MS kindly started removing things that they deemed not good for us, and I think I remember trying Office 4.3. Openoffice does not, just checked. There are various converter filters on the internet. I honestly can't think of another application that was once so common, where importing it is such a pain. – camelccc Apr 20 '23 at 12:36
@camelccc: As I said, it is just a memory from old times, that some office versions running on Win 3.1(1) or Win 95 has some option to import from Wordstar. But I never used such an option (I never had Wordstar docs), and I cannot guarantee that it was really Wordstar in the list of options (maybe it was some similar name). After all it is ~20years old info. If you already tried those versions, than you have better information than me. – virolino Apr 20 '23 at 12:44
6

Retail versions of WordPerfect (up to at least version 11) should be able to import WordStar files and then re-save them in something that most modern word processors can read (plain text, RTF, WordPerfect, Word). In really old DOS versions the function isn't in the WordPerfect executable itself but rather in a standalone converter called CONVERT.EXE. – Psychonaut Apr 20 '23 at 13:05
@virolino - MS Word 3 for DOS had a CONVWS tool, but it discarded most of the options after the dot commands. It's effectively useless. – scruss Apr 20 '23 at 18:31

score 19 · Accepted Answer · answered Apr 20 '23 at 18:36

19

WordTsar – A Wordstar clone might be able to read the files. It is open-source and cross-platform. Compatibility depends on the version of WordStar used to write the files.

It does implement an impressive number of WS's Dot Commands.

answered Apr 20 '23 at 18:36

scruss

21,585
1
45
113

3

LOL, what a great name for a clone! – Jörg W Mittag Apr 21 '23 at 12:20
1

Awesome, thank you very much everyone for your valuable help. I am impressed by the power of the community of this forum. And many thanks to you, @scruss for giving me the direct solution. Have a lovely weekend =) – David Sosa Apr 21 '23 at 20:52
2

I'm really glad it worked for you, @DavidSosa. WordStar is one of the few legacy file formats that OpenOffice will not convert, so it's good to know that WordTsar can fill a gap – scruss Apr 22 '23 at 20:50

Tanner Swett · Answer 4 · 2023-04-21T11:18:48.423

10

manassehkatz's answer mentions that the 8th bit of the last letter of each word has been set high, so we should be able to clear the high bit in order to get the original text. However, it's not quite obvious what the character encoding of the text in the question is.

Investigation reveals that the text has been decoded as Mac OS Roman or a similar encoding. If we encode the text as Mac OS Roman, clear the high bits, and then decode it again, most of the characters now make sense.

The character † (dagger) changes to a space, and the character ç (c with cedilla) changes to a carriage return, so I've just left those as-is. If I replace all of the other characters, the result is:

††††† D.† Armando Sanchez D!az,† mayor de edad,† casado,† Graduado ç ç
†††††Social,† vecino de esta Ciudad con domicilio a efecto de notifiç ç

I have no explanation for why we end up with "D!az" instead of "Diaz" or "Díaz." Aside from that, this seems like perfectly normal Spanish text: "Mr. Armando Sánchez Díaz, of legal age, married, Graduado Social, resident of this City with address for purposes of notifi..."

edited Apr 21 '23 at 11:18

answered Apr 20 '23 at 22:07

Tanner Swett

209
1
7

1

If this is truly a WordStar file then there is no real concept of character sets - it is simply 7-bit ASCII. – manassehkatz-Moving 2 Codidact Apr 20 '23 at 22:49
3

Yes, but the route from 7-bit ASCII “with a sometimes-set 8th-bit “ to Unicode, necessarily via one or more encoding conversion(s), can be very tortuous indeed. – Euro Micelli Apr 21 '23 at 01:15
@manassehkatz-Moving2Codidact a Spaniard or Mexican using WordStar in their own country would have their own character set, no? – RonJohn Apr 21 '23 at 04:37
@RonJohn: Probably. But if WordStar was limited to 7-bit ASCII, then the codepage selected for Spanish (probably CP850/Latin-1) wouldn't be relevant: the way codepages worked was that they could alter the screen appearance of characters in the 0x80–0xff range, i.e. exactly the character range that WordStar couldn't handle in the first place. – Schmuddi Apr 21 '23 at 06:23
2

Why do you guess that this is Mac OS Roman? WordStar was never released on a Macintosh system, so why would that be a plausible guess? – Schmuddi Apr 21 '23 at 06:28
7

@Schmuddi It's not just plausible, it's nearly certain. Note that I'm not claiming that WordStar did anything related to Mac OS Roman; what I'm saying happened is that, in the year 2023, when the asker opened up this file to view it, they viewed it using either Mac OS Roman or a similar encoding. And given the way that WordStar uses the high bit, and the letters that those Spanish words end with, the encoding must be one where the characters ÆÔ˙ÚÂ·‰Ó are all encoded the way that they are in Mac OS Roman. – Tanner Swett Apr 21 '23 at 11:12
@TannerSwett: Thanks for the reply – now I get where you're coming from, and after checking the position of the characters in more detail, I agree: Mac OS Roman is a fairly safe guess. – Schmuddi Apr 21 '23 at 19:58
In Wordstar, diacritics were represented by . which is 08,39. The unicode conversion may have broken that? – david Apr 22 '23 at 23:36

Document format for a DOS word processor where control commands begin with a dot

4 Answers4

Linked