17

I recently learned about the existence of COMIT, which was the first string processing language. It's very old (1957), but it was used a lot in the first 10 years it was around, mainly in academic research. Eventually, it fell into obscurity.

It seems pretty cool, and I'd like to be able to play with it a bit more. It would help me improve my answer about COMIT, too.

Note that there are numerous things with the same name, but these seem to be unrelated to the COMIT language.

This paper describes a bit about how COMIT programs were run:

The programming of the compiler-interpreter by the MIT Computation Center Staff is well underway and may be completed by the time of the meeting.

I'm looking for anything (namely software) that will allow me to run COMIT programs. At this point I don't have any other requirements, as I have found nothing so far and I don't want to limit my options.

If no software is available, is there another way? I was thinking it could be fruitful to try the academic route. Most of what I've read about the language comes from research papers (none of which were written in the last 40 years, unfortunately). The authors obviously were able to run their COMIT programs, and it would be ideal for future researchers to have the ability to run COMIT programs too, even if for no other reason than to verify the original research.

Laurel
  • 1,720
  • 13
  • 27
  • 2
    I don't think there is any way of running a COMIT program without a re-implementation of the language, which might be hard (considering the lack of information). However you might amuse yourself by answering this question - we are waiting for an answer in COMIT. I've been looking at doing one for about 6 months! – Brian Tompsett - 汤莱恩 May 31 '16 at 19:37
  • 2
    @BrianTompsett-汤莱恩 I wonder if I could just cheat and write something that translates to another language. The Lisp II COMIT feature could be particularly promising, if I can find a compiler for it. – Laurel May 31 '16 at 20:01

4 Answers4

21

Old question, but: I've just shipped an interpreter for a large subset of COMIT. Here it is.

http://www.catb.org/esr/comit/

Full documentation is included. There's a pretty good suite of regression tests included.

Some routing commands, subroutines, and subscripts are not yet implemented. This is mainly because it's not easy to tell from the manual what designer's intent was.

ESR
  • 311
  • 2
  • 4
9

Here's a quick, dirty and probably buggy implementation of COMIT in Haskell. The COMIT programmers' reference manual seems to be paywalled (I'm looking at you, ACM!), so I used the description in An introduction to COMIT programming. Numeral subscripts, shelves etc. are not implemented, and I don't know how to behave in corner cases (like * A + $ + $ + B = 2 *).

There's no parser (yet), so instead of LABEL A + $1 + $ = 2 + B NEXT, you need to write ("LABEL", [LLit "A", Repeat 1, Any], [Match 2, RLit "B"], "NEXT") etc.

The output is just a list of the workspace states. The workspace itself is a list of strings. Here are two examples from the manual:

ws1 = [" BILL", ",", " THEY", " SAY", ",", " IS", " RETIRED"]
ex1 = [("*", [LLit ",", Any, LLit ","], [], "*")]

*Main> run ex1 ws1
[[" BILL"," IS"," RETIRED"]]

and a 3-line program using the marker technique:

ws2 = ["A", "B", "C", "A", "B"]
ex2 = [("*", [Any], [RLit "*Q", Match 1], "*"),
  ("FIND", [LLit "*Q", Any, LLit "A"], [Match 2, Match 3, Match 1], "FIND"),
  ("*", [LLit "A", LLit "*Q"], [], "*")]

*Main> run ex2 ws2
[["*Q","A","B","C","A","B"],
 ["A","*Q","B","C","A","B"],
 ["A","B","C","A","*Q","B"],
 ["A","B","C","A","*Q","B"],
 ["A","B","C","B"]]

Code:

data LElem a = LLit a | Repeat Int | Any deriving (Show)
data RElem a = RLit a | Match Int deriving (Show)

type Label = String

type Workspace a = [a]
type Matching a = [[a]]

type Rule a = (Label, [LElem a], [RElem a], Label)

splits (x:xs) = ([], (x:xs)) : [(x:ys,zs) | (ys,zs) <- splits xs]
splits [] = [([],[])]

-- if the lhs deosn't start with Any, the zeroeth match takes the part of the
-- workspace before the actual match. Otherwise, it's empty.
match :: (Show a, Eq a) => [LElem a] -> Workspace a -> [Matching a]
match rs@(Any:_) ws = do { m <- match' rs ws; return ([]:m) }
match rs ws = matchShortest rs ws

match' :: (Show a, Eq a) => [LElem a] -> Workspace a -> [Matching a]
match' [] ws = [[ws]] -- keep end of workspace after match
match' (LLit l:rs) (w:ws)
  | l == w    = do { m <- match' rs ws; return ([w]:m) }
  | otherwise = []
match' (Repeat n:rs) ws
  | length ws >= n = do { m <- match' rs (drop n ws); return ((take n ws):m) }
  | otherwise      = []
match' [Any] ws = [[ws,[]]] -- $ at end matches everything 
match' (Any:rs) ws = matchShortest rs ws
match' rs ws = [] -- fail

matchShortest rs ws = take 1 [xs:m | (xs, ys) <- splits ws, m <- match' rs ys]

-- keep first and last part of match, apply rhs to everything else
replace :: [RElem a] -> Matching a -> Workspace a
replace rs m = (head m) ++ concatMap f rs ++ (last m) where
  f (RLit w) = [w]
  f (Match k) = m !! k

next :: Label -> [Rule a] -> [Rule a]
next _ [] = []
next n rss@((n',_,_,_):rss') 
  | n == n'   = rss
  | otherwise = next n rss'

exec :: (Show a, Eq a) => [Rule a] -> [Rule a] -> Workspace a -> [Workspace a]
exec _ [] ws = []
exec rss css ws = ws' : exec rss css' ws' where
  (_, lhs, rhs, n):_ = css
  m = match lhs ws
  (ws', css') = case (m, n) of
    ([], _)  -> (ws,   tail css)   -- match failed
    (_, "*") -> (ws'', tail css)   -- next line
    (_, _)   -> (ws'', next n rss) -- goto label
  ws'' = replace rhs (head m)

run rss ws = exec rss rss ws

COMIT feels very similar to sed or awk, except that it works on list of words ("constituents") instead of just a list of characters, and so is obviously geared to represent grammars and term rewriting systems.

dirkt
  • 27,321
  • 3
  • 72
  • 113
  • 3
    I found a link to the Reference Manual. It says public domain, so you should be able to see it. This is a fantastic start, so maybe I should learn some Haskell (I am completely unfamiliar with its syntax). – Laurel Jun 01 '16 at 15:19
5

It might not be the easiest way to go about it, but you could possibly try emulating the original IBM 700 / 7000 series mainframe that the code was written for.

There is an emulator for System/370, The Hercules System/370, ESA/390, and z/Architecture Emulator, and System/370 has built in backwards compatibility for the 7000 series. It might be a great deal of work to get that all set up properly though.

mnem
  • 4,537
  • 2
  • 22
  • 36
3

AIM-51, "METEOR: A LISP Interpreter for String Transformations" (Daniel G. Bobrow, April 1963), provides a detailed description and complete source code (about 250 lines) for a COMIT interpreter that takes input as LISP S-expressions. It's written for LISP 1.5. There's a probably slightly newer and much more readable (but I think longer and I'm not sure how different) version of the source code on page 259 of The Programming Language Lisp : Its Operation and Applications.

How useful this is I'm not sure:

  • Getting LISP 1.5 up and running on an IBM 7090 emulator (or a Univac 1100 emulator) seems daunting, at best.
  • It might be portable without too much difficulty to LISP 1.5's direct successor, MACLISP. Maybe. Though you'd still need a (probably easier to handle) PDP-10 emulator. (Or you could try Multics.)
  • Perhaps it could be ported to a more modern LISP, though these tend to be substantially different to LISP 1.5.

All that said, the source itself might provide useful information about the workings of COMIT, particularly for other folks here writing their own COMIT interpreters.

cjs
  • 25,592
  • 2
  • 79
  • 179