What's the best way to create a parser in C++ from a file with grammar?
-
5What format is the 'file with grammar' in? – CB Bailey Dec 03 '09 at 22:38
-
2http://stackoverflow.com/questions/1669/learning-to-write-a-compiler is the canonical question for how to on compilers and interpreters around here. Many good links there. For a hand built recursive decent approach, look at the Crenshaw tutorial. – dmckee --- ex-moderator kitten Dec 03 '09 at 22:58
6 Answers
- I'd suggest the use of boost.spirit
You also might want to have a look at these links:
- 18,668
- 5
- 51
- 82
-
-
1I would suggest not using `boost::spirit` if you plan on a compiler of any decent size - compile times for parsers built with `boost::spirit` tend to get very large, making even very small changes a PITA (because the whole thing is done with templates) – a_m0d Sep 02 '10 at 23:18
It depends heavily on the grammar. I tend to like recursive descent parsers, which are normally written by hand (though it's possible to generate one from a description of the grammar).
If you're going to use a parser generator, there are really two good choices: Byacc and Antlr. If you want something that's (reasonably) compatible with yacc, Byacc is (by far) your best choice. If you're starting from the beginning, with neither existing code nor experience that favors using something compatible with yacc, then Antlr is almost certainly your best bet.
Since it's been mentioned, I'll also talk a bit about Bison. I'd avoid Bison like the plague that it is. Brooks's advice to "Plan to throw one away" applies here. Robert Corbett (the author of Byacc) wrote Bison as his first attempt at a parser generator. Unfortunately, he gave it to GNU instead of throwing it away. In a classic case of marketing beating technical excellence, Bison is widely used (and even recommended, by those who don't know better) while Byacc remains relatively obscure.
Edit: I hate to do it, but since it's also been mentioned, I'll also comment on Boost.spirit. While this may be the coolest example of template meta programming around, it has a couple of problems that lead me to recommend against trying to put it to serious use.
- Compile times with it can get excruciating -- 10 minutes is common, and a larger/more complex grammar can take even longer (assuming it doesn't crash the compiler).
- If you make any mistake at all, it can and frequently will produce insanely long error messages that are virtually impossible to decipher. Error messages from template-heavy code are notoriously bad anyway, and Spirit stresses the system more than almost anything else.
Believe me: the fact that you can write something like Spirit at all is right on the border between impressive and amazing -- but I'd still only use it if I was sure the grammar I was dealing with was (and would always remain) quite small and simple.
- 455,417
- 76
- 598
- 1,067
Have you looked at Lex and Yacc ? To quote from section 5 of the linked document:
My preferred way to make a C++ parser is to have Lex generate a plain C file, and to let YACC generate C++ code. When you then link your application, you may run into some problems because the C++ code by default won't be able to find C functions, unless you've told it that those functions are extern "C".
- 261,477
- 36
- 323
- 432
The best way to create a parser is to use lex and yacc.
-
3No one can answer the question about *best*, but you got really close - lex&yacc cousins flex&bison do take c++ into account. – Michael Krelin - hacker Dec 03 '09 at 22:44
-
I was assuming that the question was about how to write a parser by hand in C++. – Dima Dec 03 '09 at 22:49
I've used bison, found the examples just right for my level. Was able to create a simple calculator with it, of course it can do much more.
The calculator took 1+2*3 for example and built a syntax tree. The documentation did not describe how to build the tree however and that took me a little time to work out.
If I was going again I'd look into 'antlr' as it looked good and well supported.
Martin.
- 136
- 1
- 9