Book review: Generating [hip] Parsers with JavaCC
Before the age of Disco, I once found myself in need of obtaining specific data elements from a series of log files generated by a large order processing system. Essentially, a chain of copasetic state machines would log their status while they processed various aspects of an order– line items, billing, notification, etc. It turned out that the log format was uniform– it followed a format that enabled one to understand who was writing, when it was written, and why. As you can probably imagine, the folks in operations would monitor these logs and when problems arose, they’d ping development.
Invariably, a hip developer would ssh onto a production box and literally tail -f said log file and watch things progress (in real time). Some developers were more savvy and would pipe the contents into a grep command looking for error messages, but ultimately, it varied from person to person how they’d actually assess the situation.
I, indubitably being of the lazy disco type, wanted to press a button (or run a simple command) and receive a report when I found myself in the hot seat. Of course, this problem has been solved the world over and I had to do it my way because I’m a developer so I started investigating parsing libraries and ran across, what was at the time, some WebGain documentation on JavaCC. Unfortunately for me, I didn’t have the attention span to figure things out and eventually went the regex route via Jakarta’s ORO library.
Tom Copeland’s “Generating Parsers with JavaCC” has, without a doubt, shown me the error of my ways all those years ago. His masterpiece on JavaCC serves as the reference for this handy library– indeed, a major portion of this book documents every detail of generating parsers by clearly unveiling the particulars of tokenizing, parsing, error handling, and even testing JavaCC parsers, just to name a few. I particularly enjoyed the chapter on JavaCC’s JJTree preprocessor as it tied in a lot of the details, for me personally, of writing custom PMD rules.
Indeed, now that DSLs are all the rage these days (I’d go so far as to label them hip, baby), “Generating Parsers with JavaCC” can easily enable adventurous types to assemble mini-languages (and obviously parse and handle them via JavaCC). Because it’s his bag, Tom does a great job in chapter 11 of enumerating a few examples of doing so, in fact. What’s particularly amusing for me is that he shows an example of parsing Apache’s web logs.
I was eventually able to keep on truckin’ by running a single command to receive a detailed report of various goings on in the order processing application I mentioned earlier– my trippin’ little utility made heavy use of regular expressions and served its purpose well enough. But, after reading “Generating Parsers with JavaCC” I realize that my job could have been a bit easier had I just relied on JavaCC to do the heavy lifting of parsing the application’s log files. You can bet that if I find myself in a similar situation in the future, you’ll find me coding away with a well marked up, heavily worn copy of “Generating Parsers with JavaCC” by my side. Give this groovy book a read– you’ll find yourself smarter for it.
| Related odds and ends | ||
|---|---|---|
Sunday 30 Sep 2007 | Book Review
[...] Andy wrote an interesting post today on Book review: Generating [hip] Parsers with JavaCCHere’s a quick excerptIt turned out that the log format was uniform– it followed a format that enabled one to understand who was writing, when it was written, and why. As you can probably imagine, the folks in operations would monitor these logs and when … [...]
Thanks for the great usefully informations. I hope i will find more about JavaCC. Very interesting!
JavaCC is indeed a very handy library. I enjoyed the ressource a lot, thanks for this review!
Very good information and my favorit site. Greatfull JavaCC information. Thanks
Very nice information but i don’t learn it.