Comments in regular expressions typically say how something matches, not why. When a regular expression is not matching as expected, these sort of comments are worse than useless, tainting expectations of the code at hand. Given the terseness and complexity of large regular expressions, it hardly matters whether the author of the expression and comments is separate from the reader or one and the same. When looking at my old code, I regularly have to stop and think about a regex, but I believe that it is time well spent. Every time I find a bug, it reinforces that programming is a human endeavor, and as such is never perfect. Recently, I used one of my old scripts as an example in the SELF 2011 talk. This is a script I've used for years, and as I was looking at it on the slide, I saw where I had put s,^./,, when I meant s,^\./,,.
Abstraction is a tool for managing complexity, and it can be used with regular expressions too. Here's an example from a ruby class:
numrange = /\d+(?:-\d+)?/
numlist = /#{numrange}(?:,#{numrange})*/
step = /\/\d+/
numspec = /(?:\*|#{numlist})(?:#{step})?/
This code tries to be self-documenting, so the intent is explicit in the choice of names. This is akin to a comment, but is simplistic enough to be instantly validated. Since more complex patterns are built using prior abstractions, each can be understood and validated with little effort.
No comments:
Post a Comment