You must escape the left brace in a regex

In v5.26 and later, you have to escape the left brace, {, in Perl regular expressions. You probably never thought about this overworked character, but lots of other people have. This is an important change because it’s a fatal issue that may cause your modules and other tools (such as an old version of autoconf!) to stop working. But, we’ve also known about this for a bit, so if you are up-to-date, things may have already been fixed.

{

The short answer is to test your code with v5.26 before you deploy it. If you encounter an unescaped left brace that isn’t part of an existing feature, your test perl will tell you. Find that brace and escape it with a backslash in front on it: \{. You can also put it in a character class: [{]. You might have to do this in code for tools that you didn’t write (I did it for my local autotools, for instance). Although we shouldn’t have to be told, don’t push to master on friday afternoon and leave for vacation.

Now, consider braces a bit more. First, think of the things that braces do or can do:

  • block delimiter
  • identifier delimiter, ${ident}
  • hash subscript
  • hexadeciml character code points, \x{ABCD}
  • octal character code points, \c{777}
  • unicode character names, \N{UNICODE NAME}
  • delimiters for match, substitution, and transliteration operators
  • Perl special variables, such as ${^MATCH}

Inside regular expressions, they do even more (including the double-quoted features):

  • generalized quantifiers, a{N}, a{N,}, and a{M,N}
  • property names \p{PROPERTY NAME}
  • some zero width assertions, such as \b{BREAKNAME}
  • embedded code, (?{ ... })
  • postponed regular subexpression, (??{ ... })
  • relative references and backreferences, such as \g{-1}

I’ve probably missed some, but that’s already an impressive list. What’s not there? The literal braces. It’s starting to get crowded in there, and the list is growing.

Perl has to figure out if that brace character in your pattern starts a feature (also making the preceding characters part of a feature). As Perl’s regexes get new features, that { is being used for much more.

To make it easier for future expansion, you now have to escape a literal {. You don’t want a preceding character that used to be literal turning into a feature that uses {. Conversely, you don’t want a missing right brace to turn an unescaped left brace into a literal.

If you want to read more about this, there’s a short thread On deprecating unescaped literal left brace from perl5porters in 2012.

Leave a comment

0 Comments.

Leave a Reply


[ Ctrl + Enter ]