Pragmatism in the real world

preg_last_error() returns No Error on preg_match() Failure

A question came up on a mailing list that I subscribe to that I thought interesting.

Consider this code:

<?php
$result = preg_match("/href='(.*)", 'blah');
$error = preg_last_error();

if($error === PREG_NO_ERROR) {
    echo "No Error\n";
} else {
    echo "An Error Occurred\n";
}
var_dump($result);

The output is:

Warning: preg_match(): No ending delimiter '/' found in /var/www/preg_test.php on line 2
No Error
bool(false)

As you can see the pattern passed to preg_match is invalid as it is missing the end delimiter, so intuitively there is an error, but preg_last_error() says that there isn’t!

This comes about because preg_last_error() only tells you the last error returned from the PCRE library, rather than the last error from the last call to a preg_* function. The difference is subtle, but important.

To see why this is, we have to go to the source code.

Start at line 840 to discover that the php function preg_match() maps to php_do_pcre_match() in the C code. php_do_pcre_match() is at line 477. This function doesn’t do a lot: it does some validation checks and then if all is ok, it calls php_pcre_match_impl() which does the real work.. Or to put it another way, this means that if any of the validation checks fail, then the pcre library isn’t called.

One of the validation checks that’s done is a function called pcre_get_compiled_regex_cache() which checks that the pattern supplied is a valid. If it’s not valid, then an E_WARNING is generated and the function returns NULL (i.e. an error). This causes php_do_pcre_match() to return false. If you are interested, the actual check for an ending delimiter starts at line 256, but make sure you know a bit about C pointers!

So there you have it. An invalid pattern will cause preg_match() to return false and preg_last_error() to return PREG_NO_ERROR.

Thus, you should check preg_last_error() in addition to checking the return value.