On Sat, Dec 20, 2008 at 9:06 AM, Richard Heyes <[email protected]> wrote:
> > i'm reading a book about PHP and i was wondering why regular expressions
> are
> > so often used to check format of variables or emails while the function
> > filter exists since version 5.2.
>
> That's not so long.
>
> > What are the plus of regular expression while checking variable format ?
>
> They' more versatile. Far more in the case of PCRE.
to elaborate, in general, the filter extension should be faster than
corresponding preg_* calls from user space. why? b/c, they essentially are
compiled calls to pcre (in many cases) for specific cases, such as email.
check out the C for filter_validate_email(), its pretty simple:
void php_filter_validate_email(PHP_INPUT_FILTER_PARAM_DECL) /* {{{ */
{
/* From
http://cvs.php.net/co.php/pear/HTML_QuickForm/QuickForm/Rule/Email.php?r=1.4*/
const char regexp[] =
"/^((\\\"[^\\\"\\f\\n\\r\\t\\b]+\\\")|([\\w\\!\\#\\$\\%\\&\\'\\*\\+\\-\\~\\/\\^\\`\\|\\{\\}]+(\\.[\\w\\!\\#\\$\\%\\&\\'\\*\\+\\-\\~\\/\\^\\`\\|\\{\\}]+)*))@((\\[(((25[0-5])|(2[0-4]
pcre *re = NULL;
pcre_extra *pcre_extra = NULL;
int preg_options = 0;
int ovector[150]; /* Needs to be a multiple of 3 */
int matches;
re = pcre_get_compiled_regex((char *)regexp, &pcre_extra, &preg_options
TSRMLS_CC);
if (!re) {
RETURN_VALIDATION_FAILED
}
matches = pcre_exec(re, NULL, Z_STRVAL_P(value), Z_STRLEN_P(value), 0,
0, ovector, 3);
/* 0 means that the vector is too small to hold all the captured
substring offsets */
if (matches < 0) {
RETURN_VALIDATION_FAILED
}
}
basically all it does is call pcre_exec() on against some email regex, and
the string you want to search from userspace. the difference between that
and a call to preg_match() using the same regex and search string is speed.
the other tradeoff, as richard mentioned is flexibility. since you cant
possibly conjure / write all possible calls to the regex engine, it makes
sense to expose something like the preg_* functions to userspace. that
being said, id recommend wrapping the filter_* calls in your validation code
when & where possible, which is essentially the mantra of php programming in
general anyway (stick to the native functions as much as possible).
-nathan
ps.
ill probly setup a test later to see if my half-baked theory is even
accurate :O