Hi All,
I'd very much appreciate some help building a regular expression for
preg_match_all that can differentiate between 'words' and 'phrases'.
For example, say I have a string that contains: 'this is an "example of a
phrase"'
I'd like to be able to break that down to:
this
is
an
example of a phrase
My current preg_match_all regex:
preg_match_all('([\w\-]+|[\(]|[\)])',"this is an \"example of a
phrase\"',$arr);
returns the following:
Array
(
[0] => Array
(
[0] => this
[1] => is
[2] => an
[3] => example
[4] => of
[5] => a
[6] => phrase
)
)
Note: I'm using this to break elements of a string down to build an sql
string, which is why I'm looking for "(" and ")" characters (ie the
"[\(]|[\)]" part of the regex) and maintaining them in the array. A
real-world example of the the value being supplied to the regex might be
"completed and "January 2005" and not (store or online)" etc. I already have
the logic to handle "and", "or", "not" and "()" but haven't been able to
figure out how to maintain substrings in quotes as a single value in the
array.
Any help appreciated!
Much warmth,
Murray