Jay Blanchard wrote:
> [snip]
> Assuming unix, I'd do the following from the root of the application
> to get a list
> of files that contain queries:
>
> $ egrep "=[:space:]*\".*\b(SELECT|INSERT|UPDATE)\b" * -ril
> ...
>
> Anyway, that's how I'd do it. Hope you got something out of this...
> :) [/snip]
>
> That is a good start, now all I need to do is get the whole query(s)
I guess I misunderstood your goal. You said before that you needed to LOCATE
the
queries, which the above will do (if certain assumptions are true). What
exactly
are you wanting to accomplish? Are you trying to write a script that will
extract
the entire query from a set of PHP files?
If so, and given this:
$variableName = "INSERT INTO bar (foo) ";
$variableName .= "VALUES ('".$foo."') ";
Would you expect:
INSERT INTO bar (foo) VALUES ('')
or something else?
At any rate, if you are trying to automatically extract entire queries, I'd say
that's pretty tricky, unless your code uses very strict idioms for defining
queries
that it doesn't stray from. For example, if the code always do this:
$query = "select .....";
$query .= "from .....";
$query .= "where ....";
That's one thing, but if it sometimes does this:
$query = "select ....".
"from .....".
"where .....";
Then that's something entirely different.
Assuming the first, you could perhaps use a two-pass approach where you first go
through a file and find the beginning of a query assignment (using a regex like
the
one above), then extract the variable name and line number. Then on a second
pass
start from each line number you saved and process line number + N lines (where
N is
the largest number of lines the code normally takes to define a query,
arbitrarily
chosen) and look for lines where the variable name is followed by an assignment
operator (= or .=). Then use a regex on those lines to extract everything
between
quotes (hopefully just double quotes, unless the code tends to switch between
double
and single). But then you run into problems when your N is too large and the
end of
one chunk overlaps into the beginning of the next. I would imagine something
like
this would take a lot of time to get right, and even then only be an 80%
solution,
if that.
It seems to get this 100% right you'd have to have a full-fledged PHP parser,
which
means hacking the parser that PHP itself uses. Even hacking the PHP parser
probably
wouldn't get you a 100% solution because the compile stage won't be
enough....there
is going to be code that may or may not be evaluated at runtime and I think it
would
be impossible to know without actually running it.
So...if you absolutely have to have this done programmatically (as opposed to
just
locating the beginning of the queries and manually extracting them) then I
would say
good luck to you. :) Hopefully someone else has some ideas...
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php