Hi everyone,

I'm having a problem that may either lie in PostgreSQL or in perl, and
I'm not sure which.  I originally posted this on pgsql.general, and
only heard back that someone else has this problem too.  I'm running
perl 5.10 on Windows 2003 Server, although I believe the version
embedded in pgsql at the moment is still 5.8.

The basic problem is that I generate the code of a function, and then
want to check whether this is the same code stored in the database or
not, but diff shows "differences" where there aren't any (or shouldn't
be any, since nothing in the database has changed), and worse, perl's
'eq' and 'ne' operators agree with diff that the strings are
different.  But at least visually the two texts being compared are
identical, and if I copy and paste the results from Firefox into
TextPad and then run "compare files", it says they're identical.

Here are all the clues I could find, although I still am not sure what
to do:

1) If I encode the generated code and the database code using
MIME::Base64, only the first few bytes ('REVDTEFSRQ', decoded that's
'DECLARE') are the same, and after that the files are different.

2) Diff, on the other hand, doesn't show any differences until after
this chunk (i.e., at the first newline and tab):
DECLARE
        c refcursor;
        n text;
BEGIN
        IF t =

...and then shows a series of differences on every line, following a
pattern I couldn't make sense of.

3) DBI's string reporter function says:

generated: UTF8 off, ASCII, 5123 characters 5123 bytes
database: UTF8 off, ASCII, 4164 characters 4164 bytes

...compared with 4208 bytes for both in TextPad when I paste and
compare.

4) pgsql uses utf8 for its postgres database (and all of my own
databases too.)

5) using utf8 or Encode packages and encoding/decoding the generated
string never seemed to lower its byte count.

6) This is my diff function:

CREATE OR REPLACE FUNCTION get_diff(t1 text, t2 text)
  RETURNS text AS
$BODY$use HTML::Diff;
my ($left, $right) = @_;
if ($left eq $right)
{
        return $left;
}
else
{
        my $result = html_word_diff($left, $right);
        my $diff = '';
        foreach (@$result)
        {
                if (($$_[0] eq '-') || ($$_[0] eq 'c'))
                {
                        $diff .= '<del>' . $$_[1] . '</del>';
                }
                if (($$_[0] eq '+') || ($$_[0] eq 'c'))
                {
                        $diff .= '<ins>' . $$_[2] . '</ins>';
                }
                elsif ($$_[0] eq 'u')
                {
                        $diff .= $$_[1];
                }
        }
        return '<div class="earlier_version">' . $diff . '</div><hr><div
class="later_version">' . $diff . '</div>';
}$BODY$
  LANGUAGE 'plperlu' VOLATILE
  COST 100;

...which reports differences like this:

ELSIF t = <del>'aircraft_model' </del><ins>'aircraft_model' </ins>THEN
OPEN c FOR EXECUTE <del>'SELECT </del><ins>'SELECT </ins>model_year ||
<del>'' '' </del><ins>'' '' </ins>|| make || <del>'' '' </del><ins>''
'' </ins>|| common_name FROM aircraft_model WHERE <del>id=' </
del><ins>id=' </ins>|| id;

What I can't fathom about this is that, for example, 'aircraft_model'
is something the generator pulls from the database, and yet, so is
'model_year', but not 'SELECT'.  So why are 'aircraft_model' and
'SELECT' different after storing this code in the database, and yet
'model_year' isn't?  And why does base64 encoding indicate differences
right after 'DECLARE' and yet diff doesn't?

Most of all, what's the sane way to simply compare these strings for
equality?  I was just using diff for clues, but I don't really care
about the difference detail beyond whether the function code actually
needs updating because it has changed.

Help?

I have the encoded versions if that would help make sense of this.

Thanks,
Kev


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to