Re: Different results from identical matviews

2020-07-02 Thread Magnus Hagander
On Thu, Jul 2, 2020 at 2:02 AM Tom Lane  wrote:

> Anders Steinlein  writes:
> > We have a materialized view from which a customer reported some
> > confusing/invalid results, leading us to inspect the query and not
> finding
> > anything wrong. Running the query defining the matview manually, or
> > creating a new (identical) materialized view returns the correct result.
> > Obviously, we've done REFRESH MATERIALIZED VIEW just before doing the
> > comparison, and all runs are in the same schema.
>
> I suspect the query underlying the matviews is less deterministic than
> you think it is.  I did not study that query in any detail, but just
> from a quick eyeball: the array_agg() calls with no attempt to enforce a
> particular aggregation order are concerning, and so is grouping by
> a citext column (where you'll get some case-folding of a common value,
> but who knows which).
>

Also not having looked at the query in detail -- but are there concurrent
changes in the database? Because since you're creating your transaction in
READ COMMITTED, other transactions finishing in between your two REFRESH
commands can alter the data. To make sure that's not what's happening, you
may want to try doing the same thing with a BEGIN  TRANSACTION ISOLATION
LEVEL SERIALIZABLE instead, and see if the problem still occurs.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Different results from identical matviews

2020-07-02 Thread Anders Steinlein
On Thu, Jul 2, 2020 at 2:02 AM Tom Lane  wrote:

> Anders Steinlein  writes:
> > We have a materialized view from which a customer reported some
> > confusing/invalid results, leading us to inspect the query and not
> finding
> > anything wrong. Running the query defining the matview manually, or
> > creating a new (identical) materialized view returns the correct result.
> > Obviously, we've done REFRESH MATERIALIZED VIEW just before doing the
> > comparison, and all runs are in the same schema.
>
> I suspect the query underlying the matviews is less deterministic than
> you think it is.  I did not study that query in any detail, but just
> from a quick eyeball: the array_agg() calls with no attempt to enforce a
> particular aggregation order are concerning, and so is grouping by
> a citext column (where you'll get some case-folding of a common value,
> but who knows which).


Thanks for the tip, but I'm having a hard time thinking that's the case,
seeing as I'm unable to trigger the wrong result no matter how hard I try
with a new definition/manual query. I've introduced random ordering to the
first CTE-clause (where the initial citext values comes from, and casing
thus could differ in some order) which doesn't change the result.

When the citext type is used throughout the query, shouldn't the grouping
result be deterministic? The citext values are first "rolled up" with
array_agg() and later unnested and finally grouped. Shouldn't the end
result be the same, regardless of what particular case-folded version of
the value it chooses to group on?

I've simplified the query for this particular customer case that, again,
always returns the correct result no matter how often I try:

mm_prod=> SELECT sid, count(*) FROM (
WITH tagged_contacts AS (
SELECT lid, email, cl.skip_preexisting_campaigns AS skip_subscribed,
ct.skip_preexisting_campaigns AS skip_tags, ladded,
tagname, created
FROM contacts_lists cl
LEFT JOIN contacts_tags ct USING (email)
WHERE lstatus = 'a'
ORDER BY random()
),
tagged_segments AS (
SELECT s.lid, cid, sid, sp.type, sp.mid, matchdelay, tagname, event,
count(*) OVER (PARTITION BY sid) AS requirements,
activated_at
FROM segments s
LEFT JOIN campaigns USING (cid)
INNER JOIN segments_predicates sp USING (sid)
WHERE
s.archived_at IS NULL
AND (cid IS NULL OR activated_at IS NOT NULL)
),
segments_contacts AS (
SELECT lid, sid, requirements,
CASE
WHEN type = 'has_tag' THEN (
SELECT array_agg(DISTINCT email::citext)
FROM tagged_contacts
WHERE
lid = s.lid
AND tagname = s.tagname
AND (matchdelay IS NULL OR created + matchdelay
< now())
AND (
cid IS NULL
OR (
created >= activated_at
AND NOT COALESCE(skip_tags, false)
)
)
)
END AS emails
FROM tagged_segments s
),
unnested AS (
SELECT lid, sid, requirements, unnest(emails) AS email
FROM segments_contacts
)
SELECT lid, sid, email
FROM unnested
GROUP BY lid, sid, email, requirements
HAVING count(email) = requirements
) x
WHERE sid = 42259
GROUP BY sid;
  sid  | count
---+---
 42259 |98
(1 row)

This is stale data for this customer, so no data changes are occurring to
change the results. I can REFRESH MATERIALIZED VIEW as many times I was on
the original segments_with_contacts matview, and I never see different
results. If it were not deterministic, shouldn't I expect to see different
results one in at least 100 times tried?

Thanks again for any insight to try and figure this out. Again, I could
just re-create the matview we use in production and it would likely work
(since I'm unable to get wrong results with a newly created case), but I
would rather try to find out the root cause here first.

Best,
-- a.


Re: Different results from identical matviews

2020-07-02 Thread Anders Steinlein
On Thu, Jul 2, 2020 at 12:12 PM Magnus Hagander  wrote:

>
> On Thu, Jul 2, 2020 at 2:02 AM Tom Lane  wrote:
>
>> Anders Steinlein  writes:
>> > We have a materialized view from which a customer reported some
>> > confusing/invalid results, leading us to inspect the query and not
>> finding
>> > anything wrong. Running the query defining the matview manually, or
>> > creating a new (identical) materialized view returns the correct result.
>> > Obviously, we've done REFRESH MATERIALIZED VIEW just before doing the
>> > comparison, and all runs are in the same schema.
>>
>> I suspect the query underlying the matviews is less deterministic than
>> you think it is.  I did not study that query in any detail, but just
>> from a quick eyeball: the array_agg() calls with no attempt to enforce a
>> particular aggregation order are concerning, and so is grouping by
>> a citext column (where you'll get some case-folding of a common value,
>> but who knows which).
>>
>
> Also not having looked at the query in detail -- but are there concurrent
> changes in the database?
>

Yes, the database is in production so changes do occur, however the data
involved for this particular customer and the tables involved in this query
is not changing while we've looked into the case these days. It's SELECTED
quite a bit though.


> Because since you're creating your transaction in READ COMMITTED, other
> transactions finishing in between your two REFRESH commands can alter the
> data. To make sure that's not what's happening, you may want to try doing
> the same thing with a BEGIN  TRANSACTION ISOLATION LEVEL SERIALIZABLE
> instead, and see if the problem still occurs.
>

Thanks for the suggestion, but the issue remains. The results are the same
(that is, invalid for the one matview and correct for new matviews or
manual queries) whether I run the queries in the same transaction or
separate transactions. After many many many attempts. I'm quite baffled,
really...

Best,
-- a.

>


Re: Different results from identical matviews

2020-07-02 Thread Anders Steinlein
On Thu, Jul 2, 2020 at 1:26 PM Anders Steinlein  wrote:

> On Thu, Jul 2, 2020 at 2:02 AM Tom Lane  wrote:
>
>> Anders Steinlein  writes:
>> > We have a materialized view from which a customer reported some
>> > confusing/invalid results, leading us to inspect the query and not
>> finding
>> > anything wrong. Running the query defining the matview manually, or
>> > creating a new (identical) materialized view returns the correct result.
>> > Obviously, we've done REFRESH MATERIALIZED VIEW just before doing the
>> > comparison, and all runs are in the same schema.
>>
>> I suspect the query underlying the matviews is less deterministic than
>> you think it is.  I did not study that query in any detail, but just
>> from a quick eyeball: the array_agg() calls with no attempt to enforce a
>> particular aggregation order are concerning, and so is grouping by
>> a citext column (where you'll get some case-folding of a common value,
>> but who knows which).
>
>
> Thanks for the tip, but I'm having a hard time thinking that's the case,
> seeing as I'm unable to trigger the wrong result no matter how hard I try
> with a new definition/manual query. I've introduced random ordering to the
> first CTE-clause (where the initial citext values comes from, and casing
> thus could differ in some order) which doesn't change the result.
>

I just wanted to add that we're on Postgres 12.3. This matview has been
with us since 9.4 days, and we have not experienced any such issues before
(could be customers who haven't noticed or reported it to us, of course...).
 version

-
 PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg18.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit

Best,
-- a.

>


Re: Different results from identical matviews

2020-07-02 Thread Michael Lewis
Is now() computed at the time the view is defined and not at refresh? If
this were a function, I would be more suspicious of that but a plain view,
surely not. I hope.

>


Re: Different results from identical matviews

2020-07-02 Thread Jeremy Smith
It looks like you are using now() fairly often in that query.  That would,
of course, give different results in different transactions, but it could
also give different results if a) the things you are comparing now() to are
timestamp without time zone and b) the session time zone of the user doing
the refresh is different from the session time zone of the user running the
query.  I'd also be suspicious about any other timestamp comparisons that
aren't comparing timestamp to timestamp or timestamptz to timestamptz.


Re: Different results from identical matviews

2020-07-02 Thread Anders Steinlein
On Thu, Jul 2, 2020 at 3:01 PM Jeremy Smith  wrote:

> It looks like you are using now() fairly often in that query.  That would,
> of course, give different results in different transactions, but it could
> also give different results if a) the things you are comparing now() to are
> timestamp without time zone and b) the session time zone of the user doing
> the refresh is different from the session time zone of the user running the
> query.  I'd also be suspicious about any other timestamp comparisons that
> aren't comparing timestamp to timestamp or timestamptz to timestamptz.
>

Thanks, but as mentioned in another part of the thread this data is stale
(unmodified during these days we're looking into this). And the result is
different between the erroneous matview and identical manual query
regardless, whether I run them in the same transaction or not. And
the results were the same yesterday, same days before that when the issue
came up, and same today. So I can't see how times are the issue,
unfortunately. :-/

Best,
-- a.


Re: Different results from identical matviews

2020-07-02 Thread Paul Förster
Hi Michael,

> On 02. Jul, 2020, at 14:55, Michael Lewis  wrote:
> 
> Is now() computed at the time the view is defined and not at refresh? If this 
> were a function, I would be more suspicious of that but a plain view, surely 
> not. I hope.

now() returns the time at the beginning of the transaction:

see: https://www.postgresql.org/docs/current/functions-datetime.html

If you want the time whenever you call the function, use clock_timestamp() 
instead. There's also a good article at:

https://www.cybertec-postgresql.com/en/postgresql-now-vs-nowtimestamp-vs-clock_timestamp/

Hope this helps.

Cheers,
Paul



Re: Different results from identical matviews

2020-07-02 Thread Tom Lane
Anders Steinlein  writes:
> On Thu, Jul 2, 2020 at 2:02 AM Tom Lane  wrote:
>> I suspect the query underlying the matviews is less deterministic than
>> you think it is.

> Thanks for the tip, but I'm having a hard time thinking that's the case,
> seeing as I'm unable to trigger the wrong result no matter how hard I try
> with a new definition/manual query.

Well, another line of thought is that there actually is some difference
between the stored query for the original matview and the ones you enter
afresh.  You said they were the same, but I surely didn't attempt to
verify that.  Comparing pg_get_viewdef() output for equality would be
a good first step.  Even that perhaps isn't conclusive, so you could
also try comparing the pg_rewrite.ev_action fields for the views'
ON SELECT rules.  (That might be a bit frustrating because of likely
inconsistencies in node "location" fields; but any other difference
is cause for suspicion.)

regards, tom lane




Re: Different results from identical matviews

2020-07-02 Thread David G. Johnston
On Thursday, July 2, 2020, Anders Steinlein  wrote:

>
>
>> Thanks for the tip, but I'm having a hard time thinking that's the case,
>> seeing as I'm unable to trigger the wrong result no matter how hard I try
>> with a new definition/manual query. I've introduced random ordering to the
>> first CTE-clause (where the initial citext values comes from, and casing
>> thus could differ in some order) which doesn't change the result.
>>
>
> I just wanted to add that we're on Postgres 12.3. This matview has been
> with us since 9.4 days, and we have not experienced any such issues before
> (could be customers who haven't noticed or reported it to us, of course...).
>  version
>
> 
> -
>  PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg18.04+1) on x86_64-pc-linux-gnu,
> compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
>


 I concur that the determinism doesn’t seem like a problem - but not much
else does either.  As a shot in the dark does pg_depend show any
differences between the dependencies for the two views?

How did this migrate from 9.4 to 12?

It would be helpful if “Explain analyze refresh materialized view” were a
thing (is it?)

If you can backup and restore the existing database (basebackup is more
likely, but pg_dump would be more useful) and still observe the problem
then maybe I see hope for digging down into the cause.  Otherwise I’d limit
my decision to testing for the symptom with the solution being to rebuild
any problem views.

David J.


Re: Different results from identical matviews

2020-07-02 Thread Anders Steinlein
On Thu, Jul 2, 2020 at 3:44 PM Tom Lane  wrote:

> Anders Steinlein  writes:
> > On Thu, Jul 2, 2020 at 2:02 AM Tom Lane  wrote:
> >> I suspect the query underlying the matviews is less deterministic than
> >> you think it is.
>
> > Thanks for the tip, but I'm having a hard time thinking that's the case,
> > seeing as I'm unable to trigger the wrong result no matter how hard I try
> > with a new definition/manual query.
>
> Well, another line of thought is that there actually is some difference
> between the stored query for the original matview and the ones you enter
> afresh.  You said they were the same, but I surely didn't attempt to
> verify that.  Comparing pg_get_viewdef() output for equality would be
> a good first step.


I used a manual `diff` earlier, but this sure was easier. But yes, the
stored queries are identical:

mm_prod=> select pg_get_viewdef('aakpnews.segments_with_contacts') =
pg_get_viewdef('aakpnews.segments_with_contacts_2');
 ?column?
--
 t
(1 row)

Even that perhaps isn't conclusive, so you could
> also try comparing the pg_rewrite.ev_action fields for the views'
> ON SELECT rules.  (That might be a bit frustrating because of likely
> inconsistencies in node "location" fields; but any other difference
> is cause for suspicion.)
>

You're right, ev_action is indeed different:

mm_prod=> select x1.ev_type = x2.ev_type as ev_type_equal, x1.ev_enabled =
x2.ev_enabled as enabled_equal, x1.is_instead = x2.is_instead as
is_instead_equal, x1.ev_qual = x2.ev_qual as ev_qual_equal, x1.ev_action =
x2.ev_action as ev_action_equal
from
(select pr.* from pg_namespace pn inner join pg_class pc on pc.relnamespace
= pn.oid inner join pg_rewrite pr on pr.ev_class = pc.oid where pn.nspname
= 'aakpnews' and pc.relname = 'segments_with_contacts') x1,
(select pr.* from pg_namespace pn inner join pg_class pc on pc.relnamespace
= pn.oid inner join pg_rewrite pr on pr.ev_class = pc.oid where pn.nspname
= 'aakpnews' and pc.relname = 'segments_with_contacts_2') x2;
 ev_type_equal | enabled_equal | is_instead_equal | ev_qual_equal |
ev_action_equal
---+---+--+---+-
 t | t | t| t | f
(1 row)

Is there somehow I can format them to make it easier to compare? My basic
attempts didn't help me much. I put them up in all their glories in
pastebins, since they are rather large. Please let me know if there is
somehow I can make this easier to look into.

ev_action for segments_with_contacts - the origial matview:
https://pastebin.com/MBJ45prC
ev_action for segments_with_contacts_2 - the similar newly created matview:
https://pastebin.com/sL4WjzBj

Best,
-- a.


Catching errors with Perl DBI

2020-07-02 Thread stan
How can I catch the errors generated whne I call an INSERT that violates a
constraint? I have coded like this:

 my $sth = $dbh->prepare($stmt);
 my $rv = $sth->execute() or die $DBI::errstr;
 if ( $rv < 0 ) {
 print $DBI::errstr;
 }

But, if the INSERT violates a constraint, it never gets the the evaluation
of the $rv

Is this a setting for the DBI?

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin




Re: Different results from identical matviews

2020-07-02 Thread Anders Steinlein
On Thu, Jul 2, 2020 at 3:55 PM David G. Johnston 
wrote:

> On Thursday, July 2, 2020, Anders Steinlein  wrote:
>>
>>
>> I just wanted to add that we're on Postgres 12.3. This matview has been
>> with us since 9.4 days, and we have not experienced any such issues before
>> (could be customers who haven't noticed or reported it to us, of course...).
>>  version
>>
>>
>> -
>>  PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg18.04+1) on x86_64-pc-linux-gnu,
>> compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
>>
>
>
>  I concur that the determinism doesn’t seem like a problem - but not much
> else does either.  As a shot in the dark does pg_depend show any
> differences between the dependencies for the two views?
>

Could be worth checking, yes. Could you give me any guidance as to how to
compare this? Never looked at pg_depend before -- which of the columns
should have the oid for the matview I want to look up dependencies for?

How did this migrate from 9.4 to 12?
>

pg_dump and pg_restore. It's been a few months, so unfortunately I can't
recall which pg_dump version was used. Another thing to possibly note is
that the citext extension has subsequently been updated as well; I'm unsure
if the matview has been recreated after that (if that could have any
effect).


> It would be helpful if “Explain analyze refresh materialized view” were a
> thing (is it?)
>

Yeah, I was looking for that too initially when investigating.
Unfortunately, "Utility statements have no plan structure" is the response
given.

If you can backup and restore the existing database (basebackup is more
> likely, but pg_dump would be more useful) and still observe the problem
> then maybe I see hope for digging down into the cause.  Otherwise I’d limit
> my decision to testing for the symptom with the solution being to rebuild
> any problem views.
>

I //think// I have the dump laying around, but if simply rebuilding the
view fixes the problem I'm inclined to just do that, although the issue is
a bit concerning. If anyone here suspects an actual bug with a possible
avenue for further investigation, in which case I would be happy to help.

Best,
-- a


Check Replication lag status

2020-07-02 Thread Brajendra Pratap Singh
Hi,

If my primary postgresql database will crash and there will be some lag
between primary and standby database then how to findout from standby side
that till what duration data will be available on standby postgresql
database?

Thanks,
Brajendra


restore_command for postgresql streaming replication

2020-07-02 Thread Brajendra Pratap Singh
Hi,

Please help us to configure the restore_command on postgresql replication
server's recovery.conf file incase of ERROR: requested wal segment has
already been removed on primary and when we have implemented pgbackrest on
primary database.

Primary database -

archive_command = 'pgbackrest --stanza=uatodb archive-push %p'

Thanks
Singh


Re: Catching errors with Perl DBI

2020-07-02 Thread Gianni Ceccarelli
On Thu, 2 Jul 2020 11:03:37 -0400
stan  wrote:

> my $sth = $dbh->prepare($stmt);
> my $rv = $sth->execute() or die $DBI::errstr;

that ``or die`` means: if the result of the ``execute`` is false
(which only happens on error), throw an exception (which, as you
noticed, terminates the process unless caught with an ``eval {}`` or
similar construct)

> if ( $rv < 0 ) { print $DBI::errstr; }

Notice that ``$rv`` would never be less than 0: for an ``INSERT``,
it's the number of rows inserted (or a special "0 but true" value in
case no rows were inserted).

So, you can do two things:

* keeping the same style::

my $rv = $sth->execute(@bind_values);
if (!$rv) {
print $sth->errstr;
# and probably do something useful here ☺
}

* switching to exceptions everywhere

  Tell DBI you want exceptions::

my $dbh = DBI->connect(
$dsn,$user,$password,
{
PrintError => 0,
RaiseErorr => 1,
PrintWarn  => 0,
RaiseWarn  => 1,
}
);

  then run statements like this::

eval { $dbh->prepare($stmt)->execute(@bind_values) }
or do {
print $@; # the exception is store in this variable
};

-- 
Dakkar - 
GPG public key fingerprint = A071 E618 DD2C 5901 9574
 6FE2 40EA 9883 7519 3F88
key id = 0x75193F88





Re: Catching errors with Perl DBI

2020-07-02 Thread Francisco Olarte
Stan:

On Thu, Jul 2, 2020 at 5:03 PM stan  wrote:

> How can I catch the errors generated whne I call an INSERT that violates a
> constraint? I have coded like this:
>
>  my $sth = $dbh->prepare($stmt);
>  my $rv = $sth->execute() or die $DBI::errstr;
>  if ( $rv < 0 ) {
>  print $DBI::errstr;
>  }
>
> But, if the INSERT violates a constraint, it never gets the the evaluation
> of the $rv

I assume you mean the if($rv<0) is what it is not executed.

In perl this happens because something died. I assume it is not the
one you coded. This means some of your handles have the RaiseError
attribute, lookit up in the perldoc.

> Is this a setting for the DBI?

I do not remember if it has a global setting, but it sure has a
database handle setting ( which percolates down ). I use it routinely
for easier error handling.

I'm not sure if you know how to from your message, but if something is
dying you can use the block eval construct:

eval {
# potentially dying code...
   my $sth = $dbh->prepare($stmt);
   my $rv = $sth->execute() or die $DBI::errstr;
   if ( $rv < 0 ) {
  print $DBI::errstr;
}
 1; # This forces the eval to return true if execution gets here.
} or do {
# Whatever you want, $@ has the codes.
}

to trap it in perl.

About RaiseError, it is common to set it to true in the handle, put
all your code in a sub() and catch it, in programs where you donot
have easy recovery of errors, and use local eval if needed to catch
this kind of prim. key violation things.

Francisco Olarte.




Re: Different results from identical matviews

2020-07-02 Thread David G. Johnston
On Thu, Jul 2, 2020 at 8:06 AM Anders Steinlein  wrote:

> On Thu, Jul 2, 2020 at 3:55 PM David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Thursday, July 2, 2020, Anders Steinlein  wrote:
>>>
>>>
>>> I just wanted to add that we're on Postgres 12.3. This matview has been
>>> with us since 9.4 days, and we have not experienced any such issues before
>>> (could be customers who haven't noticed or reported it to us, of course...).
>>>  version
>>>
>>>
>>> -
>>>  PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg18.04+1) on x86_64-pc-linux-gnu,
>>> compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
>>>
>>
>>
>>  I concur that the determinism doesn’t seem like a problem - but not much
>> else does either.  As a shot in the dark does pg_depend show any
>> differences between the dependencies for the two views?
>>
>
> Could be worth checking, yes. Could you give me any guidance as to how to
> compare this? Never looked at pg_depend before -- which of the columns
> should have the oid for the matview I want to look up dependencies for?
>

It would be an educational/trial-and-error experience for me as well.  That
you found a difference in pg_rewrite.ev_action probably provides a more
fruitful avenue of attack though I'm inexperienced there as well.  I do
believe that inspecting pg_depend will also highlight whatever difference
you are seeing in the ev_action.  What tickles my curiosity is why that
difference (whatever it is, I haven't looked) isn't manifesting in the \d+
output for the materialized view.

David J.


Re: restore_command for postgresql streaming replication

2020-07-02 Thread Stephen Frost
Greetings,

* Brajendra Pratap Singh (singh.bpratap...@gmail.com) wrote:
> Please help us to configure the restore_command on postgresql replication
> server's recovery.conf file incase of ERROR: requested wal segment has
> already been removed on primary and when we have implemented pgbackrest on
> primary database.
> 
> Primary database -
> 
> archive_command = 'pgbackrest --stanza=uatodb archive-push %p'

For this, the restore_command would be:

restore_command = 'pgbackrest --stanza=uatodb archive-get %f "%p"'

When you use pgbackrest to do a restore with 'pgbackrest restore',
pgbackrest will also write the correct restore_command into the
configuration for the restored system for you.

You might want to check out the user guide:

https://pgbackrest.org/user-guide.html

(that's for v10 on Debian, there's a couple other user guides if you're
on something different that you could look at)

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Different results from identical matviews

2020-07-02 Thread Tom Lane
Anders Steinlein  writes:
>> Even that perhaps isn't conclusive, so you could
>> also try comparing the pg_rewrite.ev_action fields for the views'
>> ON SELECT rules.  (That might be a bit frustrating because of likely
>> inconsistencies in node "location" fields; but any other difference
>> is cause for suspicion.)

> You're right, ev_action is indeed different:
> ...
> Is there somehow I can format them to make it easier to compare? My basic
> attempts didn't help me much. I put them up in all their glories in
> pastebins, since they are rather large. Please let me know if there is
> somehow I can make this easier to look into.

Yeah, expression trees are pretty unreadable :-(.  I downloaded these,
changed all the "location" fields to -1 to make them more comparable,
and behold there are still a bunch of diffs.  Here's one:

original view:

{JOINEXPR :jointype 0 :isNatural false :larg 
{RANGETBLREF :rtindex 1}
 :rarg 
{RANGETBLREF :rtindex 2}
 :usingClause ("email") :quals 
{OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 
:inputcollid 100 :args (
{RELABELTYPE :arg 
{VAR :varno 1 :varattno 2 :vartype 106893 :vartypmod -1 :varcollid 100 
:varlevelsup 0 :varnoold 1 :varoattno 2 :location -1}
 :resulttype 25 :resulttypmod -1 :resultcollid 100 :relabelformat 2 :location 
-1}
{RELABELTYPE :arg 
{VAR :varno 2 :varattno 2 :vartype 106893 :vartypmod -1 :varcollid 100 
:varlevelsup 0 :varnoold 2 :varoattno 2 :location -1}
 :resulttype 25 :resulttypmod -1 :resultcollid 100 :relabelformat 2 :location 
-1}
) :location -1}
 :alias <> :rtindex 3}

new view:

{JOINEXPR :jointype 0 :isNatural false :larg 
{RANGETBLREF :rtindex 1}
 :rarg 
{RANGETBLREF :rtindex 2}
 :usingClause ("email") :quals 
{OPEXPR :opno 106108 :opfuncid 106101 :opresulttype 16 :opretset false 
:opcollid 0 :inputcollid 100 :args (
{RELABELTYPE :arg 
{VAR :varno 1 :varattno 2 :vartype 106893 :vartypmod -1 :varcollid 100 
:varlevelsup 0 :varnoold 1 :varoattno 2 :location -1}
 :resulttype 106084 :resulttypmod -1 :resultcollid 100 :relabelformat 2 
:location -1}
{RELABELTYPE :arg 
{VAR :varno 2 :varattno 2 :vartype 106893 :vartypmod -1 :varcollid 100 
:varlevelsup 0 :varnoold 2 :varoattno 2 :location -1}
 :resulttype 106084 :resulttypmod -1 :resultcollid 100 :relabelformat 2 
:location -1}
) :location -1}
 :alias <> :rtindex 3}

This is the internal form of a "JOIN ... USING (email)" construct.
I didn't try to trace this back to exactly where it was in the source
queries.  The important thing here is that we have a couple of Vars
of type 106893, which I gather must be citext or a domain over it.
In the first tree, those are coerced via a no-op RelabelType operation
into plain text (type OID 25) and then compared with the built-in texteq
operator.  In the second tree, they are coerced to some other non-built-in
type (maybe plain citext?) and then compared with operator 106108.

I am betting that 106084 is citext, 106108 is citext's equality operator,
and the net implication of all this is that the original matview is doing
the JOIN using case-sensitive equality whereas the new one is using
case-insensitive equality.

A plausible explanation for how things got that way is that citext's
equality operator wasn't in your search_path when you created the original
matview, but it is in view when you make the new one, allowing that
equality operator to capture the interpretation of USING.  Unfortunately,
since the reverse-listing of this join is just going to say "USING
(email)", there's no way to detect from human-readable output that the
interpretation of the USING clauses is different.  (We've contemplated
introducing not-SQL-standard syntax to allow flagging such cases, but
haven't pulled the trigger on that.)

I count five places in the query with similar operator substitutions.
There are some other diffs in the trees that are a bit odd, but might be
explained if the new view was made by dump/reload rather than from the
identical SQL text the original view was made from; they all look like
they are references to JOIN output columns rather than the underlying
table columns or vice versa.  That's probably harmless, but the different
join operators certainly are not.

regards, tom lane




Re: Different results from identical matviews

2020-07-02 Thread David G. Johnston
On Thu, Jul 2, 2020 at 8:44 AM Tom Lane  wrote:

> A plausible explanation for how things got that way is that citext's
> equality operator wasn't in your search_path when you created the original
> matview, but it is in view when you make the new one, allowing that
> equality operator to capture the interpretation of USING.  Unfortunately,
> since the reverse-listing of this join is just going to say "USING
> (email)", there's no way to detect from human-readable output that the
> interpretation of the USING clauses is different.  (We've contemplated
> introducing not-SQL-standard syntax to allow flagging such cases, but
> haven't pulled the trigger on that.)
>

The citext extension seems to have been installed into the public schema as
well which could introduce the CVE-2018-1058 fix as a potential moving part.

It seems a bit odd though since the textual query does specify "DISTINCT
mails_contacts_opens.email::public.citext" so it does seem to be
search_path induced as the view couldn't exist if the extension was simply
missing not extension specific equality operator were present to match in
front of the default equality operator.  But then those casts also make me
question whether the source tables are defined using text instead of citext
in which case the joins using text equality would be expected and their
using citext equality in the new queries potentially suspect.

David J.


Re: survey: psql syntax errors abort my transactions

2020-07-02 Thread David G. Johnston
On Thu, Jul 2, 2020 at 8:54 AM Jeremy Schneider  wrote:

>
> https://www.postgresql.org/message-id/flat/CABTbUpiAOKZ405uArt8cJFtC72RhzthmvWETQK_6Qw0Ad-HquQ%40mail.gmail.com
>
> This thread on hackers actually seemed kindof short to me. Not nearly
> enough bike-shedding to call it a closed case.
>

Seemed about right:

"We should do this"
Yes
No - with a reason
No - with a reason
No - improve user education
No - emphatically
Yes - but ends ups deferring to the majority

No one else chooses to voice an opinion

The status quo prevailed since no-one chose to contribute further arguments
for change and the original patch was retracted.  What kind of
"bike-shedding" (which seems to be used incorrectly here) would you expect?

All I can speak for is personal usage but I don't find the current default
to be an issue.  I'm also generally opposed to changing this kind of
default even when I disagree with its current value.  If anything psql is a
bit too permissive by default IMO.  Default should be as safe as possible
even at the cost of user inconvenience - so that unknowledgeable people get
the most protection.  If options exist to trade safety for convenience that
is good - each user can make that trade-off for themselves and in the
process be aware of what exactly their decision entails.

David J.


Re: survey: psql syntax errors abort my transactions

2020-07-02 Thread Adrian Klaver

On 7/2/20 8:54 AM, Jeremy Schneider wrote:
Maybe it's just me, but I'm wondering if it's worth changing the default 
behavior of psql so it doesn't abort transactions in interactive mode 
when I mistakenly mis-spell "select" or something silly like that.  This 
is of course easily remedied in my psqlrc file by adding "\set 
ON_ERROR_ROLLBACK interactive". I don't know whether there are 
equivalent settings for pgAdmin and Toad and whatever other tools people 
are using for their interactive SQL sessions. But I do know that for all 
the new people coming to PostgreSQL right now (including lots at my 
company), none of them are going to know about this setting and 
personally I think the default is user-unfriendly.


https://www.postgresql.org/docs/current/app-psql.html#id-1.9.4.18.8.5.2.9.17.1.2

A couple years back, some hackers discussed changing the default, and it 
was decided against (IIUC) because of concerns about broken scripts 
suddenly causing damage rather than aborting out. (About which... I 
think if a script is sending broken SQL, then it might not be checking 
error return values either and will likely keep running even after 
PostgreSQL ignores a few SQL statements after the error in the current 
session...)


https://www.postgresql.org/message-id/flat/CABTbUpiAOKZ405uArt8cJFtC72RhzthmvWETQK_6Qw0Ad-HquQ%40mail.gmail.com

This thread on hackers actually seemed kindof short to me. Not nearly 
enough bike-shedding to call it a closed case. It also seems to me that 
the community has made significant changes across new major versions in 
the past, and this idea here might not be entirely off the table quite yet.


So...

Survey for the user community here on the pgsql-general list: it would 
be great if lots of people could chime in by answering two questions 
about your very own production environment:


question 1) are you worried about scripts in your production environment 
where damage could be caused by a different default in a future new 
major version of postgresql?  (not aborting transactions in interactive 
mode when syntax errors occur)


question 2) do you think the increased user-friendliness of changing 
this default behavior would be worthwhile for specific users in your 
organization who use postgresql?  (including both yourself and others 
you know of)


I would say just add a message to the ERROR that points out 
ON_ERROR_ROLLBACK = 'on' is available. For instance:


test(5432)=# begin ;
BEGIN
test(5432)=# select 1/0;
ERROR:  division by zero
test(5432)=# select 1;
ERROR:  current transaction is aborted, commands ignored until end of 
transaction block


New part of message
HINT: \set ON_ERROR_ROLLBACK on to rollback on error.



As someone working at a large company with an aweful lot of PostgreSQL, 
thinking of the internal production systems I'm aware of, I'd personally 
vote pretty strongly in favor of changing the default.


-Jeremy


--
Jeremy Schneider
Database Engineer
Amazon Web Services




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: survey: psql syntax errors abort my transactions

2020-07-02 Thread Adrian Klaver

On 7/2/20 9:44 AM, Jeremy Schneider wrote:

On 7/2/20 09:28, David G. Johnston wrote:
The status quo prevailed since no-one chose to contribute further 
arguments for change and the original patch was retracted.  What kind 
of "bike-shedding" (which seems to be used incorrectly here) would you 
expect?


The bike-shedding comment reflects my sense of humor, which is evidently 
so dry that it can be entirely un-detectable.  :)  But overall I do 
think there's room for input from more people.  Might not change the 
outcome, I'm just curious if there are more people who'd have thoughts 
to offer.


I'm not sure hint text would be the right course, as the hint wouldn't 
make sense in the context of other clients and I don't know if we have a 
mechanism now for the server to change it's error response based on 
which client is being used.


But psql can. psql knows what the setting is and act accordingly e.g.:

test(5432)=# \x
Expanded display is on.

My suggestion is that the hint would only be shown in psql. Using  HINT: 
maybe not be the best choice as it shows up in server messages, but 
something along that line.




One other thought occurred to me after sending this email - does 
ON_ERROR_ROLLBACK=interactive differentiate between syntax errors and 
other errors? Thinking about how users approach SQL, this feels 
significant. I'd happily want to let users at my company retry after 
syntax errors, but I'd want them to inspect any other error more closely.


-Jeremy

--
Jeremy Schneider
Database Engineer
Amazon Web Services




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: survey: psql syntax errors abort my transactions

2020-07-02 Thread Adrian Klaver

On 7/2/20 9:44 AM, Jeremy Schneider wrote:

On 7/2/20 09:28, David G. Johnston wrote:
The status quo prevailed since no-one chose to contribute further 
arguments for change and the original patch was retracted.  What kind 
of "bike-shedding" (which seems to be used incorrectly here) would you 
expect?


The bike-shedding comment reflects my sense of humor, which is evidently 
so dry that it can be entirely un-detectable.  :)  But overall I do 
think there's room for input from more people.  Might not change the 
outcome, I'm just curious if there are more people who'd have thoughts 
to offer.


I'm not sure hint text would be the right course, as the hint wouldn't 
make sense in the context of other clients and I don't know if we have a 
mechanism now for the server to change it's error response based on 
which client is being used.


One other thought occurred to me after sending this email - does 
ON_ERROR_ROLLBACK=interactive differentiate between syntax errors and 
other errors? Thinking about how users approach SQL, this feels 
significant. I'd happily want to let users at my company retry after 
syntax errors, but I'd want them to inspect any other error more closely.


An error is an error:

\set ON_ERROR_ROLLBACK interactive

test(5432)=# begin ;
BEGIN
test(5432)=# slect 1;
ERROR:  syntax error at or near "slect"
LINE 1: slect 1;
^
test(5432)=# select 1;
 ?column?
--
1
(1 row)

test(5432)=# select 1/0;
ERROR:  division by zero
test(5432)=# select 1;
 ?column?
--
1
(1 row)



-Jeremy

--
Jeremy Schneider
Database Engineer
Amazon Web Services




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Different results from identical matviews

2020-07-02 Thread Anders Steinlein
On Thu, Jul 2, 2020 at 5:43 PM Tom Lane  wrote:

> Anders Steinlein  writes:
> >> Even that perhaps isn't conclusive, so you could
> >> also try comparing the pg_rewrite.ev_action fields for the views'
> >> ON SELECT rules.  (That might be a bit frustrating because of likely
> >> inconsistencies in node "location" fields; but any other difference
> >> is cause for suspicion.)
>
> > You're right, ev_action is indeed different:
> > ...
> > Is there somehow I can format them to make it easier to compare? My basic
> > attempts didn't help me much. I put them up in all their glories in
> > pastebins, since they are rather large. Please let me know if there is
> > somehow I can make this easier to look into.
>
> Yeah, expression trees are pretty unreadable :-(.  I downloaded these,
> changed all the "location" fields to -1 to make them more comparable,
> and behold there are still a bunch of diffs.  Here's one:
>
> [...]
>
> This is the internal form of a "JOIN ... USING (email)" construct.
> I didn't try to trace this back to exactly where it was in the source
> queries.  The important thing here is that we have a couple of Vars
> of type 106893, which I gather must be citext or a domain over it.
>

Yes, we have a domain called `email` over citext. The reason for the
multiple casts to citext, IIRC, was that i.e. array_agg() didn't accept the
email domain directly.


> In the first tree, those are coerced via a no-op RelabelType operation
> into plain text (type OID 25) and then compared with the built-in texteq
> operator.  In the second tree, they are coerced to some other non-built-in
> type (maybe plain citext?) and then compared with operator 106108.
>
> I am betting that 106084 is citext, 106108 is citext's equality operator,
> and the net implication of all this is that the original matview is doing
> the JOIN using case-sensitive equality whereas the new one is using
> case-insensitive equality.
>

This makes sense. We've narrowed the different results down to this exact
case; that is, for the rows that are missing in the old matview, they have
"email" entries in different case between some of the tables. So
case-sensitive equality checks on these will naturally be lost somewhere.
How this came to be is another question...

A plausible explanation for how things got that way is that citext's
> equality operator wasn't in your search_path when you created the original
> matview, but it is in view when you make the new one, allowing that
> equality operator to capture the interpretation of USING.


Possibly, since this view has existed for many years. However in general,
our multi-tenant migration system does SET search_path TO , public for
all DDL we do, so unless we had an issue whey back when it should've been
present.


> Unfortunately,
> since the reverse-listing of this join is just going to say "USING
> (email)", there's no way to detect from human-readable output that the
> interpretation of the USING clauses is different.  (We've contemplated
> introducing not-SQL-standard syntax to allow flagging such cases, but
> haven't pulled the trigger on that.)
>

If I'm reading this correctly, would this be a "reason" to be more explicit
when doing joins involving non-standard data types? I.e. would it be
"safer" to do ON x1.email::citext == x2.email::citext instead of USING (if
there is any difference at all...)?

I count five places in the query with similar operator substitutions.
> There are some other diffs in the trees that are a bit odd, but might be
> explained if the new view was made by dump/reload rather than from the
> identical SQL text the original view was made from; they all look like
> they are references to JOIN output columns rather than the underlying
> table columns or vice versa.  That's probably harmless, but the different
> join operators certainly are not.
>

I looked over the procedure we used for the upgrade, and it was this:
Postgres 9.4 server backup using WAL-E, restored into 9.4 on a new box,
where we did some data cleanup and upgraded extensions including citext.
Then used pg_dump from 12, directory format, and restored into 12. Since
that time no DDL or extension changes have been made on the tables/view
involved here. Could the citext upgrade have had any effect here, messing
with dependencies somehow?

Anyway, I think I'll just recreate these views for all tenants to be on the
safe side. Still curious about how this came to be, but since it's about 10
years of history in this database I guess it will be hard to figure
anything out for sure.

Thanks a lot for looking into this, let me know if there's any reason to
dig further.

Best,
-- a.


Re: Different results from identical matviews

2020-07-02 Thread Tom Lane
Anders Steinlein  writes:
> On Thu, Jul 2, 2020 at 5:43 PM Tom Lane  wrote:
>> Unfortunately,
>> since the reverse-listing of this join is just going to say "USING
>> (email)", there's no way to detect from human-readable output that the
>> interpretation of the USING clauses is different.  (We've contemplated
>> introducing not-SQL-standard syntax to allow flagging such cases, but
>> haven't pulled the trigger on that.)

> If I'm reading this correctly, would this be a "reason" to be more explicit
> when doing joins involving non-standard data types? I.e. would it be
> "safer" to do ON x1.email::citext == x2.email::citext instead of USING (if
> there is any difference at all...)?

Yes, it would be.  Of course then you don't get the "merging" of the two
join output columns into one, so you might have to qualify references a
bit more.

You might find this thread interesting:

https://www.postgresql.org/message-id/flat/ffefc172-a487-aa87-a0e7-472bf29735c8%40gmail.com

regards, tom lane




Does TOAST really compress the complete row?

2020-07-02 Thread Thomas Kellerer

I am confused about one claim in this blog post: 
https://www.2ndquadrant.com/en/blog/oracle-to-postgresql-binary-objects


All columns that come after data > 2000 bytes participate in The
Large Attribute Strorage Technique (TOAST). This storage is for the
row, not the column. Your id column comes as the last column in the
table? Whoopsie, your primary key just got shoved into blob storage

I always was under the impression that TOASTing only happens on column level, 
not on row level.
The manual does not mention anything about the whole row being TOASTed if one 
column exceeds the threshold.

Can someone clarify please?

Thomas





Re: Different results from identical matviews

2020-07-02 Thread Anders Steinlein
On Thu, Jul 2, 2020 at 11:44 PM Tom Lane  wrote:

> Anders Steinlein  writes:
> > I'm reading this correctly, would this be a "reason" to be more explicit
> > when doing joins involving non-standard data types? I.e. would it be
> > "safer" to do ON x1.email::citext == x2.email::citext instead of USING
> (if
> > there is any difference at all...)?
>
> Yes, it would be.  Of course then you don't get the "merging" of the two
> join output columns into one, so you might have to qualify references a
> bit more.
>
> You might find this thread interesting:
>
>
> https://www.postgresql.org/message-id/flat/ffefc172-a487-aa87-a0e7-472bf29735c8%40gmail.com


Indeed interesting, thanks!

Am I right in thinking that we should actually go over (i.e. re-create) all
functions and views defined before this dump/restore where we're using JOIN
... USING (citext_column)? We most definitely have many more such cases,
since this is the common (perhaps naive) way we've written joins (unless
there are obvious reasons to be explicit). :-/

Best,
-- a.


Re: Does TOAST really compress the complete row?

2020-07-02 Thread Tom Lane
Thomas Kellerer  writes:
> I am confused about one claim in this blog post: 
> https://www.2ndquadrant.com/en/blog/oracle-to-postgresql-binary-objects

>> All columns that come after data > 2000 bytes participate in The
>> Large Attribute Strorage Technique (TOAST). This storage is for the
>> row, not the column. Your id column comes as the last column in the
>> table? Whoopsie, your primary key just got shoved into blob storage

> I always was under the impression that TOASTing only happens on column level, 
> not on row level.

You're right, and the quoted text is wrong.  Not only does TOAST compress
fields not whole rows, but it selectively targets wider fields first.
If your pkey is getting toasted, you should likely rethink your choice
of pkey.  (Or, possibly, you just have so many fields there's no choice
but to compress all of them.  Then it might be time for a table redesign.)

The decision-making about this is concentrated in 
heap_toast_insert_or_update, which can be seen here:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/heaptoast.c

regards, tom lane




Re: Different results from identical matviews

2020-07-02 Thread Tom Lane
Anders Steinlein  writes:
> Am I right in thinking that we should actually go over (i.e. re-create) all
> functions and views defined before this dump/restore where we're using JOIN
> ... USING (citext_column)? We most definitely have many more such cases,
> since this is the common (perhaps naive) way we've written joins (unless
> there are obvious reasons to be explicit). :-/

If it's not clear to you how this matview came to be different from
the rest, then it certainly seems likely that other ones might have
the same disease.

Note that functions don't really have this sort of issue, since they're
just stored as text.  Only a view or matview would preserve creation-time
decisions about interpretation.

regards, tom lane




Re: Does TOAST really compress the complete row?

2020-07-02 Thread Adam Brusselback
Another thing that was said I wasn't aware of and have not been able to
find any evidence to support:

> 10. Blobs don’t participate in Logical replication.


Re: Does TOAST really compress the complete row?

2020-07-02 Thread Adrian Klaver

On 7/2/20 4:29 PM, Adam Brusselback wrote:
Another thing that was said I wasn't aware of and have not been able to 
find any evidence to support:


https://www.postgresql.org/docs/12/logical-replication-restrictions.html

"Large objects (see Chapter 34) are not replicated. There is no 
workaround for that, other than storing data in normal tables."


Of course that does not apply to bytea:

https://www.postgresql.org/docs/12/datatype-binary.html



 > 10. Blobs don’t participate in Logical replication.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Does TOAST really compress the complete row?

2020-07-02 Thread Adam Brusselback
>
> > https://www.postgresql.org/docs/12/logical-replication-restrictions.html

> >

> > "Large objects (see Chapter 34) are not replicated. There is no

> > workaround for that, other than storing data in normal tables."

> >

> > Of course that does not apply to bytea:

> >
> https://www.postgresql.org/docs/12/datatype-binary.html

That makes sense now, I was reading that section as if it were talking
about bytea, not LO.

Thanks for pointing that out!
- Adam


Re: survey: psql syntax errors abort my transactions

2020-07-02 Thread Laurenz Albe
On Thu, 2020-07-02 at 09:31 -0700, Adrian Klaver wrote:
> I would say just add a message to the ERROR that points out 
> ON_ERROR_ROLLBACK = 'on' is available. For instance:
> 
> test(5432)=# begin ;
> BEGIN
> test(5432)=# select 1/0;
> ERROR:  division by zero
> test(5432)=# select 1;
> ERROR:  current transaction is aborted, commands ignored until end of 
> transaction block
> 
> New part of message
> HINT: \set ON_ERROR_ROLLBACK on to rollback on error.

-1

That would increase the already annoying spam of "current transaction is 
aborted"
messages when your transaction fails.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: survey: psql syntax errors abort my transactions

2020-07-02 Thread Laurenz Albe
On Thu, 2020-07-02 at 08:54 -0700, Jeremy Schneider wrote:
> Maybe it's just me, but I'm wondering if it's worth changing the default 
> behavior
> of psql so it doesn't abort transactions in interactive mode when I mistakenly
> mis-spell "select" or something silly like that.
> This is of course easily remedied in my psqlrc file by adding "\set 
> ON_ERROR_ROLLBACK interactive".
> [...]
> But I do know that for all the new people coming to PostgreSQL right now
> (including lots at my company), none of them are going to know about this 
> setting
> and personally I think the default is user-unfriendly.
> [...]
>
> So...
> 
> Survey for the user community here on the pgsql-general list: it would be 
> great if lots
> of people could chime in by answering two questions about your very own 
> production environment:
> 
> question 1) are you worried about scripts in your production environment 
> where damage
> could be caused by a different default in a future new major version of 
> postgresql?
> not aborting transactions in interactive mode when syntax errors occur)

I would dislike if interactive mode behaves differently from a non-interactive 
mode.

This is my favorite example why I like the way PostgreSQL does things:

/* poor man's VACUUM (FULL) */
BEGIN;
CREATTE TABLE t2 AS SELECT * FROM t1;
DROP TABLE t1;
ALTER TABLE t2 RENAME TO t1;
COMMIT;

> question 2) do you think the increased user-friendliness of changing this 
> default
> behavior would be worthwhile for specific users in your organization who use 
> postgresql?
> (including both yourself and others you know of)

I personally would benefit because I wouldn't have to repeat the whole 
transaction
while teaching a class when I made a typo inside a transaction.

Still I prefer the way things are currently.  Teaching classes is not the main
use case of psql.

Yours,
Laurenz Albe