Re: pg_basebackup

2019-12-23 Thread Kyotaro Horiguchi
Hello.

At Mon, 23 Dec 2019 03:38:12 +, Daulat Ram  
wrote in 
> thanks Adrian, what about the
> postmaster.opts file, this file was also skipped in backup.

The file is overwritten at startup so there's no point in having it in
a backup. Thus, it is deliberately excluded from a backup.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




SQL operator '*='

2019-12-23 Thread Matthias Apitz


Hello,

I've here a smaller problem of our porting from Sybase/Oracle/Informix
code to PostgreSQL; the code reads for the mentioned DBS:


...
#ifdef DBSINF
EXEC SQL DECLARE land_cursor CURSOR FOR
SELECT stammprio, lkz, landbez, plkz, postbez, karenz1, karenz2,
karenz3, land.wkz, webez, we, kurs, land.del
FROM   land, OUTER devisen
WHERE  land.wkz = devisen.wkz AND land.brgroup = 
devisen.brgroup AND land.brgroup = :brgroupHost_for_helpland_cursor
ORDER  BY stammprio, landbez;
#endif

#ifdef DBSORA
EXEC SQL DECLARE land_cursor CURSOR FOR
SELECT stammprio, lkz, landbez, plkz, postbez, karenz1, karenz2,
karenz3, land.wkz, webez, we, kurs, land.del
FROM   land, devisen
WHERE  land.wkz = devisen.wkz (+) AND land.brgroup = 
devisen.brgroup (+) AND land.brgroup = :brgroupHost_for_helpland_cursor
ORDER  BY stammprio, landbez;
#endif

#ifdef DBSSYB
EXEC SQL DECLARE land_cursor CURSOR FOR 
SELECT stammprio, lkz, landbez, plkz, postbez, karenz1, 
karenz2, 
karenz3, land.wkz, webez, we, kurs, land.del
FROM   land, devisen
WHERE  land.wkz *= devisen.wkz AND land.brgroup *= 
devisen.brgroup AND land.brgroup = :brgroupHost_for_helpland_cursor
ORDER  BY stammprio, landbez;
#endif

#ifdef DBSPOS
EXEC SQL DECLARE land_cursor CURSOR FOR 
SELECT stammprio, lkz, landbez, plkz, postbez, karenz1, 
karenz2, 
karenz3, land.wkz, webez, we, kurs, land.del
FROM   land, devisen
WHERE  land.wkz *= devisen.wkz AND land.brgroup *= 
devisen.brgroup AND land.brgroup = :brgroupHost_for_helpland_cursor
ORDER  BY stammprio, landbez;
#endif

(the code for DBSPOS was just copied from Sybase). It compiles fine but
raises on execution en error about operator '*=' is not supported...

Any ideas about how to express this correctly in PostgreSQL?

Thanks

matthias



-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

"Glaube wenig, hinterfrage alles, denke selbst: Wie man Manipulationen 
durchschaut"
"Believe little, scrutinise all, think by your own: How see through 
manipulations"
ISBN-10: 386489218X




Re: SQL operator '*='

2019-12-23 Thread Thomas Kellerer

Matthias Apitz schrieb am 23.12.2019 um 15:33:

I've here a smaller problem of our porting from Sybase/Oracle/Informix
code to PostgreSQL; the code reads for the mentioned DBS:


#ifdef DBSORA
EXEC SQL DECLARE land_cursor CURSOR FOR
SELECT stammprio, lkz, landbez, plkz, postbez, karenz1, karenz2,
karenz3, land.wkz, webez, we, kurs, land.del
FROM   land, devisen
WHERE  land.wkz = devisen.wkz (+) AND land.brgroup = 
devisen.brgroup (+) AND land.brgroup = :brgroupHost_for_helpland_cursor
ORDER  BY stammprio, landbez;
#endif

#ifdef DBSSYB
EXEC SQL DECLARE land_cursor CURSOR FOR
SELECT stammprio, lkz, landbez, plkz, postbez, karenz1, karenz2,
karenz3, land.wkz, webez, we, kurs, land.del
FROM   land, devisen
WHERE  land.wkz *= devisen.wkz AND land.brgroup *= 
devisen.brgroup AND land.brgroup = :brgroupHost_for_helpland_cursor
ORDER  BY stammprio, landbez;
#endif

(the code for DBSPOS was just copied from Sybase). It compiles fine but
raises on execution en error about operator '*=' is not supported...


T-SQL (Sybase and SQL Server) uses *= for outer joins, just as Oracle uses (+)

Haven't used either of those outdated operators in decades, but I think the 
equivalent would be:

FROM land
  LEFT JOINdevisen
on land.wkz = devisen.wkz
   AND land.brgroup = devisen.brgroup
   AND land.brgroup = :brgroupHost_for_helpland_cursor





Mixing greediness in regexp_matches

2019-12-23 Thread Daniel Verite
  Hi,

When looking into how to implement a global replace of multiple
substrings (each with their own replacement) in sql
or plpgsql, I'm wondering if/how an RE with an alternation
can be used.

The basic idea is to iterate on the rows produced by

   regexp_matches(string, '(.*?)(foo|bar|foobar)', 'g')

to break down the string into pairs of (non-matching segment,
matching segment) so that a final result can be assembled
from that (setting aside the last non-matching segment, that
can be retrieved in a final step).

The difficulty is that the longest strings in the alternation
should be prioritized, but the starting (.*?) makes the RE
non-greedy so "foo" is choosen over "foobar". 

The doc at [1] leaves me unoptimistic when it mentions that:

 "...when an RE contains both greedy and non-greedy
 subexpressions, the total match length is either as long as possible
 or as short as possible, according to the attribute assigned to the
 whole RE. The attributes assigned to the subexpressions only affect
 how much of that match they are allowed to “eat” relative to each
 other."

Also it gives this example of forcing the RE as a whole to
be greedy despite it having a non-greedy sub-RE:
 regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}')
but it doesn't seem to be able to produce the desired result
in the case of the RE in the middle being an alternation
with strings of different lengths.

The ideal RE with a (foo|foobar|bar) alternation, when applied
globally to a string like 'the string has foo and foobar and bar and
more' would produce something like:

 {"the string has ","foo"}
 {" and ","foobar"}
 {" and ","bar"}

Is that possible with regexp_matches?


[1] https://www.postgresql.org/docs/current/functions-matching.html




Re: Mixing greediness in regexp_matches

2019-12-23 Thread Tom Lane
"Daniel Verite"  writes:
> The basic idea is to iterate on the rows produced by
>regexp_matches(string, '(.*?)(foo|bar|foobar)', 'g')
> to break down the string into pairs of (non-matching segment,
> matching segment) so that a final result can be assembled
> from that (setting aside the last non-matching segment, that
> can be retrieved in a final step).
> The difficulty is that the longest strings in the alternation
> should be prioritized, but the starting (.*?) makes the RE
> non-greedy so "foo" is choosen over "foobar". 

I'd try forcing the match to be the whole string, ie

^(.*?)(foo|bar|foobar)(.*)$

which would also save some work for restarting the iteration,
since you'd have already captured the all-the-rest substring.

With the endpoints forced, it doesn't really matter whether
the engine deems the RE-as-a-whole to be greedy or not.
I think this would work without needing any explicit greediness
marking for the second and third parts, but I might be wrong
about that detail.

regards, tom lane




Re: Mixing greediness in regexp_matches

2019-12-23 Thread Tom Lane
"Daniel Verite"  writes:
>> The basic idea is to iterate on the rows produced by
>> regexp_matches(string, '(.*?)(foo|bar|foobar)', 'g')
>> to break down the string into pairs of (non-matching segment,
>> matching segment) so that a final result can be assembled
>> from that (setting aside the last non-matching segment, that
>> can be retrieved in a final step).

BTW, just to think outside the box a bit, I wonder whether you
couldn't build this out of regexp_split_to_array:

regression=# select regexp_split_to_array('junkfoolbarfoolishfoobarmore', 
'foo|bar|foobar');
 regexp_split_to_array 
---
 {junk,l,"",lish,more}
(1 row)

The idea would be to iterate over the array elements, tracking the
corresponding position in the source string, and re-discovering at
each break which of the original alternatives must've matched.

It's sort of annoying that we don't have a simple "regexp_location"
function that would give you back the starting position of the
first match.

regards, tom lane




Re: Mixing greediness in regexp_matches

2019-12-23 Thread Daniel Verite
Tom Lane wrote:

> I'd try forcing the match to be the whole string, ie
> 
> ^(.*?)(foo|bar|foobar)(.*)$
> 
> which would also save some work for restarting the iteration,
> since you'd have already captured the all-the-rest substring.

In that case regexp_matches will return 0 or 1 row. In the
above-mentioned example, that would be:

=> select regexp_matches('the string has foo and foobar and bar and more',
 '^(.*?)(foo|foobar|bar)(.*)$', 'g');
 regexp_matches 

 {"the string has ",foo," and foobar and bar and more"}

So the next iteration would consist of calling regexp_matches()
on result[3], and so on until no match is found.
I think it would work as desired, but probably much less efficiently on
large strings/large number of matches than if a single call of
regexp_matches() could return all matches.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite




Re: Mixing greediness in regexp_matches

2019-12-23 Thread Daniel Verite
Tom Lane wrote:

> regression=# select regexp_split_to_array('junkfoolbarfoolishfoobarmore',
> 'foo|bar|foobar');
> regexp_split_to_array 
> ---
> {junk,l,"",lish,more}
> (1 row)
> 
> The idea would be to iterate over the array elements, tracking the
> corresponding position in the source string, and re-discovering at
> each break which of the original alternatives must've matched.
> 
> It's sort of annoying that we don't have a simple "regexp_location"
> function that would give you back the starting position of the
> first match.

It occurred to me too that regexp_split_to_table or array would make
this problem really easy if only it had a mode to capture and return the
matched parts too.

FWIW, in plperl, there's a simple solution:

 $string =~ s/(foobar|foo|...)/$replace{$1}/g

when %replace is a hash of the substitutions %(foo=>baz,...).
The strings in the alternation are tested in their order of
appearance, so you can choose to be greedy or not by just sorting
them by length.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite




Re: Mixing greediness in regexp_matches

2019-12-23 Thread Tom Lane
"Daniel Verite"  writes:
> FWIW, in plperl, there's a simple solution:
>  $string =~ s/(foobar|foo|...)/$replace{$1}/g

Well, our manual does suggest using plperl (or pltcl) when the
built-in pattern match functions aren't adequate ;-)

regards, tom lane




Re: Commit to primary with unavailable sync standby

2019-12-23 Thread Maksim Milyutin



On 19.12.2019 18:08, Fabio Ugo Venchiarutti wrote:



On 19/12/2019 13:58, Maksim Milyutin wrote:

On 19.12.2019 14:04, Andrey Borodin wrote:


Hi!



Hi!

FYI, this topic was up recently in -hackers 
https://www.postgresql.org/message-id/caeet0zhg5off7iecby6tzadh1moslmfz1hlm311p9vot7z+...@mail.gmail.com 




I cannot figure out proper way to implement safe HA upsert. I will 
be very grateful if someone would help me.


Imagine we have primary server after failover. It is 
network-partitioned. We are doing INSERT ON CONFLICT DO NOTHING; 
that eventually timed out.


az1-grx88oegoy6mrv2i/db1 M > WITH new_doc AS (
 INSERT INTO t(
 pk,
 v,
 dt
 )
 VALUES
 (
 5,
 'text',
 now()
 )
 ON CONFLICT (pk) DO NOTHING
 RETURNING pk,
   v,
   dt)
    SELECT new_doc.pk from new_doc;
^CCancel request sent
WARNING:  01000: canceling wait for synchronous replication due to 
user request
DETAIL:  The transaction has already committed locally, but might 
not have been replicated to the standby.

LOCATION:  SyncRepWaitForLSN, syncrep.c:264
Time: 2173.770 ms (00:02.174)

Here our driver decided that something goes wrong and we retry query.

az1-grx88oegoy6mrv2i/db1 M > WITH new_doc AS (
 INSERT INTO t(
 pk,
 v,
 dt
 )
 VALUES
 (
 5,
 'text',
 now()
 )
 ON CONFLICT (pk) DO NOTHING
 RETURNING pk,
   v,
   dt)
    SELECT new_doc.pk from new_doc;
  pk

(0 rows)

Time: 4.785 ms

Now we have split-brain, because we acknowledged that row to client.
How can I fix this?

There must be some obvious trick, but I cannot see it... Or maybe 
cancel of sync replication should be disallowed and termination 
should be treated as system failure?




I think the most appropriate way to handle such issues is to catch by 
client driver such warnings (with message about local commit) and 
mark the status of posted transaction as undetermined. If connection 
with sync replica will come back then this transaction eventually 
commits but after triggering of autofailover and *not replicating 
this commit to replica* this commit aborts. Therefore client have to 
wait some time (that exceeds the duration of autofailover) and check 
(logically based on committed data) the status of commit.


The problem here is the locally committed data becomes visible to 
future transactions (before autofailover) that violates the property 
of consistent reading from master. IMO the more correct behavior for 
PostgreSQL here is to ignore any cancel / termination queries when 
backend is in status of waiting response from sync replicas.


However, there is another way to get locally applied commits via 
restart of master after initial recovery. This case is described in 
doc 
https://www.postgresql.org/docs/current/warm-standby.html#SYNCHRONOUS-REPLICATION-HA 
. But here HA orchestrator agent can close access from external users 
(via pg_hba.conf manipulations) until PostgreSQL instance synchronizes



And this is where the unsafety lies: that assumes that the isolated 
master is in enough of a sane state to apply a self-ban (and that can 
do it in near-zero time).



Although the retry logic in Andrey's case is probably not ideal (and 
you offered a more correct approach to synchronous commit), there are 
many "grey area" failure modes that in his scenario would either 
prevent a given node from sealing up fast enuogh if at all (eg: PID 
congestion causing fork()/system() to fail while backends are already 
up and happily flushing WAL).



This is particularly relevant to situations when only a subset of 
critical transactions set synchronous_commit to remote_*: it'd still 
be undesirable to sink "tier 2" data in a stale primary for any 
significant length of time).



Could you more concrete describe your thesis? In my proposal the 
self-ban to master is applied after restarting one so that changes from 
locally committed transactions was not visible for new incoming 
transactions.



In the case of postgres (or any RDBMS, really), all I can think of is 
either an inline proxy performing some validation as part of the 
forwarding (which is what we did internally but that has not been 
green lit for FOSS :( )



External validation unfortunately is not option here. AIMB the local 
commits become visible to future transactions coming to master and even 
if some proxy reports to client that transaction is not committed 
completely, new incoming transactions reading locally applied changes 
and making its changes based on these ones implicitly confirms the 
status of these changes as committed.



or some logic in the backend that rejects asynchronous commits too if 
some condition is not met (eg:  synchronous standby nodes 
not present - a builtin version of the pg_stat_replication look-aside 
CTE I suggested earlier).



CTE with sub-query using pg_stat_replication is 

Re: SQL operator '*='

2019-12-23 Thread Alban Hertroys


> On 23 Dec 2019, at 15:33, Matthias Apitz  wrote:
> 
> #ifdef DBSPOS
>   EXEC SQL DECLARE land_cursor CURSOR FOR 
>   SELECT stammprio, lkz, landbez, plkz, postbez, karenz1, 
> karenz2, 
>   karenz3, land.wkz, webez, we, kurs, land.del
>   FROM   land

 LEFT JOIN devisen ON land.wkz = devisen.wkz AND land.brgroup = 
devisen.brgroup

>   WHERE  land.brgroup = :brgroupHost_for_helpland_cursor
>   ORDER  BY stammprio, landbez;
> #endif

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.





Re: unanalyze a foreign table

2019-12-23 Thread Justin
I do not know of way to undo an analyze once its committed.  I do not know
the danger in deleting an entry in pg_statistic

What you can do in the future is make copy of the Statics for this table,
analyze, if it negatively affect results put the copy back.

Another option is to do

begin ;
ANALYZE my_problem_table ;
explain select my_problem_query;
rollback ;

Foreign tables are not be default analyzed so the statistics should have
been empty or no entry,  unless it was previously analyzed.

https://www.postgresql.org/docs/current/sql-analyze.html

On Sun, Dec 22, 2019 at 2:22 PM Jeff Janes  wrote:

> I did a manual ANALYZE of a foreign table, to see if it would make a
> troublesome query better.  It did, but it also made some other queries that
> were previously fine to become far worse.  Is there a way to undo the
> analyze?  I can muck around in the catalogs like below, but seems really
> grotty.
>
> delete from pg_statistic where starelid=418911;
>
> The other option seems to be doing a `drop foreign table ... cascade`, but
> then recreating all the cascaded drops is quite annoying and error prone.
>
> I currently solved it by re-cloning my performance testing server from
> upstream, but that also is not very convenient.  Is directly manipulating
> the catalogs really the best way?
>
> Cheers,
>
> Jeff
>


Re: pgpool High Availability Issue

2019-12-23 Thread Bruce Momjian
The pgpool email lists are the right place to ask this question:

https://www.pgpool.net/mediawiki/index.php/Mailing_lists

---

On Fri, Nov 15, 2019 at 11:04:22AM -0800, a venkatesh wrote:
> Hi, 
> 
> I'm working on configuring high availability for pgpool using watchdog.
> Initially, I tried with two pgpool nodes (along with a pgmaster and pgslave).
> In this scenario, assuming pgpool node 1 was started first and became the
> leader. After sometime , the node got disconnected with  pgpool node 2 and
> pgpool node 2 as well declared itself as leader.                    
> 
> To handle this kind of scenario, I tried provisioning an additional pgpool 
> node
> and made a cluster with total 5 nodes (3 pgpool nodes, 1 pgmaster and 1
> pgslave), assuming it will create a quorum to handle such situations.
> Unfortunately, the situation still remains the same. (In case of any
> disconnection between node that became leader and the first stand by node, 
> both
> the nodes try to manage the pgmaster and slave simultaneously).
> 
> Please help me understand if this is expected behavior or some additional
> configurations are required to be made, so that two pgpool nodes don't become
> leader simultaneously.  If it's an expected behavior, how can we handle this 
> ? 
> 
> (A point to note is that I'm not using elastic IP address here, instead I have
> created a network load balancer in AWS, created a target group with all the
> three pgpool nodes as targets). 
> 
> Regards, 
> Venkatesh. 

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: pgpool High Availability Issue

2019-12-23 Thread Tatsuo Ishii
> I'm working on configuring high availability for pgpool using watchdog.
> Initially, I tried with two pgpool nodes (along with a pgmaster and
> pgslave). In this scenario, assuming pgpool node 1 was started first and
> became the leader. After sometime , the node got disconnected with  pgpool
> node 2 and pgpool node 2 as well declared itself as leader.
> 
> 
> To handle this kind of scenario, I tried provisioning an additional pgpool
> node and made a cluster with total 5 nodes (3 pgpool nodes, 1 pgmaster and
> 1 pgslave), assuming it will create a quorum to handle such situations.
> Unfortunately, the situation still remains the same. (In case of any
> disconnection between node that became leader and the first stand by node,
> both the nodes try to manage the pgmaster and slave simultaneously).
> 
> Please help me understand if this is expected behavior or some additional
> configurations are required to be made, so that two pgpool nodes don't
> become leader simultaneously.  If it's an expected behavior, how can we
> handle this ?

Definitely it's not an expected behavior unless there's something
wrong with Pgpool-II version or with Pgpool-II configuration.

To investigate the problem we need:

- exact Pgpool-II version
- pgpool.conf on all 3 pgpool nodes
- pgpool log when one of pgpool nodes went down

> (A point to note is that I'm not using elastic IP address here, instead I
> have created a network load balancer in AWS, created a target group with
> all the three pgpool nodes as targets).
> 
> Regards,
> Venkatesh.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: Date created for tables

2019-12-23 Thread Bruce Momjian
On Thu, Dec  5, 2019 at 05:10:20PM +, Chloe Dives wrote:
> Having moved to PostgreSQL from Oracle a few years ago I have been generally
> very impressed by Postgres, but there are a few things that I still miss. One
> of those is being able to see the created and last modified dates for database
> objects.
> 
>  
> 
> Is this something that has been considered for implementation?

I wrote a blog about this:

https://momjian.us/main/blogs/pgblog/2017.html#November_21_2017

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Date created for tables

2019-12-23 Thread Ron

On 12/23/19 7:01 PM, Bruce Momjian wrote:

On Thu, Dec  5, 2019 at 05:10:20PM +, Chloe Dives wrote:

Having moved to PostgreSQL from Oracle a few years ago I have been generally
very impressed by Postgres, but there are a few things that I still miss. One
of those is being able to see the created and last modified dates for database
objects.

  


Is this something that has been considered for implementation?

I wrote a blog about this:

https://momjian.us/main/blogs/pgblog/2017.html#November_21_2017


You all are *grossly* over-complicating this.

By creation time, "we DBAs" think the time we ran "CREATE object", not when 
pg_dump, pg_basebackup and pg_update ran.


Likewise, modification time is when we last ran an ALTER command ran, not 
when VACUUM ran (that's tracked elsewhere) or DML ran.


That's all.

--
Angular momentum makes the world go 'round.


Re: Date created for tables

2019-12-23 Thread Melvin Davidson
>You all are *grossly* over-complicating this.
Agree +1


On Mon, Dec 23, 2019 at 9:14 PM Ron  wrote:

> On 12/23/19 7:01 PM, Bruce Momjian wrote:
>
> On Thu, Dec  5, 2019 at 05:10:20PM +, Chloe Dives wrote:
>
> Having moved to PostgreSQL from Oracle a few years ago I have been generally
> very impressed by Postgres, but there are a few things that I still miss. One
> of those is being able to see the created and last modified dates for database
> objects.
>
>
>
> Is this something that has been considered for implementation?
>
> I wrote a blog about this:
>
>   https://momjian.us/main/blogs/pgblog/2017.html#November_21_2017
>
>
> You all are *grossly* over-complicating this.
>
> By creation time, "we DBAs" think the time we ran "CREATE object", not
> when pg_dump, pg_basebackup and pg_update ran.
>
> Likewise, modification time is when we last ran an ALTER command ran, not
> when VACUUM ran (that's tracked elsewhere) or DML ran.
>
> That's all.
>
> --
> Angular momentum makes the world go 'round.
>


-- 
*Melvin Davidson*
*Maj. Database & Exploration Specialist*
*Universe Exploration Command – UXC*
Employment by invitation only!