Re: Help troubleshooting SubtransControlLock problems

2018-03-06 Thread Rene Romero Benavides
For such issues, I find this view very useful (the first one):
https://wiki.postgresql.org/wiki/Lock_Monitoring

Examine blocking_pid's ,  and tell us what kind of operation is blocking
the other processes . Also, are there many long running transactions in
your server?


2018-03-06 21:24 GMT-06:00 Scott Frazer :

> Hi, we have a Postgres 9.6 setup using replication that has recently
> started seeing a lot of processes stuck in "SubtransControlLock" as a
> wait_event on the read-replicas. Like this, only usually about 300-800 of
> them:
>
>
>  179706 | LWLockNamed | SubtransControlLock
>
>  186602 | LWLockNamed | SubtransControlLock
>
>  186606 | LWLockNamed | SubtransControlLock
>
>  180947 | LWLockNamed | SubtransControlLock
>
>  186621 | LWLockNamed | SubtransControlLock
>
> The server then begins to crawl, with some queries just never finishing
> until I finally shut the server down.
>
> Searching for that particular combo of wait_event_type and wait_event only
> seems to turn up the page about statistics collection, but no helpful
> information on troubleshooting this lock.
>
> Restarting the replica server clears the locks and allows us to start
> working again, but it's happened twice now in 12 hours and I'm worried it
> will happen again.
>
> Does anyone have any advice on where to start looking?
>
> Thanks,
> Scott
>
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: dblink: give search_path

2018-04-11 Thread Rene Romero Benavides
What about setting the search path at the user level?
ALTER ROLE act SET search_path = act,logger;

Best.

2018-04-11 1:44 GMT-05:00 Thiemo Kellner :

> Hi all
>
> I try to execute a function not in the Schema I connect to with dblink. Is
> there way to tell dblink to set search_path in a specific way? I have not
> found a solution in the documentation. I tried with the set search_path
> definition in the function declarations to no avail.
>
> Function Schema: logger
> Database: act
> User: act
> User Default Schema: act
>
> Kind regards
>
> Thiemo
>
> --
> Öffentlicher PGP-Schlüssel: http://pgp.mit.edu/pks/lookup?
> op=get&search=0x8F70EFD2D972CBEF
>
> 
> This message was sent using IMP, the Internet Messaging Program.
>
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: pgq is one of the most underrated pg related stuff

2019-09-27 Thread Rene Romero Benavides
pgq is great, I use it to ship particular records to kafka, but I'm
planning on switching to debezium to take advantage of wal logical decoding
with wal2json.
Cheers.

On Wed, Sep 25, 2019 at 5:18 AM Миша Тюрин  wrote:

>
> Ok! How are you doing on (with)without pgq?
>
> // another underrated was multicorn. Is it still alive?
>
> Thank you! And sorry for provocative way.
> — Misha
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Pg_auto_failover

2019-09-30 Thread Rene Romero Benavides
It seems it does, only 10 and 11 as of this:
"We support Postgres 10 and Postgres 11. "
https://cloudblogs.microsoft.com/opensource/2019/05/06/introducing-pg_auto_failover-postgresql-open-source-extension-automated-failover-high-availability/

On Sat, Sep 28, 2019 at 8:44 AM Sonam Sharma  wrote:

> Thanks Paul, I am able to set up. Does pg_auto_failover only works on v10
> and v11.
> I am having 9.5 on production.
>
> On Wed, Sep 25, 2019, 11:40 PM Paul Jungwirth 
> wrote:
>
>> On 9/25/19 8:55 AM, Sonam Sharma wrote:
>> > My user has permission to touch inside the Fs.
>> > I am not getting why it is checking for /backup.
>> >
>> > Any suggestions on this , what else I can check
>>
>> In addition to the code already linked to, I would look at
>>
>> https://github.com/citusdata/pg_auto_failover/blob/1290edd0bab54c627f577cf4462bd16a56b20a1a/src/bin/pg_autoctl/pgctl.c#L429-L437
>>
>> You see that pg_auto_failover is trying to `mkdir -p $pgdata/../backup`
>> (where pgdata is a C variable, not a shell/environment variable). You
>> might want to read through the rest of that code to see where pgdata is
>> coming from, and see if it's being set correctly. Do you really want
>> /backup at the root level?
>>
>> Clearly whatever user is running pg_basebackup doesn't have permission
>> to make that directory. But maybe that's okay and the directory should
>> be somewhere else.
>>
>> --
>> Paul  ~{:-)
>> p...@illuminatedcomputing.com
>>
>>
>>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Tuple concurrency issue in large objects

2019-12-09 Thread Rene Romero Benavides
Hi Shalini. The usual diagnostic info is your postgresql server version,
major and minor version, such as in 12.1 , the major version is 12 and the
minor version (patch version) is 1.


On Fri, Dec 6, 2019 at 9:26 AM Shalini  wrote:

> Hi all,
>
> I am working on a project which allows multiple users to work on single
> large text document. I am using lo_put to apply only the diff into the
> large object without replacing it with a new lob. While working on it, I
> encountered an error "Tuple concurrently updated".
> The error can be reproduced with two psql clients.
>
> Setup:
>
> mydb=# create table text_docs(id serial primary key, data oid);
> CREATE TABLE
> mydb=# insert into text_docs(data) select lo_import('./upload.txt');
> INSERT 0 1
> mydb=# select * from text_docs;
>   id |  data
> +-
>1 | 5810130
> (1 rows)
>
> Now, if we open two psql clients and execute the following commands:
>
> Client 1:
>
> mydb=# begin;
> BEGIN
> mydb=# select lo_put(5810130, 10, '\xaa');
> UPDATE 1
>
> Client 2:
>
> mydb=# select lo_put(5810130, 10, '\xaa');
>
> Client 1:
> mydb=# commit;
> COMMIT
>
> Client 2:
> mydb=# select lo_put(5810130, 10, '\xaa');
> ERROR:  tuple concurrently updated
>
> Is there a workaround to this concurrency issue without creating a new
> large object?
>
> Regards
> Shalini
>
>
>
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Postgresql Data corruption

2020-01-14 Thread Rene Romero Benavides
On Tue, Jan 14, 2020 at 2:55 PM Tulqin Navruzov <
tulqin.navru...@finnetlimited.com> wrote:

>
>
> Hi team
>
> Could you help us to sort this out please?
>
> We had some hardware problems with Datacenter and could not using
> postgresql from restored snapshots it showing "Structure needs to be
> cleaning " during startup
> , on centos7 did xfs_repair and postgresql started successfully. But could
> not select datas from it . showing below message .
>
> This is when want to login to db with postgres user .
> psql: PANIC:  could not open critical system index 2662
>
> but can logged in with another user and trying to select from some tables
> , showing this message :
>
> ERROR:  catalog is missing 11 attribute(s) for relid 113971
>
> Could you help us to solve this problem? or could you give contact who can
> help with this?
>
> Thanks in advance.
>

First of all, please attend to these instructions:
https://wiki.postgresql.org/wiki/Corruption

from what you describe looks like you need to restore from backups or
promote a slave to master if you have such a setup and the slave(s) were
not affected by these hardware problems.

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Postgresql Data corruption

2020-01-14 Thread Rene Romero Benavides
Can't help you personally right now due to work, but try these guys:
https://pgexperts.com/services/emergency_help/

On Tue, Jan 14, 2020 at 4:08 PM Rene Romero Benavides <
rene.romer...@gmail.com> wrote:

> On Tue, Jan 14, 2020 at 2:55 PM Tulqin Navruzov <
> tulqin.navru...@finnetlimited.com> wrote:
>
>>
>>
>> Hi team
>>
>> Could you help us to sort this out please?
>>
>> We had some hardware problems with Datacenter and could not using
>> postgresql from restored snapshots it showing "Structure needs to be
>> cleaning " during startup
>> , on centos7 did xfs_repair and postgresql started successfully. But
>> could not select datas from it . showing below message .
>>
>> This is when want to login to db with postgres user .
>> psql: PANIC:  could not open critical system index 2662
>>
>> but can logged in with another user and trying to select from some tables
>> , showing this message :
>>
>> ERROR:  catalog is missing 11 attribute(s) for relid 113971
>>
>> Could you help us to solve this problem? or could you give contact who
>> can help with this?
>>
>> Thanks in advance.
>>
>
> First of all, please attend to these instructions:
> https://wiki.postgresql.org/wiki/Corruption
>
> from what you describe looks like you need to restore from backups or
> promote a slave to master if you have such a setup and the slave(s) were
> not affected by these hardware problems.
>
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
> http://pglearn.blogspot.mx/
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Log Unique Queries without Params?

2020-04-12 Thread Rene Romero Benavides
Also try pg_stat_statements, does that on the fly and it will give you very
interesting execution stats too.

On Mon, Apr 13, 2020 at 12:37 AM Chris Morris 
wrote:

> Thx!
>
> On Sat, Apr 11, 2020 at 11:55 PM Julien Rouhaud 
> wrote:
>
>> On Sun, Apr 12, 2020 at 6:51 AM Chris Morris 
>> wrote:
>> >
>> > I have a local script I've written that will scan a log of PG queries
>> to extract out unique queries without any specific parameter data. For
>> example, if these 2 queries are actually run:
>> >
>> > SELECT * FROM foo where bar = 1;
>> > SELECT * FROM foo where bar = 2;
>> >
>> > It will capture only:
>> >
>> > SELECT * FROM foo whee bar = :id;
>> >
>> > Are there any existing tools that do this already for me? I'm
>> considering setting up a server that can have logs forwarded to it and only
>> logging unique queries like this, but I don't want to build anything that
>> may already exist out there.
>>
>> pgbadger (http://pgbadger.darold.net/#about) will do that and much
>> more.  Depending on what you want to achieve maybe pg_stat_statements
>> (https://www.postgresql.org/docs/current/pgstatstatements.html) is
>> also an alternative.
>>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Suggestion to Monitoring Tool

2020-05-31 Thread Rene Romero Benavides
Give pgwatch2 a try:
https://pgwatch.com/

On Wed, May 27, 2020 at 11:46 AM postgann2020 s 
wrote:

> Hi Team,
>
> Thanks for your support.
>
> Environment Details:
> OS: RHEL 7.2
> Postgres: 9.5.15
> Master-Slave with Streaming replication
>
> We are planning to implement the monitoring tool for our environment.
>
> Could someone please suggest the Monitoring Tool based on your experience.
>
> We are looking to cover the below areas.
>
> 1. Monitoring metrics and alerting.
> 2. Monitoring events and alerting.
> 3. Consolidate all the PROD DB logs and provide insights on log data.
> 4. logging explain plan and insights on explain plans. (Something like
> store explain plan and compare plans and send alerts on deviations)
> 5. Logging audit data and insights from audit data.
>
>
> Thanks & Regards,
> Postgann.
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


gdal, proj and other postgis dependencies missing in postgres repos

2020-06-07 Thread Rene Romero Benavides
Hi everybody, do you know what happened to gdal and other postgis
dependencies like proj in the official postgres repos?
they appear to be missing in these repos

https://yum.postgresql.org/12/redhat/rhel-7-x86_64/
https://yum.postgresql.org/11/redhat/rhel-7-x86_64/
https://yum.postgresql.org/10/redhat/rhel-7-x86_64/

they used to be there, right? Thank you.
-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison


Re: gdal, proj and other postgis dependencies missing in postgres repos

2020-06-07 Thread Rene Romero Benavides
On Sun, Jun 7, 2020 at 5:37 PM Rene Romero Benavides <
rene.romer...@gmail.com> wrote:

> Hi everybody, do you know what happened to gdal and other postgis
> dependencies like proj in the official postgres repos?
> they appear to be missing in these repos
>
> https://yum.postgresql.org/12/redhat/rhel-7-x86_64/
> https://yum.postgresql.org/11/redhat/rhel-7-x86_64/
> https://yum.postgresql.org/10/redhat/rhel-7-x86_64/
>
> they used to be there, right? Thank you.
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
>
>
>
Btw, they still exist for 9.4
https://yum.postgresql.org/9.4/redhat/rhel-7-x86_64/

Thanks.
-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison


Re: gdal, proj and other postgis dependencies missing in postgres repos

2020-06-07 Thread Rene Romero Benavides
Thank you very much David.

On Sun, Jun 7, 2020 at 5:48 PM David G. Johnston 
wrote:

> On Sunday, June 7, 2020, Rene Romero Benavides 
> wrote:
>
>> On Sun, Jun 7, 2020 at 5:37 PM Rene Romero Benavides <
>> rene.romer...@gmail.com> wrote:
>>
>>> Hi everybody, do you know what happened to gdal and other postgis
>>> dependencies like proj in the official postgres repos?
>>> they appear to be missing in these repos
>>>
>>> https://yum.postgresql.org/12/redhat/rhel-7-x86_64/
>>> https://yum.postgresql.org/11/redhat/rhel-7-x86_64/
>>> https://yum.postgresql.org/10/redhat/rhel-7-x86_64/
>>>
>>> they used to be there, right? Thank you.
>>>
>>
>  https://yum.postgresql.org/news-newreporpmsreleased.php
>
> David J.
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: ansible modules for postgresql installation/config

2020-06-12 Thread Rene Romero Benavides
I've had good experiences when working with roles published by geerlingguy,
he's written several Ansible books also:
https://galaxy.ansible.com/geerlingguy/postgresql

On Thu, Jun 11, 2020 at 1:28 PM Chris Stephens 
wrote:

> it looks like there are a number of roles available through ansible galaxy
> that support initial postgresql setup and configuration.
>
> i'm wondering if there are any that are more popular/capable that i should
> consider vs just picking a few and evaluating those.
>
> does anyone have any recommendations?
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


PostgreSQL 10.12. background writer not cleaning buffers, small working set, big shared_buffers

2020-08-13 Thread Rene Romero Benavides
Hello Postgres community.
In a few words I'm figuring out this stats from the background writer where
you can see buffers_clean is 0 and buffers_backend = 44849371, I would like
it to be the other way around, that's more efficient, right?

-[ RECORD 1 ]-+--
checkpoints_timed | 97504
checkpoints_req   | 4
checkpoint_write_time | 21171407908
checkpoint_sync_time  | 684641
buffers_checkpoint| 273934008
buffers_clean | 0
maxwritten_clean  | 0
buffers_backend   | 44849371
buffers_backend_fsync | 0
buffers_alloc | 44744713
stats_reset   | 2019-09-10 08:42:34.490899-07

Using default settings
postgres=# select name,setting,unit from pg_settings where name like
'%bgwriter%'
;
  name   | setting | unit
-+-+--
 bgwriter_delay  | 200 | ms
 bgwriter_flush_after| 64  | 8kB
 bgwriter_lru_maxpages   | 100 |
 bgwriter_lru_multiplier | 2   |

shared_buffers = '6GB'
and the VM has a RAM of 24GB

The working set is about 160MB, I know, shared buffers are oversized for
this DB, and I know that the bgwriter is supposed to do work when the
working set doesn't fit into shared_buffers, but backends cleaning their
buffers produces undesirable sync waits, right ?
Any advice ?

Thank you.


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison


Re: PostgreSQL 10.12. background writer not cleaning buffers, small working set, big shared_buffers

2020-08-14 Thread Rene Romero Benavides
Btw, these are its wal / checkpoint related settings:

activemq1=# select name,setting,unit from pg_settings where category =
'Write-Ahead Log / Checkpoints';
 name | setting | unit
--+-+--
 checkpoint_completion_target | 0.9 |
 checkpoint_flush_after   | 32  | 8kB
 checkpoint_timeout   | 300 | s
 checkpoint_warning   | 30  | s
 max_wal_size | 4096| MB
 min_wal_size | 2048| MB

Thank you.

On Fri, Aug 14, 2020 at 1:39 AM Rene Romero Benavides <
rene.romer...@gmail.com> wrote:

> Hello Postgres community.
> In a few words I'm figuring out this stats from the background writer
> where you can see buffers_clean is 0 and buffers_backend = 44849371, I
> would like it to be the other way around, that's more efficient, right?
>
> -[ RECORD 1 ]-+--
> checkpoints_timed | 97504
> checkpoints_req   | 4
> checkpoint_write_time | 21171407908
> checkpoint_sync_time  | 684641
> buffers_checkpoint| 273934008
> buffers_clean | 0
> maxwritten_clean  | 0
> buffers_backend   | 44849371
> buffers_backend_fsync | 0
> buffers_alloc | 44744713
> stats_reset   | 2019-09-10 08:42:34.490899-07
>
> Using default settings
> postgres=# select name,setting,unit from pg_settings where name like
> '%bgwriter%'
> ;
>   name   | setting | unit
> -+-+--
>  bgwriter_delay  | 200 | ms
>  bgwriter_flush_after| 64  | 8kB
>  bgwriter_lru_maxpages   | 100 |
>  bgwriter_lru_multiplier | 2   |
>
> shared_buffers = '6GB'
> and the VM has a RAM of 24GB
>
> The working set is about 160MB, I know, shared buffers are oversized for
> this DB, and I know that the bgwriter is supposed to do work when the
> working set doesn't fit into shared_buffers, but backends cleaning their
> buffers produces undesirable sync waits, right ?
> Any advice ?
>
> Thank you.
>
>
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
>
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: PostgreSQL 10.12. background writer not cleaning buffers, small working set, big shared_buffers

2020-08-14 Thread Rene Romero Benavides
Any ideas on this guys?

Thank you.

On Fri, Aug 14, 2020 at 1:39 AM Rene Romero Benavides <
rene.romer...@gmail.com> wrote:

> Hello Postgres community.
> In a few words I'm figuring out this stats from the background writer
> where you can see buffers_clean is 0 and buffers_backend = 44849371, I
> would like it to be the other way around, that's more efficient, right?
>
> -[ RECORD 1 ]-+--
> checkpoints_timed | 97504
> checkpoints_req   | 4
> checkpoint_write_time | 21171407908
> checkpoint_sync_time  | 684641
> buffers_checkpoint| 273934008
> buffers_clean | 0
> maxwritten_clean  | 0
> buffers_backend   | 44849371
> buffers_backend_fsync | 0
> buffers_alloc | 44744713
> stats_reset   | 2019-09-10 08:42:34.490899-07
>
> Using default settings
> postgres=# select name,setting,unit from pg_settings where name like
> '%bgwriter%'
> ;
>   name   | setting | unit
> -+-+--
>  bgwriter_delay  | 200 | ms
>  bgwriter_flush_after| 64  | 8kB
>  bgwriter_lru_maxpages   | 100 |
>  bgwriter_lru_multiplier | 2   |
>
> shared_buffers = '6GB'
> and the VM has a RAM of 24GB
>
> The working set is about 160MB, I know, shared buffers are oversized for
> this DB, and I know that the bgwriter is supposed to do work when the
> working set doesn't fit into shared_buffers, but backends cleaning their
> buffers produces undesirable sync waits, right ?
> Any advice ?
>
> Thank you.
>
>
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
>
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: replication lag despite corrective config

2018-11-19 Thread Rene Romero Benavides
Not sure about the root cause but I can make these observations and raise
some questions:
1) 9.6.6 is five bug fix versions behind
2) 300GB is so big a table, wouldn't make sense to you to partition it ?
2a) or if it's partitioned, doesn't the time of creation or dropping of new
partitions match the time of the conflict?
3) can you track long running transactions on the master?
4) what are the isolation levels on master / replica?
5) check for active locks in the replica, I guess you should see some
blocked transactions during big delays, I've seen this in the past when
standby_feedback is turned off.
6) any out of the ordinary messages in the replica's logs? any evidence
that has been canceling statements?
7) are master and replica exactly the same in terms of resources and main
parameters?
8) how is performance in both nodes while the big delay is happening? IO /
cpu load / etc.

Also, check this out:
https://www.alibabacloud.com/forum/read-383


Am Mo., 19. Nov. 2018 um 21:46 Uhr schrieb Wyatt Alt :

> Sorry, I see now there was a similar question a few days ago:
>
> https://www.postgresql.org/message-id/CAJw4d1WtzOdYzd8Nq2=ufk+z0jy0l_pfg9tvcwprmt3nczq...@mail.gmail.com
>
> Two ideas proposed (aside from disconnects):
> * Autovacuum is truncating a page on the master and taking an
> AccessExclusiveLock on the table in use on the replica
> * A "pin conflict", which I'm still unfamiliar with.
>
> The user's response says they are in the first bucket, but the argument
> relies on max_standby_streaming_delay set to -1, while mine is 5 minutes. I
> need to understand pin conflicts better, but the likely scenario Andrew
> outlined doesn't apply to me. My offending queries were doing bitmap heap
> scans on a 300GB table.
>
> Reading the thread I see Andres ask for the "precise conflict" the user
> gets -- is there a way I can get that without a datadir? And to re-frame
> the original question, are there causes of replication lag that
> max_standby_streaming_delay would not be expected to prevent, that would be
> resolved by killing long standby queries? If so, what's the best way to
> confirm?
>
> Wyatt
>
> On Mon, Nov 19, 2018 at 5:46 PM Wyatt Alt  wrote:
>
>> I've been struggling to eliminate replication lag on a Postgres 9.6.6
>> instance on Amazon RDS. I believe the lag is caused by early cleanup
>> conflicts from vacuums on the master, because I can reliably resolve it by
>> killing long-running queries on the standby. I most recently saw ten hours
>> of lag on Saturday and addressed it this way.
>>
>> The standby is running with
>> hot_standby_feedback = on
>> max_standby_streaming_delay = 5min
>> max_standby_archive_delay = 30s
>>
>> I am not using replication slots on the primary due to reported negative
>> interactions with pg_repack on large tables.
>>
>> My rationale for the first two settings is that hot_standby_feedback
>> should address my issues almost all the time, but that
>> max_standby_streaming_delay would sometimes be necessary as a fallback, for
>> instance in cases of a transient connection loss between the standby and
>> primary. I believe these settings are mostly working, because lag is less
>> frequent than it was when I configured them.
>>
>> My questions are,
>> * Am I overlooking anything in my configuration?
>> * What would explain lag caused by query conflicts given the
>> max_standby_streaming_delay setting? Shouldn't those queries be getting
>> killed?
>> * Is there any particular diagnostic info I should be collecting on the
>> next occurrence, to help me figure out the cause? Note that as I'm on RDS,
>> I don't have direct access to the datadir -- just psql.
>>
>> Thanks for any advice!
>> Wyatt
>>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: replication lag despite corrective config

2018-11-21 Thread Rene Romero Benavides
You're welcome.
Since last Saturday when you addressed the 10 hour delay, with the new
settings, have you seen more of such delay incidents? what the previous
settings were?
Beware that hot_standby_feedback = on and such long queries in the replica
can increase bloat in the master, are you measuring bloat? if so, do you
notice a significant increase?

Am Di., 20. Nov. 2018 um 23:08 Uhr schrieb Wyatt Alt :

> Hi Rene,
> On 11/19/18 8:46 PM, Rene Romero Benavides wrote:
>
> Not sure about the root cause but I can make these observations and raise
> some questions:
> 1) 9.6.6 is five bug fix versions behind
>
> Valid point to raise.
>
> 2) 300GB is so big a table, wouldn't make sense to you to partition it ?
> 2a) or if it's partitioned, doesn't the time of creation or dropping of
> new partitions match the time of the conflict?
>
> Partitioning is in the works, but none at the moment.
>
>
> 3) can you track long running transactions on the master?
> 4) what are the isolation levels on master / replica?
>
> Transaction times on the master max out around two minutes. On the replica
> they are much longer -- numerous 1 - 2 hour transactions per day, and
> occasional ones as long as 10 - 20 hours. Isolation levels are read
> committed everywhere.
>
> 5) check for active locks in the replica, I guess you should see some
> blocked transactions during big delays, I've seen this in the past when
> standby_feedback is turned off.
> 6) any out of the ordinary messages in the replica's logs? any evidence
> that has been canceling statements?
>
> I'll make a note to record the active locks next time. I haven't seen
> anything unusual in the logs during these incidents, but have observed
> statements getting canceled at other times, which is why I think the config
> mostly works.
>
> 7) are master and replica exactly the same in terms of resources and main
> parameters?
> 8) how is performance in both nodes while the big delay is happening? IO /
> cpu load / etc.
>
> This brings up a good detail I forgot to mention originally. During the
> last incident, IO utilization on the replica was near 100%, and had been
> for several hours, which I believe was due to the long queries I canceled.
> Now that I think about it, I wonder if the lag may have arisen from IO
> contention between the query and WAL replay, rather than a query conflict.
>
>
> Also, check this out:
> https://www.alibabacloud.com/forum/read-383
>
> Thanks, interesting reading.
>
>
> Am Mo., 19. Nov. 2018 um 21:46 Uhr schrieb Wyatt Alt  >:
>
>> Sorry, I see now there was a similar question a few days ago:
>>
>> https://www.postgresql.org/message-id/CAJw4d1WtzOdYzd8Nq2=ufk+z0jy0l_pfg9tvcwprmt3nczq...@mail.gmail.com
>>
>> Two ideas proposed (aside from disconnects):
>> * Autovacuum is truncating a page on the master and taking an
>> AccessExclusiveLock on the table in use on the replica
>> * A "pin conflict", which I'm still unfamiliar with.
>>
>> The user's response says they are in the first bucket, but the argument
>> relies on max_standby_streaming_delay set to -1, while mine is 5 minutes. I
>> need to understand pin conflicts better, but the likely scenario Andrew
>> outlined doesn't apply to me. My offending queries were doing bitmap heap
>> scans on a 300GB table.
>>
>> Reading the thread I see Andres ask for the "precise conflict" the user
>> gets -- is there a way I can get that without a datadir? And to re-frame
>> the original question, are there causes of replication lag that
>> max_standby_streaming_delay would not be expected to prevent, that would be
>> resolved by killing long standby queries? If so, what's the best way to
>> confirm?
>>
>> Wyatt
>>
>> On Mon, Nov 19, 2018 at 5:46 PM Wyatt Alt  wrote:
>>
>>> I've been struggling to eliminate replication lag on a Postgres 9.6.6
>>> instance on Amazon RDS. I believe the lag is caused by early cleanup
>>> conflicts from vacuums on the master, because I can reliably resolve it by
>>> killing long-running queries on the standby. I most recently saw ten hours
>>> of lag on Saturday and addressed it this way.
>>>
>>> The standby is running with
>>> hot_standby_feedback = on
>>> max_standby_streaming_delay = 5min
>>> max_standby_archive_delay = 30s
>>>
>>> I am not using replication slots on the primary due to reported negative
>>> interactions with pg_repack on large tables.
>>>
>>> My rationale for the first two settings is that hot_standby_feedback
>>> shoul

Re: replication lag despite corrective config

2018-11-21 Thread Rene Romero Benavides
How big have been the delays after the new settings? I guess significantly
lower than before, right? how much have they decreased?

Am Mi., 21. Nov. 2018 um 13:18 Uhr schrieb Rene Romero Benavides <
rene.romer...@gmail.com>:

> You're welcome.
> Since last Saturday when you addressed the 10 hour delay, with the new
> settings, have you seen more of such delay incidents? what the previous
> settings were?
> Beware that hot_standby_feedback = on and such long queries in the replica
> can increase bloat in the master, are you measuring bloat? if so, do you
> notice a significant increase?
>
> Am Di., 20. Nov. 2018 um 23:08 Uhr schrieb Wyatt Alt  >:
>
>> Hi Rene,
>> On 11/19/18 8:46 PM, Rene Romero Benavides wrote:
>>
>> Not sure about the root cause but I can make these observations and raise
>> some questions:
>> 1) 9.6.6 is five bug fix versions behind
>>
>> Valid point to raise.
>>
>> 2) 300GB is so big a table, wouldn't make sense to you to partition it ?
>> 2a) or if it's partitioned, doesn't the time of creation or dropping of
>> new partitions match the time of the conflict?
>>
>> Partitioning is in the works, but none at the moment.
>>
>>
>> 3) can you track long running transactions on the master?
>> 4) what are the isolation levels on master / replica?
>>
>> Transaction times on the master max out around two minutes. On the
>> replica they are much longer -- numerous 1 - 2 hour transactions per day,
>> and occasional ones as long as 10 - 20 hours. Isolation levels are read
>> committed everywhere.
>>
>> 5) check for active locks in the replica, I guess you should see some
>> blocked transactions during big delays, I've seen this in the past when
>> standby_feedback is turned off.
>> 6) any out of the ordinary messages in the replica's logs? any evidence
>> that has been canceling statements?
>>
>> I'll make a note to record the active locks next time. I haven't seen
>> anything unusual in the logs during these incidents, but have observed
>> statements getting canceled at other times, which is why I think the config
>> mostly works.
>>
>> 7) are master and replica exactly the same in terms of resources and main
>> parameters?
>> 8) how is performance in both nodes while the big delay is happening? IO
>> / cpu load / etc.
>>
>> This brings up a good detail I forgot to mention originally. During the
>> last incident, IO utilization on the replica was near 100%, and had been
>> for several hours, which I believe was due to the long queries I canceled.
>> Now that I think about it, I wonder if the lag may have arisen from IO
>> contention between the query and WAL replay, rather than a query conflict.
>>
>>
>> Also, check this out:
>> https://www.alibabacloud.com/forum/read-383
>>
>> Thanks, interesting reading.
>>
>>
>> Am Mo., 19. Nov. 2018 um 21:46 Uhr schrieb Wyatt Alt > >:
>>
>>> Sorry, I see now there was a similar question a few days ago:
>>>
>>> https://www.postgresql.org/message-id/CAJw4d1WtzOdYzd8Nq2=ufk+z0jy0l_pfg9tvcwprmt3nczq...@mail.gmail.com
>>>
>>> Two ideas proposed (aside from disconnects):
>>> * Autovacuum is truncating a page on the master and taking an
>>> AccessExclusiveLock on the table in use on the replica
>>> * A "pin conflict", which I'm still unfamiliar with.
>>>
>>> The user's response says they are in the first bucket, but the argument
>>> relies on max_standby_streaming_delay set to -1, while mine is 5 minutes. I
>>> need to understand pin conflicts better, but the likely scenario Andrew
>>> outlined doesn't apply to me. My offending queries were doing bitmap heap
>>> scans on a 300GB table.
>>>
>>> Reading the thread I see Andres ask for the "precise conflict" the user
>>> gets -- is there a way I can get that without a datadir? And to re-frame
>>> the original question, are there causes of replication lag that
>>> max_standby_streaming_delay would not be expected to prevent, that would be
>>> resolved by killing long standby queries? If so, what's the best way to
>>> confirm?
>>>
>>> Wyatt
>>>
>>> On Mon, Nov 19, 2018 at 5:46 PM Wyatt Alt  wrote:
>>>
>>>> I've been struggling to eliminate replication lag on a Postgres 9.6.6
>>>> instance on Amazon RDS. I believe the lag is caused by early cleanup
>>>> conflicts from vacuums on the master, because I can 

Re: Vacuum and Materialized view refresh slow

2018-11-30 Thread Rene Romero Benavides
Have you benchmarked the new infrastructure's IO and network performance?
why did you switch providers? was it because of cost?


Re: Unused indexes

2018-12-01 Thread Rene Romero Benavides
I think your assumption is correct, as long as the statistics collector is
working correctly (I've never seen this not being the case), and the
setting "track_counts" is set to on.

Am Sa., 1. Dez. 2018 um 05:24 Uhr schrieb Ron :

> For indexes that existed before the cluster was last started, and for
> which
> REINDEX hasn't been run on them since the cluster was last started... is
> it
> valid to say that an index has not been used since the cluster was started
> if these three pg_stat_all_indexes fields all have a value of 0?
>
> idx_scan
> idx_tup_read
> idx_tup_fetch
>
> If it matters, the version is 9.6.6.
>
> Thanks
>
> --
> Angular momentum makes the world go 'round.
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: postgis after pg_upgrade

2018-12-03 Thread Rene Romero Benavides
Are the postgis functions usable? do they currently work ? any error
messages in the database server logs ? have you installed the postgis
binaries for postgres 9.6? if you follow the given advise by Paul Ramsey,
what happens?

Am So., 2. Dez. 2018 um 14:24 Uhr schrieb Slavcho Trnkovski <
strnkov...@gmail.com>:

> Hi,
>
> This will not resolve the issue I have because extension is already to the
> latest version, but it is using postgres 9.4 and it should use 9.6.
>
> Regards,
> Slavcho
>
> On Fri, Nov 30, 2018 at 6:01 PM Paul Ramsey 
> wrote:
>
>> If you install the new version, and then use ‘ALTER EXTENSION UPDATE’ to
>> update the SQL-side bindings, everything should improve.
>>
>> On Nov 30, 2018, at 5:11 AM, Slavcho Trnkovski 
>> wrote:
>>
>> Hi,
>>
>> I have postgresql 9.4 with postgis extension installed (latest version,
>> 2.4.5). I upgraded postgresql from 9.4 to 9.6.
>> After upgrading to 9.6 I get the following result
>> from PostGIS_full_version()
>>  select PostGIS_full_version();
>>
>>  postgis_full_version
>>
>> ---
>>  POSTGIS="2.4.5 r16765" *PGSQL="94" (procs need upgrade for use with
>> "96") *GEOS="3.6.2-CAPI-1.10.2 4d2925d6" PROJ="Rel. 4.9.3, 15 August
>> 2016" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11"
>> RASTER
>> (1 row)
>>
>> Is there any way to resolve this besides recreating the extension?
>>
>> Regards,
>> Slavcho
>>
>>
>>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Moving large table between servers: logical replication or postgres_fdw

2018-12-04 Thread Rene Romero Benavides
I tend to believe that a backup (pg_dump) in custom format (-F c) using
multiple jobs (parallel) -> restore (pg_restore) also with multiple
concurrent jobs would be better.

Am Di., 4. Dez. 2018 um 21:14 Uhr schrieb Rhys A.D. Stewart <
rhys.stew...@gmail.com>:

> Greetings Folks,
>
> I have a relatively large table (100m rows) that I want to move to a
> new box with more resources. The table isn't doing anything...i.e its
> not being updated or read from. Which approach would be faster to move
> the data over:
>
>   a). Use pg_fdw and do "create local_table as select * from
> foreign_table".
>   b). setup logical replication between the two servers.
>
> Regards,
>
> Rhys
> Peace & Love|Live Long & Prosper
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: debugging intermittent slow updates under higher load

2018-12-05 Thread Rene Romero Benavides
Also read about hot updates and the storage parameter named "fill_factor",
so, data blocks can be recycled instead of creating new ones if the updated
fields don't update also indexes.

Am Mi., 5. Dez. 2018 um 09:39 Uhr schrieb Alexey Bashtanov
:

>
> >
> > The table has around 1.5M rows which have been updated/inserted around
> > 121M times, the distribution of updates to row in alerts_alert will be
> > quite uneven, from 1 insert up to 1 insert and 0.5M updates.
> >
> > Under high load (200-300 inserts/updates per second) we see occasional
> > (~10 per hour) updates taking excessively long times (2-10s). These
> > updates are always of the form:
> >
> > UPDATE "alerts_alert" SET ...bunch of fields... WHERE
> > "alerts_alert"."id" = '...sha1 hash...';
> >
> > Here's a sample explain:
> >
> > https://explain.depesz.com/s/Fjq8
> >
> > What could be causing this? What could we do to debug? What config
> > changes could we make to alleviate this?
> >
>
> Hello Chris,
>
> One of the reasons could be the row already locked by another backend,
> doing the same kind of an update or something different.
> Are these updates performed in a longer transactions?
> Can they hit the same row from two clients at the same time?
> Is there any other write or select-for-update/share load on the table?
>
> Have you tried periodical logging of the non-granted locks?
> Try querying pg_stat_activity and pg_locks (possibly joined and maybe
> repeatedly self-joined, google for it)
> to get the backends that wait one for another while competing for to
> lock the same row or object.
>
> Best,
>   Alex
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: debugging intermittent slow updates under higher load

2018-12-05 Thread Rene Romero Benavides
This parameter can be updated on a "per table" basis.

Am Mi., 5. Dez. 2018 um 09:47 Uhr schrieb Rene Romero Benavides <
rene.romer...@gmail.com>:

> Also read about hot updates and the storage parameter named "fill_factor",
> so, data blocks can be recycled instead of creating new ones if the updated
> fields don't update also indexes.
>
> Am Mi., 5. Dez. 2018 um 09:39 Uhr schrieb Alexey Bashtanov
> :
>
>>
>> >
>> > The table has around 1.5M rows which have been updated/inserted around
>> > 121M times, the distribution of updates to row in alerts_alert will be
>> > quite uneven, from 1 insert up to 1 insert and 0.5M updates.
>> >
>> > Under high load (200-300 inserts/updates per second) we see occasional
>> > (~10 per hour) updates taking excessively long times (2-10s). These
>> > updates are always of the form:
>> >
>> > UPDATE "alerts_alert" SET ...bunch of fields... WHERE
>> > "alerts_alert"."id" = '...sha1 hash...';
>> >
>> > Here's a sample explain:
>> >
>> > https://explain.depesz.com/s/Fjq8
>> >
>> > What could be causing this? What could we do to debug? What config
>> > changes could we make to alleviate this?
>> >
>>
>> Hello Chris,
>>
>> One of the reasons could be the row already locked by another backend,
>> doing the same kind of an update or something different.
>> Are these updates performed in a longer transactions?
>> Can they hit the same row from two clients at the same time?
>> Is there any other write or select-for-update/share load on the table?
>>
>> Have you tried periodical logging of the non-granted locks?
>> Try querying pg_stat_activity and pg_locks (possibly joined and maybe
>> repeatedly self-joined, google for it)
>> to get the backends that wait one for another while competing for to
>> lock the same row or object.
>>
>> Best,
>>   Alex
>>
>>
>
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
> http://pglearn.blogspot.mx/
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: order of reading the conf files

2018-12-06 Thread Rene Romero Benavides
Why do you need to know that ?

Am Do., 6. Dez. 2018 um 01:21 Uhr schrieb bhargav kamineni <
kbn98...@gmail.com>:

>
> Hi,
>
> may i know the order in which postgres reads the configuration files like
> conf , auto.conf , hba  ?
> and how does postmaster forks postgres , can we see that forking process
> in logfile ?
>
>
>
> Thanks,
> Banu.
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: What is the tuplestore?

2018-12-10 Thread Rene Romero Benavides
What if this error message pertains to something happening on the
application side?

Am Mo., 10. Dez. 2018 um 09:56 Uhr schrieb Ron :

> Hi,
>
> v9.6.6
>
>
> 2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748 SELECT
> PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore
> temporary
> file: No space left on device
>
> I see this in the pg_log file, but #1 can't figure out what "tuplestore"
> is
> (Google doesn't help), and #2 there's lots of space on all my file
> systems.
> data/base, where pgsql_tmp lives, has 96GB free.)
>
> Thanks
>
> --
> Angular momentum makes the world go 'round.
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: What is the tuplestore?

2018-12-10 Thread Rene Romero Benavides
Maybe the temp space got released right after the failure?
https://grokbase.com/t/postgresql/pgsql-general/02ag7k8gcr/tuplestore-write-failed
do you have space usage charts for that partition? doesn't it show a spike
during that time?

Am Mo., 10. Dez. 2018 um 15:54 Uhr schrieb Ron :

> There's certainly a problem with the application, but the error is in the
> pg_log, not the application log.
>
> On 12/10/2018 03:21 PM, Rene Romero Benavides wrote:
>
> What if this error message pertains to something happening on the
> application side?
>
> Am Mo., 10. Dez. 2018 um 09:56 Uhr schrieb Ron :
>
>> Hi,
>>
>> v9.6.6
>>
>>
>> 2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748 SELECT
>> PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore
>> temporary
>> file: No space left on device
>>
>> I see this in the pg_log file, but #1 can't figure out what "tuplestore"
>> is
>> (Google doesn't help), and #2 there's lots of space on all my file
>> systems.
>> data/base, where pgsql_tmp lives, has 96GB free.)
>>
>> Thanks
>>
>>
> --
> Angular momentum makes the world go 'round.
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: What is the tuplestore?

2018-12-10 Thread Rene Romero Benavides
Yes, pgsql_tmp, you ought to be looking for a sudden and drastic jump in
space utilization around the time of the error message. You're not
concerned with the current space utilization, but with the one around that
time, because, it probably got freed right after the error was raised.
How many times has this happened ? what kind of queries were running at
that time? can you identify something that could have required lots of
temporary space?

Am Mo., 10. Dez. 2018 um 17:33 Uhr schrieb Ron :

> Which file system (specifically, which directory)?  Is it
> data/base/pgsql_tmp?  There's 96GB free, which is 74% of the volume.
>
> On 12/10/2018 04:50 PM, Rene Romero Benavides wrote:
>
> Maybe the temp space got released right after the failure?
>
> https://grokbase.com/t/postgresql/pgsql-general/02ag7k8gcr/tuplestore-write-failed
> do you have space usage charts for that partition? doesn't it show a spike
> during that time?
>
> Am Mo., 10. Dez. 2018 um 15:54 Uhr schrieb Ron :
>
>> There's certainly a problem with the application, but the error is in the
>> pg_log, not the application log.
>>
>> On 12/10/2018 03:21 PM, Rene Romero Benavides wrote:
>>
>> What if this error message pertains to something happening on the
>> application side?
>>
>> Am Mo., 10. Dez. 2018 um 09:56 Uhr schrieb Ron :
>>
>>> Hi,
>>>
>>> v9.6.6
>>>
>>>
>>> 2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748
>>> SELECT
>>> PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore
>>> temporary
>>> file: No space left on device
>>>
>>> I see this in the pg_log file, but #1 can't figure out what "tuplestore"
>>> is
>>> (Google doesn't help), and #2 there's lots of space on all my file
>>> systems.
>>> data/base, where pgsql_tmp lives, has 96GB free.)
>>>
>>> Thanks
>>>
>>>
>>
> --
> Angular momentum makes the world go 'round.
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: shared_buffers on Big RAM systems

2018-12-13 Thread Rene Romero Benavides
This topic seems to be always open to discussion. In my opinion, it depends
on how big your work dataset is, there's no use in sizing shared_buffers
beyond that size. I think, the most reasonable thing is analyzing each case
as proposed here:
https://www.keithf4.com/a-large-database-does-not-mean-large-shared_buffers/


Re: Amazon Aurora

2018-12-21 Thread Rene Romero Benavides
It might be and Aurora issue, but AFAIK, the only difference with the
mainstream PostgreSQL project is their storage implementation. What do you
mean by slow? each and every query takes longer than it should? have you
analyzed particular execution plans?

On Thu, Dec 20, 2018 at 3:03 PM Ravi Krishna  wrote:

> Glen,
>
> I think your question can be posted here for a better response:
>
> https://forums.aws.amazon.com/forum.jspa?forumID=227
>
>  Original Message 
>
>
> On Thu, Dec 20, 2018, at 3:57 PM, Glenn Schultz wrote:
>
>
> I have a Postgres database of about 1.5 terabytes on amazon aurora.  It
> runs super slow.  Has anyone experienced this and if so how was the problem
> addressed?
> Glenn
> Sent from my iPhone
>
>
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Function `set_config` doesn't work in with query?

2019-01-04 Thread Rene Romero Benavides
On Fri, Jan 4, 2019 at 3:37 AM Zexuan Luo  wrote:

> For instance:
> ```
> with t as (
> select set_config('blah', '1', false)
> )
> select current_setting('blah');
> select current_setting('blah');
> ```
>
> Execute queries above gets these error messages:
> psql:test-query-dump.sql:4: ERROR:  unrecognized configuration parameter
> "blah"
> psql:test-query-dump.sql:5: ERROR:  unrecognized configuration parameter
> "blah"
>
> Thank you for any responses.
>
>
The only parameters you can set that way, are the ones listed in:
SELECT name FROM pg_settings;

For user defined parameters, check this:
https://dba.stackexchange.com/questions/29961/how-do-i-set-and-get-custom-database-variables

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: type int2vector

2019-01-29 Thread Rene Romero Benavides
What's the question?

On Mon, Jan 28, 2019 at 4:25 AM 吉成恒  wrote:

>
> select * from pg_partition where 2 =all(pg_partition.paratts);
>
> --
>
> 吉成恒
>
> 光大证券股份有限公司 信息技术总部(数据中心)
>
> 地址:静安区新闸路1508号7楼
>
> 电话:021-22169287
>
> 手机:18717772189
>
> 邮箱:j...@ebscn.com
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Table Replication

2019-01-29 Thread Rene Romero Benavides
You could also use Slony-I

On Tue, Jan 29, 2019 at 12:24 PM Thomas Endres 
wrote:

>
> On Jan 29, 2019, at 8:22 AM, Fabrízio de Royes Mello <
> fabri...@timbira.com.br> wrote:
>
>
>
> Em ter, 29 de jan de 2019 às 10:24, Sathish Kumar 
> escreveu:
> >
> > Hi,
> >
> > We are trying to replicate few tables from one postgresql server to
> another server. We are currently using Postgresql 9.5.x, is there any way
> to achieve it without Postgresql upgrade.
>
> Yeap... use pglogical [1] extension.
>
>
> I have had success with pgloader when migrating specific tables and/or
> schemas from one machine to another.
> 
>
>
> Best,
> Tom Endres
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Hot to model data in DB (PostgreSQL) for SNMP-like multiple configurations

2019-03-09 Thread Rene Romero Benavides
Try partitioning your table based on your device_id, that will give you a
considerable boost for queries which the where clause includes it. for 9.6
(that's the one your using right?) there's pg_partman for that kind of
thing, in this case you would partition by ranges, if the id's are
sequential it's pretty straightforward. Any chance of upgrading to a newer
PG version? partitioning becomes native from  PG10 onwards, so you don't
have to rely on particular plugins, and there are always significant
performance improvements for several use cases in newer versions (like
improved parallelism).



On Fri, Mar 8, 2019 at 10:40 AM Alexandru Lazarev <
alexandru.laza...@gmail.com> wrote:

> For now I do not see the strong reason, but i inherited this project from
> other developers,
> Originally there was MongoDB and structure was more complex, having SNMP
> like nested tables with OID.Instance1.Instance2.instance3 and in JSON it
> looked like:
> {
> "1.3.6.1.4.1..3.10.2.2.25.4.1.43.1": {
> "1": {
> "24": "vlan24",
> "25": "vlan25"
> },
> "2": {
> "24": "127.0.0.1",
> "25": "8.8.8.8"
> }
> }
> }
>
> Here we have table in table - How to model this in relational - with
> separate tables and JOINs only?
> I am not excluding in future I'll have such requirement
>
> the other reason is that devices request their config and some other tools
> requests devices configs as a single document/file - this a bit create
> overhead for composing document in JSON or XML or CSV format from
> relational table (I understand it is doable, but...)
>
> BTW in PG documentation:
> "
> *8.14.2. Designing JSON documents effectively*
>
>
>
>
> *Representing data as JSON can be considerably more flexible than the
> traditional relational data model, which is compelling in environments
> where requirements are fluid. It is quite possible for both approaches to
> co-exist and complement each other within the same application. However,
> even for applications where maximal flexibility is desired, it is still
> recommended that JSON documents have a somewhat fixed structure. The
> structure is typically unenforced (though enforcing some business rules
> declaratively is possible), but having a predictable structure makes it
> easier to write queries that usefully summarize a set of "documents"
> (datums) in a table.JSON data is subject to the same concurrency-control
> considerations as any other data type when stored in a table. Although
> storing large documents is practicable, keep in mind that any update
> acquires a row-level lock on the whole row. Consider limiting JSON
> documents to a manageable size in order to decrease lock contention among
> updating transactions. Ideally, JSON documents should each represent an
> atomic datum that business rules dictate cannot reasonably be further
> subdivided into smaller datums that could be modified independently."
> https://www.postgresql.org/docs/9.6/datatype-json.html
> *
>
>
>
> On Fri, Mar 8, 2019 at 5:15 PM Alban Hertroys  wrote:
>
>> Is there a reason not to use a relational model instead of json(b) here?
>> I think that is in fact considered best practice.
>>
>> On Fri, 8 Mar 2019 at 15:40, Alexandru Lazarev <
>> alexandru.laza...@gmail.com> wrote:
>>
>>> I am working on product managing and monitoring Network (NMS-like
>>> products).
>>>
>>> Product manages configuration of network devices, for now each device
>>> has stored its configuration in simple table - this was the original design.
>>>
>>> CREATE TABLE public.configuration(
>>>   id integer NOT NULL,
>>>   config json NOT NULL,
>>>   CONSTRAINT configuration_pkey PRIMARY KEY (id),)
>>>
>>> A config looks like:
>>>
>>> {
>>> "_id": 20818132,
>>> "type": "Modem",
>>> "data": [{
>>> "oid": "1.3.6.1.4.1..3.10.2.2.25.4.1.40",
>>> "instance": "24",
>>> "value": "null"
>>> },
>>> {
>>> "oid": "1.3.6.1.4.1..3.5.10.1.86",
>>> "instance": "0",
>>> "value": "562"
>>> },
>>> {
>>> "oid": "1.3.6.1.4.1..3.5.10.3.92.4.1",
>>> "instance": "0",
>>> "value": "0"
>>> },
>>> {
>>> "oid": "1.3.6.1.4.1..3.10.2.2.25.4.1.43",
>>> "instance": "24",
>>> "value": "vlan24"
>>> },
>>> {
>>> "oid": "1.3.6.1.4.1..3.10.2.2.25.4.1.43",
>>> "instance": "25",
>>> "value": "vlan25"
>>> }
>>> ]}
>>>
>>> And there are many plv8 (java script procedural language extension for
>>> PostgreSQL) stored procedures working on bulks of such config, reading some
>>> OIDs, changing them conditionally, removing some of them and adding others,
>>> especially in use cases like: There are some upper-level META-configuration
>>> of different level, which 

Re: PostgreSQL temp table blues

2019-03-13 Thread Rene Romero Benavides
Wow, thanks for sharing your experience. What kind of connection pooling
are we talking about? some connection pools implement a DISCARD ALL
statement after a session close, that may help if possible to configure.

On Wed, Mar 13, 2019 at 4:21 AM Jahwan Kim  wrote:

> Hi all,
>
>
> I'd like to share my (painful) experience, in which temp tables caused
> PostgreSQL shutdown.
> TL;DR. Do not use temp tables in PostgreSQL with connection pool.
>
> * My app uses connection pool AND temp tables, with default setting of ON
> COMMIT PRESERVE ROWS.
> * I found out later that autovacuum doesn't deal with temp tables.
> * The database ages as long as the connection is not closed.
> * So when the database age reaches XID STOP LIMIT, the database refuses to
> process any new transaction requests, saying "database is not accepting
> commands to avoid wraparound data loss... HINT: Stop the postmaster and use
> a standalone backend to vacuum that database. "
>
> After reading the docs, I expected this much. What happens after this
> surprised me.
> * Now the database needs to be shutdown. When shutting down, it tries to
> remove temp tables (of course), but since the database is not accepting any
> commands, ... The temp tables are then ORPHANED, although there was no
> database crash!
> * Because of these orphan temp tables, vacuuming the database in single
> mode won't work, as suggested by HINT. The orphaned temp tables must be
> manually dropped in single mode, and only then the database can be vacuumed
> back to normal state. Without dropping temp tables, vacuuming just takes
> (quite possibly a long) time and do (almost) nothing.
>
> Well, that's all. All of the above facts are documented, albeit tersely.
> If anybody I know ask me about temp tables in PostgreSQL, I'd just say
> "DON'T."
>
>
> Best Regards,
> Jahwan
>
>
>
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: PostgreSQL temp table blues

2019-03-13 Thread Rene Romero Benavides
In conjunction with some parameter to renew idle connections and those that
have been opened for too long will help you prevent this in the future,
this also helps prevent server processes from becoming too big memory wise.


On Wed, Mar 13, 2019 at 4:32 PM Rene Romero Benavides <
rene.romer...@gmail.com> wrote:

> Wow, thanks for sharing your experience. What kind of connection pooling
> are we talking about? some connection pools implement a DISCARD ALL
> statement after a session close, that may help if possible to configure.
>
> On Wed, Mar 13, 2019 at 4:21 AM Jahwan Kim  wrote:
>
>> Hi all,
>>
>>
>> I'd like to share my (painful) experience, in which temp tables caused
>> PostgreSQL shutdown.
>> TL;DR. Do not use temp tables in PostgreSQL with connection pool.
>>
>> * My app uses connection pool AND temp tables, with default setting of ON
>> COMMIT PRESERVE ROWS.
>> * I found out later that autovacuum doesn't deal with temp tables.
>> * The database ages as long as the connection is not closed.
>> * So when the database age reaches XID STOP LIMIT, the database refuses
>> to process any new transaction requests, saying "database is not accepting
>> commands to avoid wraparound data loss... HINT: Stop the postmaster and use
>> a standalone backend to vacuum that database. "
>>
>> After reading the docs, I expected this much. What happens after this
>> surprised me.
>> * Now the database needs to be shutdown. When shutting down, it tries to
>> remove temp tables (of course), but since the database is not accepting any
>> commands, ... The temp tables are then ORPHANED, although there was no
>> database crash!
>> * Because of these orphan temp tables, vacuuming the database in single
>> mode won't work, as suggested by HINT. The orphaned temp tables must be
>> manually dropped in single mode, and only then the database can be vacuumed
>> back to normal state. Without dropping temp tables, vacuuming just takes
>> (quite possibly a long) time and do (almost) nothing.
>>
>> Well, that's all. All of the above facts are documented, albeit tersely.
>> If anybody I know ask me about temp tables in PostgreSQL, I'd just say
>> "DON'T."
>>
>>
>> Best Regards,
>> Jahwan
>>
>>
>>
>>
>>
>
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
> http://pglearn.blogspot.mx/
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: stale WAL files?

2019-04-01 Thread Rene Romero Benavides
On Sat, Mar 30, 2019 at 5:03 PM Gmail  wrote:

>
>
> > On Mar 30, 2019, at 10:54 AM, Gmail  wrote:
> >
> >
>  On Mar 29, 2019, at 6:58 AM, Michael Paquier 
> wrote:
> >>>
> >>> On Thu, Mar 28, 2019 at 09:53:16AM -0600, Rob Sargent wrote:
> >>> This is pg10 so it's pg_wal.  ls -ltr
> >>>
> >>>
> >>> -rw---. 1 postgres postgres 16777216 Mar 16 16:33
> >>> 00010CEA00B1
> >>> -rw---. 1 postgres postgres 16777216 Mar 16 16:33
> >>> 00010CEA00B2
> >>>
> >>> ... 217 more on through to ...
> >>>
> >>> -rw---. 1 postgres postgres 16777216 Mar 16 17:01
> >>> 00010CEA00E8
> >>> -rw---. 1 postgres postgres 16777216 Mar 16 17:01
> >>> 00010CEA00E9
> >>> -rw---. 1 postgres postgres 16777216 Mar 28 09:46
> >>> 00010CEA000E
> > I’m now down to 208 Mar 16 WAL files so they are being processed (at
> least deleted).  I’ve taken a snapshot of the pg_wal dir such that I can
> see which files get processed. It’s none of the files I’ve listed previously
>
> Two more have been cleaned up.  001C and 001D generated at 16:38 Mar 16
>
>
>
> Please share your complete postgresql.conf file and the results from this
query:
SELECT * FROM pg_settings;
has someone in the past configured wal archiving?
You've ran out of disk space as this log message you shared states:
No space left on device
what's the output of df -h

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: stale WAL files?

2019-04-01 Thread Rene Romero Benavides
On Mon, Apr 1, 2019 at 6:30 PM Rene Romero Benavides <
rene.romer...@gmail.com> wrote:

>
> On Sat, Mar 30, 2019 at 5:03 PM Gmail  wrote:
>
>>
>>
>> > On Mar 30, 2019, at 10:54 AM, Gmail  wrote:
>> >
>> >
>> >>>> On Mar 29, 2019, at 6:58 AM, Michael Paquier 
>> wrote:
>> >>>
>> >>> On Thu, Mar 28, 2019 at 09:53:16AM -0600, Rob Sargent wrote:
>> >>> This is pg10 so it's pg_wal.  ls -ltr
>> >>>
>> >>>
>> >>> -rw---. 1 postgres postgres 16777216 Mar 16 16:33
>> >>> 00010CEA00B1
>> >>> -rw---. 1 postgres postgres 16777216 Mar 16 16:33
>> >>> 00010CEA00B2
>> >>>
>> >>> ... 217 more on through to ...
>> >>>
>> >>> -rw---. 1 postgres postgres 16777216 Mar 16 17:01
>> >>> 00010CEA00E8
>> >>> -rw---. 1 postgres postgres 16777216 Mar 16 17:01
>> >>> 00010CEA00E9
>> >>> -rw---. 1 postgres postgres 16777216 Mar 28 09:46
>> >>> 00010CEA000E
>> > I’m now down to 208 Mar 16 WAL files so they are being processed (at
>> least deleted).  I’ve taken a snapshot of the pg_wal dir such that I can
>> see which files get processed. It’s none of the files I’ve listed previously
>>
>> Two more have been cleaned up.  001C and 001D generated at 16:38 Mar 16
>>
>>
>>
>> Please share your complete postgresql.conf file and the results from this
> query:
> SELECT * FROM pg_settings;
> has someone in the past configured wal archiving?
> You've ran out of disk space as this log message you shared states:
> No space left on device
> what's the output of df -h
>
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
> http://pglearn.blogspot.mx/
>
>
BTW , how spread apart are checkpoints happening? do you have stats on
that? maybe they're too spread apart and that's why WAL files cannot be
recycled rapidly enough?
-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Table Export & Import

2019-04-01 Thread Rene Romero Benavides
Hi Sathish, as Michel Pelletier, pointed out, a trigger based approach (i.e
slony-I, bucardo) or the pg_logical plugin (requires server restart) is the
way to go, personally, I've worked with slony-I, the initial setup is
somewhat tricky, but it works, depending on how transactional the table,
storage, and network speed, we're talking about seconds of sync difference
worst case scenario.

On Mon, Apr 1, 2019 at 8:23 PM Sathish Kumar  wrote:

> Hi All,
>
> Can you tell me a way for table replication or sync or to achieve minimal
> downtime from dbserver1 to dbserver2 on Postgresql 9.5
>
> Table Size: 160gb
> 4VCPU, 16gb RAM
>
> On Tue, Apr 2, 2019, 12:19 AM Sathish Kumar  wrote:
>
>> The table size is 160gb. We would like to move/copy this table fro one db
>> server to another db server.
>>
>> On Tue, Apr 2, 2019, 12:17 AM Michel Pelletier <
>> pelletier.mic...@gmail.com> wrote:
>>
>>> On Mon, Apr 1, 2019 at 7:47 AM Sathish Kumar  wrote:
>>>
 Hi Adrian,
 We are exporting live table data to a new database, so we need to stop
 our application until the export/import is completed. We would like to
 minimise this downtime.

>>>
>>> It's more complicated if you want to keep your application running and
>>> writing to the db while migrating.  There are trigger-level replication
>>> tools, like slony that can be used to stream changes to the new database,
>>> and then you switch over once you get both of them to parity, but there are
>>> some gotchas.  You said the db is only 160GB, it depend a lot on what kind
>>> of schema we're talking about, but I imagine it wouldn't take long to just
>>> take the downtime and do a normal pg_upgrade.
>>>
>>>

 On Mon, Apr 1, 2019, 10:22 PM Adrian Klaver 
 wrote:

> On 3/31/19 11:09 PM, Sathish Kumar wrote:
> > Hi Team,
> >
> > We have a requirement to copy a table from one database server to
> > another database server. We are looking for a solution to achieve
> this
> > with lesser downtime on Prod. Can you help us with this?
>
> So what is creating the downtime now?
>
> In addition to other suggestions you might want to take a look at:
>
> https://www.postgresql.org/docs/9.5/postgres-fdw.html
>
>
> >
> > Table Size: 160GB
> > Postgresql Server Version: 9.5
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Gigantic load average spikes

2019-04-01 Thread Rene Romero Benavides
On Mon, Apr 1, 2019 at 10:35 AM rihad  wrote:

> On 04/01/2019 08:30 PM, Michel Pelletier wrote:
>
>
>
> On Sun, Mar 31, 2019 at 10:49 PM David Rowley <
> david.row...@2ndquadrant.com> wrote:
>
>>
>> Perhaps a bunch of processes waiting on the access exclusive lock on
>> the materialized view being released?
>>
>> log_lock_waits might help you if the MV takes more than a second to
>> refresh, otherwise, you might need to have a look at ungranted locks
>> in pg_locks and see if the number of locks spikes during the refresh.
>>
>
> I think David's got the right idea here.  Like he said, investigate
> pg_locks, if it is the refresh materialized view, you can avoid the problem
> by doing 'REFRESH MATERIALIZED VIEW CONCURRENTLY'.  You will need at least
> one unique index on the table.
>
>
> It is actually refreshed concurrently.
>
>
>
>> --
>>  David Rowley   http://www.2ndQuadrant.com/
>>  PostgreSQL Development, 24x7 Support, Training & Services
>>
>>
>>
> Hi. How many vcores does the server have? what's the load average we're
talking about? do you mind sharing your postgresql configuration?

--


Re: stale WAL files?

2019-04-03 Thread Rene Romero Benavides
On Wed, Apr 3, 2019 at 1:05 PM Rob Sargent  wrote:

>
>
> On Apr 1, 2019, at 9:20 PM, Rene Romero Benavides 
> wrote:
>
>
> On Mon, Apr 1, 2019 at 6:30 PM Rene Romero Benavides <
> rene.romer...@gmail.com> wrote:
>
>>
>> On Sat, Mar 30, 2019 at 5:03 PM Gmail  wrote:
>>
>>>
>>>
>>> > On Mar 30, 2019, at 10:54 AM, Gmail  wrote:
>>> >
>>> >
>>> >>>> On Mar 29, 2019, at 6:58 AM, Michael Paquier 
>>> wrote:
>>> >>>
>>> >>> On Thu, Mar 28, 2019 at 09:53:16AM -0600, Rob Sargent wrote:
>>> >>> This is pg10 so it's pg_wal.  ls -ltr
>>> >>>
>>> >>>
>>> >>> -rw---. 1 postgres postgres 16777216 Mar 16 16:33
>>> >>> 00010CEA00B1
>>> >>> -rw---. 1 postgres postgres 16777216 Mar 16 16:33
>>> >>> 00010CEA00B2
>>> >>>
>>> >>> ... 217 more on through to ...
>>> >>>
>>> >>> -rw---. 1 postgres postgres 16777216 Mar 16 17:01
>>> >>> 00010CEA00E8
>>> >>> -rw---. 1 postgres postgres 16777216 Mar 16 17:01
>>> >>> 00010CEA00E9
>>> >>> -rw---. 1 postgres postgres 16777216 Mar 28 09:46
>>> >>> 00010CEA000E
>>> > I’m now down to 208 Mar 16 WAL files so they are being processed (at
>>> least deleted).  I’ve taken a snapshot of the pg_wal dir such that I can
>>> see which files get processed. It’s none of the files I’ve listed previously
>>>
>>> Two more have been cleaned up.  001C and 001D generated at 16:38 Mar 16
>>>
>>>
>>>
>>> Please share your complete postgresql.conf file and the results from
>> this query:
>> SELECT * FROM pg_settings;
>> has someone in the past configured wal archiving?
>> You've ran out of disk space as this log message you shared states:
>> No space left on device
>> what's the output of df -h
>>
>> --
>> El genio es 1% inspiración y 99% transpiración.
>> Thomas Alva Edison
>> http://pglearn.blogspot.mx/
>>
>>
> BTW , how spread apart are checkpoints happening? do you have stats on
> that? maybe they're too spread apart and that's why WAL files cannot be
> recycled rapidly enough?
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
> http://pglearn.blogspot.mx/
>
> two attempts (one in-line, one with attachement) at sending
> postgresql.conf and pg_settings report have been sent to a moderator.
>
>
>
As per your configuration :
max_wal_size = 50GB
this seems to be the cause for the WAL files piling up.

this has been declared twice, the last one is taking effect.
-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Reg: Pg_Ctl command help

2019-04-03 Thread Rene Romero Benavides
How did you install it? It should be installed as a service.

On Wed, Apr 3, 2019 at 4:24 AM Nadeem Akbar basha 
wrote:

> Hello,
>
>
>
> I have a query regarding starting the Postgres server using the ‘pg_ctl’
> command in the command prompt (Windows OS).
>
>
>
> I try to start  the postgres server using the following command,
>
> pg_ctl -D "C:\Program Files\PostgreSQL\9.6\data" start -w
>
>
>
> After the server gets started, I’m closing the command prompt. But as soon
> as I close the console, the server gets forcefully shutdown. Again I have
> to restart the server  through command prompt.
>
> Is there any way, where after starting the server, I have to exit the
> command prompt, still running the server at the background.
>
>
>
> Please help me to resolve this issue.
>
>
>
> *Thanks & Regards,*
>
> *A. Nadeem Ahmed*
>
>
> ::DISCLAIMER::
>
> --
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only. E-mail transmission is not
> guaranteed to be secure or error-free as information could be intercepted,
> corrupted, lost, destroyed, arrive late or incomplete, or may contain
> viruses in transmission. The e mail and its contents (with or without
> referred errors) shall therefore not attach any liability on the originator
> or HCL or its affiliates. Views or opinions, if any, presented in this
> email are solely those of the author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction,
> dissemination, copying, disclosure, modification, distribution and / or
> publication of this message without the prior written consent of authorized
> representative of HCL is strictly prohibited. If you have received this
> email in error please delete it and notify the sender immediately. Before
> opening any email and/or attachments, please check them for viruses and
> other defects.
> --
>
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: stale WAL files?

2019-04-10 Thread Rene Romero Benavides
What's your current max_wal_size parameter?
SHOW max_wal_size;
If it's 8GB as your configuration's previous value, you would get a
constant share of 512 WAL files. If it's a development environment set it
to the desired size, the smaller the value, the more frequent the
checkpoints, but your checkpoint_timeout value is 300 (5 minutes) which is
likely to be happening first, and thus being the one triggering checkpoints
that often.

On Wed, Apr 10, 2019 at 1:12 PM Rob Sargent  wrote:

>
>
> As per your configuration :
> max_wal_size = 50GB
> this seems to be the cause for the WAL files piling up.
>
> this has been declared twice, the last one is taking effect.
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
> http://pglearn.blogspot.mx/
>
> I've manage to generate another 359 WAL files in a 10 minute span
> yesterday (now only 357 remain and I suspect they will wither away as
> before).  Are these being held simply because of the high max_wal_size
> value?
>
> This is a development environment, wherein I'm loading 4M+ records, first
> into 41 staging tables 100K rows per.  In a loop over each staging table,
> the data is then placed into application tables via selects. First select *
> into "matching table" then select id into intersection record (id, fixed
> groupId).  Each such iteration is in it's own transaction.  I have dropped
> and recreate this same database numerous times working my way up from 100K
> to 4M records, dialing in application parameters according to number of
> primary records.  I have not, however, dropped the last incarnation.
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: terminating walsender process due to replication timeout

2019-05-14 Thread Rene Romero Benavides
To detect network issues maybe you could monitor replication delay.

On Mon, May 13, 2019 at 6:42 AM  wrote:

> Hello PostgreSQL Community!
>
> I faced an issue on my linux machine using Postgres 11.3 .
> I have 2 nodes in db cluster: master and standby.
> I tried to perform a plenty of long-running  queries which lead to the
> databases desynchronization:
> terminating walsender process due to replication timeout
>
> Here is the output in debug mode:
> 2019-05-13 13:21:33 FET 0 DEBUG:  sending replication keepalive
> 2019-05-13 13:21:34 FET 0 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  StartTransaction(1) name: unnamed;
> blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 DEBUG:  CommitTransaction(1) name: unnamed;
> blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
> 2019-05-13 13:21:34 FET 0 LOG:  terminating walsender process due to
> replication timeout
>
>
> The issue is reproducible. I configure 2 nodes cluster, download
> demo_small.zip from https://edu.postgrespro.ru/ and run the following
> command:
> psql -U user1 -f demo_small.sql db1
> and I get the observed behaviour.
>
>
> I know that I can increase wal_sender_timeout value to avoid this
> behaviour (currently wal_sender_timeout is equal to 1 second.)
> To be honest I don't want to increase wal_sender_timeout because I would
> like to detect some network issues quickly.
>
> After having googled I found that someone faced a similar issue
> https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832...@2ndquadrant.com
> which was fixed in  PostgreSQL 9.4.16.
>
>
> Is my issue the same as described here
> https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832...@2ndquadrant.com
> ?
> Is there any  other chance to avoid it without increasing
> wal_sender_timeout?
>
>
> Thank you in advance.
> Regards,
> Andrei



-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: pgstattuple free_percent to high

2017-12-13 Thread Rene Romero Benavides
Check for long running transactions modifying (update, insert) on those
tables ,using pg_stat_activity.

Tweak these storage parameters for such tables:
autovacuum_vacuum_cost_delay : decrease it (the autovacuum daemon goes to
sleep less often )
autovacuum_vacuum_threshold : decrease it (to trigger more frequent
autovacuum activations )
autovacuum_vacuum_cost_limit : increase it (to allow the autovacuum daemon
to work for longer periods)
autovacuum_vacuum_scale_factor : decrease it (to trigger more autovacuum
activations when this percentage of a table has been modified)

For example I've set these parameters for one table experiencing long
running transactions, and for its access patterns have worked:

autovacuum_vacuum_cost_delay=5,
autovacuum_vacuum_threshold=50,autovacuum_vacuum_cost_limit=3000,
autovacuum_vacuum_scale_factor=0.01
but these settings are very particular for each usage pattern.

Take into account that more activity from autovacuum means more IO, more
CPU usage, you might also benefit from setting autovacuum_work_mem to a
higher setting if the available RAM allows it, to give more RAM to the
autovacuum daemon.



2017-12-13 9:49 GMT-06:00 Nicola Contu :

> Hello,
> We are running postgres 9.6.6 on centos 7.
>
> We have a large DB (180GB) with about 1200 tables.
>
> We have autovacuum set with default values and we are seeing that for some
> tables the free percent goes really high (51%) and we need to daily full
> vacuum those tables.
>
> dbanme=# SELECT * FROM pgstattuple('tablename');
>  table_len  | tuple_count | tuple_len  | tuple_percent | dead_tuple_count
> | dead_tuple_len | dead_tuple_percent | free_space | free_percent
> +-++---+
> --+++---
> -+--
>  2119548928 |  526658 | 1023569149 | 48.29 |0
> |  0 |  0 | 1083485292 |51.12
> (1 row)
>
> I guess this is because of long queries but I'm not really sure.
> Do you know how to avoid this problem and what can cause it?
>
> Do you think that increasing the autovacuum settings for those tables
> would alleviate the issue?
>
> Thanks,
> Nicola
>



-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: Deadlock with 2 processes waiting on transactionids and locking unrelated tables

2018-02-18 Thread Rene Romero Benavides
Hi. Does any of the two tables have triggers? What's the database /
transaction isolation level? Do the updates run in a transaction among
other read / write operations within the same transaction ?
Regards.

2018-02-18 23:28 GMT-06:00 David Wheeler :

> Hi,
>
> We’re seeing deadlock semi-regularly (0-2 per day) that I’m really having
> trouble getting to the bottom of.
>
> Process 7172 waits for ShareLock on transaction 4078724272
> <(407)%20872-4272>; blocked by process 7186.
> Process 7186 waits for ShareLock on transaction 4078724210
> <(407)%20872-4210>; blocked by process 7172.
>
> The two queries in question are updates on unrelated tables. Running the
> queries on their own shows no overlapping entries in pg_locks.
>
> Process 7172: update ticket set unread = true where ticketid = $1
> Process 7186: update "planscheduleitem" set "planschedule"=$1 where
> "psi"=$2
>
> How can I work out why Postgres has decided that the two processes are in
> deadlock? Is there an explainer somewhere on transaction level locks? I
> can’t see anything in the docs besides that they exist.
>
>
>
> Details below
>
> select version();
>   version
> 
> ---
>  PostgreSQL 10.2 (Ubuntu 10.2-1.pgdg14.04+1) on x86_64-pc-linux-gnu,
> compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit
> (1 row)
>
> 
> 
> ---
>
>
> after running update "planscheduleitem" set "planschedule"=$1 where
> "psi"=$2
>
> SELECT locktype, virtualxid, transactionid, virtualtransaction, pid, mode,
> relname, page, tuple
> FROM pg_locks LEFT JOIN pg_class ON (relation=oid) where pid =41715;
>locktype| virtualxid | transactionid | virtualtransaction |  pid  |
>   mode   | relname | page | tuple
> ---++---+---
> -+---+--+-+--+---
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_parentticketid   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_originalticketid |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_tickettypeid_idx |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_subject_idx  |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_closedtime_idx   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_assignedto_idx   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_serviceuid_idx   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_parentuid_idx|  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_createdtime_idx  |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_txid |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_tickettype   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_ticketpriority   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_idx_0|  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_pkey |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | number_constraint   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket  |  |
>  virtualxid| 56/2306863 |   | 56/2306863 | 41715 |
> ExclusiveLock| |  |
>  transactionid ||4089785154 <(408)%20978-5154> |
> 56/2306863 | 41715 | ExclusiveLock| |
>|
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_fromuid_idx  |  |
> (19 rows)
>
> 
> 
> 
>
> after running update ticket set unread = true where ticketid = $1
>
> SELECT locktype, virtualxid, transactionid, virtualtra

Re: Deadlock with 2 processes waiting on transactionids and locking unrelated tables

2018-02-19 Thread Rene Romero Benavides
My guess is that the transaction doing:

update "planscheduleitem" set "planschedule"=$1 where "psi"=$2

updates ticket before reaching that point

And

update ticket set unread = true where ticketid = $1

updates planscheduleitem before that

Does it make sense to you? Btw, do the transactions use explicit locking?

2018-02-18 23:28 GMT-06:00 David Wheeler :

> Hi,
>
> We’re seeing deadlock semi-regularly (0-2 per day) that I’m really having
> trouble getting to the bottom of.
>
> Process 7172 waits for ShareLock on transaction 4078724272
> <(407)%20872-4272>; blocked by process 7186.
> Process 7186 waits for ShareLock on transaction 4078724210
> <(407)%20872-4210>; blocked by process 7172.
>
> The two queries in question are updates on unrelated tables. Running the
> queries on their own shows no overlapping entries in pg_locks.
>
> Process 7172: update ticket set unread = true where ticketid = $1
> Process 7186: update "planscheduleitem" set "planschedule"=$1 where
> "psi"=$2
>
> How can I work out why Postgres has decided that the two processes are in
> deadlock? Is there an explainer somewhere on transaction level locks? I
> can’t see anything in the docs besides that they exist.
>
>
>
> Details below
>
> select version();
>   version
> 
> ---
>  PostgreSQL 10.2 (Ubuntu 10.2-1.pgdg14.04+1) on x86_64-pc-linux-gnu,
> compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit
> (1 row)
>
> 
> 
> ---
>
>
> after running update "planscheduleitem" set "planschedule"=$1 where
> "psi"=$2
>
> SELECT locktype, virtualxid, transactionid, virtualtransaction, pid, mode,
> relname, page, tuple
> FROM pg_locks LEFT JOIN pg_class ON (relation=oid) where pid =41715;
>locktype| virtualxid | transactionid | virtualtransaction |  pid  |
>   mode   | relname | page | tuple
> ---++---+---
> -+---+--+-+--+---
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_parentticketid   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_originalticketid |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_tickettypeid_idx |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_subject_idx  |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_closedtime_idx   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_assignedto_idx   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_serviceuid_idx   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_parentuid_idx|  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_createdtime_idx  |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_txid |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_tickettype   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_ticketpriority   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_idx_0|  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_pkey |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | number_constraint   |  |
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket  |  |
>  virtualxid| 56/2306863 |   | 56/2306863 | 41715 |
> ExclusiveLock| |  |
>  transactionid ||4089785154 <(408)%20978-5154> |
> 56/2306863 | 41715 | ExclusiveLock| |
>|
>  relation  ||   | 56/2306863 | 41715 |
> RowExclusiveLock | ticket_fromuid_idx  |  |
> (19 rows)
>
> 
> 
> 
>
> after running 

Re: Performance issues during backup

2018-02-20 Thread Rene Romero Benavides
What about sending the backup to a different server? through ssh / rsync or
something, that would save lots of IO activity

2018-02-20 2:02 GMT-06:00 Laurenz Albe :

> Dylan Luong wrote:
> > We perform nighty base backup of our production PostgreSQL instance. We
> have a script that basically puts the instance
> > into back mode and then backs up (tar) the /Data directory and then
> takes it out of backup mode.
> > Ie,
> > psql -c "SELECT pg_start_backup('${DATE}');"
> > tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b
> $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME}
> > psql -c "SELECT pg_stop_backup();"
> >
> > The size of our database is about 250GB and it usually takes about 1
> hour to backup.
> > During this time, we have performance issue where queries can take up to
> 15secs to return where normally it takes 2 to 3 seconds.
> > During this time (1:30am) usage is low (less than 10 users) on the
> system.
> >
> > Has anyone experience the same problem and any suggestions where to look
> at to resolve the problem?
>
> The "tar" is probably taking up too much I/O bandwidth.
>
> Assuming this is Linux, you could run it with
>
>   ionice -c 2 -n 7 tar ...
>
> or
>
>   ionice -c 3 tar ...
>
> Of course then you can expect the backup to take more time.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/