pgbouncer with ldap
Hello, I am trying to setup pgbouncer with ldap to connect with rds postgres. Downloaded latest version of pbbouncer (1.11) and using hba auth_type. Getting below error when starting pgbouncer daemon. Am I missing something? [centos@ip-x.x.x.x etc]$ pgbouncer -V PgBouncer version 1.11.0 [centos@ip-x.x.x.x etc]$ cat pg_hba.conf host all user1 0.0.0.0/0 md5 host all all 0.0.0.0/0 ldap ldapserver=ldap-server.com ldapprefix="cn=" ldapsuffix=", dc=group, dc=com" [centos@ip-x.x.x.x etc]$ cat pgbouncer.ini | grep -v '^$' | grep -v '^;' [databases] db1 = host=xxx.rds.amazonaws.com port=5439 dbname=db1 [users] [pgbouncer] logfile = /var/log/pgbouncer/pgbouncer.log pidfile = /var/run/pgbouncer/pgbouncer.pid listen_addr = * listen_port = 5439 auth_type = hba auth_file = /home/centos/pgbouncer-1.11.0/etc/userlist.txt auth_hba_file = /home/centos/pgbouncer-1.11.0/etc/pg_hba.conf [centos@ip-x.x.x.x etc]$ pgbouncer -d pgbouncer.ini 2019-09-09 06:57:54.447 UTC [9465] WARNING hba line 2: unsupported method: buf=ldap 2019-09-09 06:57:54.448 UTC [9465] WARNING could not parse hba config line 2 Regards, Ayub
Re: pgbouncer with ldap
It has hba and via hba file one can specify ldap connections https://www.postgresql.org/docs/9.3/auth-pg-hba-conf.html On Mon, Sep 9, 2019, 3:40 AM Christoph Moench-Tegeder wrote: > ## Ayub M (hia...@gmail.com): > > > Hello, I am trying to setup pgbouncer with ldap to connect with rds > > postgres. Downloaded latest version of pbbouncer (1.11) and using hba > > auth_type. Getting below error when starting pgbouncer daemon. Am I > > missing something? > > There's no "ldap" mentioned anywhere in the docs: > https://pgbouncer.github.io/config.html > That could be a hint that ldap isn't supported... > > Regards, > Christoph > > -- > Spare Space >
Re: pgbouncer with ldap
Thank you Achilleas and Laurenz for the points, will try this out. On Mon, Sep 9, 2019, 6:46 AM Achilleas Mantzios < ach...@matrix.gatewaynet.com> wrote: > On 9/9/19 12:41 μ.μ., Laurenz Albe wrote: > > Christoph Moench-Tegeder wrote: > >>> It has hba and via hba file one can specify ldap connections > >>> > >>> https://www.postgresql.org/docs/9.3/auth-pg-hba-conf.html > >> https://pgbouncer.github.io/config.html#hba-file-format > >> "Auth-method field: Only methods supported by PgBouncer’s auth_type > >> are supported", and "ldap" is not supported. > >> When there's no ldap support in pgbouncer, there's no ldap support > >> in pgbouncer. > > To throw in something less tautological: > > > > PgBouncer supports PAM authentication, so if you are on UNIX, > > you could use PAM's LDAP module to do what you want. > Right, I had written a blog about it : > > https://severalnines.com/database-blog/one-security-system-application-connection-pooling-and-postgresql-case-ldap > > However, I always wished (since my first endeavors with pgbouncer) it was > less complicated. > > > > Yours, > > Laurenz Albe > > > -- > Achilleas Mantzios > IT DEV Lead > IT DEPT > Dynacom Tankers Mgmt > > > >
Re: pgbouncer with ldap
Achilleas, for this setup to work are changes to postgresql.conf and pg_hba.conf needed? I am trying to implement this for AWS rds Aurora where these files are not accessible. On Mon, Sep 9, 2019, 6:46 AM Achilleas Mantzios < ach...@matrix.gatewaynet.com> wrote: > On 9/9/19 12:41 μ.μ., Laurenz Albe wrote: > > Christoph Moench-Tegeder wrote: > >>> It has hba and via hba file one can specify ldap connections > >>> > >>> https://www.postgresql.org/docs/9.3/auth-pg-hba-conf.html > >> https://pgbouncer.github.io/config.html#hba-file-format > >> "Auth-method field: Only methods supported by PgBouncer’s auth_type > >> are supported", and "ldap" is not supported. > >> When there's no ldap support in pgbouncer, there's no ldap support > >> in pgbouncer. > > To throw in something less tautological: > > > > PgBouncer supports PAM authentication, so if you are on UNIX, > > you could use PAM's LDAP module to do what you want. > Right, I had written a blog about it : > > https://severalnines.com/database-blog/one-security-system-application-connection-pooling-and-postgresql-case-ldap > > However, I always wished (since my first endeavors with pgbouncer) it was > less complicated. > > > > Yours, > > Laurenz Albe > > > -- > Achilleas Mantzios > IT DEV Lead > IT DEPT > Dynacom Tankers Mgmt > > > >
Re: pgbouncer with ldap
Okay, thanks for the response. Unfortunately Aurora does not expose these files or I should say there is no concept of these files in AWS managed Aurora DB service. Anyway I will give a try and let you know. On Thu, Sep 12, 2019 at 1:52 AM Achilleas Mantzios < ach...@matrix.gatewaynet.com> wrote: > On 11/9/19 2:47 μ.μ., Ayub M wrote: > > Achilleas, for this setup to work are changes to postgresql.conf and > pg_hba.conf needed? I am trying to implement this for AWS rds Aurora where > these files are not accessible. > > Those files are needed in any case if you work with postgresql. > Unfortunately no experience with Aurora. He have been building from source > for ages. > > On Mon, Sep 9, 2019, 6:46 AM Achilleas Mantzios < > ach...@matrix.gatewaynet.com> wrote: > >> On 9/9/19 12:41 μ.μ., Laurenz Albe wrote: >> > Christoph Moench-Tegeder wrote: >> >>> It has hba and via hba file one can specify ldap connections >> >>> >> >>> https://www.postgresql.org/docs/9.3/auth-pg-hba-conf.html >> >> https://pgbouncer.github.io/config.html#hba-file-format >> >> "Auth-method field: Only methods supported by PgBouncer’s auth_type >> >> are supported", and "ldap" is not supported. >> >> When there's no ldap support in pgbouncer, there's no ldap support >> >> in pgbouncer. >> > To throw in something less tautological: >> > >> > PgBouncer supports PAM authentication, so if you are on UNIX, >> > you could use PAM's LDAP module to do what you want. >> Right, I had written a blog about it : >> >> https://severalnines.com/database-blog/one-security-system-application-connection-pooling-and-postgresql-case-ldap >> >> However, I always wished (since my first endeavors with pgbouncer) it was >> less complicated. >> > >> > Yours, >> > Laurenz Albe >> >> >> -- >> Achilleas Mantzios >> IT DEV Lead >> IT DEPT >> Dynacom Tankers Mgmt >> >> >> >> > > -- > Achilleas Mantzios > IT DEV Lead > IT DEPT > Dynacom Tankers Mgmt > > -- Regards, Ayub
Re: pgbouncer with ldap
Stumbled in the first step - PAM authentication via pgbouncer. After compiling pgbouncer with the pam plug-in, I am unable to login into the db - throws PAM error message. Please help. User created with the same password as linux user -- localhost:~$ psql -h dbhost -p 3306 -U admin -W db1 db1=> create user testuser password 'hello123'; CREATE ROLE [ec2-user@ip-1.1.1.1 pam.d]$ psql -h localhost -p 5432 testdb -U testuser Password for user testuser: psql: ERROR: auth failed Log entries - pgbouncer.log 2019-09-13 06:51:47.180 UTC [5752] LOG C-0x1243020: testdb/testuser@[::1]:52408 login attempt: db=testdb user=testuser tls=no 2019-09-13 06:51:47.180 UTC [5752] NOISE safe_send(12, 9) = 9 2019-09-13 06:51:47.180 UTC [5752] NOISE resync(12): done=86, parse=86, recv=86 2019-09-13 06:51:47.180 UTC [5752] NOISE resync(12): done=0, parse=0, recv=0 2019-09-13 06:51:47.180 UTC [5752] NOISE safe_recv(12, 4096) = 14 2019-09-13 06:51:47.180 UTC [5752] NOISE C-0x1243020: testdb/testuser@[::1]:52408 read pkt='p' len=14 2019-09-13 06:51:47.180 UTC [5752] DEBUG C-0x1243020: testdb/testuser@[::1]:52408 pam_auth_begin(): pam_first_taken_slot=1, pam_first_free_slot=1 2019-09-13 06:51:47.180 UTC [5752] DEBUG pam_auth_worker(): processing slot 1 2019-09-13 06:51:47.180 UTC [5752] WARNING pam_authenticate() failed: Authentication failure 2019-09-13 06:51:47.181 UTC [5752] DEBUG pam_auth_worker(): authorization completed, status=3 2019-09-13 06:51:47.386 UTC [5752] LOG C-0x1243020: testdb/testuser@[::1]:52408 closing because: auth failed (age=0s) 2019-09-13 06:51:47.386 UTC [5752] WARNING C-0x1243020: testdb/testuser@[::1]:52408 pooler error: auth failed Able to login as testuser [ec2-user@ip-1.1.1.1 pam.d]$ su - testuser Password: Last login: Fri Sep 13 06:21:12 UTC 2019 on pts/1 [testuser@ip-1.1.1.1 ~]$ id uid=1001(testuser) gid=1001(testuser) groups=1001(testuser) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 The user was created as follows [root@ip-1.1.1.1 ~]# adduser -p hello123 testuser [root@ip-1.1.1.1 ~]# id testuser uid=1001(testuser) gid=1001(testuser) groups=1001(testuser) Here is the pgbouncer.ini config [ec2-user@ip-1.1.1.1 etc]$ less pgbouncer.ini | grep -v '^$' | grep -v '^;' [databases] testdb = host=dbhost port=3306 dbname=db1 [users] [pgbouncer] logfile = /var/log/pgbouncer/pgbouncer.log pidfile = /var/run/pgbouncer/pgbouncer.pid listen_addr = * listen_port = 5432 auth_type = pam Am I missing something? Any permissions? On Thu, Sep 12, 2019 at 4:54 AM Ayub M wrote: > Okay, thanks for the response. Unfortunately Aurora does not expose these > files or I should say there is no concept of these files in AWS managed > Aurora DB service. Anyway I will give a try and let you know. > > On Thu, Sep 12, 2019 at 1:52 AM Achilleas Mantzios < > ach...@matrix.gatewaynet.com> wrote: > >> On 11/9/19 2:47 μ.μ., Ayub M wrote: >> >> Achilleas, for this setup to work are changes to postgresql.conf and >> pg_hba.conf needed? I am trying to implement this for AWS rds Aurora where >> these files are not accessible. >> >> Those files are needed in any case if you work with postgresql. >> Unfortunately no experience with Aurora. He have been building from source >> for ages. >> >> On Mon, Sep 9, 2019, 6:46 AM Achilleas Mantzios < >> ach...@matrix.gatewaynet.com> wrote: >> >>> On 9/9/19 12:41 μ.μ., Laurenz Albe wrote: >>> > Christoph Moench-Tegeder wrote: >>> >>> It has hba and via hba file one can specify ldap connections >>> >>> >>> >>> https://www.postgresql.org/docs/9.3/auth-pg-hba-conf.html >>> >> https://pgbouncer.github.io/config.html#hba-file-format >>> >> "Auth-method field: Only methods supported by PgBouncer’s auth_type >>> >> are supported", and "ldap" is not supported. >>> >> When there's no ldap support in pgbouncer, there's no ldap support >>> >> in pgbouncer. >>> > To throw in something less tautological: >>> > >>> > PgBouncer supports PAM authentication, so if you are on UNIX, >>> > you could use PAM's LDAP module to do what you want. >>> Right, I had written a blog about it : >>> >>> https://severalnines.com/database-blog/one-security-system-application-connection-pooling-and-postgresql-case-ldap >>> >>> However, I always wished (since my first endeavors with pgbouncer) it >>> was less complicated. >>> > >>> > Yours, >>> > Laurenz Albe >>> >>> >>> -- >>> Achilleas Mantzios >>> IT DEV Lead >>> IT DEPT >>> Dynacom Tankers Mgmt >>> >>> >>> >>> >> >> -- >> Achilleas Mantzios >> IT DEV Lead >> IT DEPT >> Dynacom Tankers Mgmt >> >> > > -- > Regards, > Ayub > -- Regards, Ayub
Re: pgbouncer with ldap
Yes I did set that, here is how pgbouncer looks like --- -rwsrwsr-x. 1 root root 2087504 Sep 13 00:45 pgbouncer On Fri, Sep 13, 2019 at 6:50 AM Achilleas Mantzios < ach...@matrix.gatewaynet.com> wrote: > On 13/9/19 10:19 π.μ., Ayub M wrote: > > Stumbled in the first step - PAM authentication via pgbouncer. After > compiling pgbouncer with the pam plug-in, I am unable to login into the db > - throws PAM error message. Please help. > > User created with the same password as linux user -- > localhost:~$ psql -h dbhost -p 3306 -U admin -W db1 > db1=> create user testuser password 'hello123'; > CREATE ROLE > > [ec2-user@ip-1.1.1.1 pam.d]$ psql -h localhost -p 5432 testdb -U testuser > Password for user testuser: > psql: ERROR: auth failed > > > ok, pgbouncer should be able to read /etc/pam* files. > Did you miss the > # chown root:staff ~pgbouncer/pgbouncer-1.9.0/pgbouncer > # chmod +s ~pgbouncer/pgbouncer-1.9.0/pgbouncer > part? > > > Log entries - pgbouncer.log > 2019-09-13 06:51:47.180 UTC [5752] LOG C-0x1243020: > testdb/testuser@[::1]:52408 > login attempt: db=testdb user=testuser tls=no > 2019-09-13 06:51:47.180 UTC [5752] NOISE safe_send(12, 9) = 9 > 2019-09-13 06:51:47.180 UTC [5752] NOISE resync(12): done=86, parse=86, > recv=86 > 2019-09-13 06:51:47.180 UTC [5752] NOISE resync(12): done=0, parse=0, > recv=0 > 2019-09-13 06:51:47.180 UTC [5752] NOISE safe_recv(12, 4096) = 14 > 2019-09-13 06:51:47.180 UTC [5752] NOISE C-0x1243020: > testdb/testuser@[::1]:52408 > read pkt='p' len=14 > 2019-09-13 06:51:47.180 UTC [5752] DEBUG C-0x1243020: > testdb/testuser@[::1]:52408 > pam_auth_begin(): pam_first_taken_slot=1, pam_first_free_slot=1 > 2019-09-13 06:51:47.180 UTC [5752] DEBUG pam_auth_worker(): processing > slot 1 > 2019-09-13 06:51:47.180 UTC [5752] WARNING pam_authenticate() failed: > Authentication failure > 2019-09-13 06:51:47.181 UTC [5752] DEBUG pam_auth_worker(): authorization > completed, status=3 > 2019-09-13 06:51:47.386 UTC [5752] LOG C-0x1243020: > testdb/testuser@[::1]:52408 > closing because: auth failed (age=0s) > 2019-09-13 06:51:47.386 UTC [5752] WARNING C-0x1243020: > testdb/testuser@[::1]:52408 > pooler error: auth failed > > Able to login as testuser > [ec2-user@ip-1.1.1.1 pam.d]$ su - testuser > Password: > Last login: Fri Sep 13 06:21:12 UTC 2019 on pts/1 > [testuser@ip-1.1.1.1 ~]$ id > uid=1001(testuser) gid=1001(testuser) groups=1001(testuser) > context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 > > The user was created as follows > [root@ip-1.1.1.1 ~]# adduser -p hello123 testuser > [root@ip-1.1.1.1 ~]# id testuser > uid=1001(testuser) gid=1001(testuser) groups=1001(testuser) > > Here is the pgbouncer.ini config > [ec2-user@ip-1.1.1.1 etc]$ less pgbouncer.ini | grep -v '^$' | grep -v > '^;' > [databases] > testdb = host=dbhost port=3306 dbname=db1 > [users] > [pgbouncer] > logfile = /var/log/pgbouncer/pgbouncer.log > pidfile = /var/run/pgbouncer/pgbouncer.pid > listen_addr = * > listen_port = 5432 > auth_type = pam > > Am I missing something? Any permissions? > > On Thu, Sep 12, 2019 at 4:54 AM Ayub M wrote: > >> Okay, thanks for the response. Unfortunately Aurora does not expose these >> files or I should say there is no concept of these files in AWS managed >> Aurora DB service. Anyway I will give a try and let you know. >> >> On Thu, Sep 12, 2019 at 1:52 AM Achilleas Mantzios < >> ach...@matrix.gatewaynet.com> wrote: >> >>> On 11/9/19 2:47 μ.μ., Ayub M wrote: >>> >>> Achilleas, for this setup to work are changes to postgresql.conf and >>> pg_hba.conf needed? I am trying to implement this for AWS rds Aurora where >>> these files are not accessible. >>> >>> Those files are needed in any case if you work with postgresql. >>> Unfortunately no experience with Aurora. He have been building from source >>> for ages. >>> >>> On Mon, Sep 9, 2019, 6:46 AM Achilleas Mantzios < >>> ach...@matrix.gatewaynet.com> wrote: >>> >>>> On 9/9/19 12:41 μ.μ., Laurenz Albe wrote: >>>> > Christoph Moench-Tegeder wrote: >>>> >>> It has hba and via hba file one can specify ldap connections >>>> >>> >>>> >>> https://www.postgresql.org/docs/9.3/auth-pg-hba-conf.html >>>> >> https://pgbouncer.github.io/config.html#hba-file-format >>>> >> "Auth-method field: Only methods supported by PgBouncer’s auth_type >>>> >> are supported", and "ldap" is not supported.
fetch time included in pg_stat_statements?
Does the pg_stat_statements.total_time include the time it takes for all fetches of a cursor query. Or is it only the db time taken to execute the query? -- Regards, Ayub
work_mem and shared_buffers question
1. shared_buffers - In a regular PostgreSQL installation, say I am allocating 25% of my memory to shared_buffers that means it leaves 75% for rest such as OS, page cache and work_mems etc. Is my understanding correct? If so, AWS Aurora for Postgres uses 75% of memory for shared_buffers, then it would leave just 25% for other things? 2. Does the memory specified for work_mem, fully gets allocated to all sessions irrespective of whether they do any sorting or hashing operations? -- Regards, Ayub
Re: work_mem and shared_buffers question
Thanks, for q2 - if work_mem is a limit before spilling onto disk, is there a min amount of memory which gets allocated to each session when it starts? On Mon, Feb 10, 2020 at 3:51 AM Naresh g wrote: > 1. Yes, by default it is advisable to set 25% of RAM to shared buffers > because community version postgres highly depends on OS cache, which means > when you fetch something from disk, it is first copied to OS cache (75%) > then to shared buffers(25%). > Two copies of data will be there in your system RAM. > The copy is called buffered IO. > > Amazon Aurora eliminates this buffered IO, hence it is not required you > stick to restrict 25% of RAM. > > 2. Work_mem is just setting, if you sort something out your session uses > than memory other wise it just lies at OS. > > On Mon, 10 Feb, 2020, 1:34 PM Ayub M, wrote: > >> >>1. shared_buffers - In a regular PostgreSQL installation, say I am >>allocating 25% of my memory to shared_buffers that means it leaves 75% for >>rest such as OS, page cache and work_mems etc. Is my understanding >> correct? >>If so, AWS Aurora for Postgres uses 75% of memory for shared_buffers, then >>it would leave just 25% for other things? >>2. Does the memory specified for work_mem, fully gets allocated to >>all sessions irrespective of whether they do any sorting or hashing >>operations? >> >> >> -- >> Regards, >> Ayub >> > -- Regards, Ayub
aws sct/dms versus ora2pg
I would like to get suggestions and feedback on aws sct/dms vs ora2pg for an Oracle (on-prem) to PostgreSQL (aws rds) migration project. One big difference between them I see is the fact that dms supports incremental loads from oracle to postgres (dont think ora2pg supports that, but I could be wrong). Someone who researched or used these tools help list pros and cons of each approach and which is highly recommended.
postgres stats on the new primary
I have a RDS PostgreSQL v11.6 with primary and standby. On the primary, vacuum/analyze were running and pg_stat_all_tables's last_vacuum/analyze/autovacuum/autoanalyze were having values when vacuum/analyze were run. 1. Switchover to Standby happened, now when I see pg_stat_all_tables (on the active primary which was standby earlier), the last vacuum/analyze columns are all nulls. Would primary-standby replication not replicate system tables? Is this how it behaves? 2. On the new primary, would the table have stats on them? (I do see pg_stats entries for the tables on new standby), are they reliable or do I need to gather stats and run vacuum after failover/switchover to standby?
hash joins are causing no space left error
This is PostgreSQL 11 on AWS, there is a mview query in this OLAP database, the tables involved are huge - 50-100m records on average records hundreds of columns in most cases. The query runs for a while and then errors out saying "No space left on device". I could see it generating around 500gb of temp file data. At times it goes thru and at times it fails - probably due to other queries running at the same time and causing failure. The tables are partitioned and indexed on the PKs and FKs. Using parallelism (4) with increased work_mem (4gb). The joins are happening on around 10 tables and all are joining on the PK and FK columns. I see partition pruning happening but the hash joins are killing the query. Is there any way to avoid hash joins? If we create hash indexes on the joining columns, would PostgreSQL avoid hashing operation and instead use hash indexes on the tables and join them. That way I feel resource intensive hashing would be avoided and there wont be any need of temp files. I tried but does not seem to work, when I query the table with specific values then it uses the hash index but when I am joining the tables it seems to do its own hash join. My question is how to optimize massive table joins in PostgreSQL to resolve - avoid space failures and make it run fast - takes a couple of hours to complete now. Any best practices or suggestions.
Re: hash joins are causing no space left error
On Wed, Aug 12, 2020 at 8:06 PM Tom Lane wrote: > Ayub M writes: > > This is PostgreSQL 11 on AWS, there is a mview query in this OLAP > database, > > the tables involved are huge - 50-100m records on average records > hundreds > > of columns in most cases. The query runs for a while and then errors out > > saying "No space left on device". I could see it generating around 500gb > of > > temp file data. At times it goes thru and at times it fails - probably > due > > to other queries running at the same time and causing failure. > > Are you sure that these queries are actually producing the answers you > want? It sounds suspiciously like you are computing underconstrained > joins. > --> Yes, it is as per the business requirement. > > > The joins are happening on around 10 tables and all are joining on the PK > > and FK columns. I see partition pruning happening but the hash joins are > > killing the query. > > Is there any way to avoid hash joins? > > TBH, you are asking the wrong question. A merge join would take about as > much temporary space, and a nestloop join over so much data would probably > not finish in an amount of time you're willing to wait. Indexes are NOT > a magic solution here. What you need to be thinking about is how to not > need to process so much data. > > If you really need to have this proven to you, you can try "set > enable_hashjoin = off", but I don't think you'll find that better. > --> You are right, neither merge join nor nested loop are resolving the > issue. > > regards, tom lane > -- Regards, Ayub
postgres materialized view refresh in parallel
Created materialized view in parallel by setting max_parallel_workers_per_gather to 4. I could see parallel sessions kicking off and creating the mview much faster than serial execution (10mins - parallel vs over an hour - serial). Now I want the refresh of the mview to happen taking around same time as create, I am okay with it taking little more than create, to execute the steps it takes when refreshing the view. But what I am noticing is, refresh is not running in parallel at all - even after having max_parallel_workers_per_gather to 4. Does PostgreSQL 11.6 not support refresh of the mviews in parallel as it did while creating the mview?
any issue with enable_partitionwise_join at instance level?
Hello, while running aggregate queries on partitioned table there is improvement when enable_partitionwise_join is turned on. By default this parameter is disabled and going by the documentation it says ( https://www.postgresql.org/docs/11/runtime-config-query.html) ``` Because partitionwise grouping or aggregation can use significantly more CPU time and memory during planning, the default is off. ``` Why would it take "significantly" more CPU and memory for the planner? Is there any foreseeable issues when enabling this at the instance level?
postgres materialized view refresh performance
There is a table t which is used in a mview mv, this is the only table in the mview definition. create table t (c1 int, ..., c10 int);-- there is a pk on say c1 columncreate materialized view mv as select c1, c2...c10 from t;---there is a unique index on say c5 and bunch of other indexes on the mview. The reason there is a mview created instead of using table t, is that that the table gets truncated and reloaded every couple of hours and we don't want users to see an empty table at any point of time that's why mview is being used. Using "refresh materialized view concurrently", this mview is being used by APIs and end users. Couple of questions I have - 1. Whenever mview refresh concurrently happens, does pg create another set of table and indexes and switch it with the orig? If no, then does it update the existing data? 2. If the usage of mview is pretty heavy does it impact the performance of the refresh process? Vice-versa, if the refresh is going on does the performance of mview by users take a hit? 3. The mview gets refreshed in a couple of mins sometimes and sometimes it takes hours. When it runs for longer, there are no locks and no resource shortage, the number of recs in the base table is 6m (7.5gb) which is not huge so why does it take so long to refresh the mview? 4. Does mview need vacuum/analyze/reindex?
Re: postgres materialized view refresh performance
Thank you both. As for the mview refresh taking long -- • The mview gets refreshed in a couple of mins sometimes and sometimes it takes hours. When it runs for longer, there are no locks and no resource shortage, the number of recs in the base table is 6m (7.5gb) which is not huge so why does it take so long to refresh the mview? Does the run time correlate with the number of changes being made? -- Almost the same number of records are present in the base table (6 million records). The base table gets truncated and reloaded everytime with almost the same number of records. And the mview is a simple select from this one base table. The mview has around 10 indexes, 1 unique and 9 non-unique indexes. Population of the base tables takes about 2 mins, using "insert into select from table", but when the mview is created for the first time it takes 16 minutes. Even when I remove all but one unique index it takes about 7 minutes. Any clue as to why it is taking longer than the create of the base table (which is 2 mins). On Fri, Oct 23, 2020 at 10:53 AM Philip Semanchuk < phi...@americanefficient.com> wrote: > > > > On Oct 23, 2020, at 9:52 AM, Ravi Krishna wrote: > > > >> My understanding is that when CONCURRENTLY is specified, Postgres > implements the refresh as a series of INSERT, UPDATE, > >> and DELETE statements on the existing view. So the answer to your > question is no, Postgres doesn’t create another table and > >> then swap it. > > > > The INSERTS/UPDATE/DELETE happens only for the difference. PG first > creates a new temp table and then compares it with > > the MV and detects the difference. That is why for CONCURRENTLY, a > unique index is required on the MV. > > Yes, thank you, that’s what I understand too but I expressed it very > poorly. > > -- Regards, Ayub
Re: postgres materialized view refresh performance
It's a simple sequential scan plan of one line, just reading the base table sequentially. On Mon, Oct 26, 2020, 9:21 AM Philip Semanchuk wrote: > > > > On Oct 25, 2020, at 10:52 PM, Ayub M wrote: > > > > Thank you both. > > > > As for the mview refresh taking long -- > > • The mview gets refreshed in a couple of mins sometimes and sometimes > it takes hours. When it runs for longer, there are no locks and no resource > shortage, the number of recs in the base table is 6m (7.5gb) which is not > huge so why does it take so long to refresh the mview? > > > > Does the run time correlate with the number of changes being made? > > > > -- Almost the same number of records are present in the base table (6 > million records). The base table gets truncated and reloaded everytime with > almost the same number of records. > > > > And the mview is a simple select from this one base table. > > > > The mview has around 10 indexes, 1 unique and 9 non-unique indexes. > > > > Population of the base tables takes about 2 mins, using "insert into > select from table", but when the mview is created for the first time it > takes 16 minutes. Even when I remove all but one unique index it takes > about 7 minutes. Any clue as to why it is taking longer than the create of > the base table (which is 2 mins). > > Do you know if it’s executing a different plan when it takes a long time? > auto_explain can help with that. > > > > > > > On Fri, Oct 23, 2020 at 10:53 AM Philip Semanchuk < > phi...@americanefficient.com> wrote: > > > > > > > On Oct 23, 2020, at 9:52 AM, Ravi Krishna > wrote: > > > > > >> My understanding is that when CONCURRENTLY is specified, Postgres > implements the refresh as a series of INSERT, UPDATE, > > >> and DELETE statements on the existing view. So the answer to your > question is no, Postgres doesn’t create another table and > > >> then swap it. > > > > > > The INSERTS/UPDATE/DELETE happens only for the difference. PG first > creates a new temp table and then compares it with > > > the MV and detects the difference. That is why for CONCURRENTLY, a > unique index is required on the MV. > > > > Yes, thank you, that’s what I understand too but I expressed it very > poorly. > > > > > > > > -- > > Regards, > > Ayub > >
postgres auditing truncates and deletes on tables
for an RDS Aurora PostgreSQL v11.7 database I want to setup DDL and truncate and delete auditing. I am setting log_statement to DDL, to log all DDLs (this excludes truncate statements). I also want to setup auditing only for deletes and truncates on tables. I don't want to set log_statement to all, as this would fill up my logs. Checking in pg_audit, it supports write class which includes delete and truncate, but it also includes write/update and copy which I don't need. Is there any way to setup auditing to audit DDL plus deletes/truncates only using pg_audit/native auditing?
Partitioning an existing table - pg10.6
Hello, I am using postgres 10.6 and have a huge table with ~30m recs which needs to be partitioned. Whats the best strategy for it with minimal downtime? 1. Rename existing table and create new partitioned tables with orig name and keep inserting data from the old renamed to new table. This will incur downtime for the apps. 2. Create partitioned table with new name and move the data from current table (move oldest to newest) and when moving current month paritition shutdown down the app and move the last set of data. This will not incur downtime but if any data is changed which was already copied over then it might be lost. Any other approaches/alternatives? -- Regards, Ayub
postgres index usage count too high
There is a table in the db, whose index_scan count from pg_stat_all_tables for this table seems to be too high, there are not that many queries being executed against this table. Wondering how this count can be too high. 1. The db is up since 80 days so I assume these are cumulative stats since last startup? 2. Could it be possible that a query is using this table joining other tables, and this table is being probed multiple times in loops. Below is a googled part of the plan showing parallel index only scan happened 5 times for the index. I am assuming something of this sort is happening making the index scan count going too high. Please let me know if that might be the case. -> Parallel Index Only Scan using us_geonames_type_idx on us_geonames (cost=0.43..24401.17 rows=559758 width=4) (actual time=0.036..90.309 rows=447806 loops=5) 1. Is there any other possible explanation for this high count. I see updates do increase this count but there are not those many updates. Inserts and deletes do not seem to touch this counter. -[ RECORD 1 ]---+-- relid | 3029143981 schemaname | myschema relname | mytable seq_scan| 196 seq_tup_read| 2755962642 idx_scan| 4362625959 idx_tup_fetch | 3579773932 n_tup_ins | 93821564 n_tup_upd | 645310 n_tup_del | 0 n_tup_hot_upd | 21288 n_live_tup | 31153237 n_dead_tup | 0 n_mod_since_analyze | 0 last_vacuum | 2021-04-24 05:06:56.481349+00 last_autovacuum | 2021-03-04 00:27:26.705849+00 last_analyze| 2021-04-24 05:07:37.589756+00 last_autoanalyze| 2021-03-04 08:55:32.673118+00 vacuum_count| 69 autovacuum_count| 1 analyze_count | 69 autoanalyze_count | 55 db=> select * from pg_stat_all_indexes where relname = 'mytable' and indexrelname = 'mytable_pkey' order by idx_tup_fetch desc nulls last;-[ RECORD 1 ]-+--- relid | 3029143926 indexrelid| 3029143974 schemaname| myschema relname | mytable indexrelname | mytable_pkey idx_scan | 3806451145 idx_tup_read | 97277555 idx_tup_fetch | 61522 Thanks.
postgres vacuum memory limits
Hello, when maintenance_work_mem and autovacuum_work_mem are set, my understanding is that the vacuum and autovacuum sessions should be limited to use the memory limits set by these parameters. But I am seeing more memory being used than these limits by autovacuum sessions, any reason why this would happen? Please see below examples, where maintenance_work_mem is set to 20mb and shared_buffers is 128mb. When I see the memory for this session in top, it shows 162mb. But when default_statistics_target is increased to 3000, the session usage is 463mb, which is way more than 20mb maintenance_work_mem and 128mb shared_buffer. Shouldn't the process memory be capped to 20+128mb? postgres=# show maintenance_work_mem ; maintenance_work_mem -- 20MB (1 row) postgres=# vacuum analyze mdm_context; VACUUM postgres=# show shared_buffers; shared_buffers 128MB (1 row) PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 62246 postgres 20 0 422892 165996 139068 R 57.1 15.7 25:06.34 postgres: postgres postgres [local] VACUUM postgres=# show default_statistics_target; default_statistics_target --- 100 (1 row) postgres=# set default_statistics_target=3000;SET postgres=# vacuum analyze mdm_context; VACUUM PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 62246 postgres 20 0 876132 474384 2976 R 62.9 47.6 25:11.41 postgres: postgres postgres [local] VACUUM