Cross-site cookies warnings

2020-07-15 Thread Fabio
I'm using python 3.7.4, django 3.0.6, javascript, Postgres 12.3.1 
pgadmin 4.21 windows7. When my page loads on the console there are these 
warnings:


|Cookie“PGADMIN_KEY”will be soon treated ascross-site cookie against 
“http://127.0.0.1:8000/lists/list-name/” because the scheme does not 
match. list-nameCookie“PGADMIN_LANGUAGE”will be soon treated 
ascross-site cookie against “http://127.0.0.1:8000/lists/list-name/” 
because the scheme does not match. list-nameCookie“PGADMIN_KEY”will be 
soon treated ascross-site cookie against 
“http://127.0.0.1:8000/lists/list-name/” because the scheme does not 
match. list-nameCookie“PGADMIN_LANGUAGE”will be soon treated 
ascross-site cookie against “http://127.0.0.1:8000/lists/list-name/” 
because the scheme does not match. list-nameCookie“PGADMIN_KEY”will be 
soon treated ascross-site cookie against 
“http://127.0.0.1:8000/static/lists/js/lists.js” because the scheme does 
not match. lists.jsCookie“PGADMIN_LANGUAGE”will be soon treated 
ascross-site cookie against 
“http://127.0.0.1:8000/static/lists/js/lists.js” because the scheme does 
not match. lists.jsCookie“PGADMIN_KEY”will be soon treated ascross-site 
cookie against “http://127.0.0.1:8000/jsi18n/” because the scheme does 
not match. jsi18nCookie“PGADMIN_LANGUAGE”will be soon treated 
ascross-site cookie against “http://127.0.0.1:8000/jsi18n/” because the 
scheme does not match. jsi18nCookie“PGADMIN_KEY”will be soon treated 
ascross-site cookie against “http://127.0.0.1:8000/static/js/common.js” 
because the scheme does not match. common.jsCookie“PGADMIN_LANGUAGE”will 
be soon treated ascross-site cookie against 
“http://127.0.0.1:8000/static/js/common.js” because the scheme does not 
match. common.jsCookie“PGADMIN_KEY”will be soon treated ascross-site 
cookie against “http://127.0.0.1:8000/static/icons/favicon.png” because 
the scheme does not match. favicon.pngCookie“PGADMIN_LANGUAGE”will be 
soon treated ascross-site cookie against 
“http://127.0.0.1:8000/static/icons/favicon.png” because the scheme does 
not match. favicon.png |They should come from thenew policy about cookies  . The problem is|PGADMIN_KEY|  and|PGADMIN_LANGUAGE|  are cookies of PGAdmin and I don't use

them explicitly (I mean evidently the database himself use them but
in my code they don't appear). Is this a bug of pgadmin?


Then I upgraded to pgadmin 4.23 and now I have only these warnings:


Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against 
“http://127.0.0.1:8000/lists/list-name/” because the scheme does not 
match. 2 list-name 


Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against 
“http://127.0.0.1:8000/jsi18n/” because the scheme does not match. 
jsi18n 


Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against 
“http://127.0.0.1:8000/static/js/common.js” because the scheme does not 
match. common.js 


Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against 
“http://127.0.0.1:8000/static/lists/js/lists.js” because the scheme does 
not match. lists.js 


Cookie “PGADMIN_KEY” will be soon treated as cross-site cookie against 
“http://127.0.0.1:8000/static/icons/favicon.png” because the scheme does 
not match. favicon.png 


Also I see a new cookie (PGADMIN_INT_KEY) but it doesn't give problems.
So there's some way to solve this problem or I just have to wait the new update?

thank you

||



Re: How to fetch data from tables in PostgreSQL

2018-04-18 Thread Fabio Pardi
Hi Vaibhav,

I think this might come of help:

https://www.postgresql.org/docs/current/static/tutorial-select.html




On 04/18/2018 09:22 AM, vaibhav zaveri wrote:
> Hi, 
> 
> How to fetch data from tables in PostgreSQL.
> 
> Regards, 
> Vaibhav Zaveri



Re: pg_upgrade help

2018-04-18 Thread Fabio Pardi
Hi,

please avoid crossposting to multiple mailing lists.


You need to run both versions of the database, the old and the new.

They need to run on different ports (note that it is impossible to run 2
different processes on the same port, that's not a postgresql thing)



On 04/18/2018 09:30 AM, Akshay Ballarpure wrote:
> Hi all,
> I need help on pg_upgrade from 8.4 to 9.4 version. Appreciate urgent
> response.
> Installed both version and stopped it. Do i need to run both version or
> only one 8.4 or 9.4 . Both should run on 50432 ?
> 
> 
> -bash-4.2$ id
> uid=26(postgres) gid=26(postgres) groups=26(postgres)
> context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> 
> -bash-4.2$ export OLDCLUSTER=/var/ericsson/esm-data/postgresql-data    
>                        -- 8.4 data
> -bash-4.2$ export NEWCLUSTER=/var/ericsson/esm-data/postgresql-data-9.4
>                   -- 9.4 data
> 
> 
> -bash-4.2$ /opt/rh/rh-postgresql94/root/usr/bin/pg_upgrade
> --old-bindir=/usr/bin --new-bindir=/opt/rh/rh-postgresql94/root/usr/bin
> --old-datadir=$OLDCLUSTER --new-datadir=$NEWCLUSTER
> 
> *connection to database failed: could not connect to server: No such
> file or directory*
>         Is the server running locally and accepting
>         connections on Unix domain socket
> "/var/run/postgresql/.s.PGSQL.50432"?
> 
> 
> could not connect to old postmaster started with the command:
> "/usr/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D
> "/var/ericsson/esm-data/postgresql-data" -o "-p 50432 -c autovacuum=off
> -c autovacuum_freeze_max_age=20  -c listen_addresses='' -c
> unix_socket_permissions=0700" start
> Failure, exiting
> 
> 
> 
> 
> With Best Regards
> Akshay
> 
> =-=-=
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
> 



Re: pg_upgrade help

2018-04-18 Thread Fabio Pardi
Hi,

i was too fast in reply (and perhaps i should drink my morning coffee
before replying), I will try to be more detailed:

both servers should be able to run at the moment you run pg_upgrade,
that means the 2 servers should have been correctly stopped in advance,
should have their configuration files, and new cluster initialized too.

Then, as Sergei highlights here below, pg_upgrade will take care of the
upgrade process, starting the servers.


Here there is a step by step guide, i considered my best ally when it
was time to upgrade:

https://www.postgresql.org/docs/9.4/static/pgupgrade.html

note point 7:

'stop both servers'


About the port the servers will run on, at point 9 there is some
clarification:

' pg_upgrade defaults to running servers on port 50432 to avoid
unintended client connections. You can use the same port number for both
clusters when doing an upgrade because the old and new clusters will not
be running at the same time. However, when checking an old running
server, the old and new port numbers must be different.'

Hope it helps,

Fabio Pardi


On 04/18/2018 10:34 AM, Akshay Ballarpure wrote:
> Thanks Fabio for instant reply.
> 
> I now started 8.4 with 50432 and 9.4 with default port but still its
> failing ...Can you please suggest what is wrong ?
> 
> -bash-4.2$ /opt/rh/rh-postgresql94/root/usr/bin/pg_upgrade
> --old-bindir=/usr/bin --new-bindir=/opt/rh/rh-postgresql94/root/usr/bin
> --old-datadir=$OLDCLUSTER --new-datadir=$NEWCLUSTER
> 
> *failure*
> Consult the last few lines of "pg_upgrade_server.log" for
> the probable cause of the failure.
> 
> There seems to be a postmaster servicing the old cluster.
> Please shutdown that postmaster and try again.
> Failure, exiting
> -bash-4.2$ ps -eaf | grep postgres
> root      8646  9365  0 08:07 pts/1    00:00:00 su - postgres
> postgres  8647  8646  0 08:07 pts/1    00:00:00 -bash
> postgres  9778     1  0 09:17 ?        00:00:00 /usr/bin/postgres -p
> 50432 -D /var/ericsson/esm-data/postgresql-data/
> postgres  9779  9778  0 09:17 ?        00:00:00 postgres: logger process
> postgres  9781  9778  0 09:17 ?        00:00:00 postgres: writer process
> postgres  9782  9778  0 09:17 ?        00:00:00 postgres: wal writer
> process
> postgres  9783  9778  0 09:17 ?        00:00:00 postgres: autovacuum
> launcher process
> postgres  9784  9778  0 09:17 ?        00:00:00 postgres: stats
> collector process
> postgres  9900     1  0 09:20 ?        00:00:00
> /opt/rh/rh-postgresql94/root/usr/bin/postgres -D
> /var/ericsson/esm-data/postgresql-data-9.4/
> postgres  9901  9900  0 09:20 ?        00:00:00 postgres: logger process
> postgres  9903  9900  0 09:20 ?        00:00:00 postgres: checkpointer
> process
> postgres  9904  9900  0 09:20 ?        00:00:00 postgres: writer process
> postgres  9905  9900  0 09:20 ?        00:00:00 postgres: wal writer
> process
> postgres  9906  9900  0 09:20 ?        00:00:00 postgres: autovacuum
> launcher process
> postgres  9907  9900  0 09:20 ?        00:00:00 postgres: stats
> collector process
> postgres  9926  8647  0 09:21 pts/1    00:00:00 ps -eaf
> postgres  9927  8647  0 09:21 pts/1    00:00:00 grep --color=auto postgres
> 
> 
> -bash-4.2$ netstat -antp | grep 50432
> (Not all processes could be identified, non-owned process info
>  will not be shown, you would have to be root to see it all.)
> tcp        0      0 127.0.0.1:50432         0.0.0.0:*              
> LISTEN      9778/postgres
> tcp6       0      0 ::1:50432               :::*                  
>  LISTEN      9778/postgres
> -bash-4.2$ netstat -antp | grep 5432
> (Not all processes could be identified, non-owned process info
>  will not be shown, you would have to be root to see it all.)
> tcp        0      0 127.0.0.1:5432          0.0.0.0:*              
> LISTEN      9900/postgres
> tcp6       0      0 ::1:5432                :::*                  
>  LISTEN      9900/postgres
> 
> -
>   pg_upgrade run on Wed Apr 18 09:24:47 2018
> -
> 
> command: "/usr/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D
> "/var/ericsson/esm-data/postgresql-data" -o "-p 50432 -c autovacuum=off
> -c autovacuum_freeze_max_age=20  -c listen_addresses='' -c
> unix_socket_permissions=0700" start >> "pg_upgrade_server.log" 2>&1
> pg_ctl: another server might be running; trying to start server anyway
> FATAL:  lock file "postmaster.pid" already exists
> HINT:  Is another postmaster (PID 9778) running in data directory
> "/var/ericsson/esm-data/postgresql-data"?
> pg_ctl: could not sta

Re: pg_upgrade help

2018-04-18 Thread Fabio Pardi
did you run initdb on the new db?

what happens if you manually start the new db?

/opt/rh/rh-postgresql94/root/usr/bin/pg_ctl  start -o "-p 50432 -c
listen_addresses='' -c unix_socket_permissions=0700"  -D $NEWCLUSTER

after starting it, can you connect to it using psql?

psql -p 50432 -h /var/run/postgresql  -U your_user _db_



regards,

fabio pardi


On 04/18/2018 02:02 PM, Akshay Ballarpure wrote:
> Hi Fabio,
> sorry to bother you again, its still failing with stopping both server
> (8.4 and 9.4)
> 
> -bash-4.2$ /opt/rh/rh-postgresql94/root/usr/bin/pg_upgrade
> --old-bindir=/usr/bin --new-bindir=/opt/rh/rh-postgresql94/root/usr/bin
> --old-datadir=$OLDCLUSTER --new-datadir=$NEWCLUSTER
> 
> connection to database failed: could not connect to server: No such file
> or directory
>         Is the server running locally and accepting
>         connections on Unix domain socket
> "/var/run/postgresql/.s.PGSQL.50432"?
> 
> 
> could not connect to old postmaster started with the command:
> "/usr/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D
> "/var/ericsson/esm-data/postgresql-data" -o "-p 50432 -c autovacuum=off
> -c autovacuum_freeze_max_age=20  -c listen_addresses='' -c
> unix_socket_permissions=0700" start
> Failure, exiting
> 
> 
> With Best Regards
> Akshay
> Ericsson OSS MON
> Tata Consultancy Services
> Mailto: akshay.ballarp...@tcs.com
> Website: http://www.tcs.com <http://www.tcs.com/>
> 
> Experience certainty.        IT Services
>                        Business Solutions
>                        Consulting
> 
> 
> 
> 
> 
> From:        Fabio Pardi 
> To:        Akshay Ballarpure ,
> pgsql-general@lists.postgresql.org
> Date:        04/18/2018 02:35 PM
> Subject:        Re: pg_upgrade help
> 
> 
> 
> 
> Hi,
> 
> i was too fast in reply (and perhaps i should drink my morning coffee
> before replying), I will try to be more detailed:
> 
> both servers should be able to run at the moment you run pg_upgrade,
> that means the 2 servers should have been correctly stopped in advance,
> should have their configuration files, and new cluster initialized too.
> 
> Then, as Sergei highlights here below, pg_upgrade will take care of the
> upgrade process, starting the servers.
> 
> 
> Here there is a step by step guide, i considered my best ally when it
> was time to upgrade:
> 
> https://www.postgresql.org/docs/9.4/static/pgupgrade.html
> 
> note point 7:
> 
> 'stop both servers'
> 
> 
> About the port the servers will run on, at point 9 there is some
> clarification:
> 
> ' pg_upgrade defaults to running servers on port 50432 to avoid
> unintended client connections. You can use the same port number for both
> clusters when doing an upgrade because the old and new clusters will not
> be running at the same time. However, when checking an old running
> server, the old and new port numbers must be different.'
> 
> Hope it helps,
> 
> Fabio Pardi
> 
> 
> On 04/18/2018 10:34 AM, Akshay Ballarpure wrote:
>> Thanks Fabio for instant reply.
>>
>> I now started 8.4 with 50432 and 9.4 with default port but still its
>> failing ...Can you please suggest what is wrong ?
>>
>> -bash-4.2$ /opt/rh/rh-postgresql94/root/usr/bin/pg_upgrade
>> --old-bindir=/usr/bin --new-bindir=/opt/rh/rh-postgresql94/root/usr/bin
>> --old-datadir=$OLDCLUSTER --new-datadir=$NEWCLUSTER
>>
>> *failure*
>> Consult the last few lines of "pg_upgrade_server.log" for
>> the probable cause of the failure.
>>
>> There seems to be a postmaster servicing the old cluster.
>> Please shutdown that postmaster and try again.
>> Failure, exiting
>> -bash-4.2$ ps -eaf | grep postgres
>> root      8646  9365  0 08:07 pts/1    00:00:00 su - postgres
>> postgres  8647  8646  0 08:07 pts/1    00:00:00 -bash
>> postgres  9778     1  0 09:17 ?        00:00:00 /usr/bin/postgres -p
>> 50432 -D /var/ericsson/esm-data/postgresql-data/
>> postgres  9779  9778  0 09:17 ?        00:00:00 postgres: logger process
>> postgres  9781  9778  0 09:17 ?        00:00:00 postgres: writer process
>> postgres  9782  9778  0 09:17 ?        00:00:00 postgres: wal writer
>> process
>> postgres  9783  9778  0 09:17 ?        00:00:00 postgres: autovacuum
>> launcher process
>> postgres  9784  9778  0 09:17 ?        00:00:00 postgres: stats
>> collector process
>> postgres  9900     

Re: pg_upgrade help

2018-04-19 Thread Fabio Pardi
Hi,

while trying to reproduce your problem, i noticed that on my Centos 6 
installations Postgres 8.4 and Postgres 9.6 (I do not have 9.4 readily 
available) store the socket in different places:

Postgres 9.6.6 uses /var/run/postgresql/

Postgres 8.4 uses /tmp/

therefore using default settings, i can connect to 9.6 but not 8.4 without 
specifying where the socket is

Connect to 9.6

12:01 postgres@machine:~# psql
psql (8.4.20, server 9.6.6)
WARNING: psql version 8.4, server version 9.6.
 Some psql features might not work.
Type "help" for help.

-

Connect to 8.4

12:01 postgres@machine:~# psql
psql: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

12:04 postgres@machine:~# psql -h /tmp
psql (8.4.20)
Type "help" for help.




I think you might be incurring in the same problem.

Can you confirm it?


regards,

fabio pardi 





On 04/19/2018 09:37 AM, Akshay Ballarpure wrote:
> Hi Fabio,
> Yes i ran initdb on new database and able to start as below.
> 
> [root@ms-esmon root]# su - postgres -c "/usr/bin/postgres -p 50432 -D 
> /var/ericsson/esm-data/postgresql-data/ 2>&1 &"
> [root@ms-esmon root]# su - postgres -c 
> "/opt/rh/rh-postgresql94/root/usr/bin/postgres -D 
> /var/ericsson/esm-data/postgresql-data-9.4/ 2>&1 &"
> [root@ms-esmon root]# 2018-04-19 08:17:53.553 IST  LOG:  redirecting log 
> output to logging collector process
> 2018-04-19 08:17:53.553 IST  HINT:  Future log output will appear in 
> directory "pg_log".
> 
> [root@ms-esmon root]#
> [root@ms-esmon root]# ps -eaf | grep postgre
> sroot      8646  9365  0 Apr18 pts/1    00:00:00 su - postgres
> postgres  8647  8646  0 Apr18 pts/1    00:00:00 -bash
> postgres 28009     1  2 08:17 ?        00:00:00 /usr/bin/postgres -p 50432 -D 
> /var/ericsson/esm-data/postgresql-data/  *--8.4*
> postgres 28010 28009  0 08:17 ?        00:00:00 postgres: logger process
> postgres 28012 28009  0 08:17 ?        00:00:00 postgres: writer process
> postgres 28013 28009  0 08:17 ?        00:00:00 postgres: wal writer process
> postgres 28014 28009  0 08:17 ?        00:00:00 postgres: autovacuum launcher 
> process
> postgres 28015 28009  0 08:17 ?        00:00:00 postgres: stats collector 
> process
> postgres 28048     1  0 08:17 ?        00:00:00 
> /opt/rh/rh-postgresql94/root/usr/bin/postgres -D 
> /var/ericsson/esm-data/postgresql-data-9.4/
> postgres 28049 28048  0 08:17 ?        00:00:00 postgres: logger process
> postgres 28051 28048  0 08:17 ?        00:00:00 postgres: checkpointer process
> postgres 28052 28048  0 08:17 ?        00:00:00 postgres: writer process
> postgres 28053 28048  0 08:17 ?        00:00:00 postgres: wal writer process
> postgres 28054 28048  0 08:17 ?        00:00:00 postgres: autovacuum launcher 
> process
> postgres 28055 28048  0 08:17 ?        00:00:00 postgres: stats collector 
> process
> root     28057  2884  0 08:17 pts/0    00:00:00 grep --color=auto postgre
> 
> 
> Also i am able to start db with the command provided by you and run psql.
> 
> /opt/rh/rh-postgresql94/root/usr/bin/pg_ctl  start -o "-p 50432 -c 
> listen_addresses='' -c unix_socket_permissions=0700"  -D 
> /var/ericsson/esm-data/postgresql-data-9.4/
> pg_ctl: another server might be running; trying to start server anyway
> server starting
> -bash-4.2$ 2018-04-19 08:22:46.527 IST  LOG:  redirecting log output to 
> logging collector process
> 2018-04-19 08:22:46.527 IST  HINT:  Future log output will appear in 
> directory "pg_log".
> 
> -bash-4.2$ ps -eaf | grep postg
> root      8646  9365  0 Apr18 pts/1    00:00:00 su - postgres
> postgres  8647  8646  0 Apr18 pts/1    00:00:00 -bash
> postgres 28174     1  0 08:22 pts/1    00:00:00 
> /opt/rh/rh-postgresql94/root/usr/bin/postgres -D 
> /var/ericsson/esm-data/postgresql-data-9.4 -p 50432 -c listen_addresses= -c 
> unix_socket_permissions=0700
> postgres 28175 28174  0 08:22 ?        00:00:00 postgres: logger process
> postgres 28177 28174  0 08:22 ?        00:00:00 postgres: checkpointer process
> postgres 28178 28174  0 08:22 ?        00:00:00 postgres: writer process
> postgres 28179 28174  0 08:22 ?        00:00:00 postgres: wal writer process
> postgres 28180 28174  0 08:22 ?        00:00:00 postgres: autovacuum launcher 
> process
> postgres 28181 28174  0 08:22 ?        00:00:00 postgres: stats collector 
> process
> postgres 28182  8647  0 08:22 pts/1    00:00:00 ps -eaf
> postgres 28183  8647  0 08:22 pts/1    00:00:00 grep --color=auto postg
> 
> -bash-4.2$ psql -p 50432 -h /var/run/postgresql -U rhqadmin -d rhq
> psql (8.

Re: pg_upgrade help

2018-04-20 Thread Fabio Pardi
Hi Akshay,

I m glad it worked. 

* Your new data folder will be on /var/ericsson/esm-data/postgresql-data-9.4/ 
therefore you should set PGDATA accordingly

* analyze_new_cluster.sh runs on the new cluster, 9.4. Indeed you should start 
the db first, as mentioned in the upgrade message.


If you are happy with your upgrade, you can cleanup the leftovers running:

 delete_old_cluster.sh



regards,

fabio pardi

On 04/20/2018 11:24 AM, Akshay Ballarpure wrote:
> Hi Fabio,
> *Thanks so much for figuring out an issue..!!! much appreciated.*
> i have stopped both postgres version (8.4 and 9.4)
> 
> -bash-4.2$ export PGDATA=/var/ericsson/esm-data/postgresql-data   - 
> postgresql 8.4
> -bash-4.2$ pg_ctl stop -mfast
> waiting for server to shut down done
> server stopped
> 
> 
> -bash-4.2$ export PGDATA=/var/ericsson/esm-data/postgresql-data-9.4/   - 
> postgresql 9.4
> -bash-4.2$ ps -eaf | grep postgre^C
> -bash-4.2$ /opt/rh/rh-postgresql94/root/usr/bin/pg_ctl stop -mfast
> waiting for server to shut down done
> server stopped
> 
> 
> And set below environment variables on terminal where i ran pg_upgrade. 
> and*its working fine. thanks so much for figuring out an issue..!!! much 
> appreciated.*
> 
> -bash-4.2$ echo $PGDATA
> /var/ericsson/esm-data/postgresql-data  - postgresql 8.4
> -bash-4.2$ echo $PGHOST
> /var/run/postgresql
> 
> 
> -bash-4.2$ env | grep PG
> PGHOST=/var/run/postgresql
> PGDATA=/var/ericsson/esm-data/postgresql-data
> 
> 
> /opt/rh/rh-postgresql94/root/usr/bin/pg_upgrade --old-bindir=/usr/bin 
> --new-bindir=/opt/rh/rh-postgresql94/root/usr/bin 
> --old-datadir=/var/ericsson/esm-data/postgresql-data 
> --new-datadir=/var/ericsson/esm-data/postgresql-data-9.4
> 
> 
> 
> Performing Consistency Checks
> -
> Checking cluster versions                                   ok
> Checking database user is a superuser                       ok
> Checking database connection settings                       ok
> Checking for prepared transactions                          ok
> Checking for reg* system OID user data types                ok
> Checking for contrib/isn with bigint-passing mismatch       ok
> Checking for invalid "line" user columns                    ok
> Checking for large objects                                  ok
> Creating dump of global objects                             ok
> Creating dump of database schemas
>                                                             ok
> Checking for presence of required libraries                 ok
> Checking database user is a superuser                       ok
> Checking for prepared transactions                          ok
> 
> If pg_upgrade fails after this point, you must re-initdb the
> new cluster before continuing.
> 
> Performing Upgrade
> --
> Analyzing all rows in the new cluster                       ok
> Freezing all rows on the new cluster                        ok
> Deleting files from new pg_clog                             ok
> Copying old pg_clog to new server                           ok
> Setting next transaction ID and epoch for new cluster       ok
> Deleting files from new pg_multixact/offsets                ok
> Setting oldest multixact ID on new cluster                  ok
> Resetting WAL archives                                      ok
> Setting frozenxid and minmxid counters in new cluster       ok
> Restoring global objects in the new cluster                 ok
> Adding support functions to new cluster                     ok
> Restoring database schemas in the new cluster
>                                                             ok
> Setting minmxid counter in new cluster                      ok
> Removing support functions from new cluster                 ok
> Copying user relation files
>                                                             ok
> Setting next OID for new cluster                            ok
> Sync data directory to disk                                 ok
> Creating script to analyze new cluster                      ok
> Creating script to delete old cluster                       ok
> Checking for large objects                                  ok
> 
> Upgrade Complete
> 
> Optimizer statistics are not transferred by pg_upgrade so,
> once you start the new server, consider running:
>     analyze_new_cluster.sh
> 
> Running this script will delete the old cluster's data files:
>     delete_old_cluster.sh
> 
> 
> 
> Now few more questions..
> 
> I migrated  export PGDATA=/var/ericsson/esm-data/postgresql-data - postgresql 
> 8.4
> I can start 9.4 with above PGDATA right ?
> analyze_new

Re: Wanted: postgresql-9.6-dbg_9.6.5-1.pgdg80+1_amd64.deb

2018-05-03 Thread Fabio Pardi
Hi Craig,


are those good?


http://mirror1.net.one.com/postgres/pool/main/p/postgresql-9.6/postgresql-9.6-dbg_9.6.5-1.pgdg80+1_amd64.deb

http://atalia.postgresql.org/morgue/p/postgresql-9.6/postgresql-9.6-dbg_9.6.5-1.pgdg80%2b1_amd64.deb


regards,

fabio pardi



On 03/05/18 10:23, Craig Ringer wrote:
> Hi all
> 
> I'm trying to debug a PostgreSQL install with a very hard to reproduce
> bug. The user did not install debug info, and apt.postgresql.org has
> purged the packages. 2ndQuadrant doesn't yet have a mirror of all
> historical packages up and running (though we're working on it).
> 
> So I need postgresql-9.6-dbg_9.6.5-1.pgdg80+1_amd64.deb .
> 
> If anyone has it in a mirror repo, their /var/cache/apt/archives/, or
> even has it installed and can supply the files it contains, I'd be
> extremely grateful.
> 



Re: Wanted: postgresql-9.6-dbg_9.6.5-1.pgdg80+1_amd64.deb

2018-05-03 Thread Fabio Pardi
oh! I was late!

well..you have one extra link now! :)

On 03/05/18 11:00, Fabio Pardi wrote:
> Hi Craig,
> 
> 
> are those good?
> 
> 
> http://mirror1.net.one.com/postgres/pool/main/p/postgresql-9.6/postgresql-9.6-dbg_9.6.5-1.pgdg80+1_amd64.deb
> 
> http://atalia.postgresql.org/morgue/p/postgresql-9.6/postgresql-9.6-dbg_9.6.5-1.pgdg80%2b1_amd64.deb
> 
> 
> regards,
> 
> fabio pardi
> 
> 
> 
> On 03/05/18 10:23, Craig Ringer wrote:
>> Hi all
>>
>> I'm trying to debug a PostgreSQL install with a very hard to reproduce
>> bug. The user did not install debug info, and apt.postgresql.org has
>> purged the packages. 2ndQuadrant doesn't yet have a mirror of all
>> historical packages up and running (though we're working on it).
>>
>> So I need postgresql-9.6-dbg_9.6.5-1.pgdg80+1_amd64.deb .
>>
>> If anyone has it in a mirror repo, their /var/cache/apt/archives/, or
>> even has it installed and can supply the files it contains, I'd be
>> extremely grateful.
>>
> 



Re: Long running DDL statements blocking all queries

2018-05-31 Thread Fabio Pardi
Hi Ashu,

when you say 'almost every query in our application starts getting blocked'...


'blocked' as in 'they are stuck and can be seen in pg_stat_activity'? (in this 
case, please post the full content of pg_stat_activity)

or 

'blocked' as in 'they are waiting in pgbouncer pool? 


regards,

fabio pardi


On 31/05/18 12:38, Ashu Pachauri wrote:
> We have been using Postgres 9.5.12 behind PGBouncer and facing some weird 
> issues. Whenever we running long running DDL statements (e.g. 'add index 
> concurently' or 'Alter table alter column type'), after some time, we start 
> seeing that almost every query in our application starts getting blocked.
> I understand that the operations I mentioned cab be unsafe, but the queries 
> being blocked are on completely unrelated tables.  I used the instructions 
> given on Postgres wiki (https://wiki.postgresql.org/wiki/Lock_Monitoring) to 
> correlate the blocking and blocked statements and there seems to be 
> absolutely no correlation.
> 
> Thanks,
> Ashu



Re: Long running DDL statements blocking all queries

2018-05-31 Thread Fabio Pardi
Ashu, 

please, provide full output of:


\x
select * from pg_stat_activity ;



What you posted on github is only providing a list of blocked backends.

If I read it correctly, then PID 18317 is the root cause of all the locks, but 
it does not show up in the list, not being blocked by anything...


regards,

fabio pardi 



On 31/05/18 14:26, Ashu Pachauri wrote:
> Thanks Fabio for the reply.  
> The queries are blocked in the sense that I can see them in pg_stat_activity.
> 
> Please find the query and its output for correlating the blocked activity 
> with blocking query from pg_state_activity and pg_locks: 
> https://gist.github.com/ashu210890/c39cd7a38ce37f4baab2f58e1ade1403
> 
> This output was captured after stopping all writes to our postgres database 
> and the only thing talking to postgres was our webserver that only does 
> metadata reads. As you can see from the above gist, even the 'SET' statements 
> are blocked waiting for the ALTER statement to finish.
> 
> 
> Thanks,
> Ashu
> 
> 
> On Thu, May 31, 2018 at 4:38 PM Fabio Pardi  <mailto:f.pa...@portavita.eu>> wrote:
> 
> Hi Ashu,
> 
> when you say 'almost every query in our application starts getting 
> blocked'...
> 
> 
> 'blocked' as in 'they are stuck and can be seen in pg_stat_activity'? (in 
> this case, please post the full content of pg_stat_activity)
> 
> or
> 
> 'blocked' as in 'they are waiting in pgbouncer pool?
> 
> 
> regards,
> 
> fabio pardi
> 
> 
> On 31/05/18 12:38, Ashu Pachauri wrote:
> > We have been using Postgres 9.5.12 behind PGBouncer and facing some 
> weird issues. Whenever we running long running DDL statements (e.g. 'add 
> index concurently' or 'Alter table alter column type'), after some time, we 
> start seeing that almost every query in our application starts getting 
> blocked.
> > I understand that the operations I mentioned cab be unsafe, but the 
> queries being blocked are on completely unrelated tables.  I used the 
> instructions given on Postgres wiki 
> (https://wiki.postgresql.org/wiki/Lock_Monitoring) to correlate the blocking 
> and blocked statements and there seems to be absolutely no correlation.
> >
> > Thanks,
> > Ashu
> 



Re: Question on disk contention

2018-05-31 Thread Fabio Pardi
As far as I know, the OS cache is shared, and shared_buffers too.

Back to the matter of contention, your statement i think might be true only in 
cases when you are querying data which does not fit in RAM. 

Under those circumstances, the OS or Postgres might need to evict blocks from 
RAM to make room to new blocks fetched from disk, while the concurrent query is 
trying to do the same. Then the disk contention might happen. 


regards,

fabio pardi

On 31/05/18 16:09, Melvin Davidson wrote:
> 
> 
> On Thu, May 31, 2018 at 10:04 AM, Ron  <mailto:ronljohnso...@gmail.com>> wrote:
> 
> On 05/31/2018 08:52 AM, Melvin Davidson wrote:
>>
>>
>> On Thu, May 31, 2018 at 1:13 AM, Charles Clavadetscher 
>> mailto:clavadetsc...@swisspug.org>> wrote:
>>
>> Hi Melvin
>>
>> As an answer to a previous post you wrote:
>>
>> "Also, your main problem is that when you have two exact same 
>> queries executing at the same time, they will cause contention in
>> the disk, and neither one will make much progress."
>>
>> Could you elaborate a little more on the meaning of "contention in 
>> the disk"?
>> What is it that happens?
>>
>> Thank you and have a good day.
>> Regards
>> Charles
>>
>>
>>
>> >Could you elaborate a little more on the meaning of "contention in the 
>> disk"?
>> >What is it that happens?
>>
>> To simplify, you have two users/jobs, both wanting the exact same 
>> information. So the system instructs the disk to get
>> that information from the disk, which causes the disk head to "seek" to 
>> the position of the first eligible row and
>> continues positioning to other eligible rows. Now the job is not 
>> exclusive, so the system temporarily switches to the
>> other job, which causes the disk to go back to the first row and work 
>> from there. The switching back and forth continues,
>> so that instead of one job finishing quickly, they both have to take 
>> turns waiting for needed information. That takes
>> a lot longer,
>>
>> Try this, Select a table that has a lot of rows, ideally 1M+. Then start 
>> a query with a WHERE clause and see how long
>> it takes. Then submit the same query from 5 separate connections 
>> simultaneously and see how long that takes.
> 
> Why isn't the OS caching the disk blocks, and why isn't Postgres using 
> the cached data?
> 
> 
> -- 
> Angular momentum makes the world go 'round.
> 
> 
> 
>> Why isn't the OS caching the disk blocks, and why isn't Postgres using the 
>> cached data?
> It does, but the cache is for each connection/job. They are not shared.
> -- 
> *Melvin Davidson**
> Maj. Database & Exploration Specialist**
> Universe Exploration Command – UXC***
> Employment by invitation only!



Re: Question on disk contention

2018-06-01 Thread Fabio Pardi
For the sake of completeness I think i have to slightly correct myself.

While I cannot find it in the documentation, (if somebody has pointers, please 
provide them) on my personal notes I found that 'if a scan is already in 
progress, and a new scan starts, then the new scan will start where the 
existing scan is.'

In other words, the new scan will make use of the existing data retrieval, and 
then retrieve the data it misses.

I did some tests which look like my notes are right.

I created a table:

 \dt+ big_series 
  List of relations
  Schema   |Name| Type  | Owner | Size  | Description 
---++---+---+---+-
 pv_public | big_series | table | user | 24 GB | 
(1 row)


 \d big_series ;
 Table "pv_public.big_series"
 Column |  Type   | Modifiers 
+-+---
 a  | integer | 


containing 700+ million records


On a 4 cores VM, with 10GB RAM running Postgres 9.6.3: 

If i run a single query, then it takes around 3.5 minutes

running i=50 queries in parallel, which do

select * from big_series where a = '$i' 

and start with 2 seconds difference one to another,

then the times are:

output.0: Execution time: 213519.445 ms
output.10: Execution time: 223256.190 ms
output.11: Execution time: 224113.198 ms
output.12: Execution time: 225147.653 ms
output.13: Execution time: 226101.934 ms
output.14: Execution time: 227028.992 ms
output.15: Execution time: 228622.188 ms
output.16: Execution time: 229584.695 ms
output.17: Execution time: 229795.122 ms
output.18: Execution time: 229949.648 ms
output.19: Execution time: 229974.346 ms
output.1: Execution time: 214274.906 ms
output.20: Execution time: 230001.553 ms
output.21: Execution time: 230138.985 ms
output.22: Execution time: 230080.078 ms
output.23: Execution time: 230135.255 ms
output.24: Execution time: 230393.351 ms
output.25: Execution time: 230467.203 ms
output.26: Execution time: 230651.052 ms
output.27: Execution time: 230603.229 ms
output.28: Execution time: 230502.608 ms
output.29: Execution time: 230692.864 ms
output.2: Execution time: 215348.711 ms
output.30: Execution time: 230630.022 ms
output.31: Execution time: 230501.983 ms
output.32: Execution time: 230213.728 ms
output.33: Execution time: 229992.756 ms
output.34: Execution time: 229515.631 ms
output.35: Execution time: 228769.224 ms
output.36: Execution time: 228341.051 ms
output.37: Execution time: 227881.148 ms
output.38: Execution time: 226883.588 ms
output.39: Execution time: 225697.683 ms
output.3: Execution time: 216344.473 ms
output.40: Execution time: 224860.213 ms
output.41: Execution time: 222169.478 ms
output.42: Execution time: 221259.683 ms
output.43: Execution time: 220444.522 ms
output.44: Execution time: 219244.326 ms
output.45: Execution time: 218170.045 ms
output.46: Execution time: 217346.484 ms
output.47: Execution time: 216860.823 ms
output.48: Execution time: 216431.684 ms
output.49: Execution time: 216286.271 ms
output.4: Execution time: 217365.770 ms
output.5: Execution time: 218376.528 ms
output.6: Execution time: 219395.868 ms
output.7: Execution time: 220501.178 ms
output.8: Execution time: 221297.149 ms
output.9: Execution time: 222157.996 ms




Please correct me if my notes are wrong, or provide pointers to the 
documentation.


regards,

fabio pardi


On 31/05/18 16:20, Fabio Pardi wrote:
> As far as I know, the OS cache is shared, and shared_buffers too.
> 
> Back to the matter of contention, your statement i think might be true only 
> in cases when you are querying data which does not fit in RAM. 
> 
> Under those circumstances, the OS or Postgres might need to evict blocks from 
> RAM to make room to new blocks fetched from disk, while the concurrent query 
> is trying to do the same. Then the disk contention might happen. 
> 
> 
> regards,
> 
> fabio pardi
> 
> On 31/05/18 16:09, Melvin Davidson wrote:
>>
>>
>> On Thu, May 31, 2018 at 10:04 AM, Ron > <mailto:ronljohnso...@gmail.com>> wrote:
>>
>> On 05/31/2018 08:52 AM, Melvin Davidson wrote:
>>>
>>>
>>> On Thu, May 31, 2018 at 1:13 AM, Charles Clavadetscher 
>>> mailto:clavadetsc...@swisspug.org>> wrote:
>>>
>>> Hi Melvin
>>>
>>> As an answer to a previous post you wrote:
>>>
>>> "Also, your main problem is that when you have two exact same 
>>> queries executing at the same time, they will cause contention in
>>> the disk, and neither one will make much progress."
>>>
>>> Could you elaborate a little more on the meaning of "contention in 
>>> the disk"?
>>> What is it that happens?
>>>
>>> Thank you and have a good day.
>>> 

Re: How to get postmaster shut down time in postgres?

2018-06-04 Thread Fabio Pardi
If i were you, i would parse the logfile as others already mentioned. It
is easy and clean.

Another option might be to modify the init script and add a timestamp to
a file whenever it stops/starts your server. Or maybe you can add to the
init file a script that fills in your table, but i m not sure it is best
practice.

Else, we use monitoring for that, but in most cases the timestamps will
not be that accurate.


regards,

fabio pardi

On 06/04/2018 03:17 PM, pavan95 wrote:
> Hi all,
> 
> Is there any way to get postmaster(postgres database) shutdown time?  Please
> help!!
> 
> Thanks in Advance.
> 
> 
> Regards,
> Pavan
> 
> 
> 
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
> 



Re: question on streaming replication

2018-06-14 Thread Fabio Pardi
Hi Atul,

Please do not cross-post over mailing lists.

As per your problem:

on a streaming replication setup, all changes applied to master are propagated 
to standby(s).

If standby is stopped or cannot temporary reach master, then it will pick up 
changes when started or when can reach master again, given that the initial 
state was in sync with the master.

What stated here above is always true, but there are cases in which too many 
changes are pushed to master and standby is not able to pick them up.

In order for standby to pick the missing data, is fundamental that the WAL 
files containing the changes are available to standby or in alternative 
wal_keep_segments is 'large enough'.

A good starting point to debug the situation are the logfiles of standby 
server, together with pg_stat_replication table.


Also handy to run on master:

SELECT pg_current_xlog_location() and on standby: select 
pg_last_xlog_receive_location() to understand if it is picking up. 
Documentation is always a good starting point to understand what is going on. 
If you did not already, have a look here: 
https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html Hope 
it helps, Fabio


On 14/06/18 07:28, Atul Kumar wrote:
> Hi,
>
> I have postgres edb 9.6 version, i have below query to solve it out.
>
> i have configured streaming replication having master and slave node
> on same  server just to test it.
>
> All worked fine but when i made slave service stop, and create some
> test databases in master, after then i made slave service start, slave
> didn't pick the changes.
>
> The replication was on async state.
>
> Then after doing some search on google i tried to make it sync state
> but even making changes in postgresql.conf file I am neither getting
> sync state nor getting any changes on slave server.
>
> Please suggest the needful.
>
>
> Regards,
> Atul
>




Re: Faster way of estimating database size

2018-06-15 Thread Fabio Pardi
Hi Sam,

When behind a terminal I use \l+ to show the size of the databases, since it is 
handy to remember. It shows db size in a "pretty size".

Timing both commands, i see that \l+ takes more or less the same time your 
query takes, but I think your query better fits the monitoring purpose.

But the real point here is why are you running the command every ~3 seconds? 
(and perhaps, why 200 databases?)

I would focus on that issue, and keep using your query.


regards,

fabio pardi


On 15/06/18 02:29, Sam Saffron wrote:
> Hi there,
>
> At the moment we are using:
>
> SELECT pg_database.datname, pg_database_size(pg_database.datname) as
> size FROM pg_database
>
> To gather size of databases for monitoring purposes in Prometheus.
>
> Our pg stat logging now shows this is our number one query cause we
> seem to be running it for some crazy reason 19 times a minute. Clearly
> this is too much and we should (and will) cache results for a bit.
>
> Nonetheless, I notice it take 400ms to run on our cluster (with a few
> 200 dbs) and was wondering if there is a quicker way of estimating
> this number?
>
> Sam
>




pgbench on pglogical: bandwidth usage

2018-06-18 Thread Fabio Pardi
-i 

# Now a small hack, since pglogical only accepts tables who have a primary key. 
pgbench_historical does not have it, out of the box: (maybe here there is some 
room for an improvement to propose for pgbench code? what do you think?)

psql $pglog_db -c "ALTER TABLE pgbench_history ADD COLUMN id SERIAL PRIMARY 
KEY;"
psql $pglog_db -c "CREATEextension pglogical;"
psql $pglog_db -c "SELECT pglogical.create_node(node_name := 
'provider.$pglog_db', dsn := 'host=dbfabio port=5432 dbname=$pglog_db');"
psql $pglog_db -c "SELECT pglogical.replication_set_add_all_tables('default', 
ARRAY['public']);"


Subscriber:

Same as test 1



Streaming replication setup looks like:

Master:

wal_level = hot_standby

Standby:

hot_standby = on

+ recovery.conf



any help is appreciated.


regards,

fabio pardi




Re: Drop Default Privileges?

2018-06-19 Thread Fabio Pardi
Hi Louis,

I think 'alter user' can do the job for you.

https://www.postgresql.org/docs/current/static/sql-alteruser.html

Else, as alternative: before running pg_restore, you couldedit the dump and 
replace the string 'ROLE postgres' withthe correct useron the RDS instance.

regards,

fabio pardi




On 19/06/18 17:20, Louis Battuello wrote:
> Is it possible to drop default privileges?
>
> I’m attempting to run a pg_restore into an RDS instance, which doesn’t have a 
> “postgres” user.
>
> I encounter many messages like so:
>
> ALTER DEFAULT PRIVILEGES...
>
> pg_restore: [archiver (db)] Error from TOC entry 10182; 826 253752252 
>  DEFAULT ACL DEFAULT PRIVILEGES FOR TABLES postgres
>
> pg_restore: [archiver (db)] could not execute query: ERROR:  role "postgres" 
> does not exist
>
>     Command was: ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA 
> abc_schema REVOKE ALL ON TABLES  FROM PUBLIC;
>
>
> I’d like to remove these default privileges on the source database to avoid 
> this error message, but I can’t find the syntax in the documentation (or if 
> it’s possible). I only see GRANT/REVOKE options.
>
> Thanks,
> Louis



Re: How can I stop a long run pgAgent job?

2018-06-21 Thread Fabio Pardi
Hi Shore,

Have a look at:

https://www.postgresql.org/docs/current/static/functions-admin.html

'pg_terminate_backend' is probably what you are looking for

regards,

fabio pardi


On 21/06/18 11:32, a wrote:
> Hi 
>
> I'm using pgAdmin 4, pgAgent and postgresql 10 on windows server.
>
> I tried a job but due to some reasons, its running long time. Is there a way 
> that I can terminate it ??
>
> Thanks
>
> Shore



Re: Problem Postgres

2018-06-26 Thread Fabio Pardi
Hi Emanuele,


For the next time, is best if you post to an italian mailing list if your log 
messages are in italian. Else change settings in order to have english errors.

Back to your problem:

Looks like permissions on your installation are not OK, and have been 
manipulated from the default ones. The fact that happens every 20 minutes, 
means maybe something 'scheduled' is occurring, like, eg, a query (maybe from 
your monitoring system), or a checkpoint.


Could you please:

* post the full logfile of the next line, the one on 'ISTRUZIONE' and anything 
else you think is relevant in the logs

* tell us more about your installation (from package, compiled, which version 
of Postres and OS, if you did anything special after installation)

*post the permissions of the folder 'base' and those of base/16395/19739338

* tell us under which user is postgres running


regards,

fabio pardi



On 26/06/18 10:05, Emanuele Musella wrote:
> Good morning,
>
> we have the following error:
>
> 2018-06-26 09:48:44 CEST ERRORE:  non è stato possibile ottenere informazioni 
> sul file "base/16395/19739338": Permission denied
> 2018-06-26 09:48:44 CEST ISTRUZIONE:  select 
> p.datname,pg_database_size(p.datname) from pg_database p
>
> This error repetly every 20 minutes.
>
> Can you help me?
>
> Regards 
>
> Emanuele Musella



Re: Single query uses all system memory

2018-06-26 Thread Fabio Pardi
Hi Ivar,

I agree with Magnus. As addition, also i would change:

 effective_cache_size

and

 maintenance_work_mem

Then disable OOM killer, change logging in order to log files on disk to see if 
your work_mem is too low and spills on disk, rethink your data structure and, 
overall, I think that you can have a look to this page, which I find a good 
starting point for a proper tuning:

https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server


regards,

fabio pardi


On 26/06/18 12:13, Magnus Hagander wrote:
>
>
> On Tue, Jun 26, 2018 at 11:40 AM, Ivar Fredriksen  <mailto:ivar.fredrik...@gmail.com>> wrote:
>
> A single large query is able to spend all the system memory (as reported 
> by top), and the oom_reaper kills postgres. See bottom of email for an 
> example query and logs.
>
>  
>
> Expected behavior would be that postgres is not killed and the query is 
> limited to the shared_buffer memory, potentially failing, but not killing the 
> postgres process.
>
>
> Then your expectations are completely wrong. shared_buffers have nothing to 
> do with limiting the memory of a query. shared_buffers set the size of the 
> PostgreSQL cache, not the working set. That's controlled by work_mem (see 
> below).
>
>  
>
> Do I have some fundamental flaw in my understanding of this, or is there 
> some sort of configuration that should be in place that is missing? The 
> offending sample query has been
>
> rewritten to not use joins, and will no longer crash postgres. I am not 
> asking anyone to spend a lot of time analyzing the query itself, it is just 
> provided as an example for when the problem will occur.
>
> My question is more in a broader sense why one query is eating through 
> all system memory, and is there anything I can do to prevent this from 
> happening? 
>
> We have set shared_buffers to 256MB on a test-system that has 1GB memory. 
> Production machines have more resources, but the errors are happening in 
> exactly the same way, so I assume (perhaps wrongly) that using the test
>
> system specs and logs might give me the answers I need to figure out what 
> is happening.
>
>  
>
> Technical details are provided below, a big thanks to anyone who can 
> provide any help or insight to this.
>
>  
>
> Regards,
>
> Ivar Fredriksen
>
>  
>
> PostgreSQL version number:
>
> PostgreSQL 10.4 (Debian 10.4-2.pgdg90+1) on x86_64-pc-linux-gnu, compiled 
> by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
>
> Installed with the debian packages for postgres found at: deb 
> http://apt.postgresql.org/pub/repos/apt/ 
> <http://apt.postgresql.org/pub/repos/apt/>
>
>  
>
> Changes made to the settings in the postgresql.conf file:
>
>  name | current_setting     | 
>    source
>
> 
> --+-+--
>
>  application_name | psql    | 
> client
>
>  checkpoint_completion_target | 0.9     | 
> configuration file
>
>  client_encoding  | UTF8    | 
> client
>
>  cluster_name | 10/main | 
> configuration file
>
>  DateStyle    | ISO, MDY    | 
> configuration file
>
>  default_statistics_target    | 100 | 
> configuration file
>
>  default_text_search_config   | pg_catalog.english  | 
> configuration file
>
>  dynamic_shared_memory_type   | posix   | 
> configuration file
>
>  effective_cache_size | 1536MB  | 
> configuration file
>
>  external_pid_file    | /var/run/postgresql/10-main.pid | 
> configuration file
>
>  lc_messages  | C   | 
> configuration file
>
>  lc_monetary  | C   | 
> configuration file
>
>  lc_numeric   | en_US.UTF-8     | 
> configuration file
>
>  lc_time  | C   | 
> configuration file
>
>  listen_addresses | *   | 
> configuration file
>
>  log_line_prefix  | %m [%p] %q%u@%d | 

Re: Not able to update some rows in a table

2018-07-02 Thread Fabio Pardi
Hi Marco,

I think more information would give to us a better picture.

can you please post the following?

postgres version

your table structure

how you are updating the records

if the problem is reproducible directly from the postgres prompt

how you are fetching results after the update

anything else to make the situation more clear (like errors in the logfile, or 
anything else you think is relevant)


regards,

fabio pardi



On 02/07/18 16:51, Marco Fochesato wrote:
> Dear all,
> I have a table with 21 columns.
> Primary key is done with 20 of these colums.
>
> I have 3.313 records.
> I don't know why, but I'm not able to update some of these records.
> I don't understand, it seems that I'm not able to fetch.
> It happens with Pgadmin, with Python Psycopg2 and also through Libreoffice 
> Base.
>
> Looking to the properties of the table in Pgadmin, I can see only the 
> estimated number
> of rows. Counted rows are 'not counted'.
>
> -- 
> Marco Fochesato



Re: Query to monitor index bloat

2018-07-17 Thread Fabio Pardi
If you monitor using nagios (or if you want to make a wrapper around it):

https://bucardo.org/check_postgres/check_postgres.pl.html#bloat

works like a charm.

regards,

fabio pardi


On 07/16/2018 05:16 PM, Alessandro Aste wrote:
> Hello,  I am trying to put togheter a query to monitor the index bloat
> for a database I maintain.
> Is there a "SQL" way to obtain  bloated index ? I googled around but I
> found nothing working.
> 
> I'm currently running 9.6 but I'm looking for something compatible with
> version 10 too.
> 
> Thank you very much in advance,
> 
> 
> Alessandro. 



Re: Query to monitor index bloat

2018-07-17 Thread Fabio Pardi



On 07/17/2018 10:21 AM, Jehan-Guillaume (ioguix) de Rorthais wrote:
> On Tue, 17 Jul 2018 10:11:50 +0200

> ...and check this one: https://github.com/OPMDG/check_pgactivity/
> 
> It uses bloat queries for tables and btree indexes Adrien Nayrat was pointing
> earlier in this thread.
> 
> In fact, both queries in check_pgactivity were written because the bloat check
> in check_postgres was considering **all** fields were in **all***
> indexes..

not accurately, since it is excluding a few things.

from the docs:
'Tables must have at least 10 pages, and indexes at least 15, before
they can be considered by this test.'

+ you can include and exclude objects based on your taste, same as in
check_pgactivity.

The only 'drawback' of check_postgres.pl is that it checks indexes and
tables's bloat in one go. (but: if your object's names are normalized,
it should not be difficult to include or exclude them)
I do not consider it a drawback, but you are free to pick your poison...


.Which is quite a large approximation...I don't know if this is still
> the case though.

While i think both tools might fit Alessandro's purpose, please note
that check_pgactivity is **only** checking for btree indexes (which are
the default ones, and the proven-to-get-bloated-quickly)

If I were you (both), I would monitor **all** indexes (and yes! tables
too), since one day you might realize it was actually a good idea to do so.

regards,

fabio pardi



Re: Query to monitor index bloat

2018-07-17 Thread Fabio Pardi
you have your points, my friend.


On 07/17/2018 11:23 AM, Jehan-Guillaume (ioguix) de Rorthais wrote:
> On Tue, 17 Jul 2018 11:03:08 +0200
> Fabio Pardi  wrote:
> 
>> On 07/17/2018 10:21 AM, Jehan-Guillaume (ioguix) de Rorthais wrote:
>>> On Tue, 17 Jul 2018 10:11:50 +0200  
>>
>>> ...and check this one: https://github.com/OPMDG/check_pgactivity/
>>>
>>> It uses bloat queries for tables and btree indexes Adrien Nayrat was
>>> pointing earlier in this thread.
>>>
>>> In fact, both queries in check_pgactivity were written because the bloat
>>> check in check_postgres was considering **all** fields were in **all***
>>> indexes..  
>>
>> not accurately, since it is excluding a few things.
>>
>> from the docs:
>> 'Tables must have at least 10 pages, and indexes at least 15, before
>> they can be considered by this test.'
> 
> well I agree with this. What the point of computing bloat for small objects? I
> would raise this way higher.
> 
>> + you can include and exclude objects based on your taste, same as in
>> check_pgactivity.
>>
>> The only 'drawback' of check_postgres.pl is that it checks indexes and
>> tables's bloat in one go. (but: if your object's names are normalized,
>> it should not be difficult to include or exclude them)
>> I do not consider it a drawback, but you are free to pick your poison...
> 
> Well, again, the btree approximation is quite large in check_postgres. I would
> not rely on it detect bloat quickly. **If this is still true**, as it 
> considers
> all fields are in the index, the estimated index size might be very
> large compared to the real one.
> 
> But, again, this is a few years I did not digg in this query, I mmight be 
> wrong.
> 
>>> Which is quite a large approximation...I don't know if this is still
>>> the case though.  
>>
>> While i think both tools might fit Alessandro's purpose, please note
>> that check_pgactivity is **only** checking for btree indexes (which are
>> the default ones, and the proven-to-get-bloated-quickly)
>>
>> If I were you (both), I would monitor **all** indexes (and yes! tables
>> too), since one day you might realize it was actually a good idea to do so.
> 
> I agree, we should monitor all indexes. If you have some formula to quickly
> estimate ideal size of a GIN, GiST, hash or sp-gist indexes, please share. 
> But,
> unfortunately, as far as I know, this is way more complex than just summing 
> the
> average size of the fields in the index :/
> 



Re: Is it ok to run vacuum full verbose command for live database for the tables which has more dead tuples?

2018-07-18 Thread Fabio Pardi
Hi Raghavendra


answers in line here below:


On 18/07/18 11:13, Raghavendra Rao J S V wrote:
> We have thousands of tables. But out of those tables, around 20 to 40 tables 
> are always busy due to that those tables are bloating.
>
> In order to avoid this we are running a shell script which performs vacuum 
> full on the tables which has more than ten thousand dead tuples. While 
> running this we are stopping all application processors and running vacuum 
> full on the tables which has more dead tuples.
>
>  1. Is it ok to run *vacuum full verbose* command for live database for the 
> tables which has more dead tuples(greater than)?
>
nope

>  1. Does it cause any *adverse *effect?
>
>
exclusively locks the table. Recipe for disaster.



What's wrong with the normal operations of autovacuum?


regards,

fabio pardi



> Please clarify me. Thanks in advance.
>
> -- 
> Regards,
> Raghavendra Rao J S V
>



Re: DB Backup from WAL Slave

2018-07-25 Thread Fabio Pardi
Hi Basti,


When you are dumping a big table using hot_standby_feedback=on,you are holding 
the master from cleaning up old records(as in: it was running on master).

If too much time elapses while the lock is held (30 secs by default), then the 
statementis cancelled.

At the expense of potentially accumulating more WAL segments on the primary, 
you might want to look into 
max_standby_archive_delay/max_standby_streaming_delay parameters to have more 
control over query cancellation

regards,

fabio pardi





On 25/07/18 15:14, basti wrote:
> Thanks a lot for this tip. I don't know is this is the right param.
> Becouse of: "... , but can cause database bloat on the primary for some
> workloads."
>
> This the an other way to backup the database and work around
>
>
> pg_dump: Error message from server: ERROR:  canceling statement due to
> conflict with recovery
> # DETAIL:  User was holding a relation lock for too long.
>
> this error?
>
> I have try bg_asepackup but this close with:
> pg_basebackup: could not connect to server: FATAL:  number of requested
> standby connections exceeds max_wal_senders (currently 0)
>
> Best regards,
>
> On 24.07.2018 15:33, Andreas Kretschmer wrote:
>> On 24 July 2018 14:44:45 CEST, basti  wrote:
>>> Hello,
>>>
>>> we have a db master and a slave.
>> -
>>> How can I do an backup with pg_dumpall from slave?
>> Set hot_standby_feedback to on.
>>
>>
>> Regards, Andreas
>>
>>



Re: Question on postgresql.conf

2018-07-31 Thread Fabio Pardi
Hi,

you can read this:

https://www.postgresql.org/docs/current/static/runtime-config-file-locations.html


Basically you will start Postgres as



pg_ctl -D /nfs/global/


but it expects to find a file postgresql.conf (i do not know any way to specify 
a different file name)


inside postgresql.conf you should then specify something like:

data_directory='/your/data/folder'


What probably better fits you is instead the following:

- use the original location of your data folder and postgresql.conf file

- specify inside the file:
include='/nfs/global/postgres-machine1.cnf'

I have doubts on the security implications of that setup, though.


regards,
fabio pardi




On 31/07/18 13:27, Lu, Dan wrote:
>
> Hi David,
>
>  
>
> Which command?  Can you give me an example?
>
>  
>
> Say, I want to put the config file on a share drive so all my postgres 
> configuration file can be seen in one location.
>
>  
>
> /nfs/global/postgres-.cnf
>
>  
>
> Example:
>
> /nfs/global/postgres-machine1.cnf
>
>  
>
> /nfs/global/postgres-machine2.cnf
>
>  
>
> /nfs/global/postgres-machine3.cnf
>
>  
>
>  
>
> *From:*David G. Johnston [mailto:david.g.johns...@gmail.com]
> *Sent:* Tuesday, July 31, 2018 1:48 AM
> *To:* Alvaro Aguayo Garcia-Rada 
> *Cc:* pgsql-gene...@postgresql.org; Lu, Dan 
> *Subject:* Re: Question on postgresql.conf
>
>  
>
> On Monday, July 30, 2018, Alvaro Aguayo Garcia-Rada  <mailto:aagu...@opensysperu.com>> wrote:
>
> As far as I know, it's not currently possible. 
>
>  
>
> That would be incorrect, you just need to change server startup commands.
>
>  
>
> https://www.postgresql.org/docs/10/static/runtime-config-file-locations.html
>
>  
>
> David J.
>
>  
>
>
> --
>
> IMPORTANT: The information contained in this email and/or its attachments is 
> confidential. If you are not the intended recipient, please notify the sender 
> immediately by reply and immediately delete this message and all its 
> attachments. Any review, use, reproduction, disclosure or dissemination of 
> this message or any attachment by an unintended recipient is strictly 
> prohibited. Neither this message nor any attachment is intended as or should 
> be construed as an offer, solicitation or recommendation to buy or sell any 
> security or other financial instrument. Neither the sender, his or her 
> employer nor any of their respective affiliates makes any warranties as to 
> the completeness or accuracy of any of the information contained herein or 
> that this message or any of its attachments is free of viruses.



A Timeseries Case Study: InfluxDB VS PostgreSQL

2018-09-07 Thread Fabio Pardi
Hi,

I recently published a blog article reporting a small research I made on
the usage of InfluxDB and PostgreSQL for time series, together with
Grafana on my specific use case.


I think that some of you might find it interesting, maybe inspiring or
perhaps it can trigger some interesting discussion, given the high level
of expertise of people in this mailing list.

I reached out to InfluxDB guys, but after an initial ack, I did not hear
from them any longer.


https://portavita.github.io/2018-07-31-blog_influxdb_vs_postgresql


All comments, critics, suggestions and corrections are very welcome


Regards,

fabio pardi



Re: A Timeseries Case Study: InfluxDB VS PostgreSQL

2018-09-07 Thread Fabio Pardi
Hi Achilleas,

I'm glad you like the article.

Probably I will find the time to come back to the topic when I'm done comparing 
Mongodb with PostgreSQL

regards,

fabio pardi


On 07/09/18 11:18, Achilleas Mantzios wrote:
> Nice read! Wonder if you could repeat the tests on pgsql 10.5 and btree/BRIN.




Re: Pgbouncer and postgres

2018-09-18 Thread Fabio Pardi
Hi Nicola,

I tried to reproduce the problem using pgbouncer 1.8.1 from RPM on CentOS 7 and 
(an almost equal to) your configuration file but I m not able tobring it to the 
same error-state.

Your problem might be tied either to the way you compiled it, or to some 
package installed/reconfigured with the updates (my CentOS is not up to date to 
the latest packages)

What if you try to use the RPM versionof pgbouncer? Can you try that?

How many interfaces to you have on the system?  Did you check if maybe some 
port/interface is in use?  You can stop pgbouncer and then run 'lsof -n | grep 
6543'.No line should return.

Are  you running pgbouncer from systemctl or command line?Maybe is a long shot, 
but one more thing you can try is to remove the line 'user=postgres' and let it 
run as 'pgbouncer'user from root/systemctl.

Is selinux enabled or disabled on the system?


Regards,

fabio pardi



On 17/09/18 14:33, Nicola Contu wrote:
>
> Hello,
>
> I'm trying to get pgbouncer working but I'm getting this error :
>
>
> 2018-09-17 12:20:15.304 87772 NOISE safe_accept(12) = Resource temporarily 
> unavailable
> 2018-09-17 12:20:36.183 87772 NOISE safe_accept(11) = 13 (127.0.0.1:59100 
> <http://127.0.0.1:59100>)
> 2018-09-17 12:20:36.183 87772 NOISE new fd from accept=13
> 2018-09-17 12:20:36.183 87772 NOISE safe_accept(11) = Resource temporarily 
> unavailable
> 2018-09-17 12:21:14.412 88424 DEBUG parse_ini_file: 'logfile' = 
> '/var/log/pgbouncer.log' ok:1
> 2018-09-17 12:21:14.412 88424 DEBUG parse_ini_file: 'pidfile' = 
> '/home/postgres/pgbouncer.pid'
> 2018-09-17 12:21:14.412 88424 DEBUG parse_ini_file: 'pidfile' = 
> '/home/postgres/pgbouncer.pid' ok:1
> 2018-09-17 12:21:14.412 88424 DEBUG parse_ini_file: 'admin_users' = 'admin'
> 2018-09-17 12:21:14.412 88424 DEBUG parse_ini_file: 'admin_users' = 'admin' 
> ok:1
> 2018-09-17 12:21:14.412 88424 DEBUG parse_ini_file: 'user' = 'postgres'
> 2018-09-17 12:21:14.412 88424 DEBUG parse_ini_file: 'user' = 'postgres' ok:1
> 2018-09-17 12:21:14.412 88424 DEBUG parse_ini_file: 'max_db_connections' = 
> '220'
> 2018-09-17 12:21:14.412 88424 DEBUG parse_ini_file: 'max_db_connections' = 
> '220' ok:1
> 2018-09-17 12:21:14.412 88424 DEBUG parse_ini_file: 'log_connections' = '0'
> 2018-09-17 12:21:14.413 88424 DEBUG parse_ini_file: 'log_connections' = '0' 
> ok:1
> 2018-09-17 12:21:14.413 88424 DEBUG parse_ini_file: 'log_disconnections' = '0'
> 2018-09-17 12:21:14.413 88424 DEBUG parse_ini_file: 'log_disconnections' = 
> '0' ok:1
> 2018-09-17 12:21:14.413 88424 DEBUG parse_ini_file: 'verbose' = '3'
> 2018-09-17 12:21:14.413 88424 DEBUG parse_ini_file: 'verbose' = '3' ok:1
> 2018-09-17 12:21:14.413 88424 DEBUG loading auth_file: 
> "/etc/pgbouncer/users.txt"
> 2018-09-17 12:21:14.417 88424 NOISE event: 128, SBuf: 192, PgSocket: 400, 
> IOBuf: 4108
> 2018-09-17 12:21:14.417 88424 LOG File descriptor limit: 1024 (H:4096), 
> max_client_conn: 100, max fds possible: 230
> 2018-09-17 12:21:14.417 88424 DEBUG pktbuf_dynamic(128): 0x25277c0
> 2018-09-17 12:21:14.417 88424 DEBUG make_room(0x25277c0, 4): realloc 
> newlen=256
> 2018-09-17 12:21:14.417 88424 DEBUG pktbuf_dynamic(128): 0x252ebb0
> 2018-09-17 12:21:14.418 88424 NOISE connect(4, unix:/tmp/.s.PGSQL.6543) = 
> Connection refused
> 2018-09-17 12:21:14.418 88424 NOISE safe_close(4) = 0
> 2018-09-17 12:21:14.418 88424 LOG Stale pidfile, removing
> 2018-09-17 12:21:14.419 88424 DEBUG adns_create_context: evdns2
> 2018-09-17 12:21:14.421 88424 DEBUG add_listen: 127.0.0.1:6543 
> <http://127.0.0.1:6543>
> 2018-09-17 12:21:14.422 88424 NOISE old TCP_DEFER_ACCEPT on 11 = 0
> 2018-09-17 12:21:14.422 88424 NOISE install TCP_DEFER_ACCEPT on 11
> 2018-09-17 12:21:14.422 88424 LOG listening on 127.0.0.1:6543 
> <http://127.0.0.1:6543>
> 2018-09-17 12:21:14.422 88424 DEBUG add_listen: unix:/tmp/.s.PGSQL.6543
> 2018-09-17 12:21:14.422 88424 LOG listening on unix:/tmp/.s.PGSQL.6543
> 2018-09-17 12:21:14.423 88424 LOG process up: pgbouncer 1.8.1, libevent 
> 2.0.21-stable (epoll), adns: evdns2, tls: OpenSSL 1.0.2k-fips  26 Jan 2017
> 2018-09-17 12:21:14.756 88424 DEBUG loading auth_file: 
> "/etc/pgbouncer/users.txt"
> 2018-09-17 12:21:48.917 88424 NOISE safe_accept(12) = 13 (unix:)
> 2018-09-17 12:21:48.917 88424 NOISE new fd from accept=13
> 2018-09-17 12:21:48.917 88424 WARNING tune_socket(13) failed: Operation not 
> supported
> 2018-09-17 12:21:48.917 88424 

Re: how to know whether query data from memory after pg_prewarm

2018-09-19 Thread Fabio Pardi
@Thomas, this tool looks very interesting!

@Jimmy:

Back to the question,you might approach the problem from a different 
perspective..

If you run a query 2 consecutive times, from the secondon, you should be at 
'full speed'.

Therefore if the first run takes X seconds but the consecutive runs take only a 
fraction of it, then you have probably cached all the data.

It might be as simple as that. Butalsotake into account:

- What is written above is true when the size of the table fits in RAM
- No other activity is going on. If you have other activity going on on the 
disk, it will pollute your results
- If you read from disk, you will see read activity on the data disk, given 
that your are graphing it. If is cached then it does not read from disk.
- Running the query with 'explain' will tell you in detail what is going on. 
(Maybe you forgot to create an index on Postgres and is there on Oracle?)
- If you are doing sortsand the data does not fit on work_mem then you are 
making use of disk space, slowing down operations


regards,

fabio pardi


On 19/09/18 05:29, Thomas Munro wrote:
> On Wed, Sep 19, 2018 at 1:35 PM jimmy  wrote:
>> I use select pg_prewarm('table1','read','main')  to load data of table1 into 
>> the memory.
>> when I use select count(1) from table1 group by aa to query data.
>> I find the speed of query is not fast, I wonder whether it query data from 
>> memory.
>> And it is slower than Oracle, both of Oracle and Postgresql has same table 
>> and count of data.
>> when pg_prewarm use 'read' mode,  the data is put into the OS cache, how to 
>> examine the table which is pg_prewarmed into the OS cache .
>> I know pg_buffercache ,but it just examine the table in the shared buffer of 
>> Postgresql, not the table in the OS cache.
> This is a quick and dirty hack, but it might do what you want:
>
> https://github.com/macdice/pgdata_mincore
>
> Tested on FreeBSD, not sure how well it'll travel.
>



Re: Pgbouncer and postgres

2018-09-19 Thread Fabio Pardi


On 19/09/18 15:34, Adrian Klaver wrote:
> On 9/19/18 5:40 AM, ncontu wrote:
>> Hi Fabio,
>> It used to work before, even without installing updates.
>
> Exactly, the suspicion is that something in the updates changed the behavior 
> of your system.
>
Yes, that's the suspicion. Or at least was..

I tried to update all the available packages available as of today for 
Centos7-OS and Centos7-updates hosted at:


http://ftp.nluug.nl/ftp/pub/os/Linux/distr/CentOS/7/os/x86_64

and

http://ftp.nluug.nl/ftp/pub/os/Linux/distr/CentOS/7/updates/x86_64/


With an up to date system, I m still able to use pgbouncer.

For the records, I m also posting the log lines showing up when pgbouncer is 
asked to start:


2018-09-19 15:49:42.574 9979 DEBUG parse_ini_file: 'verbose' = '3' ok:1
2018-09-19 15:49:42.575 9979 DEBUG loading auth_file: "/etc/pgbouncer/users.txt"
2018-09-19 15:49:42.576 9979 NOISE event: 136, SBuf: 200, PgSocket: 408, IOBuf: 
4108
2018-09-19 15:49:42.577 9979 LOG File descriptor limit: 1024 (H:4096), 
max_client_conn: 100, max fds possible: 140
2018-09-19 15:49:42.577 9979 DEBUG pktbuf_dynamic(128): 0x82a790
2018-09-19 15:49:42.577 9979 DEBUG make_room(0x82a790, 4): realloc newlen=256
2018-09-19 15:49:42.577 9979 DEBUG pktbuf_dynamic(128): 0x82a970
2018-09-19 15:49:42.577 9979 NOISE connect(4, unix:/tmp/.s.PGSQL.6543) = No 
such file or directory
2018-09-19 15:49:42.577 9979 NOISE safe_close(4) = 0
2018-09-19 15:49:42.578 9981 DEBUG adns_create_context: c-ares 1.10.0
2018-09-19 15:49:42.578 9981 DEBUG add_listen: 0.0.0.0:6543
2018-09-19 15:49:42.578 9981 NOISE old TCP_DEFER_ACCEPT on 7 = 0
2018-09-19 15:49:42.578 9981 NOISE install TCP_DEFER_ACCEPT on 7
2018-09-19 15:49:42.578 9981 LOG listening on 0.0.0.0:6543
2018-09-19 15:49:42.578 9981 DEBUG add_listen: ::/6543
2018-09-19 15:49:42.578 9981 NOISE old TCP_DEFER_ACCEPT on 8 = 0
2018-09-19 15:49:42.578 9981 NOISE install TCP_DEFER_ACCEPT on 8
2018-09-19 15:49:42.578 9981 LOG listening on ::/6543
2018-09-19 15:49:42.578 9981 DEBUG add_listen: unix:/tmp/.s.PGSQL.6543
2018-09-19 15:49:42.579 9981 LOG listening on unix:/tmp/.s.PGSQL.6543
2018-09-19 15:49:42.579 9981 LOG process up: pgbouncer 1.8.1, libevent 
2.0.21-stable (epoll), adns: c-ares 1.10.0, tls: OpenSSL 1.0.2k-fips  26 Jan 
2017


and this part shows up when the user connects:

psql -h /tmp -p 6543 -U someuser mydb


2018-09-19 16:02:12.125 10457 NOISE C-0x1391120: 
(nodb)/(nouser)@unix(10465):6543 pkt='!' len=77
2018-09-19 16:02:12.125 10457 DEBUG C-0x1391120: 
(nodb)/(nouser)@unix(10465):6543 got var: user=someuser
2018-09-19 16:02:12.125 10457 DEBUG C-0x1391120: 
(nodb)/(nouser)@unix(10465):6543 got var: database=mydb
2018-09-19 16:02:12.125 10457 DEBUG C-0x1391120: 
(nodb)/(nouser)@unix(10465):6543 using application_name: psql
2018-09-19 16:02:12.125 10457 DEBUG C-0x1391120: 
(nodb)/(nouser)@unix(10465):6543 got var: client_encoding=UTF8
2018-09-19 16:02:12.125 10457 NOISE safe_send(10, 13) = 13
2018-09-19 16:02:12.125 10457 NOISE resync: done=77, parse=77, recv=77
2018-09-19 16:02:12.125 10457 NOISE resync: done=0, parse=0, recv=0
2018-09-19 16:02:12.125 10457 NOISE safe_recv(10, 4096) = 0
2018-09-19 16:02:12.126 10457 NOISE safe_close(10) = 0
2018-09-19 16:02:19.094 10457 NOISE safe_accept(9) = 10 (unix:)
2018-09-19 16:02:19.094 10457 NOISE new fd from accept=10
2018-09-19 16:02:19.094 10457 NOISE unix peer uid: 0
2018-09-19 16:02:19.094 10457 DEBUG C-0x1391120: 
(nodb)/(nouser)@unix(10470):6543 P: got connection: unix->unix
2018-09-19 16:02:19.094 10457 NOISE safe_accept(9) = Resource temporarily 
unavailable
2018-09-19 16:02:19.094 10457 NOISE resync: done=0, parse=0, recv=0
2018-09-19 16:02:19.094 10457 NOISE safe_recv(10, 4096) = 77
2018-09-19 16:02:19.094 10457 NOISE C-0x1391120: 
(nodb)/(nouser)@unix(10470):6543 pkt='!' len=77
2018-09-19 16:02:19.094 10457 DEBUG C-0x1391120: 
(nodb)/(nouser)@unix(10470):6543 got var: user=someuser
2018-09-19 16:02:19.094 10457 DEBUG C-0x1391120: 
(nodb)/(nouser)@unix(10470):6543 got var: database=mydb
2018-09-19 16:02:19.094 10457 DEBUG C-0x1391120: 
(nodb)/(nouser)@unix(10470):6543 using application_name: psql
2018-09-19 16:02:19.094 10457 DEBUG C-0x1391120: 
(nodb)/(nouser)@unix(10470):6543 got var: client_encoding=UTF8
2018-09-19 16:02:19.094 10457 NOISE safe_send(10, 13) = 13
2018-09-19 16:02:19.094 10457 NOISE resync: done=77, parse=77, recv=77
2018-09-19 16:02:19.094 10457 NOISE resync: done=0, parse=0, recv=0
2018-09-19 16:02:19.094 10457 NOISE safe_recv(10, 4096) = 0
2018-09-19 16:02:19.095 10457 NOISE safe_close(10) = 0



after providing the password, connection works as usual.

As mentioned earlier by Adrian, the offending line is probably:

2018-09-17 12:21:48.917 88424 WARNING tune_socket(13) failed: Operation not 
supported


>>
>> I tried installing it from source, from git repo. Can't find the RPM, do you
>> have any link f

Re: Pgbouncer and postgres

2018-09-20 Thread Fabio Pardi
I tried to compile 1.8.1 on my (now) up to date Centos 7 and everything
went ok. I can run pgbouncer and login to it.


this is the relevant part of the diff between config.log files


> config.status:1112: lib/usual/config.h is unchanged
4676c4689
< CPPFLAGS=''
---
> CPPFLAGS=' -I/usr/local/lib/include'

4690c4703
< LDFLAGS='-g  -Wl,--as-needed'
---
> LDFLAGS='-g  -Wl,--as-needed -L/usr/local/lib/lib'


together with the socket problem mentioned earlier makes me think that
we are overlooking something or we miss some important piece of information.

regards,

fabio pardi


On 09/19/2018 06:11 PM, Nicola Contu wrote:
> No the old one is stopped, so not sure why it says the port is in use.
> 
> Attached the config.log
> 
> Thanks
> 
> Il giorno mer 19 set 2018 alle ore 18:02 Adrian Klaver
> mailto:adrian.kla...@aklaver.com>> ha scritto:
> 
> On 9/19/18 8:39 AM, ncontu1 wrote:
> > So unexpectedly with the rpms, it installed it on a different
> place and it is
> > working from command line
> >
> > [root@cmd-dev1 pgbouncer]# /bin/pgbouncer
> /etc/pgbouncer/pgbouncer.ini -u
> > postgres
> > 2018-09-19 15:38:36.081 46322 LOG File descriptor limit: 1024
> (H:4096),
> > max_client_conn: 100, max fds possible: 230
> > 2018-09-19 15:38:36.082 46322 LOG Stale pidfile, removing
> > 2018-09-19 15:38:36.082 46322 LOG listening on 0.0.0.0:6543
> <http://0.0.0.0:6543>
> > 2018-09-19 15:38:36.082 46322 LOG listening on ::/6543
> > 2018-09-19 15:38:36.083 46322 WARNING Cannot listen on
> > unix:/tmp/.s.PGSQL.6543: bind(): Address already in use
> 
> Is the compiled version still running?
> 
> Or is something else sitting on the 6543 socket?
> 
> > 2018-09-19 15:38:36.083 46322 LOG process up: pgbouncer 1.8.1,
> libevent
> > 2.0.21-stable (epoll), adns: c-ares 1.10.0, tls: OpenSSL
> 1.0.2k-fips  26 Jan
> > 2017
> >
> > [root@cmd-dev1 ~]# psql -U ncontu -p 6543 cmd3dev -h IP_ADD
> > Password for user ncontu:
> > psql (10.5)
> > Type "help" for help.
> >
> > cmd3dev=# \q
> >
> >
> > So at this point... I don't really know why compiled from source
> it does not
> > work.
> 
> Not sure.
> 
> For those that might know it would help to provide the complete
> ./configure used and the config.log(or equivalent)
> 
> >
> >
> >
> > --
> > Sent from:
> http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
> >
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
> 



Re: DB size difference after restore

2018-10-03 Thread Fabio Pardi
Hi Sonam,

it is common that after an export (pg_dump) and a consecutive import, the new 
database uses less space than the old one.

Under some circumstances, indexes can become unbalancedor bloated, and tables 
bloated. And that uses more space than it should.

for the sake of trying it out: If you perform a 'vacuum full' on your old 
db,then the size should be very close or equal to the newly imported one.

hope it clarifies.


regards,

fabio pardi


On 03/10/18 14:51, Sonam Sharma wrote:
> Hello Ben,
>
> When we do \l+ , it is different than source, when we load backup from target 
> server
>
>
> Regards,
>
> Sonam
>
>
>  
>
>  
>
>
> On Wed, Oct 3, 2018 at 6:17 PM Benjamin Scherrey  <mailto:scher...@proteus-tech.com>> wrote:
>
> If you're talking about space on drive then you can expect the new one to 
> be smaller generally as it has been straight efficient writes rather than a 
> bunch of updates and deletes which create "holes" in the physical file space.
>
> It helps if you are more detailed as to what you've observed if you want 
> a more specific answer. 
>
>   - - Ben Scherrey 
>
> On Wed, Oct 3, 2018, 7:43 PM Sonam Sharma  <mailto:sonams1...@gmail.com>> wrote:
>
> I have restored the database from backup dump but the size of source 
> and target databases are different. What can be the reason for this ?
>
> Regards,
> Sonam
>



Re: Does postgreSQL community edition supports data distribution across nodes

2018-10-05 Thread Fabio Pardi
Hi,

I think you are looking for:

https://www.postgres-xl.org/

regards,

fabio pardi

On 05/10/18 11:54, deepikags wrote:
> Hi,
>
> Does postgreSQL community edition supports data distribution across
> nodes(distributed architecture) or do we need to buy licence for the same ?
>
>
> Regards,
> Deepika
>
>
>
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
>



Re: Pgbouncer discard all

2018-10-10 Thread Fabio Pardi
Hi Nicola,

I am of the impression that the problem of having waiting clients does not 
depend from server_idle_timeout

How is the load on both ends of pgbouncer? High? Low? No activity? A lot of 
content switch perhaps?

I think that a pool_size of 120 is very high, unless you have an XXL database 
server. Too high numbers will bring lower performances.

Also i think it would be of interest to know the values you are using for:

default_pool_size  min_pool_size and reserve_pool_size

and of main importance is to actually know what are the active connections 
doing. Are those all active? Did perhaps somebody forgot to close the 
connection to the db in your application and they are only waiting to be timed 
out?

About 'DISCARD ALL', that is the default 'server_reset_query'.

regards,

fabio pardi


On 10/10/18 17:03, Nicola Contu wrote:
> Hello,
> we are running pgbouncer 1.9.1 connected to postgres 10.5
>
> Sometimes we are seeing a lot of waiting connections with this query :
>
> DISCARD ALL
>
> This is our pgbouncer config :
>
> [databases]
> dev = host=10.10.10.1 port=5432 dbname=dev auth_user=pgbouncer pool_size=120
>
> [pgbouncer]
> listen_port = 6543
> listen_addr = *
> auth_type = md5
> auth_file = /etc/pgbouncer/users.txt
> auth_query = select uname,phash from user_lookup($1)
> logfile = /var/log/pgbouncer.log
> pidfile = /home/postgres/pgbouncer.pid
> admin_users = admin
> user = postgres
> max_db_connections = 220
> log_connections = 0
> log_disconnections = 0
>
>
> Do you think this can depend on the server_idle_timeout default config value?
>
> Thanks a lot,
> Nicola



Re: Pgbouncer discard all

2018-10-18 Thread Fabio Pardi
@Nicola, while I'm glad you got further help.. I was asking myself it you maybe 
missed my answer to your question? (mail from 10/10/2018 at 15:31)

https://www.postgresql.org/message-id/a22f8385-2a49-30a7-b1d2-fc743c2f3245%40portavita.eu

regards,

fabio pardi



On 16/10/2018 14:12, Nicola Contu wrote:
> Hello,
> is this normal? can anyone help?
>
> Thanks a lot for your help in advance.
>
> Nicola
>
> Il giorno mer 10 ott 2018 alle ore 17:03 Nicola Contu  <mailto:nicola.co...@gmail.com>> ha scritto:
>
> Hello,
> we are running pgbouncer 1.9.1 connected to postgres 10.5
>
> Sometimes we are seeing a lot of waiting connections with this query :
>
> DISCARD ALL
>
> This is our pgbouncer config :
>
> [databases]
> dev = host=10.10.10.1 port=5432 dbname=dev auth_user=pgbouncer 
> pool_size=120
>
> [pgbouncer]
> listen_port = 6543
> listen_addr = *
> auth_type = md5
> auth_file = /etc/pgbouncer/users.txt
> auth_query = select uname,phash from user_lookup($1)
> logfile = /var/log/pgbouncer.log
> pidfile = /home/postgres/pgbouncer.pid
> admin_users = admin
> user = postgres
> max_db_connections = 220
> log_connections = 0
> log_disconnections = 0
>
>
> Do you think this can depend on the server_idle_timeout default config 
> value?
>
> Thanks a lot,
> Nicola
>



Re: help understanding pgbench results

2019-07-15 Thread Fabio Pardi
Hi Luca


(I tried to reproduce your tests, but I got similar results over different 
checkpoint_completion_target)

The rest is in line here below:

On 12/07/2019 12:04, Luca Ferrari wrote:

> 
> shared_buffers = 1 GB
> checkpoint_timeout = 5 min
> 
> I've created a pgbench database as follows (around 4.5 GB):
> % pgbench -i -s 300 -F 100 --foreign-keys --unlogged-tables -h
> 127.0.0.1 -U luca pgbench
> 
> and I've tested three times (each time after a restart) with the following:
> % pgbench  -T 600  -j 4 -c 4  -h 127.0.0.1 -U luca -P 60  pgbench
> 
> 
> Since tables are unlogged, I was expecting no much difference in
> setting checkpoint_completion_target, but I got (average results):
> - checkpoint_completion_target = 0.1  ==> 755 tps
> - checkpoint_completation_target = 0.5 ==> 767 tps
> - checkpoint_completion_target = 0.9 ==> 681 tps

unlogged tables are not written to WAL, therefore checkpoints do not fit into 
the picture (unless something else is writing data..).

> 
> so while there is not a big different in the first two cases, it seems
> throttling I/O reduces the tps, and I don't get why. Please note that
> there is some small activity while benchmarking, and that's why I ran
> at least three tests for each setting.

It is not a good idea to have anything running in the background. 

Also is always a good idea to run tests multiple times, and I think that 3 is 
the bare minimum. 
You want to make sure your tests are as reliable as possible, means having 
similar results between each other, therefore you might post all the results, 
not only the average, so people can give their interpretation of the data.


Back to your question, your tests run for 10 minutes, and checkpoints happen 
every 5, so we should expect to see 2 checkpoints per test, which might 
influence your results. How long is a checkpoint spread over time, is given by 
checkpoint_completion_target

Assuming that the 'background activity' writes data, a value of 
(checkpoint_completion_target) 0.9 means that when your test starts, the system 
might be still busy in writing data from the previous checkpoint (which started 
before your pgbench test was launched). That is less likely to happen with a 
value of 0.1



Maybe looking at the graphs (CPU, disk) of your server might point to something.
Also the postgres logs should be able to tell you more, eg: when a checkpoint 
starts, finishes, and how much stuff it wrote.


I hope I gave you enough inputs to better understand what is going on.

regards,

fabio pardi




Re: help understanding pgbench results

2019-07-15 Thread Fabio Pardi



On 15/07/2019 15:14, Luca Ferrari wrote:

>> Assuming that the 'background activity' writes data, a value of 
>> (checkpoint_completion_target) 0.9 means that when your test starts, the 
>> system might be still busy in writing data from the previous checkpoint 
>> (which started before your pgbench test was launched). That is less likely 
>> to happen with a value of 0.1
> 
> Uhm...but in the logged table tests a value of 0.9 increases the tps,
> that as far as I understand is in contrast with what you are stating.

What I stated is valid for unlogged tables. (a background checkpoint makes your 
pgbench results 'dirty')

When you talk about logged tables, you actually want to spread the checkpoint 
over time. The more it is spread, the better performances. But here, probably, 
checkpoint has a lot to write compared to the data produced by background job 
(and, maybe, checkpoints are happening more frequently?).


> 
> Anyway, I'll test more and report back some more results.

good, let us know and do not forget to provide the log lines produced by the 
checkpoints too.

regards,

fabio pardi




Re:

2019-08-28 Thread Fabio Pardi
Hi,


if you have access to the OS, then you have plenty of options.

Else, I think pg_cron might do the job


regards,

fabio pardi



On 28/08/2019 08:58, Sonam Sharma wrote:
> Is there any option to run reindex or vaccum in background?
> Every time the session gets logged off in between.




Re: Calculate hardware requirements

2020-06-04 Thread Fabio Pardi
Hi,

in my opinion your question is too generic to get an accurate answer. To 
educate yourself reading Postgres docs or some good books would be in my 
opinion the best way to give an answer yourself to your own question. Then you 
can still post to the ML on some specific setting (postgres performance ML is 
the best place).

Much of the requirements depends on the expected load on the database and what 
kind of usage you will do, such as OLTP or DWH/BI. Also the database size is 
important to fit in the picture.

As rule of thumb, you want all your installations to be identical in terms of 
hardware specs. CPU should be able to serve your queries and your clients, so 
you must have enough cores to serve the expected number of connections without 
degrading performances.

About RAM, the more the better, but if you have enough to fit your db (or the 
part you use of your db) in RAM, you will probably avoid many of your problems 
about disks performances.

Do not forget disks, RAID controllers, networking, SLA, SLO, HA, DR..

OT: I would use newer Postgres than 9.6 if I were you, unless you have good 
reasons to use 9.6.


regards,

fabio pardi




On 04/06/2020 11:36, Praveen Kumar K S wrote:
> Hello All,
>
> I'm looking for hardware configurations to set up 1 master and 2 hot-standby 
> slaves using 9.6 in one DC. Also, I need to have DR with the same setup with 
> cross-site replication enabled. I went through a lot of docs/blogs suggesting 
> 4cores and at least 4/8GB RAM. But I'm looking for help on how exactly one 
> can justify the hardware requirements, like a formula ? Please advise. 
>
> Regards,
> PK



Re: n_distinct off by a factor of 1000

2020-06-23 Thread Fabio Pardi

On 23/06/2020 14:42, Klaudie Willis wrote:
> I got my first hint of why this problem occurs when I looked at the 
> statistics.  For the column in question, "instrument_ref" the statistics 
> claimed it to be:
>
> The default_statistics_target=500, and analyze has been run.
> select * from pg_stats where attname like 'instr%_ref'; -- Result: *40.000*
> select count(distinct instrumentid_ref) from bigtable -- Result: *33 385 922 
> (!!)*
>
> That is an astonishing difference of almost a 1000X. 
>

I think you are counting 2 different things here.

The first query returns all the columns "like 'instr%_ref'" present in the 
statistics (so in the whole cluster), while the second is counting the actual 
number of different rows in bigtable.


regards,

fabio pardi


Re: Network performance optimization

2020-09-09 Thread Fabio Pardi
Hi,

use of a connection pooler usually helps in such cases. It will not reduce 
latency, but it will mitigate the problem when the app can benefit from 
recycling the connections.

regards,

fabio pardi


On 09/09/2020 06:59, J . Mageshwaran wrote:
> Hi Team, I am performing some benchmark testing with application in aws east 
> 2 and dB in East 1. I am using pgbench for benchmark, the RTT Latency between 
> East 1 to east2 is some where between 12 to 16 ms on an average. Is there any 
> optimization that can be done to reduce the latency and increase the 
> throughout to the PostgreSQL layer?



Re: Doubt in pgbouncer

2020-10-01 Thread Fabio Pardi
Hi Rama,

On 02/10/2020 01:42, Rama Krishnan wrote:
> Hi Friends, 
>
> By using pg bouncer can we split read and queries
>

pgbouncer is just a connection pooler.

The logic where to send the reads and where the writes, should be in our 
application.

regards,

fabio pardi


Re: Why is Postgres only using 8 cores for partitioned count? [Parallel Append]

2021-02-15 Thread Fabio Pardi



On 14/02/2021 22:16, Gavin Flower wrote:
> On 14/02/2021 22:47, David Rowley wrote:
>> On Sun, 14 Feb 2021 at 13:15, Seamus Abshere
>>  wrote:
>>> The comment from Robert says: (src/backend/optimizer/path/allpaths.c)
>>>
>>>  /*
>>>   * If the use of parallel append is permitted, always 
>>> request at least
>>>   * log2(# of children) workers.
>>>
>>> In my case, every partition takes 1 second to scan, I have 64 cores, I have 
>>> 64 partitions, and the wall time is 8 seconds with 8 workers.
>>>
>>> I assume that if it it planned significantly more workers (16? 32? even 
>>> 64?), it would get significantly faster (even accounting for transaction 
>>> cost). So why doesn't it ask for more? Note that I've set 
>>> max_parallel_workers=512, etc. (postgresql.conf in my first message).
>> There's perhaps an argument for allowing ALTER TABLE > table> SET (parallel_workers=N); to be set on partitioned tables, but
>> we don't currently allow it.
> [...]
>> David
>
> Just wondering why there is a hard coded limit.
>
> While I agree it might be good to be able specify the number of workers, sure 
> it would be possible to derive a suitable default based on the number of 
> effective processors available?
>


I had the same problem and my conclusion was that it is not possible to go 
above 8 cores because of Amdahl's law on parallel computing. More here: 
https://en.wikipedia.org/wiki/Amdahl%27s_law

regards,

fabio pardi





Re: Idle connections / sessions

2018-12-12 Thread Fabio Pardi
Hi Josef,

please avoid cross posting to multiple lists.

I m not a developer, but I think that if you do not want idle connections, you 
should terminate them on the side they have been created.

If your application leaves the connection open, then you will notice idle 
connections on Postgres when not in use.

regards,

fabio pardi


On 12/12/2018 10:37, Oygun Josef wrote:
>
> Hi,
>
>  
>
> Is it possible to terminate idle connections/sessions automatically through a 
> timeout in AWS or do I need to run a periodical cron job for this?
>
>  
>
> Postgres version: 9.6.6
>
> Instance: db.t2.micro
>
> RAM : 1GB
>
>  
>
> We are running a microservice architecture using docker with kubernetes and I 
> can see that every pod on every node that has connected to the DB still has a 
> idle connection as long as the node is still active even.
>
>  
>
> It is both PGAdmin and the PostgreSQL JDBC Driver that leaves open idle 
> connections.
>
>  
>
>  
>
> *Josef Oygun*
>
>  
>



Re: Is there something wrong with my test case?

2019-01-07 Thread Fabio Pardi
Hi Thiemo,


On 07/01/2019 11:30, Thiemo Kellner wrote:
>
> Hi HP
>
> Thanks for your reply.
>
> Quoting "Peter J. Holzer" :
>
>> On 2018-12-25 11:54:11 +, Thiemo Kellner wrote:
>> [three different but functionally equivalent queries]
>>
>>> Explain analyze verbose showed for:
>>> A (cost=264.72..626.97 rows=31 width=90) (actual time=1.117..1.117 rows=0
>>> loops=1)
>>> C (cost=0.42..611.19 rows=31 width=52) (actual time=2.217..2.217 rows=0
>>> loops=1)
>>
>> 626.97 doesn't seem "much higher" to me than 611.19. I would call that
>> "about the same".
>>
>
> So would I but the cost is given as a range. Taking the the average somewhat 
> 400 compare to somewhat 300. I do not know whether averaging is appropriate 
> here.


The cost is not a range. The 2 numbers you see are:

  *

Estimated start-up cost. This is the time expended before the output phase 
can begin, e.g., time to do the sorting in a sort node.

  * Estimated total cost. This is stated on the assumption that the plan node 
is run to completion, i.e., all available rows are retrieved. In practice a 
node's parent node might stop short of reading all available rows (see the 
LIMIT example below).

As you can read here:

https://www.postgresql.org/docs/current/using-explain.html


regards,

fabio pardi




Re: Read consistency when using synchronous_commit=off

2019-01-15 Thread Fabio Pardi
Hi,

all clients will get the latest version of the row (from RAM, that is). The 
only thing is that in case of server crash, not-yet-written-to-disk commits 
will be lost.

detailed explanation can be found here:

https://www.postgresql.org/docs/current/wal-async-commit.html

regards,

fabio pardi 


On 15/01/2019 11:58, pshadangi wrote:
> To improve commit performance we are planning to use 
> "synchronous_commit=off", with this if multiple clients are reading the same 
> data/row will they always get the latest updated data/row ? (clients are 
> using committed read and we are not using clustered environment, we have just 
> one instance of postgres serving local clients running on the same machine).
> For example if client1 updates a row then the updated value is available to 
> client2 immediately after the commit or there is a delay as commit is now 
> asynchronous ?



Re: Read consistency when using synchronous_commit=off

2019-01-15 Thread Fabio Pardi
Hi,

After better thinking, I have to reply to myself since I m not entirely sure of 
my previous question. (I m digging into the docs, but i do not want to mislead 
you in the meanwhile)

If i recall correctly, written data is parked in WAL buffer before being synced 
to disk (to the transaction log).

I m not sure other clients are able to read from WAL buffer, therefore i m not 
sure the data is available to other clients at that specific point in time.

Maybe somebody else in the ML knows the details by heart?

regards,

fabio pardi



On 15/01/2019 12:15, Fabio Pardi wrote:
> Hi,
>
> all clients will get the latest version of the row (from RAM, that is). The 
> only thing is that in case of server crash, not-yet-written-to-disk commits 
> will be lost.
>
> detailed explanation can be found here:
>
> https://www.postgresql.org/docs/current/wal-async-commit.html
>
> regards,
>
> fabio pardi 
>
>
> On 15/01/2019 11:58, pshadangi wrote:
>> To improve commit performance we are planning to use 
>> "synchronous_commit=off", with this if multiple clients are reading the same 
>> data/row will they always get the latest updated data/row ? (clients are 
>> using committed read and we are not using clustered environment, we have 
>> just one instance of postgres serving local clients running on the same 
>> machine).
>> For example if client1 updates a row then the updated value is available to 
>> client2 immediately after the commit or there is a delay as commit is now 
>> asynchronous ?
>



Re: Postgres Automated Failover

2019-01-17 Thread Fabio Pardi
Hi,

In my opinion repmgr it's worth a look. 

https://repmgr.org/

regards,

fabio pardi


On 17/01/2019 14:32, AI Rumman wrote:
> Hi,
> 
> I am planning to use Postgresql with TimescaleDb extension. I have to design 
> a system similar to AWS RDS which supports automated failover, transparent 
> minor version upgrades etc.
> In early days, I worked with PgPool II to enable heartbeat between the 
> Postgres servers. Is there any new tool to achieve the same by which I can 
> design a transparent failover system for Postgres?
> 
> Also is there any multi-master Postgresql solution? Is Postgres-XC production 
> ready?
> 
> Any suggestions is appreciated. 
> 
> Thanks.



Re: pgbouncer

2019-01-17 Thread Fabio Pardi
David,

Are you sure? The behaviour I experienced is different from what you described.

On 17/01/2019 16:32, David G. Johnston wrote:
> On Thu, Jan 17, 2019 at 8:18 AM Nicola Contu  wrote:
>>
>> Hello,
>> I am a bit confused about the settings in pgbouncer
>>
>> What's exactly the pool_size?
> 
> Roughly, the number of open connections pgbouncer will keep to PostgreSQL.
> 
>> If I set 3, and I tried to connect from 4 shells, I am still able to connect.
> 
> That would be the point - pgbouncer is sharing the 3 connections it
> keeps with PostgreSQL between the 4 client connections made to it.


pgbouncer will spawn connections until it reach the maximum allowed. 
Connections go 1:1 between pgbouncer and the client, and are not 'shared'

pool_size is the size of the pool that pgbouncer keeps open for you. If not 
set, the default_pool_size is used.


If you check the open connections to your host, you can verify it 

netstat --tcp -n | grep ... 

will show ESTABLISHED connections = pool_size


client connections will be accepted by pgbouncer until pool_size is reached.

after that, pgbouncer can spawn more connection only if reserve_pool_size 
allows it, and after a wait of reserve_pool_timeout .

when your roof is reached (means you are opening pool_size + reserve_pool_size) 
then client connections will enter a queue.

Entering a queue, does not mean be serviced by the database. It means wait your 
turn. If you connect manually, you will notice an open connection, (socket is 
opened) but not database shell.


regards,

fabio pardi

> 
>> Same thing for max_db_connections. I set this to 1 and I am able to connect 
>> from 2 shells.
> 
> Same as pool_size but basically a fail-safe since pools are
> per-user/per-database while this limit is per-database only.
> 
>> This is kind of confusing and I'm not really cleared reading the 
>> documentation.
> 
> For each setting you need to understand whether it limits
> pgbouncer->PostgreSQL or client->pgbouncer
> 
> Configurations in [databases] limit the former; [pgbouncer] options
> either provide defaults for the [databases] or limit clients.
> 
>> [databases]
>> cmdwela = host=127.0.0.1 port=5432 dbname=cmdwela auth_user=pgbouncer 
>> pool_size=120
>> cicciopasticcio = host=127.0.0.1 port=5432 dbname=cicciopasticcio 
>> auth_user=pgbouncer pool_size=2
>>
>> [pgbouncer]
>> listen_port = 6543
>> listen_addr = *
>> auth_type = md5
>> auth_file = /etc/pgbouncer/users.txt
>> auth_query = select uname,phash from user_lookup($1)
>> logfile = /var/log/pgbouncer.log
>> pidfile = /home/postgres/pgbouncer.pid
>> admin_users = admin
>> user=postgres
> 
>> max_db_connections=1
> So one open connection max per database/user pool but it is shared -
> i.e., actively executing queries running in parallel are limited to
> this number.
> 
>> max_client_conn=5
>> I just want to limit connections from the app etc.
> 
> That limit is 5
> 
> David J.
> 



Re: pgbouncer

2019-01-18 Thread Fabio Pardi
Nicola,

My previous mail was maybe misleading because when i mentioned
'connections' I actually meant active connections to the db, as in:
doing a transaction.
In that case, yes, the connections are shared. But as soon as they
initiate a transaction, then they are not shared any longer and a new
pgbouncer connection is spawn to the database. Alternatively, the client
cannot be served and has to wait.

I hope my example reported here below will clarify the situation:


---
pgbouncer setting:

test_db   = host=... dbname=... user=... pool_size=2

[pgbouncer]
pool_mode = session


max_client_conn = 10
default_pool_size = 5
min_pool_size = 4
reserve_pool_size = 3
---

With above configuration, this is what 'show pools' on pgbouncer shows
when clients are not using pgbouncer:

database   | test_db
user   | xx
cl_active  | 0
cl_waiting | 0
sv_active  | 0
sv_idle| 0
sv_used| 2
sv_tested  | 0
sv_login   | 0
maxwait| 0
maxwait_us | 0
pool_mode  | session

2 servers used, as specified by pool_size

means: 2 TCP connections open to the db

 netstat --tcp -n |grep 5432 | | grep EST | wc -l
2


***

if you remove pool_size from the database string, then config becomes:

test_db   = host=... dbname=... user=...

---
[pgbouncer]
pool_mode = session


max_client_conn = 10
default_pool_size = 5
min_pool_size = 4
reserve_pool_size = 3
---

therefore:

database   | test_db
user   | xx
cl_active  | 0
cl_waiting | 0
sv_active  | 0
sv_idle| 0
sv_used| 4
sv_tested  | 0
sv_login   | 0
maxwait| 0
maxwait_us | 0
pool_mode  | session



netstat --tcp -n |grep EST | grep 5432 | wc -l
4


what we learned: pool_size overwrites min_pool_size

***


Now, I m restoring pool_size to 2 as we had in the beginning, and
checking how many connections we can open to pgbouncer..


We now have 2 connections always available.

i therefore open open 2 connections to the database. I only connect, and
leave them idle. This is confirmed by the

-[ RECORD 2 ]-
database   | test_db
user   | xx
cl_active  | 2
cl_waiting | 0
sv_active  | 0
sv_idle| 0
sv_used| 2
sv_tested  | 0
sv_login   | 0
maxwait| 0
maxwait_us | 0
pool_mode  | session


so, now, 2 server connections are open. If I open one more, then you are
sharing the connections, as David mentioned.

-[ RECORD 2 ]-
database   | test_db
user   | xx
cl_active  | 3
cl_waiting | 0
sv_active  | 0
sv_idle| 0
sv_used| 2
sv_tested  | 0
sv_login   | 0
maxwait| 0
maxwait_us | 0
pool_mode  | session

We can see 3 clients active, and 2 connections to the db.

and netstat will count 2 TCP connections only.


If instead you ask your clients to initiate a database transaction, for
instance typing 'BEGIN ;' then you will actually see what i meant in my
previous mail:

database   | test_db
user   | xx
cl_active  | 3
cl_waiting | 0
sv_active  | 3
sv_idle| 0
sv_used| 0
sv_tested  | 0
sv_login   | 0
maxwait| 0
maxwait_us | 0
pool_mode  | session

and netstat will show 3 open connections to your db


how many connections can you possibly open to the database?

pool_size + reserve_pool_size = 5 in my example.

how many connections can you make to pgbouncer? a total (globally,
including to pgbouncer db) of max_client_conn

that means, 'cl_active' can go up to 10 in my example, but as soon as
the 6th client initiates a transaction, it cannot be served and has to
wait for a connection to be freed.


regards,

fabio pardi





On 1/17/19 5:15 PM, Nicola Contu wrote:
>> If I set 3, and I tried to connect from 4 shells, I am still able to connect.
> 
> That would be the point - pgbouncer is sharing the 3 connections it
> keeps with PostgreSQL between the 4 client connections made to it.
> 
> Mmh, my pool_mode is per session. The 4 sessions were active, not doing
> any query, but connected to the shell.
> So that's what my doubt 
> 
> Il giorno gio 17 gen 2019 alle ore 16:32 David G. Johnston
> mailto:david.g.johns...@gmail.com>> ha scritto:
> 
> On Thu, Jan 17, 2019 at 8:18 AM Nicola Contu  <mailto:nicola.co...@gmail.com>> wrote:
> >
> > Hello,
> > I am a bit confused about the settings in pgbouncer
> >
> > What's exactly the pool_size?
> 
> Roughly, the number of open connections pgbouncer will keep to
> PostgreSQL.
> 
> > If I set 3, and I tried to connect from 4 shells, I am still able
> to connect.
> 
> That would be the point - pgbouncer is sharing the 3 connections it
> keeps with PostgreSQL between the 4 client connections made to it.
> 
> > Same thing for max_db_connections. I set this to 1 and I am able
> to connect from 2 shells.
> 
> Same as pool_size but basically a fail-safe since pools 

Re: Transactions

2019-04-09 Thread Fabio Pardi
Hi Karl Martin,

you could set log_min_duration_statement=0 at the global level (in the config 
file) or at session level too. 

regards,

fabio pardi




On 09/04/2019 11:26, Karl Martin Skoldebrand wrote:
> Hi,
> 
>  
> 
> Is there a way to track “transactions” by default (i.e. without anyone having 
> set up anything specific). The problem I am facing is that users are claiming 
> that settings are disappearing with them doing anything to affect them. It 
> would be good to be able to see what postgresql thinks is going on.
> 
> /*Subscriber adds Severity/BU/Service by ticking the corresponding box in 
> subscriber configuration in WEBAPP. This works for some time./
> 
> /*Subscriber stops receiving selected [tickets]./
> 
> /*Upon checking settings the selected Severity/BU/Service has been 
> unselected./
> 
>  
> 
> /M.
> 
>  
> 
> 
> 
> Disclaimer:  This message and the information contained herein is proprietary 
> and confidential and subject to the Tech Mahindra policy statement, you may 
> review the policy at http://www.techmahindra.com/Disclaimer.html externally 
> http://tim.techmahindra.com/tim/disclaimer.html internally within 
> TechMahindra.
> 
> 
> 




Re: no matching entries in passwd file

2019-05-21 Thread Fabio Pardi
Hi Daulat,

I believe that the error message is referring to the system user, not
the database one.


docker exec --help | grep -- -u
  -u, --user string  Username or UID (format:
[:])



regards,

fabio pardi


On 5/21/19 2:42 PM, Daulat Ram wrote:
> Hello team,
> 
>  
> 
> I have  database & users created inside the docker but we are getting
> connection issue while trying to connect to database using user created
> in postgres.
> 
>  
> 
> docker exec -it -u test b8e7ejb1e31d bash
> 
> unable to find user test: no matching entries in passwd file
> 
> Regards,
> 
> Daulat
> 
>  
> 




Re: One way replication in PostgreSQL

2019-06-03 Thread Fabio Pardi
Hi,


On 6/3/19 6:00 PM, PALAYRET Jacques wrote:
> Hello,
> 
> If, for security reasons, I can't create a connection or a flow from
> subscriber/secundary/slave towards provider/primary/master, witch
> replication systems can I use ?
> 

To perform replication, you need some form of connectivity between the
hosts (unless, you want to only apply archived WAL files, as mentioned
by Stephen here above).
In streaming replication your replica needs to be able to initiate the
connection to master


If you instead have, let's say, master -> replica or both your hosts can
reach a 3rd host, you might work around the problem using an SSH tunnel.

I discourage you from this latter option, anyway.

> If possible, I would prefer partial replication (only some tables) to
> full base replication (all instances).

you could have a look into pglogical

regards,

fabio pardi




Re: Can PostgreSQL create new WAL files instead of reusing old ones?

2018-04-27 Thread Fabio Ugo Venchiarutti
I was wondering the same. The WAL writer is overwriting the same inodes 
again and again, so block COW logic should only happen once: at allocation.


I'm no expert: does XFS track COW based on path (ugh?)


Maybe I'm crazy but here's a possible workaround if the problem is 
effectively at that level: OP could use the archive_command to 
deliberately allocate a new segment and switch the old one with it 
before returning zero to the archiver.


The WAL writer will then recycle what it thinks is the same inode and 
not your impostor.



I'm rather confident this should work ok but you may want to make sure 
with the hackers that no file descriptors are open on a ready-to-archive 
segments while you shuffle things around in your command (or some other 
weird implication I'm missing).






On 27/04/18 17:28, Bruce Momjian wrote:

On Wed, Apr 18, 2018 at 09:34:50AM -0400, Vick Khera wrote:

On Wed, Apr 18, 2018 at 12:35 AM, Michael Paquier  wrote:


 That looks like a rather difficult problem to solve in PostgreSQL
 itself, as the operator running the cluster is in charge of setting up
 the FS options which would control the COW behavior, so it seems to me


You cannot turn off CoW on ZFS. What other behavior would you refer to here?

I suppose one could make a dedicated data set for the WAL and have ZFS make a
reservation for about 2x the total expected WAL size. It would require careful
attention to detail if you increase WAL segments configuration, though, and if
you had any kind of hiccup with streaming replication that caused the segments
to stick around longer than expected (but that's no different from any other
file system).


Uh, at the risk of asking an obvious question, why is the WAL file COW
if it was renamed?  No one has the old WAL file open, as far as I know.



--
Regards

Fabio Ugo Venchiarutti
Data Services Department
Ocado Technology

--


Notice:  This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group. 


 

If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses. 


 

Fetch and Sizzle 
are trading names of Speciality Stores Limited and Fabled is a trading name 
of Marie Claire Beauty Limited, both members of the Ocado Group.


 


References to the “Ocado Group” are to Ocado Group plc (registered in 
England and Wales with number 7098618) and its subsidiary undertakings (as 
that expression is defined in the Companies Act 2006) from time to time.  
The registered office of Ocado Group plc is Buildings One & Two, Trident 
Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.




Re: How do I get the SQL statement in a trigger?

2018-05-10 Thread Fabio Ugo Venchiarutti
Querying over pg_stat_activity with clauses in a trigger can be quite 
expensive if your write transaction rate is high.



You may want to look into the current_query() function documented at 
https://www.postgresql.org/docs/current/static/functions-info.html









On 10/05/18 11:44, Pavel Stehule wrote:



2018-05-10 12:42 GMT+02:00 a <372660...@qq.com <mailto:372660...@qq.com>>:

Thanks a lot, would you please be so kind to tell me more about what
is top level query??


for example - if you run some queries from PLpgSQL functions (triggers), 
then these queries are not top queries.


Regards

Pavel



-- Original message --
*From:* "Pavel Stehule"__;
*Sendtime:* Thursday, May 10, 2018 6:38 PM
*To:* "a"<372660...@qq.com <mailto:372660...@qq.com>>;
*Cc:* "pgsql-general"__;
*Subject:* Re: How do I get the SQL statement in a trigger?

Hi

2018-05-10 12:23 GMT+02:00 a <372660...@qq.com
<mailto:372660...@qq.com>>:

Hi I would like to write a trigger that recorded every sql
statement under the effected entry. So if it is possible to
retrieve the sql statement within a trigger??


You can read a tom command from pg_stat_activity table

  postgres=# select query from pg_stat_activity where pid =
pg_backend_pid();
+--+
|  query   |
+--+
| select query from pg_stat_activity where pid = pg_backend_pid(); |
+--+
(1 row)

But this information is available only for top level query.

Regards

Pavel




--
Regards

Fabio Ugo Venchiarutti
Data Services Department
Ocado Technology

--


Notice:  This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group. 


 

If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses. 


 

Fetch and Sizzle 
are trading names of Speciality Stores Limited and Fabled is a trading name 
of Marie Claire Beauty Limited, both members of the Ocado Group.


 


References to the “Ocado Group” are to Ocado Group plc (registered in 
England and Wales with number 7098618) and its subsidiary undertakings (as 
that expression is defined in the Companies Act 2006) from time to time.  
The registered office of Ocado Group plc is Buildings One & Two, Trident 
Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.




Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Fabio Ugo Venchiarutti

On 21/10/2019 09:52, Luca Ferrari wrote:

On Sat, Oct 19, 2019 at 7:46 PM Daulat Ram  wrote:

One more questions is, how backups are useful if we have streaming replication 
. As I know, we can promote the standby as primary in case of disaster at 
primary side. Do we need to schedule backups if we have streaming replication?


Let's speculate a little on that: do you need backups if you have a
RAID-1 configuration?
Replication helps you reduce almost to zero the time to handle a
disaster, backups allow you to recover in a more large time window.

Luca





TBH I hear this argument more often than I wish.


Offline backups and data replication are nearly entirely orthogonal.


Any form of instantaneous redundancy (RAID, instantaneous replication 
and so on) primary is a mitigation measure to protect data & 
availability against loss of infrastructure.



Backups (preferably with PITR) also do that, but that's not their 
primary purpose unless you can't afford live redundancy on top of them.



Offline backups address many failure scenarios that any form of live 
replication is defenseless against (eg: logical damage to the data as a 
result of human errors/bugs/vandalism would hit all your replicas, but 
you always can perform DR from a backup).



Delayed replicas, or any online rollback capability (such as pg_rewind 
off the server's own pg_wal or Oracle's flashback), somewhat live in a 
grey area in between, and their effectiveness varies depending on which 
level was compromised.






--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Fabio Ugo Venchiarutti

On 21/10/2019 12:10, Avinash Kumar wrote:



On Mon, Oct 21, 2019 at 4:19 PM Fabio Ugo Venchiarutti 
mailto:f.venchiaru...@ocado.com>> wrote:


On 21/10/2019 09:52, Luca Ferrari wrote:
 > On Sat, Oct 19, 2019 at 7:46 PM Daulat Ram
mailto:daulat@exponential.com>> wrote:
 >> One more questions is, how backups are useful if we have
streaming replication . As I know, we can promote the standby as
primary in case of disaster at primary side. Do we need to schedule
backups if we have streaming replication?
 >
 > Let's speculate a little on that: do you need backups if you have a
 > RAID-1 configuration?
 > Replication helps you reduce almost to zero the time to handle a
 > disaster, backups allow you to recover in a more large time window.
 >
 > Luca
 >
 >


TBH I hear this argument more often than I wish.


Offline backups and data replication are nearly entirely orthogonal. 




Any form of instantaneous redundancy (RAID, instantaneous replication
and so on) primary is a mitigation measure to protect data &
availability against loss of infrastructure.


Backups (preferably with PITR) also do that, but that's not their
primary purpose unless you can't afford live redundancy on top of them.


Offline backups address many failure scenarios that any form of live
replication is defenseless against (eg: logical damage to the data as a
result of human errors/bugs/vandalism would hit all your replicas, but
you always can perform DR from a backup).

I think we are way ahead of the days where we need to live on Offline 
backups involving downtimes.
Today, we have several Online Consistent Backup solutions such as 
pg_basebackup (built-in with PG), pgBackRest, BARMAN, WAL-G etc 

Online Consistent Backups + Archiving of WALs are perfectly reliable.
We need to ensure that we have safe backup locations, for example, push 
them to AWS S3 and forget about redundancy.

Why do you think only Offline Backups are reliable today ?



Delayed replicas, or any online rollback capability (such as pg_rewind
off the server's own pg_wal or Oracle's flashback), somewhat live in a
grey area in between, and their effectiveness varies depending on which
level was compromised.

What if you delay your replica by a day, before you making a change 
tomorrow (that may need a rollback). Delayed Replica is for emergencies 
IMO and of course does not satisfy every possible scenario.





Trade-offs come down to individual implementers, so fair enough.


Our biases must come from different use cases/experience: a good 
fraction of recovery/sanitisation operations I had to carry out were 
very selective/application specific and involved buildup of anomalies 
over time, sometimes months.



Wouldn't have been possible without an old frozen reference for users to 
compare with, and that hat to come from very cold storage.





--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Is this a bug ?

2019-10-23 Thread Fabio Ugo Venchiarutti

On 23/10/2019 16:55, Ron wrote:

On 10/23/19 10:51 AM, Geoff Winkless wrote:

On Wed, 23 Oct 2019 at 16:42, Ravi Krishna  wrote:


select count(*) from bugtest where fld1 in ('a','b','c'
 'd','e');

Note the missing comma after 'c'.

PG takes it a syntactically right SQL and gives 3 as output.

In SQLServer it errors out SQL Error [102] [S0001]: Incorrect syntax 
near 'd'.


Can't believe this bug was never found before.  We ended up wasting 
lot of time to figure this out.

Simplify:

select 'a'
db-# 'b';
  ?column?
--
  ab
(1 row)

This is not a bug.

https://www.postgresql.org/docs/9.2/sql-syntax-lexical.html

Two string constants that are only separated by whitespace with at
least one newline are concatenated and effectively treated as if the
string had been written as one constant.


Then -- since the 'e' is separated from 'd' by a comma, the result 
should be "4", not "3".


No doubt: it's a bug, no matter what the Pg devs say.




Does any record in your test table contain the concatenated 'cd' in 
"fld1"? If not, it's working as per specification & documentation.



I can agree that this sounds like one of those idiosyncratic parts of 
the ANSI specification that exist only for compatibility with some 
ancient practice, but this behaviour is documented.



BTW, can parser behavior be affected by settings this early?


A GUC setting disabling this madness would make sense IMHO...




--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Is this a bug ?

2019-10-23 Thread Fabio Ugo Venchiarutti

On 23/10/2019 17:30, Ron wrote:

On 10/23/19 11:27 AM, Geoff Winkless wrote:

On Wed, 23 Oct 2019 at 17:20, Geoff Winkless  wrote:

For what it's worth, I can see a value to having

SELECT 'this is quite a long string'
    'which I've joined together '
    'across multiple lines';

although the advantage of it vs using a concat operator is slim.

As an aside, Postgres isn't the only DB to follow the standard here.

mysql> select 'hello'
 -> ' there';
+-+
| hello   |
+-+
| hello there |
+-+


This is the kind of weird stuff that we mocked mysql for.

This too would have been roundly mocked if discovered in mysql:

postgres=# select to_date('2018150X','MMDD');
   to_date

  2019-03-03
(1 row)





The main difference being that we mocked MySQL because it BROKE the 
standard requirements in bespoke, arguably lazy/stupid/dangerous ways 
(and often undocumented as a result).



You ran into a gotcha of the SQL specification; in your shoes I'd 
instead appreciate the fact that the PG maintainers went out of their 
way and documented this spec weirdness as such in their own project when 
they could have just went "look into the specification for oddities".



MySQL silently truncated your overflowing string for years without as 
much as a warning, by default. In my book that broke specification, rule 
of least surprise, data safety and a lot more.




--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Automatically parsing in-line composite types

2019-10-29 Thread Fabio Ugo Venchiarutti



On 29/10/2019 12:23, Dave Cramer wrote:



On Wed, 23 Oct 2019 at 15:50, Mitar <mailto:mmi...@gmail.com>> wrote:


Hi!

Bump my previous question. I find it surprising that it seems this
information is not possible to be reconstructed by the client, when
the server has to have it internally. Is this a new feature request or
am I missing something?

 > I am trying to understand how could I automatically parse an in-line
 > composite type. By in-line composite type I mean a type corresponding
 > to ROW. For example, in the following query:
 >
 > SELECT _id, body, (SELECT array_agg(ROW(comments._id, comments.body))
 > FROM comments WHERE comments.post_id=posts._id) AS comments FROM
posts
 >
 > It looks like I can figure out that "comments" is an array of
records.
 > But then there is no way really to understand how to parse those
 > records? So what are types of fields in the record?
 >
 > I start the parsing process by looking at types returned in
 > RowDescription message and then reading descriptions in pg_type
table.
 >
 > Is there some other way to get full typing information of the
result I
 > am assuming is available to PostreSQL internally?



Reading the RowDescription is the only way I am aware of.


Dave Cramer

da...@postgresintl.com <mailto:da...@postgresintl.com>
www.postgresintl.com <http://www.postgresintl.com>



Perhaps I misunderstood your question, but that sounds like my average 
use-case for the object-relational type system & JSON/JSONB 
functions/types: defining nested structured types as temporary relations 
in my queries and spew out their hierarchical JSON representation - 
often as a single big field (ironically I hate storing JSON in 
relational databases unless I'm storing something really opaque like 
dashboard layouts).



EG:

SELECT
t.relname AS t_name,
array_to_json(ARRAY_AGG(ats)) AS fields_json
FROM
pg_class AS t INNER JOIN (
SELECT
ia.attrelid AS table_id,
ia.attnum AS column_number,
ia.attname AS column_name
FROM
pg_attribute AS ia
) AS ats
ON
(t.relkind = 'r')
AND
(t.relname IN ('pg_type', 'pg_constraint'))
AND
(ats.table_id = t.oid)
GROUP BY
t.relname


You can use subqueries and array_agg() to deepen your output tree all 
the way to a stack overflow, a single _to_json() call at the 
top will recursively traverse and convert whatever you feed it.



In your case you can just emit your composite type as a JSON object or 
array thereof (types and relations are the same thing).






--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Automatically parsing in-line composite types

2019-10-30 Thread Fabio Ugo Venchiarutti




On 30/10/2019 16:15, Mitar wrote:

Hi!

On Wed, Oct 30, 2019 at 8:37 AM Merlin Moncure  wrote:

Check out libpqtypes: https://github.com/pgagarinov/libpqtypes


Interesting. I have looked at the code a bit but I do not find how it
determines the type for inline compound types, like the ones they
appear in my original SQL query example. Could you maybe point me to
the piece of code there handling that? Because to my
understanding/exploration that information is simply not exposed to
the client in any way. :-(


it does exactly what you want. It's a wrapper for libpq that provides
client side parsing for the binary protocol with array and composite
type parsing.


It looks to me that it does parsing of composite types only if they
are registered composite types. But not for example ones you get if
you project a subset of fields from a table in a subquery. That has no
registered composite type?

Also, how you are handling discovery of registered types, do you read
that on-demand from the database? They are not provided over the wire?


Virtually any
non-C client application really ought to be using json rather than the
custom binary structures libpqtyps would provide.


I thought that initially, too, but then found out that JSON has some
heavy limitations because the implementation in PostgreSQL is standard
based. There is also no hook to do custom encoding of non-JSON values.
So binary blobs are converted in an ugly way (base64 would be better).
You also loose a lot of meta-information, because everything non-JSON
gets converted to strings automatically. Like knowing what is a date.
I think MongoDB with BSON made much more sense here. It looks like
perfect balance between simplicity of JSON structure and adding few
more useful data types.

But yes, JSON is great also because clients often have optimized JSON
readers. Which can beat any other binary serialization format. In
node.js, it is simply the fastest there is to transfer data:

https://mitar.tnode.com/post/in-nodejs-always-query-in-json-from-postgresql/


Mitar




Then perhaps, as opposed to wedging this into the tabular paradigm, a 
transition to more targeted support for hierarchical result 
representation would be preferable, just done directly by the backend an 
rendered by libpq... (perhaps still encapsulated as a DataRow field not 
to break the traditional model. Or perhaps a special RowDescription-like 
message in the backend protocol? Not my place to strongly push proposals 
there).



There's a lot of room for optimisation if done natively (think label 
deduplication at the source. Not sure if BSON works this way too).



There's also the problem of independent implementations of the 
protocol...AFAIK the JDBC client is not a wrapper to libpq and they'd 
also have to break their result surfacing paradigms to make it work...



Sounds like an enormous risk & undertaking for the hackers TBH, and I 
currently see another limiting factor to idea's popularity: as it 
stands, advanced SQL is daunting for the much of the industry, and IMHO 
the queries to generate arbitrarily structured & lightweight inline 
types/relations are relatively verbose and deeply nested (eg: last time 
I checked, stripping/renaming some attributes from a relation required 
subselecting them, and - prehaps due to PEBCAK - can't think of a way to 
create results as associative arrays indexed by attributes).


For this to gain traction, a more streamlined syntax/functions/operators 
for precision work may be necessary, or the result would only satisfy a 
narrow set of users who are already intimate with the state of affairs.



Can't help thinking that the current JSON-over-field pinhole may already 
be at the sweet spot between usefulness and inter-operability with 
existing systems. Just the SQL side of it could be less noisy and, yes, 
data type pidgeonhole problem could benefit from something like a GUC 
setting to electively break standard JSON compatibility, function 
arguments or else.





--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: tcp keep alive don't work when the backend is busy

2019-12-10 Thread Fabio Ugo Venchiarutti

On 10/12/2019 15:06, Tom Lane wrote:

=?utf-8?B?0J7Qu9C10LMg0KHQsNC80L7QudC70L7Qsg==?=  writes:

According to the documentation
https://www.postgresql.org/docs/12/runtime-config-connection.html
A backend must check connection to the client by tcp_keepalive messages. 
(Config option tcp_keepalives_idle).



But this is don't work if the backend is busy.


You're reading something into the documentation that isn't there.

The TCP keepalive mechanism is something that the OS does, independently
of backend processing.  The backend isn't going to notice loss of client
connection until it tries to read or write on the connection.

If it were free to improve this, we might do so.  But it would be
very much not free.

regards, tom lane





At what points does the backend bite the bullet to test the state of 
that file descriptor?


I'd expect select() and poll() to return immediately when keepalive 
probes timeout, so idling clients are covered (and that's the main use 
case); does any other code path go out if its way to ensure that there's 
still a client without actually needing to read()/write()/send()/recv()? 
(obviously at the cost you mentioned)









--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Commit to primary with unavailable sync standby

2019-12-19 Thread Fabio Ugo Venchiarutti






On 19/12/2019 11:04, Andrey Borodin wrote:

Hi!

I cannot figure out proper way to implement safe HA upsert. I will be very 
grateful if someone would help me.

Imagine we have primary server after failover. It is network-partitioned. We 
are doing INSERT ON CONFLICT DO NOTHING; that eventually timed out.

az1-grx88oegoy6mrv2i/db1 M > WITH new_doc AS (
 INSERT INTO t(
 pk,
 v,
 dt
 )
 VALUES
 (
 5,
 'text',
 now()
 )
 ON CONFLICT (pk) DO NOTHING
 RETURNING pk,
   v,
   dt)
SELECT new_doc.pk from new_doc;
^CCancel request sent
WARNING:  01000: canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not have been 
replicated to the standby.
LOCATION:  SyncRepWaitForLSN, syncrep.c:264
Time: 2173.770 ms (00:02.174)

Here our driver decided that something goes wrong and we retry query.

az1-grx88oegoy6mrv2i/db1 M > WITH new_doc AS (
 INSERT INTO t(
 pk,
 v,
 dt
 )
 VALUES
 (
 5,
 'text',
 now()
 )
 ON CONFLICT (pk) DO NOTHING
 RETURNING pk,
   v,
   dt)
SELECT new_doc.pk from new_doc;
  pk

(0 rows)

Time: 4.785 ms

Now we have split-brain, because we acknowledged that row to client.
How can I fix this?

There must be some obvious trick, but I cannot see it... Or maybe cancel of 
sync replication should be disallowed and termination should be treated as 
system failure?

Best regards, Andrey Borodin.



You're hitting the CAP theorem ( https://en.wikipedia.org/wiki/CAP_theorem )


You cannot do it with fewer than 3 nodes, as the moment you set your 
standby to synchronous to achieve consistency, both your nodes become 
single points of failure.



With 3 or more nodes you can perform what is called a quorum write 
against ( floor( / 2) + 1 ) nodes .



With 3+ nodes, the "easy" strategy is to set a  number of 
standby nodes in synchronous_standby_names ( 
https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-SYNCHRONOUS-STANDBY-NAMES 
)



This however makes it tricky to pick the correct standby for promotions 
during auto-failovers, as you need to freeze all the standbys listed in 
the above setting in order to correctly determine which one has the 
highest WAL location without running into race conditions (as the 
operation is non-atomic, stateful and sticky).



I personally prefer to designate a fixed synchronous set at setup time 
and automatically set a static synchronous_standby_names on the master 
whenever a failover occurs. That allows for a simpler failover mechanism 
as you know they got the latest WAL location.




If you want an off-the shelf solution, nowadays Patroni seems to be all 
the rage.





--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Commit to primary with unavailable sync standby

2019-12-19 Thread Fabio Ugo Venchiarutti




On 19/12/2019 12:25, Andrey Borodin wrote:

Hi Fabio!

Thanks for looking into this.


19 дек. 2019 г., в 17:14, Fabio Ugo Venchiarutti  
написал(а):


You're hitting the CAP theorem ( https://en.wikipedia.org/wiki/CAP_theorem )


You cannot do it with fewer than 3 nodes, as the moment you set your standby to 
synchronous to achieve consistency, both your nodes become single points of 
failure.

We have 3 nodes, and the problem is reproducible with all standbys being 
synchronous.






With 3 or more nodes you can perform what is called a quorum write against ( 
floor( / 2) + 1 ) nodes .

The problem seems to be reproducible in quorum commit too.





With 3+ nodes, the "easy" strategy is to set a  number of standby 
nodes in synchronous_standby_names ( 
https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-SYNCHRONOUS-STANDBY-NAMES
 )


This however makes it tricky to pick the correct standby for promotions during 
auto-failovers, as you need to freeze all the standbys listed in the above 
setting in order to correctly determine which one has the highest WAL location 
without running into race conditions (as the operation is non-atomic, stateful 
and sticky).

After promotion of any standby we still can commit to old primary with the 
combination of cancel and retry.



AFAICT this pseudo-idempotency issue can only be solved if every query 
is validated against the quorum.


A quick-and-dirty solution would be to wrap the whole thing in a CTE 
which also returns a count from pg_stat_replication (a stray/partitioned 
master would have less than (quorum - 1 standbys).
(May be possible to do it directly in the RETURNING clause, I don't have 
a backend handy test that).



You can either look into the result at the client or force an error via 
some bad cast/zero division in the query.


All the above is however still subject to (admittedly tight) race 
conditions.



This problem is precisely why I don't use any of the off-the shelf 
solutions: last time I checked none of that had a connection 
proxy/router to direct clients to the real master and not a node that 
thinks it is.







I personally prefer to designate a fixed synchronous set at setup time and 
automatically set a static synchronous_standby_names on the master whenever a 
failover occurs. That allows for a simpler failover mechanism as you know they 
got the latest WAL location.

No, synchronous standby does not necessarily own latest WAL. It has WAL point 
no earlier than all commits acknowledged to client.



You're right. I should have said "latest WAL holding an acknowledged 
transaction"




Thanks!

Best regards, Andrey Borodin.



--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Commit to primary with unavailable sync standby

2019-12-19 Thread Fabio Ugo Venchiarutti




On 19/12/2019 13:58, Maksim Milyutin wrote:

On 19.12.2019 14:04, Andrey Borodin wrote:


Hi!



Hi!

FYI, this topic was up recently in -hackers 
https://www.postgresql.org/message-id/caeet0zhg5off7iecby6tzadh1moslmfz1hlm311p9vot7z+...@mail.gmail.com 




I cannot figure out proper way to implement safe HA upsert. I will be 
very grateful if someone would help me.


Imagine we have primary server after failover. It is 
network-partitioned. We are doing INSERT ON CONFLICT DO NOTHING; that 
eventually timed out.


az1-grx88oegoy6mrv2i/db1 M > WITH new_doc AS (
 INSERT INTO t(
 pk,
 v,
 dt
 )
 VALUES
 (
 5,
 'text',
 now()
 )
 ON CONFLICT (pk) DO NOTHING
 RETURNING pk,
   v,
   dt)
    SELECT new_doc.pk from new_doc;
^CCancel request sent
WARNING:  01000: canceling wait for synchronous replication due to 
user request
DETAIL:  The transaction has already committed locally, but might not 
have been replicated to the standby.

LOCATION:  SyncRepWaitForLSN, syncrep.c:264
Time: 2173.770 ms (00:02.174)

Here our driver decided that something goes wrong and we retry query.

az1-grx88oegoy6mrv2i/db1 M > WITH new_doc AS (
 INSERT INTO t(
 pk,
 v,
 dt
 )
 VALUES
 (
 5,
 'text',
 now()
 )
 ON CONFLICT (pk) DO NOTHING
 RETURNING pk,
   v,
   dt)
    SELECT new_doc.pk from new_doc;
  pk

(0 rows)

Time: 4.785 ms

Now we have split-brain, because we acknowledged that row to client.
How can I fix this?

There must be some obvious trick, but I cannot see it... Or maybe 
cancel of sync replication should be disallowed and termination should 
be treated as system failure?




I think the most appropriate way to handle such issues is to catch by 
client driver such warnings (with message about local commit) and mark 
the status of posted transaction as undetermined. If connection with 
sync replica will come back then this transaction eventually commits but 
after triggering of autofailover and *not replicating this commit to 
replica* this commit aborts. Therefore client have to wait some time 
(that exceeds the duration of autofailover) and check (logically based 
on committed data) the status of commit.


The problem here is the locally committed data becomes visible to future 
transactions (before autofailover) that violates the property of 
consistent reading from master. IMO the more correct behavior for 
PostgreSQL here is to ignore any cancel / termination queries when 
backend is in status of waiting response from sync replicas.


However, there is another way to get locally applied commits via restart 
of master after initial recovery. This case is described in doc 
https://www.postgresql.org/docs/current/warm-standby.html#SYNCHRONOUS-REPLICATION-HA 
. But here HA orchestrator agent can close access from external users 
(via pg_hba.conf manipulations) until PostgreSQL instance synchronizes



And this is where the unsafety lies: that assumes that the isolated 
master is in enough of a sane state to apply a self-ban (and that can do 
it in near-zero time).



Although the retry logic in Andrey's case is probably not ideal (and you 
offered a more correct approach to synchronous commit), there are many 
"grey area" failure modes that in his scenario would either prevent a 
given node from sealing up fast enuogh if at all (eg: PID congestion 
causing fork()/system() to fail while backends are already up and 
happily flushing WAL).



This is particularly relevant to situations when only a subset of 
critical transactions set synchronous_commit to remote_*: it'd still be 
undesirable to sink "tier 2" data in a stale primary for any significant 
length of time).



Distributed systems like Etcd and Cassandra have a notion of 
"coordination node" in the context of a request (not having to deal with 
an "authoritative" transaction makes it easier).



In the case of postgres (or any RDBMS, really), all I can think of is 
either an inline proxy performing some validation as part of the 
forwarding (which is what we did internally but that has not been green 
lit for FOSS :( ) or some logic in the backend that rejects asynchronous 
commits too if some condition is not met (eg:  synchronous 
standby nodes not present - a builtin version of the pg_stat_replication 
look-aside CTE I suggested earlier).






its changes with all sync replicas as it's implemented in Stolon 
https://github.com/sorintlab/stolon/blob/master/doc/syncrepl.md#handling-postgresql-sync-repl-limits-under-such-circumstances 
.



Best regards,
Maksim Milyutin






--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and vi

Re: a proposal for a new functionality: "SELECT * [EXCEPT col1 [,col2]]

2020-02-25 Thread Fabio Ugo Venchiarutti
That makes sense, however for my general use case I'd also like the 
ability to mark some columns as not match-able by `SELECT * FROM 
` and `TABLE `  at table definition without having to 
create dedicated views (think of the way system attributes such as 
tableoid, cmax, cmin ... are handled) .




Typical use case: manual inspection of rows containing an hefty payload 
field and some metadata ones; the payload more often than not isn't the 
bit I'm interested in, but the size of it dominates the output making it 
hard to read (granted - psql has formatting options to handle that, but 
having a sane default at the table level would help a lot).




One may argue that such behaviour breaks the principle of least 
surprise, so all of this should probably be strictly opt-in (and perhaps 
queries could output some hints that such hidden columns exists).




Regards


F



On 25/02/2020 10:30, Josef Šimánek wrote:

Just to mention, similar concept can be found in Google BigQuery.

https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#select-except
https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#select-replace

út 25. 2. 2020 v 11:18 odesílatel Stanislav Motyčka 
mailto:stanislav.moty...@gmail.com>> napsal:


Hello,

Sometimes (for tables with many columns) it would be better and
easier to write "SELECT" statement with clause "EXCEPT":
"SELECT * [EXCEPT col1 [,col2]] FROM ..."

It's easier to write "except" one or two columns from all (*) as to
write names of all columns besides one or two.
What do you thin about it?

Best regards
Stano Motycka



--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Limit transaction lifetime

2020-03-06 Thread Fabio Ugo Venchiarutti



On 06/03/2020 14:24, Andrei Zhidenkov wrote:
Is there a way to limit a transaction lifetime in PostgreSQL? I could 
use `idle_in_transaction_session_timeout` parameter but it applies only 
to IDLE transactions. However, I want to rollback any transaction that 
executes more than specified period of time.


--

With best regards, Andrei Zhidenkov.


If it's a single command you're trying to limit `SET statement_timeout 
TO ` should do the trick.



If you want it based on the session's cumulative statement time, off the 
top of my head I can't think of anything in vanilla PG without using 
executor hooks (that requires some coding).



If the queries that worry you are long-lived, you might be able to get 
by with a scheduled process checking against pg_stat_activity (eg: 
age(query_start)) and adding the current query's run-time to some 
per-session total, but it's an highly inaccurate process.





--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Limit transaction lifetime

2020-03-06 Thread Fabio Ugo Venchiarutti

On 06/03/2020 16:14, Andrei Zhidenkov wrote:



If it's a single command you're trying to limit `SET statement_timeout TO 
` should do the trick.


This will set only statement timeout but won’t work for long transactions that 
contain a lot of short statements.


If you want it based on the session's cumulative statement time, off the top of 
my head I can't think of anything in vanilla PG without using executor hooks 
(that requires some coding).


Yes, that’s exactly I want to do.


If the queries that worry you are long-lived, you might be able to get by with 
a scheduled process checking against pg_stat_activity (eg: age(query_start)) 
and adding the current query's run-time to some per-session total, but it's an 
highly inaccurate process.


I think in my case I should check `xact_start`, because not every query 
initiates a new transaction.
That way you'd also be counting in time a given transaction spent 
idling, which IME is one of the biggest source of concurrency headaches 
if it's holding locks while doing nothing.

If your use case is unaffected by that, that'd be good news for you.





--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Back Port Request for INVALID Startup Packet

2020-03-12 Thread Fabio Ugo Venchiarutti

On 12/03/2020 00:29, Tom Lane wrote:

Virendra Kumar  writes:

Can you please back port patch where if a 0 byte packet sent to PG instance 
(Health Checks), it starts complaining about invalid startup packet and flood 
the log which increases log size considerably if the health checks are every 3 
seconds or something like that.
Patch Requested - 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=342cb650e


We generally don't like to change behavior of back branches without
ironclad agreement that the existing behavior is a bug ... which this
surely isn't.  Also, the discussion leading up to that patch specifically
considered and rejected back-patching; so I'm disinclined to overrule
that decision now.

I would suggest that an every-three-second health check is not
appropriate, especially one that is so minimal that it only
detects whether the postmaster is alive.



Does the backend explicitly configure the kernel's TCP accept queue?

Unless Postgres sets it to impractically low numbers, the above test 
sounds weak from yet another angle; the postmaster might be "alive" as a 
process but completely seized for other reasons: at least in Linux the 
TCP accept queue progresses the 3-way handshake so long as the bound 
socket's file descriptor is still held, without even waiting for accept().



I'd recommend a better probe that at least sends a startup message and 
expects the backend to follow up with the authentication request (no 
need to craft messages manually, testing for more abstract 
messages/states out of psql or some libpq wrapper would do). Better 
still, if credentials/access are available, run a query.




--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: vacuum full doubled database size

2020-03-13 Thread Fabio Ugo Venchiarutti

On 13/03/2020 15:15, Ron wrote:
This is why I'd VACUUM FULL in a planned manner, one or two tables at a 
time, and *locally* from crontab.


That's not really viable on any remotely busy system: VACUUM FULL claims 
exclusive table locks, causing queries to hang 
(https://www.postgresql.org/docs/current/sql-vacuum.html#NOTES mentions 
this too).


Tools like pg_repack can do some live shrinking.

I've also had some success at reclaiming space without large scale locks 
by carefully crafting some atomic DELETE + INSERT in order to force 
tuples from the tail end into gaps at lower CTIDs (physical page 
address) that were made available by previous plain VACUUMs - regular 
VACUUM will clip data files if all the tuples beyond a given offset are 
dead.






On 3/13/20 8:41 AM, Zwettler Markus (OIZ) wrote:


We did a "vacuum full" on a database which had been interrupted by a 
network outage.


We found the database size doubled afterwards.

Autovacuum also found a lot of orphaned tables afterwards.

The ophan temp objects went away after a cluster restart while the db 
size remained doubled.


Any idea?

Postgres 9.6.17



--
Angular momentum makes the world go 'round.


--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Network performance optimization

2020-09-09 Thread Fabio Ugo Venchiarutti

On 09/09/2020 04:59, J . Mageshwaran wrote:
Hi Team, I am performing some benchmark testing with application in aws 
east 2 and dB in East 1. I am using pgbench for benchmark, the RTT 
Latency between East 1 to east2 is some where between 12 to 16 ms on an 
average. Is there any optimization that can be done to reduce the 
latency and increase the throughout to the PostgreSQL layer?


Even if network datagrams moved at the speed of light and with no 
serialisation/forwarding delay (which they don't), you're still going to 
deal with several thousand KMs of distance; I'm positively surprised 
you're getting such short round-trip times as it stands.



If you have any control over the application code, IMO your best shot is 
to coalesce DB interactions into as few smarter DB queries as you can 
for each higher order operation (eg: RESTful API calls), as you'll be 
paying for that fixed RTT overhead for each command.





--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?

2020-09-22 Thread Fabio Ugo Venchiarutti



On 21/09/2020 17:53, Joshua Drake wrote:
3. The ability to embed PG to run in an automatic, quiet manner as part 
of something else. I know about SQLite, but it's extremely limited to 
the point of being virtually useless IMO, which is why I cannot use that 
for anything nontrivial. I want my familiar PostgreSQL, only not require 
it to be manually and separately installed on the machine where it is to 
run as part of some "application". If I could just "embed" it, this 
would allow me to create a single EXE which I can simply put on a 
different machine to run my entire "system" which otherwise takes *tons* 
of tedious, error-prone manual labor to install, set up and maintain. Of 
course, this is probably much easier said than done, but I don't 
understand why PG's architecture necessarily dictates that PG must be a 
stand-alone, separate thing. Or rather, why some "glue" cannot enable it 
to be used just like SQLite from a *practical* perspective, even if it 
still is a "server-client model" underneath the hood. (Which doesn't 
matter at all to me, nor should it matter to anyone else.)


It depends what you mean by "embedded".
If you want sqlite's linked library approach, where the database calls 
literally run your process' address space, then that's a no go, as 
postgres is a multi-user database server with its own process hierarchy.



However, postgres also is a rather agnostic command that does not detach 
from the parent's terminal/stdio unless instructed to, so nothing stops 
your bespoke application from launching and managing its own postmaster 
as a directly managed child process - started as part of application 
initialisation - and having that listen on a local socket only available 
to the application itself; this is what we implemented in some of our 
installations where postgres is enslaved to the cluster control system - 
the configuration is re-generated at every restart (IIRC Patroni does 
something similar minus the unix socket part).



A plethora of systems are built around the notion of programs calling 
other programs and managing the process' life cycle. The limiting factor 
to such architecture tends to be the OS's process control semantics and 
API (notoriously weaker or more contrived on non-unix-like OSs), but 
that's not postgres' fault.






--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Upgrading 9.1.17 to which version?

2019-05-16 Thread Fabio Ugo Venchiarutti




On 16/05/2019 18:20, Ron wrote:

On 5/16/19 4:36 AM, nigel.ander...@gmx.com wrote:

Hi,
I've just inherited an ancient install of 9.1.17 after our tech guy 
left, on what turns out to be a rapidly dying server and being a total 
newb to PostgreSQL (and not much more advanced on Linux) I'm a little 
stuck on the way ahead.
I've managed to secure a decent new server for a new install of 
PostgreSQL which runs CentOS 7.6 (Minimal). CentOS 7.6's standard 
PostgreSQL package seems to be 9.2.24 which is obviously no longer 
supported so probably doesn't get us much further ahead in the short 
term. As part of this upgrade we'd also like to implement support for 
pg_trgm which apparently needs >=9.6.
I spent most of yesterday trying to get 9.6.13 installed from the 
PostgreSQL Yum repository and finally got it working with the initdb 
stuff stored on a non-default dedicated partition (RAID10 array) only 
to find that psql didn't work and was complaining about a missing 
libpq.so.5. Not sure if that's a common problem?


What packages did you install?

My (admittedly loose) logic tells me that upgrading from 9.1.x to 
9.6.x is probably a safer option than making the leap up to 10.x or 11.x


No, not really.

but I wonder whether that might be an easier/more reliable option from 
an install and point of view and certainly preferable in the long 
term. Any advice on where to go?


11.x would be best, since it's EOL is furthest in the future.
9.6 would be best, because it's had more bug-fix releases.



Aren't all important bugfixes backported to every non-EOL affected 
majors at once?



Correct me if I'm wrong but I thought that's the reason minors are 
released at unison for all majors.





:)

--
Angular momentum makes the world go 'round.


--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice:  This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group. 


 

If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses. 


 

Fetch and Sizzle 
are trading names of Speciality Stores Limited and Fabled is a trading name 
of Marie Claire Beauty Limited, both members of the Ocado Group.


 


References to the “Ocado Group” are to Ocado Group plc (registered in 
England and Wales with number 7098618) and its subsidiary undertakings (as 
that expression is defined in the Companies Act 2006) from time to time.  
The registered office of Ocado Group plc is Buildings One & Two, Trident 
Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.





Re: distinguish update from insert (on conflict)

2019-05-22 Thread Fabio Ugo Venchiarutti




On 22/05/2019 03:37, Justin Pryzby wrote:

On Tue, May 21, 2019 at 06:57:36PM -0700, Adrian Klaver wrote:

On 5/21/19 6:34 PM, Justin Pryzby wrote:

Is it still impossible to distinguish whether a row was inserted vs updated ?


You will need to be more specific.


Sorry, I mean with UPSERT / "INSERT .. ON CONFLICT DO UPDATE", is it possible
to tell whether a row was inserted vs. updated ?

Thanks,
Justin




Here's my recipe for that:

RETURNING
/* whatever, */
(xmax = 0) AS is_new_record
;

I don't know if any of the hackers thought of a sleeker technique



--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice:  This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group. 


 

If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses. 


 

Fetch and Sizzle 
are trading names of Speciality Stores Limited and Fabled is a trading name 
of Marie Claire Beauty Limited, both members of the Ocado Group.


 


References to the “Ocado Group” are to Ocado Group plc (registered in 
England and Wales with number 7098618) and its subsidiary undertakings (as 
that expression is defined in the Companies Act 2006) from time to time.  
The registered office of Ocado Group plc is Buildings One & Two, Trident 
Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.





Re: Postgresql backup via LVM snapshot?

2019-06-03 Thread Fabio Ugo Venchiarutti
Are you trying to fit the postgres backups in some overarching LVM-based 
backup scheme you're already running or is this from scratch?



With the possible exception of rsync's deltas & hardlink trickeries 
between full backups, I personally don't see a point in using a 
block/file-level copy for physical backups of Postgres clusters (unless 
you're keeping the snapshot around as the "thin backup" but that would 
put the backup in the same volume group as the live data; meh).



pg_basebackup (or more specifically the underlying replication protocol 
command BASE_BACKUP), precisely wraps up all that you're trying to 
accomplish here; it is perfectly aware of the internal state of 
operations and guarantees operations 
synchronisation/correctness/completeness.



It roughly boils down to (with minor variations between types of 
physical backups):


- dropping a checkpoint and wait for it to complete, or wait for the 
next one to roll around (that will be the starting point of WAL recovery 
when you restore)

- remember the WAL location of that checkpoint
- copy the data files "as is" without worrying about torn/corrupt pages 
due to changes. They will be overwritten with the amended version during 
WAL recovery
- optionally put away a copy of all the transaction logs that were 
generated between the checkpoint and the end of the copy if you're not 
already performing WAL archival (or if you simply want your backup to be 
self-contained and recoverable without accessing the WAL archive).




The database will look to have crashed when you initially restore it 
(same as with an any form of snapshotting including LVM) and will have 
to go through WAL recovery anyway: the earliest version of the DB to 
which you can safely restore is the one at the moment the copy ended, so 
the benefits of the atomic low-level snapshot are negated.




By using the old exclusive start/stop_backup() & action synchronisation 
yourself you're just making it more manual and error prone.




Regards


F




On 30/05/2019 20:38, Lu, Dan wrote:

Hello,

Would you be able to confirm for me that a host level LVM snapshot of 
the PGDATA directory along with PG_WAL directly via LVM backup is 
supported way of backup and restore?


I read about this here: 
https://dba.stackexchange.com/questions/145361/backup-standby-database-using-lvm-snapshot



Is it as simple as:

èSELECT pg_start_backup('Begin LVM Backup At xyz');

èDo LVM backup from o/s end to backup PGDATA/PG_WAL

èSELECT pg_stop_backup();

Thanks.

Dan




IMPORTANT: The information contained in this email and/or its 
attachments is confidential. If you are not the intended recipient, 
please notify the sender immediately by reply and immediately delete 
this message and all its attachments. Any review, use, reproduction, 
disclosure or dissemination of this message or any attachment by an 
unintended recipient is strictly prohibited. Neither this message nor 
any attachment is intended as or should be construed as an offer, 
solicitation or recommendation to buy or sell any security or other 
financial instrument. Neither the sender, his or her employer nor any of 
their respective affiliates makes any warranties as to the completeness 
or accuracy of any of the information contained herein or that this 
message or any of its attachments is free of viruses.


--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice:  This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group. 


 

If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses. 


 

Fetch and Sizzle 
are trading names of Speciality Stores Limited and Fabled is a trading name 
of Marie Claire Beauty Limited, both members of the Ocado Group.


 


References to the “Ocado Group” are to Ocado Group plc (registered in 
England and Wales with number 7098618) and its subsidiary undertakings (as 
that expression is defined in the Companies Act 2006) from time to time.  
The registered office of Ocado Group plc is Buildings One & Two, Trident 
Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.





Re: perf tuning for 28 cores and 252GB RAM

2019-06-18 Thread Fabio Ugo Venchiarutti

On 18/06/2019 00:45, Jeff Janes wrote:
On Mon, Jun 17, 2019 at 4:51 PM Michael Curry <mailto:cu...@cs.umd.edu>> wrote:


I am using a Postgres instance in an HPC cluster, where they have
generously given me an entire node. This means I have 28 cores and
252GB RAM. I have to assume that the very conservative default
settings for things like buffers and max working memory are too
small here.

We have about 20 billion rows in a single large table. 



What is that in bytes?  Do you only have that one table?

The database is not intended to run an application but rather to
allow a few individuals to do data analysis, so we can guarantee the
number of concurrent queries will be small, and that nothing else
will need to use the server. Creating multiple different indices on
a few subsets of the columns will be needed to support the kinds of
queries we want.

What settings should be changed to maximize performance?


With 28 cores for only a few users, parallelization will probably be 
important.  That feature is fairly new to PostgreSQL and rapidly 
improving from version to version, so you will want to use the last 
version you can (v11).  And then increase the values for 
max_worker_processes, max_parallel_maintenance_workers, 
max_parallel_workers_per_gather, and max_parallel_workers.  With the 
potential for so many parallel workers running at once, you wouldn't 
want to go overboard on work_mem, maybe 2GB.  If you don't think all 
allowed users will be running large queries at the same time (because 
they are mostly thinking what query to run, or thinking about the 
results of the last one they ran, rather than actually running queries), 
then maybe higher than that.


If your entire database can comfortably fit in RAM, I would make 
shared_buffers large enough to hold the entire database.  If not, I 
would set the value small (say, 8GB) and let the OS do the heavy lifting 
of deciding what to keep in cache.



Does the backend mmap() data files when that's possible?


I've heard the "use the page cache" suggestion before, from users and 
hackers alike, but I never quite heard a solid argument dismissing 
potential overhead-related ill effects of the seek() & read() syscalls 
if they're needed, especially on many random page fetches.



Given that shmem-based shared_buffers are bound to be mapped into the 
backend's address space anyway, why isn't that considered always 
preferable/cheaper?




I'm aware that there are other benefits in counting on the page cache 
(eg: staying hot in the face of a backend restart), however I'm 
considering performance in steady state here.




TIA



If you go with the first option, you
probably want to use pg_prewarm after each restart to get the data into 
cache as fast as you can, rather than let it get loaded in naturally as 
you run queries;  Also, you would probably want to set random_page_cost 
and seq_page_cost quite low, like maybe 0.1 and 0.05.


You haven't described what kind of IO capacity and setup you have, 
knowing that could suggest other changes to make.  Also, seeing the 
results of `explain (analyze, buffers)`, especially with track_io_timing 
turned on, for some actual queries could provide good insight for what 
else might need changing.


Cheers,

Jeff






--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice:  This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group. 


 

If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses. 


 

Fetch and Sizzle 
are trading names of Speciality Stores Limited and Fabled is a trading name 
of Marie Claire Beauty Limited, both members of the Ocado Group.


 


References to the “Ocado Group” are to Ocado Group plc (registered in 
England and Wales with number 7098618) and its subsidiary undertakings (as 
that expression is defined in the Companies Act 2006) from time to time.  
The registered office of Ocado Group plc is Buildings One & Two, Trident 
Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.