from:"Vijaykumar Jain"

Redacting params in PostgreSQL logs

2018-03-30 Thread Vijaykumar Jain

Hi,

I have a project of sending postgres logs to ELK stack. It is working fine.

Now there are concerns that logs have query statements with parameters.
and then GDPR and other PII issues.

Is there a way I can redact the params in statements in the logs by some config 
or extension ?
Pls note: I want the statements in logs but not the values :)

like insert into foo values (‘xxx’) etc



Regards,
Vijay

Re: Redacting params in PostgreSQL logs

2018-03-30 Thread Vijaykumar Jain

https://github.com/lfittl/pg_query/blob/master/README.md#parsing-a-normalized-query

OK found this :)


Regards,
Vijay

From: Vijaykumar Jain
Sent: Friday, March 30, 2018 11:24:30 PM
To: pgsql-general@lists.postgresql.org
Subject: Redacting params in PostgreSQL logs

Hi,

I have a project of sending postgres logs to ELK stack. It is working fine.

Now there are concerns that logs have query statements with parameters.
and then GDPR and other PII issues.

Is there a way I can redact the params in statements in the logs by some config 
or extension ?
Pls note: I want the statements in logs but not the values :)

like insert into foo values (‘xxx’) etc



Regards,
Vijay

Re: Redacting params in PostgreSQL logs

2018-03-31 Thread Vijaykumar Jain

FYI

This is what I had tried wrt logstash.

https://gist.github.com/cabecada/dd765a30f6946fdbf0bec0eb31fba047

From: Vijaykumar Jain 
Date: Friday, March 30, 2018 at 11:33 PM
To: "pgsql-general@lists.postgresql.org" 
Subject: Re: Redacting params in PostgreSQL logs

https://github.com/lfittl/pg_query/blob/master/README.md#parsing-a-normalized-query

OK found this :)

Regards,
Vijay
____
From: Vijaykumar Jain
Sent: Friday, March 30, 2018 11:24:30 PM
To: pgsql-general@lists.postgresql.org
Subject: Redacting params in PostgreSQL logs

Hi,

I have a project of sending postgres logs to ELK stack. It is working fine.

Now there are concerns that logs have query statements with parameters.
and then GDPR and other PII issues.

Is there a way I can redact the params in statements in the logs by some config 
or extension ?
Pls note: I want the statements in logs but not the values :)

like insert into foo values (‘xxx’) etc

Regards,
Vijay

Re: [External] Merging two database dumps

2018-06-13 Thread Vijaykumar Jain

You can try one option, although just a thought in the air 😊
Use postgres FDW  ex. 
https://robots.thoughtbot.com/postgres-foreign-data-wrapper

Create foreign tables in the relevant server schema

And then union/union all  😊 or your custom constraint on the destination table 
where you dump the rows.

For ex.
You have server1, server2, server3

And you have server4 as your new single server.

You create FDW of server1, server2, server3 on server 4 and then import table 
into respective server schema.

server1.table1, server2.table1, server3.table1

and then
insert into server4.table1  select * from( select * from server1.table1 union 
select * from server2.table1 union select * from server3.table1) a;

something 😊


Thanks,
Vijay

From: Alex O'Ree 
Date: Wednesday, June 13, 2018 at 4:47 PM
To: "pgsql-general@lists.postgresql.org" 
Subject: [External] Merging two database dumps

I have a situation with multiple postgres servers running all with the same 
databases and table structure. I need to periodically export the data from each 
of there then merge them all into a single server. On  occasion, it's feasible 
for the same record (primary key) to be stored in two or more servers

I was using pgdump without the --insert option however I just noticed that 
pgrestore will stop inserting into a table when the conflict occurs, leaving me 
with an incomplete set.

Question is what are my other options to skip over the conflicting record when 
merging?

From the docs, it appears that making dumps with the --insert option may be the 
only way to go however performance is an issue. In this case would dropping all 
indexes help?

Re: [External] Re: Schema/Data conversion opensource tools from MySQL to PostgreSQL

2018-06-25 Thread Vijaykumar Jain

I have not tested this, but have read about this in somewhere.
https://github.com/the4thdoctor/pg_chameleon

if it helps
  good
else
  ignore

Thanks,
Vijay
From: Pavan Teja 
Date: Monday, June 25, 2018 at 10:25 PM
To: chiru r 
Cc: "pgsql-gene...@postgresql.org >> PG-General Mailing List" 

Subject: [External] Re: Schema/Data conversion opensource tools from MySQL to 
PostgreSQL

Hi Chiru,

You can use MySQL foreign data wrapper to achieve this.

Regards,
Pavan

On Mon, Jun 25, 2018, 10:18 PM chiru r 
mailto:chir...@gmail.com>> wrote:
Hi All,


Please suggest Schema/Data conversion opensource tools from MySQL to PostgreSQL.


Thanks,
Chiranjeevi

Re: [External] How to revoke privileged from PostgreSQL's superuser

2018-08-06 Thread Vijaykumar Jain

I am not sure superuser can be selectively restricted via queries, but I am not 
sure, have not tried.

But maybe you can try restricting the super user access to the db from all 
hosts via the pg_hba.conf.

Fore eg. I have a user
monitor| Superuser

and
in my /etc/postgresql/10/main/pg_hba.conf

host pgtesting monitor 0.0.0.0/0   reject

and then
psql -U monitor -p 5432 -d pgtesting -h 127.0.0.1
psql: FATAL:  pg_hba.conf rejects connection for host "127.0.0.1", user 
"monitor", database "pgtesting", SSL on
FATAL:  pg_hba.conf rejects connection for host "127.0.0.1", user "monitor", 
database "pgtesting", SSL off

psql -U monitor -p 5432 -d pgtesting -h localhost
psql: FATAL:  pg_hba.conf rejects connection for host "127.0.0.1", user 
"monitor", database "pgtesting", SSL on
FATAL:  pg_hba.conf rejects connection for host "127.0.0.1", user "monitor", 
database "pgtesting", SSL off

psql -U monitor -p 5432 -d pgtesting -h 173.16.6.3
psql: FATAL:  pg_hba.conf rejects connection for host "173.16.6.3", user 
"monitor", database "pgtesting", SSL on
FATAL:  pg_hba.conf rejects connection for host "173.16.6.3", user "monitor", 
database "pgtesting", SSL off


https://stackoverflow.com/questions/38942868/revoke-superuser-connect-a-specific-database

Thanks,
Vijay


From: "bejita0...@yahoo.co.jp" 
Reply-To: "bejita0...@yahoo.co.jp" 
Date: Monday, August 6, 2018 at 3:19 PM
To: "pgsql-ad...@lists.postgresql.org" , 
"pgsql-general@lists.postgresql.org" 
Subject: [External] How to revoke privileged from PostgreSQL's superuser

Hello,

I am a newbie DBA.

I have a request for revoking the access to user's data from DBA-user.
I think the request is right because users should be the only ones can access 
their data.
But DBA-user also need full access to the other data? It means that DBA-user 
also needs to be a superuser.

So I conclude the request that how to revoke privileged from superuser in 
postgres.

As my knowledge, the superuser in PostgreSQL bypasses all the permission check.
So that, there is no way to do it in PostgreSQL, is that right?

Is there some DBAs are faced with this before?


Thanks,
--
bejita

Re: [External] Re: ERROR: cache lookup failed for function 125940

2018-08-19 Thread Vijaykumar Jain

Hey Tom,

Had a small query here.
If we have streaming replication enabled, and if we have corruption like this 
due to power loss etc, 
is it safe to assume we failover to standby  and we should be good (atleast in 
most cases)
I wanted to understand if the system catalog is corrupted, for some reason, 
will it/will it not  stream replicate the corrupted data to standby.
Or they are a property of an individual database server and will be 
rebuilt/updated on pg_dump/restore.

I am sorry if I am diverting the Drako's query or if that is not related, I'll 
ask another question then __
But I guess maybe the answer may help Drako too.

Thanks,
Vijay

On 8/20/18, 12:53 AM, "Tom Lane"  wrote:

DrakoRod  writes:
> I reindex the pg_proc table and reindex finished correctly but I try read
> the tables or make the dump and same error, reindexed the database and 
show
> me this error: 

> ERROR:  cache lookup failed for function 125999

This suggests you've actually lost some entries from pg_proc.  Have
you had any system crashes or suchlike?

> Any suggestions, If I run pg_upgrade to 10, will these errors be 
corrected? 

No, pg_upgrade can't magically restore data that's not there.

regards, tom lane

Re: [External] Multiple COPY on the same table

2018-08-20 Thread Vijaykumar Jain

Hey Ravi,

What is the goal you are trying to achieve here.
To make pgdump/restore faster?
To make replication faster?
To make backup faster ?

Also no matter how small you split the files into, if network is your 
bottleneck then I am not sure you can attain n times the benefit my simply 
sending the files in parallel but yeah maybe some benefit.
But then for parallel processing you also need to ensure your server is having 
relevant resources or else it will just be a lot of context switching I guess ?
Pg dump has an option to dump in parallel
pgbasebackup is single threaded I read but pgbackrest can allow better parallel 
processing in backups.
There is also logical replication where you can selectively replicate your 
tables to avoid bandwidth issues.
I might have said a lot and nothing may be relevant, but you need to let us 
know the goal you want to achieve :)

Regards,
Vijay

From: Ravi Krishna 
Sent: Monday, August 20, 2018 8:24:35 PM
To: pgsql-general@lists.postgresql.org
Subject: [External] Multiple COPY on the same table

Can I split a large file into multiple files and then run copy using each file. 
 The table does not contain any
serial or sequence column which may need serialization. Let us say I split a 
large file to 4 files.  Will the
performance boost by close to 4x??

ps: Pls ignore my previous post which was without a subject (due to mistake)

Re: [External] Multiple COPY on the same table

2018-08-20 Thread Vijaykumar Jain


I guess this should help you, Ravi.

https://www.postgresql.org/docs/10/static/populate.html




On 8/20/18, 10:30 PM, "Christopher Browne"  wrote:

On Mon, 20 Aug 2018 at 12:53, Ravi Krishna  wrote:

> > What is the goal you are trying to achieve here.
> > To make pgdump/restore faster?
> > To make replication faster?
> > To make backup faster ?
>
> None of the above.
>
>  We got csv files from external vendor which are 880GB in total size, in 
44 files.  Some of the large tables had COPY running for several hours. I was 
just thinking of a faster way to load.


Seems like #4...

#4 - To Make Recovery faster

Using COPY pretty much *is* the "faster way to load"...

The main thing you should consider doing to make it faster is to drop
indexes and foreign keys from the tables, and recreate them
afterwards.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

Re: [External] RE: Estimate time without running the query

2018-09-13 Thread Vijaykumar Jain

explain analyze would *run* the query and it can be dangerous if it is a DML 
statement like insert/update/delete 😊

If you still want to go with explain analyze,
You can do

begin;
explain analyze ;
rollback;

thanks,
Vijay

From: Johnes Castro 
Date: Friday, September 14, 2018 at 3:12 AM
To: Neto pr , PostgreSQL General 

Subject: [External] RE: Estimate time without running the query

Hi netoprbr,

Use a command explain analyse.

Best Regards.
Johnes Castro

De: Neto pr 
Enviado: quinta-feira, 13 de setembro de 2018 19:38
Para: PostgreSQL General
Assunto: Estimate time without running the query

Dear all,
Only a doubt.
The Explain  command only estimates the cost of execution of a query, 
and does not estimate time for execution.
I would like know if exists  some way to estimate the time, without running the 
query?
Best Regards
[]`s Neto

Re: [External] Slot issues

2018-10-14 Thread Vijaykumar Jain

I guess max_replication_slots has to >  current total slots in use. (and
not >= )

https://www.postgresql.org/docs/10/static/runtime-config-replication.html
https://github.com/postgres/postgres/blob/d6e98ebe375e115c29028f9bd090f0f7e07e2527/src/backend/replication/slot.c#L1506

from the doc, it says defaults are 10, any reason why you have set it to 2.
You need to set it at a value higher than number of slots active.
also, i guess changing this parameter would require a server restart.

Regards,
Vijay

On Mon, Oct 15, 2018 at 12:16 AM bhargav kamineni 
wrote:

> Hi Team,
>
> I am getting this ERROR while starting my second slave server
>
> PANIC:  too many replication slots active before shutdown
> HINT:  Increase max_replication_slots and try again.
>
> max_replication_slots on my master is 2 and one of them is already active
> for another slave,
> do i need to increase this parameter for the need of working of another
> slave ? if so whats the reason ?
>
>
>
> Thanks
> Bhargav K
>

Re: [External] Slot issues

2018-10-14 Thread Vijaykumar Jain

ok my bad.

i just set one of the 9.6.9 versions to have max_replication_slots =2 and
still i had to slots active.

ostgres=# table pg_replication_slots;
slot_name| plugin | slot_type | datoid | database | active
| active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
-++---++--+++--+--+-+-
 a02 || physical  ||  | t  |  13719 |
 |  | 2/D3D0  |
a03 || physical  ||  | t  |  13720 |  |
 | 2/D3D0  |
(2 rows)

postgres=# show max_replication_slots;
 max_replication_slots
---
 2
(1 row)


yep it should work with 2.
Regards,
Vijay


On Mon, Oct 15, 2018 at 1:02 AM Vijaykumar Jain  wrote:

> I guess max_replication_slots has to be greater than active replication
> slots.
> maybe you need to increase max_replication_slots to 3 to have 2 active
> slots.
>
> Regards,
> Vijay
>
>
> On Mon, Oct 15, 2018 at 12:54 AM bhargav kamineni <
> bhargavpostg...@gmail.com> wrote:
>
>>  for 9.x. Default was 0.  we have set max_replication_slot =2 , one is
>> already being used , postgres should use  second slot for another server
>> but dono why its throwing error
>>
>> On Mon, 15 Oct 2018 at 00:32, Vijaykumar Jain 
>> wrote:
>>
>>> I guess max_replication_slots has to >  current total slots in use. (and
>>> not >= )
>>>
>>> https://www.postgresql.org/docs/10/static/runtime-config-replication.html
>>>
>>> https://github.com/postgres/postgres/blob/d6e98ebe375e115c29028f9bd090f0f7e07e2527/src/backend/replication/slot.c#L1506
>>>
>>> from the doc, it says defaults are 10, any reason why you have set it to
>>> 2. You need to set it at a value higher than number of slots active.
>>> also, i guess changing this parameter would require a server restart.
>>>
>>>
>>>
>>>
>>> Regards,
>>> Vijay
>>>
>>>
>>> On Mon, Oct 15, 2018 at 12:16 AM bhargav kamineni <
>>> bhargavpostg...@gmail.com> wrote:
>>>
>>>> Hi Team,
>>>>
>>>> I am getting this ERROR while starting my second slave server
>>>>
>>>> PANIC:  too many replication slots active before shutdown
>>>> HINT:  Increase max_replication_slots and try again.
>>>>
>>>> max_replication_slots on my master is 2 and one of them is already
>>>> active for another slave,
>>>> do i need to increase this parameter for the need of working of another
>>>> slave ? if so whats the reason ?
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Bhargav K
>>>>
>>>

Re: [External] Slot issues

2018-10-14 Thread Vijaykumar Jain

from your master, can you give us the output of

select * from pg_replication_slots;

maybe some stray slot exists that you may not be aware of ?

Regards,
Vijay


On Mon, Oct 15, 2018 at 1:08 AM Vijaykumar Jain  wrote:

> ok my bad.
>
> i just set one of the 9.6.9 versions to have max_replication_slots =2 and
> still i had to slots active.
>
> ostgres=# table pg_replication_slots;
> slot_name| plugin | slot_type | datoid | database | active
> | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
>
> -++---++--+++--+--+-+-
>  a02 || physical  ||  | t  |  13719 |
>  |  | 2/D3D0  |
> a03 || physical  ||  | t  |  13720 |
>  |  | 2/D3D0  |
> (2 rows)
>
> postgres=# show max_replication_slots;
>  max_replication_slots
> ---
>  2
> (1 row)
>
>
> yep it should work with 2.
> Regards,
> Vijay
>
>
> On Mon, Oct 15, 2018 at 1:02 AM Vijaykumar Jain 
> wrote:
>
>> I guess max_replication_slots has to be greater than active replication
>> slots.
>> maybe you need to increase max_replication_slots to 3 to have 2 active
>> slots.
>>
>> Regards,
>> Vijay
>>
>>
>> On Mon, Oct 15, 2018 at 12:54 AM bhargav kamineni <
>> bhargavpostg...@gmail.com> wrote:
>>
>>>  for 9.x. Default was 0.  we have set max_replication_slot =2 , one is
>>> already being used , postgres should use  second slot for another server
>>> but dono why its throwing error
>>>
>>> On Mon, 15 Oct 2018 at 00:32, Vijaykumar Jain 
>>> wrote:
>>>
>>>> I guess max_replication_slots has to >  current total slots in use.
>>>> (and not >= )
>>>>
>>>>
>>>> https://www.postgresql.org/docs/10/static/runtime-config-replication.html
>>>>
>>>> https://github.com/postgres/postgres/blob/d6e98ebe375e115c29028f9bd090f0f7e07e2527/src/backend/replication/slot.c#L1506
>>>>
>>>> from the doc, it says defaults are 10, any reason why you have set it
>>>> to 2. You need to set it at a value higher than number of slots active.
>>>> also, i guess changing this parameter would require a server restart.
>>>>
>>>>
>>>>
>>>>
>>>> Regards,
>>>> Vijay
>>>>
>>>>
>>>> On Mon, Oct 15, 2018 at 12:16 AM bhargav kamineni <
>>>> bhargavpostg...@gmail.com> wrote:
>>>>
>>>>> Hi Team,
>>>>>
>>>>> I am getting this ERROR while starting my second slave server
>>>>>
>>>>> PANIC:  too many replication slots active before shutdown
>>>>> HINT:  Increase max_replication_slots and try again.
>>>>>
>>>>> max_replication_slots on my master is 2 and one of them is already
>>>>> active for another slave,
>>>>> do i need to increase this parameter for the need of working of
>>>>> another slave ? if so whats the reason ?
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Bhargav K
>>>>>
>>>>

Re: [External] Re: Slot issues

2018-10-14 Thread Vijaykumar Jain

i guess he is trying to confirm if the value of max_replication_slot is the
same on all the master and the standby nodes.

also,

I am trying to shoot in the dark.

can you give the output of (on master)
ls /var/lib/postgresql//main/pg_replslot/*

also the value of max_replication_slot on the master and all the standby's.
also, how are you creating the replication slot?

can you share the query?

also pls share the content of the recovery.conf on the standby nodes (both
currently running and currently not running).
you can scribble company secret stuff.

Regards,
Vijay

On Mon, Oct 15, 2018 at 1:58 AM bhargav kamineni 
wrote:

> I got his on standby,  could you please explain in detail about
>  --*but that *on the standby* haven't set max_replication_slots high
> enough*.
>
> Thanks,
> Bhargav
>
> On Mon, 15 Oct 2018 at 01:50, Andres Freund  wrote:
>
>> Hi,
>>
>> On 2018-10-15 00:15:53 +0530, bhargav kamineni wrote:
>> > Hi Team,
>> >
>> > I am getting this ERROR while starting my second slave server
>> >
>> > PANIC:  too many replication slots active before shutdown
>> > HINT:  Increase max_replication_slots and try again.
>> >
>> > max_replication_slots on my master is 2 and one of them is already
>> active
>> > for another slave,
>> > do i need to increase this parameter for the need of working of another
>> > slave ? if so whats the reason ?
>>
>> You're getting that error on the standby, not the primary, right?   In
>> all likelihood the problem is that you copied over replication slots to
>> your standby server, but that *on the standby* haven't set
>> max_replication_slots high enough.
>>
>> Greetings,
>>
>> Andres Freund
>>
>

Re: [External] Re: Slot issues

2018-10-14 Thread Vijaykumar Jain

ok wait, you data directory path may be different :)
i mean wherever your pg data dir is, from there find pg_replslot folder and
list the contents of it.

Regards,
Vijay


On Mon, Oct 15, 2018 at 2:10 AM Vijaykumar Jain  wrote:

> i guess he is trying to confirm if the value of max_replication_slot is
> the same on all the master and the standby nodes.
>
> also,
>
> I am trying to shoot in the dark.
>
> can you give the output of (on master)
> ls /var/lib/postgresql//main/pg_replslot/*
>
> also the value of max_replication_slot on the master and all the standby's.
> also, how are you creating the replication slot?
>
> can you share the query?
>
> also pls share the content of the recovery.conf on the standby nodes (both
> currently running and currently not running).
> you can scribble company secret stuff.
>
>
> Regards,
> Vijay
>
>
> On Mon, Oct 15, 2018 at 1:58 AM bhargav kamineni <
> bhargavpostg...@gmail.com> wrote:
>
>> I got his on standby,  could you please explain in detail about
>>  --*but that *on the standby* haven't set max_replication_slots high
>> enough*.
>>
>> Thanks,
>> Bhargav
>>
>> On Mon, 15 Oct 2018 at 01:50, Andres Freund  wrote:
>>
>>> Hi,
>>>
>>> On 2018-10-15 00:15:53 +0530, bhargav kamineni wrote:
>>> > Hi Team,
>>> >
>>> > I am getting this ERROR while starting my second slave server
>>> >
>>> > PANIC:  too many replication slots active before shutdown
>>> > HINT:  Increase max_replication_slots and try again.
>>> >
>>> > max_replication_slots on my master is 2 and one of them is already
>>> active
>>> > for another slave,
>>> > do i need to increase this parameter for the need of working of another
>>> > slave ? if so whats the reason ?
>>>
>>> You're getting that error on the standby, not the primary, right?   In
>>> all likelihood the problem is that you copied over replication slots to
>>> your standby server, but that *on the standby* haven't set
>>> max_replication_slots high enough.
>>>
>>> Greetings,
>>>
>>> Andres Freund
>>>
>>

Re: [External] Re: Slot issues

2018-10-14 Thread Vijaykumar Jain

Sorry Bhargav.

I tried to simulate the scenarios,
3 node cluster
1primary, 2hot standby

given:
max_replication_slots = 2 on primary, and i try to create 2 hot standby
-> runs fine.
max_replication_slots = 1 on primary, and i try to create 2 hot standby ->
gives PANIC, increase replication slots. (the standby only cries with
replication slot does not exist, not increase number of replication slots)
max_replication_slots = 2 on primary, and max_replication_slots=1 on hot
standby, delete recovery.conf and recover from pg_baseback works fine.
max_replication_slots = 2 on primary, on standby i try to use a
replication_slot already existing, it gives correct error of slot in use
and not increase replication slot error.
my pg_replslot folder is consistent of the replication slots i have on the
master.


so i guess, i'll leave it to the experts.
but i still do not understand why would you get that error on standby,
unless you are doing cascading replication.

I'll watch this thread to see how this is resolved.

Thanks,
Vijay


Regards,
Vijay


On Mon, Oct 15, 2018 at 2:41 AM Andres Freund  wrote:

> Hi,
>
> Please try to quote properly.
>
> On 2018-10-15 01:57:53 +0530, bhargav kamineni wrote:
> > I got his on standby,  could you please explain in detail about
> >  --*but that *on the standby* haven't set max_replication_slots high
> enough*
> > .
>
> What is max_replication_slots set to on the new standby?
>
> If you created the new basebackup using rsync, and didn't exclude
> pg_replication_slot, it'll have copied the slots from the primary.  And
> thus needs a high enough max_replication_slots to work with them.
>
> - Andres
>
>

Re: [External] Re: Slot issues

2018-10-14 Thread Vijaykumar Jain

ah that explains.
Thanks andres.

I do not use rsync, hence was not able to reproduce i guess :)

Regards,
Vijay


On Mon, Oct 15, 2018 at 2:46 AM bhargav kamineni 
wrote:

> Yeah i have used rsync , Got it now will increase the
> max_replication_slots to high enough , Thank you Andres Freund :-)
>
> On Mon, 15 Oct 2018 at 02:40, Andres Freund  wrote:
>
>> Hi,
>>
>> Please try to quote properly.
>>
>> On 2018-10-15 01:57:53 +0530, bhargav kamineni wrote:
>> > I got his on standby,  could you please explain in detail about
>> >  --*but that *on the standby* haven't set max_replication_slots high
>> enough*
>> > .
>>
>> What is max_replication_slots set to on the new standby?
>>
>> If you created the new basebackup using rsync, and didn't exclude
>> pg_replication_slot, it'll have copied the slots from the primary.  And
>> thus needs a high enough max_replication_slots to work with them.
>>
>> - Andres
>>
>

Re: [External] Re: Slot issues

2018-10-14 Thread Vijaykumar Jain

What was the whole point of this divergence  :)

We use replication slots and it is for this reason we are able to scale our
applications by redirecting reads and writes backed by haproxy and
pgbouncer. with the health check of replication lag. It works awesome and
at a much much lower cost to what aws used to charge us for. (1 master,  3
slaves .. that goes to 7-8 during peak days)

With the use of logical replication we were able to create a multimaster
mesh that allowed the writes to certain tables across the ocean and all
regions subscribed to it.
As a result, at any given time all 5 availability. Zones had same data and
we would sustain an occasional WAN outage between two DCs.

We use around 300 Postgres servers, everything managed via puppet and
foreman.
And I have not seen any single issue w r t the one reported for any of our
setups.
So I guess it is generalising just based out of some threads :)
I do not use db2 but there has never been a need for one in my company as
we have multiple design patterns of Postgres architecture based of
requirements.

On Mon, 15 Oct 2018 at 3:46 AM Ravi Krishna  wrote:

> >
> > You're not forced to use slots.  Their purpose is to allow to force the
> > primary to keep necessary resources around. Which also allows to get rid
> > of the archive in some setups.
>
> Thanks.
>
> Disclaimer:  We don't use replication as we piggy back on AWS HA.
>
> The reason why I posted this is because majority of replication related
> messages in this forum
> is about slots :-)

-- 

Regards,
Vijay

Re: [External] Change in db size

2019-07-17 Thread Vijaykumar Jain

I guess the restore cleared the bloat from the table.
\dt+
\di+
If you run the above commands from the terminal, you would see diff in
sizes of the. objects.
Also querying pg_stat_all_tables you might see  “dead” tuples in old tables
which would have cleaned up now.
Also if there were any large temp objects or queries using temp files would
have cleared and hence reclaimed disk.
If none of these are true, then I guess experts would pitch in.

On Wed, 17 Jul 2019 at 10:33 PM Sonam Sharma  wrote:

> I have restored database and the db size of source was around 55gb and
> after restore the db size of Target is 47gb.
>
> How to confirm if restore was successful or not ?
> The table count is also same.
>
-- 

Regards,
Vijay

wal_level logical for streaming replication

2019-08-28 Thread Vijaykumar Jain

Hello Team,

wal_level = logical
wal_level = replica

As per docs,
wal_level determines how much information is written to the WAL. The
default value is replica, which writes enough data to support WAL archiving
and replication, including running read-only queries on a standby server.
minimal removes all logging except the information required to recover from
a crash or immediate shutdown. Finally, logical adds information necessary
to support logical decoding. Each level includes the information logged at
all lower levels. This parameter can only be set at server start.

I have had a combination where I had to upgrade a server using logical
replication and when it was done I had replicas serving read queries of the
server with wal level logical.
All fine.
I have two dumb queries .

If I change wal_level back to replica, will it corrupt wal? coz it will
then be having diff information ( r format of data ?)
What is the base reason as to why ddls are not sent via logical replication
but the hot standby has ddl changes reflected absolutely fine ?
and there is one large limitation on large object support for logical
replication?
Where can I see the limitation on size or is it just certain data types ?

I used this to generate dummy data of in a table of all data types  it only
1000 rows and it replicated fine.
I could easily stress more on this, but I did not want to do discovery and
spread rumours without known references.

Appreciate your help as always.

Thanks,
Vijay
-- 

Regards,
Vijay

Re: [External] Re: wal_level logical for streaming replication

2019-08-29 Thread Vijaykumar Jain

Thanks Laurenz,


Regards,
Vijay


On Thu, Aug 29, 2019 at 2:07 AM Laurenz Albe  wrote:
>
> On Wed, 2019-08-28 at 21:44 +0530, Vijaykumar Jain wrote:
> > If I change wal_level back to replica, will it corrupt wal? coz it
> > will then be having diff information ( r format of data ?)
>
> That's why you have to restart the server when you change that
> parameter.  This way, there will be a checkpoint marking the change.
>
> > What is the base reason as to why ddls are not sent via logical
> > replication but the hot standby has ddl changes reflected absolutely
> > fine ?
>
> Physical streaming replication just replicates the database files,
> so it also replicates DDL statements, which are just changes to the
> catalog tables.
>
> Basically, anything that can be recovered can be replicated.
>
> Logical replication has to perform "logical decoding", that is,
> it has to translate the (physical) WAL information into logical
> information (waht row was modified in which table).
>
> So this is much more complicated.  It should be possible in theory,
> but nobody has got around to solving the difficulties involved yet.
>
> > and there is one large limitation on large object support for logical
> > replication?
> > Where can I see the limitation on size or is it just certain data
> > types ?
>
> This has nothing to do with the size; I guess the answer is the same as
> above.  One proble that I can see immediately is that primary and
> standby don't share the same OIDs, yet every large object is identified
> by its OID.  So I think this is a fundamental problem that cannot be
> solved.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>

Profile a db connection time?

2019-10-11 Thread Vijaykumar Jain

Sorry if this is silly , but I have series of network outages and wanted to
understand if I can profile a psql connection to various parts of it
initialling a connection
Like when using hostname, resolving dns
the time to actually establish a db connection and then the time to parse
the query and run the query and time to return the results
I know a lot of the answers could be
Why not do a dns check before the connection,
run vacuum analyse to check only db time  and do not return any row, and
then run with returning the rows and do a diff etc.
Am I asking something that makes sense?
Like google chrome does for page loading etc ?

-- 

Regards,
Vijay

Re: Could not read block 0 in file

2022-04-08 Thread Vijaykumar Jain

On Fri, 8 Apr 2022 at 14:36,  wrote:

> Hi,
>
> While we are looking for a suitable backup to recover from, I hope this
> community may have some other advice on forward steps in case we cannot
> restore.
>
> RCA: Unexpected shutdown due to critical power failure
>
> Current Issue: The file base/16509/17869 is zero bytes in size.
>
> Additional Information:
> Platform: Windows Server
> PostGres Version: 10.16 (64-bit)
>
> The database does start, and is otherwise functioning and working aside
> from a particular application feature that relies on the lookup of the
> values in the table that was held in the currently zero-bytes data file.
>
> The non-functioning table (ApprovalStageDefinition) is a relatively simple
> table with 5 rows of static data. The contents can easily be recovered with
> a query such as the following for each of the 5 records:
> insert into ApprovalStageDefinition values (1, 'Stage One', 'Stage One');
>
> The error message when running this query is:
> ERROR:  could not read block 0 in file "base/16509/17869": read only 0 of
> 8192 bytes
>
> The file does exist on the file system, with zero bytes, as do the
> associated fsm and vm files.
>
> PostGres does allow us to describe the table:
>  \d ApprovalStageDefinition;
>Table "public.approvalstagedefinition"
>   Column   |  Type  | Collation | Nullable | Default
> ---++---+--+-
>  stageid   | bigint |   | not null |
>  stagename | citext |   | not null |
>  internalstagename | citext |   | not null |
> Indexes:
> "approvalstagedef_pk" PRIMARY KEY, btree (stageid)
> "approvalstagedefinition_uk1" UNIQUE CONSTRAINT, btree (stagename)
> "approvalstagedefinition_uk2" UNIQUE CONSTRAINT, btree
> (internalstagename)
> Check constraints:
> "approvalstagedefinition_internalstagename_c" CHECK
> (length(internalstagename::text) <= 100)
> "approvalstagedefinition_stagename_c" CHECK (length(stagename::text)
> <= 100)
> Referenced by:
> TABLE "approvaldetails" CONSTRAINT "approvaldetails_fk5" FOREIGN KEY
> (stageid) REFERENCES approvalstagedefinition(stageid) ON DELETE CASCADE
> TABLE "currentapprovalstage" CONSTRAINT "currentapprovalst_fk1"
> FOREIGN KEY (stageid) REFERENCES approvalstagedefinition(stageid) ON DELETE
> CASCADE
> TABLE "serviceapprovermapping" CONSTRAINT "serviceapprovermapping_fk4"
> FOREIGN KEY (stageid) REFERENCES approvalstagedefinition(stageid) ON DELETE
> CASCADE
>
> Desired Solution:
> A way to recreate the data file based on the existing schema so that we
> can then insert the required records.
>
> Challenges/Apprehensions:
> I am a PostGres novice, and reluctant to try dropping the table and
> recreating it based on the existing schema as I don’t know what else it may
> break, especially with regards to foreign keys and references.
>
> Any constructive advice would be appreciated.
>
> Thank you
>
>

in the order of steps

1) Corruption - PostgreSQL wiki
<https://wiki.postgresql.org/wiki/Corruption>
2) PostgreSQL: Documentation: 14: F.2. amcheck
<https://www.postgresql.org/docs/current/amcheck.html>
3) df7cb/pg_filedump: pg_filedump provides facilities for low-level
examination of PostgreSQL tables and indexes (github.com)
<https://github.com/df7cb/pg_filedump>
4) Physical recovery with pg_filedump (alexey-n-chernyshov.github.io)
<https://alexey-n-chernyshov.github.io/blog/physical-recovery-with-pg_filedump.html>
(example usage)
5) Pgopen-Recovery_damaged_cluster(1).pdf (postgresql.org)
<https://wiki.postgresql.org/images/3/3f/Pgopen-Recovery_damaged_cluster%281%29.pdf>
 (using zero_damaged_pages  option to skip/zero error pages and move on)

Although I never really dealt with disk corruption, so i am not hands on
with the scenarios, I have tried to replicate some scenarios by injecting
disk faults using dmsetup local disk.
which may/may not be the same the power failure/ RAID controller problems
especially on windows. but the above would be helpful to atleast get the
data (if possible) from the corrupt pages and also scan through the entire
db to find out more problems.


-- 
Thanks,
Vijay
LinkedIn - Vijaykumar Jain <https://www.linkedin.com/in/vijaykumarjain/>

Re: Could not read block 0 in file

2022-04-08 Thread Vijaykumar Jain

On Fri, 8 Apr 2022 at 15:24, Magnus Hagander  wrote:

>
>
> On Fri, Apr 8, 2022 at 11:06 AM  wrote:
>
>> Hi,
>>
>> While we are looking for a suitable backup to recover from, I hope this
>> community may have some other advice on forward steps in case we cannot
>> restore.
>>
>> RCA: Unexpected shutdown due to critical power failure
>>
>> Current Issue: The file base/16509/17869 is zero bytes in size.
>>
>> The error message when running this query is:
ERROR:  could not read block 0 in file "base/16509/17869": read only 0 of
8192 bytes

i guess it is maybe page header/ metadata not the whole file is zero bytes.
the data can be recovered then from the blogs?
i may be corrected :)

Re: Could not read block 0 in file

2022-04-08 Thread Vijaykumar Jain

On Fri, 8 Apr 2022 at 15:31, Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

>
>
> On Fri, 8 Apr 2022 at 15:24, Magnus Hagander  wrote:
>
>>
>>
>> On Fri, Apr 8, 2022 at 11:06 AM  wrote:
>>
>>> Hi,
>>>
>>> While we are looking for a suitable backup to recover from, I hope this
>>> community may have some other advice on forward steps in case we cannot
>>> restore.
>>>
>>> RCA: Unexpected shutdown due to critical power failure
>>>
>>> Current Issue: The file base/16509/17869 is zero bytes in size.
>>>
>>> The error message when running this query is:
> ERROR:  could not read block 0 in file "base/16509/17869": read only 0 of
> 8192 bytes
>
> i guess it is maybe page header/ metadata not the whole file is zero
> bytes. the data can be recovered then from the blogs?
> i may be corrected :)
>
>
My bad, sorry did not read the email properly.

*The file does exist on the file system, with zero bytes, as do the
associated fsm and vm files.*

As Magnus suggests, then.

-- 
Thanks,
Vijay
LinkedIn - Vijaykumar Jain
<https://www.linkedin.com/in/vijaykumarjain/>

Re: Setting up streaming replication on large database (20+ TB) for the first time

2022-08-17 Thread Vijaykumar Jain

I just did a backup and restore of a replica using pgbackrest.
db size 28tb

nvme/ssd storage
96 cpu, 380 gb mem

zst compression, 24 workers (backup, 12 workers restore)

2.5 hours to backup
2 hours to restore.
Wal replay is something I forgot to tune, but I could now use
https://pgbackrest.org/configuration.html#section-archive/option-archive-get-queue-max
to speed up pulls too.
Everything is on prem, no cloud FYI and gentoo.



On Thu, Aug 18, 2022, 11:23 AM Ron  wrote:

> pg_backrest will certainly backup your data faster. It might be able to be
> used as a seed instead of pg_basebackup.
>
> On 8/17/22 15:06, Ivan N. Ivanov wrote:
> > I have a large database (~25 TB) and I want to set up streaming
> > replication for the first time.
> >
> > My problem is that after completion of the pg_basebackup (which
> completed
> > for 2 days with --wal-method=none) now PG is replaying the WAL files
> from
> > the WAL archive directory but it can not keep up. The replaying of WAL
> > files is the same as the physical time, for example:
> >
> > 2022-08-17 22:42:57 EEST [13507-6] [] DETAIL:  Last completed
> transaction
> > was at log time 2022-08-15 18:24:02.155289+03.
> > 2022-08-17 22:48:35 EEST [13507-12] [] DETAIL:  Last completed
> transaction
> > was at log time 2022-08-15 18:29:54.962822+03.
> > 2022-08-17 22:54:35 EEST [13507-16] [] DETAIL:  Last completed
> transaction
> > was at log time 2022-08-15 18:34:20.099468+03.
> >
> > From ~22:43 to ~22:48 there are 5 minutes. And completed transactions
> are
> > at ~18:24 and ~18:29 (5 minutes).
> >
> > I have even put all WAL files from the archive directly in the pg_wal
> > directory of the replica and now PostgreSQL skips the cp command from
> > restore_command, i.e. I have removed the restore_command and now the WAL
> > files are only recovering, this is the only operation, but it is slow:
> >
> > postgres: startup   recovering 00010003FC790013
> > postgres: startup   recovering 00010003FC790014
> > postgres: startup   recovering 00010003FC790015
> > ...
> >
> > And it cannot keep up and my replication cannot start since it is 2 days
> > behind the master... The replica has the same SSD disks as the master.
> >
> > Is there a better way to do this? How to speed up recovering of WAL
> files?
> > I have increased shared_buffers as much as I can... Is there something
> > that I miss from the recovery process?
> >
> > I do not have problems setting up replications for the first time for
> > small database (10 GB - 100 GB), but for 25 TB I can not set the
> > replication, because of this lag.
> >
>
> --
> Angular momentum makes the world go 'round.
>
>
>

Re: [(catch-ext)] Re: Setting up streaming replication on large database (20+ TB) for the first time

2022-08-18 Thread Vijaykumar Jain

Sorry for top posting, from phone.

But pgbackrest exactly helped with that. With compression and parallel
process in backup, the backup and restore was quick. I used this, where I
took a backup and immediately did a restore so less wals to replay, else
wal replay is indeed slow.

On Thu, Aug 18, 2022, 1:03 PM Ivan N. Ivanov 
wrote:

> Thank you, people. The big problem in my case, which I have not mentioned,
> is that I think the network is a bottleneck, because I am running
> pg_basebackup through internet from local country to Amazon instance in
> Germany and the speed in copying is around 50 MB/sec max, that is why it
> takes 2 days for copying.
>
> I will try using high compression for the basebackup to reduce the time.
>
> pgbackrest is an alternative, too
>
> Thank you again!
>
>
>
> On Wed, Aug 17, 2022 at 11:13 PM Ivan N. Ivanov 
> wrote:
>
>> Thank you for your answer! I have found this tool and I will try it
>> tomorrow to see if this "read-ahead" feature will speed up the process.
>>
>> On Wed, Aug 17, 2022 at 11:09 PM Christophe Pettus 
>> wrote:
>>
>>>
>>>
>>> > On Aug 17, 2022, at 13:06, Ivan N. Ivanov 
>>> wrote:
>>> >
>>> > How to speed up recovering of WAL files?
>>>
>>> Since you are running on your own hardware, you might take a look at:
>>>
>>> https://github.com/TritonDataCenter/pg_prefaulter
>>>
>>

Multi master disjoint cluster

2022-10-26 Thread Vijaykumar Jain

Hi all,

tl;dr
I have a simple question,
given a choice if I can write the same data to two databases in parallel,
should I opt for primary / replica setup or multi writer/master setup. This
setup has the ability to make use of kafka consumer groups (like two
replication slots each having their own lsn offsets)  to write to both db
node pairs in parallel via the application layer.

The churn of data is really high, there is a lot of wals generated, around
500gb/hr.

If I go with primary/replica, (lr not feasible)

I need to ensure both are on the same major version. Upgrades are tricky
(we don't have qa) so we just have option to schema dump for upgrade
compatibility. Data, we trust postgresql for that :). (I wish we had zfs
everywhere but no )

Any excl table blocking operations, (although with later versions there are
very less blocking operations) can impact queries on replica as well
(excluding delay settings).

Corruption can cause downtime (we have tons of them and raids to protect
them) so if replica is having issues, we can zero the pages on the replica
and do some operations if we isolate the problem pages, else resync the
replica from primary. But if primary is having some issues, we copy data
from replica to disk and copy in to primary after truncating etc. Some
downtime but not a lot. (I am not expert at data recovery) and mostly rely
on amcheck, dd, and raid checks.

We don't use pitr (too many wals × 58) or delayed replication as we can't
afford more servers.

ddl deploys are guaranteed by replication. So no need to try 2pc like stuff
at app layer. (Although apps use deploy tools to ensure eventually the ddls
are consistent and idempotent)

Basically primary/replica relieves the app to think what is there on the
primary is also on the replica eventually, so there can be source of truth.

But with multi writers, any app mishandling like bug in catching exception
etc can result in diversion and no more mirrored setup.
We need to have checks/reconciliation to ensure both write nodes in pair
have almost similar data at the end of the day so we can trust this setup
independent of any app mistakes.

But if app layer gets robust, we have almost no downtime in reads and
writes, we can have both nodes on different versions, (w/o logical
replication) can query both nodes real time, no real replication lag issues
, conflicts etc, can upgrade like blue green, canary test some changes on
one if needed etc.

Am I making sense at all? Or I am sounding confused, and I don't know the
difference between primary/replica vs multi writer. This is not bdr like
thing, they don't really need each other unless we are into some recovery.

My point is, we have 58 such primary/replica shards (each 10tb+)
(consistent hashing at app layer, no fdw)  and there is no scope of
downtime for reads, so any issue like post upgrade performance degradation
(if any) gives me chills. and we have no qa to test real data.

There are too many dimensions to shard on and aggregations need to run
across the shards (Yes there is no scope of data isolation).

Re: Multi master disjoint cluster

2022-10-26 Thread Vijaykumar Jain

Actually, pls ignore my email.

re reading my mail makes it look like I did not research it throughly and
just asked without actual implementation of both options and having a clear
goal on what can incompromise along with no read downtime.
I'll write better next time.

On Wed, Oct 26, 2022, 10:04 PM Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

> Hi all,
>
> tl;dr
> I have a simple question,
> given a choice if I can write the same data to two databases in parallel,
> should I opt for primary / replica setup or multi writer/master setup. This
> setup has the ability to make use of kafka consumer groups (like two
> replication slots each having their own lsn offsets)  to write to both db
> node pairs in parallel via the application layer.
>
> The churn of data is really high, there is a lot of wals generated, around
> 500gb/hr.
>
> If I go with primary/replica, (lr not feasible)
>
> I need to ensure both are on the same major version. Upgrades are tricky
> (we don't have qa) so we just have option to schema dump for upgrade
> compatibility. Data, we trust postgresql for that :). (I wish we had zfs
> everywhere but no )
>
> Any excl table blocking operations, (although with later versions there
> are very less blocking operations) can impact queries on replica as well
> (excluding delay settings).
>
> Corruption can cause downtime (we have tons of them and raids to protect
> them) so if replica is having issues, we can zero the pages on the replica
> and do some operations if we isolate the problem pages, else resync the
> replica from primary. But if primary is having some issues, we copy data
> from replica to disk and copy in to primary after truncating etc. Some
> downtime but not a lot. (I am not expert at data recovery) and mostly rely
> on amcheck, dd, and raid checks.
>
> We don't use pitr (too many wals × 58) or delayed replication as we can't
> afford more servers.
>
> ddl deploys are guaranteed by replication. So no need to try 2pc like
> stuff at app layer. (Although apps use deploy tools to ensure eventually
> the ddls are consistent and idempotent)
>
> Basically primary/replica relieves the app to think what is there on the
> primary is also on the replica eventually, so there can be source of truth.
>
> But with multi writers, any app mishandling like bug in catching exception
> etc can result in diversion and no more mirrored setup.
> We need to have checks/reconciliation to ensure both write nodes in pair
> have almost similar data at the end of the day so we can trust this setup
> independent of any app mistakes.
>
> But if app layer gets robust, we have almost no downtime in reads and
> writes, we can have both nodes on different versions, (w/o logical
> replication) can query both nodes real time, no real replication lag issues
> , conflicts etc, can upgrade like blue green, canary test some changes on
> one if needed etc.
>
> Am I making sense at all? Or I am sounding confused, and I don't know the
> difference between primary/replica vs multi writer. This is not bdr like
> thing, they don't really need each other unless we are into some recovery.
>
> My point is, we have 58 such primary/replica shards (each 10tb+)
> (consistent hashing at app layer, no fdw)  and there is no scope of
> downtime for reads, so any issue like post upgrade performance degradation
> (if any) gives me chills. and we have no qa to test real data.
>
> There are too many dimensions to shard on and aggregations need to run
> across the shards (Yes there is no scope of data isolation).
>
>

Re: postgres large database backup

2022-11-30 Thread Vijaykumar Jain

On Wed, Nov 30, 2022, 9:10 PM Atul Kumar  wrote:

> Hi,
>
> I have a 10TB database running on postgres 11 version running on centos 7
> "on premises", I need to schedule the backup of this database in a faster
> way.
>
> The scheduled backup will be used for PITR purposes.
>
> So please let me know how I should do it in a quicker backup for my 10TB
> database ? Is there any tool to take backups and subsequently incremental
> backups in a faster way and restore it for PITR in a faster way when
> required.
>
> What should be the exact approach for scheduling such backups so that it
> can be restored in a faster way ?
>


We had a 96 cpu, 385gb ram, nvme storage and 10g network baremetal server.
We used pgbackrest for full backup.
It supports pitr and differential backup.

28tb db took 2.5 hours for backup on remote storage, and restore from the
remote storage  took 3.5 hours when immediately restored (a lot of time is
later due to wal replay to catch up)

pg_basebackup took 9 hours.

Re: postgres large database backup

2022-12-01 Thread Vijaykumar Jain

On Thu, Dec 1, 2022, 7:11 PM Mladen Gogala  wrote:

> On 11/30/22 20:41, Michael Loftis wrote:
>
>
> ZFS snapshots don’t typically have much if  any performance impact versus
> not having a snapshot (and already being on ZFS) because it’s already doing
> COW style semantics.
>
> Hi Michael,
>
> I am not sure that such statement holds water. When a snapshot is taken,
> the amount of necessary I/O requests goes up dramatically. For every block
> that snapshot points to, it is necessary to read the block, write it to the
> spare location and then overwrite it, if you want to write to a block
> pointed by snapshot. That gives 3 I/O requests for every block written.
> NetApp is trying to optimize it by using 64MB blocks, but ZFS on Linux
> cannot do that, they have to use standard CoW because they don't have the
> benefit of their own hardware and OS. And the standard CoW is tripling the
> number of I/O requests for every write to the blocks pointed to by the
> snapshot, for every snapshot. CoW is a very expensive animal, with horns.
>

I am not an expert in this area, but we have zfs for specific instances
which have timeseries/event log data, and we also need compression.
One day, there was a need to snapshot a 35tb zfs pool and send it across
the network to a relplica, coz both the disks in the mirror degraded around
same time, I do not recall zfs snapshots took anything resource intensive,
and it was quick.ill ask around for actual time though.

We have more than 500 of these type of nodes with zfs (each having 20 disks
in mirror each 8tb) for event log with compression, and zfs works just
fine. This is a special setup where the data is assumed to be cold storage,
hence compression, so it was designed for heavy writes and occasional reads
queries only for debugging.

>

Re: postgres large database backup

2022-12-01 Thread Vijaykumar Jain

>  I do not recall zfs snapshots took anything resource intensive, and it
> was quick.ill ask around for actual time.
>

Ok just a small note, out ingestion pattern is write anywhere, read
globally. So we did stop ingestion while snapshot was taken as we could
afford it that way. Maybe the story is different when snapshot is taken on
live systems which generate a lot of delta.

>

incomplete startup packet messages in logs

2018-11-01 Thread Vijaykumar Jain

Hi Team,

I know this is has been answered a lot on the internet wrt ignoring,
but i am unable to figure out why I get these messages in logs
"incomplete startup packet"

I have a server running postgresql 10.5 on ubuntu16.04

i do not have any active external connections other than streaming
postgres replication.


postgres  7757 1  0 12:20 ?00:00:00
/usr/lib/postgresql/10/bin/postgres -D /var/lib/postgresql/10/main -c
config_file=/etc/postgresql/10/main/postgresql.conf

postgres  7759  7757  0 12:20 ?00:00:00 postgres: 10/main:
checkpointer process

postgres  7760  7757  0 12:20 ?00:00:00 postgres: 10/main:
writer process

postgres  7761  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
writer process

postgres  7762  7757  0 12:20 ?00:00:00 postgres: 10/main:
autovacuum launcher process

postgres  7763  7757  0 12:20 ?00:00:00 postgres: 10/main:
archiver process

postgres  7764  7757  0 12:20 ?00:00:00 postgres: 10/main:
stats collector process

postgres  7765  7757  0 12:20 ?00:00:00 postgres: 10/main:
bgworker: logical replication launcher

postgres  7779  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
sender process replicator x.x.x.x(47792) streaming 3/FA0001B0

postgres  7780  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
sender process replicator x.x.x.x(50526) streaming 3/FA0001B0

postgres  7786  7757  0 12:21 ?00:00:00 postgres: 10/main: wal
sender process barman_streaming x.x.x.x(43566) streaming 3/FA0001B0



also,

postgres=# select backend_type, query from pg_stat_activity;

backend_type |   query

-+---

 autovacuum launcher |

 background worker   |

 walsender   |

 walsender   |

 walsender   |

 client backend  | select backend_type, query from pg_stat_activity;

 background writer   |

 checkpointer|

 walwriter   |



but i still see


tail -1000 /var/log/postgresql/postgresql-10-main.log| grep
'incomplete startup packet'

2018-11-01 13:04:18 UTC LOG:  incomplete startup packet

2018-11-01 13:08:18 UTC LOG:  incomplete startup packet

2018-11-01 13:12:18 UTC LOG:  incomplete startup packet


Note: exactly at 4 min interval.

i do not know C, but i tried to understand something from
https://github.com/postgres/postgres/blob/197e4af9d5da180190a0f2be851b095dba57d9cd/src/backend/postmaster/postmaster.c#L1891

without any active connections, i would rule any driver issue out.

i do not have ssl enabled, but not using it. I tried disabling it too.

postgres=# show ssl;

 ssl

-

 off

(1 row)



I do not know, why those log messages are then showing up.

not errors in

ens160Link encap:Ethernet  HWaddr 00:50:56:a4:53:b6

  inet addr:x.x.x.x  Bcast:x.x.x.x Mask:x.x.x.x

  inet6 addr: x/64 Scope:Link

  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

  RX packets:45403371 errors:0 dropped:0 overruns:0 frame:0

  TX packets:8834699 errors:0 dropped:0 overruns:0 carrier:0

  collisions:0 txqueuelen:1000

  RX bytes:6417380742 (6.4 GB)  TX bytes:56209193635 (56.2 GB)


loLink encap:Local Loopback

  inet addr:127.0.0.1  Mask:255.0.0.0

  inet6 addr: ::1/128 Scope:Host

  UP LOOPBACK RUNNING  MTU:65536  Metric:1

  RX packets:28985185 errors:0 dropped:0 overruns:0 frame:0

  TX packets:28985185 errors:0 dropped:0 overruns:0 carrier:0

  collisions:0 txqueuelen:1

  RX bytes:7078785550 (7.0 GB)  TX bytes:7078785550 (7.0 GB)







Regards,
Vijay

Re: incomplete startup packet messages in logs

2018-11-01 Thread Vijaykumar Jain

ok i enabled log_connections

2018-11-01 13:44:18 UTC LOG:  connection received: host=::1 port=47574
2018-11-01 13:44:18 UTC LOG:  incomplete startup packet

i see this in my /etc/hosts

::1 localhost ip6-localhost ip6-loopback


and

host   all all ::1/128 md5

in my hba,conf


i'll check why there is an attempt to make a connection.


Regards,
Vijay
On Thu, Nov 1, 2018 at 6:57 PM Vijaykumar Jain  wrote:
>
> Hi Team,
>
> I know this is has been answered a lot on the internet wrt ignoring,
> but i am unable to figure out why I get these messages in logs
> "incomplete startup packet"
>
> I have a server running postgresql 10.5 on ubuntu16.04
>
> i do not have any active external connections other than streaming
> postgres replication.
>
>
> postgres  7757 1  0 12:20 ?00:00:00
> /usr/lib/postgresql/10/bin/postgres -D /var/lib/postgresql/10/main -c
> config_file=/etc/postgresql/10/main/postgresql.conf
>
> postgres  7759  7757  0 12:20 ?00:00:00 postgres: 10/main:
> checkpointer process
>
> postgres  7760  7757  0 12:20 ?00:00:00 postgres: 10/main:
> writer process
>
> postgres  7761  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
> writer process
>
> postgres  7762  7757  0 12:20 ?00:00:00 postgres: 10/main:
> autovacuum launcher process
>
> postgres  7763  7757  0 12:20 ?00:00:00 postgres: 10/main:
> archiver process
>
> postgres  7764  7757  0 12:20 ?00:00:00 postgres: 10/main:
> stats collector process
>
> postgres  7765  7757  0 12:20 ?00:00:00 postgres: 10/main:
> bgworker: logical replication launcher
>
> postgres  7779  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
> sender process replicator x.x.x.x(47792) streaming 3/FA0001B0
>
> postgres  7780  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
> sender process replicator x.x.x.x(50526) streaming 3/FA0001B0
>
> postgres  7786  7757  0 12:21 ?00:00:00 postgres: 10/main: wal
> sender process barman_streaming x.x.x.x(43566) streaming 3/FA0001B0
>
>
>
> also,
>
> postgres=# select backend_type, query from pg_stat_activity;
>
> backend_type |   query
>
> -+---
>
>  autovacuum launcher |
>
>  background worker   |
>
>  walsender   |
>
>  walsender   |
>
>  walsender   |
>
>  client backend  | select backend_type, query from pg_stat_activity;
>
>  background writer   |
>
>  checkpointer|
>
>  walwriter   |
>
>
>
> but i still see
>
>
> tail -1000 /var/log/postgresql/postgresql-10-main.log| grep
> 'incomplete startup packet'
>
> 2018-11-01 13:04:18 UTC LOG:  incomplete startup packet
>
> 2018-11-01 13:08:18 UTC LOG:  incomplete startup packet
>
> 2018-11-01 13:12:18 UTC LOG:  incomplete startup packet
>
>
> Note: exactly at 4 min interval.
>
> i do not know C, but i tried to understand something from
> https://github.com/postgres/postgres/blob/197e4af9d5da180190a0f2be851b095dba57d9cd/src/backend/postmaster/postmaster.c#L1891
>
> without any active connections, i would rule any driver issue out.
>
> i do not have ssl enabled, but not using it. I tried disabling it too.
>
> postgres=# show ssl;
>
>  ssl
>
> -
>
>  off
>
> (1 row)
>
>
>
> I do not know, why those log messages are then showing up.
>
> not errors in
>
> ens160Link encap:Ethernet  HWaddr 00:50:56:a4:53:b6
>
>   inet addr:x.x.x.x  Bcast:x.x.x.x Mask:x.x.x.x
>
>   inet6 addr: x/64 Scope:Link
>
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>
>   RX packets:45403371 errors:0 dropped:0 overruns:0 frame:0
>
>   TX packets:8834699 errors:0 dropped:0 overruns:0 carrier:0
>
>   collisions:0 txqueuelen:1000
>
>   RX bytes:6417380742 (6.4 GB)  TX bytes:56209193635 (56.2 GB)
>
>
> loLink encap:Local Loopback
>
>   inet addr:127.0.0.1  Mask:255.0.0.0
>
>   inet6 addr: ::1/128 Scope:Host
>
>   UP LOOPBACK RUNNING  MTU:65536  Metric:1
>
>   RX packets:28985185 errors:0 dropped:0 overruns:0 frame:0
>
>   TX packets:28985185 errors:0 dropped:0 overruns:0 carrier:0
>
>   collisions:0 txqueuelen:1
>
>   RX bytes:7078785550 (7.0 GB)  TX bytes:7078785550 (7.0 GB)
>
>
>
>
>
>
>
> Regards,
> Vijay

Re: incomplete startup packet messages in logs

2018-11-01 Thread Vijaykumar Jain

i have disabled all monitoring i am aware of for postgresql. ( it is
sensu/collectd/ and occasional queries via haproxy and pgbouncer) i
have disabled them all.

i did tcpdump (i am not a pro here), but ignored the ip6 loopback.
i'll paste the output once i get again.



Regards,
Vijay

On Thu, Nov 1, 2018 at 7:17 PM Vijaykumar Jain  wrote:
>
> ok i enabled log_connections
>
> 2018-11-01 13:44:18 UTC LOG:  connection received: host=::1 port=47574
> 2018-11-01 13:44:18 UTC LOG:  incomplete startup packet
>
> i see this in my /etc/hosts
>
> ::1 localhost ip6-localhost ip6-loopback
>
>
> and
>
> host   all all ::1/128 md5
>
> in my hba,conf
>
>
> i'll check why there is an attempt to make a connection.
>
>
> Regards,
> Vijay
> On Thu, Nov 1, 2018 at 6:57 PM Vijaykumar Jain  wrote:
> >
> > Hi Team,
> >
> > I know this is has been answered a lot on the internet wrt ignoring,
> > but i am unable to figure out why I get these messages in logs
> > "incomplete startup packet"
> >
> > I have a server running postgresql 10.5 on ubuntu16.04
> >
> > i do not have any active external connections other than streaming
> > postgres replication.
> >
> >
> > postgres  7757 1  0 12:20 ?00:00:00
> > /usr/lib/postgresql/10/bin/postgres -D /var/lib/postgresql/10/main -c
> > config_file=/etc/postgresql/10/main/postgresql.conf
> >
> > postgres  7759  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > checkpointer process
> >
> > postgres  7760  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > writer process
> >
> > postgres  7761  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
> > writer process
> >
> > postgres  7762  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > autovacuum launcher process
> >
> > postgres  7763  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > archiver process
> >
> > postgres  7764  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > stats collector process
> >
> > postgres  7765  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > bgworker: logical replication launcher
> >
> > postgres  7779  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
> > sender process replicator x.x.x.x(47792) streaming 3/FA0001B0
> >
> > postgres  7780  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
> > sender process replicator x.x.x.x(50526) streaming 3/FA0001B0
> >
> > postgres  7786  7757  0 12:21 ?00:00:00 postgres: 10/main: wal
> > sender process barman_streaming x.x.x.x(43566) streaming 3/FA0001B0
> >
> >
> >
> > also,
> >
> > postgres=# select backend_type, query from pg_stat_activity;
> >
> > backend_type |   query
> >
> > -+---
> >
> >  autovacuum launcher |
> >
> >  background worker   |
> >
> >  walsender   |
> >
> >  walsender   |
> >
> >  walsender   |
> >
> >  client backend  | select backend_type, query from pg_stat_activity;
> >
> >  background writer   |
> >
> >  checkpointer|
> >
> >  walwriter   |
> >
> >
> >
> > but i still see
> >
> >
> > tail -1000 /var/log/postgresql/postgresql-10-main.log| grep
> > 'incomplete startup packet'
> >
> > 2018-11-01 13:04:18 UTC LOG:  incomplete startup packet
> >
> > 2018-11-01 13:08:18 UTC LOG:  incomplete startup packet
> >
> > 2018-11-01 13:12:18 UTC LOG:  incomplete startup packet
> >
> >
> > Note: exactly at 4 min interval.
> >
> > i do not know C, but i tried to understand something from
> > https://github.com/postgres/postgres/blob/197e4af9d5da180190a0f2be851b095dba57d9cd/src/backend/postmaster/postmaster.c#L1891
> >
> > without any active connections, i would rule any driver issue out.
> >
> > i do not have ssl enabled, but not using it. I tried disabling it too.
> >
> > postgres=# show ssl;
> >
> >  ssl
> >
> > -
> >
> >  off
> >
> > (1 row)
> >
> >
> >
> > I do not know, why those log messages are then showing up.
> >
> > not errors in
> >
> > ens160Link encap:Ethernet  HWaddr 00:50:56:a4:53:b6
> >
> >   inet addr:x.x.x.x  Bcast:x.x.x.x Mask:x.x.x.x
> >
> >   inet6 addr: x/64 Scope:Link
> >

Re: incomplete startup packet messages in logs

2018-11-01 Thread Vijaykumar Jain

Thanks Steve,

It was indeed monitoring.

i just had one check enabled, except the others which was not a check
that queried postgres but a simple tcp port available.

check-ports.rb -H localhost -p 5432

which was basically just creating a socket to check if it is is
listening but not closing it.


$:~# irb


irb(main):001:0> require 'socket';

irb(main):002:0* a = TCPSocket.new('localhost', 5432)

=> #

irb(main):003:0> ^C

and in logs

2018-11-01 14:05:10 UTC LOG:  connection received: host=127.0.0.1 port=59786

2018-11-01 14:05:24 UTC LOG:  incomplete startup packet


so indeed it is monitoring as it is on the internet :)

but thanks for helping out :)


Regards,
Vijay
On Thu, Nov 1, 2018 at 7:20 PM Vijaykumar Jain  wrote:
>
> i have disabled all monitoring i am aware of for postgresql. ( it is
> sensu/collectd/ and occasional queries via haproxy and pgbouncer) i
> have disabled them all.
>
> i did tcpdump (i am not a pro here), but ignored the ip6 loopback.
> i'll paste the output once i get again.
>
>
>
> Regards,
> Vijay
>
> On Thu, Nov 1, 2018 at 7:17 PM Vijaykumar Jain  wrote:
> >
> > ok i enabled log_connections
> >
> > 2018-11-01 13:44:18 UTC LOG:  connection received: host=::1 port=47574
> > 2018-11-01 13:44:18 UTC LOG:  incomplete startup packet
> >
> > i see this in my /etc/hosts
> >
> > ::1 localhost ip6-localhost ip6-loopback
> >
> >
> > and
> >
> > host   all all ::1/128 md5
> >
> > in my hba,conf
> >
> >
> > i'll check why there is an attempt to make a connection.
> >
> >
> > Regards,
> > Vijay
> > On Thu, Nov 1, 2018 at 6:57 PM Vijaykumar Jain  wrote:
> > >
> > > Hi Team,
> > >
> > > I know this is has been answered a lot on the internet wrt ignoring,
> > > but i am unable to figure out why I get these messages in logs
> > > "incomplete startup packet"
> > >
> > > I have a server running postgresql 10.5 on ubuntu16.04
> > >
> > > i do not have any active external connections other than streaming
> > > postgres replication.
> > >
> > >
> > > postgres  7757 1  0 12:20 ?00:00:00
> > > /usr/lib/postgresql/10/bin/postgres -D /var/lib/postgresql/10/main -c
> > > config_file=/etc/postgresql/10/main/postgresql.conf
> > >
> > > postgres  7759  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > > checkpointer process
> > >
> > > postgres  7760  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > > writer process
> > >
> > > postgres  7761  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
> > > writer process
> > >
> > > postgres  7762  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > > autovacuum launcher process
> > >
> > > postgres  7763  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > > archiver process
> > >
> > > postgres  7764  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > > stats collector process
> > >
> > > postgres  7765  7757  0 12:20 ?00:00:00 postgres: 10/main:
> > > bgworker: logical replication launcher
> > >
> > > postgres  7779  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
> > > sender process replicator x.x.x.x(47792) streaming 3/FA0001B0
> > >
> > > postgres  7780  7757  0 12:20 ?00:00:00 postgres: 10/main: wal
> > > sender process replicator x.x.x.x(50526) streaming 3/FA0001B0
> > >
> > > postgres  7786  7757  0 12:21 ?00:00:00 postgres: 10/main: wal
> > > sender process barman_streaming x.x.x.x(43566) streaming 3/FA0001B0
> > >
> > >
> > >
> > > also,
> > >
> > > postgres=# select backend_type, query from pg_stat_activity;
> > >
> > > backend_type |   query
> > >
> > > -+---
> > >
> > >  autovacuum launcher |
> > >
> > >  background worker   |
> > >
> > >  walsender   |
> > >
> > >  walsender   |
> > >
> > >  walsender   |
> > >
> > >  client backend  | select backend_type, query from pg_stat_activity;
> > >
> > >  background writer   |
> > >
> > >  checkpointer|
> > >
> > >  walwriter   |
> > >
> > >
> > >
> > > but i still

simple query on why a merge join plan got selected

2018-12-15 Thread Vijaykumar Jain

Hey Guys,

I was just playing with exploring joins and plans i came across this

create table t1(a int);
create table t2(a int);
insert into t1 select (x % 10) from generate_series(1, 10) x;
insert into t2 select (x % 100) from generate_series(1, 10) x;

pgtesting=> analyze t1;
ANALYZE

pgtesting=> analyze t2;
ANALYZE


this is reproducible

the below query by default makes use of merge join (which takes way
longer to return rows as compared to when i explicitly disable merge
join it returns in half the time taken by merge join) but i am not
able to figure out why, although i have run analyze on the tables.


pgtesting=> explain (analyze, buffers) select * from t1 join t2 using (a);

  QUERY PLAN

---

 Merge Join  (cost=19495.64..1039705.09 rows=97241600 width=4) (actual
time=124.153..22243.262 rows=1 loops=1)

   Merge Cond: (t1.a = t2.a)

   Buffers: shared hit=886, temp read=320384 written=616

   ->  Sort  (cost=9747.82..9997.82 rows=10 width=4) (actual
time=56.442..81.611 rows=10 loops=1)

 Sort Key: t1.a

 Sort Method: external merge  Disk: 1376kB

 Buffers: shared hit=443, temp read=172 written=173

 ->  Seq Scan on t1  (cost=0.00..1443.00 rows=10 width=4)
(actual time=0.030..10.003 rows=10 loops=1)

   Buffers: shared hit=443

   ->  Sort  (cost=9747.82..9997.82 rows=10 width=4) (actual
time=67.702..9469.366 rows=10001 loops=1)

 Sort Key: t2.a

 Sort Method: external sort  Disk: 1768kB

 Buffers: shared hit=443, temp read=220222 written=443

 ->  Seq Scan on t2  (cost=0.00..1443.00 rows=10 width=4)
(actual time=0.013..8.186 rows=10 loops=1)

   Buffers: shared hit=443

 Planning time: 0.402 ms

 Execution time: 26093.192 ms

(17 rows)


pgtesting=> set enable_mergejoin TO FALSE;

SET

pgtesting=> explain (analyze, buffers) select * from t1 join t2 using (a);

 QUERY PLAN

-

 Hash Join  (cost=3084.00..1117491.00 rows=97241600 width=4) (actual
time=26.893..10229.924 rows=1 loops=1)

   Hash Cond: (t1.a = t2.a)

   Buffers: shared hit=889, temp read=273 written=271

   ->  Seq Scan on t1  (cost=0.00..1443.00 rows=10 width=4)
(actual time=0.028..18.123 rows=10 loops=1)

 Buffers: shared hit=443

   ->  Hash  (cost=1443.00..1443.00 rows=10 width=4) (actual
time=26.255..26.255 rows=10 loops=1)

 Buckets: 131072  Batches: 2  Memory Usage: 2713kB

 Buffers: shared hit=443, temp written=152

 ->  Seq Scan on t2  (cost=0.00..1443.00 rows=10 width=4)
(actual time=0.017..9.163 rows=10 loops=1)

   Buffers: shared hit=443

 Planning time: 0.099 ms

 Execution time: 14095.975 ms

(12 rows)


pgtesting=> show work_mem;

 work_mem

--

 4MB

(1 row)


pgtesting=> show shared_buffers;

 shared_buffers



 1GB

(1 row)


pgtesting=> select version();


version

-

 PostgreSQL 10.5 (Ubuntu 10.5-1.pgdg16.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609,
64-bit

(1 row)


May be i am missing something way obvious :) but my only concern being
high cardinality joins may not use the statistics correctly?

Regards,
Vijay

Re: [External] Re: simple query on why a merge join plan got selected

2018-12-17 Thread Vijaykumar Jain

Thanks a lot Tom, as always :)
We generally do not have so many duplicates in production, so maybe this is
an edge case but I am happy with the explanation and the code reference for
the analysis.
I’ll also play with default statistic target to see what changes by
increasing the value.


On Sun, 16 Dec 2018 at 5:52 AM Tom Lane  wrote:

> Vijaykumar Jain  writes:
> > I was just playing with exploring joins and plans i came across this
> > create table t1(a int);
> > create table t2(a int);
> > insert into t1 select (x % 10) from generate_series(1, 10) x;
> > insert into t2 select (x % 100) from generate_series(1, 10) x;
> > ...
> > select * from t1 join t2 using (a);
>
> Hm.  This is a fairly extreme case for mergejoining.  In the first place,
> because of the disparity in the key ranges (t1.a goes from 0..9, t2.a
> from 0..99) the planner can figure out that a merge join can stop after
> scanning only 10% of t2.  That doesn't help much here, since we still
> have to sort all of t2, but nonetheless the planner is going to take
> that into account.  In the second place, because you have so many
> duplicate values, most rows in t1 will require "rescanning" 1000 rows
> that were already read and joined to the previous row of t1 (assuming
> t1 is on the left of the join; it's worse if t2 is on the left).
>
> The planner estimates each of those situations properly, but it looks
> to me like it is not handling the combination of both effects correctly.
> In costsize.c we've got
>
> /*
>  * The number of tuple comparisons needed is approximately number of
> outer
>  * rows plus number of inner rows plus number of rescanned tuples (can
> we
>  * refine this?).  At each one, we need to evaluate the mergejoin
> quals.
>  */
> startup_cost += merge_qual_cost.startup;
> startup_cost += merge_qual_cost.per_tuple *
> (outer_skip_rows + inner_skip_rows * rescanratio);
> run_cost += merge_qual_cost.per_tuple *
> ((outer_rows - outer_skip_rows) +
>  (inner_rows - inner_skip_rows) * rescanratio);
>
> where outer_rows and inner_rows are the numbers of rows we're predicting
> to actually read from each input, the xxx_skip_rows values are zero for
> this example, and rescanratio was previously computed as
>
> /* We'll inflate various costs this much to account for rescanning */
> rescanratio = 1.0 + (rescannedtuples / inner_path_rows);
>
> where inner_path_rows is the *total* size of the inner relation,
> including rows that we're predicting won't get read because of the
> stop-short effect.
>
> As far as I can tell, that comment's claim about the number of tuple
> comparisons needed is on-target ... but the code is computing a number
> of tuple comparisons 10x less than that.  The reason is that rescanratio
> is wrong: it should be
>
> rescanratio = 1.0 + (rescannedtuples / inner_rows);
>
> instead, so that it's something that makes sense to multiply inner_rows
> by.  In the existing uses of rescanratio, one multiplies it by
> inner_path_rows and needs to be changed to inner_rows to agree with
> this definition, but the other uses are already consistent with this.
>
> This doesn't make a significant difference if either rescannedtuples
> is small, or inner_rows isn't much less than inner_path_rows.  But
> when neither is true, we can greatly underestimate the number of tuple
> comparisons we'll have to do, as well as the number of re-fetches from
> the inner plan node.  I think in practice it doesn't matter that often,
> because in such situations we'd usually not have picked a mergejoin
> anyway.  But in your example the buggy mergejoin cost estimate is about
> 10% less than the hashjoin cost estimate, so we go with mergejoin.
>
> The attached proposed patch fixes this, raising the mergejoin cost
> estimate to about 35% more than the hashjoin estimate, which seems
> a lot closer to reality.  It doesn't seem to change any results in
> the regression tests, which I find unsurprising: there are cases
> like this in the tests, but as I just said, they pick hashjoins
> already.
>
> Also interesting is that after this fix, the estimated costs of a
> mergejoin for this example are about the same whether t1 or t2 is on
> the left.  I think that's right: t2-on-the-left has 10x more rescanning
> to do per outer tuple, but it stops after scanning only 10% of the
> outer relation, canceling that out.
>
> I'm not sure whether to back-patch this.  It's a pretty clear thinko,
> but there's the question of whether we'd risk destabilizing plan
> choices that are working OK in the real world.
>
> regards, tom lane
>
> --

Regards,
Vijay

Re: [External] Re: Geographical multi-master replication

2019-01-24 Thread Vijaykumar Jain

I do not know the use case but we did try the following.
We had a small requirement wrt some regional data written to tables but
needs to be available to all regions.
We made use of logical replication to replicate/publish  each local table
to all the other regions ( like a many to many)
In theory, if there were no network delay all regions will have all the
data at any point of time.
If there is a wan outage b/w A and B,
They both still talk to C and c can be considered the region with more
correct data, till we have others recovered.

basically given we had disjoint data, we were able to make use of this
feature.

And it it definitely not for HA, there are other options for it.

On Fri, 25 Jan 2019 at 11:16 AM Andreas Kretschmer 
wrote:

>
>
> Am 25.01.19 um 06:10 schrieb Jeremy Finzel:
> >
> > The problem is that the version for BDR 1.0.7, which has an
> > implementation for postgres 9.4, will be on end of live at the end
> > of this year. Unfortunately the paid solution is out of our
> > budget, so we currently have two options: find an alternative or
> > remove the multi-region implementation. We are currently looking
> > for alternatives.
> >
> >
> > You are missing all of the alternatives here.  Why don't you consider
> > upgrading from postgres 9.4 and with it to a supported version of
> > BDR?  There is nothing better you can do to keep your infrastructure
> > up to date, performant, secure, and actually meet your multi-master
> > needs than to upgrade to a newer version of postgres which does have
> > BDR support.
> >
> > Even "stock" postgres 9.4 is set for end of life soon. Upgrade!
>
> ACK!
>
> Sure, you have to pay for a support contract, and this isn't for free,
> but you will get a first-class support for BDR. If you really needs a
> worldwide distributed multi-master solution you should be able to buy that.
>
>
> Regards, Andreas
>
> --
> 2ndQuadrant - The PostgreSQL Support Company.
> www.2ndQuadrant.com
>
>
> --

Regards,
Vijay

FDW, too long to run explain

2019-02-03 Thread Vijaykumar Jain

Hi,

with pg v10.1

I have a setup enabled as below.
7 shards ( 1RW,  2 RO )
they all are fronted by FDW talking to each other.

we use writes directly to shards, and reads via FDW from all shards (RO)
our DB size is ~ 500GB each shard, and tables are huge too.
1 table ~ 200GB, 1 ~55GB, 1 ~40GB and a lot of small tables, but large
indices on large table.

the sharding was done based on a key to enable shard isolation at app layer
using a fact table.
select id,shard from fact_table;

server resources are,
32GB mem, 8 vcpu, 500GB SSD.

the FDW connect to each other shard via FDW fronted by haproxy -> pgbouncer
-> postgresql.
Hope this is good enough background :)

now we have some long running queries via FDW that take minutes and get
killed explain runs as idle in transaction on remote servers. (we set
use_remote_estimate = true )
when the query is run on individual shards directly, it runs pretty
quickly,
but when run via FDW, it takes very long.
i even altered fetch_sie to 1, so that in case some filters do not get
pushed, those can be applied on the FDW quickly.

but i am lost at the understanding of why explain runs for ever via FDW.
we have a view on remote servers. we import public schema from remote
servers, into coordinator custom schema, and then union all

select * from (
select * from sh01.view1
union all
select * from sh01.view1
...
) t where t.foo = 'bar' limit 10;


now the explain for
select * from sh01.view1  keeps running for minutes sometimes,

then fetch too keeps running for minutes, although the total rows are <
1 maybe.
idle in transaction | FETCH 1 FROM c1

we have very aggressive  settings for autovacuum and auto analyze.

autovacuum_naptime = '15s'
autovacuum_vacuum_scale_factor = '0.001'
autovacuum_analyze_scale_factor = '0.005'
log_autovacuum_min_duration = '0'
maintenance_work_mem = '2GB'
autovacuum_vacuum_cost_limit = '5000'
autovacuum_vacuum_cost_delay = '5ms'


other questions:
also, what is the cost of fetch_size?
we have in our settings => use_remote_estimate=true,fetch_size=1

I mean given we have a query

select * from foobar limit 1; via FDW
limit 1 does not get pushed.
so it seems all rows some to FDW node and then limit is applied?


i currently do not have the queries, but i have a screenshot for long
running explain via FDW.
also since the whole query does not show up in pg_stat_statement, i am not
sure, that would be of great help since predicate although applied, do not
show up in pg_stat_activity.


I know, there can be more info i can provide, but  if anyone has
experienced this, pls let me know.

BTW, i know citus is an option, but can we keep that option aside.

we see better ways to handle this in future, by sharding on ids and further
partitioning of tables and parallel execution of FDW queries, but we need
to know if this is a known issue of pg10 or i am doing something wrong
which will bite in pg11 too.


Appreciate your help, always.


Regards,
Vijay

Re: [External] logical replication

2019-02-14 Thread Vijaykumar Jain

Yes we already do that, provided you take care restrictions of logical
replication as mentioned in the doc.

On Thu, 14 Feb 2019 at 3:25 PM suganthi Sekar  wrote:

>
> Hi Team ,
>
>
>  Is it possible to do the logical replication  in 2 way (Synchronization)
>
>
> Example :
>
>
> server1  :  5 tables  (1,2,3,4,5)
>
> Server 2 :  5 tables  (1,2,3,4,5)
>
>
> 2,3 table data to be replicate from server 1 to  server2
>
> 1,5 table data to be replicate  from server 2 to server 1
>
>
> Regards,
>
> Suganthi Sekar
>
-- 

Regards,
Vijay

Re: [External] logical replication

2019-02-14 Thread Vijaykumar Jain

Sure, here it is.
https://www.postgresql.org/docs/10/logical-replication-restrictions.html

Regards,
Vijay


On Thu, Feb 14, 2019 at 5:36 PM suganthi Sekar 
wrote:

> Hi Vijay,
>
>
>  Can you please share the Link (or document)
>
>
> Regards
>
> Suganthi Sekar
> ------
> *From:* Vijaykumar Jain 
> *Sent:* 14 February 2019 17:15:28
> *To:* suganthi Sekar
> *Cc:* pgsql-general@lists.postgresql.org
> *Subject:* Re: [External] logical replication
>
> Yes we already do that, provided you take care restrictions of logical
> replication as mentioned in the doc.
>
>
> On Thu, 14 Feb 2019 at 3:25 PM suganthi Sekar 
> wrote:
>
>
> Hi Team ,
>
>
>  Is it possible to do the logical replication  in 2 way (Synchronization)
>
>
> Example :
>
>
> server1  :  5 tables  (1,2,3,4,5)
>
> Server 2 :  5 tables  (1,2,3,4,5)
>
>
> 2,3 table data to be replicate from server 1 to  server2
>
> 1,5 table data to be replicate  from server 2 to server 1
>
>
> Regards,
>
> Suganthi Sekar
>
> --
>
> Regards,
> Vijay
>

Re: [External] Re: FDW, too long to run explain

2019-02-17 Thread Vijaykumar Jain

I am yet to figure out the reason, what we have done is implement fake
columns to represent samples and giving them random numbers and keeping
other bulls to fake limit.

Most of the queries that were impacted were the ones that did not push
order by and limit to foreign servers.
I am also trying to upgrade pg11 to make use of parallelisation.
For now I am making use of materialised view on each shard and using
predicates that get pushed directly to ensure a simple plan is created.
There is a compromise but this is what is reasonable for now.

On Sun, 17 Feb 2019 at 4:27 PM auxsvr  wrote:

> Related to this question:
>
> Postgresql cursors are in most cases I've tried extremely slow. The cause
> is as described in my previous answer, in my experience. Is there any plan
> or way to improve this situation? For example, for FDW one would expect the
> plan on the remote side to be similar, if not identical, to the one
> locally, with the exception of the setup cost.
> --
> Regards,
> Peter
>
>
>
> --

Regards,
Vijay

Re: [External] Re: FDW, too long to run explain

2019-02-17 Thread Vijaykumar Jain

Assuming your questions as 1,2,3, please find my answers below.

1)"explain" on foreign servers run as "idle in transactions". coz they were
running very long (in the order of some minutes) , pgbouncer (in tx level
pooling) setting kill them (as idle in tx time limit exceeded of 5 mins) or
else results in too many connections piling up.
2)yes, i get those from pg_stat_activity, it truncates the full statement,
but  it shows up as * EXPLAIN select col1, col2  *  00:00:44 | idle in
transaction (this is just one of the screenshots i have). (on the foreign
side)
3)yes, i think we kind of understood that part (fetch and memory), but i am
not sure if that is used as any hint in plan generation too. i am sorry, i
did not put auto explain on, on foreign servers, as that required a restart
of the server.

(this is the real content of the screenshot ,yes 13 mins), masking the
colname and viewname
20678 | 00:13:38.990025 | EXPLAIN SELECT cols from view | idle in
transaction

 the explain analyze of the same query on the foreign server is in ms.

I am sorry, i am vague about the queries in the email. i cannot reproduce
it, as we do not have multiple shards of 500G in my qa environment and i
cannot take dump of prod to test that in our test env coz of gdpr :)
but as i said in the mail, we were speculating since limit was not passed,
the plans may have been bad. We tricked the foreign server by using a
sample column to fake limit push down, and now have improved response
times. We made vaccum/analyze very aggressive to ensure stats are never
stale after large updates or deletes.

Unless someone can else reproduce, I guess, i'll close this mail. (I'll try
to reproduce it myself again, but for now i have less data to share to
convince anyone that happened.

Regards,
Vijay

On Sun, Feb 17, 2019 at 11:11 PM Jeff Janes  wrote:

> On Mon, Feb 4, 2019 at 2:15 AM Vijaykumar Jain 
> wrote:
>
>>
>> now we have some long running queries via FDW that take minutes and get
>> killed explain runs as idle in transaction on remote servers.
>>
>
> Are you saying the EXPLAIN itself gets killed, or execution of the plan
> generated based on the EXPLAIN (issued under use_remote_estimate = true)
> gets killed?  Who is doing the killing, the local side or the foreign
> side?  Can you include verbatim log entries for this?
>

explain on foreign servers run as "idle in transactions". coz they were
running very long (in the order of some minutes) , pgbouncer setting kill
them (as idle in tx time limit exceeded of 5 mins) or else results in too
many connections piling up.

>
>> now the explain for
>> select * from sh01.view1  keeps running for minutes sometimes,
>>
>> then fetch too keeps running for minutes, although the total rows are <
>> 1 maybe.
>> idle in transaction | FETCH 1 FROM c1
>>
>
> What is this?  Is it from some monitoring tool, or pg_stat_activity, or
> what?  And is it on the local side or the foreign side?
>
yes, pg_stat_activity, it truncates the full statement, but  it shows up as
* EXPLAIN select col1, col2  *  00:00:44 | idle in transaction (this is
just one of the screenshots i have). (on the foreign side)

>
>
>> other questions:
>> also, what is the cost of fetch_size?
>>
>
> It will always fetch rows from the foreign server in this sized chunks.  A
> larger fetch_size will have less network latency and computational overhead
> if many rows are going to be consumed, but also consume more memory on the
> local server as all rows are stored in memory per each chunk.  Also, in the
> case of a LIMIT, it reads a large number of rows even if most of them may
> be unneeded.  Conceptually, the LIMIT could be used to modify the FETCH
> downward to match the LIMIT, but that is not implemented.  In the case of a
> view over UNION ALL, I don't think the individual subqueries even know what
> the global LIMIT is.
>
Yep, i guess that is where i think the plan may have

>
>
>>
>> I mean given we have a query
>>
>> select * from foobar limit 1; via FDW
>> limit 1 does not get pushed.
>> so it seems all rows some to FDW node and then limit is applied?
>>
>
> It should not read all rows.  It should read as many multiples of
> fetch_size as needed, which should just be 1 multiple in this case.
>
Yep, i think we kind of understood that part, but i am not sure if that is
used to generate the plan too. i am sorry, i did not put auto explain on,
on foreign servers, as that required a restart of the server.

>
> Cheers,
>
> Jeff
>
>>

Re: [External] Re: FDW, too long to run explain

2019-02-17 Thread Vijaykumar Jain

Hey Jeff,

yes, we now relaxed the idle in transaction setting to 15 mins.

i was hesitant to increase the settings as it blocked auto vaccum. We use
hot_standby_feedback = true also as we split reads/writes and allow long
running queries on read replicas, this too affects auto vaccum.
so overall, all the options i set to ensure auto vaccum gets triggered get
impacted by increased idle in tx and hot_standby_feedback = true, both of
which seem to be necessary for the setup now.

we have been trying to work with sharding using (mutli coordinator FDW) on
our own (and have been successful although have hiccups), using directory
based sharding in pg10. (if we cannot handle growth, all goes to mongo for
its automatic sharding and failover)

I have to admit we can do better here though. we need to rebalance the data
in the shards when we come close to 90% disk. those are long delete/upsert
queries. We have very aggressive autovaccum to ensure we do not have a lot
of stale stats.
I have plans to rearchitect the whole setup with pg11 where we plan to
introduce time based sharding and then table partitioning in each shard
further by time and also use Materialized views, for day old data with pre
aggregated fields on each shard so that explain does not have to work too
hard :)

and then create foreign tables and attach them as partitions. similar to
https://github.com/MasahikoSawada/pgconf-asia-demo/tree/c47e25bf589c7d401c9d342329b400ec26eb61db

i guess, i am diverting the query, but just saying :)
Thanks for suggestions and help Jeff. Appreciate it.

Regards,
Vijay

On Mon, Feb 18, 2019 at 12:39 AM Tom Lane  wrote:

> Jeff Janes  writes:
> > A question for the PostgreSQL hackers would be, Is it necessary and
> > desirable that the EXPLAIN be issued in the same transaction as the
> > eventual DECLARE and FETCHes?  I don't think it is.
>
> It seems like a good idea to me.  I certainly don't think "I've got
> an idle-in-transaction timeout on the remote that's shorter than my
> local transaction runtime" is a plausible argument for changing that.
> You could trip over that with a slow query regardless of whether we
> separated the EXPLAIN step, just because there's no guarantee how
> often we'll ask the FDW to fetch some rows.
>
> > I guess if the foreign
> > side table definition got changed between EXPLAIN and DECLARE it would
> > cause problems, but changing the foreign side definition out of sync with
> > the local side can cause problems anyway, so is that important to
> preserve?
>
> I believe that the EXPLAIN will leave the remote transaction holding
> AccessShareLock on the query's tables, meaning that doing it in one
> transaction provides some positive protection against such problems,
> which we'd lose if we changed this.
>
> regards, tom lane
>

Re: [External] Re: FDW, too long to run explain

2019-02-17 Thread Vijaykumar Jain

Regards,
Vijay


On Mon, Feb 18, 2019 at 12:56 AM Jeff Janes  wrote:

> On Sun, Feb 17, 2019 at 1:52 PM Vijaykumar Jain 
> wrote:
>
>> Assuming your questions as 1,2,3, please find my answers below.
>>
>> 1)"explain" on foreign servers run as "idle in transactions". coz they
>> were running very long (in the order of some minutes) , pgbouncer (in tx
>> level pooling) setting kill them (as idle in tx time limit exceeded of 5
>> mins) or else results in too many connections piling up.
>> 2)yes, i get those from pg_stat_activity, it truncates the full
>> statement, but  it shows up as * EXPLAIN select col1, col2  *  00:00:44
>> | idle in transaction (this is just one of the screenshots i have). (on the
>> foreign side)
>>
>
> You are misinterpreting that data.  The EXPLAIN is not currently running.
> It is the last statement that was running prior to the connection going
> idle-in-transaction.  See my just previous email--I think the reason it is
> idle is that the local is servicing some other part of the query (probably
> on a different FDW), and that is taking a long time.
>
Ok, i raked this from the logs where enabled log_min_duration_statement =
10s

2019-01-31 12:48:18 UTC LOG:  duration: 29863.311 ms  statement: EXPLAIN
SELECT blah, FROM public.view WHERE ((scheduled_bdt >= '2019-01-20'::date))
AND ((scheduled_bdt <= '2019-01-26'::date)) AND ((somekey = ANY
('{269029,123399,263164,261487}'::bigint[])))   (both the columns are
indexed)

>
> Are all the connections piling up from postgres_fdw, or are many of them
> from other applications?  I think your timeout is just shifting symptoms
> around without fixing the underlying problem, while also making that
> underlying problem hard to diagnose.
>
same application, but when more than one person is using the analytical
tool that runs the underlying query.

>
>
>
>> 3)yes, i think we kind of understood that part (fetch and memory), but i
>> am not sure if that is used as any hint in plan generation too.
>>
>
> The query is planned as part of a cursor.  As such, it will
> use cursor_tuple_fraction as the "hint".  Perhaps you could tweak this
> parameter on the foreign side.  I think that a low setting for this
> parameter should give similar plans as a small LIMIT would give you, while
> large settings would give the same plans as a large (or no) LIMIT would.
>
> I think postgres_fdw should pass does the LIMIT when it can do so, but it
> doesn't currently.
>
As i already said, we have overcome the limit issue with a fake sample
column in the huge tables. that way we limit the number of rows on the
foreign server itself before the fetch. this is not the best and has its
edge cases, but yeah, it works for now.


>
> Cheers,
>
> Jeff
>
>>

Re: [External] Re: FDW, too long to run explain

2019-02-17 Thread Vijaykumar Jain

Ok.
I’ll try to work on it this week and see if i am able to reproduce anything.

On Mon, 18 Feb 2019 at 2:30 AM Jeff Janes  wrote:

>
>
> On Sun, Feb 17, 2019 at 2:37 PM Vijaykumar Jain 
> wrote:
>
>>
>> Ok, i raked this from the logs where enabled log_min_duration_statement =
>> 10s
>>
>> 2019-01-31 12:48:18 UTC LOG:  duration: 29863.311 ms  statement: EXPLAIN
>> SELECT blah, FROM public.view WHERE ((scheduled_bdt >= '2019-01-20'::date))
>> AND ((scheduled_bdt <= '2019-01-26'::date)) AND ((somekey = ANY
>> ('{269029,123399,263164,261487}'::bigint[])))   (both the columns are
>> indexed)
>>
>
> That is interesting.  Was that in the logs for the local or the foreign
> side?  And is it common, or rare?
>
> If on the local side, could it be that the EXPLAINs sent to the foreign
> side are being made to wait by the connection pooler, leading to long
> delays?  If that is from the foreign side, then it should be conceptually
> unrelated to FDW.  Any chance you could reproduce the slowness in your test
> environment?  Slowness in the planner is probably related to the schema
> structure, not the data itself.
>
> I don't think this would be related to the idle-in-transaction, except
> that one FDW connection maybe idle-in-transaction after its EXPLAIN is done
> because it is waiting for another FDW connection to slowly run its EXPLAIN.
>
> Cheers,
>
> Jeff
>
>> --

Regards,
Vijay

Re: [External] Re: FDW, too long to run explain

2019-02-18 Thread Vijaykumar Jain

Oh Wow, i guess you are right.
I just ran example where local runs make use of parallel setup, but not FDW.

i have three servers
2 x pg10
1 x pg11

i run queries on coordinator node ( pg11 ) which makes calls to foreign
server to do a simple count.
the individual nodes run the query in parallel, the setup  is repeatable.
but via FDW it runs a simple seq scan.
i guess this is for the same reason as you mentioned wrt declared cursors.


on pg11
create schema pg10;
create schema pg10_qa;
import foreign schema pg10  from server pg10 into pg10;
import foreign schema pg10_qa from server pg10_qa into pg10_qa;

explain (analyze,verbose) SELECT COUNT(1) FROM pg10.tbl_ItemTransactions;
this query is via FDW
 QUERY PLAN

 Foreign Scan  (cost=108.53..152.69 rows=1 width=8) (actual
time=6584.498..6584.500 rows=1 loops=1)
   Output: (count(1))
   Relations: Aggregate on (pg10.tbl_itemtransactions)
   Remote SQL: SELECT count(1) FROM pg10.tbl_itemtransactions
 Planning Time: 0.112 ms
 Execution Time: 6585.435 ms
(6 rows)

2019-02-18 09:56:48 UTC LOG:  duration: 6593.046 ms  plan:
Query Text: DECLARE c1 CURSOR FOR
SELECT count(1) FROM pg10.tbl_itemtransactions
Aggregate  (cost=768694.80..768694.81 rows=1 width=8) (actual
time=6593.039..6593.039 rows=1 loops=1)
  Output: count(1)
  Buffers: shared hit=259476
  ->  Seq Scan on pg10.tbl_itemtransactions  (cost=0.00..666851.04
rows=40737504 width=0) (actual time=0.024..3389.245 rows=40737601 loops=1)
Output: tranid, transactiondate, transactionname
Buffers: shared hit=259476



on pg10 (1) -- foreign server pg10
create schema pg10;
CREATE TABLE pg10.tbl_ItemTransactions
 (
 TranID SERIAL
 ,TransactionDate TIMESTAMPTZ
 ,TransactionName TEXT
 );
INSERT INTO pg10.tbl_ItemTransactions
(TransactionDate, TransactionName)
SELECT x, 'dbrnd'
FROM generate_series('2014-01-01 00:00:00'::timestamptz, '2016-08-01
00:00:00'::timestamptz,'2 seconds'::interval) a(x);

explain analyze SELECT count(1) FROM pg10.tbl_itemtransactions;  --this
query is local

QUERY PLAN
---
 Finalize Aggregate  (cost=472650.72..472650.73 rows=1 width=8) (actual
time=2576.053..2576.054 rows=1 loops=1)
   ->  Gather  (cost=472650.50..472650.71 rows=2 width=8) (actual
time=2575.721..2626.980 rows=3 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 ->  Partial Aggregate  (cost=471650.50..471650.51 rows=1 width=8)
(actual time=2569.302..2569.302 rows=1 loops=3)
   ->  Parallel Seq Scan on tbl_itemtransactions
(cost=0.00..429215.60 rows=16973960 width=0) (actual time=0.048..1492.144
rows=13579200 loops=3)
 Planning time: 0.405 ms
 Execution time: 2627.455 ms
(8 rows)



on pg10 (2) -- foreign server pg10_qa
create schema pg10_qa;
CREATE TABLE pg10_qa.tbl_ItemTransactions
 (
 TranID SERIAL
 ,TransactionDate TIMESTAMPTZ
 ,TransactionName TEXT
 );
INSERT INTO pg10_qa.tbl_ItemTransactions
(TransactionDate, TransactionName)
SELECT x, 'dbrnd'
FROM generate_series('2014-01-01 00:00:00'::timestamptz, '2016-08-01
00:00:00'::timestamptz,'2 seconds'::interval) a(x);

explain analyze SELECT count(1) FROM pg10_qa.tbl_itemtransactions;  -- this
query is local

QUERY PLAN
---
 Finalize Aggregate  (cost=472650.72..472650.73 rows=1 width=8) (actual
time=2568.469..2568.469 rows=1 loops=1)
   ->  Gather  (cost=472650.50..472650.71 rows=2 width=8) (actual
time=2568.067..2613.006 rows=3 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 ->  Partial Aggregate  (cost=471650.50..471650.51 rows=1 width=8)
(actual time=2563.893..2563.893 rows=1 loops=3)
   ->  Parallel Seq Scan on tbl_itemtransactions
(cost=0.00..429215.60 rows=16973960 width=0) (actual time=0.017..1388.417
rows=13579200 loops=3)
 Planning time: 0.048 ms
 Execution time: 2613.246 ms
(8 rows)


but i guess partition elimination still works across the shards (see
attached). atleast, we'll benefit from here :) in pg11.


Regards,
Vijay


On Mon, Feb 18, 2019 at 3:07 AM Jeff Janes  wrote:

> On Sun, Feb 17, 2019 at 6:32 AM Vijaykumar Jain 
> wrote:
>
>> I am yet to figure out the reason, what we have done is implement fake
>> columns to represent samples and giving them random numbers and keeping
>> other bulls to fake limit.
>>
>> Most of the queries that were impacted were the ones that did not push
>> order by and limit to foreign servers.
>> I

Re: [External] LIMIT not showing all results

2019-03-05 Thread Vijaykumar Jain

Can you run both the queries with
“explain analyze select ” and paste the output.


On Tue, 5 Mar 2019 at 9:41 PM Casey Deccio  wrote:

> Okay, the subject is a little misleading because of course LIMIT isn't
> supposed to all results, but I've got an issue where LIMIT isn't showing
> the number of results I would expect.  For example:
>
> mydb=> select id,name,date from analysis where name = 'foo' order by date
> desc limit 3;
>id |name |  date
> ---+-+
> 195898786 | foo | 2019-03-05 06:45:29+00
> (1 row)
>
> mydb=> select id,name,date from analysis where name = 'foo' order by date
> desc limit 20;
>id |name |  date
> ---+-+
> 195898786 | foo | 2019-03-05 06:45:29+00
>
>
> But when I query without limit (or even with "limit all"), there are more:
>
> mydb=> select id,name,date from analysis where name = 'foo' order by date
> desc;
>id |name |  date
> ---+-+
> 195898786 | foo | 2019-03-05 06:45:29+00
> 195842671 | foo | 2019-01-24 14:31:45+00
> 195667475 | foo | 2018-12-30 23:40:11+00
> 195256709 | foo | 2018-10-29 18:33:07+00
> ...
> (Many more rows)
>
> psql version 9.6.11.
> Server version 9.4.20.
>
> We just migrated the database from one server to another (as far as I
> know, there was no version change), and it worked previously.
>
> Any ideas?
>
> Many thanks in advance.
> Casey
>
-- 

Regards,
Vijay

Re: [External] LIMIT not showing all results

2019-03-05 Thread Vijaykumar Jain

Thanks Tom.


I mean if the instance is a test instance,
probably analysis_name_date_key can be dropped and the query can be
run again so as to check if it still returns the correct rows.
or create an index in parallel with the same col as
analysis_name_date_key and check if the optimizer choses the right
index.
and then come to conclusion of bad index.

Also is there an option where we can force a particular index to be used ?

i read somewhere the below query may help with detecting bad index, is
this correct?

SELECT n.nspname, c.relname
FROM   pg_catalog.pg_class c, pg_catalog.pg_namespace n,
   pg_catalog.pg_index i
WHERE  (i.indisvalid = false OR i.indisready = false) AND
   i.indexrelid = c.oid AND c.relnamespace = n.oid AND
   n.nspname != 'pg_catalog' AND
   n.nspname != 'information_schema' AND
   n.nspname != 'pg_toast'


Regards,
Vijay

On Tue, Mar 5, 2019 at 10:16 PM Tom Lane  wrote:
>
> Casey Deccio  writes:
> >> On Mar 5, 2019, at 9:15 AM, Vijaykumar Jain  wrote:
> >> Can you run both the queries with
> >> “explain analyze select ” and paste the output.
>
> > dnsviz=> explain analyze select id,name,date from analysis where name = 
> > 'foo' order by date desc limit 20;
> > 
> >  QUERY PLAN
>
> > ---
> > --
> >  Limit  (cost=0.57..81.74 rows=20 width=31) (actual time=0.540..0.541 
> > rows=1 loops=1)
> >->  Index Scan Backward using analysis_name_date_key on analysis  
> > (cost=0.57..7760.25 rows=1912 width=31) (actual
> > time=0.539..0.540 rows=1 loops=1)
> >  Index Cond: ((name)::text = 'foo'::text)
> >  Planning time: 6.728 ms
> >  Execution time: 0.587 ms
> > (5 rows)
>
> Hm, so possibly corruption in that index?  REINDEX might help.
>
> regards, tom lane
>

xmin and very high number of concurrent transactions

2019-03-12 Thread Vijaykumar Jain

I was asked this question in one of my demos, and it was interesting one.

we update xmin for new inserts with the current txid.
now in a very high concurrent scenario where there are more than 2000
concurrent users trying to insert new data,
will updating xmin value be a bottleneck?

i know we should use pooling solutions to reduce concurrent
connections but given we have enough resources to take care of
spawning a new process for a new connection,

Regards,
Vijay

Re: [External] Re: xmin and very high number of concurrent transactions

2019-03-12 Thread Vijaykumar Jain

no i mean not we end users, postgres does it (?) via the xmin and xmax
fields  from inherited tables :) if that is what you wanted in a why
or are you asking, does postgres even update those rows and i am wrong
assuming it that way?

since the values need to be atomic,
consider the below analogy
assuming i(postgres) am person giving out token to
people(connections/tx) in a queue.
if there is a single line, (sequential) then it is easy for me to
simply give them 1 token incrementing the value and so on.
but if there are thousands of users in parallel lines, i am only one
person delivering the token, will operate sequentially, and the other
person is "blocked" for sometime before it gets the token with the
required value.
so if there are 1000s or users with the "delay" may impact my
performance  coz i need to maintain the value of the token to be able
to know what token value i need to give to next person?

i do not know if am explaining it correctly, pardon my analogy,

Regards,
Vijay

On Wed, Mar 13, 2019 at 1:10 AM Adrian Klaver  wrote:
>
> On 3/12/19 12:19 PM, Vijaykumar Jain wrote:
> > I was asked this question in one of my demos, and it was interesting one.
> >
> > we update xmin for new inserts with the current txid.
>
> Why?
>
> > now in a very high concurrent scenario where there are more than 2000
> > concurrent users trying to insert new data,
> > will updating xmin value be a bottleneck?
> >
> > i know we should use pooling solutions to reduce concurrent
> > connections but given we have enough resources to take care of
> > spawning a new process for a new connection,
> >
> > Regards,
> > Vijay
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com

Re: [External] Re: xmin and very high number of concurrent transactions

2019-03-13 Thread Vijaykumar Jain

Thank you everyone for responding.
Appreciate your help.

Looks like I need to understand the concepts a little more in detail , to
be able to ask the right questions, but atleast now I can look at  the
relevant docs.


On Wed, 13 Mar 2019 at 2:44 PM Julien Rouhaud  wrote:

> On Wed, Mar 13, 2019 at 9:50 AM Laurenz Albe 
> wrote:
> >
> > Vijaykumar Jain wrote:
> > > I was asked this question in one of my demos, and it was interesting
> one.
> > >
> > > we update xmin for new inserts with the current txid.
> > > now in a very high concurrent scenario where there are more than 2000
> > > concurrent users trying to insert new data,
> > > will updating xmin value be a bottleneck?
> > >
> > > i know we should use pooling solutions to reduce concurrent
> > > connections but given we have enough resources to take care of
> > > spawning a new process for a new connection,
> >
> > You can read the function GetNewTransactionId in
> > src/backend/access/transam/varsup.c for details.
> >
> > Transaction ID creation is serialized with a "light-weight lock",
> > so it could potentially be a bottleneck.
>
> Also I think that GetSnapshotData() would be the major bottleneck way
> before GetNewTransactionId() becomes problematic.  Especially with
> such a high number of active backends.
>
-- 

Regards,
Vijay

Re: [External] Re: PostgreSQL temp table blues

2019-03-13 Thread Vijaykumar Jain

May be I am wrong here, but is it not the classic case of connections open
too long  idle in TX and xid wraparound ?
How is connection pool (and which one ?)
adding to the woes?
I mean the same can be a problem with direct connections too right ?

We use pgbouncer with mostly TX level pooling which closes the connection
after a commit or a rollback.
We have both idle TX timeouts at front end and back end of the pgbouncer
setting.
And we have monitoring of bloat, idle in TX sessions and pgbouncer
connections.

We have dbs of various sizes but all less than 1TB.
So I do not know if I am comparing with the same set of resources, but just
that we use temp tables with connection pooling  but with the right
monitoring and reasonable constraints and we yet to bite that bullet.
So I guess we’ll add to the monitoring something like this too

https://aws.amazon.com/blogs/database/implement-an-early-warning-system-for-transaction-id-wraparound-in-amazon-rds-for-postgresql/



But I guess you have had a long day,  but thanks for sharing.


On Thu, 14 Mar 2019 at 11:45 AM Rene Romero Benavides <
rene.romer...@gmail.com> wrote:

> In conjunction with some parameter to renew idle connections and those
> that have been opened for too long will help you prevent this in the
> future, this also helps prevent server processes from becoming too big
> memory wise.
>
> On Wed, Mar 13, 2019 at 4:32 PM Rene Romero Benavides <
> rene.romer...@gmail.com> wrote:
>
>> Wow, thanks for sharing your experience. What kind of connection pooling
>> are we talking about? some connection pools implement a DISCARD ALL
>> statement after a session close, that may help if possible to configure.
>>
>> On Wed, Mar 13, 2019 at 4:21 AM Jahwan Kim  wrote:
>>
>>> Hi all,
>>>
>>>
>>> I'd like to share my (painful) experience, in which temp tables caused
>>> PostgreSQL shutdown.
>>> TL;DR. Do not use temp tables in PostgreSQL with connection pool.
>>>
>>> * My app uses connection pool AND temp tables, with default setting of
>>> ON COMMIT PRESERVE ROWS.
>>> * I found out later that autovacuum doesn't deal with temp tables.
>>> * The database ages as long as the connection is not closed.
>>> * So when the database age reaches XID STOP LIMIT, the database refuses
>>> to process any new transaction requests, saying "database is not accepting
>>> commands to avoid wraparound data loss... HINT: Stop the postmaster and use
>>> a standalone backend to vacuum that database. "
>>>
>>> After reading the docs, I expected this much. What happens after this
>>> surprised me.
>>> * Now the database needs to be shutdown. When shutting down, it tries to
>>> remove temp tables (of course), but since the database is not accepting any
>>> commands, ... The temp tables are then ORPHANED, although there was no
>>> database crash!
>>> * Because of these orphan temp tables, vacuuming the database in single
>>> mode won't work, as suggested by HINT. The orphaned temp tables must be
>>> manually dropped in single mode, and only then the database can be vacuumed
>>> back to normal state. Without dropping temp tables, vacuuming just takes
>>> (quite possibly a long) time and do (almost) nothing.
>>>
>>> Well, that's all. All of the above facts are documented, albeit tersely.
>>> If anybody I know ask me about temp tables in PostgreSQL, I'd just say
>>> "DON'T."
>>>
>>>
>>> Best Regards,
>>> Jahwan
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> El genio es 1% inspiración y 99% transpiración.
>> Thomas Alva Edison
>> http://pglearn.blogspot.mx/
>>
>>
>
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
> http://pglearn.blogspot.mx/
>
> --

Regards,
Vijay

postgresql-11 installation errors via deb package on ubuntu 16

2019-03-18 Thread Vijaykumar Jain

Hey Guys,

I do not know if this list is also for asking ubuntu package related queries.

We have been recently getting a lot of below errors, as a result of
which the entire postgresql installation gets broken.
the config folder /etc/postgresql/* is empty, initdb fails to
initialize the db and we cannot move forward.

the problem is this is happening only on a certain set of servers, but not all.
so i do not know if this is an issue with the package or anything else?
maybe bad mirror?

i came across a similar issue but this is ages back.
https://stackoverflow.com/questions/2748607/how-to-thoroughly-purge-and-reinstall-postgresql-on-ubuntu
and it was marked close for some reasons.
i get the part that removing everything and reinstalling the package
resolves the issue,


   24  service postgresql stop
   25  rm -rf /etc/postgresql*
   26  rm -rf /var/lib/postgresql*
   27  rm -rf /var/run/postgresql*
   28  rm /var/cache/apt/archives/postgresql-*
   29  apt-get purge postgresql-11
   30  apt-get purge postgresql-client-common
   31  rm -rf /var/log/postgresql*
   32  puppet agent -t --debug --verbose  (this takes care of installation)

and then things are fine from here on.



error messages from /var/log/apt/term.log


Preparing to unpack .../postgresql-11_11.2-1.pgdg16.04+1_amd64.deb ...
Unpacking postgresql-11 (11.2-1.pgdg16.04+1) ...
Processing triggers for postgresql-common (199.pgdg16.04+1) ...
Building PostgreSQL dictionaries from installed myspell/hunspell packages...
Removing obsolete dictionary files:
Setting up postgresql-11 (11.2-1.pgdg16.04+1) ...
Unescaped left brace in regex is deprecated, passed through in regex;
marked by <-- HERE in m/(?http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main 11

this has been really painful to debug, coz i do not know how randomly
we get these errors and sometimes not.



Regards,
Vijay

Re: [External] Re: postgresql-11 installation errors via deb package on ubuntu 16

2019-03-18 Thread Vijaykumar Jain

uname -a
Linux  4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04
UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

lsb_release -a
Distributor ID: Ubuntu
Description:Ubuntu 16.04.4 LTS
Release:16.04
Codename:   xenial

no, these fresh installations. existing installations do not have issues.
we have customization, but at the core, this is the part which is
installing the package.
https://github.com/puppetlabs/puppetlabs-postgresql/blob/master/manifests/server/install.pp
 (the puppet module version being 4.9.0 though)

now this never happens in my vagrant (I am from india) and all my runs
are clean.
this happens when we run installations on US machines, but coz this is
not reproducible, i do not know if this is package issue or something
else.

also i tried to check if there are issues with apt database etc, but
none of the other packages show errors in installation but this.

from /var/lib/dpkg/status, both package show as installed ok.

Package: postgresql-common
Status: install ok installed
Priority: optional
Section: database
Installed-Size: 627
Maintainer: Debian PostgreSQL Maintainers 
Architecture: all
Multi-Arch: foreign
Version: 199.pgdg16.04+1
Depends: adduser, debconf (>= 0.5.00) | debconf-2.0, lsb-base (>=
3.0-3), postgresql-client-common (= 199.pgdg16.04+1), procps, ssl-cert
(>= 1.0.11), ucf, init-system-helpers (>= 1.18~)
Recommends: e2fsprogs, logrotate
Suggests: libjson-perl
Breaks: postgresql-9.1 (<< 9.1.1-3~), systemd (<< 204)
Conflicts: postgresql-7.4, postgresql-8.0
Conffiles:
 /etc/apt/apt.conf.d/01autoremove-postgresql ce792928ef7a41f68842bb9380ddcc48
 /etc/init.d/postgresql 1b9da643d7a403248d70e1776fb98458
 /etc/logrotate.d/postgresql-common 101326ef5d138998692ece35109ef1a2
 /etc/sysctl.d/30-postgresql-shm.conf 9453b06c646f9033eada574154c31a9d
Description: PostgreSQL database-cluster manager
 The postgresql-common package provides a structure under which
 multiple versions of PostgreSQL may be installed and/or multiple
 clusters maintained at one time.
 .
 The commands provided are pg_conftool, pg_createcluster, pg_ctlcluster,
 pg_dropcluster, pg_lsclusters, pg_renamecluster, pg_upgradecluster,
 pg_virtualenv.
 .
 PostgreSQL is a fully featured object-relational database management
 system. It supports a large part of the SQL standard and is designed
 to be extensible by users in many aspects. Its features include ACID
 transactions, foreign keys, views, sequences, subqueries, triggers,
 outer joins, multiversion concurrency control, and user-defined types
 and functions.

Package: postgresql-11
Status: install ok installed
Priority: optional
Section: database
Installed-Size: 43901
Maintainer: Debian PostgreSQL Maintainers 
Architecture: amd64
Version: 11.2-1.pgdg16.04+1
Provides: postgresql-contrib-11
Depends: locales | locales-all, postgresql-client-11,
postgresql-common (>= 194~), ssl-cert, tzdata, debconf (>= 0.5) |
debconf-2.0, libc6 (>= 2.17), libgcc1 (>= 1:3.0), libgssapi-krb5-2 (>=
1.8+dfsg), libicu55 (>= 55.1-1~), libldap-2.4-2 (>= 2.4.7), libllvm6.0
(>= 1:6.0~svn298832-1~), libpam0g (>= 0.99.7.1), libpq5 (>= 9.3~),
libselinux1 (>= 2.1.12), libssl1.0.0 (>= 1.0.2~beta3), libstdc++6 (>=
5.2), libsystemd0, libuuid1 (>= 2.16), libxml2 (>= 2.7.4), libxslt1.1
(>= 1.1.25), zlib1g (>= 1:1.1.4)
Recommends: sysstat
Breaks: postgresql-11-citus (<< 8.0.0.PGDG-2~), postgresql-11-cron (<<
1.1.3-2~), postgresql-11-pgextwlist (<< 1.8-2~),
postgresql-11-pglogical (<< 2.2.1-4~), postgresql-11-plsh (<<
1.20171014-3~), postgresql-11-rum (<< 1.3.2-4~),
postgresql-11-wal2json (<< 1.0-5~)
Description: object-relational SQL database, version 11 server
 PostgreSQL is a powerful, open source object-relational database
 system. It is fully ACID compliant, has full support for foreign
 keys, joins, views, triggers, and stored procedures (in multiple
 languages). It includes most SQL:2008 data types, including INTEGER,
 NUMERIC, BOOLEAN, CHAR, VARCHAR, DATE, INTERVAL, and TIMESTAMP. It
 also supports storage of binary large objects, including pictures,
 sounds, or video. It has native programming interfaces for C/C++,
 Java, .Net, Perl, Python, Ruby, Tcl, ODBC, among others, and
 exceptional documentation.
 .
 This package provides the database server for PostgreSQL 11.
Homepage: http://www.postgresql.org/
Postgresql-Catversion: 201809051

let me know if you need more info.

Regards,
Vijay

On Mon, Mar 18, 2019 at 8:06 PM Adrian Klaver  wrote:
>
> On 3/18/19 7:24 AM, Vijaykumar Jain wrote:
> > Hey Guys,
> >
> > I do not know if this list is also for asking ubuntu package related 
> > queries.
> >
> > We have been recently getting a lot of below errors, as a result of
> > which the entire postgresql installation gets broken.
> > the config folder /etc/postgresql/* is empty, initdb fails to
> > initialize

Re: [External] Re: postgresql-11 installation errors via deb package on ubuntu 16

2019-03-18 Thread Vijaykumar Jain

Thanks Tom,

I probably thought it was not the right forum, but thanks for being
polite and saying no.
I just wanted to take a chance if this was known to anyone, coz
googling did not show recent issues with the same.
But ok, I'll ask in ubuntu forums.

Regards,
Vijay

On Mon, Mar 18, 2019 at 8:15 PM Tom Lane  wrote:
>
> Vijaykumar Jain  writes:
> > I do not know if this list is also for asking ubuntu package related 
> > queries.
>
> Not really; you'd be better off filing a bug with ubuntu where their
> packager(s) will see it.
>
> The symptoms you describe definitely look like there is something
> wrong with a packager-supplied configuration adjustment script.
> More than that is hard to say without familiarity with the ubuntu
> postgres packages, which I lack.
>
> regards, tom lane

Re: [External] postgres 9.5 DB corruption: invalid byte sequence for encoding "UTF8"

2019-03-25 Thread Vijaykumar Jain

i think the experts will chime in soon,

but why do you think this as db corruption and not just a bad input?
https://github.com/postgres/postgres/blob/master/src/pl/plperl/expected/plperl_lc_1.out

or it may also be encoding issue.
https://pganalyze.com/docs/log-insights/app-errors/U137

can you do a pg_dump and restore on a parallel instance? does it
result in failure?

we also ask the app to log data (temporarily) inserted  so that we
could figure out directly if there was bad data upstream or have
validations to prevent inserts when there is bad data.

also, in our case the query was stuck at "PARSE"  (if you do ps aux |
grep postgres) and in some cases did result in oom.
but upgrading the client and using session mode pooling in pgbouncer
worked for us.




Regards,
Vijay

On Tue, Mar 26, 2019 at 12:17 AM Thomas Tignor  wrote:
>
> Hoping someone may be able to offer some guidance on this recurring problem. 
> I am providing this "problem report" to the general list as I understand the 
> bugs list requires a set of reproduction steps we do not yet have. Please 
> advise if I have the process wrong. I have tried to provide all known 
> relevant info here. Thanks in advance for any insights.
>
> --> A description of what you are trying to achieve and what results you 
> expect.:
>
> We are experiencing intermittent DB corruption in postgres 9.5.14. We are 
> trying to identify and eliminate all sources. We are using two independent 
> services for data replication, Slony-I v2.2.6 and a custom service developed 
> in-house. Both are based on COPY operations. DB corruption is observed when 
> COPY operations fail with an error of the form: 'invalid byte sequence for 
> encoding "UTF8"'. This occurs with a frequency ranging between a few days and 
> several weeks. Each incident is followed by a race to find and repair or 
> remove corrupted data, which we are getting good at. With well over a dozen 
> incidents, the great majority originally showed corruption in a single 
> VARCHAR(2000) column (value) in a single table (alert_attribute). In this 
> time, we read about suspected and real problems with TOAST functionality and 
> so made the decision to change alert_attribute.value to PLAIN storage. Since 
> that change was made, most new incidents show corruption in the 
> alert_attribute.name column instead (VARCHAR(200)). Another table 
> (alert_instance) has been impacted as well. See below for their schemas.
>
> We have looked high and low through system logs and device reporting utility 
> output for any sign of hardware failures. We haven't turned up anything yet. 
> We also tried rebuilding an entire DB from scratch. That did not seem to help.
>
> We have not been performing routine reindexing. This is a problem we are 
> working to correct. Normally our master DB serves for an 8-12 week period 
> without reindexing before we failover to a peer. Before assuming the master 
> role, the peer always begins by truncating the alert_instance and 
> alert_attribute tables and loading all data from the current master.
>
> Hardware specs are listed below. For storage, we have 8 INTEL SSDSA2BW12 
> direct-attached disks. We can provide additional info as needed.
>
> ams=# \d ams.alert_attribute
>Table "ams.alert_attribute"
>   Column   |  Type   | Modifiers
> ---+-+---
>  alert_instance_id | integer | not null
>  name  | character varying(200)  | not null
>  data_type | smallint| not null
>  value | character varying(2000) |
> Indexes:
> "pk_alert_attributes" PRIMARY KEY, btree (alert_instance_id, name), 
> tablespace "tbls5"
> "idx_aa_aval" btree (name, value)
> Foreign-key constraints:
> "fk_alert_attr_instance_id" FOREIGN KEY (alert_instance_id) REFERENCES 
> ams.alert_instance(alert_instance_id) ON DELETE CASCADE
> Triggers:
> _ams_cluster_logtrigger AFTER INSERT OR DELETE OR UPDATE ON 
> ams.alert_attribute FOR EACH ROW EXECUTE PROCEDURE 
> _ams_cluster.logtrigger('_ams_cluster', '2', 'kk')
> _ams_cluster_truncatetrigger BEFORE TRUNCATE ON ams.alert_attribute FOR 
> EACH STATEMENT EXECUTE PROCEDURE _ams_cluster.log_truncate('2')
> Disabled user triggers:
> _ams_cluster_denyaccess BEFORE INSERT OR DELETE OR UPDATE ON 
> ams.alert_attribute FOR EACH ROW EXECUTE PROCEDURE 
> _ams_cluster.denyaccess('_ams_cluster')
> _ams_cluster_truncatedeny BEFORE TRUNCATE ON ams.alert_attribute FOR EACH 
> STATEMENT EXECUTE PROCEDURE _ams_cluster.deny_truncate()
>
> ams=#
> ams=# \d ams.alert_instance
> Table "ams.alert_instance"
>Column|  Type  | Modifiers
> -++---
>  alert_instance_id   | integer| not null
>  alert_definition_id | integer| not null
>  alert_instance_key  | charac

Re: [External] Re: Import Database

2019-05-05 Thread Vijaykumar Jain

Yes.
I do bump up maintenance_work_mem temporarily during a restore.
it helps in rebuilding on indexes a little faster.
Turning fsync off during restore will help the restore a little fast too
but in case of any crash you may have to restart the restore from scratch.
Also do have the option to take pg_dump and run pg_restore ? or you just
the have the raw sql dump to work with?
if you have the option of taking a dump again, you can try using pg_dump
and pg_restore with -Fc (custom format) and -j n (parallel)  option along
with temp bump in maint memory.
This will make the restore a little faster that raw sql dump I think.
If you are on pg10 or above? you can use logical replication to mirror  the
database.
There are blogs by several people explaining how to do that, that may be
helpful.

On Sun, 5 May 2019 at 10:29 PM Ravi Krishna  wrote:

> IMO you are using the slowest tool to import.
>
> Just one quick question: Why can't you take cluster backup using any of
> the tools available and then drop all
> unwanted databases after you import the cluster.
>
> pg_basebackup will do a good job.
>
> --

Regards,
Vijay

logical replication initiate via manual pg_dump

2019-05-10 Thread Vijaykumar Jain

Hey Guys,

tl;dr, but incase i missed something, i can follow up on this with more details.


I have a setup where i try to upgrade a cluster from pg10 to pg11 via
logical replication with minimum downtime.
its a database that is 500GB with 1 table having 350GB of data (+
bloat) and 100GB of indexes.

now when i triggered logical replication, it took more than 2 days
(and still around 10% remaining) to catch up on the data. the
publisher shows copy table to stdout still running.
i am not sure of the internals, but given a case that in a test prep
with no active connections and DMLs it took around 2 days, does seem
like i am missing something.

on both pg10 and pg11
ram 32GB
cpu 8
SSD
max_wal_size 100GB
checkpoint_timeout 30min
shared_buffers 8GB

on pg10
(copy still running on the huge table)
select (now() - query_start)::interval, query from pg_stat_activity;
2 days 19:03:12.799767 | COPY public. TO STDOUT

this is how disk looks like on pg10 (publisher)
du -sh /var/lib/postgresql/1*/main/{pg_wal*,base*}
78G /var/lib/postgresql/10/main/pg_wal
467G /var/lib/postgresql/10/main/base


on pg11
du -sh /var/lib/postgresql/1*/main/{pg_wal*,base*}
65G /var/lib/postgresql/11/main/pg_wal
417G /var/lib/postgresql/11/main/base

now although this is kind of upgrade involves very little downtime,
but it required around 2x the disk on the original server while the
replication was running, but there was not way to correctly estimate
the disk required for logical replication to finish and move over to
pg11.

-

now
given the above problem,
i tried to read thru with a goal of if i can do an initial sync via
pg_dump and start the restore from there.

https://www.slideshare.net/noriyoshishinoda/pgconfasia-2017-logical-replication-internals-english
https://medium.com/avitotech/recovery-use-cases-for-logical-replication-in-postgresql-10-a1e6bab03072

and setup a small test cluster as to test if i can pg_dump and
pg_restore and then start replication from the restart_lsn of the
primary/publisher.

***
demo lab
(all pg11 for now)
pg1 (primary/publisher on port 3000)
pg2  (hot_standby replica on port 3001)
pg3  (subscriber on port 3002)

***
on pg1 (create some tables and trigger to ensure trigger does not fire
on subscriber)
example=# CREATE TABLE public.company (
example(# id integer NOT NULL,
example(# name text NOT NULL,
example(# age integer NOT NULL,
example(# address character(50),
example(# salary real
example(# );
CREATE TABLE
example=# ALTER TABLE ONLY public.company
example-# ADD CONSTRAINT company_pkey PRIMARY KEY (id);
ALTER TABLE
example=# CREATE TABLE public.audit (
example(# emp_id integer NOT NULL,
example(# entry_date text NOT NULL
example(# );
CREATE TABLE
example=# ALTER TABLE ONLY public.audit
example-# ADD CONSTRAINT audit_pkey PRIMARY KEY (emp_id);
ALTER TABLE
example=# CREATE FUNCTION public.auditlogfunc() RETURNS trigger
example-# LANGUAGE plpgsql
example-# AS $$
example$#BEGIN
example$#   INSERT INTO AUDIT(EMP_ID, ENTRY_DATE) VALUES (new.ID,
current_timestamp);
example$#   RETURN NEW;
example$#END;
example$# $$;
CREATE FUNCTION
example=# CREATE TRIGGER example_trigger AFTER INSERT ON
public.company FOR EACH ROW EXECUTE PROCEDURE public.auditlogfunc();


***
setup pg2 as hot_standby replica
and dump the schema of pg1 example database on pg3.

***
on pg1 (insert some dummy data)
insert into company select x, x::text, x, 'address-' || x::text, x
from generate_series(1, 1500) x;

***
on pg1
create publication pg1 for all tables;

***
on pg2
verify replica is up and running and data replicated (fine)
example=# select count(1) from company;
 count
---
  1500
(1 row)


***
on pg3
*take a dump and restore the dump on pg3.

pg_dump -p 3000 -U postgres -Fc --serializable-deferrable
--no-subscriptions --no-publications -d example | pg_restore -p 3002
-U postgres -C -d example

*then create subscriptions to pg1

create subscription pg3 connection 'dbname=example port=3000
user=postgres' publication pg1 with (enabled = false, copy_data =
false);


***
on pg1
*get the last restart_lsn value from pg_replication_slots

select * from pg_replication_slots;
 slot_name | plugin | slot_type | datoid | database | temporary |
active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn
---++---++--+---+++--+--+-+-
 pg2   || physical  ||  | f | t
  |  22724 |  |  | 13/A8133A68 |


***
and then on pg3

select * from pg_stat_subscription;
 subid | subname | pid | relid | received_lsn | last_msg_send_time |
last_msg_receipt_time | latest_end_lsn | latest_end_time
---+-+-+---+--++---++-
 17104 | pg3 | |   |  ||

Re: logical replication initiate via manual pg_dump

2019-05-13 Thread Vijaykumar Jain

update:
i dropped all the indexes on the dest db tables, but the primary key
and unique constraints from base tables in logical replication to
ensure replica identity.

and retriggered the logical replication from scratch.
it completed in 4 hours. (which otherwise ran for 4 days)

so i guess it was the "too many indexes" slowed down copy way too much.
anyways, i got to explore pg_replication_origin_advance which was a cool thing.


Regards,
Vijay

On Fri, May 10, 2019 at 8:59 PM Vijaykumar Jain  wrote:
>
> Hey Guys,
>
> tl;dr, but incase i missed something, i can follow up on this with more 
> details.
>
>
> I have a setup where i try to upgrade a cluster from pg10 to pg11 via
> logical replication with minimum downtime.
> its a database that is 500GB with 1 table having 350GB of data (+
> bloat) and 100GB of indexes.
>
> now when i triggered logical replication, it took more than 2 days
> (and still around 10% remaining) to catch up on the data. the
> publisher shows copy table to stdout still running.
> i am not sure of the internals, but given a case that in a test prep
> with no active connections and DMLs it took around 2 days, does seem
> like i am missing something.
>
> on both pg10 and pg11
> ram 32GB
> cpu 8
> SSD
> max_wal_size 100GB
> checkpoint_timeout 30min
> shared_buffers 8GB
>
> on pg10
> (copy still running on the huge table)
> select (now() - query_start)::interval, query from pg_stat_activity;
> 2 days 19:03:12.799767 | COPY public. TO STDOUT
>
> this is how disk looks like on pg10 (publisher)
> du -sh /var/lib/postgresql/1*/main/{pg_wal*,base*}
> 78G /var/lib/postgresql/10/main/pg_wal
> 467G /var/lib/postgresql/10/main/base
>
>
> on pg11
> du -sh /var/lib/postgresql/1*/main/{pg_wal*,base*}
> 65G /var/lib/postgresql/11/main/pg_wal
> 417G /var/lib/postgresql/11/main/base
>
> now although this is kind of upgrade involves very little downtime,
> but it required around 2x the disk on the original server while the
> replication was running, but there was not way to correctly estimate
> the disk required for logical replication to finish and move over to
> pg11.
>
> -
>
> now
> given the above problem,
> i tried to read thru with a goal of if i can do an initial sync via
> pg_dump and start the restore from there.
>
> https://www.slideshare.net/noriyoshishinoda/pgconfasia-2017-logical-replication-internals-english
> https://medium.com/avitotech/recovery-use-cases-for-logical-replication-in-postgresql-10-a1e6bab03072
>
> and setup a small test cluster as to test if i can pg_dump and
> pg_restore and then start replication from the restart_lsn of the
> primary/publisher.
>
> ***
> demo lab
> (all pg11 for now)
> pg1 (primary/publisher on port 3000)
> pg2  (hot_standby replica on port 3001)
> pg3  (subscriber on port 3002)
>
> ***
> on pg1 (create some tables and trigger to ensure trigger does not fire
> on subscriber)
> example=# CREATE TABLE public.company (
> example(# id integer NOT NULL,
> example(# name text NOT NULL,
> example(# age integer NOT NULL,
> example(# address character(50),
> example(# salary real
> example(# );
> CREATE TABLE
> example=# ALTER TABLE ONLY public.company
> example-# ADD CONSTRAINT company_pkey PRIMARY KEY (id);
> ALTER TABLE
> example=# CREATE TABLE public.audit (
> example(# emp_id integer NOT NULL,
> example(# entry_date text NOT NULL
> example(# );
> CREATE TABLE
> example=# ALTER TABLE ONLY public.audit
> example-# ADD CONSTRAINT audit_pkey PRIMARY KEY (emp_id);
> ALTER TABLE
> example=# CREATE FUNCTION public.auditlogfunc() RETURNS trigger
> example-# LANGUAGE plpgsql
> example-# AS $$
> example$#BEGIN
> example$#   INSERT INTO AUDIT(EMP_ID, ENTRY_DATE) VALUES (new.ID,
> current_timestamp);
> example$#   RETURN NEW;
> example$#END;
> example$# $$;
> CREATE FUNCTION
> example=# CREATE TRIGGER example_trigger AFTER INSERT ON
> public.company FOR EACH ROW EXECUTE PROCEDURE public.auditlogfunc();
>
>
> ***
> setup pg2 as hot_standby replica
> and dump the schema of pg1 example database on pg3.
>
> ***
> on pg1 (insert some dummy data)
> insert into company select x, x::text, x, 'address-' || x::text, x
> from generate_series(1, 1500) x;
>
> ***
> on pg1
> create publication pg1 for all tables;
>
> ***
> on pg2
> verify replica is up and running and data replicated (fine)
> example=# select count(1) from company;
>  count
> ---
>   1500
> (1 row)
>
>
> ***
> on pg3
> *take a dump and restore the dump on pg3.
>
> pg_dump -p 3000 -U postgres -Fc --serializable-

multiple nodes in FDW create server statement

2019-07-02 Thread Vijaykumar Jain

All,

We are glad that we have this feature that allows us to load balance reads.
that has helped us a lot.
https://paquier.xyz/postgresql-2/postgres-10-multi-host-connstr/

I would like to know if it is possible to request a similar enhancement to
FDWs too?
https://www.postgresql.org/docs/11/sql-createserver.html

unless i am missing something obvious, we wanted to use this option when
one of the read instances are down when there is a FDW query for reads.
the second instance is only a fallback, not round robin. and we have 2
foreign servers, one for write and one for reads.

although they are marked by cnames, so its just a matter of cname switch,
but that takes some time for propagation and requires someone to be near
the machine.

i guess the concerns would be if primary is down, the writes would be
directed to another set of servers which may end up in confusing errors.

pgtesting=> create table foo(id int);

ERROR:  cannot execute CREATE TABLE in a read-only transaction

pgtesting=> \q

Let me know if this is not clear.

Regards,
Vijay

Re: [External] Re: multiple nodes in FDW create server statement

2019-07-03 Thread Vijaykumar Jain

awesomeness.
Thanks Laurenz.

Regards,
Vijay


On Wed, Jul 3, 2019 at 12:48 PM Laurenz Albe 
wrote:

> Vijaykumar Jain wrote:
> > We are glad that we have this feature that allows us to load balance
> reads.
> > that has helped us a lot.
> > https://paquier.xyz/postgresql-2/postgres-10-multi-host-connstr/
> >
> > I would like to know if it is possible to request a similar enhancement
> to FDWs too?
> > https://www.postgresql.org/docs/11/sql-createserver.html
> >
> > unless i am missing something obvious, we wanted to use this option when
> one
> > of the read instances are down when there is a FDW query for reads.
> > the second instance is only a fallback, not round robin. and we have
> > 2 foreign servers, one for write and one for reads.
>
> The documentation of postgres_fdw says:
>
>   A foreign server using the postgres_fdw foreign data wrapper can have
> the same
>   options that libpq accepts in connection strings, as described in
> Section 34.1.2,
>   except that these options are not allowed:
>
> user and password (specify these in a user mapping, instead)
>
> client_encoding (this is automatically set from the local server
> encoding)
>
> fallback_application_name (always set to postgres_fdw)
>
> So there is nothing that keeps you from using multiple host names or ports,
> and you can also use "target_session_attrs".
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>

Re: uncommitted xmin 3100586 from before xid cutoff 10339367 needs to be frozen

2024-03-22 Thread Vijaykumar Jain

On Fri, 22 Mar 2024 at 15:39, Laurenz Albe  wrote:

> On Fri, 2024-03-22 at 13:41 +0530, Daulat wrote:
> > We are unable to take the backup of our database. While taking backup we
> are getting the same error.
> >
> > psql: error: connection to server at "localhost" (::1), port 5014
> failed: FATAL:  pg_attribute catalog is missing 1 attribute(s) for relation
> OID 2662
>
> Then you got severe data corruption.  This is the index
> "pg_class_oid_index",
> and corrupted metadata make recovery difficult.
>
> If I were you, I would seek professional help.
> But first, stop working with this database immediately.
> Stop the server and take a backup of all the files in the data
> directory.

Do we have an option that op has a replica running and the bug has not
propagated to the replica , and the op can failover/ take a backup off the
replica.
i mean i could simulate a corruption of pg_catalog on my local vm using dd,
but ofc that is hardware level corruption that did not propagate to the
replica, so i could failover and backup from the replica just fine.
PS : if the bug propagates to the replica or does corruption on the replica
too, then idk the solution. if you could login and get the oid of the
objects (and have field types ready externally), then you can run a
pg_filedump and copy the data.
First contact with the pg_filedump - Highgo Software Inc.
<https://www.highgo.ca/2021/07/14/first-contact-with-the-pg_filedump/>

i tried an example, but i had a lot of info for that.
<https://www.highgo.ca/2021/07/14/first-contact-with-the-pg_filedump/>corruption
demo for blogs. (github.com)
<https://gist.github.com/cabecada/8024d98024559e9fc97ccfcb5324c09f>  (if
you dont understand this, then ignore)

> Vijay
>
LinkedIn - Vijaykumar Jain <https://www.linkedin.com/in/vijaykumarjain/>

Re: constant crashing

2024-04-14 Thread Vijaykumar Jain

On Sun, 14 Apr 2024 at 21:50, jack  wrote:

> The full error reads:
> server closed the connection expectantly
> This probably means the server terminated abnormally
> before or while processing the request.
> error: connection to server was lost
>
> PostgreSQL 16.2
>
> I also believe it is a resource issue which can be rectified with a
> setting, but which setting?
> If you were updating 100 million records what settings would you adjust?
>
> Here are the updates I am performing on the 100 million records:
> UPDATE table SET category_modified = UPPER(category);
> UPDATE table SET category_modified =
> REGEXP_REPLACE(REPLACE(REPLACE(category_modified, '''','-'), '`', '-'),
> '\s{2,}', ' ', 'g') WHERE  AND LENGTH(category_modified)>1 AND
> POSITION('--' IN category_modified)>0;
> UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
> UPDATE table SET category_modified = CASE WHEN category_modified IS NOT
> NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
> UPDATE table SET category_modified = regexp_replace(category_modified,
> '-{2,}', '-', 'g');
> UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1
> FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND
> category_modified LIKE '%-';
>
>
independent of best practices, i just want to check if there is a leak.
I created a sample table with text data and ran updates like yours and I
could not see mem growth, but I have a small vm and ofc your
category_modified field might be more complex than simple text fields for
30-40 chars.

can you grab the pid of your psql backend and (if you have pidstat
installed) monitor resource usage for that pid

postgres@pg:~/udemy/16$ psql
psql (16.2 (Ubuntu 16.2-1.pgdg22.04+1))
Type "help" for help.

postgres=# select pg_backend_pid();
  pg_backend_pid

   1214
(1 row)

# pidstat 2 100 -rud -h -p 1214
(get all stats for that pid) that might help to figure out if there is a
leak or the server has other things competing for memory and your updates
were picked by the killer.

Linux 5.15.0-101-generic (pg)   04/15/24_x86_64_(1 CPU)

# TimeUID   PID%usr %system  %guest   %wait%CPU   CPU
minflt/s  majflt/s VSZ RSS   %MEM   kB_rd/s   kB_wr/s kB_ccwr/s
iodelay  Command
00:40:25  113  12140.000.000.000.000.00 0
0.00  0.00  354112  220940  24.18  0.00  0.00  0.00
   0  postgres

# TimeUID   PID%usr %system  %guest   %wait%CPU   CPU
minflt/s  majflt/s VSZ RSS   %MEM   kB_rd/s   kB_wr/s kB_ccwr/s
iodelay  Command
00:40:27  113  12140.000.000.000.000.00 0
0.00  0.00  354112  220940  24.18  0.00  0.00  0.00
   0  postgres


ofc, if there is a genuine leak , then there might be more digging
needed Finding
memory leaks in Postgres C code (enterprisedb.com)
<https://www.enterprisedb.com/blog/finding-memory-leaks-postgres-c-code>
just kill the process requesting more mem than available  Memory context:
how PostgreSQL allocates memory - CYBERTEC (cybertec-postgresql.com)
<https://www.cybertec-postgresql.com/en/memory-context-for-postgresql-memory-management/>


-- 
Thanks,
Vijay
LinkedIn - Vijaykumar Jain <https://www.linkedin.com/in/vijaykumarjain/>

Re: constant crashing

2024-04-14 Thread Vijaykumar Jain

Ignore my thread, I guess there might be a bug given it segfaulted.

On Mon, Apr 15, 2024, 12:48 AM Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

>
>
> On Sun, 14 Apr 2024 at 21:50, jack  wrote:
>
>> The full error reads:
>> server closed the connection expectantly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> error: connection to server was lost
>>
>> PostgreSQL 16.2
>>
>> I also believe it is a resource issue which can be rectified with a
>> setting, but which setting?
>> If you were updating 100 million records what settings would you adjust?
>>
>> Here are the updates I am performing on the 100 million records:
>> UPDATE table SET category_modified = UPPER(category);
>> UPDATE table SET category_modified =
>> REGEXP_REPLACE(REPLACE(REPLACE(category_modified, '''','-'), '`', '-'),
>> '\s{2,}', ' ', 'g') WHERE  AND LENGTH(category_modified)>1 AND
>> POSITION('--' IN category_modified)>0;
>> UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
>> UPDATE table SET category_modified = CASE WHEN category_modified IS NOT
>> NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
>> UPDATE table SET category_modified = regexp_replace(category_modified,
>> '-{2,}', '-', 'g');
>> UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1
>> FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND
>> category_modified LIKE '%-';
>>
>>
> independent of best practices, i just want to check if there is a leak.
> I created a sample table with text data and ran updates like yours and I
> could not see mem growth, but I have a small vm and ofc your
> category_modified field might be more complex than simple text fields for
> 30-40 chars.
>
> can you grab the pid of your psql backend and (if you have pidstat
> installed) monitor resource usage for that pid
>
> postgres@pg:~/udemy/16$ psql
> psql (16.2 (Ubuntu 16.2-1.pgdg22.04+1))
> Type "help" for help.
>
> postgres=# select pg_backend_pid();
>   pg_backend_pid
> 
>1214
> (1 row)
>
> # pidstat 2 100 -rud -h -p 1214
> (get all stats for that pid) that might help to figure out if there is a
> leak or the server has other things competing for memory and your updates
> were picked by the killer.
>
> Linux 5.15.0-101-generic (pg)   04/15/24_x86_64_(1 CPU)
>
> # TimeUID   PID%usr %system  %guest   %wait%CPU   CPU
> minflt/s  majflt/s VSZ RSS   %MEM   kB_rd/s   kB_wr/s kB_ccwr/s
> iodelay  Command
> 00:40:25  113  12140.000.000.000.000.00 0
> 0.00  0.00  354112  220940  24.18  0.00  0.00  0.00
>0  postgres
>
> # TimeUID   PID%usr %system  %guest   %wait%CPU   CPU
> minflt/s  majflt/s VSZ RSS   %MEM   kB_rd/s   kB_wr/s kB_ccwr/s
> iodelay  Command
> 00:40:27  113  12140.000.000.000.000.00 0
> 0.00  0.00  354112  220940  24.18  0.00  0.00  0.00
>0  postgres
> 
>
> ofc, if there is a genuine leak , then there might be more digging needed 
> Finding
> memory leaks in Postgres C code (enterprisedb.com)
> <https://www.enterprisedb.com/blog/finding-memory-leaks-postgres-c-code>
> just kill the process requesting more mem than available  Memory context:
> how PostgreSQL allocates memory - CYBERTEC (cybertec-postgresql.com)
> <https://www.cybertec-postgresql.com/en/memory-context-for-postgresql-memory-management/>
>
>
> --
> Thanks,
> Vijay
> LinkedIn - Vijaykumar Jain <https://www.linkedin.com/in/vijaykumarjain/>
>

Re: Backup_Long Running

2024-04-24 Thread Vijaykumar Jain

On Wed, Apr 24, 2024, 12:33 PM jaya kumar  wrote:

> Hi Team,
>
>
>
> Production database Backup is running very long hours. Any option to
> reduce backup time? Kindly advise me.
>
>
>
> DB size: 793 GB
>
>
>
> We are taking pg_basebackup backup.
>
>
do you see network saturation, io saturation ?
generally faster hardware i.e striped and or nvme disks along with a robust
network link and capacity should help get the backup done quickly.
where are you taking the backup from? is the server busy doing other work
or it is a dedicated machine for backups ?
basically monitor for resource saturation, if all looks good, we could take
basebackup of a 10tb db in 8 hours, and in another case on a slow remote
storage, backup of 2tb took 1 day.

now, pgbackrest can speedup backup processes by spawning more workers for
archiving and stuff. we have taken backup on nvme disks striped of 28tb in
3 hours, bare metals servers with powerful cpu.

so , it's hardware  else switch to pgbackrest which can take
incremental/differential/full backups.
there are other tools too, I used only these two.

>
>

Re: I have ansible for postgres-etcd-patroni

2024-05-04 Thread Vijaykumar Jain

Hi Iman.

Thank you for sharing.


On Sun, May 5, 2024, 1:42 AM Iman Bakhtiari 
wrote:

> Hi i have written this ansible with https://github/sudoix together
> https://github.com/imanbakhtiari/postgres-ansible.git
> This ansible needs 5  virtual machine
> in 3 of them it install postgresql with patroni replication
> and in two of them it install haproxy and keepalived
> finally it became so stable idea for database with a SQL and also etcd
> NOSQL
>
> with this single command
> ansible-playbook -i inventory/db-servers.ini postgres.yml --become
> --become-method=sudo --tags "preinstall,postgres,haproxy" --ask-become-pass
>
> i just want to share this with others and your community
> and also i am living in Iran right now and eager to accept job offer for
> emigration or any job recommendations
> Here is my resume
> https://gitea.com/imanbakhtiari/resume
> Thank you
>

Re: AI for query-planning?

2024-06-22 Thread Vijaykumar Jain

On Sat, Jun 22, 2024, 5:20 PM Andreas Joseph Krogh 
wrote:

> Hi, are there any plans for using some kind of AI for query-planning?
>
> Can someone with more knowledge about this than I have please explain why
> it might, or not, be a good idea, and what the challenges are?
>

https://github.com/ossc-db/pg_plan_advsr

https://github.com/s-hironobu/pg_plan_inspector

not totally ai, but it can use data from your database to build some
metadata.

there are also plan analyzers online which people paste their plan to get a
quick summary of problem areas, that data can be useful too for ai based
plan optimisers.

but that said, postgresql has a lot of knobs to tune, so ...

Re: -1/0 virtualtransaction

2021-04-27 Thread Vijaykumar Jain

Hi,

I am just trying to jump in, but ignore if not relevant.

when you said*Eventually this results in an "out of shared memory"
error  *

Can you rule out the below two scenarios (wrt /dev/shm too low in docker or
query requesting for too many locks either due to parallellism/partition
involved)
There have been multiple cases of out of shared memory i have read earlier
for due to above.

PostgreSQL: You might need to increase max_locks_per_transaction
(cybertec-postgresql.com)

PostgreSQL at low level: stay curious! · Erthalion's blog


also, is this repeatable (given you mention it happens and eventually lead
to "out of shared memory")

I may be missing something, but i do not see a PID even though it has a
lock granted on a page, was the process terminated explicitly or
implicitly. ( and an orphan lingering ? )
ps auwwxx | grep postgres

I took the below from "src/test/regress/sql/tidscan.sql"  to simulate
SIReadLock with an orphan process (by killing the process), but it gets
reaped fine for me :(

postgres=# \d tidscan
  Table "public.tidscan"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 id | integer |   |  |

postgres=# INSERT INTO tidscan VALUES (1), (2), (3);

postgres=# BEGIN ISOLATION LEVEL SERIALIZABLE;
BEGIN
postgres=*# SELECT * FROM tidscan WHERE ctid = '(0,1)';
 id

  1
(1 row)

postgres=*# -- locktype should be 'tuple'
SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode =
'SIReadLock';
 locktype |mode
--+
 tuple| SIReadLock
(1 row)

postgres=*# -- locktype should be 'tuple'
SELECT pid, locktype, mode FROM pg_locks WHERE mode = 'SIReadLock';
 pid  | locktype |mode
--+--+
 2831 | tuple| SIReadLock
(1 row)

i thought one could attach a gdb or strace to the pid to figure out what it
did before crashing.

As always, I have little knowledge on postgresql, feel free to ignore if
nothing relevant.

Thanks,
Vijay



On Tue, 27 Apr 2021 at 19:55, Mike Beachy  wrote:

> Hi Laurenz -
>
> On Tue, Apr 27, 2021 at 2:56 AM Laurenz Albe 
> wrote:
>
>> Not sure, but do you see prepared transactions in "pg_prepared_xacts"?
>>
>
> No, the -1 in the virtualtransaction (
> https://www.postgresql.org/docs/11/view-pg-locks.html) for
> pg_prepared_xacts was another clue I saw! But, it seems more or less a dead
> end as I have nothing in pg_prepared_xacts.
>
> Thanks for the idea, though.
>
> I still need to put more effort into Tom's idea about SIReadLock hanging
> out after the transaction, but some evidence pointing in this direction is
> that I've reduced the number of db connections and found that the '-1/0'
> locks will eventually go away! I interpret this as the db needing to find
> time when no overlapping read/write transactions are present. This doesn't
> seem completely correct, as I don't have any long lived transactions
> running while these locks are hanging out. Confusion still remains, for
> sure.
>
> Mike

Re: client backwards compatible with older servers

2021-04-30 Thread Vijaykumar Jain

I am not sure which postgresql client is being referred to.
If it is psql command line then from the notes section of the docs below
says

https://www.postgresql.org/docs/current/app-psql.html

psql works best with servers of the same or an older major version.
Backslash commands are particularly likely to fail if the server is of a
newer version than psql itself. However, backslash commands of the \d family
should work with servers of versions back to 7.4, though not necessarily
with servers newer than psql itself. The general functionality of running
SQL commands and displaying query results should also work with servers of
a newer major version, but this cannot be guaranteed in all cases.

If you want to use psql to connect to several servers of different major
versions, it is recommended that you use the newest version of psql.
Alternatively, you can keep around a copy of psql from each major version
and be sure to use the version that matches the respective server. But in
practice, this additional complication should not be necessary.
If it is language specific clients, I guess you may have to talk to the
maintainers of those libraries/ clients.

Thanks,
Vijay

On Fri, Apr 30, 2021, 1:21 PM Rajesh Madiwale 
wrote:

> Hi Team,
>
> Do we have any reference documentation which tells above postgres client's
> backward compatibility ?
> Example: I am having PG server installed is PG-11.8 and PG client
> installed is PG-13 , will I face any issue? or are there any limitations if
> I use it ?
>
> Regards,
>  Rajesh Madiwale.
>
>

Re: Streaming replica failure

2021-04-30 Thread Vijaykumar Jain

Were there any issues with hardware ?
Memory/storage ?
I am not sure but it look like data loss to me.

If you have admin access?

Can you run dmesg -a,  system logs, hyperion logs etc and see if there are
any errors related to memory corruption.
Coz if there are hardware issues, this will happen again IMHO.
But if this is not a hardware issue, then I guess one of the experts may
have to chime in.

Thanks,
Vijay



On Tue, Apr 27, 2021, 3:31 PM Aleš Zelený  wrote:

> Hello,
>
> we are using PostgreSQL 12.4 on CentOS 7. The hot standby failed:
>
> 2021-04-24 09:19:27 CEST [20956]: [747-1] user=,db=,host=,app= LOG:
>  recovery restart point at 3D8C/352B4CE8
> 2021-04-24 09:19:27 CEST [20956]: [748-1] user=,db=,host=,app= DETAIL:
>  Last completed transaction was at log time 2021-04-24 09:19:27.221313+02.
> 2021-04-24 09:20:57 CEST [20956]: [749-1] user=,db=,host=,app= LOG:
>  restartpoint starting: time
> 2021-04-24 09:24:27 CEST [20956]: [750-1] user=,db=,host=,app= LOG:
>  restartpoint complete: wrote 171233 buffers (13.4%); 0 WAL file(s) added,
> 68 removed, 0 recycled; write
> =209.663 s, sync=0.012 s, total=209.963 s; sync files=283, longest=0.001
> s, average=0.000 s; distance=684762 kB, estimate=684762 kB
> 2021-04-24 09:24:27 CEST [20956]: [751-1] user=,db=,host=,app= LOG:
>  recovery restart point at 3D8C/5EF6B858
> 2021-04-24 09:24:27 CEST [20956]: [752-1] user=,db=,host=,app= DETAIL:
>  Last completed transaction was at log time 2021-04-24 09:24:27.288115+02.
> 2021-04-24 09:25:15 CEST [20955]: [11-1] user=,db=,host=,app= WARNING:
>  page 366603 of relation base/20955/10143636 is uninitialized
> 2021-04-24 09:25:15 CEST [20955]: [12-1] user=,db=,host=,app= CONTEXT:
>  WAL redo at 3D8C/D79F6500 for Heap2/VISIBLE: cutoff xid 3806260577 flags
> 0x01
> 2021-04-24 09:25:15 CEST [20955]: [13-1] user=,db=,host=,app= PANIC:  WAL
> contains references to invalid pages
> 2021-04-24 09:25:15 CEST [20955]: [14-1] user=,db=,host=,app= CONTEXT:
>  WAL redo at 3D8C/D79F6500 for Heap2/VISIBLE: cutoff xid 3806260577 flags
> 0x01
> 2021-04-24 09:25:16 CEST [20953]: [11-1] user=,db=,host=,app= LOG:
>  startup process (PID 20955) was terminated by signal 6: Neúspěšně ukončen
> (SIGABRT)
> 2021-04-24 09:25:16 CEST [20953]: [12-1] user=,db=,host=,app= LOG:
>  terminating any other active server processes
>
> The relation base/20955/10143636 is a standard table.
>
> Version details:
> postgres=# select version();
>  version
>
>
> -
>  PostgreSQL 12.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
> 20150623 (Red Hat 4.8.5-39), 64-bit
> (1 row)
>
> Settings:
> postgres=# select name, setting, category from pg_settings where category
> = 'Write-Ahead Log / Settings';
>   name  |  setting  |  category
> +---+
>  commit_delay   | 0 | Write-Ahead Log / Settings
>  commit_siblings| 5 | Write-Ahead Log / Settings
>  fsync  | on| Write-Ahead Log / Settings
>  full_page_writes   | on| Write-Ahead Log / Settings
>  synchronous_commit | on| Write-Ahead Log / Settings
>  wal_buffers| 2048  | Write-Ahead Log / Settings
>  wal_compression| off   | Write-Ahead Log / Settings
>  wal_init_zero  | on| Write-Ahead Log / Settings
>  wal_level  | logical   | Write-Ahead Log / Settings
>  wal_log_hints  | off   | Write-Ahead Log / Settings
>  wal_recycle| on| Write-Ahead Log / Settings
>  wal_sync_method| fdatasync | Write-Ahead Log / Settings
>  wal_writer_delay   | 200   | Write-Ahead Log / Settings
>  wal_writer_flush_after | 128   | Write-Ahead Log / Settings
> (14 rows)
>
> pg_waldump output:
> -bash-4.2$ /usr/pgsql-12/bin/pg_waldump 00013D8C00D7 2>&1 |
> tail
> rmgr: Btree   len (rec/tot): 64/64, tx: 3812802559, lsn:
> 3D8C/D7CF5A98, prev 3D8C/D7CF5A58, desc: INSERT_LEAF off 360, blkref #0:
> rel 1663/20955/11280092 blk 19
> 509
> rmgr: Transaction len (rec/tot): 46/46, tx: 3812802557, lsn:
> 3D8C/D7CF5AD8, prev 3D8C/D7CF5A98, desc: COMMIT 2021-04-24 09:25:16.160687
> CEST
> rmgr: Heaplen (rec/tot):159/   159, tx: 3812802559, lsn:
> 3D8C/D7CF5B08, prev 3D8C/D7CF5AD8, desc: INSERT off 8 flags 0x08, blkref
> #0: rel 1663/20955/11280066 bl
> k 165603
> rmgr: Btree   len (rec/tot): 64/64, tx: 3812802559, lsn:
> 3D8C/D7CF5BA8, prev 3D8C/D7CF5B08, desc: INSERT_LEAF off 317, blkref #0:
> rel 1663/20955/11280073 blk 15
> 340
> rmgr: Btree   len (rec/tot): 64/64, tx: 3812802559, lsn:
> 3D8C/D7CF5BE8, prev 3D8C/D7CF5BA8, desc: INSERT_LEAF off 130, blkref #0:
> rel 1663/20955/11280091 blk 22
> 003
> rmgr: Heaplen (rec/tot): 80/80, tx: 38128025

Re: PostgreSQL, Asynchronous I/O, Buffered I/O and why did fsync-gate not affect Oracle or MySQL?

2021-05-02 Thread Vijaykumar Jain

This wiki page.
It has PR references for mysql and mongo for the fsycnc issue.

Fsync Errors - PostgreSQL wiki


I'd leave the more intellectual brainstorming to the experts.

Also, ask for concrete references / reproducible scenarios for opinions if
you care.
else it leads to rumours :)

Thanks,
Vijay




On Sun, 2 May 2021 at 19:17, Pól Ua Laoínecháin  wrote:

> Hi all,
>
>
> On 2019/10/09, I posted a question here concerning PostgreSQL I/O, the
> (primarily) Linux fsync problem and my lecturer's attitude to
> PostgreSQL (text of that email is at the bottom of this post).
>
>
> I asked why the fsync issue didn't affect Oracle and/or MySQL for example?
>
> As far as I can see, this was because Oracle uses Direct I/O whereas
> PostgreSQL uses Buffered I/O. I know that the issue has been resolved
> and no longer affects currently supported versions - but I'm still
> curious...
>
>
> From here:
> https://www.percona.com/blog/2019/02/22/postgresql-fsync-failure-fixed-minor-versions-released-feb-14-2019/
>
> > Whereas, writing the modified/dirty buffers to datafiles from shared
> buffers is always through Buffered IO.
>
>
> Now, I'm not quite sure that I completely comprehend matters: Is there
> a difference between Asynchronous I/O and Buffered I/O?
>
> If so, could some kind person point me to referernces that explain the
> difference?
>
>
> But, my foggy brain aside, I read (can't find URL - paraphrasing):
> PostgreSQL is happy to let the kernel take the load off the server and
> look after I/O - that's fine, but you'd better be able to trust your
> kernel.
>
>
> However, MySQL also uses Asynchronous I/O by default - or does it?
>
> From here:
> https://dev.mysql.com/doc/refman/8.0/en/optimizing-innodb-diskio.html#:~:text=InnoDB%20uses%20the%20asynchronous%20I,which%20is%20enabled%20by%20default
> ,
> we have:
>
> > InnoDB uses the asynchronous I/O subsystem (native AIO) on Linux to
> perform read-ahead and write requests for data file pages. This behavior is
> controlled by the innodb_use_native_aio configuration option, which is
> enabled by default.
>
>
> Now, I haven't (AFAIK) seen references to problems with this fsync
> issue on MySQL. Maybe they're so used to losing data, nobody noticed?
> :-)
>
>
> Seriously though, it does beg the question - why did this cause a
> major issue for PostgreSQL but not for MySQL?
>
>
> Is it because of (a) difference(s) between Asynchronous I/O and
> Buffered I/O asked about above?
>
>
> A couple of pointers (excuse the pun!) about this issue would be
> great. Some stuff which shows the difference between Direct I/O and
> the others would also be helpful. I seem to remember there was mention
> of this for PostgreSQL but that it would be a mulit-year project. Why
> so if PostgreSQL can already use Direct I/O for the WAL logs?
>
> Obviously, I can search and I have been - but I'd appreciate material
> from people here who can sort the wheat from the chaff and point me to
> solid references. If here is not a suitable forum, then kindly
> redirect me.
>
> TIA and rgs,
>
>
> Pól Ua...
>
>
>
> =
>
> > 2019/10/09 Is my lecturer wrong about PostgreSQL? I think he is!
>
>
> I recently started a Masters in Computer Science (and not at the
> institution in my email address).
>
> One of my courses is "Advanced Databases" - yummy I thought - it's not
> even compulsory for me but I just *_had_* to take this module. The
> lecturer is a bit of an Oracle fan-boy (ACE director no less...
> hmmm...) and I want(ed) - becoming less enthusiasic by the minute - to
> do my dissertation with him. So, we're having a chat and I make plain
> my love of good 'ol PostgreSQL as my RDBMS of choice and he tells me
> that there are problems with random block corruption with PostgreSQL.
> I said "really" and before that conversation could go any further,
> another student came over and asked a question.
>
> So, I toddled off and did some research - I had heard something about
> this before (vague fuzzy memories) of a problem with the Linux kernel
> so I searched for a bit and duly dug up a couple of pages
>
> https://lwn.net/Articles/752063/ : PostgreSQL's fsync() surprise - and
>
> https://news.ycombinator.com/item?id=19238121 : Linux Fsync Issue for
> Buffered IO and Its Preliminary Fix for PostgreSQL
>
> So, this week I go back to my lecturer and say, yep, there was some
> issue but it was a Linux kernel problem and not PostgreSQL's fault and
> has been resolved.
>
> He tells me that he knew about that but that there was another issue
> (he had "spoken to people" at meetings!). I said "well, why isn't it
> fixed?" and he replied "where's the impetus?" to which I responded
> (quite shocked at this stage) something like "well, I know that the
> core team values correctness very highly" to which he came back with
> "yes, but they have no commercial imperative to fix anything - they
> have to wait until somebody is capable enough

Re: trigger impacting insertion of records

2021-05-06 Thread Vijaykumar Jain

just simplified, but it works fine for me.

create table example(id int primary key, value text);

create or replace function trg_fn() returns trigger language plpgsql as $$
begin
RAISE NOTICE 'trigger_func(%) called: action = %, when = %, level =
%', TG_ARGV[0], TG_OP, TG_WHEN, TG_LEVEL;
RAISE NOTICE 'id=%, value=%', NEW.id, NEW.value;
update example set value=replace(value,'_',' ') where left(value,3)
= 'US_';
return new;
end; $$;


create trigger after_insert_trigger after insert ON example for each row
execute function  trg_fn();

insert into example select x, case when x % 2 = 0 then 'US_' || x::text
else x::text end from generate_series(1, 100) x;

NOTICE:  trigger_func() called: action = INSERT, when = AFTER, level
= ROW
NOTICE:  id=99, value=99
NOTICE:  Returned 0 rows
NOTICE:  trigger_func() called: action = INSERT, when = AFTER, level
= ROW
NOTICE:  id=100, value=US_100
NOTICE:  Returned 0 rows
INSERT 0 100

-- do not see any values with US_, although i inserted 50 of them.
postgres=# select count(*) from example where value like 'US\_%';
 count
---
 0
(1 row)

-- do see 50 "US" values as expected.
postgres=# select count(*) from example where value like 'US %';
 count
---
50
(1 row)

Can you verify accountnumber field does not have any spaces etc at the
beginning.

unless there is some conflicting stuff modifying rows, i think this should
be ok.
you can
lock TABLE example IN exclusive mode;  -- DO NOT DO IT IF IT IMPACTS
ANYTHING IN PRODUCTION




On Thu, 6 May 2021 at 13:15, Atul Kumar  wrote:

> Hi,
>
> I have simple table having structure like given below:
>
> \d bp_ach_trans
>   Table "bonzipay.bp_ach_trans"
>Column   |  Type  |
> Modifiers
>
> ++---
> bptransid  | integer| not null default
> nextval('bp_ach_trans_bptransid_seq1'::regclass)
>
> filename   | character varying(50)  |
>  payment_status | character varying(30)  |
>  settledate | character varying(15)  |
>  payment_pastransid | bigint |
>  tname  | character varying(250) |
>  code   | character varying(5)   |
>  error_txt  | character varying(200) |
>  routingnumber  | character varying(15)  |
>  tracenumber| character varying(10)  |
>  accountnumber  | character varying(15)  |
>  bankaccountnumber  | character varying(17)  |
>  type   | character varying(1)   |
>  amount | numeric|
>  site   | character varying(30)  |
>  accountype | character varying(2)   |
>  tranid | character varying(15)  |
>
> Triggers:
> ins_ussf_rec AFTER INSERT ON bp_ach_trans FOR EACH ROW EXECUTE
> PROCEDURE ussf_accountnumber_update()
>
>
>
> the function definition is like below:
>
> CREATE OR REPLACE FUNCTION bonzipay.ussf_accountnumber_update()
>  RETURNS trigger
>  LANGUAGE plpgsql
> AS $function$ BEGIN update bonzipay.bp_ach_trans set
> accountnumber=replace(accountnumber,'_',' ') where
> left(accountnumber,3) = 'US_'; RETURN NEW; END; $function$
>
>
> my query is:
>
> when I am inserting around 1000 records in the table having
> accountnumber not having value 'US_', I am getting only 300 records
> insertion. remaining around 700 values are not getting inserted.
>
> why this strange behavior is happening, as I am not inserting any
> record having value 'US_' even after that all records are not
> inserting.
>
> Any suggestions are welcome.
>
>
>
> Regards,
> Atul
>
>
>

-- 
Thanks,
Vijay
Mumbai, India

Re: Optimizing search query with sorting by creation field

2021-05-07 Thread Vijaykumar Jain

What is your baseline expectation?
With what size of db table, what query should take how much time
How much server resources can be used?


If this seems to be a timeseries db,
Are the rows append only  and random insertion order ?

You are create partitions based on time and sub partitions on some other
field that can be used as predictable to exclude scanning other partitions
and speed up retrieved rows.
Clustering of tables based on timestamp may help reduce sort time if order
by is always used.

Basically a lot of stuff for improvement I would list are used by
timescaledb which is optimized for time series based queries.

If you want or do not want to use timescaledb,
I guess it has enough content to help plan time series based data query and
retrieve data.






On Fri, May 7, 2021, 6:12 AM Droid Tools  wrote:

> Hi,
>
> I'm looking for tips on optimizing a search query where someone searches
> for content within a post and wants to sort the results by a timestamp. I
> have a table `posts` with a `created` field (timestamp) and a post_tsv
> column (TSVECTOR). Likewise I have a GIN Index on the `post_tsv` field and
> a separate index on the `created` field.
>
> My initial, naive, attempt was to simply to do something like:
>
> ```
> SELECT * FROM posts WHERE post_tsv @@ websearch_to_tsquery(?) ORDER BY
> created DESC
> ```
>
> However, I didn't realize in this case the `created` index would be
> ignored, which means if there was a large number of posts returned, this
> query would take several seconds to execute. Also the planner would do
> weird things even if the result set was small and still take several
> seconds to execute. Currently I've papered over the problem by issuing a
> subquery and sorting that instead.
>
>  ```
> SELECT * FROM
>   (SELECT * FROM posts WHERE post_tsv @@ websearch_to_tsquery(?) LIMIT
> 10,000) q
> ORDER BY created DESC
> ```
>
> In short I execute the search, limit that to 10,000 rows, and then order
> the 10,000 rows that were returned. This worked amazingly for queries that
> returned fewer than 10,000 rows as those queries went from taking several
> seconds to run down to a handful of milliseconds. The problem is for
> queries with more than 10,000 rows you essentially end up with random
> results. I'm still not using the created index, but sorting 10,000 rows in
> memory is relatively fast.
>
> I'm stuck where to go from here - what I would like, since I know I will
> only ever ORDER BY the created field is to build some index where the
> default ordering is by the created field. GIN, as I understand it, doesn't
> support indexing in this manner (using one of the columns as a sort field).
> Is there anything else I could try?
>
>

Re: idle_in_transaction_session_timeout

2021-05-08 Thread Vijaykumar Jain

Why do you want to increase that timeout ?
I hope you are aware long idle in transactions connections would delay
vacuuming and result in much larger bloats and slow down the db over period
of time.
You need to upgrade and push using the reasons that these are not supported.
asking for workarounds around an unsupported version  will only increase
headaches in the future.


On Sat, May 8, 2021, 3:58 PM Atul Kumar  wrote:

> ok, But what is the workaround of this parameter in postgres 9.5, ,I
> need to increase the time of "idle in transaction" transactions.
>
>
>
>
>
> Regards,
> Atul
>
>
>
>
>
>
>
> On 5/7/21, luis.robe...@siscobra.com.br 
> wrote:
> >
> > - Mensagem original -
> >> De: "Atul Kumar" 
> >> Para: "pgsql-general" 
> >> Enviadas: Sexta-feira, 7 de maio de 2021 3:34:44
> >> Assunto: idle_in_transaction_session_timeout
> >
> >> Hi,
> >
> >> I have postgres 9.5 version running on my machine.
> >
> >> When I am trying to find out the parameter
> >> idle_in_transaction_session_timeout it is showing me below error:
> >
> >> postgres=# show idle_in_transaction_session_timeout;
> >> ERROR: unrecognized configuration parameter
> >> "idle_in_transaction_session_timeout"
> >
> >> I also checked postgresql.conf but even in this file there is no such
> >> parameter.
> >
> >> Please help me to find this parameter.
> >
> >> Regards,
> >> Atul
> >
> > idle_in_transaction_session_timeout first appears in v9.6[1]
> >
> > [1] https://www.postgresql.org/docs/9.6/runtime-config-client.html
> >
> > Luis R. Weck
> >
>
>
>

Copyright vs Licence

2021-05-10 Thread Vijaykumar Jain

Hi All,

I have been playing around with the pg_auto_failover extension by citus and
have really enjoyed playing chaos with it.
citusdata/pg_auto_failover: Postgres extension and service for automated
failover and high-availability (github.com)


Now I see this at the bottom of this extension.
This may be a stupid question, but i ask coz i have worked with OSS that
been marked EOL or dead.
Some software have started asking for fee (like oracle for supported java)
Some software which were completely open sourced for unlimited usage (like
sensu) now have a new version which has limited/capped free usage.
Or the Google vs Oracle case.

I know I can make a city of postgresql clusters of various sharded
architectures, and it will still be free and postgresql is not responsible
for any damage etc i understand, but can the extensions later charge on
usage model.



*Copyright (c) Microsoft Corporation. All rights reserved.This project is
licensed under the PostgreSQL License, see LICENSE file for details.*
I have a lame query (but i have a concern wrt how oracle acquired products
licenses changed)

What is the role of copyright in a license. (I am not sure if i am even
framing the question correctly, but let me know if i am not).
Can I be charged for whatever reasons in the future for using this
extension.

-- 
Thanks,
Vijay
Mumbai, India

Re: force partition pruning

2021-05-10 Thread Vijaykumar Jain

I do not know how to put this in words,
but see below when the predicate is explicitly applied to the main table
with partition.

postgres=# \d+ prt1
   Partitioned table "public.prt1"
 Column |   Type| Collation | Nullable | Default | Storage  |
Stats target | Description
+---+---+--+-+--+--+-
 a  | integer   |   | not null | | plain|
|
 b  | integer   |   |  | | plain|
|
 c  | character varying |   |  | | extended |
|
Partition key: RANGE (a)
Partitions: prt1_p1 FOR VALUES FROM (0) TO (250),
prt1_p2 FOR VALUES FROM (250) TO (500),
prt1_p3 FOR VALUES FROM (500) TO (600)

(failed reverse-i-search)`': ^C
postgres=# \d+ b
 Table "public.b"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target
| Description
+-+---+--+-+-+--+-
 id | integer |   | not null | | plain   |
|
Indexes:
"b_id_idx" btree (id)
Access method: heap

postgres=# table b;
 id
-
 200
 400
(2 rows)

-- basically if the table is joined and predicate can be applied to the
outer table which has constraints matching,
partition pruning takes place.

I do not know the theory, or even what i did is correct, but just FYI.

postgres=# explain analyze  select prt1.* from prt1 where a in ( select id
from b where id in (1, 100, 200) );
   QUERY PLAN
-
 Hash Semi Join  (cost=1.05..9.36 rows=2 width=13) (actual
time=0.034..0.074 rows=1 loops=1)
   Hash Cond: (prt1.a = b.id)
   ->  Append  (cost=0.00..7.50 rows=300 width=13) (actual
time=0.006..0.043 rows=300 loops=1)
 ->  Seq Scan on prt1_p1 prt1_1  (cost=0.00..2.25 rows=125
width=13) (actual time=0.005..0.013 rows=125 loops=1)
 ->  Seq Scan on prt1_p2 prt1_2  (cost=0.00..2.25 rows=125
width=13) (actual time=0.003..0.009 rows=125 loops=1)
 ->  Seq Scan on prt1_p3 prt1_3  (cost=0.00..1.50 rows=50 width=13)
(actual time=0.002..0.004 rows=50 loops=1)
   ->  Hash  (cost=1.03..1.03 rows=2 width=4) (actual time=0.005..0.005
rows=1 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 9kB
 ->  Seq Scan on b  (cost=0.00..1.03 rows=2 width=4) (actual
time=0.003..0.003 rows=1 loops=1)
   Filter: (id = ANY ('{1,100,200}'::integer[]))
   Rows Removed by Filter: 1
 Planning Time: 0.181 ms
 Execution Time: 0.089 ms
(13 rows)

postgres=# explain analyze  select prt1.* from prt1 where a in ( select id
from b where b.id = prt1.a) and a in (1, 100, 200);
QUERY PLAN
---
 Index Scan using iprt1_p1_a on prt1_p1 prt1  (cost=0.14..14.03 rows=2
width=13) (actual time=0.024..0.025 rows=1 loops=1)
   Index Cond: (a = ANY ('{1,100,200}'::integer[]))
   Filter: (SubPlan 1)
   Rows Removed by Filter: 1
   SubPlan 1
 ->  Seq Scan on b  (cost=0.00..1.02 rows=1 width=4) (actual
time=0.003..0.003 rows=0 loops=2)
   Filter: (id = prt1.a)
   Rows Removed by Filter: 1
 Planning Time: 0.120 ms
 Execution Time: 0.041 ms
(10 rows)

postgres=# explain analyze  select prt1.* from prt1 where exists ( select 1
from b where b.id = prt1.a) and a in (1, 100, 200);
 QUERY PLAN
-
 Hash Semi Join  (cost=1.04..3.79 rows=2 width=13) (actual
time=0.024..0.028 rows=1 loops=1)
   Hash Cond: (prt1.a = b.id)
   ->  Seq Scan on prt1_p1 prt1  (cost=0.00..2.72 rows=3 width=13) (actual
time=0.011..0.017 rows=2 loops=1)
 Filter: (a = ANY ('{1,100,200}'::integer[]))
 Rows Removed by Filter: 123
   ->  Hash  (cost=1.02..1.02 rows=2 width=4) (actual time=0.004..0.004
rows=2 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 9kB
 ->  Seq Scan on b  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.001..0.002 rows=2 loops=1)
 Planning Time: 0.192 ms
 Execution Time: 0.043 ms
(10 rows)

postgres=# explain analyze  select prt1.* from prt1 inner join b on prt1.a
= b.id  where a in (1, 100, 200);
 QUERY PLAN
-
 Hash Join  (cost=1.04..3.79 rows=2 width=13) (actual time=0.024..0.028
rows=1 loops=1)
   Hash Cond: (prt1.a = b.id)
   ->  Seq Scan on prt1_p1 prt1  (cost=0.00..2.72 rows=3 wid

Re: force partition pruning

2021-05-11 Thread Vijaykumar Jain

1), (505);

-- Basic table
create table tprt (col1 int) partition by range (col1);
create table tprt_1 partition of tprt for values from (1) to (501);
create table tprt_2 partition of tprt for values from (501) to (1001);
create table tprt_3 partition of tprt for values from (1001) to (2001);
create table tprt_4 partition of tprt for values from (2001) to (3001);
create table tprt_5 partition of tprt for values from (3001) to (4001);
create table tprt_6 partition of tprt for values from (4001) to (5001);

create index tprt1_idx on tprt_1 (col1);
create index tprt2_idx on tprt_2 (col1);
create index tprt3_idx on tprt_3 (col1);
create index tprt4_idx on tprt_4 (col1);
create index tprt5_idx on tprt_5 (col1);
create index tprt6_idx on tprt_6 (col1);

insert into tprt values (10), (20), (501), (502), (505), (1001), (4500);

-- make this similar to your query.

alter table tbl1 add column col2 int default 0;
update tbl1 set col2 =1 where col1 = 501;
table tbl1;
 col1 | col2
--+--
  501 |1
  505 |0


explain analyze select * from tprt where tprt.col1 in (select tbl1.col1
from tbl1 where tbl1.col2 in (1, 2) );
partition pruning does take place.

explain analyze select * from tprt where tprt.col1 in (select tbl1.col1
from tbl1 where tbl1.col2 in (1, 2) );
 QUERY PLAN

 *Nested Loop*  (cost=38.46..382.58 rows=1760 width=4) (actual
time=0.026..0.029 rows=1 loops=1)
   ->  HashAggregate  (cost=38.31..38.53 rows=22 width=4) (actual
time=0.012..0.013 rows=1 loops=1)
 Group Key: tbl1.col1
 Batches: 1  Memory Usage: 24kB
 ->  Seq Scan on tbl1  (cost=0.00..38.25 rows=23 width=4) (actual
time=0.008..0.009 rows=1 loops=1)
   Filter: (col2 = ANY ('{1,2}'::integer[]))
   Rows Removed by Filter: 1
   ->  Append  (cost=0.15..14.86 rows=78 width=4) (actual time=0.011..0.013
rows=1 loops=1)
 ->  Index Only Scan using tprt1_idx on tprt_1  (cost=0.15..2.47
rows=13 width=4) (never executed)
   Index Cond: (col1 = tbl1.col1)
   Heap Fetches: 0
 ->  Index Only Scan using tprt2_idx on tprt_2  (cost=0.15..2.47
rows=13 width=4) (actual time=0.009..0.010 rows=1 loops=1)
   Index Cond: (col1 = tbl1.col1)
   Heap Fetches: 1
 ->  Index Only Scan using tprt3_idx on tprt_3  (cost=0.15..2.47
rows=13 width=4) (never executed)
   Index Cond: (col1 = tbl1.col1)
   Heap Fetches: 0
 ->  Index Only Scan using tprt4_idx on tprt_4  (cost=0.15..2.30
rows=13 width=4) (never executed)
   Index Cond: (col1 = tbl1.col1)
   Heap Fetches: 0
 ->  Index Only Scan using tprt5_idx on tprt_5  (cost=0.15..2.30
rows=13 width=4) (never executed)
   Index Cond: (col1 = tbl1.col1)
   Heap Fetches: 0
 ->  Index Only Scan using tprt6_idx on tprt_6  (cost=0.15..2.47
rows=13 width=4) (never executed)
   Index Cond: (col1 = tbl1.col1)
   Heap Fetches: 0
 Planning Time: 0.211 ms
 Execution Time: 0.083 ms
(28 rows)

So, in short , i do not know how pruning works during execution time.
I hope someone else might be able to help you.


Thanks,
Vijay





On Mon, 10 May 2021 at 18:17, Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

>
>
> I do not know how to put this in words,
> but see below when the predicate is explicitly applied to the main table
> with partition.
>
> postgres=# \d+ prt1
>Partitioned table "public.prt1"
>  Column |   Type| Collation | Nullable | Default | Storage  |
> Stats target | Description
>
> +---+---+--+-+--+--+-
>  a  | integer   |   | not null | | plain|
> |
>  b  | integer   |   |  | | plain|
> |
>  c  | character varying |   |  | | extended |
> |
> Partition key: RANGE (a)
> Partitions: prt1_p1 FOR VALUES FROM (0) TO (250),
> prt1_p2 FOR VALUES FROM (250) TO (500),
> prt1_p3 FOR VALUES FROM (500) TO (600)
>
> (failed reverse-i-search)`': ^C
> postgres=# \d+ b
>  Table "public.b"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
>
> +-+---+--+-+-+--+-
>  id | integer |   | not null | | plain   |
>   |
> Indexes:
> "b_id_idx" btree (id)
> A

Re: force partition pruning

2021-05-11 Thread Vijaykumar Jain

Ok. maybe you are in a rush.

But I would keep the thread open, to understand what I am not
understanding or else, it'll become a habit of converting sql to plpgsql :)

Big Guys,
It seems, when the table is partitioned by range, it makes use of a nested
loop which helps in partition pruning.
if the table is list partitioned, it scans all the partitions.

Is this expected ?



LIST BASED PARTITION
**
postgres@go:~$ more p.sql
drop table tbl1;
drop table tprt;

create table tbl1(col1 int);
insert into tbl1 values (501), (505);

-- Basic table
create table tprt (col1 int) partition by list (col1);
create table tprt_1 partition of tprt for values in (501);
create table tprt_2 partition of tprt for values in (1001);
create table tprt_3 partition of tprt for values in  (2001);
create table tprt_4 partition of tprt for values in  (3001);
create table tprt_5 partition of tprt for values in  (4001);
create table tprt_6 partition of tprt for values in  (5001);

create index tprt1_idx on tprt_1 (col1);
create index tprt2_idx on tprt_2 (col1);
create index tprt3_idx on tprt_3 (col1);
create index tprt4_idx on tprt_4 (col1);
create index tprt5_idx on tprt_5 (col1);
create index tprt6_idx on tprt_6 (col1);

insert into tprt values (501), (1001), (2001), (3001), (4001), (5001),
(501);

alter table tbl1 add column col2 int default 0;
update tbl1 set col2 =1 where col1 = 501;

vacuum analyze tprt;
vacuum analyze tbl1;

explain analyze select * from tprt where tprt.col1 in (select tbl1.col1
from tbl1 where tbl1.col2 in (1, 2) );

 QUERY PLAN

 Hash Semi Join  (cost=1.05..7.20 rows=2 width=4) (actual time=0.028..0.034
rows=2 loops=1)
   Hash Cond: (tprt.col1 = tbl1.col1)
   ->  Append  (cost=0.00..6.10 rows=7 width=4) (actual time=0.003..0.008
rows=7 loops=1)
 ->  Seq Scan on tprt_1  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.002..0.003 rows=2 loops=1)
 ->  Seq Scan on tprt_2  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
 ->  Seq Scan on tprt_3  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
 ->  Seq Scan on tprt_4  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.000..0.000 rows=1 loops=1)
 ->  Seq Scan on tprt_5  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
 ->  Seq Scan on tprt_6  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
   ->  Hash  (cost=1.02..1.02 rows=2 width=4) (actual time=0.006..0.006
rows=1 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 9kB
 ->  Seq Scan on tbl1  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.003..0.003 rows=1 loops=1)
   Filter: (col2 = ANY ('{1,2}'::integer[]))
   Rows Removed by Filter: 1
 Planning Time: 0.237 ms
 Execution Time: 0.060 ms


*even if i set hashjoin off*

postgres=# set enable_hashjoin TO 0;
SET
postgres=# explain analyze select * from tprt where tprt.col1 in (select
tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );
 QUERY PLAN

 Nested Loop Semi Join  (cost=0.00..7.34 rows=2 width=4) (actual
time=0.013..0.023 rows=2 loops=1)
   Join Filter: (tprt.col1 = tbl1.col1)
   Rows Removed by Join Filter: 5
   ->  Append  (cost=0.00..6.10 rows=7 width=4) (actual time=0.004..0.010
rows=7 loops=1)
 ->  Seq Scan on tprt_1  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.003..0.003 rows=2 loops=1)
 ->  Seq Scan on tprt_2  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
 ->  Seq Scan on tprt_3  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
 ->  Seq Scan on tprt_4  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
 ->  Seq Scan on tprt_5  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
 ->  Seq Scan on tprt_6  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
   ->  Materialize  (cost=0.00..1.03 rows=2 width=4) (actual
time=0.001..0.001 rows=1 loops=7)
 ->  Seq Scan on tbl1  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.007..0.007 rows=1 loops=1)
   Filter: (col2 = ANY ('{1,2}'::integer[]))
   Rows Removed by Filter: 1
 Planning Time: 0.578 ms
 Execution Time: 0.038 ms
(16 rows)

**



RANGE BASED PARTITION
**
postgres@go:~$ more q.sql
drop table tbl1;
drop table tprt;
create table tbl1(col1 int);
insert into tbl1 values (501), (505);

-- Basic table
create table tprt (col1 int) partition by range(col1);
create table tprt_1 partition of tprt for values from (0) to (500);
create table tprt_2 partition of t

Re: force partition pruning

2021-05-11 Thread Vijaykumar Jain

ok,

partitioning - Can PostgreSQL 12 do partition pruning at execution time
with subquery returning a list? - Stack Overflow
<https://stackoverflow.com/questions/6395/can-postgresql-12-do-partition-pruning-at-execution-time-with-subquery-returning>

ok forcing hash join off, did not work as the outer table was the
partitioned table selected.


On Tue, 11 May 2021 at 22:42, Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

> Ok. maybe you are in a rush.
>
> But I would keep the thread open, to understand what I am not
> understanding or else, it'll become a habit of converting sql to plpgsql :)
>
> Big Guys,
> It seems, when the table is partitioned by range, it makes use of a nested
> loop which helps in partition pruning.
> if the table is list partitioned, it scans all the partitions.
>
> Is this expected ?
>
>
>
> LIST BASED PARTITION
> **
> postgres@go:~$ more p.sql
> drop table tbl1;
> drop table tprt;
>
> create table tbl1(col1 int);
> insert into tbl1 values (501), (505);
>
> -- Basic table
> create table tprt (col1 int) partition by list (col1);
> create table tprt_1 partition of tprt for values in (501);
> create table tprt_2 partition of tprt for values in (1001);
> create table tprt_3 partition of tprt for values in  (2001);
> create table tprt_4 partition of tprt for values in  (3001);
> create table tprt_5 partition of tprt for values in  (4001);
> create table tprt_6 partition of tprt for values in  (5001);
>
> create index tprt1_idx on tprt_1 (col1);
> create index tprt2_idx on tprt_2 (col1);
> create index tprt3_idx on tprt_3 (col1);
> create index tprt4_idx on tprt_4 (col1);
> create index tprt5_idx on tprt_5 (col1);
> create index tprt6_idx on tprt_6 (col1);
>
> insert into tprt values (501), (1001), (2001), (3001), (4001), (5001),
> (501);
>
> alter table tbl1 add column col2 int default 0;
> update tbl1 set col2 =1 where col1 = 501;
>
> vacuum analyze tprt;
> vacuum analyze tbl1;
>
> explain analyze select * from tprt where tprt.col1 in (select tbl1.col1
> from tbl1 where tbl1.col2 in (1, 2) );
>
>  QUERY PLAN
>
> 
>  Hash Semi Join  (cost=1.05..7.20 rows=2 width=4) (actual
> time=0.028..0.034 rows=2 loops=1)
>Hash Cond: (tprt.col1 = tbl1.col1)
>->  Append  (cost=0.00..6.10 rows=7 width=4) (actual time=0.003..0.008
> rows=7 loops=1)
>  ->  Seq Scan on tprt_1  (cost=0.00..1.02 rows=2 width=4) (actual
> time=0.002..0.003 rows=2 loops=1)
>  ->  Seq Scan on tprt_2  (cost=0.00..1.01 rows=1 width=4) (actual
> time=0.001..0.001 rows=1 loops=1)
>  ->  Seq Scan on tprt_3  (cost=0.00..1.01 rows=1 width=4) (actual
> time=0.001..0.001 rows=1 loops=1)
>  ->  Seq Scan on tprt_4  (cost=0.00..1.01 rows=1 width=4) (actual
> time=0.000..0.000 rows=1 loops=1)
>  ->  Seq Scan on tprt_5  (cost=0.00..1.01 rows=1 width=4) (actual
> time=0.001..0.001 rows=1 loops=1)
>  ->  Seq Scan on tprt_6  (cost=0.00..1.01 rows=1 width=4) (actual
> time=0.001..0.001 rows=1 loops=1)
>->  Hash  (cost=1.02..1.02 rows=2 width=4) (actual time=0.006..0.006
> rows=1 loops=1)
>  Buckets: 1024  Batches: 1  Memory Usage: 9kB
>  ->  Seq Scan on tbl1  (cost=0.00..1.02 rows=2 width=4) (actual
> time=0.003..0.003 rows=1 loops=1)
>Filter: (col2 = ANY ('{1,2}'::integer[]))
>Rows Removed by Filter: 1
>  Planning Time: 0.237 ms
>  Execution Time: 0.060 ms
>
>
> *even if i set hashjoin off*
>
> postgres=# set enable_hashjoin TO 0;
> SET
> postgres=# explain analyze select * from tprt where tprt.col1 in (select
> tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );
>  QUERY PLAN
>
> 
>  Nested Loop Semi Join  (cost=0.00..7.34 rows=2 width=4) (actual
> time=0.013..0.023 rows=2 loops=1)
>Join Filter: (tprt.col1 = tbl1.col1)
>Rows Removed by Join Filter: 5
>->  Append  (cost=0.00..6.10 rows=7 width=4) (actual time=0.004..0.010
> rows=7 loops=1)
>  ->  Seq Scan on tprt_1  (cost=0.00..1.02 rows=2 width=4) (actual
> time=0.003..0.003 rows=2 loops=1)
>  ->  Seq Scan on tprt_2  (cost=0.00..1.01 rows=1 width=4) (actual
> time=0.001..0.001 rows=1 loops=1)
>  ->  Seq Scan on tprt_3  (cost=0.00..1.01 rows=1 width=4) (actual
> time=0.001..0.001 rows=1 loops=1)
>  ->  Seq Scan on tprt_4  (cost=0.0

Re: force partition pruning

2021-05-11 Thread Vijaykumar Jain

=0.29..320.37
rows=17072 width=4) (never executed)*
*   Heap Fetches: 0*
* ->  Index Only Scan using tprt4_idx on tprt_4  (cost=0.29..306.20
rows=16394 width=4) (never executed)*
*   Heap Fetches: 0*
* ->  Index Only Scan using tprt5_idx on tprt_5  (cost=0.29..301.62
rows=16089 width=4) (never executed)*
*   Heap Fetches: 0*
* ->  Index Only Scan using tprt6_idx on tprt_6  (cost=0.12..4.14
rows=1 width=4) (never executed)*
   Heap Fetches: 0
   ->  Index Scan using tbl1_col1_idx on tbl1  (cost=0.13..12.16 rows=2
width=4) (actual time=0.006..0.008 rows=1 loops=1)
 Filter: (col2 = ANY ('{1,2}'::integer[]))
 Rows Removed by Filter: 1
 Planning Time: 0.239 ms
 Execution Time: 9.129 ms
(20 rows)

postgres=# set enable_indexonlyscan TO off;
SET
postgres=# explain analyze select * from tprt where tprt.col1 in (select
tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );
   QUERY PLAN

 Merge Semi Join  (cost=1.75..1372.21 rows=6 width=4) (actual
time=0.018..9.624 rows=33394 loops=1)
   Merge Cond: (tprt.col1 = tbl1.col1)
   ->  Append  (cost=1.56..2833.70 rows=17 width=4) (actual
time=0.012..6.048 rows=33395 loops=1)
 ->  Index Scan using tprt1_idx on tprt_1  (cost=0.29..768.20
rows=33394 width=4) (actual time=0.012..4.117 rows=33394 loops=1)
 ->  Index Scan using tprt2_idx on tprt_2  (cost=0.29..399.14
rows=17057 width=4) (actual time=0.007..0.007 rows=1 loops=1)
 ->  Index Scan using tprt3_idx on tprt_3  (cost=0.29..399.37
rows=17072 width=4) (never executed)
 ->  Index Scan using tprt4_idx on tprt_4  (cost=0.29..382.20
rows=16394 width=4) (never executed)
 ->  Index Scan using tprt5_idx on tprt_5  (cost=0.29..376.62
rows=16089 width=4) (never executed)
 ->  Index Scan using tprt6_idx on tprt_6  (cost=0.12..8.14 rows=1
width=4) (never executed)
   ->  Index Scan using tbl1_col1_idx on tbl1  (cost=0.13..12.16 rows=2
width=4) (actual time=0.004..0.005 rows=1 loops=1)
 Filter: (col2 = ANY ('{1,2}'::integer[]))
 Rows Removed by Filter: 1
 Planning Time: 0.237 ms
 Execution Time: 10.634 ms
(14 rows)

postgres=# set enable_indexscan TO off;
SET
postgres=# explain analyze select * from tprt where tprt.col1 in (select
tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );
  QUERY PLAN
--
 Hash Semi Join  (cost=1.05..2580.54 rows=6 width=4) (actual
time=0.034..21.374 rows=33394 loops=1)
   Hash Cond: (tprt.col1 = tbl1.col1)
   ->  Append  (cost=0.00..1946.11 rows=17 width=4) (actual
time=0.006..11.179 rows=17 loops=1)
 ->  Seq Scan on tprt_1  (cost=0.00..481.94 rows=33394 width=4)
(actual time=0.006..2.342 rows=33394 loops=1)
 ->  Seq Scan on tprt_2  (cost=0.00..246.57 rows=17057 width=4)
(actual time=0.003..0.886 rows=17057 loops=1)
 ->  Seq Scan on tprt_3  (cost=0.00..246.72 rows=17072 width=4)
(actual time=0.003..0.897 rows=17072 loops=1)
 ->  Seq Scan on tprt_4  (cost=0.00..236.94 rows=16394 width=4)
(actual time=0.003..0.876 rows=16394 loops=1)
 ->  Seq Scan on tprt_5  (cost=0.00..232.89 rows=16089 width=4)
(actual time=0.003..0.797 rows=16089 loops=1)
 ->  Seq Scan on tprt_6  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.003..0.003 rows=1 loops=1)
   ->  Hash  (cost=1.02..1.02 rows=2 width=4) (actual time=0.006..0.006
rows=1 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 9kB
 ->  Seq Scan on tbl1  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.004..0.005 rows=1 loops=1)
   Filter: (col2 = ANY ('{1,2}'::integer[]))
   Rows Removed by Filter: 1
 Planning Time: 0.233 ms
 Execution Time: 22.428 ms
(16 rows)



so i guess it works. I should not try to be smarter than the optimizer :)


On Tue, 11 May 2021 at 22:59, Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

> ok,
>
> partitioning - Can PostgreSQL 12 do partition pruning at execution time
> with subquery returning a list? - Stack Overflow
> <https://stackoverflow.com/questions/6395/can-postgresql-12-do-partition-pruning-at-execution-time-with-subquery-returning>
>
> ok forcing hash join off, did not work as the outer table was the
> partitioned table selected.
>
>
> On Tue, 11 May 2021 at 22:42, Vijaykumar Jain <
> vijaykumarjain.git...@gmail.com> wrote:
>
>> Ok. maybe you are in a rush.
>>
>> But I would keep the thread open, to understand w

Re: force partition pruning

2021-05-11 Thread Vijaykumar Jain

Fair enough.

Many thanks for taking time out to follow up and clear my misunderstanding.
I’ll not pollute the thread , since OP got what he wanted.
But I’ll have to spend more time trying to simulate it with data and reread
what you want to say :).
But thanks again for clearing that up.


On Wed, 12 May 2021 at 8:16 AM Niels Jespersen  wrote:

> Fra: David Rowley  Sendt: 12. maj 2021 02:34
>
> >>
> >> ok i think i just may be there is very less data , hence no index scan,
> no pruning.
> >>
> >> when i try to force seq_scan off,
> >>
> >
> >Unfortunately, no run-time pruning occurred in the above plan.
> >
> >The fact that the above plan uses Append made that possible.
> >
> >I think, for now, the only sure way to get run-time pruning working for
> this case is to run two separate queries so that the 2nd one can
> >perform plan-time pruning.
>
> This is the conclusion I'm taking from this thread and will base my
> further work on. I was the one asking the original question. A table
> returning function is my work-hypothesis for now.
>
> >
> >
> >I think if you try to make this work by trying to force the planner's
> hand, you'll just feel pain when the planner one day has a change of heart
> and decides to swap the join order on you.
> >
> >David
> >
> Thank you for the insights into the planner capabilities.
>
> Regards Niels
>
-- 
Thanks,
Vijay
Mumbai, India

Postgresql fdw tracing

2021-05-12 Thread Vijaykumar Jain

Hello Gentlemen,

I am trying to setup a shard array of pg clusters behind fdw which is
fronted by haproxy/envoyproxy to load balance.

Something like this, but including pgbouncer.
https://image.slidesharecdn.com/fdw-basedsharding-170321103514/95/fdwbased-sharding-update-and-future-12-638.jpg?cb=1490092596

Everything works perfectly well, but monitoring is becoming difficult for
remote queries.

If there are 100s of queries all querying multiple shards behind, it is
very difficult to trace server from where the query originated  from the
backend shards.

If you have understood till here,

Has anyone worked with passing a  unique request id (like in http requests
via header ) that lives across the queries via fdw etc so that we can
identify and trace the query.
Even if it is via a sidecar.

If we remove the fdw context,
This works on local queries as pg_stat_activity shows parent pid if
parallel workers spawned.

If any one has used pgbouncer, even pgbouncer maps the front-end to backend
queries to trace the client ip to the backend request.

I hope i am clear, but I can elaborate more if required.
I am trying to simulate a poor man's citus sharding :)

Re: I have no idea why pg_dump isn't dumping all of my data

2021-05-21 Thread Vijaykumar Jain

Can you try dumping using verbose flag.
-v

Just want to confirm if the user has relevant permissions.


On Fri, May 21, 2021, 3:04 PM Simon Connah 
wrote:

> Hi,
>
> I'm running the following command to dump my database:
>
> /usr/bin/pg_dump
> --file=/home/simon/nanoscopic_db_127_0_0_1-2021_05_21_10_25_28-dump.sql
> --dbname=nanoscopic_db --clean --create --if-exists
> --username=nanoscopic_db_user --host=127.0.0.1 --port=5432
>
> and yet when I run that all I get in the SQL file is the following:
>
> https://gist.github.com/simonconnah/e1a15b1536b6e519b84481ae74f082bf
>
> I'm at a total loss. I've tried all the relevant looking command line
> switches and nothing seems to dump the actual contents of the database. It
> just dumps the extension command. Can anyone help me to figure this out
> please? It is probably something stupid that I am doing wrong.
>
> Simon.
>

Re: I have no idea why pg_dump isn't dumping all of my data

2021-05-21 Thread Vijaykumar Jain

i just did a dump of a db which was owned by postgres but some tables owned
by other users and it ran fine.
I am not sure of that nanoscopic extension though.



***
createdb -e foobar;

postgres=# \c foobar
You are now connected to database "foobar" as user "postgres".
foobar=# set role demo;
SET
foobar=> create table xx(id int);
CREATE TABLE
foobar=> \dt
   List of relations
 Schema | Name | Type  | Owner
+--+---+---
 public | xx   | table | demo
(1 row)

foobar=> insert into xx values (1);
INSERT 0 1
foobar=> \dt
   List of relations
 Schema | Name | Type  | Owner
+--+---+---
 public | xx   | table | demo
(1 row)

foobar=> \l
 List of databases
   Name|  Owner   | Encoding |  Collate   |   Ctype|   Access
privileges
---+--+--+++---
 demo  | demo_rw  | UTF8 | en_US.utf8 | en_US.utf8 |
 foobar| postgres | UTF8 | en_US.utf8 | en_US.utf8 |
 postgres  | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
 template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres
  +
   |  |  |||
postgres=CTc/postgres
 template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres
  +
   |  |  |||
postgres=CTc/postgres
(5 rows)
***


***
pg_dump --file=nanoscopic_db_127_0_0_1-2021_05_21_10_25_28-dump.sql
--dbname=foobar --clean --create --if-exists --username=demo -v
--host=127.0.0.1 --port=5432

... last lines from the verbose dump

pg_dump: dropping DATABASE foobar
pg_dump: creating DATABASE "foobar"
pg_dump: connecting to new database "foobar"
pg_dump: creating TABLE "public.xx"
pg_dump: processing data for table "public.xx"
pg_dump: dumping contents of table "public.xx"


CREATE DATABASE foobar WITH TEMPLATE = template0 ENCODING = 'UTF8' LOCALE =
'en_US.utf8';


ALTER DATABASE foobar OWNER TO postgres;

\connect foobar

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;

SET default_tablespace = '';

SET default_table_access_method = heap;

--
-- TOC entry 200 (class 1259 OID 26105)
-- Name: xx; Type: TABLE; Schema: public; Owner: demo
--

CREATE TABLE public.xx (
id integer
);


ALTER TABLE public.xx OWNER TO demo;

--
-- TOC entry 2232 (class 0 OID 26105)
-- Dependencies: 200
-- Data for Name: xx; Type: TABLE DATA; Schema: public; Owner: demo
--

COPY public.xx (id) FROM stdin;
1
\.


-- Completed on 2021-05-21 15:54:08 IST

--
-- PostgreSQL database dump complete
--
***
works fine.
I do not know that extension(nanoscopic) though.

it is reading some tables in a public schema, but not even dumping the
schema.

yep, thats odd if it does not throw any errors, coz any errors wrt
permissions are thrown right away to console.

maybe someone with more exp would be able to help.


On Fri, 21 May 2021 at 15:32, Simon Connah 
wrote:

> ‐‐‐ Original Message ‐‐‐
> On Friday, May 21st, 2021 at 10:55, Vijaykumar Jain <
> vijaykumarjain.git...@gmail.com> wrote:
>
> Can you try dumping using verbose flag.
> -v
>
> Just want to confirm if the user has relevant permissions.
>
>
> On Fri, May 21, 2021, 3:04 PM Simon Connah 
> wrote:
>
>> Hi,
>>
>> I'm running the following command to dump my database:
>>
>> /usr/bin/pg_dump
>> --file=/home/simon/nanoscopic_db_127_0_0_1-2021_05_21_10_25_28-dump.sql
>> --dbname=nanoscopic_db --clean --create --if-exists
>> --username=nanoscopic_db_user --host=127.0.0.1 --port=5432
>>
>> and yet when I run that all I get in the SQL file is the following:
>>
>> https://gist.github.com/simonconnah/e1a15b1536b6e519b84481ae74f082bf
>>
>> I'm at a total loss. I've tried all the relevant looking command line
>> switches and nothing seems to dump the actual contents of the database. It
>> just dumps the extension command. Can anyone help me to figure this out
>> please? It is probably something stupid that I am doing wrong.
>>
>> Simon.
>>
>
> *pg_dump: *last built-in OID is 16383
> *pg_dump: *reading extensions
> *pg_dump: *identifying extension members
> *pg_dump: *reading schemas
> *pg_dump: *reading user-defined tables
> *pg_dump: *reading user-defined functions
> *pg_du

Re: I have no idea why pg_dump isn't dumping all of my data

2021-05-21 Thread Vijaykumar Jain

ok,

I think this is what it is.

I copied the files to the extensions folder.

ls /opt/postgresql-13/local/share/extension/nanoscopic*
/opt/postgresql-13/local/share/extension/nanoscopic--1.0.sql
/opt/postgresql-13/local/share/extension/nanoscopic.control

and loaded the extensions.
the relations are created as a result of the extension.

foobar=# create extension nanoscopic;
CREATE EXTENSION
foobar=# \dt
 List of relations
 Schema | Name  | Type  |  Owner
+---+---+--
 public | blog  | table | postgres
 public | blog_page | table | postgres
 public | blog_post | table | postgres
 public | blog_post_comment | table | postgres
 public | blog_user | table | postgres
 public | blog_user_permissions | table | postgres
(6 rows)

foobar=# drop extension nanoscopic;
DROP EXTENSION
foobar=# \dt
Did not find any relations.


when you dump the db, only the create extension statement is dumped, not
its relations.

when you reload the db from the dump file, the extension is created and
relations too are created via that extension.

But I do not know the theory of how pg_dump deals with relations and the
data created via extensions at load time and further when they are modified.

I'll do some lookup on this.



















On Fri, 21 May 2021 at 16:29, Simon Connah 
wrote:

> This is the source code of the extension in question:
> https://github.com/xmrsoftware/nanoscopic/tree/master/sql/nanoscopic
> ‐‐‐ Original Message ‐‐‐
> On Friday, May 21st, 2021 at 11:29, Vijaykumar Jain <
> vijaykumarjain.git...@gmail.com> wrote:
>
> i just did a dump of a db which was owned by postgres but some tables
> owned by other users and it ran fine.
> I am not sure of that nanoscopic extension though.
>
>
>
> ***
> createdb -e foobar;
>
> postgres=# \c foobar
> You are now connected to database "foobar" as user "postgres".
> foobar=# set role demo;
> SET
> foobar=> create table xx(id int);
> CREATE TABLE
> foobar=> \dt
>List of relations
>  Schema | Name | Type  | Owner
> +--+---+---
>  public | xx   | table | demo
> (1 row)
>
> foobar=> insert into xx values (1);
> INSERT 0 1
> foobar=> \dt
>List of relations
>  Schema | Name | Type  | Owner
> +--+---+---
>  public | xx   | table | demo
> (1 row)
>
> foobar=> \l
>  List of databases
>Name|  Owner   | Encoding |  Collate   |   Ctype|   Access
> privileges
>
> ---+--+--+++---
>  demo  | demo_rw  | UTF8 | en_US.utf8 | en_US.utf8 |
>  foobar| postgres | UTF8 | en_US.utf8 | en_US.utf8 |
>  postgres  | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
>  template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres
> +
>|  |  |||
> postgres=CTc/postgres
>  template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres
> +
>|  |  |||
> postgres=CTc/postgres
> (5 rows)
> ***
>
>
> ***
> pg_dump --file=nanoscopic_db_127_0_0_1-2021_05_21_10_25_28-dump.sql
> --dbname=foobar --clean --create --if-exists --username=demo -v
> --host=127.0.0.1 --port=5432
>
> ... last lines from the verbose dump
>
> pg_dump: dropping DATABASE foobar
> pg_dump: creating DATABASE "foobar"
> pg_dump: connecting to new database "foobar"
> pg_dump: creating TABLE "public.xx"
> pg_dump: processing data for table "public.xx"
> pg_dump: dumping contents of table "public.xx"
>
>
> CREATE DATABASE foobar WITH TEMPLATE = template0 ENCODING = 'UTF8' LOCALE
> = 'en_US.utf8';
>
>
> ALTER DATABASE foobar OWNER TO postgres;
>
> \connect foobar
>
> SET statement_timeout = 0;
> SET lock_timeout = 0;
> SET idle_in_transaction_session_timeout = 0;
> SET client_encoding = 'UTF8';
> SET standard_conforming_strings = on;
> SELECT pg_catalog.set_config('search_path', '', false);
> SET check_function_bodies = false;
> SET xmloption = content;
> SET client_min_messages = warning;
> SET row_security = off;
>
> SET default_tablespace = '';
>
> SET default_table_access_method = heap;
>
> --
> -- TOC entry 200 (class 1259 OID 26105)
> -- Name: xx; Type: TABLE; Schema: public; Owner: demo
> --
>
> CREATE TABLE public.xx (
> id integer
> );
&g

Re: I have no idea why pg_dump isn't dumping all of my data

2021-05-21 Thread Vijaykumar Jain

 xmloption = content;
SET client_min_messages = warning;
SET row_security = off;

--
-- TOC entry 2 (class 3079 OID 26997)
-- Name: nanoscopic; Type: EXTENSION; Schema: -; Owner: -
--

CREATE EXTENSION IF NOT EXISTS nanoscopic WITH SCHEMA public;


--
-- TOC entry 2239 (class 0 OID 0)
-- Dependencies: 2
-- Name: EXTENSION nanoscopic; Type: COMMENT; Schema: -; Owner:
--

COMMENT ON EXTENSION nanoscopic IS 'Database requirements for the
Nanoscopic blogging platform';


-- Completed on 2021-05-21 17:03:34 IST

--
-- PostgreSQL database dump complete
--


the dump only refers to creation of extension.
so when you load the extension via restore, it would create the extension
and create the table foo and load one value as in sql script.

but the inserted value of 2 is lost.

so this happens.
I do not know which part of docs mention that.

but FYI.




On Fri, 21 May 2021 at 16:56, Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

> ok,
>
> I think this is what it is.
>
> I copied the files to the extensions folder.
>
> ls /opt/postgresql-13/local/share/extension/nanoscopic*
> /opt/postgresql-13/local/share/extension/nanoscopic--1.0.sql
> /opt/postgresql-13/local/share/extension/nanoscopic.control
>
> and loaded the extensions.
> the relations are created as a result of the extension.
>
> foobar=# create extension nanoscopic;
> CREATE EXTENSION
> foobar=# \dt
>  List of relations
>  Schema | Name  | Type  |  Owner
> +---+---+--
>  public | blog  | table | postgres
>  public | blog_page | table | postgres
>  public | blog_post | table | postgres
>  public | blog_post_comment | table | postgres
>  public | blog_user | table | postgres
>  public | blog_user_permissions | table | postgres
> (6 rows)
>
> foobar=# drop extension nanoscopic;
> DROP EXTENSION
> foobar=# \dt
> Did not find any relations.
>
>
> when you dump the db, only the create extension statement is dumped, not
> its relations.
>
> when you reload the db from the dump file, the extension is created and
> relations too are created via that extension.
>
> But I do not know the theory of how pg_dump deals with relations and the
> data created via extensions at load time and further when they are modified.
>
> I'll do some lookup on this.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, 21 May 2021 at 16:29, Simon Connah 
> wrote:
>
>> This is the source code of the extension in question:
>> https://github.com/xmrsoftware/nanoscopic/tree/master/sql/nanoscopic
>> ‐‐‐ Original Message ‐‐‐
>> On Friday, May 21st, 2021 at 11:29, Vijaykumar Jain <
>> vijaykumarjain.git...@gmail.com> wrote:
>>
>> i just did a dump of a db which was owned by postgres but some tables
>> owned by other users and it ran fine.
>> I am not sure of that nanoscopic extension though.
>>
>>
>>
>> ***
>> createdb -e foobar;
>>
>> postgres=# \c foobar
>> You are now connected to database "foobar" as user "postgres".
>> foobar=# set role demo;
>> SET
>> foobar=> create table xx(id int);
>> CREATE TABLE
>> foobar=> \dt
>>List of relations
>>  Schema | Name | Type  | Owner
>> +--+---+---
>>  public | xx   | table | demo
>> (1 row)
>>
>> foobar=> insert into xx values (1);
>> INSERT 0 1
>> foobar=> \dt
>>List of relations
>>  Schema | Name | Type  | Owner
>> +--+---+---
>>  public | xx   | table | demo
>> (1 row)
>>
>> foobar=> \l
>>  List of databases
>>Name|  Owner   | Encoding |  Collate   |   Ctype|   Access
>> privileges
>>
>> ---+--+--+++---
>>  demo  | demo_rw  | UTF8 | en_US.utf8 | en_US.utf8 |
>>  foobar| postgres | UTF8 | en_US.utf8 | en_US.utf8 |
>>  postgres  | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
>>  template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres
>> +
>>|  |  |||
>> postgres=CTc/postgres
>>  template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres
>> +
>>|  |  |||
>> postgres=CTc/postgres
>> (5 rows)
>> ***
>>
>>
>> ***

Re: I have no idea why pg_dump isn't dumping all of my data

2021-05-21 Thread Vijaykumar Jain

PostgreSQL: Documentation: 13: 37.17. Packaging Related Objects into an
Extension <https://www.postgresql.org/docs/13/extend-extensions.html>

so it works as expected.
someone would have to point to the reference wrt modification of data in
objects created via extension.

The main advantage of using an extension, rather than just running the
SQL script
to load a bunch of “loose” objects into your database, is that PostgreSQL will
then understand that the objects of the extension go together. You can drop
all the objects with a single DROP EXTENSION
<https://www.postgresql.org/docs/13/sql-dropextension.html> command (no
need to maintain a separate “uninstall” script). Even more useful,
pg_dump knows
that it should not dump the individual member objects of the extension — it
will just include a CREATE EXTENSION command in dumps, instead. This vastly
simplifies migration to a new version of the extension that might contain
more or different objects than the old version. Note however that you must
have the extension's control, script, and other files available when
loading such a dump into a new database.



On Fri, 21 May 2021 at 17:07, Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

> so this is the summary.
>
> I modified the extension to have a simple sql that created table and
> inserted a value.
>
> ***
>
> postgres@go:~$ cat
> /opt/postgresql-13/local/share/extension/nanoscopic--1.0.sql
>
>
> create table foo(id int);
> insert into foo values (1);
>
>
> postgres@go:~$ cat
> /opt/postgresql-13/local/share/extension/nanoscopic.control
> default_version = '1.0'
> comment = 'Database requirements for the Nanoscopic blogging platform'
> encoding = UTF8
> superuser = false
> trusted = false
>
> postgres@go:~$ psql foobar
> psql (13.2)
> Type "help" for help.
>
> foobar=# drop extension nanoscopic;
> DROP EXTENSION
> foobar=# \dt
> Did not find any relations.
> foobar=# \q
> postgres@go:~$ stoppg
> waiting for server to shut down done
> server stopped
> postgres@go:~$ startpg
> waiting for server to start done
> server started
> postgres@go:~$ psql foobar
> psql (13.2)
> Type "help" for help.
>
> foobar=# set role demo;
> SET
> foobar=> create extension nanoscopic;
> CREATE EXTENSION
> foobar=> \dt
>List of relations
>  Schema | Name | Type  | Owner
> +--+---+---
>  public | foo  | table | demo
> (1 row)
>
> foobar=> table foo;
>  id
> 
>   1
> (1 row)
>
> foobar=> insert into foo values (2);   -- i add more data to the table
> created via extension
> INSERT 0 1
> foobar=> table foo;
>  id
> 
>   1
>   2
> (2 rows)
>
> foobar=> \q
> postgres@go:~$ pg_dump
> --file=nanoscopic_db_127_0_0_1-2021_05_21_10_25_28-dump.sql --dbname=foobar
> --clean --create --if-exists --username=demo -v --host=127.0.0.1 --port=5432
> Password:
> pg_dump: last built-in OID is 16383
> pg_dump: reading extensions
> pg_dump: identifying extension members
> pg_dump: reading schemas
> pg_dump: reading user-defined tables
> pg_dump: reading user-defined functions
> pg_dump: reading user-defined types
> pg_dump: reading procedural languages
> pg_dump: reading user-defined aggregate functions
> pg_dump: reading user-defined operators
> pg_dump: reading user-defined access methods
> pg_dump: reading user-defined operator classes
> pg_dump: reading user-defined operator families
> pg_dump: reading user-defined text search parsers
> pg_dump: reading user-defined text search templates
> pg_dump: reading user-defined text search dictionaries
> pg_dump: reading user-defined text search configurations
> pg_dump: reading user-defined foreign-data wrappers
> pg_dump: reading user-defined foreign servers
> pg_dump: reading default privileges
> pg_dump: reading user-defined collations
> pg_dump: reading user-defined conversions
> pg_dump: reading type casts
> pg_dump: reading transforms
> pg_dump: reading table inheritance information
> pg_dump: reading event triggers
> pg_dump: finding extension tables
> pg_dump: finding inheritance relationships
> pg_dump: reading column info for interesting tables
> pg_dump: finding the columns and types of table "public.foo"
> pg_dump: flagging inherited columns in subtables
> pg_dump: reading indexes
> pg_dump: flagging indexes in partitioned tables
> pg_dump: reading extended statistics
> pg_dump: reading constraints
> pg_dump: reading triggers
> pg_dump: reading rewrite rules
> pg_dump: reading policies
> pg_dump: reading row security enabled for table "public.foo"
> pg_dump:

Re: Setting up replication

2021-05-26 Thread Vijaykumar Jain

core ref:
PostgreSQL: Documentation: 13: Part III. Server Administration

although this is a lot verbose, but you would keep coming back to this to
tune your setup.

to understand basic setups. some are
How to Set Up Streaming Replication in PostgreSQL 12 - Percona Database
Performance Blog

How to setup Postgres 13 WAL streaming replication on Ubuntu 18.04 - Highgo
Software Inc.

some other references.
dhamaniasad/awesome-postgres: A curated list of awesome PostgreSQL
software, libraries, tools and resources, inspired by awesome-mysql
(github.com) 

a typical setup

Primary ---streaming replication --->  (Replica1, Replica2 )

Primary - writes
replica R1,R2  - reads ( depending on load can be put behind load
balancer like haproxy and connection pooler pgbouncer)
Scaling PostgreSQL using Connection Poolers and Load Balancers for an
Enterprise Grade environment - Percona Database Performance Blog

https://tinyurl.com/f2zk76yc  (EDB link, but the link is too big)

backups:
vm snapshots ( optional )
physical disk backups. ( optional )
pg_dumpall from replica and save it to external storage daily. (PostgreSQL:
Documentation: 13: pg_dumpall
)
barman (point in time recovery, can configure to save 7 days of WALs for
point in time recovery ) on external server. (Barman Manual (pgbarman.org)
)
Implement backup with Barman. This tutorial is part of a multipage… | by
Sylvain | coderbunker | Medium

Ideally, i would always go with core docs, as many tutorials get stale, but
i just mention to help get started quickly and then come back to core docs.

Things can get more complex (or simpler) if you go with auto failover
solutions
pg_auto_failover
patroni
enterprise solutions from EDB, cruncy etc .

this channel on youtube is pretty neat too. Scaling Postgres - YouTube

I am not sure my reply  is making it helpful or making it too loud for
simple setups.
anyways :)

On Wed, 26 May 2021 at 23:28, Oliver Kohll  wrote:

> Hi,
>
> We currently have an app with the database on the same server as the app
> itself. I'd like to transition to a system where
>
> 1) in the short term, the db replicates to a different server. This will
> allow us to take the daily pg_dump backups from the replica rather than the
> primary server. They're currently slowing down the system too much as they
> run.
>
> 2) in the medium term, switch the replica to be the primary and connect to
> that from the app, i.e. app and db will be on separate servers, letting us
> resource each appropriately. A 3rd server can then be used to replicate to
> for backup purposes.
>
> 3) in the long run, depending on demand that also gives us the option of
> scaling the db horizontally e.g. with a distributed db like Citus.
>
> Are there any suggestions / good walkthroughs of how to do number 1? There
> are many options!
>
> All I know so far is we can probably use streaming replication as I can
> make sure the PostgreSQL versions on each server are the same.
>
> One thing I'm wondering is how often should a base backup be taken? Also
> should we set up everything manually with scripts or use a 3rd party backup
> tool like barman?
>
> Any suggestions appreciated.
>
> Oliver
>

-- 
Thanks,
Vijay
Mumbai, India

Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Vijaykumar Jain

Yes,
I too see growth when text type is used, but not when int or even fixed
size char(10) is used.

I always thought truncate was similar to delete + vacuum full,
but checking for your scenarios, I did not see an update on
pg_stat_user_table on truncate for vacuums.

then i checked
PostgreSQL Internals: TRUNCATE (pykello.github.io)

to help understand truncation better.

but then i still do not understand how a col type *text* which is dynamic
results in mem growth (coz there are no rows inserted, i understand for
long strings db does work to compress, move them to toast tables etc) but
these are empty rows.

Maybe someone else will be able to explain what is going on.




On Fri, 28 May 2021 at 06:52, Nick Muerdter  wrote:

> I've been seeing what looks like unbounded memory growth (until the OOM
> killer kicks in and kills the postgres process) when running a pl/pgsql
> function that performs TRUNCATE statements against various temporary tables
> in a loop. I think I've been able to come up with some fairly simple
> reproductions of the issue in isolation, but I'm trying to figure out if
> this is a memory leak or of I'm perhaps doing something wrong with tuning
> or other settings.
>
> What I've observed:
>
> - The memory growth occurs if the temp table has indexes or a primary key
> set on it.
> - Alternatively, the memory growth also occurs if the temp table has
> certain column types on it (eg, "text" types).
> - If the table doesn't have indexes and only has integer columns present,
> then the memory growth does *not* occur.
> - I originally saw this against a PostgreSQL 12 server, but I've tested
> this against PostgreSQL 9.6.22, 12.7, and 13.3 Docker containers and
> reproduced it against all versions in the containers.
>
> Here are 2 separate examples that seem to show the memory growth on the
> server (the first being a table with a "text" column, the second example
> having no text column but a primary key index):
>
> DO $$
>   DECLARE
> i bigint;
>   BEGIN
> CREATE TEMPORARY TABLE pg_temp.foo (id integer, bar text);
>
> FOR i IN 1..2 LOOP
>   TRUNCATE pg_temp.foo;
> END LOOP;
>   END
> $$
>
> DO $$
>   DECLARE
> i bigint;
>   BEGIN
> CREATE TEMPORARY TABLE pg_temp.foo (id integer);
> ALTER TABLE pg_temp.foo ADD PRIMARY KEY (id);
>
> FOR i IN 1..2 LOOP
>   TRUNCATE pg_temp.foo;
> END LOOP;
>   END
> $$
>
> Compare that to this example (which doesn't have an index or any other
> column types that trigger this), which does *not* show any memory growth:
>
> DO $$
>   DECLARE
> i bigint;
>   BEGIN
> CREATE TEMPORARY TABLE pg_temp.foo (id integer);
>
> FOR i IN 1..2 LOOP
>   TRUNCATE pg_temp.foo;
> END LOOP;
>   END
> $$
>
> Any help in determining what's going on here (or if there are other ways
> to go about this) would be greatly appreciated!
>
> Thank you!
> Nick
>
>
>

-- 
Thanks,
Vijay
Mumbai, India

Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Vijaykumar Jain

i tried to reproduce tracking mem allocation.

demo=# DO $$
DECLAREi bigint;
BEGIN
CREATE TEMPORARY TABLE pg_temp.foo (id int) with (
AUTOVACUUM_ENABLED = 0, TOAST.AUTOVACUUM_ENABLED = 0 );
FOR i IN 1..2 LOOP
TRUNCATE pg_temp.foo;
END LOOP;
END
$$;

in a parallel tmux session.

strace -p 1620 --trace=memory

no movement/ no new output




when i replace the col with type *text*.

demo=# DO $$
DECLAREi bigint;
BEGIN
CREATE TEMPORARY TABLE pg_temp.foo (id *text*) with (
AUTOVACUUM_ENABLED = 0, TOAST.AUTOVACUUM_ENABLED = 0 );
FOR i IN 1..2 LOOP
TRUNCATE pg_temp.foo;
END LOOP;
END
$$;

strace -p 1620 --trace=memory
strace: Process 1620 attached
--- SIGINT {si_signo=SIGINT, si_code=SI_USER, si_pid=1878, si_uid=1001} ---
brk(0x556c502ad000) = 0x556c502ad000
brk(0x556c502ed000) = 0x556c502ed000
brk(0x556c5036d000) = 0x556c5036d000
brk(0x556c5046d000) = 0x556c5046d000
brk(0x556c5066d000) = 0x556c5066d000
brk(0x556c50a6d000) = 0x556c50a6d000
brk(0x556c5126d000) = 0x556c5126d000

brk(2) - Linux manual page (man7.org)
<https://www.man7.org/linux/man-pages/man2/brk.2.html>
it seems it does try memory allocation repeatedly.
I am not a C developer :), please ignore if i am diverting.




On Fri, 28 May 2021 at 18:52, Tom Lane  wrote:

> Vijaykumar Jain  writes:
> > I too see growth when text type is used, but not when int or even fixed
> > size char(10) is used.
> > ...
> > but then i still do not understand how a col type *text* which is dynamic
> > results in mem growth (coz there are no rows inserted, i understand for
> > long strings db does work to compress, move them to toast tables etc) but
> > these are empty rows.
>
> The text column would cause the table to have an associated toast table
> [1],
> which in turn would have an index.  Both of those would be reallocated as
> new files on-disk during TRUNCATE, just like the table proper.
>
> A plausible theory here is that TRUNCATE leaks some storage associated
> with an index's relcache entry, but not any for a plain table.
>
> regards, tom lane
>
> [1] https://www.postgresql.org/docs/current/storage-toast.html
>


-- 
Thanks,
Vijay
Mumbai, India

Re: WARNING: oldest xmin is far in the past

2021-05-28 Thread Vijaykumar Jain

If the replication slot is still inactive,
It will prevent vacuum to do the cleanup no matter how much vacuum is run
manually.

did the slot show as active after the restart of the collector ?

If it is active then may be increase maintenance_work_mem to a aggresive
value and lower nap time for auto vacuum to ensure it gets priority to get
the cleanup done quickly.

https://www.cybertec-postgresql.com/en/reasons-why-vacuum-wont-remove-dead-rows/
https://www.cybertec-postgresql.com/en/tuning-autovacuum-postgresql/


On Fri, May 28, 2021, 8:57 PM Alban Hertroys 
wrote:

> Good day,
>
> We have a PG 11.11 instance here that serves as a data-warehouse for us.
> This morning I was investigating an issue with our ETL's and discovered
> this error in the logs, that keeps repeating:
>
> 2021-05-28 15:01:54.094 CEST [20164]   WARNING:  oldest xmin is far in the
> past
> 2021-05-28 15:01:54.094 CEST [20164]   HINT:  Close open transactions soon
> to avoid wraparound problems.
> You might also need to commit or roll back old prepared
> transactions, or drop stale replication slots.
>
> We don't have any idle in transaction sessions, but we do have a
> replication slot that turns out to have been inactive for an unknown while.
>
> The current situation around our xid's is this:
>
> avbv=# select slot_name, slot_type, database, active, catalog_xmin,
> restart_lsn, confirmed_flush_lsn from pg_replication_slots ;
>slot_name   | slot_type | database | active | catalog_xmin |
> restart_lsn  | confirmed_flush_lsn
>
> ---+---+--++--+---+-
>  debezium_prod | logical   | avbv | t  |616648922 |
> 1166/C45B5140 | 1167/65C7AA0
> (1 row)
>
> avbv=# select datname, datfrozenxid from pg_database ;
> datname| datfrozenxid
> ---+--
>  postgres  |610128180
>  speeltuin |610128180
>  template1 |610128180
>  template0 |591773830
>  reportinfo|610128180
>  avbv_20190314 |610128180
>  avbv  |610128180
>  ensfocus-tst  |610128180
>  ensfocus  |610128180
>  ensfocuswf8   |610128180
>  portal_prd|610128180
>  portal_tst|610128180
> (12 rows)
>
> Clearly, the gap between the higher frozen xid's (610128180) and the
> replication slots xmin (616648922 ) is rather small; a mere 650k xid's
> apart.
>
> We have that single logical replication slot that Debezium subscribes to,
> to push committed records for some tables to Kafka. Those are tables that
> get frequent inserts, a batch of new records arrives about every 15
> minutes, 24/7.
>
> As mentioned, initially when I detected this problem, the Debezium
> connector (the subscriber) had failed to attach. Restarting it fixed that
> (that's a known issue that was recently discovered in the current version
> 1.4.0). I had hopes the xmin issue would be gone once it caught up, but it
> did catch up earlier today and the issue remains...
>
> I did already take several actions in attempts to solve the issue, so far
> to little avail:
>
> * I restarted the database, closing any idle in transaction sessions that
> might have gone unnoticed otherwise
> * I ran vacuum -a -U postgres, which printed a number of repetitions of
> the same error message on the console
> * I ran vacuum -a -F -U postgres
> * I added a heartbeat interval of 1ms (10s) to the Debezium connector,
> although I didn't think that was necessary
>
> Should I just wait for the replication slot xmin to increase into a safe
> area? It is slowly increasing, while the frozen xid's have remained the
> same while monitoring this issue.
> Or is there some action I should take?
>
>
>
> For the record:
>
> avbv=# select version();
>  version
>
> --
>  PostgreSQL 11.11 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
> 20150623 (Red Hat 4.8.5-44), 64-bit
> (1 row)
>
>
> Regards,
>
> Alban Hertroys
>
> P.S. Sorry about below company disclaimer, there is nothing I can do about
> that.
>
>
>
> *Alban Hertroys *
> D: 8776 |M:  |T: +31 (0)53 4888 888 | E: alban.hertr...@apollotyres.com
> Apollo Vredestein B.V.| Ir. E.L.C. Schiffstraat 370, 7547 RD Enschede, The
> Netherlands
> Chamber of Commerce number: 34223268
>
>
>
>
> *The information contained in this e-mail is intended solely for the use
> of the individual or entity to whom it is addressed. If you are not the
> intended recipient, you are hereby notified that any disclosure, copying,
> distribution or action in relation to the contents of this information is
> strictly prohibited and may be unlawful and request you to delete this
> message and any attachments and advise the sender by return e-mail. The
> confidentiality of this message is not warranted. Apollo Vredestein and its
> subsidiaries rule out any and every li

Re: WAL accumulating, Logical Replication pg 13

2021-05-28 Thread Vijaykumar Jain

I am not too sure with 9.3
i tried an upgrade from 9.6 to 11 using logical replication (pg_logical
extension)

one thing to note.
logical replication initiates a copy from a snapshot, then changes from
then on.

I had a very high insert rate on my source tables (v9.6) and the
destination (v11) could not keep up (it had tons of indexes when I copied
the schema) and it took around a day as the table had around 12 indexes.

So at the destination(v11), I dropped all but the primary index for each
table, started subscription and when it was almost caught up, rebuilt the
index on the destination concurrently.
it completed in 4-5 hours without stopping the source.
migration completed in a few mins :)

not sure if this would help, but just FYI.


On Sat, 29 May 2021 at 01:36, Willy-Bas Loos  wrote:

> Hi , I'm upgrading a 1.5TB database from postgres 9.3 to postgres 13 on
> Debian 10. This is now in an Acceptance stage (DTAP). I have encountered a
> problem: the WAL is not being deleted. I now have 1.4 TB of WAL in pg_wal
> and my disks are getting full. The oldest WAL file is 18 days old.
> I use Logical Replication from the new cluster to another new cluster with
> 1 subscriber and 1 subscription.
>
> pg_stat_subscription tells me all recent timestamps.
> and this:
> db=# select * from pg_replication_slots;
> -[ RECORD 1 ]---+-
> slot_name   | my_pub1
> plugin  | pgoutput
> slot_type   | logical
> datoid  | 16401
> database| db
> temporary   | f
> active  | t
> active_pid  | 9480
> xmin|
> catalog_xmin| 269168
> restart_lsn | D4/908BC268
> confirmed_flush_lsn | E1/25BF5710
> wal_status  | extended
> safe_wal_size   |
>
>
>
> I've had problems with diskspace on this server, with postgres crashing
> because of it, then added more diskspace and postgres recovered. This
> doesn't seem to be a problem now.
>
> The *publication* has the options publish = 'insert, update, delete,
> truncate', publish_via_partition_root = false
> The *subscription* has the options connect = true, enabled = true,
> create_slot = false, slot_name = my_pub1, synchronous_commit = 'off'
>
> The log on the publisher says:
> 2021-05-25 21:25:18.973 CEST [4584] user@db LOG:  starting logical
> decoding for slot "my_pub1"
> 2021-05-25 21:25:18.973 CEST [4584] user@db DETAIL:  Streaming
> transactions committing after D6/A82B5FE0, reading WAL from D4/908BC268.
> 2021-05-25 21:25:18.973 CEST [4584] user@db LOG:  logical decoding found
> consistent point at D4/908BC268
> 2021-05-25 21:25:18.973 CEST [4584] user@db DETAIL:  There are no running
> transactions.
> 2021-05-25 21:29:49.456 CEST [4614] user@db ERROR:  replication slot
> "my_pub1" is active for PID 4584
> 2021-05-25 21:29:54.474 CEST [4615] user@db ERROR:  replication slot
> "my_pub1" is active for PID 4584
>
> And on the subscriber:
> 2021-05-28 21:23:46.702 CEST [40039] LOG:  logical replication apply
> worker for subscription "my_pub1" has started
> 2021-05-28 21:23:46.712 CEST [40039] ERROR:  could not start WAL
> streaming: ERROR:  replication slot "my_pub1" is active for PID 730
> 2021-05-28 21:23:46.714 CEST [19794] LOG:  background worker "logical
> replication worker" (PID 40039) exited with exit code 1
>
> The postgres settings on the *publisher* are:
> max_connections = 100 # (change requires restart)
> tcp_keepalives_idle = 120 # TCP_KEEPIDLE, in seconds;
> shared_buffers = 50GB # min 128kB
> work_mem = 1GB # min 64kB
> maintenance_work_mem = 10GB # min 1MB
> logical_decoding_work_mem = 5GB # min 64kB
> dynamic_shared_memory_type = posix # the default is the first option
> max_worker_processes = 20 # (change requires restart)
> max_parallel_maintenance_workers = 10 # taken from max_parallel_workers
> max_parallel_workers_per_gather = 5 # taken from max_parallel_workers
> max_parallel_workers = 15 # maximum number of max_worker_processes that
> wal_level = logical # minimal, replica, or logical
> max_wal_size = 1GB
> min_wal_size = 80MB
> #archive_mode = off
> max_wal_senders = 10 # max number of walsender processes
> wal_sender_timeout = 1min # in milliseconds; 0 disables
> max_replication_slots = 7 # max number of replication slots
>
> On postgres settings on the *subscriber*:
> max_connections = 100 # (change requires restart)
> tcp_keepalives_idle = 120 # TCP_KEEPIDLE, in seconds;
> shared_buffers = 25GB # min 128kB
> work_mem = 1GB # min 64kB
> maintenance_work_mem = 10GB # min 1MB
> logical_decoding_work_mem = 5GB # min 64kB
> dynamic_shared_memory_type = posix # the default is the first option
> max_worker_processes = 20 # (change requires restart)
> max_parallel_maintenance_workers = 10 # taken from max_parallel_workers
> max_parallel_workers_per_gather = 5 # taken from max_parallel_workers
> max_parallel_workers = 15 # maximum number of max_worker_processes that
> wal_level = logical # minimal, replica, or logical
> max_wal_size =

Re: WAL accumulating, Logical Replication pg 13

2021-05-29 Thread Vijaykumar Jain

WAL can be built up for reasons like
1) if you have an inactive replication slot. I mean you had a streaming
replica which was turned off, but you did not remote the slot from primary.
2) Do you have archiving enabled? Are the archiving commands running fine ?
if just the archiving is broken, then you can manually run archive cleanup
provided, replication is all caught up fine.

3) logical replication can be broken for multiple reasons, like conflicts
where the subscriber already has the data which primary wants to push. it
will not proceed until the conflicts are resolved.
4) poor connectivity or the computer/network resources not able to keep up
with the load, can result in WAL pile up.

there are many blogs around logical replication issues, but when it was new
in pg10, I read this.
Recovery use cases for Logical Replication in PostgreSQL 10 | by Konstantin
Evteev | AvitoTech | Medium
<https://medium.com/avitotech/recovery-use-cases-for-logical-replication-in-postgresql-10-a1e6bab03072>

btw,
how are you doing logical replication with 9.3 ? using a pglogical
extension ?
I can try many things, but it would be wrong to make assumptions since i
did not work with 9.3
for ex.
Bug fix: Using ExecCopySlot during multi insert by bdrouvotAWS · Pull
Request #295 · 2ndQuadrant/pglogical (github.com)
<https://github.com/2ndQuadrant/pglogical/pull/295>
there are many issues posted here that may be relevant to your setup.

On Sat, 29 May 2021 at 19:22, Willy-Bas Loos  wrote:

> Yeah, indexes could slow things down, thanks. Btw I'm not using logical
> replication for the upgrade, that's not supported for 9.3.
> It was more complicated but that's beside the point.
>
> I could just delete the publication and all that belongs to it and start
> over. But since I'm trying out logical replication, I would like to be more
> in control than that. It's there anything that I can dig into to find out
> why the WAL is accumulating?
>
> Op vr 28 mei 2021 22:20 schreef Vijaykumar Jain <
> vijaykumarjain.git...@gmail.com>:
>
>> I am not too sure with 9.3
>> i tried an upgrade from 9.6 to 11 using logical replication (pg_logical
>> extension)
>>
>> one thing to note.
>> logical replication initiates a copy from a snapshot, then changes from
>> then on.
>>
>> I had a very high insert rate on my source tables (v9.6) and the
>> destination (v11) could not keep up (it had tons of indexes when I copied
>> the schema) and it took around a day as the table had around 12 indexes.
>>
>> So at the destination(v11), I dropped all but the primary index for each
>> table, started subscription and when it was almost caught up, rebuilt the
>> index on the destination concurrently.
>> it completed in 4-5 hours without stopping the source.
>> migration completed in a few mins :)
>>
>> not sure if this would help, but just FYI.
>>
>>
>> On Sat, 29 May 2021 at 01:36, Willy-Bas Loos  wrote:
>>
>>> Hi , I'm upgrading a 1.5TB database from postgres 9.3 to postgres 13 on
>>> Debian 10. This is now in an Acceptance stage (DTAP). I have encountered a
>>> problem: the WAL is not being deleted. I now have 1.4 TB of WAL in pg_wal
>>> and my disks are getting full. The oldest WAL file is 18 days old.
>>> I use Logical Replication from the new cluster to another new cluster
>>> with 1 subscriber and 1 subscription.
>>>
>>> pg_stat_subscription tells me all recent timestamps.
>>> and this:
>>> db=# select * from pg_replication_slots;
>>> -[ RECORD 1 ]---+-
>>> slot_name   | my_pub1
>>> plugin  | pgoutput
>>> slot_type   | logical
>>> datoid  | 16401
>>> database| db
>>> temporary   | f
>>> active  | t
>>> active_pid  | 9480
>>> xmin|
>>> catalog_xmin| 269168
>>> restart_lsn | D4/908BC268
>>> confirmed_flush_lsn | E1/25BF5710
>>> wal_status  | extended
>>> safe_wal_size   |
>>>
>>>
>>>
>>> I've had problems with diskspace on this server, with postgres crashing
>>> because of it, then added more diskspace and postgres recovered. This
>>> doesn't seem to be a problem now.
>>>
>>> The *publication* has the options publish = 'insert, update, delete,
>>> truncate', publish_via_partition_root = false
>>> The *subscription* has the options connect = true, enabled = true,
>>> create_slot = false, slot_name = my_pub1, synchronous_commit = 'off'
>>>
>&g

max_connections

2021-05-30 Thread Vijaykumar Jain

I have a two dumb questions.

1)
I know the max_connections value change requires a restart.

I also read a thread, which says why it is the case, assuming it still
holds true.

Jean Arnaud  writes:
> I'm looking for a way to change the "max_connections" parameter without
> restarting the PostGreSQL database.

There is none. That's one of the parameters that determines shared
memory array sizes, and we can't change those on-the-fly.

regards, tom lane

Does that mean, if I set max_connections to 1000 users, but only intend to
use 10 at max (via conn limit per role settings), what would be the perf
difference compared to max_connections = 100 , but still using only 10 at
max.

2)
can i for test purposes.compile src with a xid max value to 2^16 etc ? like
with uint16

Basically, I was trying to simulate xid wraparound, by creating an open
transaction and an inactive replication slot, and then running a huge
parallel loops to bump txid_current() to see what happens at 2B crossover.
it takes forever on my slow laptop.
I do not see these in test scenarios, or could not find any.

as always, ignore if not relevant, or already discussed.
I am preparing for interviews, hence asking myself dumb questions :)

-- 
Thanks,
Vijay
Mumbai, India

Re: WAL accumulating, Logical Replication pg 13

2021-05-31 Thread Vijaykumar Jain

So I got it all wrong it seems :)
You upgraded to pg13 fine? , but while on pg13 you have issues with logical
replication ?

There is a path in the postgresql source user subscription folder iirc
which covers various logical replication scenarios.
That may help you just in case.

I have tried all known ways in which replication breaks like above and once
I resolve conflicts it starts fine.
I’ll try to explore more scenarios.
Pardon my link to pglogical. I misunderstood.


On Mon, 31 May 2021 at 7:25 PM Willy-Bas Loos  wrote:

> Thank you for elaborating those possible causes and for the suggestions
> you made.
> 1) if you have an inactive replication slot.
> There is only 1 replication slot and it is active. So that is not the
> issue.
>
> 2) Do you have archiving enabled?
> No, i never turned it on and so this is in the settings of  both publisher
> and subscriber: #archive_mode = off (and show archive_mode; tells me the
> same)
>
> 3) logical replication can be broken for multiple reasons, like conflicts
> where the subscriber already has the data which primary wants to push. it
> will not proceed until the conflicts are resolved.
> That would have been in the log, but there isn't any conflict in the log.
> Only the messages that i posted with my first message.
>
> 4) poor connectivity or the computer/network resources not able to keep up
> with the load, can result in WAL pile up.
> This would be strange since there is a 10Gbps connection within the same
> rack. But it could theoretically be malfunctioning or the performance on
> the subscriber could be too low.
> If any of this is the case, shouldn't that be visible in
> pg_stat_subscription ?
>
> Thanks for the article, it's interesting to see how they transitioned from
> londiste, even if the article is about pglogical, not logical replication
> in the postgres core.
> I was using Londiste to transfer the data to the new server and minimize
> downtime, so the article might come in handy.
> I prepared by reading the documentation, which is very straightforward.
> >btw,
> >how are you doing logical replication with 9.3 ? using a pglogical
> extension ?
> No, I'm not using logical replication in postgres 9.3 . Only on postgres
> 13.
> About the link to the bug reports: Thanks for the suggestion. But first
> I'd like to get some better grip on what is going on before searching for
> bugs.
>
> Still, any help will be much appreciated
>
> On Sat, May 29, 2021 at 5:16 PM Vijaykumar Jain <
> vijaykumarjain.git...@gmail.com> wrote:
>
>> WAL can be built up for reasons like
>> 1) if you have an inactive replication slot. I mean you had a streaming
>> replica which was turned off, but you did not remote the slot from primary.
>> 2) Do you have archiving enabled? Are the archiving commands running fine
>> ? if just the archiving is broken, then you can manually run
>> archive cleanup provided, replication is all caught up fine.
>>
>> 3) logical replication can be broken for multiple reasons, like conflicts
>> where the subscriber already has the data which primary wants to push. it
>> will not proceed until the conflicts are resolved.
>> 4) poor connectivity or the computer/network resources not able to keep
>> up with the load, can result in WAL pile up.
>>
>> there are many blogs around logical replication issues, but when it was
>> new in pg10, I read this.
>> Recovery use cases for Logical Replication in PostgreSQL 10 | by
>> Konstantin Evteev | AvitoTech | Medium
>> <https://medium.com/avitotech/recovery-use-cases-for-logical-replication-in-postgresql-10-a1e6bab03072>
>>
>> btw,
>> how are you doing logical replication with 9.3 ? using a pglogical
>> extension ?
>> I can try many things, but it would be wrong to make assumptions since i
>> did not work with 9.3
>> for ex.
>> Bug fix: Using ExecCopySlot during multi insert by bdrouvotAWS · Pull
>> Request #295 · 2ndQuadrant/pglogical (github.com)
>> <https://github.com/2ndQuadrant/pglogical/pull/295>
>> there are many issues posted here that may be relevant to your setup.
>>
>>
>>
>>
>>
>> On Sat, 29 May 2021 at 19:22, Willy-Bas Loos  wrote:
>>
>>> Yeah, indexes could slow things down, thanks. Btw I'm not using logical
>>> replication for the upgrade, that's not supported for 9.3.
>>> It was more complicated but that's beside the point.
>>>
>>> I could just delete the publication and all that belongs to it and start
>>> over. But since I'm trying out logical replication, I would like to be more
>>> in control than that. It's th

Re: WAL accumulating, Logical Replication pg 13

2021-05-31 Thread Vijaykumar Jain

This part.
If you can read perl :),

https://github.com/postgres/postgres/tree/master/src/test/subscription/t

On Mon, May 31, 2021, 9:02 PM Willy-Bas Loos  wrote:

>
>
> On Mon, May 31, 2021 at 4:24 PM Vijaykumar Jain <
> vijaykumarjain.git...@gmail.com> wrote:
>
>> So I got it all wrong it seems :)
>>
> Thank you for taking the time to help me!
>
> You upgraded to pg13 fine? , but while on pg13 you have issues with
>> logical replication ?
>>
> Yes, the upgrade went fine. So here are some details:
> I already had londiste running on postgres 9.3, but londiste wouldn't run
> on Debian 10
> So i first made the new server Debian 9 with postgres 9.6 and i started
> replicating with londiste from 9.3 to 9.6
> When all was ready, i stopped the replication to the 9.6 server and
> deleted all londiste & pgq content with drop schema cascade.
> Then I upgraded the server to Debian  10. Then i user pg_upgrade to
> upgrade from postgres 9.6 to 13. (PostGIS versions were kept compatible).
> Then I added logical replication and a third server as a subscriber.
>
> I was going to write that replication is working fine (since the table
> contains a lot of data and there are no conflicts in the log), but it turns
> out that it isn't.
> The subscriber is behind and It looks like there hasn't been any incoming
> data after the initial data synchronization.
> So at least now i know that the WAL is being retained with a reason. The
> connection is working properly (via psql anyway)
>
> I will also look into how to diagnose this from the system tables, e.g.
> substracting LSN's to get some quantitative measure  for the lag.
>
>
>
>> There is a path in the postgresql source user subscription folder iirc
>> which covers various logical replication scenarios.
>> That may help you just in case.
>>
> OK, so comments in the source code you mean?
>
>

Re: CREATE/REFRESH MATERIALIZED VIEW planner difference?

2021-06-01 Thread Vijaykumar Jain

if you are not using it concurrently, can you confirm the there are *no
active* queries on the mv.
refresh requires AccessExclusiveLock and will wait, till it gets one.
just asking if you can rule out the extended time is not due to waiting for
lock.

also, can you share the plans  where you see the diff.

On Tue, 1 Jun 2021 at 23:30, Philip Semanchuk 
wrote:

> Hi all,
> Should I expect a planner difference between CREATE MATERIALIZED VIEW and
> REFRESH MATERIALIZED VIEW? We have a materialized view that uses 4 workers
> during CREATE but only one worker during REFRESH, and as a result the
> refresh takes much longer (~90 minutes vs. 30 minutes for the CREATE). So
> far this behavior has been 100% consistent.
>
> I'm running both the CREATE and REFRESH on the same server (Postgres 11.9
> on AWS Aurora). I don't think the refresh is using one worker in response
> to other things happening on the server because we’ve observed this
> happening when the server is not busy. We're not using the CONCURRENTLY
> option for REFRESH.
>
> THanks
> Philip
>
>

-- 
Thanks,
Vijay
Mumbai, India

Re: CREATE/REFRESH MATERIALIZED VIEW planner difference?

2021-06-01 Thread Vijaykumar Jain

ok i see this.
i may be wrong, but even when i force parallel cost to 0,
i only get workers to create mv, but refresh mv plan does not use workers
for the same conf params.

***
postgres=# create table if not exists t( id int primary key, value int );
CREATE TABLE
postgres=# insert into t select x,x from generate_series(1, 10) x;
INSERT 0 10
postgres=# analyze t;
ANALYZE
*

postgres=# drop materialized view mv;
DROP MATERIALIZED VIEW
postgres=# explain analyze create materialized view mv AS select
round(avg(id)), sum(id) from t, pg_sleep(10);
QUERY PLAN
---
 Aggregate  (cost=2943.02..2943.03 rows=1 width=40) (actual
time=10027.940..10027.941 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..2443.01 rows=10 width=4) (actual
time=10010.513..10022.985 rows=10 loops=1)
 ->  Function Scan on pg_sleep  (cost=0.00..0.01 rows=1 width=0)
(actual time=10010.497..10010.498 rows=1 loops=1)
 ->  Seq Scan on t  (cost=0.00..1443.00 rows=10 width=4)
(actual time=0.012..5.841 rows=10 loops=1)
 Planning Time: 0.245 ms
 Execution Time: 10039.621 ms
(6 rows)

postgres=# drop materialized view mv;
DROP MATERIALIZED VIEW
postgres=# set parallel_setup_cost=0;
SET
postgres=# set parallel_tuple_cost=0;
SET
postgres=# set min_parallel_table_scan_size=0;
SET
postgres=# set max_parallel_workers_per_gather=4;
SET
postgres=# explain analyze create materialized view mv AS select
round(avg(id)), sum(id) from t, pg_sleep(10);
  QUERY PLAN
--
 Finalize Aggregate  (cost=1318.04..1318.05 rows=1 width=40) (actual
time=10042.197..10042.457 rows=1 loops=1)
   ->  Gather  (cost=1318.00..1318.01 rows=4 width=40) (actual
time=10041.941..10042.450 rows=5 loops=1)
 Workers Planned: 4
 Workers Launched: 4
 ->  Partial Aggregate  (cost=1318.00..1318.01 rows=1 width=40)
(actual time=10035.167..10035.168 rows=1 loops=5)
   ->  Nested Loop  (cost=0.00..1193.00 rows=25000 width=4)
(actual time=10011.980..10033.456 rows=2 loops=5)
 ->  Parallel Seq Scan on t  (cost=0.00..693.00
rows=25000 width=4) (actual time=0.005..5.791 rows=2 loops=5)
 ->  Function Scan on pg_sleep  (cost=0.00..0.01 rows=1
width=0) (actual time=0.501..0.501 rows=1 loops=10)
 Planning Time: 0.105 ms
 Execution Time: 10059.992 ms
(10 rows)

postgres=# refresh materialized view mv;
REFRESH MATERIALIZED VIEW


*** auto explain in logs



2021-06-02 00:41:44.294 IST [2687] LOG:  statement: explain analyze create
materialized view mv AS select round(avg(id)), sum(id) from t, pg_sleep(10);
2021-06-02 00:41:54.361 IST [2687] LOG:  duration: 10059.566 ms  plan:
Query Text: explain analyze create materialized view mv AS select
round(avg(id)), sum(id) from t, pg_sleep(10);
Finalize Aggregate  (cost=1318.04..1318.05 rows=1 width=40) (actual
time=10042.197..10042.457 rows=1 loops=1)
  Output: round(avg(t.id), 0), sum(t.id)
  Buffers: shared hit=443
  ->  Gather  (cost=1318.00..1318.01 rows=4 width=40) (actual
time=10041.941..10042.450 rows=5 loops=1)
Output: (PARTIAL avg(t.id)), (PARTIAL sum(t.id))
Workers Planned: 4
Workers Launched: 4
Buffers: shared hit=443
->  Partial Aggregate  (cost=1318.00..1318.01 rows=1
width=40) (actual time=10035.167..10035.168 rows=1 loops=5)
  Output: PARTIAL avg(t.id), PARTIAL sum(t.id)
  Buffers: shared hit=443
  Worker 0:  actual time=10033.316..10033.316 rows=1
loops=1
Buffers: shared hit=62
  Worker 1:  actual time=10033.162..10033.163 rows=1
loops=1
Buffers: shared hit=55
  Worker 2:  actual time=10034.946..10034.946 rows=1
loops=1
Buffers: shared hit=117
  Worker 3:  actual time=10033.210..10033.211 rows=1
loops=1
Buffers: shared hit=103
  ->  Nested Loop  (cost=0.00..1193.00 rows=25000
width=4) (actual time=10011.980..10033.456 rows=2 loops=5)
Output: t.id
Buffers: shared hit=443
Worker 0:  actual time=10017.958..10032.681
rows=14012 loops=1
  Buffers: shared hit=62
Worker 1:  actual time=10014.150..10032.520
rows=12430 loops=1
  Buffers: shared hit=55
Work

1 2 3 >

1 - 100 of 215 matches

Mail list logo