Re: pgbackrest: ERROR: [029]: unable to convert base 10 string '0000000B' to unsigned int

2020-01-21 Thread Michael Paquier
On Tue, Jan 21, 2020 at 08:10:39AM +0100, Eric Veldhuyzen wrote:
> We are using pgbackrest (2.21) to backup out postgresql (11) clusters.
> Last night our nightly diff backup gave me the
> ERROR: unable to convert base 10 string '000B' to unsigned int.
> I tried if a full backup would fix this, but it didn't.

Did you try to contact the maintainers?  Here is the link to the
project:
https://github.com/pgbackrest/pgbackrest
--
Michael


signature.asc
Description: PGP signature


Re: Questions about how to streaming replication (pg 9.2)

2020-01-21 Thread Laurenz Albe
On Mon, 2020-01-20 at 14:59 +0100, Ekaterina Amez wrote:
> PS: Just in case anyone wants to know, this is part of the process of upgrade 
> a server with 9.2 version
> that has no free space in PGDATA and that can't be stopped for much time. 
> After asking here, the strategy
> to upgrade will be: replicate this DB to a path with space to grow, switch 
> clusters (slave becoming master
> and creating a new slave to have just-in-case), and pg_upgrade slave/new 
> master with --link option.

Do not upgrade to 9.6, upgrade to at least v11.

Set "wal_keep_segments" high enough on the primary for safety.

Run "pg_basebackup" on the standby.

Create a "recovery.conf" on the standby that has "primary_conninfo" set
and "standby_mode = on".

Start the standby and wait until it has caught up, then switch over.

Yours,
Laurenz Albe





Re: Questions about how to streaming replication (pg 9.2)

2020-01-21 Thread Ekaterina Amez



El 21/1/20 a las 10:14, Laurenz Albe escribió:

On Mon, 2020-01-20 at 14:59 +0100, Ekaterina Amez wrote:

PS: Just in case anyone wants to know, this is part of the process of upgrade a 
server with 9.2 version
that has no free space in PGDATA and that can't be stopped for much time. After 
asking here, the strategy
to upgrade will be: replicate this DB to a path with space to grow, switch 
clusters (slave becoming master
and creating a new slave to have just-in-case), and pg_upgrade slave/new master 
with --link option.

Do not upgrade to 9.6, upgrade to at least v11.


v9.6 is target version because this is the last server that remains with 
unsupported PG version, and the others have been upgraded to 9.6. After 
having all of them in 9.6, I have to discuss what to do next as at least 
one of the servers is installed over CentOS 32 bits  (facepalm...)





Set "wal_keep_segments" high enough on the primary for safety.



I'm not sure how to figure this value . Of course I could use a big 
value and forget, but as the server is running out of space I'd like to 
adjust it so there is (almost) no waste of space. This is why I began to 
think about wal archiving, though it seems a bit more complicated.




Run "pg_basebackup" on the standby.

Create a "recovery.conf" on the standby that has "primary_conninfo" set
and "standby_mode = on".

Start the standby and wait until it has caught up, then switch over.
Yeah, I've been testing this part and it's the only part where I feel I 
know what I'm doing.


Yours,
Laurenz Albe


Regards,

Ekaterina





About deadlock for drop index command on big size table

2020-01-21 Thread emilu

Hello,

I tried to call drop index on a big table(around 3 million records)  
through non-multi thread script, and got deadlock.


Same drop index command in the same script on a smaller size table no  
deadlock returned.


May I know that may cause the problem please?

Thanks a lot.

[Block info]
 blocking_pid | blocked_pid | blocked_query
--+-+---
 123 | 7298 | DROP index idx1
 123 | 7298 | DROP index idx1





Re: ERROR: too many dynamic shared memory segments

2020-01-21 Thread Nicola Contu
Hey Thomas,
after a few months, we started having this issue again.
So we revert the work_mem parameter to 600MB instead of 2GB.
But the issue is still there. A query went to segmentation fault, the DB
went to recovery mode and our app went to read only for a few minutes.

I understand we can increase max_connections so we can have many more
segments.

My question is : is there a way to understand the number of segments we
reached?
Currently we have 220 max_conn so as your formula is 64 + 2* 220 we have
about 500 shared segments.
We would like to increase that number to 300 or 400 but would be great to
understand if there is a way to make sure we will solve the issue as it
requires a restart of the service.

I know you were also talking about a redesign this part in PostgreSQL. Do
you know if anything has changed in any of the newer versions after 11.5?

Thanks a lot,
Nicola





Il giorno gio 12 set 2019 alle ore 01:01 Thomas Munro <
thomas.mu...@gmail.com> ha scritto:

> On Wed, Sep 11, 2019 at 11:20 PM Nicola Contu 
> wrote:
> > If the error persist I will try to revert the work_mem.
> > Thanks a lot
>
> Hi Nicola,
>
> It's hard to say exactly what the cause of the problem is in your case
> and how to avoid it, without knowing what your query plans look like.
> PostgreSQL allows 64 + 2 * max_connections segments to exist a time,
> and it needs a number of them that depends on work_mem (in the case of
> Parallel Hash Join and Parallel Bitmap Index Scan), and also depends
> on the number of Gather nodes that appear in the plan, which in some
> unusual cases can result from partitioning.
>
> I've seen people reaching this error by running a lot of parallel
> queries concurrently.  If that's the cause, then you can definitely
> get some relief by turning work_mem down, or by turning
> max_connections up (even though you don't want to allow more
> connections -- because it influences the formula for deciding on the
> DSM segment limit).  We should probably adjust some of the internal
> constants to give us more slots, to avoid that problem, as discussed
> here:
>
>
> https://www.postgresql.org/message-id/flat/CA%2BhUKGL6H2BpGbiF7Lj6QiTjTGyTLW_vLR%3DSn2tEBeTcYXiMKw%40mail.gmail.com
>
> I've also seen people reaching this error by somehow coming up with
> plans that had a very large number of Gather nodes in them,
> corresponding to partitions; that's probably a bad plan (it'd
> presumably be better to terminate parallelism higher up in the plan,
> but these plans do seem to exist in the wild; I don't recall exactly
> why).  I think we need a bit of a redesign so that if there are
> multiple Gather nodes, they share the same main DSM segment, instead
> of blowing through this limit.
>
> --
> Thomas Munro
> https://enterprisedb.com
>


Re: pgbackrest: ERROR: [029]: unable to convert base 10 string '0000000B' to unsigned int

2020-01-21 Thread David Steele

On 1/21/20 12:55 AM, Eric Veldhuyzen wrote:

Ron wrote:

On 1/21/20 1:10 AM, Eric Veldhuyzen wrote:

Hi,

We are using pgbackrest (2.21) to backup out postgresql (11) clusters.
Last night our nightly diff backup gave me the
ERROR: unable to convert base 10 string '000B' to unsigned int.
I tried if a full backup would fix this, but it didn't.


Maybe I'm missing something, but 000*B* looks *hexa*decimal, not decimal.


You're not missing something, and this is most likely also the reason that this
fails. I'm quite sure it should try to convert a base 16 string here to an
unsigned int...


That's exactly what should be happening.  There has already been a bug 
filed at https://github.com/pgbackrest/pgbackrest/issues/910 and you can 
find more details there.


Regards,
--
-David
da...@pgmasters.net




Re: ERROR: too many dynamic shared memory segments

2020-01-21 Thread Nicola Contu
We also reverted this param :

cmdv3=# show max_parallel_workers_per_gather;
 max_parallel_workers_per_gather
-
 2
(1 row)

It was set to 8.


Il giorno mar 21 gen 2020 alle ore 16:06 Nicola Contu <
nicola.co...@gmail.com> ha scritto:

> Hey Thomas,
> after a few months, we started having this issue again.
> So we revert the work_mem parameter to 600MB instead of 2GB.
> But the issue is still there. A query went to segmentation fault, the DB
> went to recovery mode and our app went to read only for a few minutes.
>
> I understand we can increase max_connections so we can have many more
> segments.
>
> My question is : is there a way to understand the number of segments we
> reached?
> Currently we have 220 max_conn so as your formula is 64 + 2* 220 we have
> about 500 shared segments.
> We would like to increase that number to 300 or 400 but would be great to
> understand if there is a way to make sure we will solve the issue as it
> requires a restart of the service.
>
> I know you were also talking about a redesign this part in PostgreSQL. Do
> you know if anything has changed in any of the newer versions after 11.5?
>
> Thanks a lot,
> Nicola
>
>
>
>
>
> Il giorno gio 12 set 2019 alle ore 01:01 Thomas Munro <
> thomas.mu...@gmail.com> ha scritto:
>
>> On Wed, Sep 11, 2019 at 11:20 PM Nicola Contu 
>> wrote:
>> > If the error persist I will try to revert the work_mem.
>> > Thanks a lot
>>
>> Hi Nicola,
>>
>> It's hard to say exactly what the cause of the problem is in your case
>> and how to avoid it, without knowing what your query plans look like.
>> PostgreSQL allows 64 + 2 * max_connections segments to exist a time,
>> and it needs a number of them that depends on work_mem (in the case of
>> Parallel Hash Join and Parallel Bitmap Index Scan), and also depends
>> on the number of Gather nodes that appear in the plan, which in some
>> unusual cases can result from partitioning.
>>
>> I've seen people reaching this error by running a lot of parallel
>> queries concurrently.  If that's the cause, then you can definitely
>> get some relief by turning work_mem down, or by turning
>> max_connections up (even though you don't want to allow more
>> connections -- because it influences the formula for deciding on the
>> DSM segment limit).  We should probably adjust some of the internal
>> constants to give us more slots, to avoid that problem, as discussed
>> here:
>>
>>
>> https://www.postgresql.org/message-id/flat/CA%2BhUKGL6H2BpGbiF7Lj6QiTjTGyTLW_vLR%3DSn2tEBeTcYXiMKw%40mail.gmail.com
>>
>> I've also seen people reaching this error by somehow coming up with
>> plans that had a very large number of Gather nodes in them,
>> corresponding to partitions; that's probably a bad plan (it'd
>> presumably be better to terminate parallelism higher up in the plan,
>> but these plans do seem to exist in the wild; I don't recall exactly
>> why).  I think we need a bit of a redesign so that if there are
>> multiple Gather nodes, they share the same main DSM segment, instead
>> of blowing through this limit.
>>
>> --
>> Thomas Munro
>> https://enterprisedb.com
>>
>


Re: Questions about how to streaming replication (pg 9.2)

2020-01-21 Thread Laurenz Albe
On Tue, 2020-01-21 at 11:18 +0100, Ekaterina Amez wrote:

[about wal_keep_segments]

> I'm not sure how to figure this value . Of course I could use a big 
> value and forget, but as the server is running out of space I'd like to 
> adjust it so there is (almost) no waste of space. This is why I began to 
> think about wal archiving, though it seems a bit more complicated.

See how many WAL files are generated in a day.

If you set wal_keep_segments to that value, the standby will be
able to survive a down time of a day.

If you don't need to cater for that much down time, use a smaller value.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





After restart of our C++(postgres client) which is running in the kubernetes pod, connection is getting hanged.

2020-01-21 Thread M Tarkeshwar Rao
Hi all,

We are new to Kubernetes. Our application is in C++ and it is connecting to 
postgres server.
We are facing problems during sending traffic to postgres server. Initially it 
is running fine.

When we restarting the container it hanged. We took tcpdump. After analyzing 
the dump we found that
Connection is established but not getting the response.

Initially when everything ok(without restart of container) we are getting 1 
byte response from server after connection established.
But after restarting of the container, postgres server is not sending this 
1byte response .

Regards
Tarkeshwar