Steps required for increasing disk size in EC2 instance with minimal downtime

2020-07-20 Thread RAJAMOHAN
Hello all,

We have configured RAID 0 setup, with 8 EBS volumes of size 700G. We
combined those volumes under a single mount point for the data directory.
Archives are configured to be stored on different EC2 instances.Currently
disk utilisation of data directory crossed 90%. We need to increase the
disk with less downtime.

Can anyone please share with me the steps to increase the disk size in EC2
instances with minimal downtime.

xvdf202:80   0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdg202:96   0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdh202:112  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdi202:128  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdj202:144  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdk202:160  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdl202:176  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdm202:192  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres


/dev/md127  5.5T  5.2T  306G  95% /opt/postgres


Thanks & Regards,
Rajamohan.J


Re: How to restore a dump containing CASTs into a database with a new user?

2020-07-20 Thread Thorsten Schöning
Guten Tag Christophe Pettus,
am Montag, 20. Juli 2020 um 07:19 schrieben Sie:

> No, you don't, and you (probably) can't change the ownership of
> "inet". "inet" is a built-in type.

I'm somewhat sure I did and can answer my own questions now:

> dropdb ams_sm_mtg
> createdb --encoding=UTF-8 --locale=de_DE.UTF-8 --owner=ams_sm_mtg 
> --template=template0 ams_sm_mtg
> psql --dbname=ams_sm_mtg
> ALTER TYPE inet OWNER TO ams_sm_mtg;

This makes this type really owned by the given user, but as it seems
only for the current database! This can be seen in the table
"pg_type", in which the column "typowner" really changes to the new
user-ID:

> psql --dbname=ams_sm_mtg
> ALTER TYPE inet OWNER TO postgres;

> inet11  10
> _inet   11  10

vs.

> psql --dbname=ams_sm_mtg
> ALTER TYPE inet OWNER TO ams_sm_mtg;

> inet11  16389
> _inet   11  16389

"pg_type" contains some other user created types and those have the
same user-ID. The important thing is that doing this with another
database doesn't seem to influence the one of interest:

> psql --dbname=template1
> ALTER TYPE inet OWNER TO postgres;

This keeps the following in the table of interest:

> inet11  16389
> _inet   11  16389

Using other manually created databases and users results in the same:

> psql --dbname=ams_db_login
> ALTER TYPE inet OWNER TO ams_db_login;

This leads to the following in the given table:

> inet11  16390
> _inet   11  16390

But keeps things in other tables:

> inet11  16389
> _inet   11  16389

Additionally, when viewing "pg_table" connected as and to
"ams_db_login", it shows different types than are available when
viewing "pg_table" as and for "ams_sm_mtg". This is another strong
hint that those types are managed per database.

So things seem to work as expected, with the only caveat that one
needs to make some types being owned by new users BEFORE actually
restoring. The problem of course is to know which types those are,
seems one needs to restore, look for errors, ALTER, drop, restore
again etc.

Would be far easier if Postgres would do that automatically like it
seems to do for most other objects. The important point is that owning
those types seems to be per database, so things should be safe to do
automatically.

> [...]The issue is that you have
> user-defined objects which are owned by the user "postgres"; you
> should change those to the user that you want, leaving the CASTs owned by 
> "postgres".

The error messages and docs say otherwise and changing the owner to a
user which doesn't exist at all in the source-cluster doesn't make
sense as well. When creating the dump, I can't know into which target
database owned by which user it gets restored at some point.

Mit freundlichen Grüßen,

Thorsten Schöning

-- 
Thorsten Schöning   E-Mail: thorsten.schoen...@am-soft.de
AM-SoFT IT-Systeme  http://www.AM-SoFT.de/

Telefon...05151-  9468- 55
Fax...05151-  9468- 88
Mobil..0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow





PGBench on Windows - connections are subprocesses?

2020-07-20 Thread Durumdara
Dear Members!


I have a question about PGBench for Windows (9,6).

I want to understand the working method of this tool for use well
in the test series.

This has more options, like connections (c).

As I tried the c controls how much concurrent connections must be used in
the test. For example c 10 increases with ten. It's ok.

But the number of threads option (j I think) confused me. At first I
thought the total connection number is simply the multiplication of c and j
(subconnections).
As I saw this is untrue.

So I don't know how this utility works really in the background.

a.)
It makes C subprocesses with connections and every of them makes his work
concurrently?

b.)
It makes N connections on one thread/process. Hut only one works at the
same time. J controls how much works at the same time. The other
connections are sleeping.
1. c started/has finished.
2. c started/has finished.
N. c started/has finished.
End.
(This is a little bit senseless)

c.)
It makes C subthreads and every thread has it's connection. They work
concurrently (like in case a).
It could be ok, but why can I set threads with the "j" option - when
subthreads must be equal with the number of connections.

Do you know something about the background of this utility?

We have to move to the next server which is different from the actual one,
and I need to measure them to detect the possible problems.
To compare the result I must understand the operations.

Very-very thank you for each answer you write!

Best regards
  DD


Re: PGBench on Windows - connections are subprocesses?

2020-07-20 Thread Tom Lane
Durumdara  writes:
> But the number of threads option (j I think) confused me. At first I
> thought the total connection number is simply the multiplication of c and j
> (subconnections).
> As I saw this is untrue.

> So I don't know how this utility works really in the background.

There are -j threads in the pgbench process, and -c connections to
the server (hence -c backend processes on the server side).  Each
of the pgbench threads is responsible for sending queries to a subset
of the connections.  Setting -j more than -c is useless (I forget
if it's actually an error).  If you set -j to, say, half of -c then
each thread has exactly two connections to manage.  If -j is too
small compared to -c then pgbench itself tends to become the bottleneck.

regards, tom lane




Re: Steps required for increasing disk size in EC2 instance with minimal downtime

2020-07-20 Thread Adrian Klaver

On 7/20/20 12:01 AM, RAJAMOHAN wrote:

Hello all,

We have configured RAID 0 setup, with 8 EBS volumes of size 700G. We 
combined those volumes under a single mount point for the data 
directory. Archives are configured to be stored on different EC2 
instances.Currently disk utilisation of data directory crossed 90%. We 
need to increase the disk with less downtime.


Can anyone please share with me the steps to increase the disk size in 
EC2 instances with minimal downtime.


Search term: aws ebs raid increase size

https://pracucci.com/growing-aws-ebs-raid-0-array-increasing-volumes-size.html



xvdf    202:80   0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdg    202:96   0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdh    202:112  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdi    202:128  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdj    202:144  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdk    202:160  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdl    202:176  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres
xvdm    202:192  0  700G  0 disk
└─md127   9:127  0  5.5T  0 raid0 /opt/postgres


/dev/md127      5.5T  5.2T  306G  95% /opt/postgres


Thanks & Regards,
Rajamohan.J




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: PGBench on Windows - connections are subprocesses?

2020-07-20 Thread Durumdara
I extend the question to understand why I was confused about this.

In Delphi the connections are thread based. From a thread you can use more
connections. But you can't use a connection from two or more threads
concurrently! You can use subprocess farm, or subthread farm to make
parallel performance tests.

Durumdara  ezt írta (időpont: 2020. júl. 20., H,
15:20):

> Dear Members!
>
>
> I have a question about PGBench for Windows (9,6).
>
> I want to understand the working method of this tool for use well
> in the test series.
>
> This has more options, like connections (c).
>
> As I tried the c controls how much concurrent connections must be used in
> the test. For example c 10 increases with ten. It's ok.
>
> But the number of threads option (j I think) confused me. At first I
> thought the total connection number is simply the multiplication of c and j
> (subconnections).
> As I saw this is untrue.
>
> So I don't know how this utility works really in the background.
>
> a.)
> It makes C subprocesses with connections and every of them makes his work
> concurrently?
>
> b.)
> It makes N connections on one thread/process. Hut only one works at the
> same time. J controls how much works at the same time. The other
> connections are sleeping.
> 1. c started/has finished.
> 2. c started/has finished.
> N. c started/has finished.
> End.
> (This is a little bit senseless)
>
> c.)
> It makes C subthreads and every thread has it's connection. They work
> concurrently (like in case a).
> It could be ok, but why can I set threads with the "j" option - when
> subthreads must be equal with the number of connections.
>
> Do you know something about the background of this utility?
>
> We have to move to the next server which is different from the actual one,
> and I need to measure them to detect the possible problems.
> To compare the result I must understand the operations.
>
> Very-very thank you for each answer you write!
>
> Best regards
>   DD
>
>
>
>
>
>
>
>
>
>


Re: PGBench on Windows - connections are subprocesses?

2020-07-20 Thread Durumdara
Dear Tom!

Tom Lane  ezt írta (időpont: 2020. júl. 20., H, 15:38):

>
> There are -j threads in the pgbench process, and -c connections to
> the server (hence -c backend processes on the server side).  Each
> of the pgbench threads is responsible for sending queries to a subset
> of the connections.  Setting -j more than -c is useless (I forget
> if it's actually an error).  If you set -j to, say, half of -c then
> each thread has exactly two connections to manage.  If -j is too
> small compared to -c then pgbench itself tends to become the bottleneck.
>

Without -j, we have N connections, but one thread.
So this thread serves all of the connections one by one? The 1. and 2. and
3. connection while others do nothing (standing by)?
Or like Round Robin: a job from C1, next job from C2,  job from CN -
then job from C1 and no more jobs?

And if J = C means realtime parallel test?
Because each thread has only one connection, and they work concurrently as
real users do?

Then if I want to simulate 500 users heavy work, I need to set J=C=500?

Thank you for the information!

BW
   DD


Re: Logical replication from 11.x to 12.x and "unique key violations"

2020-07-20 Thread Thomas Kellerer
> I have a strange error when using logical replication between a 11.2
> source database and a 12.3 target.
>
> If I create the publication with all needed tables (about 50) at
> once, I get "duplicate key value violates unique constraint xxx_pkey"
> errors during the initial replication (when creating the
> subscription).
>
> When create the publication only with a few tables, the initial data
> sync works without problems. To replicate all tables, I add the
> tables incrementally to the publication, and refresh the
> subscription.
>
> If I do it like that (step-by-step) everything works fine. Tables
> that generated the "duplicate key value" error previously will
> replicate just fine. The tables are quite small, some of them less
> then 100 rows.
>

Any pointers where I should start looking to investigate this?





Re: how to "explain" some ddl

2020-07-20 Thread Michel Pelletier
Marc,

If you add a check constraint that proves the new child partition has no
out of bounds rows, then the ATTACH PARTITION will not block:

"Before running the ATTACH PARTITION command, it is recommended to create a
CHECK constraint on the table to be attached matching the desired partition
constraint. That way, the system will be able to skip the scan to validate
the implicit partition constraint. Without the CHECK constraint, the table
will be scanned to validate the partition constraint while holding an ACCESS
EXCLUSIVE lock on that partition and a SHARE UPDATE EXCLUSIVE lock on the
parent table. It may be desired to drop the redundant CHECK constraint
after ATTACH PARTITION is finished."

https://www.postgresql.org/docs/current/ddl-partitioning.html

As for your high dimension table with lots of indexes, are you sure they
are all being used?  I almost always find my legacy customers have many
indexes that are constantly being updated but are never used by their
applications due to either "framework confusion" or just overzealous
indexing.   Here's a good article by Laurenze Albe on the subject:

https://www.cybertec-postgresql.com/en/get-rid-of-your-unused-indexes/

-Michel

On Tue, Jul 14, 2020 at 12:32 PM Marc Millas  wrote:

> Hi Tom,
> a few tests later.
> Looks like when you add a partition as default, all tupples of it are
> read, even if there is an index on the column that is the partition key.
> this do explain our attach time. We are going to clean the default
> partition...
>
> regards,
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>
>
> On Tue, Jul 14, 2020 at 7:05 PM Tom Lane  wrote:
>
>> Marc Millas  writes:
>> > We would like to understand where an alter table attach partition spend
>> its
>> > time.
>> > to my understanding, explain doesnt do this.
>>
>> Nope :-(.  As our DDL commands have gotten more complicated, there's
>> been some discussion of adding that, but nothing's really been done
>> yet.
>>
>> There is some progress-monitoring support for some DDL commands now,
>> but that's not quite the same thing.
>>
>> > for a BI job we have a partitionned table with 1800+ partitions.
>>
>> TBH I'd recommend scaling that down by at least a factor of ten.
>> We are not at a point where you can expect that all operations will
>> be cheap even with thousands of partitions.  We may never be at that
>> point, although people continue to chip away at the bottlenecks.
>>
>> regards, tom lane
>>
>


Re: How to restore a dump containing CASTs into a database with a new user?

2020-07-20 Thread Christophe Pettus



> On Jul 20, 2020, at 02:28, Thorsten Schöning  wrote:
> 
> Would be far easier if Postgres would do that automatically like it
> seems to do for most other objects. The important point is that owning
> those types seems to be per database, so things should be safe to do
> automatically.

I'm not sure I understand exactly how this "feature" would work.  It seems to 
be "in the case that I am using CASTs that include internal types and restoring 
to a different, non-superuser user than the original one in the database that 
was dumped from, change the owner of internal types to make sure that my CAST 
restores work."  That strikes me as a *very* ad hoc feature indeed.

--
-- Christophe Pettus
   x...@thebuild.com





Re: Improvement for query planner? (no, not about count(*) again ;-))

2020-07-20 Thread Francisco Olarte
Tobias, 1st some etiquette stuff.

- You have replied just to me, directly. I'm CCing the list. Remember
to use reply all. Usual practice in the postgres lists is to reply to
the list and everyone involved in the thread ( doing reply all
achieves this normally ).

- It's not a biggie in this particular mail, but please do not
top-post, specially if you want to get answers on any complex
question. Trim unnecessary parts from the quoted text and reply below.
Having to scroll to a big chunk which includes even my signature is
not a thing I like.

On Mon, Jul 20, 2020 at 2:23 PM  wrote:
> I have tried the queries
> select name1 from games union select name2 from games
> but not
> select distinct name1 from games union
> select distinct name2 from games
> since it's just the same and easy for the optimizer to realize (I thought?)

There are several other things you have not done. You have not
provided any info ( statistics ) on your table, just the cardinality.
You have not provided any explain output, which is normally needed if
you really want people to help you.

On the subject, I'm not really sure both queries are identical or can
be optimized, but when one does bulk queries like that it is better to
help it. Your query is a corner case, a one of loading, and many
optimizers do not catch that things properly, as putting code for them
means increasing bug surface on a feature of dubious utility ( I,
personally, would prefer having to put your sample query manually than
risking optimizer bugs OR paying the price of the optimizer trying to
catch that on every query I send ).

Also, the optimizer may be catching many things, a explain output may
help to see what it's doing.

> I've given Postgres a few Gigs (I think 4?) as work_mem, having 16 GB in 
> total. Still it's not using them.

This does not seem correct. work_mem is per operation, it can be used
several times on a single query. See
https://www.postgresql.org/docs/12/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY
. Again, some explain/show combos may clear up things.

> Seems like my mistake was creating that table with a primary key. But the 
> query itself without inserting into anything should've been fast then, which 
> it wasn't. I'll remember your trick of creating the primary key afterwards 
> and will try just select name1 from games to see how it goes.

The PK stuff is bulk-loading 101. Try explain AND explain analyze of
some variants, remember to analyze your tables ( seems redundant, but
the PK & redundant hash key stuff leads me to think you are not too
experienced on postgres usage ).

Francisco Olarte.




Re: Logical replication from 11.x to 12.x and "unique key violations"

2020-07-20 Thread Adrian Klaver

On 7/20/20 7:22 AM, Thomas Kellerer wrote:

I have a strange error when using logical replication between a 11.2
source database and a 12.3 target.

If I create the publication with all needed tables (about 50) at
once, I get "duplicate key value violates unique constraint xxx_pkey"
errors during the initial replication (when creating the
subscription).

When create the publication only with a few tables, the initial data
sync works without problems. To replicate all tables, I add the
tables incrementally to the publication, and refresh the
subscription.

If I do it like that (step-by-step) everything works fine. Tables
that generated the "duplicate key value" error previously will
replicate just fine. The tables are quite small, some of them less
then 100 rows.



Any pointers where I should start looking to investigate this?


What are the PUBLICATION and SUBSCRIPTION commands being used?

Where is "xxx_pkey" coming from, e.g. sequence?

Where are source and target relative to each other in network/world?

Are there any other errors in log at around the same time that might apply?

Have you looked at 
https://www.postgresql.org/docs/12/logical-replication-architecture.html#LOGICAL-REPLICATION-SNAPSHOT?:


30.5.1. Initial Snapshot

Are the tables heavily used when the subscription is invoked?


--
Adrian Klaver
adrian.kla...@aklaver.com




Re: How to restore a dump containing CASTs into a database with a new user?

2020-07-20 Thread Thorsten Schöning
Guten Tag Christophe Pettus,
am Montag, 20. Juli 2020 um 16:34 schrieben Sie:

> I'm not sure I understand exactly how this "feature" would work.

Make internal types used in CASTs owned by the restoring user, like
all other objects are owned automatically as well. Postgres obviously
is able to detect those problems already and reowning automatically
would prevent people from the need to research first and do things
manually afterwards. One could optionally introduce an additional
flag, but if things like "--role" and "--no-owner" are specified,
intention of the users seem pretty clear to me.

In the worst case this problem makes people even NOT use individual
DB-users and use superuser for everything.

Just search for the topic about backing up databases using a read-only
user: It's that difficult, that people suggest creating a special
superuser set into read-only mode, from which it can recover itself
and stuff like that. Simply because granting permissions on unknown
schemas and maintaining default permissions for objects nobody knows
if they ever will be needed isn't really straightforward.

Using one concrete owner for everything right from the start is the
best one can do and that includes being able to reown in case of
restoring backups. And Postgres supports most of that already, only
the CAST-thing has not been thought to an end I guess.

Mit freundlichen Grüßen,

Thorsten Schöning

-- 
Thorsten Schöning   E-Mail: thorsten.schoen...@am-soft.de
AM-SoFT IT-Systeme  http://www.AM-SoFT.de/

Telefon...05151-  9468- 55
Fax...05151-  9468- 88
Mobil..0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow





Re: DB Authentication with Label Security

2020-07-20 Thread Michel Pelletier
On Fri, Jul 17, 2020 at 12:08 PM Ailleen Pace 
wrote:

> Oracle has a product called Oracle Label Security using Oracle Internet
> Directory.  Does PostgreSQL have a similar capability?
>
> Thank you in advance!
>

Hi Ailleen,

Googling it appears the same functionality in Postgres is called Row Level
Security (RLS).

https://www.postgresql.org/docs/current/ddl-rowsecurity.html

-Michel


Re: How to restore a dump containing CASTs into a database with a new user?

2020-07-20 Thread Christophe Pettus



> On Jul 20, 2020, at 08:10, Thorsten Schöning  wrote:
> Make internal types used in CASTs owned by the restoring user, like
> all other objects are owned automatically as well.

I don't think that we want to do that, or that we even have to.

Having a restore tool make automatic changes to the ownership of objects in the 
database it is restoring into seems like a bad idea, especially when those 
ownership changes are not part of the backup itself.  On a database with 
multiple users, you can't just get away with changing the ownership of the 
types; you have to make sure that the USAGE is granted appropriately to other 
users.

Again, this is to support a very specific use-case:

* A database has user-defined objects in it that only a superuser can create, 
and,
* The rest of the database objects are owned by that superuser, and,
* You want to change the ownership of the database objects that can be changed, 
and,
* You want to have a single backup that you can restore multiple times, 
changing the ownership in a different way each time, and,
* You want to use pg_restore to do it.

This would require a fair amount of surgery to pg_restore.  Right now, 
pg_restore doesn't really have a "remap these users" functionality.  --no-owner 
*looks* like it does that, and can be used for that in certain cases, but the 
user-remapping functionality of it is really a side-effect.  It happens to 
change the user because instead of altering the user to what it is in the 
backup, it just accepts the default ownership based on the user it is connected 
as.

You can accomplish the same thing by restoring as the superuser, not having to 
alter the ownership of any internal type, and then changing the ownership of 
the user-created objects in the new database once it is restored.  This can be 
done entirely with existing tools, and doesn't need any changes to pg_restore, 
or even having to do ownership changes of internal types (which I strongly 
suspect will bite you later).

--
-- Christophe Pettus
   x...@thebuild.com





Re: Logical replication from 11.x to 12.x and "unique key violations"

2020-07-20 Thread Tom Lane
Thomas Kellerer  writes:
>> I have a strange error when using logical replication between a 11.2
>> source database and a 12.3 target.
>> 
>> If I create the publication with all needed tables (about 50) at
>> once, I get "duplicate key value violates unique constraint xxx_pkey"
>> errors during the initial replication (when creating the
>> subscription).
>> 
>> When create the publication only with a few tables, the initial data
>> sync works without problems. To replicate all tables, I add the
>> tables incrementally to the publication, and refresh the
>> subscription.
>> 
>> If I do it like that (step-by-step) everything works fine. Tables
>> that generated the "duplicate key value" error previously will
>> replicate just fine. The tables are quite small, some of them less
>> then 100 rows.

I have not looked at the code, but it wouldn't surprise me if the initial
replication just copies all the specified tables in some random order.
If there are FK references involved, the replication would have to be
done with referenced tables first, and I bet there's no logic for that.
(Even if there was, it could not cope with circular references or
self-references.)

Best bet might be to not install the subscriber's foreign key
constraints till after the initial sync is done.

regards, tom lane




Re: Logical replication from 11.x to 12.x and "unique key violations"

2020-07-20 Thread Adrian Klaver

On 7/20/20 10:47 AM, Tom Lane wrote:

Thomas Kellerer  writes:

I have a strange error when using logical replication between a 11.2
source database and a 12.3 target.

If I create the publication with all needed tables (about 50) at
once, I get "duplicate key value violates unique constraint xxx_pkey"
errors during the initial replication (when creating the
subscription).

When create the publication only with a few tables, the initial data
sync works without problems. To replicate all tables, I add the
tables incrementally to the publication, and refresh the
subscription.

If I do it like that (step-by-step) everything works fine. Tables
that generated the "duplicate key value" error previously will
replicate just fine. The tables are quite small, some of them less
then 100 rows.


I have not looked at the code, but it wouldn't surprise me if the initial
replication just copies all the specified tables in some random order.
If there are FK references involved, the replication would have to be
done with referenced tables first, and I bet there's no logic for that.
(Even if there was, it could not cope with circular references or
self-references.)

Best bet might be to not install the subscriber's foreign key
constraints till after the initial sync is done.


I'm probably missing something, but would that not result in a 'key not 
found' type of error. The OP is seeing "duplicate key value violates 
unique constraint xxx_pkey". To me that indicates a doubling up of at 
least some of the data replication.




regards, tom lane





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Logical replication from 11.x to 12.x and "unique key violations"

2020-07-20 Thread Tom Lane
Adrian Klaver  writes:
> I'm probably missing something, but would that not result in a 'key not 
> found' type of error. The OP is seeing "duplicate key value violates 
> unique constraint xxx_pkey". To me that indicates a doubling up of at 
> least some of the data replication.

[ squint... ]  Sorry, I completely misread the error condition.

Yeah, duplicate keys does seem odd here.  Can you provide a self
contained example?

regards, tom lane




Re: Logical replication from 11.x to 12.x and "unique key violations"

2020-07-20 Thread John Ashmead
I have had this problem with logical replication on PG 10 repeatedly. In a 
clean build no problem.

But if I am restarting replication because of some problem I’ve seen problems 
with rows already present.

My own fix, which has worked in my shop, is to add replica triggers to check 
for the row being already present.   If it is, they drop the row on the floor.  
This lets stuff come in in whatever order it happens to come in.  

Sample code:

if TG_OP = ‘INSERT’ then
select id into id1 from table1 where id = new.id;
if id1 is not null then
— log error for analysis
return null;
end if;
end if;

In an ideal world, this would probably not be necessary.  But it can be tricky 
to restart replication in an absolutely clean way across all tables and in this 
case it can be better to allow for a bit of overlap in the rows being 
replicated. 

FWIW,

John

> On Jul 20, 2020, at 1:47 PM, Tom Lane  wrote:
> 
> Thomas Kellerer  writes:
>>> I have a strange error when using logical replication between a 11.2
>>> source database and a 12.3 target.
>>> 
>>> If I create the publication with all needed tables (about 50) at
>>> once, I get "duplicate key value violates unique constraint xxx_pkey"
>>> errors during the initial replication (when creating the
>>> subscription).
>>> 
>>> When create the publication only with a few tables, the initial data
>>> sync works without problems. To replicate all tables, I add the
>>> tables incrementally to the publication, and refresh the
>>> subscription.
>>> 
>>> If I do it like that (step-by-step) everything works fine. Tables
>>> that generated the "duplicate key value" error previously will
>>> replicate just fine. The tables are quite small, some of them less
>>> then 100 rows.
> 
> I have not looked at the code, but it wouldn't surprise me if the initial
> replication just copies all the specified tables in some random order.
> If there are FK references involved, the replication would have to be
> done with referenced tables first, and I bet there's no logic for that.
> (Even if there was, it could not cope with circular references or
> self-references.)
> 
> Best bet might be to not install the subscriber's foreign key
> constraints till after the initial sync is done.
> 
>   regards, tom lane
> 
> 
> 

John Ashmead
139 Montrose Avenue
Rosemont, PA, 19010-1508
(610) 527 9560 
mobile (610) 247 2323
john.ashm...@ashmeadsoftware.com








smime.p7s
Description: S/MIME cryptographic signature


Extension vs Implementing Wire Protocol

2020-07-20 Thread Matthew Tamayo-Rios
Hi all,

We need to apply some data management policies on read. We are considering
two approaches

   1. Implement a postgres extension.
   2. Implement a proxy that implements the PostgreSQL wire protocol.

We're trying to decide which approach to take and are leaning towards
implementing it as an open source extension. Mostly what we need is to be
able to apply simple row level transformations after query execution before
data is returned to the client. Our hope is we can do this without messing
with cursors or execution as we don't want it to be brittle across postgres
versions. It would be super cool if it could apply to foreign data wrappers
as well.

Examples of behaviors we'd like to have:

   - 'SELECT * FROM table;' should return masked versions of the columns
   based on policy for that specific user.
   - 'SELECT * FROM table;' should return just the columns accessible to
   a specific user.

Questions:

   1. Are there easily extensible (ideally open-source) proxies that
   already implement the Postgres protocol that we could modify/extend to
   support this?
   2. Does the extension framework support post-query execution
   transformations before returning the result set such that it is compatible
   with postgres clients (transparently).

This seems like a useful extension and we'd be happy to build it and open
source it so that other folks could benefit. Also, please let me know if I
got the wrong mailing list and should take discussion elsewhere.

Matthew


Re: Extension vs Implementing Wire Protocol

2020-07-20 Thread Thomas Kellerer

Matthew Tamayo-Rios schrieb am 20.07.2020 um 22:13:

Examples of behaviors we'd like to have:

  * 'SELECT * FROM table;' should return masked versions of the columns based 
on policy for that specific user.
  * 'SELECT * FROM table;' should return just the columns accessible to a 
specific user.

Questions:

1. Are there easily extensible (ideally open-source) proxies that
already implement the Postgres protocol that we could modify/extend
to support this?


You might want to look this extension:

https://postgresql-anonymizer.readthedocs.io/en/latest/


2. Does the extension framework support post-query execution
transformations before returning the result set such that it is
compatible with postgres clients (transparently).

In general you could probably achieve both use-cases with rewrite rules
and/or views (both being "pre-query" rather than post-query though)







Re: Extension vs Implementing Wire Protocol

2020-07-20 Thread Karsten Hilbert
On Mon, Jul 20, 2020 at 10:36:26PM +0200, Thomas Kellerer wrote:

> >   * 'SELECT * FROM table;' should return masked versions of the columns 
> > based on policy for that specific user.
> >   * 'SELECT * FROM table;' should return just the columns accessible to a 
> > specific user.
> >
> > Questions:
> >
> > 1. Are there easily extensible (ideally open-source) proxies that
> > already implement the Postgres protocol that we could modify/extend
> > to support this?
>
> You might want to look this extension:
>
> https://postgresql-anonymizer.readthedocs.io/en/latest/

or https://github.com/marcmunro/veil

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: Could not open file pg_xact/0E97

2020-07-20 Thread Jeremy Schneider

> On Jul 18, 2020, at 14:18, Radoslav Nedyalkov  wrote:
> 
> 
> Well. the vacuum full failed with 
> 
> vacuumdb: vacuuming of table "olap.transactions_and_fees_2020_01" in database 
> "db" failed: ERROR:  found xmin 3916900817 from before relfrozenxid 80319533

Do you have checksums enabled for this database?

-Jeremy

Sent from my TI-83



Re: Logical replication from 11.x to 12.x and "unique key violations"

2020-07-20 Thread Thomas Kellerer
Adrian Klaver schrieb am 20.07.2020 um 16:45:
> On 7/20/20 7:22 AM, Thomas Kellerer wrote:
>>> I have a strange error when using logical replication between a 11.2
>>> source database and a 12.3 target.
>>>
>>> If I create the publication with all needed tables (about 50) at
>>> once, I get "duplicate key value violates unique constraint xxx_pkey"
>>> errors during the initial replication (when creating the
>>> subscription).
>>>
>>> When create the publication only with a few tables, the initial data
>>> sync works without problems. To replicate all tables, I add the
>>> tables incrementally to the publication, and refresh the
>>> subscription.
>>>
>>> If I do it like that (step-by-step) everything works fine. Tables
>>> that generated the "duplicate key value" error previously will
>>> replicate just fine. The tables are quite small, some of them less
>>> then 100 rows.
>>>
>>
>> Any pointers where I should start looking to investigate this?
>
> What are the PUBLICATION and SUBSCRIPTION commands being used?

Pretty basic:

  create publication some_publication
 for table table1, table2, table3, , table50;

  create subscription foo
connection '...'
publication some_publication;

The approach that worked:

* create the publication with the first 10 tables
* create the subscription, wait until the initial sync is finished
* alter the publication add the next 10 tables
* refresh the subscription, wait until the sync is finished - repeat until all 
tables are replicated.

Note that I did not pay attention to foreign key relationships when adding the 
tables
(I essentially added them in alphabetical order). I first thought that was 
going to
be a problem, but it seems Postgres can cope with that.

Once the initial sync was through (about a week now), no further problems came 
up.
It's been running smoothly since then


> Where is "xxx_pkey" coming from, e.g. sequence?

No, as mentioned, those are varchar(20) columns.
The values are generated by the application (no default value defined for the 
column)

> Where are source and target relative to each other in network/world?

Same datacenter.


> Are there any other errors in log at around the same time that might apply?

No.

> Are the tables heavily used when the subscription is invoked?

Used, but not "heavily" (a few inserts per minute).

And I think the tables where the errors happened, weren't written to at all.
Those were rather small lookup tables (a few hundred rows at most)

There are no triggers on the tables where the errors happened in case that's 
important.







Re: Logical replication from 11.x to 12.x and "unique key violations"

2020-07-20 Thread Thomas Kellerer
Tom Lane schrieb am 20.07.2020 um 20:04:
> Adrian Klaver  writes:
>> I'm probably missing something, but would that not result in a 'key not
>> found' type of error. The OP is seeing "duplicate key value violates
>> unique constraint xxx_pkey". To me that indicates a doubling up of at
>> least some of the data replication.
>
> [ squint... ]  Sorry, I completely misread the error condition.
>
> Yeah, duplicate keys does seem odd here.  Can you provide a self
> contained example?

I'll try, but this is a production system.
Extracting the necessary anonymous data will be tricky.

Is there any chance the version difference might cause this?
And a slightly outdated 11.x at that?

Thomas




Re: Logical replication from 11.x to 12.x and "unique key violations"

2020-07-20 Thread Thomas Kellerer
John Ashmead schrieb am 20.07.2020 um 20:11:
> I have had this problem with logical replication on PG 10 repeatedly.
> In a clean build no problem.
>
> But if I am restarting replication because of some problem I’ve seen
> problems with rows already present.
>
> My own fix, which has worked in my shop, is to add replica triggers
> to check for the row being already present.   If it is, they drop the
> row on the floor.  This lets stuff come in in whatever order it
> happens to come in.
>
> Sample code:
>
> if TG_OP = ‘INSERT’ then
> select id into id1 from table1 where id = new.id ;
> if id1 is not null then
> — log error for analysis
> return null;
> end if;
> end if;
>
> In an ideal world, this would probably not be necessary.  But it can
> be tricky to restart replication in an absolutely clean way across
> all tables and in this case it can be better to allow for a bit of
> overlap in the rows being replicated.
>

Interesting idea, I'll keep this in mind.

For now everything works fine, so there is no pressing need.

Thomas