RE: Logical Replication - Single Destination Table With Multiple Source Tables - How to Handle Structure Changes

2021-07-05 Thread Avi Weinberg
Thanks for the reply,

My question was, what will happen if I have one destination table which gets 
data from many source tables.  What is the best way to handle changes in the 
structure of SOME of the source tables, while other source tables remain in the 
old format.
Maybe in some cases where the type was changed it may be able to work with 
source tables of different types, but what if column was renamed in one source 
table but the column remained with the old name in the other source table?  
What column name should the destination table have?  Do I need to duplicate the 
column to have both old and new names?

From: Vijaykumar Jain [mailto:vijaykumarjain.git...@gmail.com]
Sent: Sunday, July 4, 2021 6:53 PM
To: Avi Weinberg 
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Logical Replication - Single Destination Table With Multiple 
Source Tables - How to Handle Structure Changes

On Sun, 4 Jul 2021 at 15:53, Avi Weinberg 
mailto:a...@gilat.com>> wrote:
I'm using logical replication to copy data from multiple tables to a single 
destination table.  At times the structure of the source table needs to change. 
 However, not all source table will have their structure updated at the same 
time.  Let's assume, for example, a column type needs to be changed (but 
solution needs to work for column addition, deletion, rename etc.).  What is 
the preferable approach:


  1.  To add another column to the destination table where its type will be the 
new type.  Source tables, that have the new column type, will write to the new 
column.  While source tables with old column type will write to the old column. 
 A view will do the proper casting and will show a single column to user.
  2.  Add a new table (versioned: table_name_v2) where source tables that have 
a new structure will write to the new destination table, while old source 
tables will write to the old destination table.   A view with UNION and casting 
will combine all tables.
  3.  A better way?

Does the below not work for all alter table changes on publisher.
I have been testing on a demo setup pg14beta, and subscribers are able to catch 
up fine.

on publisher (port 5001)
postgres=# alter table t alter COLUMN name type bigint using name::bigint;

on subscriber (port 5002)
postgres=# alter table t alter COLUMN name type bigint using name::bigint;
ALTER TABLE
postgres=# alter subscription mysub refresh publication;

this should work for all the cases for ddl changes right.

-- demo

-- create table on pub and sub
postgres@db:~/playground/logical_replication$ psql -p 5001
psql (14beta1)
Type "help" for help.
postgres=# create table t(id int primary key, name text);
CREATE TABLE
postgres=# \q

postgres@db:~/playground/logical_replication$ psql -p 5002
psql (14beta1)
Type "help" for help.
postgres=# create table t(id int primary key, name text);
CREATE TABLE
postgres=# \q

-- insert dummy data to check sub rx changes
postgres@db:~/playground/logical_replication$ psql -p 5001
psql (14beta1)
Type "help" for help.
postgres=# insert into t values (1, 1::text);
INSERT 0 1
postgres=# \q

postgres@db:~/playground/logical_replication$ psql -p 5002
psql (14beta1)
Type "help" for help.
postgres=# table t;
 id | name
+--
(0 rows)
postgres=# alter subscription mysub refresh publication; -- this is because i 
dropped table with publication enabled
ALTER SUBSCRIPTION
postgres=# table t;
 id | name
+--
  1 | 1
(1 row)
postgres=# \q

-- alter table alter column change type on pub
postgres@db:~/playground/logical_replication$ psql -p 5001
psql (14beta1)
Type "help" for help.
postgres=# alter table t alter COLUMN name type bigint using name::bigint;
ALTER TABLE
postgres=# \q

-- alter table alter column change type on sub
postgres@db:~/playground/logical_replication$ psql -p 5002
psql (14beta1)
Type "help" for help.
postgres=# alter table t alter COLUMN name type bigint using name::bigint;
ALTER TABLE
postgres=# \q

-- insert new data based on new column type
postgres@db:~/playground/logical_replication$ psql -p 5001
psql (14beta1)
Type "help" for help.
postgres=# insert into t values (2, 100);
INSERT 0 1
postgres=# \q

-- check new data on sub
postgres@db:~/playground/logical_replication$ psql -p 5002
psql (14beta1)
Type "help" for help.
postgres=# table t;
 id |  name
+-
  1 |   1
  2 | 100
(2 rows)
postgres=# \q


--alter table alter col type on pub and insert data
postgres@db:~/playground/logical_replication$ psql -p 5001
psql (14beta1)
Type "help" for help.
postgres=# alter table t alter COLUMN name type text using name::text;
ALTER TABLE
postgres=# insert into t values (3, 'three');
INSERT 0 1
postgres=# \q

--alter table alter col type on sub, changes will not come till refresh pub
postgres@db:~/playground/logical_replication$ psql -p 5002
psql (14beta1)
Type "help" for help.
postgres=# table t;
 id |  name
+-
  1 |   1
  2 | 100
(2 row

number of wal file is not reduced.

2021-07-05 Thread Atul Kumar
Hi,

I have postgres 9.6 running server on centos 7, the number of wal file
in pg_xlog directory is above 4000 so to save some disk space, I
thought of reducing to 100.

but there is no reduction in numbers.

Please not that there is no replication is configured, the server is
independent.

What I am missing here.

•   max_wal_size = '3GB'
•   checkpoint_timeout = '15min'
•   checkpoint_completion_target=0.8
•   log_checkpoints=on
•   wal_keep_segments=100 (initially it was set to 4000)




Please suggest when the number of wal files will be reduced to approx 100.





regards,
Atul




Re: number of wal file is not reduced.

2021-07-05 Thread hubert depesz lubaczewski
On Mon, Jul 05, 2021 at 03:18:14PM +0530, Atul Kumar wrote:
> I have postgres 9.6 running server on centos 7, the number of wal file
> in pg_xlog directory is above 4000 so to save some disk space, I
> thought of reducing to 100.
...
> Please suggest when the number of wal files will be reduced to approx 100.

It's hard to say for sure. There are potential issues that might cause
pg to keep more wal files than configured.

Generally these kinds of things are easier solved in more interactive
medium - like IRC, or Slack.

But, to give you some start, check if:
1. archiving doesn't fail
2. archiving doesn't lag
3. there are no prepared transactions
4. there are no lagging replication slots

Best regards,

depesz





wal_keep_segments and max_wal_size

2021-07-05 Thread Atul Kumar
hi,

(on postgres 9.6) I was just practicing to make my no. of wal files in
pg_xlog directory to be the same/equal as I pass on wal_keep_segments
paramater.

say for example
show wal_keep_segments;
 wal_keep_segments
---
 125


so I want my wal files should not be grow in numbers more than 125 so
for that I have changed the parameter of max_wal_size to 2 GB to limit
the size of pg_xlog directory.

 show max_wal_size;
 max_wal_size
--
 2GB
(1 row)


but I did some testing by bulk inserts then the number of wal files
has grown more than 125.

and the size of pg_xlog directory also reached to 2.7 GB.

/data/apps/edb/as96/data/pg_xlog 04:05:08]$ ls | wc -l
173

/data/apps/edb/as96/data 04:05:11]$ du -sh pg_xlog/
2.7Gpg_xlog/


I wonder why I faced such behavior bcz I  limited the size of pg_xlog
directory by setting max_wal_size to 2GB that should be equivalent to
around 125 number of wal file in the pg_xlog directory.


please suggest how should I make both identical (wal_keep_segments and
max_wal_size).





regards,
Atul




Re: number of wal file is not reduced.

2021-07-05 Thread Atul Kumar
Hi ,

As I mentioned in my mail that there is no replication configured and
no archival is lagging.

my query is simple.

How to to reduce the number of wal file from 4000 to 100 in an
independent server ?



Regards.

On 7/5/21, hubert depesz lubaczewski  wrote:
> On Mon, Jul 05, 2021 at 03:18:14PM +0530, Atul Kumar wrote:
>> I have postgres 9.6 running server on centos 7, the number of wal file
>> in pg_xlog directory is above 4000 so to save some disk space, I
>> thought of reducing to 100.
> ...
>> Please suggest when the number of wal files will be reduced to approx
>> 100.
>
> It's hard to say for sure. There are potential issues that might cause
> pg to keep more wal files than configured.
>
> Generally these kinds of things are easier solved in more interactive
> medium - like IRC, or Slack.
>
> But, to give you some start, check if:
> 1. archiving doesn't fail
> 2. archiving doesn't lag
> 3. there are no prepared transactions
> 4. there are no lagging replication slots
>
> Best regards,
>
> depesz
>
>




Re: Logical Replication - Single Destination Table With Multiple Source Tables - How to Handle Structure Changes

2021-07-05 Thread Vijaykumar Jain
On Mon, 5 Jul 2021 at 14:29, Avi Weinberg  wrote:

> Thanks for the reply,
>
>
>
> My question was, what will happen if I have one destination table which
> gets data from many source tables.  What is the best way to handle changes
> in the structure of SOME of the source tables, while other source tables
> remain in the old format.
>
> Maybe in some cases where the type was changed it may be able to work with
> source tables of different types, but what if column was renamed in one
> source table but the column remained with the old name in the other source
> table?  What column name should the destination table have?  Do I need to
> duplicate the column to have both old and new names?
>

ADD NEW column: [1]
add a new column to the destination first and then to the source(s).
the WALs would be retained by the publisher till it can start publishing
again, so no data loss.

ALTER COL: [2]
imho, add a new column of the to be modified datatype to both destination
and source as above in [1]. then write a trigger for source tables to sync
the columns which need to have the type changed to the new column added in
the same source tables. let those changes get published to destination.
(for ex. check the int to bigint migration in PG, like
int_to_bigint_migration

and then drop the old column at source and destination.

DROP COL [3]:
dropping-columns

if there are apps making use of that column, first you would have to
ensure, the column is no longer in use in queries. then you can start by
dropping the col at source(s) and then the destination.

I have a setup that I have used with three source dbs and 1 dest db for
above. but things can get more complicated with FK constraints etc, so i am
not sure i have the best answer to this as i have not done it in production.
I have only used LR for migration and was thinking of (federated setup /to
unshard) where many shards -> LR -> one shard and when sharding was a bad
decision, but in both cases did not allow DDL changes at source till it was
completed.


The Curious Case of the Table-Locking UPDATE Query

2021-07-05 Thread Emiliano Saenz
Hello!
We have a huge POSTGRES 9.4 database in the production environment (several
tables have more than 100.000.00 registers). Last two months we have had
problems with CPU utilization. Debugging the locks (on pg_locks) we notice
that sometimes simple UPDATE (by primary key) operation takes out
ACCESS_EXCLUSIVE_LOCK mode over these huge tables so POSTGRES DB collapses
and it generates excessive CPU consumption. My question is, How is it
possible that UPDATE operation takes out ACCESS_EXCLUSIVE_LOCK mode?
More information, this system never manifests this behavior before and we
don't make software changes on last 2 years


Re: The Curious Case of the Table-Locking UPDATE Query

2021-07-05 Thread Adrian Klaver

On 7/5/21 4:22 PM, Emiliano Saenz wrote:

Hello!
We have a huge POSTGRES 9.4 database in the production environment 
(several tables have more than 100.000.00 registers). Last two months we 
have had problems with CPU utilization. Debugging the locks (on 
pg_locks) we notice that sometimes simple UPDATE (by primary key) 
operation takes out ACCESS_EXCLUSIVE_LOCK mode over these huge tables so 
POSTGRES DB collapses and it generates excessive CPU consumption. My 
question is, How is it possible that UPDATE operation takes out 
ACCESS_EXCLUSIVE_LOCK mode?
More information, this system never manifests this behavior before and 
we don't make software changes on last 2 years


FYI. 9.4 is ~1.5 years past EOL

Please don't post images. It would have just as easy to copy and paste 
the output and would have saved hand building the below.


Where is temp.querys_ejecutandose.csv coming from?

Above you mention querying  pg_locks.

What is the query you are using?

From here:

https://www.postgresql.org/docs/9.4/explicit-locking.html

"ACCESS EXCLUSIVE

Conflicts with locks of all modes (ACCESS SHARE, ROW SHARE, ROW 
EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, 
EXCLUSIVE, and ACCESS EXCLUSIVE). This mode guarantees that the holder 
is the only transaction accessing the table in any way.


Acquired by the DROP TABLE, TRUNCATE, REINDEX, CLUSTER, VACUUM 
FULL, and REFRESH MATERIALIZED VIEW (without CONCURRENTLY) commands. 
Many forms of ALTER TABLE also acquire a lock at this level (see ALTER 
TABLE). This is also the default lock mode for LOCK TABLE statements 
that do not specify a mode explicitly.

"



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Damaged (during upgrade?) table, how to repair?

2021-07-05 Thread Sam Gendler
On Sun, Jul 4, 2021 at 1:20 PM Adrian Klaver 
wrote:

>
> In any case I don't see you getting a 9.5 version on the laptop in the
> package directories. Pretty sure the Fedora 30 repos will not have 9.5
> and the Postgres repos don't go back to Fedora 30. So if you want a 9.5
> instance you will need to build it from source in order to get a server
> that works long enough to restore the 9.5 dump to so you can then use
> the 11 instance pg_dump to dump in order to move to the 11 instance.
>
>
There are 9.5 docker images available from dockerhub.  Easy enough to run
postgres from inside one of those with the postgres data dir mounted
inside, I would think. Could even use an 11 image to connect to it for
pg_dump.

--sam