Changing locale of an existing database

2025-06-17 Thread Marcin Gozdalik
Hi

I am using PostgreSQL 17 and would like to take advantage of performance
and stability across OS updates of builtin C.UTF-8 locale.
I have a cluster with a DB created with en_US.UTF-8 libc locale. I would
like to migrate the DB to C.UTF-8. Ideally there'd be an "ALTER DATABASE
... SET LOCALE ..." command that would take care of it but it seems it
doesn't exist. I was thinking that I could change the collation of all
TEXT/CHAR/VARCHAR columns in all the tables to pg_c_utf8, REINDEX all those
columns and change the default locale in the pg_database table.

Is it a sensible plan? Am I missing some steps? I can't find any reference
to anybody doing that before or discouraging it.

Thanks,
Marcin

-- 
Marcin Gozdalik


Re: Performance issues during pg_restore -j with big partitioned table

2025-06-17 Thread Dimitrios Apostolou
For the record, I haven't seen this deadlock again. I guess it was a bug 
on the master branch that got fixed, because I've been testing later 
versions a few times.


Dimitris



On Fri, 4 Apr 2025, Dimitrios Apostolou wrote:


On Wed, 2 Apr 2025, Dimitrios Apostolou wrote:


 Hello list.

 My database includes one table with 1000 partitions, all of them rather


I was not clear here: my database dump has all that, and the database is 
brand new and empty.



 sizeable. I run:

 pg_restore -j12 --no-tablespaces --disable-triggers --exit-on-error
 --no-owner --no-privileges -n public -d newdb custom_format_dump.pgdump

 Right now after 24h of restore, I notice weird behaviour, so I have
 several questions about it:

 + 11 postgres backend processes are sleeping as "TRUNCATE TABLE waiting".
   I see that they are waiting to issue a TRUNCATE for one of the
   partitions and then COPY data to it.  Checking the log I see that
   several partitions have already been copied finished, but many more
   are left to start.

   Why is a TRUNCATE needed at the start of a partition's COPY phase? I
   didn't issue a --clean on the command line (I don't need it as my
   database is newly created), and I don't see a mention of related
   TRUNCATE in the pg_restore manual.

 + 1 postgres backend process is doing:

   ALTER TABLE the_master_partitioned_table
 ADD CONSTRAINT ...
   FOREIGN KEY (columnX) REFERENCES another_table(columnX)

   According to my logs this started right after COPY DATA for
   another_table was finished. And apparently it has a lock on
   the_master_partitioned_table that all other TRUNCATE have to wait for.

   Is this a bug in the dependency resolution? Wouldn't it make sense for
   this to wait until all 1000 partitions have finished their COPY DATA
   phase?


Trying again, pg_restore exited with error after almost 24h:

pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 8904; 2606 16529 CONSTRAINT test_runs_raw 
test_runs_raw_partitioned_pkey

pg_restore: error: could not execute query: ERROR:  deadlock detected
DETAIL:  Process 465409 waits for AccessExclusiveLock on relation 44437 of 
database 44090; blocked by process 465408.
Process 465408 waits for AccessShareLock on relation 44383 of database 44090; 
blocked by process 465409.

HINT:  See server log for query details.

From the logs I see that:

+ Process 465409 waits for AccessExclusiveLock on relation 44437 of database 
44090; blocked by process 465408.

 -->  44437 is test_runs_raw__part_max10120k (a single partition)
+ Process 465408 waits for AccessShareLock on relation 44383 of database 
44090; blocked by process 465409.

 -->  44383 is test_runs_raw  (the master partitioned table)

Process 465409:
  ALTER TABLE ONLY public.test_runs_raw
ADD CONSTRAINT test_runs_raw_partitioned_pkey PRIMARY KEY (workitem_n,
run_n);

Process 465408:
  COPY public.test_runs_raw__part_max10120k(...) FROM stdin;


Bug? This happened on a postgres compiled from last week's master branch. The 
dump I'm trying to restore is from postgres 17.4.



Thanks
Dimitris







Re: Changing locale of an existing database

2025-06-17 Thread Adrian Klaver

On 6/17/25 10:30, Marcin Gozdalik wrote:

Around 100 DBs, ranging from 1TB to 50TB.
Initial testing showed that it takes 1.5h to dump the DB and 3h to 
restore it for a DB of around 3TB.


Just recreating the indexes takes around 30 minutes on the same DBs.

I understand that pg_dump/pg_restore is the safe route but it's too slow 
for me so I'd like to understand if I have any alternative.


Alright I see the issue. I don't know enough about this to say whether 
directly altering the system catalog would be a safe operation or not. 
An answer to that is going to need to come from someone with more 
knowledge on Postgres internals then I.




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Changing locale of an existing database

2025-06-17 Thread Adrian Klaver

On 6/17/25 09:20, Marcin Gozdalik wrote:

Hi

I am using PostgreSQL 17 and would like to take advantage of performance 
and stability across OS updates of builtin C.UTF-8 locale.
I have a cluster with a DB created with en_US.UTF-8 libc locale. I would 
like to migrate the DB to C.UTF-8. Ideally there'd be an "ALTER DATABASE 
... SET LOCALE ..." command that would take care of it but it seems it 
doesn't exist. I was thinking that I could change the collation of all 
TEXT/CHAR/VARCHAR columns in all the tables to pg_c_utf8, REINDEX all 
those columns and change the default locale in the pg_database table.


Is it a sensible plan? Am I missing some steps? I can't find any 
reference to anybody doing that before or discouraging it.


How big a database are we talking about?

To me it would seem easier to create a new database with new locale and 
do either a pg_dump/pg_restore or logical replication to the new 
instance. Of course this may depend on the answer to the question above.




Thanks,
Marcin

--
Marcin Gozdalik


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Logical Replication Memory Allocation Error - "invalid memory alloc request size"

2025-06-17 Thread Masahiko Sawada
On Wed, Jun 11, 2025 at 7:36 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> Dear Max,
>
> > We have rewritten as many of our transactions as possible to avoid using
> > temporary tables, and so far, that seems to have resolved the problem.
>
> Good to know. We try to fix as soon as possible.
>

I pushed the fix for this issue[1].

Regards,

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d87d07b7ad3b782cb74566cd771ecdb2823adf6a


--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Stably escaping an identifier

2025-06-17 Thread Phillip Diffley
Thanks!

On Sun, Jun 15, 2025 at 10:11 PM Tom Lane  wrote:

> Phillip Diffley  writes:
> > Is there a reliable way to determine if an identifier has already been
> > escaped, or alternatively is there a function that will stably escape an
> > identifier such that the identifier will not change if the function is
> > called repeatedly?
>
> This is impossible in general, because you can't know if the
> double-quotes are meant to be part of the identifier value.
>
> My advice here would be to flat-out reject input identifiers that
> contain double quotes.  I'd suggest banning newlines too while
> at it, as those are known to create security issues in some
> contexts.
>
> regards, tom lane
>