Re: Help with row estimate problem
On 29/7/2024 22:51, Jon Zeppieri wrote: Of course, I'd prefer not to have to materialize this relation explicitly. This particular query, for this particular user, benefits from it, but similar queries or queries for different users may not. I think the root of the problem is that population size (i.e., the number of members in a given population) has a high variance, and then planner is basing its estimates on the average population size (and maybe the average number of populations to which a user has access?), which is not especially useful. Is there anything I can do about this? Would any extended statistics be useful here? Thanks for report. I see such cases frequently enough and the key problem here is data skew, as you already mentioned. Extended statistics doesn't help here. Also, because we can't estimate specific values coming from the outer NestLoop - we can't involve MCV to estimate selectivity of the population. That's the reason why the optimiser uses ndistinct value. What you can do? I see only one option - split the table to some partitions where data will be distributed more or less uniformly. And invent a criteria for pruning unnecessary partitions. Of course, you can also try pg_hint_plan and force planner to use MergeJoin or HashJoin in that suspicious case. -- regards, Andrei Lepikhov
Re: Help with row estimate problem
On Tue, Jul 30, 2024 at 11:34 AM Andrei Lepikhov wrote: > > Thanks for report. I see such cases frequently enough and the key > problem here is data skew, as you already mentioned. Extended statistics > doesn't help here. Also, because we can't estimate specific values > coming from the outer NestLoop - we can't involve MCV to estimate > selectivity of the population. That's the reason why the optimiser uses > ndistinct value. > What you can do? I see only one option - split the table to some > partitions where data will be distributed more or less uniformly. And > invent a criteria for pruning unnecessary partitions. > Of course, you can also try pg_hint_plan and force planner to use > MergeJoin or HashJoin in that suspicious case. Thanks for the reply, Andrei, and for the advice about partitioning. - Jon
logical replication out of memory
2024-07-31 00:01:02.795 UTC:10.240.6.139(33068):repl13801@pgpodb:[3603770]:[10-1]:pgreps_13801ERROR: out of memory 2024-07-31 00:01:02.795 UTC:10.240.6.139(33068):repl13801@pgpodb:[3603770]:[11-1]:pgreps_13801DETAIL: Cannot enlarge string buffer containing 378355896 bytes by 756711422 more bytes. 2024-07-31 00:01:02.795 UTC:10.240.6.139(33068):repl13801@pgpodb:[3603770]:[12-1]:pgreps_13801CONTEXT: slot "pgreps_13801", output plugin "pgoutput", in the change callback, associated LSN 3D/318438E0 2024-07-31 00:01:02.795 UTC:10.240.6.139(33068):repl13801@pgpodb:[3603770]:[13-1]:pgreps_13801STATEMENT: START_REPLICATION SLOT "pgreps_13801" LOGICAL 3C/F24C74D0 (proto_version '1', publication_names 'pgreps_13801') We use built-in pgoutput and a client application did an HOT update to a column , that data type is "text" and real length is 756711422. But this table is NOT on publication list, possible to make logical decoding ignore "WAL records belong to tables that's not in publication list" ? Thanks, James
Re: logical replication out of memory
We use built-in pgoutput and a client application did an HOT update to a column , that data type is "text" and real length is 756711422. But this table is NOT on publication list, possible to make logical decoding ignore "WAL records belong to tables that's not in publication list" ? or we have drop replication slots or make it start from a new pglsn position ? 2024-07-31 00:01:02.795 UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[10-1]:pgreps_13801ERROR: out of memory 2024-07-31 00:01:02.795 UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[11-1]:pgreps_13801DETAIL: Cannot enlarge string buffer containing 378355896 bytes by 756711422 more bytes. 2024-07-31 00:01:02.795 UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[12-1]:pgreps_13801CONTEXT: slot "pgreps_13801", output plugin "pgoutput", in the change callback, associated LSN 3D/318438E0 2024-07-31 00:01:02.795 UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[13-1]:pgreps_13801STATEMENT: START_REPLICATION SLOT "pgreps_13801" LOGICAL 3C/F24C74D0 (proto_version '1', publication_names 'pgreps_13801') James Pang (chaolpan) 於 2024年7月31日週三 上午10:28寫道: > > > > > *From:* James Pang > *Sent:* Wednesday, July 31, 2024 10:18 AM > *To:* [email protected] > *Subject:* logical replication out of memory > > > > > > > > 2024-07-31 00:01:02.795 > UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[10-1]:pgreps_13801ERROR: > out of memory > > 2024-07-31 00:01:02.795 > UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[11-1]:pgreps_13801DETAIL: > Cannot enlarge string buffer containing 378355896 bytes by 756711422 more > bytes. > > 2024-07-31 00:01:02.795 > UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[12-1]:pgreps_13801CONTEXT: > slot "pgreps_13801", output plugin "pgoutput", in the change callback, > associated LSN 3D/318438E0 > > 2024-07-31 00:01:02.795 > UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[13-1]:pgreps_13801STATEMENT: > START_REPLICATION SLOT "pgreps_13801" LOGICAL 3C/F24C74D0 (proto_version > '1', publication_names 'pgreps_13801') > > > > We use built-in pgoutput and a client application did an HOT update to a > column , that data type is "text" and real length is 756711422. But this > table is NOT on publication list, possible to make logical decoding ignore > "WAL records belong to tables that's not in publication list" ? >
Re: logical replication out of memory
Hi Pang The text column is exceptionally large, Your server must be out of memory, Such a process ran out of memory while handling a large text column update. I suggest using an S3 bucket for such files, Consider increasing the memory-related configuration parameters, like work_mem, maintenance_work_mem or even the server's overall memory allocation if possible. Or increase the shared buffer size. If everything doesn't work, use physical replication to cope with it. 😄 in last let me know the datatype your are using for this column. On Wed, Jul 31, 2024 at 7:18 AM James Pang wrote: > 2024-07-31 00:01:02.795 > UTC:10.240.6.139(33068):repl13801@pgpodb:[3603770]:[10-1]:pgreps_13801ERROR: > out of memory > 2024-07-31 00:01:02.795 > UTC:10.240.6.139(33068):repl13801@pgpodb:[3603770]:[11-1]:pgreps_13801DETAIL: > Cannot enlarge string buffer containing 378355896 bytes by 756711422 more > bytes. > 2024-07-31 00:01:02.795 > UTC:10.240.6.139(33068):repl13801@pgpodb:[3603770]:[12-1]:pgreps_13801CONTEXT: > slot "pgreps_13801", output plugin "pgoutput", in the change callback, > associated LSN 3D/318438E0 > 2024-07-31 00:01:02.795 > UTC:10.240.6.139(33068):repl13801@pgpodb:[3603770]:[13-1]:pgreps_13801STATEMENT: > START_REPLICATION SLOT "pgreps_13801" LOGICAL 3C/F24C74D0 (proto_version > '1', publication_names 'pgreps_13801') > > We use built-in pgoutput and a client application did an HOT update to a > column , that data type is "text" and real length is 756711422. But this > table is NOT on publication list, possible to make logical decoding ignore > "WAL records belong to tables that's not in publication list" ? > > Thanks, > > James >
Re: logical replication out of memory
We use built-in pgoutput and a client application did an HOT update to a column , that data type is "text" and real data length is 756711422 bytes. it's pg logical decoding throw out of memory error when decode "WAL records belong to table" , and string buffer total size exceed 1GB. But this table is NOT on publication list and not in replication either, possible to make logical decoding to ignore the wal records that belong to "NOT in replication list" tables ? that can help reduce this kind of error. Thanks, James > > > > *From:* khan Affan > *Sent:* Wednesday, July 31, 2024 1:42 PM > *To:* James Pang > *Cc:* [email protected] > *Subject:* Re: logical replication out of memory > > > > Hi Pang > > The text column is exceptionally large, Your server must be out of memory, > Such a process ran out of memory while handling a large text column update. > > I suggest using an S3 bucket for such files, Consider increasing the > memory-related configuration parameters, like work_mem, > maintenance_work_mem or even the server's overall memory allocation if > possible. > > Or increase the shared buffer size. > > If everything doesn't work, use physical replication to cope with it. 😄 > > in last let me know the datatype your are using for this column. > > > > On Wed, Jul 31, 2024 at 7:18 AM James Pang wrote: > > We use built-in pgoutput and a client application did an HOT update to a > column , that data type is "text" and real length is 756711422. But this > table is NOT on publication list, possible to make logical decoding ignore > "WAL records belong to tables that's not in publication list" ? > > 2024-07-31 00:01:02.795 > UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[10-1]:pgreps_13801ERROR: > out of memory > > 2024-07-31 00:01:02.795 > UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[11-1]:pgreps_13801DETAIL: > Cannot enlarge string buffer containing 378355896 bytes by 756711422 more > bytes. > > 2024-07-31 00:01:02.795 > UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[12-1]:pgreps_13801CONTEXT: > slot "pgreps_13801", output plugin "pgoutput", in the change callback, > associated LSN 3D/318438E0 > > 2024-07-31 00:01:02.795 > UTC:xxx..xxx.xxx(33068):repl13801@pgpodb:[3603770]:[13-1]:pgreps_13801STATEMENT: > START_REPLICATION SLOT "pgreps_13801" LOGICAL 3C/F24C74D0 (proto_version > '1', publication_names 'pgreps_13801') > > > > Thanks, > > > > James > >
