Re: update faster way

2024-09-14 Thread Laurenz Albe
On Sat, 2024-09-14 at 08:43 +0530, yudhi s wrote:
> We have to update a column value(from numbers like '123' to codes like 'abc'
> by looking into a reference table data) in a partitioned table with billions
> of rows in it, with each partition having 100's millions rows. As we tested

> for ~30million rows it's taking ~20minutes to update. So if we go by this
> calculation, it's going to take days for updating all the values. So my
> question is
> 
> 1) If there is any inbuilt way of running the update query in parallel
>(e.g. using parallel hints etc) to make it run faster?
> 2) should we run each individual partition in a separate session (e.g. five
>partitions will have the updates done at same time from 5 different
>sessions)? And will it have any locking effect or we can just start the
>sessions and let them run without impacting our live transactions?

Option 1 doesn't exist.
Option 2 is possible, and you can even have more than one session workingr
on a single partition.

However, the strain on your system's resources and particularly the row
locks will impair normal database work.

Essentially, you can either take an extended down time or perform the updates
in very small chunks with a very low "lock_timeout" over a very long period
of time.  If any of the batches fails because of locking conflicts, it has
to be retried.

Investigate with EXPLAIN (ANALYZE) why the updates take that long.  It could
be a lame disk, tons of (unnecessary?) indexes or triggers, but it might as
well be the join with the lookup table, so perhaps there is room for
improvement (more "work_mem" for a hash join?).

Yours,
Laurenz Albe




Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres

2024-09-14 Thread Dan Kortschak
On Wed, 2024-07-24 at 00:23 +, Dan Kortschak wrote:
> On 7/23/24 13:11, Vincent Veyron wrote:
> > On Mon, 15 Jul 2024 20:31:13 +
> > 
> > This is the goto page for anything SQL :
> > https://www.postgresql.org/docs/current/sql-commands.html
> > 
> > For DateTime types :
> > https://www.postgresql.org/docs/current/datatype-datetime.html
> > 
> > For JSON types :
> > https://www.postgresql.org/docs/current/datatype-json.html
> 
> Thanks, I will work through those.

I've had a chance to attack this. The first part of the problem was
that I could not figure out how to get the multiple statement
transaction that I using in SQLite to work with PostgreSQL. The
solution was to use the host language's Postres binding transaction
functions and send the statements separately.

The first part, to ensure the JSON array exist is solved with

update
events
set 
datastr = jsonb_set(datastr, '{amend}', '[]')
where
starttime < $3 and
endtime > $2 and
not datastr::jsonb ? 'amend' and
bucketrow = (
select rowid from buckets where id = $1
);

I'm still having difficulties with the second part which is to update
the contents of the amend array in the JSON.

So far I'm able to append the relevant details to the append array, but
I'm unable to correctly select the corrects elements from the $6
argument, which is in the form
[{"start":,"end":,"data":}, ...]. The first
update statement gives me broadly what I want, but includes elements of
the array that it shouldn't.

update
events
set
datastr = jsonb_set(
datastr,
'{amend}',
datastr->'amend' || jsonb_build_object(
'time', $2::TEXT,
'msg', $3::TEXT,
'replace', (
-- This select is for comparison with 
the code below.
select * from jsonb($6::TEXT)
)
)
)
where
starttime < $5 and
endtime > $4 and
bucketrow = (
select rowid from buckets where id = $1
);

If I filter on the start and end time, I end up with no element coming
through at all and the "replace" field ends up null.

update
events
set
datastr = jsonb_set(
datastr,
'{amend}',
datastr->'amend' || jsonb_build_object(
'time', $2::TEXT,
'msg', $3::TEXT,
'replace', (
select *
from
jsonb($6::TEXT) as replacement
where

(replacement->>'start')::TIMESTAMP WITH TIME ZONE < endtime and

(replacement->>'end')::TIMESTAMP WITH TIME ZONE > starttime
)
)
)
where
starttime < $5 and
endtime > $4 and
bucketrow = (
select rowid from buckets where id = $1
);

Can anyone suggest what I might be missing? One thing that occurs to me
is that due to the host language the timezone in starttime and endtime
is the local timezone, while the timezone in the elements of the $6
argument are in UTC. I've tried forcing the timezones to match and this
does not appear to be the issue.

This can be seen in the case of output from the first, non-filtering
update statement above:

{
...
"start": "2023-06-13T05:24:50+09:30",
"end": "2023-06-13T05:24:55+09:30",
"data": {
...
"amend": [
{
"endtime": "2023-06-13T05:24:55+09:30",
"msg": "testing",
"replace": [
{
"data": {
...
},
"end": 
"2023-06-12T19:54:51Z",
"start": 
"2023-06-12T19:54:39Z"
}
   

Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres

2024-09-14 Thread Alban Hertroys


> On 14 Sep 2024, at 10:33, Dan Kortschak  wrote:

(…)

> I'm still having difficulties with the second part which is to update
> the contents of the amend array in the JSON.
> 
> So far I'm able to append the relevant details to the append array, but
> I'm unable to correctly select the corrects elements from the $6
> argument, which is in the form
> [{"start":,"end":,"data":}, ...]. The first
> update statement gives me broadly what I want, but includes elements of
> the array that it shouldn't.

(…)

> If I filter on the start and end time, I end up with no element coming
> through at all and the "replace" field ends up null.
> 
> update
> events
> set
> datastr = jsonb_set(
> datastr,
> '{amend}',
> datastr->'amend' || jsonb_build_object(
> 'time', $2::TEXT,
> 'msg', $3::TEXT,
> 'replace', (
> select *
> from
> jsonb($6::TEXT) as replacement
> where
> (replacement->>'start')::TIMESTAMP WITH TIME ZONE < endtime and
> (replacement->>'end')::TIMESTAMP WITH TIME ZONE > starttime
> )
> )
> )
> where
> starttime < $5 and
> endtime > $4 and
> bucketrow = (
> select rowid from buckets where id = $1
> );

That’s because the replacement data is an array of objects, not a single object.

You need to iterate through the array elements to build your replacement data, 
something like what I do here with a select (because that’s way easier to play 
around with):

with dollar6 as (
select jsonb($$[
{
"data": { "foo": 1, 
"bar": 2
},
"end": 
"2023-06-12T19:54:51Z",
"start": 
"2023-06-12T19:54:39Z"
}
]$$::text) replacement
)
select *
from dollar6
cross join lateral jsonb_array_elements(replacement) r
where (r->>'start')::timestamptz <= current_timestamp;


There are probably other ways to attack this problem, this is the one I came up 
with.


Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.





Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres

2024-09-14 Thread Dan Kortschak
On Sat, 2024-09-14 at 12:05 +0200, Alban Hertroys wrote:
> 
> That’s because the replacement data is an array of objects, not a
> single object.
> 
> You need to iterate through the array elements to build your
> replacement data, something like what I do here with a select
> (because that’s way easier to play around with):
> 
> with dollar6 as (
> select jsonb($$[
>     {
>     "data": {
> "foo": 1, "bar": 2
>     },
>     "end": "2023-
> 06-12T19:54:51Z",
>     "start":
> "2023-06-12T19:54:39Z"
>     }
>     ]$$::text) replacement
> )
> select *
> from dollar6
> cross join lateral jsonb_array_elements(replacement) r
> where (r->>'start')::timestamptz <= current_timestamp;


Thanks






Re: Manual query vs trigger during data load

2024-09-14 Thread Peter J. Holzer
On 2024-09-14 00:54:49 +0530, yudhi s wrote:
> As "thiemo" mentioned , it can be done as below method, but if we have
> multiple lookup tables to be populated for multiple columns , then , how can
> the INSERT query be tweaked to cater the need here?

Just use a join:
insert into target(val1, val2, val3, val4)
select :param1, cfgA.substA, :param3, cfgB.substB
from cfgA, cfgB
where cfgA.keyA = :param2 and cfgB.keyB = :param4

Or use a CTE per lookup which might be more readable:

with cA as ( select substA from cfgA where keyA = :param2 ),
 cB as ( select substB from cfgB where keyB = :param4 )
insert into target(val1, val2, val3, val4)
select :param1, cA.substA, :param3, cB.substB
from cA, cB

However, I agree with Rob here. It's probably better to do the
substitution in Java.

hp


-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: update faster way

2024-09-14 Thread yudhi s
On Sat, 14 Sept, 2024, 1:09 pm Laurenz Albe, 
wrote:

> On Sat, 2024-09-14 at 08:43 +0530, yudhi s wrote:
> > We have to update a column value(from numbers like '123' to codes like
> 'abc'
> > by looking into a reference table data) in a partitioned table with
> billions
> > of rows in it, with each partition having 100's millions rows. As we
> tested
>
> > for ~30million rows it's taking ~20minutes to update. So if we go by this
> > calculation, it's going to take days for updating all the values. So my
> > question is
> >
> > 1) If there is any inbuilt way of running the update query in parallel
> >(e.g. using parallel hints etc) to make it run faster?
> > 2) should we run each individual partition in a separate session (e.g.
> five
> >partitions will have the updates done at same time from 5 different
> >sessions)? And will it have any locking effect or we can just start
> the
> >sessions and let them run without impacting our live transactions?
>
> Option 1 doesn't exist.
> Option 2 is possible, and you can even have more than one session workingr
> on a single partition.
>
> However, the strain on your system's resources and particularly the row
> locks will impair normal database work.
>
> Essentially, you can either take an extended down time or perform the
> updates
> in very small chunks with a very low "lock_timeout" over a very long period
> of time.  If any of the batches fails because of locking conflicts, it has
> to be retried.
>
> Investigate with EXPLAIN (ANALYZE) why the updates take that long.  It
> could
> be a lame disk, tons of (unnecessary?) indexes or triggers, but it might as
> well be the join with the lookup table, so perhaps there is room for
> improvement (more "work_mem" for a hash join?)
>


Thank you so much Laurenz.

We have mostly insert/update happen on current day/live partition. So
considering that, if we will run batch updates(with batch size of 1000)
from five different sessions in parallel on different historical partition,
at any time they will lock 5000 rows and then commit. And also those rows
will not collide with each other. So do you think that approach can anyway
cause locking issues? We will ensure the update of live partition occurs
when we have least activity. So in that way we will not need extended down
time. Please correct me if wrong.

Never used lock_timeout though, but in above case do we need lock_timeout?

Regarding batch update with batch size of 1000, do we have any method
exists in postgres (say like forall statement in Oracle) which will do the
batch dml. Can you please guide me here, how we can do it in postgres.

And yes will need to see what happens in the update using explain analyze.
And I was trying to see, if we can run explain analyze without doing actual
update , but seems that is not possible.




>


Re: update faster way

2024-09-14 Thread Peter J. Holzer
On 2024-09-14 16:10:15 +0530, yudhi s wrote:
> On Sat, 14 Sept, 2024, 1:09 pm Laurenz Albe,  wrote:
> 
> On Sat, 2024-09-14 at 08:43 +0530, yudhi s wrote:
> > We have to update a column value(from numbers like '123' to codes like
> 'abc'
> > by looking into a reference table data) in a partitioned table with
> billions
> > of rows in it, with each partition having 100's millions rows. As we
> tested
> 
> > for ~30million rows it's taking ~20minutes to update.
[...]
> > 2) should we run each individual partition in a separate session (e.g.
> five
> >    partitions will have the updates done at same time from 5 different
> >    sessions)? And will it have any locking effect or we can just start
> the
> >    sessions and let them run without impacting our live transactions?
> 
> Option 2 is possible, and you can even have more than one session workingr
> on a single partition.
> 
> However, the strain on your system's resources and particularly the row
> locks will impair normal database work.
> 
> Essentially, you can either take an extended down time or perform the
> updates
> in very small chunks with a very low "lock_timeout" over a very long 
> period
> of time.  If any of the batches fails because of locking conflicts, it has
> to be retried.
> 
> Investigate with EXPLAIN (ANALYZE) why the updates take that long.  It
> could
> be a lame disk, tons of (unnecessary?) indexes or triggers, but it might 
> as
> well be the join with the lookup table, so perhaps there is room for
> improvement (more "work_mem" for a hash join?)
> 
> 
> 
> Thank you so much Laurenz.
> 
> We have mostly insert/update happen on current day/live partition. So
> considering that, if we will run batch updates(with batch size of 1000) from
> five different sessions in parallel on different historical partition, at any
> time they will lock 5000 rows and then commit.

If you are updating billions of rows in batches of 5000, that means you
are executing hundreds of thousands or millions of update statements.

Which in turn means that you want as little overhead as possible per
batch which means finding those 5000 rows should be quick. Which brings
us back to Igor's question: Do you have any indexes in place which speed
up finding those 5000 rows (the primary key almost certainly won't help
with that). EXPLAIN (ANALYZE) (as suggested by Laurenz) will certainly
help answering that question.

> And also those rows will not collide with each other. So do you think
> that approach can anyway cause locking issues?

No, I don't think so. With a batch size that small I wouldn't expect
problems even on the live partition. But of course many busy parallel
sessions will put additional load on the system which may or may not be
noticeable by users (you might saturate the disks writing WAL entries
for example, which would slow down other sessions trying to commit).


> Regarding batch update with batch size of 1000, do we have any method exists 
> in
> postgres (say like forall statement in Oracle) which will do the batch dml. 
> Can
> you please guide me here, how we can do it in postgres.

Postgres offers several server side languages. As an Oracle admin you
will probably find PL/pgSQL most familiar. But you could also use Perl
or Python or several others. And of course you could use any
programming/scripting language you like on the client side.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: update faster way

2024-09-14 Thread Ron Johnson
On Fri, Sep 13, 2024 at 11:59 PM yudhi s 
wrote:

>
>>
>> Do you have any indexes?
>> If not - you should, if yes - what are they?
>>
>>
>>
> Yes we have a primary key on this table which is on a UUID type column and
> also we have other indexes in other timestamp columns . But how is this
> going to help as we are going to update almost all the rows in the table?
>

But do you have an index on tab_part1.column1?

And how slow is subquery?


Re: update faster way

2024-09-14 Thread yudhi s
On Sat, Sep 14, 2024 at 4:55 PM Peter J. Holzer  wrote:

>
> Which in turn means that you want as little overhead as possible per
> batch which means finding those 5000 rows should be quick. Which brings
> us back to Igor's question: Do you have any indexes in place which speed
> up finding those 5000 rows (the primary key almost certainly won't help
> with that). EXPLAIN (ANALYZE) (as suggested by Laurenz) will certainly
> help answering that question.
>
> > And also those rows will not collide with each other. So do you think
> > that approach can anyway cause locking issues?
>
> No, I don't think so. With a batch size that small I wouldn't expect
> problems even on the live partition. But of course many busy parallel
> sessions will put additional load on the system which may or may not be
> noticeable by users (you might saturate the disks writing WAL entries
> for example, which would slow down other sessions trying to commit).
>
>
> > Regarding batch update with batch size of 1000, do we have any method
> exists in
> > postgres (say like forall statement in Oracle) which will do the batch
> dml. Can
> > you please guide me here, how we can do it in postgres.
>
> Postgres offers several server side languages. As an Oracle admin you
> will probably find PL/pgSQL most familiar. But you could also use Perl
> or Python or several others. And of course you could use any
> programming/scripting language you like on the client side.
>
>
 When you said *"(the primary key almost certainly won't help with that)", *I
am trying to understand why it is so ?
I was thinking of using that column as an incrementing filter and driving
the eligible rows based on that filter. And if it would have been a
sequence. I think it would have helped but in this case it's UUID , so I
may not be able to do the batch DML using that as filter criteria. but in
that case will it be fine to drive the update based on ctid something as
below? Each session will have the range of 5 days of data or five partition
data and will execute a query something as below which will update in the
batches of 10K and then commit. Is this fine? Or is there some better way
of doing the batch DML in postgres plpgsql?

DO $$
DECLARE
l_rowid_array ctid[];
l_ctid ctid;
l_array_size INT := 1;
l_processed INT := 0;
BEGIN

FOR l_cnt IN 0..(SELECT COUNT(*) FROM part_tab WHERE   part_date >
'1-sep-2024' and part_date < '5-sep-2024'
) / l_array_size LOOP
l_rowid_array := ARRAY(
SELECT ctid
FROM part_tab
WHERE part_date   > '1-sep-2024' and part_date < '5-sep-2024'
LIMIT l_array_size OFFSET l_cnt * l_array_size
);

FOREACH l_ctid IN ARRAY l_rowid_array LOOP
update  part_tab
SET column1 = reftab.code
   FROM reference_tab reftab
WHERE tab_part1.column1 = reftab.column1
and ctid = l_ctid;
l_processed := l_processed + 1;
END LOOP;

COMMIT;
END LOOP;

END $$;


Re: Manual query vs trigger during data load

2024-09-14 Thread yudhi s
On Sat, Sep 14, 2024 at 4:17 PM Peter J. Holzer  wrote:

> On 2024-09-14 00:54:49 +0530, yudhi s wrote:
> > As "thiemo" mentioned , it can be done as below method, but if we have
> > multiple lookup tables to be populated for multiple columns , then , how
> can
> > the INSERT query be tweaked to cater the need here?
>
> Just use a join:
> insert into target(val1, val2, val3, val4)
> select :param1, cfgA.substA, :param3, cfgB.substB
> from cfgA, cfgB
> where cfgA.keyA = :param2 and cfgB.keyB = :param4
>
> Or use a CTE per lookup which might be more readable:
>
> with cA as ( select substA from cfgA where keyA = :param2 ),
>  cB as ( select substB from cfgB where keyB = :param4 )
> insert into target(val1, val2, val3, val4)
> select :param1, cA.substA, :param3, cB.substB
> from cA, cB
>
>
Thank you. I will try these options.
Also we are trying to do something as below , which will separate the
tables based on the specific lookup fields for the target tables and thus
it will look simple rather than using those reference tables in the From
clause which may cause some confusion in reading the code or not sure if it
will cause cartesian. Please correct me if I'm wrong.

INSERT INTO tab_part1 (column1, column2, column3, column4, column5,
part_date)
VALUES ( :v_col1, (SELECT lookup_value FROM reference_tab1 WHERE lookup_key
= :v_col2), :v_col3, :v_col4, :v_col5, CURRENT_DATE );


Failing to allocate memory when I think it shouldn't

2024-09-14 Thread Thomas Ziegler
Hello Community!

I have had my database killed by the kernel oom-killer. After that I set turned 
off memory over-committing and that is where things got weird.

I have a PostgreSQL 16.2 running on a Debian Bookworm. The machine has 64GB of 
RAM and runs in a VM in an OpenStack environment.
I have `shared_buffers` at `16000MB`, `work_mem` at `80MB`, `temp_buffers` at 
`8MB`, `max_connections` at `300` and `maintenance_work_mem` at `1GB`.
So all in all, I get to roughly 42GB of max memory usage 
(`16000+(80+8)*300=42400`).

But now, I have connections fail because they can't allocate enough memory. I 
thought they should start creating temporary files if that is the case (and I 
tested this with trivial examples in psql) but the connections from my 
applications sometimes fail. I suspect that the queries are so large, that the 
execution-plans etc. are too large to fit into the 80MB of RAM. This is the 
"simpler" one of my problems, but it would be nice if somebody could tell me if 
my suspicion makes sense.

The harder issue is, that in multiple instances those memory-issues actually 
lead to the database "crashing" and having to recover from the redo-log. To be 
precise, the master-process doesn't crash and the database automatically starts 
the recovery (in contrast to the oom-kill, where I had to restart the database 
manually).
I can't get to the bottom of why that happens. At the point in time where that 
happens, there was more than enough usable memory on the system (>17GB) and I 
don't understand how it can kill the whole database if a worker-process breaks 
because it can't get more RAM. Also why wasn't it able to get more RAM, the 
kernel should have been more than happy to give it more as there was so much 
usable RAM available. (By "usable" I mean free + cache - shared)

Here are logs of one of those instances:

TopMemoryContext: 166384 total in 7 blocks; 33128 free (10 chunks); 133256 used
  smgr relation table: 16384 total in 2 blocks; 4640 free (2 chunks); 11744 used
  PgStat Shared Ref Hash: 7216 total in 2 blocks; 688 free (0 chunks); 6528 used
  PgStat Shared Ref: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
  PgStat Pending: 4096 total in 3 blocks; 2168 free (6 chunks); 1928 used
  TopTransactionContext: 8192 total in 1 blocks; 4536 free (9 chunks); 3656 used
  TransactionAbortContext: 32768 total in 1 blocks; 32504 free (0 chunks); 264 
used
  Portal hash: 8192 total in 1 blocks; 592 free (0 chunks); 7600 used
  TopPortalContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
  Relcache by OID: 16384 total in 2 blocks; 7712 free (2 chunks); 8672 used
  CacheMemoryContext: 262144 total in 6 blocks; 160 free (2 chunks); 261984 used
index info: 2048 total in 2 blocks; 760 free (2 chunks); 1288 used: 
pg_rewrite_rel_rulename_index
[ ... ]
  WAL record construction: 50200 total in 2 blocks; 6376 free (0 chunks); 43824 
used
  PrivateRefCount: 8192 total in 1 blocks; 2648 free (0 chunks); 5544 used
  MdSmgr: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
  LOCALLOCK hash: 8192 total in 1 blocks; 592 free (0 chunks); 7600 used
  GUCMemoryContext: 24576 total in 2 blocks; 5288 free (5 chunks); 19288 used
GUC hash table: 32768 total in 3 blocks; 12704 free (5 chunks); 20064 used
  Timezones: 104112 total in 2 blocks; 2648 free (0 chunks); 101464 used
  Postmaster: 8192 total in 1 blocks; 6496 free (10 chunks); 1696 used
ident parser context: 1024 total in 1 blocks; 760 free (0 chunks); 264 used
hba parser context: 17408 total in 5 blocks; 6944 free (7 chunks); 10464 
used
  ErrorContext: 8192 total in 1 blocks; 7928 free (6 chunks); 264 used
Grand total: 889064 bytes in 124 blocks; 180840 free (113 chunks); 708224 used
2024-09-12 05:18:36.064 UTC [3808080] _@_ FATAL:  out of memory
2024-09-12 05:18:36.064 UTC [3808080] _@_ DETAIL:  Failed on request of size 
190 in memory context "CacheMemoryContext".
TopMemoryContext: 166384 total in 7 blocks; 33128 free (10 chunks); 133256 used
[ ... ]
2024-09-12 05:18:36.065 UTC [3808075] _@_ FATAL:  out of memory
2024-09-12 05:18:36.065 UTC [3808075] _@_ DETAIL:  Failed on request of size 
190 in memory context "CacheMemoryContext".
2024-09-12 05:18:36.065 UTC [3808076] ERROR:  out of memory
2024-09-12 05:18:36.065 UTC [3808076] DETAIL:  Failed on request of size 32800 
in memory context "HashBatchContext".
2024-09-12 05:18:36.065 UTC [3805724] _@_ ERROR:  out of memory
2024-09-12 05:18:36.065 UTC [3805724] _@_ DETAIL:  Failed on request of size 
32800 in memory context "HashBatchContext".
2024-09-12 05:18:36.065 UTC [3805724] _@_ CONTEXT:  parallel worker
2024-09-12 05:18:36.065 UTC [3808078] _@_ FATAL:  out of memory
2024-09-12 05:18:36.065 UTC [3808078] _@_ DETAIL:  Failed on request of size 
190 in memory context "CacheMemoryContext".
2024-09-12 05:18:36.065 UTC [3808081] _@_ FATAL:  out of memory
2024-09-12 05:18:36.065 UTC [3808081] _@_ DETAIL:  Failed on request of size 
40992 in memory context "TopMemoryConte

Reg: Size difference

2024-09-14 Thread Vinay Oli
Hi Team

I have been using PostgreSQL for the past 6 years. PostgreSQL has
significantly impacted my life, providing me with great opportunities for
knowledge and self-development.

I'm currently facing a strange issue with PostgreSQL 15.0. I have a
primary-standby setup that is in sync, with a replication slot in place.
There are 18 databases, and one of the databases on the primary side is 104
GB, while the same database on the standby side is 216 GB. Both are in sync
with zero delay.

Could this be a bug? If so, has it been resolved in newer releases? If it
is not a bug, how can this issue be fixed? Is there a solution or any
supporting documentation available?

WAL and log files are being rotated properly. The issue is with a database
named services_mfs. On the primary cluster, the services_mfs database is
104GB, but on the standby cluster, it is 216GB, even though both cluster
are in sync. The standby database is only used in case of a crash, which is
managed by a Patroni cluster with etcd.



Thanks,

Vinay Kumar


Re: update faster way

2024-09-14 Thread Laurenz Albe
On Sat, 2024-09-14 at 16:10 +0530, yudhi s wrote:
> > However, the strain on your system's resources and particularly the row
> > locks will impair normal database work.
> > 
> > Essentially, you can either take an extended down time or perform the 
> > updates
> > in very small chunks with a very low "lock_timeout" over a very long period
> > of time.  If any of the batches fails because of locking conflicts, it has
> > to be retried.
> > 
> > Investigate with EXPLAIN (ANALYZE) why the updates take that long.  It could
> > be a lame disk, tons of (unnecessary?) indexes or triggers, but it might as
> > well be the join with the lookup table, so perhaps there is room for
> > improvement (more "work_mem" for a hash join?)
> 
> We have mostly insert/update happen on current day/live partition. So
> considering that, if we will run batch updates(with batch size of 1000) from
> five different sessions in parallel on different historical partition, at any
> time they will lock 5000 rows and then commit. And also those rows will not
> collide with each other. So do you think that approach can anyway cause 
> locking
> issues?

The updates won't lock with each other.  I thought that other database activity
might modify rows in these partitions.  If that is not the case, you don't need
to worry about locks.

In that case I would also choose a much higher batch size.

You should make sure to back off every now and then and VACUUM the partition,
so that you avoid excessive table bloat.

> We will ensure the update of live partition occurs when we have least 
> activity.
> So in that way we will not need extended down time. Please correct me if 
> wrong.

That sounds right.

> Never used lock_timeout though, but in above case do we need lock_timeout?

It can be useful if your updating process is blocked by a lock from the
application.  Setting the parameter to a low value will keep your update
from hanging for a long time and will throw an error instead.
Erroring out early reduces the danger of a deadlock.

> Regarding batch update with batch size of 1000, do we have any method exists
> in postgres (say like forall statement in Oracle) which will do the batch dml.
> Can you please guide me here, how we can do it in postgres.

I would certainly not perform the update row for row in PL/pgSQL code.
Perhaps something like this:

  DO
  $$DECLARE
 i bigint;
  BEGIN
 FOR i IN 1..100 by 10 LOOP
UPDATE tab SET ...
WHERE id >= i AND id < i + 10;

COMMIT;
 END LOOP;
  END;$$;

  VACUUM tab;

Then repeat for the next million rows, and so on.

> And yes will need to see what happens in the update using explain analyze.
> And I was trying to see, if we can run explain analyze without doing
> actual update , but seems that is not possible. 

You can do it in a transaction and roll the transaction back.

Yours,
Laurenz Albe




Re: Reg: Size difference

2024-09-14 Thread Adrian Klaver

On 9/14/24 10:19, Vinay Oli wrote:

Hi Team

I have been using PostgreSQL for the past 6 years. PostgreSQL has 
significantly impacted my life, providing me with great opportunities 
for knowledge and self-development.


I'm currently facing a strange issue with PostgreSQL 15.0. I have a 
primary-standby setup that is in sync, with a replication slot in place. 
There are 18 databases, and one of the databases on the primary side is 
104 GB, while the same database on the standby side is 216 GB. Both are 
in sync with zero delay.


How are you measuring the size?

If you are measuring on the files system what is the size of the base/ 
and pg_wal/ sub-directories in the %PG_DATA directoty?




Could this be a bug? If so, has it been resolved in newer releases? If 
it is not a bug, how can this issue be fixed? Is there a solution or any 
supporting documentation available?


WAL and log files are being rotated properly. The issue is with a 
database named services_mfs. On the primary cluster, the services_mfs 
database is 104GB, but on the standby cluster, it is 216GB, even though 
both cluster are in sync. The standby database is only used in case of a 
crash, which is managed by a Patroni cluster with etcd.




Thanks,

Vinay Kumar





--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Reg: Size difference

2024-09-14 Thread Laurenz Albe
On Sat, 2024-09-14 at 22:49 +0530, Vinay Oli wrote:
> I'm currently facing a strange issue with PostgreSQL 15.0. I have a
> primary-standby setup that is in sync, with a replication slot in place.
> There are 18 databases, and one of the databases on the primary side
> is 104 GB, while the same database on the standby side is 216 GB.
> Both are in sync with zero delay.

Try and identify if any of the database objects are different in size.
That shouldn't happen.

If all the database objects have the same size on both systems, the
explanation is likely some unrelated crap sitting in the data directory
on the standby.  Try to identify files that exist on one system, but
not on the other.

Yours,
Laurenz Albe




Re: Reg: Size difference

2024-09-14 Thread Ron Johnson
On Sat, Sep 14, 2024 at 1:19 PM Vinay Oli  wrote:

> Hi Team
>
> I have been using PostgreSQL for the past 6 years. PostgreSQL has
> significantly impacted my life, providing me with great opportunities for
> knowledge and self-development.
>
> I'm currently facing a strange issue with PostgreSQL 15.0.
>
That's two years old, and eight patch updates.

> Could this be a bug? If so, has it been resolved in newer releases?
>
https://www.postgresql.org/docs/release/

-- 
Death to ... butter sauce.
Don't boil me.
I'm still alive!
Iraq crustacean!


Re: Reg: Size difference

2024-09-14 Thread Vinay Oli
Hi

 I've checked the database size by meta command \l+ and even I checked from
file system level du -sh 49181 folder. 49181 is the db oid.

Pgwal directory is same 40gb at primary and standby servers.

All the directories are of same size,  49181 folder (oid)  is only having
different size.



Thanks,
Vinay kumar

On Sat, Sep 14, 2024, 10:59 PM Adrian Klaver 
wrote:

> On 9/14/24 10:19, Vinay Oli wrote:
> > Hi Team
> >
> > I have been using PostgreSQL for the past 6 years. PostgreSQL has
> > significantly impacted my life, providing me with great opportunities
> > for knowledge and self-development.
> >
> > I'm currently facing a strange issue with PostgreSQL 15.0. I have a
> > primary-standby setup that is in sync, with a replication slot in place.
> > There are 18 databases, and one of the databases on the primary side is
> > 104 GB, while the same database on the standby side is 216 GB. Both are
> > in sync with zero delay.
>
> How are you measuring the size?
>
> If you are measuring on the files system what is the size of the base/
> and pg_wal/ sub-directories in the %PG_DATA directoty?
>
> >
> > Could this be a bug? If so, has it been resolved in newer releases? If
> > it is not a bug, how can this issue be fixed? Is there a solution or any
> > supporting documentation available?
> >
> > WAL and log files are being rotated properly. The issue is with a
> > database named services_mfs. On the primary cluster, the services_mfs
> > database is 104GB, but on the standby cluster, it is 216GB, even though
> > both cluster are in sync. The standby database is only used in case of a
> > crash, which is managed by a Patroni cluster with etcd.
> >
> >
> >
> > Thanks,
> >
> > Vinay Kumar
> >
> >
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Reg: Size difference

2024-09-14 Thread Vinay Oli
Hi ,


I've verified there's no crap sitting. I've checked the database size by
meta command \l+ and even I checked from file system level du -sh 49181
folder. 49181 is the db oid.



Thanks,
Vinay kumar

On Sat, Sep 14, 2024, 11:00 PM Laurenz Albe 
wrote:

> On Sat, 2024-09-14 at 22:49 +0530, Vinay Oli wrote:
> > I'm currently facing a strange issue with PostgreSQL 15.0. I have a
> > primary-standby setup that is in sync, with a replication slot in place.
> > There are 18 databases, and one of the databases on the primary side
> > is 104 GB, while the same database on the standby side is 216 GB.
> > Both are in sync with zero delay.
>
> Try and identify if any of the database objects are different in size.
> That shouldn't happen.
>
> If all the database objects have the same size on both systems, the
> explanation is likely some unrelated crap sitting in the data directory
> on the standby.  Try to identify files that exist on one system, but
> not on the other.
>
> Yours,
> Laurenz Albe
>


Re: Failing to allocate memory when I think it shouldn't

2024-09-14 Thread Christoph Moench-Tegeder
Hi,

## Thomas Ziegler (thomas.zieg...@holmsecurity.com):

There's a lot of information missing here. Let's start from the top.

> I have had my database killed by the kernel oom-killer. After that I
> set turned off memory over-committing and that is where things got weird.

What exactly did you set? When playing with vm.overcommit, did you
understand "Committed Address Space" and the workings of the
overcommit accounting? This is the document:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/mm/overcommit-accounting.rst
Hint: when setting overcommit_memory=2 you might end up with way
less available adress space than you thought you would. Also keep
an eye on /proc/meminfo - it's sometimes hard to estimate "just off
your cuff" what's in memory and how it's mapped. (Also, anything
else on that machine which might hog memory?).

> I have `shared_buffers` at `16000MB`, `work_mem` at `80MB`, `temp_buffers`
> at `8MB`, `max_connections` at `300` and `maintenance_work_mem` at `1GB`.
> So all in all, I get to roughly 42GB of max memory usage
> (`16000+(80+8)*300=42400`).

That work_mem is "per query operation", you can have multiple of
those in a single query:
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-WORK-MEM
Also, there's hash_mem_multiplier:
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-HASH-MEM-MULTIPLIER
Then I've seen that your query uses parallel workers, remember that
each worker requests memory.
Next, maintenance_work_mem is a per process limit, and depending on
what's running at any given time, that can add up.

Finally, there's this:
> 2024-09-12 05:18:36.073 UTC [1932776] LOG:  background worker "parallel 
> worker" (PID 3808076) exited with exit code 1
> terminate called after throwing an instance of 'std::bad_alloc'
>   what():  std::bad_alloc
> 2024-09-12 05:18:36.083 UTC [1932776] LOG:  background worker "parallel 
> worker" (PID 3808077) was terminated by signal 6: Aborted

That "std::bad_alloc" sounds a lot like C++ and not like the C our
database is written in. My first suspicion would be that you're using
LLVM-JIT (unless you have other - maybe even your own - C++ extensions
in the database?) and that in itself can use a good chunk of memory.
And it looks like that exception bubbled up as a signal 6 (SIGABRT)
which made the process terminate immediately without any cleanup,
and after that the server has no other chance than to crash-restart.

I recommend starting with understanding the actual memory limits
as set by your configuration (personally I believe that memory
overcommit is less evil than some people think). Have a close look
at /proc/meminfo and if possible disable JIT and check if it changes
anything. Also if possible try starting with only a few active
connections and increase load carefully once a steady state (in
terms of memory usage) has been reached.

Regards,
Christoph

-- 
Spare Space




Re: Reg: Size difference

2024-09-14 Thread Adrian Klaver

On 9/14/24 11:10, Vinay Oli wrote:

Hi

  I've checked the database size by meta command \l+ and even I checked 
from file system level du -sh 49181 folder. 49181 is the db oid.


Minimal information is not going to solve this issue.

Provide the exact output of commands from:

1) The Primary database.

2) The Standby database.

3) Per post from Laurenz Albe, any objects in 49181/ that differ in size 
between the primary and standby.


Also:

1) Describe the exact type of replication you are using?

2) The replication settings you are using.

3) The exact Postgres versions on both sides.

4) The OS and version on both sides.

5) Have you checked the Postgres logs on the standby side to see if the 
server is showing any relevant errors?





Pgwal directory is same 40gb at primary and standby servers.

All the directories are of same size,  49181 folder (oid)  is only 
having different size.




Thanks,
Vinay kumar

On Sat, Sep 14, 2024, 10:59 PM Adrian Klaver > wrote:


On 9/14/24 10:19, Vinay Oli wrote:
 > Hi Team
 >
 > I have been using PostgreSQL for the past 6 years. PostgreSQL has
 > significantly impacted my life, providing me with great
opportunities
 > for knowledge and self-development.
 >
 > I'm currently facing a strange issue with PostgreSQL 15.0. I have a
 > primary-standby setup that is in sync, with a replication slot in
place.
 > There are 18 databases, and one of the databases on the primary
side is
 > 104 GB, while the same database on the standby side is 216 GB.
Both are
 > in sync with zero delay.

How are you measuring the size?

If you are measuring on the files system what is the size of the base/
and pg_wal/ sub-directories in the %PG_DATA directoty?

 >
 > Could this be a bug? If so, has it been resolved in newer
releases? If
 > it is not a bug, how can this issue be fixed? Is there a solution
or any
 > supporting documentation available?
 >
 > WAL and log files are being rotated properly. The issue is with a
 > database named services_mfs. On the primary cluster, the
services_mfs
 > database is 104GB, but on the standby cluster, it is 216GB, even
though
 > both cluster are in sync. The standby database is only used in
case of a
 > crash, which is managed by a Patroni cluster with etcd.
 >
 >
 >
 > Thanks,
 >
 > Vinay Kumar
 >
 >
 >

-- 
Adrian Klaver

adrian.kla...@aklaver.com 



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: update faster way

2024-09-14 Thread Juan Rodrigo Alejandro Burgos Mella
The only way that I see as plausible to use a subquery, both in the query
and in the setting of the variable, is that the relationship is one to one,
and that there is an index that responds to the predicate

UPDATE table1 t1
SET column_value = (SELECT  FROM table2 t2 WHERE t2.column_relation
= t1.column_relation)
WHERE (colum_relation) IN (SELECT column_relation FROM table2)

PD: the index of being in table2

Atte
JRBM

El sáb, 14 sept 2024 a las 0:22, yudhi s ()
escribió:

> Hello,
> We have to update a column value(from numbers like '123' to codes like
> 'abc' by looking into a reference table data) in a partitioned table with
> billions of rows in it, with each partition having 100's millions rows. As
> we tested for ~30million rows it's taking ~20minutes to update. So if we go
> by this calculation, it's going to take days for updating all the values.
> So my question is
>
> 1) If there is any inbuilt way of running the update query in parallel
> (e.g. using parallel hints etc) to make it run faster?
> 2) should we run each individual partition in a separate session (e.g.
> five partitions will have the updates done at same time from 5 different
> sessions)? And will it have any locking effect or we can just start the
> sessions and let them run without impacting our live transactions?
>
> UPDATE tab_part1
> SET column1 = reftab.code
> FROM reference_tab reftab
> WHERE tab_part1.column1 = subquery.column1;
>
> Regards
> Yudhi
>