Replication lag due to lagging restart_lsn
Hello, I wish to use logical replication in Postgres to capture transactions as CDC and forward them to a custom sink. To understand the overhead of logical replication workflow I created a toy subscriber using the V3PGReplicationStream that acknowledges LSNs after every 16k reads by calling setAppliedLsn, setFlushedLsn, and forceUpdateState. The toy subscriber is set up as a subscriber for a master Postgres instance that publishes changes using a Publication. I then run a write-heavy workload on this setup that generates transaction logs at approximately 235MBps. Postgres is run on a beefy machine with a 10+GBps network link between Postgres and the toy subscriber. My expectation with this setup was that the replication lag on master would be minimal as the subscriber acks the LSN almost immediately. However, I observe the replication lag to increase continuously for the duration of the test. Statistics in pg_replication_slots show that restart_lsn lags significantly behind the confirmed_flushed_lsn. Cursory reading on restart_lsn suggests that an increasing gap between restart_lsn and confirmed_flushed_lsn means that Postgres needs to reclaim disk space and advance restart_lsn to catch up to confirmed_flushed_lsn. With that context, I am looking for answers for two questions - 1. What work needs to happen in the database to advance restart_lsn to confirmed_flushed_lsn? 2. What is the recommendation on tuning the database to improve the replication lag in such scenarios? Regards, Satyam
CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED
Using V12, Linux [Ubuntu 16.04LTS] I have a system which implements a message queue with the basic pattern that a process selects a group of, for example 250, rows for processing via SELECT .. LIMIT 250 FOR UPDATE SKIP LOCKED. When there are a small number of concurrent connections to process the queue, this seems to work as expected and connections quickly obtain a unique block of 250 rows for processing. However, as I scale up the number of concurrent connections, I see a spike in CPU (to 100% across 80 cores) when the SELECT FOR UPDATE SKIP LOCKED executes and the select processes wait for multiple minutes (10-20 minutes) before completing. My use case requires around 256 concurrent processors for the queue but I've been unable to scale beyond 128 without everything grinding to a halt. The queue table itself fits in RAM (with 2M hugepages) and during the wait, all the performance counters drop to almost 0 - no disk read or write (semi-expected due to the table fitting in memory) with 100% buffer hit rate in pg_top and row read around 100/s which is much smaller than expected. After processes complete the select and the number of waiting selects starts to fall, CPU load falls and then suddenly the remaining processes all complete within a few seconds and things perform normally until the next time there are a group of SELECT FOR UPDATE statements which bunch together and things then repeat. I found that performing extremely frequent vacuum analyze (every 30 minutes) helps a small amount but this is not that helpful so problems are still very apparent. I've exhausted all the performance tuning and analysis results I can find that seem even a little bit relevant but cannot get this cracked. Is anyone on the list able to help with suggestions of what I can do to track why this CPU hogging happens as this does seem to be the root of the problem? Thanks in advance, Jim
Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED
Message queue... Are rows deleted? Are they updated once or many times? Have you adjusted fillfactor on table or indexes? How many rows in the table currently or on average? Is there any ordering to which rows you update? It seems likely that one of the experts/code contributors will chime in and explain about how locking that many rows in that many concurrent connections means that some resource is overrun and so you are escalating to a table lock instead of actually truly locking only the 250 rows you wanted. On the other hand, you say 80 cores and you are trying to increase the number of concurrent processes well beyond that without (much) disk I/O being involved. I wouldn't expect that to perform awesome. Is there a chance to modify the code to permit each process to lock 1000 rows at a time and be content with 64 concurrent processes?
Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED
Thank you for the quick response. No adjustments of fill factors. Hadn't though of that - I'll investigate and try some options to see if I can measure an effect. There is some ordering on the select [ ORDER BY q_id] so each block of 250 is sequential-ish queue items; I just need them more or less in the order they were queued so as near FIFO as possible without being totally strict on absolute sequential order. Table has around 192K rows, as a row is processed it is deleted as part of the transaction with a commit at the end after all 250 are processed [partitioned table, state changes and it migrates to a different partition] and as the queue drops to 64K it is added to with 128K rows at a time. I've tuned the LIMIT value both up and down. As I move the limit up, the problem becomes substantially worse; 300 swamps it and the selects take > 1 hour to complete; at 600 they just all lock everything up and it stops processing. I did try 1,000 but it basically resulted in nothing being processed. Less processes does not give the throughput required because the queue sends data elsewhere which has a long round trip time but does permit over 1K concurrent connections as their work-round for throughput. I'm stuck having to scale up my concurrent processes in order to compensate for the long processing time of an individual queue item. On 18-Aug.-2020 20:08, Michael Lewis wrote: Message queue... Are rows deleted? Are they updated once or many times? Have you adjusted fillfactor on table or indexes? How many rows in the table currently or on average? Is there any ordering to which rows you update? It seems likely that one of the experts/code contributors will chime in and explain about how locking that many rows in that many concurrent connections means that some resource is overrun and so you are escalating to a table lock instead of actually truly locking only the 250 rows you wanted. On the other hand, you say 80 cores and you are trying to increase the number of concurrent processes well beyond that without (much) disk I/O being involved. I wouldn't expect that to perform awesome. Is there a chance to modify the code to permit each process to lock 1000 rows at a time and be content with 64 concurrent processes?
Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED
Did you try using NOWAIT instead of SKIP LOCKED to see if the behavior still shows up? On Tue, Aug 18, 2020, 8:22 PM Jim Jarvie wrote: > Thank you for the quick response. > > No adjustments of fill factors. Hadn't though of that - I'll investigate > and try some options to see if I can measure an effect. > > There is some ordering on the select [ ORDER BY q_id] so each block of 250 > is sequential-ish queue items; I just need them more or less in the order > they were queued so as near FIFO as possible without being totally strict > on absolute sequential order. > > Table has around 192K rows, as a row is processed it is deleted as part of > the transaction with a commit at the end after all 250 are processed > [partitioned table, state changes and it migrates to a different partition] > and as the queue drops to 64K it is added to with 128K rows at a time. > > I've tuned the LIMIT value both up and down. As I move the limit up, the > problem becomes substantially worse; 300 swamps it and the selects take > 1 > hour to complete; at 600 they just all lock everything up and it stops > processing. I did try 1,000 but it basically resulted in nothing being > processed. > Less processes does not give the throughput required because the queue > sends data elsewhere which has a long round trip time but does permit over > 1K concurrent connections as their work-round for throughput. I'm stuck > having to scale up my concurrent processes in order to compensate for the > long processing time of an individual queue item. > > > > On 18-Aug.-2020 20:08, Michael Lewis wrote: > > Message queue... > Are rows deleted? Are they updated once or many times? Have you adjusted > fillfactor on table or indexes? How many rows in the table currently or on > average? Is there any ordering to which rows you update? > > It seems likely that one of the experts/code contributors will chime in and > explain about how locking that many rows in that many concurrent > connections means that some resource is overrun and so you are escalating > to a table lock instead of actually truly locking only the 250 rows you > wanted. > > On the other hand, you say 80 cores and you are trying to increase the > number of concurrent processes well beyond that without (much) disk I/O > being involved. I wouldn't expect that to perform awesome. > > Is there a chance to modify the code to permit each process to lock 1000 > rows at a time and be content with 64 concurrent processes? > > >
Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED
On Tue, Aug 18, 2020 at 6:22 PM Jim Jarvie wrote: > There is some ordering on the select [ ORDER BY q_id] so each block of 250 > is sequential-ish queue items; I just need them more or less in the order > they were queued so as near FIFO as possible without being totally strict > on absolute sequential order. > How long does each process take in total? How strict does that FIFO really need to be when you are already doing SKIP LOCKED anyway? Table has around 192K rows, as a row is processed it is deleted as part of > the transaction with a commit at the end after all 250 are processed > [partitioned table, state changes and it migrates to a different partition] > and as the queue drops to 64K it is added to with 128K rows at a time. > Can you expound on the partitioning? Are all consumers of the queue always hitting one active partition and anytime a row is processed, it always moves to one of many? archived type partitions? Less processes does not give the throughput required because the queue > sends data elsewhere which has a long round trip time > Is that done via FDW or otherwise within the same database transaction? Are you connecting some queue consumer application code to Postgres, select for update, doing work on some remote system that is slow, and then coming back and committing the DB work? By the way, top-posting is discouraged here and partial quotes with interspersed comments are common practice.
Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED
Also, have you checked how bloated your indexes are getting? Do you run default autovacuum settings? Did you update to the new default 2ms cost delay value? With a destructive queue, it would be very important to ensure autovacuum is keeping up with the churn. Share your basic table structure and indexes, sanitized if need be. >
Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED
On Tue, 2020-08-18 at 19:52 -0400, Jim Jarvie wrote: > I have a system which implements a message queue with the basic pattern that > a process selects a group of, > for example 250, rows for processing via SELECT .. LIMIT 250 FOR UPDATE SKIP > LOCKED. > > When there are a small number of concurrent connections to process the queue, > this seems to work as > expected and connections quickly obtain a unique block of 250 rows for > processing. > However, as I scale up the number of concurrent connections, I see a spike in > CPU (to 100% across 80 cores) > when the SELECT FOR UPDATE SKIP LOCKED executes and the select processes > wait for multiple minutes > (10-20 minutes) before completing. My use case requires around 256 > concurrent processors for the queue > but I've been unable to scale beyond 128 without everything grinding to a > halt. > > The queue table itself fits in RAM (with 2M hugepages) and during the wait, > all the performance counters > drop to almost 0 - no disk read or write (semi-expected due to the table > fitting in memory) with 100% > buffer hit rate in pg_top and row read around 100/s which is much smaller > than expected. > > After processes complete the select and the number of waiting selects starts > to fall, CPU load falls and > then suddenly the remaining processes all complete within a few seconds and > things perform normally until > the next time there are a group of SELECT FOR UPDATE statements which bunch > together and things then repeat. > > I found that performing extremely frequent vacuum analyze (every 30 minutes) > helps a small amount but > this is not that helpful so problems are still very apparent. > > I've exhausted all the performance tuning and analysis results I can find > that seem even a little bit > relevant but cannot get this cracked. > > Is anyone on the list able to help with suggestions of what I can do to track > why this CPU hogging happens > as this does seem to be the root of the problem? You should - check with "pgstattuple" if the table is bloated. - use "perf" to see where the CPU time is spent. - look at "pg_stat_activity" for wait events (unlikely if the CPU is busy). Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
