Re: pg_checksums?
Hi Peter > On Oct 29, 2023, at 02:43, Peter J. Holzer wrote: > I don't think so. AFAIK Replication keeps the data files in sync on a > bit-for-bit level and turning on checksums changes the data layout. > Running a cluster where one node has checksums and the other doesn't > would result in a complete mess. I agree with the last sentence. This is why I asked if it is safe to enable checksums on a replica, switch over and then do it again on the ex primary, i.e. now new replica without doing a reinit. Cheers Paul
Re: pg_checksums?
On 2023-10-29 10:11:07 +0100, Paul Förster wrote: > On Oct 29, 2023, at 02:43, Peter J. Holzer wrote: > > I don't think so. AFAIK Replication keeps the data files in sync on a > > bit-for-bit level and turning on checksums changes the data layout. > > Running a cluster where one node has checksums and the other doesn't > > would result in a complete mess. > > I agree with the last sentence. This is why I asked if it is safe to > enable checksums on a replica, switch over and then do it again on the > ex primary, i.e. now new replica without doing a reinit. It *might* work if there are zero writes on the primary during the downtime of the replica (because those writes couldn't be replicated), but that seems hard to ensure. Even if you could get away with making the primary read-only (is this even possible?) I wouldn't have much confidence in the result and reinit the (new) replica anyway. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
[SOLVED?] Re: Disk wait problem... not hardware...
On Sat, 28 Oct 2023 18:34:50 -0400 Jim Mlodgenski wrote: Looking like a GOLD star for Jim... >On Fri, Oct 27, 2023 at 7:46 PM wrote: > >> Memory: 125.5 GiB of RAM >> >It looks like you have a large amount of memory allocated to the server > >But your plans are doing reads instead of pulling things from shared >buffers Assuming you are referring to "Buffers: shared hit=248 read=25022" ? >>explain (analyze, buffers) select count(*) from ncvhis_2016_12_03; >> Finalize Aggregate (cost=404669.65..404669.66 rows=1 width=8) (actual >> time=844.158..847.309 rows=1 loops=1) Buffers: shared hit=248 read=25022 >> -> Gather (cost=404669.43..404669.65 rows=2 width=8) (actual >> time=844.133..847.301 rows=3 loops=1) Workers Planned: 2 >> Workers Launched: 2 >> Buffers: shared hit=248 read=25022 >> -> Partial Aggregate (cost=403669.43..403669.45 rows=1 >> width=8) (actual time=838.772..838.772 rows=1 loops=3) Buffers: shared >> hit=248 read=25022 -> Parallel Index Only Scan using >> ncvhis_2016_12_03_voted_party_cd_idx on ncvhis_2016_12_03 >> (cost=0.44..372735.05 rows=12373755 width=0) (actual >> time=18.277..592.473 rows=9900389 loops=3) Heap Fetches: 0 Buffers: >> shared hit=248 read=25022 Planning Time: 0.069 ms JIT: >> Functions: 8 >> Options: Inlining false, Optimization false, Expressions true, >> Deforming true Timing: Generation 0.284 ms, Inlining 0.000 ms, >> Optimization 0.268 ms, Emission 3.590 ms, Total 4.143 ms Execution >> Time: 847.498 ms >> >> > > >data/postgresql.conf: >> max_connections = 100 >> shared_buffers = 128MB I semi-misquoted; it was running on 128MB when first reported; then 512MB when I sent this message. >It looks like you are running with the stock config for shared_buffers. >With only 128MB dedicated for shared_buffers and such a big database, >you'll be thrashing the cache pretty hard. With 125GB on the server, try >upping shared_buffers to something more like 32GB. WOW! Some tables now have a barely noticeable delay; on the whole, looking MUCH better. Testing with a just created copy[1] on the NVMe, it's better; but it's slowly going back to longer delays. Looks like count(*) may have easily-missed side-effects (more below). My team members were suggesting a possible motherboard issue since the problem happened on both the NVMe M.2 SSD, and on a backup SATA-3 platter drive. Your suggestion arrived just in time as I was about to wipe out the SSD and re-import everything fresh. Another thing that bothers me is that when I copied the DB, the: cp -a ./var ./var2 # same drive, same partition replicating the DB onto free space on the same SSD, it was experiencing slow reads 18-20KB/sec; the writes were quick, large blocks; so I was suspecting something with the SSD... System buffers..? How autoload row count (or count(*)) can hurt: The queries which triggered this issue are all the same (WB built-in): SELECT count(*) FROM table;# WB = SQL-workbench/J These are all static tables. Does PG maintain a table row count so as to avoid having to count each time? WB is setup to: * autoload table row count * autoload table data (restricted with LIMIT) Walking through these 8M & 33M row tables (3.8GB & 5.2GB respectively) must be filling the shared_buffers and thrashing even 32GB... Ah-ha! Turning off "autoload table row count" has "resolved" this issue. I can now scroll through all the tables at high speed (hold down the arrow key to walk through all the tables. Not very useful; but walking through the tables with only 128MB shared-buffers would have been the ultimate buffer thrashing... ;p Even walking through the tables to collect their row counts (whether manually or programmatically. ). That said, the 32GB shared-buffers may now be masking NVMe drive issues... Thanks!!!, Pierre Time and complexity to find the cause of an issue is inversely proportional to the simplicity of the issue...
Re: Disk wait problem... may not be hardware...
On 2023-10-27 19:46:09 -0400, p...@pfortin.com wrote: > On Fri, 27 Oct 2023 19:07:11 +0200 Peter J. Holzer wrote: > >Have you looked at the query plans as I recommended? (You might also > >want to enable track_io_timing to get extra information, but comparing > >just the query plans of fast and slow queries would be a first step) > > I didn't see how that would help since other than the table name the > queries are identical. Curious: are you implying PG stores tables > differently? No, but Postgres decides on the query depending on the statistics stored about that table. If those statistics are off, the query plan can be wildly different and very inefficient. So checking whether the plans are plausible should be one of the first things you do when performance is not what you expect. Indeed, on https://wiki.postgresql.org/wiki/Slow_Query_Questions it is the very first bullet point in the section "Things to Try Before You Post". When you have established that the plan looks fine even when the performance is poor then you have to look elsewhere. But even then it helps to know what the database is actually trying to do. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Re: pg_checksums?
Hi Peter, > On Oct 29, 2023, at 11:49, Peter J. Holzer wrote: > > It *might* work if there are zero writes on the primary during the > downtime of the replica (because those writes couldn't be replicated), > but that seems hard to ensure. Even if you could get away with making > the primary read-only (is this even possible?) I wouldn't have much > confidence in the result and reinit the (new) replica anyway. As soon as I stop the replica to enable checksums, even writes can't get replicated anymore. So during enabling checksums, a replica is definitely protected against modifications by its primary, simply because it's down. The modifications of the primary are applied to the replica when it comes back online. So, I don't see a problem at this particular stage. My fear is merely that enabling checksums does something to the physical state of the data files which are not compatible with the other side. Like for example manipulate the file headers in some way. Maybe this question is better suited for the admin list than this general list? Cheers Paul
Re: [SOLVED?] Re: Disk wait problem... not hardware...
On 2023-10-29 09:21:46 -0400, p...@pfortin.com wrote: > These are all static tables. Does PG maintain a table row count so as to > avoid having to count each time? No. To count the rows in a table, Postgres has to actually read the whole table (or an index, if a suitable index (e.g. a primary key) exists). However, the table statistics contain an estimate for the number of rows: hjp=> select schemaname, relname, n_live_tup from pg_stat_user_tables order by 3 desc; ╔╤═══╤╗ ║ schemaname │ relname │ n_live_tup ║ ╟┼───┼╢ ║ public │ ncvhis_2016_12_03 │977 ║ ║ public │ random_test │100 ║ ║ public │ beislindex│351 ║ ║ public │ delivery │ 6 ║ ... This is an estimate, not the actual count, so it might be wrong (sometimes very wrong), but accessing it is fast and it might be good enough for your purposes. > WB is setup to: > * autoload table row count > * autoload table data (restricted with LIMIT) Maybe WB can be set up to get n_live_tup instead of the real row count? hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Re: Disk wait problem... may not be hardware...
On Sun, 29 Oct 2023 16:00:46 +0100 Peter J. Holzer wrote: >On 2023-10-27 19:46:09 -0400, p...@pfortin.com wrote: >> On Fri, 27 Oct 2023 19:07:11 +0200 Peter J. Holzer wrote: >> >Have you looked at the query plans as I recommended? (You might also >> >want to enable track_io_timing to get extra information, but comparing >> >just the query plans of fast and slow queries would be a first step) >> >> I didn't see how that would help since other than the table name the >> queries are identical. Curious: are you implying PG stores tables >> differently? > >No, but Postgres decides on the query depending on the statistics stored >about that table. If those statistics are off, the query plan can be >wildly different and very inefficient. So checking whether the plans are >plausible should be one of the first things you do when performance is >not what you expect. Indeed, on >https://wiki.postgresql.org/wiki/Slow_Query_Questions it is the very >first bullet point in the section "Things to Try Before You Post". > >When you have established that the plan looks fine even when the >performance is poor then you have to look elsewhere. But even then it >helps to know what the database is actually trying to do. > >hp Peter, Thanks for this and the link... Thanks to Jim's post, I came to a similar conclusion; but the above link could use a bit more emphasis on shared_buffers. I'm no longer enabling "autoload table row count" and pointing out to our team that SELECT count(*) FROM table; is OK *unless* done in a loop across many tables where shared-buffers may never be enough to prevent thrashing... Thanks, Pierre
Re: pg_checksums?
On 10/29/23 04:11, Paul Förster wrote: Hi Peter On Oct 29, 2023, at 02:43, Peter J. Holzer wrote: I don't think so. AFAIK Replication keeps the data files in sync on a bit-for-bit level and turning on checksums changes the data layout. Running a cluster where one node has checksums and the other doesn't would result in a complete mess. I agree with the last sentence. This is why I asked if it is safe to enable checksums on a replica, switch over and then do it again on the ex primary, i.e. now new replica without doing a reinit. For that to work, the secondary files would have to remain identical to the primary files. Theoretically that _should_ happen, but it might not, or whatever command that enables checksums after the fact might have a sanity check. As for safety, what do you mean by "safe"? -- Born in Arizona, moved to Babylonia.
Re: pg_checksums?
On 2023-10-29 16:15:37 +0100, Paul Förster wrote: > On Oct 29, 2023, at 11:49, Peter J. Holzer wrote: > > It *might* work if there are zero writes on the primary during the > > downtime of the replica (because those writes couldn't be replicated), > > but that seems hard to ensure. Even if you could get away with making > > the primary read-only (is this even possible?) I wouldn't have much > > confidence in the result and reinit the (new) replica anyway. > > As soon as I stop the replica to enable checksums, even writes can't > get replicated anymore. So during enabling checksums, a replica is > definitely protected against modifications by its primary, simply > because it's down. The modifications of the primary are applied to the > replica when it comes back online. And this is where it would break down. The modifications can't be applied to the replica any more because the replica now contains checksums and the modifications don't. In the best case the replica would catch the discrepancy and refuse to apply the modifications which would lead to the loss of these modifications. In the worst case it would apply them anyway causing severe data corruption. > So, I don't see a problem at this particular stage. > > My fear is merely that enabling checksums does something to the > physical state of the data files which are not compatible with the > other side. Exactly. Those checksums have to be stored somewhere. > Like for example manipulate the file headers in some way. Not just the file headers. Every single data block. (Ok, it looks like the space for the checksum is reserved even if checksums aren't enabled[1]. So at least pg_checksums doesn't have to move data around to enable them. But overwriting a page with a checksum with one without one would still be bad.) hp [1] https://www.postgresql.org/docs/current/storage-page-layout.html#PAGEHEADERDATA-TABLE -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Re: [SOLVED?] Re: Disk wait problem... not hardware...
On Sun, 29 Oct 2023 16:16:05 +0100 Peter J. Holzer wrote: >On 2023-10-29 09:21:46 -0400, p...@pfortin.com wrote: >> These are all static tables. Does PG maintain a table row count so as to >> avoid having to count each time? > >No. To count the rows in a table, Postgres has to actually read the >whole table (or an index, if a suitable index (e.g. a primary key) >exists). Am I correct to assume count(fieldname) would only load that column for counting? In other words: we should be specific (avoid "*") in general? >However, the table statistics contain an estimate for the number of >rows: > >hjp=> select schemaname, relname, n_live_tup from pg_stat_user_tables order by >3 desc; >╔╤═══╤╗ >║ schemaname │ relname │ n_live_tup ║ >╟┼───┼╢ >║ public │ ncvhis_2016_12_03 │977 ║ >║ public │ random_test │100 ║ >║ public │ beislindex│351 ║ >║ public │ delivery │ 6 ║ >... Are there prerequisites for this query? Every (264) n_live_tup returned was zero... though, VERY fast.. :) >This is an estimate, not the actual count, so it might be wrong >(sometimes very wrong), but accessing it is fast and it might be good >enough for your purposes. Thanks for sharing... We're learning fast... :) >> WB is setup to: >> * autoload table row count >> * autoload table data (restricted with LIMIT) > >Maybe WB can be set up to get n_live_tup instead of the real row count? It appears this is hard-coded [on/off only]; but I thought I saw the WB author post on this list recently... >hp >
Re: [SOLVED?] Re: Disk wait problem... not hardware...
On 10/29/23 11:45, p...@pfortin.com wrote: On Sun, 29 Oct 2023 16:16:05 +0100 Peter J. Holzer wrote: On 2023-10-29 09:21:46 -0400, p...@pfortin.com wrote: These are all static tables. Does PG maintain a table row count so as to avoid having to count each time? No. To count the rows in a table, Postgres has to actually read the whole table (or an index, if a suitable index (e.g. a primary key) exists). Am I correct to assume count(fieldname) would only load that column for counting? The OS reads (chunks of) blocks, not individual bytes, and Postgresql is record oriented, not column-oriented. In other words: we should be specific (avoid "*") in general? No. -- Born in Arizona, moved to Babylonia.
Re: [SOLVED?] Re: Disk wait problem... not hardware...
On 10/29/23 09:45, p...@pfortin.com wrote: On Sun, 29 Oct 2023 16:16:05 +0100 Peter J. Holzer wrote: On 2023-10-29 09:21:46 -0400, p...@pfortin.com wrote: These are all static tables. Does PG maintain a table row count so as to avoid having to count each time? No. To count the rows in a table, Postgres has to actually read the whole table (or an index, if a suitable index (e.g. a primary key) exists). Am I correct to assume count(fieldname) would only load that column for counting? In other words: we should be specific (avoid "*") in general? Read: https://wiki.postgresql.org/wiki/Slow_Counting WB is setup to: * autoload table row count * autoload table data (restricted with LIMIT) Maybe WB can be set up to get n_live_tup instead of the real row count? So basically you are dealing with a client issue. It appears this is hard-coded [on/off only]; but I thought I saw the WB author post on this list recently... hp -- Adrian Klaver adrian.kla...@aklaver.com
Re: pg_checksums?
Hi Ron, > On Oct 29, 2023, at 16:37, Ron wrote: > > As for safety, what do you mean by "safe"? Safe in the sense that, if I enable checksums on a replica, switch over and the enable checksums on the other side, if this is ok, or whether future mutations on the primary will corrupt the replica. That's why I asked if I need to perform a patronictl reinit. Cheers Paul
Re: pg_checksums?
Hi Ron, > On Oct 29, 2023, at 16:38, Peter J. Holzer wrote: > And this is where it would break down. The modifications can't be > applied to the replica any more because the replica now contains > checksums and the modifications don't. In the best case the replica > would catch the discrepancy and refuse to apply the modifications which > would lead to the loss of these modifications. In the worst case it > would apply them anyway causing severe data corruption. ... > Not just the file headers. Every single data block. > > (Ok, it looks like the space for the checksum is reserved even if > checksums aren't enabled[1]. So at least pg_checksums doesn't have to > move data around to enable them. But overwriting a page with a checksum > with one without one would still be bad.) Those are the kind of answers and insights I was looking for. Thank you very much. Ok, I will do a reinit then. Cheers Paul
Re: pg_checksums?
On 10/29/23 12:57, Paul Förster wrote: Hi Ron, On Oct 29, 2023, at 16:37, Ron wrote: As for safety, what do you mean by "safe"? Safe in the sense that, if I enable checksums on a replica, switch over and the enable checksums on the other side, if this is ok, or whether future mutations on the primary will corrupt the replica. Trying it would tell you something. That's why I asked if I need to perform a patronictl reinit. Best to ask Percona. -- Born in Arizona, moved to Babylonia.
Differences between database objects (tables, triggers, sequences, functiions) and a sql file
I have a master sql file used to create all the database objects for an application. Is there an application/script to compare what's in the actual database and my sql file and display the differences?
Re: Differences between database objects (tables, triggers, sequences, functiions) and a sql file
On Sun, Oct 29, 2023, 13:59 Ravi Malghan wrote: > I have a master sql file used to create all the database objects for an > application. Is there an application/script to compare what's in the actual > database and my sql file and display the differences? > Not that I know of. You can find stuff that, once you've executed the script you can compare that database to some other and find differences. David J. > >
Re: pg_checksums?
On Sun, Oct 29, 2023 at 11:49:11AM +0100, Peter J. Holzer wrote: > On 2023-10-29 10:11:07 +0100, Paul Förster wrote: >> On Oct 29, 2023, at 02:43, Peter J. Holzer wrote: >>> I don't think so. AFAIK Replication keeps the data files in sync on a >>> bit-for-bit level and turning on checksums changes the data layout. >>> Running a cluster where one node has checksums and the other doesn't >>> would result in a complete mess. >> >> I agree with the last sentence. This is why I asked if it is safe to >> enable checksums on a replica, switch over and then do it again on the >> ex primary, i.e. now new replica without doing a reinit. > > It *might* work if there are zero writes on the primary during the > downtime of the replica (because those writes couldn't be replicated), > but that seems hard to ensure. Even if you could get away with making > the primary read-only (is this even possible?) I wouldn't have much > confidence in the result and reinit the (new) replica anyway. Hm? Page checksums are written when a page is flushed to disk, we don't set them for dirty buffers or full-page writes included in WAL, so it should be OK to do something like the following: - Stop cleanly a standby. - Run pg_checksums on the standby to enable them. - Restart the standby. - Catchup with the latest changes - Stop cleanly the primary, letting the shutdown checkpoint be replicated to the standby. - Promote the standby. - Enable checksums on the previous primary. - Start the previous primary to be a standby of the node you failed over to. -- Michael signature.asc Description: PGP signature
make pg_ctl start more friendly
Hi all: Last week, I tried PITR recovery. The 'recovery_target_action' guc is shutdown. Then I got a failure, and the utility told me to check the log. Finally I found the result was due to guc. I think pg_ctl should print some information which told users recovery had been done.
Server process exited with exit code 4
I encountered a problem while using postgre(version 15.2) on windows 10. the following error messages are often printed in the logs, about once a day. --- LOG: server process (PID 38540) exited with exit code 4. DETAIL: Failed process was running: COMMIT LOG: terminating any other active server processes LOG: all server process terminated; reinitializing LOG: database system was interrupted; last known up at 2023-10-17 00:47:17 HKT LOG: connection received: host=127.0.0.1 port=58402 FATAL: the database system is in recovery mode LOG: connection received: host=127.0.0.1 port=58407 FATAL: the database system is in recovery mode ... LOG: connection received: host=127.0.0.1 port=58648 FATAL: the database system is in recovery mode LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 9/BDC88988 FATAL: the database system is in recovery mode LOG: invalid record length at 9/BF49CF28: wanted 24, got 0 LOG: redo done at 9/BF49CF00 system usage: CPU: user: 0.10 s, system: 0.04 s, elapsed: 0.29 s LOG: checkpoint starting: end-of-recovery immediate wait LOG: checkpoint complete: wrote 1947 buffers (0.4%); 0 WAL files added; 0 removed, 2 recycled; write=0.057s, … LOG: database system is ready to accept connections -- Sometimes it doesn't start with "server process (PID xxx) exited with exit code 4", but with "database system was interrupted". These computers have relatively abundant resource, memory 192GB, hard disk 12T. Installed plugin timescaledb (version 2.11.1) for compression and aggregation. it doesn't seem like a problem with timescaledb either, as this issue has only recently emerged. Has anyone encountered similar problem or may know a solution? Thanks.
Server process exited with exit code 4
Forwarded Message | From | yangsr3411 | | Date | 10/30/2023 12:02 | | To | pgsql-general | | Cc | | | Subject | Server process exited with exit code 4 | I encountered a problem while using postgre(version 15.2) on windows 10. the following error messages are often printed in the logs, about once a day. --- LOG: server process (PID 38540) exited with exit code 4. DETAIL: Failed process was running: COMMIT LOG: terminating any other active server processes LOG: all server process terminated; reinitializing LOG: database system was interrupted; last known up at 2023-10-17 00:47:17 HKT LOG: connection received: host=127.0.0.1 port=58402 FATAL: the database system is in recovery mode LOG: connection received: host=127.0.0.1 port=58407 FATAL: the database system is in recovery mode ... LOG: connection received: host=127.0.0.1 port=58648 FATAL: the database system is in recovery mode LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 9/BDC88988 FATAL: the database system is in recovery mode LOG: invalid record length at 9/BF49CF28: wanted 24, got 0 LOG: redo done at 9/BF49CF00 system usage: CPU: user: 0.10 s, system: 0.04 s, elapsed: 0.29 s LOG: checkpoint starting: end-of-recovery immediate wait LOG: checkpoint complete: wrote 1947 buffers (0.4%); 0 WAL files added; 0 removed, 2 recycled; write=0.057s, … LOG: database system is ready to accept connections -- Sometimes it doesn't start with "server process (PID xxx) exited with exit code 4", but with "database system was interrupted". These computers have relatively abundant resource, memory 192GB, hard disk 12T. Installed plugin timescaledb (version 2.11.1) for compression and aggregation. it doesn't seem like a problem with timescaledb either, as this issue has only recently emerged. Has anyone encountered similar problem or may know a solution? Thanks.
Re: Server process exited with exit code 4
On 2023-10-30 14:02, yangsr3411 wrote: Has anyone encountered similar problem or may know a solution? Just to rule out hardware problems, does the server hardware have some way of showing things like ECC memory errors and similar? Most official server hardware (HPE, Dell, etc) have utilities that can show a log of any recent weirdness that occurred at a hardware level. If yours can, take a look for things like ECC errors or any other strange stuff. :) Regards and best wishes, Justin Clift