Re: pg_checksums?

2023-10-29 Thread Paul Förster
Hi Peter

> On Oct 29, 2023, at 02:43, Peter J. Holzer  wrote:
> I don't think so. AFAIK Replication keeps the data files in sync on a
> bit-for-bit level and turning on checksums changes the data layout.
> Running a cluster where one node has checksums and the other doesn't
> would result in a complete mess.

I agree with the last sentence. This is why I asked if it is safe to enable 
checksums on a replica, switch over and then do it again on the ex primary, 
i.e. now new replica without doing a reinit.

Cheers
Paul



Re: pg_checksums?

2023-10-29 Thread Peter J. Holzer
On 2023-10-29 10:11:07 +0100, Paul Förster wrote:
> On Oct 29, 2023, at 02:43, Peter J. Holzer  wrote:
> > I don't think so. AFAIK Replication keeps the data files in sync on a
> > bit-for-bit level and turning on checksums changes the data layout.
> > Running a cluster where one node has checksums and the other doesn't
> > would result in a complete mess.
> 
> I agree with the last sentence. This is why I asked if it is safe to
> enable checksums on a replica, switch over and then do it again on the
> ex primary, i.e. now new replica without doing a reinit.

It *might* work if there are zero writes on the primary during the
downtime of the replica (because those writes couldn't be replicated),
but that seems hard to ensure. Even if you could get away with making
the primary read-only (is this even possible?) I wouldn't have much
confidence in the result and reinit the (new) replica anyway.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


[SOLVED?] Re: Disk wait problem... not hardware...

2023-10-29 Thread pf
On Sat, 28 Oct 2023 18:34:50 -0400 Jim Mlodgenski wrote:

Looking like a GOLD star for Jim...

>On Fri, Oct 27, 2023 at 7:46 PM  wrote:
>
>> Memory: 125.5 GiB of RAM
>>  
>It looks like you have a large amount of memory allocated to the server
>
>But your plans are doing reads instead of pulling things from shared
>buffers

Assuming you are referring to "Buffers: shared hit=248 read=25022" ?

>>explain (analyze, buffers) select count(*) from ncvhis_2016_12_03;
>> Finalize Aggregate  (cost=404669.65..404669.66 rows=1 width=8) (actual
>> time=844.158..847.309 rows=1 loops=1) Buffers: shared hit=248 read=25022  
>>   ->  Gather  (cost=404669.43..404669.65 rows=2 width=8) (actual  
>> time=844.133..847.301 rows=3 loops=1) Workers Planned: 2
>> Workers Launched: 2
>> Buffers: shared hit=248 read=25022  
>> ->  Partial Aggregate  (cost=403669.43..403669.45 rows=1
>> width=8) (actual time=838.772..838.772 rows=1 loops=3) Buffers: shared
>> hit=248 read=25022 ->  Parallel Index Only Scan using
>> ncvhis_2016_12_03_voted_party_cd_idx on ncvhis_2016_12_03
>> (cost=0.44..372735.05 rows=12373755 width=0) (actual
>> time=18.277..592.473 rows=9900389 loops=3) Heap Fetches: 0 Buffers:
>> shared hit=248 read=25022 Planning Time: 0.069 ms JIT:
>>   Functions: 8
>>   Options: Inlining false, Optimization false, Expressions true,
>> Deforming true Timing: Generation 0.284 ms, Inlining 0.000 ms,
>> Optimization 0.268 ms, Emission 3.590 ms, Total 4.143 ms Execution
>> Time: 847.498 ms
>>
>>  
>
>
>data/postgresql.conf:
>> max_connections = 100
>> shared_buffers = 128MB

I semi-misquoted; it was running on 128MB when first reported; then 512MB
when I sent this message.

>It looks like you are running with the stock config for shared_buffers.
>With only 128MB dedicated for shared_buffers and such a big database,
>you'll be thrashing the cache pretty hard. With 125GB on the server, try
>upping shared_buffers to something more like 32GB.

WOW!  Some tables now have a barely noticeable delay; on the whole,
looking MUCH better. Testing with a just created copy[1] on the NVMe,
it's better; but it's slowly going back to longer delays. Looks like
count(*) may have easily-missed side-effects (more below). My team
members were suggesting a possible motherboard issue since the problem
happened on both the NVMe M.2 SSD, and on a backup SATA-3 platter drive.
Your suggestion arrived just in time as I was about to wipe out the SSD
and re-import everything fresh.  

Another thing that bothers me is that when I copied the DB, the:
   cp -a ./var ./var2   # same drive, same partition
replicating the DB onto free space on the same SSD, it was experiencing
slow reads 18-20KB/sec; the writes were quick, large blocks; so I was
suspecting something with the SSD...  System buffers..?


How autoload row count (or count(*)) can hurt:

The queries which triggered this issue are all the same (WB built-in):
   SELECT count(*) FROM table;# WB = SQL-workbench/J

These are all static tables. Does PG maintain a table row count so as to
avoid having to count each time?  WB is setup to:
* autoload table row count
* autoload table data (restricted with LIMIT)

Walking through these 8M & 33M row tables (3.8GB & 5.2GB respectively)
must be filling the shared_buffers and thrashing even 32GB...

Ah-ha! Turning off "autoload table row count" has "resolved" this issue. I
can now scroll through all the tables at high speed (hold down the arrow
key to walk through all the tables. Not very useful; but walking through
the tables with only 128MB shared-buffers would have been the ultimate
buffer thrashing... ;p  Even walking through the tables to collect their
row counts (whether manually or programmatically. ).

That said, the 32GB shared-buffers may now be masking NVMe drive issues...

Thanks!!!,
Pierre

Time and complexity to find the cause of an issue is inversely
proportional to the simplicity of the issue...




Re: Disk wait problem... may not be hardware...

2023-10-29 Thread Peter J. Holzer
On 2023-10-27 19:46:09 -0400, p...@pfortin.com wrote:
> On Fri, 27 Oct 2023 19:07:11 +0200 Peter J. Holzer wrote:
> >Have you looked at the query plans as I recommended? (You might also
> >want to enable track_io_timing to get extra information, but comparing
> >just the query plans of fast and slow queries would be a first step)
> 
> I didn't see how that would help since other than the table name the
> queries are identical.  Curious: are you implying PG stores tables
> differently?

No, but Postgres decides on the query depending on the statistics stored
about that table. If those statistics are off, the query plan can be
wildly different and very inefficient. So checking whether the plans are
plausible should be one of the first things you do when performance is
not what you expect. Indeed, on
https://wiki.postgresql.org/wiki/Slow_Query_Questions it is the very
first bullet point in the section "Things to Try Before You Post".

When you have established that the plan looks fine even when the
performance is poor then you have to look elsewhere. But even then it
helps to know what the database is actually trying to do.

hp



-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: pg_checksums?

2023-10-29 Thread Paul Förster
Hi Peter,

> On Oct 29, 2023, at 11:49, Peter J. Holzer  wrote:
> 
> It *might* work if there are zero writes on the primary during the
> downtime of the replica (because those writes couldn't be replicated),
> but that seems hard to ensure. Even if you could get away with making
> the primary read-only (is this even possible?) I wouldn't have much
> confidence in the result and reinit the (new) replica anyway.

As soon as I stop the replica to enable checksums, even writes can't get 
replicated anymore. So during enabling checksums, a replica is definitely 
protected against modifications by its primary, simply because it's down. The 
modifications of the primary are applied to the replica when it comes back 
online. So, I don't see a problem at this particular stage.

My fear is merely that enabling checksums does something to the physical state 
of the data files which are not compatible with the other side. Like for 
example manipulate the file headers in some way.

Maybe this question is better suited for the admin list than this general list?

Cheers
Paul



Re: [SOLVED?] Re: Disk wait problem... not hardware...

2023-10-29 Thread Peter J. Holzer
On 2023-10-29 09:21:46 -0400, p...@pfortin.com wrote:
> These are all static tables. Does PG maintain a table row count so as to
> avoid having to count each time?

No. To count the rows in a table, Postgres has to actually read the
whole table (or an index, if a suitable index (e.g. a primary key)
exists).

However, the table statistics contain an estimate for the number of
rows:

hjp=> select schemaname, relname, n_live_tup from pg_stat_user_tables order by 
3 desc;
╔╤═══╤╗
║ schemaname │  relname  │ n_live_tup ║
╟┼───┼╢
║ public │ ncvhis_2016_12_03 │977 ║
║ public │ random_test   │100 ║
║ public │ beislindex│351 ║
║ public │ delivery  │  6 ║
...


This is an estimate, not the actual count, so it might be wrong
(sometimes very wrong), but accessing it is fast and it might be good
enough for your purposes.

> WB is setup to:
> * autoload table row count
> * autoload table data (restricted with LIMIT)

Maybe WB can be set up to get n_live_tup instead of the real row count?

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Disk wait problem... may not be hardware...

2023-10-29 Thread pf
On Sun, 29 Oct 2023 16:00:46 +0100 Peter J. Holzer wrote:

>On 2023-10-27 19:46:09 -0400, p...@pfortin.com wrote:
>> On Fri, 27 Oct 2023 19:07:11 +0200 Peter J. Holzer wrote:  
>> >Have you looked at the query plans as I recommended? (You might also
>> >want to enable track_io_timing to get extra information, but comparing
>> >just the query plans of fast and slow queries would be a first step)  
>> 
>> I didn't see how that would help since other than the table name the
>> queries are identical.  Curious: are you implying PG stores tables
>> differently?  
>
>No, but Postgres decides on the query depending on the statistics stored
>about that table. If those statistics are off, the query plan can be
>wildly different and very inefficient. So checking whether the plans are
>plausible should be one of the first things you do when performance is
>not what you expect. Indeed, on
>https://wiki.postgresql.org/wiki/Slow_Query_Questions it is the very
>first bullet point in the section "Things to Try Before You Post".
>
>When you have established that the plan looks fine even when the
>performance is poor then you have to look elsewhere. But even then it
>helps to know what the database is actually trying to do.
>
>hp

Peter,

Thanks for this and the link... Thanks to Jim's post, I came to a similar
conclusion; but the above link could use a bit more emphasis on
shared_buffers.  I'm no longer enabling "autoload table row count" and
pointing out to our team that SELECT count(*) FROM table; is OK *unless*
done in a loop across many tables where shared-buffers may never be
enough to prevent thrashing...

Thanks,
Pierre




Re: pg_checksums?

2023-10-29 Thread Ron

On 10/29/23 04:11, Paul Förster wrote:

Hi Peter


On Oct 29, 2023, at 02:43, Peter J. Holzer  wrote:
I don't think so. AFAIK Replication keeps the data files in sync on a
bit-for-bit level and turning on checksums changes the data layout.
Running a cluster where one node has checksums and the other doesn't
would result in a complete mess.

I agree with the last sentence. This is why I asked if it is safe to enable 
checksums on a replica, switch over and then do it again on the ex primary, 
i.e. now new replica without doing a reinit.


For that to work, the secondary files would have to remain identical to the 
primary files.  Theoretically that _should_ happen, but it might not, or 
whatever command that enables checksums after the fact might have a sanity 
check.


As for safety, what do you mean by "safe"?

--
Born in Arizona, moved to Babylonia.




Re: pg_checksums?

2023-10-29 Thread Peter J. Holzer
On 2023-10-29 16:15:37 +0100, Paul Förster wrote:
> On Oct 29, 2023, at 11:49, Peter J. Holzer  wrote:
> > It *might* work if there are zero writes on the primary during the
> > downtime of the replica (because those writes couldn't be replicated),
> > but that seems hard to ensure. Even if you could get away with making
> > the primary read-only (is this even possible?) I wouldn't have much
> > confidence in the result and reinit the (new) replica anyway.
> 
> As soon as I stop the replica to enable checksums, even writes can't
> get replicated anymore. So during enabling checksums, a replica is
> definitely protected against modifications by its primary, simply
> because it's down. The modifications of the primary are applied to the
> replica when it comes back online.

And this is where it would break down. The modifications can't be
applied to the replica any more because the replica now contains
checksums and the modifications don't. In the best case the replica
would catch the discrepancy and refuse to apply the modifications which
would lead to the loss of these modifications. In the worst case it
would apply them anyway causing severe data corruption.

> So, I don't see a problem at this particular stage.
> 
> My fear is merely that enabling checksums does something to the
> physical state of the data files which are not compatible with the
> other side.

Exactly. Those checksums have to be stored somewhere.

> Like for example manipulate the file headers in some way.

Not just the file headers. Every single data block.

(Ok, it looks like the space for the checksum is reserved even if
checksums aren't enabled[1]. So at least pg_checksums doesn't have to
move data around to enable them. But overwriting a page with a checksum
with one without one would still be bad.)

hp

[1] 
https://www.postgresql.org/docs/current/storage-page-layout.html#PAGEHEADERDATA-TABLE

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: [SOLVED?] Re: Disk wait problem... not hardware...

2023-10-29 Thread pf
On Sun, 29 Oct 2023 16:16:05 +0100 Peter J. Holzer wrote:

>On 2023-10-29 09:21:46 -0400, p...@pfortin.com wrote:
>> These are all static tables. Does PG maintain a table row count so as to
>> avoid having to count each time?  
>
>No. To count the rows in a table, Postgres has to actually read the
>whole table (or an index, if a suitable index (e.g. a primary key)
>exists).

Am I correct to assume count(fieldname) would only load that column for
counting?  In other words: we should be specific (avoid "*") in general?

>However, the table statistics contain an estimate for the number of
>rows:
>
>hjp=> select schemaname, relname, n_live_tup from pg_stat_user_tables order by 
>3 desc;  
>╔╤═══╤╗
>║ schemaname │  relname  │ n_live_tup ║
>╟┼───┼╢
>║ public │ ncvhis_2016_12_03 │977 ║
>║ public │ random_test   │100 ║
>║ public │ beislindex│351 ║
>║ public │ delivery  │  6 ║
>...

Are there prerequisites for this query?
Every (264) n_live_tup returned was zero...  though, VERY fast.. :)

>This is an estimate, not the actual count, so it might be wrong
>(sometimes very wrong), but accessing it is fast and it might be good
>enough for your purposes.

Thanks for sharing...  We're learning fast... :)

>> WB is setup to:
>> * autoload table row count
>> * autoload table data (restricted with LIMIT)  
>
>Maybe WB can be set up to get n_live_tup instead of the real row count?

It appears this is hard-coded [on/off only]; but I thought I saw the WB
author post on this list recently...

>hp
>




Re: [SOLVED?] Re: Disk wait problem... not hardware...

2023-10-29 Thread Ron

On 10/29/23 11:45, p...@pfortin.com wrote:

On Sun, 29 Oct 2023 16:16:05 +0100 Peter J. Holzer wrote:


On 2023-10-29 09:21:46 -0400, p...@pfortin.com wrote:

These are all static tables. Does PG maintain a table row count so as to
avoid having to count each time?

No. To count the rows in a table, Postgres has to actually read the
whole table (or an index, if a suitable index (e.g. a primary key)
exists).

Am I correct to assume count(fieldname) would only load that column for
counting?


The OS reads (chunks of) blocks, not individual bytes, and Postgresql is 
record oriented, not column-oriented.



In other words: we should be specific (avoid "*") in general?


No.

--
Born in Arizona, moved to Babylonia.




Re: [SOLVED?] Re: Disk wait problem... not hardware...

2023-10-29 Thread Adrian Klaver

On 10/29/23 09:45, p...@pfortin.com wrote:

On Sun, 29 Oct 2023 16:16:05 +0100 Peter J. Holzer wrote:


On 2023-10-29 09:21:46 -0400, p...@pfortin.com wrote:

These are all static tables. Does PG maintain a table row count so as to
avoid having to count each time?


No. To count the rows in a table, Postgres has to actually read the
whole table (or an index, if a suitable index (e.g. a primary key)
exists).


Am I correct to assume count(fieldname) would only load that column for
counting?  In other words: we should be specific (avoid "*") in general?


Read:

https://wiki.postgresql.org/wiki/Slow_Counting



WB is setup to:
* autoload table row count
* autoload table data (restricted with LIMIT)


Maybe WB can be set up to get n_live_tup instead of the real row count?


So basically you are dealing with a client issue.



It appears this is hard-coded [on/off only]; but I thought I saw the WB
author post on this list recently...


hp






--
Adrian Klaver
adrian.kla...@aklaver.com





Re: pg_checksums?

2023-10-29 Thread Paul Förster
Hi Ron,

> On Oct 29, 2023, at 16:37, Ron  wrote:
> 
> As for safety, what do you mean by "safe"?

Safe in the sense that, if I enable checksums on a replica, switch over and the 
enable checksums on the other side, if this is ok, or whether future mutations 
on the primary will corrupt the replica.

That's why I asked if I need to perform a patronictl reinit.

Cheers
Paul



Re: pg_checksums?

2023-10-29 Thread Paul Förster
Hi Ron,

> On Oct 29, 2023, at 16:38, Peter J. Holzer  wrote:

> And this is where it would break down. The modifications can't be
> applied to the replica any more because the replica now contains
> checksums and the modifications don't. In the best case the replica
> would catch the discrepancy and refuse to apply the modifications which
> would lead to the loss of these modifications. In the worst case it
> would apply them anyway causing severe data corruption.
...

> Not just the file headers. Every single data block.
> 
> (Ok, it looks like the space for the checksum is reserved even if
> checksums aren't enabled[1]. So at least pg_checksums doesn't have to
> move data around to enable them. But overwriting a page with a checksum
> with one without one would still be bad.)

Those are the kind of answers and insights I was looking for. Thank you very 
much.

Ok, I will do a reinit then.

Cheers
Paul





Re: pg_checksums?

2023-10-29 Thread Ron

On 10/29/23 12:57, Paul Förster wrote:

Hi Ron,


On Oct 29, 2023, at 16:37, Ron  wrote:

As for safety, what do you mean by "safe"?

Safe in the sense that, if I enable checksums on a replica, switch over and the 
enable checksums on the other side, if this is ok, or whether future mutations 
on the primary will corrupt the replica.


Trying it would tell you something.


That's why I asked if I need to perform a patronictl reinit.


Best to ask Percona.

--
Born in Arizona, moved to Babylonia.




Differences between database objects (tables, triggers, sequences, functiions) and a sql file

2023-10-29 Thread Ravi Malghan
I have a master sql file used to create all the database objects for an 
application. Is there an application/script to compare what's in the actual 
database and my sql file and display the differences?


Re: Differences between database objects (tables, triggers, sequences, functiions) and a sql file

2023-10-29 Thread David G. Johnston
On Sun, Oct 29, 2023, 13:59 Ravi Malghan  wrote:

> I have a master sql file used to create all the database objects for an
> application. Is there an application/script to compare what's in the actual
> database and my sql file and display the differences?
>

Not that I know of.  You can find stuff that, once you've executed the
script you can compare that database to some other and find differences.

David J.

>
>


Re: pg_checksums?

2023-10-29 Thread Michael Paquier
On Sun, Oct 29, 2023 at 11:49:11AM +0100, Peter J. Holzer wrote:
> On 2023-10-29 10:11:07 +0100, Paul Förster wrote:
>> On Oct 29, 2023, at 02:43, Peter J. Holzer  wrote:
>>> I don't think so. AFAIK Replication keeps the data files in sync on a
>>> bit-for-bit level and turning on checksums changes the data layout.
>>> Running a cluster where one node has checksums and the other doesn't
>>> would result in a complete mess.
>> 
>> I agree with the last sentence. This is why I asked if it is safe to
>> enable checksums on a replica, switch over and then do it again on the
>> ex primary, i.e. now new replica without doing a reinit.
> 
> It *might* work if there are zero writes on the primary during the
> downtime of the replica (because those writes couldn't be replicated),
> but that seems hard to ensure. Even if you could get away with making
> the primary read-only (is this even possible?) I wouldn't have much
> confidence in the result and reinit the (new) replica anyway.

Hm?  Page checksums are written when a page is flushed to disk, we
don't set them for dirty buffers or full-page writes included in WAL,
so it should be OK to do something like the following:
- Stop cleanly a standby.
- Run pg_checksums on the standby to enable them.
- Restart the standby.
- Catchup with the latest changes
- Stop cleanly the primary, letting the shutdown checkpoint be
replicated to the standby.
- Promote the standby.
- Enable checksums on the previous primary.
- Start the previous primary to be a standby of the node you failed
over to.
--
Michael


signature.asc
Description: PGP signature


make pg_ctl start more friendly

2023-10-29 Thread Crisp Lee
Hi all:
  Last week, I tried PITR recovery. The 'recovery_target_action' guc is
shutdown. Then I got a failure, and the utility told me to check the log.
Finally I found the result was due to guc. I think pg_ctl should print some
information which told users recovery had been done.


Server process exited with exit code 4

2023-10-29 Thread yangsr3411
I encountered a problem while using postgre(version 15.2) on windows 10. the 
following error messages are often printed in the logs, about once a day.
---
LOG: server process (PID 38540) exited with exit code 4.
DETAIL: Failed process was running: COMMIT
LOG: terminating any other active server processes
LOG: all server process terminated; reinitializing
LOG: database system was interrupted; last known up at 2023-10-17 00:47:17 HKT
LOG: connection received: host=127.0.0.1 port=58402
FATAL: the database system is in recovery mode
LOG: connection received: host=127.0.0.1 port=58407
FATAL: the database system is in recovery mode
...
LOG: connection received: host=127.0.0.1 port=58648
FATAL: the database system is in recovery mode
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 9/BDC88988
FATAL: the database system is in recovery mode
LOG: invalid record length at 9/BF49CF28: wanted 24, got 0
LOG: redo done at 9/BF49CF00 system usage: CPU: user: 0.10 s, system: 0.04 s, 
elapsed: 0.29 s
LOG: checkpoint starting: end-of-recovery immediate wait
LOG: checkpoint complete: wrote 1947 buffers (0.4%); 0 WAL files added; 0 
removed, 2 recycled; write=0.057s, …
LOG: database system is ready to accept connections
--

Sometimes it doesn't start with "server process (PID xxx) exited with exit code 
4", but with "database system was interrupted".

These computers have relatively abundant resource, memory 192GB, hard disk 12T.

Installed plugin timescaledb (version 2.11.1) for compression and aggregation. 
it doesn't seem like a problem with timescaledb either, as this issue has only 
recently emerged.

Has anyone encountered similar problem or may know a solution?
Thanks.







Server process exited with exit code 4

2023-10-29 Thread yangsr3411
 Forwarded Message 
| From | yangsr3411 |
| Date | 10/30/2023 12:02 |
| To | pgsql-general |
| Cc | |
| Subject | Server process exited with exit code 4 |
I encountered a problem while using postgre(version 15.2) on windows 10. the 
following error messages are often printed in the logs, about once a day.
---
LOG: server process (PID 38540) exited with exit code 4.
DETAIL: Failed process was running: COMMIT
LOG: terminating any other active server processes
LOG: all server process terminated; reinitializing
LOG: database system was interrupted; last known up at 2023-10-17 00:47:17 HKT
LOG: connection received: host=127.0.0.1 port=58402
FATAL: the database system is in recovery mode
LOG: connection received: host=127.0.0.1 port=58407
FATAL: the database system is in recovery mode
...
LOG: connection received: host=127.0.0.1 port=58648
FATAL: the database system is in recovery mode
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 9/BDC88988
FATAL: the database system is in recovery mode
LOG: invalid record length at 9/BF49CF28: wanted 24, got 0
LOG: redo done at 9/BF49CF00 system usage: CPU: user: 0.10 s, system: 0.04 s, 
elapsed: 0.29 s
LOG: checkpoint starting: end-of-recovery immediate wait
LOG: checkpoint complete: wrote 1947 buffers (0.4%); 0 WAL files added; 0 
removed, 2 recycled; write=0.057s, …
LOG: database system is ready to accept connections
--

Sometimes it doesn't start with "server process (PID xxx) exited with exit code 
4", but with "database system was interrupted".

These computers have relatively abundant resource, memory 192GB, hard disk 12T.

Installed plugin timescaledb (version 2.11.1) for compression and aggregation. 
it doesn't seem like a problem with timescaledb either, as this issue has only 
recently emerged.

Has anyone encountered similar problem or may know a solution?
Thanks.







Re: Server process exited with exit code 4

2023-10-29 Thread Justin Clift

On 2023-10-30 14:02, yangsr3411 wrote:


Has anyone encountered similar problem or may know a solution?


Just to rule out hardware problems, does the server hardware have
some way of showing things like ECC memory errors and similar?

Most official server hardware (HPE, Dell, etc) have utilities
that can show a log of any recent weirdness that occurred at a
hardware level.

If yours can, take a look for things like ECC errors or any other
strange stuff. :)

Regards and best wishes,

Justin Clift