PostgreSQL container crash trouble.

2021-03-08 Thread Daisuke Ikeda
Dear all.

I'm running PostgreSQL (enabled timescaledb extension) in the following
environment.
And I had db pod crashed error several times irregularly.

I want to know the cause of this fail.

Environment:

  - Base: Kubernetes(microk8s)
  - DB: PostgreSQL 12.4 (TimescaleDB) Container
  - DB OS: Alpine Linux
  - microk8s host OS: CentOS 7.6, Amazon Linux 2 (Occured under some hosts)
  - DB data location: Mounted host directory (for data persistence)

1) PostgreSQL crash and cannot start Pod..

I noticed that the timescaledb pod was restarted repeatedly with the
following error.

---
PostgreSQL Database directory appears to contain a database; Skipping
initialization
[1] LOG:  starting PostgreSQL 12.4 on x86_64-pc-linux-musl, compiled by gcc
(Alpine 9.3.0) 9.3.0, 64-bit
[1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
[1] LOG:  listening on IPv6 address "::", port 5432
[1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
[20] LOG:  database system was shut down at 2021-02-15 21:15:12 UTC
[20] LOG:  invalid primary checkpoint record
[20] PANIC:  could not locate a valid checkpoint record
[1] LOG:  startup process (PID 20) was terminated by signal 6: Aborted
[1] LOG:  aborting startup due to startup process failure
[1] LOG:  database system is shut down
---

I thought it might be WAL trouble and tried to check with pg_controldata
and pg_waldump.

---
$ pg_controldata 
pg_control version number:1201
Catalog version number:   201909212
Database system identifier:   6909006008117780509
Database cluster state:   in production
pg_control last modified: Mon Feb 15 21:11:21 2021
Latest checkpoint location:   15/8FB002C0
Latest checkpoint's REDO location:15/8FAF3018
Latest checkpoint's REDO WAL file:00010015008F
Latest checkpoint's TimeLineID:   1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:  0:33988723
Latest checkpoint's NextOID:  117637
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:480
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  33988723
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:Mon Feb 15 21:11:14 2021
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline:   0
Backup start location:0/0
Backup end location:  0/0
End-of-backup record required:no
wal_level setting:replica
wal_log_hints setting:off
max_connections setting:  50
max_worker_processes setting: 15
max_wal_senders setting:  10
max_prepared_xacts setting:   0
max_locks_per_xact setting:   64
track_commit_timestamp setting:   off
Maximum data alignment:   8
Database block size:  8192
Blocks per segment of large relation: 131072
WAL block size:   8192
Bytes per WAL segment:16777216
Maximum length of identifiers:64
Maximum columns in an index:  32
Maximum size of a TOAST chunk:1996
Size of a large-object chunk: 2048
Date/time type storage:   64-bit integers
Float4 argument passing:  by value
Float8 argument passing:  by value
Data page checksum version:   0
Mock authentication nonce:
 871e705d9393cdb1b161d5562656e20db314779af25942fa10d840983dc4ba84
---

I checked the latest checkpoint's REDO WAL file.

---
$ pg_waldump -n 10 pg_wal/00010015008F
pg_waldump: fatal: WAL segment size must be a power of two between 1 MB and
1 GB, but the WAL file "00010015008F" header specifies 0 bytes
---

I cannot read wal data.
This file size is 16MB (according to the wal size setting).
But the content is all zero data. I checked this situation with "od"
command.

---
$ od -N 40 -A d -v pg_wal/00010015008F
000 00 00 00 00 00 00 00 00
016 00 00 00 00 00 00 00 00
032 00 00 00 00
040
---


The WAL log of the previous one could be seen with pg_waldump.
However, I could not find any log corresponding to the latest checkpoint
location (8FB002C0) listed in pg_controldata.


2) Run pg_resetwal and successful start Pod

I ran the pg_resetwal command to repair the WAL trouble.
And the DB pod successfully started.

But, I received select query fail for some tables.

3) failed select query

---
sampledb1=# select * from table1 limit 1;
ERROR:  missing chunk number 0 for toast value 27678 in pg_toast_2619
---

I thought taht this error is raised for pg_statistic table (and related
to

Re: PostgreSQL container crash trouble.

2021-03-09 Thread Daisuke Ikeda
Thank you for your reply!
It was helpful.

I'll check more about the state of writing on the disk storage.

And I will reinforce the backup & restore strategy.



2021年3月9日(火) 0:28 Laurenz Albe :

> On Mon, 2021-03-08 at 22:30 +0900, Daisuke Ikeda wrote:
> > I'm running PostgreSQL (enabled timescaledb extension) in the following
> environment.
> > And I had db pod crashed error several times irregularly.
> >
> > I want to know the cause of this fail.
> >
> > Environment:
> >
> >   - Base: Kubernetes(microk8s)
> >   - DB: PostgreSQL 12.4 (TimescaleDB) Container
> >   - DB OS: Alpine Linux
> >   - microk8s host OS: CentOS 7.6, Amazon Linux 2 (Occured under some
> hosts)
> >   - DB data location: Mounted host directory (for data persistence)
> >
> > 1) PostgreSQL crash and cannot start Pod..
> >
> > I noticed that the timescaledb pod was restarted repeatedly with the
> following error.
> >
> > ---
> > PostgreSQL Database directory appears to contain a database; Skipping
> initialization
> > [1] LOG:  starting PostgreSQL 12.4 on x86_64-pc-linux-musl, compiled by
> gcc (Alpine 9.3.0) 9.3.0, 64-bit
> > [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
> > [1] LOG:  listening on IPv6 address "::", port 5432
> > [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
> > [20] LOG:  database system was shut down at 2021-02-15 21:15:12 UTC
> > [20] LOG:  invalid primary checkpoint record
> > [20] PANIC:  could not locate a valid checkpoint record
> > [1] LOG:  startup process (PID 20) was terminated by signal 6: Aborted
> > [1] LOG:  aborting startup due to startup process failure
> > [1] LOG:  database system is shut down
> > ---
> >
> > I thought it might be WAL trouble and tried to check with pg_controldata
> and pg_waldump.
>
> It is WAL trouble.
> WAL does not contain the checkpoint from before the crash.
>
> > I thought it might be WAL trouble and tried to check with pg_controldata
> and pg_waldump.
> >
> > ---
> > $ pg_controldata 
> > [...]
> > Latest checkpoint location:   15/8FB002C0
> > Latest checkpoint's REDO location:15/8FAF3018
> > Latest checkpoint's REDO WAL file:00010015008F
> > [...]
> >
> > I checked the latest checkpoint's REDO WAL file.
> >
> > ---
> > $ pg_waldump -n 10 pg_wal/00010015008F
> > pg_waldump: fatal: WAL segment size must be a power of two between 1 MB
> and 1 GB, but the WAL file "00010015008F" header specifies 0
> bytes
> > ---
> >
> > I cannot read wal data.
> > This file size is 16MB (according to the wal size setting).
> > But the content is all zero data. I checked this situation with "od"
> command.
> >
> > ---
> > $ od -N 40 -A d -v pg_wal/00010015008F
> > 000 00 00 00 00 00 00 00 00
> > 016 00 00 00 00 00 00 00 00
> > 032 00 00 00 00
> > 040
> > ---
>
> Looks like modifications to this file were lost.
>
> > 2) Run pg_resetwal and successful start Pod
> >
> > I ran the pg_resetwal command to repair the WAL trouble.
> > And the DB pod successfully started.
>
> Yes, but "pg_resetwal" on a crashed cluster leads to data corruption.
> The best you can do now is salvage what you can.
>
> > But, I received select query fail for some tables.
> >
> > 3) failed select query
> >
> > ---
> > sampledb1=# select * from table1 limit 1;
> > ERROR:  missing chunk number 0 for toast value 27678 in pg_toast_2619
> > ---
> >
> > I thought taht this error is raised for pg_statistic table (and related
> toast table (pg_toast_2619)).
> > So, I deleted the broken rows in pg_statistic table and ran ANALYZE
> query.
> > After then, this tables is recovered.
>
> Lucky you!
>
> > 4) another data trouble
> >
> > I had an another data trouble after pg_resetwal.
> >
> > In some table, SELECT query did not return any rows.
> > And INSERT query failed (no response. Waiting forever...)
> > This table have only primary key index.
> > I thought it might have been caused by an pkey index trouble after
> pg_resetwal.
> > I didn't know how to repair the index of the primary key, and finally I
> dropped table and restore.
>
> You should "pg_dumpall" the cluster and load it into a cluster
> that was newly created with "initdb".
>
> Any problems loading the data have to be resolved manually.
>
> > I thought this wal trouble was caused by disk IO troubles. But any error
> was not raised in OS syslog.
> > I want to know any other causes.
>
> One cause might be unreliable storage that doesn't honor fsync requests
> correctly.  But given that your checkpoint location is pretty late in
> the file, it seems unlikely that none of the data did make it to disk.
>
> Somehow you lost the data for a WAL segment, and you cannot recover.
>
> Of course, if you have a backup, you know what to do.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>