[Bug 1042556] Re: Critical data loss bug in postgresql-common initscript

Martin Pitt Thu, 06 Sep 2012 06:21:05 -0700

** Description changed:

  Hi
  
  The Debian packages for PostgreSQL (and thus the Ubuntu packages because
  of the shared use of pg_wrapper) are subject to a potentially critical
  data loss bug because of an unsafe procedure for restarting PostgreSQL.
  
  This issue has been recognised and patched in Debian:
  
-     
http://anonscm.debian.org/loggerhead/pkg-postgresql/postgresql-common/trunk/revision/1181
-     http://archives.postgresql.org/pgsql-general/2012-07/msg00501.php
+     
http://anonscm.debian.org/loggerhead/pkg-postgresql/postgresql-common/trunk/revision/1181
+     http://archives.postgresql.org/pgsql-general/2012-07/msg00501.php
  
  but should be urgently included in Ubuntu and backported.
  
  I quote Tom Lane (key PostgreSQL dev):
  
-         [The] forced unlink on the postmaster.pid file [...] (a) is entirely
-         unnecessary, and (b) defeats the safety interlock against starting a
-         new postmaster before all the old backends have flushed out.
+         [The] forced unlink on the postmaster.pid file [...] (a) is entirely
+         unnecessary, and (b) defeats the safety interlock against starting a
+         new postmaster before all the old backends have flushed out.
  
  It is VITAL that pg_wrapper NEVER unlink the postmaster.pid file. The
  postmaster will do that its self if it finds the pid to be stale, but
  only after performing some checks to make sure there are no backends
  still running and to ensure that there's no other postmaster running
  against the database.
  
  See:
-     http://archives.postgresql.org/pgsql-general/2012-07/msg00475.php
+     http://archives.postgresql.org/pgsql-general/2012-07/msg00475.php
  
  Context here:
  
-     http://archives.postgresql.org/pgsql-general/2012-07/msg00350.php
-     
http://dba.stackexchange.com/questions/20959/recover-postgresql-database-from-wal-errors-on-startup/20961
+     http://archives.postgresql.org/pgsql-general/2012-07/msg00350.php
+     
http://dba.stackexchange.com/questions/20959/recover-postgresql-database-from-wal-errors-on-startup/20961
+ 
+ SRU INFORMATION:
+  * Impact: Severe data loss in rare corner cases.
+ 
+  * Regression potential: Very low. The change has been in Debian,
+ Quantal, and my very popular PostgreSQL backports repository for quite
+ some time. pg_ctlcluster has a function start_check_pid_file() which
+ cleans up a stale PID file on startup if it still exists after
+ pg_ctlcluster stop --force goes to kill -9 the postmaster, so that does
+ not stop a subsequent startup. The test suite (t/030_errors.t)
+ explicitly covers scenarios with missing, broken, and stale PID files
+ and ensures that they are handled properly.
+ 
+  * Test case: I do not know a realistic and reliable test case to cause
+ the data loss, but the analysis of the bug in above ML thread is very
+ clear. I suggest to regression-test the change only, i. e. run the
+ postgresql-common test suite and a manual check that starting a cluster
+ still works with a stale pid file being around:
+ 
+   sudo pg_createcluster 9.1 test --start
+   sudo cp /var/lib/postgresql/9.1/test/postmaster.pid{,.save}
+   sudo pg_ctlcluster 9.1 test stop
+   # now cause a stale pid file
+   sudo cp /var/lib/postgresql/9.1/test/postmaster.pid{.save,}
+   
+   # this should succeed and say "Removed stale pid file."
+   sudo pg_ctlcluster 9.1 test start
+   
+   # this should say that 9.1/test is online
+   pg_lsclusters


-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1042556

Title:
  Critical data loss bug in postgresql-common initscript

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/postgresql-common/+bug/1042556/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1042556] Re: Critical data loss bug in postgresql-common initscript

Reply via email to