Re: Best options for new PG instance

2018-03-05 Thread Michael Paquier
On Mon, Mar 05, 2018 at 09:51:53AM -0800, Steve Atkins wrote:
> I've been running postgresql instances on ESXi VMs for years with no
> issues. I've not benchmarked them, but performance has been good
> enough despite their running on fairly wimpy hardware. Performance
> relative to bare metal is probably going to be dominated by disk IO,
> and depending on how you're hosting VMs that can be anywhere between
> pretty good and terrible - in a large corporation I'd expect it to be
> pretty good. Just don't skimp on RAM - having your hot data in the
> filesystem cache is always good and can make high latency storage
> tolerable.

One thing to be very careful about is the backup strategy of your
PostgreSQL instances.  I would recommend primarily using PostgreSQL
in-core tools like pg_basebackup to do the work and make sure that
things are consistent.  Users tend to rely a lot on VM snapshots,
particularly quiesced snapshots without memory footprint, but those
could be the cause of data corruption if not using appropriate
pre-freeze and post-thaw scripts in charge of freezing the partitions
while the snapshot is taken (use different partitions for the data
folder, pg_wal and logs as well!), so this would require extra work from
your side.  I am talking about VMware technology here, still you can
find a lot of so-told-useful VM-level backup technologies.  Be careful
with those as well when it comes to database backups.  You can think
that your backups taken are safe, until you see a corruption which has
been hidden for weeks.
--
Michael


signature.asc
Description: PGP signature


Re: Resync second slave to new master

2018-03-05 Thread Michael Paquier
On Tue, Mar 06, 2018 at 04:45:10AM +, Dylan Luong wrote:
> After a failover (promote) to the Slave1,  is it easily resync the
> Slave2 to the new master (old slave1)? Do we need to do  full rebuild
> of the Slave2 from new master everytime we failover to Slave1 from
> Master? Can we use pg_rewind on Slave2 to resyn it with new master
> (old slave1)? 

After promoting slave 1, it could be possible that some records have
slipped to slave 2 from the primary.  In this case, a rewind would be
recommended.  You should be careful that your slave 2 has not received
WAL to a position newer than where WAL has forked because of the
promotion.  If that happened, then a rewind would be necessary before
replugging slave 2 to the newly-promoted server.  Be very careful with
your failover flow here.  pg_rewind also would not run if it finds that
the target server does not need a rewind, so you could stop the slave 2,
and run pg_rewind unconditionally to keep things simple.
--
Michael


signature.asc
Description: PGP signature


Re: Resync second slave to new master

2018-03-05 Thread Michael Paquier
On Tue, Mar 06, 2018 at 06:00:40AM +, Dylan Luong wrote:
> So everytime after promoting Slave to become master (either manually
> or automatic), just stop Slave2 and run pg_rewind on slave2 against
> the new maser (old slave1). And when old master server is available
> again, use pg_rewind on that serve as well against new master to
> return to original configuration.

Yes.  That's exactly the idea.  Running pg_rewind on the old master will
be necessary anyway because you need to stop it cleanly once, which will
cause it to generate WAL records at least for the shutdown checkpoint,
while doing it on slave 2 may be optional, still safer to do.
--
Michael


signature.asc
Description: PGP signature


Re: Resync second slave to new master

2018-03-08 Thread Michael Paquier
On Thu, Mar 08, 2018 at 10:48:29AM +0300, Yavuz Selim Sertoğlu wrote:
> If not set, could you add recovery.conf file
> recovery_target_timeline='latest'
> parameter?
> https://www.postgresql.org/docs/devel/static/recovery-target-settings.html

Yes, that's visibly the issue here.
--
Michael


signature.asc
Description: PGP signature


Re: wiki Disk Usage, table size: ERROR: could not open relation with OID 0

2018-03-14 Thread Michael Paquier
On Wed, Mar 14, 2018 at 04:17:54PM +0100, Hans Schou wrote:
> I got the message
>   ERROR: could not open relation with OID 0
> when running the "General Table Size Information" from
> https://wiki.postgresql.org/wiki/Disk_Usage

I cannot see this failure on latest HEAD on a database running the
regression database.  This is an elog() message by the way, which is
something to report internal errors, so users should never be able to
face it.

> But I guess it was supposed to give size of all tables.
> 
> I'm running version 9.1.9 so it should be working according to the
> wiki.

You should update and upgrade.  9.1 has fallen out of community support
1 year and a half ago, and 9.1.9 is utterly outdated.
--
Michael


signature.asc
Description: PGP signature


Re: changing my mail address

2018-03-18 Thread Michael Paquier
On Sun, Mar 18, 2018 at 09:38:22AM -0400, Stephen Frost wrote:
> These days, each email from the mailing list includes a link which can
> be used to unsubscribe from that list without having any account.

Yes. Ron, if you take the time to look at a raw email, just look for the
field value of List-Unsubscribe and you are good to go.  I don't know
much how email clients parse that, but gmail actually shows that as a
small button you can click on if I recall correctly.
--
Michael


signature.asc
Description: PGP signature


Re: FDW Foreign Table Access: strange LOG message

2018-03-22 Thread Michael Paquier
On Thu, Mar 22, 2018 at 06:55:12PM -0400, Tom Lane wrote:
> Anyway, it's clearly not very nice that postgres_fdw makes no attempt
> to do a graceful shutdown of its remote connection.  I don't know that
> this rises to the level of a bug, exactly, but if somebody wanted to
> send in a patch it'd probably get accepted.

I won't qualify that as a bug, this is mainly noise.  Still I agree that
it would be cleaner to make more efforts in finishing the connections
when the session goes down.  You would roughly just need to register an
on_shmem_exit callback which loops through all the connections to finish
them.  Now would we want to slow down the session shutdown just for
that?  I am less sure particularly if there is lag between the remote
and the local servers.
--
Michael


signature.asc
Description: PGP signature


Re: pg_stat_statements: password in command is not obfuscated

2018-03-25 Thread Michael Paquier
On Sat, Mar 24, 2018 at 12:17:30PM +1300, David Rowley wrote:
> If it is, then it's not a bug in pg_stat_statements. log_statement =
> 'ddl' would have kept a record of the same thing.
> 
> Perhaps the best fix would be a documentation improvement to mention
> the fact and that it's best not to use plain text passwords in
> CREATE/ALTER ROLE. Passwords can be md5 encrypted.

Yeah, this is bad practice.  That's one of the reasons why storage of
plain text passwords has been removed in Postgres 10 still they can be
passed via command, and also why PQencryptPasswordConn and
PQencryptPassword are useful.  Using psql's \password is a good habit to
have.
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL version on VMware vSphere ESXI 6.5 and harware version 13

2018-03-26 Thread Michael Paquier
On Mon, Mar 26, 2018 at 01:49:30PM -0700, David G. Johnston wrote:
> You should probably ask this question on the VMWare forums since its their
> software that would or would not have the performance improvements.
> PostgreSQL will benefit from any memory and disk-related virtualization
> enhancements that they made.  I would have to assume that the answer is a
> favorable yes.

(VMware human here).

Upgrading ESX or other VMware-related components has nothing to do with
PostgreSQL.  Some of the products may embed a version of PostgreSQL to
store some of their meta-data or other things, in which case the upgrade
to a newer PostgreSQL version, if need be, will be taken care of by the
product itself.  If you are using your own set of PostgreSQL instances,
then that's up to what you have at OS level.
--
Michael


signature.asc
Description: PGP signature


Re: [GENERAL] missing public on schema public

2018-03-26 Thread Michael Paquier
On Mon, Mar 26, 2018 at 12:46:38PM -0300, Alvaro Herrera wrote:
> Was this ever fixed?

Ugh.  I have added a reminder on the open item page for v11 as an older
bug: 
https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items#Older_Bugs
--
Michael


signature.asc
Description: PGP signature


Re: Warning of .partial wal file in PITR and Replication Environment

2018-03-26 Thread Michael Paquier
On Mon, Mar 26, 2018 at 11:52:53AM +, Amee Sankhesara - Quipment India 
wrote:
> Warning : The failed archive command was: copy
> "D:\PostgreSQL\9.6\data\pg_xlog\00010A8800F8.partial"
> "\\10.0.0.35\Archive_Replication\00010A8800F8.partial" |
> copy "D:\PostgreSQL\9.6\data\pg_xlog\00010A8800F8.partial"
> "\\10.0.0.35\Archive_PITR\00010A8800F8.partial"

Am I reading that correctly or you are trying to copy twice the same
file?  Why?  Using only copy may cause corruptions if you are not
careful as a disabled write caching would cause the data to go to the OS
cache on Windows and not directly to disk.

> Due to this it has stopped to take backup of PITR
> 
> So we like to know how to clean up the ".partial" file from pg_xlog and run 
> PITR  backup smoothly
> Also like to know is there any database consistency related issue ?

This is the last, partial WAL segment from the previous timeline.
Normally such things are able to archive correctly, so you may want to
consider a more advanced archive command able to handle duplicates.
It is impossible to give a strict answer before knowing what you are
looking for in terms of WAL archive redundancy.

You may need to use target_timeline = 'latest' in your recovery.conf
settings as well.
--
Michael


signature.asc
Description: PGP signature


Re: pg_ctl promote causes error "could not read block" (PG 9.5.0 and 9.5.4)

2018-03-28 Thread Michael Paquier
On Wed, Mar 28, 2018 at 09:36:11AM -0700, raj 1988 wrote:
> Are we hitting some bug? tried to look around but not able to confirm if we
> are hitting a bug or not.  For us this is happening consistently on
> different servers whenever we do pg_ctl promote and then it block WRITE on
> that table.

This has the strong smell of the FSM bug fixed in 9.5.5:
https://www.postgresql.org/docs/devel/static/release-9-5-5.html

So, in order to get things right:
1) Update to the latest version of Postgres 9.5.
2) Make sure that your cluster gets in a clean state.  There are
instructions here:
https://wiki.postgresql.org/wiki/Free_Space_Map_Problems

> As of now we get rid of the error either by doing vacuum full or CTAS, but
> i am afraid what we will do in case this happens to our few TB tables.

This rebuilds the file-space map, which is why it goes away.  You really
want to do the work I am mentioning above to get back to a clean state.
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL Cascade streaming replication problem

2018-04-02 Thread Michael Paquier
On Sun, Apr 01, 2018 at 06:26:51PM +, Jakub Janeček wrote:
> What did i do wrong? I need stop comulating WAL files and remove old WAL
> files, which are processed and are only "waiting"..

Perhaps wal_keep_segments is set and you forgot about it or you used a
replication slot that you forgot to drop on slave 1?
--
Michael


signature.asc
Description: PGP signature


Re: difference between checkpoint_segments and wal_keep_segments in postgres configuration file

2018-04-10 Thread Michael Paquier
On Tue, Apr 10, 2018 at 11:06:54PM +0530, Raghavendra Rao J S V wrote:
> I am not clear the difference between checkpoint_segments and
> wal_keep_segments .
> 
> I would like to now below things. Please explain.Thanks in advance.
> 
>- Difference  between *checkpoint_segments *and *wal_keep_segments *
>value
>- Role  of  *checkpoint_segments *and *wal_keep_segments *
>- Which one should has higher value.

Documentation is king here.  For checkpoint_segments:
https://www.postgresql.org/docs/9.4/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-CHECKPOINTS

  Maximum number of log file segments between automatic WAL
  checkpoints (each segment is normally 16 megabytes). The default is
  three segments. Increasing this parameter can increase the amount of
  time needed for crash recovery. This parameter can only be set in
  the postgresql.conf file or on the server command line.

For wal_keep_segments:
https://www.postgresql.org/docs/9.4/static/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-SENDER

  Specifies the minimum number of past log file segments kept in the
  pg_xlog directory, in case a standby server needs to fetch them for
  streaming replication. Each segment is normally 16 megabytes. If a
  standby server connected to the sending server falls behind by more than
  wal_keep_segments segments, the sending server might remove a WAL
  segment still needed by the standby, in which case the replication
  connection will be terminated. Downstream connections will also
  eventually fail as a result. (However, the standby server can recover by
  fetching the segment from archive, if WAL archiving is in use.)

Mentioning checkpoint_segments implies that you are using PostgreSQL 9.4
or older versions as this has been removed and replaced by max_wal_size
in 9.5.  You should consider upgrading to a newer version.

Hence the first is used in the context of normal operations to decide
the frequency of checkpoints when those are triggered by volume.  The
second can be used with streaming replication to give a standby a higher
catchup window.  Giving value to one or the other depends on the
context, and both are usable in completely different circumstances.
--
Michael


signature.asc
Description: PGP signature


Re: Can a broken Postgresql data directory be reconstructed without losing data?

2018-04-15 Thread Michael Paquier
On Sun, Apr 15, 2018 at 08:25:05AM +, A A wrote:
> I'm trying to restore a database from a broken data directory that
> have lost many of its files accidentally.

Let me guess, a wild set of rm commands?

> I have tried unsuccessfully to install new instance of the same
> Postgresql version and then copy (with overwrite) the broken datadir
> to the new one.here the remaining datadir that I'm working on :

This is the kind of things I have found myself doing some times, as well
as some people on this list, and this enters in the category of horror
stories.

So the answer to the question "Can I work on a broken data folder and
save as much data as possible?" is "yes".  However this enters in the
field of dark magics as this requires advanced skills and knowledge of
PostgreSQL, including its internals, its folder structure, and ways to
bypass some protocols or even perhaps patch the upstream code to bypass
some checks to the data you are looking for...  Additionally, this is
usually a step-by-step and case-by-case problem.

Please note that data deleted is gone.  You may be able to get back some
data from the file system which has been deleted, though this requires a
special set of skills and luck.  Hiring a professional is recommended
for such operations.
--
Michael


signature.asc
Description: PGP signature


Re: Old active connections?

2018-04-17 Thread Michael Paquier
On Tue, Apr 17, 2018 at 05:11:10PM -0700, David G. Johnston wrote:
> Long-lived non-idle statements would likely be waiting for a lock to be
> released.

Be very careful with transactions marked as "idle in transaction" for a
long time.  Long-running transactions prevent VACUUM to do its work as
the oldest XID in view is not updated, causing performance to go down,
and bloat to go up.
--
Michael


signature.asc
Description: PGP signature


Re: Can PostgreSQL create new WAL files instead of reusing old ones?

2018-04-17 Thread Michael Paquier
On Tue, Apr 17, 2018 at 02:57:03PM -0400, Tom Lane wrote:
> I think possibly the OP doesn't understand why it's designed that way.
> The point is not really to "recycle old WAL files", it's to avoid having
> disk space allocation occur during the critical section where we must
> PANIC on failure.  Now, of course, that doesn't really work if the
> filesystem is COW underneath, because it's allocating fresh disk space
> anyway even though semantically we're overwriting existing data.
> But what I'd like to see is a fix that deals with that somehow, rather
> than continue to accept the possibility of ENOSPC occurring inside WAL
> writes on these file systems.  I have no idea what such a fix would
> look like :-(

That looks like a rather difficult problem to solve in PostgreSQL
itself, as the operator running the cluster is in charge of setting up
the FS options which would control the COW behavior, so it seems to me
that there is room as well for an in-core option to tell the
checkpointer to enforce the removal of past files instead of simple
recycling them, because this actually breaks max_wal_size.

max_wal_size is of course a soft limit, and it has been discussed a
couple of times that it would be nice to get that to a hard limit, but
it is really a hard problem to avoid the system to not slow down or even
stop its I/O if the hard is close by or reached..
--
Michael


signature.asc
Description: PGP signature


Re: Pulling initial physical replication pg_basebackup from a downstream server

2018-04-25 Thread Michael Paquier
On Wed, Apr 25, 2018 at 09:52:47AM -0500, Scott Frazer wrote:
> I'm changing out a data center and I need to setup a new replicated server.
> The bandwidth speeds between the new data center and the master are slower
> than the speeds between the new data center and the current replica.
> 
> Can I get the pg_base_backup from the current replica and then tell the new
> server to connect to the master?

Taking a base backup from a replica is a supported operation.  The base
backup taken can then be used to connect to the root primary, even if
that's not the server you took the base backup from.  Being part of the
same cluster is enough (same system ID generated at initialization
time). 

> Would I need to do anything special to
> keep the master from discarding the transaction logs needed to come up to
> speed?

You could create a physical replication slot on the primary, and then
take a backup after being sure that the standby has fetched enough WAL
to be ahead of the LSN position where the replication slot has been
taken.  And finally create a recovery.conf in the new replica so as it
uses the slot created on the master.  You will finish by retaining a bit
more WAL than necessary, but a client can request WAL data using a
replication slot which is at least the oldest position.  Asking for a
newer position would cause the replication slot to just not retain
anymore the data past the point requested.
--
Michael


signature.asc
Description: PGP signature


Re: relkind='p' has no pg_stat_user_tables

2018-05-04 Thread Michael Paquier
On Thu, May 03, 2018 at 01:24:59PM -0500, Justin Pryzby wrote:
> Yes, I was surprised about the difference between ANALYZE relkind_p
> and relkind_r.
> 
> But I see that's a documented behavior I'd missed until now:
> 
> https://www.postgresql.org/docs/current/static/sql-analyze.html
> |If the specified table is a partitioned table, both the inheritance 
> statistics
> |of the partitioned table as a whole and statistics of the individual 
> partitions
> |are updated.

When I read this thread, it seems to me that more user-friendly system
functions able to extract a sub-tree of child relations (by inheritance
and/or partition) is something which would help.  Now users are limited
to things like large WITH RECURSIVE queries when willing to extract a
full tree.  While that's easily done with a custom function, there is
room for an in-core function as well.  I recall that Amit Langote has
sent a patch which introduces a wrapper function on top of
find_all_inheritors, perhaps that would get into v12.
--
Michael


signature.asc
Description: PGP signature


Re: recovery_target_time and WAL fetch with streaming replication

2018-05-12 Thread Michael Paquier
On Sun, May 13, 2018 at 01:39:48AM +0200, Hannes Erven wrote:
> what is Postgresql's strategy when to fetch WAL from the master while in
> streaming replication, and could it be tweaked?
>
> I'm using a physical streaming replication slave to have a database lagging
> behind about one month behind the primary, by setting "recovery_target_time"
> to the desired point in time.
> This setting is periodically advanced by a cronjob to allow the replica to
> roll forward. It's a 10.3-1 install on Debian.

I think that you are coplicating your life here.

Fetching WAL from a primary (or another standby) cannot be directly per
se.  By definition, any WAL present will e automatically fetched and
synced on the standby.  Now, it is not usually the moment WAL is fetched
and synced on a standby that matters, it is the moment it is replayed.
For example, what people usually want to be protected from is an
infortunate DROP TABLE on the primary run by an administrator to be
immediately replayed on the standby, losing the data.  Hence delaying
when WAL is replayed can offer some protection, and this can be achieved
by setting recovery_min_apply_delay in recovery.conf.  This will cause
WAL records replaying transactions commits to wait for the amount of
time specified by this parameter, giving you the time to recover from
any failures with a standby which has a controlled synced delta.

> One option of course would be to use some transfer mechanism external to
> Postgresql... but so far I'm thinking there must be any easier way?

Another option I can think of here is to use a specific restore_command
instead of streaming replication.  Simply archive a WAL segment on the
primary with some meta-data like the time it was archived, and then
allow the standby to recover the segment only after a delta has passed.
The can allow a more evenly distribution of segments.
--
Michael


signature.asc
Description: PGP signature


Re: recovery_target_time and WAL fetch with streaming replication

2018-05-13 Thread Michael Paquier
On Sun, May 13, 2018 at 09:42:42AM +0200, Hannes Erven wrote:
> But when new WAL is needed, the standby will fetch /all/ WAL present on the
> master.

Fetching as much WAL as possible when recovery happens is wanted by
design, so as it recovers as much as possible.  And that's documented.

> I'd say, the standby should either:
> - always connect to the primary and fetch any WAL present

This is what a hot standby does.  It keeps waiting for WAL to become
available whichever the source used (archive, local pg_xlog or stream)
and switches between one or the other.  You can look at
WaitForWALToBecomeAvailable to get an idea of the details.  

> - stop fetching/streaming WAL when it is not needed for the current
> recovery_target

The startup process is in charge of recovery (WAL replay and definition
of from where to get the WAL available), and is the one which decides if
using streaming is necessary or not.  if streaming is used, then it
starts a WAL receiver.  If a switch from streaming to another WAL source
(local pg_xlog or archives is done), then it shuts down the WAL
receiver, consisting in sending SIGTERM to the WAL receiver and stopping
it immediately with a FATAL message (stops process immediately).  The
key point is that  WAL receiver is designed to be a light-weight
transport layer for WAL data.  In short, to be simple, it receives a set
of WAL bytes and writes them.  It does not include any logic to decode
WAL records, so it cannot know when a stop point happens or not.  It
also has no idea of the configuration within recovery.conf, which is
loaded by the startup process.

> Yes, but thats far less simple than just setting restore_target_time .

It seems to me that archiving provides the control you are looking for.
--
Michael


signature.asc
Description: PGP signature


Re: 10.4 upgrade, function markings, and template0

2018-05-14 Thread Michael Paquier
On Mon, May 14, 2018 at 05:22:39PM -0400, Tom Lane wrote:
> Maybe it'd be worth building some sort of infrastructure that would
> allow this to be done at a lower level.  It's not hard to imagine
> an autovacuum-like or bgworker-based thingy that could run around
> and apply a given SQL script in every database, bypassing the usual
> worries about authentication and connections-disabled databases.

A portion of the infrastructure is already available for background
workers which can use BGWORKER_BYPASS_ALLOWCONN since Postgres 11 to
enforce connections to databases even if an administrator disables
connections to it.
--
Michael


signature.asc
Description: PGP signature


Re: Out of memory error with PG10.3, 10.4 but not 9.3.19

2018-05-16 Thread Michael Paquier
On Tue, May 15, 2018 at 03:02:48PM +, ChatPristi wrote:
> I got the error with PG 10.3 and 10.4 on CentOS 7 up-to-date.The
> command works with a smaller size database. 
> The command works with the same database with PG 9.3.19 on RHEL 6.9
> up-to-date.
> 
> I attach the EXPLAIN SELECT command.

Well, instead of an explain output which takes 2.4MB compressed and
9.6MB uncompressed (take it as unreadable), could you produce a
self-contained test case with a glimpse of the schema you are using?
Where does the OOM happen, and how did you change your partitioned table
schema?  Are you using the native partitioning instead?
--
Michael


signature.asc
Description: PGP signature


Re: Problem compiling PostgreSQL.

2018-05-17 Thread Michael Paquier
On Thu, May 17, 2018 at 08:31:48AM +0100, Paul Linehan wrote:
> I'm having problems compiling PostgreSQL.

On which platform and/or distribution are you trying the code
compilation?
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL backup issue

2018-05-22 Thread Michael Paquier
On Tue, May 22, 2018 at 10:35:46PM -0700, David G. Johnston wrote:
> I would conclude that pg-basebackup is placing its output in stderr instead
> of stdout then...

The output of pg_basebackup's verbose mode goes to stderr (look for
example at the verbose flags in pg_basebackup.c).
--
Michael


signature.asc
Description: PGP signature


Re: Streaming Replication between PostGreSQL 9.2.2 on Red Hat and PostGreSQL 9.2.24 on Debian

2018-05-24 Thread Michael Paquier
On Thu, May 24, 2018 at 10:21:33AM +0200, talk to ben wrote:
> - Doing Streaming Replication between different minor version of PG is
>   possible but not recommended [2]

Standbys need to be updated first, hence be careful that the primary is
not updated before the standbys or WAL generated on the primary may not
be able to replay on its standbys.  Note however that you won't get
support for such configurations on the community lists, so just make
sure that all nodes in a cluster are on the same version and that it is
the latest one.
--
Michael


signature.asc
Description: PGP signature


Re: Microsoft buys GitHub, is this a threat to open-source

2018-06-04 Thread Michael Paquier
On Mon, Jun 04, 2018 at 08:44:37PM +0200, Pavel Stehule wrote:
> I have few projects there - Orafce, plpgsql_check, and pspg. I hope so
> these projects are well protected by BSD licence - and distributed
> redundant nature of git. I hope so there is not reason for panic this
> moment. I have not a big data in non git sources - issues, and others.

Not sure myself if there is any need to worry about this stuff (which is
not related to Postgres lists as there is no dependency to github,
yeah!), still one thing that anybody hosting projects on remote places
should do anyway is to take automatic backups of what they have on those
places and keep a copy of them locally.  This way, you have an exit door
if something happens to the place where the code is located.  I do so
for all my stuff on github for example.  And git makes that really easy
to do.
--
Michael


signature.asc
Description: PGP signature


Re: Microsoft buys GitHub, is this a threat to open-source

2018-06-07 Thread Michael Paquier
On Thu, Jun 07, 2018 at 11:57:06AM +0200, Jehan-Guillaume (ioguix) de Rorthais 
wrote:
> How do you backup your projects issues on github? Using the API with some
> loops?

I personally don't care much about this part.  As long as the code
survives..
--
Michael


signature.asc
Description: PGP signature


Re: Print pg_lsn as a number?

2018-06-12 Thread Michael Paquier
On Tue, Jun 12, 2018 at 10:39:43AM -0700, Andres Freund wrote:
> On 2018-06-12 09:31:49 -0700, Scott Stroupe wrote:
>> According to the documentation[1], pg_lsn is a 64-bit integer that's
>> printed as two hex numbers separated by a slash, e.g. 68/1225BB70. Is
>> there a way to get the 64-bit integer in a common numeric
>> representation instead of the peculiar hex-slash-hex representation?
> 
> What do you want to do with the LSN?

While fixing the slot advance stuff, I would have liked to get an
automatic test able to reproduce the crash where the slot is first
advanced at a page boundary, and then again moved forward.  However it
happens that it is not that easy to do so, so I would be rather
supportive to at least the following operations:
pg_lsn + numeric = pg_lsn.
pg_lsn % numeric = numeric.

We map pg_wal_lsn_diff result to a numeric, so that could make sense to
use numeric as well here, which is not the greatest choice by the way as
that's an int64 internally, but that's more portable for any
(unlikely-to-happen) future changes.

Using the segment size value in pg_settings, you could also advance the
LSN worth a full segment for example...
--
Michael


signature.asc
Description: PGP signature


Re: Load data from a csv file without using COPY

2018-06-19 Thread Michael Paquier
On Tue, Jun 19, 2018 at 02:32:10PM -0700, David G. Johnston wrote:
> ​You really need to describe what you consider to be a "real life​
> scenario"; and probably give a better idea of creation and number of these
> csv files.  In addition to describing the relevant behavior of the
> application you are testing.
> 
> If you want maximum realism you should probably write integration tests for
> your application and then execute those at high volume.
> 
> Or at minimum give an example of the output you would want from this
> unknown program...

Hard to say what you are especially looking for that psql's \copy cannot
do, but perhaps you have an interest in pg_bulkload?  Here is a link to
the project:
https://github.com/ossc-db/pg_bulkload/

It has a couple of fancy features as well, like preventing failures of
rows if loading a large file, etc.
--
Michael


signature.asc
Description: PGP signature


Re: using pg_basebackup for point in time recovery

2018-06-19 Thread Michael Paquier
Hi Pierre,

On Tue, Jun 19, 2018 at 12:03:58PM +, Pierre Timmermans wrote:
> Here is the doc, the sentence that I find misleading is "There are
> backups that cannot be used for point-in-time recovery", also
> mentioning that they are faster than pg_dumps add to confusion (since
> pg_dumps cannot be used for PITR):
> https://www.postgresql.org/docs/current/static/continuous-archiving.html

Yes, it is indeed perfectly possible to use such backups to do a PITR
as long as you have a WAL archive able to replay up to the point where
you want the replay to happen, so I agree that this is a bit confusing.
This part of the documentation is here since the beginning of times,
well 6559c4a2 to be exact.  Perhaps we would want to reword this
sentence as follows:
"These are backups that could be used for point-in-time recovery if
combined with a WAL archive able to recover up to the wanted recovery
point.  These backups are typically much faster to backup and restore
than pg_dump for large deployments but can result as well in larger
backup sizes, so the speed of one method or the other is to evaluate
carefully first."

I am open to better suggestions of course.
--
Michael


signature.asc
Description: PGP signature


Re: using pg_basebackup for point in time recovery

2018-06-20 Thread Michael Paquier
Hi Pierre,

On Wed, Jun 20, 2018 at 08:06:31AM +, Pierre Timmermans wrote:
> Hi Michael

You should avoid top-posting on the Postgres lists, this is not the
usual style used by people around :)

> Thanks for the confirmation. Your rewording removes the confusion. I
> would maybe take the opportunity to re-instate that pg_dump cannot be
> used for PITR, so in the line of 
> "These are backups that could be used for point-in-time recovery if
> combined with a WAL archive able to recover up to the wanted recovery
> point.  These backups are typically much faster to backup and restore
> than pg_dump for large deployments but can result as well in larger
> backup sizes, so the speed of one method or the other is to evaluate
> carefully first. Consider also that pg_dump backups cannot be used for
> point-in-time recovery."

Attached is a patch which includes your suggestion.  What do you think?
As that's an improvement, only HEAD would get that clarification.

>  Maybe the confusion stems from the fact that if you restore a
> standalone (self-contained) pg_basebackup then - by default - recovery
> is done with the recovery_target immediate option, so if one needs
> point-in-time recovery he has to edit the recovery.conf and brings the
> archives..

Perhaps.  There is really nothing preventing one to add a recovery.conf
afterwards, which is also why pg_basebackup -R exists.  I do that as
well for some of the framework I work with and maintain.
--
Michael
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 982776ca0a..ccc0a66bf3 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1430,12 +1430,15 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
  Standalone Hot Backups
 
  
-  It is possible to use PostgreSQL's backup facilities to
-  produce standalone hot backups. These are backups that cannot be used
-  for point-in-time recovery, yet are typically much faster to backup and
-  restore than pg_dump dumps.  (They are also much larger
-  than pg_dump dumps, so in some cases the speed advantage
-  might be negated.)
+  It is possible to use PostgreSQL's backup
+  facilities to produce standalone hot backups.  These are backups that
+  could be used for point-in-time recovery if combined with a WAL
+  archive able to recover up to the wanted recovery point.  These backups
+  are typically much faster to backup and restore than pg_dump for large
+  deployments but can result as well in larger backup sizes, so the
+  speed of one method or the other is to evaluate carefully first.  Note
+  also that pg_dump backups cannot be used
+  for point-in-time recovery.
  
 
  


signature.asc
Description: PGP signature


Re: using pg_basebackup for point in time recovery

2018-06-21 Thread Michael Paquier
On Thu, Jun 21, 2018 at 04:42:00PM -0400, Ravi Krishna wrote:
> Same here even though I use Mac mail. But it is not yahoo alone. 
> Most of the web email clients have resorted to top posting.  I miss
> the old days of Outlook Express which was so '>' friendly.  I think
> Gmail allows '>' when you click on the dots to expand the mail you
> are replying to, but it messes up in justifying and formatting it.

Those products have good practices when it comes to break and redefine
what the concept behind emails is...
--
Michael


signature.asc
Description: PGP signature


Re: using pg_basebackup for point in time recovery

2018-06-21 Thread Michael Paquier
On Thu, Jun 21, 2018 at 04:50:38PM -0700, David G. Johnston wrote:
> Generally only actual bug fixes get back-patched; but I'd have to say
> this looks like it could easily be classified as one.

Everybody is against me here ;)

> Some comments on the patch itself:
> 
> "recover up to the wanted recovery point." - "desired recovery point" reads
> better to me
> 
> 
> "These backups are typically much faster to backup and restore" - "These
> backups are typically much faster to create and restore"; avoid repeated
> use of the word backup

Okay.

> "but can result as well in larger backup sizes" - "but can result in larger
> backup sizes", drop the unnecessary 'as well'

Okay.

> I like adding "cold backup" here to help contrast and explain why a base
> backup is considered a "hot backup".  The rest is style to make that flow
> better.

Indeed.  The section uses hot backups a lot.

What do all folks here think about the updated attached?
--
Michael
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 982776ca0a..af48aa64c2 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1430,12 +1430,15 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
  Standalone Hot Backups
 
  
-  It is possible to use PostgreSQL's backup facilities to
-  produce standalone hot backups. These are backups that cannot be used
-  for point-in-time recovery, yet are typically much faster to backup and
-  restore than pg_dump dumps.  (They are also much larger
-  than pg_dump dumps, so in some cases the speed advantage
-  might be negated.)
+  It is possible to use PostgreSQL's backup
+  facilities to produce standalone hot backups.  These are backups that
+  could be used for point-in-time recovery if combined with a WAL
+  archive able to recover up to the wanted recovery point.  These backups
+  are typically much faster to create and restore than
+  pg_dump for large deployments but can result
+  in larger backup sizes.  Note also that
+  pg_dump backups cannot be used for
+  point-in-time recovery.
  
 
  


signature.asc
Description: PGP signature


Re: using pg_basebackup for point in time recovery

2018-06-25 Thread Michael Paquier
On Mon, Jun 25, 2018 at 12:51:10PM -0400, Bruce Momjian wrote:
> FYI, in recent discussions on the docs list:
> 
>   
> https://www.postgresql.org/message-id/CABUevEyumGh3r05U3_mhRrEU=dfacdrr2hew140mvn7fsbm...@mail.gmail.com

I did not recall this one.  Thanks for the reminder, Bruce.

> There was the conclusion that:
> 
>   If it's a clean backpatch I'd say it is -- people who are using
>   PostgreSQL 9.6 will be reading the documentation for 9.6 etc, so they
>   will not know about the fix then.
>   
>   If it's not a clean backpatch I can certainly see considering it, but if
>   it's not a lot of effort then I'd say it's definitely worth it.
> 
> so the rule I have been using for backpatching doc stuff has changed
> recently.

In the case of this thread, I think that the patch applies cleanly
anyway as this comes from the period where hot standbys have been
introduced.  So that would not be a lot of work...  Speaking of which,
it would be nice to be sure about the wording folks here would prefer
using before fixing anything ;p
--
Michael


signature.asc
Description: PGP signature


Re: Specifying WAL Location in Streaming Replication

2018-07-08 Thread Michael Paquier
On Sun, Jul 08, 2018 at 09:51:47AM -0400, Matt Dee wrote:
> In the documentation for START_REPLICATION, a required argument is the WAL
> location to begin streaming at, and I'm not sure what to use here.  I have
> been using 0, and it seems to work fine.  Additionally, it seems that when
> --startpos is not provided to pg_recvlogical, it defaults to just sending a
> wal location of 0.
> 
> While this seems to work, I feel a bit uneasy about it since I can't find
> any documentation explaining why this works or what this argument does when
> using a replication slot.  Any clarification would be appreciated.

0/0 can be used in the case where the client does not know which
position it should try to use, in which case the server will choose by
itself from the oldest LSN position where changes have been left
unconsumed.  If you have an idea of documentation improvement, I think
that it would be welcome, say in the replication protocol page for the
command START_REPLICATION.  This level of details is really internal,
but fresh ideas may bring more clarity in this area, and nowhere in the
docs are used references to either 0/0 or InvalidXLogRecPtr for WAL
positions.
--
Michael


signature.asc
Description: PGP signature


Re: Reconnecting a slave to a newly-promoted master

2018-07-10 Thread Michael Paquier
On Mon, Jul 09, 2018 at 05:58:53PM -0700, Shawn Mulloney wrote:
> There are three PostgreSQL machines: A, B, and C. B and C are slaves off of
> the master, A. A fails, and B is promoted to being the new master. Can C
> just be pointed at A and have it "just work"?

In your question I am pretty sure that you mean "B" instead of "A" as
"A" has died and is off the grid.  The answer to that question would be
perhaps no, as if C could have replayed WAL ahead of B in which case you
could finish with a corrupted C instance if trying to reconnect it
directly to the promoted B.
--
Michael


signature.asc
Description: PGP signature


Re: Waiting connections postgres 10

2018-07-23 Thread Michael Paquier
On Mon, Jul 23, 2018 at 04:41:59PM +, Nicola Contu wrote:
> we used to monitor waiting connections with postgres 9.6.6 via this
> query: 
> 
> select count (*) from pg_stat_activity where wait_event IS NOT NULL

Please note that a wait event becomes NULL once it is reset, so what
this query does is to monitor backends which are not waiting on
something, not backends which are idle.  Hence you would include
backends also doing some active work.

> select count (*) from pg_stat_activity where wait_event_type ='Client'
> and wait_event IN ('ClientRead','ClienteWrite') and state='idle'.
>
> Can anyone help me?

You are visibly looking for a WHERE clause defined with state IN
('idle', 'idle in transaction') when looking for connections waiting for
some activity to be generated by the application, which works also with
9.6.
--
Michael


signature.asc
Description: PGP signature


Re: Pg_rewind cannot load history wal

2018-08-03 Thread Michael Paquier
On Wed, Aug 01, 2018 at 09:09:30PM +, Richard Schmidt wrote:
> Our procedure that runs on machine A and B is as follows:
> 
>   1.  Build new databases on A and B, and configure A as Primary and B
>   as Standby databases. 
>   2.  Make some changes to the A (the primary) and check that they are
>   replicated to the B (the standby) 
>   3.  Promote B to be the new primary
>   4.  Switch of the A (the original primary)
>   5.  Add the replication slot to B (the new primary) for A (soon to
>   be standby)
>   6.  Add a recovery.conf to A (soon to be standby). File contains
>   recovery_target_timeline = 'latest' and restore_command = 'cp
>   /ice-dev/wal_archive/%f "%p" 
>   7.  Run pg_rewind on A - this appears to work as it returns the
>   message 'source and target cluster are on the same timeline no
>   rewind required'; 
>   8.  Start up server A (now a slave)

Step 7 is incorrect here, after promotion of B you should see pg_rewind
actually do its work.  The problem is that you are missing a piece in
your flow in the shape of a checkpoint on the promoted standby to run
after 3 and before step 7.  This makes the promoted standby update its
timeline number in the on-disk control file, which is used by pg_rewind
to check if a rewind needs to happen or not.

We see too many reports of such mistakes, I am going to propose a patch
on the -hackers mailing list to mention that in the documentation...
--
Michael


signature.asc
Description: PGP signature


Re: Sv: WAL-dir filling up with wal_level = logical

2018-08-03 Thread Michael Paquier
On Fri, Aug 03, 2018 at 12:40:16PM +0200, Andreas Kretschmer wrote:
> On 3 August 2018 12:33:26 CEST, Andreas Joseph Krogh  
> wrote:
>>Hm, a co-worker noticed that we have max_wal_size = 16GB, that might be
>>the 
>>reason:-)
> 
> Yes ;-)

Worth mentioning that this is a soft size, and not a hard size, hence
depending on your worload you may see more WAL segments than what is set
in 16GB.  The docs mention that, so no need to be surprised.
--
Michael


signature.asc
Description: PGP signature


Re: Pg_rewind cannot load history wal

2018-08-03 Thread Michael Paquier
On Sat, Aug 04, 2018 at 07:44:59AM +0100, Simon Riggs wrote:
> I think the problem is that writing the online checkpoint is deferred
> after promotion, so this is a timing issue that probably doesn't show
> in our regression tests.

Somewhat.  It is a performance improvement of 9.3 to let the startup
request a checkpoint to the checkpointer process instead of doing it
itself.

> Sounds like we should write a pending timeline change to the control
> file and have pg_rewind check that instead.
> 
> I'd call this a timing bug, not a doc issue.

Well, having pg_rewind enforce a checkpoint on the promoted standby
could cause a performance hit as well if we do it mandatorily as if
there is delay between the promotion and the rewind triggerring a
checkpoint could have already happen.  So it is for me a documentation
bug first regarding the failover workflow, and potentially a patch for a
new feature which makes pg_rewind trigger directly a checkpoint.
--
Michael


signature.asc
Description: PGP signature


Re: Pg_rewind cannot load history wal

2018-08-04 Thread Michael Paquier
On Sat, Aug 04, 2018 at 04:59:45AM -0700, Andres Freund wrote:
> On 2018-08-04 10:54:22 +0100, Simon Riggs wrote:
>> pg_rewind doesn't work correctly. Documenting a workaround doesn't change 
>> that.
> 
> Especially because most people will only understand this after they've
> been hit, as test scenarios will often just be quick enough.

Well, since its creation we have the tool behave this way.  I am not
sure either that we can have pg_rewind create a checkpoint on the source
node each time a rewind is done, as it may not be necessary, and it
would enforce WAL segment recycling more than necessary, so if we were 
to back-patch something like that I am pretty much convinced that we
would get complains from people already using the tool, with existing
failover flows which are broken.  Making this stuff to not need a
checkpoint is actually possible.  When the source is offline, the
control file can be relied on as the shutdown checkpoint would update
the on-disk control file.  When the source is online, pg_rewind only
needs to know the new timeline number from the source, which we could
provide via a SQL function, but that would work only on HEAD (look at
ControlFile_source, you would see that only the new TLI matters, and
that getTimelineHistory does not really need to know the contents of the
control file).
--
Michael


signature.asc
Description: PGP signature


Re: Pg_rewind cannot load history wal

2018-08-04 Thread Michael Paquier
On Sat, Aug 04, 2018 at 07:54:36AM -0700, Christophe Pettus wrote:
> Would having pg_rewind do a checkpoint on the source actually cause
> anything to break, as opposed to a delay while the checkpoint
> completes?

Users relying only on streaming without archives would be impacted as
potentially two checkpoints could be used on the promoted standby,
making all past segment needed from the divergence point not to be
around.  That's a problem which exists in v11 as only WAL segments worth
one checkpoint are kept around, not for 9.5, 9.6 and 10.

> The current situation can create a corrupted target, which seems far
> worse than just slowing down pg_rewind.

Hm?  pg_rewind requires the target to be stopped properly, meaning that
the divergence point is known to both nodes.  If the source is online
and has not created the first post-recovery checkpoint, then you would
get a no-op with pg_rewind, and when restarting the old master witha
recovery.conf you would get a failure.  If you stop the old master
so as at next startup it needs crash recovery to recover, then there is
indeed a risk of corrupted instance, but that would be the same problem
even if pg_rewind is used.
--
Michael


signature.asc
Description: PGP signature


Re: upgrading from pg 9.3 to 10

2018-08-15 Thread Michael Paquier
On Tue, Aug 14, 2018 at 04:15:12PM -0300, Martín Marqués wrote:
> I'd recommend testing with a clone of the server to verify that it works
> properly (not only pg_upgrade, but your application with the new version
> of postgres). Also to time the window you'll need and see if there are
> things to be aware of, like extensions which are upgraded and might
> break the upgrade.
> 
> Now if you are going to first jump to 9.4, I'd recommend using pglogical
> after getting to 9.4 and upgrade straight from 9.4 to 10 (always after
> testing your application against 10)

Well, pglogical has the advantage of reducing the downtime, which may
not matter depending on your application and you may be able to accept a
it of downtime, and pg_upgrade --link can be pretty quick at its job.
Test it before as --link is a no-return trip.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup failed to read a file

2018-08-15 Thread Michael Paquier
On Tue, Aug 14, 2018 at 12:14:59PM -0400, Tom Lane wrote:
> That seems like a pretty expensive thing to do, if there are lots of
> files ... and you'd still end up failing, so it's not moving the ball
> very far.

Yeah, I would think that with many small relations it is going to have a
measurable performance impact if we scan the whole data directory a
second time.

> More generally, this seems closely related to bug #14999 [1]
> which concerned pg_rewind's behavior in the face of unexpected file
> permissions within the data directory.  We ended up not doing anything
> about that except documenting it, which I wasn't very satisfied with,
> but the costs of doing better seemed to exceed the benefits.

Please feel free to read the end of the thread about details on the
matter.  There are many things you could do, all have drawbacks.

> It'd be nice to have a more coherent theory about what needs to be copied
> or not, and not fail on files that could simply be ignored.  Up to now
> we've resisted having any centrally defined knowledge of what can be
> inside a PG data directory, but maybe that bullet needs to be bitten.

Yeah, I have not really come up with a nice idea yet, especially when
things sometimes move with custom files that some users have been
deploying, so I am not completely sure that we'd need to do something
anyway, nor that it is worth the trouble.  One saner strategy may be to
split your custom file into a directory out of the main data folder...
--
Michael


signature.asc
Description: PGP signature


Re: Reeving an error while taking the backup using "pg_basebackup" utility.

2018-08-27 Thread Michael Paquier
On Tue, Aug 28, 2018 at 07:19:12AM +0530, Raghavendra Rao J S V wrote:
> pg_basebackup: could not get transaction log end position from server:
> FATAL:  requested WAL segment 00010285008F has already been
> removed
> 
> Please guide me why and how to handle this error. Do you want me to change
> any of the option in my pg_basebackup command let me know.

This means that while taking a backup, a checkpoint has come in and has
recycled past segments.  Lack of luck it is, as depending on the load
you may see such failures.  One way to correct the problem would be to
take a base backup without WAL segments included and with a WAL archive
used by the base backup taken.  A second is to use a physical
replication slot which guarantee the presence of the wanted segments.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup + SSL error: bad length

2018-08-27 Thread Michael Paquier
On Mon, Aug 27, 2018 at 04:40:34PM -0700, Adrian Klaver wrote:
> Is there more then one copy of pg_basebackup on the machines?

Or this user has created a tablespace directly in the main data folder,
which can cause pg_basebackup to fail because of recursion issues.  It
is no wonder that a WARNING is created during a CREATE TABLESPACE if you
do that.
--
Michael


signature.asc
Description: PGP signature


Re: WAL replay issue from 9.6.8 to 9.6.10

2018-08-29 Thread Michael Paquier
On Wed, Aug 29, 2018 at 08:31:50AM +0200, Alexander Kukushkin wrote:
> 2018-08-29 6:02 GMT+02:00 Dave Peticolas :
>> Hello, I'm seeing some issues with WAL replay on a test server running
>> 9.6.10 using WAL archived from a 9.6.8 primary server. It reliably PANICs
>> during replay with messages like so:
>>
>> WARNING:  page 1209270272 of relation base/16422/47496599 does not exist
>> CONTEXT:  xlog redo at 4810/C84F8A0 for Btree/DELETE: 88 items
>> PANIC:  WAL contains references to invalid pages
> 
> 
> it looks like you are hitting pretty much the same problem as I:
> https://www.postgresql.org/message-id/flat/153492341830.1368.3936905691758473953%40wrigleys.postgresql.org
> The only major difference, you are restoring from the backup, while in
> my case the host running replica has crashed.
> Also in my case, the primary was already running 9.6.10.
> 
> In my case, it also panics during "Btree/DELETE: XYZ items" and page
> number of relation is insanely huge.

That would be the same problem.  Dave, do you have a background worker
running in parallel or some read-only workload with backends doing
read-only operations on a standby once it has reached a consistent
point?
--
Michael


signature.asc
Description: PGP signature


Re: WAL replay issue from 9.6.8 to 9.6.10

2018-08-29 Thread Michael Paquier
On Wed, Aug 29, 2018 at 09:15:29AM -0700, Dave Peticolas wrote:
> Oh, perhaps I do, depending on what you mean by worker. There are a couple
> of periodic processes that connect to the server to obtain metrics. Is that
> what is triggering this issue? In my case I could probably suspend them
> until the replay has reached the desired point.

That would be it.  How do you decide when those begin to run and connect
to Postgres.  Do you use pg_isready or similar in a loop for sanity
checks?

> I have noticed this behavior in the past but prior to 9.6.10 restarting the
> server would fix the issue. And the replay always seemed to reach a point
> past which the problem would not re-occur.

You are picking my interest here.  Did you actually see the same
problem?  In 9.6.10 what happens is that I have tightened the consistent
point checks and logic so as inconsistent page issues would actually
show up when they should, and that those become reproducible so as we
can track down any rogue WAL record or inconsistent behavior.
--
Michael


signature.asc
Description: PGP signature


Re: WARNING: could not flush dirty data: Function not implemented

2018-09-03 Thread Michael Paquier
On Sun, Sep 02, 2018 at 04:31:18PM -0700, Andres Freund wrote:
> Please note that nobody has verified that postgres works correctly via
> the emulation stuff MS is doing.  There is a native version of postgres
> for windows however, and that is tested (and exercised by a lot of
> installations).

If there are folks willing to put enough effort in getting this to work,
it could work, assuming that a buildfarm animal is able to get down this
road.  From what I can see on this thread we are not yet at that stage
though.
--
Michael


signature.asc
Description: PGP signature


Re: scram-sha-256 authentication broken in FIPS mode

2018-09-04 Thread Michael Paquier
On Wed, Sep 05, 2018 at 03:29:31AM +, Alessandro Gherardi wrote:
> It looks like scram-sha-256 doesn't work when postgres is linked
> against FIPS-enabled OpenSSL and FIPS mode is turned on.
> 
> Specifically, all login attempts fail with an OpenSSL error saying
> something along the lines of "Low level API call to digest SHA256
> forbidden in fips mode".

The error comes from libc, right?  Postgres can of course be configured
to work with FIPS without patching it, it just needs to be enabled
system-wide, which is what RedHat does, and what you are doing I guess?

> I think this issue could be solved by refactoring the code in
> sha2_openssl.c to use the OpenSSL EVP interface
> (see https://wiki.openssl.org/index.php/EVP_Message_Digests ). 
> Any thoughts? Is this a known issue?

This report is the first of this kind since Postgres 10, which is where
the SHA2 interface for OpenSSL has been introduced.  So likely we'd need
to look into that more deeply..  This has the strong smell of a bug.  If
your system is new enough, you should have sha256() & co as system
functions, so you would see the failure as well?  The regression tests
would have likely complained.
--
Michael


signature.asc
Description: PGP signature


Re: scram-sha-256 authentication broken in FIPS mode

2018-09-05 Thread Michael Paquier
On Wed, Sep 05, 2018 at 01:19:39PM +, Alessandro Gherardi wrote:
> Hi Michael,I'm actually running postgres on Windows.

First you may want to avoid top-posting.  This is not the style of the
community lists and this breaks the logic of a thread.

> I added code to fe-secure-openssl.c and be-secure-openssl.c that reads
> the Windows "standard" FIPS registry entry, and if FIPS is enabled
> calls FIPS_mode_set(1). This is to mimic to behavior of the .NET
> framework.

That's rather uncharted territory, as you are patching both the backend
*and* the client.  If we could prove that sha2-openssl.c is actually
unreliable even if FIPS is enabled system-wide with either SCRAM
authentication or any of the other hashing functions, then I would be
ready to accept a patch.  Now, as far as I can see and heard from other
folks for at least Linux, if FIPS is enabled at the OS level, then
Postgres would use it automatically and SCRAM is able to work.  I have
yet to hear that this part is broken.  As far as I know from companies
within the community which worked on STIG requirements, the thing
works.

> Below is the code I added to fe-secure-openssl.c, the code in
> be-secure-openssl.c is similar: 
> Thoughts? I can try to fix the scram-sha-256 issue by using EVP and
> send you a merge request for the patch and the code below if you think
> my approach is correct.

That's a bit unreadable I am afraid :)
You may want to attach a patch after producing it with for example "git
format-patch -1".
--
Michael


signature.asc
Description: PGP signature


Re: how to know current xlog location on standby after primary is down

2018-09-09 Thread Michael Paquier
On Sun, Sep 09, 2018 at 10:29:08PM +0800, magodo wrote:
> So I want to know what is the correct way to do it. Thank you in
> advance!

There are pg_last_wal_receive_lsn() and pg_last_wal_replay_lsn() which
allow you to know what is the last LSN received and replayed on a
standby.  Those can be used when an instance is in recovery.
--
Michael


signature.asc
Description: PGP signature


Re: scram-sha-256 authentication broken in FIPS mode

2018-09-10 Thread Michael Paquier
On Mon, Sep 10, 2018 at 02:52:00PM +, Alessandro Gherardi wrote:
> I changed the implementation of the other SHA digests to use EVP
> also.  I verified that, with these changes, scram-sha-256 works when
> FIPS is enabled.

Hm...  I have spent a couple of hours setting up a CentOS 7 VM with FIPS
enabled to track down if there is actually a problem.  Here is what I
have done to enable it:
1) yum install dracut-fips
dracut -v -f
2) Update boot loader, where it is necessary to update
GRUB_CMDLINE_LINUX by adding to it "fips=1 boot=UUID=$PARTITION_UUID"
into /etc/default/grub.  PARTITION_UUID can be found with "blkid=/boot"
(depends on the partition layer by the way).
3) Disable prelinking (requires installation of package prelink) by
adding PRELINKING=no to /etc/sysconfig/prelink, then remove existing
prelinks with "prelink -u -a".

After a reboot, it is possible to see /proc/sys/crypto/fips_enabled set
to 1.  Once I did that, unfortunately I have not been able to spot
deficiencies when calling the low-level SHA APIs from OpenSSL, where
both SCRAM and all the in-core SSL functions are proving to work
correctly.  Calling directly FIPS_mode() within Postgres backends also
prove that FIPS is effectively enabled.  Anyway, on top of the remark
Alessandro has done above, this line from the OpenSSL docs has caught my
eyes:
https://www.openssl.org/docs/man1.1.0/crypto/SHA512_Init.html
"Applications should use the higher level functions EVP_DigestInit
etc. instead of calling the hash functions directly."

This is present in OpenSSL docs for some time:
commit: 4facdbb5fa9d791fc72dc78b9c3512ea1384df33
author: Ulf Möller 
date: Sun, 6 Feb 2000 23:26:31 +

Hence, intrinsically, we are in contradiction with the upstream docs.  I
have worked on the problem with the patch, which works down to OpenSSL
0.9.8, and should fix your issue.  This is based on what you sent
previously, except that I was not able to apply what was sent, so I
reworked the whole.  Alessandro, does this fix your problems?  I would
like to apply that down to v10 where SCRAM has been introduced.
--
Michael


signature.asc
Description: PGP signature


Re: scram-sha-256 authentication broken in FIPS mode

2018-09-10 Thread Michael Paquier
On Tue, Sep 11, 2018 at 12:02:50PM +0900, Michael Paquier wrote:
> Hence, intrinsically, we are in contradiction with the upstream docs.  I
> have worked on the problem with the patch, which works down to OpenSSL
> 0.9.8, and should fix your issue.  This is based on what you sent
> previously, except that I was not able to apply what was sent, so I
> reworked the whole.  Alessandro, does this fix your problems?  I would
> like to apply that down to v10 where SCRAM has been introduced.

With the actual patch attached things are better.  So here it is. 
--
Michael
From 1b450dee61855f4fd8b9e4a37d2f95c07f26db55 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 11 Sep 2018 11:34:48 +0900
Subject: [PATCH] Change SHA algorithms to use EVP_CTX from OpenSSL

This seems to fix issues with FIPS mode on Windows.
---
 src/common/sha2_openssl.c | 45 +++
 src/include/common/sha2.h | 10 -
 2 files changed, 36 insertions(+), 19 deletions(-)

diff --git a/src/common/sha2_openssl.c b/src/common/sha2_openssl.c
index 362e1318db..e80dec7b4d 100644
--- a/src/common/sha2_openssl.c
+++ b/src/common/sha2_openssl.c
@@ -20,83 +20,100 @@
 #include "postgres_fe.h"
 #endif
 
-#include 
-
 #include "common/sha2.h"
 
+static void
+digest_init(EVP_MD_CTX **ctx, const EVP_MD *type)
+{
+	*ctx = EVP_MD_CTX_create();
+	EVP_DigestInit_ex(*ctx, type, NULL);
+}
+
+static void
+digest_update(EVP_MD_CTX **ctx, const uint8 *data, size_t len)
+{
+	EVP_DigestUpdate(*ctx, data, len);
+}
+
+static void
+digest_final(EVP_MD_CTX **ctx, uint8 *dest)
+{
+	EVP_DigestFinal_ex(*ctx, dest, 0);
+	EVP_MD_CTX_destroy(*ctx);
+}
 
 /* Interface routines for SHA-256 */
 void
 pg_sha256_init(pg_sha256_ctx *ctx)
 {
-	SHA256_Init((SHA256_CTX *) ctx);
+	digest_init(ctx, EVP_sha256());
 }
 
 void
 pg_sha256_update(pg_sha256_ctx *ctx, const uint8 *data, size_t len)
 {
-	SHA256_Update((SHA256_CTX *) ctx, data, len);
+	digest_update(ctx, data, len);
 }
 
 void
 pg_sha256_final(pg_sha256_ctx *ctx, uint8 *dest)
 {
-	SHA256_Final(dest, (SHA256_CTX *) ctx);
+	digest_final(ctx, dest);
 }
 
 /* Interface routines for SHA-512 */
 void
 pg_sha512_init(pg_sha512_ctx *ctx)
 {
-	SHA512_Init((SHA512_CTX *) ctx);
+	digest_init(ctx, EVP_sha512());
 }
 
 void
 pg_sha512_update(pg_sha512_ctx *ctx, const uint8 *data, size_t len)
 {
-	SHA512_Update((SHA512_CTX *) ctx, data, len);
+	digest_update(ctx, data, len);
 }
 
 void
 pg_sha512_final(pg_sha512_ctx *ctx, uint8 *dest)
 {
-	SHA512_Final(dest, (SHA512_CTX *) ctx);
+	digest_final(ctx, dest);
 }
 
 /* Interface routines for SHA-384 */
 void
 pg_sha384_init(pg_sha384_ctx *ctx)
 {
-	SHA384_Init((SHA512_CTX *) ctx);
+	digest_init(ctx, EVP_sha384());
 }
 
 void
 pg_sha384_update(pg_sha384_ctx *ctx, const uint8 *data, size_t len)
 {
-	SHA384_Update((SHA512_CTX *) ctx, data, len);
+	digest_update(ctx, data, len);
 }
 
 void
 pg_sha384_final(pg_sha384_ctx *ctx, uint8 *dest)
 {
-	SHA384_Final(dest, (SHA512_CTX *) ctx);
+	digest_final(ctx, dest);
 }
 
 /* Interface routines for SHA-224 */
 void
 pg_sha224_init(pg_sha224_ctx *ctx)
 {
-	SHA224_Init((SHA256_CTX *) ctx);
+	digest_init(ctx, EVP_sha224());
 }
 
 void
 pg_sha224_update(pg_sha224_ctx *ctx, const uint8 *data, size_t len)
 {
-	SHA224_Update((SHA256_CTX *) ctx, data, len);
+	digest_update(ctx, data, len);
 }
 
 void
 pg_sha224_final(pg_sha224_ctx *ctx, uint8 *dest)
 {
-	SHA224_Final(dest, (SHA256_CTX *) ctx);
+	digest_final(ctx, dest);
 }
diff --git a/src/include/common/sha2.h b/src/include/common/sha2.h
index f3fd0d0d28..701647713f 100644
--- a/src/include/common/sha2.h
+++ b/src/include/common/sha2.h
@@ -51,7 +51,7 @@
 #define _PG_SHA2_H_
 
 #ifdef USE_SSL
-#include 
+#include 
 #endif
 
 /*** SHA224/256/384/512 Various Length Definitions ***/
@@ -70,10 +70,10 @@
 
 /* Context Structures for SHA-1/224/256/384/512 */
 #ifdef USE_SSL
-typedef SHA256_CTX pg_sha256_ctx;
-typedef SHA512_CTX pg_sha512_ctx;
-typedef SHA256_CTX pg_sha224_ctx;
-typedef SHA512_CTX pg_sha384_ctx;
+typedef EVP_MD_CTX *pg_sha256_ctx;
+typedef EVP_MD_CTX *pg_sha512_ctx;
+typedef EVP_MD_CTX *pg_sha224_ctx;
+typedef EVP_MD_CTX *pg_sha384_ctx;
 #else
 typedef struct pg_sha256_ctx
 {
-- 
2.19.0.rc2



signature.asc
Description: PGP signature


Re: scram-sha-256 authentication broken in FIPS mode

2018-09-11 Thread Michael Paquier
On Tue, Sep 11, 2018 at 04:32:27PM +0200, Peter Eisentraut wrote:
> I recommend letting this bake in the master branch for a while.  There
> are a lot weirdly patched and alternative OpenSSL versions out there
> that defy any documentation.

Good point.  Such things have bitten in the past.  Okay, then let's do
something about sha2_openssl.c only on HEAD for now then, which I am
fine to finish wrapping.

> Of course, we should also see if this actually fixes the reported problem.

It seems to me that addressing FIPS concerns on Windows and getting our
hashing functions plugged with OpenSSL correctly are two separate
issues.  The second one also says that we are in the grey based on
OpenSSL docs, which worryies me.  And EVP_DigestInit is used in pgcrypto
for ages, where I don't recall seeing reports about that.
--
Michael


signature.asc
Description: PGP signature


Re: scram-sha-256 authentication broken in FIPS mode

2018-09-12 Thread Michael Paquier
On Wed, Sep 12, 2018 at 07:24:24AM +0900, Michael Paquier wrote:
> Good point.  Such things have bitten in the past.  Okay, then let's do
> something about sha2_openssl.c only on HEAD for now then, which I am
> fine to finish wrapping.

I was looking at trying to commit this patch, however more needs to be
done in terms of error handling, as the proposed patch would happily
crash if EVP_MD_CTX cannot be allocated (understand OOM) in
EVP_DigestInit_ex if I read the OpenSSL code correctly (see
crypto/evp/digest.c).  Our lives would be facilitated if it was possible
to use directly EVP_MD_CTX and EVP_MD_CTX_init so as no allocation is
done but that's not doable as of 1.0.2.
--
Michael


signature.asc
Description: PGP signature


Re: scram-sha-256 authentication broken in FIPS mode

2018-09-17 Thread Michael Paquier
On Mon, Sep 17, 2018 at 02:55:55PM +, Alessandro Gherardi wrote:
> Therefore, I believe the best option, at least for now, is calling
> FIPS_mode_set(1) in the application. 

I am not so sure about that.  As you rightly mention, CentOS and RedHat
patch OpenSSL to allow FIPS to work.  Per my research, Ubuntu can also
enable FIPS but that's not the case of Debian, which is very popular (I
may be wrong about the last one but I use it daily).

One question I have is how are you actually able to use FIPS on Windows
with OpenSSL?  Is that from one of the tarballs available in
openssl.org, which are more than 1 year old?  Pure upstream code does
not give this option, and CentOS/RHEL use a customly-made patch, based
on which Postgres does not complain when calling the low-level hashing
functions, and we rely now on FIPS being enabled system-wide.  And that
actually works.  It seems to me that you are yourself using a custom
patch for OpenSSL, and that's actually a different flavor than the Linux
version as in your case the low-level hashing functions complain if
called directly in FIPS mode.

At the end, I think that we ought to wait and see if upstream OpenSSL
comes up with support for FIPS and how it integrates with it, on both
Linux *and* Windows, and then consider if Postgres needs to do more.
There is little point in merging now a patch for something which may or
may not be supported by OpenSSL now.  My bet, as things stand, is that
we could finish with something similar to what happens on Linux with a
system-wide switch that Postgres knows nothing about.  Perhaps that will
not be the case, but let's think about that once we know for sure.
--
Michael


signature.asc
Description: PGP signature


Re: help with startup slave after pg_rewind

2018-09-20 Thread Michael Paquier
On Wed, Sep 19, 2018 at 10:29:44PM +, Dylan Luong wrote:
> After promoting slave to master, I completed a pg_rewind of the slave
> (old master) to the new master. But when I try to start the slave I am
> getting the following error.
>
> I tried to run pg_rewind again, but now it says I cannot do it as its
> already same timeline.

What did pg_rewind tell you after the first run?  If you remove the set
of WAL segments on the rewound instance and let it replay only segments
from the archive, are you able to get past?

There is an inconsistency in the WAL records you are trying to replay.
In this case a contrecord refers to a WAL record split across multiple
pages.  The WAL reader is expecting one, and cannot find it.  And that's
not normal.  My bet is that something is wrong in your failover flow
which you think is right.  It is hard to get that right.
--
Michael


signature.asc
Description: PGP signature


Re: How to maintain the csv log files in pg_log directory only for past 30 days

2018-09-27 Thread Michael Paquier
On Fri, Sep 28, 2018 at 10:33:30AM +0530, Raghavendra Rao J S V wrote:
> Log file will be generated in *csv* format at *pg_log* directory in our
> PostgreSQL. Every day we are getting one log file. We would like to
> maintain only max 30 days. Which setting need to modify by us in
> “postgresql.conf” in order to recycle the log files after 30 days.

If you use for example log_filename = 'postgresql-%d.log', then the
server uses one new file every day.  This truncates the contents from
the last month automatically.
--
Michael


signature.asc
Description: PGP signature


Re: How to maintain the csv log files in pg_log directory only for past 30 days

2018-09-28 Thread Michael Paquier
On Fri, Sep 28, 2018 at 06:19:16AM -0700, Adrian Klaver wrote:
> If log_truncate_on_rotation = 'on', correct?

Yup, thanks for precising.
--
Michael


signature.asc
Description: PGP signature


Re: pg9.6: no backup history file (*.backup) created on hot standby

2018-10-08 Thread Michael Paquier
On Mon, Oct 08, 2018 at 03:23:47PM +0800, magodo wrote:
> Is this as expected?

Yes.

> If so, in which case should I do backup on primary and in which case
> should I do it on standby?

This depends on your use cases, sometimes you don't want to make the
production server, the primary use more CPU than necessary so you can
leverage the activity on a standby.
--
Michael


signature.asc
Description: PGP signature


Re: pg9.6: no backup history file (*.backup) created on hot standby

2018-10-08 Thread Michael Paquier
(Please do not forget to add the community mailing list in CC.)

On Tue, Oct 09, 2018 at 10:33:56AM +0800, magodo wrote:
> Since the backup history aims to aid administrator to identify the
> point from which wal archive should be kept and before which the
> archive could be cleaned. It is very helpful in kinds of use cases.
> 
> Why not also create it when do basebackup on standby?

The backup_label file is equally helpful, and backup history files are
not mandatory for backups.  One of the reasons behind why we cannot have
those on standbys is that there is no actual way to ensure the
uniqueness of this file, as two backups could be started in parallel
with the same start location and the *same* file name.  If an archive
command is not able to handle correctly the duplicates, you could bloat
pg_wal.  And that's a real problem.
--
Michael


signature.asc
Description: PGP signature


Re: pg9.6: no backup history file (*.backup) created on hot standby

2018-10-08 Thread Michael Paquier
On Tue, Oct 09, 2018 at 11:45:50AM +0800, magodo wrote:
> Yet, I am still not so clear how does the bloat of pg_wal happen? Do
> you mean pg_wal will be filled up by many .backup(s)?

If your archive_command is for example a simple cp (which it should not
be by the way), and if you try to archive twice the same file, then the
archive command would continuously fail and prevent existing WAL
segments to be archived.  Segments are continuously created, and pg_wal
grows in size.
--
Michael


signature.asc
Description: PGP signature


Re: pg9.6: no backup history file (*.backup) created on hot standby

2018-10-09 Thread Michael Paquier
On Tue, Oct 09, 2018 at 03:26:35PM +0800, magodo wrote:
> Yes, but does this differ whether I'm archiving a general WAL or
> archiving the backup history? I mean if user doesn't handle duplicate
> archive, then pg_wal will still be filled up when archiving WAL.

A WAL segment has a unique name, and would be finished to be used once.
The problem with backup history files on standbys is that the *same*
file can could finish by being generated *multiple* times with base
backups taken in parallel.  That's a completely different story, and the
window to those backup history files having the same name gets larger
the more the window between two checkpoints is.  That's a problem I
studied a couple of months back.
--
Michael


signature.asc
Description: PGP signature


Re: Compile psql 9.6 with SSL Version 1.1.0

2018-10-13 Thread Michael Paquier
On Fri, Oct 12, 2018 at 12:17:40PM -0400, Tom Lane wrote:
> So the immediate fix for building on Windows is you need to manually
> define all of those in pg_config.h.  We probably ought to try to make
> that less painful, though.  At the very least it'd be good if there
> were just one symbol you had to set manually ...

The root of the issue is that we have no way to change dynamically the
set of flags used in pg_config.h.win32 with the MSVC scripts depending
on the version of OpenSSL compiled.  We can tweak the scripts to do that
automatically  Patches welcome.
--
Michael


signature.asc
Description: PGP signature


Re: Active connections are terminated because of small wal_sender_timeout

2019-07-08 Thread Michael Paquier
On Fri, Jul 05, 2019 at 10:03:16AM -0400, Tom Lane wrote:
> ayaho...@ibagroup.eu writes:
>> Do you have any thoughts regarding this issue? 
> 
> I do not think anybody thinks this is a bug.  Setting wal_sender_timeout
> too small is a configuration mistake.

Yeah.  I don't see any bug here.  Please note that it can be also a
problem to set up a too high value in some configuration setups.  The
lack of flexibility in this area is why wal_sender_timeout has been
switch to be user-settable in v12.  In short you can configure it in
the connection string to enforce a custom value per standby.
--
Michael


signature.asc
Description: PGP signature


Re: Download link for postgres 10.8 edb software

2019-07-11 Thread Michael Paquier
On Fri, Jul 12, 2019 at 02:55:55AM +, Nanda Kumar wrote:
> Currently we are using postgres database 9.7 version.

You may be mistaken.  There is a major release of PostgreSQL called
9.6, but after that we have jumped directly to 10, reducing the number
of digits to mark a given minor version from 3 to 2.
--
Michael


signature.asc
Description: PGP signature


Re: Download link for postgres 10.8 edb software

2019-07-11 Thread Michael Paquier
On Fri, Jul 12, 2019 at 05:57:10AM +, Nanda Kumar wrote:
> Its my bad. Yes, currently  we are using 9.6 version. Now we are
> looking for 10.8 edb software for testing purpose . Kindly advise
> where we can download the 10.8 edb software ?

10.9 is the latest version in the 10.X series:
https://www.postgresql.org/download/windows/
--
Michael


signature.asc
Description: PGP signature


Re: question about client/server version mismatches

2019-08-12 Thread Michael Paquier
On Tue, Aug 13, 2019 at 12:45:35PM +1000, raf wrote:
> Since the backup itself was from a 9.5.12 server, it
> seems that the 9.6 parameter, idle_in_transaction_session_timeout,
> must have been set by the 9.6 client even though it was
> connected to a 9.5 server. Is that expected behaviour?

Yes, the compatibility of pg_dump is mentioned in the docs, and things
happen so as pg_dump can be used to transfer data to newer versions:
https://www.postgresql.org/docs/devel/app-pgdump.html
Please see from "Because pg_dump is used to transfer data to newer
versions of PostgreSQL...".

So doing a dump from a 9.5 instance using pg_dump from 9.6 would have
set the parameter.
--
Michael


signature.asc
Description: PGP signature


Re: Redis 16 times faster than Postgres?

2019-09-29 Thread Michael Paquier
On Sun, Sep 29, 2019 at 04:52:15PM -0500, Ron wrote:
> On 9/29/19 4:42 PM, Colin 't Hart wrote:
> Redis is an in-memory key-value database. PostgreSQL... isn't.

Well, I think that you have never heard about the urban legend of
running Postgres on scissors then and this reminds me of this blog
post:
http://www.databasesoup.com/2015/02/running-with-scissors-mode.html

Note that sometimes I have run Postgres on a tmpfs as well to test
some specific patches.  So that can be done, and of course that's
unsafe.
--
Michael


signature.asc
Description: PGP signature


Re: Version 10.7 of postgres

2019-10-09 Thread Michael Paquier
On Thu, Oct 10, 2019 at 10:22:22AM +0530, Shankar Bhaskaran wrote:
> We are planning to use postgres 10.7 version as that is the latest
> version supported on Aurora DB. Since we have an on premise installation
> also , i was trying to download the same version of postgres for windows
> and linux.
> Unfortunately that version is not available in the download site as well as
> the ftp site .Is there any reason why an installer for 10.7 version in
> windows and linux is not added to archives?

I cannot speak about Aurora DB and its compatibility or
incompatibilities, but the latest minor version of 10 is 10.10, and a
lot of bugs have been fixed since 10.7, hence it is sort of logic in
my opinion to only have an installer for the latest version available,
and that eases a lot the maintenance as the Windows installer likely
needs to bundle OpenSSL and such dependencies.  I would recommend also
that you do not use a version which has known bugs.
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL - unrecognized win32 error code: 38

2019-10-26 Thread Michael Paquier
On Sat, Oct 26, 2019 at 11:02:26AM -0700, Adrian Klaver wrote:
> Not sure how that can be answered without knowing what ComputeComputer is
> doing?

Yes, there is nothing of this kind in the PostgreSQL code.  So you may
want to check your own extension code if a module is involved here, or
perhaps you are just using a fork of Postgres..
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL - unrecognized win32 error code: 38

2019-10-28 Thread Michael Paquier
On Mon, Oct 28, 2019 at 09:51:07AM -0700, ZhenHua Cai wrote:
> The following is the SQL statement of that function.
> 
> DROP TABLE IF EXISTS "GenericReadList" ;
> CREATE TEMP TABLE "GenericReadList"(
> "ComputerProfileId" int NOT NULL,
> "Trustee" uuid NOT NULL,
> "AccessControlType" smallint NULL,
> "AceOrder" int NULL
> );

The original complain comes from a function called ComputeComputer.
What does it do and where does it come from?  Does it call any in-core
code which would cause the failure.  If the function is a SQL
function, could you provide a self-contained test case?
--
Michael


signature.asc
Description: PGP signature


Re: root page 3 of index "pg_class_oid_index" has level 0, expected 1

2019-11-13 Thread Michael Paquier
On Thu, Nov 14, 2019 at 08:26:48AM +0100, Laurenz Albe wrote:
> If you have data corruption, you shouldn't use pg_upgrade to upgrade.
> pg_dumpall / psql is the way to go.

Please refer to this wiki page:
https://wiki.postgresql.org/wiki/Corruption

If you have a cluster in such a state, you have risks of having a
corruption which has spread, and you should not reuse this data folder
as-is.  You can think of this REINDEX as a short-term solution to
retrieve as much of your data as you can so as you can dump it, and
then restore it into a safer location.  For corruptions with system
indexes, you also have the option of using ignore_system_indexes.
--
Michael


signature.asc
Description: PGP signature


Re: REINDEX VERBOSE unknown option

2019-11-17 Thread Michael Paquier
On Sun, Nov 17, 2019 at 04:41:59AM +0100, Pavel Stehule wrote:
> Documentation patch is good idea.

The documentation is rather clear about the need to of parenthesis
when using the VERBOSE option, and that it is not a mandatory option:
REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
--
Michael


signature.asc
Description: PGP signature


Re: REINDEX VERBOSE unknown option

2019-11-18 Thread Michael Paquier
On Mon, Nov 18, 2019 at 10:27:24AM +0100, Josef Šimánek wrote:
> This is clear once you understand what does it mean. I was aware of VERBOSE
> option of EXPLAIN and tried to use it without needed parentheses (the same
> way EXPLAIN can understand it). In the parameter list of REINDEX, it is
> still called VERBOSE (not "( VERBOSE )") and there's no info
> that parentheses are needed.

What would you do in the documentation once there is support for more
than one option then?

The grammar is done this way to remain extensible.  If you look at any
command in the documentation, there are two things to be aware of:
1) Clauses within square brackets are optional.
2) Multiple clauses within braces and separated by '|' mean that at
least one is mandatory.

Parenthesis do not matter here.  They are part of the query syntax.

> PS: AFAIK VERBOSE is option for EXPLAIN, but parameter for REINDEX. Is that
> the reason for different syntax?

Mainly historical reasons.  REINDEX VERBOSE has been added in 9.5.
EXPLAIN VERBOSE is around since at least 7.1.  Using options within
parenthesis is preferred lately because it is much easier to make the
grammar more extensible for future purposes and it eases the option
parsing.
--
Michael


signature.asc
Description: PGP signature


Re: Authentication: MD5 to SCRAM-SHA-256 error

2019-11-18 Thread Michael Paquier
On Fri, Nov 15, 2019 at 03:17:17PM -0500, Dave Hughes wrote:
> Thanks for replying guys!  You both led me down the right path.  I didn't
> realize it, but looks like we had 2 directories where psql was installed.
> Once I gave the path specifically to PostgreSQL10, it worked like a charm.
> 
> I'm reaching out to our server admins now to see why there are 2
> instances.

Maybe remnants of past version packages for an upgrade where
pg_upgrade has been involved?  You need both the old and new binaries
when doing that, and most distributions allow different major versions
of Postgres to be installed in parallel.
--
Michael


signature.asc
Description: PGP signature


Re: REINDEX VERBOSE unknown option

2019-11-19 Thread Michael Paquier
On Tue, Nov 19, 2019 at 11:37:04AM +, Geoff Winkless wrote:
> It's bad enough that you have the inconsistency that REINDEX VERBOSE
> requires parentheses while the more recent REINDEX CONCURRENTLY does
> not (presumably to match the syntax of CREATE INDEX CONCURRENTLY),
> without insisting that the user parse the difference between { and (
> in the manual (not the easiest difference to scan) before they can use
> the command.

The first implementations of REINDEX CONCURRENTLY used the existing
grammar.  There was also a discussion when the feature was close to
commit about exactly that and I preferred the parenthesis option.
Who won t the end?  Consistency with the existing grammar for
CREATE/DROP INDEX here. 

>> How about this?
>>
>> * Terminals (stuff that has to be typed as shown) in bold.
>>
>> * Non-Terminals (stuff which has to be replaced) in italic.
>>
>> * Meta-characters ([, ], |, ...) in regular type.
> 
> Even if you do that you're still requiring the user to parse syntax
> according to esoteric rules. I'm not sure that changing the rules
> helps that much.

This does not concern only the page for REINDEX.  Perhaps this could
be improved, but I am not sure how and particularly if changing it is
worth it as many people are used to the existing way of presenting the
commands synopsis as well.
--
Michael


signature.asc
Description: PGP signature


Re: ON COMMIT options for non temporary tables

2019-11-21 Thread Michael Paquier
On Thu, Nov 21, 2019 at 05:13:31PM +0100, Laurenz Albe wrote:
> How should that work for tables other than temporary tables?
> Should COMMIT lock if somebody else accesses the table?

Postgres does not support read uncommitted, so the table would not be
visible to other sessions until the transaction that created it is
committed.  Anyway, the goal is to use the table just within a
transaction without having it produce WAL, right?

> What keeps you from explicitly dropping the table right before
> you commit?

Yeah, that would be the most simple.  Another idea I can think of here
is to use a combination of event trigger and deferred triggered on a
dummy constraint: register the table creation using the event trigger
on CREATE TABLE, and then drop it using the deferred trigger.  Using a
DROP TABLE before the commit or once your transaction is done with
this table is much more simple of course :)
--
Michael


signature.asc
Description: PGP signature


Re: sql query for postgres replication check

2019-11-24 Thread Michael Paquier
On Fri, Nov 22, 2019 at 01:20:59PM +, Zwettler Markus (OIZ) wrote:
> I came up with the following query which should return any apply lag in 
> seconds.
> 
> select coalesce(replay_delay, 0) replication_delay_in_sec
> from (
>select datname,
>   (
> select case
>  when received_lsn = latest_end_lsn then 0
>  else extract(epoch
> from now() - latest_end_time)
>end
> from pg_stat_wal_receiver
>   ) replay_delay
>from pg_database
>where datname = current_database()
>  ) xview;
> 
> 
> I would expect delays >0 in case SYNC or ASYNC replication is
> somehow behind. We will do a warning at 120 secs and critical at 300
> secs.

pg_stat_wal_receiver is available only on the receiver, aka the
standby so it would not really be helpful on a primary.  On top of
that streaming replication is system-wide, so there is no actual point
to look at databases either.

> Would this do the job or am I missing something here?

Here is a suggestion for Nagios: hot_standby_delay, as told in
https://github.com/bucardo/check_postgres/blob/master/check_postgres.pl
--
Michael


signature.asc
Description: PGP signature


Re: jsonb_set() strictness considered harmful to data

2019-11-27 Thread Michael Paquier
On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote:
> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed",
> errdetail - a exception due setting "null_value_treatment" =>
> raise_exception
> and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb"
> 
> I don't know, but in this case, the exception should be verbose. This is
> "rich" function with lot of functionality

@Andrew: This patch is waiting on input from you for a couple of days
now.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup + incremental base backups

2019-11-29 Thread Michael Paquier
On Fri, Nov 29, 2019 at 04:57:11PM -0300, Christopher Pereira wrote:
> Our stream replication slave server got out of sync so we need to base
> backup again.
> 
> In case of big databases, can we do incremental backups with pg_basebackup?

I know of two ways to define such backups, one being actually
incremental and the other differential:
- In the shape of a range of WAL segments, which is incremental.  A
full base backup is taken, and then by backing up periodically a range
of WAL segments to make sure that you are able to recover up to the
point you are looking for.  Postgres core can help to do that by
itself. 
- Using a combination of full backups and differential backups
(differential), the latter containing only (well, mostly) relation
pages which have changed since the last full backup or differential
backup.  When restoring, you then need to merge the last full backup
and one or more differential backups, followed by a replay of WAL
segments up to the point you are willing to recover to.  In this case
some external tools offer solutions to that problem: pgBackRest and
pg_rman are two I know of.

Other backup solutions on top of the two cited above, like barman, all
have implementations to handle the first type of incremental backup.

> Is there any alternative?
> 
> Here was a proposal: https://wiki.postgresql.org/wiki/Incremental_backup

Yeah, nothing has been done in uptream though in this area.  That's
more about differential backups.
--
Michael


signature.asc
Description: PGP signature


Re: upgrade and migrate

2019-12-03 Thread Michael Paquier
On Tue, Dec 03, 2019 at 10:32:22PM +, Julie Nishimura wrote:
> Hello, what is the best way to migrate from PostgreSQL 8.3.11 on
> x86_64-redhat-linux-gnu to PostgreSQL 9.6.16 on x86_64-pc-linux-gnu
> server, with minimal downtime?
> The caveat is the source has about 80 databases overall almost 30
> TB. I could migrate the smallest ones (up to 1 tb) using pg_dump and
> pg_restore, but the largest hot database is almost 17 tb, and I am
> not sure how to approach this effort in a better and efficient way?

pg_upgrade could be one way to go here.  That's not the scale pg_dump
would be very good at.  I would have personally avoided using pg_dump
above 10~20GB.  Depending on the downtime you are ready to accept,
a migration based on Slony could be something to investigate.
--
Michael


signature.asc
Description: PGP signature


Re: upgrade and migrate

2019-12-04 Thread Michael Paquier
On Wed, Dec 04, 2019 at 08:38:01AM +0100, Thomas Kellerer wrote:
> But pg_upgrade only supports 8.4+

Ditto.  You're right here.
--
Michael


signature.asc
Description: PGP signature


Re: Date created for tables

2019-12-05 Thread Michael Paquier
On Thu, Dec 05, 2019 at 07:12:22PM -0600, Ron wrote:
> On 12/5/19 1:01 PM, Tom Lane wrote:
>> It's been considered, and rejected, many times.  Aside from the overhead
>> involved, there are too many different ideas of what such dates ought to
>> mean (e.g., what should happen during dump/restore? does a failed
>> transaction update last-modified? etc etc).  You can search the
>> project's mailing list archives if you want to read the prior discussions.
> 
> All the other RDBMSs seem to have figured it out.

It does not necessarily mean that Postgres has to do it.  FWIW, you
can track that using an even trigger for CREATE TABLE or other objects
which inserts the following in a table of your choice for a given
database:
- The timestamp of the transaction.
- The object name.
- Its class ID, say pg_class::regclass for a table, etc.
--
Michael


signature.asc
Description: PGP signature


Re: archiving question

2019-12-05 Thread Michael Paquier
On Thu, Dec 05, 2019 at 03:04:55PM +, Zwettler Markus (OIZ) wrote:
> What do you mean hear?
> 
> Afaik, Postgres runs the archive_command per log, means log by log by log.
> 
> How should we parallelize this?

You can, in theory, skip the archiving for a couple of segments and
then do the operation at once without the need to patch Postgres.
--
Michael


signature.asc
Description: PGP signature


Re: pg_repack failure

2020-01-06 Thread Michael Paquier
On Tue, Jan 07, 2020 at 06:15:09AM +, Nagaraj Raj wrote:
> and this error is occurring in large tables only, and current table
> size which is running about 700GB
> 
> /pg_repack --version
> pg_repack 1.4.3
> 
> DB version: PostgreSQL 9.6.11 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 
> 4.9.3, 64-bit

I think that you had better report that directly to the maintainers of
the tool here:
https://github.com/reorg/pg_repack/
--
Michael


signature.asc
Description: PGP signature


Re: what to do after a failover

2020-01-08 Thread Michael Paquier
On Wed, Jan 08, 2020 at 11:06:28PM -0500, Rita wrote:
> I run a master and standby setup with Postgresql 11. The systems are
> identical from a hardware and software setup.  If the master goes down I
> can do a pg_ctl promote on the standby and point my applications to use the
> standby (new master).
>
> Once the original master is online, when is an appropriate time to fail
> back over? And are there any other things besides promote after the
> failover is done?

Make sure that you still have an HA configuration able to handle
multiple degrees of failures with always standbys available after a
promotion.

The options available to rebuild your HA configuration after a
failover depend on the version of PostgreSQL you are using.  After a
failover the most simple solution would be to always recreate a new
standby from a base backup taken from the freshly-promoted primary,
though it can be costly depending on your instance.  You could also
use pg_rewind (available in core since 9.5) to recycle the previous
primary and reuse it as a standby of the new promoted custer.  Note
that there are community-based solutions for such things, like
pg_auto_failover or pacemaker-based stuff just to name two.  These
rely on more complex architectures, where a third node is present to
monitor the others (any sane HA infra ought to do at least that to be
honest).
--
Michael


signature.asc
Description: PGP signature


Re: Postgres streaming replication

2020-01-08 Thread Michael Paquier
On Wed, Jan 08, 2020 at 05:04:21AM +, Daulat Ram wrote:
> I have to implement the streaming replication for our prod environment.
> Can you please share the list of parameters to setup the PostgreSQL
> 11 streaming replication with continuous archiving and give clarity
> on the below.
> 
>   *   Do we need to enable the archive_mode and archive_command
> parameters at standby side if implementing PostgreSQL 11
> streaming replication with continuous archiving  ?
>   *   What is the benefits for implementing streaming replication
> with continuous archiving over the PostgreSQL Streaming
> replication? 

I think that you need to do two things and take the necessary time to
do both:
1) List on a sheet of paper all your requirements, your expectations
and what kind of tradeoffs you are ready to make for your cluster
configuration when it comes to potential data loss, and the amount of
data retention you are willing to have around.  There is no way to
tell what's good or not for your product if you don't know that
first. 
2) Read the documentation:
https://www.postgresql.org/docs/current/high-availability.html
--
Michael


signature.asc
Description: PGP signature


Re: what to do after a failover

2020-01-09 Thread Michael Paquier
On Thu, Jan 09, 2020 at 03:14:59PM +0100, Jehan-Guillaume de Rorthais wrote:
> If you can afford that, this is the cleanest and easiest procedure you could
> find.
>
> Note that pg_basebackup need an empty PGDATA, so it will have to transfert the
> whole instance from new promoted primary to the original one.

Simple is easier to understand.  Now the larger your instance, the
longer it takes to copy a base backup and the longer your reduce the
availability of your cluster.  So be careful with what you choose.
--
Michael


signature.asc
Description: PGP signature


Re: pg_stat_statements extension

2020-01-13 Thread Michael Paquier
On Mon, Jan 13, 2020 at 11:41:36AM -0800, Adrian Klaver wrote:
> How did the above get installed, from source, RPM. other?

Rushikesh, depending on your environment, the way to install
pg_stat_statements' libraries may change.  On most Linux
distributions, any extension modules are shipped with a package
different than the main ones for the server and client binaries and
libraries.  Then, you would most likely look for a package named
postgresql-11-contrib or similar.
--
Michael


signature.asc
Description: PGP signature


Re: Fwd: Postgresql Data corruption

2020-01-14 Thread Michael Paquier
On Tue, Jan 14, 2020 at 02:34:04PM -0800, Adrian Klaver wrote:
> On 1/14/20 12:44 PM, Tulqin Navruzov wrote:
>> but can logged in with another user and trying to select from some
>> tables , showing this message :
>> 
>> ERROR:  catalog is missing 11 attribute(s) for relid 113971
>> 
>> Could you help us to solve this problem? or could you give contact who
>> can help with this?
> 
> Do you have space to back up the $DATADIR or do you have a recent backup?
> 
> Asking because that would be helpful before you try the below.
> 
> I would take a look at:
> https://www.postgresql.org/docs/12/sql-reindex.html

That's a bad data corruption pattern, and rolling in a backup is
recommended (make sure that they are actually safe to use!).  A
REINDEX should help you to retrieve data and move it somewhere else if
you don't have a backup, but nothing more as there is no guarantee
that other parts of the system are not broken.  As others have already
pointed out, take the time necessary to read that:
https://wiki.postgresql.org/wiki/Corruption

But first, you should take three, slow, deep breaths.  Rushing can
only make things worse.
--
Michael


signature.asc
Description: PGP signature


Re: pgbackrest: ERROR: [029]: unable to convert base 10 string '0000000B' to unsigned int

2020-01-21 Thread Michael Paquier
On Tue, Jan 21, 2020 at 08:10:39AM +0100, Eric Veldhuyzen wrote:
> We are using pgbackrest (2.21) to backup out postgresql (11) clusters.
> Last night our nightly diff backup gave me the
> ERROR: unable to convert base 10 string '000B' to unsigned int.
> I tried if a full backup would fix this, but it didn't.

Did you try to contact the maintainers?  Here is the link to the
project:
https://github.com/pgbackrest/pgbackrest
--
Michael


signature.asc
Description: PGP signature


Re: POLL: Adding transaction status to default psql prompt

2020-02-05 Thread Michael Paquier
On Thu, Feb 06, 2020 at 03:54:48AM +0100, Vik Fearing wrote:
> I proposed a patch to add %x to PROMPT1 and PROMPT2 by default in psql.
> 
> The effect of this is:
> 
> - nothing at all when not in a transaction,
> - adding a '*' when in a transaction or a '!' when in an aborted
>   transaction.
> 
> Before making a change to a long-time default, a poll in this group was
> requested.

Thanks Vik for starting a new thread.  For reference here is the
thread where the patch is being discussed:
https://www.postgresql.org/message-id/09502c40-cfe1-bb29-10f9-4b3fa7b2b...@2ndquadrant.com
--
Michael


signature.asc
Description: PGP signature


Re: JIT on Windows with Postgres 12.1

2020-02-12 Thread Michael Paquier
On Wed, Feb 12, 2020 at 12:05:11PM -0800, Andres Freund wrote:
> Yes, LLVM would work on windows. I'd not even be surprised if one could
> make it work on windows already when using a mingw based build, instead
> of msvc.

For MSVC, assuming that you have the proper dependencies for JIT
compilation and LLVM available, and just for the reference if anybody
is interesting in making that happen (Tom and Andres are aware of that
already), one would need to do roughly an update of src/tools/msvc/ to
add a new option path to LLVM, and update install-windows.sgml to add
the new optional dependency.  Note that including LLVM in the builds
means adding conditionally the files part of src/backend/jit/llvm/,
but one can take for example OpenSSL as a model of how it happens.
That would take care of the build.  There may be some
Windows-dependent stuff needed to make JIT work properly on Windows
though.
--
Michael


signature.asc
Description: PGP signature


Re: JIT on Windows with Postgres 12.1

2020-02-12 Thread Michael Paquier
On Wed, Feb 12, 2020 at 07:32:08PM -0800, Andres Freund wrote:
> That's not really the hard part. That's integrating the generation of
> LLVM bitcode files into the buildsystem. As the absolute minimum
> llvmjit_types.bc needs to be generated, but to be meaningfully supported
> we'd need to generate all the .bc files we're generating on make based
> builds.

Oh, I see.  That's indeed...  Not straight-forward.
--
Michael


signature.asc
Description: PGP signature


Re: V9.5

2020-03-10 Thread Michael Paquier
On Tue, Mar 10, 2020 at 12:23:49PM +0530, Sonam Sharma wrote:
> We have pg_read_all_stats role from v9.6. do we have a similar role for
> v9.5 and lower versions ?

No, and pg_read_all_stats has been introduced in Postgres 10, not 9.6:
https://www.postgresql.org/docs/10/release-10.html
--
Michael


signature.asc
Description: PGP signature


Re: Mixed Locales and Upgrading

2020-03-17 Thread Michael Paquier
On Tue, Mar 17, 2020 at 10:45:50AM -0400, Tom Lane wrote:
> Don Seiler  writes:
>> What are the ramifications of changing collation like that? Should we
>> consider rebuilding indexes ASAP after that?
> 
> Text indexes would definitely be at risk here.  I'm not really certain
> how bad the problem would be.  Do you have a feeling for how much of
> the data is 100% ASCII?  If you could be sure of that for any given
> column, you wouldn't have to reindex indexes on that column.

There is no way to know how much indexes would get broken without
having a look at it.  Anything ASCII-based should be of no problem.
If you have a doubt, reindexing evey index which includes text column
data is the best course of action in my opinion if you have any
doubts, because that's safe even if it has a higher cost.
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL 10 not archiving some WAL files

2020-03-18 Thread Michael Paquier
On Wed, Mar 18, 2020 at 10:57:22AM -0300, Norberto Dellê wrote:
> This setup worked very well for a long time, but since we upgraded
> PostgreSQL to newer versions (10.x), sometimes it just skips archiving some
> wal files. This seems to happen mainly when the server is shut down.
> There's no messages about errors in the logs, and also no corresponding
> .done file in archive_status, it just goes to the next wal file.
> 
> I would like to know if any of you had or has a problem like this, and if
> you had, how you solved it.

Hmm.  I don't recall seeing any of that.  This would mean that we are
either missing the generation of some .ready file, or that some .done
file gets generated when they should not in archive_status/.  What
kind of server shutdown are you doing?  Immediate so as recovery
happens at the follow-up startup.  Or is that a clean service
shutdown?
--
Michael


signature.asc
Description: PGP signature


  1   2   3   4   >