Re: Is there a continuous backup for pg ?

2018-03-02 Thread David Steele
Hi Gary,

On 3/2/18 2:05 PM, Gary M wrote:
> Hi,
> 
> I have an unusual requirement for schema based, live backup of a 24/7
> database processing 100K inserts/updates per hour. The data store is
> around 100TB.
> 
> The requirement is supporting an incremental backup of 10 minute
> windows.  Replication is not considered backup from malicious action.
> 
> Are there any best practices or solutions that can meet these
> requirements ? 
pgBackRest is specifically designed to handle very large clusters and
high WAL rates.  Backup, restore, and archiving can be run in parallel
to speed operations.  The new version 2 has been optimized to make
archive-push even faster than version 1 and we will be releasing an
optimized archive-get soon.

You would be best off achieving your 10-minute windows with daily
incremental backups and then recovery with PITR to the required time.
PITR allows you to specify any time for recovery.


Regards,
-- 
-David
da...@pgmasters.net



Re: Barman versus pgBackRest

2018-03-09 Thread David Steele
Hi Thomas,

On 3/6/18 2:53 PM, Thomas Poty wrote:
> Hello Community,
> 
> I hesitate to use barman or pgBackRest.  I have found a lot of common
> points between them and a few differences:

To be clear, I'm the original author and primary maintainer of
pgBackRest.  I'll let the Barman folks speak to their strengths, but I'm
happy to address your points below.

> About pgBarman, I like :
> - be able restore on a remote server from the backup server

This a good feature, and one that has been requested for pgBackRest. You
can do this fairly trivially with ssh, however, so it generally hasn't
been a big deal for people.  Is there a particular reason you need this
feature?

> - use replication slots for backingup wal on the backup server.

Another good feature.  We have not added it yet because pgBackRest was
originally written for very high-volume clusters (100K+ WAL per day) and
our parallel async feature answers that need much better.  We recommend
a replicated standby for more update-to-date data.

Even so, we are looking at adding support for replication slots to
pgBackRest.  We are considering a hybrid scheme that will use
replication to keep the WAL archive as up to date as possible, while
doing bulk transfer with archive_command.

> About pgBackRest, I like :
> 
> - real differential backup.
> - lots of options
> - option for backingup if PostgreSQL is already in backup mode
> 
> 
> I would like to have :
> - advices or feedbach about using pgBackrest or barman.
> - pros and cons of these solutions

I'll stick with some of the major pgBackRest pros:

- Parallel backup including compression and checksums
- Encryption
- S3 support
- Parallel archive
- Delta restore
- Page checksum validation
- Backup resume

More about features here: https://pgbackrest.org

> - differences that I would not have seen.

pgBackRest is used in some very demanding environments and we are
constantly answering the needs of our users with features and
performance improvements, e.g. the enormous improvements to archive-push
speed in the 2.0 release.

I'd be happy to answer any specific questions you have about pgBackRest.

Regards,
-- 
-David
da...@pgmasters.net



Re: Barman versus pgBackRest

2018-04-13 Thread David Steele

Hi Thomas,

On 4/11/18 3:14 AM, Thomas Poty wrote:


Sorry for answering only now but I just saw you answer only now.

 > To be clear, I'm the original author and primary maintainer of
pgBackRest.

I am very happy to see guys like you to take time to answer me. Thank you


You are welcome.  Users are the reason I work on this project.


 >This a good feature, and one that has been requested for pgBackRest. You
 >can do this fairly trivially with ssh, however, so it generally hasn't
 >been a big deal for people.  Is there a particular reason you need this
 >feature?

The reason is probably a psychologic matter but I like the idea of a 
unique connecting point to restore DBs of different location.


I am very impatient to see "replication slot" support and "remote 
restore" feature added.


Remote restore is likely to land first, though neither feature is 
currently at the top of the list.  Unfortunately, we have limited 
resources and must prioritize.


Regards,
--
-David
da...@pgmasters.net



Re: During promotion, new master tries to archive same segment twice

2018-08-15 Thread David Steele
Hi Phil,

On 8/15/18 4:25 PM, Phil Endecott wrote:
> 
> Questions:
> 
> - Can all of this be blamed on the failure of the first attempt 
> to promote due to the wrong permissions on recovery.conf?

Yes, it looks that way.

> - Should my archive_command detect the case where it is asked to 
> write the same file again with the same contents, and report 
> success in that case?

Yes.  pgBackRest has done this for years and it saves a *lot* of headaches.

> - Is this a bug?

I don't think so.  There are a number of cases where the same WAL
segment can be pushed more than once, especially after failures where
Postgres is not sure that the command completed.  The archive command
should handle this gracefully.

Regards,
-- 
-David
da...@pgmasters.net



Re: During promotion, new master tries to archive same segment twice

2018-08-16 Thread David Steele
On 8/16/18 4:37 AM, Phil Endecott wrote:
> David Steele wrote:
>> On 8/15/18 4:25 PM, Phil Endecott wrote:
>>> - Should my archive_command detect the case where it is asked to
>>> write the same file again with the same contents, and report success
>>> in that case?
>>
>> Yes.
> 
>> There are a number of cases where the same WAL
>> segment can be pushed more than once, especially after failures where
>> Postgres is not sure that the command completed.  The archive command
>> should handle this gracefully.
> 
> Hmm, OK.  Here's what the current docs say:
> 
> Section 25.3.1:
> 
> "The archive command should generally be designed to refuse to
> overwrite any pre-existing archive file. This is an important
> safety feature to preserve the integrity of your archive in case
> of administrator error (such as sending the output of two
> different servers to the same archive directory).
> 
> It is advisable to test your proposed archive command to ensure
> that it indeed does not overwrite an existing file, and that it
> returns nonzero status in this case."
> 
> And section 26.2.9:
> 
> "When continuous WAL archiving is used in a standby, there
> are two different scenarios: the WAL archive can be shared
> between the primary and the standby, or the standby can
> have its own WAL archive.  When the standby has its own WAL
> archive, set archive_mode to always, and the standby will call
> the archive command for every WAL segment it receives, whether
> it's by restoring from the archive or by streaming replication.
> The shared archive can be handled similarly, but the
> archive_command must test if the file being archived exists
> already, and if the existing file has identical contents.
> This requires more care in the archive_command, as it must be
> careful to not overwrite an existing file with different contents,
> but return success if the exactly same file is archived twice.
> And all that must be done free of race conditions, if two
> servers attempt to archive the same file at the same time."
> 
> So you're saying that that's wrong, and that I must always
> handle the case when the same WAL segment is written twice.

Seems like an omission in section 25.3.1 rather than a problem in 26.2.9.

Duplicate WAL is possible in *all* cases.  A trivial example is that
Postgres calls archive_command and it succeeds but an error happens
(e.g. network) right before Postgres is notified.  It will wait a bit
and try the same WAL segment again.

> I'll file a bug against the documentation.

OK.

>> pgBackRest has done this for years and it saves a *lot* of headaches.
> 
> The system to which I am sending the WAL files is a rsync.net
> account.  I use it because of its reliability, but methods for
> transferring files are limited largely to things like scp and
> rsync.

Rsync and scp are not good tools to use for backup because there is no
guarantee of durability, i.e. the file is not synced to disk before
success is returned.  rsync.net may have durability guarantees but you
should verify that with them.

Even so, crafting a safe archive_command using these tools is going to
be very tricky.

Regards,
-- 
-David
da...@pgmasters.net



Re: Barman versus pgBackRest

2018-09-04 Thread David Steele
On 9/4/18 11:24 AM, Joshua D. Drake wrote:
> On 09/04/2018 07:52 AM, Ron wrote:
>> On 09/04/2018 09:24 AM, Joshua D. Drake wrote:
>>> On 09/04/2018 07:14 AM, Ron wrote:

 That was about barman, in the barman group.  This is asking about
 pgbackrest...  :)

 So: does pgbackrest have this ability which barman does not have?
 The "--db-include" option seems to indicate that you can restore a
 single db, but does indicate whether or not you can rename it.
>>>
>>> https://pgbackrest.org/configuration.html#section-restore/option-db-include
>>>
>>>
>>
>> Which implies that you can't do it?
> 
> You can restore a single database and then issue a simple ALTER DATABASE
> command to change the DB name.

This will work, but I don't think it's what Ron is getting at.

To be clear, it is not possible to restore a database into an *existing*
cluster using pgBackRest selective restore.  This is a limitation of
PostgreSQL file-level backups.

To do what Ron wants you would need to restore it to a new cluster, then
use pg_dump to logically dump and restore it to whatever cluster you
want it in.  This still saves time since there is less to restore but is
obviously not ideal.

Regards,
-- 
-David
da...@pgmasters.net



Re: Barman versus pgBackRest

2018-09-04 Thread David Steele
On 9/4/18 11:55 AM, Ron wrote:
> On 09/04/2018 10:51 AM, David Steele wrote:
> [snip]
>> This will work, but I don't think it's what Ron is getting at.
>>
>> To be clear, it is not possible to restore a database into an *existing*
>> cluster using pgBackRest selective restore.  This is a limitation of
>> PostgreSQL file-level backups.
>>
>> To do what Ron wants you would need to restore it to a new cluster, then
>> use pg_dump to logically dump and restore it to whatever cluster you
>> want it in.  This still saves time since there is less to restore but is
>> obviously not ideal.
> 
> That's exactly what I'm referring to.
> 
> Presumably I could restore it to a new cluster on the same VM via initdb
> on a different port and PGDATA directory?

Definitely.  No need to initdb since all the required files will be
restored by pgBackRest.  You'll just need to create an empty directory
to restore into.

Regards,
-- 
-David
da...@pgmasters.net



Re: Barman versus pgBackRest

2018-09-04 Thread David Steele
Hi Ron,

On 9/4/18 7:41 AM, Ron wrote:
> On 03/09/2018 08:56 AM, David Steele wrote:
> [snip]
>>> About pgBarman, I like :
>>> - be able restore on a remote server from the backup server
>> This a good feature, and one that has been requested for pgBackRest. You
>> can do this fairly trivially with ssh, however, so it generally hasn't
>> been a big deal for people.  Is there a particular reason you need this
>> feature?
> 
> (Sorry to dredge up this old thread.)
> 
> Do you just change the IP address of the "restore target"?

[I'll assume you wanted to hear about pgBackRest here since we discussed
it down thread.]

Generally restores are done from the database server, but if you want to
run a restore from the backup server you can run it via ssh:

ssh user@pg-server pgbackrest [...]

> 
>>> - use replication slots for backingup wal on the backup server.
>> Another good feature.  We have not added it yet because pgBackRest was
>> originally written for very high-volume clusters (100K+ WAL per day) and
>> our parallel async feature answers that need much better.  We recommend
>> a replicated standby for more update-to-date data.
> 
> Every N minutes you copy the WAL files to the backup server?
> 
> 

-- 
-David
da...@pgmasters.net



Re: Barman versus pgBackRest

2018-09-04 Thread David Steele
On 9/4/18 7:41 AM, Ron wrote:
> On 03/09/2018 08:56 AM, David Steele wrote:
> 
>>> - use replication slots for backingup wal on the backup server.
>> Another good feature.  We have not added it yet because pgBackRest was
>> originally written for very high-volume clusters (100K+ WAL per day) and
>> our parallel async feature answers that need much better.  We recommend
>> a replicated standby for more update-to-date data.
> 
> Every N minutes you copy the WAL files to the backup server?

[Accidentally hit send on the previous post, here's the rest...]

WAL segments are transferred whenever Postgres indicates that a segment
is finished via the archive_command.  Async archiving "looks ahead" to
find WAL segments that are ready to archive.

You can use archive_timeout to force Postgres to push a WAL segment
every N seconds for clusters that have idle time.


Regards,
-- 
-David
da...@pgmasters.net



Re: pgbackrest when data/base is symlinked to another volume

2018-09-07 Thread David Steele

Hi Ron,

On 9/6/18 11:21 PM, Ron wrote:


Will pgbackrest properly backup and restore the cluster if data/base, 
data/pg_xlog and data/pg_log are symlinks?


PGDATA=/var/lib/pgsql/9.6/data
$PGDATA/base -> /Database/9.6/base
$PGDATA/pg_log -> /Database/9.6/pg_log
$PGDATA/pg_xlog -> /Database/9.6/pg_xlog


Yes, this will work.  Note that restore does not recreate symlinks by 
default so you'll need to specify --link-all to enable symlink creation.


See 
https://pgbackrest.org/configuration.html#section-restore/option-link-all 
for details.


Using symlinks in this way will make management of your clusters more 
difficult, mostly because systems need more provisioning before restores 
can be performed.  In general I'd recommend against it unless there are 
performance considerations.


Also, you might consider using log_directory to relocate log files 
rather than a symlink.  This will exclude log files from your backup 
which is usually preferable -- primary logs restored to a standby are 
out of context and can cause confusion.


Regards,
--
-David
da...@pgmasters.net



Re: pgbackrest when data/base is symlinked to another volume

2018-09-08 Thread David Steele

On 9/7/18 8:47 PM, Ron wrote:

On 09/07/2018 05:22 PM, David Steele wrote:


On 9/6/18 11:21 PM, Ron wrote:


Will pgbackrest properly backup and restore the cluster if data/base, 
data/pg_xlog and data/pg_log are symlinks?


PGDATA=/var/lib/pgsql/9.6/data
$PGDATA/base -> /Database/9.6/base
$PGDATA/pg_log -> /Database/9.6/pg_log
$PGDATA/pg_xlog -> /Database/9.6/pg_xlog


Yes, this will work.  Note that restore does not recreate symlinks by 
default so you'll need to specify --link-all to enable symlink creation.


See 
https://pgbackrest.org/configuration.html#section-restore/option-link-all 
for details.


Using symlinks in this way will make management of your clusters more 
difficult, mostly because systems need more provisioning before 
restores can be performed.  In general I'd recommend against it unless 
there are performance considerations.


Now that I'm thinking more about what you wrote... "data" isn't on it's 
own partition.  data/*base* has it's own partition.


What's the recommended method for putting *base**/* on a partition 
different from data/?  Or is that not recommended?


All the user data goes in base so there's really no need to separate it 
out of data.  Typically pg_wal and tablespaces are relocated onto 
different devices for performance (or to get more space).  If the 
partitions are on the same device then there's no performance benefit, 
just admin hassle.


--
-David
da...@pgmasters.net



Re: Volume partitioning (was Re: pgbackrest when data/base is symlinked to another volume)

2018-09-10 Thread David Steele

Hi Ron

On 9/8/18 4:26 PM, Ron wrote:


Googled "postgresql disk partitioning" and "postgresql volume 
partitioning" without much success.


Is the best practice volume partitioning:
/Database/9.6/data
/Database/9.6/data/pg_log
/Database/9.6/data/pg_xlog

where /var/lib/pgsql/9.6 (on RHEL6) is a symlink to /Database/9.6/data 
and PGDATA=/Database/9.6/data


*or *

/Database/9.6/data/base
/Database/9.6/data/pg_log
/Database/9.6/data/pg_xlog

where PGDATA=/var/lib/pgsql/9.6/data and base, pg_log and px_xlog are 
symlinks to the partitions?


This is really a matter of preference, but simpler is better, so the 
first one looks better to me.


In general I prefer to keep everything in /var/lib/pgsql/9.6:

/var/lib/pgsql/9.6/data
/var/lib/pgsql/9.6/data/pg_xlog -> /var/lib/pgsql/9.6/wal

Then use the log_directory setting to put logs in:

/var/log/pgsql or /var/lib/pgsql/9.6/log

pgBackRest will expect to be pointed at a real directory (pg-path) and 
expect the data_directory in PostgreSQL to match.


Regards,
--
-David
da...@pgmasters.net



Re: Setting up continuous archiving

2018-09-26 Thread David Steele

On 9/26/18 8:20 AM, Yuri Kanivetsky wrote:


I'm trying to compile a basic set of instruction needed to set up
continuous archiving and to recover from a backup. I'm running
PostgreSQL 9.3 on Debian Stretch system.


This is an incredibly complex topic and it is very difficult to 
implement correctly.


My advice is that you use a mature backup solution like pgBackRest or 
barman rather than rolling your own.


--
-David
da...@pgmasters.net



Re: Advice on logging strategy

2018-10-12 Thread David Steele

On 10/11/18 11:26 AM, Mike Martin wrote:


This results in enormous log files which take ages to import using copy 
becuase each execute statement is logged with the parameters chosen


Is there any way around this?

I cant find any way to filter dml statements


pgAudit (https://github.com/pgaudit/pgaudit) gives you fine-grain 
control over what is logged by command type, table, or user as well as a 
lot more detail.


--
-David
da...@pgmasters.net



Re: Setting up continuous archiving

2018-10-15 Thread David Steele

On 10/15/18 5:09 PM, Benoit Lobréau wrote:


By the way, do/can they both use streaming to receive WAL records? Or
streaming is only for standby servers. For backups you have only
file-based log shipping?

barman supports streaming but it's not as magical as one might think.
See pgbarman's documentation for how to manager .partial files.

pgbackrest archives only wal files when postgres uses the 
archive_command. You also have the option to do paralllel async wal 
push/get.
It can be useful if you write wals quicker than you can archive them or 
if you want to restore more quickly.


We have focused on archive_command because the performance is much 
better because it can be parallelized.



Then, I suppose they both don't support partial PITR
(http://docs.pgbarman.org/release/2.4/#scope), where there are
standalone backups that extends to points in time for which there are
no WAL files. I'm not sure if this matters, but I assume that it might
be effective in terms of disk space.

It's a hot backup so you have to have wals files so that your backup is 
consistent at the end of the backup.
You can build something like what you describe with pgbackrest, 
archive-copy and a copy of the backup directory to another place.


The --repo1-retention-archive-type and --repo1-retention-archive options 
allow you do keep WAL for a smaller number of backups in order to save 
space.


https://pgbackrest.org/configuration.html#section-repository/option-repo-retention-archive

So, for example:

[global]
repo1-retention-full=4
# The following option is the default but included for clarity
repo1-retention-archive-type=full
repo1-retention-archive=2

This configuration will retain 4 full backups but only keep PITR WAL for 
2 of them, i.e. the WAL generated between backups.  The WAL required to 
make a backup consistent is always retained so the 2 older backups can 
be played to consistency but no further.



I'd like to be able to handle db failure as fast as possible. Ideally,
automatically. Which probably means either repmgr, or corosync +
pacemaker + PAF. Is that what you mean by HA here? 



yes. You can also look into patrony for an alternative. It's a matter of 
preference and requirements.


pgBackRest certainly works with Pacemaker/Corosync and Patroni.

If your standby is lagging too far behind (and you dont use slots) you 
can use pgbackrest's archive to fill the gap in wal files and catchup 
with the master.


Recovering WAL out of the archive is safer than using slots.  Since the 
WAL is transferred over the network compressed it can also save a lot of 
bandwidth.


Regards,
--
-David
da...@pgmasters.net



Re: PGbackrest capacity related question

2019-07-22 Thread David Steele

On 7/22/19 5:08 PM, chiru r wrote:

Hi  All,

I am using pgbackrest on my production workloads and it is working as 
expected.

However our DB is growing more and more .
We have currently two disks and each one is 4 TB size. However 3.8TB 
size data got loaded into DB currently and our /pgback mount point size 
is 4TB.
So if the DB size grows more then 4TB, how to add another file system to 
store backup under "repo-path"?.


pgBackRest does not have multi-volume support because there are a number 
of ways to accomplish this using ZFS, Linux volumes, etc.  You'll need 
the expand the storage for the repo using one of these methods or reduce 
your retention settings.


Regards,
--
-David
da...@pgmasters.net




Re: pgbackrest restore to new location?

2019-09-17 Thread David Steele
On 9/17/19 7:23 AM, Luca Ferrari wrote:
> On Tue, Sep 17, 2019 at 12:00 PM Ron  wrote:
>> The real problem is that after doing that, "pg_ctl start -D
>> /path/to/new/data" fails with "PANIC: could not locate a valid checkpoint
>> record".

Sounds like backup_label is missing or has been deleted.

The easiest way to restore to a new location is just to copy
pgbackrest.conf from the primary (or create a new one) with the same
stanza and then alter pg1-path in pgbackrest.conf or at the command line.

Regards,

-- 
-David
da...@pgmasters.net




Re: pgbackrest restore to new location?

2019-09-18 Thread David Steele
On 9/17/19 10:03 PM, Stephen Frost wrote:
> 
> That said- it brings up a pretty serious issue that should be discussed,
> and that's nuking this:
> 
> HINT:  If you are not restoring from a backup, try removing the file 
> ".../backup_label".
> 
> That hint is absolutely wrong these days when many tools have been
> updated to use the non-exclusive backup method and it just ends up
> getting people into trouble and, worse, can result in them having
> corrupted clusters.
> 
> I'll get a patch into the next commitfest to remove it.  The exclusive
> method has been deprecated for quite a few releases and we should stop
> giving bad advice on the assumption that people are using it.

We updated the error message and hints in c900c152, but it was not
back-patched.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: pgbackrest - question about restoring cluster to a new cluster on same server

2019-09-18 Thread David Steele
On 9/18/19 6:59 PM, Ron wrote:
> 
> Scenario: there's data corruption on production server, so we need to do
> a PITR restore from "a few days ago" of the cluster holding the prod
> databases to a second cluster on that same VM in order to try and find
> the missing data and load it back into the prod cluster.
> 
> Other than putting a high I/O load on the LUN where repo-path is located
> (from both writing WALs to it and reading the backed up files), will
> there be any problems when "pg_ctl start" processes recovery.conf and
> applies the WAL files to the new cluster while the prod cluster is
> writing new WAL files.
> 
> Does my question make sense?

It does, but the answer lies outside of pgBackRest.  "Can the repo
storage handle the load of archive-push and archive-get at the same
time" is really a question of storage and network throughput.

pgBackRest compresses everything by default which goes a long way
towards increasing throughput, but ultimately we don't control the
bandwidth.

Having said that, if the storage and network throughput are sufficient,
restoring and recovering a standby using pgBackRest will not impact the
primary as a direct pg_basebackup will.

Regards,

-- 
-David
da...@pgmasters.net




Re: pgbackrest - question about restoring cluster to a new cluster on same server

2019-09-18 Thread David Steele
On 9/18/19 9:40 PM, Ron wrote:
> 
> I'm concerned with one pgbackrest process stepping over another one and
> the restore (or the "pg_ctl start" recovery phase) accidentally
> corrupting the production database by writing WAL files to the original
> cluster.

This is not an issue unless you seriously game the system.  When a
cluster is promoted it selects a new timeline and all WAL will be
archived to the repo on that new timeline.  It's possible to promote a
cluster without a timeline switch by tricking it but this is obviously a
bad idea.

So, if you promote the new cluster and forget to disable archive_command
there will be no conflict because the clusters will be generating WAL on
separate timelines.

In the case of a future failover a higher timeline will be selected so
there still won't be a conflict.

Unfortunately, that dead WAL from the rogue cluster will persist in the
repo until an PostgreSQL upgrade because expire doesn't know when it can
be removed since it has no context.  We're not quite sure how to handle
this but it seems a relatively minor issue, at least as far as
consistency is concerned.

If you do have a split-brain situation where two primaries are archiving
on the same timeline then first-in wins.  WAL from the losing primary
will be rejected.

Regards,
-- 
-David
da...@pgmasters.net




Re: pgbackrest - question about restoring cluster to a new cluster on same server

2019-09-18 Thread David Steele
On 9/18/19 10:18 PM, Jerry Sievers wrote:
> David Steele  writes:
> 
>> This is not an issue unless you seriously game the system.  When a
> 
> And/or your recovery system is running archive_mode=always :-)
> 
> I don't know how popular that setting value is but that plus an
> identical archive_command as the origin...  duplicate archival with
> whatever consequences.
> 
> Disclaimer: I don't know if pgbackrest guards against such a
> configuration.

We current disallow archive_mode=always because the locking issues are
complex.  Also, standbys will not always push WAL which is binary
identical to the primary (even though they are logically the same) so it
can be really tricky to tell who is authoritative.

We have plans in this area but they are currently on the back-burner.

Regards,
-- 
-David
da...@pgmasters.net




Re: pgbackrest with PAF(corosync and pacmaker)

2019-10-17 Thread David Steele

On 10/17/19 9:48 AM, Ajay Pratap wrote:


I am using pacemaker and corosync to setup two nodes High availability 
cluster for postgreSQL 10. In the scenario I want to setup timely backup 
with pgbackrest.
Using the _backup from standby_ feature I could able to take backup from 
my secondary  postgres. But if secondary postgres is down, it does not 
take backup from primary.


This is entirely on purpose.  Backup from standby is designed to reduce 
load on the primary so we believe it is counter-intuitive to put load 
back on the primary when the standby is down.  Running backups *at all* 
when degraded is usually a bad idea.


> Is there any way to achieve this?

You can run more than one standby and pgBackRest will automatically 
select the first one it finds that is up.  If you must take a backup 
when degraded then you can specify --no-backup-standby on the command 
line to disable standby for that one backup.


Regards,
--
-David
da...@pgmasters.net




Re: Postgres Point in time Recovery (PITR),

2019-10-18 Thread David Steele

On 10/18/19 11:29 AM, Luca Ferrari wrote:

On Fri, Oct 18, 2019 at 10:30 AM Andreas Joseph Krogh
 wrote:

We use barman (https://www.pgbarman.org/) for continuous streaming backup and I 
had to restore from it once, and it went like this:


Just for the records, here's an example of restore with pgbackrest:

% sudo -u postgres pgbackrest --stanza=miguel \
--log-level-console=info --delta restore
...
INFO: restore backup set 20190916-125652F
INFO: remove invalid files/paths/links from /postgres/pgdata/11
INFO: cleanup removed 148 files, 3 paths
...
INFO: write /postgres/pgdata/11/recovery.conf
INFO: restore global/pg_control (performed last
 to ensure aborted restores cannot be started)
INFO: restore command end: completed successfully (5113ms)


pgBackRest also has a tutorial on PITR:
https://pgbackrest.org/user-guide.html#pitr

--
-David
da...@pgmasters.net




Re: pgbackrest concerns and doubts.

2019-12-02 Thread David Steele

Hi Ajay,

On 12/2/19 1:52 AM, Ajay Pratap wrote:


I am using Postgresql 10 and pgbackrest 2.18 version on centos 7.6
I have few doubts regard pgbackrest.
a) what is the log rotation policy for pgbackrest logs. I see it logs on 
default path /var/log/pgbackrest/-.log  what is 
the log rotation policy of each logs.
use case: if i am writing a script to parse the logs and gatter some 
info, i should be aware of log rotation or if logs doesn't rotate a 
single file could be huge to parse.

Or should I simply use /logrotate/


pgBackRest does not have any built-in log rotation policies since this 
is best implemented per OS.  Some packages have logrotate scripts and 
others don't.  RHEL doesn't, but you can see a logrotate example in the 
Debian/Ubuntu package at: 
https://salsa.debian.org/postgresql/pgbackrest/blob/master/debian/pgbackrest.logrotate



b) since pgbackrest takes physical backup,
  what are the impact if I upgrades minor postgres version(10.5 to 10.10)
and impact on postgres major version(10.10 to 12.X)


Minor PostgreSQL upgrades require no special action in pgBackRest.  We 
test with each minor upgrade to ensure there are no regressions.  Unless 
you have a specific reason not to, it is always best to be running the 
most recent PostgreSQL minor version.


Major version upgrades will require a pgBackRest stanza-upgrade to be 
run after the PostgreSQL upgrade is complete.  For more information see: 
https://pgbackrest.org/user-guide-centos7.html#upgrade-stanza.


Regards,
--
-David
da...@pgmasters.net




Re: pgbackrest: ERROR: [029]: unable to convert base 10 string '0000000B' to unsigned int

2020-01-21 Thread David Steele

On 1/21/20 12:55 AM, Eric Veldhuyzen wrote:

Ron wrote:

On 1/21/20 1:10 AM, Eric Veldhuyzen wrote:

Hi,

We are using pgbackrest (2.21) to backup out postgresql (11) clusters.
Last night our nightly diff backup gave me the
ERROR: unable to convert base 10 string '000B' to unsigned int.
I tried if a full backup would fix this, but it didn't.


Maybe I'm missing something, but 000*B* looks *hexa*decimal, not decimal.


You're not missing something, and this is most likely also the reason that this
fails. I'm quite sure it should try to convert a base 16 string here to an
unsigned int...


That's exactly what should be happening.  There has already been a bug 
filed at https://github.com/pgbackrest/pgbackrest/issues/910 and you can 
find more details there.


Regards,
--
-David
da...@pgmasters.net




Re: pgbackrest: ERROR: [029]: unable to convert base 10 string '0000000B' to unsigned int

2020-01-23 Thread David Steele

On 1/21/20 8:45 AM, David Steele wrote:

On 1/21/20 12:55 AM, Eric Veldhuyzen wrote:

Ron wrote:

On 1/21/20 1:10 AM, Eric Veldhuyzen wrote:

Hi,

We are using pgbackrest (2.21) to backup out postgresql (11) clusters.
Last night our nightly diff backup gave me the
ERROR: unable to convert base 10 string '000B' to unsigned int.
I tried if a full backup would fix this, but it didn't.


Maybe I'm missing something, but 000*B* looks *hexa*decimal, not 
decimal.


You're not missing something, and this is most likely also the reason 
that this
fails. I'm quite sure it should try to convert a base 16 string here 
to an

unsigned int...


That's exactly what should be happening.  There has already been a bug 
filed at https://github.com/pgbackrest/pgbackrest/issues/910 and you can 
find more details there.


This issue was fixed in 2.22, released on 21 JAN.

--
-David
da...@pgmasters.net




Re: It is possible to force periodically switch xlog?

2020-04-01 Thread David Steele

On 4/1/20 7:19 PM, Edson Richter wrote:


IIRC archive_timeout does precisely that.

I would assume that "archive_mode" must be "on" or "always" - but I do 
have "archive_mode=off".


I'm not sure what the point of switching the WAL segment would be with 
archive_mode=off.


What are you trying to accomplish?

Regards,
--
-David
da...@pgmasters.net




Re: Pgbackrest restore options

2020-05-01 Thread David Steele

On 4/17/20 2:38 AM, Konireddy Rajashekar wrote:


Doubt regarding the backrest restore options,

is there any way to ensure the backups are good as expected, instead of 
restoring the whole cluster or database is there any other possibility 
to ensure it by restoring a single file?  or any other such mechanism


Currently there is no command to verify a backup without restoring it.

However, we are planning a verify command in the near future -- most 
likely within the next few releases.


Regards,
--
-David
da...@pgmasters.net




Re: [pgbackrest] Expiring the last backup?

2019-01-24 Thread David Steele

On 1/24/19 5:07 PM, Ron wrote:


Are these the steps, or am I missing something?

$ pgbackrest stop
$ pgbackrest stanza-delete --stanza=mystanza--force


That looks right but no need for --force.  That's what the `stop` is for 
-- to let pgBackRest know you really mean to do this.


See documentation for more information:
https://pgbackrest.org/user-guide.html#delete-stanza

--
-David
da...@pgmasters.net



Re: [pgbackrest] Expiring the last backup?

2019-01-24 Thread David Steele

On 1/25/19 8:02 AM, Ron wrote:

On 1/24/19 11:22 PM, David Steele wrote:

On 1/24/19 5:07 PM, Ron wrote:


Are these the steps, or am I missing something?

$ pgbackrest stop
$ pgbackrest stanza-delete --stanza=mystanza--force


That looks right but no need for --force.  That's what the `stop` is 
for -- to let pgBackRest know you really mean to do this.


See documentation for more information:
https://pgbackrest.org/user-guide.html#delete-stanza


Turns out that --force was required.  stanza-delete failed, complained 
that the postmaster was running, and told me to use "--force" if I 
really wanted to drop it.


Whoops -- I guess I should have read the docs, too.  It's very unusual 
to be deleting a stanza for a running cluster and we made it difficult 
on purpose.


--
-David
da...@pgmasters.net



Re: Barman disaster recovery solution

2019-02-27 Thread David Steele

On 2/27/19 2:31 PM, Achilleas Mantzios wrote:

On 27/2/19 1:58 μ.μ., rich...@simkorp.com.br wrote:
Just to notice, I d o use backup from standby and WAL archive from 
standby. It is possible. But you have to configure standby with option 
of wal archive "always".


I guess there are issues with it. If this was so easy then pgbarman and 
pgbackrest would support it out of the box.


There are a few issues with it:

1) If you allow the primary and standby to archive to the same 
repository then there needs to be some conflict resolution if they write 
at the same time.  If they write to different repositories then you need 
to decided which one to use for a restore, or have some kind of conflict 
resolution between them.  It gets complicated.


2) Writing only from the standby reduces load on the primary but if the 
connection to the primary is down then you can get behind on archiving. 
If something then happens to the primary then your recovery point will 
be limited.


Regards,
--
-David
da...@pgmasters.net



Re: Barman disaster recovery solution

2019-02-27 Thread David Steele

On 2/27/19 4:48 PM, Achilleas Mantzios wrote:

On 27/2/19 4:16 μ.μ., David Steele wrote:

On 2/27/19 2:31 PM, Achilleas Mantzios wrote:

On 27/2/19 1:58 μ.μ., rich...@simkorp.com.br wrote:
Just to notice, I d o use backup from standby and WAL archive from 
standby. It is possible. But you have to configure standby with 
option of wal archive "always".


I guess there are issues with it. If this was so easy then pgbarman 
and pgbackrest would support it out of the box.


There are a few issues with it:

1) If you allow the primary and standby to archive to the same 
repository then there needs to be some conflict resolution if they 
write at the same time.  If they write to different repositories then 
you need to decided which one to use for a restore, or have some kind 
of conflict resolution between them.  It gets complicated.


2) Writing only from the standby reduces load on the primary but if 
the connection to the primary is down then you can get behind on 
archiving. If something then happens to the primary then your recovery 
point will be limited.


David to quote an older email from you:
"pgBackRest currently requires some files and all WAL to be sent from 
the primary even when doing backup from standby.  We may improve this in 
the future but it's not on the road map right now. "
So, I had the impression that receiving WALs from the standby was a 
greater technical problem.


No, it just increases the risk of being behind on archiving.

One of the things pgBackRest does well is move a *lot* of WAL and it is 
orders of magnitude faster than streaming replication, which is 
single-threaded and uncompressed.  So, in spite of the additional load 
it's generally safest to archive from the primary, especially on high 
write volume clusters.


--
-David
da...@pgmasters.net



Re: Barman disaster recovery solution

2019-02-28 Thread David Steele

On 2/28/19 9:21 AM, Achilleas Mantzios wrote:

On 28/2/19 1:08 π.μ., Ahmed, Nawaz wrote:


Hi,

I believe the "file copy" method (listed in the table) in pgbackrest 
is based on pg_basebackup, so i think it should be "pg_basebackup over 
ssh" as pgbackrest internally calls pg_basebackup. David Steele can 
correct me.



No, apparently pgbackrest does not rely on pg_basebackup.


pgBackRest implements file copy internally and does not rely on any 
command-line tools such as rsync, tar, pg_basebackup, etc.


Regards,
--
-David
da...@pgmasters.net



Re: Pgbackrest Comparability issue

2019-03-01 Thread David Steele

On 3/1/19 1:37 AM, Adrian Klaver wrote:

On 2/28/19 12:28 PM, chiru r wrote:


The below steps we followed.
yum -y install perl-parent
yum -y install perl-Time-HiRes
yum -y install perl-JSON
yum -y install perl-Digest-SHA
yum -y install perl-Digest
yum -y install perl-DBD-Pg

Downloaded pgbackrest-release-1.24.zip from github and unpacked.
https://github.com/pgbackrest/pgbackrest

cp -r pgbackrest-release-1.24/lib/pgBackRest  /usr/share/perl5/
cp -r pgbackrest-release-1.24/bin/pgbackrest  /usr/bin/pgbackrest


You appear to have an old version at /bin/pgbackrest.

pgBackRest is available as a package from yum.postgresql.org.  If you 
installed Postgres from there then you just need to:


yum install pgbackrest

If this is not a fresh install and you have old copies of pgbackrest 
lying around it would be best to follow the uninstall directions here:


https://pgbackrest.org/prior/1.29/user-guide.html#installation

You may have files in other locations as well since I don't believe 
/bin/pgbackrest has even been in our install directions.


Regards,
--
-David
da...@pgmasters.net



Re: Barman disaster recovery solution

2019-03-01 Thread David Steele

Achilleas,

On 2/27/19 11:39 AM, Achilleas Mantzios wrote:

On 21/2/19 9:28 π.μ., Achilleas Mantzios wrote:

On 21/2/19 9:17 π.μ., Julie Nishimura wrote:

Does anyone use this solution? any recommenations?

Thanks!


Barman will fit most requirements. PgBackRest excels when WAL traffic 
goes on 10 files/day or more. I have written an article, not yet 
publised, on a comparison on the 3 most known solutions. Will post a 
link as soon as it gets published.


Hello, as promised here is my blog :
https://severalnines.com/blog/current-state-open-source-backup-management-postgresql


Indeed, the pgBackRest user guide is a bit out of date.  I've been 
meaning to update to a newer version of Postgres but haven't had the 
chance.  This gave me the extra nudge I needed.


The docs have been update to PG10 for the next release, though the only 
visible change is to remove the `stop-auto` option since it is not 
relevant to PG10.


https://github.com/pgbackrest/pgbackrest/commit/6ce3310f8a2900d1af717da8d4c3345a9016933b

Thanks!
--
-David
da...@pgmasters.net



Re: Regarding pgaudit log_directory

2019-03-30 Thread David Steele

On 3/29/19 3:32 PM, Durgamahesh Manne wrote:


   I could not find parameter related to pgaudit log_directory  .


pgAudit does not support logging outside the standard PostgreSQL logging 
facility and there are no plans for such a feature.


The general solution is to use Splunk, ELK, etc. to do manipulation of 
the PostgreSQL logs.


Regards,
--
-David
da...@pgmasters.net




Re: Regarding pgaudit log_directory

2019-03-31 Thread David Steele

On 3/31/19 8:01 PM, Durgamahesh Manne wrote:


On Saturday, March 30, 2019, David Steele <mailto:da...@pgmasters.net>> wrote:


On 3/29/19 3:32 PM, Durgamahesh Manne wrote:

    I could not find parameter related to pgaudit log_directory  .

pgAudit does not support logging outside the standard PostgreSQL
logging facility and there are no plans for such a feature.

The general solution is to use Splunk, ELK, etc. to do manipulation
of the PostgreSQL logs.

Regards,
-- 
-David

da...@pgmasters.net <mailto:da...@pgmasters.net>

Please let me know open source application interface to monitor of 
pgaudit logs only

As I have installed pgaudit successfully


For open source ELK is probably your best bet.  This may help get you 
started: 
https://prezi.com/f2dvt6m9tbf9/integrating-postgresql-with-logstash-for-real-time-monitoring/


--
-David
da...@pgmasters.net




Re: perl path issue

2019-05-09 Thread David Steele
On 5/9/19 9:39 AM, Adrian Klaver wrote:
> On 5/9/19 4:53 AM, Prakash Ramakrishnan wrote:
>>
>> We having perl issue for while during restore operation for using
>> pgbackrest backup tool can you please do the needful.
> 
> Looks to me like the Postgres client library libpq is not installed or
> cannot be found on this machine.

If pgBackRest was built from source here's the complete dependency list
for RHEL7:

yum install perl perl-Time-HiRes perl-Digest-SHA perl-DBD-Pg perl-JSON-PP

Regards,
-- 
-David
da...@pgmasters.net




Re: perl path issue

2019-05-09 Thread David Steele
On 5/9/19 10:57 AM, Prakash Ramakrishnan wrote:
> 
> David am trying to restore the backup in dev server please find the
> below details ,

libpq should be installed with Postgres.  Have you installed Postgres yet?

-- 
-David
da...@pgmasters.net




Re: perl path issue

2019-05-09 Thread David Steele
On 5/9/19 11:12 AM, Prakash Ramakrishnan wrote:
> Yes, I did. 

OK, then I would say something is wrong with your package configuration,
but I don't know what it might be.

Regards,
-- 
-David
da...@pgmasters.net




Re: perl path issue

2019-05-09 Thread David Steele
On 5/9/19 12:51 PM, Prakash Ramakrishnan wrote:
> Hi Adrian,
> 
> I don't know about that client libraries can you please explain or share
> me the command I will show you the output and we need the solution for
> this pgbackrest issue .

yum install postgresql10

should do it, but I thought pqlib was installed with the server as well.
 Perhaps not.

-- 
-David
da...@pgmasters.net




Re: perl path issue

2019-05-09 Thread David Steele
On 5/9/19 12:58 PM, Adrian Klaver wrote:
> On 5/9/19 9:54 AM, David Steele wrote:
>> On 5/9/19 12:51 PM, Prakash Ramakrishnan wrote:
>>> Hi Adrian,
>>>
>>> I don't know about that client libraries can you please explain or share
>>> me the command I will show you the output and we need the solution for
>>> this pgbackrest issue .
>>
>> yum install postgresql10
> 
> Do you really want to do this as the server is already installed?

Perhaps not, but I was hoping it might fix a broken package.  Since he's
running EDB it almost certainly won't help.

-- 
-David
da...@pgmasters.net




Re: perl path issue

2019-05-09 Thread David Steele
On 5/9/19 12:58 PM, Prakash Ramakrishnan wrote:
> 
> I did the installation for using postgresql 10.4 version edb standard
> edition.so again need to run the yum command or please advise me .

It really depends on how you've installed EDB, packages or their
customer installer.

The custom installer installs stuff in non-standard locations so likely
won't work without a lot of tweaking.  I don't have any experience with
the EDB packages.

Devrim?

-- 
-David
da...@pgmasters.net




Re: pgbackrest info of encrypted seems broken

2021-05-19 Thread David Steele

On 5/19/21 9:47 AM, Ron wrote:


This is on Postgresql 9.6, if it matters.  Attached is a text file 
showing the pgbackrest version, config, backup log and "pgbackrest info" 
output.


I followed the instructions in 
https://pgbackrest.org/user-guide.html#quickstart/configure-encryption, 
and successfully ran these commands before the first encrypted backup, 
but they aren't in the text file:

rm -r /Database/9.6/backups/pgbackrest
pgbackrest stanza-create --stanza=localhost

Have I misconfigured something?


If the cipher is configured per stanza then the info command must be run 
per stanza, e.g.


pgbackrest --stanza=localhost info

If the cipher settings were in [global] this would work as expected, but 
of course there may be valid reasons to set the cipher at the stanza 
level. Essentially, if you run the info command without --stanza it has 
no knowledge of stanza-level settings.


We've known about this internally for a while, but have never seen it 
expressed in the field. If this is a blocker for you (i.e. you can't put 
the cipher settings in [global]) then feel free to open a Gtihub issue 
(https://github.com/pgbackrest/pgbackrest/issues) so it is on our radar.


Regards,
--
-David
da...@pgmasters.net




Re: pgbackrest info of encrypted seems broken

2021-05-19 Thread David Steele

On 5/19/21 1:42 PM, Ron wrote:

On 5/19/21 12:08 PM, David Steele wrote:

On 5/19/21 9:47 AM, Ron wrote:


This is on Postgresql 9.6, if it matters.  Attached is a text file 
showing the pgbackrest version, config, backup log and "pgbackrest 
info" output.


I followed the instructions in 
https://pgbackrest.org/user-guide.html#quickstart/configure-encryption, 
and successfully ran these commands before the first encrypted 
backup, but they aren't in the text file:

rm -r /Database/9.6/backups/pgbackrest
pgbackrest stanza-create --stanza=localhost

Have I misconfigured something?


If the cipher is configured per stanza then the info command must be 
run per stanza, e.g.


pgbackrest --stanza=localhost info

If the cipher settings were in [global] this would work as expected, 
but of course there may be valid reasons to set the cipher at the 
stanza level. Essentially, if you run the info command without 
--stanza it has no knowledge of stanza-level settings.


Both of those solved the problem.  I wound up moving it up to [global].



We've known about this internally for a while, but have never seen it 
expressed in the field. If this is a blocker for you (i.e. you can't 
put the cipher settings in [global]) then feel free to open a Gtihub 
issue (https://github.com/pgbackrest/pgbackrest/issues) so it is on 
our radar.


What I suggest is documenting this. :)    Issue #1407.


Documenting it sounds like a good plan.

Regards,
--
-David
da...@pgmasters.net




Re: pgbackrest - hiding the encryption password

2021-05-19 Thread David Steele

On 5/19/21 1:49 PM, Ron wrote:


Currently on our RHEL 7.8 system, /etc/pgbackrest.conf is root:root and 
633 perms.  Normally, that's ok, but is a horrible idea when it's a 
plaintext file, and stores the pgbackrest encryption password.


Would pgbackrest (or something else) break if I change it to 
postgres:postgres 600 perms?


Nothing will break as far as I know. As long as pgbackrest can read the 
file it will be happy.


Is there a better way of hiding the password so that only user postgres 
can see it?


You could use an environment variable in postgres' environment, see 
https://pgbackrest.org/command.html#introduction.


In this case it would be PGBACKREST_REPO1_CIPHER_PASS=xxx

Regards,
--
-David
da...@pgmasters.net




Re: pgbackrest - hiding the encryption password

2021-05-19 Thread David Steele

On 5/19/21 2:48 PM, Ron wrote:

On 5/19/21 1:34 PM, David Steele wrote:

On 5/19/21 1:49 PM, Ron wrote:


Currently on our RHEL 7.8 system, /etc/pgbackrest.conf is root:root 
and 633 perms.  Normally, that's ok, but is a horrible idea when it's 
a plaintext file, and stores the pgbackrest encryption password.


Would pgbackrest (or something else) break if I change it to 
postgres:postgres 600 perms?


Nothing will break as far as I know. As long as pgbackrest can read 
the file it will be happy.


Is there a better way of hiding the password so that only user 
postgres can see it?


You could use an environment variable in postgres' environment, see 
https://pgbackrest.org/command.html#introduction.


In this case it would be PGBACKREST_REPO1_CIPHER_PASS=xxx


Similarly there's PGBACKREST_REPO1_CIPHER_TYPE?


All options can be set through the environment. See the link for details.

Regards,
--
-David
da...@pgmasters.net




Re: PgBackRest question?

2017-12-16 Thread David Steele
On 12/15/17 4:36 PM, chiru r wrote:
> 
> Thanks,I am thinking about a specific recovery case.
> 
> Lets assume Heavy transactional system we configured.
> It is generating WAL 2000/hr and recycling automatically in pg_wal
> directory. 
> 
> QA :
> 
> Sunday -- 11 PM  -- Full backup done.
> Monday -- 11 PM  -- Differential Backup done
> Tuesday-- 10 AM incremental backup is done
> Note :  Every  2 hrs  incremental backup scheduled on system.
> 
> For example, if we want to restore to DEV server:
> 
> We want to recover database as of 11:30 AM Tuesday on DEV server using
> QA backups.
> 
> Is it possible to restore using pgbackrest tool for this scenario?.

Yes.  See the documentation here on how to create a standby:

http://pgbackrest.org/user-guide.html#replication/hot-standby

And to recover to a specific point in time:

http://pgbackrest.org/user-guide.html#pitr

Then simply promote the standby and you have a dev server.

> How pgbackrest keeps track of transactions since the last backup? Where
> it stores transaction information for recovery ?.

The transactions since the last backup are tracked in the WAL.  Here's
how to setup WAL archiving:

http://pgbackrest.org/user-guide.html#quickstart/configure-archiving

It looks like you have a lot of WAL volume so you'll want to use async
archiving:

http://pgbackrest.org/user-guide.html#backup-host/async-archiving

You mentioned in your original post:

> We are looking for an option, Can we store online backups and wal archive
> files separately in different directories?.

But it's not clear to me why you would need to do that based on your use
case.

Regards,
-- 
-David
da...@pgmasters.net



Re: PgBackRest question?

2017-12-18 Thread David Steele
On 12/17/17 7:10 PM, chiru r wrote:
> Thanks David for the replay.
> 
> we are not interested to use replication/Standby configuration at this
> moment with pgbackrest.

Database restores all work the same way -- the only difference is a few
configuration parameters.

> We are looking to restore the QA backups on Dev systems . Please help
> us if any blogs/documentations for remote databases recovery scenarios?.

I pointed you at the standby documentation because it describes a remote
database recovery scenario.  The only thing you need to do is exclude
standby_mode=on and you will have your dev system.

Regards,
-- 
-David
da...@pgmasters.net



Re: Does PostgreSQL check database integrity at startup?

2017-12-26 Thread David Steele

On 12/26/17 10:57 AM, Pavel Stehule wrote:
2017-12-26 16:50 GMT+01:00 Edson Carlos Ericksson Richter 



I'm rebuilding the standby server for two days already, with 23% of
completion status...
If lost the database and backups because of that failure, it would
be a giant disaster.
Few hours checking integrity would be acceptable... Specially if I
can run it on standby only.


very simple check

pgdumpall > /dev/null

but this doesn't check indexes.


pgBackRest will validate all page checksums (including indexes, etc.) in 
the cluster during backup.  Full backups check everything, 
incr/differential backups check only the files that have changed.


--
-David
da...@pgmasters.net



Re: Does PostgreSQL check database integrity at startup?

2017-12-27 Thread David Steele

On 12/26/17 4:58 PM, Alvaro Herrera wrote:



pgBackRest will validate all page checksums (including indexes, etc.) in the
cluster during backup.  Full backups check everything, incr/differential
backups check only the files that have changed.


If a table or index file is of zero length when backed up, as in the
described case, nothing will be checked, right?  I mean, there is
nothing externally indicating that the file ought to be of a different
size.  Am I wrong?  


Yes - that is how it works.


So Edson's situation here would not raise any red
flags.


It wasn't clear to me from the OP that the primary and standby were 
different - I thought there was just a zero file in general.


In any case, my reply was more directed at Pavel's reply about using 
pg_dump to validate checksums.  There are better ways...


--
-David
da...@pgmasters.net



Re: PgBackRest question?

2017-12-28 Thread David Steele

On 12/28/17 3:38 PM, chiru r wrote:

Thank you for the details  David and Stephen..

I am unable to recover the database which associated with user table space .

Please see the below test case and suggest me,if any one has the  issues 
while recovering single database.


*Test case: *

1) created tablespace   tblsp1 .
2) created database  db1 with tablespace tblsp1;
3) created  another database db2 ( with out tablespace , it means it is 
going to be created under pg_default table space)

4) Taken full backup.
5) create  table t1 on both  db1 and db2 databases .
6) Taken Incremental backup
7)   a) We have tried  to restore the db1 on different location by 
mapping table-space and without table-space .


This all looks reasonable.


However we got the below Error
  pgbackrest --stanza=online_backups  --db-include=db1 
  --tablespace-map-all=/u02/pgdata02/report2  restore

ERROR [080]: : database to include 'db1' does not exist


It appears that db1 is not listed as a valid database in the backup 
manifest.


b) We have tried  to restore the db1 on different location and we got 
the below Error


pgbackrest --stanza=online_backups  --delta --db-include=db1  restore
ERROR [080]: : database to include 'db1' does not exist

Please help me can we restore single database ,if it is assigned with 
tables space? .

And also let me know if anything missing in my test case ?

*Note *: if I restore the db2, whicich is associated with pg_default 
tablespace it is working.

pgbackrest --stanza=online_backups  --delta --db-include=db2  restore


Can you run this query on the cluster where the backups are running and 
attach the result:


select datname, oid from pg_database;

Thanks,
--
-David
da...@pgmasters.net



Re: PgBackRest question?

2017-12-28 Thread David Steele

On 12/28/17 5:15 PM, chiru r wrote:


Please find the below details.

postgres=# select datname, oid from pg_database;
   datname  |  oid
---+
  template0 |  13289
  postgres  |  13294
  template1 |      1
  db1       | 770161
  db2       | 770162
  db3       | 770169
(6 rows)
That looks OK. The next thing to look at is the backup that you are 
trying to restore.  Based on your commands this should be the most 
recent backup.


Please attach the backup.manifest file from:

[repo-path]/backup/online_backups/latest/backup.manifest

It's probably best if you compress it before attaching.

Thanks,
--
-David
da...@pgmasters.net



Re: PgBackRest question?

2017-12-29 Thread David Steele
On 12/28/17 6:15 PM, chiru r wrote:
> 
> I am unable to copy the complete backup.manifest file due to security
> reasons . please find the below contents.
> 
> [backup:target]
> pg_data={"path":"/u02/pgdata01/9.5/data","type":"path"}
> pg_tblspc/721349={"path":"/u02/pgdata02/report1","tablespace-id":"721349","tablespace-name":"report1","type":"link"}
> 
> [db]
> db1={"db-id":770161,"db-last-system-id":13289}
> db2={"db-id":770162,"db-last-system-id":13289}
> db3={"db-id":770169,"db-last-system-id":13289}
> postgres={"db-id":13294,"db-last-system-id":13289}
> template0={"db-id":13289,"db-last-system-id":13289}
> template1={"db-id":1,"db-last-system-id":13289}

OK -- it looks like this is a bug.  pgBackRest is validating the
database mappings against the files in the manifest but does not
recognize databases that are assigned to a tablespace.  It's OK if
tables are assigned to a tablespace, but not the entire database.

We'll fix this in the next release.  I've opened an issue on github to
track it: https://github.com/pgbackrest/pgbackrest/issues

Thanks!
-- 
-David
da...@pgmasters.net



Re: PgBackRest question?

2017-12-29 Thread David Steele
On 12/29/17 1:53 PM, Stephen Frost wrote:
> Greetings,
> 
> * David Steele (da...@pgmasters.net) wrote:
>> On 12/28/17 6:15 PM, chiru r wrote:
>>> I am unable to copy the complete backup.manifest file due to security
>>> reasons . please find the below contents.
>>>
>>> [backup:target]
>>> pg_data={"path":"/u02/pgdata01/9.5/data","type":"path"}
>>> pg_tblspc/721349={"path":"/u02/pgdata02/report1","tablespace-id":"721349","tablespace-name":"report1","type":"link"}
>>>
>>> [db]
>>> db1={"db-id":770161,"db-last-system-id":13289}
>>> db2={"db-id":770162,"db-last-system-id":13289}
>>> db3={"db-id":770169,"db-last-system-id":13289}
>>> postgres={"db-id":13294,"db-last-system-id":13289}
>>> template0={"db-id":13289,"db-last-system-id":13289}
>>> template1={"db-id":1,"db-last-system-id":13289}
>>
>> OK -- it looks like this is a bug.  pgBackRest is validating the
>> database mappings against the files in the manifest but does not
>> recognize databases that are assigned to a tablespace.  It's OK if
>> tables are assigned to a tablespace, but not the entire database.
>>
>> We'll fix this in the next release.  I've opened an issue on github to
>> track it: https://github.com/pgbackrest/pgbackrest/issues
> 
> Of course, a full restore should work just fine, this issue occurs
> only if you're doing a single-database restore from a cluster.
> 
> Presuming you're doing more than just testing, doing a full restore
> (and then dropping the databases that you don't want) would be a
> workaround until 1.28 is out.

Stephen is correct -- if you have the space to do full restores then
that will work until this is fixed.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: pgBackRest backup from standby

2018-02-19 Thread David Steele
Hi Don,

On 2/18/18 7:34 PM, Don Seiler wrote:
> 
> Looking to use pgBackRest to take a backup from a hot standby. I'm
> reading that pgBackRest still needs to connect to the primary and copy
> some files. My questions are:
> 
>  1. What files does it need to copy? Config files? WAL files?

It copies files that are not replicated from the primary so that a
primary-style backup is created. Anything that is replicated (which is
by far the bulk of the data) is copied from the standby.

>  2. How does it connect? SSH?

Yes.

>  3. Does pgBackRest need to be installed and configured on the primary
> as well?

Yes.  Anyway, it's best to archive from the primary so a replication
failure does not affect your archiving.

Configuring pgBackRest, SSH, standby, and backup from standby are all
covered in the user guide.

http://pgbackrest.org/user-guide.html

In particular:

http://pgbackrest.org/user-guide.html#backup-host
http://pgbackrest.org/user-guide.html#replication
http://pgbackrest.org/user-guide.html#standby-backup

Regards,
-- 
-David
da...@pgmasters.net



Re: pgBackRest backup from standby

2018-02-19 Thread David Steele
On 2/18/18 10:20 PM, Michael Paquier wrote:
> On Sun, Feb 18, 2018 at 06:48:30PM -0600, Don Seiler wrote:
>> On Sun, Feb 18, 2018 at 6:43 PM, Michael Paquier  wrote:
>>> You may want to contact the maintainers directly through github where
>>> the project is maintained:
>>> https://github.com/pgbackrest/pgbackrest
>>
>> Is that the place to just ask questions? I wasn't sure that "Issues" meant
>> just questions versus problems or bugs. I didn't see any mention of a forum
>> or list on their website, and there have been lots of pgBackRest questions
>> on this list in the past so I settled on this one.
> 
> Stephen and David are around all the time, so it is not really an issue
> to discuss things related to pgBackRest on this list I guess :)
> Attaching related folks directly in CC: also usually helps.
> 
> You may get faster feedback by opening an issue directly on github
> though, as there are a lot of emails on -general so it is easy to get
> things lost.  My 2c.

Either is fine with me, but as Michael says I might miss postings to
-general.  I'm sure somebody else would catch it, though.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: pgBackRest backup from standby

2018-02-19 Thread David Steele
Hi Don,

On 2/19/18 9:25 AM, Don Seiler wrote:
> On Mon, Feb 19, 2018 at 8:18 AM, David Steele  <mailto:da...@pgmasters.net>> wrote:
> 
> It copies files that are not replicated from the primary so that a
> primary-style backup is created. Anything that is replicated (which is
> by far the bulk of the data) is copied from the standby.
> 
> OK so all data files would be copied from standby. Can you give me an
> example of the types of files that need to be copied from primary?
>  

Anything *not* in global (except pg_control), base, pg_tblspc,
pg_xact/pg_clog, and pg_multixact are copied from the primary.

For example, pg_stat is copied from the primary so these stats are
preserved on a standby backup.

pgBackRest uses all the same exclusions as pg_basebackup, so many
dirs/files are not copied at all: pg_dynshmem, pg_notify, pg_replslot,
pg_serial, pg_snapshots, pg_stat_tmp, pg_subtrans, etc.

Full list here
https://www.postgresql.org/docs/10/static/protocol-replication.html.

> it's best to archive from the primary so a replication
> failure does not affect your archiving.
> 
> Understood, just not something I can change in production primary at the
> moment. Hence looking to see about a quick one-off backup from standby.

For a quick one-off, pg_basebackup is your friend.

Regards,
-- 
-David
da...@pgmasters.net



Re: pgBackRest backup from standby

2018-02-19 Thread David Steele
Hi Don,

On 2/19/18 10:01 AM, Don Seiler wrote:
> On Mon, Feb 19, 2018 at 8:53 AM, David Steele  <mailto:da...@pgmasters.net>> wrote:
> 
> Anything *not* in global (except pg_control), base, pg_tblspc,
> pg_xact/pg_clog, and pg_multixact are copied from the primary.
> 
> For example, pg_stat is copied from the primary so these stats are
> preserved on a standby backup.
> 
> So if I have tablespaces outside of $PGDATA (but symlinked from within
> pg_tblspc, of course), those will still be backed up from the standby,
> right?

Correct.

> Is it right to say that the files that would be copied from primary are
> very small, typically? So it isn't a huge transfer over the WAN (in my
> case)?

Yes, they are typically very small.  The general exception to this rule
is if logs are stored in pg_log.  I recommend storing logs out of the
PGDATA dir as they can be quite large and don't really make sense to
restore to another server.

Files copied from the master will be marked as such in backup.manifest
(master:true) so you can check for yourself.

> I did come up with a sort of Rube Goldberg-esque workaround for now
> involving using a clone of the prod standby VM from Veeam backup to use
> as the backup source (after stopping recovery and opening it as a
> standalone DB).

You don't get PITR that way, of course, but at least it's a backup.  As
long as your clone is consistent.

-- 
-David
da...@pgmasters.net



Re: pgBackRest backup from standby

2018-02-19 Thread David Steele
On 2/19/18 10:32 AM, Don Seiler wrote:
> On Mon, Feb 19, 2018 at 9:21 AM, David Steele  <mailto:da...@pgmasters.net>> wrote:
> 
> 
> Yes, they are typically very small.  The general exception to this rule
> is if logs are stored in pg_log.  I recommend storing logs out of the
> PGDATA dir as they can be quite large and don't really make sense to
> restore to another server.
> 
> Files copied from the master will be marked as such in backup.manifest
> (master:true) so you can check for yourself.
> 
> 
> Good to know. And fortunately for this DB we do have pg_log (and
> pg_xlog) symlinked to different volumes outside of $PGDATA.

If pg_log is symlinked to PGDATA it will be copied.  pg_xlog is not
copied in any backup.

> 
> > I did come up with a sort of Rube Goldberg-esque workaround for now
> > involving using a clone of the prod standby VM from Veeam backup to use
> > as the backup source (after stopping recovery and opening it as a
> > standalone DB).
> 
> You don't get PITR that way, of course, but at least it's a backup.  As
> long as your clone is consistent.
> 
> 
> Yes it's a crash-consistent snapshot-based backup. I've done quite a few
> restores from it and it works great. It can do PITR as well since I
> would have all the WAL files from prod needed to keep recovering. But
> for these cases I just recover it to the first consistent point and open
> it for testing (or backups in this case). 

I don't think it would be safe to do PITR on a backup taken in this way.
 The WAL diverges even if you suppress a timeline switch.

-- 
-David
da...@pgmasters.net



Re: pgBackRest backup from standby

2018-02-19 Thread David Steele
On 2/19/18 11:29 AM, Simon Riggs wrote:
> On 19 February 2018 at 16:17, David Steele  wrote:
>>> > I did come up with a sort of Rube Goldberg-esque workaround for now
>>> > involving using a clone of the prod standby VM from Veeam backup to 
>>> use
>>> > as the backup source (after stopping recovery and opening it as a
>>> > standalone DB).
>>>
>>> You don't get PITR that way, of course, but at least it's a backup.  As
>>> long as your clone is consistent.
>>>
>>>
>>> Yes it's a crash-consistent snapshot-based backup. I've done quite a few
>>> restores from it and it works great. It can do PITR as well since I
>>> would have all the WAL files from prod needed to keep recovering. But
>>> for these cases I just recover it to the first consistent point and open
>>> it for testing (or backups in this case).
>>
>> I don't think it would be safe to do PITR on a backup taken in this way.
> 
> If you have all the WAL files, then it would be safe.

I read "open it for testing (or backups in this case)" as letting
recovery complete and promoting the cluster to a master before taking
the backup.

Don, is that the case?  If it is, I think there's a problem with or
without a timeline switch.  If you confirm the backup is being taken as
above then I'll detail my concerns.

-- 
-David
da...@pgmasters.net



Re: pgBackRest backup from standby

2018-02-19 Thread David Steele
On 2/19/18 2:05 PM, Don Seiler wrote:
> On Mon, Feb 19, 2018 at 12:39 PM, David Steele  <mailto:da...@pgmasters.net>> wrote:
> 
> 
> I read "open it for testing (or backups in this case)" as letting
> recovery complete and promoting the cluster to a master before taking
> the backup.
> 
> Don, is that the case?  If it is, I think there's a problem with or
> without a timeline switch.  If you confirm the backup is being taken as
> above then I'll detail my concerns.
> 
> 
> Note that this is just for creating a couple of one-off backups to
> restore for our dev and pre-prod environments. Given that, I was going
> to open a new clone as its own cluster and take backups from that. The
> data would be the same though and suit purposes of the dev and pre-prod
> refreshes.
> 
> If I were taking backups for the purpose of production backups, I would
> not do things this way. That is the eventual plan but right now we
> aren't ready to make the changes necessary in the production environment.

OK, that's fine then.  You can play these to consistency and they'll be
fine.  I just wouldn't try to do any PITR using the production WAL archive.

-- 
-David
da...@pgmasters.net



Re: pgBackRest backup from standby

2018-02-19 Thread David Steele
On 2/19/18 3:41 PM, Don Seiler wrote:
> On Mon, Feb 19, 2018 at 10:17 AM, David Steele  <mailto:da...@pgmasters.net>> wrote:
> 
> If pg_log is symlinked to PGDATA it will be copied.  pg_xlog is not
> copied in any backup.
> 
> 
> So an external pg_log directory symlinked into $PGDATA will have its log
> contents copied?

Yes.

> I'm curious, why even copy the pg_log logs? They aren't needed for
> database restore or recovery.

The general philosophy is to copy everything except what we know for
sure can be excluded.  In practice, this means sticking to what
pg_basebackup excludes because that list is vetted by the community.

Also, relocating the log directory is easy using the log_directory
setting, so that's what I recommend if it's an issue.  Some users do
want to backup their logs.

-- 
-David
da...@pgmasters.net



Re: initdb when data/ folder has mount points

2018-02-21 Thread David Steele

On 2/21/18 7:01 PM, Tom Lane wrote:

Ron Johnson  writes:

Apparently, initdb assumes that data/ is one big mount point. However, we
have four mount points:
/var/lib/pgsql/9.6/data/backup
/var/lib/pgsql/9.6/data/base
/var/lib/pgsql/9.6/data/pg_log
/var/lib/pgsql/9.6/data/pg_xlog


Don't do that.


Agreed.


There's no reason for backup storage to be under the data directory (and
lots of good reasons for it not to be).  Just put it somewhere else.


Yes -- in this configuration your backups would be backed up with every 
backup.  It's pretty obvious where that would go.



The supported way to put pg_xlog on a separate volume is to put that
mount point somewhere else, and make $PGDATA/pg_xlog be a symlink to
it.  IIRC, there's an initdb option to help with that, though you can
also make it so manually after initdb.


initdb supports linking pg_xlog/pg_wal with the --xlogdir/--waldir option.


For pg_log, just put it somewhere else and set the appropriate
configuration option to say where to write the postmaster log files.
Or you could use a symlink, like the solution for pg_xlog, but
I don't see any advantage there.


Symlinking pg_log is not ideal because the logs end up in the backup. 
It gets pretty weird when those logs get restored to a standby and 
somebody starts reading them.



I don't see any point in making base/ be its own mount point.  Once
you get rid of those other subdirectories there's not going to be
enough "global" storage left to justify its own volume.


Agreed.

--
-David
da...@pgmasters.net



Re: initdb when data/ folder has mount points

2018-02-22 Thread David Steele
On 2/22/18 1:16 AM, Michael Paquier wrote:
> On Wed, Feb 21, 2018 at 07:56:38PM -0500, David Steele wrote:
>> On 2/21/18 7:01 PM, Tom Lane wrote:
>>> For pg_log, just put it somewhere else and set the appropriate
>>> configuration option to say where to write the postmaster log files.
>>> Or you could use a symlink, like the solution for pg_xlog, but
>>> I don't see any advantage there.
>>
>> Symlinking pg_log is not ideal because the logs end up in the backup. It
>> gets pretty weird when those logs get restored to a standby and somebody
>> starts reading them.
> 
> log_directory in postgresql.conf san be set up with an absolute
> directory value.  So there is no actual need for a symlink with pg_log.
> This also reduces the amount of data transfered as part of base
> backups without actually needing them. 

Yes, I really should have said having pg_log in the data directory at
all is not ideal, symlinked or no.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature