On Sat, Dec 30, 2017 at 09:19:05AM -0500, Robert Blayzor wrote:
> On Dec 30, 2017, at 12:38 AM, Justin Pryzby wrote:
> > BTW depending on your requirements, it may be possible to make pg_dump much
> > more efficient. For our data, it's reasonable to assume that a table is
> > "final" if its constraints exclude data older than a few days ago, and it
> > can
> > be permanently dumped and excluded from future, daily backups, which makes
> > the
> > backups smaller and faster, and probably causes less cache churn, etc. But
> > I
> > imagine you might have different requirements, so that may be infeasible, or
> > you'd maybe have to track insertions, either via p
>
> The idea is only only keep a # of months available for searching over a
> period of months. Those months could be 3 or more, up to a year, etc. But
> being able to just drop and entire child table for pruning is very
> attractive. Right now the average months data is about 2-3 million rows each.
> Data is just inserted and then only searched. Never updated…
>
> I also like the idea of skipping all this older data from a PGdump. We
> archive records inserted into these tables daily into cold storage. ie:
> export and compressed. So the data is saved cold. We dump the DB nightly
> also, but probably would make sense to skip anything outside of the newest
> child table. Just not sure how to make that happen, yet….
For us, I classify the tables as "partitioned" or "not partitioned" and
subdivide "partitioned" into "recent" or "historic" based on table names; but
if you design it from scratch then you'd have the opportunity to keep a list of
partitioned tables, their associated date range, date of most recent insertion,
and most recent "final" backup.
This is the essence of it:
snap= ... SELECT pg_export_snapshot();
pg_dump --snap "$snap" -T "$ptnreg" -f nonpartitioned.new
pg_dump --snap "$snap" -t "$recent" -f recent.new
loop around historic partitioned tables and run "final" pg_dump if it's been
INSERTed more recently than it's been dumped.
remove any "final" pg_dump not included in any existing backup (assuming you
keep multiple copies on different rotation).
Note that pg_dump -t/-T is different from "egrep" in a few special ways..
Justin