Re: Purging few months old data and vacuuming in production

2023-01-06 Thread Ranjith Paliyath
Thank you for the details, experience shared and the suggestions.
Apologies for the delay in collecting the response for the queries.

(1)Are the tables tied together by FK?
  - Overall there are 9 tables (sorry not 6 as mentioned originally) that are 
being purged. Only 4 tables would be having FK relationship.  

(2)How big are the rows?
  - The 9 tables now occupy almost 2TB space. Below is the rowsize (in bytes) 
and record-count details -
  
236 188,055,675
297 296,941,261
371 58,673,649
  9557,477,553
904 296,743,680
234 188,161,891
414 430,411,653
707 735,895,015
128 155,104,922

(3)Is there an index on the date field?
  - Yes. But only in one table, which is the main table (records to purge in 
rest of the tables is based on this table). 

(4)Can you drop unneeded indices during the window, and then rebuild them 
afterward?
  - Not sure, if the time window within which the purge process must complete 
would be sufficient to do drop and rebuild of indices.

(5)How beefy is your hardware?
  - No. of cores - 64
Memory - 128GB
Disk - SSD, Total capacity - 8.5TB

Thank you...This electronic mail (including any attachment thereto) may be 
confidential and privileged and is intended only for the individual or entity 
named above. Any unauthorized use, printing, copying, disclosure or 
dissemination of this communication may be subject to legal restriction or 
sanction. Accordingly, if you are not the intended recipient, please notify the 
sender by replying to this email immediately and delete this email (and any 
attachment thereto) from your computer system...Thank You.




Re: best practice to patch a postgresql version?

2023-01-06 Thread Ron

On 1/5/23 23:43, Laurenz Albe wrote:

On Tue, 2022-12-27 at 00:48 -0600, Ron wrote:

If it really is a critical production database, you will have a CAT/UAT 
(customer/user acceptance testing)
server on which you rigorously run regression tests on a point release for a 
month before updating the production server.

Otherwise, it's a hope-and-pray database.

No, that is wrong.

You should not test your application when you install a minor update.  The 
reason is that
few people are willing to test the application thoroughly every few months, and 
the outcome
is that minor releases are *not* applied regularly, as they should be.

You are supposed to trust PostgreSQL development that they don't introduce new 
bugs.
Sure, this can happen, even though all possible care is taken with backpatches. 
 I have
seen it happen once or twice in the 15+ years I have been dealing with 
PostgreSQL.
In that case, a new minor release will come out soon afterwards.


It's absolutely standard practice "in the enterprise" to install the latest 
patch on the UAT (and possibly Dev and Staging) servers before rolling out 
to production.


Have I aver seen a problem in Postgresql?  No.  But I've seen problems with 
other RDBMSs.  If a problem did happen, and caused for example, an important 
report to suddenly take 3 hours instead of 3 minutes, the client will 
scream; there might even be SLA penalties.


Thus, we're cautious with "critical production databases".

--
Born in Arizona, moved to Babylonia.




Re: Purging few months old data and vacuuming in production

2023-01-06 Thread Ron

On 1/6/23 02:44, Ranjith Paliyath wrote:

Thank you for the details, experience shared and the suggestions.
Apologies for the delay in collecting the response for the queries.

(1)Are the tables tied together by FK?
   - Overall there are 9 tables (sorry not 6 as mentioned originally) that are 
being purged. Only 4 tables would be having FK relationship.

(2)How big are the rows?
   - The 9 tables now occupy almost 2TB space. Below is the rowsize (in bytes) 
and record-count details -
   
 236	188,055,675

 297296,941,261
 37158,673,649
   95   57,477,553
 904296,743,680
 234188,161,891
 414430,411,653
 707735,895,015
 128155,104,922

(3)Is there an index on the date field?
   - Yes. But only in one table, which is the main table (records to purge in 
rest of the tables is based on this table).


Can you do online purging?

For example, get a list of the main table's primary keys to be deleted, and 
then nibble away at them all day: in one transaction delete all the records 
for one logically related set of records.  Do that N million times, and 
you've purged the data without impacting production.



--
Born in Arizona, moved to Babylonia.




Re: Purging few months old data and vacuuming in production

2023-01-06 Thread Ranjith Paliyath
Thank you very much for the response.

> Can you do online purging?

> For example, get a list of the main table's primary keys to be deleted, 
and
> then nibble away at them all day: in one transaction delete all the 
records
> for one logically related set of records.  Do that N million times, and
> you've purged the data without impacting production.

So, with this approach, is the advantage like, manual vacuuming worry may be 
set aside, because auto-vacuuming would deal with the dead rows? This is 
because the deletion step is executed record by record in main table, with its 
connected record(s) delete executions in rest of tables? Due to the infra 
capability that is there in this instance, the impact could be almost none!!??

Thank you...

This electronic mail (including any attachment thereto) may be confidential and 
privileged and is intended only for the individual or entity named above. Any 
unauthorized use, printing, copying, disclosure or dissemination of this 
communication may be subject to legal restriction or sanction. Accordingly, if 
you are not the intended recipient, please notify the sender by replying to 
this email immediately and delete this email (and any attachment thereto) from 
your computer system...Thank You.




Re: Purging few months old data and vacuuming in production

2023-01-06 Thread Ron

On 1/6/23 08:27, Ranjith Paliyath wrote:

Thank you very much for the response.

 > Can you do online purging?

 > For example, get a list of the main table's primary keys to be deleted, 
and
 > then nibble away at them all day: in one transaction delete all the 
records
 > for one logically related set of records.  Do that N million times, and
 > you've purged the data without impacting production.

So, with this approach, is the advantage like, manual vacuuming worry may be 
set aside, because auto-vacuuming would deal with the dead rows?


Theoretically, manual vacuuming is never necessary.  I'd occasionally do 
manual vacuums (after purging a couple of weeks of data, for example).


Disable autovacuum on a table, vacuum it, then reenable autovacuum.

ALTER TABLE table_name SET (autovacuum_enabled = false);
VACUUM table_name;
ALTER TABLE table_name SET (autovacuum_enabled = true);



  This is because the deletion step is executed record by record in main table, 
with its connected record(s) delete executions in rest of tables?


I don't know if you have ON DELETE CASCADE.  Even if you do, you'll have to 
manually delete the tables not linked by FK.  I'd write a PL/pgSQL 
procedure: pass in a PK and then delete records from the 9 tables in the 
proper order so as to not throw FK constraint errors.



Due to the infra capability that is there in this instance,


What is "infra capability"?


the impact could be almost none!!??


It'll use /some/ resources, because it's a thread deleting records, but most 
of the records and index nodes won't be where new records are being inserted.


Note, though, that this will generate a lot of WAL records.

--
Born in Arizona, moved to Babylonia.

Re: Updating column default values in code

2023-01-06 Thread Brad White
On Wednesday, December 28, 2022, Brad White  wrote:

> On timestamp fields, I need to update the column default from the current
> "Now()" to "LOCALTIMESTAMP(0)"
>
> I could just manually make the change on every table, but they want the
> existing backups to still work. So I need something that I can build into
> my restore script.
>
> I've gotten the list of tables and columns.
>
> How do I update the relation?
>
> SELECT a.attrelid, a.attname, pg_get_expr(d.adbin, d.adrelid) AS
> default_value
> FROM   pg_catalog.pg_attribute AS a
> JOIN pg_catalog.pg_attrdef d ON (a.attrelid, a.attnum) = (d.adrelid,
> d.adnum)
> WHERE  NOT a.attisdropped   -- no dropped (dead) columns
> ANDa.attnum   > 0   -- no system columns
> AND pg_get_expr(d.adbin, d.adrelid) = 'now()'
>
>
>
Extend the query so all inputs needed to manually write an ALTER TABLE
command (chiefly, you need textual names, not oids).  Then use format() to
actually write those commands using the query as input.  You can use psql
\gexec to actually execute those queries, though other options exist
depending on what tools you are comfortable with).

David J.

I'm still suffering here.
I have 95 column defaults to update across most of my data tables.

I can generate ALTER statements, per David's sensible suggestion, but they
fail because you have to drop all the views.
I created a script that would drop and re-create all the views, but all the
table/column names are non-quoted which fails because all my tables/columns
are mixed-case.

So I either need to

-- generate a script to re-create the views that works,

-- or parse my script, recognize all the relation names, force them to
proper casing, and wrap in quotes, so I can drop and regenerate the views
properly,

-- or alter the definition of the column defaults inplace
in pg_catalog.pg_attrdef.

Altering the defaults seems safe because the default value shouldn't affect
the view at all.

Thanks for any suggestions,
Brad.


Re: Updating column default values in code

2023-01-06 Thread Ken Tanzer
On Fri, Jan 6, 2023 at 3:32 PM Brad White  wrote:

> I can generate ALTER statements, per David's sensible suggestion, but they
> fail because you have to drop all the views.
>


> Altering the defaults seems safe because the default value
> shouldn't affect the view at all.
>

Are you sure those alter statements will fail?  I do that frequently.  Plus
would expect it to work because of your second statement.

Here's an example, running on 9.6:

CREATE TABLE foo (f1 integer DEFAULT 1, f2 integer);
CREATE VIEW foo_view AS SELECT * FROM foo;
ALTER TABLE foo ALTER COLUMN f1 SET DEFAULT 3;
ALTER TABLE foo ALTER COLUMN f2 SET DEFAULT 2;

agency=> BEGIN;
BEGIN
agency=> CREATE TABLE foo (f1 integer DEFAULT 1, f2 integer);
CREATE TABLE
agency=> CREATE VIEW foo_view AS SELECT * FROM foo;
CREATE VIEW
agency=> ALTER TABLE foo ALTER COLUMN f1 SET DEFAULT 3;
ALTER TABLE
agency=> ALTER TABLE foo ALTER COLUMN f2 SET DEFAULT 2;
ALTER TABLE

Cheers,
Ken

-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ *
*https://demo.agency-software.org/client
*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
 to
learn more about AGENCY or
follow the discussion.


Re: Regd. the Implementation of Wallet (in Oracle) config equivalent in postgreSQL whilst the database migration

2023-01-06 Thread Bruce Momjian
On Thu, Dec 22, 2022 at 11:15:57AM +0100, Rainer Duffner wrote:
> I wasn’t involved in setting it up here, but AFAIK you need to „enroll“ the
> client to the HSM.
> 
> That is a one-time process that requires HSM credentials (via certificates and
> pass-phrases).
> 
> Then, that client can talk to the HSM. 
> 
> The HSM-client is (or should be) engineered in such a way that you can’t
> extract the encryption-secret easily.
> 
> I am not sure, but IIRC, you should not even be able to clone the VM without
> the HSM noticing or the clone not working at all to begin with (for lack of
> enrollment). Though most production databases are too large to just „clone“.
> 
> Maybe someone who knows more about this subject can chime in before I make a
> fool of myself?

This wiki should help:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption

Also, the first two patches in this email are doc patches which explain
the benefits:

https://www.postgresql.org/message-id/20210625212204.ga7...@momjian.us

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.




Re: Updating column default values in code

2023-01-06 Thread Brad White

On 1/6/2023 7:44 PM, Ken Tanzer wrote:



On Fri, Jan 6, 2023 at 3:32 PM Brad White  wrote:

I can generate ALTER statements, per David's sensible suggestion,
but they fail because you have to drop all the views.


Altering the defaults seems safe because the default value
shouldn't affect the view at all.


Are you sure those alter statements will fail?  I do that frequently.  
Plus would expect it to work because of your second statement.


Here's an example, running on 9.6:

CREATE TABLE foo (f1 integer DEFAULT 1, f2 integer);
CREATE VIEW foo_view AS SELECT * FROM foo;
ALTER TABLE foo ALTER COLUMN f1 SET DEFAULT 3;
ALTER TABLE foo ALTER COLUMN f2 SET DEFAULT 2;

agency=> BEGIN;
BEGIN
agency=> CREATE TABLE foo (f1 integer DEFAULT 1, f2 integer);
CREATE TABLE
agency=> CREATE VIEW foo_view AS SELECT * FROM foo;
CREATE VIEW
agency=> ALTER TABLE foo ALTER COLUMN f1 SET DEFAULT 3;
ALTER TABLE
agency=> ALTER TABLE foo ALTER COLUMN f2 SET DEFAULT 2;
ALTER TABLE

Cheers,
Ken

--


That's a good point.

It was failing when I was trying to change the field type, before I 
stumbled on the real issue of the default values.


I realize now that I assumed and didn't try to update just the defaults.

Thanks,
Brad.

--
Quote Signature I talk with clients, find out where their pain points 
are, and solve those.

On-call IT Management for small companies and non-profits.
SCP, Inc.
bwh...@inebraska.com
402-601-7990


Quote of the Day
   There is a huge difference between fathering a child and being a 
father.

   One produces a child. The other produces an adult.
    -- John Eldredge

Re: Updating column default values in code

2023-01-06 Thread Tom Lane
Brad White  writes:
> On Wednesday, December 28, 2022, Brad White  wrote:
>> On timestamp fields, I need to update the column default from the current
>> "Now()" to "LOCALTIMESTAMP(0)"

> I'm still suffering here.

I don't understand why a script to generate
ALTER VIEW v ALTER COLUMN c SET DEFAULT LOCALTIMESTAMP(0)
commands doesn't do what you need?

regards, tom lane




RE: Purging few months old data and vacuuming in production

2023-01-06 Thread Ranjith Paliyath
Thank you very much again.


> So, with this approach, is the advantage like, manual 
vacuuming worry may be set aside, because auto-vacuuming would deal with the 
dead rows?

> Theoretically, manual vacuuming is never necessary.  I'd occasionally 
do manual vacuums (after purging a couple of weeks of data, for example).

> Disable autovacuum on a table, vacuum it, then reenable autovacuum.
>
> ALTER TABLE table_name SET (autovacuum_enabled = false);
> VACUUM table_name;
> ALTER TABLE table_name SET (autovacuum_enabled = true);

Ok. For the record by record delete approach, autovacuum-ing could be the 
natural option.

> This is because the deletion step is executed record by 
record in main table, with its connected record(s) delete executions in rest of 
tables? 

> I don't know if you have ON DELETE CASCADE.  Even if you do, you'll 
have to manually delete the tables not linked by FK.  I'd write a PL/pgSQL 
procedure: pass in a PK and then delete records from the 9 tables in the proper 
order so as to not throw FK constraint errors. 

Ok, in the case of our specific 9 tables it would finding and deleting linked 
records in 8 tables based on the record chosen in the main table. That is going 
and deleting records one by one.


> Due to the infra capability that is there in this instance,

> What is "infra capability"?

You had a query like how beefy the hardware is - was trying to refer to the 
hardware capability.

> the impact could be almost none!!??

> It'll use some resources, because it's a thread deleting records, but 
most of the records and index nodes won't be where new records are being 
inserted.

> Note, though, that this will generate a lot of WAL records.

Ok, thanks. 
We were weighing on pros and cons of the table partitioning approach. But, 
input on the experience you had with the partitioned approach is something 
we'll need to very much consider. We'll try to see if the per record delete 
could be tried out once, and how it affects the DB load, with its present WAL 
setting.

Thank you...