Large pkey index on insert-only table

2023-06-26 Thread Devin Ivy
Hi all,
I have a suspiciously large index, and I could use a hand finding a root
cause for its size.  This index supports the primary key for a closure
table that models threaded comments with columns `(id, ancestor_id,
depth)`.  The primary key is composite: `(id, ancestor_id)`.  The id
columns are varchars which are a bit long for identifiers, around 70
bytes.  This table is insert-only: the application never performs updates
or deletes.

The table has grown to 200GB, and the unique index supporting the primary
key is nearly double that at around 360GB, which stood out to me as rather
large compared to the table itself.  The index uses the default fillfactor
of 90.  I would not anticipate very much bloat since updates and deletes
never occur on this table, and according to pg_stat_all_tables autovacuum
has been running regularly.  I've used the btree bloat estimator from
https://github.com/ioguix/pgsql-bloat-estimation, and it estimates the
bloat percentage at 47%.

Any thoughts on why this may be, or where to go next to continue tracking
this down?  Also, could the primary key column order `(id, ancestor_id)` vs
`(ancestor_id, id)` significantly affect the index size depending on the
column cardinalities?  I appreciate your time and input, thanks!

--
Devin Ivy


Effects of dropping a large table

2023-07-18 Thread Devin Ivy
Hi all,
I'm hoping to ensure I understand the implications of dropping a large
table and the space being reclaimed by the database and/or OS.  We're using
pg v14.

This table is quite large with a primary key and one additional index—all
together these are on the order of 1TB.  The primary key and index take-up
quite a bit more space than the table itself.  Our hope is to discontinue
use of this table and then eventually drop it.  However, the database is
under constant load and we'd like to avoid (or at least anticipate)
downtime or degraded performance.  The database also replicates to a
standby instance.

So in short, what can we expect if we drop this table?  Will the strategy
that pg takes to give that space back to the rest of the database and/or OS
have significant effects on availability or performance?  Finally, are
there any other considerations that we should take into account?  I
appreciate your time and input, thanks!

--
Devin Ivy