e noticed the effects of the problem.
That sounds plausible.
--
Peter Geoghegan
es with a generic table name
alias?
--
Peter Geoghegan
NFLICT DO UPDATE.
--
Peter Geoghegan
tually make any promises about not getting unique
violations. Only ON CONFLICT DO UPDATE (and ON CONFLICT DO NOTHING)
make such a promise. That's the main reason why Postgres supports
both.
--
Peter Geoghegan
ind by fillfactor in each case. In
general page splits tend to come in distinct "waves" after CREATE
INDEX is run.
--
Peter Geoghegan
still 15.7) on Ubuntu 22.04 LTS.
That shouldn't matter, as far as this theory of mine is concerned.
Anything after 12 could be affected by the issue I'm thinking of.
--
Peter Geoghegan
sistently:
https://www.youtube.com/watch?v=p5RaATILoiE&t=2079s
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f21668f3
Of course, this is only a guess. I vaguely recall a complaint that
sounded vaguely like yours, also involving partitioning.
--
Peter Geoghegan
On Mon, Sep 2, 2024 at 4:58 PM Peter Geoghegan wrote:
> On Mon, Sep 2, 2024 at 4:35 PM Pavel Luzanov wrote:
> > If it helps, without creating index on id column, the numbers will be
> > much closer:
>
> Yes, avoiding all index vacuuming seems useful. It makes the test cas
're writing extra FPIs to set hint
bits. But that explanation only works if you assume that page-level
checksums are in use (or that wal_log_hints is turned on).
--
Peter Geoghegan
ally cause an increase in the number of WAL records written?
I'd have thought that that was simply impossible.
--
Peter Geoghegan
inated in
pages that only contained existing LP_UNUSED items when scanned by
VACUUM?
--
Peter Geoghegan
ed to WAL, buffers, and
CPU time that changed.
Perhaps I'm not thinking of something obvious. Maybe it's extra
VISIBILITY records? But I'd expect the number of VISIBILITY records to
match the number of pages frozen, given these particulars. VACUUM
VERBOSE at least shows that that hasn't changed.
--
Peter Geoghegan
s
> and the total size of the WAL. Instead, WAL numbers have significantly
> degraded.
>
> What am I doing wrong?
That does seem weird.
CC'ing the authors of the relevant VACUUM enhancements.
--
Peter Geoghegan
On Thu, Sep 7, 2023 at 3:48 AM David Rowley wrote:
> On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan wrote:
> > It seems likely that the problem here is that some of the predicates
> > appear as so-called "Filter:" conditions, as opposed to true index
> > quals.
>
1)
Index Cond: (a = 1)
Filter: (b <> 42)
Rows Removed by Filter: 1
Heap Fetches: 0
Buffers: shared hit=11
Planning Time: 0.076 ms
Execution Time: 3.204 ms
(8 rows)
(There are lots of index tuples matching (a, b) = "(1, NULL)" here, you see.)
--
Peter Geoghegan
ing is something that theoretically can be avoided by the
implementation (if there were deletes involved then that wouldn't
apply, but there aren't).
--
Peter Geoghegan
7;t necessarily going to
stop autovacuum/autoanalyze from doing pending list cleanup.
The ANALYZE pending list cleanup path has some problems:
https://postgr.es/m/cah2-wzkjrk556envtflmyxedw91xguwiyzvep2kp5yqt_-3...@mail.gmail.com
--
Peter Geoghegan
PG15 in theory can be used to do UPSERT
> properly that is void of the aforementioned limitation.
> The downside is it is rather verbose.
The MERGE command has various race conditions that are particularly
relevant to UPSERT type use cases. See the wiki page you referenced
for a huge amoun
x27;d have frozen that
one tuple and then set the page all-visible. The page would likely be
frozen again by the next aggressive VACUUM, which is usually much more
expensive.
--
Peter Geoghegan
at version). And if you were on 14+, things in
this area would be much better still.
--
Peter Geoghegan
Can you run amcheck's bt_index_check() routine against some of the
indexes you've shown? There is perhaps some chance that index
corruption exists and causes VACUUM to take a very long time to delete
index pages. This is pretty much a wild guess, though.
--
Peter Geoghegan
ust about possible with an access method like GIN. Do you have
any non-btree indexes on the table? Can you show us the details of the
table, including all of its indexes? In other words, can you show "\d
applications" output from psql?
--
Peter Geoghegan
interface for
running amcheck routine, so maybe look into that once you upgrade.
--
Peter Geoghegan
look into, in particular anything I
> should check before upgrading the remaining 13.8 DB to 13.9?
I recommend running amcheck on all indexes, or at least all
possibly-affected text indexes.
--
Peter Geoghegan
relative ordering of each script
can be changed this way. There is also something called merged
tailorings.
The OP should see the Postgres ICU docs for hints on how to use these
facilities to make a custom collation that matches whatever their
requirements are:
https://www.postgresql.org/docs/
. This is a highly contrived example.
FWIW I think that it could be a lot less bad, even with indexes that
you'd think would be almost as bad as the bad one from your test case.
Even things that appear to be random aren't usually nearly as random
as what you've shown.
--
Peter Geoghegan
usual to have a perfectly
random and uniformly distributed clustering of index tuples, which is
what the index built via hashing exhibits. Even a v4 UUID index could
easily have plenty of duplicates, which would probably do
significantly better on the metric you've focussed on.
--
Peter Geoghegan
mud, too. What does "inferred" mean here? I think it means "chosen as
> arbiter index", but maybe I misunderstand.)
Unique index/constraint inference is the process by which we choose an
arbiter index. See the second paragraph of the "ON CONFLICT Clause"
section of the INSERT docs.
--
Peter Geoghegan
interlock at all in the case of
nbtree index scans with MVCC snapshots -- but *not* with index-only
scans. See "Making concurrent TID recycling safe" in the nbtree
README. I only got around to documenting all of the details here quite
recently. The index-only scan thing dates back to 9.5.
--
Peter Geoghegan
On Tue, Oct 11, 2022 at 10:04 PM Tom Lane wrote:
> Do we propagate visibility-map bits to standbys?
Yes.
--
Peter Geoghegan
ll be
set to false on the standby.
--
Peter Geoghegan
more sense for *your* workload, where huge differences in the rate of
MultiXact consumption among tables is likely the norm. This still
isn't perfect (far from it), but it has the potential to make things
far better here.
--
Peter Geoghegan
ion is in dire need of an
overhaul. :-(
--
Peter Geoghegan
vacuum, but that's not really true (apart from VACUUM FULL, which
really is quite different). The difference between aggressive and
non-aggressive can be big in practice due to an accumulation of
unfrozen pages over multiple non-aggressive vacuums.
--
Peter Geoghegan
nxid) reached vacuum_freeze_table_age).
See my recent response to a similar question here:
https://postgr.es/m/CAH2-WzkFQ-okvVXizpy4dCEVq75N-Qykh=crhzao-eajflv...@mail.gmail.com
--
Peter Geoghegan
On Mon, Apr 18, 2022 at 11:37 PM bhargav kamineni wrote:
> executing the vacuum on the entire cluster is also giving the same HINTS and
> WARNING's
You're using Aurora, not PostgreSQL. Perhaps this is actually a bug,
but there is no way for anybody here to know.
--
Peter Geoghegan
g because of how they allow an
organization to deploy a program in a production environment, complete
with version control? Does it have something to do with decoupling the
mutable business data stored in tables from the programs contained/run
in the same database?
--
Peter Geoghegan
communicated with him online, and
I've worked on Postgres more or less full time for a full decade now.
As far as I'm aware he hasn't ever publicly posting to any of the
mailing lists.
--
Peter Geoghegan
hen how can you expect it for yourself?
--
Peter Geoghegan
CONFLICT. And
because the only alternative interpretation is that Alex intends to
update those columns using their current values (not new values),
which won't really change anything -- that seems unlikely to have been
the intent.
--
Peter Geoghegan
do just pass a string some other way), but it's
definitely considered the way to go.
Note that you can do stuff like this from SQL:
pg@regression:5432 [2999218]=# select 'pg_class'::regclass::oid;
oid
───
1,259
(1 row)
--
Peter Geoghegan
e:
https://www.postgresql.org/message-id/flat/CAL9smLC%3DSxYiN7yZ4HDyk0RnZyXoP2vaHD-Vg1JskOEHyhMXug%40mail.gmail.com#e79eca5922789de828314e296fdcb82d
--
Peter Geoghegan
On Mon, Sep 6, 2021 at 7:52 AM Daniel Westermann (DWE)
wrote:
> >Try running vacuum with index cleanup = on.
>
> Thank you, Peter
Thanks for testing!
--
Peter Geoghegan
eds to do work that is proportional to
the number of unset-in-vm heap pages.
I believe that that trade-off makes a lot of sense. Autovacuum has
little chance of keeping anything like 100% of all pages set in the VM
anyway. But it can get a lot closer to it in some cases now.
--
Peter Geoghegan
ing alone, the number of vacuum operations that do
unnecessary index vacuuming doesn't increase at all (except perhaps
due to rounding effects).
--
Peter Geoghegan
It's a non-hot update, and so there is a single dead index tuple. You're
seeing the new optimization that makes vacuum skip indexes in marginal
cases.
Try running vacuum with index cleanup = on.
Peter Geoghegan
(Sent from my phone)
ll block
write DML. You might prefer to just use the first query if this is
running in a production environment.
--
Peter Geoghegan
dexes won't grow at all due to
garbage index tuples, especially on Postgres 14.
--
Peter Geoghegan
ly rare in
practice.
--
Peter Geoghegan
ave lots of duplicates, which did
perform rather badly prior to Postgres 12. I bet that you'd benefit
from upgrading to Postgres 12, or especially to Postgres 13. The
B-Tree space management is a lot better now. (Actually, it'll be
better again in Postgres 14.)
--
Peter Geoghegan
On Thu, Jan 21, 2021 at 12:55 PM Tommy Li wrote:
> Is there any way to configure autovacuum to log the same information as
> VACUUM VERBOSE?
No. Though there really should be.
--
Peter Geoghegan
. That will make the table larger initially, but leaving enough
space behind on the same heap pages for successor tuples makes it
possible to use HOT updates.
--
Peter Geoghegan
test. And it might defeat certain future optimizations based on heap
TID being the only tiebreaker. Having two types of equality might have
to bleed into the optimizer.
It's a question of engineering trade-offs. I don't think that it's worth it.
--
Peter Geoghegan
=
operator will be broken in the presence of different display scales.
It's a non-starter.
The numeric2 design that I sketched is a bit ugly, but I can see no
better way. A three-way posting list split (i.e. the other design that
you sketched) is a special case that is very hard to test, very
complicated, and of little value in the grand scheme of things.
--
Peter Geoghegan
split like this then
you need three copies of the key -- the original, the new, and a
second copy of the original. That's much more complicated.
--
Peter Geoghegan
On Tue, Aug 18, 2020 at 9:44 AM Peter Geoghegan wrote:
> If we wanted to fix this for numeric, we'd have to invent a new
> numeric datatype (called numeric2, say). That probably isn't as hard
> as it sounds, since it could be part of the same B-Tree operator
> family as num
anted to fix this for numeric, we'd have to invent a new
numeric datatype (called numeric2, say). That probably isn't as hard
as it sounds, since it could be part of the same B-Tree operator
family as numeric. It could also be implicitly cast to numeric.
--
Peter Geoghegan
is approach to dumping out a single page image
easier, since it doesn't involve relfilenodes or filesystem files:
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#contrib.2Fpageinspect_page_dump
--
Peter Geoghegan
rites didn't restored the page to the state before the
> index-item deletion happened(that is, if full_page_writes were set to
> off.). (If it found to be the cause, I'm not sure why that didn't
> happen on 9.5.)
There is also a Heap/HOT_UPDATE log line with similar errors.
--
Peter Geoghegan
segments that become ready to be archived during crash recovery
were potentially recycled without being archived."
Sorry that you were affected by this bug -- it really sucks.
--
Peter Geoghegan
processes in a free of any other load system.
> It is postgres 11.7
Try increasing maintenance_work_mem from the default of 64MB. MWM
constrains the number of parallel workers used.
--
Peter Geoghegan
ss and until
you REINDEX. This includes cases where you're running pg_upgrade
against a restored physical backup.
--
Peter Geoghegan
4:
regression=# create extension pageinspect;
CREATE EXTENSION
regression=# select version from bt_metap('pg_aggregate_fnoid_index');
version
-
4
(1 row)
--
Peter Geoghegan
have that for the ICU
collations, but it currently isn't possible to use ICU as the default
collation provider. You really have to go out of your way to use ICU
collations.
--
Peter Geoghegan
ck('my_index', true);
If that doesn't show any errors, then perhaps try this:
SELECT bt_index_parent_check('my_index', true);
If you're on Postgres 10, then you should leave out the second
argument, "true", since that version doesn't have the extra
heapallindexed check.
Let us know what you see.
--
Peter Geoghegan
ry here, modified to verify all B-Tree indexes (not
just those indexes in the pg_catalog schema):
https://www.postgresql.org/docs/10/amcheck.html
--
Peter Geoghegan
sort code received many improvements over the years, really
starting in 9.5, and continuing in 9.6, 10 and 11. FWIW, I think that
that was probably the biggest factor here. Though parallel CREATE
INDEX will have helped as well.
--
Peter Geoghegan
mit applies to a tuple *after*
TOAST compression has been applied.
--
Peter Geoghegan
earlier than pg93. So thank you for
> your hard work and dedication to this awesome piece of software.
How long did it take on 9.3?
I am the author of the parallel CREATE INDEX feature. It's good to get
feedback like this.
--
Peter Geoghegan
led almost
immediately. If it happened with any regularity, somebody would have
complained by now.
As Tom said, it's hard to give a useful answer without more context --
how you define "immediate"?
--
Peter Geoghegan
checked, restoring this database will take about 30GB of disk
space on top of the dump file itself.
--
Peter Geoghegan
to
corrupt B-Tree indexes (not Geometry):
https://trac.osgeo.org/postgis/ticket/3841
Though I also see what could be a comparable bug in Geometry:
https://trac.osgeo.org/postgis/ticket/3777
These bugs are from about 3 years ago. If I'm right you should be able
to isolate the bug using amc
On Wed, Jul 17, 2019 at 10:27 AM Peter Geoghegan wrote:
> > It's possible that amcheck would have given you an accurate diagnosis
> > of the problem -- especially if you used bt_index_parent_check():
> >
> > https://www.postgresql.org/docs/current/amcheck.htm
ng can break when somebody
creates an index on a non-immutable expression (including VACUUM),
provided that all the problems only affect the table with the broken
index. OTOH, taking down the entire Postgres cluster as an indirect
consequence of one person's ill-considered CREATE INDEX really sucks.
That distinction seems important to me.
--
Peter Geoghegan
ore aggressive than give up
when there is a "failed to re-find parent key" or similar condition.
Perhaps it would make more sense to make the index inactive (for some
value of "inactive") instead of just complaining. That might be the
least worst option, all things considered.
--
Peter Geoghegan
On Wed, Jul 17, 2019 at 9:21 AM Peter Geoghegan wrote:
> It's possible that amcheck would have given you an accurate diagnosis
> of the problem -- especially if you used bt_index_parent_check():
>
> https://www.postgresql.org/docs/current/amcheck.html
BTW, be sure to use th
a corrupt index) ultimately risks the availability of every
database in the cluster. Many installations receive little to no
supervision, so it may just be a matter of time there. That is
certainly a bad thing.
--
Peter Geoghegan
//www.postgresql.org/docs/current/amcheck.html
--
Peter Geoghegan
here,
so it seems worth comparing index size in detail.
--
Peter Geoghegan
On Tue, Jul 9, 2019 at 9:04 PM Peter Geoghegan wrote:
> ISTM that the simplest explanation here is that index fragmentation
> (and even index size) is a red herring, and the real issue is that
> you're suffering from problems similar to those that are described in
> these old t
ins the heap/table sort order
among duplicates by treating heap TID as a tiebreaker column, which
may make REINDEXing totally unnecessary for you. It's harder to model
this issue because the problem with heap TID order will only be seen
when there is at least a moderate amount of churn.
--
Peter Geoghegan
me existing index tuple, causing many
page splits despite there being no real change in the logical contents
of the index. Even then, the table will itself lose much of its
original order, so the index will become "unclustered" as it becomes
fragmented.
--
Peter Geoghegan
;fragmentation" isn't usually much of a problem when using Postgres.
--
Peter Geoghegan
r a REINDEX, then why bother at all? There is no reason
to think that that will be more effective than simple vacuuming.
--
Peter Geoghegan
On Mon, Jul 8, 2019 at 12:19 PM Peter Geoghegan wrote:
> Well, you're still running autovacuum very aggressively here. It'll
> easily keep up when run on a relatively small table such as this.
Also, an exactly equal number of insertions and deletions is rather
likely to result in
nt release of 9.4 -- not
9.4.6. You're missing years of bug fixes by sticking to such an old
point release, including some rather nasty ones -- 9.4.23 is the
current 9.4 point release. Actually, 9.4 is going to lose support this
year, as the oldest stable version that's currently supported by the
community.
--
Peter Geoghegan
beta2 run, but it sounds as though maybe it could be relevant to
> this kind of workload - Is that so?
You seem to be worried about keeping indexes as small as possible.
vacuum_cleanup_index_scale_factor won't help with that.
--
Peter Geoghegan
e?
https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
Thanks
--
Peter Geoghegan
nce.
It's easy to show problems with very low cardinality indexes in the
old code. You'll definitely notice a difference there.
> Is there a pdf or text version?
Just the talk slides:
https://www.pgcon.org/2019/schedule/attachments/518_nbtree-arch-pgcon.pdf
--
Peter Geoghegan
ey're both causes of index bloat that
VACUUM cannot usually prevent.
--
Peter Geoghegan
which reports a avg_leaf_density for the index. Though I agree that
that's not likely to help matters. Apart from anything else, the
steady state of an index is embodied by more than just its
avg_leaf_density. Especially following the v12 enhancements to B-Tree
indexes.
--
Peter Geoghegan
an "immediate" mode shutdown is appropriate, then. That will
mean that the primary will need to go through crash recovery if and
when you bring it back up, though.
--
Peter Geoghegan
anything now, but I think that it's possible
that the page split thing will very effective. Perhaps even so
effective that it won't make much sense to vacuum global indexes just
because there is a pending dropped partition.
--
Peter Geoghegan
as hard as it sounds. You don't have to be Claude
Shannon to realize that it's kind of silly to reserve 16 bits for the
offset number component of a TID/ItemPointer. We need to continue to
support offset numbers that go that high, but the implementation would
optimize for the common case where offset numbers are less than 512
(or maybe less than 1024).
--
Peter Geoghegan
could make
that happen. Maybe this same representation could be used for all
nbtree indexes, not just global nbtree indexes.
--
Peter Geoghegan
bare minimum
amount of space is used for partition numbers. Maybe it won't matter
that much if partition numbers cannot be recycled due to this
asynchronous processing.
--
Peter Geoghegan
make a good case for
asynchronously cleaning up the dead entries that point to a dropped
partition (probably within VACUUM). Perhaps we should offer *both* as
options.
--
Peter Geoghegan
I'd be particularly concerned
about the complexity of VACUUM there, whereas that doesn't seem all
that bad in the case of global indexes.
--
Peter Geoghegan
lations for
global indexes, though maybe global indexes could focus our attention
on that problem.
--
Peter Geoghegan
hin GIN, where
it's impossible for the main entry tree to have duplicates without at
least storing them in a posting list.
--
Peter Geoghegan
1 - 100 of 155 matches
Mail list logo