Re: Queries containing ORDER BY and LIMIT started to work slowly

2023-09-01 Thread Rondat Flyag
Hi David. Thank you so much for your help. The problem was in the dropped asins_statistics(asin_id) index. I had set it, but it was dropped somehow during the dump. I set it again andeverything works fine now.Thank you again. P.S. There are two close terms: ASIN and ISBN. I use ASIN in my tables, but ISBN is well-known to people. I changed ASIN to ISBN in the text files, but forgot to replace the last time.This is why the names didn't correspond. Cheers,Serg 31.08.2023, 00:43, "David Rowley" :On Thu, 31 Aug 2023 at 06:32, Rondat Flyag  wrote: I tried VACUUM ANALYZE for three tables, but without success. I also tried to set enable_seqscan=off and the query took even more time. If I set enable_sort=off then the query takes a lot of time and I cancel it. Please see the attached query plans.It's a little hard to comment here as I don't see what the plan wasbefore when you were happy with the performance. I also see thequeries you mentioned in the initial email don't match the plans.There's no table called "isbns" in the query. I guess this is "asins"?Likely you could get a faster plan if there was an index onasins_statistics (asin_id). That would allow a query plan that scansthe isbns_value_key index and performs a parameterised nested loop onasins_statistics using the asins_statistics (asin_id) index. Lookingat your schema, I don't see that index, so it's pretty hard to guesswhy the plan used to be faster. Even if the books/asins merge joinused to take place first, there'd have been no efficient way to jointo the asins_statistics table and preserve the Merge Join's order (I'massuming non-parameterized nested loops would be inefficient in thiscase). Doing that would have also required the asins_statistics(asin_id) index. Are you sure that index wasn't dropped?However, likely it's a waste of time to try to figure out what theplan used to be. Better to focus on trying to make it faster. Isuggest you create the asins_statistics (asin_id) index. However, Ican't say with any level of confidence that the planner would opt touse that index if it did exist. Lowering random_page_cost orincreasing effective_cache_size would increase the chances of that.David

Re: Queries containing ORDER BY and LIMIT started to work slowly

2023-09-01 Thread Rondat Flyag
Hello Jeff.Thank you too for your efforts and help. The problem was in the dropped index for asins_statistics(asin_id). It existed, but was dropped during the dump I suppose. I created it again and everything is fine now. Cheers,Serg 31.08.2023, 19:52, "Jeff Janes" :On Wed, Aug 30, 2023 at 1:31 PM Rondat Flyag  wrote:Hi and thank you for the response. I tried VACUUM ANALYZE for three tables, but without success. I also tried to set enable_seqscan=off and the query took even more time. If I set enable_sort=off then the query takes a lot of time and I cancel it. Maybe you could restore (to a temp server, not the production) a physical backup taken from before the change happened, and get an old plan that way.  I'm guessing that somehow an index got dropped around the same time you took the dump.  That might be a lot of work, and maybe it would just be easier to optimize the current query while ignoring the past.  But you seem to be interested in a root-cause analysis, and I don't see any other way to do one of those. What I would expect to be the winning plan would be something sort-free like: Limit  merge join    index scan yielding books in asin order (already being done)    nested loop       index scan yielding asins in value order       index scan probing asins_statistics driven by asins_statistics.asin_id = asins.id Or possibly a 2nd nested loop rather than the merge join just below the limit, but with the rest the same In addition to the "books" index already evident in your current plan, you would also need an index leading with asins_statistics.asin_id, and one leading with asins.value.  But if all those indexes exists, it is hard to see why setting enable_seqscan=off wouldn't have forced them to be used.  Cheers, Jeff

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-09-01 Thread Maxim Boguk
>
> But anyway, PostgreSQL has features to prevent the index bloat from
> becoming too severe of a problem, and you should figure out why they are
> not working for you.  The most common ones I know of are 1) long open
> snapshots preventing clean up, 2) all index scans being bitmap index scans,
> which don't to micro-vacuuming/index hinting the way ordinary btree
> index scans do, and 3) running the queries on a hot-standby, where index
> hint bits must be ignored.  If you could identify and solve this issue,
> then you wouldn't need to twist yourself into knots avoiding non-HOT
> updates.
>

I am not sure that kill bits could be a complete fix for indexes with tens
of millions dead entries and only a handful of live entries. As I
understand the mechanics of killbits - they help to avoid excessive heap
visibility checks for dead tuples, but tuples with killbit are still should
be read from the index first. And with many millions of dead entries it
isn't free.

PS: ignoring killbits on hot standby slaves is a source of endless pain in
many cases.

--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 
Phone AU: +61  45 218 5678