Re: Slow query, possibly not using index

2023-08-28 Thread Les
>
>>
> All right, I started pgstattuple()  and I'll also do pgstatindex(), but it
> takes a while. I'll get back with the results.
>

=# select * from pgstattuple('media.block');

  table_len   | tuple_count |  tuple_len   | tuple_percent |
dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space  |
free_percent
--+-+--+---+--+++-+--
 372521984000 |39652836 | 299148572428 |  80.3 |
 3578319 |26977942540 |   7.24 | 44638265312 |11.98
(1 row)

=# select * from pgstatindex('media.idx_block_unused');
 version | tree_level | index_size | root_block_no | internal_pages |
leaf_pages | empty_pages | deleted_pages | avg_leaf_density |
leaf_fragmentation
-+++---+++-+---+--+
   4 |  2 |  389677056 |   546 |114 |
 23069 |   0 | 24384 |90.03 |  0
(1 row)

As far as I understand these numbers, the media.block table itself is in
good shape, but the index is not. Should I vacuum the whole table? Or would
it be better to REINDEX INDEX media.idx_block_unused CONCURRENTLY ?

More important question is, how can I find out why the index was not auto
vacuumed.

Thank you,

   Laszlo


Re: Slow query, possibly not using index

2023-08-28 Thread Les
>
>
> =# select * from pgstatindex('media.idx_block_unused');
>  version | tree_level | index_size | root_block_no | internal_pages |
> leaf_pages | empty_pages | deleted_pages | avg_leaf_density |
> leaf_fragmentation
>
> -+++---+++-+---+--+
>4 |  2 |  389677056 |   546 |114 |
>  23069 |   0 | 24384 |90.03 |  0
> (1 row)
>
> After reindex:

=# select * from pgstatindex('media.idx_block_unused');
 version | tree_level | index_size | root_block_no | internal_pages |
leaf_pages | empty_pages | deleted_pages | avg_leaf_density |
leaf_fragmentation
-+++---+++-+---+--+
   4 |  0 |   8192 | 0 |  0 |
   0 |   0 | 0 |  NaN |NaN
(1 row)

 explain analyze select id from media.block b where nrefs =0 limit 1

QUERY PLAN
  |
-+
Limit  (cost=0.14..0.46 rows=1 width=16) (actual time=0.010..0.011 rows=0
loops=1)   |
  ->  Index Only Scan using idx_block_unused on block b  (cost=0.14..698.91
rows=2231 width=16) (actual time=0.008..0.009 rows=0 loops=1)|
Heap Fetches: 0
 |
Planning Time: 0.174 ms
 |
Execution Time: 0.030 ms
  |

It is actually empty.

Now I only need to figure out why autovacuum did not work on the index.

Thank you

Laszlo


Re: Slow query, possibly not using index

2023-08-28 Thread Pavel Stehule
po 28. 8. 2023 v 13:00 odesílatel Les  napsal:

>
>
>>
>> =# select * from pgstatindex('media.idx_block_unused');
>>  version | tree_level | index_size | root_block_no | internal_pages |
>> leaf_pages | empty_pages | deleted_pages | avg_leaf_density |
>> leaf_fragmentation
>>
>> -+++---+++-+---+--+
>>4 |  2 |  389677056 |   546 |114 |
>>  23069 |   0 | 24384 |90.03 |  0
>> (1 row)
>>
>> After reindex:
>
> =# select * from pgstatindex('media.idx_block_unused');
>  version | tree_level | index_size | root_block_no | internal_pages |
> leaf_pages | empty_pages | deleted_pages | avg_leaf_density |
> leaf_fragmentation
>
> -+++---+++-+---+--+
>4 |  0 |   8192 | 0 |  0 |
>  0 |   0 | 0 |  NaN |NaN
> (1 row)
>
>  explain analyze select id from media.block b where nrefs =0 limit 1
>
> QUERY PLAN
>   |
>
> -+
> Limit  (cost=0.14..0.46 rows=1 width=16) (actual time=0.010..0.011 rows=0
> loops=1)   |
>   ->  Index Only Scan using idx_block_unused on block b
>  (cost=0.14..698.91 rows=2231 width=16) (actual time=0.008..0.009 rows=0
> loops=1)|
> Heap Fetches: 0
>|
> Planning Time: 0.174 ms
>|
> Execution Time: 0.030 ms
>   |
>
> It is actually empty.
>
> Now I only need to figure out why autovacuum did not work on the index.
>

Autovacuum doesn't reindex.

Regards

Pavel


>
> Thank you
>
> Laszlo
>
>


Re: Slow query, possibly not using index

2023-08-28 Thread David Rowley
On Mon, 28 Aug 2023 at 19:21, Les  wrote:
> More important question is, how can I find out why the index was not auto 
> vacuumed.

You should have a look at pg_stat_user_tables.  It'll let you know if
the table is being autovacuumed and how often.  If you're concerned
about autovacuum not running properly, then you might want to lower
log_autovacuum_min_duration.  Generally, anything that takes a
conflicting lock will cause autovacuum to cancel so that the
conflicting locker can get through.  Things like ALTER TABLE or even
an ANALYZE running will cancel most autovacuum runs on tables.

Also, this is a fairly large table and you do have the standard
autovacuum settings.  Going by pgstattuple, the table has 39652836
tuples.  Autovacuum will trigger when the statistics indicate that 20%
of tuples are dead, which is about 8 million tuples. Perhaps that's
enough for the index scan to have to skip over a large enough number
of dead tuples to make it slow. You might want to consider lowering
the autovacuum scale factor for this table.

Also, ensure you're not doing anything like calling pg_stat_reset();

It might be worth showing us the output of:

select * from pg_stat_user_tables where relid = 'media.block'::regclass;

David




Re: Slow query, possibly not using index

2023-08-28 Thread Les
>
>
>
> > More important question is, how can I find out why the index was not
> auto vacuumed.
>
> You should have a look at pg_stat_user_tables.  It'll let you know if
> the table is being autovacuumed and how often.  If you're concerned
> about autovacuum not running properly, then you might want to lower
> log_autovacuum_min_duration.  Generally, anything that takes a
> conflicting lock will cause autovacuum to cancel so that the
> conflicting locker can get through.  Things like ALTER TABLE or even
> an ANALYZE running will cancel most autovacuum runs on tables.
>
> Also, this is a fairly large table and you do have the standard
> autovacuum settings.  Going by pgstattuple, the table has 39652836
> tuples.  Autovacuum will trigger when the statistics indicate that 20%
> of tuples are dead, which is about 8 million tuples. Perhaps that's
> enough for the index scan to have to skip over a large enough number
> of dead tuples to make it slow. You might want to consider lowering
> the autovacuum scale factor for this table.
>
> Also, ensure you're not doing anything like calling pg_stat_reset();
>
> It might be worth showing us the output of:
>
> select * from pg_stat_user_tables where relid = 'media.block'::regclass;
>
Thank you for your suggestion, this is really very helpful.

select * from pg_stat_user_tables where relid = 'media.block'::regclass;



Name   |Value|
---+-+
relid  |25872|
schemaname |media|
relname|block|
seq_scan   |8|
seq_tup_read   |139018370|
idx_scan   |45023556 |
idx_tup_fetch  |37461539 |
n_tup_ins  |7556051  |
n_tup_upd  |7577720  |
n_tup_del  |0|
n_tup_hot_upd  |0|
n_live_tup |39782042 |
n_dead_tup |5938057  |
n_mod_since_analyze|1653427  |
n_ins_since_vacuum |5736676  |
last_vacuum| |
last_autovacuum|2023-08-17 22:39:29.383 +0200|
last_analyze   | |
last_autoanalyze   |2023-08-22 16:02:56.093 +0200|
vacuum_count   |0|
autovacuum_count   |1|
analyze_count  |0|
autoanalyze_count  |4|

Regards,

Laszlo


Index bloat and REINDEX/VACUUM optimization for partial index

2023-08-28 Thread jayaprabhakar k
Hi,

TL;DR:
Observations:

   1. REINDEX requires a full table scan
  - Roughly create a new index, rename index, drop old index.
  - REINDEX is not incremental. running reindex frequently does not
  reduce the future reindex time.
   2. REINDEX does not use the index itself
   3. VACUUM does not clean up the indices. (relpages >> reltuples) I
   understand, vacuum is supposed to remove pages only if there are no live
   tuples in the page, but somehow, even immediately after vacuum, I see
   relpages significantly greater than reltuples. I would have assumed,
   relpages <= reltuples
   4. Query Planner does not consider index bloat, so uses highly bloated
   partial index that is terribly slow over other index

Question: Is there a way to optimize postgres vacuum/reindex when using
partial indexes?

We have a large table (tasks) that keep track of all the tasks that are
created and their statuses. Around 1.4 million tasks per day are created
every day (~15 inserts per second).

One of the columns is int `status` that can be one of (1 - Init, 2 -
InProgress, 3 - Success, 4 - Aborted, 5 - Failure) (Actually, there are
more statuses, but this would give the idea)

On average, a task completes in around a minute with some outliers that can
go as long as a few weeks. There is a periodic heartbeat that updates the
last updated time in the table.

At any moment, there are *around 1000-1500 tasks in pending statuses* (Init
+ InProgress) out of around 500 million tasks.

Now, we have a task monitoring query that will look for all pending tasks
that have not received any update in the last n minutes.

```
SELECT [columns list]
  FROM tasks
  WHERE status NOT IN (3,4,5) AND created > NOW() - INTERVAL '30 days' AND
updated < NOW() - interval '30 minutes'
```

Since we are only interested in the pending tasks, I created a partial index
 `*"tasks_pending_status_created_type_idx" btree (status, created,
task_type) WHERE status <> ALL (ARRAY[3, 4, 5])*`.

This worked great initially, however this started to get bloated very very
quickly because, every task starts in pending state, gets multiple updates
(and many of them are not HOT updates, working on optimizing fill factor
now), and eventually gets deleted from the index (as status changes to
success).


```

\d+ tasks

Table "public.tasks"
Column |Type| Collation |
Nullable |  Default  | Storage  | Compression |
Stats target | Description
---++---+--+---+--+-+--+-
 id| bigint |   |
not null | nextval('tasks_id_seq'::regclass) | plain| |
 |
  client_id | bigint |   |
not null |   | plain| |
 |
 status| integer|   |
not null |   | plain| |
 |
 description   | character varying(128) |   |
not null |   | extended | |
 |
 current_count | bigint |   |
not null |   | plain| |
 |
 target_count  | bigint |   |
not null |   | plain| |
 |
 status_msg| character varying(4096)|   |
   |   | extended | |
   |
 blob_key  | bigint |   |
   |   | plain| |
   |
 created   | timestamp with time zone   |   |
not null |   | plain| |
 |
 updated   | timestamp with time zone   |   |
not null |   | plain| |
 |
 idle_time | integer|   |
not null | 0 | plain| |
 |
 started   | timestamp with time zone   |   |
   |   | plain| |
   |
Indexes:
"tasks_pkey" PRIMARY KEY, btree (id)
"tasks_created_idx" btree (created)
"tasks_pending_status_created_idx" btree (status, created) WHERE status
<> ALL (ARRAY[3, 4, 5])

"tasks_client_id_status_created_idx" btree (client_id, status, created
DESC)
"tasks_status_idx" btree (status)
Access method: heap
Options: autovacuum_vacuum_scale_factor=0.02,
autovacuum_analyze_scale_factor=0.02, fillfactor=70
```

Immediately aft

Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-08-28 Thread Peter Geoghegan
On Mon, Aug 28, 2023 at 5:33 PM jayaprabhakar k  wrote:
> REINDEX requires a full table scan
>
> Roughly create a new index, rename index, drop old index.
> REINDEX is not incremental. running reindex frequently does not reduce the 
> future reindex time.

You didn't say which Postgres version you're on. Note that Postgres 14
can deal with index bloat a lot better than earlier versions could.
This is known to work well with partial indexes. See:

https://www.postgresql.org/message-id/flat/CAL9smLAjt9mZC2%3DqBeJwuNPq7KMAYGTWWQw_hvA-Lfo0b3ycow%40mail.gmail.com

-- 
Peter Geoghegan