Postgres 15 SELECT query doesn't use index under RLS

2023-10-13 Thread Alexander Okulovich

Hello everyone!


Recently, we upgraded the AWS RDS instance from Postgres 12.14 to 15.4 
and noticed extremely high disk consumption on the following query 
execution:


select (exists (select 1 as "one" from "public"."indexed_commit" where 
"public"."indexed_commit"."repo_id" in (964992,964994,964999, ...);


For some reason, the query planner starts using Seq Scan instead of the 
index on the "repo_id" column when requesting under user limited with 
RLS. On prod, it happens when there are more than 316 IDs in the IN part 
of the query, on stage - 3. If we execute the request from Superuser, 
the planner always uses the "repo_id" index.


Luckily, we can easily reproduce this on our stage database (which is 
smaller). If we add a multicolumn "repo_id, tenant_id" index, the 
planner uses it (Index Only Scan) with any IN params count under RLS.


Could you please clarify if this is a Postgres bug or not? Should we 
include the "tenant_id" column in all our indexes to make them work 
under RLS?



 Postgres version / Operating system+version


PostgreSQL 15.4 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 
7.3.1 20180712 (Red Hat 7.3.1-6), 64-bit



 Full Table and Index Schema

\d indexed_commit
    Table "public.indexed_commit"
    Column |    Type | Collation | Nullable | 
Default

---+-+---+--+-
 id    | bigint  |   | not null |
 commit_hash   | character varying(40)   |   | not null |
 parent_hash   | text    | |  |
 created_ts    | timestamp without time zone |   | not null |
 repo_id   | bigint  |   | not null |
 lines_added   | bigint  | |  |
 lines_removed | bigint  | |  |
 tenant_id | uuid    |   | not null |
 author_id | uuid    |   | not null |
Indexes:
    "indexed-commit-repo-idx" btree (repo_id)
    "indexed_commit_commit_hash_repo_id_key" UNIQUE CONSTRAINT, btree 
(commit_hash, repo_id) REPLICA IDENTITY
    "indexed_commit_repo_id_without_loc_idx" btree (repo_id) WHERE 
lines_added IS NULL OR lines_removed IS NULL

Policies:
    POLICY "commit_isolation_policy"
  USING ((tenant_id = 
(current_setting('app.current_tenant_id'::text))::uuid))



 Table Metadata

SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, 
relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE 
relname='indexed_commit';
    relname | relpages |  reltuples   | relallvisible | relkind | 
relnatts | relhassubclass | reloptions | pg_table_size

+--+--+---+-+--++-+---
 indexed_commit | 18170522 | 7.451964e+08 |  18104744 | r |    
9 | f  | 
{autovacuum_vacuum_scale_factor=0,autovacuum_analyze_scale_factor=0,autovacuum_vacuum_threshold=20,autovacuum_analyze_threshold=10} 
|  148903337984



 EXPLAIN (ANALYZE, BUFFERS), not just EXPLAIN

Production queries:

316 ids under RLS limited user


392 ids under RLS limited user 

392 ids under Superuser 


 History

It became slow after the upgrade to 15.4. We never had any issues before.


 Hardware

AWS DB class db.t4g.large + GP3 400GB disk


 Maintenance Setup

Are you running autovacuum? Yes

If so, with what settings?

autovacuum_vacuum_scale_factor=0,autovacuum_analyze_scale_factor=0,autovacuum_vacuum_threshold=20,autovacuum_analyze_threshold=10

SELECT * FROM pg_stat_user_tables WHERE relname='indexed_commit';
 relid | schemaname |    relname | seq_scan | seq_tup_read | 
idx_scan  | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | 
n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | 
n_ins_since_vacuum | last_vacuum |    last_autovacuum    | 
last_analyze |   last_autoanalyze    | vacuum_count | 
autovacuum_count | analyze_count | autoanalyze_count

---+++--+--+---+---+---+---+---+---+++-++-+---+--+---+--+--+---+---
 24662 | public | indexed_commit | 2485 |  49215378424 | 
374533865 |    4050928807 | 764089750 |   2191615 |  18500311 
| 0 |  745241398 |    383 |   46018 
|  45343 | | 2023-10-11 23:51:29.17

Re: Postgres 15 SELECT query doesn't use index under RLS

2023-10-13 Thread Tom Lane
Alexander Okulovich  writes:
> Recently, we upgraded the AWS RDS instance from Postgres 12.14 to 15.4 
> and noticed extremely high disk consumption on the following query 
> execution:
> select (exists (select 1 as "one" from "public"."indexed_commit" where 
> "public"."indexed_commit"."repo_id" in (964992,964994,964999, ...);
> For some reason, the query planner starts using Seq Scan instead of the 
> index on the "repo_id" column when requesting under user limited with 
> RLS. On prod, it happens when there are more than 316 IDs in the IN part 
> of the query, on stage - 3. If we execute the request from Superuser, 
> the planner always uses the "repo_id" index.

The superuser bypasses the RLS policy.  When that's enforced, the
query can no longer use an index-only scan (because it needs to fetch
tenant_id too).  Moreover, it may be that only a small fraction of the
rows fetched via the index will satisfy the RLS condition.  So the
estimated cost of an indexscan query could be high enough to persuade
the planner that a seqscan is a better idea.

> Luckily, we can easily reproduce this on our stage database (which is 
> smaller). If we add a multicolumn "repo_id, tenant_id" index, the 
> planner uses it (Index Only Scan) with any IN params count under RLS.

Yeah, that would be the obvious way to ameliorate both problems.

If in fact you were getting decent performance from an indexscan plan
before, the only explanation I can think of is that the repo_ids you
are querying for are correlated with the tenant_id, so that the RLS
filter doesn't eliminate very many rows from the index result.  The
planner wouldn't realize that by default, but if you create extended
statistics on repo_id and tenant_id then it might do better.  Still,
you probably want the extra index.

> Could you please clarify if this is a Postgres bug or not?

You haven't shown any evidence suggesting that.

> Should we 
> include the "tenant_id" column in all our indexes to make them work 
> under RLS?

Adding tenant_id is going to bloat your indexes quite a bit,
so I wouldn't do that except in cases where you've demonstrated
it's important.

regards, tom lane