Inconsistent query performance based on relation hit frequency

2024-06-27 Thread Laura Hausmann
Heya, I hope the title is somewhat descriptive. I'm working on a
decentralized social media platform and have encountered the following
performance issue/quirk, and would like to ask for input, since I'm not
sure I missed anything.

I'm running PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
13.2.1 20230801, 64-bit, running on an Arch Linux box with 128GB of RAM &
an 8c16t Ryzen 3700X CPU. Disk is a NVME RAID0.

Postgres configuration: https://paste.depesz.com/s/iTv

I'm using autovacuum defaults & am running a manual VACUUM ANALYZE on the
entire database nightly.

The relevant database parts consist of a table with posts (note), a table
with users (user), and a table with follow relationships (following). The
query in question takes the most recent n (e.g. 50) posts, filtered by the
users follow relations.

The note table on my main production instance grows by about 200k entries
per week.

Schema & tuple counts: https://paste.depesz.com/s/cfI

Here's the shortest query I can reproduce the issue with:
https://paste.depesz.com/s/RoC
Specifically, it works well for users that follow a relatively large amount
of users (https://explain.depesz.com/s/tJnB), and is very slow for users
that follow a low amount of users / users that post infrequently (
https://explain.depesz.com/s/Mtyr).

>From what I can tell, this is because this query causes postgres to scan
the note table from the bottom (most recent posts first), discarding
anything by users that are not followed.

Curiously, rewriting the query like this (https://paste.depesz.com/s/8rN)
causes the opposite problem, this query is fast for users with a low
following count (https://explain.depesz.com/s/yHAz#query), and slow for
users with a high following count (https://explain.depesz.com/s/1v6L,
https://explain.depesz.com/s/yg3N).

These numbers are even further apart (to the point of 10-30s query
timeouts) in the most extreme outlier cases I've observed, and on lower-end
hardware.

I've sidestepped the issue by running either of these queries based on a
heuristic that checks whether there are more than 250 matching posts in the
past 7 days, recomputed once per day for every user, but it feels more like
a hack than a proper solution.

I'm able to make the planner make a sensible decision in both cases by
setting enable_sort = off, but that tanks performance for the rest of my
application, is even more of a hack, and doesn't seem to work in all cases.

I've been able to reproduce this issue with mock data (
https://paste.depesz.com/s/CnY), though it's not generating quite the same
query plans and is behaving a bit differently.

I'd appreciate any and all input on the situation. If I've left out any
information that would be useful in figuring this out, please tell me.

Thanks in advance,
Laura Hausmann


Re: Inconsistent query performance based on relation hit frequency

2024-06-27 Thread Andrei Lepikhov

On 6/27/24 07:50, Laura Hausmann wrote:
I'd appreciate any and all input on the situation. If I've left out any 
information that would be useful in figuring this out, please tell me.

Thanks for this curious case, I like it!
At first, you can try to avoid "OR" expressions - PostgreSQL has quite 
limited set of optimisation/prediction tricks on such expressions.
Second - I see, postgres predicts wrong number of tuples. But using my 
typical tool [1] and getting more precise estimations i don't see 
significant profit:


 Limit  (cost=10832.85..10838.69 rows=50 width=21)
   ->  Gather Merge  (cost=10832.85..10838.92 rows=52 width=21)
 Workers Planned: 2
 Workers Launched: 2
 ->  Sort  (cost=9832.83..9832.90 rows=26 width=21)
   Sort Key: objects.id DESC
   Sort Method: top-N heapsort  Memory: 32kB
   Worker 0:  Sort Method: quicksort  Memory: 32kB
   Worker 1:  Sort Method: quicksort  Memory: 32kB
   ->  Parallel Seq Scan on objects
 Filter: ((hashed SubPlan 1) OR ("userId" = 1))
 Rows Removed by Filter: 183372
 SubPlan 1
   ->  Nested Loop
 ->  Index Only Scan using users_pkey on
   Index Cond: (id = 1)
   Heap Fetches: 0
 ->  Index Only Scan using 
"relationships_followerId_followeeId_idx" on relationships

   Index Cond: ("followerId" = 1)
   Heap Fetches: 0
 Planning Time: 0.762 ms
 Execution Time: 43.816 ms

 Limit  (cost=10818.83..10819.07 rows=2 width=21)
   ->  Gather Merge  (cost=10818.83..10819.07 rows=2 width=21)
 Workers Planned: 2
 Workers Launched: 2
 ->  Sort  (cost=9818.81..9818.81 rows=1 width=21)
   Sort Key: objects.id DESC
   Sort Method: quicksort  Memory: 25kB
   Worker 0:  Sort Method: quicksort  Memory: 25kB
   Worker 1:  Sort Method: quicksort  Memory: 25kB
   ->  Parallel Seq Scan on objects
 Filter: ((hashed SubPlan 1) OR ("userId" = 4))
 Rows Removed by Filter: 183477
 SubPlan 1
   ->  Nested Loop  (cost=0.56..8.61 rows=1 width=4)
 ->  Index Only Scan using 
"relationships_followerId_followeeId_idx" on relationships

   Index Cond: ("followerId" = 4)
   Heap Fetches: 0
 ->  Index Only Scan using users_pkey
   Index Cond: (id = 4)
   Heap Fetches: 0
 Planning Time: 0.646 ms
 Execution Time: 30.824 ms

But this was achieved just because of parallel workers utilisation. 
Disabling them we get:


 Limit  (cost=14635.07..14635.08 rows=2 width=21) (actual 
time=75.941..75.943 rows=0 loops=1)
   ->  Sort  (cost=14635.07..14635.08 rows=2 width=21) (actual 
time=75.939..75.940 rows=0 loops=1)

 Sort Key: objects.id DESC
 Sort Method: quicksort  Memory: 25kB
 ->  Seq Scan on objects  (cost=8.61..14635.06 rows=2 width=21) 
(actual time=75.931..75.932 rows=0 loops=1)

   Filter: ((hashed SubPlan 1) OR ("userId" = 4))
   Rows Removed by Filter: 550430
   SubPlan 1
 ->  Nested Loop  (cost=0.56..8.61 rows=1 width=4) 
(actual time=0.039..0.040 rows=0 loops=1)
   ->  Index Only Scan using 
"relationships_followerId_followeeId_idx" on relationships 
(cost=0.28..4.29 rows=1 width=8) (actual time=0.038..0.038 rows=0 loops=1)

 Index Cond: ("followerId" = 4)
 Heap Fetches: 0
   ->  Index Only Scan using users_pkey on users 
(cost=0.29..4.31 rows=1 width=4) (never executed)

 Index Cond: (id = 4)
 Heap Fetches: 0
 Planning Time: 0.945 ms
 Execution Time: 76.123 ms

So, from the optimiser's point of view, it has done the best it could.
Theoretically, if you have a big table with indexes and must select a 
small number of tuples, the ideal query plan will include parameterised 
NestLoop JOINs. Unfortunately, parameterisation in PostgreSQL can't pass 
inside a subquery. It could be a reason for new development because 
MSSQL can do such a trick, but it is a long way.
You can try to rewrite your schema and query to avoid subqueries in 
expressions at all.

I hope this message gave you some insights.

[1] https://github.com/postgrespro/aqo

--
regards, Andrei Lepikhov





Re: Row level security

2024-06-27 Thread Andrew Okhmat

Hello Jim,

Your approach of using table inheritance in PostgreSQL for implementing 
row-level security (RLS) has some interesting aspects, but there are 
potential pitfalls and alternatives that you should consider. Below, 
I'll outline some key points to


Table Inheritance and Performance Concerns

* RLS and Inheritance -  In PostgreSQL, RLS policies are applied per 
table. If you use inheritance, RLS policies defined on the parent table 
won’t automatically apply to the child tables. You’ll have to set up RLS 
policies on each child table separately.


* Growing Base Table  - The base table, getting a new row for every row 
inserted in the child tables, will grow really fast. Managing a table 
with hundreds of millions of rows per year could become a serious 
performance problem.


* Partitioning - Partitioning can help manage big tables by breaking 
them into smaller parts. But if your base table becomes a bottleneck, 
partitioning the child tables alone might not solve the problem.


Alternative Approach: Use Partitioned Tables Directly with RLS

Given your needs, here's a different approach that leverages 
PostgreSQL's partitioning and indexing features along with RLS:


* Directly Partitioned Tables - Instead of inheritance, create 
partitioned tables directly for each type of data. Partition these 
tables based on a logical key (like time, site ID, or customer ID) so 
each partition stays manageable:

Example:
 CREATE TABLE data (
 id SERIAL PRIMARY KEY,
 site_id INT,
 customer_id INT,
 division_id INT,
 department_id INT,
 data_payload JSONB,
 created_at TIMESTAMPTZ
 ) PARTITION BY RANGE (created_at);


* RLS Policies on Partitions - Set up RLS policies on each partition. 
Since partitions are smaller, RLS policy checks should be more efficient.

Example:
 CREATE POLICY rls_policy ON data
 USING (site_id = current_setting('app.current_site_id')::INT);
 ENABLE ROW LEVEL SECURITY;

* Session Variables - Using PostgreSQL session variables to store 
user-specific info (like /app.current_site_id/) is convenient, but has 
potential security risks. If a client can set these variables, they 
could manipulate them to gain unauthorized access. To mitigate this, 
ensure that only trusted parts of your application can set these 
variables. Consider using server-side functions or application logic to 
securely set these variables based on the authenticated user's information.

Example:
 SET app.current_site_id = '123';

* Indexing  - Make sure you index columns used in RLS policies and 
queries, like /site_id/ and /customer_id/.

 CREATE INDEX idx_site_id ON data (site_id);
 CREATE INDEX idx_customer_id ON data (customer_id);
 CREATE INDEX idx_site_customer ON data (site_id, customer_id);
 CREATE INDEX idx_created_at ON data (created_at);

* Using Stored Procedures - using stored procedures can centralize 
security logic, but it can also add complexity. Here's a brief look:


Advantages:
   - Centralized security logic.
   - Additional layer of security as logic is hidden from end-users.
   - Can include data validation and business logic.

2Disadvantages:
   - Increased complexity in development and maintenance.
   - Potential performance overhead for complex procedures.
   - Less flexibility for ad-hoc queries.

Example:
   CREATE OR REPLACE FUNCTION insert_data(
   p_site_id INT,
   p_customer_id INT,
   p_division_id INT,
   p_department_id INT,
   p_data_payload JSONB,
   p_created_at TIMESTAMPTZ
   ) RETURNS VOID AS $$
   BEGIN
   IF current_setting('app.current_site_id')::INT = p_site_id THEN
   INSERT INTO data (site_id, customer_id, division_id, 
department_id, data_payload, created_at)
   VALUES (p_site_id, p_customer_id, p_division_id, 
p_department_id, p_data_payload, p_created_at);

   ELSE
   RAISE EXCEPTION 'Access Denied';
   END IF;
   END;
   $$ LANGUAGE plpgsql;


* Final Thoughts

Using direct partitioning and applying RLS policies to each partition 
should help with performance issues linked to a growing base table. This 
approach also keeps things flexible for future expansions and avoids the 
hassle of managing inheritance hierarchies. Proper indexing with RLS 
policies in mind can greatly improve query performance in large tables. 
Just make sure to handle session variables securely to avoid potential 
security issues.


If you have more questions or need further advice on implementation, 
just let me know!


Cheers,
Andy

On 25-Jun-24 00:28, Thomas Simpson wrote:


Hi,

I'm trying to implement a system which requires row level security on 
some key data tables (most do not require RLS).  The data tables will 
grow substantially (rows likely > +100M/year - the system is > 80% 
data insert plus < 20% updates and by design, no deletes).


Some queries are likely to brush past many rows before being 
eliminated by the RLS policy, so I

Re: Inconsistent query performance based on relation hit frequency

2024-06-27 Thread Achilleas Mantzios - cloud


On 6/27/24 03:50, Laura Hausmann wrote:
Heya, I hope the title is somewhat descriptive. I'm working on a 
decentralized social media platform and have encountered the following 
performance issue/quirk, and would like to ask for input, since I'm 
not sure I missed anything.


I'm running PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc 
(GCC) 13.2.1 20230801, 64-bit, running on an Arch Linux box with 128GB 
of RAM & an 8c16t Ryzen 3700X CPU. Disk is a NVME RAID0.


Postgres configuration: https://paste.depesz.com/s/iTv

I'm using autovacuum defaults & am running a manual VACUUM ANALYZE on 
the entire database nightly.


The relevant database parts consist of a table with posts (note), a 
table with users (user), and a table with follow relationships 
(following). The query in question takes the most recent n (e.g. 50) 
posts, filtered by the users follow relations.


The note table on my main production instance grows by about 200k 
entries per week.


Schema & tuple counts: https://paste.depesz.com/s/cfI

Here's the shortest query I can reproduce the issue with: 
https://paste.depesz.com/s/RoC
Specifically, it works well for users that follow a relatively large 
amount of users (https://explain.depesz.com/s/tJnB), and is very slow 
for users that follow a low amount of users / users that post 
infrequently (https://explain.depesz.com/s/Mtyr).


From what I can tell, this is because this query causes postgres to 
scan the note table from the bottom (most recent posts first), 
discarding anything by users that are not followed.


Curiously, rewriting the query like this 
(https://paste.depesz.com/s/8rN) causes the opposite problem, this 
query is fast for users with a low following count 
(https://explain.depesz.com/s/yHAz#query), and slow for users with a 
high following count (https://explain.depesz.com/s/1v6L, 
https://explain.depesz.com/s/yg3N).


These numbers are even further apart (to the point of 10-30s query 
timeouts) in the most extreme outlier cases I've observed, and on 
lower-end hardware.


I've sidestepped the issue by running either of these queries based on 
a heuristic that checks whether there are more than 250 matching posts 
in the past 7 days, recomputed once per day for every user, but it 
feels more like a hack than a proper solution.


I'm able to make the planner make a sensible decision in both cases by 
setting enable_sort = off, but that tanks performance for the rest of 
my application, is even more of a hack, and doesn't seem to work in 
all cases.


I've been able to reproduce this issue with mock data 
(https://paste.depesz.com/s/CnY), though it's not generating quite the 
same query plans and is behaving a bit differently.


Before deep dive into everybody's favorite topic you may simplify your 
query :


select o.* from objects o where o."userId" = :userid UNION select o.* 
from objects o where o."userId" IN


(SELECT r."followeeId" FROM relationships r WHERE r."followerId"= :userid)

postgres@[local]/laura=# explain (analyze, buffers) select o.* from 
objects o where o."userId" = 1 UNION select o.* from objects o where 
o."userId" IN (SELECT r."followeeId" FROM relati

onships r WHERE r."followerId"=1) ORDER BY id DESC ;

QUERY PLAN

-
---
Sort  (cost=8622.04..8767.98 rows=58376 width=40) (actual 
time=1.041..1.053 rows=314 loops=1)

  Sort Key: o.id DESC
  Sort Method: quicksort  Memory: 39kB
  Buffers: shared hit=1265
  ->  HashAggregate  (cost=3416.92..4000.68 rows=58376 width=40) 
(actual time=0.900..1.006 rows=314 loops=1)

Group Key: o.id, o."userId", o.data
Batches: 1  Memory Usage: 1585kB
Buffers: shared hit=1265
->  Append  (cost=0.42..2979.10 rows=58376 width=40) (actual 
time=0.024..0.816 rows=314 loops=1)

  Buffers: shared hit=1265
  ->  Index Scan using "objects_userId_idx" on objects o 
 (cost=0.42..3.10 rows=17 width=21) (actual time=0.003..0.003 rows=0 
loops=1)

Index Cond: ("userId" = 1)
Buffers: shared hit=3
  ->  Nested Loop  (cost=0.70..2684.12 rows=58359 width=21) 
(actual time=0.020..0.794 rows=314 loops=1)

Buffers: shared hit=1262
->  Index Only Scan using 
"relationships_followerId_followeeId_idx" on relationships r 
 (cost=0.28..7.99 rows=315 width=4) (actual time=0.011..0.030 rows=315 
loops=

1)
  Index Cond: ("followerId" = 1)
  Heap Fetches: 0
  Buffers: shared hit=3
->  Index Scan using "objects_userId_idx" on 
objects o_1  (cost=0.42..6.65 rows=185 width=21) (actual 
time=0.002..0.002 rows=1 loops=315)

   

Re: Inconsistent query performance based on relation hit frequency

2024-06-27 Thread Laura Hausmann
Heya & thank you for the response!

That makes a lot of sense. I'm glad to hear it's on the radar of the team,
but I understand that this is a complex task and won't happen anytime soon.

For the meantime, I've tried a couple ways of rewriting the query, sadly
none of which seem to translate to the production database:

Simply dropping the or/union clause (and adding a relationship to the user
themselves) fixes the problem in the test database (both user 1 (
https://explain.depesz.com/s/ZY8l) and user 4 (
https://explain.depesz.com/s/Q2Wk) run in 1~15ms, which isn't perfect but
good enough), but not the production one (still fast for high frequency (
https://explain.depesz.com/s/DixF) and slow for low frequency (
https://explain.depesz.com/s/fIKm) users).

I also tried rewriting it as a join (https://explain.depesz.com/s/36Ve),
but that also didn't seem to have an effect.

It's very possible I missed one or multiple ways the query could be
rewritten in.

I'm sadly not sure how I could generate a test dataset that more closely
resembles the production workload. In case that would be helpful in
debugging this further, any tips on that would be greatly appreciated.

Thanks in advance,
Laura Hausmann


On Thu, Jun 27, 2024 at 12:31 PM Andrei Lepikhov  wrote:

> On 6/27/24 07:50, Laura Hausmann wrote:
> > I'd appreciate any and all input on the situation. If I've left out any
> > information that would be useful in figuring this out, please tell me.
> Thanks for this curious case, I like it!
> At first, you can try to avoid "OR" expressions - PostgreSQL has quite
> limited set of optimisation/prediction tricks on such expressions.
> Second - I see, postgres predicts wrong number of tuples. But using my
> typical tool [1] and getting more precise estimations i don't see
> significant profit:
>
>   Limit  (cost=10832.85..10838.69 rows=50 width=21)
> ->  Gather Merge  (cost=10832.85..10838.92 rows=52 width=21)
>   Workers Planned: 2
>   Workers Launched: 2
>   ->  Sort  (cost=9832.83..9832.90 rows=26 width=21)
> Sort Key: objects.id DESC
> Sort Method: top-N heapsort  Memory: 32kB
> Worker 0:  Sort Method: quicksort  Memory: 32kB
> Worker 1:  Sort Method: quicksort  Memory: 32kB
> ->  Parallel Seq Scan on objects
>   Filter: ((hashed SubPlan 1) OR ("userId" = 1))
>   Rows Removed by Filter: 183372
>   SubPlan 1
> ->  Nested Loop
>   ->  Index Only Scan using users_pkey on
> Index Cond: (id = 1)
> Heap Fetches: 0
>   ->  Index Only Scan using
> "relationships_followerId_followeeId_idx" on relationships
> Index Cond: ("followerId" = 1)
> Heap Fetches: 0
>   Planning Time: 0.762 ms
>   Execution Time: 43.816 ms
>
>   Limit  (cost=10818.83..10819.07 rows=2 width=21)
> ->  Gather Merge  (cost=10818.83..10819.07 rows=2 width=21)
>   Workers Planned: 2
>   Workers Launched: 2
>   ->  Sort  (cost=9818.81..9818.81 rows=1 width=21)
> Sort Key: objects.id DESC
> Sort Method: quicksort  Memory: 25kB
> Worker 0:  Sort Method: quicksort  Memory: 25kB
> Worker 1:  Sort Method: quicksort  Memory: 25kB
> ->  Parallel Seq Scan on objects
>   Filter: ((hashed SubPlan 1) OR ("userId" = 4))
>   Rows Removed by Filter: 183477
>   SubPlan 1
> ->  Nested Loop  (cost=0.56..8.61 rows=1 width=4)
>   ->  Index Only Scan using
> "relationships_followerId_followeeId_idx" on relationships
> Index Cond: ("followerId" = 4)
> Heap Fetches: 0
>   ->  Index Only Scan using users_pkey
> Index Cond: (id = 4)
> Heap Fetches: 0
>   Planning Time: 0.646 ms
>   Execution Time: 30.824 ms
>
> But this was achieved just because of parallel workers utilisation.
> Disabling them we get:
>
>   Limit  (cost=14635.07..14635.08 rows=2 width=21) (actual
> time=75.941..75.943 rows=0 loops=1)
> ->  Sort  (cost=14635.07..14635.08 rows=2 width=21) (actual
> time=75.939..75.940 rows=0 loops=1)
>   Sort Key: objects.id DESC
>   Sort Method: quicksort  Memory: 25kB
>   ->  Seq Scan on objects  (cost=8.61..14635.06 rows=2 width=21)
> (actual time=75.931..75.932 rows=0 loops=1)
> Filter: ((hashed SubPlan 1) OR ("userId" = 4))
> Rows Removed by Filter: 550430
> SubPlan 1
>   ->  Nested Loop  (cost=0.56..8.61 rows=1 

Re: Inconsistent query performance based on relation hit frequency

2024-06-27 Thread Achilleas Mantzios

Στις 27/6/24 17:58, ο/η Laura Hausmann έγραψε:

Heya & thank you for the response!

That makes a lot of sense. I'm glad to hear it's on the radar of the 
team, but I understand that this is a complex task and won't happen 
anytime soon.


For the meantime, I've tried a couple ways of rewriting the query, 
sadly none of which seem to translate to the production database:


Simply dropping the or/union clause (and adding a relationship to the 
user themselves) fixes the problem in the test database (both user 1 
(https://explain.depesz.com/s/ZY8l) and user 4 
(https://explain.depesz.com/s/Q2Wk) run in 1~15ms, which isn't perfect 
but good enough), but not the production one (still fast for high 
frequency (https://explain.depesz.com/s/DixF) and slow for low 
frequency (https://explain.depesz.com/s/fIKm) users).


I also tried rewriting it as a join 
(https://explain.depesz.com/s/36Ve), but that also didn't seem to have 
an effect.


It's very possible I missed one or multiple ways the query could be 
rewritten in.


I'm sadly not sure how I could generate a test dataset that more 
closely resembles the production workload. In case that would be 
helpful in debugging this further, any tips on that would be greatly 
appreciated.


I am not sure my message made it through to you, I dont know if you are 
subscribed to the list, here is an idea :


select o.* from objects o where o."userId" = :userid UNION select o.* 
from objects o where o."userId" IN


(SELECT r."followeeId" FROM relationships r WHERE 
r."followerId"=:userid) ORDER BY id DESC ;


With your test data I get <= 1ms answers with all inputs.



Thanks in advance,
Laura Hausmann


On Thu, Jun 27, 2024 at 12:31 PM Andrei Lepikhov  
wrote:


On 6/27/24 07:50, Laura Hausmann wrote:
> I'd appreciate any and all input on the situation. If I've left
out any
> information that would be useful in figuring this out, please
tell me.
Thanks for this curious case, I like it!
At first, you can try to avoid "OR" expressions - PostgreSQL has
quite
limited set of optimisation/prediction tricks on such expressions.
Second - I see, postgres predicts wrong number of tuples. But
using my
typical tool [1] and getting more precise estimations i don't see
significant profit:

  Limit  (cost=10832.85..10838.69 rows=50 width=21)
    ->  Gather Merge  (cost=10832.85..10838.92 rows=52 width=21)
          Workers Planned: 2
          Workers Launched: 2
          ->  Sort  (cost=9832.83..9832.90 rows=26 width=21)
                Sort Key: objects.id  DESC
                Sort Method: top-N heapsort  Memory: 32kB
                Worker 0:  Sort Method: quicksort  Memory: 32kB
                Worker 1:  Sort Method: quicksort  Memory: 32kB
                ->  Parallel Seq Scan on objects
                      Filter: ((hashed SubPlan 1) OR ("userId" = 1))
                      Rows Removed by Filter: 183372
                      SubPlan 1
                        ->  Nested Loop
                              ->  Index Only Scan using users_pkey on
                                    Index Cond: (id = 1)
                                    Heap Fetches: 0
                              ->  Index Only Scan using
"relationships_followerId_followeeId_idx" on relationships
                                    Index Cond: ("followerId" = 1)
                                    Heap Fetches: 0
  Planning Time: 0.762 ms
  Execution Time: 43.816 ms

  Limit  (cost=10818.83..10819.07 rows=2 width=21)
    ->  Gather Merge  (cost=10818.83..10819.07 rows=2 width=21)
          Workers Planned: 2
          Workers Launched: 2
          ->  Sort  (cost=9818.81..9818.81 rows=1 width=21)
                Sort Key: objects.id  DESC
                Sort Method: quicksort  Memory: 25kB
                Worker 0:  Sort Method: quicksort  Memory: 25kB
                Worker 1:  Sort Method: quicksort  Memory: 25kB
                ->  Parallel Seq Scan on objects
                      Filter: ((hashed SubPlan 1) OR ("userId" = 4))
                      Rows Removed by Filter: 183477
                      SubPlan 1
                        ->  Nested Loop  (cost=0.56..8.61 rows=1
width=4)
                              ->  Index Only Scan using
"relationships_followerId_followeeId_idx" on relationships
                                    Index Cond: ("followerId" = 4)
                                    Heap Fetches: 0
                              ->  Index Only Scan using users_pkey
                                    Index Cond: (id = 4)
                                    Heap Fetches: 0
  Planning Time: 0.646 ms
  Execution Time: 30.824 ms

But this was achieved just because of parallel workers utilisation.
Disabling them we