Re: FDW, too long to run explain

2019-02-09 Thread auxsvr
On Monday, 4 February 2019 09:14:14 EET Vijaykumar Jain wrote:
> Hi,

Hi,

> with pg v10.1

> we use writes directly to shards, and reads via FDW from all shards (RO)
> our DB size is ~ 500GB each shard, and tables are huge too.
> 1 table ~ 200GB, 1 ~55GB, 1 ~40GB and a lot of small tables, but large
> indices on large table.
> 
> the sharding was done based on a key to enable shard isolation at app layer
> using a fact table.
> select id,shard from fact_table;
> 
> server resources are,
> 32GB mem, 8 vcpu, 500GB SSD.
> 
> the FDW connect to each other shard via FDW fronted by haproxy -> pgbouncer
> -> postgresql.
> Hope this is good enough background :)
> 
> now we have some long running queries via FDW that take minutes and get
> killed explain runs as idle in transaction on remote servers. (we set
> use_remote_estimate = true )
> when the query is run on individual shards directly, it runs pretty
> quickly,
> but when run via FDW, it takes very long.
> i even altered fetch_sie to 1, so that in case some filters do not get
> pushed, those can be applied on the FDW quickly.

In general, the plans via FDW are not the same as the ones running locally. 
We're having similar issues and the reason seems to be that queries via FDW are 
optimized for startup cost or few rows.

> Regards,
> Vijay

-- 
Regards,
Peter





Shared hosting with FDW on AWS RDS

2019-02-10 Thread auxsvr
Hi,

We'd like to configure an RDS server for shared hosting. The idea is that every 
customer will be using a different database and FDW will be configured, so that 
the remote tables have access to the full data, but materialized views will be 
pulling from them data specific to each customer. So far, everything seems to 
work fine and be secure, as we've revoked access to the remote tables for the 
customer users, but I'm feeling a bit uneasy considering that the credentials 
for full access are stored in each database. My understanding is that remote 
user mapping is designed so that this will not be an issue, but I was wondering 
if access to the metadata schema might allow to circumvent this restriction. 
Also, I was wondering if someone has experience hardening databases on RDS, as 
the so called superuser does not have the right to revoke access from the 
metadata schema.

Comments and suggestions are welcome.
-- 
Regards,
Peter





Re: FDW, too long to run explain

2019-02-17 Thread auxsvr
Related to this question:

Postgresql cursors are in most cases I've tried extremely slow. The cause is as 
described in my previous answer, in my experience. Is there any plan or way to 
improve this situation? For example, for FDW one would expect the plan on the 
remote side to be similar, if not identical, to the one locally, with the 
exception of the setup cost.
-- 
Regards,
Peter





Re: FDW, too long to run explain

2019-02-17 Thread auxsvr
On Sunday, 17 February 2019 20:58:47 EET Jeff Janes wrote:
> A question for the PostgreSQL hackers would be, Is it necessary and
> desirable that the EXPLAIN be issued in the same transaction as the
> eventual DECLARE and FETCHes?  I don't think it is.  I guess if the foreign
> side table definition got changed between EXPLAIN and DECLARE it would
> cause problems, but changing the foreign side definition out of sync with
> the local side can cause problems anyway, so is that important to preserve?

Won't separate transactions cause issues if the statistics of the table change 
in the meantime in a way that affects the plan?

> Cheers,
> 
> Jeff
-- 
Regards,
Peter





FDW and remote cursor_tuple_fraction

2019-03-03 Thread auxsvr
Hi,

One reason the remote plans via FDW are different than the plans of queries 
running directly on the remote server is that a cursor is used, which is 
optimized for low number of rows according to cursor_tuple_fraction. Is there 
any way to set parameters that affect the queries on the remote via FDW? If 
not, is there any plan to introduce such setting?
-- 
Regards,
Peter





Re: Design of a reliable task processing queue

2025-01-19 Thread auxsvr
On Saturday, 18 January 2025 12:44:07 EET Alex Burkhart wrote:
> Hey team,
> 
> I'm looking for help to organize locks and transaction for a reliable task
> queue.
> 
> REQUIREMENTS
> 
> 1. Pending actions are persisted to a database. There's a trace once they
> are done.
> 2. Application workers pick actions one by one. At any given time, each
> action can be assigned to at most one worker (transaction).
> 3. If multiple actions have same "lock_id", only one of them is processed
> at the time. That has to be action with smallest id.

Why reinvent the wheel and not use production-ready code from projects such as 
que (Ruby), pgqueuer (Python)?
-- 
Regards,
Peter