Implementing foreign data wrappers and avoiding n+1 querying

2022-12-21 Thread David Gilman
When a fdw table participates in query planning and finds itself as
part of a join it can output a parameterized path. If chosen, Postgres
will dutifully call the fdw over and over via IterateForeignScan to
fetch matching tuples. Many fdw extensions do network traffic, though,
and it would be beneficial to reduce the total number of queries done
or network connections established.

Is there some path that can be emitted by the fdw, or some other
technique, to get the query planner and everything else to handle
batching the tuples returned by the outer relation? For example, once
batched the fdw extension could send the equivalent of a WHERE row IN
(a, b, c), or maybe even WHERE row BETWEEN a AND c to the foreign
system, and either the fdw callback or a subplan does the rechecking
to match up the returned foreign tuples with the local ones.

One thought is that it might be possible to abuse the async support
for fdws to accomplish this. Your fdw could accept async requests, sit
on them until some threshold is crossed, do the actual query and feed
them back into the executor when the results are back. However, from
what I can tell the async interface has no way to tell the ForeignScan
that it won't get any more async requests, so there's no way to force
flush the final batch of queries.




Re: Implementing foreign data wrappers and avoiding n+1 querying

2022-12-22 Thread David Gilman
I apologize that my post was not super clear, I am thinking about
implementing a fdw from scratch, and the target database is one of those
NoSQL databases where you have to send JSON over a HTTP connection for each
query.

I have reviewed the postgres fdw code to see how it works and to see what's
possible. Although it probably wouldn't benefit as much from this sort of
thing (yay to postgres' design!) It could possibly still benefit a bit,
which makes me wonder if it can't be done with the current planner nodes it
might be a worthy improvement to add support for this.

On Wed, Dec 21, 2022, 10:57 PM David Rowley  wrote:

> On Thu, 22 Dec 2022 at 13:31, David Gilman  wrote:
> >
> > When a fdw table participates in query planning and finds itself as
> > part of a join it can output a parameterized path. If chosen, Postgres
> > will dutifully call the fdw over and over via IterateForeignScan to
> > fetch matching tuples. Many fdw extensions do network traffic, though,
> > and it would be beneficial to reduce the total number of queries done
> > or network connections established.
>
> Sounds like you might be looking for fdw_startup_cost [1].
>
> David
>
> [1] https://www.postgresql.org/docs/current/postgres-fdw.html
>


Query plan regression between CTE and views

2023-08-14 Thread David Gilman
I have a query that was originally written as a handful of CTEs out of
convenience. It is producing a reasonable query plan because the CTE
materialization was kicking in at an appropriate place. The CTEs
aren't totally linear. The graph looks like this, where A, B, C and D
are CTEs, and B -> A means B selects from A. In Graphviz format:

G {
   B -> A;
   C -> A;
   C -> B;
   D -> C;
}

Out of curiosity I tried turning the query into a series of views and
ran that query. The query plan is vastly different, there is no
materialization and it runs much slower.

My question is: is this a valid bug? I am not sure if I should expect
the view version to find a way to materialize and produce a comparable
query plan. Also, making a minimal test case is going to take a bit
and I don't want to start unless this smells like a genuine bug.

-- 
David Gilman
:DG<




Re: Query plan regression between CTE and views

2023-08-15 Thread David Gilman
I'm on PostgreSQL 15 with essentially a stock configuration.

On Tue, Aug 15, 2023 at 8:58 AM Ron  wrote:
>
> On 8/14/23 09:54, David Gilman wrote:
> > I have a query that was originally written as a handful of CTEs out of
> > convenience. It is producing a reasonable query plan because the CTE
> > materialization was kicking in at an appropriate place. The CTEs
> > aren't totally linear. The graph looks like this, where A, B, C and D
> > are CTEs, and B -> A means B selects from A. In Graphviz format:
> >
> > G {
> > B -> A;
> > C -> A;
> > C -> B;
> > D -> C;
> > }
> >
> > Out of curiosity I tried turning the query into a series of views and
> > ran that query. The query plan is vastly different, there is no
> > materialization and it runs much slower.
> >
> > My question is: is this a valid bug? I am not sure if I should expect
> > the view version to find a way to materialize and produce a comparable
> > query plan. Also, making a minimal test case is going to take a bit
> > and I don't want to start unless this smells like a genuine bug.
>
>
> What version of Postgresql?
>
> (Also, back before, I think, v12, CTEs were optimizer fences.  You were
> better using views or sub-queries.)
>
> --
> Born in Arizona, moved to Babylonia.
>
>
>
>


-- 
David Gilman
:DG<