Custom FDW - the results of a nested query/join not being passed as qual to the outer query

2021-01-27 Thread Kai Daguerre
Hi!

(First post. If this is not the appropriate list, please feel free to move
or let me know. )

I am developing an FDW which allows various data sources to act as virtual
tables, allowing various different APIs to be queried using a consistent
SQL interface - a similar concept to Osquery but using Postgres instead of
SQLite. It is working pretty well, but we have hit a bit of a roadblock (or
bump in the road at least).

We often have virtual tables where a list operation is not viable/possible
without providing quals. For example we have implemented a 'whois' table,
which will retrieve whois information for specified domains. It is clearly
not practical to do an unqualified 'list' of *all* domains.

The problem we have is that the results of nested subqueries/joins are not
being passed as quals to the outer query.

So for example
*   select * from whois_domain where domain in ('google.com
<http://google.com>', 'yahoo.co.uk <http://yahoo.co.uk>')*
works fine, and a qual is passed to the fdw with a value of ['google.com', '
yahoo.co.uk']

However the following (assuming a 'domains table containing required
domains) does not work:
   *select * from whois_domain where domain in (select domain from domains)*

In this case, no quals are passed to the fdw, so the *select * from
whois_domain* query therefore fails. What we would like is to ensure the
subquery runs first, and for the results to be available to the outer query.

---

Using SQLite, this could be accomplished using cross-joins (
https://sqlite.org/optoverview.html#crossjoin). Is there an equivalent (or
similar) mechanism in Postgres to ensure query ordering?

Within the FDW, I have tried using the GetForeignPaths function to return a
path which returns a single row when the 'key' column is used. This does
provide a qual, however it is of type T_Var - I believe I need a constant
qual.

Any suggestion welcome as to either a different way to structure the query
or whether the FDW can request/enforce the ordering by returning specific
planning results.

Many thanks,
Kai Daguerre

FDW source: https://github.com/turbot/steampipe-postgres-fdw
Product page: https://steampipe.io


Re: Custom FDW - the results of a nested query/join not being passed as qual to the outer query

2021-01-28 Thread Kai Daguerre
Many thanks for the fast response.

Using an SRF is an interesting idea, I'll have a play and see if we can
make that work.

Cheers,
Kai

On Wed, Jan 27, 2021 at 3:27 PM Tom Lane  wrote:

> Kai Daguerre  writes:
> > We often have virtual tables where a list operation is not
> viable/possible
> > without providing quals. For example we have implemented a 'whois' table,
> > which will retrieve whois information for specified domains. It is
> clearly
> > not practical to do an unqualified 'list' of *all* domains.
>
> In that case you're going to have to resign yourself to some queries
> failing.  This is unavoidable, consider "select * from whois".  But
> just because the query has a WHERE condition doesn't mean that a useful
> restriction clause can be extracted for any particular table.
>
> I think the best you can do is (1) fail at runtime if there's no qual
> and (2) at plan time, return an extremely high cost estimate for a
> qual-less scan, in hopes of discouraging the planner from choosing
> that.  (Instead of (2), you could perhaps not generate a scan path
> at all, but that's likely to lead to an unintelligible error message.)
>
> Perhaps you should rethink whether you really want a foreign table
> rather than a set-returning function.  With the SRF approach it's
> automatic that the user must supply the restricting argument(s) you need.
>
> regards, tom lane
>