I am not sure if this belongs in pgsql-general or pgsql-hackers, I am
trying first in psgl-general.
I am trying to understand setting a selectivity function that gets
applied to an operator (to hopefully
provide better information for the planner/optmizer). This is for the
q3c extension, source code found at
https://github.com/segasai/q3c.
There are functions for selectivity, and for an operator.
-- A dummy type used in the selectivity operator
create type q3c_type as (ra double precision, dec double precision,
ra1 double precision, dec1 double precision);
-- A dummy operator function (always returns true)
CREATE OR REPLACE FUNCTION q3c_seloper(double precision, q3c_type)
RETURNS bool
AS 'MODULE_PATHNAME', 'pgq3c_seloper'
LANGUAGE C STRICT IMMUTABLE COST 1000;
-- A selectivity function for the q3c operator
CREATE OR REPLACE FUNCTION q3c_sel(internal, oid, internal, int4)
RETURNS float8
AS 'MODULE_PATHNAME', 'pgq3c_sel'
LANGUAGE C IMMUTABLE STRICT ;
-- A selectivity function for the q3c operator
CREATE OR REPLACE FUNCTION q3c_seljoin(internal, oid, internal, int2,
internal)
RETURNS float8
AS 'MODULE_PATHNAME', 'pgq3c_seljoin'
LANGUAGE C IMMUTABLE STRICT ;
-- distance operator with correct selectivity
CREATE OPERATOR ==<<>>== (
LEFTARG = double precision,
RIGHTARG = q3c_type,
PROCEDURE = q3c_seloper,
RESTRICT = q3c_sel,
JOIN = q3c_seljoin
);
The C portions are declared as:
/* The actual selectivity function, it returns the ratio of the
* search circle to the whole sky area
*/
PG_FUNCTION_INFO_V1(pgq3c_sel);
Datum pgq3c_sel(PG_FUNCTION_ARGS)
where the actual calculation portion is (not showing the setup portion):
ratio = 3.14 * rad * rad / 41252.; /* pi*r^2/whole_sky_area */
/* clamp at 0, 1*/
CLAMP_PROBABILITY(ratio);
elog(WARNING, "HERE0 pgq3c_sel.... %e", ratio);
PG_RETURN_FLOAT8(ratio);
}
The join function is declared as:
PG_FUNCTION_INFO_V1(pgq3c_seljoin);
Datum pgq3c_seljoin(PG_FUNCTION_ARGS)
{
where the meat portion is:
ratio = 3.14 * rad * rad / 41252.; /* pi*r^2/whole_sky_area */
/* clamp at 0, 1*/
CLAMP_PROBABILITY(ratio);
elog(WARNING, "HERE0 pgq3c_seljoin.... %e", ratio);
PG_RETURN_FLOAT8(ratio);
}
The two elog statements aren't in the orig code, I've added them to help
me trace
the code. As far as I can tell, the these selectivity functions are
called in
src/backend/optimizer/path/clausesel.c in the routine
clause_selectivity_ext.
If I add similar elog statements, at about line 836, the code says:
if (treat_as_join_clause(root, clause, rinfo, varRelid,
sjinfo))
{
/* Estimate selectivity for a join clause. */
if (opno > 6000)
elog(WARNING, "clause_selectivity: join_selectivity
opno %d",opno);
s1 = join_selectivity(root, opno,
opclause->args,
opclause->inputcollid,
jointype,
sjinfo);
if (opno > 6000){
elog(WARNING, "join_selectivity: s1 %f", s1);
}
}
else
{
/* Estimate selectivity for a restriction clause. */
if (opno > 6000)
elog(WARNING, "clause_selectivity:
restriction_selectivity opno %d", opno);
s1 = restriction_selectivity(root, opno,
opclause->args,
opclause->inputcollid,
varRelid);
if (opno > 6000){
elog(WARNING, "restriction_selectivity: s1 %lf", s1);
}
}
When I actually execute this, I get output to the terminal of the form:
WARNING: join_selectivity: operator id 16818 jointype 0 0
WARNING: HERE0 pgq3c_seljoin.... 5.873266e-12
WARNING: datum result 4438812783922730423 0.000000
WARNING: HERE0 pgq3c_seljoin.... 5.873266e-12
WARNING: join_selectivity: 0.000000 16818 jointype 0
WARNING: join_selectivity: s1 0.000000
WARNING: clause_selectivity: s1 0.000000
where it seems to me the q3c code is returning a non zero value, but in
the guts of
postgres what is found is a zero value. If I want to verify I have the
correct opr,
which is 16818, I can verify via:
q3c_test=# select
oid,oprname,oprnamespace,oprowner,oprkind,oprleft,oprright,oprresult,oprcode
from pg_operator where oid = 16818;
oid | oprname | oprnamespace | oprowner | oprkind | oprleft |
oprright | oprresult | oprcode
-------+----------+--------------+----------+---------+---------+----------+-----------+-------------
16818 | ==<<>>== | 2200 | 16391 | b | 701 |
16814 | 16 | q3c_seloper
which yeilds what I expect.
The join_selectivity is essentially a call in
src/backend/optimizer/util/plancat.c of:
result = DatumGetFloat8(OidFunctionCall5Coll(oprjoin,
inputcollid,
PointerGetDatum(root),
ObjectIdGetDatum(operatorid),
PointerGetDatum(args),
Int16GetDatum(jointype),
PointerGetDatum(sjinfo)));
if (result < 0.0 || result > 1.0)
elog(ERROR, "invalid join selectivity: %f", result);
while restriction_selectivity is a call to:
result = DatumGetFloat8(OidFunctionCall4Coll(oprrest,
inputcollid,
PointerGetDatum(root),
ObjectIdGetDatum(operatorid),
PointerGetDatum(args),
Int32GetDatum(varRelid)));
This is the point where I run out of steam. The basic issue I have is
that q3c code is attempting
to return a small, but non-zero value for the selectivity in two
functions, but the guts of
postgresql has both the join_selectivity and restriction_selectivity
function return zero where
I think they shouldn't.
Any advice in how to make progress on this is welcome. I'm using
19devel (I can probably do a
git merge to move to a more up to date version), and I'm running Fedora
release 43 in case which
exact OS I'm using is relavent.
Greg