Custom opclass for column statistics?

2019-07-06 Thread Ancoron Luciferis
Hi,

I've been wondering whether it is possible somehow to have the standard
column statistics to respect a certain operator class?

The reason why I am asking for this is that I have a UUID column with a
unique index at it using a custom operator class which implies a
different sort order than for the default UUID operator class.

This results into planner mistakes when determining whether to use the
index for row selection or not. Too often it falls back into sequential
scan due to this.


Thanx and Cheers,

Ancoron




Re: Custom opclass for column statistics?

2019-07-06 Thread Tomas Vondra

On Sat, Jul 06, 2019 at 11:02:27AM +0200, Ancoron Luciferis wrote:

Hi,

I've been wondering whether it is possible somehow to have the standard
column statistics to respect a certain operator class?

The reason why I am asking for this is that I have a UUID column with a
unique index at it using a custom operator class which implies a
different sort order than for the default UUID operator class.

This results into planner mistakes when determining whether to use the
index for row selection or not. Too often it falls back into sequential
scan due to this.



Can you share an example demonstrating the issue?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Custom opclass for column statistics?

2019-07-06 Thread Ancoron Luciferis
On 06/07/2019 15:38, Tomas Vondra wrote:
> On Sat, Jul 06, 2019 at 11:02:27AM +0200, Ancoron Luciferis wrote:
>> Hi,
>>
>> I've been wondering whether it is possible somehow to have the standard
>> column statistics to respect a certain operator class?
>>
>> The reason why I am asking for this is that I have a UUID column with a
>> unique index at it using a custom operator class which implies a
>> different sort order than for the default UUID operator class.
>>
>> This results into planner mistakes when determining whether to use the
>> index for row selection or not. Too often it falls back into sequential
>> scan due to this.
>>
> 
> Can you share an example demonstrating the issue?
> 
> 
> regards
> 

Yes, I have an opclass as follows:

CREATE OPERATOR CLASS uuid_timestamp_ops FOR TYPE uuid
USING btree AS
OPERATOR1   <*,
OPERATOR1   <~ (uuid, timestamp with time zone),
OPERATOR2   <=*,
OPERATOR2   <=~ (uuid, timestamp with time zone),
OPERATOR3   =,
OPERATOR3   =~ (uuid, timestamp with time zone),
OPERATOR4   >=*,
OPERATOR4   >=~ (uuid, timestamp with time zone),
OPERATOR5   >*,
OPERATOR5   >~ (uuid, timestamp with time zone),
FUNCTION1   uuid_timestamp_cmp(uuid, uuid),
FUNCTION1   uuid_timestamp_only_cmp(uuid, timestamp
with time zone),
FUNCTION2   uuid_timestamp_sortsupport(internal)
;

...and e.g. operator "<*" is defined as:

CREATE FUNCTION uuid_timestamp_lt(uuid, uuid)
RETURNS bool
AS 'MODULE_PATHNAME', 'uuid_timestamp_lt'
LANGUAGE C
IMMUTABLE
LEAKPROOF
STRICT
PARALLEL SAFE;

COMMENT ON FUNCTION uuid_timestamp_lt(uuid, uuid) IS 'lower than';

CREATE OPERATOR <* (
LEFTARG = uuid,
RIGHTARG = uuid,
PROCEDURE = uuid_timestamp_lt,
COMMUTATOR = '>*',
NEGATOR = '>=*',
RESTRICT = scalarltsel,
JOIN = scalarltjoinsel
);


The function "uuid_timestamp_lt" is basically defined as follows:
1. if not version 1 UUID fallback to standard uuid compare
2. extract timestamp values and compare
3. if equal timestamps fallback to standard uuid compare

...so that a chronological order is established.


The test table is created as follows:

CREATE TABLE uuid_v1_ext (id uuid);
CREATE UNIQUE INDEX idx_uuid_v1_ext ON uuid_v1_ext (id uuid_timestamp_ops);


The values for "histogram_bounds" of the test table look like this (due
to the default sort order for standard type UUID):

3789-97bf-11e9-b6bb-e03f49f7f733
008b88f8-6deb-11e9-901a-e03f4947f477
010a8b22-586a-11e9-8258-e03f49ce78f3
...
6f682e68-978d-11e9-901a-e03f4947f477
6ff412ee-926f-11e9-901a-e03f4947f477
7079ffe2-642f-11e9-b0cc-e03f49e7fd3b
70ffaeca-4645-11e9-adf9-e03f494677fb
...
fef26b41-9b9d-11e9-b0cc-e03f49e7fd3b
ff779ce8-9e52-11e9-8258-e03f49ce78f3
6bfc-4de4-11e9-b0d4-e03f49d6f6bf

...and I think that's where the planner gets the decision for a query
such as:

DELETE FROM uuid_v1_ext WHERE id <* '4bdf6f81-56ad-11e9-8258-e03f49ce78f3';

...which then get's executed as sequential scan instead of an index scan.

I was also thinking about changing the selectivity function used by the
custom operator, but I didn't find any hints how to implement that
without duplicating a lot of internal code.


Cheers,

Ancoron





Re: Custom opclass for column statistics?

2019-07-06 Thread Tomas Vondra

On Sat, Jul 06, 2019 at 05:35:33PM +0200, Ancoron Luciferis wrote:

On 06/07/2019 15:38, Tomas Vondra wrote:

On Sat, Jul 06, 2019 at 11:02:27AM +0200, Ancoron Luciferis wrote:

Hi,

I've been wondering whether it is possible somehow to have the standard
column statistics to respect a certain operator class?

The reason why I am asking for this is that I have a UUID column with a
unique index at it using a custom operator class which implies a
different sort order than for the default UUID operator class.

This results into planner mistakes when determining whether to use the
index for row selection or not. Too often it falls back into sequential
scan due to this.



Can you share an example demonstrating the issue?


regards



Yes, I have an opclass as follows:

CREATE OPERATOR CLASS uuid_timestamp_ops FOR TYPE uuid
   USING btree AS
   OPERATOR1   <*,
   OPERATOR1   <~ (uuid, timestamp with time zone),
   OPERATOR2   <=*,
   OPERATOR2   <=~ (uuid, timestamp with time zone),
   OPERATOR3   =,
   OPERATOR3   =~ (uuid, timestamp with time zone),
   OPERATOR4   >=*,
   OPERATOR4   >=~ (uuid, timestamp with time zone),
   OPERATOR5   >*,
   OPERATOR5   >~ (uuid, timestamp with time zone),
   FUNCTION1   uuid_timestamp_cmp(uuid, uuid),
   FUNCTION1   uuid_timestamp_only_cmp(uuid, timestamp
with time zone),
   FUNCTION2   uuid_timestamp_sortsupport(internal)
;

...and e.g. operator "<*" is defined as:

CREATE FUNCTION uuid_timestamp_lt(uuid, uuid)
RETURNS bool
AS 'MODULE_PATHNAME', 'uuid_timestamp_lt'
LANGUAGE C
IMMUTABLE
LEAKPROOF
STRICT
PARALLEL SAFE;

COMMENT ON FUNCTION uuid_timestamp_lt(uuid, uuid) IS 'lower than';

CREATE OPERATOR <* (
   LEFTARG = uuid,
   RIGHTARG = uuid,
   PROCEDURE = uuid_timestamp_lt,
   COMMUTATOR = '>*',
   NEGATOR = '>=*',
   RESTRICT = scalarltsel,
   JOIN = scalarltjoinsel
);


The function "uuid_timestamp_lt" is basically defined as follows:
1. if not version 1 UUID fallback to standard uuid compare
2. extract timestamp values and compare
3. if equal timestamps fallback to standard uuid compare

...so that a chronological order is established.


The test table is created as follows:

CREATE TABLE uuid_v1_ext (id uuid);
CREATE UNIQUE INDEX idx_uuid_v1_ext ON uuid_v1_ext (id uuid_timestamp_ops);


The values for "histogram_bounds" of the test table look like this (due
to the default sort order for standard type UUID):

3789-97bf-11e9-b6bb-e03f49f7f733
008b88f8-6deb-11e9-901a-e03f4947f477
010a8b22-586a-11e9-8258-e03f49ce78f3
...
6f682e68-978d-11e9-901a-e03f4947f477
6ff412ee-926f-11e9-901a-e03f4947f477
7079ffe2-642f-11e9-b0cc-e03f49e7fd3b
70ffaeca-4645-11e9-adf9-e03f494677fb
...
fef26b41-9b9d-11e9-b0cc-e03f49e7fd3b
ff779ce8-9e52-11e9-8258-e03f49ce78f3
6bfc-4de4-11e9-b0d4-e03f49d6f6bf

...and I think that's where the planner gets the decision for a query
such as:

DELETE FROM uuid_v1_ext WHERE id <* '4bdf6f81-56ad-11e9-8258-e03f49ce78f3';

...which then get's executed as sequential scan instead of an index scan.

I was also thinking about changing the selectivity function used by the
custom operator, but I didn't find any hints how to implement that
without duplicating a lot of internal code.



Not sure, I'm not very familiar with this code, so I'd have to play with
it and try things. But that's hard when I don't have any code. Would it
be possible to share a small self-contained test case?

I wonder what does uuid_timestamp_cmp do? I suppose it first compares by
a timestamp extracted from the UUID, right?

It'd be interesting to see

(a) statistics for the column from pg_stats, both for the table and
index (which should have been built using the custom opclass, I think).

(b) EXPLAIN ANALYZE for queries with your opclass, and perhaps with the
default one (that can't use the timestamp condition, but it should be
possible to generate smallers/largest uuid for a timestamp).

BTW which PostgreSQL version is this?

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Custom opclass for column statistics?

2019-07-06 Thread Ancoron Luciferis
On 06/07/2019 17:57, Tomas Vondra wrote:
> On Sat, Jul 06, 2019 at 05:35:33PM +0200, Ancoron Luciferis wrote:
>> On 06/07/2019 15:38, Tomas Vondra wrote:
>>> On Sat, Jul 06, 2019 at 11:02:27AM +0200, Ancoron Luciferis wrote:
 Hi,

 I've been wondering whether it is possible somehow to have the standard
 column statistics to respect a certain operator class?

 The reason why I am asking for this is that I have a UUID column with a
 unique index at it using a custom operator class which implies a
 different sort order than for the default UUID operator class.

 This results into planner mistakes when determining whether to use the
 index for row selection or not. Too often it falls back into sequential
 scan due to this.

>>>
>>> Can you share an example demonstrating the issue?
>>>
>>>
>>> regards
>>>
>>
>> Yes, I have an opclass as follows:
>>
>> CREATE OPERATOR CLASS uuid_timestamp_ops FOR TYPE uuid
>>    USING btree AS
>>    OPERATOR    1   <*,
>>    OPERATOR    1   <~ (uuid, timestamp with time zone),
>>    OPERATOR    2   <=*,
>>    OPERATOR    2   <=~ (uuid, timestamp with time zone),
>>    OPERATOR    3   =,
>>    OPERATOR    3   =~ (uuid, timestamp with time zone),
>>    OPERATOR    4   >=*,
>>    OPERATOR    4   >=~ (uuid, timestamp with time zone),
>>    OPERATOR    5   >*,
>>    OPERATOR    5   >~ (uuid, timestamp with time zone),
>>    FUNCTION    1   uuid_timestamp_cmp(uuid, uuid),
>>    FUNCTION    1   uuid_timestamp_only_cmp(uuid, timestamp
>> with time zone),
>>    FUNCTION    2   uuid_timestamp_sortsupport(internal)
>> ;
>>
>> ...and e.g. operator "<*" is defined as:
>>
>> CREATE FUNCTION uuid_timestamp_lt(uuid, uuid)
>> RETURNS bool
>> AS 'MODULE_PATHNAME', 'uuid_timestamp_lt'
>> LANGUAGE C
>> IMMUTABLE
>> LEAKPROOF
>> STRICT
>> PARALLEL SAFE;
>>
>> COMMENT ON FUNCTION uuid_timestamp_lt(uuid, uuid) IS 'lower than';
>>
>> CREATE OPERATOR <* (
>>    LEFTARG = uuid,
>>    RIGHTARG = uuid,
>>    PROCEDURE = uuid_timestamp_lt,
>>    COMMUTATOR = '>*',
>>    NEGATOR = '>=*',
>>    RESTRICT = scalarltsel,
>>    JOIN = scalarltjoinsel
>> );
>>
>>
>> The function "uuid_timestamp_lt" is basically defined as follows:
>> 1. if not version 1 UUID fallback to standard uuid compare
>> 2. extract timestamp values and compare
>> 3. if equal timestamps fallback to standard uuid compare
>>
>> ...so that a chronological order is established.
>>
>>
>> The test table is created as follows:
>>
>> CREATE TABLE uuid_v1_ext (id uuid);
>> CREATE UNIQUE INDEX idx_uuid_v1_ext ON uuid_v1_ext (id
>> uuid_timestamp_ops);
>>
>>
>> The values for "histogram_bounds" of the test table look like this (due
>> to the default sort order for standard type UUID):
>>
>> 3789-97bf-11e9-b6bb-e03f49f7f733
>> 008b88f8-6deb-11e9-901a-e03f4947f477
>> 010a8b22-586a-11e9-8258-e03f49ce78f3
>> ...
>> 6f682e68-978d-11e9-901a-e03f4947f477
>> 6ff412ee-926f-11e9-901a-e03f4947f477
>> 7079ffe2-642f-11e9-b0cc-e03f49e7fd3b
>> 70ffaeca-4645-11e9-adf9-e03f494677fb
>> ...
>> fef26b41-9b9d-11e9-b0cc-e03f49e7fd3b
>> ff779ce8-9e52-11e9-8258-e03f49ce78f3
>> 6bfc-4de4-11e9-b0d4-e03f49d6f6bf
>>
>> ...and I think that's where the planner gets the decision for a query
>> such as:
>>
>> DELETE FROM uuid_v1_ext WHERE id <*
>> '4bdf6f81-56ad-11e9-8258-e03f49ce78f3';
>>
>> ...which then get's executed as sequential scan instead of an index scan.
>>
>> I was also thinking about changing the selectivity function used by the
>> custom operator, but I didn't find any hints how to implement that
>> without duplicating a lot of internal code.
>>
> 
> Not sure, I'm not very familiar with this code, so I'd have to play with
> it and try things. But that's hard when I don't have any code. Would it
> be possible to share a small self-contained test case?

My test uses historically generated UUID's and is currently not
automated (generates ~120M UUID's over a period of ~4 months). The tool
I am using to generate the UUID's is a Java tool, so not very automation
friendly (using Maven, which downloads a lot at first build time).

The resulting data is ~4 GiB in size, but here is a guide I use for my
manual testing:

https://gist.github.com/ancoron/b08ac4b1ceafda2a38ff12030c011385

Please note that my testing involves 4 SSD's (to avoid too much mixed
I/O but not for functionality):
1.) system + WAL
2.) generated UUID files (for reading)
3.) table data (tablespace "fast")
4.) index data (tablespace "faster")


> 
> I wonder what does uuid_timestamp_cmp do? I suppose it first compares by
> a timestamp extracted from the UUID, right?

exactly:

https://github.com/ancoron/pg-uuid-ext/blob/master/uuid_ext.c#L234

> 
> It'd be interesting to see
> 
> (a) statistics for the column from pg_stats, both for the table and
> index (which should have been built using the custom