multicolumn partitioning help

2023-03-14 Thread James Robertson
Hey folks,
I am having issues with multicolumn partitioning. For reference I am using
the following link as my guide:
https://www.postgresql.org/docs/devel/sql-createtable.html

To demonstrate my problem, I created a simple table called humans. I want
to partition by the year of the human birth and then the first character of
the hash. So for each year I'll have year*16 partitions. (hex)

CREATE TABLE humans (
hash bytea,
fname text,
dob date
)PARTITION BY RANGE (EXTRACT(YEAR FROM dob),substring(hash::text, 1,
1));

Reading the documentation:   "When creating a range partition, the lower
bound specified with FROM is an inclusive bound, whereas the upper bound
specified with TO is an exclusive bound".

However I can't insert any of the following after the first one, because it
says it overlaps. Do I need to do anything different when defining
multi-column partitions?


This works:
CREATE TABLE humans_1968_0 PARTITION OF humans FOR VALUES FROM (1968, '0')
TO (1969, '1');


These fail:
CREATE TABLE humans_1968_1 PARTITION OF humans FOR VALUES FROM (1968, '1')
TO (1969, '2');
CREATE TABLE humans_1968_2 PARTITION OF humans FOR VALUES FROM (1968, '2')
TO (1969, '3');
CREATE TABLE humans_1968_3 PARTITION OF humans FOR VALUES FROM (1968, '3')
TO (1969, '4');
CREATE TABLE humans_1968_4 PARTITION OF humans FOR VALUES FROM (1968, '4')
TO (1969, '5');
CREATE TABLE humans_1968_5 PARTITION OF humans FOR VALUES FROM (1968, '5')
TO (1969, '6');
CREATE TABLE humans_1968_6 PARTITION OF humans FOR VALUES FROM (1968, '6')
TO (1969, '7');
CREATE TABLE humans_1968_7 PARTITION OF humans FOR VALUES FROM (1968, '7')
TO (1969, '8');
CREATE TABLE humans_1968_8 PARTITION OF humans FOR VALUES FROM (1968, '8')
TO (1969, '9');
CREATE TABLE humans_1968_9 PARTITION OF humans FOR VALUES FROM (1968, '9')
TO (1969, 'a');
CREATE TABLE humans_1968_a PARTITION OF humans FOR VALUES FROM (1968, 'a')
TO (1969, 'b');
CREATE TABLE humans_1968_b PARTITION OF humans FOR VALUES FROM (1968, 'b')
TO (1969, 'c');
CREATE TABLE humans_1968_c PARTITION OF humans FOR VALUES FROM (1968, 'c')
TO (1969, 'd');
CREATE TABLE humans_1968_d PARTITION OF humans FOR VALUES FROM (1968, 'd')
TO (1969, 'e');
CREATE TABLE humans_1968_e PARTITION OF humans FOR VALUES FROM (1968, 'e')
TO (1969, 'f');
CREATE TABLE humans_1968_f PARTITION OF humans FOR VALUES FROM (1968, 'f')
TO (1969, 'g');
CREATE TABLE humans_1969_0 PARTITION OF humans FOR VALUES FROM (1969, '0')
TO (1970, '1');
CREATE TABLE humans_1969_1 PARTITION OF humans FOR VALUES FROM (1969, '1')
TO (1970, '2');
CREATE TABLE humans_1969_2 PARTITION OF humans FOR VALUES FROM (1969, '2')
TO (1970, '3');
CREATE TABLE humans_1969_3 PARTITION OF humans FOR VALUES FROM (1969, '3')
TO (1970, '4');
CREATE TABLE humans_1969_4 PARTITION OF humans FOR VALUES FROM (1969, '4')
TO (1970, '5');
CREATE TABLE humans_1969_5 PARTITION OF humans FOR VALUES FROM (1969, '5')
TO (1970, '6');
CREATE TABLE humans_1969_6 PARTITION OF humans FOR VALUES FROM (1969, '6')
TO (1970, '7');
CREATE TABLE humans_1969_7 PARTITION OF humans FOR VALUES FROM (1969, '7')
TO (1970, '8');
CREATE TABLE humans_1969_8 PARTITION OF humans FOR VALUES FROM (1969, '8')
TO (1970, '9');
CREATE TABLE humans_1969_9 PARTITION OF humans FOR VALUES FROM (1969, '9')
TO (1970, 'a');
CREATE TABLE humans_1969_a PARTITION OF humans FOR VALUES FROM (1969, 'a')
TO (1970, 'b');
CREATE TABLE humans_1969_b PARTITION OF humans FOR VALUES FROM (1969, 'b')
TO (1970, 'c');
CREATE TABLE humans_1969_c PARTITION OF humans FOR VALUES FROM (1969, 'c')
TO (1970, 'd');
CREATE TABLE humans_1969_d PARTITION OF humans FOR VALUES FROM (1969, 'd')
TO (1970, 'e');
CREATE TABLE humans_1969_e PARTITION OF humans FOR VALUES FROM (1969, 'e')
TO (1970, 'f');
CREATE TABLE humans_1969_f PARTITION OF humans FOR VALUES FROM (1969, 'f')
TO (1970, 'g');

Thank you for reviewing this problem.


Re: multicolumn partitioning help

2023-03-15 Thread James Robertson
Laurenz, Justin,
Thank you both for thinking of this problem.

Laurenz your solution is how I thought I would work around my (lack of)
understanding of partitioning. (nested partitions).
I was hesitant because I didn't know what sort of performance problems I
would create for myself.

If we have true multi-column don't we get the benefit of:

TopLevelTable
|
|> worker-thread 1
|
|> worker-thread 2
|
|> worker-thread n

Doesn't that give me more performance than:

TopLevelTable
|
|> worker-thread 1
|> sub-table 1.1
|> sub-table 1.2
|> sub-table 1.n
|
|> worker-thread 2
|> sub-table 2.1
|> sub-table 2.2
|> sub-table 2.n

or do we get?

TopLevelTable
|
|> worker-thread 1 (default catch)
|> worker thread 2 -> sub-table 1.1
|> worker thread 3 -> sub-table 1.2
|> worker thread 4 -> sub-table 1.n
|
|> worker-thread 5 (default catch)
|> worker thread 6 -> sub-table 2.1
|> worker thread 7 -> sub-table 2.2
|> worker thread 8 -> sub-table 2.n


Summary:
1) if we create nested partitions, do we create performance issues:
2) if nested partitions are the solutions, what is the point of
multi-column partitioning?


wish list) wouldn't it be neat if we can do mult-mode multi-column? like
PARTITION BY RANGE (EXTRACT(YEAR FROM dob)) LIST (SUBSTRING(hash, 1, 1));

On Tue, Mar 14, 2023 at 5:41 PM Laurenz Albe 
wrote:

> On Sun, 2023-03-12 at 13:59 -0400, James Robertson wrote:
> > I am having issues with multicolumn partitioning. For reference I am
> using the following link as my guide:
> > https://www.postgresql.org/docs/devel/sql-createtable.html
> >
> > To demonstrate my problem, I created a simple table called humans. I
> want to partition by the year
> > of the human birth and then the first character of the hash. So for each
> year I'll have year*16 partitions. (hex)
> >
> > CREATE TABLE humans (
> > hash bytea,
> > fname text,
> > dob date
> > )PARTITION BY RANGE (EXTRACT(YEAR FROM dob),substring(hash::text, 1,
> 1));
> >
> > Reading the documentation:   "When creating a range partition, the lower
> bound specified with
> > FROM is an inclusive bound, whereas the upper bound specified with TO is
> an exclusive bound".
> >
> > However I can't insert any of the following after the first one, because
> it says it overlaps.
> > Do I need to do anything different when defining multi-column partitions?
> >
> >
> > This works:
> > CREATE TABLE humans_1968_0 PARTITION OF humans FOR VALUES FROM (1968,
> '0') TO (1969, '1');
> >
> >
> > These fail:
> > CREATE TABLE humans_1968_1 PARTITION OF humans FOR VALUES FROM (1968,
> '1') TO (1969, '2');
>
> Justin has explained what the problem is, let me supply a solution.
>
> I think you want subpartitioning, like
>
>   CREATE TABLE humans (
>  hash bytea,
>  fname text,
>  dob date
>   ) PARTITION BY LIST (EXTRACT (YEAR FROM dob));
>
>   CREATE TABLE humans_2002
>  PARTITION OF humans FOR VALUES IN (2002)
>  PARTITION BY HASH (hash);
>
>   CREATE TABLE humans_2002_0
>  PARTITION OF humans_2002 FOR VALUES WITH (MODULUS 26, REMAINDER 0);
>
>   [...]
>
>   CREATE TABLE humans_2002_25
>  PARTITION OF humans_2002 FOR VALUES WITH (MODULUS 26, REMAINDER 25);
>
> and so on for the other years.
>
> Yours,
> Laurenz Albe
>