from:"Alec Lazarescu"

Re: Partitioning options

2024-02-18 Thread Alec Lazarescu

"Would probably look at a nested partitioning"

I'm not the original poster, but I have a schema with nested
(composite) partitions and I do run into some significant
inefficiencies compared to flat partitions in various schema metadata
operations (queries to get the list of tables, creating foreign keys,
etc.) in tables with 1,000+ total partitions.

One example: 
https://www.postgresql.org/message-id/CAE%2BE%3DSQacy6t_3XzCWnY1eiRcNWfz4pp02FER0N7mU_F%2Bo8G_Q%40mail.gmail.com

Alec

On Sun, Feb 11, 2024 at 8:25 AM Justin  wrote:
>
> Hi Marc,
>
> Nested partitioning still allows for simple data deletion by dropping the 
> table that falls in that date range.
>
> Probably thinking of partitioning by multicolomn rules which is very complex  
> to set up
>
> On Fri, Feb 9, 2024, 10:29 AM Marc Millas  wrote:
>>
>>
>>
>>
>> On Thu, Feb 8, 2024 at 10:25 PM Justin  wrote:
>>>
>>> Hi Sud,
>>>
>>> Would not look at HASH partitioning as it is very expensive to add or 
>>> subtract the number of partitions.
>>>
>>> Would probably look at a nested partitioning using  customer ID using range 
>>> or list of IDs then  by transaction date,  Its easy to add  partitions and 
>>> balance the partitions segments.
>>
>>
>>  I'll not do that because, then, when getting rid of obsolete data, you must 
>> delete a huge number of records, and vacuum each partition.
>> if partitioning by date, you will ease greatly the cleaning, by just getting 
>> rid of obsolete partitions which is quite speedy.( no delete, no vacuum, no 
>> index updates, ...)
>> Marc
>>
>>>
>>> Keep in mind that SELECT queries being used on the partition must  use the 
>>> partitioning KEY in the WHERE clause of the query or performance will 
>>> suffer.
>>>
>>> Suggest doing a query analysis before deploying partition to confirm the 
>>> queries WHERE clauses matched the planned partition rule.  I suggest that 
>>> 80% of the queries of the executed queries must match the partition rule if 
>>> not don't deploy partitioning or change  all the queries in the application 
>>> to match the partition rule
>>>
>>>
>>> On Thu, Feb 8, 2024 at 3:51 PM Greg Sabino Mullane  
>>> wrote:
>
> Out of curiosity, As OP mentioned that there will be Joins and also 
> filters on column Customer_id column , so why don't you think that 
> subpartition by customer_id will be a good option? I understand List 
> subpartition may not be an option considering the new customer_ids gets 
> added slowly in the future(and default list may not be allowed) and also 
> OP mentioned, there is skewed distribution of data for customer_id 
> column. However what is the problem if OP will opt for HASH subpartition 
> on customer_id in this situation?


 It doesn't really gain you much, given you would be hashing it, the 
 customers are unevenly distributed, and OP talked about filtering on the 
 customer_id column. A hash partition would just be a lot more work and 
 complexity for us humans and for Postgres. Partitioning for the sake of 
 partitioning is not a good thing. Yes, smaller tables are better, but they 
 have to be smaller targeted tables.

 sud wrote:

> 130GB of storage space as we verified using the "pg_relation_size" 
> function, for a sample data set.


 You might also want to closely examine your schema. At that scale, every 
 byte saved per row can add up.

 Cheers,
 Greg

Re: Partitioning options

2024-02-21 Thread Alec Lazarescu

Hi, Justin.

The example link has self-contained DDL to create the partitions (in
flat vs composite mode for comparison) and then making the FK's on
each showing the marked speed difference for the same net number of
partitions (1200 flat vs 80x15 = 1200 composite):
https://www.postgresql.org/message-id/CAE%2BE%3DSQacy6t_3XzCWnY1eiRcNWfz4pp02FER0N7mU_F%2Bo8G_Q%40mail.gmail.com

Alec

On Tue, Feb 20, 2024 at 11:59 AM Justin  wrote:
>
>
> On Sun, Feb 18, 2024 at 5:20 PM Alec Lazarescu  wrote:
>>
>> "Would probably look at a nested partitioning"
>>
>> I'm not the original poster, but I have a schema with nested
>> (composite) partitions and I do run into some significant
>> inefficiencies compared to flat partitions in various schema metadata
>> operations (queries to get the list of tables, creating foreign keys,
>> etc.) in tables with 1,000+ total partitions.
>>
>> One example: 
>> https://www.postgresql.org/message-id/CAE%2BE%3DSQacy6t_3XzCWnY1eiRcNWfz4pp02FER0N7mU_F%2Bo8G_Q%40mail.gmail.com
>>
>> Alec
>>
>>
>
> Hi Alec,
>
> would need to see the DDL of the partitions and the queries accessing these 
> partitions to have an opinion
>
> Thank you
> Justin

Re: Partitioning options

Re: Partitioning options

2 matches

Site Navigation

Mail list logo

Footer information