gene-bordegaray commented on issue #21992:
URL: https://github.com/apache/datafusion/issues/21992#issuecomment-4378732353

   I am in favor of a general trait as the long-term goal of this work. I think 
allowing users to implement their own type of partitioning will make DF more 
powerful in production use cases as I am sure that there will be instances of 
partitioning that are not captured in `Hash` or `Range` partitioning (just as 
`Hash` did not fully work for us). Off the top of my head something like a 
`Value` partitioning would also be useful:
   
   ```text
   p0: col in ('a', 'd')
   p1: col in ('b')
   p2: col in ('c')
   ```
   
   Because of this I think providing another extendible point for people (the 
trait) will be very high value even if worth the extra effort.
   
   With this said I do think we can create mergeable commits by extending the 
enum now by supporting `Range` partitioning as @adriangb has described but 
model it after what our trait will look like. We can treat the trait as the 
final goal but let `Range` help us define the requirements for that as we 
pseudo-implement what that trait will look like.
   
   > If we are going to go with a trait, it might be good to declare that we 
eventually want to shoot to remove all special cases for 
   > hash partitioning 🤔
   
   And along with this, yes I agree here that we should shoot to encapsulate 
all partitioning logic behind this trait and no special cases. The optimizer 
rules and other things should ask if two partitioning are compatible or satisfy 
one another, not just "is this Hash partitioned" 👍 
   
   I see this path as actually being more intuitive once done well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to