barry3406 commented on issue #21507:
URL: https://github.com/apache/datafusion/issues/21507#issuecomment-4222788641

   Thanks for the triage @asolimando — the Postgres reference makes the 
expected semantics clear. The bug looks like `functional_dependencies.rs` 
treats a `UNIQUE` constraint the same way as a `PRIMARY KEY` for FD inference, 
but per SQL spec a UNIQUE column can have multiple NULLs, so `(a)` is not 
actually a unique row identifier when NULLs are present. That's why the GROUP 
BY collapses the two NULL rows into one.
   
   I can put up a fix that distinguishes UNIQUE from PRIMARY KEY in FD 
derivation (only emit a functional dependency from UNIQUE columns when the 
column is also `NOT NULL`), with a sqllogictest regression covering the NULL 
case. Sound right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to