somandal commented on PR #10636: URL: https://github.com/apache/pinot/pull/10636#issuecomment-1537572602
> Thinking more on my previous comment..... > > May be one way to workaround the NULL business is to output array when we have duplicates where the min and max is happening ? > > This query > > ``` > SELECT > argmin(intCol, **stringCol**), > argmin(intCol, **doubleCol**), > sum(doubleCol) > FROM table > ``` > > can output > > argmin(intCol, stringCol) argmin(intCol, doubleCol) sum(doubleCol) > ["a2", "a11"] [2.0, 3.0] 9.0 > Similarly, the following query > > ``` > SELECT > intCol, > argmin(longCol, **doubleCol**), > argmin(longCol, **longCol**) > FROM table > GROUP BY intCol > ``` > > Can output > > intCol argmin(longCol, doubleCol) argmin(longCol, longCol) > 1 [2.0, 3.0] 2 > 2 4.0 1 > This is probably a more intuitive way to reason about response and is more SQL friendly imo and avoids populating NULLs. > > @jasperjiaguo wdyt ? IMO we should allow other aggregation functions with argmin and argmax. +1 on using group by key instead of null if we need to keep the output on separate rows. The array output idea looks good to me, but the array can get quite large in some scenarios with a large number of duplicates. Perhaps that should be okay? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org