rdblue commented on code in PR #9176: URL: https://github.com/apache/iceberg/pull/9176#discussion_r1468947657
########## api/src/main/java/org/apache/iceberg/expressions/ValueAggregate.java: ########## @@ -30,13 +30,16 @@ protected ValueAggregate(Operation op, BoundTerm<T> term) { @Override public T eval(StructLike struct) { - return term().eval(struct); + if (struct.size() > 1) { + throw new UnsupportedOperationException("Expected struct like of size 1"); + } + + return (T) struct.get(0, term().type().typeId().javaClass()); } @Override public T eval(DataFile file) { - valueStruct.setValue(evaluateRef(file)); - return term().eval(valueStruct); + return (T) evaluateRef(file); Review Comment: The case that I think is missing is when the term is not a simple reference and is instead a transformed value. That's why the `SingleValueStruct` class is here. The idea is to evaluate the original term (which could be a transform of a field) on the value that was retrieved from the `DataFile` metadata. Also, it's good to keep in mind what each `eval` method is for. The `eval(StructLike)` method is for evaluating the aggregate on individual rows. That's because this is generic and we can use the aggregate framework to calculate aggregates of normal rows. That means the change to that method isn't correct. That eval method should call the accessor that was bound to the schema of incoming rows. This eval method, `eval(DataFile)` is called to produce a value for a set of rows. As I said, that value may be then transformed by the expression so we need to run the expression term on it. To be able to run the term, we need a row that can supply the value that came from the `DataFile` and that's where `SingleValueStruct` comes in: it returns the `DataFile` value for _any_ position that is queried. It should work with any accessor. The problem is that I didn't consider nested cases when writing it. It always returns the value, but it should return itself if the caller expected the value to be nested. Here's an alternative fix for `SingleValueStruct` that keeps the original `eval` implementations but handles nested values: ```java public <T> T get(int pos, Class<T> javaClass) { if (javaClass.isAssignableFrom(StructLike.class)) { return (T) this; } else { return (T) value; } } ``` The tests from this PR pass with that change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org