rdblue commented on code in PR #9176:
URL: https://github.com/apache/iceberg/pull/9176#discussion_r1468947657


##########
api/src/main/java/org/apache/iceberg/expressions/ValueAggregate.java:
##########
@@ -30,13 +30,16 @@ protected ValueAggregate(Operation op, BoundTerm<T> term) {
 
   @Override
   public T eval(StructLike struct) {
-    return term().eval(struct);
+    if (struct.size() > 1) {
+      throw new UnsupportedOperationException("Expected struct like of size 
1");
+    }
+
+    return (T) struct.get(0, term().type().typeId().javaClass());
   }
 
   @Override
   public T eval(DataFile file) {
-    valueStruct.setValue(evaluateRef(file));
-    return term().eval(valueStruct);
+    return (T) evaluateRef(file);

Review Comment:
   The case that I think is missing is when the term is not a simple reference 
and is instead a transformed value. That's why the `SingleValueStruct` class is 
here. The idea is to evaluate the original term (which could be a transform of 
a field) on the value that was retrieved from the `DataFile` metadata.
   
   Also, it's good to keep in mind what each `eval` method is for. The 
`eval(StructLike)` method is for evaluating the aggregate on individual rows. 
That's because this is generic and we can use the aggregate framework to 
calculate aggregates of normal rows. That means the change to that method isn't 
correct. That eval method should call the accessor that was bound to the schema 
of incoming rows.
   
   This eval method, `eval(DataFile)` is called to produce a value for a set of 
rows. As I said, that value may be then transformed by the expression so we 
need to run the expression term on it. To be able to run the term, we need a 
row that can supply the value that came from the `DataFile` and that's where 
`SingleValueStruct` comes in: it returns the `DataFile` value for _any_ 
position that is queried. It should work with any accessor. The problem is that 
I didn't consider nested cases when writing it. It always returns the value, 
but it should return itself if the caller expected the value to be nested.
   
   Here's an alternative fix for `SingleValueStruct` that keeps the original 
`eval` implementations but handles nested values:
   
   ```java
       public <T> T get(int pos, Class<T> javaClass) {
         if (javaClass.isAssignableFrom(StructLike.class)) {
           return (T) this;
         } else {
           return (T) value;
         }
       }
   ```
   
   The tests from this PR pass with that change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to