RussellSpitzer commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1803845635
########## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorHolder.java: ########## @@ -140,12 +141,21 @@ public static class ConstantVectorHolder<T> extends VectorHolder { private final int numRows; public ConstantVectorHolder(int numRows) { + super(new NullVector("_dummy_", numRows), null, new NullabilityHolder(numRows)); + nullabilityHolder().setNulls(0, numRows); this.numRows = numRows; this.constantValue = null; } public ConstantVectorHolder(Types.NestedField icebergField, int numRows, T constantValue) { - super(icebergField); + super( Review Comment: Ok having gone through the code I think I may have a strategy here but please let me know what i'm missing in this interpretation. ConstantVectorHolder is used as a dummy holder prior to this commit. It has no *vector* , column descriptor, etc ... everything is null. The current approach sometimes makes a ConstantVectorHolder which sometimes *does* have something inside of it. This is done so that GenericArrowVectorAccessorFactory will have something to work with. From what I can see in GenericArrowVectorAccessorFactory is that it doesn't work with ConstantVectorHolders at all. It currently can only handle cases in which holder.vector() is a non null type and matches some known vector type. Spark on the other hand, does not actually use this path and when it sees a ConstantVectorHolder it instead creates a o.a.i.s.ConstantColumnVector. This is why I don't think Spark has any issue with this. (cc @amogh-jahagirdar ) Now this makes me think what we really need is to just fix GenericArrowVectorAccessorFactory to handle the case where the "vector" is null and in that circumstance attempt to cast the VectorHolder to a ConstantVectorHolder and create the appropriate vector type or accessor. I think we have a few options to do this. I think the easiest may be to just create an accessor that looks like the Spark ConstantColumnVector and just use that as our generic accessor. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org