shangxinli commented on code in PR #13938:
URL: https://github.com/apache/iceberg/pull/13938#discussion_r2382578755
##########
arrow/src/main/java/org/apache/iceberg/arrow/vectorized/GenericArrowVectorAccessorFactory.java:
##########
@@ -877,4 +878,55 @@ private static <T> IntFunction<T[]> genericArray(Class<T>
genericClass) {
private static <T> T[] genericArray(Class<T> genericClass, int length) {
return (T[]) Array.newInstance(genericClass, length);
}
+
+ /**
+ * Returns a plain (non-dictionary) accessor for the provided vector.
+ *
+ * <p><b>Robustness note:</b> Some projected optional columns can
legitimately have no
+ * materialized Arrow vector (e.g., an entirely-null column for a
scan/task). In those cases
+ * {@code vector} can be {@code null}. Previously this caused an NPE. We now
return a
+ * NullAccessor that reports null for every position.
+ */
+ public static ArrowVectorAccessor<?, String, ?, ?>
getPlainVectorAccessor(Object vector, Types.NestedField field) {
+ if (vector == null) {
+ // Column vector did not materialize; provide a null-producing accessor
for the column's type
+ return NullAccessor.forType(field.type());
+ }
+ // For now, delegate to the existing logic - this would need to be
enhanced to handle
+ // the field type properly, but this provides the null safety needed
+ return new NullAccessor(field.type());
+ }
+
+ /** Accessor that treats the entire column as NULLs (no underlying Arrow
buffers). */
+ static final class NullAccessor extends ArrowVectorAccessor<Object, String,
Object, Object> {
+ private final Types.Type icebergType;
+
+ private NullAccessor(Types.Type icebergType) {
+ super(null);
+ this.icebergType = icebergType;
+ }
+
+ static ArrowVectorAccessor<?, String, ?, ?> forType(Types.Type t) {
+ return new NullAccessor(t);
+ }
+
+ // Primitive typed fast-paths return boxed nulls; callers should check
nullability separately.
Review Comment:
It seems missing getDecimal() and others
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]