shubham19may commented on code in PR #14499:
URL: https://github.com/apache/iceberg/pull/14499#discussion_r2495555914
##########
arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorizedArrowReader.java:
##########
@@ -541,31 +541,41 @@ public Optional<LogicalTypeVisitorResult> visit(
@Override
public Optional<LogicalTypeVisitorResult> visit(
LogicalTypeAnnotation.TimestampLogicalTypeAnnotation
timestampLogicalType) {
- FieldVector vector = arrowField.createVector(rootAlloc);
switch (timestampLogicalType.getUnit()) {
case MILLIS:
- ((BigIntVector) vector).allocateNew(batchSize);
+ Field bigIntField =
+ new Field(
+ icebergField.name(),
+ new FieldType(
+ icebergField.isOptional(), new ArrowType.Int(Long.SIZE,
true), null, null),
+ null);
+ FieldVector millisVector = bigIntField.createVector(rootAlloc);
+ ((BigIntVector) millisVector).allocateNew(batchSize);
Review Comment:
well, I had earlier tried with `TimeStampMilliTZVector` first, but it fails
because `arrowField.createVector()` creates `TimeStampMicroTZVector` (from
Iceberg schema), causing `TimeStampMicroTZVector` -> `TimeStampMilliTZVector`
cast exception. Even if we explicitly created `TimeStampMilliTZVector`,
TimeStampMilliTZVector writes raw longs via `getDataBuffer().setLong()`, not
Arrow timestamp values.
moreover, parquet stores `TIMESTAMP_MILLIS` as physical `INT64`, so
`BigIntVector` (raw long container) is correct. Caller knows it's timestamp
data via `ReadType.TIMESTAMP_MILLIS`, not vector type.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]