wgtmac commented on PR #9772: URL: https://github.com/apache/iceberg/pull/9772#issuecomment-1999964002
Not yet. My rough plan is to do following things: 1. add a new VectorizedValuesReader base class to supporting different encodings. This is similar to what spark does but reading into arrow field vector: :https://github.com/apache/spark/blob/b7aa9740249b50ad9db254626c530ff5bc33d385/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedValuesReader.java#L30 2. extend VectorizedValuesReader to add v2 encodings one by one. 3. support vectorized readers for nested types. This patch is the step 1 above and only added vectorized reading interfaces for float/double/int32/int64 physical types. It already shows the big picture that how following steps will be done. More interfaces will be added for other physical types and logical types into arrow field vectors will be added progressively for better review experience. Does this make sense to you? @nastra -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org