laskoviymishka opened a new issue, #986:
URL: https://github.com/apache/iceberg-go/issues/986

   Parent: #589
   
   #929 / PR #932 cover the non-shredded variant path (`struct<metadata: 
binary, value: binary>`). The shredded path is the next piece: a Parquet 
variant column appears as `struct<metadata: binary, value: binary, typed_value: 
STRUCT>` where `typed_value` mirrors the shredded subtree and the reader 
reconstructs the full variant by preferring `typed_value` per-field and falling 
back to the residual `value` slice.
   
   The Parquet codec lives in arrow-go's `parquet/variant` (already in 
`go.mod`); the spec is in [Parquet Variant 
shredding](https://github.com/apache/parquet-format/blob/master/VariantShredding.md).
 Add a new file `table/internal/variant_shredded.go` carrying a pure 
`ReassembleShreddedVariant(metadata, value, typedValue) (variant.Value, error)` 
that walks the shredded struct per spec, then call it from 
`table/internal/parquet_files.go` so a shredded variant column reads 
identically to a non-shredded one. A shredded column should be invisible to the 
scanner.
   
   Cross-client coverage: a Java-produced shredded variant fixture committed 
under `table/internal/testdata/` and a golden test asserting iceberg-go reads 
the same `variant.Value`. Java apache/iceberg PR landing the writer: 
[apache/iceberg#11500](https://github.com/apache/iceberg/pull/11500) and 
follow-ups.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to