MichielMortier opened a new issue, #673:
URL: https://github.com/apache/arrow-go/issues/673

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   **Package:** `github.com/apache/arrow/go/v18/parquet/pqarrow`
   
   **Summary:**  
   `SchemaField.IsLeaf()` is documented/implemented as `ColIndex != -1`, but 
the library never sets `ColIndex` to `-1` for non-leaf (nested) fields. As a 
result, non-leaf nodes can have a non-negative `ColIndex`, so `IsLeaf()` 
returns `true` when it should return `false`.
   
   **Expected behavior:**  
   - Leaf fields (actual Parquet columns): `ColIndex >= 0`, `IsLeaf() == true`. 
 
   - Non-leaf fields (nested/parent nodes): `ColIndex == -1` (or similar 
sentinel), `IsLeaf() == false`.
   
   **Actual behavior:**  
   Non-leaf fields often keep a default or incorrect `ColIndex` (e.g. 0 or 
another value). So `IsLeaf()` can be `true` for parent nodes, and code that 
uses `IsLeaf()` to decide “is this a physical column?” gets wrong results.
   
   **Workaround:**  
   Determine “leaf” by structure instead of `IsLeaf()`: treat a field as a leaf 
when it has no children, e.g. `len(f.Children) == 0`. Only use `f.ColIndex` for 
such nodes when collecting column indices.
   
   
   ### Component(s)
   
   Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to