hypsakata opened a new issue, #48478:
URL: https://github.com/apache/arrow/issues/48478

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   When creating an `Arrow::Table` from a Ruby Hash, if a column contains 
nested arrays consisting solely of non-negative Integer values, the column is 
incorrectly inferred as `string` (utf8) instead of a list of integers.
   
   However, if a negative integer is present in the data, the column is 
correctly inferred as a list type.
   
   #### Analysis (suspected root cause)
   
   It appears the issue lies within `red-arrow/lib/arrow/array-builder.rb`.
   `detect_builder_info()` returns `UIntArrayBuilder` with `detected: false` 
for non-negative Integers (presumably to allow upgrading to a signed type if a 
negative value appears later).
   
   In the case of Arrays, a `ListArrayBuilder` seems to be constructed only 
when `sub_builder_info[:detected]` is `true`. Consequently, nested arrays 
containing only non-negative integers fail to produce a list type, causing the 
column to fall back to `string` (utf8).
   
   ### Steps to reproduce the bug
   
   ```ruby
   require "arrow"
   
   # Case 1: Only non-negative integers (Bug)
   p Arrow::Table.new({ id: [1, 2], values: [[0, 1, 2], [3, 4]] }).schema
   # Actual: values is inferred as string (utf8)
   # Output:
   # #<Arrow::Schema:... id: uint8 
   # values: string>
   ```
   
   ```ruby
   require "arrow"
   
   # Case 2: Contains a negative integer (Works as expected)
   p Arrow::Table.new({ id: [1, 2], values: [[0, -1, 2], [3, 4]] }).schema
   # Actual: values is inferred as list<int8>
   # Output:
   # #<Arrow::Schema:... id: uint8
   # values: list<item: int8>>
   ```
   
   ### Expected behavior
   
   `values` should be inferred as a list of integers (e.g. `list<item: int*>`), 
not `string`,
   even when all integers are non-negative. (The exact integer bit width may 
vary.)
   
   ### Actual behavior
   
   When all integers are non-negative, `values` is inferred as `string` (utf8). 
Adding a negative integer results in the correct list type inference.
   
   ### Environment
   
   - OS: macOS 26.1
   - CPU arch: Apple M4 Pro
   - Ruby: 3.4.7
   - Gems: red-arrow 22.0.0
   - Arrow installation method: Homebrew
   
   
   ### Component(s)
   
   Ruby


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to