This is an automated email from the ASF dual-hosted git repository.

kou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new d89c14b5d5 GH-48478: [Ruby] Fix Ruby list inference for nested 
non-negative integer arrays (#48584)
d89c14b5d5 is described below

commit d89c14b5d5203bc403fb62060fdf1ef2c0a49339
Author: hypsakata <[email protected]>
AuthorDate: Sat Dec 20 21:49:01 2025 +0900

    GH-48478: [Ruby] Fix Ruby list inference for nested non-negative integer 
arrays (#48584)
    
    ### Rationale for this change
    
    When building an `Arrow::Table` from a Ruby Hash passed to 
`Arrow::Table.new`, nested `Integer` arrays are incorrectly inferred as 
`string` (utf8) if all values are non-negative. This behavior is unexpected; 
nested integer arrays should be consistently represented as a list type (e.g., 
`list<item: uint*>` or `list<item: int*>`) rather than falling back to UTF-8 
strings.
    
    ### What changes are included in this PR?
    
    This PR modifies the logic in `detect_builder_info()`, specifically the 
`when ::Array` block, to correctly identify nested non-negative integer arrays 
as list arrays.
    
    The change ensures that if `sub_builder_info` contains a valid `:builder`, 
it will be used even if `sub_builder_info` does not yet indicate that the type 
has been "detected."
    
    ### Are these changes tested?
    
    Yes. (`ruby ruby/red-arrow/test/run-test.rb`)
    
    ### Are there any user-facing changes?
    
    Yes.
    
    GitHub Issue: Closes #48478
    * GitHub Issue: #48478
    
    Authored-by: hypsakata <[email protected]>
    Signed-off-by: Sutou Kouhei <[email protected]>
---
 ruby/red-arrow/lib/arrow/array-builder.rb |  8 ++++---
 ruby/red-arrow/test/test-array-builder.rb | 40 +++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/ruby/red-arrow/lib/arrow/array-builder.rb 
b/ruby/red-arrow/lib/arrow/array-builder.rb
index 876fd71120..2ccf50f3c1 100644
--- a/ruby/red-arrow/lib/arrow/array-builder.rb
+++ b/ruby/red-arrow/lib/arrow/array-builder.rb
@@ -155,12 +155,14 @@ module Arrow
             sub_builder_info = detect_builder_info(sub_value, sub_builder_info)
             break if sub_builder_info and sub_builder_info[:detected]
           end
-          if sub_builder_info and sub_builder_info[:detected]
-            sub_value_data_type = sub_builder_info[:builder].value_data_type
+          if sub_builder_info
+            sub_builder = sub_builder_info[:builder]
+            return builder_info unless sub_builder
+            sub_value_data_type = sub_builder.value_data_type
             field = Field.new("item", sub_value_data_type)
             {
               builder: ListArrayBuilder.new(ListDataType.new(field)),
-              detected: true,
+              detected: sub_builder_info[:detected],
             }
           else
             builder_info
diff --git a/ruby/red-arrow/test/test-array-builder.rb 
b/ruby/red-arrow/test/test-array-builder.rb
index fb48aba8a4..7a2d42e54b 100644
--- a/ruby/red-arrow/test/test-array-builder.rb
+++ b/ruby/red-arrow/test/test-array-builder.rb
@@ -146,6 +146,46 @@ class ArrayBuilderTest < Test::Unit::TestCase
                        ["Apache Arrow"],
                      ])
       end
+
+      test("list<uint>s") do
+        values = [
+          [0, 1, 2],
+          [3, 4],
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::ListDataType.new(Arrow::UInt8DataType.new)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         [0, 1, 2],
+                         [3, 4],
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+
+      test("list<int>s") do
+        values = [
+          [0, -1, 2],
+          [3, 4],
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::ListDataType.new(Arrow::Int8DataType.new)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         [0, -1, 2],
+                         [3, 4],
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
     end
 
     sub_test_case("specific builder") do

Reply via email to