This is an automated email from the ASF dual-hosted git repository.
kou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new d89c14b5d5 GH-48478: [Ruby] Fix Ruby list inference for nested
non-negative integer arrays (#48584)
d89c14b5d5 is described below
commit d89c14b5d5203bc403fb62060fdf1ef2c0a49339
Author: hypsakata <[email protected]>
AuthorDate: Sat Dec 20 21:49:01 2025 +0900
GH-48478: [Ruby] Fix Ruby list inference for nested non-negative integer
arrays (#48584)
### Rationale for this change
When building an `Arrow::Table` from a Ruby Hash passed to
`Arrow::Table.new`, nested `Integer` arrays are incorrectly inferred as
`string` (utf8) if all values are non-negative. This behavior is unexpected;
nested integer arrays should be consistently represented as a list type (e.g.,
`list<item: uint*>` or `list<item: int*>`) rather than falling back to UTF-8
strings.
### What changes are included in this PR?
This PR modifies the logic in `detect_builder_info()`, specifically the
`when ::Array` block, to correctly identify nested non-negative integer arrays
as list arrays.
The change ensures that if `sub_builder_info` contains a valid `:builder`,
it will be used even if `sub_builder_info` does not yet indicate that the type
has been "detected."
### Are these changes tested?
Yes. (`ruby ruby/red-arrow/test/run-test.rb`)
### Are there any user-facing changes?
Yes.
GitHub Issue: Closes #48478
* GitHub Issue: #48478
Authored-by: hypsakata <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
---
ruby/red-arrow/lib/arrow/array-builder.rb | 8 ++++---
ruby/red-arrow/test/test-array-builder.rb | 40 +++++++++++++++++++++++++++++++
2 files changed, 45 insertions(+), 3 deletions(-)
diff --git a/ruby/red-arrow/lib/arrow/array-builder.rb
b/ruby/red-arrow/lib/arrow/array-builder.rb
index 876fd71120..2ccf50f3c1 100644
--- a/ruby/red-arrow/lib/arrow/array-builder.rb
+++ b/ruby/red-arrow/lib/arrow/array-builder.rb
@@ -155,12 +155,14 @@ module Arrow
sub_builder_info = detect_builder_info(sub_value, sub_builder_info)
break if sub_builder_info and sub_builder_info[:detected]
end
- if sub_builder_info and sub_builder_info[:detected]
- sub_value_data_type = sub_builder_info[:builder].value_data_type
+ if sub_builder_info
+ sub_builder = sub_builder_info[:builder]
+ return builder_info unless sub_builder
+ sub_value_data_type = sub_builder.value_data_type
field = Field.new("item", sub_value_data_type)
{
builder: ListArrayBuilder.new(ListDataType.new(field)),
- detected: true,
+ detected: sub_builder_info[:detected],
}
else
builder_info
diff --git a/ruby/red-arrow/test/test-array-builder.rb
b/ruby/red-arrow/test/test-array-builder.rb
index fb48aba8a4..7a2d42e54b 100644
--- a/ruby/red-arrow/test/test-array-builder.rb
+++ b/ruby/red-arrow/test/test-array-builder.rb
@@ -146,6 +146,46 @@ class ArrayBuilderTest < Test::Unit::TestCase
["Apache Arrow"],
])
end
+
+ test("list<uint>s") do
+ values = [
+ [0, 1, 2],
+ [3, 4],
+ ]
+ array = Arrow::Array.new(values)
+ data_type = Arrow::ListDataType.new(Arrow::UInt8DataType.new)
+ assert_equal({
+ data_type: data_type,
+ values: [
+ [0, 1, 2],
+ [3, 4],
+ ],
+ },
+ {
+ data_type: array.value_data_type,
+ values: array.to_a,
+ })
+ end
+
+ test("list<int>s") do
+ values = [
+ [0, -1, 2],
+ [3, 4],
+ ]
+ array = Arrow::Array.new(values)
+ data_type = Arrow::ListDataType.new(Arrow::Int8DataType.new)
+ assert_equal({
+ data_type: data_type,
+ values: [
+ [0, -1, 2],
+ [3, 4],
+ ],
+ },
+ {
+ data_type: array.value_data_type,
+ values: array.to_a,
+ })
+ end
end
sub_test_case("specific builder") do