darcysaum-toast opened a new issue, #131: URL: https://github.com/apache/arrow-java/issues/131
### Describe the bug, including details regarding any error messages, version, and platform. Hello arrow team, I believe the Avro adapter fails on avro union types nested in records and arrays. The allocator tries to allocate increasingly more memory until it reaches the configured max (ie you can use `-Darrow.vector.max_allocation_bytes=8589934592` to tell the allocator to use 1GB and it will try to allocate all of it for an empty array field with a nested union type field). `Memory required for vector capacity 508400 is (2097152), which is more than max allowed (1048576)` I believe this is because the while loop [here](https://github.com/apache/arrow/blob/main/java/adapter/avro/src/main/java/org/apache/arrow/adapter/avro/consumers/AvroStructConsumer.java#L69-L75) never terminates as `UnionVector::getValueCapacity` always returns 0 when the union is nested in either a record or an array. I have a branch [here](https://github.com/darcysaum-toast/arrow/tree/avro-nested-union) with a [failing test](https://github.com/darcysaum-toast/arrow/blob/avro-nested-union/java/adapter/avro/src/test/java/org/apache/arrow/adapter/avro/AvroToArrowIteratorTest.java#L171). ``` @Test public void testNestedUnion() throws Exception { Schema schema = getSchema("test_nested_union.avsc"); Schema topChildSchema = schema .getField("f0") .schema(); GenericRecord parent = new GenericData.Record(schema); GenericRecord topChild = new GenericData.Record(topChildSchema); topChild.put("fNestedUnion", 1); parent.put("f0", topChild); List<VectorSchemaRoot> roots = new ArrayList<>(); try (AvroToArrowVectorIterator iterator = convert(schema, singletonList(parent))) { while (iterator.hasNext()) { roots.add(iterator.next()); System.out.println(roots.get(roots.size() - 1).contentToTSVString()); } } } ``` The [schema](https://github.com/darcysaum-toast/arrow/blob/avro-nested-union/java/adapter/avro/src/test/resources/schema/test_nested_union.avsc) is as follows: ``` { "namespace": "org.apache.arrow.avro", "type": "record", "name": "testArrayOfRecords", "fields": [ { "name": "f0", "type": { "type": "record", "name": "NestedRecord", "fields": [ { "name": "fNestedUnion", "type": ["null", "int"], "default": null } ] } } ] ``` Changing type of field `fNestedUnion` from `["null", "int"]` to `"int"` will make the test pass (and the iterator correctly prints out the arrow data). ### Component(s) Java -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org