ellisandrews-toast opened a new issue, #137: URL: https://github.com/apache/arrow-java/issues/137
### Describe the bug, including details regarding any error messages, version, and platform. I am trying to convert from avro format to arrow using the [AvroToArrow](https://github.com/apache/arrow/blob/main/java/adapter/avro/src/main/java/org/apache/arrow/AvroToArrow.java) class. Here is the documentation I am following: https://arrow.apache.org/cookbook/java/avro.html There is a bug where this does not work for nested avro schemas. Here is an example: `NestedRecordExample.avsc` ``` { "namespace": "example.nested", "type": "record", "name": "NestedRecordExample", "fields": [ { "name": "id", "type": "int" }, { "name": "NestedRecord1", "type": { "name": "NestedRecord1", "type": "record", "doc": "One layer deep nested record", "fields": [ { "name": "id", "type": "int" } ] } }, { "name": "NestedRecord2", "type": { "name": "NestedRecord2", "type": "record", "doc": "One layer deep nested record that contains a sub record", "fields": [ { "name": "id", "type": "int" }, { "name": "NestedRecord3", "type": { "name": "NestedRecord3", "type": "record", "doc": "two layer deep nested record", "fields": [ { "name": "id", "type": "int" } ] } } ] } }, { "name": "NestedRecord4", "type": { "name": "NestedRecord4", "type": "record", "doc": "One layer deep nested record that contains an array of sub records", "fields": [ { "name": "id", "type": "int" }, { "name": "NestedRecords", "type": { "name": "NestedRecords", "type": "array", "items": { "name": "NestedRecord5", "type": "record", "fields": [ { "name": "id", "type": "int" } ] } } } ] } } ] } ``` I wrote a single record using the above schema to a file `nested_record_example.avro` with the following data: ```json { "id": 1, "NestedRecord1": { "id": 2 }, "NestedRecord2": { "id": 3, "NestedRecord3": { "id": 4 } }, "NestedRecord4": { "id": 5, "NestedRecords": [ { "id": 6 } ] } } ``` I then copied the boilerplate AvroToArrow code from the [documentation](https://arrow.apache.org/cookbook/java/avro.html): ```java // Copied code from arrow docs: https://arrow.apache.org/cookbook/java/avro.html BinaryDecoder decoder = new DecoderFactory().binaryDecoder(new FileInputStream("nested_record_example.avro"), null); Schema schema = new Schema.Parser().parse(new File("NestedRecordExample.avsc")); try (BufferAllocator allocator = new RootAllocator()) { AvroToArrowConfig config = new AvroToArrowConfigBuilder(allocator).build(); try (AvroToArrowVectorIterator avroToArrowVectorIterator = AvroToArrow.avroToArrowIterator(schema, decoder, config)) { while(avroToArrowVectorIterator.hasNext()) { try (VectorSchemaRoot root = avroToArrowVectorIterator.next()) { System.out.print(root.contentToTSVString()); } } } } ``` This results in an error from the `AvroToArrow.avroToArrowIterator()` call: ``` java.lang.RuntimeException: Error occurs while creating iterator. at org.apache.arrow.AvroToArrowVectorIterator.create(AvroToArrowVectorIterator.java:85) at org.apache.arrow.AvroToArrow.avroToArrowIterator(AvroToArrow.java:65) at [ My Project calling AvroToArrow.avroToArrowIterator() ] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:108) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:40) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:60) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:52) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33) at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94) at com.sun.proxy.$Proxy5.processTestClass(Unknown Source) at org.gradle.api.internal.tasks.testing.worker.TestWorker$2.run(TestWorker.java:176) at org.gradle.api.internal.tasks.testing.worker.TestWorker.executeAndMaintainThreadName(TestWorker.java:129) at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:100) at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:60) at org.gradle.process.internal.worker.child.ActionExecutionWorker.execute(ActionExecutionWorker.java:56) at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:113) at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:65) at worker.org.gradle.process.internal.worker.GradleWorkerMain.run(GradleWorkerMain.java:69) at worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74) Caused by: java.lang.RuntimeException: Error occurs while consuming data. at org.apache.arrow.AvroToArrowVectorIterator.consumeData(AvroToArrowVectorIterator.java:128) at org.apache.arrow.AvroToArrowVectorIterator.load(AvroToArrowVectorIterator.java:147) at org.apache.arrow.AvroToArrowVectorIterator.initialize(AvroToArrowVectorIterator.java:98) at org.apache.arrow.AvroToArrowVectorIterator.create(AvroToArrowVectorIterator.java:81) ... 45 more Caused by: org.apache.arrow.memory.OutOfMemoryException: Failure allocating buffer. at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:67) at org.apache.arrow.memory.NettyAllocationManager.<init>(NettyAllocationManager.java:77) at org.apache.arrow.memory.NettyAllocationManager.<init>(NettyAllocationManager.java:84) at org.apache.arrow.memory.NettyAllocationManager$1.create(NettyAllocationManager.java:34) at org.apache.arrow.memory.BaseAllocator.newAllocationManager(BaseAllocator.java:355) at org.apache.arrow.memory.BaseAllocator.newAllocationManager(BaseAllocator.java:350) at org.apache.arrow.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:338) at org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:316) at org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29) at org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:280) at org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29) at org.apache.arrow.vector.BaseValueVector.allocFixedDataAndValidityBufs(BaseValueVector.java:224) at org.apache.arrow.vector.BaseFixedWidthVector.reAlloc(BaseFixedWidthVector.java:442) at org.apache.arrow.vector.complex.AbstractStructVector.reAlloc(AbstractStructVector.java:144) at org.apache.arrow.vector.complex.StructVector.reAlloc(StructVector.java:459) at org.apache.arrow.consumers.AvroArraysConsumer.ensureInnerVectorCapacity(AvroArraysConsumer.java:71) at org.apache.arrow.consumers.AvroArraysConsumer.consume(AvroArraysConsumer.java:48) at org.apache.arrow.consumers.AvroStructConsumer.consume(AvroStructConsumer.java:48) at org.apache.arrow.consumers.CompositeAvroConsumer.consume(CompositeAvroConsumer.java:48) at org.apache.arrow.AvroToArrowVectorIterator.consumeData(AvroToArrowVectorIterator.java:105) ... 48 more Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.base/java.nio.Bits.reserveMemory(Bits.java:175) at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118) at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317) at io.netty.buffer.UnpooledDirectByteBuf.allocateDirect(UnpooledDirectByteBuf.java:104) at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledUnsafeDirectByteBuf.allocateDirect(UnpooledByteBufAllocator.java:215) at io.netty.buffer.UnpooledDirectByteBuf.<init>(UnpooledDirectByteBuf.java:64) at io.netty.buffer.UnpooledUnsafeDirectByteBuf.<init>(UnpooledUnsafeDirectByteBuf.java:41) at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledUnsafeDirectByteBuf.<init>(UnpooledByteBufAllocator.java:210) at io.netty.buffer.UnpooledByteBufAllocator.newDirectBuffer(UnpooledByteBufAllocator.java:91) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188) at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.newDirectBufferL(PooledByteBufAllocatorL.java:171) at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.directBuffer(PooledByteBufAllocatorL.java:214) at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:58) ... 67 more Error occurs while consuming data. java.lang.RuntimeException: Error occurs while consuming data. at org.apache.arrow.AvroToArrowVectorIterator.consumeData(AvroToArrowVectorIterator.java:128) at org.apache.arrow.AvroToArrowVectorIterator.load(AvroToArrowVectorIterator.java:147) at org.apache.arrow.AvroToArrowVectorIterator.initialize(AvroToArrowVectorIterator.java:98) at org.apache.arrow.AvroToArrowVectorIterator.create(AvroToArrowVectorIterator.java:81) at org.apache.arrow.AvroToArrow.avroToArrowIterator(AvroToArrow.java:65) at [ My Project calling AvroToArrow.avroToArrowIterator() ] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:108) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:40) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:60) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:52) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33) at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94) at com.sun.proxy.$Proxy5.processTestClass(Unknown Source) at org.gradle.api.internal.tasks.testing.worker.TestWorker$2.run(TestWorker.java:176) at org.gradle.api.internal.tasks.testing.worker.TestWorker.executeAndMaintainThreadName(TestWorker.java:129) at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:100) at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:60) at org.gradle.process.internal.worker.child.ActionExecutionWorker.execute(ActionExecutionWorker.java:56) at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:113) at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:65) at worker.org.gradle.process.internal.worker.GradleWorkerMain.run(GradleWorkerMain.java:69) at worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74) Caused by: org.apache.arrow.memory.OutOfMemoryException: Failure allocating buffer. at app//io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:67) at app//org.apache.arrow.memory.NettyAllocationManager.<init>(NettyAllocationManager.java:77) at app//org.apache.arrow.memory.NettyAllocationManager.<init>(NettyAllocationManager.java:84) at app//org.apache.arrow.memory.NettyAllocationManager$1.create(NettyAllocationManager.java:34) at app//org.apache.arrow.memory.BaseAllocator.newAllocationManager(BaseAllocator.java:355) at app//org.apache.arrow.memory.BaseAllocator.newAllocationManager(BaseAllocator.java:350) at app//org.apache.arrow.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:338) at app//org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:316) at app//org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29) at app//org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:280) at app//org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29) at app//org.apache.arrow.vector.BaseValueVector.allocFixedDataAndValidityBufs(BaseValueVector.java:224) at app//org.apache.arrow.vector.BaseFixedWidthVector.reAlloc(BaseFixedWidthVector.java:442) at app//org.apache.arrow.vector.complex.AbstractStructVector.reAlloc(AbstractStructVector.java:144) at app//org.apache.arrow.vector.complex.StructVector.reAlloc(StructVector.java:459) at app//org.apache.arrow.consumers.AvroArraysConsumer.ensureInnerVectorCapacity(AvroArraysConsumer.java:71) at app//org.apache.arrow.consumers.AvroArraysConsumer.consume(AvroArraysConsumer.java:48) at app//org.apache.arrow.consumers.AvroStructConsumer.consume(AvroStructConsumer.java:48) at app//org.apache.arrow.consumers.CompositeAvroConsumer.consume(CompositeAvroConsumer.java:48) at app//org.apache.arrow.AvroToArrowVectorIterator.consumeData(AvroToArrowVectorIterator.java:105) ... 48 more Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.base/java.nio.Bits.reserveMemory(Bits.java:175) at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118) at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317) at io.netty.buffer.UnpooledDirectByteBuf.allocateDirect(UnpooledDirectByteBuf.java:104) at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledUnsafeDirectByteBuf.allocateDirect(UnpooledByteBufAllocator.java:215) at io.netty.buffer.UnpooledDirectByteBuf.<init>(UnpooledDirectByteBuf.java:64) at io.netty.buffer.UnpooledUnsafeDirectByteBuf.<init>(UnpooledUnsafeDirectByteBuf.java:41) at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledUnsafeDirectByteBuf.<init>(UnpooledByteBufAllocator.java:210) at io.netty.buffer.UnpooledByteBufAllocator.newDirectBuffer(UnpooledByteBufAllocator.java:91) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188) at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.newDirectBufferL(PooledByteBufAllocatorL.java:171) at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.directBuffer(PooledByteBufAllocatorL.java:214) at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:58) ... 67 more ``` ### Component(s) Java -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org