ejf-ibm opened a new issue, #44888: URL: https://github.com/apache/arrow/issues/44888
### Describe the bug, including details regarding any error messages, version, and platform. I'm currently hitting a segfault when trying to read in AWS CUR 2.0 billing files, writing out partitioned arrow files. The segfault is reproducible in our containerized dev environment (k8s deployment), but I have not been able to reproduce is through a synthetic functional test. I'm not able to include the entire heap dump, but the call stack and some relevant information are included below. ``` mediation-awscloudbilling-58b58d Unhandled exception mediation-awscloudbilling-58b58d Type=Segmentation error vmState=0x00040000 mediation-awscloudbilling-58b58d J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001 mediation-awscloudbilling-58b58d Handler1=00007F4FD22916B0 Handler2=00007F4FD21E6730 InaccessibleAddress=00007F4958F20A2F mediation-awscloudbilling-58b58d RDI=00007F4958F20A27 RSI=0000000000000002 RAX=3A5987A4DE9A52E2 RBX=00007F4958F209F7 mediation-awscloudbilling-58b58d RCX=3A5987A4DE9A52E2 RDX=0000000000000000 R8=648C56E80D97FBD5 R9=00007F4F6C00A200 mediation-awscloudbilling-58b58d R10=00000000C70F6907 R11=00007F4F6C000020 R12=00007F4F798F3700 R13=00007F4F798F3940 mediation-awscloudbilling-58b58d R14=00007F4F798F3F40 R15=00007F4F798F3700 mediation-awscloudbilling-58b58d RIP=00007F4E9D9D718F GS=0000 FS=0000 RSP=00007F4F798F3590 mediation-awscloudbilling-58b58d EFlags=0000000000010246 CS=0033 RBP=00007F4F798F3940 ERR=0000000000000004 mediation-awscloudbilling-58b58d TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=00007F4958F20A2F mediation-awscloudbilling-58b58d xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00) mediation-awscloudbilling-58b58d xmm1 0000000000000000 (f: 0.000000, d: 0.000000e+00) mediation-awscloudbilling-58b58d xmm2 00007f4f6c00b570 (f: 1811985792.000000, d: 6.915886e-310) mediation-awscloudbilling-58b58d xmm3 0000000000000000 (f: 0.000000, d: 0.000000e+00) mediation-awscloudbilling-58b58d xmm4 00007f4f6c00a220 (f: 1811980800.000000, d: 6.915886e-310) mediation-awscloudbilling-58b58d xmm5 00007f4f6c006fe0 (f: 1811968000.000000, d: 6.915886e-310) mediation-awscloudbilling-58b58d xmm6 00007f4eac051650 (f: 2886014464.000000, d: 6.915727e-310) mediation-awscloudbilling-58b58d xmm7 00007f4eac051650 (f: 2886014464.000000, d: 6.915727e-310) mediation-awscloudbilling-58b58d xmm8 ffffffac00000000 (f: 0.000000, d: -nan) mediation-awscloudbilling-58b58d xmm9 ffff00ff000000ff (f: 255.000000, d: -nan) mediation-awscloudbilling-58b58d xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00) mediation-awscloudbilling-58b58d xmm11 2bedb0c3811e1752 (f: 2166232832.000000, d: 4.343789e-97) mediation-awscloudbilling-58b58d xmm12 0000000000000001 (f: 1.000000, d: 4.940656e-324) mediation-awscloudbilling-58b58d xmm13 08090a0b0c0d0e0f (f: 202182160.000000, d: 5.924543e-270) mediation-awscloudbilling-58b58d xmm14 0094257f6bf44044 (f: 1811169408.000000, d: 7.172383e-306) mediation-awscloudbilling-58b58d xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00) mediation-awscloudbilling-58b58d Module=/tmp/jnilib-12024434061044301193.tmp mediation-awscloudbilling-58b58d Module_base_address=00007F4E9C5ED000 Symbol=_ZNK5arrow7compute16FunctionRegistry20FunctionRegistryImpl11GetFunctionERKSs mediation-awscloudbilling-58b58d Symbol_address=00007F4E9D9D7160 mediation-awscloudbilling-58b58d Target=2_90_20240416_760 (Linux 5.15.0-125-generic) mediation-awscloudbilling-58b58d CPU=amd64 (8 logical CPUs) (0xfb2c63000 RAM) mediation-awscloudbilling-58b58d ----------- Stack Backtrace ----------- mediation-awscloudbilling-58b58d _ZNK5arrow7compute16FunctionRegistry20FunctionRegistryImpl11GetFunctionERKSs+0x2f (0x00007F4E9D9D718F [jnilib-12024434061044301193.tmp+0x13ea18f]) mediation-awscloudbilling-58b58d _ZNK5arrow7compute16FunctionRegistry11GetFunctionERKSs+0xd (0x00007F4E9D9CD91D [jnilib-12024434061044301193.tmp+0x13e091d]) mediation-awscloudbilling-58b58d _ZN5arrow7compute12_GLOBAL__N_116BindNonRecursiveENS0_10Expression4CallEbPNS0_11ExecContextE+0x1a8 (0x00007F4E9D9F8338 [jnilib-12024434061044301193.tmp+0x140b338]) mediation-awscloudbilling-58b58d _ZN5arrow7compute12_GLOBAL__N_18BindImplINS_6SchemaEEENS_6ResultINS0_10ExpressionEEES5_RKT_PNS0_11ExecContextE+0x68f (0x00007F4E9D9F9BAF [jnilib-12024434061044301193.tmp+0x140cbaf]) mediation-awscloudbilling-58b58d _ZN5arrow7compute12_GLOBAL__N_18BindImplINS_6SchemaEEENS_6ResultINS0_10ExpressionEEES5_RKT_PNS0_11ExecContextE+0x64d (0x00007F4E9D9F9B6D [jnilib-12024434061044301193.tmp+0x140cb6d]) mediation-awscloudbilling-58b58d _ZNK5arrow7compute10Expression4BindERKNS_6SchemaEPNS0_11ExecContextE+0x33 (0x00007F4E9D9F9DF3 [jnilib-12024434061044301193.tmp+0x140cdf3]) mediation-awscloudbilling-58b58d _ZN5arrow7dataset15ProjectionDescr20FromStructExpressionERKNS_7compute10ExpressionERKNS_6SchemaE+0x25 (0x00007F4E9D24AB55 [jnilib-12024434061044301193.tmp+0xc5db55]) mediation-awscloudbilling-58b58d _ZN5arrow7dataset15ProjectionDescr15FromExpressionsESt6vectorINS_7compute10ExpressionESaIS4_EES2_ISsSaISsEERKNS_6SchemaE+0x41f (0x00007F4E9D24DFAF [jnilib-12024434061044301193.tmp+0xc60faf]) mediation-awscloudbilling-58b58d _ZN5arrow7dataset15ProjectionDescr9FromNamesESt6vectorISsSaISsEERKNS_6SchemaEb+0x620 (0x00007F4E9D272910 [jnilib-12024434061044301193.tmp+0xc85910]) mediation-awscloudbilling-58b58d _ZN5arrow7dataset14ScannerBuilder7ProjectESt6vectorISsSaISsEE+0x57 (0x00007F4E9D274817 [jnilib-12024434061044301193.tmp+0xc87817]) mediation-awscloudbilling-58b58d Java_org_apache_arrow_dataset_jni_JniWrapper_createScanner+0x208 (0x00007F4E9D22EB28 [jnilib-12024434061044301193.tmp+0xc41b28]) mediation-awscloudbilling-58b58d ffi_call_unix64+0x52 (0x00007F4FD2459D3A [libj9vm29.so+0x20bd3a]) mediation-awscloudbilling-58b58d ffi_call_int+0x1a1 (0x00007F4FD2458ED1 [libj9vm29.so+0x20aed1]) mediation-awscloudbilling-58b58d _ZN37VM_DebugBytecodeInterpreterCompressed3runEP10J9VMThread+0x184be (0x00007F4FD234B52E [libj9vm29.so+0xfd52e]) mediation-awscloudbilling-58b58d debugBytecodeLoopCompressed+0xd2 (0x00007F4FD2333062 [libj9vm29.so+0xe5062]) mediation-awscloudbilling-58b58d (0x00007F4FD23E6AC2 [libj9vm29.so+0x198ac2]) mediation-awscloudbilling-58b58d --------------------------------------- mediation-awscloudbilling-58b58d JVMDUMP039I Processing dump event "gpf", detail "" at 2024/11/30 13:54:52 - please wait. mediation-awscloudbilling-58b58d JVMDUMP032I JVM requested System dump using '//core.20241130.135452.1.0005.dmp' in response to an event mediation-awscloudbilling-58b58d JVMDUMP030W Cannot write dump to file //core.20241130.135452.1.0005.dmp: Read-only file system mediation-awscloudbilling-58b58d JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" specifies that the core dump is to be piped to an external program. Attempting to rename either core or core.155. Review the manual for the external program to find where the core dump is written and ensure the program does not truncate it. mediation-awscloudbilling-58b58d mediation-awscloudbilling-58b58d JVMPORT049I The core file created by child process with pid = 155 was not found. Review the documentation for the /proc/sys/kernel/core_pattern program "|/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" to find where the core file is written and ensure that program does not truncate it. mediation-awscloudbilling-58b58d mediation-awscloudbilling-58b58d JVMDUMP012E Error in System dump: /tmp/core.20241130.135452.1.0005.dmp mediation-awscloudbilling-58b58d JVMDUMP032I JVM requested Java dump using '/STDOUT/' in response to an event mediation-awscloudbilling-58b58d 0SECTION TITLE subcomponent dump routine mediation-awscloudbilling-58b58d NULL =============================== mediation-awscloudbilling-58b58d 1TICHARSET ANSI_X3.4-1968 mediation-awscloudbilling-58b58d 1TISIGINFO Dump Event "gpf" (00002000) received mediation-awscloudbilling-58b58d 1TIDATETIMEUTC Date: 2024/11/30 at 13:54:58:059 (UTC) mediation-awscloudbilling-58b58d 1TIDATETIME Date: 2024/11/30 at 13:54:58:059 mediation-awscloudbilling-58b58d 1TITIMEZONE Timezone: UTC (UTC) mediation-awscloudbilling-58b58d 1TINANOTIME System nanotime: 1107936331581339 mediation-awscloudbilling-58b58d 1TIFILENAME Javacore filename: /STDOUT/ mediation-awscloudbilling-58b58d 1TIREQFLAGS Request Flags: 0x41 (exclusive+preempt) mediation-awscloudbilling-58b58d 1TIPREPSTATE Prep State: 0x80 (trace_disabled) mediation-awscloudbilling-58b58d 1TIPREPINFO Exclusive VM access not taken: data may not be consistent across javacore sections mediation-awscloudbilling-58b58d NULL ------------------------------------------------------------------------ mediation-awscloudbilling-58b58d 0SECTION GPINFO subcomponent dump routine mediation-awscloudbilling-58b58d NULL ================================ mediation-awscloudbilling-58b58d 2XHOSLEVEL OS Level : Linux 5.15.0-125-generic mediation-awscloudbilling-58b58d 2XHCPUS Processors - mediation-awscloudbilling-58b58d 3XHCPUARCH Architecture : amd64 mediation-awscloudbilling-58b58d 3XHNUMCPUS How Many : 8 mediation-awscloudbilling-58b58d 3XHNUMASUP NUMA is either not supported or has been disabled by user mediation-awscloudbilling-58b58d NULL mediation-awscloudbilling-58b58d 1XHEXCPCODE J9Generic_Signal_Number: 00000018 mediation-awscloudbilling-58b58d 1XHEXCPCODE Signal_Number: 0000000B mediation-awscloudbilling-58b58d 1XHEXCPCODE Error_Value: 00000000 mediation-awscloudbilling-58b58d 1XHEXCPCODE Signal_Code: 00000001 mediation-awscloudbilling-58b58d 1XHEXCPCODE Handler1: 00007F4FD22916B0 mediation-awscloudbilling-58b58d 1XHEXCPCODE Handler2: 00007F4FD21E6730 mediation-awscloudbilling-58b58d 1XHEXCPCODE InaccessibleAddress: 00007F4958F20A2F mediation-awscloudbilling-58b58d NULL mediation-awscloudbilling-58b58d 1XHEXCPMODULE Module: /tmp/jnilib-12024434061044301193.tmp mediation-awscloudbilling-58b58d 1XHEXCPMODULE Module_base_address: 00007F4E9C5ED000 mediation-awscloudbilling-58b58d 1XHEXCPMODULE Symbol: _ZNK5arrow7compute16FunctionRegistry20FunctionRegistryImpl11GetFunctionERKSs mediation-awscloudbilling-58b58d 1XHEXCPMODULE Symbol_address: 00007F4E9D9D7160 mediation-awscloudbilling-58b58d NULL mediation-awscloudbilling-58b58d 1XHREGISTERS Registers: mediation-awscloudbilling-58b58d 2XHREGISTER RDI: 00007F4958F20A27 mediation-awscloudbilling-58b58d 2XHREGISTER RSI: 0000000000000002 mediation-awscloudbilling-58b58d 2XHREGISTER RAX: 3A5987A4DE9A52E2 mediation-awscloudbilling-58b58d 2XHREGISTER RBX: 00007F4958F209F7 mediation-awscloudbilling-58b58d 2XHREGISTER RCX: 3A5987A4DE9A52E2 mediation-awscloudbilling-58b58d 2XHREGISTER RDX: 0000000000000000 mediation-awscloudbilling-58b58d 2XHREGISTER R8: 648C56E80D97FBD5 mediation-awscloudbilling-58b58d 2XHREGISTER R9: 00007F4F6C00A200 mediation-awscloudbilling-58b58d 2XHREGISTER R10: 00000000C70F6907 mediation-awscloudbilling-58b58d 2XHREGISTER R11: 00007F4F6C000020 mediation-awscloudbilling-58b58d 2XHREGISTER R12: 00007F4F798F3700 mediation-awscloudbilling-58b58d 2XHREGISTER R13: 00007F4F798F3940 mediation-awscloudbilling-58b58d 2XHREGISTER R14: 00007F4F798F3F40 mediation-awscloudbilling-58b58d 2XHREGISTER R15: 00007F4F798F3700 mediation-awscloudbilling-58b58d 2XHREGISTER xmm0: 0000000000000000 mediation-awscloudbilling-58b58d 2XHREGISTER xmm1: 0000000000000000 mediation-awscloudbilling-58b58d 2XHREGISTER xmm2: 00007F4F6C00B570 mediation-awscloudbilling-58b58d 2XHREGISTER xmm3: 0000000000000000 mediation-awscloudbilling-58b58d 2XHREGISTER xmm4: 00007F4F6C00A220 mediation-awscloudbilling-58b58d 2XHREGISTER xmm5: 00007F4F6C006FE0 mediation-awscloudbilling-58b58d 2XHREGISTER xmm6: 00007F4EAC051650 mediation-awscloudbilling-58b58d 2XHREGISTER xmm7: 00007F4EAC051650 mediation-awscloudbilling-58b58d 2XHREGISTER xmm8: FFFFFFAC00000000 mediation-awscloudbilling-58b58d 2XHREGISTER xmm9: FFFF00FF000000FF mediation-awscloudbilling-58b58d 2XHREGISTER xmm10: 0000000000000000 mediation-awscloudbilling-58b58d 2XHREGISTER xmm11: 2BEDB0C3811E1752 mediation-awscloudbilling-58b58d 2XHREGISTER xmm12: 0000000000000001 mediation-awscloudbilling-58b58d 2XHREGISTER xmm13: 08090A0B0C0D0E0F mediation-awscloudbilling-58b58d 2XHREGISTER xmm14: 0094257F6BF44044 mediation-awscloudbilling-58b58d 2XHREGISTER xmm15: 0000000000000000 mediation-awscloudbilling-58b58d 2XHREGISTER RIP: 00007F4E9D9D718F mediation-awscloudbilling-58b58d 2XHREGISTER GS: 0000 mediation-awscloudbilling-58b58d 2XHREGISTER FS: 0000 mediation-awscloudbilling-58b58d 2XHREGISTER RSP: 00007F4F798F3590 mediation-awscloudbilling-58b58d 2XHREGISTER EFlags: 0000000000010246 mediation-awscloudbilling-58b58d 2XHREGISTER CS: 0033 mediation-awscloudbilling-58b58d 2XHREGISTER RBP: 00007F4F798F3940 mediation-awscloudbilling-58b58d 2XHREGISTER ERR: 0000000000000004 mediation-awscloudbilling-58b58d 2XHREGISTER TRAPNO: 000000000000000E mediation-awscloudbilling-58b58d 2XHREGISTER OLDMASK: 0000000000000000 mediation-awscloudbilling-58b58d 2XHREGISTER CR2: 00007F4958F20A2F mediation-awscloudbilling-58b58d NULL mediation-awscloudbilling-58b58d 1XHFLAGS VM flags:0000000000040000 mediation-awscloudbilling-58b58d NULL mediation-awscloudbilling-58b58d NULL ------------------------------------------------------------------------ mediation-awscloudbilling-58b58d 0SECTION ENVINFO subcomponent dump routine mediation-awscloudbilling-58b58d NULL ================================= mediation-awscloudbilling-58b58d 1CIJAVAVERSION JRE 17 Linux amd64-64 (build 17.0.11+9) mediation-awscloudbilling-58b58d 1CIVMVERSION 20240416_760 mediation-awscloudbilling-58b58d 1CIJ9VMTAG openj9-0.44.0 mediation-awscloudbilling-58b58d 1CIJ9VMVERSION b0699311c7 mediation-awscloudbilling-58b58d 1CIJITVERSION j9jit_20240522_2205_ mediation-awscloudbilling-58b58d 1CIOMRVERSION 254af5a04_CMPRSS mediation-awscloudbilling-58b58d 1CIJCLVERSION 5d7d758b682 based on jdk-17.0.11+9 mediation-awscloudbilling-58b58d 1CIVENDOR IBM Corporation mediation-awscloudbilling-58b58d 1CIPRODUCT IBM Semeru Runtime Open Edition mediation-awscloudbilling-58b58d 1CIEXTVERSION 17.0.11.0 mediation-awscloudbilling-58b58d 1CIJITMODES JIT enabled, AOT enabled, FSD enabled, HCR enabled mediation-awscloudbilling-58b58d 1CIRUNNINGAS Running as an embedded JVM mediation-awscloudbilling-58b58d 1CIVMIDLESTATE VM Idle State: ACTIVE mediation-awscloudbilling-58b58d 1CICONTINFO Running in container : TRUE mediation-awscloudbilling-58b58d 1CICGRPINFO JVM support for cgroups enabled : TRUE mediation-awscloudbilling-58b58d 1CISTARTTIME JVM start time: 2024/11/30 at 13:50:50:425 mediation-awscloudbilling-58b58d 1CISTARTNANO JVM start nanotime: 1107688697153833 mediation-awscloudbilling-58b58d 1CIPROCESSID Process ID: 1 (0x1) ``` Any suggestions on debug options would be very much appreciated. Thanks! ### Component(s) Java -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org