http://bugzilla.gdcproject.org/show_bug.cgi?id=97
Bug #: 97 Summary: Experiencing intermittent crash in rt_init() when loading DLL Classification: Unclassified Product: GDC Version: development Platform: x86 OS/Version: Other Status: NEW Severity: critical Priority: Normal Component: gdc AssignedTo: ibuc...@gdcproject.org ReportedBy: slavo5...@yahoo.com Created attachment 57 --> http://bugzilla.gdcproject.org/attachment.cgi?id=57 rt_init-crash.png This was migrated from https://bitbucket.org/goshawk/gdc/issue/351/experiencing-intermittent-crash-in-rt_init Manu Evans created an issue 2012-06-18 **************************************** I have a crash that only happens occasionally when loading a GDC-64 DLL. The same DLL may work or not work depending on the direction of the wind. Though it seems to crash far less often than it does. The callstack, and various other details are visible in the image attached... It appears to crash fetching __blkcache_storage, a TLS variable. The code that loads it looks odd to me. A few points of interest: * How can the final mov refer to rbx when only eax was loaded? Who's to say the top bits will be zero? * The magic address doesn't appear to be a valid offset to me... * rsi is a good pointer, but it points to a bunch of string data, including source code snippets. Not what I expected... moduleinfo of some sort? debuginfo? * The same pattern of loading ebx and using rbx is repeated above with eax->rax, except the wild absolute magic number is dereferenced this time... (how does that even work?) I don't follow the code GDC is generating here :/ .. Does it look okay to anyone else? This is affecting our whole team daily... any input or ideas what might be going on would be much appreciated! See attachment <rt_init-crash.png> Manu Evans - 2012-06-18 **************************************** edited description Iain Buclaw - 2012-06-19 **************************************** * changed status to open * assigned issue to Daniel Green I'm not sure rt.lifetime is well suited for shared libraries on windows yet. Daniel, could you look into this? Manu Evans - 2012-06-19 **************************************** I can probably supply a binary... but I think the precise context when loading the dll is critical, because it usually loads fine without problems, so a binary may not be of any use. Daniel Green - 2012-06-20 **************************************** Looking at that, it's definitely crashing in loading a TLS variable. * The use of EBX/RBX is acceptable as 32-bit operations are implicitly zero extended to 64-bit. * The issue looks to be with the value being loaded. 0xACEE47F8 ( 3 billion ) * RSI should be ok, as it doesn't require TLS relative location to function. This value loaded into EBX should be relative to the TLS section which means it should be significantly smaller. Did you custom build this? The value loaded into EBX is determined at link time without a TLS aware assembler/linker you wouldn't get the relative offset. Can you output a map file with -Wl,-Map=output.map for the DLL and the output of the following command on lifetime.o? To extract it run ar x libgphobos.a lifetime.o Then compare it with this? Line 2ea is where the magic happens that generates the relative offest. I'll work on getting the assembly output from a Dll dump as well to ensure it's linking properly. $ /c/MinGW64/bin/objdump.exe -d -r -M Intel lifetime.o 00000000000002d0 <_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo>: 2d0: 56 push rsi 2d1: 53 push rbx 2d2: 48 83 ec 28 sub rsp,0x28 2d6: 8b 04 25 00 00 00 00 mov eax,DWORD PTR ds:0x0 2d9: R_X86_64_32S _tls_index 2dd: 65 48 8b 34 25 58 00 mov rsi,QWORD PTR gs:0x58 2e4: 00 00 2e6: 48 8b 34 c6 mov rsi,QWORD PTR [rsi+rax*8] 2ea: bb 08 00 00 00 mov ebx,0x8 2eb: secrel32 .tls$GCC 2ef: 48 8b 04 1e mov rax,QWORD PTR [rsi+rbx*1] 2f3: 48 85 c0 test rax,rax 2f6: 74 08 je 300 <_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo+0x30> 2f8: 48 83 c4 28 add rsp,0x28 2fc: 5b pop rbx 2fd: 5e pop rsi 2fe: c3 ret Daniel Green - 2012-06-24 **************************************** Here's the Map information from a DLL that was built. .tls 0x000000006fa62000 0x200 0x000000006fa62010 _D2rt8lifetime18__blkcache_storagePS2rt8lifetime7BlkInfo Subtracting _D2rt8lifetime18blkcache_storagePS2rt8lifetime7BlkInfo from .tls, for a secrel32 offset gives 0x10. This is the same value as shown in the assembly dump from the Dll and is in contrast with the value 0xACEE47F8 as shown in your assembly dump. 000000006fa0972b <_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo>: int __nextRndNum = 0; } int __nextBlkIdx; } @property BlkInfo *__blkcache() 6fa0972b: 55 push rbp 6fa0972c: 48 89 e5 mov rbp,rsp 6fa0972f: 48 83 ec 30 sub rsp,0x30 { if(!__blkcache_storage) 6fa09733: 8b 04 25 4c c4 a5 6f mov eax,DWORD PTR ds:0x6fa5c44c 6fa0973a: 65 48 8b 14 25 58 00 mov rdx,QWORD PTR gs:0x58 6fa09741: 00 00 6fa09743: 48 8b 14 c2 mov rdx,QWORD PTR [rdx+rax*8] 6fa09747: b8 10 00 00 00 mov eax,0x10 6fa0974c: 48 8b 04 02 mov rax,QWORD PTR [rdx+rax*1] 6fa09750: 48 85 c0 test rax,rax 6fa09753: 0f 95 c0 setne al 6fa09756: 83 f0 01 xor eax,0x1 6fa09759: 84 c0 test al,al 6fa0975b: 74 5f je 6fa097bc <_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo+0x91> Manu Evans - 2012-06-24 **************************************** The objdump gave us the same thing you pasted. So you think it's just a bad toolchain? Any chance of a new 2.059 toolchain with that patch applied? Will that fix the problem? It's very strange that this only occurs occasionally. You'd think this would cause the DLL to fail to load every time... but we only see it fail occasionally. Other times it loads just fine Does LoadLibrary actually patch the offsets in the loaded binary with the absolute addresses as it loads or something? Daniel Green - 2012-06-24 **************************************** Can you generate a Map file for the library? I'd like to compare the offsets with what's in your assembly code. If the objdump produced the secrel32 output, then it's probably something else. With the data I had, that was the most likely scenario. The TLS patch fixes a bug in the linker as well as giving access to secrel32 relocation in assembly. If for some reason the compile/link phase was using binutils(gas or ld) not included with GDC this type of issue would occur. It's still possible a different linker is being used. ld bug In order to figure out what else it could be, it's necessary to see the map file and raw assembly for your dll. Compile or link with -Wl,-Map=output.map. objdump.exe -S -M intel mydll.dll > mydll.asm Can be used to generate intel formatted assembly. Random failures on accessing invalid memory are not as strange as you might think. That's actually the first clue, you're accessing an invalid memory location. I'll look into the runtime behavior of LoadLibrary after I've checked the map and assembly output of the Dll. -- Configure bugmail: http://bugzilla.gdcproject.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.