> On 03/29/2012 03:05 AM, Stephan Bergmann wrote: > > Hi all, > > > > In LibreOffice's ever-beloved low-level code to synthesize calls to > > C++ virtual functions, I'm having the following problem (on Linux > > x86_64). The function callVirtualMethod at > > <http://cgit.freedesktop.org/libreoffice/core/tree/bridges/source/cpp_uno/gcc3_linux_x86-64/uno2cpp.cxx?id=571876c1234ae55aab0198c7e2caf9049fcd230e#n61> > > effectively does the following: > > Quoting: > > asm volatile ( > > // Fill the xmm registers > "movq %6, %%rax\n\t" > > "movsd (%%rax), %%xmm0\n\t" > "movsd 8(%%rax), %%xmm1\n\t" > "movsd 16(%%rax), %%xmm2\n\t" > "movsd 24(%%rax), %%xmm3\n\t" > "movsd 32(%%rax), %%xmm4\n\t" > "movsd 40(%%rax), %%xmm5\n\t" > "movsd 48(%%rax), %%xmm6\n\t" > "movsd 56(%%rax), %%xmm7\n\t" > > // Fill the general purpose registers > "movq %5, %%rax\n\t" > > "movq (%%rax), %%rdi\n\t" > "movq 8(%%rax), %%rsi\n\t" > "movq 16(%%rax), %%rdx\n\t" > "movq 24(%%rax), %%rcx\n\t" > "movq 32(%%rax), %%r8\n\t" > "movq 40(%%rax), %%r9\n\t" > > // Perform the call > "movq %4, %%r11\n\t" > "movq %7, %%rax\n\t" > "call *%%r11\n\t" > > // Fill the return values > "movq %%rax, %0\n\t" > "movq %%rdx, %1\n\t" > "movsd %%xmm0, %2\n\t" > "movsd %%xmm1, %3\n\t" > : "=m" ( rax ), "=m" ( rdx ), "=m" ( xmm0 ), "=m" ( xmm1 ) > : "m" ( pMethod ), "m" ( pGPR ), "m" ( pFPR ), "m" ( nFPR ) > : "rax", "rdi", "rsi", "rdx", "rcx", "r8", "r9", "r11" > ); > > Semi-off-topic, I think this asm can be better done with only the > call inside the asm, and the rest handled by the compiler. > > { > register sal_uInt64 hard_r8 __asm__("rax"); // etc > register double hard_xmm0 __asm__("xmm0"); // etc > > hard_rdi = pGPR[0]; //etc > hard_xmm0 = pFPR[0]; //etc > hard_rax = nFPR; > > asm volatile ("call *%[method]" > : "+r"(hard_rax), //etc > "+x"(hard_xmm0), //etc > : "g" [method] (pMethod) > : "memory"); > > rax = hard_rax; > rdx = hard_rdx; > xmm0 = hard_xmm0; > xmm1 = hard_xmm1; > } > > > Of course, there's still the problem of getting the unwind data correct at > the point of the asm. I commented about that in the PR you filed.
I think i386 still has the problem that it is small register class target and if you set rdi/rax and friends as hard registers, you risk reload failures. Do we prevent code motion of hard registers sets i.e. at GIMPLE level? (I know pre-reload scheduler was improved here, but I would not rely on it either). As soon as hard_rdi/hard_rax set is moved upwards or downwards across memset/division or other stuff requiring rax or rdx, reload will ICE. Honza > > > r~