Sorry for a delay.
Here are C function (from reconstructed bug example):
double c_fn( double a, double b) {
return a * b;
}
and uffi definition:
(def-function ("c_fn" c_fn) ((a :double) (b :double)) :returning :double)
The resulting asm code for C function:
<c_fn>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: dd 45 10 fldl 0x10(%ebp)
6: dc 4d 08 fmull 0x8(%ebp)
9: 5d pop %ebp
a: c3 ret
and UFFI wrapper:
<L1c_fn>:
a60: 56 push %esi
a61: 53 push %ebx
a62: e8 f4 fe ff ff call 95b <__x86.get_pc_thunk.bx>
a67: 81 c3 2d 07 00 00 add $0x72d,%ebx
a6d: 83 ec 34 sub $0x34,%esp
a70: e8 4b fd ff ff call 7c0 <ecl_process_env@plt>
a75: 89 c6 mov %eax,%esi
a77: 8d 44 24 2c lea 0x2c(%esp),%eax
a7b: 39 86 58 01 00 00 cmp %eax,0x158(%esi)
a81: 73 49 jae acc <L1c_fn+0x6c>
a83: 8b 44 24 44 mov 0x44(%esp),%eax
a87: 89 04 24 mov %eax,(%esp)
a8a: e8 a1 fd ff ff call 830 <ecl_to_double@plt>
a8f: 8b 44 24 40 mov 0x40(%esp),%eax
a93: 89 04 24 mov %eax,(%esp)
a96: dd 5c 24 10 fstpl 0x10(%esp)
a9a: e8 91 fd ff ff call 830 <ecl_to_double@plt>
a9f: dd 44 24 10 fldl 0x10(%esp)
aa3: dd 5c 24 08 fstpl 0x8(%esp)
aa7: dd 1c 24 fstpl (%esp)
aaa: e8 21 fd ff ff call 7d0 <c_fn@plt>
aaf: 89 44 24 1c mov %eax,0x1c(%esp)
ab3: db 44 24 1c fildl 0x1c(%esp)
ab7: dd 1c 24 fstpl (%esp)
aba: e8 21 fd ff ff call 7e0 <ecl_make_double_float@plt>
abf: c7 46 04 01 00 00 00 movl $0x1,0x4(%esi)
ac6: 83 c4 34 add $0x34,%esp
ac9: 5b pop %ebx
aca: 5e pop %esi
acb: c3 ret
acc: e8 1f fd ff ff call 7f0 <ecl_cs_overflow@plt>
ad1: eb b0 jmp a83 <L1c_fn+0x23>
ad3: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
ad9: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi
I've also tried the same C function, but with casting result to an integer
and returning int signature, and it works fine. Compiling C code with
-mno-fp-ret-in-387 doesn't work ether.
With a C function, which calls c_ref:
double ref( double a, double b) {
return 2 * c_fn( a, b);
}
the asm listing is:
<ref>:
33: 55 push %ebp
34: 89 e5 mov %esp,%ebp
36: dd 45 10 fldl 0x10(%ebp)
39: dc 4d 08 fmull 0x8(%ebp)
3c: 5d pop %ebp
3d: d9 e8 fld1
3f: de c1 faddp %st,%st(1)
41: c3 ret
So, i believe that an uffi wrapper should use ST register, instead of
fildl and fstpl sequence.
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Ecls-list mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ecls-list