On Fri, Oct 11, 2013 at 12:48 PM, Richard Sandiford <rdsandif...@googlemail.com> wrote: > Richard Biener <richard.guent...@gmail.com> writes: >> On Fri, Oct 11, 2013 at 11:43 AM, Richard Sandiford >> <rdsandif...@googlemail.com> wrote: >>> Jakub Jelinek <ja...@redhat.com> writes: >>>> On Fri, Oct 11, 2013 at 10:17:41AM +0200, Richard Biener wrote: >>>>> asm(".alias __sync_synchronize sync_synchronize"); >>>> >>>> It is .set, but not everywhere. >>>> /* The MIPS assembler has different syntax for .set. We set it to >>>> .dummy to trap any errors. */ >>>> #undef SET_ASM_OP >>>> #define SET_ASM_OP "\t.dummy\t" >>>> But perhaps it would require fewer variants than providing inline asm >>>> of the __sync_* builtin by hand for all the targets that need it. >>> >>> Yeah, that's why I prefer the sed approach. GCC knows how to do exactly >>> what we want, and what syntax to use. We just need to take its output and >>> change the function name. >>> >>> And like Richard says, parsing top-level asms would be fair game, >>> especially if GCC and binutils ever were integrated (for libgccjit.so). >>> So using top-level asms seems like putting off the inevitable >>> (albeit putting it off further than __asm renaming). >>> >>> Do either of you object to the sed thing? >> >> Well, ideally there would be a way to not lie about symbol names to GCC. >> That is, have a "native" way of telling GCC what to do here (which is >> as far as I understand to emit the code for the builtin-handled $SYM >> in a function with $SYM - provide the out-of-line copy for a builtin). >> >> For the __sync functions it's unfortunate that the library function has >> the same 'name' as the builtin and the builtin doesn't have an alternate >> spelling. So - can't we just add __builtin__sync_... spellings and use >> >> __sync_synchronize () >> { >> __builtin_sync_syncronize (); >> } >> >> ? (what if __builtin_sync_syncronize expands to a libcall? if it can't, >> what's the point of the library function?) > > It can't expand to a libcall in nomips16 mode. It always expands to a > libcall in mips16 mode. The point is to provide out-of-line nomips16 > functions that the mips16 code can call. > >> Why does a simple >> >> __sync_synchronize () >> { >> __sync_synchronize (); >> } >> >> not work? On x86_64 I get from that: >> >> __sync_synchronize: >> .LFB0: >> .cfi_startproc >> mfence >> ret >> .cfi_endproc >> >> (similar to how you can have a definition of memcpy () and have >> another memcpy inside it inline-expanded) > > Is that with optimisation enabled? -O2 gives me: > > __sync_synchronize: > .LFB0: > .cfi_startproc > .p2align 4,,10 > .p2align 3 > .L2: > jmp .L2 > .cfi_endproc > .LFE0: > > We do want to compile this stuff with optimisation enabled.
I compiled with -O1 only. Yes, at -O2 I get the infinite loop. Eventually we should simply not build cgraph edges _from_ calls to builtins? Or disable tail recursion for builtin calls (tail-recursion is what does this optimization). Honza? For tailr this boils down to symtab_semantically_equivalent_p (). I suppose we don't want to change that but eventually special case builtins in tailr, thus Index: gcc/tree-tailcall.c =================================================================== --- gcc/tree-tailcall.c (revision 203409) +++ gcc/tree-tailcall.c (working copy) @@ -446,7 +446,9 @@ find_tail_calls (basic_block bb, struct /* We found the call, check whether it is suitable. */ tail_recursion = false; func = gimple_call_fndecl (call); - if (func && recursive_call_p (current_function_decl, func)) + if (func + && ! DECL_BUILT_IN (func) + && recursive_call_p (current_function_decl, func)) { tree arg; which makes -O2 not turn it into an infinite loop (possibly also applies to the original code with the alias declaration?) Thanks, Richard. > Thanks, > Richard