------- Comment #7 from mattst88 at gmail dot com  2010-04-08 17:53 -------
(In reply to comment #4)
> (In reply to comment #0)
> > When this testcase, using inline assembly, is compiled with -Os, -O2, or -O3
> > it segfaults. -O0 and -O1 allow it to run correctly.
> > 
> > Moving the inline assembly into a separate file and including it in the
> > compilation allow the program to run correctly at all -O levels.
> 
> From these symptoms, it is practically certain that you have done something
> wrong with the asm inputs and outputs.  I don't have an Alpha compiler to 
> hand,
> but just from looking at your code, I bet it will work correctly if you 
> rewrite
> it like so:
> 
> unsigned long rewritten(const unsigned long b[2]) {
>         unsigned long ofs, output;
> 
>         asm(
>                 "cmoveq %0,64,%1        # ofs    = (b[0] ? ofs : 64);\n"
>                 "cmoveq %0,%2,%0        # temp   = (b[0] ? b[0] : b[1]);\n"
>                 "cttz   %0,%0           # output = cttz(temp);\n"
>                 : "=r" (output), "=r" (ofs)
>                 : "r" (b[1]), "0" (b[0]), "1" (0)
>         );
>         return output + ofs;
> }

Yep, your code works.

> (I've assumed that the semantic of "cmoveq a,b,c" is "if (a==0) c=b;")
> 
> The trick with asm() is to do as little as possible.  I assume that the reason
> the assembly version beats the pure-C version is the cmoveq's, so I stripped
> the setup code and the addition.  This allows me to express the _real_ 
> argument
> constraints rather than fake ones, which lets me be confident that the
> optimizers will do what you want.  Note that this also means "volatile" is
> unnecessary.
> 
> As a general principle, if you find yourself writing an asm() with a big long
> list of earlyclobber outputs but no inputs, you are doing it wrong.
> 

Thanks a ton for the advice. You knocked that out of the water.

Marking as INVALID.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691

Reply via email to