------- Comment #7 from mattst88 at gmail dot com 2010-04-08 17:53 ------- (In reply to comment #4) > (In reply to comment #0) > > When this testcase, using inline assembly, is compiled with -Os, -O2, or -O3 > > it segfaults. -O0 and -O1 allow it to run correctly. > > > > Moving the inline assembly into a separate file and including it in the > > compilation allow the program to run correctly at all -O levels. > > From these symptoms, it is practically certain that you have done something > wrong with the asm inputs and outputs. I don't have an Alpha compiler to > hand, > but just from looking at your code, I bet it will work correctly if you > rewrite > it like so: > > unsigned long rewritten(const unsigned long b[2]) { > unsigned long ofs, output; > > asm( > "cmoveq %0,64,%1 # ofs = (b[0] ? ofs : 64);\n" > "cmoveq %0,%2,%0 # temp = (b[0] ? b[0] : b[1]);\n" > "cttz %0,%0 # output = cttz(temp);\n" > : "=r" (output), "=r" (ofs) > : "r" (b[1]), "0" (b[0]), "1" (0) > ); > return output + ofs; > }
Yep, your code works. > (I've assumed that the semantic of "cmoveq a,b,c" is "if (a==0) c=b;") > > The trick with asm() is to do as little as possible. I assume that the reason > the assembly version beats the pure-C version is the cmoveq's, so I stripped > the setup code and the addition. This allows me to express the _real_ > argument > constraints rather than fake ones, which lets me be confident that the > optimizers will do what you want. Note that this also means "volatile" is > unnecessary. > > As a general principle, if you find yourself writing an asm() with a big long > list of earlyclobber outputs but no inputs, you are doing it wrong. > Thanks a ton for the advice. You knocked that out of the water. Marking as INVALID. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691