RTL representation of i386 shrdl instruction is incorrect?
Hello, I was studying i386 machine description for my research purpose, and I stumbled upon following MD entry for 'shrdl' x86 instruction. It is obtained from the most recent i386.md file. (define_insn "x86_shrd" [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") (ior:SI (ashiftrt:SI (match_dup 0) (match_operand:QI 2 "nonmemory_operand" "Ic")) (ashift:SI (match_operand:SI 1 "register_operand" "r") (minus:QI (const_int 32) (match_dup 2) (clobber (reg:CC FLAGS_REG))] "" "shrd{l}\t{%s2%1, %0|%0, %1, %2}" [(set_attr "type" "ishift") (set_attr "prefix_0f" "1") (set_attr "mode" "SI") (set_attr "pent_pair" "np") (set_attr "athlon_decode" "vector") (set_attr "amdfam10_decode" "vector") (set_attr "bdver1_decode" "vector")]) It seems to me that the RTL representation for 'shrdl' is incorrect. Semantics of shrdl instruction as per Intel manual is: "The instruction shifts the first operand (destination operand) to the right the number of bits specified by the third operand (count operand). The second operand (source operand) provides bits to shift in from the left (starting with the most significant bit of the destination operand)." And the way RTL does it is by inclusive-or of arithmetically right-shifted destination and left-shifted source operand. But the problem is that: in case of a destination (reg/mem) containing negative value, arithmetically right-shifted destination will have top bits set to 1. Inclusive-or with such a value is going to generate a result with top bits set to 1 instead of moving contents of source into top bits of destination. E.g., when ebx = b72f60d0, ebp = bfcbd2c8 shrdl $16, %ebp, %ebx (ebx is dest, ebp is src) produces 0xd2c8b72f in ebx. But the corresponding RTL produces 0xb72f in ebx. So it seems to me that instead of 'ashiftrt', RTL should have 'lshiftrt'. Can anyone help me with this confusion? -- Thanks, Niranjan Hasabnis, PhD student, Secure Systems Lab, Stony brook University, Stony brook, NY.
Re: RTL representation of i386 shrdl instruction is incorrect?
Hi Richard, Thanks for your reply. I looked into some of the details of how that particular RTL template is used. It seems to me that the particular RTL template is used only when shifting 64-bit data type on a 32-bit machine. This is the underlying assumption encoded in i386.c file which generates that particular RTL only when instruction mode is DImode. If that is the case, then it won't matter whether one uses arithmetic shift or logical shift to right shift lower 4-bytes of a 8-byte value. In other words, the mapping between RTL template and shrdl is incorrect, but the underlying assumption in i386.c guards the bug. On Thu, Jun 5, 2014 at 3:51 AM, Richard Biener wrote: > On Thu, Jun 5, 2014 at 12:03 AM, Niranjan Hasabnis > wrote: >> Hello, >> >> I was studying i386 machine description for my research purpose, >> and I stumbled upon following MD entry for 'shrdl' x86 instruction. >> It is obtained from the most recent i386.md file. >> >> (define_insn "x86_shrd" >> [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m") >> (ior:SI (ashiftrt:SI (match_dup 0) >> (match_operand:QI 2 "nonmemory_operand" "Ic")) >> (ashift:SI (match_operand:SI 1 "register_operand" "r") >> (minus:QI (const_int 32) (match_dup 2) >>(clobber (reg:CC FLAGS_REG))] >> "" >> "shrd{l}\t{%s2%1, %0|%0, %1, %2}" >> [(set_attr "type" "ishift") >>(set_attr "prefix_0f" "1") >>(set_attr "mode" "SI") >>(set_attr "pent_pair" "np") >>(set_attr "athlon_decode" "vector") >>(set_attr "amdfam10_decode" "vector") >>(set_attr "bdver1_decode" "vector")]) >> >> It seems to me that the RTL representation for 'shrdl' is incorrect. >> >> Semantics of shrdl instruction as per Intel manual is: >> "The instruction shifts the first operand (destination operand) to the right >> the number of bits specified by the third operand (count operand). >> The second operand (source operand) provides bits to shift in from the >> left (starting with the most significant bit of the destination operand)." >> And the way RTL does it is by inclusive-or of arithmetically >> right-shifted destination and left-shifted source operand. >> >> But the problem is that: in case of a destination (reg/mem) containing >> negative value, arithmetically right-shifted destination will have top bits >> set to 1. Inclusive-or with such a value is going to generate a >> result with top bits set to 1 instead of moving contents of source >> into top bits of destination. >> >> E.g., when ebx = b72f60d0, ebp = bfcbd2c8 >> shrdl $16, %ebp, %ebx (ebx is dest, ebp is src) >> produces 0xd2c8b72f in ebx. >> But the corresponding RTL produces 0xb72f in ebx. >> >> So it seems to me that instead of 'ashiftrt', RTL should have 'lshiftrt'. >> Can anyone help me with this confusion? > > The way I read your explanation you are correct. It should be possible > to write a testcase that is miscompiled - just try to produce the > matched RTL pattern in C and feed it with operands at runtime that > end up producing a bogus value when shrdl is used. > > Oh, and you might want to file a bugreport then ;) > > Richard. > >> -- >> >> Thanks, >> Niranjan Hasabnis, >> PhD student, >> Secure Systems Lab, >> Stony brook University, >> Stony brook, NY. -- Thanks, Niranjan Hasabnis, PhD student, Secure Systems Lab, Stony brook University, Stony brook, NY.
Executing already executed optimization passes again
Hello, I'm using GCC plugin to do some analysis and modification on strict-RTL. Both things are done after all optimization passes are over, and just before strict-RTL is converted into assembly. I would be happy to know if it is possible to execute all already executed optimization passes again after my analysis. I read GCC online documentation, but I could not find answer to the question. It seems to me that pass manager also does not allow something like this. I went through the mailing list archive to find out if this question is already answered, but I couldn't find any. Please forgive me if this is already answered. -- Thanks, Niranjan Hasabnis, PhD student, Stony brook University.
Re: Executing already executed optimization passes again
I knew that doing transformations at earlier stage would be cleaner solution. But, due to some reason, I'm not able to do it at earlier stages. But I was thinking that some optimizations such as CSE, loop optimizations, should be reusable, and hence was exploring the idea of using them after my transformation. But current GCC pass manager doesn't seem to allow this. Specifically, it seems that I cannot ask pass manager to execute pass Pi, if I have already executed pass Pj where i < j. That's why I posted this question. ~ Niranjan On Fri, Mar 18, 2011 at 9:23 PM, Paul Koning wrote: > I don't know the answer to your specific question, but I was wondering: if > you think it is useful to do optimization again, I think that means that the > transformations you have in mind should be done at an earlier stage. > > By the time you hit register allocation, it's almost too late for anything. > > paul > > On Mar 18, 2011, at 9:03 PM, Niranjan Hasabnis wrote: > >> Hello, >> >> I'm using GCC plugin to do some analysis and modification on strict-RTL. >> Both things are done after all optimization passes are over, and just >> before strict-RTL is converted into assembly. >> I would be happy to know if it is possible to execute all already >> executed optimization passes again after my analysis. >> >> I read GCC online documentation, but I could not find answer to the question. >> It seems to me that pass manager also does not allow something like this. >> >> I went through the mailing list archive to find out if this question >> is already answered, but I couldn't find any. >> Please forgive me if this is already answered. >> >> -- >> >> Thanks, >> Niranjan Hasabnis, >> PhD student, >> Stony brook University. > > -- Thanks, Niranjan Hasabnis, PhD student, Secure Systems Lab, Stony brook University, Stony brook, NY.
Testing machine descriptions
Hello, I am very curious to know what kind of testing techniques are applied to machine description files. To be precise, is there any unit-testing environment when some instructions are added/removed from MD file, or any other testing technique is applied? Also, when MD file is patched to fix a bug, what kind of tests are done on it? I'm very much interested in knowing compiler internals and hence this question. Thank you. -- -- Regards, Niranjan Hasabnis.
Re: Testing machine descriptions
Hi DJ Delorie, Thank you for your answer. It is useful. One more question: so does the main testsuite cover all md entries? Meaning all possible assembly instructions that could be generated by that md are checked by the main testsuite? Thank you again. On Thu, Mar 27, 2014 at 7:22 PM, DJ Delorie wrote: > > I've thought about making a dejagnu testsuite specifically for helping > with new ports, which would mean lots of md-specific tests, but > really, the main testsuite probably covers everything you'd need to > test. All patches are supposed to be regression tested anyway, which > means running the full dejagnu testsuite before and after your change, > to make sure you didn't break anything. -- -- Regards, Niranjan Hasabnis.
Re: Testing machine descriptions
Hi DJ Delorie, Thank you for your answers. It is very much the information that I was looking for. On Thu, Mar 27, 2014 at 7:57 PM, DJ Delorie wrote: > > The main testsuite doesn't have tests specifically to cover all the md > entries. What I meant was, I suspect it covers enough plain C test > cases to happen to use all the usual md entries. > > Since each target has different md entries (both "which are used" and > "how each is used"), it would be nearly impossible to write an md > testsuite. If you have target-specific patterns you want to test, > you'd have to add a target-specific testsuite for them. -- -- Regards, Niranjan Hasabnis.
Re: Testing machine descriptions
Thanks for the info. I will try it out. On Fri, Mar 28, 2014 at 12:07 AM, Senthil Kumar Selvaraj wrote: > On Thu, Mar 27, 2014 at 07:51:06PM -0400, Niranjan Hasabnis wrote: >> Hi DJ Delorie, >> >> Thank you for your answer. It is useful. One more question: so does the >> main testsuite cover all md entries? Meaning all possible assembly >> instructions that could be generated by that md are checked by the >> main testsuite? Thank you again. >> > > You can check it out yourself - just build gcc with coverage > instrumentation turned on, and then use gcov. > > http://tryout.senthilthecoder.com/view/coverage/gcc/config/avr/avr.md.gcov.html > shows coverage information for the AVR target's md file. > > Regards > Senthil -- -- Regards, Niranjan Hasabnis.
Regarding x86 'sete' instruction and its corresponding RTL
Hello, I'm a student and am currently studying compiler. I was studying GCC's i386 MD, and I found that RTL insn mapped to 'sete' assembly instruction seems to have exactly opposite semantics than 'sete' instruction itself. Below are more details. If someone could clarify the issue, or let me know if I am wrong, then that would be great. RTL: (set (reg:QI 0 ax) (eq:QI (reg:CCZ 17 flags) (const_int 0))) Assembly: sete %al Semantics of sete instruction is (as per Intel manual): if zero flag = 1, (reg:QI ax) = 1 else (reg:QI ax) = 0 Where as (I believe) RTL semantics seems to say that: - if zero flag = 0, (reg:QI ax) = 1 else (reg:QI ax) = 0 This is because 'eq' operator returns STORE_FLAG_VALUE when both operands of 'eq' are equal. Otherwise, it returns 0. This is exactly opposite of what assembly semantics is. Am I missing something? Please let me know. Thanks.
Re: Regarding x86 'sete' instruction and its corresponding RTL
Hi Eric, Thank you for your reply. I referred to section 13.10, and the description there does not precisely specify the result of comparison with CC register. Yes, you are right that as per the description, comparison with CC may not have anything to do with STORE_FLAG_VALUE. But it clearly says that when the comparison fails, the result is 0. And this seems to be exactly opposite of semantics of 'sete' instruction. So the problem is still not solved. Am I misreading something? Please let me know. On Fri, Apr 4, 2014 at 3:58 AM, Eric Botcazou wrote: >> >> RTL: (set (reg:QI 0 ax) >> (eq:QI (reg:CCZ 17 flags) (const_int 0))) >> >> Assembly: sete %al >> >> >> Semantics of sete instruction is (as per Intel manual): >> if zero flag = 1, (reg:QI ax) = 1 >> else (reg:QI ax) = 0 >> >> Where as (I believe) RTL semantics seems to say that: >> - if zero flag = 0, (reg:QI ax) = 1 >>else (reg:QI ax) = 0 >> >> This is because 'eq' operator returns STORE_FLAG_VALUE when both >> operands of 'eq' are equal. Otherwise, it returns 0. This is exactly >> opposite of what assembly semantics is. > > No, that's wrong, the semantics of the comparison operators applied to the CC > register have nothing to do with STORE_FLAG_VALUE (see manual section 13.10). > > > Eric Botcazou -- -- Regards, Niranjan Hasabnis.