Re: [i386] scalar ops that preserve the high part of a vector

Marc Glisse Fri, 30 Nov 2012 04:35:03 -0800

On Sun, 14 Oct 2012, Marc Glisse wrote:

On Sun, 14 Oct 2012, Uros Bizjak wrote:
On Sat, Oct 13, 2012 at 10:52 AM, Marc Glisse <[email protected]> wrote:
Hello,

this patch provides an alternate pattern to let combine recognize scalar
operations that preserve the high part of a vector. If the strategy is all
right, I could do the same for more operations (mul, div, ...). Something
similar is also possible for V4SF (different pattern though), but probably
not as useful.
But, we _do_ have vec_merge pattern that describes the operation.
Adding another one to each operation just to satisfy combine is IMO
not correct approach.
At some point I wondered about _replacing_ the existing pattern, so therewould only be one ;-)
The vec_merge pattern takes as argument 2 vectors instead of a vector and ascalar, and describes the operation as a vector operation where we drop halfof the result, instead of a scalar operation where we re-add the top half ofthe vector. I don't know if that's the most convenient choice. Adding code insimplify-rtx to replace vec_merge with vec_concat / vec_select might beeasier than the other way around.
If the middle-end somehow gave us:
(plus X (vec_concat Y 0))
it would seem a bit strange to add an optimization that turns it into:
(vec_merge (plus X (subreg:V2DF Y)) X 1)
but then producing:
(vec_concat (plus (vec_select X 0) Y) (vec_select X 1))
would be strange as well.
(ignoring the signed zero issues here)
I'd rather see generic RTX simplification that
simplifies your proposed pattern to vec_merge pattern.
Ok, I'll see what I can do.
Also, as you mention in PR54855, Comment #5, the approach is too fragile...
I am not sure I can make the RTX simplification much less fragile... WheneverI see (vec_concat X (vec_select Y 1)), I would have to check whether X issome (possibly large) tree of scalar computations involving Y[0], move it allto vec_merge computations, and fix other users of some of those scalars tonow use S[0]. Seems too hard, I would stop at single-operation X that is usedonly once. Besides, the gain is larger in proportion when there is a singleoperation :-)
Thank you for your comments,


Hello,

I experimented with the simplify-rtx transformation you suggested, see:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54855

It works when the argument is a register, but not for memory (which iswhere the constant is in the testcase). And the description of theoperation in sse.md does seem problematic. It says the second argument is:


            (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm"))

but Intel's documentation says "The source operand can be an XMM registeror a 64-bit memory location", not quite the same.

Do you think the .md description should really stay this way, or could wechange it to something that better reflects "64-bit memory location"?


--
Marc Glisse

Re: [i386] scalar ops that preserve the high part of a vector

Reply via email to