On Wed, May 2, 2012 at 11:46 PM, Marc Glisse <marc.gli...@inria.fr> wrote:
> Hello,
>
> I definitely don't expect the attached patch to be accepted, but I would
> like some advice on the direction to go, and a patch that passes the
> testsuite and does the optimization I want on a couple testcases seems like
> it may help start the conversation. This is the first time I even look at
> .md files...
>
> The goal is to optimize: v8sf x; v4sf y=*(v4sf*)&x; so the compiler doesn't
> copy x to memory (yes, I know there is an intrinsic to do that).
>
> If I understood Richard Guenther's comment in the PR, it can be optimized in
> the back-end.

Expand simply uses a subreg to access the lower half.  I suppose we miss
some simplify_rtx/combine logic that ends up using a vec_select (can
that even be used to select the lower/upper half?).  Uros?  We want
movq %ymm0, %xmm1 (or rather simply use xmm0 in consumers) here.

Richard.

> The only way I found to place this kind of transformation is
> with define_peephole2. And I couldn't figure out how to test if 2 memory
> operands correspond to the same address, with different types (so match_dup
> is unhappy), and for some reason the XEXP(*,0) comparison said yes on my
> test and no when using an unrelated piece of memory, but it looks like a
> nonsense test that is just lucky on a couple trivial examples.
>
> Any help?
>
>
> 2012-05-02  Marc Glisse  <marc.gli...@inria.fr>
>        PR target/53101
>
> gcc/
>        * config/i386/sse.md: New peephole2 for subvectors.
>
> gcc/testsuite/
>        * gcc.target/i386/pr53101.c: New test.
>
>
> --
> Marc Glisse

Reply via email to