On Wed, May 2, 2012 at 11:46 PM, Marc Glisse <marc.gli...@inria.fr> wrote: > Hello, > > I definitely don't expect the attached patch to be accepted, but I would > like some advice on the direction to go, and a patch that passes the > testsuite and does the optimization I want on a couple testcases seems like > it may help start the conversation. This is the first time I even look at > .md files... > > The goal is to optimize: v8sf x; v4sf y=*(v4sf*)&x; so the compiler doesn't > copy x to memory (yes, I know there is an intrinsic to do that). > > If I understood Richard Guenther's comment in the PR, it can be optimized in > the back-end.
Expand simply uses a subreg to access the lower half. I suppose we miss some simplify_rtx/combine logic that ends up using a vec_select (can that even be used to select the lower/upper half?). Uros? We want movq %ymm0, %xmm1 (or rather simply use xmm0 in consumers) here. Richard. > The only way I found to place this kind of transformation is > with define_peephole2. And I couldn't figure out how to test if 2 memory > operands correspond to the same address, with different types (so match_dup > is unhappy), and for some reason the XEXP(*,0) comparison said yes on my > test and no when using an unrelated piece of memory, but it looks like a > nonsense test that is just lucky on a couple trivial examples. > > Any help? > > > 2012-05-02 Marc Glisse <marc.gli...@inria.fr> > PR target/53101 > > gcc/ > * config/i386/sse.md: New peephole2 for subvectors. > > gcc/testsuite/ > * gcc.target/i386/pr53101.c: New test. > > > -- > Marc Glisse