On Mon, 3 Dec 2012, Uros Bizjak wrote:

On Sun, Dec 2, 2012 at 1:30 PM, Marc Glisse <marc.gli...@inria.fr> wrote:

here is a patch. If it is accepted, I'll extend it to other vm patterns
(mul, div, min, max are likely candidates, but I need to check the doc).
It
passed bootstrap+testsuite on x86_64-linux.


2012-12-01  Marc Glisse  <marc.gli...@inria.fr>

        PR target/54855
gcc/
        * config/i386/sse.md (<sse>_vm<plusminus_insn><mode>3): Rewrite
        pattern.
        * config/i386/i386-builtin-types.def: New function types.
        * config/i386/i386.c (ix86_expand_args_builtin): Likewise.
        (bdesc_args) <__builtin_ia32_addss, __builtin_ia32_subss,
        __builtin_ia32_addsd, __builtin_ia32_subsd>: Change prototype.
        * config/i386/xmmintrin.h: Adapt to new builtin prototype.
        * config/i386/emmintrin.h: Likewise.
        * doc/extend.texi (X86 Built-in Functions): Document changed
prototype.

testsuite/
        * gcc.target/i386/pr54855-1.c: New testcase.
        * gcc.target/i386/pr54855-2.c: New testcase.


Yes, the approach looks correct to me, but I wonder why we have
different representations for v4sf and v2df cases? I'd say that we
should canonicalize patterns somewhere in the middle end (probably to
vec_merge variant, as IMO vec_dup looks like degenerated vec_merge
variant), otherwise we will have pattern explosion.


(I assume s/vec_dup/vec_concat/ above)

Ah, yes.

However, looking a bit more into the usage cases for these patterns,
they are only used through intrinsics with _m128 operands. While your
proposed patch makes these patterns more general (they can use 64bit
aligned memory), this is not their usual usage, and for their intended
usage, your proposed improvement complicates these patterns
unnecessarily. Following on these facts, I'd say that we leave these
special patters (since they serve their purpose well) and rather
introduce new patterns for "other" uses.

You mean like in the original patch?
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01279.html

(it only had the V2DF version, not the V4SF one)

Funny how we switched sides, now I am the one who would rather have a single pattern instead of having one for the builtin and one for recog. It seems that once we add the new pattern, keeping the old one is a waste of maintenance time, and the few extra rtx from the slightly longer pattern for these seldomly used builtins should be negligible.

But I don't mind, if that's the version you prefer, I'll update the patch.

Thanks,

--
Marc Glisse

Reply via email to