Richard Henderson <r...@redhat.com> writes:
> On 12/22/2011 11:33 AM, Richard Sandiford wrote:
>> was it simply the mode punning from HI to V4HI that stops us from
>> using the loongson_pinsr<V_suffix>_N insns?  E.g. something like:
>
> Mostly...
>
>> I realise this isn't exactly simpler in code terms, but it avoids the
>> unspec and avoids the dual mode on match_operand 2 (HImode for matching,
>> SImode for generation).  And with HImode not being valid for FPRs,
>> I'd have expected:
>> 
>>     (subreg:V4HI (reg:HI R) 0)
>> 
>> to be as "bad" as:
>> 
>>     (subreg:SI (reg:HI R) 0)
>
> Well, for one thing, note that I actually copy that subreg into a real
> SImode pseudo.  So that sub-reggy-ness is fairly well forced to happen
> in the integer registers.  At which point the SImode value is perfectly
> happy to be copied into the fp regs.

You mean the register allocator will prefer a GPR for the SImode pseudo,
so the need for one won't be hidden in an input reload of the subreg?
Not sure -- maybe it would get a GPR, but with one definition as "d"
and one use as "f", I'm not sure we can really rely on that.

> The only other alternative that I see is to do the same with DImode
> and then subreg from there to V4HImode.  While I only tested mips64el,
> I would guess that using DImode would perform worse in 32-bit mode.

Well, I wasn't thinking of using DImode, more going direct from
HImode to V4HImode, but I suppose the same thing would apply for
the subreg reload.  Never mind then.

>>>  /* Vector modes.  */
>>> -VECTOR_MODES (INT, 8);        /*       V8QI V4HI V2SI */
>>> -VECTOR_MODES (FLOAT, 8);      /*            V4HF V2SF */
>>> -VECTOR_MODES (INT, 4);        /*            V4QI V2HI */
>>> +VECTOR_MODES (INT, 8);        /*       V8QI  V4HI V2SI */
>>> +VECTOR_MODES (FLOAT, 8);      /*             V4HF V2SF */
>>> +VECTOR_MODES (INT, 4);        /*             V4QI V2HI */
>> 
>> Not sure about this bit.
>
> That's just re-aligning the comment columns so that V16QI fits.

OK, but there wasn't a V16QI comment that I could see.

>>> +/* Double-sized vector modes for vec_concat.  */
>>> +VECTOR_MODE (INT, QI, 16);
>>> +VECTOR_MODE (INT, HI, 8);
>>> +VECTOR_MODE (INT, SI, 4);
>>> +VECTOR_MODE (FLOAT, SF, 4);
>> 
>> It occured to me later that there might be an ABI impact with this
>> on n32 and n64, due to the historical mistake of defining things
>> in terms of modes (partly for the sake of libcalls):
>
> Hmm.  Except that since mips_vector_mode_supported_p disallows
> V4SFmode, you'll never wind up with variables of that type.
>
> If the mips folk ever come up with another isa extension that
> operates on larger register sets, this will make a difference...

I'd still rather be safe than sorry when it comes to ABI stuff.
We've defined the mode, and we've long allowed users to define:

    typedef float v4sf __attribute__((vector_size(16)));

regardless of backend support.  I realise we shouldn't be matching
the two up in the absence of bugs, but...

>> Is there an explicit target-independent requirement that
>> TARGET_VECTORIZE_VEC_PERM_CONST_OK is only called in a
>> context where insns can be emitted?
>
> We start a sequence inside vec_perm_const_ok that prevents emit_insn
> from actually doing anything.  It works everwhere else...

OK.

>>> +      /* The backend (vec_select (vec_concat)) patterns are not duplicated
>>> +    for single-operand.  Try once with the original un-folded selector. */
>>> +      if (mips_expand_vec_perm_const_1 (&d))
>>> +   return true;
>> 
>> What sort of input does this handle, and why don't we want the same thing
>> for case 1 & 2 single vectors?
>
>   V4HI a;
>   __builtin_select (a, a, (V4HI){ 0, 4, 1, 5 })
>
> When we called vec_perm_const_ok, we looked at the elements of the selector
> and saw that they come from two different operands.  So we performed the test
> with op0 != op1 and one_operand_p = false.
>
> Here in expand_vec_perm_const, we see that a == a and want to fold to
> { 0, 0, 1, 1 }.  Well, this caused us problems in the i386 port where
> we had some double-operand patterns that didn't match a single-operand
> pattern.

OK, but just so that I understand: does that mean we're missing some
single-operand cases that ought to be there, and this code is just
stopping us from ICEing when we hit them?  Would it be worth having
a gcc_checking_assert there?

Richard

Reply via email to