On 12/22/2011 11:33 AM, Richard Sandiford wrote:
> was it simply the mode punning from HI to V4HI that stops us from
> using the loongson_pinsr<V_suffix>_N insns?  E.g. something like:

Mostly...

> I realise this isn't exactly simpler in code terms, but it avoids the
> unspec and avoids the dual mode on match_operand 2 (HImode for matching,
> SImode for generation).  And with HImode not being valid for FPRs,
> I'd have expected:
> 
>     (subreg:V4HI (reg:HI R) 0)
> 
> to be as "bad" as:
> 
>     (subreg:SI (reg:HI R) 0)

Well, for one thing, note that I actually copy that subreg into a real
SImode pseudo.  So that sub-reggy-ness is fairly well forced to happen
in the integer registers.  At which point the SImode value is perfectly
happy to be copied into the fp regs.

The only other alternative that I see is to do the same with DImode
and then subreg from there to V4HImode.  While I only tested mips64el,
I would guess that using DImode would perform worse in 32-bit mode.


>>  /* Vector modes.  */
>> -VECTOR_MODES (INT, 8);        /*       V8QI V4HI V2SI */
>> -VECTOR_MODES (FLOAT, 8);      /*            V4HF V2SF */
>> -VECTOR_MODES (INT, 4);        /*            V4QI V2HI */
>> +VECTOR_MODES (INT, 8);        /*       V8QI  V4HI V2SI */
>> +VECTOR_MODES (FLOAT, 8);      /*             V4HF V2SF */
>> +VECTOR_MODES (INT, 4);        /*             V4QI V2HI */
> 
> Not sure about this bit.

That's just re-aligning the comment columns so that V16QI fits.

>> +/* Double-sized vector modes for vec_concat.  */
>> +VECTOR_MODE (INT, QI, 16);
>> +VECTOR_MODE (INT, HI, 8);
>> +VECTOR_MODE (INT, SI, 4);
>> +VECTOR_MODE (FLOAT, SF, 4);
> 
> It occured to me later that there might be an ABI impact with this
> on n32 and n64, due to the historical mistake of defining things
> in terms of modes (partly for the sake of libcalls):

Hmm.  Except that since mips_vector_mode_supported_p disallows
V4SFmode, you'll never wind up with variables of that type.

If the mips folk ever come up with another isa extension that
operates on larger register sets, this will make a difference...

>> +/* Construct (set target (vec_select op0 (parallel perm))) and
>> +   return true if that's a valid instruction in the active ISA.  */
>> +
>> +static bool
>> +expand_vselect (rtx target, rtx op0, const unsigned char *perm, unsigned 
>> nelt)
> 
> Stupid nitpick, but there's a mixture of statics without "mips_" and
> statics with.  Would be nice to have them all as all "mips_", just in case.

Ok.

> Is there an explicit target-independent requirement that
> TARGET_VECTORIZE_VEC_PERM_CONST_OK is only called in a
> context where insns can be emitted?

We start a sequence inside vec_perm_const_ok that prevents emit_insn
from actually doing anything.  It works everwhere else...

>> +      /* The backend (vec_select (vec_concat)) patterns are not duplicated
>> +     for single-operand.  Try once with the original un-folded selector. */
>> +      if (mips_expand_vec_perm_const_1 (&d))
>> +    return true;
> 
> What sort of input does this handle, and why don't we want the same thing
> for case 1 & 2 single vectors?

  V4HI a;
  __builtin_select (a, a, (V4HI){ 0, 4, 1, 5 })

When we called vec_perm_const_ok, we looked at the elements of the selector
and saw that they come from two different operands.  So we performed the test
with op0 != op1 and one_operand_p = false.

Here in expand_vec_perm_const, we see that a == a and want to fold to  { 0, 0, 
1, 1 }.
Well, this caused us problems in the i386 port where we had some double-operand
patterns that didn't match a single-operand pattern.

... of course on the i386 port we try the simplified version first.  Which 
really
makes more sense than this.  Not sure what I was thinking...


r~

Reply via email to