Re: Vector registers on MIPS arch

2016-04-06 Thread Ilya Enkovich
2016-04-06 1:50 GMT+03:00 David Guillen Fandos :
>
> Thanks again Ilya,
>
> That seems to help to solve the problem. Now I'm facing another issue.
> It seems the tree-vec-generic pass is promoting my vector operations to
> BLKmode and therefore the VECTOR_MODE_P macro evaluates to false,
> falling back to scalar mode.
> I thought I got it working for a moment when I forgot to fix the
> HARD_MODE_REGNO_OK macro that evaluated to false for vector registers.
> In that case I mange to dodge this issue but I see another issue
> regarding register allocation (obviously! :P)
>
> So the bottom line would be, how do I make sure that my "compute_type"
> is V4SF instead of BLKmode? Where does this promotion happen?

TYPE_MODE macro for vectors is actually a call to vector_type_mode.  You
should probably look at it first.  You may also check what mode_for_vector
returns for float vector in your case.

Ilya

>
> Thanks a lot!
> David


Unnecessary check on phi node in tree if-conversion?

2016-04-06 Thread Bin.Cheng
Hi,
Function if_convertible_phi_p has below check on virtual PHI nodes:


  if (any_mask_load_store)
return true;

  /* When there were no if-convertible stores, check
 that there are no memory writes in the branches of the loop to be
 if-converted.  */
  if (virtual_operand_p (gimple_phi_result (phi)))
{
  imm_use_iterator imm_iter;
  use_operand_p use_p;

  if (bb != loop->header)
{
  if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "Virtual phi not on loop->header.\n");
  return false;
}

  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, gimple_phi_result (phi))
{
  if (gimple_code (USE_STMT (use_p)) == GIMPLE_PHI
  && USE_STMT (use_p) != phi)
{
  if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "Difficult to handle this virtual phi.\n");
  return false;
}
}
}

Since the check is short-cut by any_mask_load_store, when the check is
triggered, it means there is virtual phi node but no conditional
memory stores?  Is this possible?  Plus, any_mask_load_store is set by
below code in if_convertible_gimple_assign_stmt_p:
  if (ifcvt_can_use_mask_load_store (stmt))
{
  gimple_set_plf (stmt, GF_PLF_2, true);
  *any_mask_load_store = true;
  return true;
}

So in theory it's possible to skip aforementioned check when only mask
load is encountered.

Any ideas?

Thanks,
bin


Re: Unnecessary check on phi node in tree if-conversion?

2016-04-06 Thread Bin.Cheng
On Wed, Apr 6, 2016 at 5:07 PM, Bin.Cheng  wrote:
> Hi,
> Function if_convertible_phi_p has below check on virtual PHI nodes:
>
>
>   if (any_mask_load_store)
> return true;
>
>   /* When there were no if-convertible stores, check
>  that there are no memory writes in the branches of the loop to be
>  if-converted.  */
>   if (virtual_operand_p (gimple_phi_result (phi)))
> {
>   imm_use_iterator imm_iter;
>   use_operand_p use_p;
>
>   if (bb != loop->header)
> {
>   if (dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, "Virtual phi not on loop->header.\n");
>   return false;
> }
>
>   FOR_EACH_IMM_USE_FAST (use_p, imm_iter, gimple_phi_result (phi))
> {
>   if (gimple_code (USE_STMT (use_p)) == GIMPLE_PHI
>   && USE_STMT (use_p) != phi)
> {
>   if (dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, "Difficult to handle this virtual phi.\n");
>   return false;
> }
> }
> }
>
> Since the check is short-cut by any_mask_load_store, when the check is
> triggered, it means there is virtual phi node but no conditional
> memory stores?  Is this possible?  Plus, any_mask_load_store is set by
> below code in if_convertible_gimple_assign_stmt_p:
>   if (ifcvt_can_use_mask_load_store (stmt))
> {
>   gimple_set_plf (stmt, GF_PLF_2, true);
>   *any_mask_load_store = true;
>   return true;
> }
>
> So in theory it's possible to skip aforementioned check when only mask
> load is encountered.
>
> Any ideas?
It's possible to have a loop like:


  .MEM_2232 = PHI <.MEM_574(179), .MEM_1247(183)>
  ...
  if (cond)
goto 
  else
goto 

:  //empty
:
  .MEM_1247 = PHI <.MEM_2232(180), .MEM_2232(181)>
  if (cond2)
goto 
  else
goto 

:
  goto 

So can we handle the PHI which can be degenerated in if-cvt?

Thanks,
bin


Re: Vector registers on MIPS arch

2016-04-06 Thread David Guillen Fandos
On 06/04/16 10:44, Ilya Enkovich wrote:
> 2016-04-06 1:50 GMT+03:00 David Guillen Fandos :
>>
>> Thanks again Ilya,
>>
>> That seems to help to solve the problem. Now I'm facing another issue.
>> It seems the tree-vec-generic pass is promoting my vector operations to
>> BLKmode and therefore the VECTOR_MODE_P macro evaluates to false,
>> falling back to scalar mode.
>> I thought I got it working for a moment when I forgot to fix the
>> HARD_MODE_REGNO_OK macro that evaluated to false for vector registers.
>> In that case I mange to dodge this issue but I see another issue
>> regarding register allocation (obviously! :P)
>>
>> So the bottom line would be, how do I make sure that my "compute_type"
>> is V4SF instead of BLKmode? Where does this promotion happen?
> 
> TYPE_MODE macro for vectors is actually a call to vector_type_mode.  You
> should probably look at it first.  You may also check what mode_for_vector
> returns for float vector in your case.
> 
> Ilya
> 
>>
>> Thanks a lot!
>> David

Thanks a lot Ilya!

I managed to get it working. There were some bugs regarding register
allocation that ended up promoting the class to be BLKmode instead of
V4SFmode. I had to debug it a bit, which is tricky, but in the end I
found my way through it.

Just to finish this. Do you think from your experience that is difficult
to implement vector instructions that have variable sizes? This
particular VFU has 4, 3, 2 and 1 element operations with arbitrary
swizzling. This is, we can load a V3SF and perform a dot product
operation with another V3SF to get a V1SF for instance. Of course the
elements might overlap, so if a vreg is A B C D we can have a 4 element
vector ABCD or a pair of 3 element vregs ABC and BCD, the same logic
applies to have 3 registers of V2SF type and so forth. It is very
flexible. It also allows column and row arranging, so we can load 4
vectors in a 4x4 matrix and multiply them with another matrix
transposing them on the fly.

I guess this is too difficult to expose to gcc, which is more used to
intel SIMD stuff. In the past I wrote most of the kernels in assembly
and wrap them around C functions, but if you use classes and inline
functions having gcc on your side helps a lot (register allocation and
therefore less load/stores to memory).

Thanks a lot for your help!

David




gcc-4.9-20160406 is now available

2016-04-06 Thread gccadmin
Snapshot gcc-4.9-20160406 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160406/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.9 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch 
revision 234800

You'll find:

 gcc-4.9-20160406.tar.bz2 Complete GCC

  MD5=de1a3f88aefbb0bc76c88a5691c2c93e
  SHA1=bf9d3e27848d8b1c0be965168cc64ffb25c79897

Diffs from 4.9-20160330 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.