RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST

2015-08-04 Thread Ajit Kumar Agarwal


-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com] 
Sent: Monday, August 03, 2015 2:59 PM
To: Ajit Kumar Agarwal
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli 
Hunsigida; Nagaraju Mekala
Subject: Re: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST

On Sun, Aug 2, 2015 at 4:13 PM, Ajit Kumar Agarwal 
 wrote:
> All:
>
> The definition of the following macro that determine the statement cost that 
> adds to vectorization cost.
>
> #define TARGET_VECTORIZE_ADD_STMT_COST.
>
> In the implementation of the above macro the following is done for many 
> vectorization supported architectures like i386, ARM.
>
> if (where == vect_body && stmt_info && stmt_in_inner_loop_p (stmt_info))
> count *= 50;  /* FIXME.  */
>
> I have the  following questions.
>
> 1. Why the multiplication factor of 50 is choosen?

>>It's a wild guess.  See 
>>tree-vect-loop.c:vect_get_single_scalar_iteration_cost.

> 2. The comment mentions that the inner loop relative to the loop being 
> vectorized is added more weight. If more weight is added to the inner 
> loop for the loop being vectorized, the chances of vectorizing the inner loop 
> decreases. Why the inner loop cost is increased with relative to the loop 
> being vectorized?

>>In fact adding more weight to the inner loop increases the chance of 
>>vectorizing it (if vectorizing the inner loop is profitable).
>>Both scalar and vector cost get biased by a factor of 50 (we assume 50 
>>iterations of the inner loop for one iteration of the outer loop), so a 
>>non-profitable >>vectorization in the outer loop can be offsetted by 
>>profitable inner loop vectorization.

>>Yes, '50' can be improved if we actually know the iteration count of the 
>>inner loop or if we have profile-feedback.

Thanks for the valuable information and suggestions.

Thanks & Regards
Ajit

Richard.


> Thanks & Regards
> Ajit


Loop distribution for nested Loops.

2015-08-04 Thread Ajit Kumar Agarwal

All:

For the Loop given in Fig(1), there is no possibility of loop distribution 
because of the dependency of S1 and S2 on the outerloop index k.
Due to the dependency the Loop cannot be distributed. 

The Loop can be distributed with the transformation given in Fig(2) where the 
loop given in Fig(1) is distributed due to the dependency
Hoisting transformation. The Dependency hoisting transformation where the 
dependency is shifted to insertion of new outer Loop and the 
Dependency is based on the inserted outerloop.  This makes the loop k(S1) and 
j(S2) distributed with the insertion of new outerloop and transfer 
The dependency  of S1  and S2 to the inserted outer loop. 

Do k = 1, n-1
  Do I = k+1, n

S1:  a(I,k) = a(I,k)/a(k,k)

  Enddo
  Do j = k+1,n
 Do I = k+1,n
  S2:  a(I,j) = a(I,j) - a(I,k) *a(k,j);
 Enddo
   Enddo
Enddo

Fig(1)

Do x = 1, n
   Do k = 1, x-1
  Do I = k+1, n

S2: a(I,x) = a(I,x) - a(I,k) * a(k,x)

  Enddo
   enddo
  Do i = x+1,n
 S1:  a(I,x) = a(I,x)/a(x,x);
   Enddo
Enddo

Fig(2).

The above transformation looks interesting making the candidate of loop 
distribution of nested loops with the presence of dependency by
Shifting the dependency to new inserted outer loop.

It is useful to have dependency hoisting transformation that makes the loop 
distribution possible for nested loops

My question is  the partitioning based Loop distributed transformation does the 
distribution of the nested Loops?  

Thanks & Regards
Ajit







Re: Loop distribution for nested Loops.

2015-08-04 Thread Richard Biener
On Tue, Aug 4, 2015 at 4:04 PM, Ajit Kumar Agarwal
 wrote:
>
> All:
>
> For the Loop given in Fig(1), there is no possibility of loop distribution 
> because of the dependency of S1 and S2 on the outerloop index k.
> Due to the dependency the Loop cannot be distributed.
>
> The Loop can be distributed with the transformation given in Fig(2) where the 
> loop given in Fig(1) is distributed due to the dependency
> Hoisting transformation. The Dependency hoisting transformation where the 
> dependency is shifted to insertion of new outer Loop and the
> Dependency is based on the inserted outerloop.  This makes the loop k(S1) and 
> j(S2) distributed with the insertion of new outerloop and transfer
> The dependency  of S1  and S2 to the inserted outer loop.
>
> Do k = 1, n-1
>   Do I = k+1, n
>
> S1:  a(I,k) = a(I,k)/a(k,k)
>
>   Enddo
>   Do j = k+1,n
>  Do I = k+1,n
>   S2:  a(I,j) = a(I,j) - a(I,k) *a(k,j);
>  Enddo
>Enddo
> Enddo
>
> Fig(1)
>
> Do x = 1, n
>Do k = 1, x-1
>   Do I = k+1, n
>
> S2: a(I,x) = a(I,x) - a(I,k) * a(k,x)
>
>   Enddo
>enddo
>   Do i = x+1,n
>  S1:  a(I,x) = a(I,x)/a(x,x);
>Enddo
> Enddo
>
> Fig(2).
>
> The above transformation looks interesting making the candidate of loop 
> distribution of nested loops with the presence of dependency by
> Shifting the dependency to new inserted outer loop.
>
> It is useful to have dependency hoisting transformation that makes the loop 
> distribution possible for nested loops
>
> My question is  the partitioning based Loop distributed transformation does 
> the distribution of the nested Loops?

No, the patch implementing it
(https://gcc.gnu.org/ml/gcc-patches/2013-09/msg01293.html - maybe that
wasn't the latest version)
was never applied.

Richard.

> Thanks & Regards
> Ajit
>
>
>
>
>


gcc-5-20150804 is now available

2015-08-04 Thread gccadmin
Snapshot gcc-5-20150804 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/5-20150804/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 5 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-5-branch 
revision 226596

You'll find:

 gcc-5-20150804.tar.bz2   Complete GCC

  MD5=057f839bacbe387bdfd6c3ad7bbc1bf2
  SHA1=101efc341bc034560b1a08dc1c187c25e37fc6e5

Diffs from 5-20150728 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-5
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.