https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88767

--- Comment #4 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 9 Jan 2019, wschmidt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88767
> 
> Bill Schmidt <wschmidt at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|WAITING                     |UNCONFIRMED
>      Ever confirmed|1                           |0
> 
> --- Comment #2 from Bill Schmidt <wschmidt at gcc dot gnu.org> ---
> Hi Richard -- This was reported to us internally.  The performance of this 
> test
> case on a P8 server indicates that disabling complete unrolling and applying
> unroll-and-jam could produce about a 1.5x speedup.  I am going to have our
> performance team verify that this is the case using just the options that Li
> Jia used; the original report modified the source to provide the results of
> unroll-and-jam since the reporter didn't know how to disable cunrolli.  I'll
> post the results here when we have them.

Note for cases like this it would be nice to extend our set of loop 
pragmas so you could say

#pragma GCC loop unroll-and-jam [factor]

on the outer loop which should then disable unrolling of the inner.

If source modification is possible, that is.  Using 
-fdisable-tree-cunrolli isn't meant to be a "production thing"

Reply via email to