Re: input address reload issue

2017-01-06 Thread Segher Boessenkool
On Thu, Jan 05, 2017 at 05:18:46PM +0100, Aurelien Buhrig wrote:
> The issue happens when reloading:
> 
> (set (reg:QI 47 [ _9 ])
>   (mem:QI (plus:SI (reg/v/f:SI 68 [orig:51 in ] [51])
> (const_int 1 [0x1])
> 
> My understanding is that IRA allocates hardregs to allocno which are
> replaced by the reload pass which generates new reloads or spills regs
> when needed, right?

Yes.  IRA chooses what hard registers to us where, or memory (i.e. no
register) if that seems best (for example if it ran out of registers).
Then afterwards reload / LRA fixes up everything that isn't valid.

> Here the IRA chooses a reg (named r2)which makes the mem address not
> legitimate. Is it valid to allocate a reg which makes non legitimate
> address?

reload will (or should ;-) ) fix it, but it would of course be better if
IRA could make valid code immediately.

> Assuming it is, my understanding is that the reload chooses a legitimate
> reg (named a0 here) and shall emit insns (in emit_reload_insns) to set
> a0 correctly (RELOAD_FOR_INPUT_ADDRESS). Right?
> 
> So the insn:
> (set (reg:QI 0 r0) (mem:QI (plus:SI (reg:SI 2 r2)(const_int 1))
> 
> is transformed into:
> (set (reg:SI 8 a0) (reg:SI 2 r2))
> (set (reg:SI 8 a0) (const_int 1))
> (set (reg:SI 8 a0) (plus:SI (reg:SI 8 a0) (reg:SI 8 a0)))
> (set (reg:QI 0 r0) (mem:QI (reg:SI 8 a0))

That second instruction kills the result of the first, that is bad.  It
doesn't do anything useful in the first place.  Maybe the first and the
third instructions could be combined as well, or the third and the fourth,
but I don't know your target.

> This "basic" transformation requires two reload regs, but only one is
> given/used/possible from the rl structure (in emit_reload_insns).
> 
> So where does the issue comes from? The need for 2 reload regs, the
> transformation which is too "basic" and could be optimized to use only
> one reload reg, or any wrong/missing reload target hook?

Look at the dump file for reload to see where things come from.  Also
everything Jeff said; you really want LRA.


Segher


un-optimal code because of forwprop after gcc-5?

2017-01-06 Thread Pitchumani Sivanupandi
Found a code size regression for AVR target in gcc-5 and higher. Looks 
like it

is applicable to x86_64 also.

Test case ( options: -Os)
-
typedef unsigned int uint8_t __attribute__((__mode__(__QI__)));
typedef unsigned int uint32_t __attribute__ ((__mode__ (__SI__)));
typedef struct rpl_instance rpl_instance_t;
struct rpl_instance {
  uint8_t dio_intcurrent;
  uint32_t dio_next_delay;
};
unsigned short random_rand(void);

void
new_dio_interval(rpl_instance_t *instance)
{
  uint32_t time;
  uint32_t ticks;
  time = 1UL << instance->dio_intcurrent;
  ticks = (time * 128) / 1000;
  instance->dio_next_delay = ticks;
  ticks = ticks / 2 + (ticks / 2 * (uint32_t)random_rand()) / 65535U;
  instance->dio_next_delay -= ticks;
}

ssa dump

_3 = instance_2(D)->dio_intcurrent
_4 = (int) _3
_5 = 1 << _4
time_6  = (uint32_t)_5
_7 = time_6 * 128
ticks_8 = _7 / 1000
instance_2(D)->dio_next_delay = ticks_8
_10 = ticks_8 / 2
_11 = ticks_8 / 2
_13 = random_rand()
_14 = (unsigned int) _13
_15 = _11 * _14
_16 = _15 / 65535
ticks_17 = _11 + _16
_18 = instance_2(D)->dio_next_delay
_19 = _18 - ticks_17
instance_2(D)->dio_next_delay = _19
return

gcc-5 or higher generates un-optimal code for _10 definition as below:
  _10 = _7 / 2000
where as gcc-4 generates _10 = ticks_8 >> 1.
Below are few differences in the passes that lead to this un-optimal code.

pass  |gcc 4  | gcc 5 and higher |
--+---+--+
ccp1  | _10 definition removed as dce | No change|
  | |  |
forwprop1 | No change | gimple_simplified _10 & _11  |
  |   |_10 = _7 / 2000   |
  |   |_11 = _7 / 2000   |
  | |  |
cddce1| No change | _10 definition removed   |
  | |  |
ccp2  | No change | No change|
  | |  |
vrp1  | _10 = ticks_8 / 2 | No change|
  | changed to |  |
  | _10 = ticks_8 >> 1 |  |
--+---+--+

Forward propagation in gcc-4 doesn't propagate ticks_8 to _11. Where as
gcc-5 propagates to two expressions (_10 and _11). Is that valid? This
prevents vrp pass from optimizing rhs expressions of _11 in this testcase.

Regards,
Pitchumani


Re: input address reload issue

2017-01-06 Thread Aurelien Buhrig

>> So the insn:
>> (set (reg:QI 0 r0) (mem:QI (plus:SI (reg:SI 2 r2)(const_int 1))
>>
>> is transformed into:
>> (set (reg:SI 8 a0) (reg:SI 2 r2))
>> (set (reg:SI 8 a0) (const_int 1))
>> (set (reg:SI 8 a0) (plus:SI (reg:SI 8 a0) (reg:SI 8 a0)))
>> (set (reg:QI 0 r0) (mem:QI (reg:SI 8 a0))
>>
>> This "basic" transformation requires two reload regs, but only one is
>> given/used/possible from the rl structure (in emit_reload_insns).
>>
>> So where does the issue comes from? The need for 2 reload regs, the
>> transformation which is too "basic" and could be optimized to use only
>> one reload reg, or any wrong/missing reload target hook?
> Sounds like you need secondary or intermediate reloads.
Thank you Jeff for your help.
Currently, the TARGET_SECONDARY_RELOAD does not check if the address of
a mem rtx is legitimate address and it returns NO_REG with (mem:QI r2).
Do you suggest the secondary reload must implement a scratch reg & md
pattern to implement this reload?




Re: input address reload issue

2017-01-06 Thread Aurelien Buhrig

> Look at the dump file for reload to see where things come from.  Also
> everything Jeff said; you really want LRA.

I will try switching to LRA in a second step, but I think I need first to 
remove the old cc0...
BTW, in which way the LRA is better than IRA? Is there any benchmarks?

Thanks for your help. 
Aurélien



Re: input address reload issue

2017-01-06 Thread Segher Boessenkool
On Fri, Jan 06, 2017 at 11:26:40AM +0100, Aurelien Buhrig wrote:
> > Look at the dump file for reload to see where things come from.  Also
> > everything Jeff said; you really want LRA.
> 
> I will try switching to LRA in a second step, but I think I need first to 
> remove the old cc0...

:-)

> BTW, in which way the LRA is better than IRA? Is there any benchmarks?

LRA is a replacement for reload, not for IRA.

LRA already usually creates better performing code than reload does, but
its big advantage is that it is a much better maintainable codebase, so
that we can improve it over time.


Segher


.../lib/gcc//7.1.1/ vs. .../lib/gcc//7/

2017-01-06 Thread Jakub Jelinek
Hi!

SUSE and some other distros use a hack that omits the minor and patchlevel
versions from the directory layout, just uses the major number, it is very
uncommon to have more than one compiler for the same major number installed
in the same prefix now that major bumps every year and the distinction
between minor and patchlevel is just the amount of bugfixes it got after
the initial release.

Dunno if the following is the latest version.

The question is, do we want something like this upstream too, and
unconditionally or based on a configure option (--enable-major-version-only
?) and in the latter case what the default should be.

I must say I don't understand the cppbuiltin.c part in the patch,
CFLAGS-cppbuiltin.o += $(PREPROCESSOR_DEFINES) -DBASEVER=$(FULLVER_s)
cppbuiltin.o: $(FULLVER)
should already provide it with the full version.  And libjava bit is
obviously no longer needed.

If we apply the patch as is (sans those last two files?), the change would
be unconditional, and we'd have to adjust maintainer scripts etc. so that
if there is FULL-VER file, the full version is in there and needs to be
bumped and BASE-VER is then just the major from that.  The patch doesn't
seem to be complete though, e.g. gcc/configure.ac uses gcc_BASEVER
var for plugins and expects it to be the full version.  Or do we want
GCCPLUGIN_VERSION to be also solely the major version?

Another possibility for still unconditional change would be to sed
the major out from BASE-VER in all the places that read it from BASE-VER
file.  Files to look at are:
config/acx.m4
fixincludes/Makefile.in
gcc/configure.ac
gcc/Makefile.in
libada/Makefile.in
libatomic/Makefile.am
libcc1/configure.ac
libcilkrts/Makefile.am
libgcc/Makefile.in
libgfortran/Makefile.am
libgomp/Makefile.am
libitm/Makefile.am
libmpx/Makefile.am
libobjc/Makefile.in
liboffloadmic/Makefile.am
libquadmath/Makefile.am
libsanitizer/Makefile.am
libssp/Makefile.am
libstdc++-v3/fragment.am
libvtv/Makefile.am
lto-plugin/Makefile.am
maintainer-scripts/gcc_release
maintainer-scripts/update_web_docs_svn

Yet another option is introduce AC_ARG_ENABLE into all those configure
scripts (some macro in config/*.m4) and do the sed conditionally.

But the first and primary question is if we want to change anything in this
area.

Index: gcc/Makefile.in
===
--- gcc/Makefile.in.orig2015-05-08 17:10:12.068697540 +0200
+++ gcc/Makefile.in 2015-05-08 17:25:31.831833081 +0200
@@ -810,12 +810,14 @@ GTM_H = tm.h  $(tm_file_list) in
 TM_H  = $(GTM_H) insn-flags.h $(OPTIONS_H)
 
 # Variables for version information.
-BASEVER := $(srcdir)/BASE-VER  # 4.x.y
+BASEVER := $(srcdir)/BASE-VER  # 5
+FULLVER := $(srcdir)/FULL-VER  # 5.x.y
 DEVPHASE:= $(srcdir)/DEV-PHASE # experimental, prerelease, ""
 DATESTAMP   := $(srcdir)/DATESTAMP # MMDD or empty
 REVISION:= $(srcdir)/REVISION  # [BRANCH revision XX]
 
 BASEVER_c   := $(shell cat $(BASEVER))
+FULLVER_c   := $(shell cat $(FULLVER))
 DEVPHASE_c  := $(shell cat $(DEVPHASE))
 DATESTAMP_c := $(shell cat $(DATESTAMP))
 
@@ -839,6 +841,7 @@ PATCHLEVEL_c := \
 # immediately after the comma in the $(if ...) constructs is
 # significant - do not remove it.
 BASEVER_s   := "\"$(BASEVER_c)\""
+FULLVER_s   := "\"$(FULLVER_c)\""
 DEVPHASE_s  := "\"$(if $(DEVPHASE_c), ($(DEVPHASE_c)))\""
 DATESTAMP_s := \
   "\"$(if $(DEVPHASE_c)$(filter-out 0,$(PATCHLEVEL_c)), $(DATESTAMP_c))\""
@@ -2028,7 +2031,7 @@ s-options-h: optionlist $(srcdir)/opt-fu
 
 dumpvers: dumpvers.c
 
-CFLAGS-version.o += -DBASEVER=$(BASEVER_s) -DDATESTAMP=$(DATESTAMP_s) \
+CFLAGS-version.o += -DBASEVER=$(FULLVER_s) -DDATESTAMP=$(DATESTAMP_s) \
-DREVISION=$(REVISION_s) \
-DDEVPHASE=$(DEVPHASE_s) -DPKGVERSION=$(PKGVERSION_s) \
-DBUGURL=$(BUGURL_s)
@@ -2038,10 +2041,10 @@ version.o: $(REVISION) $(DATESTAMP) $(BA
 CFLAGS-lto-compress.o += $(ZLIBINC)
 
 bversion.h: s-bversion; @true
-s-bversion: BASE-VER
-   echo "#define BUILDING_GCC_MAJOR `echo $(BASEVER_c) | sed -e 
's/^\([0-9]*\).*$$/\1/'`" > bversion.h
-   echo "#define BUILDING_GCC_MINOR `echo $(BASEVER_c) | sed -e 
's/^[0-9]*\.\([0-9]*\).*$$/\1/'`" >> bversion.h
-   echo "#define BUILDING_GCC_PATCHLEVEL `echo $(BASEVER_c) | sed -e 
's/^[0-9]*\.[0-9]*\.\([0-9]*\)$$/\1/'`" >> bversion.h
+s-bversion: FULL-VER
+   echo "#define BUILDING_GCC_MAJOR `echo $(FULLVER_c) | sed -e 
's/^\([0-9]*\).*$$/\1/'`" > bversion.h
+   echo "#define BUILDING_GCC_MINOR `echo $(FULLVER_c) | sed -e 
's/^[0-9]*\.\([0-9]*\).*$$/\1/'`" >> bversion.h
+   echo "#define BUILDING_GCC_PATCHLEVEL `echo $(FULLVER_c) | sed -e 
's/^[0-9]*\.[0-9]*\.\([0-9]*\)$$/\1/'`" >> bversion.h
echo "#define BUILDING_GCC_VERSION (BUILDING_GCC_MAJOR * 1000 + 
BUILDING_GCC_MINOR)" >> bversion.h
$(STAMP) s-bversion
 
@@ -2410,9 +2413,9 @@ build/%.o :  # dependencies provided by
 ## build/version.o is compiled by the $(COMPILER_F

Re: .../lib/gcc//7.1.1/ vs. .../lib/gcc//7/

2017-01-06 Thread Szabolcs Nagy
On 06/01/17 12:48, Jakub Jelinek wrote:
> SUSE and some other distros use a hack that omits the minor and patchlevel
> versions from the directory layout, just uses the major number, it is very

what is the benefit?



Re: .../lib/gcc//7.1.1/ vs. .../lib/gcc//7/

2017-01-06 Thread Jakub Jelinek
On Fri, Jan 06, 2017 at 01:07:23PM +, Szabolcs Nagy wrote:
> On 06/01/17 12:48, Jakub Jelinek wrote:
> > SUSE and some other distros use a hack that omits the minor and patchlevel
> > versions from the directory layout, just uses the major number, it is very
> 
> what is the benefit?

Various packages use the paths to gcc libraries/includes etc. in various
places (e.g. libtool, *.la files, etc.).  So any time you upgrade gcc
(say from 6.1.0 to 6.2.0 or 6.2.0 to 6.2.1), everything that has those paths
needs to be rebuilt.  By having only the major number in the paths (which is
pretty much all that matters), you only have to rebuild when the major
version of gcc changes (at which time one usually want to mass rebuild
everything anyway).

Jakub


Re: Converting to LRA (calling all maintainers)

2017-01-06 Thread Claudiu Zissulescu

On 16/09/2016 22:37, Segher Boessenkool wrote:

Hi!

Since a few days TARGET_LRA_P defaults to returning "true".  I changed
all in-tree ports to still behave the same as before, which for most
ports means they use old reload always.  All the primary platforms (see
the release criteria, ) now
default to LRA though.

Since one day (hopefully not too very far in the future) we want to
deprecate and eventually remove old reload, all ports that do not want
to risk being removed should be adapted to work with LRA.  New ports
should use LRA always.

I started a wiki page at 
that gives hints on how to go about moving to LRA.  Please add any and
all details and experiences you think can help others, there!

Thanks for listening to this public service announcement,


Segher



Thank you Segher for the wiki. ARC has some experimental support for LRA 
but there are still some issues which needs to be cleaned out. Hopefully 
in the next period, I'll be able to address most of them, leading us to 
have LRA default for our port.



Best,
Claudiu


Re: .../lib/gcc//7.1.1/ vs. .../lib/gcc//7/

2017-01-06 Thread Szabolcs Nagy
On 06/01/17 13:11, Jakub Jelinek wrote:
> On Fri, Jan 06, 2017 at 01:07:23PM +, Szabolcs Nagy wrote:
>> On 06/01/17 12:48, Jakub Jelinek wrote:
>>> SUSE and some other distros use a hack that omits the minor and patchlevel
>>> versions from the directory layout, just uses the major number, it is very
>>
>> what is the benefit?
> 
> Various packages use the paths to gcc libraries/includes etc. in various
> places (e.g. libtool, *.la files, etc.).  So any time you upgrade gcc

it is a bug that gcc installs libtool la files,
because a normal cross toolchain is relocatable
but the la files have abs path in them.

that would be nice to fix, so build scripts don't
have to manually delete the bogus la files.

> (say from 6.1.0 to 6.2.0 or 6.2.0 to 6.2.1), everything that has those paths
> needs to be rebuilt.  By having only the major number in the paths (which is
> pretty much all that matters), you only have to rebuild when the major
> version of gcc changes (at which time one usually want to mass rebuild
> everything anyway).

i thought only the gcc driver needs to know
these paths because there are no shared libs
there that are linked into binaries so no binary
references those paths so nothing have to be
rebuilt.



Re: .../lib/gcc//7.1.1/ vs. .../lib/gcc//7/

2017-01-06 Thread Jakub Jelinek
On Fri, Jan 06, 2017 at 02:13:05PM +, Szabolcs Nagy wrote:
> On 06/01/17 13:11, Jakub Jelinek wrote:
> > On Fri, Jan 06, 2017 at 01:07:23PM +, Szabolcs Nagy wrote:
> >> On 06/01/17 12:48, Jakub Jelinek wrote:
> >>> SUSE and some other distros use a hack that omits the minor and patchlevel
> >>> versions from the directory layout, just uses the major number, it is very
> >>
> >> what is the benefit?
> > 
> > Various packages use the paths to gcc libraries/includes etc. in various
> > places (e.g. libtool, *.la files, etc.).  So any time you upgrade gcc
> 
> it is a bug that gcc installs libtool la files,
> because a normal cross toolchain is relocatable
> but the la files have abs path in them.

I'm not talking about *.la files installed by gcc.  I'm talking about any
*.la files created by libtool or libtool scripts themselves.
There are libtool hacks floating around that avoid the gcc internal paths,
but I think it is not in upstream libtool.

> > (say from 6.1.0 to 6.2.0 or 6.2.0 to 6.2.1), everything that has those paths
> > needs to be rebuilt.  By having only the major number in the paths (which is
> > pretty much all that matters), you only have to rebuild when the major
> > version of gcc changes (at which time one usually want to mass rebuild
> > everything anyway).
> 
> i thought only the gcc driver needs to know
> these paths because there are no shared libs
> there that are linked into binaries so no binary
> references those paths so nothing have to be
> rebuilt.

That is not the case, various programs just store the gcc
-print-file-name= paths in various locations.
Some query it at runtime (not sure what e.g. clang++ does when it wants to
use libstdc++ headers), others store it.

Jakub


ICE on using -floop-nest-optimize

2017-01-06 Thread Toon Moene
On the attached (Fortran) source, the following version of gfortran 
draws an ICE:


$ gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 6.2.1-5' 
--with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs 
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ 
--prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- 
--enable-shared --enable-linker-build-id --libexecdir=/usr/lib 
--without-included-gettext --enable-threads=posix --libdir=/usr/lib 
--enable-nls --with-sysroot=/ --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes 
--with-default-libstdcxx-abi=new --enable-gnu-unique-object 
--disable-vtable-verify --enable-libmpx --enable-plugin 
--enable-default-pie --with-system-zlib --disable-browser-plugin 
--enable-java-awt=gtk --enable-gtk-cairo 
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre 
--enable-java-home 
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 
--with-arch-directory=amd64 
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc=auto 
--enable-multiarch --with-arch-32=i686 --with-abi=m64 
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic 
--enable-checking=release --build=x86_64-linux-gnu 
--host=x86_64-linux-gnu --target=x86_64-linux-gnu

Thread model: posix
gcc version 6.2.1 20161124 (Debian 6.2.1-5)

using the following command line arguments:

gfortran -S -g -Ofast -fprotect-parens -fbacktrace -march=native 
-mtune=native -floop-nest-optimize corr_to_spec_2D.F


The error message is:

corr_to_spec_2D.F:3:0:

   subroutine corr_to_spec_2D(nx_local,ny_local,

internal compiler error: in create_pw_aff_from_tree, at 
graphite-sese-to-poly.c:445

Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

I will retry this with trunk gfortran as soon as my automatic builds 
have constructed that compiler.


In the mean time - anyone has a clue ?

Thanks,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
c Library: hljb  $Id: corr_to_spec_2D.F 8416 2010-09-08 08:52:33Z ovignes $
c
  subroutine corr_to_spec_2D(nx_local,ny_local,
 xny_global,
 xnxl_global,nyl_global,
 xkmax_local,
 xkmax_global,lmax_global,
 xmype,nproc,nydim,kdim,
 xlmaxe,
 xjmin_list,ny_list,
 xkmin_list,kmax_list,
 xlevmin_list,levmax_list,
 xgridsize,lscale,
 xspec_dens_2D, corr_index)
c
  implicit none
c
c---
c
  integer nx_local,ny_local,
 xny_global,
 xnxl_global,nyl_global,
 xkmax_local,
 xkmax_global,lmax_global,
 xmype,nproc,nydim,kdim,
 xlmaxe(0:kmax_local),
 xjmin_list(nproc),ny_list(nproc),
 xkmin_list(nproc),kmax_list(nproc),
 xlevmin_list(0:kdim,nproc),
 xlevmax_list(0:kdim,nproc)
  real gridsize,lscale
  real 
 x   spec_dens_2D(-lmax_global:lmax_global,0:kmax_local)
  integer corr_index 
c
c---
c
c Local work space
c
  integer nextended,kextended,i,j,j_global,jlev,k,l,kwave
  real dist,dx,dy, sum_spec
  real, allocatable ::corr_extended(:,:,:)
  complex, allocatable ::  spec_corr(:,:)
c
  complex spec_dens_cmpx(-lmax_global:lmax_global,0:kmax_local)
  real phys_corr_appr(nx_local,ny_local)
  real phys_corr_orig(nx_local,ny_local) 
c
  real spec_eps
  parameter (spec_eps = 0.01) 
c
c---
c
c Allocate space for correlation in physical space 
c including extension zone and in spectral space
c
  nextended = nyl_global/ny_global+1
  if (mype.eq.1) write(6,*)' nextended=',nextended
  allocate(corr_extended(nxl_global,ny_local,nextended))
  allocate(spec_corr(-lmax_global:lmax_global,0:kmax_local))
c
c---
c
c Construct correlation in physical space
  call corr_ext( nextended, nxl_global,
 xny_local,ny_global,nyl_global,
 x gridsize, lscale,
 xcorr_extended, corr_index,
 xmype,nproc,jmin_list )
c
  if (mype.eq.1) wr

Re: ICE on using -floop-nest-optimize

2017-01-06 Thread Kyrill Tkachov


On 06/01/17 14:22, Toon Moene wrote:

On the attached (Fortran) source, the following version of gfortran draws an 
ICE:

$ gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 6.2.1-5' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin 
--enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic 
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu

Thread model: posix
gcc version 6.2.1 20161124 (Debian 6.2.1-5)

using the following command line arguments:

gfortran -S -g -Ofast -fprotect-parens -fbacktrace -march=native -mtune=native 
-floop-nest-optimize corr_to_spec_2D.F

The error message is:

corr_to_spec_2D.F:3:0:

   subroutine corr_to_spec_2D(nx_local,ny_local,

internal compiler error: in create_pw_aff_from_tree, at 
graphite-sese-to-poly.c:445
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

I will retry this with trunk gfortran as soon as my automatic builds have 
constructed that compiler.

In the mean time - anyone has a clue ?



Looks like PR 69823 ?

Kyrill


Thanks,





Re: ICE on using -floop-nest-optimize

2017-01-06 Thread Toon Moene

On 01/06/2017 03:28 PM, Kyrill Tkachov wrote:


On 06/01/17 14:22, Toon Moene wrote:

On the attached (Fortran) source, the following version of gfortran
draws an ICE:

$ gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian
6.2.1-5' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++
--prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu-
--enable-shared --enable-linker-build-id --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --libdir=/usr/lib
--enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-libmpx --enable-plugin
--enable-default-pie --with-system-zlib --disable-browser-plugin
--enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre
--enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64
--with-arch-directory=amd64
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc=auto
--enable-multiarch --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.2.1 20161124 (Debian 6.2.1-5)

using the following command line arguments:

gfortran -S -g -Ofast -fprotect-parens -fbacktrace -march=native
-mtune=native -floop-nest-optimize corr_to_spec_2D.F

The error message is:

corr_to_spec_2D.F:3:0:

   subroutine corr_to_spec_2D(nx_local,ny_local,

internal compiler error: in create_pw_aff_from_tree, at
graphite-sese-to-poly.c:445
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

I will retry this with trunk gfortran as soon as my automatic builds
have constructed that compiler.

In the mean time - anyone has a clue ?



Looks like PR 69823 ?


Yep - thanks.

So I don't have to put it into Bugzilla - even if the trunk still fails.

Saves some work - thanks again !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news


Re: input address reload issue

2017-01-06 Thread Jeff Law

On 01/06/2017 03:20 AM, Aurelien Buhrig wrote:



So the insn:
(set (reg:QI 0 r0) (mem:QI (plus:SI (reg:SI 2 r2)(const_int 1))

is transformed into:
(set (reg:SI 8 a0) (reg:SI 2 r2))
(set (reg:SI 8 a0) (const_int 1))
(set (reg:SI 8 a0) (plus:SI (reg:SI 8 a0) (reg:SI 8 a0)))
(set (reg:QI 0 r0) (mem:QI (reg:SI 8 a0))

This "basic" transformation requires two reload regs, but only one is
given/used/possible from the rl structure (in emit_reload_insns).

So where does the issue comes from? The need for 2 reload regs, the
transformation which is too "basic" and could be optimized to use only
one reload reg, or any wrong/missing reload target hook?

Sounds like you need secondary or intermediate reloads.

Thank you Jeff for your help.
Currently, the TARGET_SECONDARY_RELOAD does not check if the address of
a mem rtx is legitimate address and it returns NO_REG with (mem:QI r2).
So first you have to distinguish between an intermediate and scratch 
register.


Intermediates say that to copy from a particular register class to a 
particular destination will require the source to first be copied into 
the intermediate register, then the intermediate to the final 
destination.   So for example if you have a register that you can not 
directly store to memory, you might use an intermediate.  The source 
register would be copied to the intermediate and the intermediate then 
stored to memory.


Scratch registers are different -- essentially the backend tells reload 
that another register from a particular class is needed *and* the 
pattern to use for generating the reload.  So as an example, you might 
have a two-register indexed address,


For example, loading a constant into an FP register during PIC code 
generation may require a scratch register to hold intermediate address 
computations.


You can have cases where you need both a scratch and an intermediate. 
You can also have cases where you need an intermediate memory location.




Do you suggest the secondary reload must implement a scratch reg & md
pattern to implement this reload?
Perhaps.  I don't know enough about your architecture to be 100% sure 
about how all the pieces interact with each other -- reload, and 
secondary reloads in particular are a complex area.  I'm largely going 
on your comment that you need 2 reload registers.


Presumably you don't have an instruction for
(set (reg) (plus (reg) (const_int)))

Thus you need two scratch reload regs IIUC.  One to hold r2 another to 
hold (const_int 1).  So you'd want to generate


(set (areg1) (reg r2))
(set (areg2) (const_int 1))
(set (areg1) (plus (areg1) (areg2)
(set (r0) (mem (areg1))

Or something along those lines.  If you're going to stick with reload, 
you'll likely want to dig into find_reloads_address and its children to 
see what reloads it generates and why (debug_reload can be helpful here).


jeff




Re: input address reload issue

2017-01-06 Thread Jeff Law

On 01/06/2017 03:26 AM, Aurelien Buhrig wrote:



Look at the dump file for reload to see where things come from.  Also
everything Jeff said; you really want LRA.


I will try switching to LRA in a second step, but I think I need first to 
remove the old cc0...
BTW, in which way the LRA is better than IRA? Is there any benchmarks?
I would suggesting moving away from cc0 first.  cc0 is an abomination 
and should have been abolished years ago -- the only reason is many old 
ports would break and nobody's taken the time to convert them or propose 
them for deprecation.


While we try to keep the cc0-target paths working, they're not exercised 
all that much and can easily break.


jeff



Re: un-optimal code because of forwprop after gcc-5?

2017-01-06 Thread Jeff Law

On 01/06/2017 03:09 AM, Pitchumani Sivanupandi wrote:

Found a code size regression for AVR target in gcc-5 and higher. Looks
like it
is applicable to x86_64 also.

Please file a bug.

http://gcc.gnu.org/bugzilla


Jeff


Re: input address reload issue

2017-01-06 Thread Eric Botcazou
> I would suggesting moving away from cc0 first.  cc0 is an abomination
> and should have been abolished years ago -- the only reason is many old
> ports would break and nobody's taken the time to convert them or propose
> them for deprecation.

It's 8 out of 47 ports, most of them old indeed, with the exception of AVR.
You're probably thinking of m68k, but AVR seems to be more blocking here.

-- 
Eric Botcazou


Re: input address reload issue

2017-01-06 Thread Jeff Law

On 01/06/2017 09:37 AM, Eric Botcazou wrote:

I would suggesting moving away from cc0 first.  cc0 is an abomination
and should have been abolished years ago -- the only reason is many old
ports would break and nobody's taken the time to convert them or propose
them for deprecation.


It's 8 out of 47 ports, most of them old indeed, with the exception of AVR.
You're probably thinking of m68k, but AVR seems to be more blocking here.
I didn't have any particular port in mind, but yes, of the cc0 ports AVR 
is probably the most active.


jeff


Re: .../lib/gcc//7.1.1/ vs. .../lib/gcc//7/

2017-01-06 Thread Richard Biener
On January 6, 2017 2:11:51 PM GMT+01:00, Jakub Jelinek  wrote:
>On Fri, Jan 06, 2017 at 01:07:23PM +, Szabolcs Nagy wrote:
>> On 06/01/17 12:48, Jakub Jelinek wrote:
>> > SUSE and some other distros use a hack that omits the minor and
>patchlevel
>> > versions from the directory layout, just uses the major number, it
>is very
>> 
>> what is the benefit?
>
>Various packages use the paths to gcc libraries/includes etc. in
>various
>places (e.g. libtool, *.la files, etc.).  So any time you upgrade gcc
>(say from 6.1.0 to 6.2.0 or 6.2.0 to 6.2.1), everything that has those
>paths
>needs to be rebuilt.  By having only the major number in the paths
>(which is
>pretty much all that matters), you only have to rebuild when the major
>version of gcc changes (at which time one usually want to mass rebuild
>everything anyway).

RPMs from ISVs having this issue made us change is for SUSE.   Note another 
workaround is to provide symlinks from old provided versions to the actual one.

It's really all a packaging issue so not sure if upstream should change 
anything by default.  Providing a way to do it would be nice of course.

Richard.

>
>   Jakub