from:"Wolfgang"

Melt-building problem

2010-05-25 Thread Wolfgang

Hello,

I tried to compile the gcc-melt branch from svn, but i get the following error:


make warmelt1
make[4]: Entering directory `/home/wolfgang/gcc-melt/objects/gcc'
date +"/* empty-file-for-melt.c %c */" > empty-file-for-melt.c-tmp
/bin/bash ../../melt-branch/gcc/../move-if-change empty-file-for-melt.c-tmp 
empty-file-for-melt.c
make -f ../../melt-branch/gcc/melt-module.mk  VPATH=../../melt-branch/gcc:. 
meltmodule \
  GCCMELT_CFLAGS="-g -O2 -fomit-frame-pointer -g -g -O2 
-fomit-frame-pointer -gtoggle -DIN_GCC -DHAVE_CONFIG_H -I 
melt-private-build-include -I." \
  
GCCMELT_MODULE_SOURCE=../../melt-branch/gcc/melt/generated/warmelt-first.0.c 
GCCMELT_MODULE_BINARY=warmelt-first.0.so
make[5]: Entering directory `/home/wolfgang/gcc-melt/objects/gcc'
gcc -g -O2 -fomit-frame-pointer -g -g -O2 -fomit-frame-pointer -gtoggle 
-DIN_GCC -DHAVE_CONFIG_H -I melt-private-build-include -I. -fPIC -c -o 
warmelt-first.0.pic.o ../../melt-branch/gcc/melt/generated//warmelt-first.0.c
cc1: error: unrecognised debug output level "toggle"
make[5]: *** [warmelt-first.0.pic.o] Fehler 1
make[5]: Leaving directory `/home/wolfgang/gcc-melt/objects/gcc'
make[4]: *** [warmelt-first.0.so] Fehler 2
make[4]: Leaving directory `/home/wolfgang/gcc-melt/objects/gcc'
make[3]: *** [melt.encap] Fehler 2
make[3]: Leaving directory `/home/wolfgang/gcc-melt/objects/gcc'
make[2]: *** [all-stage2-gcc] Fehler 2
make[2]: Leaving directory `/home/wolfgang/gcc-melt/objects'
make[1]: *** [stage2-bubble] Fehler 2
make[1]: Leaving directory `/home/wolfgang/gcc-melt/objects'
make: *** [all] Fehler 2
wolfg...@debian:~/gcc-melt/objects$

any advice?

Thanks
Wolfgang
-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Re: Melt-building problem

2010-05-25 Thread Wolfgang


 Original-Nachricht 
> Datum: Tue, 25 May 2010 16:20:14 +0200
> Von: Basile Starynkevitch 
> An: Wolfgang 
> CC: gcc@gcc.gnu.org
> Betreff: Re: Melt-building problem

> On Tue, 2010-05-25 at 12:03 +0200, Wolfgang wrote:
> > Hello,
> > 
> > I tried to compile the gcc-melt branch from svn, 
> 
> A big thanks for testing GCC MELT! What svn revision of the MELT branch
> are you testing? Did you configure gcc-melt with --enable-bootstrap?
> 
> Can you reproduce the bug inside a clean build tree? (I mean, removing
> all your build tree, and start again the configure & the build)
yes, it is reproduceable...i cleaned all and tried to build that new

> 
> > but i get the following error:
> > 
> > 
> > make warmelt1
> > make[4]: Entering directory `/home/wolfgang/gcc-melt/objects/gcc'
> > date +"/* empty-file-for-melt.c %c */" > empty-file-for-melt.c-tmp
> > /bin/bash ../../melt-branch/gcc/../move-if-change
> empty-file-for-melt.c-tmp empty-file-for-melt.c
> > make -f ../../melt-branch/gcc/melt-module.mk 
> VPATH=../../melt-branch/gcc:. meltmodule \
> >   GCCMELT_CFLAGS="-g -O2 -fomit-frame-pointer -g -g -O2
> -fomit-frame-pointer -gtoggle -DIN_GCC -DHAVE_CONFIG_H -I 
> melt-private-build-include
> -I." \
> >  
> GCCMELT_MODULE_SOURCE=../../melt-branch/gcc/melt/generated/warmelt-first.0.c 
> GCCMELT_MODULE_BINARY=warmelt-first.0.so
> > make[5]: Entering directory `/home/wolfgang/gcc-melt/objects/gcc'
> > gcc -g -O2 -fomit-frame-pointer -g -g -O2 -fomit-frame-pointer -gtoggle
> -DIN_GCC -DHAVE_CONFIG_H -I melt-private-build-include -I. -fPIC -c -o
> warmelt-first.0.pic.o ../../melt-branch/gcc/melt/generated//warmelt-first.0.c
> > cc1: error: unrecognised debug output level "toggle"
> 
> Which gcc is this one? (What does gcc -v tells you?).
At the moment, i have the gcc 4.4 installed:
wolfg...@debian:~/gcc-melt/objects$ gcc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.4-1' 
--with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs 
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared 
--enable-multiarch --enable-linker-build-id --with-system-zlib 
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix 
--with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls 
--enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc 
--enable-targets=all --with-arch-32=i486 --with-tune=generic 
--enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu 
--target=i486-linux-gnu
Thread model: posix
gcc version 4.4.4 (Debian 4.4.4-1)

The Melt-Branch was configuired with:
../melt-branch/configure --enable-languages=c,c++ --enable-shared 
--enable-threads=posix --enable-nls --enable-objc-gc --enable-mpfr 
--prefix=/usr/lib/test2 --enable-plugin --enable-lto --with-ppl 
--enable-bootstrap

I now tried to compile it with the new gcc-4.5

Thanks
Wolfgang

> 
> I hoped to have corrected this bug by adding the MELTHERE_CFLAGS in GCC
> MELT's gcc/Makefile.in near line 5076. I am not a guru in autoconf + GNU
> make tricks. Apparently, something is still wrong.
> 
> A dirty workaround might be to replace every -gtoggle occurrence in the
> build tree gcc/Makefile with -g.
> 
> I will try to reproduce that bug!
> 
> Thanks for reporting it.
> 
> 
> BTW, I am surprised that GCC (even a plain 4.4 or 4.5) issues an error
> for an unrecognised debug output level. I would imagine it would in that
> case issue a warning, and try to do what -g does...
> 
> Cheers.
> 
> 
> -- 
> Basile STARYNKEVITCH http://starynkevitch.net/Basile/
> email: basilestarynkevitchnet mobile: +33 6 8501 2359
> 8, rue de la Faiencerie, 92340 Bourg La Reine, France
> *** opinions {are only mines, sont seulement les miennes} ***
> 

-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Re: Melt-building problem

2010-05-26 Thread Wolfgang

Hallo,

i built gcc-melt sucessfully with a new gcc-4.5 compiler from scratch.
The svn of melt is:

URL: svn://gcc.gnu.org/svn/gcc/branches/melt-branch
Basis des Projektarchivs: svn://gcc.gnu.org/svn/gcc
UUID des Projektarchivs: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 159823
Knotentyp: Verzeichnis
Plan: normal
Letzter Autor: bstarynk
Letzte geänderte Rev: 159667
Letztes Änderungsdatum: 2010-05-21 16:44:05 +0200 (Fr, 21. Mai 2010)

The configuration of melt ist:
Using built-in specs.
COLLECT_GCC=./gcc
COLLECT_LTO_WRAPPER=/usr/lib/test/libexec/gcc/i686-pc-linux-gnu/4.6.0/lto-wrapper
Target: i686-pc-linux-gnu
Configured with: ../melt-branch/configure --enable-languages=c,c++ 
--enable-shared --enable-threads=posix --enable-nls --enable-objc-gc 
--enable-mpfr --prefix=/usr/lib/test --enable-plugin --enable-lto 
--enable-checking --enable-tree-browse --enable-tree-checking --with-ppl 
--disable-bootstrap
Thread model: posix
gcc version 4.6.0 20100406 (experimental) (GCC)

Now i can not reproduce the error any more

Thanks a lot for your help

Wolfgang


> > > 
> > > > but i get the following error:
> > > > 
> > > > 
> > > > make warmelt1
> > > > make[4]: Entering directory `/home/wolfgang/gcc-melt/objects/gcc'
> > > > date +"/* empty-file-for-melt.c %c */" > empty-file-for-melt.c-tmp
> > > > /bin/bash ../../melt-branch/gcc/../move-if-change
> > > empty-file-for-melt.c-tmp empty-file-for-melt.c
> > > > make -f ../../melt-branch/gcc/melt-module.mk 
> > > VPATH=../../melt-branch/gcc:. meltmodule \
> > > >   GCCMELT_CFLAGS="-g -O2 -fomit-frame-pointer -g -g -O2
> > > -fomit-frame-pointer -gtoggle -DIN_GCC -DHAVE_CONFIG_H -I
> melt-private-build-include
> > > -I." \
> > > >  
> > >
> GCCMELT_MODULE_SOURCE=../../melt-branch/gcc/melt/generated/warmelt-first.0.c 
> GCCMELT_MODULE_BINARY=warmelt-first.0.so
> > > > make[5]: Entering directory `/home/wolfgang/gcc-melt/objects/gcc'
> > > > gcc -g -O2 -fomit-frame-pointer -g -g -O2 -fomit-frame-pointer
> -gtoggle
> > > -DIN_GCC -DHAVE_CONFIG_H -I melt-private-build-include -I. -fPIC -c -o
> > > warmelt-first.0.pic.o
> ../../melt-branch/gcc/melt/generated//warmelt-first.0.c
> > > > cc1: error: unrecognised debug output level "toggle"
> 
> 
> Perhaps re-merging the current MELT branch into your private branch
> (assuming you have a private MELT variant) might help, because on my
> side with GCCMELT_CC set to gcc-4.5 the make log contains
> 
> 
> make warmelt1
> make[4]: Entering directory `/usr/src/Lang/_MeltBoot/Obj/gcc'
> date +"/* empty-file-for-melt.c %c */" > empty-file-for-melt.c-tmp
> /bin/bash ../../melt-branch/gcc/../move-if-change
> empty-file-for-melt.c-tmp empty-file-for-melt.c
> make -f ../../melt-branch/gcc/melt-module.mk
> VPATH=../../melt-branch/gcc:. meltmodule \
> GCCMELT_CFLAGS="-g -fkeep-inline-functions -g
> -fkeep-inline-functions -DIN_GCC -DHAVE_CONFIG_H -I
> melt-private-build-include -I." \
> 
> GCCMELT_MODULE_SOURCE=../../melt-branch/gcc/melt/generated/warmelt-first.0.c
> GCCMELT_MODULE_BINARY=warmelt-first.0.so
> make[5]: Entering directory `/usr/src/Lang/_MeltBoot/Obj/gcc'
> gcc-4.5 -g -fkeep-inline-functions -g -fkeep-inline-functions -DIN_GCC
> -DHAVE_CONFIG_H -I melt-private-build-include -I. -fPIC -c -o
> warmelt-first.0.pic.o
> ../../melt-branch/gcc/melt/generated//warmelt-first.0.c
> gcc-4.5 -g -fkeep-inline-functions -g -fkeep-inline-functions -DIN_GCC
> -DHAVE_CONFIG_H -I melt-private-build-include -I. -fPIC -c -o
> warmelt-first.0
> +01.pic.o ../../melt-branch/gcc/melt/generated//warmelt-first.0+01.c
> echo '/*' generated file ./warmelt-first.0-stamp.c '*/' >
> warmelt-first.0-stamp.c-tmp
> date "+const char melt_compiled_timestamp[]=\"%c \";" >>
> warmelt-first.0-stamp.c-tmp
> echo "const char melt_md5[]=\"\\" >> warmelt-first.0-stamp.c-tmp
> for f
> in ../../melt-branch/gcc/melt/generated/warmelt-first.0.c
> ../../melt-branch/gcc/melt/generated/warmelt-first.0+01.c; do \
> md5line=`md5sum $f` ; \
> printf "%s\\\n" $md5line >> warmelt-first.0-stamp.c-tmp; \
>   done
> echo "\";" >> warmelt-first.0-stamp.c-tmp
> echo "const char melt_csource[]=
> \"../../melt-branch/gcc/melt/generated/warmelt-first.0.c
> ../../melt-branch/gcc/melt/generated/warmelt-first.0+01.c\";" >> 
> warmelt-first.0-stamp.c-tmp
> mv warmelt-first.0-stamp.c-tmp warmelt-first.0-stamp.c
> gcc-4.5 -g -fkeep-inline-functio

Code Instrumentation

2010-05-31 Thread Wolfgang

Hallo,

i would like to instrument some existing code. For example, after an ADD-EXPR:

int main() {
  int a=5;
  int b=5;
  int c = a + b; ...
}

should become:
...
 int c = a + b;
 puts("ADD-EXPR");
...

I thought writing a Gimple-pass would be best, but i don't know exactly where 
to start. I'm able to walk throug the Gimple-Statements and debug it, but i'm 
not able to insert something.
Is there any source-code available?

Thanks
Wolfgang
-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

What is the right usage of SAVE_EXPR?

2007-04-24 Thread Wolfgang Gellerich

What is the policy concerning the usage of SAVE_EXPRs? Who is
responsible for inserting them? I thought the respective language
front end were responsible to enclose any expressions with side
effects this way, so that later parts of GCC know how to treat these
expressions right. 

However, also some of the code translating tree nodes into rtxes like
some functions found in builtins.c worry about the re-evaluation of
arguments and insert plenty of SAVE_EXPRs. Why is that necessary?

With best regards,

  Wolfgang Gellerich



ï»¿---
Dr. Wolfgang Gellerich
IBM Deutschland Entwicklung GmbH
SchÃ¶naicher Strasse 220
71032 BÃ¶blingen, Germany
Tel. +49 / 7031 / 162598
[EMAIL PROTECTED]

===

IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Johann Weihen 
GeschÃ¤ftsfÃ¼hrung: Herbert Kircher 
Sitz der Gesellschaft: BÃ¶blingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

Mainline bootstrap failure in tree-ssa-pre.c:create_value_expr_from

2005-04-14 Thread Wolfgang Bangerth

For the last few days, since April 8th, I get bootstrap failures on 
mainline like this:

stage1/xgcc -Bstage1/
   -B/ices/bangerth/tmp/build-gcc/gcc-install/i686-pc-linux-gnu/bin/ -c -g
   -O2 -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict-prototypes
   -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros
   -Wold-style-definition -Werror -fno-common   -DHAVE_CONFIG_H-I. -I.
   -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include
   -I../../gcc/gcc/../libcpp/include  ../../gcc/gcc/tree-ssa-pre.c -o
   tree-ssa-pre.o
../../gcc/gcc/tree-ssa-pre.c: In function 'execute_pre':
../../gcc/gcc/tree-ssa-pre.c:1812: sorry, unimplemented: inlining failed in 
call to 'create_value_expr_from': recursive inlining
../../gcc/gcc/tree-ssa-pre.c:1853: sorry, unimplemented: called from here
make[1]: *** [tree-ssa-pre.o] Error 1
The failure happens in a piece of code that was added here:
   http://gcc.gnu.org/ml/gcc-bugs/2005-04/msg00337.html
by Dan Berlin (it introduced the recursive calls), though it certainly 
isn't the cause, just the trigger. It also was added already April 4th.

Since this has been happening for the last 10 days for me, I start to 
believe that I may be the only one seeing this. Anyone any explanations?

Thanks
   Wolfgang
PS: System is "Linux terra.ices.utexas.edu 2.4.25-13mdkenterprise #1 SMP 
Tue Jan 18 14:02:17 MST 2005 i686 unknown unknown GNU/Linux", the 
bootstrap compiler is Mandrake's gcc 3.3.2.

-----
Wolfgang Bangerth  email:[EMAIL PROTECTED]
www: http://www.ices.utexas.edu/~bangerth/

Re: Mainline bootstrap failure in tree-ssa-pre.c:create_value_expr_from

2005-04-14 Thread Wolfgang Bangerth


Isn't this the normal always_inline problem from the kernel headers?
Yes, good spot. Thanks for the help!
W.
-
Wolfgang Bangerth  email:[EMAIL PROTECTED]
   www: http://www.ices.utexas.edu/~bangerth/

Re: Proposed resolution to aliasing issue.

2005-05-11 Thread Wolfgang Bangerth


> In short, the issue is, when given the following code:
> 
> 
>  struct A {...};
>  struct B { ...; struct A a; ...; };
>
> 
>  void f() {
>B b;
>g(&b.a);
>  }
>
> does the compiler have to assume that "g" may access the parts of "b" 
> outside of "a". 

I understand that you are talking about ISO C, but one relevant case (in C++) 
to look out for that is similar is this one, which certainly constitutes 
legitimate and widespread use of language features:

  class A {...};
  class B : public A { ... };

  void f() {
B b;
g (static_cast (&b));
  }

  void g(A *a) {
B *b = dynamic_cast(a);
// do what you please with the full object B
  }

dynamic_cast<> was invented for the particular reason to allow such 
constructs.

I admit ignorance how exactly the C++ FE describes base class information (as 
opposed to structure member information), but the aliasing code what have to 
know about the difference.

Best
  Wolfgang

-
Wolfgang Bangerth  email:[EMAIL PROTECTED]
   www: http://www.ices.utexas.edu/~bangerth/

Re: Proposed resolution to aliasing issue.

2005-05-11 Thread Wolfgang Bangerth


Mark,
it occurred to me that asking the question you pose may use language that is 
more unfamiliar than necessary. How about this question instead -- assume

  struct S { int s; };
  struct X {
int i;
struct S s;
  };

  void g(struct S*);
  void f() {
X x;
g(&x.s);
  }

Would the compiler be allowed to realize that X::i is never referenced and 
therefore a dead variable? I assume the compiler doesn't do that right now, 
but it would be straightforward for a scalar replacement algorithm to not 
even allocate stack space for X::i, but only X::s, and hand the address of 
the only remaining stack object, of type S, to g().

The community at large may have more experience with such "as-if" related 
questions. It would be interested to know whether the scalarizers in gcc 
realize, for example, whether they can/can't get rid of X::i...

Best
  Wolfgang

-----
Wolfgang Bangerth  email:[EMAIL PROTECTED]
   www: http://www.ices.utexas.edu/~bangerth/

No documentation of -rdynamic

2005-07-06 Thread Wolfgang Bangerth


Hi all,
in order for the glibc function backtrace() to return something useful, its 
documentation says one has to use the -rdynamic flag. However, as has been 
mentioned before here
  http://gcc.gnu.org/ml/gcc-help/2002-11/msg00196.html
  http://gcc.gnu.org/ml/libstdc++/2002-04/msg00100.html
and probably some other places, there doesn't seem to be any documentation of 
what this flag does etc.

Is there someone who can give me the gist of its meaning? If I get a 
reasonable explanation, I may even be willing to write a blurb for the 
manual...

Best
  Wolfgang
 
-
Wolfgang Bangerth  email:[EMAIL PROTECTED]
   www: http://www.ices.utexas.edu/~bangerth/

Bug or feature: symbol names of global/extern variables

2005-10-06 Thread Wolfgang Roemer

Hello,

I don't know whether it is a bug or feature and I searched through the mailing 
lists without success therefor I write my question this way:

If you have a global variable inside a cpp file and create a library out of 
that, the symbol name for that global variable does in no way take the 
variable type into account. A user of that variable can "make" it any type 
with its extern declaration and thus produce subtle errors. An example:

lib.cpp:

int maximum;
int minimum;

static bool init ( )
{
  maximum = 2;
  minimum = -7;
}

static bool initialized = init ( );
---

Create a library out of that lib.cpp file. Then compile the following main.cpp 
and link it against the library:

main.cpp:
---
extern double maximum;
extern intminimum;

void main (int, char**)
{
  // Assume you are on a machine where the sizeof (int) is 4 bytes
  // and the sizeof (double) is 8 bytes.
  assert (minimum == -7);
  {
maximum = 2342343242343.3;
  }
  assert (minimum == -7);

  return 0;
}
---

The main.o will perfectly link with the library although main.o needs a double 
variable named maximum and the lib only offers an int variable named maximum. 
Because the symbol name does in no way reflect the variable type, everything 
links fine but in fact the variable minimum gets scrambled in this example 
because maximum is accessed as if it is a double variable thus overwriting 4 
additional bytes (in the case the 4 bytes of the variable minimum). The 
assertion will show that.

I tested that on Windows with Visual C++ as well and there main.obj won't link 
because the variable type is part of the symbol name and everthing is fine.

I think it would be very very important for the binary interface (ELF here, 
or?) to have that feature as well. What do you think?

Regards,

Wolfgang Roemer

Re: Bug or feature: symbol names of global/extern variables

2005-10-06 Thread Wolfgang Roemer

Hello Michael,

first of all: Thanks for the fast reply!

On Thu Oct 06, 2005 10:33, you wrote:
>> [..]
>>
>>  It's a feature. It is undefined behavior to have conflicting declarations
>>  in different translation units.
>>  [...]

Well, but shouldn't there at least be a warning during linking!?

>> [..]
>>  In that case, how does VC++ implement cout,cin, construction?
>>  In libstdc++ (well, at least in gcc-3.4) it is implemented by doing
>>  something like:
>>
>>  namespace std{
>>  ...
>>
>>  // Note that this is different from 's definition of cin
>>  // (it's declared as "extern istream cin" in there).
>>  char cin[ sizeof(istream) ];
>>  ...
>>  ios::Init::Init()
>>  {
>>if (count++ == 0)
>>   new (&cin) istream(cin_constuction_flags);
>>  
>>  }

I don't know how VC++ implements cout, cin. I just checked the symbol names 
with the dumpbin.exe tool that is part of the VC++ Suite and there it is 
clearly marked as "maximum (int)". And during the attempt to link you get a 
unresolved symbol error saying that main.o needs "maximum (double)" but lib 
only offers "maximum (int)" and that's very helpful. I encountered this 
behaviour on Linux because of a very strange SEGV and I was finally able to 
track that down to an extern variable that was used in the wrong way and thus 
I found the mentioned behaviour. I did not take a look at the VC++ libc 
implementation etc. I just checked it from the user perspective. 

Thanks,

WR

Re: Bug or feature: symbol names of global/extern variables

2005-10-06 Thread Wolfgang Roemer

Hello,

so it seems as if it would be best if I post that to the binutils mailing 
list. Agreed?

WR

On Thu Oct 06, 2005 11:57, Robert Dewar wrote:
>>  Michael Veksler wrote:
>>  > It sounds as if the symbol is still "maximum" and it is annotated with
>>  > its type (something like debug information). I should be possible to
>>  > hack the linker to emit a warning for symbols with conflicting debug
>>  > information.
>>
>>  Nice idea!
>>
>>  > This is the wrong list for linker enhancements. You should look for
>>  > binutils mailing lists. However "collect2" which is part of gcc and is
>>  > called before the linker (for C++)- could also detect this and give
>>  > the same warning. I would bet that collect2 is the wrong place for
>>  > this enhancement because it will work only for C++, not for C.
>>
>>  If the linker did this, then it would even work across languages,
>>  e.g. importing a C symbol from an Ada unit, and vice versa.
>>
>>  >   Michael

Bug or feature: symbol names of global/extern variables

2005-10-06 Thread Wolfgang Roemer

Hello,

I encountered a subtle SEGV in a program and was able to track the problem 
down to symbol names concerning global/extern variables. I discussed that 
with some guys from the GCC project (see recipient list) and we came to the 
conclusion it would make more sense to share our thoughts with you. Here the 
problem:


If you have a global variable inside a cpp file and create a library out of 
that, the symbol name for that global variable does in no way take the type of 
the variable into account. A user of that variable can "make" it any type 
with an "extern" declaration and thus produce subtle errors. An example:


 lib.cpp 
int maximum;
int minimum;

static bool init ( )
{
  maximum = 2;
  minimum = -7;
}

static bool initialized = init ( );
---

Create a library out of that lib.cpp file. Then compile the following main.cpp 
and link it against the library:


- main.cpp --
extern double maximum;
extern int    minimum;

void main (int, char**)
{
  // Assume you are on a machine where the sizeof (int) is 4 bytes
  // and the sizeof (double) is 8 bytes.
  assert (minimum == -7);
  {
    maximum = 2342343242343.3;
  }
  assert (minimum == -7);

  return 0;
}
-

The main.o will perfectly link with the library although main.o needs a double 
variable named maximum and the lib only offers an int variable named maximum. 
Because the symbol name does in no way reflect the variable type, everything 
links fine but in fact the variable named "minimum" gets scrambled in this 
example because "maximum" is accessed as if it is a double variable thus 
overwriting 4 additional bytes (in this case the 4 bytes of the variable 
minimum). The assertion will show that.

I tested that on Windows with Visual C++ as well and there main.obj doesn't 
link because the variable type is part of the symbol name and everthing is 
fine.

I think it would be very very important for the binary interface to have that 
feature as well.

Regards,

Wolfgang Roemer

Re: Bug or feature: symbol names of global/extern variables

2005-10-06 Thread Wolfgang Roemer

On Thu Oct 06, 2005 14:50, Robert Dewar wrote:
>> [..]
>>
>>  I actually disagree with this, I think attempting to make the link fail
>>  here would be a mistake.

Why do you think that this would be a mistake?

WR

Re: Bug or feature: symbol names of global/extern variables

2005-10-06 Thread Wolfgang Roemer

Hello Michael,

On Thu Oct 06, 2005 15:54, Michael Veksler wrote:
[..]
>>  2. I think that it will break C. As I remember, it is sometimes
>>  legal in C (or in some dialects of C) to have conflicting types.
>>  You may define in one translation unit:
>> char var[5];
>>  and then go on and define in a different translation unit:
>> char var[10];
>>  The linker will merge both declarations and allocate at least
>> 10 bytes for 'var' (ld's --warn-common will detect this).

that is interesting: If the linker would behave that way, I wouldn't get the 
error because the needed 8 bytes for a double would be allocated.

WR

devbranches: ambigous characterisation of branches

2015-06-07 Thread Wolfgang Hospital


Dear Sir or Madam,

in the repository contents description at 
<https://gcc.gnu.org/svn.html#olddevbranches>, numerous branch names are 
listed as inactive, with some further comments. Right at the start there 
is the longest list of such names, followed by "These branches have been 
merged into the mainline.". Without "preceding" or "following",
or at least leading dash or a trailing colon, I'm at a loss whether that 
refers to the branches named before or after.

(The somewhat formal address contributed to landing this message
in the SPAM pit? Dear me.)

Yours faithfully

Wolfgang Hospital

--
Wolfgang Hospital

gcc version 4.4.3 (GCC)

2010-01-30 Thread Wolfgang Griech

config.guess:
i686-pc-linux-gnu

gcc --v:
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ./configure
Thread model: posix
gcc version 4.4.3 (GCC)

packages:
gcc-4.4.3.tar.bz2
gcc-core-4.4.3.tar.bz2
gcc-g++-4.4.3.tar.bz2

linux distribution:
Ubuntu 8.04.4 LTS \n \l

kernel version:
Linux HDHN2432 2.6.24-26-generic #1 SMP Tue Dec 1 18:37:31 UTC 2009
i686 GNU/Linux

glibc version:
Desired=Unknown/Install/
Remove/Purge/Hold
| Status=Not/Installed/Config-f/Unpacked/Failed-cfg/Half-inst/t-aWait/T-pend
|/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err: uppercase=bad)
||/ Name   Version    Description
+++-==-==-===
ii  libc6  2.7-10ubuntu5  GNU C Library: Shared libraries

Tree Browser

2010-05-04 Thread Wolfgang kaifler

Hi,

I've tried to use the treebrowser described at 
http://gcc.gnu.org/projects/tree-ssa/tree-browser.html
I configuired gcc-4.5.0 with ... --enable-checking --enable-tree-browse 
--enable-tree-checking ...
Compilation was OK, a gcc/tree-browser.o exists.

Now i'm able to launch gdb and step through the code, etc.

When I type the suggested command, I get

(gdb) p browse_tree (current_function_decl)
No symbol "browse_tree" in current context.
(gdb)

What i'm doing wrong? Any ideas?

Thanks,
Wolfgang
-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Modifying ARM code generator for elimination of 8bit writes - need help

2006-05-28 Thread Wolfgang Mües

Hello,

I am trying to port big C/C++ programs (see www.dslinux.org) to the 
nintendo DS console.

The console has 4 Mbytes internal memory, and 32 MBytes external
memory which is *not* 8bit writable (only 16 and 32 bits). CPU is an ARM 
946. Using the external memory for ROM(XIP) and the internal memory for 
data, linux in console mode is possible, but graphical environments are 
very limited...

The idea to overcome this problem is to
a) activate data cache in writeback mode for the external memory.
b) modify the gcc code generator. "strb" opcode is transformed to 
"swpb". swpb will load the cache because of the read-modify-write,
and at cache writeback time, the whole cached half-line will be written 
back, eliminating the 8bit write problem. 

I have proven the solution with an assembler program, but I think I need 
some help modifying the compiler

I found arm.md and the moveqi insns, but because of the different 
addressing modes of strb and swpb, its not easy to make the change.
And there must be a compiler option for this, too.

Could somebody please tell me how to implement this change?

regards

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-05-31 Thread Wolfgang Mües

On Tuesday 30 May 2006 23:47, Daniel Jacobowitz wrote:
> On Tue, May 30, 2006 at 09:03:54PM +0100, Paul Brook wrote:
> > > I found arm.md and the moveqi insns, but because of the different
> > > addressing modes of strb and swpb, its not easy to make the
> > > change. And there must be a compiler option for this, too.
> > >
> > > Could somebody please tell me how to implement this change?
> >
> > Short answer is probably not.
> >
> > There are a couple of complications that spring to mind. The
> > different addressing modes and the fact that swp clobbers a
> > register are the most immediate ones.
> >
> > You'll need to modify at least the movqi insn patterns, memory
> > constraints and the legitimate address stuff. I'm not sure about
> > the clobber, that might need additional reload-related machinery.
>
> I suspect it would be better to make GCC do halfword stores instead
> (read/modify/write).

Hmmm... I have thought about that. But how do the compiler know if the 
byte address is even or odd? Every time testing the LSB of the address 
and making conditional statements is no joke...

regards

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-05-31 Thread Wolfgang Mües

Paul,

thank you for commenting...

On Tuesday 30 May 2006 22:03, Paul Brook wrote:
> > I found arm.md and the moveqi insns, but because of the different
> > addressing modes of strb and swpb, its not easy to make the change.
> > And there must be a compiler option for this, too.
> >
> > Could somebody please tell me how to implement this change?
>
> Short answer is probably not.
>
> There are a couple of complications that spring to mind. The
> different addressing modes and the fact that swp clobbers a register
> are the most immediate ones.
>
> You'll need to modify at least the movqi insn patterns, memory
> constraints and the legitimate address stuff. I'm not sure about the
> clobber, that might need additional reload-related machinery.

For the first shot, I have changed

> (define_insn "*arm_movqi_insn"
>   [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,r,m")
>   (match_operand:QI 1 "general_operand" "rI,K,m,r"))]
>   "TARGET_ARM
>&& (   register_operand (operands[0], QImode)
>
>|| register_operand (operands[1], QImode))"
>
>   "@
>mov%?\\t%0, %1
>mvn%?\\t%0, #%B1
>ldr%?b\\t%0, %1
>str%?b\\t%1, %0"
>   [(set_attr "type" "*,*,load1,store1")
>(set_attr "predicable" "yes")]
> )

into

> (define_insn "*arm_movqi_insn"
>   [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,r,Q")
>   (match_operand:QI 1 "general_operand" "rI,K,m,+r"))]
>   "TARGET_ARM
>&& (   register_operand (operands[0], QImode)
>
>|| register_operand (operands[1], QImode))"
>
>   "@
>mov%?\\t%0, %1
>mvn%?\\t%0, #%B1
>    ldr%?b\\t%0, %1
>swp%?b\\t%1, %1, [%M0]"
>   [(set_attr "type" "*,*,load1,store1")
>(set_attr "predicable" "yes")]
> )

Changing "m" to "Q", narrowing the address modes
Changing "r" to "+r", (register is globbered)
and of course making the swpb call..

Gcc compiles, but does a segfault while compiling ARM programs.

regards

Wolfgang












-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-06-01 Thread Wolfgang Mües

Rask,

On Thursday 01 June 2006 16:13, Rask Ingemann Lambertsen wrote:
> I think you will need to remove the '+' as already suggested and add
> (clobber (match_scratch:QI "=X,X,X,1")) to tell GCC that the register
> allocated to operand 1 is clobbered by the instruction for this
> particular alternative.

Using 

(define_insn "*arm_movqi_insn"
  [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,r,m")
(match_operand:QI 1 "general_operand" "rI,K,m,r"))
(clobber (match_scratch:QI 2 "=X,X,X,1"))]
  "TARGET_ARM
   && (   register_operand (operands[0], QImode)
   || register_operand (operands[1], QImode))"
  "@
   mov%?\\t%0, %1
   mvn%?\\t%0, #%B1
   ldr%?b\\t%0, %1
   str%?b\\t%1, %0"
  [(set_attr "type" "*,*,load1,store1")
   (set_attr "predicable" "yes")]
)

(_only_ adding the clobber statement),
I get

> /data1/home/wolfgang/Projekte/DSO/devkitpro/buildscripts/newlib-1.14.
>0/newlib/li bc/argz/argz_create_sep.c: In function 'argz_create_sep':
> /data1/home/wolfgang/Projekte/DSO/devkitpro/buildscripts/newlib-1.14.
>0/newlib/li bc/argz/argz_create_sep.c:60: error: unrecognizable insn:
> (insn 192 21 24 0
> /data1/home/wolfgang/Projekte/DSO/devkitpro/buildscripts/newli
> b-1.14.0/newlib/libc/argz/argz_create_sep.c:29 (set (reg:QI 1 r1)
> (reg:QI 4 r4)) -1 (nil)
> (nil))
> /data1/home/wolfgang/Projekte/DSO/devkitpro/buildscripts/newlib-1.14.
>0/newlib/li bc/argz/argz_create_sep.c:60: internal compiler error: in
> extract_insn, at recog .c:2020

What do you mean with

> You will also have to modify any code which 
> expands this pattern accordingly.

I will use this weekend to digg deeper into the documentation...

thank you for your help so far...

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-06-04 Thread Wolfgang Mües

Hello Rask,

On Friday 02 June 2006 09:24, Rask Ingemann Lambertsen wrote:
> There may be a faster way of seeing if the modification is going to
> work for the DS at all. I noticed from the output template
> "swp%?b\\t%1, %1, [%M0]" that "swp" takes three operands. I don't
> know ARM assembler, but you may be able to choose to always clobber a
> specific register. Make it a fixed register (see FIXED_REGISTERS),
> refer to this register directly in the output template and don't add
> a clobber to the movqi patterns. IMHO, that's an acceptable hack at
> an experimental stage. If the resulting code runs correctly on the
> DS, you can then undo the FIXED_REGISTERS change and add the clobber
> statements.

I have tried this. No luck. Problem is the lack of addressing modes for 
the swp instruction. Only a simple pointer in a register (no offset, no 
auto-increment is allowed).

After reading most of the gcc rtl documentation (and forgetting way too 
much..) I came up to the following conclusion:

Splitting the insn

(define_insn "*arm_movqi_insn"
  [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,r,m")
(match_operand:QI 1 "general_operand" "rI,K,m,r"))]

into 4 different insns:

(define_insn "*arm_movqi_insn"
  [(set (match_operand:QI 0 "register_operand" "")
 (match_operand:QI 1 "register_operand" ""))]

(define_insn "*arm_movnqi_insn"
  [(set (match_operand:QI 0 "register_operand" "")
 (match_operand:QI 1 "constant_operand" ""))]

(define_insn "*arm_loadqi_insn"
  [(set (match_operand:QI 0 "register_operand" "")
 (match_operand:QI 1 "memory_operand" ""))]

(define_insn "*arm_storeqi_insn"
  [(set (match_operand:QI 0 "memory_operand" "")
 (match_operand:QI 1 "register_operand" ""))]

This should give the same function as before, but I then I can do

(define_insn "*arm_storeqi_insn"
  [(set (match_operand:QI 0 "simple_memory_operand" "")
 (match_operand:QI 1 "register_operand" ""))]

etc

to limit the addressing modes of the store insn to the limits of the 
swpb instruction.

And then I can recode the 

(define_expand "movqi"
  [(set (match_operand:QI 0 "general_operand" "")
(match_operand:QI 1 "general_operand" ""))]

to cope with the movqi requirements defined in the gcc manual.

Hmmm... wondering who all these xxx_operand functions are defined, and 
where they are documented...

Is this the right way to go?

regards

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-06-04 Thread Wolfgang Mües

Paul,

On Sunday 04 June 2006 13:24, Paul Brook wrote:
> On Sunday 04 June 2006 11:31, Wolfgang Mües wrote:
> > Splitting the insn
> >
> > (define_insn "*arm_movqi_insn"
> >   [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,r,m")
> > (match_operand:QI 1 "general_operand" "rI,K,m,r"))]
> >
> > into 4 different insns:
>
> No. This is completely the wrong approach.

Why? I am learning.

> You should just change the valid QImode memory addresses, adding a new 
> constraint if neccessary. 

H... I have tried this. I have changed the operand constraint from 
"m" to "Q". But these constraints are only used to select the right 
alternative inside the insn, not which insn is invoked. It might be 
possible to modify "nonimmediate_operand"
into something else, to select this insn only if the address is fitting 
in a single register, without offset or increment.

But this will not give me the freedom to allocate a temporary register. 
According to the manual, mov insns are not supposed to clobber a 
register. I suppose I will have to allocate these registers in 

(define_expand "movqi"
  [(set (match_operand:QI 0 "general_operand" "")
(match_operand:QI 1 "general_operand" ""))]

So I have to narrow down the constraint "nonimmediate_operand", so that 
every memory address not fitting in a single register will not invoke 
arm_movqi_insn.

Please correct me if I'm wrong. This is my first encounter with the 
inner contents of gcc. I may have completely missed your point.

> You also need to tweak the reload legitimate address bits to obey the
> new restrictions.

Can you show me what you mean here? What to do where?

> For the record these hacks are unlikely to ever be acceptable in
> mainline gcc. They're relatively invasive changes who's only purpose
> is to support fundamentally broken hardware.

Paul, this is clear to me. Homebrew software on the DS is not so 
important to justify such a change in mainline gcc. A patch will be 
fine.

Its a big amount of - sometimes frustrating - work for a gcc newbie to 
make this change. I am doing this only because I know it's the only 
solution, and to turn the command line only DS linux into some nice 
PDA/browser/wireless client machine.

regards

Wolfgang

-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-06-05 Thread Wolfgang Mües

Paul,

On Sunday 04 June 2006 17:57, Paul Brook wrote:
> Because then you have several different patterns for the same
> operation. The different variants of movsi should be part of the same
> pattern so that the compiler can change its mind which variant it
> wants to use.

Together with the comments of Rask Ingemann (Thanks, Rask!), I 
understand now what you mean.

But regarding the fact that swpb() needs a temporary register - or 
alternative - clobber the input register - how can I model this 
behaviour in a single insn? 

> You're confusing constraints and predicates. general_operand is the
> predicate. The predicate says under which conditions the insn will
> match. The constraints tell regalooc/reload how to make sure the
> operands of the instruction are valid.

Yes, my wording was incorrect. But I know already the difference from 
the manual.

> Tightening the predicates isn't sufficient (and may not even be
> neccessary). You need to set the constraints so that the compiler
> knows *how* to fix invalid instructions.

And if I have 4 different constraints in a single insn, and only one of 
them is needing a temporary register, how do I model this?
This may be the biggest problem. And because byte writes are so common, 
it deserves a good implementation. I can't waste a temporary register 
for each load/store.

> The compilcation is that while constraints give sufficient
> information for the compiler to generate correct code they don't help
> generating good code. There are often non-obvious target specific
> ways of reloading invalid addresses. So reload has additional hooks
> (eg. GO_IF_LEGITIMATE_ADDRESS) to provice clever ways of fixing
> invalid operands.

I will look into this region of code to understand what's going on 
there.

Thanks, Paul.

regards

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-06-05 Thread Wolfgang Mües

Hello Dave ;-)

On Monday 05 June 2006 02:12, Dave Murphy wrote:
> I was just about to ask about this very thing since I'm quite sure
> that there would be interest in adding this to devkitARM.

You are following the process in dslinux, don't you?

In fact, devkitARM is my current build environment. The first thing that 
will happen is a patch to devkitARM.

>  How much work would it be to implement these switches?

Good question ;-)

>  I assume that the toolchain would need multilibs for these options in
>  order to use newlib etc.

I have not looked into library issues now. Compiler comes first.
We will need an asm macro for 8bit writes to the hardware registers.

And the devkitARM libraries *must* implement writeback caching for the 
GBA slot ROM area.

regards
Wolfgang

-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-06-05 Thread Wolfgang Mües

Richard,

On Monday 05 June 2006 12:06, Richard Earnshaw wrote:
> I'm confident right now that these will be too invasive to include in
> mainline.

As said before, this is OK for me. 

> The changes that tend to get incorporated into the compiler are to
> work around bugs in the CPU, not bugs in some H/W developer's use of
> the CPU.  The former affect all users of the processor, the latter
> only that one case.
>
> If we started putting in hacks for the latter the compiler back-ends
> would become unmaintainable in almost no time at all.

Agreed.

> PS.  Using swp is a bad idea IMO, this instruction is *very* slow on
> some CPU implementations because of the way it interacts with caches.

Yes, swp forces a cache load. But in this particular case, forcing a 
cache load is the ONLY way to circumvent the hardware problem.
If there is a block write, cache loads are forced only each 32 byte 
accesses.

Other possible solutions:

a) code a 16bit read-modify-write. This will also cause a cache load,
   and will need much more code, because it will have to look at the
   LSB of the address to know where to insert the byte into the word.

b) use the protection unit and make a data abort for a write to that 
memory region. This has the advantage of affecting ONLY the critical 
memory region (not all the other ones), but the disadvantages are big:
all memory writes are affected, and a data abort handler is very slow. 
This solution was implemented before, it was 100 times slower than 
native access. Unusable.

regards

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-06-05 Thread Wolfgang Mües

Hello,

my first little success...

in arm.h, I have changed

> /* Output the address of an operand.  */
> #define ARM_PRINT_OPERAND_ADDRESS(STREAM, X)  \
> { \
> int is_minus = GET_CODE (X) == MINUS; 
> \
>   \
> if (GET_CODE (X) == REG)  \
>   asm_fprintf (STREAM, "[%r, #0]", REGNO (X));
> \

into

> /* Output the address of an operand.  */
> #define ARM_PRINT_OPERAND_ADDRESS(STREAM, X)  \
> { \
> int is_minus = GET_CODE (X) == MINUS; 
> \
>   \
> if (GET_CODE (X) == REG)  \
>   asm_fprintf (STREAM, "[%r]", REGNO (X));
> \

I don't know why the form "[%r, #0]" was coded before, because the 
assembler understands "[%r]" very well for all instructions. The form 
"[%r]" has a wider usage because it covers swp too.

On Sunday 04 June 2006 23:36, Rask Ingemann Lambertsen wrote:
> On Wed, May 31, 2006 at 10:49:35PM +0200, Wolfgang Mües wrote:
> > > (define_insn "*arm_movqi_insn"
> > >   [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,r,m")
> > >   (match_operand:QI 1 "general_operand" "rI,K,m,r"))]

> I think you should go back to this (i.e. the unmodified version) and
> only change the "m" into "Q" in the fourth alternative of operand 0.
> See if that works, i.e. generates addresses that are valid for the
> swp instruction.

No, that doesn't work:

> ../../../gcc-4.0.2/gcc/unwind-dw2-fde.c: In function
> __register_frame_info_table_bases':
> ../../../gcc-4.0.2/gcc/unwind-dw2-fde.c:146: error: insn does not
> satisfy its constraints: (insn 63 28 29 0
> ../../../gcc-4.0.2/gcc/unwind-dw2-fde.c:136 (set (mem/s/j:QI (plus:SI
> (reg/v/f:SI 1 r1 [orig:102 ob ] [102]) (const_int 16 [0x10])) [0 S1
> A32])
> (reg:QI 12 ip)) 155 {*arm_movqi_insn} (nil)
> (nil))
> ../../../gcc-4.0.2/gcc/unwind-dw2-fde.c:146: internal compiler error:
> in reload_ cse_simplify_operands, at postreload.c:391

Also, I wonder what the "Q" constraint really means:

from the GCC manual:

> Q
> A memory reference where the exact address is in a single register
> (``m'' is preferable for asm statements)

but in arm.h:

> /* For the ARM, `Q' means that this is a memory operand that is just
>an offset from a register.
> #define EXTRA_CONSTRAINT_STR_ARM(OP, C, STR)  \
>((C) == 'Q') ? (GET_CODE (OP) == MEM   
> \
>&& GET_CODE (XEXP (OP, 0)) == REG) :   \

Obviously, GCC tries to implement REG+CONSTANT with Q.

Maybe I must define a new constraint?

regards

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-06-05 Thread Wolfgang Mües

Rask,

On Monday 05 June 2006 16:16, Rask Ingemann Lambertsen wrote:
> On Mon, Jun 05, 2006 at 01:47:10PM +0200, Wolfgang Mües wrote:
> Does GCC happen to accept "[%r, #0]" for swp?

No. But no problem here to change that.

> I think the comment in arm.h is wrong. The manual seems to agree with
> the code.

Just to make it easy for beginners...

> I tried 'V' instead, but it looks as if reload completely ignores the
> meaning of the constraint. There is already a comment in arm.md about
> that. It should be investigated further.

Hmmm... I have searched 'Q' in the arm files. Not used in arm.md, only 
for some variants of arm (cirrus). Maybe only implemented for them?

> Meanwhile, I changed arm_legitimate_address_p() to enforce the
> correct address form. This hurts byte loads too, though.

I assume there is no way to tell the direction in 
arm_legitimate_address_p() ? Hmmm.

> Index: gcc/config/arm/arm.opt
> ===
> --- gcc/config/arm/arm.opt(revision 114119)
> +++ gcc/config/arm/arm.opt(working copy)
> @@ -153,3 +153,7 @@
>  mwords-little-endian
>  Target Report RejectNegative Mask(LITTLE_WORDS)
>  Assume big endian bytes, little endian words
> +
> +mswp-byte-writes
> +Target Report Mask(SWP_BYTE_WRITES)
> +Use the swp instruction for byte writes

In my environment (gcc 4.0.2), this is different. But I was able to find 
the definitions in arm.h and implement these changes. Easyer than 
expected...

(The DSLINUX team is not using gcc 4.1 because of compile problems with 
the 2.6.14er kernel).

> +   swp%?b\\t%1, %1, %0\;ldr%?b\\t%1, %0"

You should get a price for cleverness here!

> +; Avoid reading the stored value back if we have a spare register.
> +(define_peephole2
> +  [(match_scratch:QI 2 "r")
> +   (set (match_operand:QI 0 "memory_operand" "")
> +(match_operand:QI 1 "register_operand" ""))]
> +  "TARGET_ARM && TARGET_SWP_BYTE_WRITES"
> +  [(parallel [
> +(set (match_dup 0) (match_dup 1))
> +(clobber (match_dup 2))]
> +  )]
> +)

As far as I can tell now, this works good. But I think there are many 
cases in which the source operand is not needed after the store. Is 
there a possibility to clobber the source operand and not using another 
register?

Hmmm. Most of the code I have seen in the first tests have no problem 
with this extra register...it's available.

> With -O2 -mswp-byte-writes:
>
> bytewritetest:
>   @ args = 0, pretend = 0, frame = 0
>   @ frame_needed = 0, uses_anonymous_args = 0
>   str lr, [sp, #-4]!
>   add r2, r0, #4
>   add lr, r0, #5
>   ldrbr3, [lr, #0]@ zero_extendqisi2
>   ldrbr1, [r2, #0]@ zero_extendqisi2
>   eor r2, r1, r3
>   add r3, r3, r1
>   ldr ip, [r0, #0]
>   str r3, [r0, #0]
>   swpbr3, r2, [lr, #0]
>   str ip, [r0, #8]
>   ldr pc, [sp], #4
>
>
> The register allocator chooses to use the lr register, in turn
> causing link register save alimination to fail, which doesn't help.

I can't understand this without explanation... is it bad?

Rask, thank you very much for your work.

regards
Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-06-06 Thread Wolfgang Mües

Rask,

On Tuesday 06 June 2006 21:33, Rask Ingemann Lambertsen wrote:
> > > +   swp%?b\\t%1, %1, %0\;ldr%?b\\t%1, %0"
> >
> > You should get a price for cleverness here!
>
> Thanks! Indeed it looks good until you think of volatile variables.

Because volatile variables can change their values from another thread, 
and the readback will be false. Oh.

gcc knows the volatile attribute here, I assume?

> > As far as I can tell now, this works good. But I think there are
> > many cases in which the source operand is not needed after the
> > store. Is there a possibility to clobber the source operand and not
> > using another register?
>
> I don't know if (match_scratch ...) might reuse the source operand.
> It can be attempted more specifically with an additional peephole
> definition:
>
> (define_peephole2
>   [(set (match_operand:QI 0 "memory_operand" "")
> (match_operand:QI 1 "register_operand" ""))]
>   "TARGET_ARM && TARGET_SWP_BYTE_WRITES && peep2_reg_dead_p (1,
> operands[1])" [(parallel
> [(set (match_dup 0) (match_dup 1))
>  (clobber (match_dup 1))]
>   )]
> )

I will try this.

> Yet another register which stands a good chance of being reusable is
> the register containing the address.

Yes, but that is not allowed according to the specification of the swp 
instruction. The address register must be different from the other 2 
registers. Is there any chance of gcc violating this constraint? 

regards
Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-06-08 Thread Wolfgang Mües

Rask,

On Thursday 08 June 2006 20:12, Rask Ingemann Lambertsen wrote:

> Also, undo the change to arm_legitimate_address_p() arm.c.

Hmmm

> arm-elf-gcc -g -mswp-byte-writes -Wall -O2 -fomit-frame-pointer
> -ffast-math -mthumb-interwork -isystem
> /usr/lib/devkitpro/libnds/include -mcpu=arm9tdmi -mtune=arm9tdmi
> -DARM9 -S arm9_main.c -o arm9_main.S arm9_main.c: In function 'test':
> arm9_main.c:20: error: unable to generate reloads for:
> (insn:HI 20 21 22 1 arm9_main.c:16 (set (mem/v:QI (post_inc:SI
> (reg/v/f:SI 3 r3 [orig:102 p ] [102])) [0 S1 A8]) (subreg/s/u:QI
> (reg:SI 2 r2 [orig:103 c.36 ] [103]) 0)) 157 {*arm_movqi_insn_swp}
> (nil) (expr_list:REG_INC (reg/v/f:SI 3 r3 [orig:102 p ] [102])
> (nil)))
> arm9_main.c:20: internal compiler error: in find_reloads, at
> reload.c:3720

void test(void)
{
static unsigned char c = 20;
volatile unsigned char * p;
int i;

p = (volatile unsigned char *) 0x0800;
for (i = 0; i < 1000; i++)
*p++ = c;

c = 40;
c = c;
}

Without the change in arm_legitimate_address_p, we get post increment 
pointer into swpb. The non-working 'Q' constraint

regards
Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-07-18 Thread Wolfgang Mües

Hello,

after getting a "working" version of the gcc 4.0.2 with the Nintendo 
8-bit-write problem, I was busy the last weeks trying to adapt the 
linux system (replacing I/O with writeb() macros, removing strb 
assembler calls).

However, it turned out that the sources of the linux kernel are a far 
more demanding test than every single small test case.

I have tried my very best to implement the last patch from Rask (thank 
you very much!). There was one place I was not shure I have coded the
right solution:

Rasks patch (gcc 4.2.x):

> +;; Match register operands or memory operands of the form (mem (reg
> ...)), +;; as permitted by the "Q" memory constraint.
> +(define_predicate "reg_or_Qmem_operand"
> +  (ior (match_operand 0 "register_operand")
> +   (and (match_code "mem")
> + (match_code "reg" "0")))
> +)
> +

My patch (without the second operand for match_code):

> ;; Match register operands or memory operands of the form (mem (reg
> ...)), ;; as permitted by the "Q" memory constraint.
> (define_predicate "reg_or_Qmem_operand"
>   (ior (match_operand 0 "register_operand")
>(and (match_code "mem")
>   (match_test "GET_CODE (XEXP (op, 0)) == REG")))
> )

Is this the right substitution?

If I compile the linux kernel with this patch, many files get compiled 
without problems, but in fs/vfat/namei.c I get:

> fs/vfat/namei.c: In function 'vfat_add_entry':
> fs/vfat/namei.c:694: error: unrecognizable insn:
> (insn 2339 2338 2340 188 (set (mem/s/j:QI (reg:SI 14 lr) [0
> .attr+0 S1 A8]) (reg:QI 12 ip)) -1 (nil)
> (nil))
> fs/vfat/namei.c:694: internal compiler error: in extract_insn, at
> recog.c:2020 Please submit a full bug report,

I can't see what is going on here...

regards

Wolfgang


The full patch of Rask is appended below:

> Index: gcc/config/arm/arm.h
> ===
> --- gcc/config/arm/arm.h  (revision 114119)
> +++ gcc/config/arm/arm.h  (working copy)
> @@ -1094,6 +1094,8 @@
> ? vfp_secondary_reload_class (MODE, X)\
>
> : TARGET_ARM  \
>
> ? (((MODE) == HImode && ! arm_arch4 && true_regnum (X) == -1) \
> +   || ((MODE) == QImode && TARGET_ARM && TARGET_SWP_BYTE_WRITES  \
> +   && true_regnum (X) == -1) \
>  ? GENERAL_REGS : NO_REGS)\
>
> : THUMB_SECONDARY_OUTPUT_RELOAD_CLASS (CLASS, MODE, X))
>
> Index: gcc/config/arm/arm.opt
> ===
> --- gcc/config/arm/arm.opt(revision 114119)
> +++ gcc/config/arm/arm.opt(working copy)
> @@ -153,3 +153,7 @@
>  mwords-little-endian
>  Target Report RejectNegative Mask(LITTLE_WORDS)
>  Assume big endian bytes, little endian words
> +
> +mswp-byte-writes
> +Target Report Mask(SWP_BYTE_WRITES)
> +Use the swp instruction for byte writes. The default is to use str
> Index: gcc/config/arm/predicates.md
> ===
> --- gcc/config/arm/predicates.md  (revision 114119)
> +++ gcc/config/arm/predicates.md  (working copy)
> @@ -125,6 +125,14 @@
>
>|| (GET_CODE (op) == REG
>
>&& REGNO (op) >= FIRST_PSEUDO_REGISTER)))")))
>
> +;; Match register operands or memory operands of the form (mem (reg
> ...)), +;; as permitted by the "Q" memory constraint.
> +(define_predicate "reg_or_Qmem_operand"
> +  (ior (match_operand 0 "register_operand")
> +   (and (match_code "mem")
> + (match_code "reg" "0")))
> +)
> +
>  ;; True for valid operands for the rhs of an floating point insns.
>  ;;   Allows regs or certain consts on FPA, just regs for everything
> else. (define_predicate "arm_float_rhs_operand"
> Index: gcc/config/arm/arm.md
> ===
> --- gcc/config/arm/arm.md (revision 114119)
> +++ gcc/config/arm/arm.md (working copy)
> @@ -5151,6 +5151,16 @@
>emit_insn (gen_movsi (operands[0], operands[1]));
>DONE;
>  }
> +  if (TARGET_ARM && TARGET_SWP_BYTE_WRITES)
> +{
> +  /* Ensure that operands[0] is (mem (reg ...)) if a memory
> operand. */ +  if (MEM_P (operands[0]) && !REG_P (XEXP
> (operands[0], 0))) +  operands[0]
> +   = replace_equiv_address (operands[0],
> +

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-07-19 Thread Wolfgang Mües

Hello Rask,

On Wednesday 19 July 2006 13:24, Rask Ingemann Lambertsen wrote:
> I've spotted a function named emit_set_insn() in arm.c. It might be
> the problem, because it uses gen_rtx_SET() directly.

But it's not the only function which uses gen_rtx_SET. There are also
much places with

> emit_constant_insn (cond,
>   gen_rtx_SET (VOIDmode, target, source));

Isn't it better to replace gen_rtx_SET?

regards

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-08-05 Thread Wolfgang Mües

Rask,

On Friday 21 July 2006 15:26, Rask Ingemann Lambertsen wrote:
> I found that this peephole optimization improves the code a whole
> lot:

Done.

> Another way of improving the code was to swap the order of the two
> last alternatives of _arm_movqi_insn_swp.

Done.

Anyway, the problems with reload continues...error: unrecognizable insn

First, I have had a problem with loading a register with a constant.
(no clobber). I have solved this problem by adding

> (define_insn "_arm_movqi_insn_const"
>   [(set (match_operand:QI 0 "register_operand" "=r")
>   (match_operand:QI 1 "const_int_operand" ""))]
>   "TARGET_ARM && TARGET_SWP_BYTE_WRITES
>&& (   register_operand (operands[0], QImode))"
>   "@
>mov%?\\t%0, %1"
>   [(set_attr "type" "*")
>(set_attr "predicable" "yes")]
> )

I am very shure that this does only cure the symptoms, and it will 
better to fix this in the reload stage, but at least, it worked, and I 
was able to compile the whole linux kernel!

After testing that the kernel is running, I have tried to compile 
uCLinux. And there is the next problem

> ../ncurses/./base/lib_set_term.c: In function '_nc_setupscreen':
> ../ncurses/./base/lib_set_term.c:470: error: unrecognizable insn:
> (insn 1199 1198 696 37 ../ncurses/./base/lib_set_term.c:429 (parallel
> [ (set (mem/s/j:QI (reg/f:SI 3 r3 [491]) [0 ._clear+0 S1
> A8]  
>  ) (reg:QI 0 r0))
> (clobber (subreg:QI (reg:DI 11 fp) 0))
> ]) -1 (nil)
> (nil))
> ../ncurses/./base/lib_set_term.c:470: internal compiler error: in
> extract_insn,
> at recog.c:2020 P

The source code line is:

>newscr->_clear = TRUE;

Obviously, TRUE is loaded in r0, but I don't know why this construct 
(storing a byte into a struct member referenced by a pointer) is not
evaluated.

I fear that these problems are creating an endless story, and sorry for 
generating traffic on this list, because I'm still no gcc expert...

On the other hand, the compiler now has generated code from hundreds of 
files, and maybe I'm very near to success now.

regards

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-08-12 Thread Wolfgang Mües

Rask,

On Sunday 06 August 2006 02:05, Rask Ingemann Lambertsen wrote:
> Yes, it only cures the symptom, but it could take a lot of time to
> find the cause, and the gain is small, so I think it is OK to leave
> it like this for now.
OK.

> This insn was generated from the "reload_outqi" pattern. I don't
> completely understand why it isn't recognized. The (subreg:QI (reg:DI
> 11 fp) 0) part won't be matched by (match_scratch ...), but
> simplify_gen_subreg() should have simplified it to (reg:QI 11 fp)
> since this is one of the main purposes of having
> simplify_(gen_)subreg() in the first place. Try changing
>
>operands[3] = simplify_gen_subreg (QImode, operands[2], DImode,
> 0);
>
> into
>
>operands[3] = gen_rtx_REG (QImode, REGNO (operands[2]));
>
> (in "reload_outqi") and see if that works.

Yes, it works. Kernel and userland are compiling now. I can't find any 
errors in the generated code. Many thanks!

regards

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)

Bootstrap failure on SuSE 10.1

2006-09-15 Thread Wolfgang Bangerth


I must be doing something extraordinarily stupid, but I can't figure out
what it is: I can't bootstrap anymore since subversion revisions from
early January this year on a system as widely available as stock SuSE10.1.

Here's what's happening: starting with revision 109241
---
config:
2006-01-02  Paolo Bonzini  <[EMAIL PROTECTED]>

PR target/25259
* stdint.m4: New.

gcc:
2006-01-02  Paolo Bonzini  <[EMAIL PROTECTED]>

PR target/25259
* Makefile.in (DECNUMINC): Include libdecnumber's build directory.

[...]

I get bootstrap failures like this:

gcc -c   -g -DENABLE_CHECKING -DENABLE_ASSERT_CHECKING -DIN_GCC   -W -Wall
   -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic
   -Wno-long-long -Wno-variadic-macros -Wold-style-definition
   -Wmissing-format-attribute -fno-common   -DHAVE_CONFIG_H -I. -I.
   -I../../svn-mainline/gcc -I../../svn-mainline/gcc/.
   -I../../svn-mainline/gcc/../include
   -I../../svn-mainline/gcc/../libcpp/include
   -I../../svn-mainline/gcc/../libdecnumber -I../libdecnumber
   ../../svn-mainline/gcc/c-lang.c -o c-lang.o
In file included from ../../svn-mainline/gcc/input.h:25,
 from ../../svn-mainline/gcc/tree.h:26,
 from ../../svn-mainline/gcc/c-lang.c:27:
../../svn-mainline/gcc/../libcpp/include/line-map.h:56: error: 'CHAR_BIT'
undeclared here (not in a function)


I can prevent the failure if I remove the -I../libdecnumber from the
command line. The reason is that c-lang.c contains #include "config.h" at
the beginning, and for some reason the preprocessor decides to pick up the
config.h from ../libdecnumber instead of from ./ . If I run above
commandline with -E instead of -c, here is the top of the preprocessor
output with -I../libdecnumber:

# 1 "../../svn-mainline/gcc/c-lang.c"
# 1 ""
# 1 ""
# 1 "../../svn-mainline/gcc/c-lang.c"
# 23 "../../svn-mainline/gcc/c-lang.c"
# 1 "../libdecnumber/config.h" 1
# 24 "../../svn-mainline/gcc/c-lang.c" 2

On the other hand, when I omit -I../libdecnumber, I get the output that
was probably expected:

# 1 "../../svn-mainline/gcc/c-lang.c"
# 1 ""
# 1 ""
# 1 "../../svn-mainline/gcc/c-lang.c"
# 23 "../../svn-mainline/gcc/c-lang.c"
# 1 "./config.h" 1 3


# 1 "./auto-host.h" 1 3


This must be something that someone has seen before and knows how to deal
with. Any ideas?

Best
  Wolfgang


PS: Just in case, this is how I build:
  ../svn-mainline/configure --prefix=/home/bangerth/bin/gcc-4.2-pre
   --enable-languages=c,c++   &&   make bootstrap

-
Wolfgang Bangerthemail:[EMAIL PROTECTED]
 www: http://www.math.tamu.edu/~bangerth/

Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-10-02 Thread Wolfgang Mües

Now it's time to give a big "thank you" to all persons involved, 
ecpecially Rask Ingemann Lambertsen with his invaluable help.

As I started this project, I feared that I would never succeed, and 
now ... the modified compiler is used about 3 month now, and DSLINUX 
with this crude modification is working fine with 36 MBytes RAM 
available, and has a good future now.

During the last months, 3 issues have come up, all with invalid insns, 
but with my new-developed knowledge of the arm code generator, I was 
able to resolve them.

So very much thanks to all involved, and keep up the good work!

Wolfgang Mües

Re: S/390 as GCC 4.3 secondary plattform?

2006-10-09 Thread Wolfgang Gellerich

Hello Everyone,

> In the criteria for primary plattforms I've read that primary plattforms
> have to be "popular systems". Reading this as "widely used" I think that
> this will be a requirement which mainframes are unlikely to meet in the
> near future, so I propose to make s390 and s390x secondary plattforms for

> now. I think this can be important to show users that gcc works reliably
on
> S/390 and that it can be expected to do so in the future as well.

I agree and would like to add that with respect to the s390 platform one
should consider that "popular" and "widely used" cannot have the same
meaning as, for example, in the context of computers for personal use.

The s390 back end does not only compile any Linux-related software
on IBM System z but also the system´s Firmware (the software layer between
operating system and hardware). So, every System z machine uses code
generated by gcc, even if there if the system does not yet run Linux.

Regards, Wolfgang

Problem with optimization passes management

2007-10-10 Thread Wolfgang Gellerich


There is a conflict between the command line switches that turn off individual
optimization steps and their preconditions. Compiling a "hello world" with the
following options:

 -O1  -fno-tree-salias

causes gcc to fail with an internal consistency check. Pass return_slot has
PROP_alias in its preconditions but alias information is not generated due to
the second option.

Regards, Wolfgang



---
Dr. Wolfgang Gellerich
IBM Deutschland Entwicklung GmbH
Schönaicher Strasse 220
71032 Böblingen, Germany
Tel. +49 / 7031 / 162598
[EMAIL PROTECTED]

===

IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Johann Weihen 
Geschäftsführung: Herbert Kircher 
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

Re: Problem with optimization passes management

2007-10-10 Thread Wolfgang Gellerich

> On 10/10/07, Wolfgang Gellerich <[EMAIL PROTECTED]> wrote:
> >
> > There is a conflict between the command line switches that turn
> off individual
> > optimization steps and their preconditions. Compiling a "hello
> world" with the
> > following options:
>
> This one issue is know, it was reported as PR 33092.
>

Sorry for the duplicate!

Regards, Wolfgang

gcc bootstrap failure with libcody

2020-12-16 Thread Joern Wolfgang Rennecke

I'm seeing a boostrap failure when I try to build the latest gcc version
( 8833eab4461b4b7050f06a231c3311cc1fa87523 ) :

checking whether time.h and sys/time.h may both be included... checking
whether
gcc supports -Wmissing-prototypes... i686-pc-linux-gnu
checking host system type... make[3]: *** [buffer.o] Error 1
make[3]: Leaving directory `/ssd/fsf/tst-gcc11--base/libcody'
make[2]: *** [all-stage1-libcody] Error 2
make[2]: *** Waiting for unfinished jobs

Going back to the version to before libcody was added seems to build
fine so far.

Re: ARC length attribute patch

2015-03-20 Thread Joern Wolfgang Rennecke




On 20/03/15 16:02, Claudiu Zissulescu wrote:

Hi Joern,

I have a small patch for ARC backend that fixes the value of instruction length 
attribute when the instruction is predicated. Ok to apply?
Why would the arc_bdr_iscond test have any effect? 
arc_predicate_delay_insns should render the issue moot.


Moreover,
- your patch has no ChangeLog entry.


+extern bool arc_bdr_iscond (rtx);


- New code should use const rtx_insn * .

 

+   conditionally. */

-   ^ The GNU coding standard requires two spaces here.


-  (const_int 2))
+   (match_test "GET_CODE (PATTERN (insn)) == COND_EXEC || arc_bdr_iscond 
(insn)")
+   (const_int 4)]
+ (const_int 2))


- You are mis-formatting the code.  (const_int 2) is part of the cond clause.

Re: ARC length attribute patch

2015-03-23 Thread Joern Wolfgang Rennecke




On 20/03/15 16:02, Claudiu Zissulescu wrote:

Hi Joern,

I have a small patch for ARC backend that fixes the value of instruction length 
attribute when the instruction is predicated. Ok to apply?


Assuming you tested it, this patch is OK.

Suitable regression test for vectorizer patches?

2018-10-29 Thread Joern Wolfgang Rennecke

I want to submit some vectorizer patches, what would be a suitable 
regression test?
Preferably some native or cross test that can run on an i7 x86_64 
GNU/Linux machine.


To give an idea what code I'm patching, here are the patches I got so far:

* tree-vect-patterns.c (vect_recog_dot_prod_pattern): Recognize 
unsigned dot product pattern.


Allow widening multiply-add to be used for DOT_PROF_EXPR reductions.
* tree-vect-data-refs.c (vect_get_smallest_scalar_type):
Treat WIDEN_MULT_PLUS_EXPR like WIDEN_SUM_EXPR.
* tree-vect-loop.c (get_initial_def_for_reduction): Likewise.
Get VECTYPE from STMT_VINFO_VECTYPE.
(vect_determine_vectorization_factor):
Allow vcector size input/output mismatch for reduction.
(vect_analyze_scalar_cycles_1): When we find a phi for a reduction,
put the reduction statement into the phi's STMT_VINFO_RELATED_STMT.
* tree-vect-patterns.c (vect_pattern_recog_1): If DOT_PROD_EXPR 
can't

be expanded directly, try to use WIDEN_MULT_PLUS_EXPR instead.

Fix bug where a vectorizer reduction split (from 
TARGET_VECTORIZE_SPLIT_REDUCTION)

would end up not being used.
* tree-vect-loop.c (vect_create_epilog_for_reduction):
If we split the reduction, use the result in Case 3 too.

Re: Suitable regression test for vectorizer patches? - (need {u,}madd* pattern)

2018-11-01 Thread Joern Wolfgang Rennecke




On 30/10/18 08:36, Richard Biener wrote:

On Mon, Oct 29, 2018 at 7:03 PM Joern Wolfgang Rennecke
 wrote:

I want to submit some vectorizer patches, what would be a suitable
regression test?

I am sure you have testcases, no?  For new features please make them
dg-do run ones by checking correctness.

For the dot product / widen_sum -> madd transformations to trigger,
I need an in-tree port with a named pattern matched by
smadd_widen_optab or umadd_widen_optab, with an input matching
PREFERRED_SIMD_VECTOR_MODE, and hence an output twice that
size (and that pattern must not be eclipsed by existing
[us]sum_widen_optab and [us]dot_prod_optab matches).

I can't find any such port in the tree.  Indeed, not any
{u,}madd4 pattern at all.

I've heard that arm cortex-m4 hardware acctually supports a madd vector 
operation

(V2HI -> V2SI), is that true?

Would the test be suitable if it made the arm target,
with a patch added to add a suitable madd pattern, and my vectorizer 
patch added,

use that madd pattern?

Or could I add an imaginary madd vector extension instruction to the arc for
that purpose?  But then, it wouldn't actually execute, as it's just a 
made-up instruction;

nor would the vectorization test be included in a test run for an actual.

Garbage collection bugs

2019-01-09 Thread Joern Wolfgang Rennecke

We've been running builds/regression tests for GCC 8.2 configured with 
--enable-checking=all, and have observed some failures related to 
garbage collection.


First problem:

The g++.dg/pr85039-2.C tests (I've looked in detail at -std=c++98, but 
-std=c++11 and -std=c++14 appear to follow the same pattern) see gcc 
garbage-collecting a live vector.  A subsequent access to the vector 
with vec_quick_push causes a segmentation fault, as m_vecpfx.m_num is 
0xa5a5a5a5 . The vec data is also being freed / poisoned. The vector in 
question is an auto-variable of cp_parser_parenthesized_expression_list, 
which is declared as: vec *expression_list;


According to doc/gty/texi: "you should reference all your data from 
static or external @code{GTY}-ed variables, and it is advised to call 
@code{ggc_collect} with a shallow call stack."


In this case, cgraph_node::finalize_function calls the garage collector,
as we are finishing a member function of a struct. gdb shows a backtrace 
of 34 frames, which is not really much as far as C++ parsing goes. The 
caller of finalize_function is expand_or_defer_fn, which uses the 
expression "function_depth > 1" to compute the no_collect paramter to 
finalize_function.
cp_parser_parenthesized_expression_list is in frame 21 of the backtrace 
at this point.
So, if we consider this shallow, cp_parser_parenthesized_expression_list 
either has to refrain from using a vector with garbage-collected 
allocation, or it has to make the pointer reachable from a GC root - at 
least if function_depth <= 1.

Is the attached patch the right approach?

When looking at regression test results for gcc version 9.0.0 20181028 
(experimental), the excess errors test for g++.dg/pr85039-2.C seems to 
pass, yet I can see no definite reason in the source code why that is 
so.  I tried running the test by hand in order to check if maybe the 
patch for PR c++/84455 plays a role,
but running the test by hand, it crashes again, and gdb shows the 
telltale a5 pattern in a pointer register.

#0  vec::quick_push (obj=,
this=0x705ece60)
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:974

#1  vec_safe_push (obj=,
v=@0x7fffd038: 0x705ece60)
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:766

#2  cp_parser_parenthesized_expression_list (
parser=parser@entry=0x77ff83f0,
is_attribute_list=is_attribute_list@entry=0, cast_p=cast_p@entry=false,
allow_expansion_p=allow_expansion_p@entry=true,
non_constant_p=non_constant_p@entry=0x7fffd103,
close_paren_loc=close_paren_loc@entry=0x0, wrap_locations_p=false)
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:7803

#3  0x006e910d in cp_parser_initializer (
parser=parser@entry=0x77ff83f0,
is_direct_init=is_direct_init@entry=0x7fffd102,
non_constant_p=non_constant_p@entry=0x7fffd103,
subexpression_p=subexpression_p@entry=false)
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:22009

#4  0x0070954e in cp_parser_init_declarator (
parser=parser@entry=0x77ff83f0,
decl_specifiers=decl_specifiers@entry=0x7fffd1c0,
checks=checks@entry=0x0,
function_definition_allowed_p=function_definition_allowed_p@entry=true,
member_p=member_p@entry=false, declares_class_or_enum=,
function_definition_p=0x7fffd250, maybe_range_for_decl=0x0,
init_loc=0x7fffd1ac, auto_result=0x7fffd2e0)
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:19827
#5  0x00711c5d in cp_parser_simple_declaration 
(parser=0x77ff83f0,
function_definition_allowed_p=, 
maybe_range_for_decl=0x0)
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:13179

#6  0x00717bb5 in cp_parser_declaration (parser=0x77ff83f0)
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:12876

#7  0x0071837d in cp_parser_translation_unit (parser=0x77ff83f0)
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:4631

#8  c_parse_file ()
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:39108

#9  0x00868db1 in c_common_parse_file ()
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/c-family/c-opts.c:1150

#10 0x00e0aaaf in compile_file ()
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:455

#11 0x0059248a in do_compile ()
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2172

#12 toplev::main (this=this@entry=0x7fffd54e, argc=,

argc@entry=100, argv=, argv@entry=0x7fffd648)
at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2307

#13 0x00594b5b in main (argc=100, argv=0x7fffd648)
at 
/data/hudson/jobs/gcc-9.0.0-linu

Build report gcc 4.6.1 on Sparc Solaris 10

2011-08-18 Thread Wolfgang S. Kechel

 function it appears in
../../../mpc/src/get.c: In function ‘mpc_get_ldc’:
../../../mpc/src/get.c:39:11: error: ‘I’ undeclared (first use in this
function)

I fixed this with a modification in /usr/include/complex.h (yes, need
root permissions):
#if !defined(__GNUC__) /* wke mod for mpc 0.9 */
#undef I
#define I _Imaginary_I
#else /* native cc */
#undef I
#define I (__extension__ 1.0iF)
#endif /* end __GNUC__ */

But I do not know if this fix may break stuff in some places.

Another restart of make yields a full build and I am able to install the
compiler and use it. it seems to generate proper result (at least for C).

In case you have further questions, do not hesitate to contact me via email.

Best regards

-- 
Wolfgang Kechel mailto:wolfgang.kec...@prs.de

tsvc test iteration count during check-gcc

2024-07-18 Thread Joern Wolfgang Rennecke

The tsvc tests take just too long on simulators, particularly if there 
is little or no vectorization of the test because of compiler 
limitations, target limitations, or the chosen options.  Having
151 tests time out at a quarter of an hour is not fun, and making the 
time out go away by upping the timeout might make for better looking 
results, but not for better turn-around times.


So, I though to just change the iteration count (which is currently 
defined as 1 in tsvc.h, resulting in billions of operations for a 
single test) to something small, like 10.


This requires new expected results, but there were pretty 
straightforward to auto-generate.  The lack of a separate number for 
s3111 caused me some puzzlement, but it can indeed share a value with 
s3.


But then if I want to specifically change the iteration count for 
simulators, I have to change 151 individual test files to add another 
dg-additional-options stanza. I can leave the job to grep / bash / ed,

but then I get 151 locally changed files, which is a pain to merge.
So I wonder if tsvc.h shouldn't really default to a low iteration count.
Is there actually any reason to run the regression tests with an 
iteration count of 1 on any host?
I mean, if you wanted to get some regression check on performance, you'd 
really want to have something more exact that wall clock time doesn't 
exceed whatever timeout is set.  You could test set a ulimit for cpu 
time and fine tune that for proper benchmark regression test - but for
the purposes of an ordinary gcc regression test, you generally just want 
the optimizations perfromed (like in the dump file tests present) and 
the computation be performed correctly.  And for these, it makes little
difference how many iterations you use for the test, as long as you 
convince GCC that the code is 'hot'.

c-c++-common/analyzer/out-of-bounds-diagram-11.c written type vs toolchain

2024-07-22 Thread Joern Wolfgang Rennecke

While on x86_64-pc-linux-gnu, the second diagram shows the type written 
as 'int', as expected, on a 16 and 32 bit newlib based toolchain, it is 
being output as int32_t .  And all the formatting is also a bit 
different, probably due to the change in how the int32_t is displayed.


What do other people see on toolchains where the regression tests 
actually have I/O functionality?


Would it make sense to handle this with one multi-line pattern for 
newlib based toolchains, ending with

   { dg-end-multiline-output "" { target *-*-elf } } */
 and one for glibc based toolchain, ending in
   { dg-end-multiline-output "" { target !*-*-elf } } */
 ?

I have no idea what toolchains with different libraries (and hence 
header files) would see.

Re: c-c++-common/analyzer/out-of-bounds-diagram-11.c written type vs toolchain

2024-07-22 Thread Joern Wolfgang Rennecke




On 22/07/2024 16:44, David Malcolm wrote:

Does it help to hack this change into prune.exp:

diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
index d00d37f015f7..f467d1a97bc6 100644
--- a/gcc/testsuite/lib/prune.exp
+++ b/gcc/testsuite/lib/prune.exp
@@ -109,7 +109,7 @@ proc prune_gcc_output { text } {
  # Many tests that use visibility will still pass on platforms that don't 
support it.
  regsub -all "(^|\n)\[^\n\]*lto1: warning: visibility attribute not supported in this 
configuration; ignored\[^\n\]*" $text "" text
  
-#send_user "After:$text\n"

+send_user "After:$text\n"
  
  return $text

  }


I'm baffled.  Isn't that statement there just to debug prune_gcc_output?
I suppose we could prune the whitespace from the diagram, but 
prune_gcc_output does not know about types.  If there's 'int, that could 
be int32_t, int16_t, int64_t, ptrdiff_t, or whatever.  Unless you want

to make all integer types be considered equivalent for dejagnu purposes
if they appear somewhere between vertical bars.



Would it make sense to handle this with one multi-line pattern for
newlib based toolchains, ending with
     { dg-end-multiline-output "" { target *-*-elf } } */
   and one for glibc based toolchain, ending in
     { dg-end-multiline-output "" { target !*-*-elf } } */
   ?


Presumably the only difference is in the top-right hand box of the
diagram,


Unfortunately, there's also a lot of white space change in the rest of 
the diagram.

I have attached the patch I'm currently using for your perusal.

 whereas my objective for those tests was more about the lower

part of the diagram - I wanted to verify how we handle symbolic buffer
sizes (e.g. (size * 4) + 3, and other run-time-computer sizes).

It's rather awkward to test the diagrams with DejaGnu, alas.

Would it might make sense to split out that file into three separate
tests -11a, -11b, and -11c, and be more aggressive about only running
the 2nd test on targets that we know generate "int" in the top-right
box?


No, each dg-end-multiline-output stanza already can have its separate 
target selector, there is no point in putting them in separate files.
I guess you could reduce the differences between platforms if you didn't 
use types as defined by headerfiles directly, as they might be #defines 
or typedefs or whatever, and instead used your own typedef or struct types.Index: c-c++-common/analyzer/out-of-bounds-diagram-8.c
===
--- c-c++-common/analyzer/out-of-bounds-diagram-8.c (revision 6640)
+++ c-c++-common/analyzer/out-of-bounds-diagram-8.c (revision 6642)
@@ -17,6 +17,24 @@
 
 /* { dg-begin-multiline-output "" }
 
+ ┌───┐
+ │write of '(int32_t) 42'│
+ └───┘
+ │
+ │
+ v
+  ┌───┐  ┌───┐
+  │buffer allocated on heap at (1)│  │   after valid range   │
+  └───┘  └───┘
+  ├───┬───┤├─┬──┤├───┬───┤
+  │  │   │
+╭─┴╮ ╭───┴───╮ ╭─┴─╮
+│capacity: 'size * 4' bytes│ │4 bytes│ │overflow of 4 bytes│
+╰──╯ ╰───╯ ╰───╯
+
+   { dg-end-multiline-output "" { target *-*-elf } } */
+/* { dg-begin-multiline-output "" }
+
  ┌───┐
  │write of '(int) 42'│
  └───┘
@@ -32,4 +50,4 @@
 │capacity: 'size * 4' bytes│   │4 bytes│ │overflow of 4 bytes│
 ╰──╯   ╰───╯ ╰───╯
 
-   { dg-end-multiline-output "" } */
+   { dg-end-multiline-output "" { target !*-*-elf } } */
Index: c-c++-common/analyzer/out-of-bounds-diagram-11.c
===
--- c-c++-common/analyzer/out-of-bounds-diagram-11.c(revision 6640)
+++ c-c++-common/analyzer/out-of-bounds-diagram-11.c(revision 6642)
@@ -45,8 +45,30 @@
   buf[size] = 42; /* { dg-warning "stack-based buffer overflow" } */
 }
 
+/* With a newlib toolchain (at least for esirisc), we end up with int32_t
+   being shown as itself.  */
 /* { dg-begin-multiline-output "" }
 
+┌┐
+│write of '(int32_t) 42' │
+

Re: c-c++-common/analyzer/out-of-bounds-diagram-11.c written type vs toolchain

2024-07-22 Thread Joern Wolfgang Rennecke





On 22/07/2024 17:13, Joern Wolfgang Rennecke wrote:
 > I guess you could reduce the differences between platforms if you 
didn't
use types as defined by headerfiles directly, as they might be #defines 
or typedefs or whatever, and instead used your own typedef or struct types.


It seems a typedef to int is seen through, even if you chain two of them 
together.

After preprocessing, newlib has:

typedef long int __int32_t;

typedef __int32_t int32_t ;

So the crucial point seems to be to have 'long int',  but that is of 
course not portable for int32_t.


So to get portable code and consistent messages, I suppose we should use 
a struct:


  typedef struct { int32_t i; } my_int32;
  my_int32 s42 = { 42 };
  my_int32 *buf = (my_int32 *) __builtin_alloca (4 * size + 3); /* { 
dg-warning "allocated buffer size is not a multiple of the pointee's 
size" } */

  buf[size] = s42; /* { dg-warning "stack-based buffer overflow" } */

Now suddenly the diagram is made *more* verbose, with the struct keyword 
added.

   ┌─┐
   │write of ‘struct my_int32’ (4 bytes) │
   └─┘
  │ │
  │ │
  v v
  ┌───┐ ┌┐
  │   buffer allocated on stack at (1)│ │   after valid range│
  └───┘ └┘
  ├───┬───┤ ├───┬┤
  │ │
 ╭┴───╮   ╭─┴╮
 │capacity: ‘(size * 4) + 3’ bytes│   │overflow of 1 byte│
 ╰╯   ╰──╯

RFD: switch/case statement dispatch using hash

2025-05-24 Thread Joern Wolfgang Rennecke


This has come up several time over the years:
https://gcc.gnu.org/legacy-ml/gcc/2006-07/msg00158.html
https://gcc.gnu.org/legacy-ml/gcc/2006-07/msg00155.html
https://gcc.gnu.org/pipermail/gcc/2010-March/190234.html

, but maybe now (or maybe a while ago) is the right time to do this, 
considering the changes in relative costs of basic operations?

Multiply and barrel shift are cheap on many modern microarchitectures.
Control flow and non-linear memory access is expensive.

FWIW, [Dietz92]
https://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1312&context=ecetr
mentions multiply in passing as unpractical for SPARC because of cost, 
but modern CPUs can often do a multiply in a single cycle.


Approximating division and scaling, for a case value x, we can calculate 
an index or offset into a table as

f(x) = x*C1 >> C2 & M

For an index, M masks off the upper bits so that the index fits into
a table that has a number of elements that is a power of two.
For architectures where a non-scaled index in cheaper to use than
a scaled one, we compute an offset by having M also mask off the lower
bits.

Each table entry contains a jump address (or offset) and a key - at 
least if both are the same size; for different sizes, it might be 
cheaper to have two tables.
If we have found values for C1 and C2 that give a perfect hash, we can 
immediately dispatch to the default case for a non-match; otherwise, we 
can have decision trees at the jump destinations, each using the 
comparison with the key from the table for the first decision.
No separate range check is necessary, so if the multiply is fast enough, 
this should be close in performance to an ordinary tablejump.


This dispatch method can be used for tables that are too sparse for a 
tablejump,but have enough cases to justify the overhead (depending on 
multiple conditional vs single indirect branch costs, the latter might 
be a low bar).


I suppose we could make tree-switch-conversion.cc use rtx costs to 
compare the hash implementation to a decision tree, or have a hook make 
the decision - and the default for the hook might use rtx costs...

Re: RFD: switch/case statement dispatch using hash

2025-06-23 Thread Joern Wolfgang Rennecke





On 23/06/2025 12:31, Florian Weimer wrote:

Also carry-less multiply persumably.

It's challenging to use those instructions for compiling switch
statements because they would then be used all over the place.


Not necessarily; you can hide them in an UNSPEC if you are worried that
exposing the exact semantics leads to inappropriate uses.

Re: scan-*-dump-times across multiple functions considered harmful

2025-07-03 Thread Joern Wolfgang Rennecke





On 02/07/2025 18:59, David Malcolm wrote:
 ...

Brainstorming some ideas on other possible approaches on making our
tests less brittle; for context I did some investigation back in 2018
about implementing "optimizations remarks" like clang does: diagnostics
about optimization decisions, so you could have a dg directive like
this on a particular line:

   foo ();  /* { dg-remark "inlined call to 'foo' into 'bar'" } */


I like the idea.  However, it seems unlikely that we can make a
clean switchover in this decade, unless you find one or more
corporate sponsors.
We probably always want dump files without a rigid structure, because
it makes it easier to add debug output when you flesh out a new pass
or a change to an existing one.  We can make the calls that generate
the json output also emit output in the dump file, so we won't carry
a doubled maintenance burden; however, this means the current ad-hoc 
messages would become more unified; thus the testsuite will have to

be adjusted.
FWIW, even you you were to get rid of the current dump files (which
I think would be stifling for GCC development for the above
reasons), you would have to adjust the testsuite.
So, we could use the json framework for new dump output that is
contributed before or along with the parts of the testsuite that
scan for it, but for any legacy dump output that is scanned for
in the testsuite, that requires to adjust the testsuite.  More
than 26K dejagnu scan-*-dump* directives in the gcc15 testsuite.
And you'll have a bit flag day, or a ton of small ones.  Plus
all the friction that this will create with porting patches up
and down gcc versions.
That is a lot of thankless work, which I can't imagine doing as a
hobby.  And condifering people at the start of their career who
might think of doing some unpaid drudge work in hope of getting 
recognition that'll get them some paid work, with paying work

for GCC drying up, they would more likely do something for LLVM,
which also seems to better align with the skills of recent
graduates.

So, unless/until you have (a) corporate sponsor(s) to pay for
the work on the existing testsuite - and that work is
successfully concluded - we will have to find a way to
make the scans of the dump files more maintainable.
In fact, if we can solve the maintenance hassle of having
multiple in a test by making the scan patterns more specific,
so we don't have to split the tests up, that will put us in
better position if/when the transition to a more organized
optimization records system is made.

scan-*-dump-times across multiple functions considered harmful

2025-07-01 Thread Joern Wolfgang Rennecke

Quite often I see a test quickly written to test some new feature (bug 
fix, extension or optimization) that has a couple of functions to cover 
various aspects of the feature, checked all together with a single 
scan-tree-dump-times, scan-rtl-dump-times etc. check, using the expected 
value for the target of the test writer.
Or worse, it's all packed into one giant function, with unpredictable 
interactions between the different pieces of code.  I think we have less 
of those recently, but please don't interpret this post as a suggestion 
to fall back to this practice.


Quite often it turns out that the feature applies only to some of the
functions / sites on some targets.  The first reaction is often to 
create multiple copies of the scan-*-dump-times stanza, with mutually 
exclusive conditions for each copy, which might look harmless when there 
are only two cases, but as more are added, it quickly turns into an

unmaintainable mess of lots dejagnu directives with complicated.

This can get even worse if different targets can get the compiler the 
pattern multiple times for the same piece of source, like for 
vectorization that is tried with different vectorization factors.


I think we should discuss what is best practice to address these 
problems efficiently, and to preferably write new tests avoiding them

in the first place.

When each function has a single site per feature where success is given
if the pattern appears at least once, a straightforward solution that 
has already been used a number of times is to split the test into 
multiple smaller tests.  The main disadvantages of this approach are
that a large set of small files can clutter the directory where they 
appear, making it less maintainable, and that the compiler is invoked 
more often, generally with the same set of include files read each time, 
thus making the test runs slower.


Another approach would be to use source line numbers, where present and 
distinctive, to add to the scan pattern to make it specific to the site
under concern.  That should, for instance, work for vectorization 
scan-tree-dump-times tests.  The disadvantage of that approach is that 
the tests become more brittle, as the line numbers would have to be 
adjusted whenever the line numbers of the source site change, like when 
new include files, dejagnu directives at the file start, or typedefs are 
needed.


Maybe we could get the best of both worlds if we add a new dump option?
Say, if we make that option add the (for polymorphic languages like C++: 
mangled) name of the current function to each dumped line that is 
interesting to scan for.  Or just every line, if that's simpler.

P.S. to: scan-*-dump-times across multiple functions considered harmful

2025-07-01 Thread Joern Wolfgang Rennecke

P.S.: to get get the specifity of linenumbers without the brittleness, 
we could have a pragma for the extra dump line tag instead.  Either till 
the next such pragme / EOF, or (if before next such pragma), till the 
end of function.
Disadvantages: Not actually more specific when the source describes a 
template, or gets cloned.  Unless we add the mangled filename too, 
either unconditionally, per option, or per format (complex, bugprone).
The tag would have to be associated with or included in the source 
location.  That might get complex and be a source of bugs in itself.

57 matches

Mail list logo