Re: Might a -native-semantics switch, forcing native target optimization semantics, be reasonable?

2006-01-01 Thread Gabriel Dos Reis
Robert Dewar <[EMAIL PROTECTED]> writes:

| Gabriel Dos Reis wrote:
| 
| >Maybe that is the case for Ada; for the C or C++ standards, you'll
| > have to define "good reason". -- Gaby
| >
| Again, I suggest that vague high level discussion is a waste of time
| here, 

I wholeheartly agree, that is why I find you need to back up your
universal claim.

-- Gaby



Re: Might a -native-semantics switch, forcing native target optimization semantics, be reasonable?

2006-01-01 Thread Robert Dewar

Gabriel Dos Reis wrote:


Robert Dewar <[EMAIL PROTECTED]> writes:

| Gabriel Dos Reis wrote:
| 
| >Maybe that is the case for Ada; for the C or C++ standards, you'll

| > have to define "good reason". -- Gaby
| >
| Again, I suggest that vague high level discussion is a waste of time
| here, 


I wholeheartly agree, that is why I find you need to back up your
universal claim.
 

You mean my claim that the standards group knows what it is doing and is 
not stupid? :-)
Well I could give many examples, but I still think it is more useful to 
get back to the original intent
of Paul's thread, which is to describe specific semantics for some of 
these cases. Otherwise we

are really discussing entirely irrelevant matters.


-- Gaby
 







Re: Might a -native-semantics switch, forcing native target optimization semantics, be reasonable?

2006-01-01 Thread Gabriel Dos Reis
Robert Dewar <[EMAIL PROTECTED]> writes:

| Gabriel Dos Reis wrote:
| 
| >Robert Dewar <[EMAIL PROTECTED]> writes:
| >
| >| Gabriel Dos Reis wrote:
| > | | >Maybe that is the case for Ada; for the C or C++ standards,
| > you'll
| >| > have to define "good reason". -- Gaby
| >| >
| >| Again, I suggest that vague high level discussion is a waste of time
| > | here, I wholeheartly agree, that is why I find you need to back up
| > your
| >universal claim.
| >
| You mean my claim that the standards group knows what it is doing and
| is not stupid? :-)

Your claim was this:

  # IN every case where the standard specifies undefined behavior, it
  # has a very good reason for doing so.

| Well I could give many examples,

unless "many" == "exhaustive", you'll just have wasted time and
bandwidth :-)

| but I still think it is more useful to get back to the original intent
| of Paul's thread, 

but that is exactly the point!  We can't usefully discuss his
intent with unsupported claims.

I predict we will end up changing nothing for GCC; I also predict that
he'll be unsatisfied with unsupported claims; then, there is a high
chance the discussion will be restarted again in a different form
(hey, this is not the first time you, Paul and me are involved in this
kind of undefined discussion).  If we have good answers with backed up
claims, we could just reference it later. 

| which is to describe specific semantics for some of
| these cases. Otherwise we
| are really discussing entirely irrelevant matters.

I think supported answers will help diminish undefined disucssion frequency.
If you don't have a definition for "good reason", that is fine.  Let's move
on something else; this is a new year.

-- Gaby



Re: Might a -native-semantics switch, forcing native target optimization semantics, be reasonable?

2006-01-01 Thread Robert Dewar

Gabriel Dos Reis wrote:


Robert Dewar <[EMAIL PROTECTED]> writes:

| Gabriel Dos Reis wrote:
| 
| >Robert Dewar <[EMAIL PROTECTED]> writes:

| >
| >| Gabriel Dos Reis wrote:
| > | | >Maybe that is the case for Ada; for the C or C++ standards,
| > you'll
| >| > have to define "good reason". -- Gaby
| >| >
| >| Again, I suggest that vague high level discussion is a waste of time
| > | here, I wholeheartly agree, that is why I find you need to back up
| > your
| >universal claim.
| >
| You mean my claim that the standards group knows what it is doing and
| is not stupid? :-)

Your claim was this:

 # IN every case where the standard specifies undefined behavior, it
 # has a very good reason for doing so.

| Well I could give many examples,

unless "many" == "exhaustive", you'll just have wasted time and
bandwidth :-)
 


Right, which is why I decline :-)


| but I still think it is more useful to get back to the original intent
| of Paul's thread, 


but that is exactly the point!  We can't usefully discuss his
intent with unsupported claims.
 


I agree, we can only discuss his intent if he gives specific examples


I predict we will end up changing nothing for GCC; I also predict that
he'll be unsatisfied with unsupported claims; then, there is a high
chance the discussion will be restarted again in a different form
(hey, this is not the first time you, Paul and me are involved in this
kind of undefined discussion).  If we have good answers with backed up
claims, we could just reference it later.


I agree with your prediction




| which is to describe specific semantics for some of
| these cases. Otherwise we
| are really discussing entirely irrelevant matters.

I think supported answers will help diminish undefined disucssion frequency.
If you don't have a definition for "good reason", that is fine.  Let's move
on something else; this is a new year.
 

My definition of good reason does not differ from the dictionary 
definition, nothing special. My
point is that at least for the cases I am aware of, the decision to make 
something undefined in
the standard makes sense, and it is not "ab initio" obvious what it 
would mean to say that
the semantics should be those of the native target. That does not mean 
it is inappropriate

to define them further in specific cases.

As you know, my general view is that GCC is a little too ready to take 
advantage of
undefined in its optimization approach, so I am not unsympathetic to 
discussing specific
cases in which this should at least optionally be constrained. We have 
already discussed
the wrap situation in detail, though Paul's mention of saturating 
arithmetic seems way
out of left field to me, since very few common architectures implement 
saturating

arithmetic at the hardward level.


-- Gaby
 







Re: Might a -native-semantics switch, forcing native target optimization semantics, be reasonable?

2006-01-01 Thread Michael Veksler
I am not sure if the original poster meant the same as I do. What I have
in mind is optimizations opportunities like the one of the following
linked list:

  struct list_element 
  {
 struct list_element *p_next;
 int *p_value;
  };
  int sum_list_values(const struct list_element *p_list)
  {
 int sum=0;
 for ( ; p_list ; p_list= p_list->p_next)
   if (p_list->p_value)
 sum += *p_list->p_value;
 return sum;
  }

Now, if the programmer wants to gain 5% performance (or more), he 
might mmap /dev/zero to address 0x00 such that
   *(*int)0 == 0

Then, the compiler or the coder could unroll the loop
to minimizes conditional branches:

  int sum_list_values(const struct list_element *p_list)
  {
 int sum=0;
 while (p_list)
 {
sum += *p_list->p_value; // undefined if p_value == NULL
p_list= p_list->p_next;
sum += *p_list->p_value; // and undefined if p_list == NULL
p_list= p_list->p_next;
 }
 return sum;
  }

It could be a big gain on architectures that can't do effective
predication of the load from p_list->p_value. 

I once heard that xlC did something like that on AIX, automatically.
Does GCC take advantage of,e.g., AIX's mapping of address 0x
to zeros? Can GCC be easily taught to do so?


-- 
Michael


Quoting Mike Stump <[EMAIL PROTECTED]>:

> On Dec 31, 2005, at 10:51 AM, Paul Schlie wrote:
> > As although C/C++ define some expressions as having undefined  
> > semantics;
> 
> I'd rather it be called --do-what-i-mean.  :-)
> 
> Could you give us a hint at what all the semantics you would want to  
> change with this option?  Are their any code bases that you're trying  
> to compile?  Compilers that you're trying to be compatible with?
> 



RTL alias analysis

2006-01-01 Thread Steven Bosscher
Hi rth,

The stack space sharing you added to cfgexpand.c breaks RTL alias
analysis.

For example, the attached test case breaks for pentiumpro at -O2. 
The problem apparently is that the second store to c is moved up
before before the load.
This looks like a serious problem to me...

Many thanks to Honza for crafting this test case.

Gr.
Steven



extern void abort (void) __attribute__((noreturn));

union setconflict
{
  short a[20];
  int b[10];
};

int
main ()
{
  int sum = 0;
  {
union setconflict a;
short *c;
c = a.a;
asm ("": "=r" (c):"0" (c));
*c = 0;
asm ("": "=r" (c):"0" (c));
sum += *c;
  }
  {
union setconflict a;
int *c;
c = a.b;
asm ("": "=r" (c):"0" (c));
*c = 1;
asm ("": "=r" (c):"0" (c));
sum += *c;
  }

  printf ("%d\n",sum);
  if (sum != 1)
abort();
  return 0;
}



.file   "t.c"
# GNU C version 4.2.0 20060101 (experimental) (x86_64-unknown-linux-gnu)
#   compiled by GNU C version 4.0.2 20050901 (prerelease) (SUSE Linux).
# GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
# options passed:  -iprefix -isystem -m32 -march=pentiumpro -auxbase -O2
# -fdump-tree-vars -fomit-frame-pointer -fverbose-asm
# options enabled:  -falign-loops -fargument-alias -fbranch-count-reg
# -fcaller-saves -fcommon -fcprop-registers -fcrossjumping
# -fcse-follow-jumps -fcse-skip-blocks -fdefer-pop
# -fdelete-null-pointer-checks -fearly-inlining
# -feliminate-unused-debug-types -fexpensive-optimizations -ffunction-cse
# -fgcse -fgcse-lm -fguess-branch-probability -fident -fif-conversion
# -fif-conversion2 -finline-functions-called-once -fipa-pure-const
# -fipa-reference -fipa-type-escape -fivopts -fkeep-static-consts
# -fleading-underscore -floop-optimize -floop-optimize2 -fmath-errno
# -fmerge-constants -fomit-frame-pointer -foptimize-register-move
# -foptimize-sibling-calls -fpcc-struct-return -fpeephole -fpeephole2
# -fregmove -freorder-blocks -freorder-functions -frerun-cse-after-loop
# -frerun-loop-opt -fsched-interblock -fsched-spec
# -fsched-stalled-insns-dep -fschedule-insns2 -fshow-column
# -fsplit-ivs-in-unroller -fstrength-reduce -fstrict-aliasing
# -fthread-jumps -ftrapping-math -ftree-ccp -ftree-ch -ftree-copy-prop
# -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse -ftree-fre
# -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize -ftree-lrs
# -ftree-pre -ftree-salias -ftree-sink -ftree-sra -ftree-store-ccp
# -ftree-store-copy-prop -ftree-ter -ftree-vect-loop-version -ftree-vrp
# -funit-at-a-time -fverbose-asm -fzero-initialized-in-bss -m32 -m80387
# -m96bit-long-double -maccumulate-outgoing-args -malign-stringops
# -mfancy-math-387 -mfp-ret-in-387 -mieee-fp -mno-red-zone -mpush-args
# -mtls-direct-seg-refs

# Compiler executable checksum: 6d90f1c30ff8027bc6976ab2dbfe2320

.section.rodata.str1.1,"aMS",@progbits,1
.LC0:
.string "%d\n"
.text
.p2align 4,,15
.globl main
.type   main, @function
main:
leal4(%esp), %ecx   #,
andl$-16, %esp  #,
pushl   -4(%ecx)#
subl$76, %esp   #,
leal28(%esp), %eax  #, tmp64
movl%ecx, 68(%esp)  #,
movl%eax, %edx  # tmp64, c
movl%ebx, 72(%esp)  #,
movw$0, (%edx)  #,* c
movl$1, (%eax)  #,* c
movswl  (%edx),%ebx #* c, sum
movl(%eax), %edx#* c,
movl$.LC0, (%esp)   #,
addl%edx, %ebx  #, sum
movl%ebx, 4(%esp)   # sum,
callprintf  #
decl%ebx# sum
jne .L6 #,
movl68(%esp), %ecx  #,
xorl%eax, %eax  # 
movl72(%esp), %ebx  #,
addl$76, %esp   #,
leal-4(%ecx), %esp  #,
ret
.L6:
callabort   #
.size   main, .-main
.ident  "GCC: (GNU) 4.2.0 20060101 (experimental)"
.section.note.GNU-stack,"",@progbits




-fpic no optimization...

2006-01-01 Thread Frediano Ziglio

Happy 2006!

I was compiling LZMA SDK (http://www.7-zip.org/, LzmaDecode.c) and just
for curiosity I looked at output assembler. I noted that when PIC is
enabled (-fpic, Linux Intel) ebx is reserved to global pointer. However
LzmaDecode do not access any global data and do not call other functions
(no relocations at all) so why not use ebx register? -fpic make compiler
just not use ebx. I tried using different versions (gcc 3.4.4 from
Fedora Core 3 and 4.0.2 from Fedora Core 4) with same result.

Frediano Ziglio (aka freddy77)





Re: Might a -native-semantics switch, forcing native target optimization semantics, be reasonable?

2006-01-01 Thread Paul Schlie
> Gabriel Dos Reis wrote:
> I predict we will end up changing nothing for GCC; I also predict that
> he'll be unsatisfied with unsupported claims; then, there is a high
> chance the discussion will be restarted again in a different form
> (hey, this is not the first time you, Paul and me are involved in this
> kind of undefined discussion).  If we have good answers with backed up
> claims, we could just reference it later.
> ...
> I think supported answers will help diminish undefined disucssion frequency.
> If you don't have a definition for "good reason", that is fine.  Let's move
> on something else; this is a new year.

Gentlemen, I honestly don't want to promote a frivolous debate either; as
such, if it is generally agreed there are no valid conceivable reasons to
enable GCC to allow a target to specify the semantics of its particular
implementation to enable optimizers to utilize that information such that
its otherwise non-optimized behavior may be preserved during optimization,
then I simply accept that I'm alone in the believe of its significance and
utility; if not, then it seems that the only questions remaining are related
to how may this be done relatively easily within the current optimization
and target definition framework?

If specific examples are required, although likely considered obvious,
here's a few:

- x[y] = 0; // may also be undefined if y is one past the
  if (x[y]) y = y+1;// extent of x[], and/or result in an overflow
// of x[y] one past the it's upper bound if
// allocated at the upper range of the address
// space, and/or if allocated at the base
// address of 0 and y is 0 or negative; but
// non-the less the machine evaluating the code
// will do something which is likely well
// defined, and must be defined if that logical
// behavior is to be predictable and thereby
// preferable during optimization.

- x = 5 << z;   // where although implementation specific, must
  if (x > 0) z = 2; // to be definable if to be utilized as the
// basis of a target specific behavior
// preserving optimization.





Re: Might a -native-semantics switch, forcing native target optimization semantics, be reasonable?

2006-01-01 Thread Paul Schlie
> From: Robert Dewar <[EMAIL PROTECTED]>>
> though Paul's mention of saturating arithmetic seems way out of left field to
> me, since very few common architectures implement saturating arithmetic at the
> hardware level.

- Only every single dsp, and an increasing number of otherwise conventional
  and/or vector extended architectures targeted to improve the support for
  the same? (not to mention corresponding modular and/or specific algorithm
  addressing mode support, which may ideally be the target of optimized code
  mapping to improve performance, which correspondingly would ideally
  require their generic definition)

(But acknowledge these machines tend not to be targeted as the CPU of a PC,
if that is the intended focus of GCC)






Re: RTL alias analysis

2006-01-01 Thread Mark Mitchell
Steven Bosscher wrote:
> Hi rth,
> 
> The stack space sharing you added to cfgexpand.c breaks RTL alias
> analysis.
> 
> For example, the attached test case breaks for pentiumpro at -O2. 
> The problem apparently is that the second store to c is moved up
> before before the load.

My guess at a solution is that when A (with alias set S_a) and B (with
alias set S_b) are given the same stack slot, we should create a new
alias set S_c which is a subset of both S_a and S_b, and give the
combined stack slot that aliase set.

-- 
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(650) 331-3385 x713



Re: RTL alias analysis

2006-01-01 Thread Jan Hubicka
> Steven Bosscher wrote:
> > Hi rth,
> > 
> > The stack space sharing you added to cfgexpand.c breaks RTL alias
> > analysis.
> > 
> > For example, the attached test case breaks for pentiumpro at -O2. 
> > The problem apparently is that the second store to c is moved up
> > before before the load.
> 
> My guess at a solution is that when A (with alias set S_a) and B (with
> alias set S_b) are given the same stack slot, we should create a new
> alias set S_c which is a subset of both S_a and S_b, and give the
> combined stack slot that aliase set.

This won't work for the indirect accesses via pointers, like in the
testcase.

Honza
> 
> -- 
> Mark Mitchell
> CodeSourcery, LLC
> [EMAIL PROTECTED]
> (650) 331-3385 x713



Re: Might a -native-semantics switch, forcing native target optimization semantics, be reasonable?

2006-01-01 Thread Paul Schlie
> From: Michael Veksler wrote:
> I am not sure if the original poster meant the same as I do. What I have
> in mind is optimizations opportunities like the one of the following
> linked list:

 (for reference: http://gcc.gnu.org/ml/gcc/2006-01/msg7.html )

- although not specifically what I was concerned about, it certainly
  seems reasonable as a non-portable target specific definition that
  may correspondingly be utilized as a basis of a target specific
  behavior preserving optimization.





Successful Build: gcc-4.1-20051230 i686-pc-mingw32

2006-01-01 Thread Parag Warudkar
E:\msys\1.0\home\gcc>gcc -v
Using built-in specs.
Target: mingw32
Configured with: ../gcc-4.1-20051223/configure --host=mingw32 --build=mingw32 --
target=mingw32 --enable-threads --disable-nls --enable-optimize --enable-languag
es=c,c++ --prefix=e:/mingw4
Thread model: win32
gcc version 4.1.0 20051230 (prerelease)

w32api 3.5
mingw-runtime 3.9
binutils 2.16.91-20050827-1

Build required manual correction of gcc-obj\gcc\Makefile -
ORIGINAL_LD_FOR_TARGET gets set to "./E:/mingw/bin/ld.exe"
different from the other stuff such as
ORIGINAL_AS_FOR_TARGET=/mingw/bin/as. make doesn't like the "E:/" and
build fails with an error Makefile:1277 : target pattern contains no
'%'. Stop. Setting ORIGINAL_LD_FOR_TARGET=/mingw/bin/ld.exe makes it
build successfully.

The built GCC and G++ succesfully compile Trolltech Qt-4.1 Windows.
Every compiled thing works fine and fast at -O2 -mtune=athlon64 -mmmx
-msse2 !

Thanks

Parag



Re: Might a -native-semantics switch, forcing native target optimization semantics, be reasonable?

2006-01-01 Thread Robert Dewar

Paul Schlie wrote:


Gabriel Dos Reis wrote:
I predict we will end up changing nothing for GCC; I also predict that
he'll be unsatisfied with unsupported claims; then, there is a high
chance the discussion will be restarted again in a different form
(hey, this is not the first time you, Paul and me are involved in this
kind of undefined discussion).  If we have good answers with backed up
claims, we could just reference it later.
...
I think supported answers will help diminish undefined disucssion frequency.
If you don't have a definition for "good reason", that is fine.  Let's move
on something else; this is a new year.
   



Gentlemen, I honestly don't want to promote a frivolous debate either; as
such, if it is generally agreed there are no valid conceivable reasons to
enable GCC to allow a target to specify the semantics of its particular
implementation to enable optimizers to utilize that information such that
its otherwise non-optimized behavior may be preserved during optimization,
then I simply accept that I'm alone in the believe of its significance and
utility; if not, then it seems that the only questions remaining are related
to how may this be done relatively easily within the current optimization
and target definition framework?

If specific examples are required, although likely considered obvious,
here's a few:

- x[y] = 0; // may also be undefined if y is one past the
 if (x[y]) y = y+1;// extent of x[], and/or result in an overflow
   // of x[y] one past the it's upper bound if
   // allocated at the upper range of the address
   // space, and/or if allocated at the base
   // address of 0 and y is 0 or negative; but
   // non-the less the machine evaluating the code
   // will do something which is likely well
   // defined, and must be defined if that logical
   // behavior is to be predictable and thereby
   // preferable during optimization.
 

We have been through this before, there is no requirement that 
optimization preserve the behavior of
undefined programs (indeed that is often one of the primary motivations 
in making things undefined).
It is fine to argue that defining the semantics is useful in a 
particular case, but arguing solely from the
point of view of trying to preserve observed behaviort is a poor 
argument. Indeed the point is that
optimization is not changing the behavior, the behavior is 
non-deterministic, and can change from
one compilation to the next even if optimization does not change. 


- x = 5 << z;   // where although implementation specific, must
 if (x > 0) z = 2; // to be definable if to be utilized as the
   // basis of a target specific behavior
   // preserving optimization.
 


ditto





Re: RTL alias analysis

2006-01-01 Thread Daniel Berlin
On Sun, 2006-01-01 at 10:22 -0800, Mark Mitchell wrote:
> Steven Bosscher wrote:
> > Hi rth,
> > 
> > The stack space sharing you added to cfgexpand.c breaks RTL alias
> > analysis.
> > 
> > For example, the attached test case breaks for pentiumpro at -O2. 
> > The problem apparently is that the second store to c is moved up
> > before before the load.
> 
> My guess at a solution is that when A (with alias set S_a) and B (with
> alias set S_b) are given the same stack slot, we should create a new
> alias set S_c which is a subset of both S_a and S_b, and give the
> combined stack slot that aliase set.

Won't work here, sadly, AFAIK.
This is because it's not TBAA that gets you here, and in fact, it won't
help (In fact, they already should be in the same alias set because they
are union'd together).

Take a look at true_dependence, or canon_true_dependence, in alias.c,
and you'll see that there are a bunch of times that even if the alias
sets say they conflict, we will return that they don't conflict.  This
is one of those times.  

If this is the same testcase steven was discussing on IRC, the real
solution is to transfer the information that the stack space sharing
knows into some simple set form, and use *that directly* in alias.c, and
check it *first*, so that if they have the same stack slot, we say there
is a dependence, even if the memory expressions/types/etc look
different.

This also lets you say for sure whether things have a different stack
slot or not, which we seem to try to fathom using the  reg_*_value stuff
(why guess when we could just ask where it put them?)



 
> 




The Extension IDE - EXTEIDE

2006-01-01 Thread EXTEIDE (sent by Nabble.com)

EXTEIDE is freeware. Anyone can download now.

Please visit http://www.exteide.com

Thanks.

--
Sent from the gcc - Dev forum at Nabble.com:
http://www.nabble.com/The-Extension-IDE---EXTEIDE-t835749.html#a2166859