Re: GCC optimizes integer overflow: bug or feature?

2006-12-22 Thread Paolo Bonzini



  int j;
  for (j = 1; 0 < j; j *= 2)
if (! bigtime_test (j))
  return 1;

Here it is obvious to a programmer that the comparison is
intended to do overflow checking, even though the test
controls the loop.


Well, it's not to me. :-)


Another question for the GCC experts: would it fix the bug
if we replaced "j *= 2" with "j <<= 1" in this sample code?


Yes, it will actually compile the code as this:

  int i, j;
  for (i = 0, j = 1; i < 31; i++)
j <<= 1;

Or you can do, since elsewhere in the code you compute time_t_max:

  for (j = 1; j <= time_t_max / 2 + 1; j *= 2)

Which is IMO more intention revealing.

Paolo


Re: GCC optimizes integer overflow: bug or feature?

2006-12-22 Thread Roberto Bagnara

Paul Eggert wrote:

Roberto Bagnara <[EMAIL PROTECTED]> writes:


(The platform I'm thinking of is Tandem NSK/OSS.)

Is this correct?  Doesn't C99's 6.2.5#6 mandate that...


This is straying from the subject of GCC and into the
problems of writing portable C code, but since you asked

The Tandem NSK/OSS environment does not claim full
conformance to C99.  The NSK/OSS community is conservative
(fault-tolerance does that do you :-) and has introduced
only some C99 features, more as time progresses.  The
NSK/OSS developers did not introduce 64-bit unsigned int
until last year.  I'm no expert in the area, but I'd guess
that most NSK/OSS production shops are still running older
releases, which have 64-bit signed int but only 32-bit
unsigned int.


I now understand that Tandem NSK/OSS is not conformant, thanks.

But he reason I asked is that I interpreted what you wrote,
i.e.,

> Also, such an approach assumes that unsigned long long int
> has at least as many bits as long long int.  But this is an
> unportable assumption; C99 does not require this.

as "C99 does not require that unsigned long long int
has at least as many bits as long long int."  My reading,
instead, is that C99 requires unsigned long long int
to have exactly the same number of bits as long long int.
All the best,

 Roberto

--
Prof. Roberto Bagnara
Computer Science Group
Department of Mathematics, University of Parma, Italy
http://www.cs.unipr.it/~bagnara/
mailto:[EMAIL PROTECTED]


Re: GCC optimizes integer overflow: bug or feature?

2006-12-22 Thread Paolo Bonzini



Or you can do, since elsewhere in the code you compute time_t_max:

  for (j = 1; j <= time_t_max / 2 + 1; j *= 2)


No, this does not work.  It would work to have:

  for (j = 1;;)
{
  if (j > time_t_max / 2)
break;
  j *= 2;
}

Oops.

Paolo



needed headerfiles in .c

2006-12-22 Thread Markus Franke
Dear GCC Developers / Users,

I am trying to port a GCC-Backend from GCC 2.7.2.3 to GCC 4.1.1. After
having had a look on some already existing backends like the PDP11, I
found out that there have been a lot of new Header-Files added to
".c" as includes.

My question is now whether some kind of standard set of Header-Files
exists which is needed by every backend? Can somebody give me a list or
something like that. I had already a look at the Internals Manual but
without finding something about it.

Thanks in advance,

Markus Franke



Fixme in driver-i386.c

2006-12-22 Thread Uros Bizjak

Hello!

There is a fixme in config/i386/driver-i386.c:

--cut here--
 if (arch)
   {
 /* FIXME: i386 is wrong for 64bit compiler.  How can we tell if
we are generating 64bit or 32bit code?  */
 cpu = "i386";
   }
 else
--cut here--

Couldn't simple "sizeof(long)" do the trick here, i.e.:

--cut here--
int main()
{
 int i = sizeof (long);

 switch (i)
   {
   default:
 abort();
   case 4:
   case 8:
 printf ("%i\n", i);
   }
 return 0;
}
--cut here--

gcc -m32
./a.out
4
gcc -m64
./a.out
8

Uros.


Re Fixme in driver-i386.c

2006-12-22 Thread Kai Tietz
Hello Uros,

no the sizeof long is not always different. E.g. for future target 64bit 
mingw the long type remains 4-byte size. But may we can use the 
pointer-size ?
Because on i386 32-bit system sizeof(void *)==4 and on x86_64 64-bit 
system sizeof(void *)==8 !

Regards,
 i.A. Kai Tietz





Uros Bizjak <[EMAIL PROTECTED]> 
Sent by: [EMAIL PROTECTED]
22.12.2006 12:24

To
GCC 
cc

Subject
Fixme in driver-i386.c






Hello!

There is a fixme in config/i386/driver-i386.c:

--cut here--
  if (arch)
{
  /* FIXME: i386 is wrong for 64bit compiler.  How can we tell if
 we are generating 64bit or 32bit code?  */
  cpu = "i386";
}
  else
--cut here--

Couldn't simple "sizeof(long)" do the trick here, i.e.:

--cut here--
int main()
{
  int i = sizeof (long);

  switch (i)
{
default:
  abort();
case 4:
case 8:
  printf ("%i\n", i);
}
  return 0;
}
--cut here--

gcc -m32
./a.out
4
gcc -m64
./a.out
8

Uros.





Saving the Tree declaration node in GCC 4.1.1.

2006-12-22 Thread Rohit Arul Raj

Hi all,

I am working with GCC 4.1.1. I need some information on the following

Before emitting a call instruction, i need to check for function
attributes. Based on that i need to emit the corresponding call
instruction. For that, before emitting the call instruction, i check
for the attributes of the called function through the declaration
node.

tree fn_id, fn_decl;
fn_id = get_identifier(name);
fn_decl = lookup_name(fn_id);

This code works fine if none of the optimizations are enabled. But
this fails for all levels of optimization as i am not able to get the
function decl tree(lookup_name return NULL).

When compiling the code with -fno-funit-at-a-time switch, i am able to
get the function declaration node.

The ASTs are  converted into the SSA and eventually to the RTL
representations after parsing each function (O0 - No optimization )or
when the whole file is parsed (with optimization).

When the whole file is parsed, is it that the Function declaration
nodes are removed intentionally by GCC ?

In this context (with -funit-at-a-time), how will i be able to access
the function declaration node of the called function just before
emitting the assembly ?

Or should i save the function declaration nodes for further use?

can any one suggest a workaround!!

Thanks in advance,

Regards,
Rohit


Re: SSA_NAMES: should there be an unused, un-free limbo?

2006-12-22 Thread Zdenek Dvorak
Hello,

> On Thu, 2006-12-21 at 20:18 +0100, Zdenek Dvorak wrote:
> 
> > I think this might be a good idea.  I think that this requires
> > a lot of changes (basically going through all uses of bsi_remove
> > and remove_phi_node and checking them), but it would be cleaner
> > than the current situation.
> Agreed.  Tedious work, but it shouldn't be terribly difficult (famous
> last words).

I will give it a try (i.e., if it can be done in one afternoon, I will
send a patch tomorrow :-).

Zdenek


Re: SSA_NAMES: should there be an unused, un-free limbo?

2006-12-22 Thread Diego Novillo

Jeffrey Law wrote on 12/22/06 01:09:

On Thu, 2006-12-21 at 14:05 -0500, Diego Novillo wrote:
In any case, that is not important.  I agree that every SSA name in the 
SSA table needs to have a DEF_STMT that is either (a) an empty 
statement, or, (b) a valid statement still present in the IL.

Just to be 100% clear.  This is not true at the current time; see the
discussion about the sharing of a single field for TREE_CHAIN and
SSA_NAME_DEF_STMT.  If you want to make that statement true, then
you need to fix both the orphan problem and the sharing of a field
for SSA_NAME_DEF_STMT and TREE_CHAIN.


I think we are agreeing violently.


RE: Char shifts promoted to int. Why?

2006-12-22 Thread Dave Korn
On 21 December 2006 21:54, Ayal Zaks wrote:

>> Something along these lines may be useful to do in the vectorizer when we
>>  get code like this: > ((char)x) = ((char)( ((int)((char)x)) <<
>> ((int)c) ) ) 
>> and don't feel like doing all the unpacking of chars to ints and then
>> packing the ints to chars after the shift. An alternative could be to
>> transform the above pattern to:
>>  char_x1 = 0
>>  char_x2 = char_x << c
>>  char_x = ((int)c < size_of_char) ? char_x2 : char_x1
>> and vectorize that (since we already know how to vectorize selects).
> 
> Alternatively, do
>   char_c2 = (c < size_of_char ? c : 0)
>   char_x2 = char_x << char_c2
> which is like saturating the shift amount.

  You don't really mean zero as the third operand of that ternary operator,
you want size_of_char.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Reload Pass

2006-12-22 Thread Vladimir N. Makarov

Rajkishore Barik wrote:


Hi,

Thanks very much. I still have doubts on your suggestion:

AFAIK, the back-end pass consists of (in order) : reg move -> insn sched 
-> reg class -> local alloc -> global alloc -> reload -> post-reload.
There are edges from reg move to reg class and reload back to global 
alloc.


In case I want to implement a linear scan which may split live ranges 
(pseudos) into live intervals(smaller pseudos) and allocate different 
registers
for each of them. This process would break the whole loop. 

So, what did you mean by --- "run this pass in between the register 
allocator and

reload, that would probably be doable."?


 

Sorry, you were not specific.  When  I read your first message, I also 
thought that you were going to rewrite the reload pass in other words to 
use another algorithm make strict RTL when there are only hard registers 
and all insn constraints are satisfied.


As I understood correctly now, you are going to implement linear scan 
register allocator which also does what reload in gcc does.


The first of all you did not stated what are your project goals.  Is it 
better understanding how gcc register allocator and reload work (that is 
a good project then) or you want to write a better register allocator 
which will be used in gcc.  The second goal is hard to achieve because 
linear scan register allocator generates worse code than Chaitin-Briggs 
and the current gcc register allocator (Chow's priority based 
coloring).  Another problem is that the linear scan register allocation 
is patented.  Do you have permission to use it in gcc?  Currently we 
have permission to Chaitin's (IBM) and Brigg's (Rice university) 
patents.  We can not ignore patents as LLVM does not worry about patents 
which uses petented linear scan and Callahan-Koblentz register allocators.


IMHO, Ian wrote correctly that you can write a decent register allocator 
which does also reload stuff for one target (although even in this case 
number of details to keep in mind will be overwhelming) but it is a hard 
to write a decent code which works for all gcc targets.  So many efforts 
were used to improve reload for all gcc processors (even weird ones like 
SH with many registers but very small displacements or mcore with few 
registers and small displacements), you should do the same to be successful.


Reload does a lot of things not mentioned in Zack's document like 
register rematerialziation, virtual register elimination, address 
inheritance (which is important for processor with small address 
displacements) and few others.  There is opinion that doing some reload 
things before the register allocation will permit to generate a better 
code because register allocation will not worry about the constraints 
and reload will be more predictable (more accurately will work in less 
cases).  Such project already exists (see svn branches).


In any case your work on register allocation and reload will be 
appreciated by the community only please be prepared that it will be not 
an easy way.


Vlad


RE: GCC optimizes integer overflow: bug or feature?

2006-12-22 Thread Dave Korn
On 22 December 2006 00:59, Denis Vlasenko wrote:


> Or this, absolutely typical C code. i386 arch can compare
> 16 bits at a time here (luckily, no alighment worries on this arch):

  Whaddaya mean, no alignment worries?  Misaligned accesses *kill* your
performance!

  I know this doesn't affect correctness, but the coder might well have known
that the pointer is unaligned and written two separate byte-sized accesses on
purpose; volatile isn't the answer because it's too extreme, there's nothing
wrong with caching these values in registers and they don't spontaneously
change on us.

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



RE: GCC optimizes integer overflow: bug or feature?

2006-12-22 Thread Andrew Pinski
On Fri, 2006-12-22 at 17:08 +, Dave Korn wrote:
> Misaligned accesses *kill* your performance! 

Maybe on x86, but on PPC, at least for the (current) Cell's PPU
misaligned accesses for most cases unaligned are optimal.

Thanks,
Andrew Pinski



Re: GCC optimizes integer overflow: bug or feature?

2006-12-22 Thread Robert Dewar

Dave Korn wrote:

On 22 December 2006 00:59, Denis Vlasenko wrote:



Or this, absolutely typical C code. i386 arch can compare
16 bits at a time here (luckily, no alighment worries on this arch):


  Whaddaya mean, no alignment worries?  Misaligned accesses *kill* your
performance!


is it really worse to do one unaligned 16-bit read, than two separate
8-bit reads? I am surprised ... and of course you have the gain from
shorter code, reducing i-cache pressure.


  I know this doesn't affect correctness, but the coder might well have known
that the pointer is unaligned and written two separate byte-sized accesses on
purpose; volatile isn't the answer because it's too extreme, there's nothing
wrong with caching these values in registers and they don't spontaneously
change on us.




Re: GCC optimizes integer overflow: bug or feature?

2006-12-22 Thread Robert Dewar

Andrew Pinski wrote:

On Fri, 2006-12-22 at 17:08 +, Dave Korn wrote:
Misaligned accesses *kill* your performance! 


Maybe on x86, but on PPC, at least for the (current) Cell's PPU
misaligned accesses for most cases unaligned are optimal.


is that true across cache boundaries?


Thanks,
Andrew Pinski




Re: GCC optimizes integer overflow: bug or feature?

2006-12-22 Thread Andrew Pinski
On Fri, 2006-12-22 at 12:30 -0500, Robert Dewar wrote:
> 
> > Maybe on x86, but on PPC, at least for the (current) Cell's PPU
> > misaligned accesses for most cases unaligned are optimal.
> 
> is that true across cache boundaries? 

For Cell, crossing the 32byte boundary causes the microcode to happen.
But the question is how often does that happen compare to non crossing,
I am willing to bet hardly at all, yes I need to test this and I am
going to have anyways for my job :).

-- Pinski



[mem-ssa] Updated documentation

2006-12-22 Thread Diego Novillo


I've updated the document describing Memory SSA.  The section on mixing 
static and dynamic partitioning is still being implemented, so it's a 
bit sparse on details and things will probably shift somewhat before I'm 
done.


http://gcc.gnu.org/wiki/mem-ssa

Feedback welcome.  Thanks.


gcc-4.1-20061222 is now available

2006-12-22 Thread gccadmin
Snapshot gcc-4.1-20061222 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20061222/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.1 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch 
revision 120156

You'll find:

gcc-4.1-20061222.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.1-20061222.tar.bz2 C front end and core compiler

gcc-ada-4.1-20061222.tar.bz2  Ada front end and runtime

gcc-fortran-4.1-20061222.tar.bz2  Fortran front end and runtime

gcc-g++-4.1-20061222.tar.bz2  C++ front end and runtime

gcc-java-4.1-20061222.tar.bz2 Java front end and runtime

gcc-objc-4.1-20061222.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.1-20061222.tar.bz2The GCC testsuite

Diffs from 4.1-20061215 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.1
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: GCC optimizes integer overflow: bug or feature?

2006-12-22 Thread Denis Vlasenko
On Friday 22 December 2006 03:03, Paul Brook wrote:
> On Friday 22 December 2006 00:58, Denis Vlasenko wrote:
> > On Tuesday 19 December 2006 23:39, Denis Vlasenko wrote:
> > > There are a lot of 100.00% safe optimizations which gcc
> > > can do. Value range propagation for bitwise operations, for one
> >
> > Or this, absolutely typical C code. i386 arch can compare
> > 16 bits at a time here (luckily, no alighment worries on this arch):
> >
> > int f(char *p)
> > {
> >     if (p[0] == 1 && p[1] == 2) return 1;
> >     return 0;
> > }
> 
> Definitely not 100% safe. p may point to memory that is sensitive to the 
> access width and/or number of accesses. (ie. memory mapped IO).

Take a look what Linux does when you need to touch a MMIO
or PIO areas. In short: it wraps it in macros/inlines
which do all required magic (which may be rather different
on different architectures. For i386, they amount to
*(volatile char*)p).

"Simple" access to such areas with *p will never work
safely across all spectrum of hardware.


Ok, next example of real-world code I recently saw.
i >= N comparisons are completely superfluous -
programmer probably overlooked that fact.

But gcc didn't notice that either and generated 16 bytes extra
for first function:

# cat t3.c
int i64c(int i) {
if (i <= 0) return '.';
if (i == 1) return '/';
if (i >= 2 && i < 12) return ('0' - 2 + i);
if (i >= 12 && i < 38) return ('A' - 12 + i);
if (i >= 38 && i < 63) return ('a' - 38 + i);
return 'z';
}

int i64c_2(int i) {
if (i <= 0) return '.';
if (i == 1) return '/';
if (i < 12) return ('0' - 2 + i);
if (i < 38) return ('A' - 12 + i);
if (i < 63) return ('a' - 38 + i);
return 'z';
}
# gcc -O2 -c -fomit-frame-pointer t3.c
# nm --size-sort t3.o
0038 T i64c_2
0048 T i64c
# gcc -O2 -S -fomit-frame-pointer t3.c
# cat t3.s
.file   "t3.c"
.text
.p2align 2,,3
.globl i64c
.type   i64c, @function
i64c:
movl4(%esp), %edx
testl   %edx, %edx
jle .L15
cmpl$1, %edx
je  .L16
leal-2(%edx), %eax
cmpl$9, %eax
jbe .L17
leal-12(%edx), %eax
cmpl$25, %eax
jbe .L18
leal-38(%edx), %eax
cmpl$24, %eax
ja  .L19
leal59(%edx), %eax
ret
.p2align 2,,3
.L17:
leal46(%edx), %eax
ret
.p2align 2,,3
.L16:
movl$47, %eax
ret
.p2align 2,,3
.L19:
movl$122, %eax
ret
.L18:
leal53(%edx), %eax
ret
.L15:
movl$46, %eax
ret
.size   i64c, .-i64c
.p2align 2,,3
.globl i64c_2
.type   i64c_2, @function
i64c_2:
movl4(%esp), %eax
testl   %eax, %eax
jle .L33
cmpl$1, %eax
je  .L34
cmpl$11, %eax
jle .L35
cmpl$37, %eax
jle .L36
cmpl$62, %eax
jg  .L37
addl$59, %eax
ret
.p2align 2,,3
.L35:
addl$46, %eax
ret
.p2align 2,,3
.L34:
movb$47, %al
ret
.p2align 2,,3
.L37:
movl$122, %eax
ret
.L36:
addl$53, %eax
ret
.L33:
movl$46, %eax
ret
.size   i64c_2, .-i64c_2
.ident  "GCC: (GNU) 4.2.0 20061128 (prerelease)"
.section.note.GNU-stack,"",@progbits

--
vda


Re: Saving the Tree declaration node in GCC 4.1.1.

2006-12-22 Thread Ian Lance Taylor
"Rohit Arul Raj" <[EMAIL PROTECTED]> writes:

> Before emitting a call instruction, i need to check for function
> attributes. Based on that i need to emit the corresponding call
> instruction. For that, before emitting the call instruction, i check
> for the attributes of the called function through the declaration
> node.
> 
> tree fn_id, fn_decl;
> fn_id = get_identifier(name);
> fn_decl = lookup_name(fn_id);

I don't understand where you are trying to emit the call instruction.
Why do you only have the name?  What language are you compiling?

Certainly calling lookup_name seems wrong.

Given a CALL_EXPR, you can use get_callee_fndecl to find the caller.

At the RTL level, you can usually use TARGET_ENCODE_SECTION_INFO and
SYMBOL_REF_FLAG to good effect.  Or look at how ARM handles
ENCODED_LONG_CALL_ATTR_P via TARGET_STRIP_NAME_ENCODING.

Ian