Re: GCC 4.3.0 Status Report (2007-08-09)

2007-08-10 Thread Diego Novillo
On 8/10/07 9:49 AM, Diego Novillo wrote:

> Zadeck has the parloop branch patches [ ... ]

Sorry, I meant Zdenek.


Re: [RFC] Migrate pointers to members to the middle end

2007-08-10 Thread Mark Mitchell
Ollie Wild wrote:

> Offhand, I don't remember what happened with the various other cases,
> but my testing at the time wasn't particularly thorough.  The feedback
> I've gotten so far seems overwhelmingly negative, so I think the next
> step is to revisit the lowering approach, exercise the hell out of it,
> and see what, if any, limitations pop up.

Yes, I agree.  Again, thank you for being patient with the process.

Let me know when you're at the point where you'd like me to review the
front-end lowering patch again; send me a URL, and I'll be happy to do so.

Thanks,

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: mips gcc -O1: Address exception error on store doubleword

2007-08-10 Thread Andrew Haley
Alex Gonzalez writes:
 > Hi, trying to come up with a testcase we figured out what the problem could 
 > be.
 > 
 > When the optimizer is on and memcpy sees that it is copying a
 > struct with double words in it, it will assume that the struct
 > starts on an 8 byte boundary and use double word loads and
 > stores. This is a safe assumption, as gcc will always ensure that
 > structs containing doubles start on an 8 byte boundary when the
 > memory is mallocced.
 > 
 > However we managed to trick gcc by mallocing a large chunk of
 > memory and then assigning a pointer to a user data (unsigned int
 > user[0]) without first ensuring that the user data was 8 byte
 > aligned. Since the structure does contain a double, this resulted
 > in a crash in memcopy.
 > 
 > The fix for this was to inform the compiler that this "void"
 > pointer should be 8 byte aligned by changing the "unsigned int
 > user[0]" to a "unsigned long long user[0]". This will cause gcc to
 > pad this entry out to ensure that it starts on an 8 byte boundary.
 > 
 > Does this make sense?

Yes.  In general, if you lie to the compiler you lose.  :-)

It's a very good idea to read what the language standards actually say
about this.  In particular, casting pointers between types doesn't
work except in some well-defined cases.  You should read the standard
to find out what works and what doesn't.

Andrew.


Re: Very Fast: Directly Coded Lexical Analyzer

2007-08-10 Thread Robert Dewar

Ronny Peine wrote:

Hi,

my questions is, why not use the element construction algorithm? The Thomson 
Algorithm creates an epsilon-NFA which needs quite a lot of memory. The 
element construction creates an NFA directly and therefor has fewer states. 
Well, this is only interesting in the scanner creation which is not so 
important than the scanner itself, but it can reduce the memory footprint of 
generator. It's a pity i can't find a url for the algorithmdescription, maybe 
i even have the wrong naming of it. I have only read it in script Compiler 
Construction at the University.


To me, very fast (millions of lines a second) lexical analyzers are
trivial to write by hand, and I really don't see the point of tools,
and certainly not the utility of any theory in writing such code.
If anything the formalism of a finite state machine just gets in the
way, since it is more efficient to encode the state in the code
location than in data.


GCC "make" errors

2007-08-10 Thread mandeep singh bhambra
Hi,

I wanted update my GCC compiler to 4.2.1 to install an updated version of C 
libraries (glibc) and it is giving me errors while it is making the build. I 
type ./configure which works fine but when I type "make" it runs fine until it 
starts to give errors which are as follows:

/tmp/ccacyMlE.s: Assembler messages:
/tmp/ccacyMlE.s:72: Error: no such 386 instruction: `stmxcsr'
/tmp/ccacyMlE.s:90: Error: no such 386 instruction: `ldmxcsr'
/tmp/ccacyMlE.s:119: Error: no such 386 instruction: `fxsave'
make[3]: *** [crtfastmath.o] Error 1
make[3]: Leaving directory `/usr/src/gcc-4.2.1/host-i686-pc-linux-gnu/gcc'
make[2]: *** [all-stage1-gcc] Error 2
make[2]: Leaving directory `/usr/src/gcc-4.2.1'
make[1]: *** [stage1-bubble] Error 2
make[1]: Leaving directory `/usr/src/gcc-4.2.1'
make: *** [all] Error 2

I have latest versions of make 3.81, binutils, coreutils, texinfo installed. I 
am running Linux JDS 2003 which I have been told is SUSE Linux on a Athlon 
1.6Ghz. It seems Linux users on linux forums have limited knowledge of this, as 
I have not recieved any assistance from them so your help would be really 
appreciated.

Thanks

Mandeep



-- 
We've Got Your Name at http://www.mail.com !
Get a FREE E-mail Account Today - Choose From 100+ Domains



Re: GCC "make" errors

2007-08-10 Thread Tim Prince

[EMAIL PROTECTED] wrote:

Hi,

I wanted update my GCC compiler to 4.2.1 to install an updated version of C libraries 
(glibc) and it is giving me errors while it is making the build. I type ./configure which 
works fine but when I type "make" it runs fine until it starts to give errors 
which are as follows:

/tmp/ccacyMlE.s: Assembler messages:
/tmp/ccacyMlE.s:72: Error: no such 386 instruction: `stmxcsr'
/tmp/ccacyMlE.s:90: Error: no such 386 instruction: `ldmxcsr'
/tmp/ccacyMlE.s:119: Error: no such 386 instruction: `fxsave'
make[3]: *** [crtfastmath.o] Error 1
make[3]: Leaving directory `/usr/src/gcc-4.2.1/host-i686-pc-linux-gnu/gcc'
make[2]: *** [all-stage1-gcc] Error 2
make[2]: Leaving directory `/usr/src/gcc-4.2.1'
make[1]: *** [stage1-bubble] Error 2
make[1]: Leaving directory `/usr/src/gcc-4.2.1'
make: *** [all] Error 2

I have latest versions of make 3.81, binutils, coreutils, texinfo installed. I 
am running Linux JDS 2003 which I have been told is SUSE Linux on a Athlon 
1.6Ghz. It seems Linux users on linux forums have limited knowledge of this, as 
I have not recieved any assistance from them so your help would be really 
appreciated.

Either you don't have a binutils from the last 8 years, or you have 
somehow crossed up your march= options, which you didn't divulge.


Re: [RFC] Migrate pointers to members to the middle end

2007-08-10 Thread Tom Tromey
> "Dan" == Daniel Berlin <[EMAIL PROTECTED]> writes:

Dan> Just to be clear, we *already* have the class hierarchies in the
Dan> middle end.

Dan> They have been there for a few years now :)

Good point, thanks.

I don't think that is enough though, because I don't think the BINFO
slots mean the same thing in g++ and gcj.

Anyway, I don't want to derail this conversation.  If we really want
to strength reduce interface dispatch to virtual dispatch in LTO then
we'll need to find some relatively language neutral way to express
that.

Tom


Re: GCC 4.3.0 Status Report (2007-08-09)

2007-08-10 Thread Diego Novillo
On 8/9/07 6:19 PM, Mark Mitchell wrote:

> Are there any folks out there who have projects for Stage 1 or Stage 2
> that they are having trouble getting reviewed?  Any comments
> re. timing for Stage 3?

Zadeck has the parloop branch patches, which I've been reviewing.  I am
not sure how many other patches are left, but at least a couple.  Zdenek
are the remaining patches submitted already?  I have one in my review
list, but I don't know if there are others.  I could go over them next week.


Re: [RFC] Migrate pointers to members to the middle end

2007-08-10 Thread Michael Matz

Hi,

On Thu, 9 Aug 2007, Tom Tromey wrote:


Michael> Yes, devirtualization.  But I wonder if you really need class
Michael> hierarchies for this (actually I'm fairly sure you don't).

However, I'm not sure I agree with the above assertion.  Specifically, 
for Java I think it is sometimes possible to strength reduce interface 
calls to virtual calls, but I don't see how this could be done without 
class hierarchy information.


Okay, I suppose there are transformations that could make use of class 
hierarchies.  Luckily we do have that via the BINFO machinery already.



Ciao,
Michael.


RE: Very Fast: Directly Coded Lexical Analyzer

2007-08-10 Thread Dave Korn
On 10 August 2007 12:49, Robert Dewar wrote:

On 01 June 2007 11:27, Ronny Peine wrote:

>> Hi,
>> 
>> my questions is, why not use the element construction algorithm? 

> To me, very fast (millions of lines a second) lexical analyzers are
> trivial to write by hand,


  I think you need one to lex the dates in the old back-dated emails in your
mailbox for you! :-)


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



reload question

2007-08-10 Thread Pat Haugen

I'm looking into a few cases where we're still getting the base/index
operand ordering wrong on PowerPC for an indexed load/store instruction,
even after the PTR_PLUS merge and fix for PR28690.  One of the cases I
observed was caused by reload picking r0 to use for the base reg opnd as a
result of spilling.  Since r0 is not a valid register for the base reg
position, we end up switching the order of the operands before emitting the
instruction which then causes the performance hit on Power6.  r0 is not a
valid BASE_REG_CLASS register, only INDEX_REG_CLASS, but the following
section of code from reload.c:find_reloads_address_1() dealing with
PLUS(REG REG) may try assigning the base reg opnd to the INDEX_REG class in
a couple situations.  This then allows r0 to be picked for the base reg
opnd.  Is this being done on purpose (going on assumption that operands are
commutative), such as to allow more opportunities for a successful
allocation with reduced spill?  If it's not wise for me to modify this
code, possibly due to effect on other architectures, what are some other
options (maybe introduce a new HONOR_BASE_INDEX_ORDER target macro)?

else if (code0 == REG && code1 == REG)
  {
if (REGNO_OK_FOR_INDEX_P (REGNO (op0))
&& regno_ok_for_base_p (REGNO (op1), mode, PLUS, REG))
  return 0;
else if (REGNO_OK_FOR_INDEX_P (REGNO (op1))
 && regno_ok_for_base_p (REGNO (op0), mode, PLUS, REG))
  return 0;
else if (regno_ok_for_base_p (REGNO (op1), mode, PLUS, REG))
  find_reloads_address_1 (mode, orig_op0, 1, PLUS, SCRATCH,
  &XEXP (x, 0), opnum, type,
ind_levels,
  insn);
else if (regno_ok_for_base_p (REGNO (op0), mode, PLUS, REG))
  find_reloads_address_1 (mode, orig_op1, 1, PLUS, SCRATCH,
  &XEXP (x, 1), opnum, type,
ind_levels,
  insn);
else if (REGNO_OK_FOR_INDEX_P (REGNO (op1)))
  find_reloads_address_1 (mode, orig_op0, 0, PLUS, REG,
  &XEXP (x, 0), opnum, type,
ind_levels,
  insn);
else if (REGNO_OK_FOR_INDEX_P (REGNO (op0)))
  find_reloads_address_1 (mode, orig_op1, 0, PLUS, REG,
  &XEXP (x, 1), opnum, type,
ind_levels,
  insn);
else
  {
find_reloads_address_1 (mode, orig_op0, 1, PLUS,
SCRATCH,
&XEXP (x, 0), opnum, type,
ind_levels,
insn);
find_reloads_address_1 (mode, orig_op1, 0, PLUS, REG,
&XEXP (x, 1), opnum, type,
ind_levels,
insn);
  }
  }


I've also seen the same situation come up during register renaming
(regrename.c), but not too surprising since the code there says it's based
off find_reloads_address_1() and is coded similarly.


-Pat



Re: mips gcc -O1: Address exception error on store doubleword

2007-08-10 Thread Alex Gonzalez
Hi, trying to come up with a testcase we figured out what the problem could be.

When the optimizer is on and memcpy sees that it is copying a struct
with double words in it, it will assume that the struct starts on an 8
byte boundary and use double word loads and stores. This is a safe
assumption, as gcc will always ensure that structs containing doubles
start on an 8 byte boundary when the memory is mallocced.

However we managed to trick gcc by mallocing a large chunk of memory
and then assigning a pointer to a user data (unsigned int user[0])
without first ensuring that the user data was 8 byte aligned. Since
the structure does contain a double, this resulted in a crash in
memcopy.

The fix for this was to inform the compiler that this "void" pointer
should be 8 byte aligned by changing the "unsigned int  user[0]" to a
"unsigned long long user[0]". This will cause gcc to pad this entry
out to ensure that it starts on an 8 byte boundary.

Does this make sense?

Alex

On 8/9/07, Alex Gonzalez <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I'll try to come up with a short test.
>
> I have narrowed it a bit more. The PVAR structure contains a long long
> variable ( with a sizeof 8 and an alignof 8 for my architecture). If I
> take out the long long variable, the compiler uses sdl instructions
> instead of sd and the exception doesn't happen.
>
> Also, if I do
>
> static void varcopy(void *pvar1, void *pvar2)
>
> the compiler uses sdl and avoids the crash.
>
> I am compiling for n32 ABI, so the register size is 64bits.
>
> Any ideas?
>
> On 8/9/07, David Daney <[EMAIL PROTECTED]> wrote:
> > Alex Gonzalez wrote:
> > > Hi,
> > >
> > > I am seeing an address error exception caused by the gcc optimizer -O1.
> > >
> > > I have narrowed it down to the following function:
> > >
> > > static void varcopy(PVAR *pvar1, PVAR *pvar2) {
> > > memcpy(pvar1,pvar2,sizeof(PVAR));
> > > }
> > >
> > > Being the sizeof(PVAR) 160 bytes.
> > >
> > > The exception is caused on an sd instruction when the input is not
> > > aligned on a doubleword boundary.
> > >
> > > I was under the assumption that the compiler made sure that it doesn't
> > > store a doubleword that is not aligned on a doubleword boundary. Is
> > > this a bug in the optimizer?
> > >
> > > I am using a gcc mips64 cross-compiler,
> > >
> > > mips64-linux-gnu-gcc (GCC) 3.3-mips64linux-031001
> > >
> > > Has anyone experienced this problem before?
> > >
> > In order to investigate we would need a self contained test case (i.e.
> > the definition of PVAR must be included).  Also it would be nice if you
> > could try it on a current version of GCC (4.2.1 perhaps).
> >
> > David Daney
> >
>


gcc-4.3-20070810 is now available

2007-08-10 Thread gccadmin
Snapshot gcc-4.3-20070810 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20070810/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.3 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 127352

You'll find:

gcc-4.3-20070810.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.3-20070810.tar.bz2 C front end and core compiler

gcc-ada-4.3-20070810.tar.bz2  Ada front end and runtime

gcc-fortran-4.3-20070810.tar.bz2  Fortran front end and runtime

gcc-g++-4.3-20070810.tar.bz2  C++ front end and runtime

gcc-java-4.3-20070810.tar.bz2 Java front end and runtime

gcc-objc-4.3-20070810.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.3-20070810.tar.bz2The GCC testsuite

Diffs from 4.3-20070803 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.3
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: reload question

2007-08-10 Thread Ian Lance Taylor
Pat Haugen <[EMAIL PROTECTED]> writes:

> I'm looking into a few cases where we're still getting the base/index
> operand ordering wrong on PowerPC for an indexed load/store instruction,
> even after the PTR_PLUS merge and fix for PR28690.  One of the cases I
> observed was caused by reload picking r0 to use for the base reg opnd as a
> result of spilling.  Since r0 is not a valid register for the base reg
> position, we end up switching the order of the operands before emitting the
> instruction which then causes the performance hit on Power6.  r0 is not a
> valid BASE_REG_CLASS register, only INDEX_REG_CLASS, but the following
> section of code from reload.c:find_reloads_address_1() dealing with
> PLUS(REG REG) may try assigning the base reg opnd to the INDEX_REG class in
> a couple situations.  This then allows r0 to be picked for the base reg
> opnd.  Is this being done on purpose (going on assumption that operands are
> commutative), such as to allow more opportunities for a successful
> allocation with reduced spill?  If it's not wise for me to modify this
> code, possibly due to effect on other architectures, what are some other
> options (maybe introduce a new HONOR_BASE_INDEX_ORDER target macro)?

I'm not entirely clear: how do you propose changing the code?

Ian


RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-10 Thread Zack Weinberg

During development of the patch I just posted for double-word clz, I
went through all the back ends and audited their use of the bit-scan
named patterns and RTL.  It appears to me that our current handling of
C[LT]Z_DEFINED_VALUE_AT_ZERO is much more complicated than it needs to
be, and also that between my patch and Sandra's earlier patch for
synthetic ctz/ffs, we have an opportunity to delete a bunch of code from
the back ends.

In this message, I'll use the word "instruction" when I am talking about
an actual hardware operation on a particular architecture; the word
"pattern" when I am talking about a named define_insn or define_expand
in a machine description; and the word "expression" when I am talking
about RTL.  The word "port" refers to the GCC back-end for a particular
CPU architecture.

There are eleven ports that make use of an clz instruction.  That use is
not necessarily in a clz pattern or with clz expressions - some only
define ffs patterns, and some use UNSPECs.  This is mostly irrelevant to
what I want to talk about, though.

  alpha arm i386 m68k mips rs6000 s390 score sh sparc xtensa

Of these, the majority have instructions that, when the input is zero,
write to the output a value equal to the number of bits in the input
(i.e. GET_MODE_BITSIZE of the mode of the input).  I'll refer to this as
canonical behavior.  Furthermore, these ports set
CLZ_DEFINED_VALUE_AT_ZERO to reflect that fact.

  alpha arm m68k mips rs6000 s390 xtensa

The score, sh and sparc instructions may or may not display canonical
behavior; their ports do not define CLZ_DEFINED_VALUE_AT_ZERO and I was
not able to find documentation of the relevant instruction.

i386, as is well known, has a clz instruction that does not write a
predictable value to the output when the input is zero, and so correctly
does not define CLZ_DEFINED_VALUE_AT_ZERO.  (Actually, when TARGET_ABM
is true, we are using a new instruction that *does* display canonical
behavior, and my aforementioned patch sets CLZ_DEFINED_VALUE_AT_ZERO to
reflect that; but again this is mostly irrelevant.)

No port needs CLZ_DEFINED_VALUE_AT_ZERO to be a tristate.  Either both
or neither of the clz pattern and the clz expression produce a defined
value at zero.

No port defines CLZ_DEFINED_VALUE_AT_ZERO to set the 'val' argument to
anything other than GET_MODE_BITSIZE (mode).  [Some of them hardcode the
constant instead of using that expression.]



There are two ports that make use of a ctz instruction:

  alpha i386

alpha's instruction displays canonical behavior; i386's instruction does
not write a predictable value to the output when the input is zero
(TARGET_ABM does not help here).  Both ports have correct definitions or
non-definitions of CTZ_DEFINED_VALUE_AT_ZERO.

In addition, four ports define ctz patterns that expand to
multi-instruction sequences.

  arm ia64 rs6000 xtensa

Of these, all except ia64 are presently redundant with the generic
expander Sandra added to optabs.c.  ia64 generates a different sequence
involving a popcount instruction; it would be easy enough to add that to
optabs.c.

CTZ_DEFINED_VALUE_AT_ZERO is not defined by all four ports, but could
be.  Those that define it, do so correctly.

No port needs CTZ_DEFINED_VALUE_AT_ZERO to be a tristate.  There are
three cases: both the pattern and the expression have a defined value at
zero (alpha); neither the pattern nor the expression has a defined value
at zero (i386); the pattern has a defined value at zero and the
expression is never emitted so its value at zero is moot (arm, rs6000,
xtensa, ia64).

If optabs.c were taught to synthesize ctz in terms of popcount, the arm,
rs6000, xtensa, and ia64 definitions of ctz patterns could all be
removed.  There would then be no port that defined
CTZ_DEFINED_VALUE_AT_ZERO to set 'val' to anything other than
GET_MODE_BITSIZE (mode).  [My patch removes the arm ctz pattern.  rs6000
and xtensa could be removed now.]



There is no port that makes use of an ffs instruction.  However, there
are nine architectures that define ffs patterns.

  alpha arm i386 ia64 rs6000 score sh sparc xtensa

All except ia64's are redundant with optabs.c after Sandra's patch plus
my patch.  ia64's would be redundant if the aforementioned popcount
sequence were added to optabs.c.

There is no port that uses the ffs expression.  ffs always has a defined
value at zero, so there is no FFS_DEFINED_VALUE_AT_ZERO macro nor any
need for one.



The machine-independent uses of C[LT]Z_DEFINED_VALUE_AT_ZERO are quite
limited:

 * builtins.c (fold_builtin_bitop): Uses them to determine the value of
 __builtin_clz* and __builtin_ctz* for a zero argument.  Interestingly,
 if the macros are false for a given mode, it folds the builtins as if
 they displayed canonical behavior.

 * optabs.c: Uses them in strategies for expanding ctz and ffs.

 * rtlanal.c (nonzero_bits1): Uses them to decide what bits can be
 nonzero in the result of a clz or ctz expression.

 * simplify-rtx.