Re: Use of FLAGS_REGNUM clashes with generates insn

2011-09-23 Thread Joern Rennecke

Quoting "Paulo J. Matos" :


My addition instruction sets all the flags. So I have:


This is annoying, but can be handled.  Been there, done that.
dse.c needs a small patch, which I intend to submit sometime in the future.


And all my (define_insn "*mov..." are tagged with a (clobber (reg:CC
RCC)). This generates all kinds of trouble since GCC generates moves
internally without the clobber that fail to match.


I don't think that can be overcome without cc0.  Unless you want to
hide your flags register altogether.


Question on _GLIBCXX_HOSTED macro libstdc++ and libsupc++

2011-09-23 Thread Amker.Cheng
Hi,

In libstdc++-v3/libsupc++/eh_term_handler.cc, it says by default the
demangler things are pulled in,
according to whether _GLIBCXX_HOSTED is defined. the demangler
exception terminating handler
are really big, especially for embedded system.

Secondly, _GLIBCXX_HOSTED is now defined if --enable-hosted-libstdcxx
is given(by default it is).
This option also controls whether libstdc++.a itself is built for target system.

So, for an embedded system, how could I provide the earlier "silent
death" handler by defining _GLIBCXX_HOSTED,
also with libstdc++ built?

Any suggestion? Thanks in advance.
FYI, all above are talking about cross-toolchain.

-- 
Best Regards.


Re: Question on _GLIBCXX_HOSTED macro libstdc++ and libsupc++

2011-09-23 Thread Jonathan Wakely
On 23 September 2011 09:14, Amker.Cheng wrote:
> Hi,
>
> In libstdc++-v3/libsupc++/eh_term_handler.cc, it says by default the
> demangler things are pulled in,
> according to whether _GLIBCXX_HOSTED is defined. the demangler
> exception terminating handler
> are really big, especially for embedded system.
>
> Secondly, _GLIBCXX_HOSTED is now defined if --enable-hosted-libstdcxx
> is given(by default it is).
> This option also controls whether libstdc++.a itself is built for target 
> system.
>
> So, for an embedded system, how could I provide the earlier "silent
> death" handler by defining _GLIBCXX_HOSTED,
> also with libstdc++ built?
>
> Any suggestion? Thanks in advance.
> FYI, all above are talking about cross-toolchain.

(Any reason this wasn't sent to the libstdc++ list?)

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43852 proposes a "quiet
mode" which would reduce code size by disabling some of the code in
eh_term_handler.cc and pure.cc - would that do what you want?

I've not had time to do anything about it, but I think Sebastian
(CC'd) has a copyright assignment in place now, and he's provided a
patch implementing it.


Re: Question on _GLIBCXX_HOSTED macro libstdc++ and libsupc++

2011-09-23 Thread Amker.Cheng
> (Any reason this wasn't sent to the libstdc++ list?)
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43852 proposes a "quiet
> mode" which would reduce code size by disabling some of the code in
> eh_term_handler.cc and pure.cc - would that do what you want?
>
> I've not had time to do anything about it, but I think Sebastian
> (CC'd) has a copyright assignment in place now, and he's provided a
> patch implementing it.
>
Sorry for missing the list, cced now.

It is exactly what I meant, thanks very much.

-- 
Best Regards.


Re: Volatile qualification on pointer and data

2011-09-23 Thread Paulo J. Matos

On 22/09/11 22:15, Richard Guenther wrote:


Btw, I think this is an old bug that has been resolved.  Did you make sure to
test a recent 4.6 branch snapshot or svn head?



Should have tested git head. Compiling git head now to check the current 
status of this issue.



--
PMatos



Re: Volatile qualification on pointer and data

2011-09-23 Thread Paulo J. Matos

On 23/09/11 12:33, Paulo J. Matos wrote:

On 22/09/11 22:15, Richard Guenther wrote:


Btw, I think this is an old bug that has been resolved. Did you make
sure to
test a recent 4.6 branch snapshot or svn head?



Should have tested git head. Compiling git head now to check the current
status of this issue.




Git head 36181f98f doesn't compile (x86_64, --enable-checking=all, GCC 
4.5.2):
gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings 
-Wcast-qual -Wstrict-prototypes -Wmissing-prototypes 
-Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros 
-Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common 
 -DHAVE_CONFIG_H -I. -I. -I../../../repositories/gcc/gcc 
-I../../../repositories/gcc/gcc/. 
-I../../../repositories/gcc/gcc/../include 
-I../../../repositories/gcc/gcc/../libcpp/include 
-I../../../repositories/gcc/gcc/../libdecnumber 
-I../../../repositories/gcc/gcc/../libdecnumber/bid -I../libdecnumber 
 ../../../repositories/gcc/gcc/fold-const.c -o fold-const.o
../../../repositories/gcc/gcc/fold-const.c: In function 
‘fold_overflow_warning’:
../../../repositories/gcc/gcc/fold-const.c:326:5: warning: format not a 
string literal and no format arguments
../../../repositories/gcc/gcc/fold-const.c: In function 
‘fold_checksum_tree’:
../../../repositories/gcc/gcc/fold-const.c:13803:3: error: invalid 
application of ‘sizeof’ to incomplete type ‘struct tree_type’


--
PMatos



Re: Use of FLAGS_REGNUM clashes with generates insn

2011-09-23 Thread Paulo J. Matos

On 23/09/11 08:21, Joern Rennecke wrote:

Quoting "Paulo J. Matos" :


My addition instruction sets all the flags. So I have:


This is annoying, but can be handled. Been there, done that.
dse.c needs a small patch, which I intend to submit sometime in the future.



Ok. Actually I was quite happy with my solution too which avoided having 
to change the core. However, it was not heavily tested.



And all my (define_insn "*mov..." are tagged with a (clobber (reg:CC
RCC)). This generates all kinds of trouble since GCC generates moves
internally without the clobber that fail to match.


I don't think that can be overcome without cc0. Unless you want to
hide your flags register altogether.



That's seriously annoying. The idea was to ditch cc0 and explicitly 
represent CC in a register to perform optimizations like splitting add 
and addc for a double word addition. If by hiding my register flags 
means going back to cc0, then it seems that the only way to go unless I 
get it to work somehow. If you having anything else in mind to get it to 
work let me know.


What I currently have in mind is to have a backend macro listing all the 
move for which a move clobber CC_REG, then whenever GCC generates a 
move, it queries the macro to know if the move requires clobbering and 
emits the clobber if required. However, I am unsure how deep the rabbit 
hole goes.


--
PMatos



Re: RFC: Improving support for known testsuite failures

2011-09-23 Thread Diego Novillo
On Thu, Sep 22, 2011 at 20:06, Hans-Peter Nilsson  wrote:
> On Thu, 8 Sep 2011, Diego Novillo wrote:
>
>> On Thu, Sep 8, 2011 at 04:31, Richard Guenther
>>  wrote:
>>
>> > I think it would be more useful to have a script parse gcc-testresults@
>> > postings from the various autotesters and produce a nice webpage
>> > with revisions and known FAIL/XPASSes for the target triplets that
>> > are tested.
>>
>> Sure, though that describes a different tool.  I'm after a tool that
>> will 'exit 0' if the testsuite finished with nominal results.
>
> Not to stop you from (partly) reinventing the wheel, but that's
> pretty much what contrib/regression/btest-gcc.sh already does,
> though you have to feed it a baseline a set of processed .sum
> files which could (for a calling script or a modified
> btest-gcc.sh) live in, say, contrib/target-results/.
> It handles "duplicate" test names by marking it as failing if
> any of them has failed.  Works good enough.

Yeah, I actually considered using it by extracting the actual .sum
file processing out of it (I was not interested in it running the
build nor the tests).

However, I also needed to add support for marking flaky tests and
putting an expiration date on failures.  Additionally, I needed
versioned failure manifests, and I could not justify storing in SVN
multiple directories with 12Mb worth of .sum files in them.

The small manifest file also has the local advantage of serving as
release documentation for what we expect to fail and why.


Diego.


Re: Volatile qualification on pointer and data

2011-09-23 Thread Paulo J. Matos

On 22/09/11 22:15, Richard Guenther wrote:


Btw, I think this is an old bug that has been resolved.  Did you make sure to
test a recent 4.6 branch snapshot or svn head?



My hopes were high but unfortunately it is not fixed yet.
git head 36181f98 still generates the same unexpected code.

Cheers,
--
PMatos



Re: Use of FLAGS_REGNUM clashes with generates insn

2011-09-23 Thread amylaar

Quoting "Paulo J. Matos" :


That's seriously annoying. The idea was to ditch cc0 and explicitly
represent CC in a register to perform optimizations like splitting add
and addc for a double word addition. If by hiding my register flags
means going back to cc0, then it seems that the only way to go unless I
get it to work somehow. If you having anything else in mind to get it
to work let me know.


Hiding the flags register would mean it is not represented in the rtl at
all.  You can have combined compare-branch instructions.
Of course, going that route would mean that the model you present to
GCC is even further from the hardware than one that uses cc0.


What I currently have in mind is to have a backend macro listing all
the move for which a move clobber CC_REG, then whenever GCC generates a
move, it queries the macro to know if the move requires clobbering and
emits the clobber if required. However, I am unsure how deep the rabbit
hole goes.


Oh, so you do have variants that can do without the clobber.
If you can make all the reloads without introducing explicit flag
clobbers, that it should work.
But you can't just pull a flag clobber out of thin air.  You should
have some way to generate valid code when the flags register is
unavailable / must be saved.  Then you can use peephole2 to add
flag clobbers where the flags register is available.

Or you can use machine_dependent_reorg or another machine-specific pass
inserted with the pass manager to rewrite clobber-free instructions into
ones that have a hardware equivalent; but you must make sure that your
data flow remains sound in the process.


Re: RFC: DWARF Extensions for Separate Debug Info Files ("Fission")

2011-09-23 Thread Jan Kratochvil
On Fri, 23 Sep 2011 02:21:44 +0200, Cary Coutant wrote:
> * .debug_pubtypes - Public types for use in building the
>   .gdb_index section at link time. This section will have an
>   extended format to allow it to represent both types in the
>   .debug_dwo_info section and type units in .debug_types.
^^^
= .dwo_info , maybe both .debug_info and .dwo_info


> * .dwo_abbrev - Defines the abbreviation codes used by the
>   .debug_dwo_info section.
^^^
= .dwo_info


I find this .dwo_* setup is great for rapid development rebuilds but it should
remain optional as the currently used DWARF final separate .debug info file is
smaller than all the .dwo files together.  In the case of the final linked
.debug builds (rpm/deb/...) one does not consider the build speed as important.
It probably does not make sense to merge + convert .dwo files back to a single
.debug file for the rpm/deb/... build performance reasons.


Thanks,
Jan


Re: Use of FLAGS_REGNUM clashes with generates insn

2011-09-23 Thread Paulo J. Matos
On Fri, 23 Sep 2011 09:30:48 -0400, amylaar wrote:

> Hiding the flags register would mean it is not represented in the rtl at
> all.  You can have combined compare-branch instructions. Of course,
> going that route would mean that the model you present to GCC is even
> further from the hardware than one that uses cc0.
>

Got it! That seems that it would go against the whole point of replacing 
cc0 for CC_REGNUM in my specific case. Oh well...
 
>> What I currently have in mind is to have a backend macro listing all
>> the move for which a move clobber CC_REG, then whenever GCC generates a
>> move, it queries the macro to know if the move requires clobbering and
>> emits the clobber if required. However, I am unsure how deep the rabbit
>> hole goes.
> 
> Oh, so you do have variants that can do without the clobber. 

Actually I don't... My explanation was supposed to be referring to a 
general solution. In my case, the macro would list all moves since all 
moves clobber CC.

> If you can
> make all the reloads without introducing explicit flag clobbers, that it
> should work.

Unfortunately I can't.

> But you can't just pull a flag clobber out of thin air.  

Understood.

> You should have
> some way to generate valid code when the flags register is unavailable /
> must be saved.  Then you can use peephole2 to add flag clobbers where
> the flags register is available.
> 
> Or you can use machine_dependent_reorg or another machine-specific pass
> inserted with the pass manager to rewrite clobber-free instructions into
> ones that have a hardware equivalent; but you must make sure that your
> data flow remains sound in the process.

I think your last suggestion of having a pass to rewrite the clobber free 
instructions into one with a hardware equivalent seems the one to go for 
me. 

Thanks for the suggestions,

-- 
PMatos



misbehaviour with md5_process_bytes and maybe in optimization

2011-09-23 Thread Pierre Vittet
Hello,

I recently asked for some help as I got a problem when using
md5_process_bytes (in libiberty/md5.c):
http://gcc.gnu.org/ml/gcc-help/2011-09/msg00126.html,
http://gcc.gnu.org/ml/gcc-help/2011-09/msg00127.html and it appears that
there is a bug in md5_process_bytes.

The bug can conduct to a miscomputed md5 result.

It tooks time to me to make the bug reproducible but I was finally able
to do so. The fact is that it only appears in very particular situation.
I have written a small gcc plugin, allowing to reproduce it (see
attachment).
The bad news is that the bug only appears when use libiberty compiled in
-g -O0 (it works well with -O2). It is quite sad, because It could means
another bug in an optimization function.

I have attached a README which detail how to use the plugin and how to
explain the bug. I have tried to explain as good as possible (and I
apologize for my very bad english).

The bug appears when:
1) We use libiberty compiled with -O0
2) We first call md5_process_bytes with a less than 64 bits buffer (we
call his size len1).
3) We make a new call of md5_process_bytes with a buffer which has a
size len2 such as:
len2 > 127 + 65 (so test in line 228 of md5.C will be true)
128 -len1 != Mulint with Mulint %  __alignof__ (md5_uint32) != 0 (so
condition on line 238 is true)
len2 - (128 - len1) = Mul64 and Mul64 such as Mul %64=0 (so the loop of
line 239 is broken with len = 64, this leads to the bug as, line 249,
(len & ~63) = 64 and we shift the buffer without processing the data).


Please, can you reproduce the bug? Is there any useful informations I
can add? Must I contact somebody from libiberty (I don't know the status
of this library (is this part of gcc or from another project?)).

I already sent a patch correcting this issue (it does not correct the
fact that we don't get the bug with an optimised libiberty):
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01098.html. It has not been
reviewed, could someone reviews this?

Thanks!

Pierre Vittet


md5sum_plugin.tar.gz
Description: application/gzip


Re: misbehaviour with md5_process_bytes and maybe in optimization

2011-09-23 Thread Ian Lance Taylor
Pierre Vittet  writes:

> The bug appears when:
>   1) We use libiberty compiled with -O0
>   2) We first call md5_process_bytes with a less than 64 bits buffer (we
> call his size len1).
>   3) We make a new call of md5_process_bytes with a buffer which has a
> size len2 such as:
>   len2 > 127 + 65 (so test in line 228 of md5.C will be true)
>   128 -len1 != Mulint with Mulint %  __alignof__ (md5_uint32) != 0 (so
> condition on line 238 is true)
>   len2 - (128 - len1) = Mul64 and Mul64 such as Mul %64=0 (so the loop of
> line 239 is broken with len = 64, this leads to the bug as, line 249,
> (len & ~63) = 64 and we shift the buffer without processing the data).

The line numbers you mention do not correspond to any version of
libiberty/md5.c that I can see.  Can you list the exact line for each
line number you mention, so that your explanation is easier to follow?
Thanks.

Ian


Re: passing arguments to gcc build in eclipse

2011-09-23 Thread pankajsejwal

ok sorrythanks for replying..!!

Andrew Haley wrote:
> 
> On 09/16/2011 11:30 AM, pankajsejwal wrote:
>>
>> I have build gcc and imported it on eclipse and started to debug it from
>> main
>> but after a few steps it stops and sends "malloc.c" not found error and
>> asks
>> to give a source path to it.
>> I believe the problem is because of the arguments that it requires to
>> proceed for example "" as gcc takes some arguments to work on
>> in
>> terminal.
>>
>> Can someone please tell me the error I am facing and it i am correct can
>> u
>> tell me how to pass arguments to the built code that it can recognize it
>> as
>> a .C file.
> 
> This is not an appropriate message for gcc@gcc.gnu.org, which is
> only about the development of gcc itself.
> 
> Most of us don't use Eclipse.  I think you'd be much better advised
> to direct this to an Eclipse-specific list, where the experts will be.
> 
> Andrew.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/passing-arguments-to-gcc-build-in-eclipse-tp32477948p32503880.html
Sent from the gcc - Dev mailing list archive at Nabble.com.



Wrong documentation of TARGET_ADDR_SPACE_SUBSET_P

2011-09-23 Thread Bingfeng Mei
Hi, 
I notice the following description is different from how spu & m32c use it. 

In internal manual:

bool TARGET_ADDR_SPACE_SUBSET_P (addr space t superset, [Target Hook]
addr space t subset)
Define this to return whether the subset named address space is contained 
within the
superset named address space. Pointers to a named address space that is a subset
of another named address space will be converted automatically without a cast if
used together in arithmetic operations. Pointers to a superset address space 
can be
converted to pointers to a subset address space via explicit casts.

In spu & m32c ports:
m32c_addr_space_subset_p (addr_space_t subset, addr_space_t superset)
spu_addr_space_subset_p (addr_space_t subset, addr_space_t superset)

I believe the document is wrong. The first argument is subset and the second
one is superset.  Should I submit a patch?

Cheers,
Bingfeng Mei



Re: RFC: DWARF Extensions for Separate Debug Info Files ("Fission")

2011-09-23 Thread Cary Coutant
>> * .debug_pubtypes - Public types for use in building the
>>   .gdb_index section at link time. This section will have an
>>   extended format to allow it to represent both types in the
>>   .debug_dwo_info section and type units in .debug_types.
>    ^^^
>    = .dwo_info , maybe both .debug_info and .dwo_info
>
>
>> * .dwo_abbrev - Defines the abbreviation codes used by the
>>   .debug_dwo_info section.
>    ^^^
>    = .dwo_info

Thanks, I've fixed the wiki page.

> I find this .dwo_* setup is great for rapid development rebuilds but it should
> remain optional as the currently used DWARF final separate .debug info file is
> smaller than all the .dwo files together.  In the case of the final linked
> .debug builds (rpm/deb/...) one does not consider the build speed as 
> important.
> It probably does not make sense to merge + convert .dwo files back to a single
> .debug file for the rpm/deb/... build performance reasons.

Yes, we'll definitely make this a compile-time option.

While I haven't finished designing the package format for collecting
all the .dwo files, I do plan on having the packaging tool do at least
duplicate type elimination to reduce the size of the package file.

-cary


Re: RFC: DWARF Extensions for Separate Debug Info Files ("Fission")

2011-09-23 Thread Cary Coutant
> The Apple approach has both the features of the Sun/HP implementation as well 
> as the ability to create a standalone debug info file.

Thanks for the clarifications. I based my comments on a description
you sent me a couple of years ago, and I apologize for any
oversimplifications I introduced.

> The compiler puts DWARF in the .o file, the linker adds some records in the 
> executable which help us to understand where files/function/symbols landed in 
> the final executable[1].

Did you intend to add a footnote?

>  If the user runs our gdb or lldb on one of these binaries, the debugger will 
> read the DWARF directly out of the .o files on the fly.  Because the linker 
> doesn't need to copy around/update/modify the DWARF, link times are very 
> fast.  If the developer decides to debug the program, no extra steps are 
> required - the debugger can be started up & used with the debug info still in 
> the .o files.

We're trying to achieve something very similar, but we have the
additional goal of separating the info from the .o files because of
our distributed build environment. I also wanted to attempt to
standardize the approach, instead of having each vendor go in separate
directions.

Thanks,

-cary


Re: misbehaviour with md5_process_bytes and maybe in optimization

2011-09-23 Thread Pierre Vittet

Thanks for your interest,

I just checked revision 179127 of GCC. Last revision is 177700, it has
not been change for 6 weeks.

My file is the same as this one:
http://gcc.gnu.org/viewcvs/trunk/libiberty/md5.c?revision=177700&view=markup

in libiberty/md5.c, function md5_process_bytes start line 203.

On 23/09/2011 17:13, Ian Lance Taylor wrote:
> Pierre Vittet  writes:
> 
>> The bug appears when:
>>  1) We use libiberty compiled with -O0
>>  2) We first call md5_process_bytes with a less than 64 bits buffer (we
>> call his size len1).
>>  3) We make a new call of md5_process_bytes with a buffer which has a
>> size len2 such as:
>>  len2 > 127 + 65 (so test in line 228 of md5.C will be true)
line 228 is the following:if (len > 64)
>>  128 -len1 != Mulint with Mulint %  __alignof__ (md5_uint32) != 0 (so
>> condition on line 238 is true)
line 238 is the following: if (UNALIGNED_P (buffer))
>>  len2 - (128 - len1) = Mul64 and Mul64 such as Mul %64=0 (so the loop of
>> line 239 is broken with len = 64, this leads to the bug as, line 249,
>> (len & ~63) = 64 and we shift the buffer without processing the data).

line 239 is the following: while (len > 64)
line 249: buffer = (const void *) ((const char *) buffer + (len & ~63));
> 
> The line numbers you mention do not correspond to any version of
> libiberty/md5.c that I can see.  Can you list the exact line for each
> line number you mention, so that your explanation is easier to follow?
> Thanks.

I give about the same explanation in the README (which is in the
attached archive of my previous mail) but I does not use line number but
direct quote of the code. It mights be more easy to try the plugin with
gdb but it needs to compile libiberty.a with -O0.
> 
> Ian
> 






Re: misbehaviour with md5_process_bytes and maybe in optimization

2011-09-23 Thread Ian Lance Taylor
Pierre Vittet  writes:

> Thanks for your interest,
>
> I just checked revision 179127 of GCC. Last revision is 177700, it has
> not been change for 6 weeks.
>
> My file is the same as this one:
> http://gcc.gnu.org/viewcvs/trunk/libiberty/md5.c?revision=177700&view=markup
>
> in libiberty/md5.c, function md5_process_bytes start line 203.
>
> On 23/09/2011 17:13, Ian Lance Taylor wrote:
>> Pierre Vittet  writes:
>> 
>>> The bug appears when:
>>> 1) We use libiberty compiled with -O0
>>> 2) We first call md5_process_bytes with a less than 64 bits buffer (we
>>> call his size len1).
>>> 3) We make a new call of md5_process_bytes with a buffer which has a
>>> size len2 such as:
>>> len2 > 127 + 65 (so test in line 228 of md5.C will be true)
> line 228 is the following:if (len > 64)
>>> 128 -len1 != Mulint with Mulint %  __alignof__ (md5_uint32) != 0 (so
>>> condition on line 238 is true)
> line 238 is the following: if (UNALIGNED_P (buffer))
>>> len2 - (128 - len1) = Mul64 and Mul64 such as Mul %64=0 (so the loop of
>>> line 239 is broken with len = 64, this leads to the bug as, line 249,
>>> (len & ~63) = 64 and we shift the buffer without processing the data).
>
> line 239 is the following: while (len > 64)
> line 249: buffer = (const void *) ((const char *) buffer + (len & ~63));
>> 
>> The line numbers you mention do not correspond to any version of
>> libiberty/md5.c that I can see.  Can you list the exact line for each
>> line number you mention, so that your explanation is easier to follow?
>> Thanks.
>
> I give about the same explanation in the README (which is in the
> attached archive of my previous mail) but I does not use line number but
> direct quote of the code. It mights be more easy to try the plugin with
> gdb but it needs to compile libiberty.a with -O0.

Thanks, I think I have it sorted out now.

It does not happen on x86 glibc-based systems at -O2 because at -O2
 #defines STRING_ARCH_unaligned, so the problematic code is
not compiled or executed.

The error was introduced by this change:

2005-07-03  Steve Ellcey  

PR other/13906
* md5.c (md5_process_bytes): Check alignment.

Thanks for noticing this problem, analyzing it, and reporting it.

I committed this patch to mainline to fix the problem.  Bootstrapped on
x86_64-unknown-linux-gnu.

Ian


2011-09-23  Ian Lance Taylor  

* md5.c (md5_process_bytes): Correct handling of unaligned
buffer.


Index: md5.c
===
--- md5.c	(revision 179127)
+++ md5.c	(working copy)
@@ -1,6 +1,6 @@
 /* md5.c - Functions to compute MD5 message digest of files or memory blocks
according to the definition of MD5 in RFC 1321 from April 1992.
-   Copyright (C) 1995, 1996 Free Software Foundation, Inc.
+   Copyright (C) 1995, 1996, 2011 Free Software Foundation, Inc.
 
NOTE: This source is derived from an old version taken from the GNU C
Library (glibc).
@@ -245,9 +245,11 @@ md5_process_bytes (const void *buffer, s
   }
   else
 #endif
-  md5_process_block (buffer, len & ~63, ctx);
-  buffer = (const void *) ((const char *) buffer + (len & ~63));
-  len &= 63;
+	{
+	  md5_process_block (buffer, len & ~63, ctx);
+	  buffer = (const void *) ((const char *) buffer + (len & ~63));
+	  len &= 63;
+	}
 }
 
   /* Move remaining bytes in internal buffer.  */


Re: Incorrect optimized (-O2) linked list code with 4.3.2

2011-09-23 Thread Richard Guenther
On Mon, Sep 12, 2011 at 10:10 AM, pavan tc  wrote:
> Hi,
>
> I would like to know if there have been issues with optimized linked
> list code with GCC 4.3.2. [optiimization flag : -O2]
>
> The following is the inlined code that has the problem:
>
> static inline void
> list_add_tail (struct list_head *new, struct list_head *head)
> {
>        new->next = head;
>        new->prev = head->prev;
>
>        new->prev->next = new;
>        new->next->prev = new;
> }
>
> The above code has been used in the loop as below:
>
>        pool = GF_CALLOC (count, padded_sizeof_type, gf_common_mt_long);
>        if (!pool) {
>                GF_FREE (mem_pool);
>                return NULL;
>        }
>
>        for (i = 0; i < count; i++) {
>                list = pool + (i * (padded_sizeof_type));
>                INIT_LIST_HEAD (list);
>                list_add_tail (list, &mem_pool->list);
>                
>        }
>
> '&mem_pool-> list' is used as the list head. mem_pool is a pointer to type :
> struct mem_pool {
>        struct list_head  list;
>        int               hot_count;
>        int               cold_count;
>        gf_lock_t         lock;
>        unsigned long     padded_sizeof_type;
>        void             *pool;
>        void             *pool_end;
>        int               real_sizeof_type;
> };
>
> 'list' is the new member being added to the tail of the list pointed to by 
> head.
> It is a pointer to type:
> struct list_head {
>        struct list_head *next;
>        struct list_head *prev;
> };
>
> The generated assembly for the loop (with the linined list_add_tail())
> is as below:
>
>   40a1c:       e8 0f 03 fd ff         callq  10d30 <__gf_calloc@plt>
>   40a21:       48 85 c0               test   %rax,%rax
>   40a24:       48 89 c7               mov    %rax,%rdi
>   40a27:       0f 84 bf 00 00 00   je     40aec 
>   40a2d:       48 8b 73 08          mov    0x8(%rbx),%rsi
>   40a31:       4d 8d 44 24 01      lea    0x1(%r12),%r8
>   40a36:       31 c0                   xor    %eax,%eax
>   40a38:       b9 01 00 00 00     mov    $0x1,%ecx
>   40a3d:       0f 1f 00                nopl   (%rax)
>   40a40:       49 0f af c5            imul   %r13,%rax
>       <=== loop start
>   40a44:       48 8d 04 07          lea    (%rdi,%rax,1),%rax
>   40a48:       48 89 18              mov    %rbx,(%rax)
> # list->next = head
>   40a4b:       48 89 06              mov    %rax,(%rsi)
> #   head->prev->next = list
>   40a4e:       48 8b 10              mov    (%rax),%rdx
> # rdx holds list->next
>   40a51:       48 89 70 08         mov    %rsi,0x8(%rax)          #
> list->prev = head->prev;
>   40a55:       48 89 42 08         mov    %rax,0x8(%rdx)         #
> list->next->prev = list
>   40a59:       48 89 c8              mov    %rcx,%rax
>   40a5c:       48 83 c1 01          add    $0x1,%rcx
>   40a60:       4c 39 c1               cmp    %r8,%rcx
>   40a63:       75 db                   jne    40a40 
>
> In the assembly above, %rbx  holds the address of 'head'.
> %rsi holds the value of head->prev. This is assigned outside the loop and the
> compiler classifies it as a loop invariant, which is where, I think,
> the problem is.
> This line of code should have been inside the loop.
>  - %rsi still holds the value of head->prev that was assigned
> outside the loop.
>
> The following experiments eliminate the problem:
>
> 1. Using 'volatile' on the address that 'head' points to.
> 2. Using a function call (logging calls, for example) inside the loop.
> 3. Using the direct libc calloc instead of the GF_CALLOC.
> [GF_CALLOC does some accounting when accounting is enabled. Calls vanilla
> libc calloc() otherwise].
>
> So, anything that necessitates a different usage of %rsi seems to be 
> correcting
> the behaviour.
>
> 4. Using gcc 4.4.3 [ The obvious solution would then be to use 4.4.3,
> but I would
> like to understand if this is a known problem with 4.3.2. Small
> programs written to
> emulate this problem do not exhibit the erroneous behaviour.]
>
> Please let me know if any more details about this behaviour are required.
> I'll be glad to provide them.

Use -fno-strict-aliasing.  Your code invokes undefined behavior.

> TIA,
> Pavan
>


Re: A case that PRE optimization hurts performance

2011-09-23 Thread Richard Guenther
On Fri, Sep 16, 2011 at 4:00 AM, Jiangning Liu  wrote:
> Hi Richard,
>
> I slightly changed the case to be like below,
>
> int f(char *t) {
>    int s=0;
>
>    while (*t && s != 1) {
>        switch (s) {
>        case 0:   /* path 1 */
>            s = 2;
>            break;
>        case 2:   /* path 2 */
>            s = 3; /* changed */
>            break;
>        default:  /* path 3 */
>            if (*t == '-')
>                s = 2;
>            break;
>        }
>        t++;
>    }
>
>    return s;
> }
>
> "-O2" is still worse than "-O2 -fno-tree-pre".
>
> "-O2 -fno-tree-pre" result is
>
> f:
>        pushl   %ebp
>        xorl    %eax, %eax
>        movl    %esp, %ebp
>        movl    8(%ebp), %edx
>        movzbl  (%edx), %ecx
>        jmp     .L14
>        .p2align 4,,7
>        .p2align 3
> .L5:
>        movl    $2, %eax
> .L7:
>        addl    $1, %edx
>        cmpl    $1, %eax
>        movzbl  (%edx), %ecx
>        je      .L3
> .L14:
>        testb   %cl, %cl
>        je      .L3
>        testl   %eax, %eax
>        je      .L5
>        cmpl    $2, %eax
>        .p2align 4,,5
>        je      .L17
>        cmpb    $45, %cl
>        .p2align 4,,5
>        je      .L5
>        addl    $1, %edx
>        cmpl    $1, %eax
>        movzbl  (%edx), %ecx
>        jne     .L14
>        .p2align 4,,7
>        .p2align 3
> .L3:
>        popl    %ebp
>        .p2align 4,,2
>        ret
>        .p2align 4,,7
>        .p2align 3
> .L17:
>        movb    $3, %al
>        .p2align 4,,3
>        jmp     .L7
>
> While "-O2" result is
>
> f:
>        pushl   %ebp
>        xorl    %eax, %eax
>        movl    %esp, %ebp
>        movl    8(%ebp), %edx
>        pushl   %ebx
>        movzbl  (%edx), %ecx
>        jmp     .L14
>        .p2align 4,,7
>        .p2align 3
> .L5:
>        movl    $1, %ebx
>        movl    $2, %eax
> .L7:
>        addl    $1, %edx
>        testb   %bl, %bl
>        movzbl  (%edx), %ecx
>        je      .L3
> .L14:
>        testb   %cl, %cl
>        je      .L3
>        testl   %eax, %eax
>        je      .L5
>        cmpl    $2, %eax
>        .p2align 4,,5
>        je      .L16
>        cmpb    $45, %cl
>        .p2align 4,,5
>        je      .L5
>        cmpl    $1, %eax
>        setne   %bl
>        addl    $1, %edx
>        testb   %bl, %bl
>        movzbl  (%edx), %ecx
>        jne     .L14
>        .p2align 4,,7
>        .p2align 3
> .L3:
>        popl    %ebx
>        popl    %ebp
>        ret
>        .p2align 4,,7
>        .p2align 3
> .L16:
>        movl    $1, %ebx
>        movb    $3, %al
>        jmp     .L7
>
> You may notice that register ebx is introduced, and some more instructions
> around ebx are generated as well. i.e.
>
>        setne   %bl
>        testb   %bl, %bl
>
> I agree with you that in theory PRE does the right thing to minimize the
> computation cost on gimple level. However, the problem is the cost of
> converting comparison result to a bool value is not considered, so it
> actually makes binary code worse. For this case, as I summarized below, to
> complete the same functionality "With PRE" is worse than "Without PRE" for
> all three paths,
>
> * Without PRE,
>
> Path1:
>        movl    $2, %eax
>        cmpl    $1, %eax
>        je      .L3
>
> Path2:
>        movb    $3, %al
>        cmpl    $1, %eax
>        je      .L3
>
> Path3:
>        cmpl    $1, %eax
>        jne     .L14
>
> * With PRE,
>
> Path1:
>        movl    $1, %ebx
>        movl    $2, %eax
>        testb   %bl, %bl
>        je      .L3
>
> Path2:
>        movl    $1, %ebx
>        movb    $3, %al
>        testb   %bl, %bl
>        je      .L3
>
> Path3:
>        cmpl    $1, %eax
>        setne   %bl
>        testb   %bl, %bl
>        jne     .L14
>
> Do you have any more thoughts?

It seems to me that with PRE all the testb %bl, %bl
should be evaluated at compile-time considering the
preceeding movl $1, %ebx.  Am I missing something?

Richard.

> Thanks,
> -Jiangning
>
>> -Original Message-
>> From: Richard Guenther [mailto:richard.guent...@gmail.com]
>> Sent: Tuesday, August 02, 2011 5:23 PM
>> To: Jiangning Liu
>> Cc: gcc@gcc.gnu.org
>> Subject: Re: A case that PRE optimization hurts performance
>>
>> On Tue, Aug 2, 2011 at 4:37 AM, Jiangning Liu 
>> wrote:
>> > Hi,
>> >
>> > For the following simple test case, PRE optimization hoists
>> computation
>> > (s!=1) into the default branch of the switch statement, and finally
>> causes
>> > very poor code generation. This problem occurs in both X86 and ARM,
>> and I
>> > believe it is also a problem for other targets.
>> >
>> > int f(char *t) {
>> >    int s=0;
>> >
>> >    while (*t && s != 1) {
>> >        switch (s) {
>> >        case 0:
>> >            s = 2;
>> >            break;
>> >        case 2:
>> >            s = 1;
>> >            break;
>> >        default:
>> >            if (*t == '-')
>> >                s = 1;
>> >            break;
>> >        }
>> >        t++;
>> >    }
>> >
>> >    return s;
>> > }
>> >
>

Re: RFC: DWARF Extensions for Separate Debug Info Files ("Fission")

2011-09-23 Thread Jason Molenda

On Sep 23, 2011, at 10:58 AM, Cary Coutant wrote:

>> The compiler puts DWARF in the .o file, the linker adds some records in the 
>> executable which help us to understand where files/function/symbols landed 
>> in the final executable[1].
> 
> Did you intend to add a footnote?

Yeah, I realized after I sent the email - it didn't seem interesting enough to 
warrant a separate followup.

The records that our linker puts in the executable are in the form of stabs 
entries.  There are a handful of stabs records created - file start, file end, 
function start, function end, symbol, pointer to a .o file, maybe one or two 
others.  We chose that format because it was trivial to support and we already 
had tools for stripping these records out of the executable once the dSYM had 
been created.

Once a dSYM has been created with all of the DWARF collected in a single file, 
our DWARF is parseable by any debug info consumer with minimal changes -- they 
need to know to look in a separate file for the DWARF from the main executable, 
but the format itself is unchanged.  Supporting the 
debug-information-in-.o-files is more involved, I don't know if any of the 
third-party debuggers on our platform work with it.


> We're trying to achieve something very similar, but we have the
> additional goal of separating the info from the .o files because of
> our distributed build environment. I also wanted to attempt to
> standardize the approach, instead of having each vendor go in separate
> directions.


Yeah, if your regular build environment involves distributed compilation, and 
the .o files need to be copied to a central system for the linker, then I can 
see why you're pursuing this approach.  For us, the most common usage is 
single-computer compilation & linking -- where the linker never pages in the 
debug info sections from the .o files so their size is not particular important.

J


Re: [Dwarf-Discuss] RFC: DWARF Extensions for Separate Debug Info Files ("Fission")

2011-09-23 Thread John DelSignore
Hi Jason,

Jason Molenda wrote:
> On Sep 23, 2011, at 10:58 AM, Cary Coutant wrote:
> 
>>> The compiler puts DWARF in the .o file, the linker adds some records in the 
>>> executable which help us to understand where files/function/symbols landed 
>>> in the final executable[1].
>> Did you intend to add a footnote?
> 
> Yeah, I realized after I sent the email - it didn't seem interesting enough 
> to warrant a separate followup.
> 
> The records that our linker puts in the executable are in the form of stabs 
> entries.  There are a handful of stabs records created - file start, file 
> end, function start, function end, symbol, pointer to a .o file, maybe one or 
> two others.  We chose that format because it was trivial to support and we 
> already had tools for stripping these records out of the executable once the 
> dSYM had been created.

I don't remember the exact details, but the problem I recall with the Darwin 
scheme is that it builds an incomplete index in the Mach-O symbol table. IIRC, 
it was missing things that a user might want to lookup by-name in the debugger, 
like static functions or variables, and type names with external linkage. 
Without a reasonably complete index, the debugger can't know where to find the 
definitions of certain things, and that forces the user to navigate using other 
information, like source file name or global function definitions to force the 
debug information in the object to be read.

Of course, the current DWARF indexes (like pubnames/pubtypes) have the same 
problem, and some compilers do a really bad job at generating those sections. 
But at least when there's a single .debug_info section, the debugger can decide 
to ignore the indexes and "skim" the full debug information. The compilers on 
IRIX did a better job at generating indexes, so the debugger could find by-name 
static functions/objects.

> Once a dSYM has been created with all of the DWARF collected in a single 
> file, our DWARF is parseable by any debug info consumer with minimal changes 
> -- they need to know to look in a separate file for the DWARF from the main 
> executable, but the format itself is unchanged.  Supporting the 
> debug-information-in-.o-files is more involved, I don't know if any of the 
> third-party debuggers on our platform work with it.

TotalView supports debug information in .o files on Darwin, and has since day 
one. Perhaps you recall all those email exchanges you and I had several years 
back. It was a modest amount of work, given that we already supported debug 
information in .o files on the Sun and HP platforms.

I seem to recall one of the sore spots for us on Dawrin was getting good 
address information for certain DWARF location operations, like DW_OP_addr. 
Fortran was a particularly messy because some compilers didn't supply a linkage 
name attribute, so the debugger had to make several guesses at the name, and 
look things up by trial and error.

Cheers, John D.

>> We're trying to achieve something very similar, but we have the
>> additional goal of separating the info from the .o files because of
>> our distributed build environment. I also wanted to attempt to
>> standardize the approach, instead of having each vendor go in separate
>> directions.
> 
> 
> Yeah, if your regular build environment involves distributed compilation, and 
> the .o files need to be copied to a central system for the linker, then I can 
> see why you're pursuing this approach.  For us, the most common usage is 
> single-computer compilation & linking -- where the linker never pages in the 
> debug info sections from the .o files so their size is not particular 
> important.
> 
> J
> ___
> Dwarf-Discuss mailing list
> dwarf-disc...@lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
> 


AIX libstdc++ missing symbols

2011-09-23 Thread David Edelsohn
My latest bootstrap of GCC on AIX failed due to missing symbols in
libstdc++ expected by libgmpxx:

exec(): 0509-036 Cannot load program exec(): 0509-036 Cannot load
program /tmp/20110922/./gcc/cc1plus/tmp/20110922/./g
cc/cc1plus because of the following errors:
 because of the following errors:
0509-130 Symbol resolution failed for   0509-130 Symbol
resolution failed for /usr/gnu/lib/libgmpxx.a(libgmpxx
.so.4)/usr/gnu/lib/libgmpxx.a(libgmpxx.so.4) because:
 because:
0509-136   Symbol   0509-136   Symbol
_ZNSt6localeD1Ev_ZNSt6localeD1Ev (number  (number 44) is not exporte
d from
   dependent module ) is not exported from
   dependent module
/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.
so.6)/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.so.6).
.
0509-136   Symbol   0509-136   Symbol
_ZNSt6localeC1ERKS__ZNSt6localeC1ERKS_ (number  (number 66) is not e
xported from
   dependent module ) is not exported from
   dependent module
/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.
so.6)/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.so.6).
.
0509-136   Symbol   0509-136   Symbol
_ZNSt8ios_base4InitD1Ev_ZNSt8ios_base4InitD1Ev (number  (number 1010
) is not exported from
   dependent module ) is not exported from
   dependent module
/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.
so.6)/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.so.6).
.
0509-136   Symbol   0509-136   Symbol
_ZNSt8ios_base4InitC1Ev_ZNSt8ios_base4InitC1Ev (number  (number 
) is not exported from
   dependent module ) is not exported from
   dependent module
/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.
so.6)/tmp/20110922/powerpc-ibm-aix5.3.0.0/libstdc++-v3/src/.libs/libstdc++.a(libstdc++.so.6).

Any idea what has changed and why those symbols no longer are exported
by libstdc++?  This seems like a libstdc++ ABI change if they really
disappeared.

Thanks, David


Re: AIX libstdc++ missing symbols

2011-09-23 Thread Paolo Carlini

On 09/24/2011 12:23 AM, David Edelsohn wrote:

My latest bootstrap of GCC on AIX failed due to missing symbols in
libstdc++ expected by libgmpxx:
On x86_64-linux are both still exported. And for sure nobody worked on 
the code itself. I would say, it's a compiler issue..


Paolo.


gcc-4.6-20110923 is now available

2011-09-23 Thread gccadmin
Snapshot gcc-4.6-20110923 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.6-20110923/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch 
revision 179133

You'll find:

 gcc-4.6-20110923.tar.bz2 Complete GCC

  MD5=85f2513ed81259e02029c7b20e0a53bb
  SHA1=bdef841f21d3e2753bc7f5fad8505eef500456b3

Diffs from 4.6-20110916 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.