Re: [buildrobot] sparc64-linux broken

2014-04-21 Thread Jan-Benedict Glaw
On Sat, 2014-04-19 21:54:07 +0200, Jan-Benedict Glaw  wrote:
> Hi!
> 
> I noticed that between 704db68e45..f2de45326 (209463..r209495),
> probbaly in e2ec52cad85e (r209484), building for arm-eabi breaks:

For sparc64-linux, things look like this (cf.
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=203150):

g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wwrite-strings -Wcast-qual 
-Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long 
-Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -I. 
-I. -I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. 
-I/home/jbglaw/repos/gcc/gcc/../include 
-I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o tree.o -MT tree.o -MMD -MP 
-MF ./.deps/tree.TPo /home/jbglaw/repos/gcc/gcc/tree.c
/home/jbglaw/repos/gcc/gcc/tree.c: In function ‘void 
tree_range_check_failed(const tree_node*, const char*, int, const char*, 
tree_code, tree_code)’:
/home/jbglaw/repos/gcc/gcc/tree.c:9246: warning: comparison between signed and 
unsigned integer expressions
/home/jbglaw/repos/gcc/gcc/tree.c:9253: warning: comparison between signed and 
unsigned integer expressions
/home/jbglaw/repos/gcc/gcc/tree.c: In function ‘void 
omp_clause_range_check_failed(const tree_node*, const char*, int, const char*, 
omp_clause_code, omp_clause_code)’:
/home/jbglaw/repos/gcc/gcc/tree.c:9307: warning: comparison between signed and 
unsigned integer expressions
/home/jbglaw/repos/gcc/gcc/tree.c:9314: warning: comparison between signed and 
unsigned integer expressions
/home/jbglaw/repos/gcc/gcc/tree.c: In function ‘void 
warn_deprecated_use(tree_node*, tree_node*)’:
/home/jbglaw/repos/gcc/gcc/tree.c:12076: warning: unknown conversion type 
character ‘r’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12076: warning: format ‘%d’ expects type 
‘int’, but argument 5 has type ‘const char*’
/home/jbglaw/repos/gcc/gcc/tree.c:12076: warning: unknown conversion type 
character ‘R’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12076: warning: format ‘%s’ expects type 
‘char*’, but argument 6 has type ‘int’
/home/jbglaw/repos/gcc/gcc/tree.c:12076: warning: too many arguments for format
/home/jbglaw/repos/gcc/gcc/tree.c:12080: warning: unknown conversion type 
character ‘r’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12080: warning: format ‘%d’ expects type 
‘int’, but argument 5 has type ‘const char*’
/home/jbglaw/repos/gcc/gcc/tree.c:12080: warning: unknown conversion type 
character ‘R’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12080: warning: too many arguments for format
/home/jbglaw/repos/gcc/gcc/tree.c:12105: warning: unknown conversion type 
character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12105: warning: unknown conversion type 
character ‘r’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12105: warning: format ‘%s’ expects type 
‘char*’, but argument 3 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/tree.c:12105: warning: format ‘%d’ expects type 
‘int’, but argument 4 has type ‘const char*’
/home/jbglaw/repos/gcc/gcc/tree.c:12105: warning: unknown conversion type 
character ‘R’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12105: warning: too many arguments for format
/home/jbglaw/repos/gcc/gcc/tree.c:12109: warning: unknown conversion type 
character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12109: warning: unknown conversion type 
character ‘r’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12109: warning: format ‘%s’ expects type 
‘char*’, but argument 3 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/tree.c:12109: warning: format ‘%d’ expects type 
‘int’, but argument 4 has type ‘const char*’
/home/jbglaw/repos/gcc/gcc/tree.c:12109: warning: unknown conversion type 
character ‘R’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12109: warning: too many arguments for format
/home/jbglaw/repos/gcc/gcc/tree.c:12116: warning: unknown conversion type 
character ‘r’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12116: warning: format ‘%d’ expects type 
‘int’, but argument 4 has type ‘const char*’
/home/jbglaw/repos/gcc/gcc/tree.c:12116: warning: unknown conversion type 
character ‘R’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12116: warning: format ‘%s’ expects type 
‘char*’, but argument 5 has type ‘int’
/home/jbglaw/repos/gcc/gcc/tree.c:12116: warning: too many arguments for format
/home/jbglaw/repos/gcc/gcc/tree.c:12120: warning: unknown conversion type 
character ‘r’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12120: warning: format ‘%d’ expects type 
‘int’, but argument 4 has type ‘const char*’
/home/jbglaw/repos/gcc/gcc/tree.c:12120: warning: unknown conversion type 
character ‘R’ in format
/home/jbglaw/repos/gcc/gcc/tree.c:12120: warning: too many arguments for format
/home/jbglaw/repos/gcc/gcc/tree.c:12129: warni

Re: [buildrobot] sparc64-linux broken

2014-04-21 Thread Richard Henderson
On 04/21/2014 09:53 AM, Jan-Benedict Glaw wrote:
> /home/jbglaw/repos/gcc/gcc/config/sparc/sparc.c:4858: error: invalid 
> conversion from ‘int’ to ‘machine_mode’


Yes, something has changed recently in the build flags to (I believe) remove
-fpermissive.  Quite a few backends are affected by this change.

I've already fixed a similar error in the aarch64 backend.  I've got a patch
fixing sparc, but havn't committed or posted it yet.



r~



signature.asc
Description: OpenPGP digital signature


Re: [buildrobot] sparc64-linux broken

2014-04-21 Thread Jakub Jelinek
On Mon, Apr 21, 2014 at 10:15:19AM -0700, Richard Henderson wrote:
> On 04/21/2014 09:53 AM, Jan-Benedict Glaw wrote:
> > /home/jbglaw/repos/gcc/gcc/config/sparc/sparc.c:4858: error: invalid 
> > conversion from ‘int’ to ‘machine_mode’
> 
> 
> Yes, something has changed recently in the build flags to (I believe) remove
> -fpermissive.  Quite a few backends are affected by this change.
> 
> I've already fixed a similar error in the aarch64 backend.  I've got a patch
> fixing sparc, but havn't committed or posted it yet.

Actually, the change was in the GET_MODE_SIZE (and a few others macros),
previously they would happily accept int argument, now they (in C++) require
enum machine_mode.
Previously it was:
#define GET_MODE_SIZE(MODE)((unsigned short) mode_size[MODE])
and now for recent GCC it is:
#define GET_MODE_SIZE(MODE) \
  ((unsigned short) (__builtin_constant_p (MODE) \
 ? mode_size_inline (MODE) : mode_size[MODE]))
Sure, we could change this to use mode_size_inline ((enum machine_mode) (MODE))
in the macro instead, but I'd say for GCC codebase it is better if we fix
the few users of these macros that pass an int rather than enum machine_mode
to these macros.

Jakub


Re: [buildrobot] sparc64-linux broken

2014-04-21 Thread Richard Henderson
On 04/21/2014 11:02 AM, Jakub Jelinek wrote:
> but I'd say for GCC codebase it is better if we fix
> the few users of these macros that pass an int rather than enum machine_mode
> to these macros.

I agree.  In the aarch64 backend it determined that we were passing a
reg_class_t and not a mode at all.  Oops.  ;-)


r~


multiple rtems targets __dso_handle not found

2014-04-21 Thread Joel Sherrill
Hi

The cut and paste is from an sh-rtems C++ application link failure.
But the failure is happening on some h8300, m68k, powerpc, and
sh BSPs. Each BSP has its own linker script so if there is a mistake
in that due to age, then we could be missing some newer magic.
This is all with gcc 4.8.2 and binutils 2.24?

What are we missing that would have introduced this?

sh-rtems4.11-g++ -B../../../../../gensh1/lib/ -specs bsp_specs -qrtems
-m1 -O2 -g -Wall -Wmissing-prototypes -Wimplicit-function-declaration
-Wstrict-prototypes -Wnested-externs-m1   -o cdtest.exe init.o main.o
main.o: In function `_static_initialization_and_destruction_0':
/users/joel/rtems-4.11-work/rtems-testing/rtems/build-sh-gensh1-rtems/sh-rtems4.11/c/gensh1/testsuites/samples/cdtest/../../../../../../../rtems/c/src/../../testsuites/samples/cdtest/main.cc:141:
undefined reference to `__dso_handle'
/users/joel/rtems-4.11-work/tools/lib/gcc/sh-rtems4.11/4.8.2/libstdc++.a(eh_globals.o):
In function `_static_initialization_and_destruction_0':
/users/joel/rtems-4.11-work/rtems-source-builder/rtems/build/sh-rtems4.11-gcc-4.8.2-newlib-cvs-head-1/build/sh-rtems4.11/libstdc++-v3/libsupc++/../../../../gcc-4.8.2/libstdc++-v3/libsupc++/eh_globals.cc:109:
undefined reference to `__dso_handle'
/users/joel/rtems-4.11-work/tools/lib/gcc/sh-rtems4.11/4.8.2/../../../../sh-rtems4.11/bin/ld:
cdtest.exe: hidden symbol `___dso_handle' isn't defined
/users/joel/rtems-4.11-work/tools/lib/gcc/sh-rtems4.11/4.8.2/../../../../sh-rtems4.11/bin/ld:
final link failed: Bad value

Thanks.

-- 
Joel Sherrill, Ph.D. Director of Research & Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available(256) 722-9985



Re: Performance gain through dereferencing?

2014-04-21 Thread David Brown

On 16/04/14 17:57, Peter Schneider wrote:

Hi David,

Sorry, I had included more information in an earlier draft which I
edited out for brevity.


(Sorry for the late reply - Easter is a /serious/ holiday in Norway.)



 > You cannot learn useful timing
 > information from a single run of a short
 > test like this - there are far too many
 > other factors that come into play.

I didn't mention that I have run it dozens of times. I know that blunt
runtime measurements on a non-realtime system tend to be
non-reproducible, and that they are inadequate for exact measurements.
But the difference here is so large that the result is highly
significant, in spite of the "amateurish" setup. The run I am showing
here is typical. One of my four cores is surely idle at any given
moment, and there is no I/O, so the variations are small.



The differences you have noted are not large - you need to be much 
clearer about what you are actually measuring here.  Typically, you will 
want to look at the time taken for an empty loop, and the time for the 
working loops - and you will want to compare for loops of different 
sizes, and with large enough loops that things such as startup time are 
negligible.



You cannot learn useful timing information from unoptimised code.


I beg to disagree. While in this case the problem (and indeed eventually
the whole program ;-) ) goes away with optimization that may not be the
case in less trivial scenarios. And optimization or not -- I would
always contend that *p = n is **not slower** than i = n. But it is.
Something is wrong ;-).


Of preference, use optimisation to ensure the compiler and code is not 
doing anything extra, and "volatile" to control how the test code is 
generated.




So I'd like to direct our attention to the generated code and its
performance (because such code conceivably could appear as the result of
an optimized compiler run as well, in less trivial scenarios). What
puzzles me is: How can it be that two instructions are slower than a
very similar pair of instructions plus another one? (And that question
is totally unrelated to optimization.)


It is far from easy to understand what /actually/ happens inside a 
modern x86 processor.  The incoming instructions get translated into 
internal RISC instructions, which are scheduled, pipelined, and executed 
in different ways at different points in the pipeline.  It is certainly 
conceivable that by some quirk the three instruction version ends up 
faster than the two instruction version.


I am not trying to tell you that your measurements or discovery are 
wrong here - I am just trying to be sure that measurement errors are 
eliminated before jumping to any conclusions.


I would also like to see a series of code snippets that produce 
different code showing similar effects.  If these start to show a clear 
pattern, then it will be worth looking at and testing on different 
processors - it could be that there is scope for a peephole optimisation 
here.





Otherwise the
result could be nothing more than a quirk of the way caching worked out.


Could you explain how caching could play a role here if all variables
and adresses are on the stack and are likely to be in the same memory
page? (I'm not being sarcastic -- I may miss something obvious).


It may be /likely/ that they are on the same memory page, but it is not 
guaranteed.  It is also quite likely that they are on different cache 
lines - sometimes the split by cache line can have an effect.  I am not 
trying to give absolute answers here - merely suggesting other possible 
semi-random effects that might make this case a one-off rather than 
common behaviour.




I can imagine that somehow the processor architecture is better utilized
by the faster version (e.g. because short inner loops pipleline worse or
whatever). For what it's worth, the programs were running on a i7-3632QM.



Usually that's the case - shorter loops run faster.  But it is hard to 
be sure, and sometimes a loop that is too short will lead to bigger 
delays from stalls in the pipeline.  Getting the fastest possible speed 
out of a modern x86 processor is more of an art than a science, and it 
is a long time since simple logic gave all the right answers.


But by all means, investigate the effect further - it is certainly 
possible that you have come across a quirk in the processor that can be 
used to generate faster code.




Re: [buildrobot] sparc64-linux broken

2014-04-21 Thread Michael Meissner
On Mon, Apr 21, 2014 at 08:02:39PM +0200, Jakub Jelinek wrote:
> Sure, we could change this to use mode_size_inline ((enum machine_mode) 
> (MODE))
> in the macro instead, but I'd say for GCC codebase it is better if we fix
> the few users of these macros that pass an int rather than enum machine_mode
> to these macros.

I fixed the powerpc (PR 60876), and while it would have been nice to have
tested this against more backends, it was fairly simple to go through the
GET_MODE_SIZE's and make them type correct.  For the PowerPC, it tended to be
in code of the form:

for (m = 0; m < NUM_MACHINE_MODES; ++m)
  {
// ...
if (GET_MODE_SIZE (m)) ...
  }

and the fix was to do something like:

for (m = 0; m < NUM_MACHINE_MODES; ++m)
  {
enum machine_mode m2 = (enum machine_mode)m;
// ...
if (GET_MODE_SIZE (m2)) ...
  }

It reminds me when I was in the original ANSI C committee that made the 1989
ANSI and 1990 ISO C standards, we debated making enum's more first class
citizens, so you could do ++/-- on them, but the committee tended to be divided
into 3 camps, one that wanted to delete enums altogether, one that wanted them
as int constants, and one that wanted more type checking.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797