rnreg and vliw

2015-01-23 Thread shmeel gutl
It seems that in gcc 4.7, the rnreg pass for renaming registers after 
reload is not vliw aware. In particular I saw it reassign a register 
that is in use in the same vliw.


To be more concrete, I saw it change the following pseudo code
DI:a30 = v0
SI:a14 = -a14

to
DI:a30 = v0
SI:a31 = -a14

since a31 was never referenced again. This won't work inside a vliw 
since it causes two instructions to set a31. Even though, rnreg runs 
before sched2, it runs after software pipelining which creates its own 
vliws.


Is there any easy fix for this.

Thanks,
Shmeel




Contributing to GCC and question about PR64744

2015-01-23 Thread Alexander Basov
Dear GCC team,

I would like to contribute to the project.

I have a background in embedded systems programming, but few
experience in compiler development.

I'd like to try with fixing PR64744.
Would some one help me to understand what should be correct compilers
behaviour with an example below:

__attribute__((naked))
void foo()
{
  char [2] = {0};
};

Now gcc (trunk for aarch64 target) goes to ICE while compiling this code:
cc1 -O0 test.c

But I think that it should report something like: "local frame
unavailable (naked function?)"


Thanks in advance

-- 
Alexander


Help with reload bug, please

2015-01-23 Thread Andrew Stubbs
How does reload ensure that an SImode value (re)loaded into an FP 
register has a valid stack index?


The FP load instruction allows a smaller index range than the integer 
equivalent, but nothing checks the destination register, only the source 
mode.


I'm trying to solve a problem in which GCC 4.1 gets this wrong, but 
AFAICT this code works exactly the same now as then (although I don't 
have a testcase). IOW, unless I'm missing something, the only reason 
this doesn't fail all the time is that it's quite rare for the register 
pressure to cause just this value to spill in a function that has a 
stack frame >1KB and the index happens to be beyond the limit.


My target is ARMv7a with VFP. The code is trying to cast an int to a 
float. The machine description is such that the preferred route is to 
load the value first into a general register, transfer it to the VFP 
register, and then convert. It's only possible to get it to load 
directly to the VFP register if all the general registers are in use. 
This makes it very hard to write a synthetic testcase.


I can "fix" the problem by rewriting arm_legitimate_index_p such that it 
assumes all SImode access might be VFP loads, but that seems suboptimal.


Any suggestions would be appreciated!

Thanks

Andrew


Re: Help with reload bug, please

2015-01-23 Thread Joern Rennecke
On 23 January 2015 at 13:46, Andrew Stubbs  wrote:
> How does reload ensure that an SImode value (re)loaded into an FP register
> has a valid stack index?

You could use CANNOT_CHANGE_MODE_CLASS, or request secondary reload.
For the latter, you can look at the memory/pseudo to decide if the
address requires
a secondary reload for your register (classs).  Although we had a few
bugs previously
where reload wouldn't re-calculate frame addresses when something
relevant changed,
so you might have to backport some fixes when working with an old compiler.


Re: C++ Standard Question

2015-01-23 Thread Jonathan Wakely

On 22/01/15 16:07 -0600, Joel Sherrill wrote:


On 1/22/2015 3:44 PM, Marc Glisse wrote:

On Thu, 22 Jan 2015, Joel Sherrill wrote:


I think this is a glibc issue but since this method is defined in the C++
standards, I thought there were plenty of language lawyers here. :)

s/glibc/libstdc++/ and they have their own ML.

 Thank you.



That's deprecated, isn't it?

Yes. There is also a warning about that coming from the test code.
I don't know how long it has been deprecated since even with
-std=c++03, the warning is present.


Those types were deprecated even in the first C++ standard in 1998.
They were born deprecated, and have remained so ever since.


  class strstreambuf : public basic_streambuf >
  ISSUE > int pcount() const;   <= ISSUE

My reading of the C++03 and draft C++14 says that the int pcount() method
in this class is not const. glibc has it const in the glibc shipped with
Fedora 20
and CentOS 6.

This is a simple test case:

   #include 

   int main() {
   int (std::strstreambuf::*dummy)() = &std::strstreambuf::pcount;
/*-- pcount is conformant --*/
   return 0;
   }

What's the consensus?

The exact signature of member functions is not mandated by the standard,
implementations are allowed to make the function const if that works (or
provide both a const and a non-const version). Your code is not guaranteed
to work. Lambdas usually provide a fine workaround.


This code is actually from the Open Group FACE Conformance Test Suite.
It uses code like this to check the presence of methods from the C Standard
Library, POSIX APIs, and the C++ Standard Library. It would be really
helpful
if you could cite the place in the C++ standard so I can provide that as
feedback
to the authors of the test suite.


17.6.5.5 [member.functions] gives implementors certain freedoms to
provide slightly different signatures to those described in the
standard, including this:

-2- It is unspecified whether any member functions in the C++ standard
   library are defined as inline (7.1.2). An implementation may
   declare additional non-virtual member function signatures within a
   class:
  
   — by adding arguments with default values to a member function

   signature;(187)
   — ...

   187) Hence, the address of a member function of a class in the C++
   standard library has an unspecified type.

The footnote makes it clear that we could declare strstreambuf::pcount
as a function with one or more parameters, as long as we give them
default values, meaning that the type of &std::strstreambuf::pcount is
not guaranteed to be int (strstreambuf::*)() in a conforming
implementation, so the conformance test could fail on an
implementation that conforms 100% to the standard.

I'm not sure if that gives us the liberty to add 'const' where the
standard doesn't have it, so this might be a real non-conformance
issue, but even if we fixed it their test is not guaranteed to work
for other reasons.



On a positive note, the test suite isn't flagging much using this
technique. This
may be the only method. But I would like to provide the correct feedback to
them so no one else deals with this.


IMHO the correct feedback for these deprecated types might be "Hmm,
you're right. Oh well, nevermind, we're not going to bother fixing it
now."

Any real program using the pcount() member will work correctly with
our implementation. In practice only conformance tests are likely to
notice the difference.



Re: Problem with extremely large procedures and 64-bit code

2015-01-23 Thread Ricardo Telichevesky

Thanks Richard for your input, much appreciated.

I followed up on your suggestions; unfortunately the -Wdisabled-optimization option you suggested did not cause any warnings. Still trying one by one the --params options without success. I got a new hint, though, running the same examples on a MacBook I don't see the same issue at all, time 
difference between 64-bit and 32-bit in each optimize/debug versions is slightly off, and 64-bit always about 10% faster in each class. I guess somehow the compiler flags are different, perhaps you, or someone knows what flags are set differently by default between them, though is hard to compare 
the actual speeds because the hardware is different. Here are the specs on the mac:


gcc: Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn) - don't 
know what that means expected  a number like 4.2.1 or something like that,  
2.53 GHz Intel Core 2 Duo

Anything comes to your mind?

Thanks again for your help,

Ricardo


On 1/20/15 1:21 AM, Richard Biener wrote:

On Tue, Jan 20, 2015 at 4:57 AM, Ricardo Telichevesky  wrote:

Hi,

 I have a strange problem with extremely large procedures when generating
64-bit code
I am using gcc 4.9.2 on  RHEL6.3 on a 64-thread 4-socket  Xeon E7 4820 with
256GB of memory. No avx extensions, using sse option when building the
compiler. This particular code is serial. I made measurements with 32- and
64- bit both debug -g and optimize -O3 for two different examples (this is a
circuit simulator and each example is a different circuit that uses
different transistors).

 Example A is the one the effect is more acute. I listed at the bottom of
the e-mail the 3 procedures that consume 90% of the execution time:

a) As a counter-example, the factor code listed is heavily optimized
hand-written 300-lines of C++ code that behaves as expected: 64-bit optimize
is way faster than any other, up to 15x faster than 32-bit debug (btw great
job in the compiler, it is really shining here).

b) evalTran has 18000 lines of auto-generated code and behaves very
counter-intuitively 64-bit optimize code is 3x slower than 32-bit optimize
code.

c) evalTranRhs has 5000 lines even worse: 64-bit is 4x slower than 32-bit.
Notice that all the data structures in 32-bit code and 64-bit code are
identical and most variables are identical - in fact all integers used are
64-bit, and most operations are floating-point ops. Initially I thought the
64-bit code was a lot bigger than 32-bit code and the cache was overwhelmed.
In fact the difference in code sizes is not even 10% (at least debug -
notice I calculated the size of each procedure in bytes)  so my trash-the
I-cache conjecture seems to be wrong. The overall execution time is causing
us a lot of problems - 64-bit optimize takes 16seconds, even more than
32-bit debug 10seconds and 32-bit optimize 4.8 seconds. Considering we only
care about 64-bit optimize we got a big problem here.

 Example B is not so bad, and in fact 64-bit code is slightly faster than
32-bit code, would be nice if went even faster, but if I got A to behave
like that I'd be pretty happy already.

 I tried to look at the wide array of optimizing options for the code, it
is is a dizzying task and I could not get any kind of intuition besides the
-O3 ... Would you have any suggestions for the proper flags for those
ridiculously large auto-generated codes that might be able to alleviate this
32-bit vs 64-bit problem? Would you think that the fact this code is in a
dynamic linked library (-fPIC) plays a role?

It's hard to tell without a testcase but GCC has various limits on
code sizes passes deal with so you might trip one of these which
effectively would disable optimizations.  For example loop dependence
analysis has a limit on the number of memory references it considers
(--param loop-max-datarefs-for-datadeps, default 1000).  Note that not
all such limits are controlled by --params.  We have
-Wdisabled-optimization that should warn if you run into any such
case (but the warning is unfortunately not correctly implemented by
all passes having such limits).

Thanks,
Richard.


 Thanks very much for your help,
 Ricardo


All times are wall clock in micro-seconds - the main was checked against the
reported UNIX time and is exact.

example  A
==
evalTran has 18000 lines of C code   (two for loops around 99% of the code)
evalTranRhs has 5000 lines of C code (two for loops around 99% of the code)

32 bit debug -g -m32 -fPIC -Wall -Winvalid-pch -msse2
%time  elapsed(us)   #calls  per call(us)timer name @DN@
-  ---   --   --
2.503 254536833530 numerical TRAN factor
56.01  5695065   8335683 evalTranbytes=231791
35.41  3600646   13924   258 evalTranRhs bytes=57501
10010168242  1   10168242main @DT@

32 bit optimize -O3 -m32 -fPIC -Wall -Winvalid-pch -msse2
%ti

Re: Problem with extremely large procedures and 64-bit code

2015-01-23 Thread Renato Golin
On 23 January 2015 at 16:07, Ricardo Telichevesky  wrote:
> gcc: Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn) - don't
> know what that means expected  a number like 4.2.1 or something like that,
> 2.53 GHz Intel Core 2 Duo

Hi Ricardo,

This is not gcc at all, it's Clang+LLVM. :/

I'm not sure why Apple decided to call the binary "gcc", which
obviously causes more confusion than solves problems, but that's
beyond the point. You should try Richard's suggestions again on the
Linux/GCC that you originally started.

cheers,
--renato


Re: Help with reload bug, please

2015-01-23 Thread Jeff Law

On 01/23/15 06:46, Andrew Stubbs wrote:

How does reload ensure that an SImode value (re)loaded into an FP
register has a valid stack index?

The FP load instruction allows a smaller index range than the integer
equivalent, but nothing checks the destination register, only the source
mode.



Unfortunately, GCC is designed with the assumption that the validity of 
an address is independent of whether its used in a load or store and in 
the case of a load, it's independent of the target register or in a 
store that validity is independent of the source register.


This is deeply baked into the compiler in a variety of places.  When I 
had responsibility for the PA I bumped up against this often.


Just for reference, the PA allows a 14 bit displacement in memory 
addresses which use integer registers, but only a 5 bit displacement for 
FP registers.  Other than the displacement amounts, I suspect this is 
the same core problem you have on your port.


Ultimately all I could do way layer hack on hack.  I can't remember them 
all.  The most significant ones were to first reject the larger offsets 
for FP modes in GO_IF_LEGITIMATE_ADDRESS.  While it's still valid (and 
relatively common on the PA) to access integer registers in FP modes or 
vice-versa, this change was a huge help.


Secondary reloads are critical.  When you detect a situation that won't 
work, you have to allocate a secondary reload register so that you can 
build up the address as well as all the reload_in/reload_out handling. 
This is how you ensure that if the compiler did something like try to 
load from memory using an integer mode into an FP register you've got an 
scratch register for reloading the address if it is an out-of-range reg+ 
address.


We may have used special constraints as well to allow loads/stores of 
integer registers in FP modes to use the larger offset.


Jeff





Re: C++ Standard Question

2015-01-23 Thread Joel Sherrill

On 1/23/2015 9:55 AM, Jonathan Wakely wrote:
> On 22/01/15 16:07 -0600, Joel Sherrill wrote:
>> On 1/22/2015 3:44 PM, Marc Glisse wrote:
>>> On Thu, 22 Jan 2015, Joel Sherrill wrote:
>>>
 I think this is a glibc issue but since this method is defined in the C++
 standards, I thought there were plenty of language lawyers here. :)
>>> s/glibc/libstdc++/ and they have their own ML.
>>  Thank you.
 
>>> That's deprecated, isn't it?
>> Yes. There is also a warning about that coming from the test code.
>> I don't know how long it has been deprecated since even with
>> -std=c++03, the warning is present.
> Those types were deprecated even in the first C++ standard in 1998.
> They were born deprecated, and have remained so ever since.
I don't mind the deprecated warning but at least I know now how long they
have been that way. :)
   class strstreambuf : public basic_streambuf >
   ISSUE > int pcount() const;   <= ISSUE

 My reading of the C++03 and draft C++14 says that the int pcount() method
 in this class is not const. glibc has it const in the glibc shipped with
 Fedora 20
 and CentOS 6.

 This is a simple test case:

#include 

int main() {
int (std::strstreambuf::*dummy)() = &std::strstreambuf::pcount;
 /*-- pcount is conformant --*/
return 0;
}

 What's the consensus?
>>> The exact signature of member functions is not mandated by the standard,
>>> implementations are allowed to make the function const if that works (or
>>> provide both a const and a non-const version). Your code is not guaranteed
>>> to work. Lambdas usually provide a fine workaround.
>>>
>> This code is actually from the Open Group FACE Conformance Test Suite.
>> It uses code like this to check the presence of methods from the C Standard
>> Library, POSIX APIs, and the C++ Standard Library. It would be really
>> helpful
>> if you could cite the place in the C++ standard so I can provide that as
>> feedback
>> to the authors of the test suite.
> 17.6.5.5 [member.functions] gives implementors certain freedoms to
> provide slightly different signatures to those described in the
> standard, including this:
>
> -2- It is unspecified whether any member functions in the C++ standard
> library are defined as inline (7.1.2). An implementation may
> declare additional non-virtual member function signatures within a
> class:
>
> — by adding arguments with default values to a member function
> signature;(187)
> — ...
>
> 187) Hence, the address of a member function of a class in the C++
> standard library has an unspecified type.
>
> The footnote makes it clear that we could declare strstreambuf::pcount
> as a function with one or more parameters, as long as we give them
> default values, meaning that the type of &std::strstreambuf::pcount is
> not guaranteed to be int (strstreambuf::*)() in a conforming
> implementation, so the conformance test could fail on an
> implementation that conforms 100% to the standard.
>
> I'm not sure if that gives us the liberty to add 'const' where the
> standard doesn't have it, so this might be a real non-conformance
> issue, but even if we fixed it their test is not guaranteed to work
> for other reasons.
>
Thank you for the explanation.

Sounds like every place this technique flags a method, it cannot be an
automatic fail but will need manual examination to at least check for
defaulted arguments.

I am curious what the ruling is on the const. Since the answer will
impact FACE's guidance for its test suite.

Is there a better way to automate a signature compliance? To tweak
what they have done?

>> On a positive note, the test suite isn't flagging much using this
>> technique. This
>> may be the only method. But I would like to provide the correct feedback to
>> them so no one else deals with this.
> IMHO the correct feedback for these deprecated types might be "Hmm,
> you're right. Oh well, nevermind, we're not going to bother fixing it
> now."
>
> Any real program using the pcount() member will work correctly with
> our implementation. In practice only conformance tests are likely to
> notice the difference.
>

-- 
Joel Sherrill, Ph.D. Director of Research & Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available(256) 722-9985




Re: C++ Standard Question

2015-01-23 Thread Jonathan Wakely

On 23/01/15 10:53 -0600, Joel Sherrill wrote:

Is there a better way to automate a signature compliance? To tweak
what they have done?


Testing member function signatures for compliance is inherently
flawed, they just shouldn't do it.

I would say they should be testing that the function can be called on
a non-const object and that it behaves as specified, rather than
testing for a specific signature.


Re: C++ Standard Question

2015-01-23 Thread Joel Sherrill

On 1/23/2015 10:59 AM, Jonathan Wakely wrote:
> On 23/01/15 10:53 -0600, Joel Sherrill wrote:
>> Is there a better way to automate a signature compliance? To tweak
>> what they have done?
> Testing member function signatures for compliance is inherently
> flawed, they just shouldn't do it.
>
> I would say they should be testing that the function can be called on
> a non-const object and that it behaves as specified, rather than
> testing for a specific signature.
That's more or less how the RTEMS API signature tests work for C. We declare
a variable of each type and pass it into the method including only the
.h files
POSIX says you should.

Thanks.

-- 
Joel Sherrill, Ph.D. Director of Research & Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available(256) 722-9985



Is there a way to dump LTO IR?

2015-01-23 Thread H.J. Lu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64754

is a LTO bug where stage 1 and stage 2 compilers generate
different LTO IR.  Is there  a way to dump LTO IR to see the
actual difference in LTO IR?

Thanks.

-- 
H.J.


Re: Help with reload bug, please

2015-01-23 Thread Andrew Stubbs

On 23/01/15 16:34, Jeff Law wrote:

Just for reference, the PA allows a 14 bit displacement in memory
addresses which use integer registers, but only a 5 bit displacement for
FP registers.  Other than the displacement amounts, I suspect this is
the same core problem you have on your port.


Yes, that seems similar.


Ultimately all I could do way layer hack on hack.  I can't remember them
all.  The most significant ones were to first reject the larger offsets
for FP modes in GO_IF_LEGITIMATE_ADDRESS.  While it's still valid (and
relatively common on the PA) to access integer registers in FP modes or
vice-versa, this change was a huge help.


This is already the case; it does the right thing when the mode is SFmode.


Secondary reloads are critical.  When you detect a situation that won't
work, you have to allocate a secondary reload register so that you can
build up the address as well as all the reload_in/reload_out handling.
This is how you ensure that if the compiler did something like try to
load from memory using an integer mode into an FP register you've got an
scratch register for reloading the address if it is an out-of-range reg+
address.


SECONDARY_INPUT_RELOAD_CLASS is another missed opportunity. Just like 
the legitimate address stuff, this has checks for the various VFP 
classes, but reload detects the class in the same flawed way, so an 
integer reload gives GENERAL_REGS even when the destination is VFP. 
Within the macro there's no way to see the whole insn.



We may have used special constraints as well to allow loads/stores of
integer registers in FP modes to use the larger offset.


Do you have an example?

Thanks

Andrew