Re: RFC: Doc update for attribute

2014-05-20 Thread David Wohlferd
After thinking about this some more, I believe I have some better text.  
Previously I used the word "discouraged" to describe this practice.  The 
existing docs use the term "avoid."  I believe what you want is 
something more like the attached.  Direct and clear, just like docs 
should be.


If you are ok with this, I'll send it to gcc-patches.

dw


+While it
+is discouraged, it is possible to write your own prologue/epilogue 
code

+using asm and use ``C'' code in the middle.
I wouldn't remove the last sentence since IMO it's not the intent of 
the feature
to ever support that and the compiler doesn't guarantee it and may 
result

in wrong code given that `naked' is a fragile low-level feature.


I'm assuming you meant "would remove."

I wasn't comfortable including that sentence, but I was following the 
existing docs.  Since they said you could "only" use basic asm, 
following that with a warning to "avoid" locals/if/etc was really 
confusing without this text.


Also, as ugly as this is, apparently some people really do this 
(comment 6): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43404#c6


We don't have to doc every crazy thing people try to do with gcc. But 
since it's out there, maybe we should this time?  If only to 
discourage it.


I'm *slightly* more in favor of keeping it.  But if you still feel it 
should go, it's gone.


Index: extend.texi
===
--- extend.texi	(revision 210624)
+++ extend.texi	(working copy)
@@ -3332,16 +3332,15 @@
 
 @item naked
 @cindex function without a prologue/epilogue code
-Use this attribute on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX and SPU
-ports to indicate that the specified function does not need prologue/epilogue
-sequences generated by the compiler.
-It is up to the programmer to provide these sequences. The
-only statements that can be safely included in naked functions are
-@code{asm} statements that do not have operands.  All other statements,
-including declarations of local variables, @code{if} statements, and so
-forth, should be avoided.  Naked functions should be used to implement the
-body of an assembly function, while allowing the compiler to construct
-the requisite function declaration for the assembler.
+This attribute is available on the ARM, AVR, MCORE, MSP430, NDS32,
+RL78, RX and SPU ports.  It allows the compiler to construct the
+requisite function declaration, while allowing the body of the
+function to be assembly code. The specified function will not have
+prologue/epilogue sequences generated by the compiler. Only Basic
+@code{asm} statements can safely be included in naked functions
+(@pxref{Basic Asm}). While using Extended @code{asm} or a mixture of
+Basic @code{asm} and ``C'' code may appear to work, they cannot be
+depended upon to work reliably and are not supported.
 
 @item near
 @cindex functions that do not handle memory bank switching on 68HC11/68HC12
@@ -6269,6 +6268,8 @@
 efficient code, and in most cases it is a better solution. When writing 
 inline assembly language outside of C functions, however, you must use Basic 
 @code{asm}. Extended @code{asm} statements have to be inside a C function.
+Functions declared with the @code{naked} attribute also require Basic 
+@code{asm} (@pxref{Function Attributes}).
 
 Under certain circumstances, GCC may duplicate (or remove duplicates of) your 
 assembly code when optimizing. This can lead to unexpected duplicate 
@@ -6388,6 +6389,8 @@
 
 Note that Extended @code{asm} statements must be inside a function. Only 
 Basic @code{asm} may be outside functions (@pxref{Basic Asm}).
+Functions declared with the @code{naked} attribute also require Basic 
+@code{asm} (@pxref{Function Attributes}).
 
 While the uses of @code{asm} are many and varied, it may help to think of an 
 @code{asm} statement as a series of low-level instructions that convert input 


Re: RFC: Doc update for attribute

2014-05-20 Thread Georg-Johann Lay

Am 05/16/2014 07:16 PM, schrieb Carlos O'Donell:

On 05/12/2014 11:13 PM, David Wohlferd wrote:

After updating gcc's docs about inline asm, I'm trying to improve
some of the related sections. One that I feel has problems with
clarity is __attribute__ naked.

I have attached my proposed update. Comments/corrections are
welcome.

In a related question:

To better understand how this attribute is used, I looked at the
Linux kernel. While the existing docs say "only ... asm statements
that do not have operands" can safely be used, Linux routinely uses
asm WITH operands.


That's a bug. Period. You must not use naked with an asm that has
operands. Any kind of operand might inadvertently cause the compiler
to generate code and that would violate the requirements of the
attribute and potentially generate an ICE.


There is target hook TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS that is intended to 
cater that case.  For example, the documentation indicates it only works with 
optimization turned off.  But I don't know how reliable it is in general.  For 
avr target it works as expected.


https://gcc.gnu.org/onlinedocs/gccint/Misc.html#index-TARGET_005fALLOCATE_005fSTACK_005fSLOTS_005fFOR_005fARGS-4969

Johann




Roadmap for 4.9.1, 4.10.0 and onwards?

2014-05-20 Thread Bruce Adams
Hi,
    I've been tracking the latest releases of gcc since 4.7 or so (variously 
interested in C++1y support, cilk and openmp).
One thing I've found hard to locate is information about planned inclusions for 
future releases. 
As much relies on unpredictable community contributions I don't expect there to 
be a concrete or reliable plan. However, equally I'm sure the steering 
committee have some ideas over what ought
to be upcoming releases. Is this published anywhere?


For example if I look at:


https://gcc.gnu.org/projects/cxx1y.html

There are 3 items marked "no" under C++14 support. Which if any are tabled for 
4.10.0?More generally what targets (obviously subject to change) are there for 
4.10.0? or 4.9.1?

Regards,

Bruce.


Supported targets

2014-05-20 Thread Bruce Adams
Hi,
   Slightly related to my previous question about the roadmap. I have two
quite old targets based on (so far as I know) standard linux distributions. 
Should they still be supported?


RHEL4 (kernel 2.6.9-55.ELsmp):


I was able to compile 4.8.1 successfully when it was released. 4.9.0 fails as 
below.
RHEL4 is end of life (but not extended life).

My feeling is this ought to work and is probably a regression I should report?


SUSE LINUX Enterprise Server 9 (i586) (kernal 2.6.5-7.111-smp)  


I was able to compile gcc 4.7.0 successfully when it was released. I had less 
luck with
4.8.0. 4.9.0 fails as below. However, this machine/distribution is so old it is 
not 

unreasonable to say it should be scrapped.


My main targets are RHEL5 and RHEL6 which work perfectly.

I also tried bootstrapping using 4.8.1 to build 4.9.0 on RHEL4 

and 4.7.0 to build 4.9.0 on the Suse box rather than the ancient
system installed versions (RHEL4 = gcc 3.4.6, Suse 9 = 3.3.3) but without 
success.


Regards,

Bruce.



RHEL4 (kernel 2.6.9-55.ELsmp):


[snip]
../../../../gcc-4.9.0/libsanitizer/include/system/linux/aio_abi.h:2:32: fatal 
error: linux/aio_abi.h: No such file or director
y
 #include_next 
    ^
compilation terminated.
make[3]: *** [sanitizer_platform_limits_linux.lo] Error 1
make[3]: Leaving directory 
`/development/brucea/gcc/build/build/x86_64-unknown-linux-gnu/libsanitizer/sanitizer_common'
make[2]: *** [install-recursive] Error 1
[snip]

SUSE LINUX Enterprise Server 9 (i586) (kernal 2.6.5-7.111-smp)   


[snip]
/development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../../../../i686-pc-linux-gnu/bin/ld:
 /home/brucea/gcc4
.9/lib/libmpfr.so: undefined reference to symbol '___tls_get_addr@@GLIBC_2.3'
/development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../../../../i686-pc-linux-gnu/bin/ld:
 note: '___tls_get
_addr@@GLIBC_2.3' is defined in DSO /lib/ld-linux.so.2 so try adding it to the 
linker command line
/lib/ld-linux.so.2: could not read symbols: Invalid operation
collect2: error: ld returned 1 exit status
make[3]: *** [cc1] Error 1

[snip]


Requires a later version of glibc?


Re: Supported targets

2014-05-20 Thread Eric Botcazou
> [snip]
> /development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../..
> /../../i686-pc-linux-gnu/bin/ld: /home/brucea/gcc4 .9/lib/libmpfr.so:
> undefined reference to symbol '___tls_get_addr@@GLIBC_2.3'
> /development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../.
> ./../../i686-pc-linux-gnu/bin/ld: note: '___tls_get _addr@@GLIBC_2.3' is
> defined in DSO /lib/ld-linux.so.2 so try adding it to the linker command
> line /lib/ld-linux.so.2: could not read symbols: Invalid operation
> collect2: error: ld returned 1 exit status
> make[3]: *** [cc1] Error 1
> 
> [snip]
> 
> 
> Requires a later version of glibc?

Yes, glibc 2.4 is required for GCC 4.9 because of this.

-- 
Eric Botcazou


Re: Supported targets

2014-05-20 Thread Jonathan Wakely
On 20 May 2014 11:26, Bruce Adams wrote:
>
> RHEL4 (kernel 2.6.9-55.ELsmp):
>
>
> I was able to compile 4.8.1 successfully when it was released. 4.9.0 fails as 
> below.
> RHEL4 is end of life (but not extended life).
>
> My feeling is this ought to work and is probably a regression I should report?

Yes, I think it should be reported if it isn't in Bugzilla yet.

You can use  --disable-libsanitizer to build GCC without the failing library.


Re: Supported targets

2014-05-20 Thread Jonathan Wakely
On 20 May 2014 11:55, Eric Botcazou wrote:
>> [snip]
>> /development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../..
>> /../../i686-pc-linux-gnu/bin/ld: /home/brucea/gcc4 .9/lib/libmpfr.so:
>> undefined reference to symbol '___tls_get_addr@@GLIBC_2.3'
>> /development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../.
>> ./../../i686-pc-linux-gnu/bin/ld: note: '___tls_get _addr@@GLIBC_2.3' is
>> defined in DSO /lib/ld-linux.so.2 so try adding it to the linker command
>> line /lib/ld-linux.so.2: could not read symbols: Invalid operation
>> collect2: error: ld returned 1 exit status
>> make[3]: *** [cc1] Error 1
>>
>> [snip]
>>
>>
>> Requires a later version of glibc?
>
> Yes, glibc 2.4 is required for GCC 4.9 because of this.


Should that be noted at
https://gcc.gnu.org/install/specific.html#x-x-linux-gnu ?


Re: Supported targets

2014-05-20 Thread Eric Botcazou
> > Yes, glibc 2.4 is required for GCC 4.9 because of this.
> 
> Should that be noted at
> https://gcc.gnu.org/install/specific.html#x-x-linux-gnu ?

Probably, unless someone knows how to work around it.  We traced it to the 
missing AS_NEEDED in /usr/lib/libc.so:

/* GNU ld script
   Use the shared library, but some functions are only in
   the static library, so try that secondarily.  */
OUTPUT_FORMAT(elf32-i386)
GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a  AS_NEEDED ( /lib/ld-
linux.so.2 ) )

-- 
Eric Botcazou


Re: Supported targets

2014-05-20 Thread Jakub Jelinek
On Tue, May 20, 2014 at 01:14:24PM +0200, Eric Botcazou wrote:
> > > Yes, glibc 2.4 is required for GCC 4.9 because of this.
> > 
> > Should that be noted at
> > https://gcc.gnu.org/install/specific.html#x-x-linux-gnu ?
> 
> Probably, unless someone knows how to work around it.  We traced it to the 
> missing AS_NEEDED in /usr/lib/libc.so:
> 
> /* GNU ld script
>Use the shared library, but some functions are only in
>the static library, so try that secondarily.  */
> OUTPUT_FORMAT(elf32-i386)
> GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a  AS_NEEDED ( /lib/ld-
> linux.so.2 ) )

But that should be generally needed only when linking with -Wl,-z,defs ,
without it the linker shouldn't care.

Jakub


Re: Supported targets

2014-05-20 Thread Eric Botcazou
> But that should be generally needed only when linking with -Wl,-z,defs ,
> without it the linker shouldn't care.

Yet using a local libc.so with the missing AS_NEEDED is a (poor) workaround.

-- 
Eric Botcazou


Re: [GSoC] writing test-case

2014-05-20 Thread Richard Biener
On Mon, May 19, 2014 at 5:51 PM, Michael Matz  wrote:
> Hi,
>
> On Thu, 15 May 2014, Richard Biener wrote:
>
>> To me predicate (and capture without expression or predicate)
>> differs from expression in that predicate is clearly a leaf of the
>> expression tree while we have to recurse into expression operands.
>>
>> Now, if we want to support applying predicates to the midst of an
>> expression, like
>>
>> (plus predicate(minus @0 @1)
>> @2)
>> (...)
>>
>> then this would no longer be true.  At the moment you'd write
>>
>> (plus (minus@3 @0 @1)
>> @2)
>>   if (predicate (@3))
>> (...)
>>
>> which makes it clearer IMHO (with the decision tree building
>> you'd apply the predicates after matching the expression tree
>> anyway I suppose, so code generation would be equivalent).
>
> Syntaxwise I had this idea for adding generic predicates to expressions:
>
> (plus (minus @0 @1):predicate
>   @2)
> (...)

So you'd write

 (plus @0 :integer_zerop)

instead of

 (plus @0 integer_zerop)

?

> If prefix or suffix doesn't matter much, but using a different syntax
> to separate expression from predicate seems to make things clearer.
> Optionally adding things like and/or for predicates might also make sense:
>
> (plus (minus @0 @1):positive_p(@0) || positive_p(@1)
>   @2)
> (...)

negation whould be more useful I guess.  You open up a can of
worms with ordering though:

(plus (minus @0 @1) @2:operand_equal_p (@1, @2, 0))

which might be declared invalid or is equivalent to

(plus (minus @0 @1) @2):operand_equal_p (@1, @2, 0)

?

Note that your predicate placement doesn't match placement of
captures for non-innermost expressions.  capturing the outer
plus would be

(plus@3 (minus @0 @1) @2)

not

(plus (minus @0 @1) @2)@3

so maybe apply predicates there as well:

(plus:operand_equal_p (@1, @2, 0) (minus @0 @1)  @2)

But I still think that doing all predicates within a if-expr makes
the pattern less convoluted.

Enabling/disabling a whole set of patterns with a common condition
might still be a worthwhile addition.

Richard.

>
> Ciao,
> Michael.


Re: [GSoC] first phase

2014-05-20 Thread Richard Biener
On Mon, May 19, 2014 at 7:30 PM, Prathamesh Kulkarni
 wrote:
> Hi,
>Unfortunately I shall need to take this week off, due to university exams,
> which are up-to 27th May. I will start working from 28th on pattern
> matching with decision tree, and try to cover up for the first week. I
> am extremely sorry about this.
> I thought I would be able to do both during exam week, but the exam
> load has become too much -:(

Ok.

> In the first phase (up-to 23rd June), I hope to get genmatch ready:
> a) pattern matching with decision tree.
> b) Add patterns to test genmatch.
> c) Depending upon the patterns, extending the meta-description
> d) Other fixes:
>
> * capturing outermost expressions.
> For example this pattern does not get simplified
> (match_and_simplify
>   (plus@2 (negate @0) @1)
>   if (!TYPE_SATURATING (TREE_TYPE (@2)))
>   (minus @1 @0))
> I guess this happens because in write_nary_simplifiers:
>   if (s->match->type != OP_EXPR)
> continue;

Yeah.

> Maybe this is not correct way to fix this, should we also pass lhs to
> generated gimple_match_and_simplify ? I guess that would be the capture
> for outermost expression.

Unfortunately it is not available for all API entries.  The type of the
expression is, though.

I lean towards rejecting the capture at parsing time and providing
a "special" capture (for example @@, or just @0, or @T to denote
it's a type, or just refer "magically" to 'type').  That is,

(match_and_simplify
  (plus (negate @0) @1)
  if (!TYPE_SATURATING (type))
  (minus @1 @0))

works for me.

> For above pattern, I guess @2 represents lhs.
>
> So for this test-case:
> int foo (int x, int y)
> {
>   int t1 = -x;
>   int t2 = t1 + y;
>   return t2;
> }
> t2 would be @2, t1 would be @0 and y would be @1.
> Is that correct ?
> This would create issues when lhs is NULL, for example,
> in call to built-in functions ?

Yeah, or if the machinery is called via gimple_build () where
there is no existing lhs.

> * avoid using statement expressions for code gen of expression
> * rewriting code-generator using visitor classes, and other refactoring
> (using std::string for example), etc.
>
> I have a very rough time-line in mind, for completing tasks:
> 28th may - 31st may
> a) Have test-case for each pattern present (except COND_EXPR) in match.pd
> I guess most of it is already done, a few patterns are remaining.

Good.

> b) Small fixes (for example, those mentioned above).

Good.

> c) Have an initial idea/prototype for implementing decision tree
>
> 1st June - 15th June
> a) Implementing decision tree
> b) Adding patterns in match.pd to test the decision tree in match.pd,
> and accompanying test-cases in tree-ssa/match-*.c
>
> 16th June - 23rd June
> a) Support for GENERIC code generation.
> b) Refactoring and backup time for backlog.
>
> GENERIC code generation:
> I am a bit confused about this. Currently, pattern matching is
> implemented for GENERIC. However I believe simplification is done on
> GIMPLE.
> For example:
> (match_and_simplify
>   (plus (negate @0) @1)
>   (minus @0 @1))
> If given input is GENERIC , it would do matching on GENERIC, but shall
> transform (minus @0 @1) to it's GIMPLE equivalent.
> Is that correct ?

Correct.  Err, not sure what it will do - I implemented it only to support
the weird cases where GENERIC is nested inside GIMPLE, like for
a_2 = b_3 < 0 ? c_4 : d_5;  thus the comment in match.pd:

/* Due to COND_EXPRs weirdness in GIMPLE the following won't work
   without some hacks in the code generator.  */
(match_and_simplify
  (cond (bit_not @0) @1 @2)
  (cond @0 @2 @1))

the code generator would need to know that COND_EXPR has
a GENERIC op0 ... same applies to REALPART_EXPR, but there
the hacks are already in place ;)

>
> * Should we have a separate GENERIC match-and-simplify API like for gimple
> instead of having GENERIC matching in gimple_match_and_simplify ?

Yes.  The GENERIC API follows the API of fold_{unary,binary,ternary}.
I suppose we simply provide a slightly different name for them
(but use the original API for recursing and call ourselves from the original
API).

> * Do we add another pattern type, something like
> generic_match_and_simplify that will do the transform on GENERIC
> for example:
> (generic_match_and_simplify
>   (plus (negate @0) @1)
>   (minus @0 @1))
> would produce GENERIC equivalent of (minus @0 @1).
>
> or maybe keep match_and_simplify, and tell the transform operand
> to produce GENERIC.
> Something like:
> (match_and_simplify
>   (plus (negate @0) @1)
>   GENERIC: (minus @0 @1))

we simply process each pattern twice, once we generate the
GIMPLE match-and-simplify routine and once we generate the
GENERIC match-and-simplify routine.  The patterns are supposed
to be the same for both and always apply to both.

> Another thing I would like to do in first phase is figure out dependencies
> of tree-ssa-forwprop on GENERIC folding (for instance fold_comparison 
> patterns).

Yeah.  Having patterns for comparison simpli

Re: [GSoC] writing test-case

2014-05-20 Thread Michael Matz
Hi,

On Tue, 20 May 2014, Richard Biener wrote:

> > Syntaxwise I had this idea for adding generic predicates to expressions:
> >
> > (plus (minus @0 @1):predicate
> >   @2)
> > (...)
> 
> So you'd write
> 
>  (plus @0 :integer_zerop)
> 
> instead of
> 
>  (plus @0 integer_zerop)
> 
> ?

plus is binary, where is your @1?  If you want to not capture the second 
operand but still have it tested for a predicates, then yes, the first 
form it would be.

> 
> > If prefix or suffix doesn't matter much, but using a different syntax
> > to separate expression from predicate seems to make things clearer.
> > Optionally adding things like and/or for predicates might also make sense:
> >
> > (plus (minus @0 @1):positive_p(@0) || positive_p(@1)
> >   @2)
> > (...)
> 
> negation whould be more useful I guess.  You open up a can of
> worms with ordering though:
> 
> (plus (minus @0 @1) @2:operand_equal_p (@1, @2, 0))
> 
> which might be declared invalid or is equivalent to

It wouldn't necessarily be invalid, the predicate would apply to @2;
but check operands 1 and 0 as well, which might be surprising.  In this 
case it might indeed be equivalent to :

> (plus (minus @0 @1) @2):operand_equal_p (@1, @2, 0)



> Note that your predicate placement doesn't match placement of
> captures for non-innermost expressions.  capturing the outer
> plus would be
> 
> (plus@3 (minus @0 @1) @2)


You're right, I'd allow placing the predicate directly behind the capture, 
i.e.:

(plus@3:predicate (minus @0 @1) @2)

> But I still think that doing all predicates within a if-expr makes the 
> pattern less convoluted.

I think it simply depends on the scope of the predicate.  If it's a 
predicate applying to multiple operands from different nested level an 
if-expr is clearer (IMHO).  If it applies to one operand it seems more 
natural to place it directly next to that operand.  I.e.:

(minus @0 @1:non_negative) // better

vs.

(minus @0 @1)
  (if (non_negative (@1))

But:

(plus@3 (minus @0 @1) @2)  // better
  (if (operand_equal_p (@1, @2, 0))

vs:

(plus@3:operand_equal_p (@1, @2, 0) (minus @0 @1) @2)

That is we could require that predicates that are applied with ':' need to 
be unary and apply to the one expression to which they are bound.

> Enabling/disabling a whole set of patterns with a common condition
> might still be a worthwhile addition.

Right, but that seems orthogonal to the above?


Ciao,
Michael.


Re: [GSoC] first phase

2014-05-20 Thread Prathamesh Kulkarni
On Tue, May 20, 2014 at 5:46 PM, Richard Biener
 wrote:
> On Mon, May 19, 2014 at 7:30 PM, Prathamesh Kulkarni
>  wrote:
>> Hi,
>>Unfortunately I shall need to take this week off, due to university exams,
>> which are up-to 27th May. I will start working from 28th on pattern
>> matching with decision tree, and try to cover up for the first week. I
>> am extremely sorry about this.
>> I thought I would be able to do both during exam week, but the exam
>> load has become too much -:(
>
> Ok.
>
>> In the first phase (up-to 23rd June), I hope to get genmatch ready:
>> a) pattern matching with decision tree.
>> b) Add patterns to test genmatch.
>> c) Depending upon the patterns, extending the meta-description
>> d) Other fixes:
>>
>> * capturing outermost expressions.
>> For example this pattern does not get simplified
>> (match_and_simplify
>>   (plus@2 (negate @0) @1)
>>   if (!TYPE_SATURATING (TREE_TYPE (@2)))
>>   (minus @1 @0))
>> I guess this happens because in write_nary_simplifiers:
>>   if (s->match->type != OP_EXPR)
>> continue;
>
> Yeah.
>
>> Maybe this is not correct way to fix this, should we also pass lhs to
>> generated gimple_match_and_simplify ? I guess that would be the capture
>> for outermost expression.
>
> Unfortunately it is not available for all API entries.  The type of the
> expression is, though.
>
> I lean towards rejecting the capture at parsing time and providing
> a "special" capture (for example @@, or just @0, or @T to denote
> it's a type, or just refer "magically" to 'type').  That is,
>
> (match_and_simplify
>   (plus (negate @0) @1)
>   if (!TYPE_SATURATING (type))
>   (minus @1 @0))
>
> works for me.
>
>> For above pattern, I guess @2 represents lhs.
>>
>> So for this test-case:
>> int foo (int x, int y)
>> {
>>   int t1 = -x;
>>   int t2 = t1 + y;
>>   return t2;
>> }
>> t2 would be @2, t1 would be @0 and y would be @1.
>> Is that correct ?
>> This would create issues when lhs is NULL, for example,
>> in call to built-in functions ?
>
> Yeah, or if the machinery is called via gimple_build () where
> there is no existing lhs.
>
>> * avoid using statement expressions for code gen of expression
>> * rewriting code-generator using visitor classes, and other refactoring
>> (using std::string for example), etc.
>>
>> I have a very rough time-line in mind, for completing tasks:
>> 28th may - 31st may
>> a) Have test-case for each pattern present (except COND_EXPR) in match.pd
>> I guess most of it is already done, a few patterns are remaining.
>
> Good.
>
>> b) Small fixes (for example, those mentioned above).
>
> Good.
>
>> c) Have an initial idea/prototype for implementing decision tree
>>
>> 1st June - 15th June
>> a) Implementing decision tree
>> b) Adding patterns in match.pd to test the decision tree in match.pd,
>> and accompanying test-cases in tree-ssa/match-*.c
>>
>> 16th June - 23rd June
>> a) Support for GENERIC code generation.
>> b) Refactoring and backup time for backlog.
>>
>> GENERIC code generation:
>> I am a bit confused about this. Currently, pattern matching is
>> implemented for GENERIC. However I believe simplification is done on
>> GIMPLE.
>> For example:
>> (match_and_simplify
>>   (plus (negate @0) @1)
>>   (minus @0 @1))
>> If given input is GENERIC , it would do matching on GENERIC, but shall
>> transform (minus @0 @1) to it's GIMPLE equivalent.
>> Is that correct ?
>
> Correct.  Err, not sure what it will do - I implemented it only to support
> the weird cases where GENERIC is nested inside GIMPLE, like for
> a_2 = b_3 < 0 ? c_4 : d_5;  thus the comment in match.pd:
>
> /* Due to COND_EXPRs weirdness in GIMPLE the following won't work
>without some hacks in the code generator.  */
> (match_and_simplify
>   (cond (bit_not @0) @1 @2)
>   (cond @0 @2 @1))
>
> the code generator would need to know that COND_EXPR has
> a GENERIC op0 ... same applies to REALPART_EXPR, but there
> the hacks are already in place ;)
>
>>
>> * Should we have a separate GENERIC match-and-simplify API like for gimple
>> instead of having GENERIC matching in gimple_match_and_simplify ?
>
> Yes.  The GENERIC API follows the API of fold_{unary,binary,ternary}.
> I suppose we simply provide a slightly different name for them
> (but use the original API for recursing and call ourselves from the original
> API).
>
>> * Do we add another pattern type, something like
>> generic_match_and_simplify that will do the transform on GENERIC
>> for example:
>> (generic_match_and_simplify
>>   (plus (negate @0) @1)
>>   (minus @0 @1))
>> would produce GENERIC equivalent of (minus @0 @1).
>>
>> or maybe keep match_and_simplify, and tell the transform operand
>> to produce GENERIC.
>> Something like:
>> (match_and_simplify
>>   (plus (negate @0) @1)
>>   GENERIC: (minus @0 @1))
>
> we simply process each pattern twice, once we generate the
> GIMPLE match-and-simplify routine and once we generate the
> GENERIC match-and-simplify routine.  The patterns are supposed
> to be the same for bo

Re: [GSoC] first phase

2014-05-20 Thread Richard Biener
On Tue, May 20, 2014 at 2:59 PM, Prathamesh Kulkarni
 wrote:
> On Tue, May 20, 2014 at 5:46 PM, Richard Biener
>  wrote:
>> On Mon, May 19, 2014 at 7:30 PM, Prathamesh Kulkarni
>>  wrote:
>>> Hi,
>>>Unfortunately I shall need to take this week off, due to university 
>>> exams,
>>> which are up-to 27th May. I will start working from 28th on pattern
>>> matching with decision tree, and try to cover up for the first week. I
>>> am extremely sorry about this.
>>> I thought I would be able to do both during exam week, but the exam
>>> load has become too much -:(
>>
>> Ok.
>>
>>> In the first phase (up-to 23rd June), I hope to get genmatch ready:
>>> a) pattern matching with decision tree.
>>> b) Add patterns to test genmatch.
>>> c) Depending upon the patterns, extending the meta-description
>>> d) Other fixes:
>>>
>>> * capturing outermost expressions.
>>> For example this pattern does not get simplified
>>> (match_and_simplify
>>>   (plus@2 (negate @0) @1)
>>>   if (!TYPE_SATURATING (TREE_TYPE (@2)))
>>>   (minus @1 @0))
>>> I guess this happens because in write_nary_simplifiers:
>>>   if (s->match->type != OP_EXPR)
>>> continue;
>>
>> Yeah.
>>
>>> Maybe this is not correct way to fix this, should we also pass lhs to
>>> generated gimple_match_and_simplify ? I guess that would be the capture
>>> for outermost expression.
>>
>> Unfortunately it is not available for all API entries.  The type of the
>> expression is, though.
>>
>> I lean towards rejecting the capture at parsing time and providing
>> a "special" capture (for example @@, or just @0, or @T to denote
>> it's a type, or just refer "magically" to 'type').  That is,
>>
>> (match_and_simplify
>>   (plus (negate @0) @1)
>>   if (!TYPE_SATURATING (type))
>>   (minus @1 @0))
>>
>> works for me.
>>
>>> For above pattern, I guess @2 represents lhs.
>>>
>>> So for this test-case:
>>> int foo (int x, int y)
>>> {
>>>   int t1 = -x;
>>>   int t2 = t1 + y;
>>>   return t2;
>>> }
>>> t2 would be @2, t1 would be @0 and y would be @1.
>>> Is that correct ?
>>> This would create issues when lhs is NULL, for example,
>>> in call to built-in functions ?
>>
>> Yeah, or if the machinery is called via gimple_build () where
>> there is no existing lhs.
>>
>>> * avoid using statement expressions for code gen of expression
>>> * rewriting code-generator using visitor classes, and other refactoring
>>> (using std::string for example), etc.
>>>
>>> I have a very rough time-line in mind, for completing tasks:
>>> 28th may - 31st may
>>> a) Have test-case for each pattern present (except COND_EXPR) in match.pd
>>> I guess most of it is already done, a few patterns are remaining.
>>
>> Good.
>>
>>> b) Small fixes (for example, those mentioned above).
>>
>> Good.
>>
>>> c) Have an initial idea/prototype for implementing decision tree
>>>
>>> 1st June - 15th June
>>> a) Implementing decision tree
>>> b) Adding patterns in match.pd to test the decision tree in match.pd,
>>> and accompanying test-cases in tree-ssa/match-*.c
>>>
>>> 16th June - 23rd June
>>> a) Support for GENERIC code generation.
>>> b) Refactoring and backup time for backlog.
>>>
>>> GENERIC code generation:
>>> I am a bit confused about this. Currently, pattern matching is
>>> implemented for GENERIC. However I believe simplification is done on
>>> GIMPLE.
>>> For example:
>>> (match_and_simplify
>>>   (plus (negate @0) @1)
>>>   (minus @0 @1))
>>> If given input is GENERIC , it would do matching on GENERIC, but shall
>>> transform (minus @0 @1) to it's GIMPLE equivalent.
>>> Is that correct ?
>>
>> Correct.  Err, not sure what it will do - I implemented it only to support
>> the weird cases where GENERIC is nested inside GIMPLE, like for
>> a_2 = b_3 < 0 ? c_4 : d_5;  thus the comment in match.pd:
>>
>> /* Due to COND_EXPRs weirdness in GIMPLE the following won't work
>>without some hacks in the code generator.  */
>> (match_and_simplify
>>   (cond (bit_not @0) @1 @2)
>>   (cond @0 @2 @1))
>>
>> the code generator would need to know that COND_EXPR has
>> a GENERIC op0 ... same applies to REALPART_EXPR, but there
>> the hacks are already in place ;)
>>
>>>
>>> * Should we have a separate GENERIC match-and-simplify API like for gimple
>>> instead of having GENERIC matching in gimple_match_and_simplify ?
>>
>> Yes.  The GENERIC API follows the API of fold_{unary,binary,ternary}.
>> I suppose we simply provide a slightly different name for them
>> (but use the original API for recursing and call ourselves from the original
>> API).
>>
>>> * Do we add another pattern type, something like
>>> generic_match_and_simplify that will do the transform on GENERIC
>>> for example:
>>> (generic_match_and_simplify
>>>   (plus (negate @0) @1)
>>>   (minus @0 @1))
>>> would produce GENERIC equivalent of (minus @0 @1).
>>>
>>> or maybe keep match_and_simplify, and tell the transform operand
>>> to produce GENERIC.
>>> Something like:
>>> (match_and_simplify
>>>   (plus (negate @0) @1)
>>>   GENERIC: (minus @0 @1))
>>

Re: [GSoC] writing test-case

2014-05-20 Thread Richard Biener
On Tue, May 20, 2014 at 2:20 PM, Michael Matz  wrote:
> Hi,
>
> On Tue, 20 May 2014, Richard Biener wrote:
>
>> > Syntaxwise I had this idea for adding generic predicates to expressions:
>> >
>> > (plus (minus @0 @1):predicate
>> >   @2)
>> > (...)
>>
>> So you'd write
>>
>>  (plus @0 :integer_zerop)
>>
>> instead of
>>
>>  (plus @0 integer_zerop)
>>
>> ?
>
> plus is binary, where is your @1?

I know it's zero so I don't need it captured.

(match_and_simplify
   (plus @0 integer_zerop)
   @0)

mind that all predicates apply to leafs only at the moment.

>  If you want to not capture the second
> operand but still have it tested for a predicates, then yes, the first
> form it would be.

Ok.

>>
>> > If prefix or suffix doesn't matter much, but using a different syntax
>> > to separate expression from predicate seems to make things clearer.
>> > Optionally adding things like and/or for predicates might also make sense:
>> >
>> > (plus (minus @0 @1):positive_p(@0) || positive_p(@1)
>> >   @2)
>> > (...)
>>
>> negation whould be more useful I guess.  You open up a can of
>> worms with ordering though:
>>
>> (plus (minus @0 @1) @2:operand_equal_p (@1, @2, 0))
>>
>> which might be declared invalid or is equivalent to
>
> It wouldn't necessarily be invalid, the predicate would apply to @2;
> but check operands 1 and 0 as well, which might be surprising.  In this
> case it might indeed be equivalent to :
>
>> (plus (minus @0 @1) @2):operand_equal_p (@1, @2, 0)
>
>
>
>> Note that your predicate placement doesn't match placement of
>> captures for non-innermost expressions.  capturing the outer
>> plus would be
>>
>> (plus@3 (minus @0 @1) @2)
>
>
> You're right, I'd allow placing the predicate directly behind the capture,
> i.e.:
>
> (plus@3:predicate (minus @0 @1) @2)
>
>> But I still think that doing all predicates within a if-expr makes the
>> pattern less convoluted.
>
> I think it simply depends on the scope of the predicate.  If it's a
> predicate applying to multiple operands from different nested level an
> if-expr is clearer (IMHO).  If it applies to one operand it seems more
> natural to place it directly next to that operand.  I.e.:
>
> (minus @0 @1:non_negative) // better
>
> vs.
>
> (minus @0 @1)
>   (if (non_negative (@1))
>
> But:
>
> (plus@3 (minus @0 @1) @2)  // better
>   (if (operand_equal_p (@1, @2, 0))
>
> vs:
>
> (plus@3:operand_equal_p (@1, @2, 0) (minus @0 @1) @2)
>
> That is we could require that predicates that are applied with ':' need to
> be unary and apply to the one expression to which they are bound.

Your example is to leafs which we already support as

(minus @0 non_negative@1)

any good example that is not convoluted where that applies to a
non-leaf position?

>> Enabling/disabling a whole set of patterns with a common condition
>> might still be a worthwhile addition.
>
> Right, but that seems orthogonal to the above?

Right.

Richard.

>
> Ciao,
> Michael.


Re: RFC: Doc update for attribute

2014-05-20 Thread Carlos O'Donell
On 05/20/2014 03:02 AM, David Wohlferd wrote:
> After thinking about this some more, I believe I have some better
> text. Previously I used the word "discouraged" to describe this
> practice. The existing docs use the term "avoid." I believe what you
> want is something more like the attached. Direct and clear, just like
> docs should be.

David, Thanks for the new patch.

> If you are ok with this, I'll send it to gcc-patches.

Looks good to me.

Cheers,
Carlos.

> Index: extend.texi
> ===
> --- extend.texi   (revision 210624)
> +++ extend.texi   (working copy)
> @@ -3332,16 +3332,15 @@
>  
>  @item naked
>  @cindex function without a prologue/epilogue code
> -Use this attribute on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX and SPU
> -ports to indicate that the specified function does not need prologue/epilogue
> -sequences generated by the compiler.
> -It is up to the programmer to provide these sequences. The
> -only statements that can be safely included in naked functions are
> -@code{asm} statements that do not have operands.  All other statements,
> -including declarations of local variables, @code{if} statements, and so
> -forth, should be avoided.  Naked functions should be used to implement the
> -body of an assembly function, while allowing the compiler to construct
> -the requisite function declaration for the assembler.
> +This attribute is available on the ARM, AVR, MCORE, MSP430, NDS32,
> +RL78, RX and SPU ports.  It allows the compiler to construct the
> +requisite function declaration, while allowing the body of the
> +function to be assembly code. The specified function will not have
> +prologue/epilogue sequences generated by the compiler. Only Basic
> +@code{asm} statements can safely be included in naked functions
> +(@pxref{Basic Asm}). While using Extended @code{asm} or a mixture of
> +Basic @code{asm} and ``C'' code may appear to work, they cannot be
> +depended upon to work reliably and are not supported.
>  
>  @item near
>  @cindex functions that do not handle memory bank switching on 68HC11/68HC12
> @@ -6269,6 +6268,8 @@
>  efficient code, and in most cases it is a better solution. When writing 
>  inline assembly language outside of C functions, however, you must use Basic 
>  @code{asm}. Extended @code{asm} statements have to be inside a C function.
> +Functions declared with the @code{naked} attribute also require Basic 
> +@code{asm} (@pxref{Function Attributes}).
>  
>  Under certain circumstances, GCC may duplicate (or remove duplicates of) 
> your 
>  assembly code when optimizing. This can lead to unexpected duplicate 
> @@ -6388,6 +6389,8 @@
>  
>  Note that Extended @code{asm} statements must be inside a function. Only 
>  Basic @code{asm} may be outside functions (@pxref{Basic Asm}).
> +Functions declared with the @code{naked} attribute also require Basic 
> +@code{asm} (@pxref{Function Attributes}).
>  
>  While the uses of @code{asm} are many and varied, it may help to think of an 
>  @code{asm} statement as a series of low-level instructions that convert 
> input 



Re: RFC: Doc update for attribute

2014-05-20 Thread Carlos O'Donell
On 05/20/2014 03:59 AM, Georg-Johann Lay wrote:
> Am 05/16/2014 07:16 PM, schrieb Carlos O'Donell:
>> On 05/12/2014 11:13 PM, David Wohlferd wrote:
>>> After updating gcc's docs about inline asm, I'm trying to
>>> improve some of the related sections. One that I feel has
>>> problems with clarity is __attribute__ naked.
>>> 
>>> I have attached my proposed update. Comments/corrections are 
>>> welcome.
>>> 
>>> In a related question:
>>> 
>>> To better understand how this attribute is used, I looked at the 
>>> Linux kernel. While the existing docs say "only ... asm
>>> statements that do not have operands" can safely be used, Linux
>>> routinely uses asm WITH operands.
>> 
>> That's a bug. Period. You must not use naked with an asm that has 
>> operands. Any kind of operand might inadvertently cause the
>> compiler to generate code and that would violate the requirements
>> of the attribute and potentially generate an ICE.
> 
> There is target hook TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS that is
> intended to cater that case.  For example, the documentation
> indicates it only works with optimization turned off.  But I don't
> know how reliable it is in general.  For avr target it works as
> expected.
> 
> https://gcc.gnu.org/onlinedocs/gccint/Misc.html#index-TARGET_005fALLOCATE_005fSTACK_005fSLOTS_005fFOR_005fARGS-4969

It's still a bug for now. That hook is there because we've allowed 
bad code to exist for so long that at this point we must for legacy
reasons allow some type of input arguments in the asm. However, that
doesn't mean we should actively promote this feature or let users
use it (until we fix it).

Ideally you do want to use the named input arguments as "r" types to
avoid needing to know the exact registers used in the call sequence.
Referencing the variables by name and letting gcc emit the right
register is useful, but only if it works consistently and today it
doesn't.

Features that fail to work depending on the optimization level should
not be promoted in the documentation. We should document what works
and file bugs or fix what doesn't work.

Cheers,
Carlos.


Weird startup issue with -fsplit-stack

2014-05-20 Thread Dmitry Antipov

Hello,

I'm trying to support -fsplit-stack in GNU Emacs. The most important problem is 
that
GC uses conservative scanning of a C stack, so I need to iterate over stack 
segments.
I'm doing this by using  __splitstack_find, as described in 
libgcc/generic-morestack.c;
but now I'm facing the weird issue with startup:

Core was generated by `./temacs --batch --load loadup bootstrap'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:486
486 pushq   %rax
(gdb) bt 10
#0  __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:486
#1  0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#2  0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#3  0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#4  0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#5  0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#6  0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#7  0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#8  0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#9  0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
(More stack frames follow...)
(gdb) bt -10
#87310 0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#87311 0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#87312 0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#87313 0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#87314 0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#87315 0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#87316 0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#87317 0x005f15df in __morestack () at 
../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
#87318 0x003791a21d65 in __libc_start_main (main=0x4d111d , argc=5, 
argv=0x7fffacc868d8, init=,
fini=, rtld_fini=, stack_end=0x7fffacc868c8) 
at libc-start.c:285
#87319 0x00405f69 in _start ()
(gdb)

Unfortunately I was unable to reproduce this issue with small test programs, so
there is no simple and easy-to-use recipe. Anyway, if someone would like to try:

bzr branch bzr://bzr.savannah.gnu.org/emacs/trunk
cd trunk
cat /path/to/emacs_split_stack.patch | patch -p0
# 'configure' options for 'smallest possible' configuration
CPPFLAGS='-DSPLIT_STACK=1' CFLAGS='-O0 -g3 -fsplit-stack' ./configure 
--prefix=/some/dir --without-all --without-x --disable-acl
make

I'm using (homebrew) GCC 4.9.0 and (stock) gold 2.24 on a Fedora 20 system.

Dmitry

=== modified file 'src/alloc.c'
--- src/alloc.c	2014-05-19 19:19:05 +
+++ src/alloc.c	2014-05-20 14:01:56 +
@@ -4932,11 +4932,28 @@
 #endif /* not GC_SAVE_REGISTERS_ON_STACK */
 #endif /* not HAVE___BUILTIN_UNWIND_INIT */
 
-  /* This assumes that the stack is a contiguous region in memory.  If
- that's not the case, something has to be done here to iterate
- over the stack segments.  */
+#ifdef SPLIT_STACK
+
+  /* This assumes gcc >= 4.6.0 with -fsplit-stack
+ and corresponding support in libgcc.  */
+  {
+size_t stack_size;
+extern void * __splitstack_find (void *, void *, size_t *,
+ void **, void **, void **);
+void *next_segment = NULL, *next_sp = NULL, *initial_sp = NULL, *stack;
+
+while ((stack = __splitstack_find (next_segment, next_sp, &stack_size,
+   &next_segment, &next_sp, &initial_sp)))
+  mark_memory (stack, (char *) stack + stack_size);
+  }
+
+#else /* not SPLIT_STACK */
+
+  /* This assumes that the stack is a contiguous region in memory.  */
   mark_memory (stack_base, end);
 
+#endif /* SPLIT_STACK */
+
   /* Allow for marking a secondary stack, like the register stack on the
  ia64.  */
 #ifdef GC_MARK_SECONDARY_STACK



Re: soft-fp functions support without using libgcc

2014-05-20 Thread Sheheryar Zahoor Qazi
>>If you have a working compiler that is missing some functions
>>provided by libgcc, that should be sufficient to build libgcc.
Meaning that even if i am unable build libgcc to my new architecture,
I should be able to able to provide soft-fp support to the
architecture?

Btw i get the following error when i build gcc:
configure:2627: error: in
`/target-arch/target-arch-gcc/builddir/target-arch/libgcc':
configure:2630: error: cannot compute suffix of object files: cannot compile


And regarding soft-fp, I get the following error when i use soft-fp
functions in a test program:
: In function `test':
(.text+0x0): undefined reference to `__floatsisf'
 In function `test':
: In function `test':
(.text+0x2c): undefined reference to `__mulsf3'
: In function `test':
(.text+0x2e): undefined reference to `__fixsfsi'

Is this due to libgcc build fail or it just linking error?


>>In other words, if you want soft-fp for IEEE float, the job should be very 
>>simple because that has already been done.  If you want soft-fp for CDC 6000 
>>float, you have to do a full implementation of that.
Actually i want soft-fp for standard IEEE 754

Sheheryar


On Fri, May 16, 2014 at 6:34 PM,   wrote:
>
> On May 16, 2014, at 12:25 PM, Ian Bolton  wrote:
>
>>> On Fri, May 16, 2014 at 6:34 AM, Sheheryar Zahoor Qazi
>>>  wrote:

 I am trying to provide soft-fp support to a an 18-bit soft-core
 processor architecture at my university. But the problem is that
 libgcc has not been cross-compiled for my target architecture and
>>> some
 functions are missing so i cannot build libgcc.I believe soft-fp is
 compiled in libgcc so i am usable to invoke soft-fp functions from
 libgcc.
 It is possible for me to provide soft-fp support without using
>>> libgcc.
 How should i proceed in defining the functions? Any idea? And does
>>> any
 archoitecture provide floating point support withoput using libgcc?
>>>
>>> I'm sorry, I don't understand the premise of your question.  It is not
>>> necessary to build libgcc before building libgcc.  That would not make
>>> sense.  If you have a working compiler that is missing some functions
>>> provided by libgcc, that should be sufficient to build libgcc.
>>
>> If you replace "cross-compiled" with "ported", I think it makes senses.
>> Can one provide soft-fp support without porting libgcc for their
>> architecture?
>
> By definition, in soft-fp you have to implement the FP operations in 
> software.  That’s not quite the same as porting libgcc to the target 
> architecture.  It should translate to porting libgcc (the FP emulation part) 
> to the floating point format being used.
>
> In other words, if you want soft-fp for IEEE float, the job should be very 
> simple because that has already been done.  If you want soft-fp for CDC 6000 
> float, you have to do a full implementation of that.
>
> paul
>


Re: negative latencies

2014-05-20 Thread Vladimir Makarov
On 05/19/2014 02:13 AM, shmeel gutl wrote:
> Are there hooks in gcc to deal with negative latencies? In other
> words, an architecture that permits an instruction to use a result
> from an instruction that will be issued later.
>

Could you explain more on *an example* what are you trying to achieve
with the negative latency.

Scheduler is based on a critical path algorithm.  Generally speaking
latency time can be negative for this algorithm.  But I guess that is
not what you are asking.

> At first glance it seems that it will will break a few things.
> 1) The definition of dependencies cannot come from the simple ordering
> of rtl.
> 2) The scheduling problem starts to look like "get off the train 3
> stops before me".
> 3) The definition of live ranges needs to use actual instruction
> timing information, not just instruction sequencing.
>
> The hooks in the scheduler seem to be enough to stop damage but not
> enough to take advantage of this "feature".
>



Re: Roadmap for 4.9.1, 4.10.0 and onwards?

2014-05-20 Thread Basile Starynkevitch
On Tue, 2014-05-20 at 11:09 +0100, Bruce Adams wrote:
> Hi,
> I've been tracking the latest releases of gcc since 4.7 or so (variously 
> interested in C++1y support, cilk and openmp).
> One thing I've found hard to locate is information about planned inclusions 
> for future releases. 
> As much relies on unpredictable community contributions I don't expect there 
> to be a concrete or reliable plan. 

> However, equally I'm sure the steering committee have some ideas over what 
> ought
> to be upcoming releases. 

As a whole, the steering committee does not have any idea, because GCC
development is based upon volunteer contributions.

However, some members of the steering committee might work in large
organization having a team of GCC contributors. That team might have its
own (private) agenda. But every patch has to be approved by someone
else.

So I don't think that the steering committee knows a lot more than you
and me.

Regards.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***




RE: Roadmap for 4.9.1, 4.10.0 and onwards?

2014-05-20 Thread Paulo Matos


> -Original Message-
> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf
> Of Basile Starynkevitch
> Sent: 20 May 2014 16:29
> To: Bruce Adams
> Cc: gcc@gcc.gnu.org
> Subject: Re: Roadmap for 4.9.1, 4.10.0 and onwards?
> 
> On Tue, 2014-05-20 at 11:09 +0100, Bruce Adams wrote:
> > Hi,
> > I've been tracking the latest releases of gcc since 4.7 or so
> (variously interested in C++1y support, cilk and openmp).
> > One thing I've found hard to locate is information about planned
> inclusions for future releases.
> > As much relies on unpredictable community contributions I don't
> expect there to be a concrete or reliable plan.
> 
> > However, equally I'm sure the steering committee have some ideas
> over
> > what ought to be upcoming releases.
> 
> As a whole, the steering committee does not have any idea, because GCC
> development is based upon volunteer contributions.
>

I understand the argument but I am not sure it's the way to go. Even if the 
project is based on volunteer contributions it would be interesting to have a 
tentative roadmap. This, I would think, would also help possible beginner 
volunteers know where to start if they wanted to contribute to the project. So 
the roadmap could be a list of features (big or small) of bug fixes that we 
would like fixed for a particular version. Even if we don't want to name it 
roadmap it would still be interesting to have a list of things that are being 
worked on or on the process of being merged into mainline and therefore will 
make it to the next major version.

That being said I know it's hard to set sometime apart to write this kind of 
thing given most of us prefer to be hacking on GCC. From a newcomer point of 
view, however, not having things like a roadmap makes it look like the project 
is heading nowhere.


Re: soft-fp functions support without using libgcc

2014-05-20 Thread Ian Lance Taylor
On Tue, May 20, 2014 at 7:37 AM, Sheheryar Zahoor Qazi
 wrote:
>>>If you have a working compiler that is missing some functions
>>>provided by libgcc, that should be sufficient to build libgcc.
> Meaning that even if i am unable build libgcc to my new architecture,
> I should be able to able to provide soft-fp support to the
> architecture?

You need to build soft-fp as part of libgcc.  What I am saying is that
you don't need soft-fp support in order to build libgcc.


> Btw i get the following error when i build gcc:
> configure:2627: error: in
> `/target-arch/target-arch-gcc/builddir/target-arch/libgcc':
> configure:2630: error: cannot compute suffix of object files: cannot compile

You need to look in target-arch/libgcc/config.log to see what the
problem is.


> And regarding soft-fp, I get the following error when i use soft-fp
> functions in a test program:
> : In function `test':
> (.text+0x0): undefined reference to `__floatsisf'
>  In function `test':
> : In function `test':
> (.text+0x2c): undefined reference to `__mulsf3'
> : In function `test':
> (.text+0x2e): undefined reference to `__fixsfsi'
>
> Is this due to libgcc build fail or it just linking error?

It's because libgcc was not built.

Ian


Re: Roadmap for 4.9.1, 4.10.0 and onwards?

2014-05-20 Thread Bruce Adams




- Original Message -
> From: Paulo Matos 
> To: Basile Starynkevitch ; Bruce Adams 
> 
> Cc: "gcc@gcc.gnu.org" 
> Sent: Tuesday, May 20, 2014 5:04 PM
> Subject: RE: Roadmap for 4.9.1, 4.10.0 and onwards?
> 
>>  -Original Message-
>>  From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf
>>  Of Basile Starynkevitch
>>  Sent: 20 May 2014 16:29
>>  To: Bruce Adams
>>  Cc: gcc@gcc.gnu.org
>>  Subject: Re: Roadmap for 4.9.1, 4.10.0 and onwards?
>> 
>>  On Tue, 2014-05-20 at 11:09 +0100, Bruce Adams wrote:
>>  > Hi,
>>  >     I've been tracking the latest releases of gcc since 4.7 or so
>>  (variously interested in C++1y support, cilk and openmp).
>>  > One thing I've found hard to locate is information about planned
>>  inclusions for future releases.
>>  > As much relies on unpredictable community contributions I don't
>>  expect there to be a concrete or reliable plan.
>> 
>>  > However, equally I'm sure the steering committee have some ideas
>>  over
>>  > what ought to be upcoming releases.
>> 
>>  As a whole, the steering committee does not have any idea, because GCC
>>  development is based upon volunteer contributions.
>> 
> 
> I understand the argument but I am not sure it's the way to go. Even if the 
> project is based on volunteer contributions it would be interesting to have a 
> tentative roadmap. This, I would think, would also help possible beginner 
> volunteers know where to start if they wanted to contribute to the project. 
> So 
> the roadmap could be a list of features (big or small) of bug fixes that we 
> would like fixed for a particular version. Even if we don't want to name it 
> roadmap it would still be interesting to have a list of things that are being 
> worked on or on the process of being merged into mainline and therefore will 
> make it to the next major version.
> 
> That being said I know it's hard to set sometime apart to write this kind of 
> thing given most of us prefer to be hacking on GCC. From a newcomer point of 
> view, however, not having things like a roadmap makes it look like the 
> project 
> is heading nowhere.
>
If you think of gcc as a large distributed agile project the road map may be 
buried
somewhere in the bug database. Perhaps its a matter of mining the relevant 
details
or encouraging practices that make them mineable?
The bugzilla has fields for assignee, priority and target milestone that could 
be used as hints.
The trouble is its very low level. 
The intent is buried in the communities subjective interpretation of priority. 
I don't know
how well that mirrors the actual values in the priority fields. I wouldn't 
expect it to without
a conscious effort.

If I search for "ALL cilk 4.9" or "ALL cilk" it is still not obvious that the 
cilk branch 
was merged into main prior to release 4.9.0. Though that could be down to my 
unfamiliarity with more complex queries in bugzilla.

Regards,

Bruce.


Re: Roadmap for 4.9.1, 4.10.0 and onwards?

2014-05-20 Thread Jeff Law

On 05/20/14 04:09, Bruce Adams wrote:

Hi, I've been tracking the latest releases of gcc since 4.7 or so
(variously interested in C++1y support, cilk and openmp). One thing
I've found hard to locate is information about planned inclusions for
future releases. As much relies on unpredictable community
contributions I don't expect there to be a concrete or reliable plan.
However, equally I'm sure the steering committee have some ideas over
what ought to be upcoming releases. Is this published anywhere?
The steering committee doesn't get involved in that aspect of 
development.  It's just not in the committee's charter.


There is no single roadmap for the GCC project and that's a direct 
result of the decentralized development.


Looking forward to the next major GCC release (4.10 or 5.0):

At a high level, wrapping up the C++11 ABI transition is high on the 
list for the next major GCC release.  As is the ongoing efforts to clean 
up the polymorphism in gimple (and maybe RTL).  Those aren't really user 
visible features, but they're a ton of work.


I'm hoping the Intel team can push the last remaining Cilk+ feature 
through (Cilk_for).  Jakub is working on Fortran support for OpenMP4. 
Others are working on OpenACC support.


Richi's work on folding looks promising, but I'm not sure of its 
relative priority.  There's work to bring AArch64 and Power 8 to first 
class support...  Honza's work on IPA, etc etc.


C++14 support will continue to land as bits are written.

I'm certainly missing lots of important stuff...


WRT to gcc-4.9.1, like most (all?) point releases, it's primarily meant 
to address bugs in the prior release.  I wouldn't expect significant 
features to be appearing in 4.9.x releases.


Jeff


Re: Weird startup issue with -fsplit-stack

2014-05-20 Thread Ian Lance Taylor
On Tue, May 20, 2014 at 7:18 AM, Dmitry Antipov  wrote:
>
> I'm trying to support -fsplit-stack in GNU Emacs. The most important problem
> is that
> GC uses conservative scanning of a C stack, so I need to iterate over stack
> segments.
> I'm doing this by using  __splitstack_find, as described in
> libgcc/generic-morestack.c;
> but now I'm facing the weird issue with startup:
>
> Core was generated by `./temacs --batch --load loadup bootstrap'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:486
> 486 pushq   %rax
> (gdb) bt 10
> #0  __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:486
> #1  0x005f15df in __morestack () at
> ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
> #2  0x005f15df in __morestack () at
> ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502
> #3  0x005f15df in __morestack () at
> ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502

This is the call to __morestack_block_signals in morestack.S.  It
should only be possible if __morestack_block_signals or something it
calls directly has a split stack.  __morestack_block_signals has the
no_split_stack attribute, meaning that it should never call
__morestack.  __morestack_block_signals only calls pthread_sigmark or
sigprocmask, neither of which should be compiled with -fsplit-stack.
So something has gone wrong, but I don't know what.

I would recommend tracing the code instruction by instruction to see
why __morestack_block_signals calls back into __morestack.  Or, if
that analysis is wrong, see what else is happening.

I can advise but I don't have time to look at this in detail.  Sorry.

Ian


Re: Roadmap for 4.9.1, 4.10.0 and onwards?

2014-05-20 Thread Jonathan Wakely
> If I search for "ALL cilk 4.9" or "ALL cilk" it is still not obvious that the 
> cilk branch
> was merged into main prior to release 4.9.0. Though that could be down to my 
> unfamiliarity with more complex queries in bugzilla.

Our bugzilla is usually used for tracking bugs, not merging of feature branches.

https://gcc.gnu.org/gcc-4.9/changes.html#c-family announces the
addition of Cilk Plus.

Merges of major new features should probably also be announced on
https://gcc.gnu.org/news.html


Re: negative latencies

2014-05-20 Thread shmeel gutl

On 20-May-14 06:13 PM, Vladimir Makarov wrote:

On 05/19/2014 02:13 AM, shmeel gutl wrote:

Are there hooks in gcc to deal with negative latencies? In other
words, an architecture that permits an instruction to use a result
from an instruction that will be issued later.


Could you explain more on *an example* what are you trying to achieve
with the negative latency.

Scheduler is based on a critical path algorithm.  Generally speaking
latency time can be negative for this algorithm.  But I guess that is
not what you are asking.
The architecture has an exposed pipeline where instructions read 
registers during the required cycle. So if one instruction produces its 
results in the third pipeline stage and a second instruction reads the 
register in the sixth pipeline stage, the second instruction can read 
the results of the first instruction even if it is issued three cycles 
earlier.


The problem that I see is that the haifa scheduler schedules one cycle 
at a time, in a forward order, by picking from a list of instructions 
that can be scheduled without delays. So, in the above example, if 
instruction one is scheduled during cycle 3, it can't schedule 
instruction two during cycle 0, 1, or 2 because its producer dependency 
(instruction one) hasn't been scheduled yet. It won't be able to 
schedule it until cycle 3. So I am asking if there is an existing 
mechanism to back schedule instruction two once instruction one is issued.


Thanks,
Shmeel

At first glance it seems that it will will break a few things.
1) The definition of dependencies cannot come from the simple ordering
of rtl.
2) The scheduling problem starts to look like "get off the train 3
stops before me".
3) The definition of live ranges needs to use actual instruction
timing information, not just instruction sequencing.

The hooks in the scheduler seem to be enough to stop damage but not
enough to take advantage of this "feature".







Re: Zero/Sign extension elimination using value ranges

2014-05-20 Thread Kugan

On 20/05/14 16:52, Jakub Jelinek wrote:
> On Tue, May 20, 2014 at 12:27:31PM +1000, Kugan wrote:
>> 1.  Handling NOP_EXPR or CONVERT_EXPR that are in the IL because they
>> are required for type correctness. We have two cases here:
>>
>> A) Mode is smaller than word_mode. This is usually from where the
>> zero/sign extensions are showing up in final assembly.
>> For example :
>> int = (int) short
>> which usually expands to
>>  (set (reg:SI )
>>   (sext:SI (subreg:HI (reg:SI 
>> We can expand  this
>>  (set (reg:SI ) (((reg:SI 
>>
>> If following is true:
>> 1. Value stored in RHS and LHS are of the same signedness
>> 2. Type can hold the value. i.e., In cases like char = (char) short, we
>> check that the value in short is representable char type. (i.e. look at
>> the value range in RHS SSA_NAME and see if that can be represented in
>> types of LHS without overflowing)
>>
>> Subreg here is not a paradoxical subreg. We are removing the subreg and
>> zero/sign extend here.
>>
>> I am assuming here that QI/HI registers are represented in SImode
>> (basically word_mode) with zero/sign extend is used as in
>> (zero_extend:SI (subreg:HI (reg:SI 117)).
> 
> Wouldn't it be better to just set proper flags on the SUBREG based on value
> range info (SUBREG_PROMOTED_VAR_P and SUBREG_PROMOTED_UNSIGNED_P)?
> Then not only the optimizers could eliminate in zext/sext when possible, but
> all other optimizations could benefit from that.

Thanks for the comments. Here is an attempt (attached) that sets
SUBREG_PROMOTED_VAR_P based on value range into. Is this the good place
to do this ?

Thanks,
Kugan
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index b7f6360..d23ae76 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3120,6 +3120,60 @@ expand_return (tree retval)
 }
 }
 
+
+static bool
+is_assign_promotion_redundant (struct separate_ops *ops)
+{
+  double_int type_min, type_max;
+  double_int min, max;
+  bool uns = TYPE_UNSIGNED (ops->type);
+  double_int msb;
+
+  /* We remove extension for integral stmts.  */
+  if (!INTEGRAL_TYPE_P (ops->type))
+return false;
+
+  if (TREE_CODE_CLASS (ops->code) == tcc_unary)
+{
+  switch (ops->code)
+   {
+   case CONVERT_EXPR:
+   case NOP_EXPR:
+
+ /* Get the value range.  */
+ if (TREE_CODE (ops->op0) != SSA_NAME
+ || POINTER_TYPE_P (TREE_TYPE (ops->op0))
+ || get_range_info (ops->op0, &min, &max) != VR_RANGE)
+   return false;
+
+ msb = double_int_one.rshift (TYPE_PRECISION (TREE_TYPE (ops->op0)));
+ if (!uns && min.cmp (msb, uns) == 1
+ && max.cmp (msb, uns) == 1)
+   {
+ min = min.sext (TYPE_PRECISION (TREE_TYPE (ops->op0)));
+ max = max.sext (TYPE_PRECISION (TREE_TYPE (ops->op0)));
+   }
+
+ /* Signedness of LHS and RHS should match or value range of RHS
+should be all positive values to make zero/sign extension 
redundant.  */
+ if ((uns != TYPE_UNSIGNED (TREE_TYPE (ops->op0)))
+  && (min.cmp (double_int_zero, TYPE_UNSIGNED (TREE_TYPE 
(ops->op0))) == -1))
+   return false;
+
+ type_max = tree_to_double_int (TYPE_MAX_VALUE (ops->type));
+ type_min = tree_to_double_int (TYPE_MIN_VALUE (ops->type));
+
+ /* If rhs value range fits lhs type, zero/sign extension is
+   redundant.  */
+ if (max.cmp (type_max, uns) != 1
+ && (type_min.cmp (min, uns)) != 1)
+   return true;
+   }
+}
+
+  return false;
+}
+
 /* A subroutine of expand_gimple_stmt, expanding one gimple statement
STMT that doesn't require special handling for outgoing edges.  That
is no tailcalls and no GIMPLE_COND.  */
@@ -3240,6 +3294,12 @@ expand_gimple_stmt_1 (gimple stmt)
  }
ops.location = gimple_location (stmt);
 
+   if (promoted && is_assign_promotion_redundant (&ops))
+ {
+   promoted = false;
+   SUBREG_PROMOTED_VAR_P (target) = 0;
+ }
+
/* If we want to use a nontemporal store, force the value to
   register first.  If we store into a promoted register,
   don't directly expand to target.  */


Re: Roadmap for 4.9.1, 4.10.0 and onwards?

2014-05-20 Thread Jan Hubicka
> On 05/20/14 04:09, Bruce Adams wrote:
> >Hi, I've been tracking the latest releases of gcc since 4.7 or so
> >(variously interested in C++1y support, cilk and openmp). One thing
> >I've found hard to locate is information about planned inclusions for
> >future releases. As much relies on unpredictable community
> >contributions I don't expect there to be a concrete or reliable plan.
> >However, equally I'm sure the steering committee have some ideas over
> >what ought to be upcoming releases. Is this published anywhere?
> The steering committee doesn't get involved in that aspect of
> development.  It's just not in the committee's charter.
> 
> There is no single roadmap for the GCC project and that's a direct
> result of the decentralized development.
> 
> Looking forward to the next major GCC release (4.10 or 5.0):
> 
> At a high level, wrapping up the C++11 ABI transition is high on the
> list for the next major GCC release.  As is the ongoing efforts to
> clean up the polymorphism in gimple (and maybe RTL).  Those aren't
> really user visible features, but they're a ton of work.
> 
> I'm hoping the Intel team can push the last remaining Cilk+ feature
> through (Cilk_for).  Jakub is working on Fortran support for
> OpenMP4. Others are working on OpenACC support.
> 
> Richi's work on folding looks promising, but I'm not sure of its
> relative priority.  There's work to bring AArch64 and Power 8 to
> first class support...  Honza's work on IPA, etc etc.

For IPA/FDO I think we are on track to merge some of more interesting
Google's changes (autoFDO, perhaps LIPO and other FDO improvements) and
Martin's pass for merging identical code.

I am personally trying to focus on two things - first is to cleanup APIs of
symbol table and IPA infrastructure after the C++ conversion and try to get
things working well for LTO of large binaries - this is important change for
optimizers, since we go from units consisting of hundred functions to units
consiting of million of functions and heuristics needs to retune.
And I also hope we will continue pushing bits making LTO more transparent
and reliable (command line arguments, debug info etc.)

Honza
> 
> C++14 support will continue to land as bits are written.
> 
> I'm certainly missing lots of important stuff...
> 
> 
> WRT to gcc-4.9.1, like most (all?) point releases, it's primarily
> meant to address bugs in the prior release.  I wouldn't expect
> significant features to be appearing in 4.9.x releases.
> 
> Jeff


Reducing Register Pressure through Live range Shrinking through Loops!!

2014-05-20 Thread Ajit Kumar Agarwal

Hello All:

Simpson does the Live range shrinking and reduction of register pressure by 
using the computation that are not load and store but the arithmetic 
computation. The computation
where the operands and registers are live at the entry and exit of the basic 
block but not touched inside the block then the computation is moved at the end 
of the block the reducing the register pressure inside the block by one.  
Extension of the Simpson work by extending the computation not being touched 
inside the basic block to the spanning of the Loops. If the Live ranges spans 
the Loops and live at the entry and exit of the Loop but the computation is not 
being touched inside the Loops then the computation is moved after the exit of 
the Loop. 

REDUCTION OF REGISTER PRESSURE THROUGH LIVE RANGE SHRINKING INSIDE THE LOOPS

for each Loop starting from inner to outer do the following
begin
RELIEFIN(i) = null if i is the entry of the cfg.
Else
For all predecessors j RELIEFOUT(j)
RELIEFOUT(i) = RELIEFIN(i) exposed union relief
INSERT(I,j) = RELIEFOUT(i) RELIEFIN(i) Intersection
Live(i)
end

The Simpson approach does takes the nesting depth into consideration of placing 
the computation and the relieve of the register pressure. Simpson approach 
doesn't takes into
consideration the computation which spans throughout the loop and the operands 
and results are live at the entry of the Loop and exit of the Loop but not 
touched inside the Loops can be useful in reduction of register pressure inside 
the Loops. This  approach will be useful in Region Based Register Allocator for 
Live Range Splitting at the Region Boundaries.

Extension of the Simpson approach is to consider the data flow analysis with 
respect to the given Loop rather than having it for entire control flow graph. 
This data flow analysis starts from the inner loop and extends it to the outer 
loop. If the reference is not through the nested depth or with some depth then 
the computation can be placed accordingly. For register allocator by Graph 
coloring the live ranges that are with respect to operands and results of the 
computation are taken into consideration and for the above approach put into 
the stack during simplification phase of Graph Coloring so that there is a 
chance of getting such Live ranges colorable and thus reduces the register 
pressure. This is extended to splitting
approach based on containment of Live ranges

OPTIMAL PLACEMENT OF THE COMPUTATION FOR SINGLE ENTRY AND MULTIPLE EXIT LOOPS

The placement of the computation to reduce the register pressure for Single 
Entry and Multiple exit by Simpson approach lead to unoptimal solution. The 
unoptimal Solution
is because of the exit node of the loop does not post dominates all the basic 
block inside the Loops. Due to this the placement of the computation just after 
the tail block of the Loop will lead to incorrect results. In order to perform 
the Optimal Solution of the placement of the computation, the computation needs 
to be placed the block just after all the exit points of the Loop reconverge 
and which will post dominates all the blocks of the Loops. This will take care 
of reducing the register pressure for the Loops that are single Entry and 
Multiple Exit. For irreducible Loops the optimization to convert to reducible 
is done before the register allocation that reduces the register pressure and 
will be applicable to structured control flow and thus reduces the register 
pressure.

The Live range shrinkage reducing register pressure takes load and store into 
consideration but not computation as proposed by Simpson. I am proposing to 
extend in GCC for the computation to reduce register pressure and for the Loop 
as given above for both Single Entry and Single Exit and Single Entry and 
Multiple Exit Loops.

Please let me know what do you think.

Thanks & Regards
Ajit


Re: Weird startup issue with -fsplit-stack

2014-05-20 Thread Dmitry Antipov

On 05/20/2014 10:16 PM, Ian Lance Taylor wrote:


This is the call to __morestack_block_signals in morestack.S.  It
should only be possible if __morestack_block_signals or something it
calls directly has a split stack.  __morestack_block_signals has the
no_split_stack attribute, meaning that it should never call
__morestack.  __morestack_block_signals only calls pthread_sigmark or
sigprocmask, neither of which should be compiled with -fsplit-stack.
So something has gone wrong, but I don't know what.


Thanks - that was an application's own copy of pthread_sigmask (compiled
with -fsplit-stack) linked into the binary due to a subtle configuration
issue.

The next major problem is that -fsplit-stack code randomly crashes with the
useless gdb backtrace, usually pointing to the very beginning of the function
(plus occasional "Cannot access memory at..." messages), e.g.:

(gdb) bt 1
#0  0x005a615b in mark_object (arg=0) at ../../trunk/src/alloc.c:6039

 6037  void
 6038  mark_object (Lisp_Object arg)
==>  6039  {

IIUC this usually (with traditional stack) happens due to stack overflow.
But what may be the case with -fsplit-stack? I do not receive any error
messages from libgcc, and there are a lot of free heap memory. If that matters,
mark_object is recursive, and recursion depth may be very high, up to a few
tens of thousands calls.

Dmitry