Re: Overwhelmed by GCC frustration

2017-08-02 Thread Richard Biener
On Tue, Aug 1, 2017 at 6:00 PM, James Greenhalgh
 wrote:
> On Tue, Aug 01, 2017 at 11:12:12AM -0400, Eric Gallager wrote:
>> On 8/1/17, Jakub Jelinek  wrote:
>> > On Tue, Aug 01, 2017 at 07:08:41AM -0400, Eric Gallager wrote:
>> >> > Heh.  I suspect -Os would benefit from a separate compilation pipeline
>> >> > such as -Og.  Nowadays the early optimization pipeline is what you
>> >> > want (mostly simple CSE & jump optimizations, focused on code
>> >> > size improvements).  That doesn't get you any loop optimizations but
>> >> > loop optimizations always have the chance to increase code size
>> >> > or register pressure.
>> >> >
>> >>
>> >> Maybe in addition to the -Os optimization level, GCC mainline could
>> >> also add the -Oz optimization level like Apple's GCC had, and clang
>> >> still has? Basically -Os is -O2 with additional code size focus,
>> >> whereas -Oz is -O0 with the same code size focus. Adding it to the
>> >> FSF's GCC, too, could help reduce code size even further than -Os
>> >> currently does.
>> >
>> > No, lack of optimizations certainly doesn't reduce the code size.
>> > For small code, you need lots of optimizations, but preferrably code-size
>> > aware ones.  For RTL that is usually easier, because you can often compare
>> > the sizes of the old and new sequences and choose smaller, for GIMPLE
>> > optimizations it is often just a wild guess on what optimizations generally
>> > result in smaller and what optimizations generally result in larger code.
>> > There are too many following passes to know for sure, and finding the right
>> > heuristics is hard.
>> >
>> > Jakub
>> >
>>
>> Upon rereading of the relevant docs, I guess it was a mistake to
>> compare -Oz to -O0. Let me quote from the apple-gcc "Optimize Options"
>> page:
>>
>> -Oz
>> (APPLE ONLY) Optimize for size, regardless of performance. -Oz
>> enables the same optimization flags that -Os uses, but -Oz also
>> enables other optimizations intended solely to reduce code size.
>> In particular, instructions that encode into fewer bytes are
>> preferred over longer instructions that execute in fewer cycles.
>> -Oz on Darwin is very similar to -Os in FSF distributions of GCC.
>> -Oz employs the same inlining limits and avoids string instructions
>> just like -Os.
>>
>> Meanwhile, their description of -Os as contrasted to -Oz reads:
>>
>> -Os
>> Optimize for size, but not at the expense of speed. -Os enables all
>> -O2 optimizations that do not typically increase code size.
>> However, instructions are chosen for best performance, regardless
>> of size. To optimize solely for size on Darwin, use -Oz (APPLE
>> ONLY).
>>
>> And the clang docs for -Oz say:
>>
>> -Oz Like -Os (and thus -O2), but reduces code size further.
>>
>> So -Oz does actually still optimize, so it's more like -O2 than -O0
>> after all, just even more size-focused than -Os.
>
> The relationship between -Os and -Oz is like the relationship between -O2
> and -O3.
>
> If -O3 says, try everything you can to increase performance even at the
> expense of code-size and compile time, then -Oz says, try everything you
> can to reduce the code size, even at the expense of performance and
> compile time.

Note for GCC -Os has been this historically.  I'd say that compared to
other compilers -O2 is what they do at -Os -- balance speed and size
with GCC being much more conservative on the size side than other
compilers.  Recently we've "weakened" -Os by for example allowing
integer division to expand to mul/add sequences but IIRC that was based
on the costs the target provides.

Richard.

> Thanks,
> James
>


GCC 7.2 Status Report (2017-08-02)

2017-08-02 Thread Richard Biener

Status
==

The GCC 7 branch is now frozen for the upcoming release candidate
and release.  All changes require release manager approval.


broken link on this page https://gcc.gnu.org/gcc-7/changes.html for link to "Profile Mode" page:

2017-08-02 Thread Sergei Kurenkov
Link on this page https://gcc.gnu.org/gcc-7/changes.html for "Profile Mode":

* The libstdc++ Profile Mode has been deprecated and will be removed
in a future version.

gives:

Not Found

The requested URL
/onlinedocs/gcc-7.1.0/libstdc++/manual/profile_mode.html was not found
on this server.


RFC: C extension to support variable-length vector types

2017-08-02 Thread Richard Sandiford
Summary
===

This is an RFC about some C language changes to support ARM's Scalable
Vector Extension (SVE).  A detailed description of SVE is available here:

https://static.docs.arm.com/ddi0584/a/DDI0584A_a_SVE_supp_armv8A.pdf

but the only feature that really matters for this RFC is that SVE has
no fixed or preferred vector length.  Implementations can instead choose
from a range of possible vector lengths, with 128 bits being the minimum
and 2048 bits being the maximum.

SVE code will generally be written in a "vector-length agnostic" way;
i.e. it generally won't (need to) assume a particular vector length.
The practical upshot of this for compilers is that the size of an SVE
vector is not normally known until runtime.

ARM has defined a set of types and intrinsic functions (known as the
"ACLE") for using SVE operations directly in C and C++ code.
For reference, the ACLE specification is available here:

https://static.docs.arm.com/100987//acle_sve_100987__00_en.pdf

but I'll try to keep the RFC self-contained.

Since the length of an SVE vector is not normally known until runtime,
the sizes of these ACLE types are likewise not normally known until runtime.
The ACLE handles this by treating the vector types as a new form of
"incomplete" type, with rules that are more relaxed than for normal
incomplete types.  The RFC is specifically about this approach, which
I'll describe in more detail below.  The main questions are:

  (1) Does the approach seem reasonable?

  (2) Would it be acceptable in principle to add this extension to the
  GCC C frontend (only enabled where necessary)?

  (3) Should we submit this to the standards committee?


Scope
=

The RFC only discusses the C semantics.  The ACLE has a similar set of
changes for C++, but the fundamental approach is very similar, so we
thought it would be better to concentrate on C to start with.

Ideally this would be discussed in parallel with the GCC and clang
communities, but since cfe-dev is subscriber-only, it wouldn't really
be appropriate to cross-post.  We'll therefore ask on the clang lists
later, folding in any outcome of this RFC.

As far as question (3) above goes, we'd be happy to turn this into a
formal submission to the standards committee if that seems appropriate.
We just thought it would be good to get some feedback here first.


Contents


1. The types in more detail
2. Requirements
3. Possible approaches
4. Outline of the type system changes
5. Rationale for choosing this approach
6. Edits to the C standard
7. User-defined sizeless types
8. Implementation


1. The types in more detail
===

The ACLE defines a vector type sv_t for each supported element type
_t, so that the complete set is:

svint8_t  svint16_t svint32_t svint64_t
svuint8_t svuint16_tsvuint32_tsvuint64_t
  svfloat16_t   svfloat32_t   svfloat64_t

The types in each column have the same number of lanes and have twice
as many lanes as those in the column to the right.

These types can be combined into tuples of 2, 3 or 4 vectors using
svxN_t, with the individual vectors being fields with the names
"v0", "v1", etc.  For example, svint8x4_t contains four separate vectors
of type svint8_t, with the vectors being in fields named "v0", "v1",
"v2" and "v3".

The ACLE also defines a single predicate type:

svbool_t

that has the same number of lanes as svint8_t and svuint8_t.


2. Requirements
===

One of the main questions that we needed to answer for the ACLE was:
how do we add the variable-length types above to the type system?
The key requirements were:

  * The approach must work in both C and C++.

  * It must be possible to define automatic variables with these types.

  * It must be possible to pass and return objects of these types
(since that's what intrinsics and vector library routines need to do).

  * It must be possible to use the types in _Generic associations
(since the ACLE uses _Generic to provide tgmath.h-style overloads).

  * It must be possible to create pointers or references to the types
(for passing or returning by pointer or reference, and because not
allowing references would be semantically difficult in C++).


3. Possible approaches
==

It seems that any approach to defining the ACLE types would fall into
one of three categories:

  (1) Limit the types in such a way that there is no concept of size.

  (2) Define the size of the types to be variable.

  (3) Define the size of the types to be constant, either with the
  constant being large enough for all possible vector lengths or
  with the types pointing to separate memory (as for C++ classes
  like std::string).

The approach we chose comes under (1).  The next sections describe this
approach informally in more detail, explain the rationale for chosing it,
and then give a more formal definition, as an edit to the standard.


4. O

Re: Overwhelmed by GCC frustration

2017-08-02 Thread Segher Boessenkool
On Tue, Aug 01, 2017 at 01:50:14PM +0200, David Brown wrote:
> I would not expect that to be good at all.  With no optimisation (-O0),
> gcc produces quite poor code - local variables are not put in registers
> or "optimised away", there is no strength reduction, etc.  For an
> architecture like the AVR with a fair number of registers (32, albeit
> 8-bit registers) and relatively inefficient stack access, -O0 produces
> /terrible/ code.

-Og is better though (better than any other -O for this test at least).

The regression happened before 4.7, it seems the big jump was with 4.6?
So what happened there?  This seems to happen on x86 as well, maybe
on everything.


Segher


Re: Overwhelmed by GCC frustration

2017-08-02 Thread Richard Biener
On Wed, Aug 2, 2017 at 3:54 PM, Segher Boessenkool
 wrote:
> On Tue, Aug 01, 2017 at 01:50:14PM +0200, David Brown wrote:
>> I would not expect that to be good at all.  With no optimisation (-O0),
>> gcc produces quite poor code - local variables are not put in registers
>> or "optimised away", there is no strength reduction, etc.  For an
>> architecture like the AVR with a fair number of registers (32, albeit
>> 8-bit registers) and relatively inefficient stack access, -O0 produces
>> /terrible/ code.
>
> -Og is better though (better than any other -O for this test at least).
>
> The regression happened before 4.7, it seems the big jump was with 4.6?
> So what happened there?  This seems to happen on x86 as well, maybe
> on everything.

And one function (of the two) shrinks compared to 3.4 and the other increases
so the jumps are probably mis-bisected anyway.

Richard.

>
> Segher


Re: RFC: C extension to support variable-length vector types

2017-08-02 Thread Joseph Myers
On Wed, 2 Aug 2017, Richard Sandiford wrote:

>   (1) Does the approach seem reasonable?
> 
>   (2) Would it be acceptable in principle to add this extension to the
>   GCC C frontend (only enabled where necessary)?
> 
>   (3) Should we submit this to the standards committee?

I think this only makes sense for WG14 in the context of a proposed 
standard feature that would use the new type properties.

The CPLEX group, having produced the first draft TS 21938-1 (Extensions 
for parallel programming: Thread-based parallelism), may go on to other 
related work including SIMD-based parallelism.  That might be the natural 
context for proposals related to vector types, both fixed-size and 
variable-size (see the discussion on the CPLEX list in June 2013, 
especially  
regarding types for native vector length rather than fixed vector length.  
(Or some kind of vector proposal could be made directly for C2x, given the 
widespread implementation experience with vector types.)

I don't see how this proposal deals with initialization for such types 
(either by disallowing it, or by providing a suitable way to initialize 
them).  A fixed-size vector can be initialized with a brace-enclosed 
initializer.  But if you use a brace-enclosed initializer for a variable 
(with automatic storage duration) with one of these types, the compiler 
can't tell if the size is large enough at compile time.  You allow 
compound literals and initialization because you want to allow aggregates 
of such types to be initialized, but that leaves unresolved the question 
of what's valid for initializing the vectors themselves.

(The lack of a size from sizeof means memset and memcpy can't readily be 
used for initialization either; likewise e.g. reading data from a file 
into such a variable.  Presumably you have other extensions to cover such 
things?)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: RFC: C extension to support variable-length vector types

2017-08-02 Thread Richard Sandiford
Hi Joseph,

Thanks for the quick feedback.

Joseph Myers  writes:
> On Wed, 2 Aug 2017, Richard Sandiford wrote:
>
>>   (1) Does the approach seem reasonable?
>> 
>>   (2) Would it be acceptable in principle to add this extension to the
>>   GCC C frontend (only enabled where necessary)?
>> 
>>   (3) Should we submit this to the standards committee?
>
> I think this only makes sense for WG14 in the context of a proposed 
> standard feature that would use the new type properties.
>
> The CPLEX group, having produced the first draft TS 21938-1 (Extensions 
> for parallel programming: Thread-based parallelism), may go on to other 
> related work including SIMD-based parallelism.  That might be the natural 
> context for proposals related to vector types, both fixed-size and 
> variable-size (see the discussion on the CPLEX list in June 2013, 
> especially  
> regarding types for native vector length rather than fixed vector length.  
> (Or some kind of vector proposal could be made directly for C2x, given the 
> widespread implementation experience with vector types.)
>
> I don't see how this proposal deals with initialization for such types 
> (either by disallowing it, or by providing a suitable way to initialize 
> them).  A fixed-size vector can be initialized with a brace-enclosed 
> initializer.  But if you use a brace-enclosed initializer for a variable 
> (with automatic storage duration) with one of these types, the compiler 
> can't tell if the size is large enough at compile time.  You allow 
> compound literals and initialization because you want to allow aggregates 
> of such types to be initialized, but that leaves unresolved the question 
> of what's valid for initializing the vectors themselves.

Yeah, sorry, I forgot to say that types like svuint8_t are opaque
built-in types.  They aren't vector types in the sense of the GNU
vector extensions (or other such extensions).  This means that the
only valid use of compound literals for vector types is:

(sv_t) { x }

where x also has type sv_t.  Obviously this case isn't useful;
it's only the aggregate case that's useful.

The ACLE types are only intended to be used with the associated
intrinsics.  There are intrinsics for creating vectors from scalars,
loading from scalars, storing to scalars, reinterpreting one type as
another, etc.

> (The lack of a size from sizeof means memset and memcpy can't readily be 
> used for initialization either; likewise e.g. reading data from a file 
> into such a variable.  Presumably you have other extensions to cover such 
> things?)

The idea is that the vector types would only be used for working data
(i.e. data that's intended to be stored in registers where possible).
Longer-term data would be stored out to arrays.  So things like memcpy,
memset and file operations would be done on arrays as normal.

Thanks,
Richard


Re: RFC: C extension to support variable-length vector types

2017-08-02 Thread Torvald Riegel
On Wed, 2017-08-02 at 14:09 +0100, Richard Sandiford wrote:
>   (1) Does the approach seem reasonable?
> 
>   (2) Would it be acceptable in principle to add this extension to the
>   GCC C frontend (only enabled where necessary)?
> 
>   (3) Should we submit this to the standards committee?

I hadn't have time to look at the proposal in detail.  I think it would
be good to have the standards committees review this.  I doubt you could
find consensus in the C++ for type system changes unless you have a
really good reason.  Have you considered how you could use the ARM
extensions from http://wg21.link/p0214r4 ?



Re: RFC: C extension to support variable-length vector types

2017-08-02 Thread Richard Sandiford
Torvald Riegel  writes:
> On Wed, 2017-08-02 at 14:09 +0100, Richard Sandiford wrote:
>>   (1) Does the approach seem reasonable?
>> 
>>   (2) Would it be acceptable in principle to add this extension to the
>>   GCC C frontend (only enabled where necessary)?
>> 
>>   (3) Should we submit this to the standards committee?
>
> I hadn't have time to look at the proposal in detail.  I think it would
> be good to have the standards committees review this.  I doubt you could
> find consensus in the C++ for type system changes unless you have a
> really good reason.  Have you considered how you could use the ARM
> extensions from http://wg21.link/p0214r4 ?

Yeah, we've been following that proposal, but I don't think it helps
as-is with SVE.  datapar is "an array of target-specific size,
with elements of type T, ..." and for SVE the natural target-specific
size would be a runtime value.  The core language would still need to
provide a way of creating that array.

Similarly to other vector architectures (including AdvSIMD), the SVE
intrinsics and their types are more geared towards people who want
to optimise specifically for SVE without having to resort to assembly.
That's an important use case for us, and I think there's always going to
be a need for it alongside generic SIMD and parallel-programming models
(which of course are a good thing to have too).

Being able to use SVE features from C is also important.  Not all
projects are prepared to convert to C++.

Thanks,
Richard


default function alignment

2017-08-02 Thread Martin Sebor

I'm writing a test to verify that multiple attribute aligned
specifiers on a function declaration are handled correctly
(bug 81566).  In the test I need to know the default function
alignment for the target(*).  I've the FUNCTION_BOUNDARY macro
used to set the default alignment for a function (IIUC).  If
that is the right macro, or if there is a more appropriate
one, is there a way to get at its value in a unit test?

If there is no way, would enhancing target-supports.exp to
include a header that defines the macro? (I assume that would
be gcc/target.h for FUNCTION_BOUNDARY).

Thanks
Martin

[*] I believe I need this to avoid test failures on targets
where the minimum function alignment is greater than 1 byte.
I might be able to use the largest known alignment (AFAICS,
that's 16 bytes on IA64 and the FR-V) as the minimum unless
there is also a target-specific maximum that I don't know
about.  Is there?  (And if so, what's the macro for it?)


Re: default function alignment

2017-08-02 Thread Joseph Myers
On Wed, 2 Aug 2017, Martin Sebor wrote:

> If there is no way, would enhancing target-supports.exp to
> include a header that defines the macro? (I assume that would
> be gcc/target.h for FUNCTION_BOUNDARY).

target.h is for target hooks, not target macros, and we want to move away 
from target macros, so adding dependencies on the existence of a macro 
with this value doesn't seem like a good idea (anyway, FUNCTION_BOUNDARY 
can depend on command-line options passed to the compiler, so getting it 
from a compiler header can't possibly work).

We have -fbuilding-libgcc to pass configuration to target library builds 
that shouldn't otherwise be needed in normal user code built by GCC (via 
defining extra predefined macros if that option is passed).  I suppose you 
could have something like that to provide such information to the 
testsuite.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: default function alignment

2017-08-02 Thread Martin Sebor

On 08/02/2017 11:37 AM, Joseph Myers wrote:

On Wed, 2 Aug 2017, Martin Sebor wrote:


If there is no way, would enhancing target-supports.exp to
include a header that defines the macro? (I assume that would
be gcc/target.h for FUNCTION_BOUNDARY).


target.h is for target hooks, not target macros, and we want to move away
from target macros, so adding dependencies on the existence of a macro
with this value doesn't seem like a good idea (anyway, FUNCTION_BOUNDARY
can depend on command-line options passed to the compiler, so getting it
from a compiler header can't possibly work).

We have -fbuilding-libgcc to pass configuration to target library builds
that shouldn't otherwise be needed in normal user code built by GCC (via
defining extra predefined macros if that option is passed).  I suppose you
could have something like that to provide such information to the
testsuite.


Thanks.  That's good to know about.  To keep the patch from getting
too big I think might just take a chance and hope there is no maximum
function alignment.  If there is I'll deal with the fallout later.

Martin


Re: GCC Runtime Library Exception in gcc/config/* files?

2017-08-02 Thread Jeff Law
On 07/21/2017 12:14 PM, Georg-Johann Lay wrote:
> Sebastian Huber schrieb:
>> Hello,
>>
>> there are some files in gcc/config/* that contain the GCC Runtime
>> Library Exception
>>
>> grep -r --include='*.[ch]' 'GCC Runtime Library Exception' -l
>> gcc/config | wc
>> 186 1865361
>>
>> and some files that don't include it
>>
>> grep -r --include='*.[ch]' 'GCC Runtime Library Exception' -l
>> gcc/config -v | wc
>> 753 753   20927
>>
>> Does it matter? What should be used for new files?
> 
> Some machine headers are needed by libgcc.  Not all information is
> available by means of built-in macros, so that compile-time decisions
> in libgcc might need the target headers.
But just because we make a compile time decision doesn't mean the tm.h
file needs to have the exception clause.  IMHO what matters is whether
or not code gets embedded.

For example, if there's a macro in a tm.h file that generates a blob of
assembly code and that assembly code is embedded into libgcc.  Then we
probably need an exception clause.

> 
> Likely, this dates back to the time when machine specific libgcc bits
> where in gcc/config/$target.
> 
> Some users of (lib)gcc which compile their (proprietary) software
> with gcc are paranoid about being infected by GPL due to using
> libgcc which uses headers without runtime library exception.
Yes, but that doesn't mean we just blindly put the exception clause on
all the target headers.

> 
> https://gcc.gnu.org/ml/gcc-help/2012-08/msg00235.html
> 
> See also
> 
> https://gcc.gnu.org/PR61152
IMHO the proper thing to do here is identify what parts potentially
introduce *code* into the library.  Those need to have the exception
clause.  Adding the exception clause to the tm.h files blindly seems
wrong to me.

jeff



Re: [patch] RFC: Hook for insn costs?

2017-08-02 Thread Richard Earnshaw
On 26/07/17 18:54, Jeff Law wrote:
> On 07/17/2017 02:35 PM, Richard Henderson wrote:
>> On 07/17/2017 12:20 AM, Richard Biener wrote:
>>> On Sun, Jul 16, 2017 at 12:51 AM, Segher Boessenkool
 Now what should it take as input?  An rtx_insn, or just the pattern
 (as insn_rtx_cost does)?
>>>
>>> Is there any useful info on the other operands of an rtx_insn?  If not
>>> then passing in the pattern (a rtx) might be somewhat more flexible.
>>> Of course it's then way easier to confuse rtx_cost and insn_cost ...
>>
>> A lot of really complex by-hand pattern matching goes away if you know
>> the instruction is valid, and you can look up an insn attribute.  That
>> suggests passing the insn and not the PATTERN.
> Good point.  In fact, it opens the possibility that costing could be
> attached to the insn itself as just another attribute if it made sense
> for the target to describe costing in that manner.
> 
> Jeff
> 

I'm not sure if that's a good or a bad thing.  Currently the mid-end
depends on some rtx constructs having sensible costs even if there's no
rtl pattern to match them (IIRC plus:QI is one such construct - RISC
type machines usually lack such an instruction).  Also, costs tend to be
micro-architecture specific so attaching costs directly to patterns
would be extremely painful, adding support would require touching the
entirety of the MD files.  The best bet would be a level of indirection
from the patterns to cost tables, much like scheduler attributes.

But that still leaves the issue of what to do with the cost of MEM vs
REG operands - in a pattern they may both be matched by general_operand
but the cost of each is quite distinct and the normal attributes system
probably won't (maybe can't) disambiguate the sub-types until after
register allocation.

R.


Re: [patch] RFC: Hook for insn costs?

2017-08-02 Thread Richard Henderson
On 08/02/2017 12:34 PM, Richard Earnshaw wrote:
> I'm not sure if that's a good or a bad thing.  Currently the mid-end
> depends on some rtx constructs having sensible costs even if there's no
> rtl pattern to match them (IIRC plus:QI is one such construct - RISC
> type machines usually lack such an instruction). 

I hadn't considered this... but there are several possible workarounds.

The simplest of which is to fall back to using rtx_cost if the insn_cost hook
returns a failure indication, e.g. -1.

> Also, costs tend to be
> micro-architecture specific so attaching costs directly to patterns
> would be extremely painful, adding support would require touching the
> entirety of the MD files.  The best bet would be a level of indirection
> from the patterns to cost tables, much like scheduler attributes.

I was never thinking of adding costs directly to the md files, but rather
structuring the insn_cost hook like

  if (recog_memoized (insn) < 0)
return -1;
  switch (get_attr_type (insn))
{
case TYPE_iadd:
case TYPE_ilog:
case TYPE_mvi:
  return COSTS_N_INSNS (1);

case TYPE_fadd:
  return cost_data->fp_add;
}

etc.  This would be especially important when it comes costing for simd-type
insns.  Matching many of those any other way would be fraught with peril.

> But that still leaves the issue of what to do with the cost of MEM vs
> REG operands - in a pattern they may both be matched by general_operand
> but the cost of each is quite distinct and the normal attributes system
> probably won't (maybe can't) disambiguate the sub-types until after
> register allocation.

Pre-reload we'll probably have a pseudo and assume a register operand, or have
an unambiguous memory.  I don't think that's really a problem.  We're producing
a cost estimate based on what we are given at that moment.

It does make x86 more complex than more RISC-ier targets, but still not
impossible.  There is at least a "memory" attribute with which one could adjust
the base cost selected by the "type" attribute.

One could also walk the pattern to count the number of mems, at least for a
given subset of insn types.


r~


Re: [patch] RFC: Hook for insn costs?

2017-08-02 Thread Segher Boessenkool
On Wed, Aug 02, 2017 at 08:34:20PM +0100, Richard Earnshaw wrote:
> >> A lot of really complex by-hand pattern matching goes away if you know
> >> the instruction is valid, and you can look up an insn attribute.  That
> >> suggests passing the insn and not the PATTERN.
> > Good point.  In fact, it opens the possibility that costing could be
> > attached to the insn itself as just another attribute if it made sense
> > for the target to describe costing in that manner.

I am doing this for rs6000 now, and it is totally practical.

> I'm not sure if that's a good or a bad thing.  Currently the mid-end
> depends on some rtx constructs having sensible costs even if there's no
> rtl pattern to match them (IIRC plus:QI is one such construct - RISC
> type machines usually lack such an instruction).

Yes, but in many cases a pass wants to create valid insns anyway.

I changed combine to always use insn_cost (instead of pattern_cost,
what used to be named insn_rtx_cost).  Implementation of this (for
rs6000) is much simpler: most insns just cost COSTS_N_INSNS (N) with
N the number of machine insns generated, and most others can easily
use a new attribute "cost", which for example can use the existing
rs6000_cost structure (that gives costs for various cpu models).

> Also, costs tend to be
> micro-architecture specific so attaching costs directly to patterns
> would be extremely painful, adding support would require touching the
> entirety of the MD files.  The best bet would be a level of indirection
> from the patterns to cost tables, much like scheduler attributes.

We already use the same in the rtx_costs hook; it is quite a bit nicer
to have it in the machine description itself.

I am dumping out the "old" cost and "new" cost whenever they differ.
It turns out that more than half of the differences are where the *old*
cost was bad!

> But that still leaves the issue of what to do with the cost of MEM vs
> REG operands - in a pattern they may both be matched by general_operand
> but the cost of each is quite distinct and the normal attributes system
> probably won't (maybe can't) disambiguate the sub-types until after
> register allocation.

I currently handle this directly in the insn_cost hook; it could also be
handled in the md files.  Of course this problem is easy for rs6000
because Power is purely a load-store architecture.


Segher


Re: [patch] RFC: Hook for insn costs?

2017-08-02 Thread Segher Boessenkool
On Wed, Aug 02, 2017 at 12:56:58PM -0700, Richard Henderson wrote:
> On 08/02/2017 12:34 PM, Richard Earnshaw wrote:
> > I'm not sure if that's a good or a bad thing.  Currently the mid-end
> > depends on some rtx constructs having sensible costs even if there's no
> > rtl pattern to match them (IIRC plus:QI is one such construct - RISC
> > type machines usually lack such an instruction). 
> 
> I hadn't considered this... but there are several possible workarounds.
> 
> The simplest of which is to fall back to using rtx_cost if the insn_cost hook
> returns a failure indication, e.g. -1.

I think it is simpler if the insn_cost hook implementation itself calls
rtx_cost (or pattern_cost etc.) where it wants to.

> > Also, costs tend to be
> > micro-architecture specific so attaching costs directly to patterns
> > would be extremely painful, adding support would require touching the
> > entirety of the MD files.  The best bet would be a level of indirection
> > from the patterns to cost tables, much like scheduler attributes.
> 
> I was never thinking of adding costs directly to the md files, but rather
> structuring the insn_cost hook like
> 
>   if (recog_memoized (insn) < 0)
> return -1;
>   switch (get_attr_type (insn))
> {
> case TYPE_iadd:
> case TYPE_ilog:
> case TYPE_mvi:
>   return COSTS_N_INSNS (1);
> 
> case TYPE_fadd:
>   return cost_data->fp_add;
> }
> 
> etc.  This would be especially important when it comes costing for simd-type
> insns.  Matching many of those any other way would be fraught with peril.

Yep.  Most of this can be handled inside just some platform-specific
cost attribute(s).

> It does make x86 more complex than more RISC-ier targets, but still not
> impossible.  There is at least a "memory" attribute with which one could 
> adjust
> the base cost selected by the "type" attribute.

Yes, I don't know if the new hook will make life much simpler for highly
CISC, or irregular architectures.  So far it looks very promising for
"RISC", more regular architectures.  Anything that doesn't want to
implement it can just use the old way of things.

> One could also walk the pattern to count the number of mems, at least for a
> given subset of insn types.

Right.


Segher


void function declared attribute const

2017-08-02 Thread Martin Sebor

Hi Honza,

While testing improvements to GCC attribute handling I came
across the warning below:

In file included from 
/ssd/src/gcc/81544/libstdc++-v3/src/c++98/mt_allocator.cc:31:0:
/ssd/build/gcc-81544/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/mt_allocator.h:359:43: 
warning: ‘const’ attribute on function returning ‘void’ [-Wattributes]

   _M_destroy_thread_key(void*) throw ();
  ^

Git log shows you added the attribute in r146330 and there was
a lively debate preceding the change on the libstdc++ mailing
list:
  https://gcc.gnu.org/ml/libstdc++/2009-04/msg00053.html

A patch including this function was posted here:
  https://gcc.gnu.org/ml/libstdc++/2009-04/msg00111.html

I've re-read much of the discussion but couldn't find my answer
there so I'm wondering if you could help me understand the
rationale for it.  (Btw., I understand what the attribute does,
I'm just not sure why it's helpful or even used in this instance.

The function is declared like this in the header:

  // XXX GLIBCXX_ABI Deprecated
  _GLIBCXX_CONST void
  _M_destroy_thread_key(void*) throw ();

and defined like so in mt_allocator.cc:

  void
  __pool::_M_destroy_thread_key(void*) throw () { }

So the definition trivially meets the requirements on a const
function, and calls to it will be eliminated (with optimization).

My question is: since the function is deprecated and its
definition only provided for ABI compatibility, should calls
to it ever actually be compiled with future revisions of GCC?

I mean, since it's an implementation-only function, there should
be no code outside of libstdc++ that makes direct calls to it.
Code that did make calls to it (e.g., as a result of macro or
inline expansion) was presumably compiled into object files by
older versions of GCC (before the deprecation and before the
addition of the attribute) and won't benefit from the const
attribute now.  And since the function is deprecated, no code
newly compiled with GCC should make calls to it, either directly
or otherwise.  (Would using attribute deprecated on the function
make sense?)

The reason for my question is to understand if the warning is
justified (it's based on the documentation of attribute const:
"It does not make sense for a const function to return void.")

If it does make sense to declare a function const that returns
void then I'll remove the warning and update the manual and
mention this use case.  Otherwise, if you confirm that the
function shouldn't be called in new code I'll submit a patch
to remove the const attribute.

Thanks
Martin


gcc-6-20170802 is now available

2017-08-02 Thread gccadmin
Snapshot gcc-6-20170802 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/6-20170802/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-6-branch 
revision 250837

You'll find:

 gcc-6-20170802.tar.xzComplete GCC

  SHA256=6da53170cb1fd50fc49cae567e741a0e6334babcc1d23e7302221b18adcce34c
  SHA1=63afc5a12cf17f32cd185d98f751590a6d96cbe1

Diffs from 6-20170726 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


How to migrate struct rtl_opt_pass to class for GCC v6.x?

2017-08-02 Thread Leslie Zhai

Hi GCC developers,

As ChangeLog-2013 mentioned:

2013-08-05  David Malcolm 

This is the automated part of the conversion of passes from C
structs to C++ classes.

...

* auto-inc-dec.c (pass_inc_dec): Convert from a global struct to a
subclass of rtl_opt_pass along with...


so I migrate struct rtl_opt_pass: 
https://github.com/llvm-mirror/dragonegg/blob/master/src/Backend.cpp#L1731



static struct rtl_opt_pass pass_rtl_emit_function = { {
  RTL_PASS, "rtl_emit_function", /* name */
#if (GCC_MINOR > 7)
  OPTGROUP_NONE, /* optinfo_flags */
#endif
  NULL,  /* gate */
  rtl_emit_function, /* execute */
  NULL,  /* sub */
  NULL,  /* next */
  0, /* static_pass_number */
  TV_NONE,   /* tv_id */
  PROP_ssa | PROP_gimple_leh | PROP_cfg, /* properties_required */
  0, /* properties_provided */
  PROP_ssa | PROP_trees, /* properties_destroyed */
  TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts, /* 
todo_flags_start */

  TODO_ggc_collect /* todo_flags_finish */
} };


to GCC v6.x C++ classes' style: 
https://github.com/xiangzhai/dragonegg/blob/gcc-6_3-branch/src/Backend.cpp#L2072



const pass_data pass_data_rtl_emit_function = {
  RTL_PASS,  /* type */
  "rtl_emit_function",   /* name */
  OPTGROUP_NONE, /* optinfo_flags */
  TV_NONE,   /* tv_id */
  PROP_ssa | PROP_gimple_leh | PROP_cfg, /* properties_required */
  0, /* properties_provided */
  PROP_ssa | PROP_trees, /* properties_destroyed */
  0, /* todo_flags_start */
  0, /* todo_flags_finish */
};

class pass_rtl_emit_function : public rtl_opt_pass {
public:
  pass_rtl_emit_function(gcc::context *ctxt)
  : rtl_opt_pass(pass_data_rtl_emit_function, ctxt) {}

  unsigned int execute(function *) { return rtl_emit_function(); }

  opt_pass *clone() { return this; }
};


GCC v4.6 will call the callback `rtl_emit_function`, but GCC v6.x will 
not, did I forget something? is there some options need to be set? 
please give me some hint, thanks a lot!


PS: GCC v6.x *call* the callback `rtl_emit_function` 
https://github.com/xiangzhai/dragonball/blob/master/tests/plugin.cpp#L67 
in this testcase, I will check the source code about register_callback, 
rtl_opt_pass, and opt_pass.


--
Regards,
Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/
@




RFC [testsuite] Obey --load-average

2017-08-02 Thread Daniel Santos
I'm working on a patch to modify the testsuite to obey the
--load-average value if one is passed to make.  It seems to work pretty
well, except for libstdc++ which doesn't load gcc/libs/gcc-defs.exp
since it defines it's own ${tool}_functions.  I haven't dug too deeply
into libstdc++'s testsuite yet, but how does it manage parallelization
if it isn't using the routines in gcc-defs.exp?  I'm thinking I will
need a separate load-limit.exp file or some such.

This feature would be very helpful since you cannot interrupt a test run
and restart from where you left off.  Also, if you suspend the job, then
you will get timeouts.  So it would be helpful to have a way to have it
yield when you need to do something else on your machine, or if you're
using a shared test machine and you want to use all available CPU, but
not crowd out other users.

Due to not having actual interprocess communication, the check_cpu_load
procedure uses an algo that gives lower numbered jobs slightly more
priority than higher numbered jobs.  When the load average is exceeded,
the job sleeps an amount of time based upon the "priority" (lower
numbered jobs sleep less) and a random number - this helps prevent feast
or famine cycles where all jobs stop when the load is too high and then
all jobs start again and saturate the CPUs, bouncing back and forth.

gcc/testsuite/ChangeLog

* gcc/Makefile.in (check-parallel-%): Export job number to environment.
* gcc/testsuite/lib/gcc-defs.exp (num_jobs, load_max, getloadavg_exe):
New global variables.
(check_cpu_load): New proc to check speed limit.
(gcc_parallel_test_run_p): Modify to call check_cpu_load before
acquiring a new lock file.

Thanks,
Daniel
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index efca9169671..f26ff3840b8 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -4039,6 +4039,7 @@ check-parallel-% : site.exp
 	@test -d $(TESTSUITEDIR)/$(check_p_subdir) || mkdir $(TESTSUITEDIR)/$(check_p_subdir)
 	-$(if $(check_p_subno),@)(rootme=`${PWD_COMMAND}`; export rootme; \
 	srcdir=`cd ${srcdir}; ${PWD_COMMAND}` ; export srcdir ; \
+	GCC_RUNTEST_JOBNO=$(check_p_subno) ; export GCC_RUNTEST_JOBNO ; \
 	if [ -n "$(check_p_subno)" ] \
 	   && [ -n "$$GCC_RUNTEST_PARALLELIZE_DIR" ] \
 	   && [ -f $(TESTSUITEDIR)/$(check_p_tool)-parallel/finished ]; then \
diff --git a/gcc/testsuite/lib/gcc-defs.exp b/gcc/testsuite/lib/gcc-defs.exp
index d5fde7ce5e3..e18025018d2 100644
--- a/gcc/testsuite/lib/gcc-defs.exp
+++ b/gcc/testsuite/lib/gcc-defs.exp
@@ -20,6 +20,8 @@ load_lib wrapper.exp
 
 load_lib target-utils.exp
 
+global num_jobs load_max getloadavg_exe
+
 #
 # ${tool}_check_compile -- Reports and returns pass/fail for a compilation
 #
@@ -148,6 +150,107 @@ proc ${tool}_exit { } {
 }
 }
 
+if { [info exists env(MAKEFLAGS)] } then {
+# Attempt to get the --load-average from make
+set load_max [regsub "^(?:|.*? -)l(\\d+(\\.\\d+)?).*?$" \
+		  $env(MAKEFLAGS) "\\1" ]
+if [regexp "^\\d+(\\.\\d+)?$" $load_max match] then {
+verbose "load_max = $load_max" 0
+} else {
+unset load_max
+}
+
+# Attempt to get the number of make -j
+set num_jobs [regsub "^(?:|.*? -)?j(\\d+).*?$" $env(MAKEFLAGS) "\\1" ]
+if [regexp "^\\d+$" $num_jobs match] then {
+verbose "num_jobs = $num_jobs" 0
+} else {
+	set num_jobs 1
+}
+}
+
+# If a --load-average was specified, try to build getloadavg_exe.
+if [info exists load_max] then {
+set src "$tmpdir/getloadavg.[pid].c"
+set getloadavg_exe "$tmpdir/getloadavg.exe"
+set f [open $src "w"]
+puts $f {
+	#include 
+	#include 
+
+	int main (int argc, char *argv[])
+	{
+	  double load;
+	  if (getloadavg (&load, 1) == -1)
+	return -1;
+
+	  printf ("%f", load);
+	  return 0;
+	}
+}
+close $f
+
+# Temporarily switch to the environment for the host compiler.
+#restore_ld_library_path_env_vars
+set cc "$HOSTCC $HOSTCFLAGS $TEST_ALWAYS_FLAGS -O2"
+set status [remote_exec host "$cc -O2 -o $getloadavg_exe  $src"]
+#set_ld_library_path_env_vars
+file delete $src
+if [lindex $status 0] {
+	verbose "Failed to build $src, will not attempt to enforce CPU load limit." 0
+unset getloadavg_exe
+}
+verbose "$getloadavg_exe and appears to be working..." 0
+}
+
+#
+# check_cpu_load -- If make was passed --load-average and and libc supports
+#		getloadavg, then check CPU load and regulate job execution.
+#
+
+proc check_cpu_load {} {
+global num_jobs load_max getloadavg_exe load_last_checked
+
+# Only recheck CPU load every 4 seconds.
+set now [clock seconds]
+if {[info exists load_last_checked] && \
+			$now < [expr {$load_last_checked + 4}]} then {
+	return;
+}
+set load_last_checked $now
+set jobno [getenv GCC_RUNTEST_JOBNO]
+
+# First job always runs
+if {$jobno == 0 || $jobno == ""} then {
+	return
+}
+
+set sleep_time [expr {rand() * 8 * $jobno / $nu