Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-25 Thread Richard Earnshaw (lists)
On 24/02/16 17:38, Joseph Myers wrote:
> On Wed, 24 Feb 2016, Richard Earnshaw (lists) wrote:
> 
>> After discussion with the ARM port maintainers we have decided that now
>> is probably the right time to deprecate support for versions of the ARM
>> Architecture prior to ARMv4t.  This will allow us to clean up some of
> 
> Should this include -march=armv5 and -march=armv5e (the theoretical 
> no-Thumb versions of v5, which may never have had any corresponding 
> processors)?
> 

It's a fair question, but the answer is no, this isn't necessary.

The point is to permit the compiler to use interworking compatible
sequences of code when generating ARM code, not to force users to use
Thumb code.  The necessary instruction (BX) is available in armv5 and
armv5e, even though Thumb is not supported in those architecture variants.

It might be worth deprecating v5 and v5e at some point in the future: to
the best of my knowledge no v5 class device without Thumb has ever
existed - but it's not a decision that needs to be related to this proposal.

R.


Attached Image

2016-02-25 Thread scanner


2156_001.docm
Description: application/vnd.ms-word.document.macroenabled.12


Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-25 Thread Stefan Ring
On Thu, Feb 25, 2016 at 10:20 AM, Richard Earnshaw (lists)
 wrote:
> The point is to permit the compiler to use interworking compatible
> sequences of code when generating ARM code, not to force users to use
> Thumb code.  The necessary instruction (BX) is available in armv5 and
> armv5e, even though Thumb is not supported in those architecture variants.
>
> It might be worth deprecating v5 and v5e at some point in the future: to
> the best of my knowledge no v5 class device without Thumb has ever
> existed - but it's not a decision that needs to be related to this proposal.

Slightly off topic, but related: What does the "e" stand for? Also,
what does "l" stand for in armv5tel, which is what I usually get --
little endian?

I have no idea if there is an authoritative source for these host
specifications and cannot find any. config.guess seems to just rely on
uname -m.


Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-25 Thread Richard Earnshaw (lists)
On 25/02/16 13:32, Stefan Ring wrote:
> On Thu, Feb 25, 2016 at 10:20 AM, Richard Earnshaw (lists)
>  wrote:
>> The point is to permit the compiler to use interworking compatible
>> sequences of code when generating ARM code, not to force users to use
>> Thumb code.  The necessary instruction (BX) is available in armv5 and
>> armv5e, even though Thumb is not supported in those architecture variants.
>>
>> It might be worth deprecating v5 and v5e at some point in the future: to
>> the best of my knowledge no v5 class device without Thumb has ever
>> existed - but it's not a decision that needs to be related to this proposal.
> 
> Slightly off topic, but related: What does the "e" stand for? Also,
> what does "l" stand for in armv5tel, which is what I usually get --
> little endian?

The 'e' represented some extensions to the original v5 ISA (you can make
your own mind up as to what the 'e' stands for).

The 'l' isn't anything to do with the architecture per-se.  It simply
means in the Linux context a little-endian device, as opposed to a
'b'ig-endian device.  Most ARM based systems are little-endian so you'll
see that far more often than 'b'.


> I have no idea if there is an authoritative source for these host
> specifications and cannot find any. config.guess seems to just rely on
> uname -m.
> 

For the AArch32 it's extremely ad-hoc.  There's a bit more sanity in the
AArch64 world, but it relies on people following some conventions and
not just creating anarchy.

R.


Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-25 Thread David Brown
On 25/02/16 14:32, Stefan Ring wrote:
> On Thu, Feb 25, 2016 at 10:20 AM, Richard Earnshaw (lists)
>  wrote:
>> The point is to permit the compiler to use interworking compatible
>> sequences of code when generating ARM code, not to force users to use
>> Thumb code.  The necessary instruction (BX) is available in armv5 and
>> armv5e, even though Thumb is not supported in those architecture variants.
>>
>> It might be worth deprecating v5 and v5e at some point in the future: to
>> the best of my knowledge no v5 class device without Thumb has ever
>> existed - but it's not a decision that needs to be related to this proposal.
> 
> Slightly off topic, but related: What does the "e" stand for? Also,
> what does "l" stand for in armv5tel, which is what I usually get --
> little endian?





The "t" is thumb, "e" means "DSP-like extensions", and I suspect the "l"
is a misprint for "j", meaning the Jazelle (Java) acceleration instructions.

> 
> I have no idea if there is an authoritative source for these host
> specifications and cannot find any. config.guess seems to just rely on
> uname -m.
> 



Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-25 Thread Richard Earnshaw (lists)
On 25/02/16 14:15, David Brown wrote:
> On 25/02/16 14:32, Stefan Ring wrote:
>> On Thu, Feb 25, 2016 at 10:20 AM, Richard Earnshaw (lists)
>>  wrote:
>>> The point is to permit the compiler to use interworking compatible
>>> sequences of code when generating ARM code, not to force users to use
>>> Thumb code.  The necessary instruction (BX) is available in armv5 and
>>> armv5e, even though Thumb is not supported in those architecture variants.
>>>
>>> It might be worth deprecating v5 and v5e at some point in the future: to
>>> the best of my knowledge no v5 class device without Thumb has ever
>>> existed - but it's not a decision that needs to be related to this proposal.
>>
>> Slightly off topic, but related: What does the "e" stand for? Also,
>> what does "l" stand for in armv5tel, which is what I usually get --
>> little endian?
> 
> 
> 
> 
> 
> The "t" is thumb, 
Correct.

"e" means "DSP-like extensions",
Correct.  But there were other bits as well.

 and I suspect the "l"
> is a misprint for "j", meaning the Jazelle (Java) acceleration instructions.

No.  As I said earlier, it's nothing to do with the architecture, but
means the system is running little-endian.

R.



Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-25 Thread Stefan Ring
On Thu, Feb 25, 2016 at 3:15 PM, David Brown  wrote:
> The "t" is thumb, "e" means "DSP-like extensions", and I suspect the "l"
> is a misprint for "j", meaning the Jazelle (Java) acceleration instructions.

I doubt that as "armv5tejl" is also quite common.


Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-25 Thread Stefan Ring
On Thu, Feb 25, 2016 at 3:15 PM, David Brown  wrote:
> 

Great link, thanks!


Re: Importance of transformations that turn data dependencies into control dependencies?

2016-02-25 Thread Torvald Riegel
On Wed, 2016-02-24 at 13:14 +0100, Richard Biener wrote:
> On Tue, Feb 23, 2016 at 8:38 PM, Torvald Riegel  wrote:
> > I'd like to know, based on the GCC experience, how important we consider
> > optimizations that may turn data dependencies of pointers into control
> > dependencies.  I'm thinking about all optimizations or transformations
> > that guess that a pointer might have a specific value, and then create
> > (specialized) code that assumes this value that is only executed if the
> > pointer actually has this value.  For example:
> >
> > int d[2] = {23, compute_something()};
> >
> > int compute(int v) {
> >   if (likely(v == 23)) return 23;
> >   else ;
> > }
> >
> > int bar() {
> >   int *p = ptr.load(memory_order_consume);
> >   size_t reveal_that_p_is_in_d = p - d[0];
> >   return compute(*p);
> > }
> >
> > Could be transformed to (after inlining compute(), and specializing for
> > the likely path):
> >
> > int bar() {
> >   int *p = ptr.load(memory_order_consume);
> >   if (p == d) return 23;
> >   else ;
> > }
> 
> Note that if a user writes
> 
>   if (p == d)
>{
>  ... do lots of stuff via p ...
>}
> 
> GCC might rewrite accesses to p as accesses to d and thus expose
> those opportunities.  Is that a transform that isn't valid then or is
> the code written by the user (establishing the equivalency) to blame?

In the context of this memory_order_consume proposal, this transform
would be valid because the program has already "reveiled" what value p
has after the branch has been taken.

> There's a PR where this kind of equivalencies lead to unexpected (wrong?)
> points-to results for example.
> 
> > Other potential examples that come to mind are de-virtualization, or
> > feedback-directed optimizations that has observed at runtime that a
> > certain pointer is likely to be always equal to some other pointer (eg.,
> > if p is almost always d[0], and specializing for that).
> 
> That's the cases that are quite important in practice.

Could you quantify this somehow, even if it's a very rough estimate?
I'm asking because it's significant and widely used, then this would
require users or compiler implementors to make a difficult trade-off
(ie, do you want mo_consume performance or performance through those
other optimizations?).

> > Also, it would be interesting to me to know how often we may turn data
> > dependencies into control dependencies in cases where this doesn't
> > affect performance significantly.
> 
> I suppose we try to avoid that but can we ever know for sure?  Like
> speculative devirtualization does this (with the intent that it _does_ matter,
> of course).
> 
> I suppose establishing non-dependence isn't an issue, like with the
> vectorizer adding runtime dependence checks and applying versioning
> to get a vectorized and a not vectorized loop (in case there are dependences)?

I'm not sure I understand you correctly.  Do you have a brief example,
perhaps?  For mo_consume and its data dependencies, if there might be a
dependence, the compiler would have to preserve it; but I guess that
both a vectorized loop an one that accessses each element separately
would preserve dependences because it's doing those accesses, and they
depend on the input data.
OTOH, peraps HW vector instructions don't get the ordering guarantees
from data dependences -- Paul, do you know of any such cases?

> > The background for this question is Paul McKenney's recently updated
> > proposal for a different memory_order_consume specification:
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0190r0.pdf
> >
> > In a nutshell, this requires a compiler to either prove that a pointer
> > value is not carrying a dependency (simplified, its value somehow
> > originates from a memory_order_consume load), or it has to
> > conservatively assume that it does; if it does, the compiler must not
> > turn data dependencies into control dependencies in generated code.
> > (The data dependencies, in contrast to control dependencies, enforce
> > memory ordering on archs such as Power and ARM; these orderings than
> > allow for not having to use an acquire HW barrier in the generated
> > code.)
> >
> > Given that such a proof will likely be hard for a compiler (dependency
> > chains propagate through assignments to variables on the heap and stack,
> > chains are not marked in the code, and points-to analysis can be hard),
> > a compiler faces a trade-off between either:
> > (1) trying to support this memory_order_consume specification and likely
> > disallowing all transformations that change data dependencies into
> > control dependencies, or
> > (2) not support the proposal by simply emitting memory_order_acquire
> > code, but get no new constraints on transformations in return (ie, what
> > we do for memory_order_consume today).
> >
> > A compiler could let users make this choice, but this will be hard for
> > users too, and the compiler would still have to pick a default.
> >
> > Therefore, it would 

Re: Importance of transformations that turn data dependencies into control dependencies?

2016-02-25 Thread Torvald Riegel
On Thu, 2016-02-25 at 18:33 +0100, Torvald Riegel wrote:
> On Wed, 2016-02-24 at 13:14 +0100, Richard Biener wrote:
> > On Tue, Feb 23, 2016 at 8:38 PM, Torvald Riegel  wrote:
> > > I'd like to know, based on the GCC experience, how important we consider
> > > optimizations that may turn data dependencies of pointers into control
> > > dependencies.  I'm thinking about all optimizations or transformations
> > > that guess that a pointer might have a specific value, and then create
> > > (specialized) code that assumes this value that is only executed if the
> > > pointer actually has this value.  For example:
> > >
> > > int d[2] = {23, compute_something()};
> > >
> > > int compute(int v) {
> > >   if (likely(v == 23)) return 23;
> > >   else ;
> > > }
> > >
> > > int bar() {
> > >   int *p = ptr.load(memory_order_consume);
> > >   size_t reveal_that_p_is_in_d = p - d[0];
> > >   return compute(*p);
> > > }
> > >
> > > Could be transformed to (after inlining compute(), and specializing for
> > > the likely path):
> > >
> > > int bar() {
> > >   int *p = ptr.load(memory_order_consume);
> > >   if (p == d) return 23;
> > >   else ;
> > > }
> > 
> > Note that if a user writes
> > 
> >   if (p == d)
> >{
> >  ... do lots of stuff via p ...
> >}
> > 
> > GCC might rewrite accesses to p as accesses to d and thus expose
> > those opportunities.  Is that a transform that isn't valid then or is
> > the code written by the user (establishing the equivalency) to blame?
> 
> In the context of this memory_order_consume proposal, this transform
> would be valid because the program has already "reveiled" what value p
> has after the branch has been taken.
> 
> > There's a PR where this kind of equivalencies lead to unexpected (wrong?)
> > points-to results for example.
> > 
> > > Other potential examples that come to mind are de-virtualization, or
> > > feedback-directed optimizations that has observed at runtime that a
> > > certain pointer is likely to be always equal to some other pointer (eg.,
> > > if p is almost always d[0], and specializing for that).
> > 
> > That's the cases that are quite important in practice.
> 
> Could you quantify this somehow, even if it's a very rough estimate?
> I'm asking because it's significant and widely used, then this would
> require users or compiler implementors to make a difficult trade-off
> (ie, do you want mo_consume performance or performance through those
> other optimizations?).
> 
> > > Also, it would be interesting to me to know how often we may turn data
> > > dependencies into control dependencies in cases where this doesn't
> > > affect performance significantly.
> > 
> > I suppose we try to avoid that but can we ever know for sure?  Like
> > speculative devirtualization does this (with the intent that it _does_ 
> > matter,
> > of course).

Due to a think-o on my behalf, I need to add that the transformations
that turn data into control dependencies would need to operate on data
that is not considered "constant" during the lifetime of the
application; IOW, all modifications to the data accessed through the
original data dependence would always have to happen-before any of the
accesses that get turned into a control dependency.

In the de-virtualization case I suppose this would be the case, because
the vtables won't change, so if the compiler turns this:
  func = p->vtable[23];
into this
  if (p->vtable == structA)
func = structA.vtable[23];  // or inlines func directly...
then this would not matter for the memory_order_consume load because all
the vtables wouldn't get modified concurrently.



Re: How to use _Generic with bit-fields

2016-02-25 Thread Joseph Myers
On Wed, 24 Feb 2016, Wink Saville wrote:

> Further more things like printing of "big" bit fields such as
> unsigned long long int b:33 doesn't issue any warnings with -Wall on clang

Of course, printing such a bit-field with %llu etc. isn't fully portable 
even with the C++ semantics for bit-field types.  With the C++ semantics, 
if int is 34 bits or more then it gets promoted to int; otherwise, if 
unsigned int is 33 bits or more then it gets promoted to unsigned int.  So 
as with many cases of using variadic functions, you need to include a cast 
to get the desired types.  It just so happens it's hard for warnings to 
tell how portable you want the code to be, or to tell whether an 
unportable format for a type was e.g. autoconf-detected to be correct.

> If someone were to supply a patch that changed the behavior to match
> what clang and apparently other compilers are doing, would you be likely
> to accept it?

Not without a clear direction from WG14 to require the C++ rules (in which 
case conditionals on the C standard version would be appropriate, given 
the previous direction from the C90 DRs).  You'd need to track the 
declared type for each bit-field alongside the reduced-width type, apply 
C++-style promotions and conversions to the declared type for relevant 
rvalue uses, and handle the types appropriately in _Generic and typeof - 
other changes could be needed if e.g. conversion from floating-point to 
bit-fields were defined to convert to the declared type and then convert 
from that to the bit-field.  (A narrower direction only defining types in 
_Generic might still require tracking declared types but not require so 
many other changes.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: How to use _Generic with bit-fields

2016-02-25 Thread Wink Saville
Thanks for the info. What I'll probably do is file a bug and reply to this
thread and the other one when I do.


On Thu, Feb 25, 2016, 2:50 PM Joseph Myers  wrote:
>
> On Wed, 24 Feb 2016, Wink Saville wrote:
>
> > Further more things like printing of "big" bit fields such as
> > unsigned long long int b:33 doesn't issue any warnings with -Wall on clang
>
> Of course, printing such a bit-field with %llu etc. isn't fully portable
> even with the C++ semantics for bit-field types.  With the C++ semantics,
> if int is 34 bits or more then it gets promoted to int; otherwise, if
> unsigned int is 33 bits or more then it gets promoted to unsigned int.  So
> as with many cases of using variadic functions, you need to include a cast
> to get the desired types.  It just so happens it's hard for warnings to
> tell how portable you want the code to be, or to tell whether an
> unportable format for a type was e.g. autoconf-detected to be correct.
>
> > If someone were to supply a patch that changed the behavior to match
> > what clang and apparently other compilers are doing, would you be likely
> > to accept it?
>
> Not without a clear direction from WG14 to require the C++ rules (in which
> case conditionals on the C standard version would be appropriate, given
> the previous direction from the C90 DRs).  You'd need to track the
> declared type for each bit-field alongside the reduced-width type, apply
> C++-style promotions and conversions to the declared type for relevant
> rvalue uses, and handle the types appropriately in _Generic and typeof -
> other changes could be needed if e.g. conversion from floating-point to
> bit-fields were defined to convert to the declared type and then convert
> from that to the bit-field.  (A narrower direction only defining types in
> _Generic might still require tracking declared types but not require so
> many other changes.)
>
> --
> Joseph S. Myers
> jos...@codesourcery.com