Re: question about -ffast-math implementation

2014-06-02 Thread Andrew Pinski
On Sun, Jun 1, 2014 at 11:09 PM, Janne Blomqvist
 wrote:
> On Sun, Jun 1, 2014 at 9:52 AM, Mike Izbicki  wrote:
>> I'm trying to copy gcc's behavior with the -ffast-math compiler flag
>> into haskell's ghc compiler.  The only documentation I can find about
>> it is at:
>>
>> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>
>> I understand how floating point operations work and have come up with
>> a reasonable list of optimizations to perform.  But I doubt it is
>> exhaustive.
>>
>> My question is: where can I find all the gory details about what gcc
>> will do with this flag?  I'm perfectly willing to look at source code
>> if that's what it takes.
>
> In addition to the official documentation, a nice overview is at
>
> https://gcc.gnu.org/wiki/FloatingPointMath
>
> Though for the gory details and authoritative answers I suppose you'd
> have to look into the source code.
>
>> Also, are there any optimizations that you wish -ffast-math could
>> perform, but for various architectural reasons they don't fit into
>> gcc?
>
> There are of course a (nearly endless?) list of optimizations that
> could be done but aren't (lack of manpower, impractical, whatnot). I'm
> not sure there are any interesting optimizations that would be
> dependent on loosening -ffast-math further?
>
> (One thing I wish wouldn't be included in -ffast-math is
> -fcx-limited-range; the naive complex division algorithm can easily
> lead to comically poor results.)

Which is kinda interesting because the Google folks have been trying
to turn on  -fcx-limited-range for C++ a few times now.

Thanks,
Andrew Pinski

>
> --
> Janne Blomqvist


Re: question about -ffast-math implementation

2014-06-02 Thread Mike Izbicki
> Though for the gory details and authoritative answers I suppose you'd
have to look into the source code.

Where would I find the code for this?


GCC 4.7.4 Status Report, branch frozen for release (candidate)

2014-06-02 Thread Richard Biener

Status
==

The GCC 4.7 branch is now frozen as I am preparing a first release
candidate for GCC 4.7.4.  All changes from now on require release
manager approval.  After GCC 4.7.4 is released the branch will be
closed.


Previous Report
===

https://gcc.gnu.org/ml/gcc/2013-04/msg00121.html

I will send the next status report once GCC 4.7.4 RC1 is ready.


lib{atomic, itm}/configure.tgt uses -mcpu=v9 as default for sparc

2014-06-02 Thread Carlos Sánchez de La Lama
Hi all,

I have seen this in 4.8.2 but seems to be present since 4.7.0 (was added
by revision 184177 in libitm/configure.tgt):



# Map the target cpu to an ARCH sub-directory.  At the same time,
# work out any special compilation flags as necessary.

case "${target_cpu}" in

...

  sparc)
case " ${CC} ${CFLAGS} " in
  *" -m64 "*)
;;
  *)
if test -z "$with_cpu"; then
  XCFLAGS="${XCFLAGS} -mcpu=v9"
fi
esac
ARCH=sparc
;;



Prevents compilation of gcc-4.8.2 under NetBSD 6.4.1, throws Illegal
Instruction during those lib{atomic, itm} final stage configure.

Removing "-mcpu=v9" allows the build to finalize.

System is a QEMUlated SparcStation 5. Not the best target for testing,
but I understand this is a bug nonetheless.

Thoughts?

Carlos



GCC 4.7.4 Release Candidate available from gcc.gnu.org

2014-06-02 Thread Richard Biener

GCC 4.7.4 Release Candidate available from gcc.gnu.org

The first release candidate for GCC 4.7.4 is available from

 ftp://gcc.gnu.org/pub/gcc/snapshots/4.7.4-RC-20140602

and shortly its mirrors.  It has been generated from SVN revision 211126.

I have so far bootstrapped and tested the release candidate on
x86_64-linux.  Please test it and report any issues to bugzilla.

If all goes well, I'd like to release 4.7.4 on Wednesday, 11th.


Re: [RFC] PR61300 K&R incoming args

2014-06-02 Thread Florian Weimer

On 05/31/2014 08:56 AM, Alan Modra wrote:


It's fine to change ABI when compiling an old-style function
definition for which a prototype exists (relative to the
non-prototype case).  It happens on i386, too.


That might be so, but when compiling the function body you must assume
the worst case, whatever that might be, at the call site.  For K&R
code, our error was to assume the call was unprototyped (which
paradoxically is the best case) when compiling the function body.


Is this really a supported use case?  I think I remember tracking down a 
bug which was related to a lack of float -> double promotion because the 
call was prototyped, and the old-style function definition wasn't.  This 
would have been on, ugh, SPARC.  I think this happened only in certain 
cases (float arguments, probably).


Does this trigger more often on ppc64 ELFv2, to the extend it becomes a 
quality-of-implementation issue?  I'm pretty sure the standards do not 
require a particular behavior in such cases.


--
Florian Weimer / Red Hat Product Security Team


Re: lib{atomic, itm}/configure.tgt uses -mcpu=v9 as default for sparc

2014-06-02 Thread Eric Botcazou
> # Map the target cpu to an ARCH sub-directory.  At the same time,
> # work out any special compilation flags as necessary.
> 
> case "${target_cpu}" in
> 
> ...
> 
>   sparc)
> case " ${CC} ${CFLAGS} " in
>   *" -m64 "*)
> ;;
>   *)
> if test -z "$with_cpu"; then
>   XCFLAGS="${XCFLAGS} -mcpu=v9"
> fi
> esac
> ARCH=sparc
> ;;
> 
> 
> 
> Prevents compilation of gcc-4.8.2 under NetBSD 6.4.1, throws Illegal
> Instruction during those lib{atomic, itm} final stage configure.
> 
> Removing "-mcpu=v9" allows the build to finalize.
> 
> System is a QEMUlated SparcStation 5. Not the best target for testing,
> but I understand this is a bug nonetheless.

IIRC both libraries require the V9 architecture to work properly/efficiently.

-- 
Eric Botcazou


[GSoC] decision tree first steps

2014-06-02 Thread Prathamesh Kulkarni
I have few questions regarding genmatch:

a) Why is 4 hard-coded here: ?
in write_nary_simplifiers:
 fprintf (f, "  tree captures[4] = {};\n");

b) Should we add syntax for a symbol to denote multiple operators ?
For exampleim in simplify_rotate:
(X << CNT1) OP (X >> CNT2) with OP being +, |, ^  (CNT1 + CNT2 ==
bitsize of type of X).

c) Remove for parsing capture in parse_expr since we reject outermost
captured expressions ?

d) I am not able to follow this comment in match.pd:
/* Patterns required to avoid SCCVN testsuite regressions.  */

/* (x >> 31) & 1 -> (x >> 31).  Folding in fold-const is more
   complicated here, it does
 Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1))
 (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1))
 if the new mask might be further optimized.  */
(match_and_simplify
  (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep)
  if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) == 0)
  @0)


Decision Tree.
I have tried to come up with a prototype for decision tree (patch attached).
For simplicity, it handles patterns involving only unary operators and
no predicates
and returns false when the pattern fails to match (no goto to match
another pattern).
I meant to post it only for illustration, and I have not really paid
attention to code quality (bad formatting, memory leaks, etc.).

* Basic Idea
A pattern consists of following parts: match, ifexpr and result.
Let's call  as "simplification" operand.
The common prefix between different match operands would be represented
by same nodes in the decision tree.

Example:
(negate (bit_not @0))
S1

(negate (negate @0))
S2

S1, S2 denote simplifications for the above patterns respectively.

The decision tree would look something like
(that's the way it gets constructed with the patch):

dummy/root
|
   NEGATE_EXPR
 /  \
 BIT_NOT   NEGATE_EXPR
   | |
 @0 @0
   | |
 S1  S2

a) The children of an internal node are number of decisions that
can be taken at that node. In the above case it's 2 for outer NEGATE_EXPR.
b) Simplification operand represents leaves of the decision tree
c) Instead of having list of heads, I have added one dummy node,
and heads become children of these dummy root node.
d) Code-gen for non-simplification operands involves generating,
"matching" code and for simplification operands involves generating
"transform" code

* Overall Flow:
I guess we would build the decision tree from the AST.
So the flow would be like:
source -> struct simplify (match, ifexpr, result) -> decision tree -> c code.

Something like (in main):
decision_tree dt;
while (there is another pattern)
{
  simplify *s = parse_match_and_simplify ();
  insert s into decision tree;
};
So parsing routines are concerned with parsing and building the AST (operand),
and not with the decision tree. Is that fine ?

* Representation of decision tree.
A decision tree would need a way to represent language constructs
like capture, predicate, etc. so in some ways it would be similar to AST.
It
In the patch, I have created the following heirarchy:
dt_operand: represents a general base "operand" of in decision tree
dt_expr: for representing expression. Expression contains operation
to be performed (e_operation).
dt_capture: analogous to capture.
dt_simplify: for representing "simplification" operand.
simplification consists of ifexpr and result
dt_head: to represent "dummy" root. Maybe a separate class is not needed.

* Constructing decision tree from AST
The algorithm i have used is similar to inserting string in a trie
outlined here: http://en.wikipedia.org/wiki/Trie
The difference shall be to traverse AST depth-first rather than
traversing the string.
Apart from that I guess it would be the same (naturally find and
compare operations would be different).
I haven't given much thought about this. Currently, I construct
decision tree for only patterns with unary operators in an
ugly way by "flattening" the AST by walking it in pre-order and
storing the nodes in vector in the order they are visited

So to construct decision tree from the following AST:
  negate
  |
  bit_not
  |
 @0

AST is flattened by walking in preorder (by walk_operand_preorder) and stored
as: [ expr, expr, capture ]
and this vector is used to construct decision tree.
I did it as a "quick and dirty" way, it has to be changed.
And it won't work for patterns with n-ary operators.

We should visit each node of AST in preorder, and add that node
during traversal to decision tree. I am not yet clear on way of doing that.

* Comparing operands
How do we compare operands ? We shall need to do this while inserting
in decision tree, since if the operand is already inserted we do not create
a new node.
In the

Re: question about -ffast-math implementation

2014-06-02 Thread Tim Prince

On 6/2/2014 3:00 AM, Andrew Pinski wrote:

On Sun, Jun 1, 2014 at 11:09 PM, Janne Blomqvist
 wrote:

On Sun, Jun 1, 2014 at 9:52 AM, Mike Izbicki  wrote:

I'm trying to copy gcc's behavior with the -ffast-math compiler flag
into haskell's ghc compiler.  The only documentation I can find about
it is at:

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

I understand how floating point operations work and have come up with
a reasonable list of optimizations to perform.  But I doubt it is
exhaustive.

My question is: where can I find all the gory details about what gcc
will do with this flag?  I'm perfectly willing to look at source code
if that's what it takes.


In addition to the official documentation, a nice overview is at

https://gcc.gnu.org/wiki/FloatingPointMath

Useful, thanks for the pointer


Though for the gory details and authoritative answers I suppose you'd
have to look into the source code.


Also, are there any optimizations that you wish -ffast-math could
perform, but for various architectural reasons they don't fit into
gcc?


There are of course a (nearly endless?) list of optimizations that
could be done but aren't (lack of manpower, impractical, whatnot). I'm
not sure there are any interesting optimizations that would be
dependent on loosening -ffast-math further?
I find it difficult to remember how to reconcile differing treatments by 
gcc and gfortran under -ffast-math; in particular, with respect to 
-fprotect-parens and -freciprocal-math.  The latter appears to comply 
with Fortran standard.


(One thing I wish wouldn't be included in -ffast-math is
-fcx-limited-range; the naive complex division algorithm can easily
lead to comically poor results.)


Which is kinda interesting because the Google folks have been trying
to turn on  -fcx-limited-range for C++ a few times now.

Intel tried to add -complex-limited-range as a default under -fp-model 
fast=1 but that was shown to be unsatisfactory.


Now, with the introduction of omp simd directives and pragmas, we have 
disagreement among various compilers on the relative roles of the 
directives and the fast-math options.
I've submitted PR60117 hoping to get some insight on whether omp simd 
should disable optimizations otherwise performed by -ffast-math.


Intel made the directives over-ride the compiler line fast (or 
"no-fast") settings locally, so that complex-limited-range might be in 
effect inside the scope of the directive (no matter whether you want 
it).  They made changes in the current beta compiler, so it's no longer 
practical to set standard-compliant options but discard them by pragma 
in individual for loops.



--
Tim Prince


Re: lib{atomic, itm}/configure.tgt uses -mcpu=v9 as default for sparc

2014-06-02 Thread Carlos Sánchez de La Lama
Hi Eric

>> Removing "-mcpu=v9" allows the build to finalize.
>
> IIRC both libraries require the V9 architecture to work
> properly/efficiently.

I have successfully built without the switch, but I am not sure of the
effects at runtime.

If V9 is indeed required, is there a way to build without those libs? Or
has pre V9 support been dropped at some point?

IMHO an efficiency enhancement should not prevent running less
efficiently on a supported architecture. If target triple is
sparcv9-*-*, the next case will match and will add the "-mcpu=v9" to
XCFLAGS, but adding it for non-v9 sparc-*-* targets is at least weird.

Carlos


Re: lib{atomic, itm}/configure.tgt uses -mcpu=v9 as default for sparc

2014-06-02 Thread Jonathan Wakely
On 2 June 2014 12:38, Carlos Sánchez de La Lama wrote:
> If V9 is indeed required, is there a way to build without those libs? Or
> has pre V9 support been dropped at some point?

--disable-libitm --disable-libatomic


Re: [GSoC] decision tree first steps

2014-06-02 Thread Ludovic Courtès
Hi,

Prathamesh Kulkarni  skribis:

> Example:
> /* x & 0 -> 0 */
> (match_and_simplify
>   (bit_and @0 @1)
>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (@1 == integer_zero_node))
>   { integer_zero_node; })
>
> /* x & -1 -> x */
> (match_and_simplify
>   (bit_and @0 @1)
>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>  && (@1 == integer_minus_one_node)
>   @0)

(Apologies if this has already been discussed/decided before.)

With my Scheme background, two things come to mind:

  1. Wouldn’t it be nicer to have a single language, rather than
 intermingle C expressions in the middle of the s-expression
 language?

  2. Wouldn’t it be useful to allow multiple clauses in a single
 ‘match’?

Here’s a hypothetical simplification pass one could write:

  ;; Match tree node ‘exp’ against a series of patterns, and return
  ;; a (possibly identical) tree node.
  (match exp
((bitwise-and first second)
 (if (and (integral-type? (tree-type first))
  (eq? integer-zero-node second))
 integer-zero-node; simplify to zero
 exp)); return ‘exp’ unchanged
((bitwise-and first second)
 (if (and (integral-type? (tree-type first))
  (eq? integer-minus-one-node second))
 first
 exp))
(else ; no simplification pattern matched
 exp))

The language for expression rewriting (the ‘if’ expressions above) could
consist of very few constructs directly translatable to C + tree.

As an example, Guile’s compiler simplification passes look very much
like this [0], built around a generic pattern matcher [1].  Pattern
matching in MELT should also be a good source of inspiration, obviously [2].

Thanks,
Ludo’.

[0] 
http://git.savannah.gnu.org/cgit/guile.git/tree/module/language/tree-il/peval.scm#n1082
[1] http://www.gnu.org/software/guile/manual/html_node/Pattern-Matching.html
[2] http://gcc-melt.org/tutomeltimplem.html


Re: [GSoC] decision tree first steps

2014-06-02 Thread Richard Biener
On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni
 wrote:
> I have few questions regarding genmatch:
>
> a) Why is 4 hard-coded here: ?
> in write_nary_simplifiers:
>  fprintf (f, "  tree captures[4] = {};\n");

Magic number (this must be big enough for all cases ...).  Honestly
this should be improved (but requires another scan over the matcher IL
to figure out the max N used in @N).

> b) Should we add syntax for a symbol to denote multiple operators ?
> For exampleim in simplify_rotate:
> (X << CNT1) OP (X >> CNT2) with OP being +, |, ^  (CNT1 + CNT2 ==
> bitsize of type of X).

Something to enhance the IL with, yes.  I'd say we support

(define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR)

thus,

(define_op  op...)

> c) Remove for parsing capture in parse_expr since we reject outermost
> captured expressions ?

but parse_expr is also used for inner expressions, no?

(plus (minus@2 @0 @1) @3)

should still work

> d) I am not able to follow this comment in match.pd:
> /* Patterns required to avoid SCCVN testsuite regressions.  */
>
> /* (x >> 31) & 1 -> (x >> 31).  Folding in fold-const is more
>complicated here, it does
>  Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1))
>  (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1))
>  if the new mask might be further optimized.  */
> (match_and_simplify
>   (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep)
>   if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) == 0)
>   @0)

The comment is literally copied from the case I extracted the
(simplified) variant from fold-const.c.  See lines 11961-12056 in fold-const.c.
It'll be a challenge to implement the equivalent in a pattern ;)

>
> Decision Tree.
> I have tried to come up with a prototype for decision tree (patch 
> attached).
> For simplicity, it handles patterns involving only unary operators and
> no predicates
> and returns false when the pattern fails to match (no goto to match
> another pattern).
> I meant to post it only for illustration, and I have not really paid
> attention to code quality (bad formatting, memory leaks, etc.).
>
> * Basic Idea
> A pattern consists of following parts: match, ifexpr and result.
> Let's call  as "simplification" operand.
> The common prefix between different match operands would be represented
> by same nodes in the decision tree.
>
> Example:
> (negate (bit_not @0))
> S1
>
> (negate (negate @0))
> S2
>
> S1, S2 denote simplifications for the above patterns respectively.
>
> The decision tree would look something like
> (that's the way it gets constructed with the patch):
>
> dummy/root
> |
>NEGATE_EXPR
>  /  \
>  BIT_NOT   NEGATE_EXPR
>| |
>  @0 @0
>| |
>  S1  S2
>
> a) The children of an internal node are number of decisions that
> can be taken at that node. In the above case it's 2 for outer NEGATE_EXPR.
> b) Simplification operand represents leaves of the decision tree
> c) Instead of having list of heads, I have added one dummy node,
> and heads become children of these dummy root node.
> d) Code-gen for non-simplification operands involves generating,
> "matching" code and for simplification operands involves generating
> "transform" code
>
> * Overall Flow:
> I guess we would build the decision tree from the AST.
> So the flow would be like:
> source -> struct simplify (match, ifexpr, result) -> decision tree -> c code.
>
> Something like (in main):
> decision_tree dt;
> while (there is another pattern)
> {
>   simplify *s = parse_match_and_simplify ();
>   insert s into decision tree;
> };
> So parsing routines are concerned with parsing and building the AST (operand),
> and not with the decision tree. Is that fine ?

Yes, that's good.

> * Representation of decision tree.
> A decision tree would need a way to represent language constructs
> like capture, predicate, etc. so in some ways it would be similar to AST.
> It
> In the patch, I have created the following heirarchy:
> dt_operand: represents a general base "operand" of in decision tree
> dt_expr: for representing expression. Expression contains operation
> to be performed (e_operation).
> dt_capture: analogous to capture.
> dt_simplify: for representing "simplification" operand.
> simplification consists of ifexpr and result
> dt_head: to represent "dummy" root. Maybe a separate class is not needed.
>
> * Constructing decision tree from AST
> The algorithm i have used is similar to inserting string in a trie
> outlined here: http://en.wikipedia.org/wiki/Trie
> The difference shall be to traverse AST depth-first rather than
> traversing the string.
> Apart from that I guess it would be the same (naturally find and
> compare operations would be different).
> I haven't given much thought about this. Currently, I construct
> d

Re: lib{atomic, itm}/configure.tgt uses -mcpu=v9 as default for sparc

2014-06-02 Thread Carlos Sánchez de La Lama
Thanks Jonathan,

>> If V9 is indeed required, is there a way to build without those libs? Or
>> has pre V9 support been dropped at some point?
>
> --disable-libitm --disable-libatomic

Ok, so those two switches will be required for non-v9 SPARC targets.

I still think this should be taken care of in the makefiles (i.e. enable
libitm & libatomic only for sparcv9 targets), but it can be solved on
pkgsrc side then. I will report in their list.

Carlos


Re: [GSoC] decision tree first steps

2014-06-02 Thread Richard Biener
On Mon, Jun 2, 2014 at 2:12 PM, Ludovic Courtès  wrote:
> Hi,
>
> Prathamesh Kulkarni  skribis:
>
>> Example:
>> /* x & 0 -> 0 */
>> (match_and_simplify
>>   (bit_and @0 @1)
>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (@1 == integer_zero_node))
>>   { integer_zero_node; })
>>
>> /* x & -1 -> x */
>> (match_and_simplify
>>   (bit_and @0 @1)
>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>>  && (@1 == integer_minus_one_node)
>>   @0)
>
> (Apologies if this has already been discussed/decided before.)
>
> With my Scheme background, two things come to mind:
>
>   1. Wouldn’t it be nicer to have a single language, rather than
>  intermingle C expressions in the middle of the s-expression
>  language?
>
>   2. Wouldn’t it be useful to allow multiple clauses in a single
>  ‘match’?
>
> Here’s a hypothetical simplification pass one could write:
>
>   ;; Match tree node ‘exp’ against a series of patterns, and return
>   ;; a (possibly identical) tree node.
>   (match exp
> ((bitwise-and first second)
>  (if (and (integral-type? (tree-type first))
>   (eq? integer-zero-node second))
>  integer-zero-node; simplify to zero
>  exp)); return ‘exp’ unchanged
> ((bitwise-and first second)
>  (if (and (integral-type? (tree-type first))
>   (eq? integer-minus-one-node second))
>  first
>  exp))
> (else ; no simplification pattern matched
>  exp))
>
> The language for expression rewriting (the ‘if’ expressions above) could
> consist of very few constructs directly translatable to C + tree.
>
> As an example, Guile’s compiler simplification passes look very much
> like this [0], built around a generic pattern matcher [1].  Pattern
> matching in MELT should also be a good source of inspiration, obviously [2].

The language is designed to be easy for the implementors ... and at
the same time match what we are used to see (thus lispy).  It's
a bit less lispy than the RTL machine descriptions which allow
RTL as their "if expressions", but I'm not sure your example is
easier to parse.

Yes, being able to combine two cases is nice - but it's also easy
to write obfuscated patterns.

(match-and-simplify
  (bit_and @0 integer_zerop@1)
  @1)
(match-and-simplify
  (bit_and @0 integer_all_onesp@1)
  @0)

is IMHO easier to parse while your version more like matches
what the code generator creates.

Richard.



> Thanks,
> Ludo’.
>
> [0] 
> http://git.savannah.gnu.org/cgit/guile.git/tree/module/language/tree-il/peval.scm#n1082
> [1] http://www.gnu.org/software/guile/manual/html_node/Pattern-Matching.html
> [2] http://gcc-melt.org/tutomeltimplem.html


FloatingPointMath and transformations

2014-06-02 Thread Vincent Lefevre
I've looked at

  https://gcc.gnu.org/wiki/FloatingPointMath

and there may be some mistakes or missing info.

First, it is said that x / C is replaced by x * (1.0 / C) when C is
a power of two. But this condition is not sufficient: if 1.0 / C
overflows, the transformation is incorrect. From some testing,
it seems that GCC detects the overflow case, so that it behaves
correctly. In this case I think that the wiki should say:
"When C is a power of two and 1.0 / C doesn't overflow."

It is also said that x / 1.0 and x / -1.0 are respectively replaced
by x and -x. But what about x * 1.0 and x * -1.0?

Ditto with -(a / b) -> a / -b and -(a / b) -> -a / b. Is there
anything similar with multiplication?

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Re: FloatingPointMath and transformations

2014-06-02 Thread Geert Bosch

On Jun 2, 2014, at 10:06 AM, Vincent Lefevre  wrote:

> I've looked at
> 
>  https://gcc.gnu.org/wiki/FloatingPointMath
> 
> and there may be some mistakes or missing info.

That’s quite possible. I created the page many years ago, based on my
understanding of GCC at that time. 
> 
> First, it is said that x / C is replaced by x * (1.0 / C) when C is
> a power of two. But this condition is not sufficient: if 1.0 / C
> overflows, the transformation is incorrect. From some testing,
> it seems that GCC detects the overflow case, so that it behaves
> correctly. In this case I think that the wiki should say:
> "When C is a power of two and 1.0 / C doesn't overflow.”
Yes, that was implied, but should indeed be made explicit.
> 
> It is also said that x / 1.0 and x / -1.0 are respectively replaced
> by x and -x. But what about x * 1.0 and x * -1.0?
> 
> Ditto with -(a / b) -> a / -b and -(a / b) -> -a / b. Is there
> anything similar with multiplication?

It should, or it would be a bug. Please feel free to add/correct anything on 
this page.

  -Geert

Re: FloatingPointMath and transformations

2014-06-02 Thread Jakub Jelinek
On Mon, Jun 02, 2014 at 10:33:37AM -0400, Geert Bosch wrote:
> > First, it is said that x / C is replaced by x * (1.0 / C) when C is
> > a power of two. But this condition is not sufficient: if 1.0 / C
> > overflows, the transformation is incorrect. From some testing,
> > it seems that GCC detects the overflow case, so that it behaves
> > correctly. In this case I think that the wiki should say:
> > "When C is a power of two and 1.0 / C doesn't overflow.”
> Yes, that was implied, but should indeed be made explicit.

If C is a power of two, then 1.0 / C should IMHO never overflow, do you mean
underflow to zero?

Jakub


Re: [RFC] PR61300 K&R incoming args

2014-06-02 Thread Alan Modra
On Mon, Jun 02, 2014 at 12:00:41PM +0200, Florian Weimer wrote:
> On 05/31/2014 08:56 AM, Alan Modra wrote:
> 
> >>It's fine to change ABI when compiling an old-style function
> >>definition for which a prototype exists (relative to the
> >>non-prototype case).  It happens on i386, too.
> >
> >That might be so, but when compiling the function body you must assume
> >the worst case, whatever that might be, at the call site.  For K&R
> >code, our error was to assume the call was unprototyped (which
> >paradoxically is the best case) when compiling the function body.
> 
> Is this really a supported use case?

Of course!  We still have K&R code lying around, as evidenced by the
PR.

>  I think I remember tracking
> down a bug which was related to a lack of float -> double promotion
> because the call was prototyped, and the old-style function
> definition wasn't.  This would have been on, ugh, SPARC.  I think
> this happened only in certain cases (float arguments, probably).

Yes, there are some limitations on parameter types that may be used
with unprototyped functions.

> Does this trigger more often on ppc64 ELFv2, to the extend it
> becomes a quality-of-implementation issue?  I'm pretty sure the
> standards do not require a particular behavior in such cases.

The PR isn't about the sort of parameter mismatch that you seem to be
thinking about.  The code in question is perfectly legal old-style
K&R where there is no float/double or int/long/void * trouble.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [GSoC] decision tree first steps

2014-06-02 Thread Ludovic Courtès
Richard Biener  skribis:

> (match-and-simplify
>   (bit_and @0 integer_zerop@1)
>   @1)
> (match-and-simplify
>   (bit_and @0 integer_all_onesp@1)
>   @0)
>
> is IMHO easier to parse while your version more like matches
> what the code generator creates.

Ah yes, the ability to specify predicates for pattern variables as above
is even better (and nicer than the examples that use inline C conditionals.)

Thanks,
Ludo’.


Cross-testing libsanitizer

2014-06-02 Thread Christophe Lyon
Hi,

I am updating my (small) patch to enable libsanitizer on AArch64, but
I am wondering about the testing.

Indeed, when testing on my laptop, execution tests fail because
libsanitizer wants to allocated 8GB of memory (I am using qemu as
execution engine).
When running on servers with more RAM, the tests pass.

I suspect this is going to be a problem, and I am wondering about the
best approach. When I enabled libsanitizer for ARM, I already had to
introduce check_effective_target_hw to avoid libsanitizer tests
involving threads because of qemu inability to handle them properly.

I could probably change this function into
check_effective_target_qemu, but that might not be acceptable (and it
would be used for 2 different purposes: threads and too much memory
allocation).

Thoughts?

Thanks,

Christophe.


Re: FloatingPointMath and transformations

2014-06-02 Thread Andreas Schwab
Jakub Jelinek  writes:

> If C is a power of two, then 1.0 / C should IMHO never overflow,

It does if C is subnormal.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: PowerPC IEEE 128-bit floating point: Language standards

2014-06-02 Thread Michael Meissner
I have not been following the language standards recently.  For any of the
recent language standards, or the standards that are being worked on, will
there be requirements for an IEEE 128-bit binary floating point type?  For the
C/C++ languages, is this type long double, or is another type being proprosed
(such as __float128)?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: PowerPC IEEE 128-bit floating point: Language standards

2014-06-02 Thread Jason Merrill

On 06/02/2014 11:45 AM, Michael Meissner wrote:

I have not been following the language standards recently.  For any of the
recent language standards, or the standards that are being worked on, will
there be requirements for an IEEE 128-bit binary floating point type?  For the
C/C++ languages, is this type long double, or is another type being proprosed
(such as __float128)?


C++ doesn't require any particular value representation for arithmetic 
types.  But std::numeric_limits::is_iec559 will let you 
know whether it conforms or not.


Jason



Re: PowerPC IEEE 128-bit floating point: Language standards

2014-06-02 Thread Jonathan Wakely
On 2 June 2014 16:45, Michael Meissner wrote:
> I have not been following the language standards recently.  For any of the
> recent language standards, or the standards that are being worked on, will
> there be requirements for an IEEE 128-bit binary floating point type?

The recent http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3626.pdf
proposal wanted to define convenience typedefs iff the implementation
provides IEEE floating point types, but I don't think it got much
support. As Jason says, there is no requirement for C++ floating-point
types to be IEEE, so those typedefs would not have been required
anyway.


Re: [RFC]Better support for big-endian targets in GCC vectorizer

2014-06-02 Thread Joseph S. Myers
On Tue, 27 May 2014, Bin.Cheng wrote:

> There are some other similar cases in vectorizer and all of them look
> suspicious since intuitively, vectorizer should neither care about
> target endianess nor do such shuffle.  Anyway, this is how we do
> vectorization currently.

Agreed.  The semantics of GIMPLE and RTL operations (and 
architecture-independent built-in functions / generic vectors extensions 
in GNU C) are meant to be architecture-independent and 
endianness-independent, generally (including that in GIMPLE and RTL, 
vector lane numbers always use array ordering).  I don't see anything in 
the definition of VEC_WIDEN_MULT_* in tree.def that would make those 
definitions endian-dependent.

Fixes should be on the basis of:

* Ensure the architecture-independent, endianness-independent semantics of 
the relevant GIMPLE and RTL operations are well-defined.

* Fix any code, whether in architecture-independent or 
architecture-dependent parts of the compiler, that deviates from those 
architecture-independent, endianness-independent semantics.  It's the back 
end's responsibility to map from those semantics to the semantics of the 
actual machine instructions.

It may be the case that on some architectures endianness affects what 
vector operations are available.  (Once you define how a particular vector 
machine mode is represented in registers, it could be the case that a load 
instruction means, in terms of GCC IR, "load" for one endianness but 
"permuting load" for the other endianness, for example - see the various 
past discussions of issues with big-endian NEON.)  But this isn't a case 
for the vectorizer caring about endianness - rather, it simply needs to 
ask the back end about the available operations, and adapt to what's 
available.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: question about -ffast-math implementation

2014-06-02 Thread Ian Lance Taylor
On Mon, Jun 2, 2014 at 12:18 AM, Mike Izbicki  wrote:
>> Though for the gory details and authoritative answers I suppose you'd
> have to look into the source code.
>
> Where would I find the code for this?

This is the GCC source code.  There isn't one file that implements
-ffast-math.  -ffast-math expands to a set of other options, and then
various bits and pieces of code tests whether those options are
enabled.

Ian


Re: PowerPC IEEE 128-bit floating point: Two 128-bit floating point types

2014-06-02 Thread Joseph S. Myers
[resending without the long CC list, since the mailing list doesn't like 
it]

On Fri, 30 May 2014, Michael Meissner wrote:

> I assume the way forward is to initially have a __float128 type that gives 
> IEEE
> 128-bit support for those that need/want it, and keep long double to be the
> current format.  When all the bits and pieces are in place, we can think about
> flipping the switch.  However, there we have to think about appropriate times
> for distributions to change over.
> 
> In terms of calling sequence, there are 2 ways to go: Either pass/return the
> IEEE 128-bit value in 2 registers (like long double is now) or treat it like a
> 128-bit vector.  The v2 ELF abi explicitly says that it is treated like a
> vector object, and I would prefer the v1 ELF on big endian server PowerPC's
> also treat it like a vector.  If we are building a compiler for a server
> target, to prevent confusion, I think it should be a requirement that you must
> have -mvsx (or at least -maltivec -mabi=altivec) to use __float128.  Or do I
> need to implement two sets of conversion functions, one if the user builds
> his/her programs with -mcpu=power5 and the other for more recent customers?
> 
> I don't have a handle on the need for IEEE 128-bit floating point in 
> non-server
> platforms.  I assume in these environments, if we need IEEE 128-bit, it will 
> be
> passed as two floating point values.  Do we need this support?

Well, binary128 has all the usual advantages over the existing long double 
for anything wanting something like IEEE semantics; that's not limited to 
server.  I'm not sure why passing in two FPRs would make sense; I'd have 
thought two GPRs would be better for the non-AltiVec case, given that it's 
entirely integer operations that get carried out on the representations of 
these values.

If you have different ABIs for this type depending on AltiVec / 
non-AltiVec, *or* if support for this type depends on AltiVec / 
non-AltiVec, as soon as you have any glibc support for this type (as 
opposed to just supporting __float128 through libquadmath as on x86) 
you've bifurcated the glibc ABI into two incompatible variants: either 
with different calling conventions, or if only one case supports the type 
at all, then one with more symbols than the other.  That is, where 
 lists "64-bit, 
hard-float, BE", you have two ABIs (and if they vary in whether support 
is present at all, you also need two sets of ABI test baselines in glibc).

Now you can't distinguish them by dynamic linker name (because both are 
compatible with old binaries), and symbol versioning won't reliably stop 
binaries / shared libraries built with one from being used with the other 
(because the same symbol versions will be present, just with different 
contents).  So I think you need something in the ELF header to indicate 
use of this type, and which ABI is being used for it, with corresponding 
binutils changes to prevent static linking of incompatible objects, and 
glibc changes to ensure incompatible objects are rejected for dynamic 
linking.

Since glibc's ABI mustn't depend on the compiler version used to build it, 
there are implications for the minimum GCC version that can build any 
glibc version with this support added: glibc must refuse to build with 
older GCC if the result would not provide one of the supported glibc ABIs.  
(If the ABIs are "not supported" and "supported with VSX / AltiVec ABI", 
as opposed to "supported using GPRs/FPRs" and "supported with VSX / 
AltiVec", then the "not supported" case could continue to be built with 
older GCC - but you'd still be increasing the minimum GCC version for LE 
builds.)

-- 
Joseph S. Myers
jos...@codesourcery.com


GNU Tools Cauldron 2014 - Presentation schedule

2014-06-02 Thread Diego Novillo

I have posted the presentation schedule at

https://gcc.gnu.org/wiki/cauldron2014

Presenters, please make sure that your talk is listed and
at a time slot that does not conflict with your travel or
other restrictions.

We are also finalizing details on the Reception and Dinner
venues (Fri night and Sat night). We will send details later.

If you have any special access needs (mobility, wheelchair
access, sight and hearing impairments) and/or special dietary
requirements, please contact us at tools-cauldron-ad...@googlegroups.com


Thanks. Diego.


Re: lib{atomic, itm}/configure.tgt uses -mcpu=v9 as default for sparc

2014-06-02 Thread Eric Botcazou
> I have successfully built without the switch, but I am not sure of the
> effects at runtime.

For sure libitm cannot work, there is a 'flushw' in config/sparc/sjlj.S.

> If V9 is indeed required, is there a way to build without those libs? Or
> has pre V9 support been dropped at some point?

No, V8 is still supported, but nobody has ported the libraries to it.

> IMHO an efficiency enhancement should not prevent running less
> efficiently on a supported architecture. If target triple is
> sparcv9-*-*, the next case will match and will add the "-mcpu=v9" to
> XCFLAGS, but adding it for non-v9 sparc-*-* targets is at least weird.

Well, V9 is about 20 years old now so defaulting to it is not unreasonable, 
especially for all the native OSes.  But patches are of course welcome.

-- 
Eric Botcazou


Re: PowerPC IEEE 128-bit floating point: Internal GCC types

2014-06-02 Thread Joseph S. Myers
On Fri, 30 May 2014, Michael Meissner wrote:

> One issue is the current mode setup is when you create new floating point
> types, the widening system kicks in and the compiler will generate all sorts 
> of
> widening from one 128-bit floating point format to another (because internally
> the precision for IBM extended double is less than the precision of IEEE
> 128-bit, due to the size of the mantisas).  Ideally we need a different way to
> create an alternate floating point mode than FRACITION_FLOAT_MODE that does no
> automatic widening.  If there is a way under the current system, I am not 
> aware
> of it.

When you support both types (under different names) in one compiler, you 
do of course need to support conversions between them - but the compiler 
shouldn't generate such conversions automatically.

Furthermore, if the usual arithmetic conversions are applied to find a 
common type, you have the issue that neither type's values are a subset of 
the other's (__float128 has wider range, but __ibm128 can represent values 
with discontiguous mantissa bits spanning more than 113 bits).  DTS 
18661-3 (N1834) says "If both operands have floating types and neither of 
the sets of values of their corresponding real types is a subset of (or 
equivalent to) the other, the behavior is undefined.".  I'd suggest making 
this (mixed arithmetic or conditional expressions between __float128 and 
__ibm128) an error for both C and C++, so people need to use an explicit 
cast, or implicit conversion by assignment etc., if they wish to mix the 
two types in arithmetic.

(Conversion from __ibm128 to __float128 is a matter of converting the two 
halves and adding them - except for signed zero you must just convert the 
top half to avoid getting a zero of the wrong sign, and for NaNs you must 
also just convert the top half to avoid a spurious exception if the top 
half is a quiet NaN (meaning the whole long double is a quiet NaN) but the 
low half is a signaling NaN.  Conversion from __float128 to __ibm128 would 
presumably be done in the usual way of converting to double, and, if the 
result is finite, subtracting the double from the __float128 value, 
converting the remainder, and renormalizing in case the low part you get 
that way is exactly 0.5ulp of the high part and the high part has its low 
bit set.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [RFC] PR61300 K&R incoming args

2014-06-02 Thread Joseph S. Myers
On Mon, 2 Jun 2014, Florian Weimer wrote:

> On 05/31/2014 08:56 AM, Alan Modra wrote:
> 
> > > It's fine to change ABI when compiling an old-style function
> > > definition for which a prototype exists (relative to the
> > > non-prototype case).  It happens on i386, too.
> > 
> > That might be so, but when compiling the function body you must assume
> > the worst case, whatever that might be, at the call site.  For K&R
> > code, our error was to assume the call was unprototyped (which
> > paradoxically is the best case) when compiling the function body.
> 
> Is this really a supported use case?  I think I remember tracking down a bug
> which was related to a lack of float -> double promotion because the call was
> prototyped, and the old-style function definition wasn't.  This would have
> been on, ugh, SPARC.  I think this happened only in certain cases (float
> arguments, probably).

ISO C (right back to C90) requires a prototype in scope if a variadic 
function, or a function whose definition has prototyped argument types 
changed by the default argument promotions (such as float and short), is 
called.

It probably makes sense by now to enable -Wimplicit-function-declaration 
by default, though that won't catch cases where the file with the 
unprototyped call has a non-prototype declaration such as "int foo();".

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PowerPC IEEE 128-bit floating point: Language standards

2014-06-02 Thread Joseph S. Myers
On Mon, 2 Jun 2014, Michael Meissner wrote:

> I have not been following the language standards recently.  For any of the
> recent language standards, or the standards that are being worked on, will
> there be requirements for an IEEE 128-bit binary floating point type?  For the
> C/C++ languages, is this type long double, or is another type being proprosed
> (such as __float128)?

See DTS 18661-3 (WG14 N1834) - supposed to be going for PDTS ballot soon.  
This provides a standard set of C bindings for types corresponding to 
particular IEEE formats, without requiring such types to be provided.  
(Thus, it provides a standard way of doing what libquadmath does - saying 
the type is _Float128, that that's a keyword so it can be used with 
_Complex (cf. bug 32187), that the library functions are *f128, that 
constants can be suffixed with f128, that _Float64 is a distinct type from 
double although it has the same representation and alignment, and so on.  
Obviously it's not your responsibility to implement any of this.  But you 
may need to do some of the same disentangling in glibc that would be 
needed for implementing 18661-3 - the ldbl-128 and ldbl-128ibm directories 
presently have the mutually exclusive meanings that long double is a 
particular type, but you'll want to build code from both of them into the 
same glibc if the transition is to be achieved without having two separate 
sets of incompatible glibc libraries.  Though separate libraries, with 
some dynamic linker magic to pick the right ones and avoid duplicating 
any libraries that don't involve long double in their interfaces, does 
have some attraction.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: FloatingPointMath and transformations

2014-06-02 Thread Vincent Lefevre
On 2014-06-02 17:34:31 +0200, Andreas Schwab wrote:
> Jakub Jelinek  writes:
> 
> > If C is a power of two, then 1.0 / C should IMHO never overflow,
> 
> It does if C is subnormal.

More precisely, in case of double precision, if C = DBL_MIN / 2,
1.0 / C doesn't overflow, but if C = DBL_MIN / 4 (or is smaller),
1.0 / C overflows.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Re: question about -ffast-math implementation

2014-06-02 Thread Mike Izbicki
Right, but I've never taken a look at the gcc codebase.  Where would I
start looking for the relevant files?  Is there a general introduction
to the codebase anywhere that I should start with?

On Mon, Jun 2, 2014 at 11:20 AM, Ian Lance Taylor  wrote:
> On Mon, Jun 2, 2014 at 12:18 AM, Mike Izbicki  wrote:
>>> Though for the gory details and authoritative answers I suppose you'd
>> have to look into the source code.
>>
>> Where would I find the code for this?
>
> This is the GCC source code.  There isn't one file that implements
> -ffast-math.  -ffast-math expands to a set of other options, and then
> various bits and pieces of code tests whether those options are
> enabled.
>
> Ian


Re: question about -ffast-math implementation

2014-06-02 Thread Andi Kleen
Mike Izbicki  writes:

> Right, but I've never taken a look at the gcc codebase.  Where would I
> start looking for the relevant files?  Is there a general introduction
> to the codebase anywhere that I should start with?

grep for all the flags set in the two functions below (from gcc/opts.c),
without the x_ to catch all:

/* The following routines are useful in setting all the flags that
   -ffast-math and -fno-fast-math imply.  */
static void
set_fast_math_flags (struct gcc_options *opts, int set)
{
  if (!opts->frontend_set_flag_unsafe_math_optimizations)
{
  opts->x_flag_unsafe_math_optimizations = set;
  set_unsafe_math_optimizations_flags (opts, set);
}
  if (!opts->frontend_set_flag_finite_math_only)
opts->x_flag_finite_math_only = set;
  if (!opts->frontend_set_flag_errno_math)
opts->x_flag_errno_math = !set;
  if (set)
{
  if (!opts->frontend_set_flag_signaling_nans)
opts->x_flag_signaling_nans = 0;
  if (!opts->frontend_set_flag_rounding_math)
opts->x_flag_rounding_math = 0;
  if (!opts->frontend_set_flag_cx_limited_range)
opts->x_flag_cx_limited_range = 1;
}
}

/* When -funsafe-math-optimizations is set the following
   flags are set as well.  */
static void
set_unsafe_math_optimizations_flags (struct gcc_options *opts, int set)
{
  if (!opts->frontend_set_flag_trapping_math)
opts->x_flag_trapping_math = !set;
  if (!opts->frontend_set_flag_signed_zeros)
opts->x_flag_signed_zeros = !set;
  if (!opts->frontend_set_flag_associative_math)
opts->x_flag_associative_math = set;
  if (!opts->frontend_set_flag_reciprocal_math)
opts->x_flag_reciprocal_math = set;
}

-- 
a...@linux.intel.com -- Speaking for myself only


Re: Zero/Sign extension elimination using value ranges

2014-06-02 Thread Kugan
On 23/05/14 07:23, Richard Henderson wrote:
> On 05/22/2014 03:12 AM, Jakub Jelinek wrote:
>> No way.  SUBREG_PROMOTED_UNSIGNED_P right now resides in two separate bits,
>> volatil and unchanging.  Right now volatile != 0, unchanging ignored
>> is -1, volatile == 0, then the value is unchanging.
>> What I meant is change this representation, e.g. to
>> x->volatil * 2 + x->unchanging - 1
>> so you can represent the values -1, 0, 1, 2 in there.
>> Of course, adjust SUBREG_PROMOTED_UNSIGNED_SET correspondingly too.
>> As SUBREG_PROMOTED_UNSIGNED_P is only valid if SUBREG_PROMOTED_VAR_P,
>> I'd hope that you don't need to care about what 0, 0 in those bits
>> means, because everything should actually SUBREG_PROMOTED_UNSIGNED_SET
>> around setting SUBREG_PROMOTED_VAR_P to non-zero.
> 
> It would be helpful to redo these, now that we don't simply have a tri-state 
> value.
> 
> const unsigned int SRP_POINTER  = 0;
> const unsigned int SRP_SIGNED   = 1;
> const unsigned int SRP_UNSIGNED = 2;
> 
> #define SUBREG_PROMOTED_SET(RTX, VAL)  \
> do {   \
>   rtx const _rtx = RTL_FLAG_CHECK1 ("SUBREG_PROMOTED_SET", \
> (RTX), SUBREG);\
>   unsigned int _val = (VAL);   \
>   _rtx->volatil = _val;\
>   _rtx->unchanging = _val >> 1;\
> } while (0)
> 
> #define SUBREG_PROMOTED_GET(RTX) \
>   ({ const rtx _rtx = RTL_FLAG_CHECK1 ("SUBREG_PROMOTED_GET",  \
>(RTX), SUBREG); \
>  _rtx->volail + _rtx->unchanging * 2;  \
>   })
> 
> The bits are arranged such that e.g.
> 
>   SUBREG_PROMOTED_GET (x) & SRP_UNSIGNED
> 
> is meaningful.  For conciseness, we'd probably want
> 
> SUBREG_PROMOTED_POINTER_P
> SUBREG_PROMOTED_UNSIGNED_P
> SUBREG_PROMOTED_SIGNED_P
> 
> as boolean macros.  I dunno if "both" (whatever you want to call that) is used
> enough to warrant its own macro.  I can more often see this being used when
> examining a given ZERO_/SIGN_EXTEND rtx, so "both" probably won't come up.


Thanks for the information.

#define SUBREG_PROMOTED_UNSIGNED_P(RTX)\
  ({ const rtx _rtx = RTL_FLAG_CHECK1 ("SUBREG_PROMOTED_UNSIGNED_P",
(RTX), SUBREG);   \
  (((_rtx->volatil + _rtx->unchanging) == 0) ? -1 : (_rtx->volatil == 1));})
when I tried this macros, I started getting "warning: ISO C++ forbids
braced-groups within expressions [-Wpedantic]", Therefore I changed it to

#define SUBREG_PROMOTED_UNSIGNED_P(RTX) \
  RTL_FLAG_CHECK1 ("SUBREG_PROMOTED_UNSIGNED_P", (RTX),
SUBREG)->volatil)\
 + (RTX)->unchanging) == 0) ? -1 : ((RTX)->volatil == 1))

I also kept the SRP_POINTER, SRP_SIGNED etc. as below. This is similar
to the sign values we get from tree and same as what is used currently.
We therefore don’t have to translate between them.

I am however, changing the internal values (of volatil and unchanging)
similar to what we will get with SRP_SIGNED=1 and SRP_UNSIGNED = 2.
Attached patch has the rest of the modifications. Regression tested on
arm and x86_64. In arm there is still one failure
(gcc.dg/fixed-point/convert-sat.c).

const unsigned int SRP_POINTER  = -1;
const unsigned int SRP_SIGNED   = 0;
const unsigned int SRP_UNSIGNED = 1;
const unsigned int SRP_SIGNED_AND_UNSIGNED = 2;


#define SUBREG_PROMOTED_SET(RTX, VAL)   \
do {\
  rtx const _rtx = RTL_FLAG_CHECK1 ("SUBREG_PROMOTED_SET",  \
(RTX), SUBREG); \
  switch ((VAL))\
  { \
case SRP_POINTER:   \
  _rtx->volatil = 0;\
  _rtx->unchanging = 0; \
  break;\
case SRP_SIGNED:\
  _rtx->volatil = 0;\
  _rtx->unchanging = 1; \
  break;\
case SRP_UNSIGNED:  \
  _rtx->volatil = 1;\
  _rtx->unchanging = 0; \
  break;\
case SRP_SIGNED_AND_UNSIGNED:   \
  _rtx->volatil = 1;\
  _rtx->unchanging = 1; \
  break;\
   }\
 } while (0)


#define SUBREG_PROMOTED_GET(RTX)

Re: PowerPC IEEE 128-bit floating point: Internal GCC types

2014-06-02 Thread Vincent Lefevre
On 2014-06-02 21:20:57 +, Joseph S. Myers wrote:
> ([...] Conversion from __float128 to __ibm128 would presumably be
> done in the usual way of converting to double, and, if the result is
> finite, subtracting the double from the __float128 value, converting
> the remainder, and renormalizing in case the low part you get that
> way is exactly 0.5ulp of the high part and the high part has its low
> bit set.)

This is not as simple, depending on how you decide to handle the
largest values. This is important if FP-model related data are
provided, such as LDBL_MANT_DIG (the precision) and LDBL_MAX_EXP
(the maximum exponent). The __ibm128 implementation as long double
is currently buggy / inconsistent, and I've reported the following
bug:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Re: FloatingPointMath and transformations

2014-06-02 Thread Vincent Lefevre
On 2014-06-02 10:33:37 -0400, Geert Bosch wrote:
> It should, or it would be a bug. Please feel free to add/correct
> anything on this page.

I am not a member of EditorGroup.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Re: Cross-testing libsanitizer

2014-06-02 Thread Yury Gribov

Christophe,

> Indeed, when testing on my laptop, execution tests fail because
> libsanitizer wants to allocated 8GB of memory (I am using qemu as
> execution engine).

Is this 8G of RAM? If yes - I'd be curious to know which part of 
libsanitizer needs so much memory.


-Y