Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Andrew Haley
On 02/13/2012 08:00 PM, Geert Bosch wrote:
> GNU Linux is quite good, but has issues with the "pow" function for
> large exponents, even in current versions

Really?  Even on 64-bit?  I know this is a problem for the 32-bit
legacy architecture, but I thought the 64-bit pow() was OK.

Andrew.



[ARM] EABI and the default to short enums

2012-02-14 Thread Sebastian Huber

Hello,

the default ARM EABI configuration uses short enums by default (from 
"gcc/config/arm/arm.c":


/* AAPCS based ABIs use short enums by default.  */

static bool
arm_default_short_enums (void)
{
  return TARGET_AAPCS_BASED && arm_abi != ARM_ABI_AAPCS_LINUX;
}

This causes a major headache for me since some libraries assume that sizeof(any 
enum) > 1, e.g. the standard XDR library.  Is the only possible way to disable 
short enums to set the ABI to ARM_ABI_AAPCS_LINUX?  Which side effects does 
this have?


Have a nice day!

--
Sebastian Huber, embedded brains GmbH

Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany
Phone   : +49 89 18 90 80 79-6
Fax : +49 89 18 90 80 79-9
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.


Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Vincent Lefevre
On 2012-02-13 15:00:54 -0500, Geert Bosch wrote:
> Properties:
> 
>   [ ]  Conforms to C99 for exceptional values 
>(accepting/producing NaNs, infinities)
> 
>   [ ]  Handles non-default rounding modes,
>trapping math, errno, etc.
> 
>   [ ]  Requires IEEE compliant binary64 arithmetic
>(no implicit extended range or precision)
> 
>   [ ]  Requires IEEE compliant binary80 arithmetic
>(I know, not official format, but YKWIM)

Please do not use the term binary80, as it is confusing (and
there is a difference between this format and the formats of
the IEEE binary{k} class concerning the implicit bit). You'd
rather say: Intel (or x86/x87) extended precision. As some
platforms use binary128 instead of the Intel extended precision,
you may want to add:

  [ ]  Requires IEEE compliant binary128 arithmetic

> Accuracy level:
> 
>   0 - Correctly rounded
> 
>   1 - Faithfully rounded, preserving symmetry and monotonicity

Between 1 and 2, I would add:

  Faithfully rounded, preserving symmetry

The reason is that ensuring monotonicity can be difficult for some
functions, where the scaled derivative can be very small (IIRC, if
you don't control monotonicity in the algorithm and only rely on
the error bound, it can be as difficult as correct rounding). That's
why, in the IEEE 754R discussions, when I proposed it as a fallback
when correct rounding wasn't supported, it was rejected.

However there are functions for which statically-proved correct
rounding is very difficult but monotonicity can easily be proved:
sin, cos, tan.

Other other hand, symmetry is always trivial to support and generally
is a direct consequence of the first simplifications in an algorithm.

>   2 - Tightly approximated, meeting prescribed relative error
>   bounds. Conforming to OpenCL and Ada Annex G "strict mode"
>   numerical bounds.
> 
>   3 - Unspecified error bounds
> 
> Note that currently of all different operating systems we (AdaCore)
> support for the GNAT compiler, I don't know of any where we can rely
> on the system math library to meet level 2 requirements for all
> functions and over all of the range! Much of this is due to us 
> supporting OS versions as long as there is vendor support, so while 
> SPARC Solaris 9 and later are fine, Solaris 8 had some issues. 
> GNU Linux is quite good, but has issues with the "pow" function for
> large exponents, even in current versions,

pow(x,y) where x is close to 1 and y is large:

  http://sourceware.org/bugzilla/show_bug.cgi?id=706

(waiting for a patch).

> and even though Ada
> allows a relative error of (4.0 + |Right · log(Left)| / 32.0) 
> for this function, or up to 310 ulps at the end of the range.
> Similarly, for trigonometric functions, the relative error for 
> arguments larger than some implementation-defined angle threshold
> is not specified, though the angle threshold needs to be at
> least radix**(mantissa bits / 2), so 2.0**12 for binary32 or 2.0**32
> for binary80. OpenCL doesn't specify an angle threshold, but I
> doubt they intend to require sufficient accurate reduction over
> the entire range to have a final error of 4 ulps: that doesn't fit
> with the rest of the requirements.

P1788 (interval arithmetic standard) will also have definitions for
accuracy levels, so that sin([x,x]) for x large can still be very
fast (except in the tighest level, which basically corresponds to
correct rounding). There will not be a threshold, but the accuracy
will be defined by considering something like [prev(x),succ(x)].

In any case, the spec should be sufficiently clear so that one can
do a proved error analysis.

> The Ada test suite (ACATS) already has quite extensive tests for (2),
> which are automatically parameterized for any type (including those
> with non-binary radix). That, and the fact that this level apparently
> is still a challenge for most system math libraries seems like a good
> reason for aiming for this level as a base level.
> 
> We should consider any libm function not meeting these minimal 
> criteria to be buggy (or: only usable with -ffast-math, if actually fast),
> and use a replacement. We could have -fstrict-math select the most
> accurate functions we have, with the actual level being more for 
> documentation purposes than anything else.

IEEE 754 recommends correct rounding (which should not be much slower
than a function accurate to a few ulp's, in average) in the full range.
I think this should be the default. The best compromise between speed
and accuracy depends on the application, and the compiler can't guess
anyway.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Vincent Lefevre
On 2012-02-14 09:51:28 +, Andrew Haley wrote:
> On 02/13/2012 08:00 PM, Geert Bosch wrote:
> > GNU Linux is quite good, but has issues with the "pow" function for
> > large exponents, even in current versions
> 
> Really?  Even on 64-bit?  I know this is a problem for the 32-bit
> legacy architecture, but I thought the 64-bit pow() was OK.

According to http://sourceware.org/bugzilla/show_bug.cgi?id=706
the 32-bit pow() can be completely wrong, and the 64-bit pow()
is just very inaccurate.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Tim Prince

 On 02/14/2012 04:51 AM, Andrew Haley wrote:

On 02/13/2012 08:00 PM, Geert Bosch wrote:

GNU Linux is quite good, but has issues with the "pow" function for
large exponents, even in current versions

Really?  Even on 64-bit?  I know this is a problem for the 32-bit
legacy architecture, but I thought the 64-bit pow() was OK.

Andrew.


No problems seen under elefunt with glibc 2.12 x86_64.

--
Tim Prince



Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Vincent Lefevre
On 2012-02-14 14:26:05 +0100, Vincent Lefevre wrote:
> On 2012-02-14 09:51:28 +, Andrew Haley wrote:
> > On 02/13/2012 08:00 PM, Geert Bosch wrote:
> > > GNU Linux is quite good, but has issues with the "pow" function for
> > > large exponents, even in current versions
> > 
> > Really?  Even on 64-bit?  I know this is a problem for the 32-bit
> > legacy architecture, but I thought the 64-bit pow() was OK.
> 
> According to http://sourceware.org/bugzilla/show_bug.cgi?id=706
> the 32-bit pow() can be completely wrong, and the 64-bit pow()
> is just very inaccurate.

Sorry, that's only for a 32-bit x86 machine, with the rounding
precision set to double and to extended. The x86_64 version
doesn't seem to have any problem in rounding to nearest (IIRC,
IBM's implementation is used, thus has problems with the directed
rounding modes).

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Tim Prince

 On 02/14/2012 08:26 AM, Vincent Lefevre wrote:

On 2012-02-14 09:51:28 +, Andrew Haley wrote:

On 02/13/2012 08:00 PM, Geert Bosch wrote:

GNU Linux is quite good, but has issues with the "pow" function for
large exponents, even in current versions

Really?  Even on 64-bit?  I know this is a problem for the 32-bit
legacy architecture, but I thought the 64-bit pow() was OK.

According to http://sourceware.org/bugzilla/show_bug.cgi?id=706
the 32-bit pow() can be completely wrong, and the 64-bit pow()
is just very inaccurate.


That bugzilla brings up paranoia, but with gfortran 4.7 on glibc 2.12 I get

TESTING X**((X+1)/(X-1))  VS.  EXP(2) =   7.3890561  AS  X -> 1.
 ACCURACY SEEMS ADEQUATE.
 TESTING POWERS Z**Q  AT FOUR NEARLY EXTREME VALUES:
  NO DISCREPANCIES FOUND.
.
NO FAILURES, DEFECTS NOR FLAWS HAVE BEEN DISCOVERED.
 ROUNDING APPEARS TO CONFORM TO THE PROPOSED IEEE STANDARD  P754
 THE ARITHMETIC DIAGNOSED APPEARS TO BE EXCELLENT!

Historically, glibc for i386 used the raw x87 built-ins without any of 
the recommended precautions.  Paranoia still shows, as it always did:

TESTING X**((X+1)/(X-1))  VS.  EXP(2) =   7.3890561  AS  X -> 1.
 DEFECT: Calculated (1-0.11102230E-15)**(-0.18014399E+17)
 differs from correct value by -0.34413050E-08
 This much error may spoil calculations such as compounded interest.


--
Tim Prince



Re: [ARM] EABI and the default to short enums

2012-02-14 Thread Ian Lance Taylor
Sebastian Huber  writes:

> the default ARM EABI configuration uses short enums by default (from
> "gcc/config/arm/arm.c":
>
> /* AAPCS based ABIs use short enums by default.  */
>
> static bool
> arm_default_short_enums (void)
> {
>   return TARGET_AAPCS_BASED && arm_abi != ARM_ABI_AAPCS_LINUX;
> }
>
> This causes a major headache for me since some libraries assume that
> sizeof(any enum) > 1, e.g. the standard XDR library.  Is the only
> possible way to disable short enums to set the ABI to
> ARM_ABI_AAPCS_LINUX?  Which side effects does this have?

This question would be better asked on the mailing list
gcc-h...@gcc.gnu.org rather than gcc@gcc.gnu.org.  The gcc@ mailing list
is for issues related to the development of gcc itself.  Please take any
followups to gcc-help.  Thanks.

You can use -fno-short-enums.  However, see the note about ABI
compatibility in the -fshort-enums doc.

Ian


Re: [ARM] EABI and the default to short enums

2012-02-14 Thread Sebastian Huber

On 02/14/2012 04:05 PM, Ian Lance Taylor wrote:

Sebastian Huber  writes:


the default ARM EABI configuration uses short enums by default (from
"gcc/config/arm/arm.c":

/* AAPCS based ABIs use short enums by default.  */

static bool
arm_default_short_enums (void)
{
   return TARGET_AAPCS_BASED&&  arm_abi != ARM_ABI_AAPCS_LINUX;
}

This causes a major headache for me since some libraries assume that
sizeof(any enum)>  1, e.g. the standard XDR library.  Is the only
possible way to disable short enums to set the ABI to
ARM_ABI_AAPCS_LINUX?  Which side effects does this have?


This question would be better asked on the mailing list
gcc-h...@gcc.gnu.org rather than gcc@gcc.gnu.org.  The gcc@ mailing list
is for issues related to the development of gcc itself.  Please take any
followups to gcc-help.  Thanks.

You can use -fno-short-enums.  However, see the note about ABI
compatibility in the -fshort-enums doc.


The problem is that I need a proper GCC ARM configuration for the RTEMS tool 
chain.  To do this I have to provide the right definitions in


gcc/config/arm/rtems-eabi.h
gcc/config/arm/t-rtems-eabi

and this is clearly not a GCC user problem.  The so called ARM ELF 
configuration didn't use short enums by default.  It seems that Linux faced 
this problem before and now we have this exception in the 
arm_default_short_enums() function above.  I want to preserve the ARM ELF 
behavior with respect to enums also in the ARM EABI configuration.  The 
question is now who to achieve this.  One option is to set the ABI to 
ARM_ABI_AAPCS_LINUX also for the RTEMS tool chain, but I am not sure that this 
is the right thing.


--
Sebastian Huber, embedded brains GmbH

Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany
Phone   : +49 89 18 90 80 79-6
Fax : +49 89 18 90 80 79-9
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.


Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Geert Bosch

On Feb 14, 2012, at 08:22, Vincent Lefevre wrote:
> Please do not use the term binary80, as it is confusing (and
> there is a difference between this format and the formats of
> the IEEE binary{k} class concerning the implicit bit).
Yes, I first wrote extended precision, though that really is
a general term that could denote many different formats.
I'll write Intel extended precision in the future. :)
>   
> IEEE 754 recommends correct rounding (which should not be much slower
> than a function accurate to a few ulp's, in average) in the full range.
> I think this should be the default. The best compromise between speed
> and accuracy depends on the application, and the compiler can't guess
> anyway.

While I'm sympathetic to that sentiment, the big issue is that we
don't have a correctly rounded math library supporting all formats
and suitable for all targets. We can't default to something we
don't have.

Right now we don't have a library either that conforms to C99
and meets the far more relaxed accuracy criteria of OpenCL and
Ada.

However, the glibc math library comes very close, and we can
surely fix any remaining issues there may be. So, if we can
use that as base, or as "fallback" library, we suddenly
achieve some minimal accuracy guarantees across a wide
range of platforms. If we can get this library with
GPL+exception, we can even generate optimized variants
and use a static library with LTO byte code allowing for
inlining etc.

Then we can collect/write code that improves on this libm.

  -Geert


Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Andrew Haley
On 02/14/2012 04:41 PM, Geert Bosch wrote:
> Right now we don't have a library either that conforms to C99

Are you sure?  As far as I know we do.  We might not meet
C99 Annex F, but that's not required.

> and meets the far more relaxed accuracy criteria of OpenCL and
> Ada.

Andrew.


Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Geert Bosch

On Feb 14, 2012, at 11:44, Andrew Haley wrote:

> On 02/14/2012 04:41 PM, Geert Bosch wrote:
>> Right now we don't have a library either that conforms to C99
> 
> Are you sure?  As far as I know we do.  We might not meet
> C99 Annex F, but that's not required.
> 
>> and meets the far more relaxed accuracy criteria of OpenCL and
>> Ada.
Note the conjunctive "and" here. I was just replying to Vincent
that it doesn't make sense to default to correctly rounded math
yet, as we don't have such a thing.

I think it is feasible to integrate a libm meeting minimal
accuracy requirements, as well as variations that additionally
give much improved performance when non-default rounding modes,
trapping and errno setting are not needed. It still seems
like glibc's libm is the best candidate to use a base.

  -Geert


Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Andrew Haley
On 02/14/2012 04:54 PM, Geert Bosch wrote:
> 
> On Feb 14, 2012, at 11:44, Andrew Haley wrote:
> 
>> On 02/14/2012 04:41 PM, Geert Bosch wrote:
>>> Right now we don't have a library either that conforms to C99
>>
>> Are you sure?  As far as I know we do.  We might not meet
>> C99 Annex F, but that's not required.
>>
>>> and meets the far more relaxed accuracy criteria of OpenCL and
>>> Ada.
> Note the conjunctive "and" here. I was just replying to Vincent
> that it doesn't make sense to default to correctly rounded math
> yet, as we don't have such a thing.

I was confused: "either X and Y" is a very odd construct.  I don't
know what it means.  But to be absolutely clear, glibc's libm
doesn't have a problem meeting C99, AFAIK.

> I think it is feasible to integrate a libm meeting minimal
> accuracy requirements, as well as variations that additionally
> give much improved performance when non-default rounding modes,
> trapping and errno setting are not needed.

Probably.

> It still seems
> like glibc's libm is the best candidate to use a base.

That depends, because glibc's libm has such a wildly varying bunch
of implementations, particularly between 32- and 64-bit x86.

Andrew.


Re: [ARM] EABI and the default to short enums

2012-02-14 Thread Ian Lance Taylor
Sebastian Huber  writes:

> On 02/14/2012 04:05 PM, Ian Lance Taylor wrote:
>> Sebastian Huber  writes:
>>
>>> the default ARM EABI configuration uses short enums by default (from
>>> "gcc/config/arm/arm.c":
>>>
>>> /* AAPCS based ABIs use short enums by default.  */
>>>
>>> static bool
>>> arm_default_short_enums (void)
>>> {
>>>return TARGET_AAPCS_BASED&&  arm_abi != ARM_ABI_AAPCS_LINUX;
>>> }
>>>
>>> This causes a major headache for me since some libraries assume that
>>> sizeof(any enum)>  1, e.g. the standard XDR library.  Is the only
>>> possible way to disable short enums to set the ABI to
>>> ARM_ABI_AAPCS_LINUX?  Which side effects does this have?
>>
>> This question would be better asked on the mailing list
>> gcc-h...@gcc.gnu.org rather than gcc@gcc.gnu.org.  The gcc@ mailing list
>> is for issues related to the development of gcc itself.  Please take any
>> followups to gcc-help.  Thanks.
>>
>> You can use -fno-short-enums.  However, see the note about ABI
>> compatibility in the -fshort-enums doc.
>
> The problem is that I need a proper GCC ARM configuration for the
> RTEMS tool chain.  To do this I have to provide the right definitions
> in
>
> gcc/config/arm/rtems-eabi.h
> gcc/config/arm/t-rtems-eabi
>
> and this is clearly not a GCC user problem.  The so called ARM ELF
> configuration didn't use short enums by default.  It seems that Linux
> faced this problem before and now we have this exception in the
> arm_default_short_enums() function above.  I want to preserve the ARM
> ELF behavior with respect to enums also in the ARM EABI configuration.
> The question is now who to achieve this.  One option is to set the ABI
> to ARM_ABI_AAPCS_LINUX also for the RTEMS tool chain, but I am not
> sure that this is the right thing.


I would recommend that RTEMS change to the ARM EABI if possible.  That
is the current standard ABI on ARM platforms.  It's true that the ARM
EABI is different from the previous ABIs in some respect.  I believe
that would mean using ARM_ABI_AAPCS.

However, if you want to retain GNU/Linux compatibility, then using
ARM_ABI_AAPCS_LINUX is likely to be correct.

But I am not an ARM expert.

Ian


RE: spill failure after IF-CASE-2 transformation

2012-02-14 Thread Henderson, Stuart
>spill_failure does return for asms since we don't want to ICE on bad
>user code. That's all that's going on here.

ahh, thanks.

>It sounds like ifcvt needs to be fixed. Your example:
>> block 44:
>> set cc = x;
>> set cc = y; (*)
>> if cc jump;
>
>looks like an invalid transformation, but I suspect rather than setting
>the CC register, the (*) insn is setting a pseudo (more accurate RTL
>would be useful). There are some cases in ifcvt.c which check
>targetm.small_register_classes_for_mode already, this is probably what
>should be done to prevent this transformation.

You suspect correctly, cc=x sets CC whereas cc=y is a pseudo which can only 
match CC.

Presumably I must check all instructions in the else_bb for modifications to 
small_register_classes_for_mode_p?  e.g. see below.

Does this seem reasonable?
Thanks,
Stu

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 8d81c89..b605a63 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3924,6 +3924,7 @@ find_if_case_2 (basic_block test_bb, edge then_edge, edge 
else_edge)
   basic_block else_bb = else_edge->dest;
   edge else_succ;
   int then_prob, else_prob;
+  rtx insn;

   /* If we are partitioning hot/cold basic blocks, we don't want to
  mess up unconditional or indirect jumps that cross between hot
@@ -3957,6 +3958,25 @@ find_if_case_2 (basic_block test_bb, edge then_edge, 
edge else_edge)
   /* ELSE has one predecessor.  */
   if (!single_pred_p (else_bb))
 return FALSE;
+
+  /* Avoid small_register_classes_for_mode_p dests. */
+  FOR_BB_INSNS (else_bb, insn)
+{
+  rtx set, dest;
+
+  if (!NONDEBUG_INSN_P (insn) || JUMP_P (insn))
+continue;
+  set = single_set (insn);
+  if (!set)
+return FALSE;
+
+  dest = SET_DEST (set);
+  if (!REG_P (dest))
+continue;
+  if (targetm.small_register_classes_for_mode_p (GET_MODE (dest)))
+return FALSE;
+}

   /* THEN is not EXIT.  */
   if (then_bb->index < NUM_FIXED_BLOCKS)



Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Geert Bosch

On Feb 13, 2012, at 09:59, Vincent Lefevre wrote:

> On 2012-02-09 15:49:37 +, Andrew Haley wrote:
>> I'd start with INRIA's crlibm.
> 
> I point I'd like to correct. GNU MPFR has mainly (> 95%) been
> developed by researchers and engineers paid by INRIA. But this
> is not the case of CRlibm. I don't know its copyright status
> (apparently, mainly ENS Lyon, and the rights have not been
> transferred to the FSF).
> 
> Also, from what I've heard, CRlibm is more or less regarded as
> dead, because there are new tools that do a better job, and new
> functions could be generated in a few hours. I suppose a member
> (not me since I don't work on these tools) or ex-member of our
> team will contact some of you about this.

Ideally, we would need to include both the generated functions
as well as identification of the tools used and source code or
scripts used with those tools.

  -Geert


Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Christoph Lauter

Hello,

first of all, let me apologize for my late answer to this very exciting 
email thread.


As pointed out several times, the current libm suffers from several 
disadvantages:


* The current libm code is a mix of codes coming from different sources, 
with tables generated by different people. The knowledge on how to 
maintain these codes (in the sense of extending them) has mainly been lost.


* The current libm is the result of a difficult compromise between 
speed, size, accuracy, architecture dependent optimization. It is 
principally targeted to general-purpose use, with some optimizations. 
These optimizations, as the one for speed in sin_cos, may fit the 
expectations of a certain group of users and not the ones of others.
There seems to be some consensus that in an ideal world, all libm 
functions should exist in different flavors (latency or throughput 
optimized, vectorized or not, with different accuracy levels up to 
correctly rounded, etc.)


* Any libm should actually be regularly performance-optimized for the 
different hardware architectures. This is tedious and as far is I know, 
it has not often been done on GNU libm.


So a re-design of the current libm seems to be desirable.

crlibm has been proposed as a good starting point for such a new and 
optimized libm. As one of the people having contributed to crlibm, I'd 
say it actually isn't and we can do much better:


* crlibm has proven that the correct rounding of (univariate) libm 
functions is possible and that the performance slow-down can absolutely 
be neglected. The correct rounding of libm functions yields perfect 
portability with bitwise comparable results.
However, correct rounding is not a panacea for all libm problems. There 
is a huge number of libm use cases where people actually do want to 
sacrifice a certain amount of accuracy (and portability) for speed.

A libm rewrite should, IMHO, address their needs, too.

* Besides all issues due to intellectual property transfer, crlibm has 
never been designed as one-to-one replacement for the standard libm:
   - the library must be initialized, an operation which might change 
the state of the FPU unit on some architectures (x86)
   - all functions come in 4 flavors, according to the rounding modes: 
e.g. exp_rn(), exp_rd(), exp_ru(), exp_rz(). Hard work would be required 
to make this fit with a more IEEE 754-2008/C99 view of what rounding 
modes are (i.e. modes in the environment).
   - the quality of the code in crlibm has itself suffered from crlibm 
having been written by a bunch of people (with different coding 
conventions) and with different technologies for intermediate format 
handling.
In one word, crlibm has been a great place to do experiments and to 
learn how to do correct rounding in a performance-sensible fashion. It 
will be hard to clean it up for a start for a new libm, though.


The challenge of starting over with libm is hence quite a huge one. We 
might also see it as an opportunity though:


* A well-structured design of the new library might bring better code 
maintainability and first of all, better code documentation. IMHO, the 
libm sources should come with all well-documented scripts that have been 
used to generate all tables and approximation polynomials.


* Recent advances in automatic code optimization for libms [1] and 
polynomial approximation [2] seem to indicate that a libm could be 
written in a fully parameterized way. That way, it would be possible

  - to support different accuracy flavors, traded for speed,
  - to support or neglect flag and errno handling as appropriate,
  - to auto-tune the code for the different, ever-changing architectures,
  - propose latency and throughput optimized flavors,
  - to go for correct rounding where it is possible and needed.

* The new library would come with a freshly-developed test bench (also 
generated by scripts distributed along with the code) and might even be 
proven formally, at least partly.


Hence, by instantiation of the parameterization, it might actually be 
possible to fill in all cases in the hypercube accuracy * 
latency/throughput optimization * flag handling * etc. and to enable the 
user to choose the flavor suiting their needs by setting of a couple of 
flags. A vanilla libm for -O0 would be a corollary of that project, of 
course.


My colleagues Jean-Michel Muller, Florent de Dinechin at École Normale 
Supérieure de Lyon/ INRIA/ CNRS and myself at Université Pierre et Marie 
Curie, Paris Sorbonne, we have been thinking of starting such a project 
for quite a while. We've got everything what we need to get started, 
even a possible Ph.D. student for doing the work is in the pipeline. She 
could get started by September (if we get the financing done).


As a matter of course, we'd be more than happy to get your input (and 
even guidance w.r.t. copyright management and coding conventions) on and 
during that project.
If (parts of) GNU or gcc could officially endorse it, we'd

Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Andrew Haley
On 02/14/2012 06:54 PM, Christoph Lauter wrote:
> My colleagues Jean-Michel Muller, Florent de Dinechin at École Normale 
> Supérieure de Lyon/ INRIA/ CNRS and myself at Université Pierre et Marie 
> Curie, Paris Sorbonne, we have been thinking of starting such a project 
> for quite a while. We've got everything what we need to get started, 
> even a possible Ph.D. student for doing the work is in the pipeline. She 
> could get started by September (if we get the financing done).

Wonderful.  That is absolutely wonderful.  :-)

Andrew.


Compilation fails on Debian Wheezy - cannot find gnu/stubs-32.h [multiarch]

2012-02-14 Thread Witold Baryluk
Hi,

I was trying to compile gcc-4.7 snapshots on Debian GNU/Linux wheezy
(testing/unstable) i386, and found problem releated to multiarch.

This is my configure and compile script:

unset LC_ALL
unset LANG

export TEMP=/scratch/baryluk/gcc/tmp
export TMP=/scratch/baryluk/gcc/tmp
export TMPDIR=/scratch/baryluk/gcc/tmp

export CPPFLAGS="-I/usr/include/i386-linux-gnu"

cd /
rm -rf /scratch/baryluk/gcc/obiekty
mkdir /scratch/baryluk/gcc/obiekty || exit 1
cd /scratch/baryluk/gcc/obiekty
/scratch/baryluk/gcc/gcc-4.7-20120211/configure \
--enable-shared \
--enable-multiarch \
--enable-nls \
--enable-linker-build-id \
--enable-checking=yes \
--enable-stage1-checking=all \
--enable-languages=c,c++,objc,go,fortran,ada,java \
--with-fpmath=sse \
--with-build-config=bootstrap-lto \
--enable-lto \
--enable-libssp \
--enable-libada \
--enable-objc-gc \
--enable-plugin \
--enable-gold \
--with-system-zlib \
--without-included-gettext \
--enable-threads=posix \
--enable-cloog-backend \
--prefix=/usr/local --program-suffix=-4.7
...
...

make bootstrap
...
...

And this is a error I got after 10 minutes of compilation


make[4]: Leaving directory 
`/scratch/baryluk/gcc/obiekty/i686-pc-linux-gnu/libgcc'
DEFINES='' 
HEADERS='/scratch/baryluk/gcc/gcc-4.7-20120211/libgcc/config/i386/value-unwind.h'
 \
/scratch/baryluk/gcc/gcc-4.7-20120211/libgcc/mkheader.sh > 
tmp-libgcc_tm.h
/bin/bash /scratch/baryluk/gcc/gcc-4.7-20120211/libgcc/../move-if-change 
tmp-libgcc_tm.h libgcc_tm.h
echo timestamp > libgcc_tm.stamp
/scratch/baryluk/gcc/obiekty/./gcc/xgcc -B/scratch/baryluk/gcc/obiekty/./gcc/ 
-B/usr/local/i686-pc-linux-gnu/bin/ -B/usr/local/i686-pc-linux-gnu/lib/ 
-isystem /usr/local/i686-pc-linux-gnu/include -isystem 
/usr/local/i686-pc-linux-gnu/sys-include-g -O2 -O2  -g -O2 -DIN_GCC   -W 
-Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes 
-Wold-style-definition  -isystem ./include   -fpic -g -DIN_LIBGCC2 
-fbuilding-libgcc -fno-stack-protector   -fpic -I. -I. -I../.././gcc 
-I/scratch/baryluk/gcc/gcc-4.7-20120211/libgcc 
-I/scratch/baryluk/gcc/gcc-4.7-20120211/libgcc/. 
-I/scratch/baryluk/gcc/gcc-4.7-20120211/libgcc/../gcc 
-I/scratch/baryluk/gcc/gcc-4.7-20120211/libgcc/../include 
-I/scratch/baryluk/gcc/gcc-4.7-20120211/libgcc/config/libbid 
-DENABLE_DECIMAL_BID_FORMAT -DHAVE_CC_TLS  -DUSE_TLS -o _muldi3.o -MT _muldi3.o 
-MD -MP -MF _muldi3.dep -DL_muldi3 -c 
/scratch/baryluk/gcc/gcc-4.7-20120211/libgcc/libgcc2.c -fvisibility=hidden 
-DHIDE_EXPORTS
In file included from /usr/include/features.h:388:0,
 from /usr/include/stdio.h:28,
 from 
/scratch/baryluk/gcc/gcc-4.7-20120211/libgcc/../gcc/tsystem.h:88,
 from /scratch/baryluk/gcc/gcc-4.7-20120211/libgcc/libgcc2.c:29:
/usr/include/gnu/stubs.h:7:27: fatal error: gnu/stubs-32.h: No such file or 
directory
compilation terminated.
make[3]: *** [_muldi3.o] Error 1
make[3]: Leaving directory 
`/scratch/baryluk/gcc/obiekty/i686-pc-linux-gnu/libgcc'
make[2]: *** [all-stage1-target-libgcc] Error 2
make[2]: Leaving directory `/scratch/baryluk/gcc/obiekty'
make[1]: *** [stage1-bubble] Error 2
make[1]: Leaving directory `/scratch/baryluk/gcc/obiekty'
make: *** [bootstrap] Error 2
Exit 2


There is a file we are searching in the
/usr/include/i386-linux-gnu/gnu/stubs-32.h but it is not used by
bootstrap.

I was trying to use CPPFLAGS, but it doesn't help.

I can create few symlinks, but this is far from nice solution.

I really like Debian's approach for multiarch, because it addressess not
only 32/64-bit split, but many other problems. I understand this however
leads to some problems, just like one I encountered here, but It would
be good to have some way to specify transition to Debian's multiarch.

Precise overview of directories to be used by GCC and other stuff can be
found at

http://wiki.debian.org/Multiarch/LibraryPathOverview


Generally there is no multiarch separation for header files, because
most header files are portable, and works without modification on
bi-arch (32/64), and often even on all platforms (if nacasarry header
files have own conditional compilation mechanisms for portability).

However, few header files needs special treatment. This is full list on
my computer

$ /usr/include/i386-linux-gnu> ls -R

.:
asm  bits  ffi.h  ffitarget.h  fpu_control.h  gnu  jconfig.h  sys

./asm:
a.out.h  ioctl.h  mtrr.h setup.h termios.h
auxvec.h ioctls.h param.hshmbuf.htypes.h
bitsperlong.hipcbuf.h poll.h sigcontext32.h  ucontext.h
boot.h   ist.hposix_types_32.h   sigcontext.hunistd_32.h
bootparam.h  kvm.hposix_types_64.h   siginfo.h   unistd_64.h
byteorder.h  kvm_para.h   posix_types.h  signal.hunistd.h
de

Re: Compilation fails on Debian Wheezy - cannot find gnu/stubs-32.h [multiarch]

2012-02-14 Thread Jonathan Wakely
On 14 February 2012 22:26, Witold Baryluk wrote:
>
> I was trying to compile gcc-4.7 snapshots on Debian GNU/Linux wheezy
> (testing/unstable) i386, and found problem releated to multiarch.

This is a known issue with Debian that has been discussed several
times on the gcc and gcc-help lists, most recently here:

http://gcc.gnu.org/ml/gcc/2012-02/msg00181.html


gcc-4.4-20120214 is now available

2012-02-14 Thread gccadmin
Snapshot gcc-4.4-20120214 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20120214/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.4 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_4-branch 
revision 184237

You'll find:

 gcc-4.4-20120214.tar.bz2 Complete GCC

  MD5=8aa06316f6f01e75fcf77e4a5baf44d7
  SHA1=1ac99aa500643bec0e688c198fc7927d3aca4e17

Diffs from 4.4-20120207 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Compilation fails on Debian Wheezy - cannot find gnu/stubs-32.h [multiarch]

2012-02-14 Thread Witold Baryluk
On Tue, Feb 14, 2012 at 10:35:16PM +, Jonathan Wakely wrote:
> On 14 February 2012 22:26, Witold Baryluk wrote:
> >
> > I was trying to compile gcc-4.7 snapshots on Debian GNU/Linux wheezy
> > (testing/unstable) i386, and found problem releated to multiarch.
> 
> This is a known issue with Debian that has been discussed several
> times on the gcc and gcc-help lists, most recently here:
> 
> http://gcc.gnu.org/ml/gcc/2012-02/msg00181.html

Actually I read entrire thread, and my problem is much different than
linking problem. Can you actually look at more than just first
paragraph?


Thanks,
Witek


-- 
Witold Baryluk


Re: weird optimization in sin+cos, x86 backend

2012-02-14 Thread Joseph S. Myers
On Tue, 14 Feb 2012, Geert Bosch wrote:

> However, the glibc math library comes very close, and we can
> surely fix any remaining issues there may be. So, if we can
> use that as base, or as "fallback" library, we suddenly
> achieve some minimal accuracy guarantees across a wide
> range of platforms. If we can get this library with
> GPL+exception, we can even generate optimized variants
> and use a static library with LTO byte code allowing for
> inlining etc.

LGPL+exception - the soft-fp license - is what I've suggested before as 
what would definitely be safe for any LGPL glibc code moving to be more 
permissively licensed for use as a library distributed with GCC.  (I think 
the GPL+exception used for libgcc etc. would also be safe - but it's more 
obviously safe for all uses to add an exception to the license currently 
in use.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Compilation fails on Debian Wheezy - cannot find gnu/stubs-32.h [multiarch]

2012-02-14 Thread Ian Lance Taylor
Witold Baryluk  writes:

> On Tue, Feb 14, 2012 at 10:35:16PM +, Jonathan Wakely wrote:
>> On 14 February 2012 22:26, Witold Baryluk wrote:
>> >
>> > I was trying to compile gcc-4.7 snapshots on Debian GNU/Linux wheezy
>> > (testing/unstable) i386, and found problem releated to multiarch.
>> 
>> This is a known issue with Debian that has been discussed several
>> times on the gcc and gcc-help lists, most recently here:
>> 
>> http://gcc.gnu.org/ml/gcc/2012-02/msg00181.html
>
> Actually I read entrire thread, and my problem is much different than
> linking problem. Can you actually look at more than just first
> paragraph?

I read your message, and it sounds like exactly the same problem: Debian
moved files around, and gcc doesn't know about it.

Ian


Re: Compilation fails on Debian Wheezy - cannot find gnu/stubs-32.h [multiarch]

2012-02-14 Thread Witold Baryluk
On 02-14 17:34, Ian Lance Taylor wrote:
> Witold Baryluk  writes:
> 
> > On Tue, Feb 14, 2012 at 10:35:16PM +, Jonathan Wakely wrote:
> >> On 14 February 2012 22:26, Witold Baryluk wrote:
> >> >
> >> > I was trying to compile gcc-4.7 snapshots on Debian GNU/Linux wheezy
> >> > (testing/unstable) i386, and found problem releated to multiarch.
> >> 
> >> This is a known issue with Debian that has been discussed several
> >> times on the gcc and gcc-help lists, most recently here:
> >> 
> >> http://gcc.gnu.org/ml/gcc/2012-02/msg00181.html
> >
> > Actually I read entrire thread, and my problem is much different than
> > linking problem. Can you actually look at more than just first
> > paragraph?
> 
> I read your message, and it sounds like exactly the same problem: Debian
> moved files around, and gcc doesn't know about it.
> 
> Ian

Thanks,

I just wanted to take note that problem is not only due changed location
of shared libraries, but also header files.

I have few other ideas for workarounds, but need to test them.

Regards,
Witek.

-- 
Witold Baryluk


Re: [ARM] EABI and the default to short enums

2012-02-14 Thread Ralf Corsepius

On 02/14/2012 06:51 PM, Ian Lance Taylor wrote:

Sebastian Huber  writes:


On 02/14/2012 04:05 PM, Ian Lance Taylor wrote:

Sebastian Huber   writes:


the default ARM EABI configuration uses short enums by default (from
"gcc/config/arm/arm.c":

/* AAPCS based ABIs use short enums by default.  */

static bool
arm_default_short_enums (void)
{
return TARGET_AAPCS_BASED&&   arm_abi != ARM_ABI_AAPCS_LINUX;
}

This causes a major headache for me since some libraries assume that
sizeof(any enum)>   1, e.g. the standard XDR library.  Is the only
possible way to disable short enums to set the ABI to
ARM_ABI_AAPCS_LINUX?  Which side effects does this have?


This question would be better asked on the mailing list
gcc-h...@gcc.gnu.org rather than gcc@gcc.gnu.org.  The gcc@ mailing list
is for issues related to the development of gcc itself.  Please take any
followups to gcc-help.  Thanks.

You can use -fno-short-enums.  However, see the note about ABI
compatibility in the -fshort-enums doc.


The problem is that I need a proper GCC ARM configuration for the
RTEMS tool chain.  To do this I have to provide the right definitions
in

gcc/config/arm/rtems-eabi.h
gcc/config/arm/t-rtems-eabi

and this is clearly not a GCC user problem.  The so called ARM ELF
configuration didn't use short enums by default.  It seems that Linux
faced this problem before and now we have this exception in the
arm_default_short_enums() function above.  I want to preserve the ARM
ELF behavior with respect to enums also in the ARM EABI configuration.
The question is now who to achieve this.  One option is to set the ABI
to ARM_ABI_AAPCS_LINUX also for the RTEMS tool chain, but I am not
sure that this is the right thing.



I would recommend that RTEMS change to the ARM EABI if possible.  That
is the current standard ABI on ARM platforms.

That's what Sebastian is trying to do.


It's true that the ARM
EABI is different from the previous ABIs in some respect.  I believe
that would mean using ARM_ABI_AAPCS.

However, if you want to retain GNU/Linux compatibility, then using
ARM_ABI_AAPCS_LINUX is likely to be correct.
So you would recommend RTEMS to throw away ARM_ABI_AAPCS and to use 
ARM_ABI_AAPCS_LINUX, which as far as I see is a 
Linux-specific/proprietaty deviation from EABI?


To me, this seems "hacking" - I am actually leaning towards considering 
the issues Sebastian mentions to be portability bugs in the non-GCC 
components he faces this issue with.



But I am not an ARM expert.

Neither am I.

Ralf