[PATCH] D46926: [Fixed Point Arithmetic] Conversion between Fixed Point and Floating Point Numbers

2018-05-22 Thread Steve Canon via Phabricator via cfe-commits
scanon requested changes to this revision.
scanon added a comment.
This revision now requires changes to proceed.

IIRC the optimization of divide-by-power-of-two --> multiply-by-inverse does 
not occur at -O0, so it would be better to multiply by 2^(-fbits) instead.


Repository:
  rC Clang

https://reviews.llvm.org/D46926



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D57188: Disable _Float16 for non ARM/SPIR Targets

2019-01-25 Thread Steve Canon via Phabricator via cfe-commits
scanon added a subscriber: ab.
scanon added a comment.

> do we want to support _Float16 anywhere else?

ARM is the only in-tree target with a defined ABI that I'm aware of.

> Do we need to lock down an ABI here for i386/x86_64 in advance of those gears 
> turning in the outer world?

We definitely want to push for one to be defined (and make sure that it makes 
sense), but I don't think we don't need to rush ahead of everyone, rather get 
to preliminary agreement. I think @ab was going to follow-up on that?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57188/new/

https://reviews.llvm.org/D57188



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D48342: [libcxx] Optimize vectors construction of trivial types from an iterator range with const-ness mismatch.

2018-12-07 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: 
libcxx/test/std/containers/sequences/vector/vector.cons/construct_iter_iter.pass.cpp:187
+  assert(std::fabs(v[0] - 0.0f) < FLT_EPSILON);
+  assert(std::fabs(v[1] - 1.0f) < FLT_EPSILON);
+  assert(std::fabs(v[2] - 2.0f) < FLT_EPSILON);

These comparisons with `FLT_EPSILON` are doing nothing but adding noise. There 
is no value other than 2 that can possibly satisfy `fabs(v[2] - 2.0f) < 
FLT_EPSILON`, so this test can simply be `v[2] == 2`, for example.

If you find yourself using `FLT_EPSILON` as a tolerance, it's almost always 
wrong.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D48342/new/

https://reviews.llvm.org/D48342



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D52835: [Diagnostics] Check integer to floating point number implicit conversions

2018-10-03 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: lib/Sema/SemaChecking.cpp:110
+/// are usually useless
+static unsigned AdjustPrecision(unsigned precision) {
+  return (precision * 59 + 195) / 196;

erichkeane wrote:
> Hmm I don't terribly understand how this function works.  Also, comment 
> above needs to end in a period.  
> 
> Can you elaborate further as to how this works?  Those are 3 pretty suspect 
> looking magic numbers...
It's attempting to compute the number of good base-10 digits (59/196 ~= 
log2(10)). We should really just make APFloat print the shortest 
round-trippable digit sequence instead. Yes, this is tricky to implement, but 
we don't need to implement it. There are two recent high-quality 
implementations available, which are both significantly faster than previous 
algorithms: Ryu and Swift's 
(https://github.com/apple/swift/blob/master/stdlib/public/runtime/SwiftDtoa.cpp).
 Swift's has the virtue of already being used in LLVM-family languages and 
having a tidy single-file implementation, but either would be perfectly usable, 
I think.

Neither supports float128 yet, but we could simply drop them in for float, 
double, and float80.


https://reviews.llvm.org/D52835



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34695: _Float16 preprocessor macro definitions

2017-09-12 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: lib/Headers/float.h:137
 
+#ifdef __STDC_WANT_IEC_60559_TYPES_EXT__
+#  define FLT16_MANT_DIG  __FLT16_MANT_DIG__

rogfer01 wrote:
> My understanding is that, given that we support TS18661-2 by default, this 
> macro should be predefined by clang and then there is no need to protect 
> these macros.
> 
> You may want to add a test for this in `test/Preprocessor/init.c`.
Where do you see that the `__STDC_WANT_IEC_60559_TYPES_EXT__` macro should be 
predefined by clang?


https://reviews.llvm.org/D34695



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34695: _Float16 preprocessor macro definitions

2017-09-12 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: lib/Headers/float.h:137
 
+#ifdef __STDC_WANT_IEC_60559_TYPES_EXT__
+#  define FLT16_MANT_DIG  __FLT16_MANT_DIG__

rogfer01 wrote:
> scanon wrote:
> > rogfer01 wrote:
> > > My understanding is that, given that we support TS18661-2 by default, 
> > > this macro should be predefined by clang and then there is no need to 
> > > protect these macros.
> > > 
> > > You may want to add a test for this in `test/Preprocessor/init.c`.
> > Where do you see that the `__STDC_WANT_IEC_60559_TYPES_EXT__` macro should 
> > be predefined by clang?
> Hi Steve,
> 
> certainly you're right, the TS says
> 
> > The new identifiers added to C11 library headers by this part of ISO/IEC 
> > TS-18661 are defined or declared by their respective headers only if 
> > `__STDC_WANT_IEC_60559_TYPES_EXT__` is defined as a macro at the point in 
> > the source file where the appropriate header is first included.
> 
> so (if I read this right) these identifiers are only available if such macro 
> is defined when including `float.h`. 
> 
> Can I assume from your comment that someone else should define it? Perhaps 
> the `float.h` header itself, some other file in the C-library implementation 
> or the user of the compiler via some `-D__STDC_WANT_IEC_60559_TYPES_EXT__`, 
> but not be predefined by the compiler? If this is the case, then the macros 
> still have to be guarded conditionally (as they were in the original patch).
> 
> Does this make sense? Thanks.
I think we could justify defining it ourselves under non-strict compilation 
modes; alternatively, system headers might define it for users in non-strict 
modes.

My reading of the TS is that in strict mode, these types and macros should be 
hidden unless the user explicitly requests them by defining 
`__STDC_WANT_IEC_60559_TYPES_EXT__` themselves.


https://reviews.llvm.org/D34695



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D34695: _Float16 preprocessor macro definitions

2017-09-13 Thread Steve Canon via Phabricator via cfe-commits
scanon accepted this revision.
scanon added a comment.

LGTM as well.


https://reviews.llvm.org/D34695



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64666: [Sema] Enable -Wimplicit-float-conversion for integral to floating point precision loss

2019-07-15 Thread Steve Canon via Phabricator via cfe-commits
scanon requested changes to this revision.
scanon added inline comments.
This revision now requires changes to proceed.



Comment at: clang/lib/Sema/SemaChecking.cpp:11429
+  S.Context.getFloatTypeSemantics(QualType(TargetBT, 0)));
+TargetFloatValue.convertFromAPInt(SourceInt,
+  SourceBT->isSignedInteger(), llvm::APFloat::rmNearestTiesToEven);

Why don't we just check that the result of the first conversion is opOK? I 
don't think doing a round-trip check is required here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64666/new/

https://reviews.llvm.org/D64666



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64666: [Sema] Enable -Wimplicit-float-conversion for integral to floating point precision loss

2019-07-16 Thread Steve Canon via Phabricator via cfe-commits
scanon accepted this revision.
scanon added a comment.
This revision is now accepted and ready to land.

LGTM. Please get at least one additional reviewer's approval before merging, as 
this is a corner of clang that I don't work on often.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64666/new/

https://reviews.llvm.org/D64666



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65753: Builtins: Add some v2f16 variants

2019-08-06 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

Strongly agree with what @rjmccall said. If we can make these generic builtins 
instead of ending up with O(100) variants of each math operation, that would 
make life immensely nicer.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65753/new/

https://reviews.llvm.org/D65753



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65753: Builtins: Add some v2f16 variants

2019-08-06 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

(Ideally we would just call them e.g. __builtin_floor, but that would be 
source-breaking. __builtin_tgmath_floor seems like a good compromise.)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65753/new/

https://reviews.llvm.org/D65753



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46135: [Driver, CodeGen] add options to enable/disable an FP cast optimization

2018-04-26 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

I like Chandler's wording. Something like:

"... this flag will attempt to cause "


https://reviews.llvm.org/D46135



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46042: Cap vector alignment at 16 for all Darwin platforms

2018-05-04 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

In https://reviews.llvm.org/D46042#1088044, @ab wrote:

> So, this makes sense to me, but on x86, should we also be worried about the 
> fact that the calling convention is based on which features are available?  
> (>128bit ext_vector_types are passed in AVX/AVX-512 registers, if available). 
>  Presumably swift is also affected, no?


Swift's calling conventions (will?) always divide larger vectors into 128b 
pieces. When interacting with C conventions, yes, this is still an issue.


Repository:
  rC Clang

https://reviews.llvm.org/D46042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78190: Add Bfloat IR type

2020-04-23 Thread Steve Canon via Phabricator via cfe-commits
scanon requested changes to this revision.
scanon added inline comments.
This revision now requires changes to proceed.



Comment at: llvm/docs/LangRef.rst:3240
+double are represented using the 16-digit form shown above (which matches the
+IEEE754 representation for double); half and float values must, however, be
+exactly representable as IEEE 754 half and single precision,

bfloat, half, and float



Comment at: llvm/docs/LangRef.rst:3241
+IEEE754 representation for double); half and float values must, however, be
+exactly representable as IEEE 754 half and single precision,
+respectively. Hexadecimal format is always used for long double, and there are

bfloat, IEEE 754 half, and IEEE 754 single precision



Comment at: llvm/docs/LangRef.rst:2896
+   * - ``bfloat``
+ - 16-bit brain floating-point value (8-bit mantissa)
+

bfloat and fp128 should agree w.r.t. whether or not the implicit bit counts. 
Either 7 and 112 or 8 and 113. Also, we should use "significand" instead of 
"mantissa". "Mantissa" has slipped in in a bunch of places, but "significand" 
is the IEEE 754 terminology, and we should follow it.



Comment at: llvm/include/llvm-c/Core.h:149
   LLVMHalfTypeKind,/**< 16 bit floating point type */
+  LLVMBfloatTypeKind,  /**< 16 bit brain floating point type */
   LLVMFloatTypeKind,   /**< 32 bit floating point type */

Throughout this, I think `BFloat` would be more consistent with other types 
than `Bfloat` is.



Comment at: llvm/include/llvm/IR/Type.h:59
 HalfTyID,///<  1: 16-bit floating point type
-FloatTyID,   ///<  2: 32-bit floating point type
-DoubleTyID,  ///<  3: 64-bit floating point type
-X86_FP80TyID,///<  4: 80-bit floating point type (X87)
-FP128TyID,   ///<  5: 128-bit floating point type (112-bit mantissa)
-PPC_FP128TyID,   ///<  6: 128-bit floating point type (two 64-bits, 
PowerPC)
-LabelTyID,   ///<  7: Labels
-MetadataTyID,///<  8: Metadata
-X86_MMXTyID, ///<  9: MMX vectors (64 bits, X86 specific)
-TokenTyID,   ///< 10: Tokens
+BfloatTyID,  ///<  2: 16-bit floating point type
+FloatTyID,   ///<  3: 32-bit floating point type

Please add a parenthetical to disambiguate from HalfTyID, like the 
larger-than-double types have.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78190/new/

https://reviews.llvm.org/D78190



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78190: Add Bfloat IR type

2020-04-27 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: llvm/docs/LangRef.rst:2896
+   * - ``bfloat``
+ - 16-bit brain floating-point value (8-bit mantissa)
+

rjmccall wrote:
> rjmccall wrote:
> > scanon wrote:
> > > bfloat and fp128 should agree w.r.t. whether or not the implicit bit 
> > > counts. Either 7 and 112 or 8 and 113. Also, we should use "significand" 
> > > instead of "mantissa". "Mantissa" has slipped in in a bunch of places, 
> > > but "significand" is the IEEE 754 terminology, and we should follow it.
> > I agree with Steve.  In general, there's no reason for these descriptions 
> > to be as terse as they are, especially for the non-standard formats.  
> > Someone reading IR and seeing `bfloat` for the first time is going to come 
> > here and not be any wiser unless they figure out the right web search.
> Hmm, now this reads more like a rationale than documentation.  I would 
> suggest:
> 
> > 16-bit "brain" floating-point value (7-bit significand).  Provides the same 
> > number of exponent bits as ``float``, so that it matches its dynamic range, 
> > just with greatly reduced precision.  Used in Intel's AVX-512 BF16 
> > extensions and ARM's ARMv8.6-A extensions, among others.
Yup, I agree. The important thing here is that someone can figure out what it 
is (the top half of a float); it's ok for them to have to do some reading to 
figure out *why* it is.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78190/new/

https://reviews.llvm.org/D78190



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78827: Add support for #pragma clang fp reassociate(on|off) -- floating point control of associative math transformations

2020-05-04 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:3182
+enabled for the translation unit with the ``-fassociative-math`` flag.
+The pragma can take two values: ``on`` and ``off``.
+

rjmccall wrote:
> Do you want to add an example here?
Is the intention that this allows reassociation only within statements (like fp 
contract on)? Based on the ir in the tests, I think the answer is "no".

If so, should "on" be called "fast" instead, since its semantics match the 
"fast" semantics for contract, and reserve "on" for reassociation within a 
statement (that seems like it would also be useful, potentially)?

Otherwise, please add some tests with multiple statements.

I agree with John that `pragma clang fp reassociate` is a reasonable spelling.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78827/new/

https://reviews.llvm.org/D78827



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78827: Add support for #pragma clang fp reassociate(on|off) -- floating point control of associative math transformations

2020-05-04 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:3182
+enabled for the translation unit with the ``-fassociative-math`` flag.
+The pragma can take two values: ``on`` and ``off``.
+

mibintc wrote:
> scanon wrote:
> > rjmccall wrote:
> > > Do you want to add an example here?
> > Is the intention that this allows reassociation only within statements 
> > (like fp contract on)? Based on the ir in the tests, I think the answer is 
> > "no".
> > 
> > If so, should "on" be called "fast" instead, since its semantics match the 
> > "fast" semantics for contract, and reserve "on" for reassociation within a 
> > statement (that seems like it would also be useful, potentially)?
> > 
> > Otherwise, please add some tests with multiple statements.
> > 
> > I agree with John that `pragma clang fp reassociate` is a reasonable 
> > spelling.
> The intention is that the pragma can be placed at either file scope or at the 
> start of a compound statement, if at file scope it affects the functions 
> following the pragma.  If at compound statement it is effective for the 
> compound statement, i can modify the test to have multiple statements.  I 
> disagree with the suggestion of "fast" instead of on/off.  at the command 
> line you can use -f[no-]associative-math; of course that command line option 
> isn't specific to floating point, but the point is it's on/off ; i can't 
> speak to whether "fast" would be a useful setting.  Thanks for your review
I don't think you understood my comment, so I'll try to explain more clearly: I 
am not asking if it applies to multiple statements or a single statement; I am 
asking within what scope reassociation is allowed.

An example: `fp contract on` allows:
```
double r = a*b + c;
```
to be contracted to an fma, but does not allow:
```
double t = a*b;
double r = t + c;
```
to be contracted. `fp contract fast` allows both to be contracted.

Your reassociate pragma, if I am understanding correctly, allows:
```
double t = a + b;
double r = t + c;
```
to be reassociated; this matches the behavior of `fp contract fast`. I am 
asking if, for the sake of understandability and orthogonality of the 
floating-point model, it would make more sense for this option to be named 
`fast`, reserving `on` for a mode that only allows reassociation within 
expressions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78827/new/

https://reviews.llvm.org/D78827



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78827: Add support for #pragma clang fp reassociate(fast|off) -- floating point control of associative math transformations

2020-05-05 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

> I don't think the C standard is likely to ever bless reassociative FP math 
> with an expression-local restriction. Steve, do you actually think that would 
> be a useful optimization mode?

I think it's pretty unlikely that C would do this, as well. It is plausibly a 
useful semantic mode, but I don't know that we need to reserve the name for it. 
I only want to raise the question, and be sure that we're aware that we're 
making a decision here (also, either way we need to document it clearly).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78827/new/

https://reviews.llvm.org/D78827



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78827: Add support for #pragma clang fp reassociate(on|off) -- floating point control of associative math transformations

2020-05-05 Thread Steve Canon via Phabricator via cfe-commits
scanon accepted this revision.
scanon added a comment.
This revision is now accepted and ready to land.

My concerns have been addressed. Thanks for bearing with me, Melanie!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78827/new/

https://reviews.llvm.org/D78827



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78827: Add support for #pragma clang fp reassociate(on|off) -- floating point control of associative math transformations

2020-05-05 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

(Please get one additional sign off before committing; I'm mainly signing off 
on the numerics model aspect).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78827/new/

https://reviews.llvm.org/D78827



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78827: Add support for #pragma clang fp reassociate(on|off) -- floating point control of associative math transformations

2020-05-06 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

TS18661-5 is quite vague on what the intended semantics for the pragma are.

These pragmas are intended to be bindings of clause 10.4 of IEEE 754, which is 
also pretty wishy-washy on the whole, but it's worth noting that clause 10 is 
titled *expression evaluation* specifically. The relevant text here is:

> A language standard should also define, and require implementations to 
> provide, attributes that allow and disallow value-changing optimizations, 
> separately or collectively, for a block. These optimizations might include, 
> but are not limited to:
>  ― Applying the associative or distributive laws.
>  ― Synthesis of a fusedMultiplyAdd operation from a multiplication and an 
> addition.
>  ― Synthesis of a formatOf operation from an operation and a conversion of 
> the result of the 40 operation.
>  ― Use of wider intermediate results in expression evaluation.

So IEEE-754 appears to view this as being "just like" contraction. (Note that 
this is all under a "should", so #yolo).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78827/new/

https://reviews.llvm.org/D78827



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72841: Add support for pragma float_control, to control precision and exception behavior at the source level

2020-05-12 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

> Prior to this change contract was never generated in the case of in-statement 
> contraction only, instead clang was emitting llvm.fmuladd to inform the 
> backend that only those were eligible for contraction. From a correctness 
> perspective I think this was perfectly fine.
> 
> Currently I don't see any logic to generate "blocking intrinsics" (I guess to 
> define a region around the instructions emitted for the given statement). 
> Until such mechanism is in place, I think that generating the contract 
> fast-math flag also for in-statement contraction is wrong because it breaks 
> the original program semantic.

This is exactly right. If we are going to have new in-statement optimizations, 
then we probably do need to add some blocking intrinsic (which would be 
elidable given suitable fast-math flags); the system of generating fmuladd 
works well for FMA contraction, but doesn't really generalize to other 
optimizations of this sort.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72841/new/

https://reviews.llvm.org/D72841



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113779: [Clang] Add mfp16, mfp16fml and mdotprod flags for ARM target features.

2022-09-07 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

Wearing my compiler user hat, I would much rather use additive -mfeature than 
have to specify these as -march+feature, even when using a build system that 
nominally handles this stuff, because I frequently want to be able to compile 
one specific file with "whatever the prevailing options are, but also enable 
this one feature." Most build systems make this possible somehow (track down 
the arch variable, append +feature to it, etc), but it's considerably simpler 
if you can just append -mfeature to the list of flags and call it a day.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113779/new/

https://reviews.llvm.org/D113779

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129231: [Builtins] Do not claim all libfuncs are readnone with trapping math.

2022-08-05 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: clang/include/clang/Basic/Builtins.def:1409
 
-LIBBUILTIN(round, "dd", "fnc", "math.h", ALL_LANGUAGES)
-LIBBUILTIN(roundf, "ff", "fnc", "math.h", ALL_LANGUAGES)
-LIBBUILTIN(roundl, "LdLd", "fnc", "math.h", ALL_LANGUAGES)
+LIBBUILTIN(round, "dd", "fng", "math.h", ALL_LANGUAGES)
+LIBBUILTIN(roundf, "ff", "fng", "math.h", ALL_LANGUAGES)

`round` is just like `trunc` or `floor` or `ceil`, and should be "fnc" if they 
are.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129231/new/

https://reviews.llvm.org/D129231

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129231: [Builtins] Do not claim all libfuncs are readnone with trapping math.

2022-08-08 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

The CFP working group and C23 have since clarified this in Annex F:

> The returned value is exact and is independent of the current rounding 
> direction mode.

They never set inexact on an implementation that claims 60559 conformance. The 
only flag that these operations may set is invalid on signaling NaN input.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129231/new/

https://reviews.llvm.org/D129231

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66836: [libc++] Add `__truncating_cast` for safely casting float types to integers

2019-08-28 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

I would tend to write this function in the following form:

  // set up lower bound and upper bound
  if (r > upperBound) r = upperBound;
  if (!(r >= lowerBound)) r = lowerBound; // NaN is mapped to l.b.
  return static_cast(r);

I prefer to avoid the explicit trunc call, since that's the defined behavior of 
the `static_cast` once the value is in-range, anyway.




Comment at: include/math.h:1573
+  enum { _Bits = numeric_limits<_IntT>::digits - 
numeric_limits<_FloatT>::digits };
+  static const _IntT value = numeric_limits<_IntT>::max() >> _Bits << _Bits;
+};

zoecarver wrote:
> What's the reasoning behind shifting something forward and back? Shouldn't 
> this always negate the other operation? 
This function doesn't quite do what it says on the tin; it considers the number 
of significand bits used for the floating-point type, but not the exponent 
range. This doesn't matter for double, because double's exponent range is much, 
much larger than any integer type, but it does matter for types like float16 
(largest representable value is 65504)--when it's added as a standard 
floating-point type at some future point, this will introduce subtle bugs.

You should be able to work around this by converting `value` to  `_FloatT`, 
taking the minimum of the result and numeric_limits::max, and converting back.

This also assumes that _FloatT has radix == 2, which I do not believe is 
actually implied by `is_floating_point == true`. Please add a static assert for 
that so that future decimal types don't use this template.



Comment at: include/math.h:1582
+  const _RealT __trunc_r = __builtin_trunc(__r);
+  if (__trunc_r >= ::nextafter(static_cast<_RealT>(_MaxVal), INFINITY)) {
+return _Lim::max();

zoecarver wrote:
> Maybe change `INFINITY` to `std::numeric_limits< _RealT >::infinity()`
Why isn't this just `__trunc_r > _MaxVal`?



Comment at: include/math.h:1584
+return _Lim::max();
+  } else if (__trunc_r <= _Lim::lowest()) {
+return _Lim::min();

This has a subtle assumption that `_IntT` is two's-complement and `_FloatT` has 
`radix=2`, so that the implicit conversion that occurs in the comparison is 
exact. The radix should be a static assert; does libc++ care about 
non-two's-complement at all?

Just from a clarity perspective, I would personally make the conversion 
explicit.



Comment at: include/math.h:1586
+return _Lim::min();
+  }
+  return static_cast<_IntT>(__trunc_r);

If I'm reading right, NaNs will fall through the above two comparisons and 
invoke UB on the static_cast below. I suspect that's not the desired behavior. 
What is the intended result for NaN?


Repository:
  rCXX libc++

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66836/new/

https://reviews.llvm.org/D66836



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66836: [libc++] Add `__truncating_cast` for safely casting float types to integers

2019-08-28 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: test/libcxx/numerics/truncating_cast.pass.cpp:36
+  {static_cast(Lim::max()) + 1, Lim::max(), false},
+  {static_cast(Lim::max()) + 1024, Lim::max(), false},
+  };

Probably should test `nextafter(static_cast(Lim::max()), INFINITY)` 
here instead.


Repository:
  rCXX libc++

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66836/new/

https://reviews.llvm.org/D66836



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66836: [libc++] Add `__truncating_cast` for safely casting float types to integers

2019-08-28 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: include/math.h:1582
+  const _RealT __trunc_r = __builtin_trunc(__r);
+  if (__trunc_r >= ::nextafter(static_cast<_RealT>(_MaxVal), INFINITY)) {
+return _Lim::max();

EricWF wrote:
> scanon wrote:
> > zoecarver wrote:
> > > Maybe change `INFINITY` to `std::numeric_limits< _RealT >::infinity()`
> > Why isn't this just `__trunc_r > _MaxVal`?
> Consider `long long` and `double`. `MaxVal - numeric_limits::max() 
> == 1024`, and we want values between `MaxVal` and `::max()` to round down. So 
> instead we essentially check for `__r >= numeric_limits::max() + 
> 1`. This approach seems more accurate.
> Consider long long and double. MaxVal - numeric_limits::max() == 
> 1024, and we want values between MaxVal and ::max() to round down. So instead 
> we essentially check for __r >= numeric_limits::max() + 1

Yes, but there are no values between MaxVal and ::max() in the floating-point 
format; if there were, they would be MaxVal instead. So you can ditch the 
nextafter and just use `> static_cast<_FloatT>(MaxVal)`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66836/new/

https://reviews.llvm.org/D66836



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66836: [libc++] Add `__truncating_cast` for safely casting float types to integers

2019-08-28 Thread Steve Canon via Phabricator via cfe-commits
scanon requested changes to this revision.
scanon added inline comments.
This revision now requires changes to proceed.



Comment at: include/math.h:1586
+return _Lim::min();
+  }
+  return static_cast<_IntT>(__trunc_r);

EricWF wrote:
> scanon wrote:
> > If I'm reading right, NaNs will fall through the above two comparisons and 
> > invoke UB on the static_cast below. I suspect that's not the desired 
> > behavior. What is the intended result for NaN?
> I didn't want to treat `NaN` as a valid input, so I want to allow UBSAN to 
> catch it rather than to provide a valid output.
Please document this clearly; otherwise someone will assume that this is a 
UB-free conversion and use it for that purpose.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66836/new/

https://reviews.llvm.org/D66836



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66836: [libc++] Add `__truncating_cast` for safely casting float types to integers

2019-08-28 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

> Eric showed me this link https://godbolt.org/z/AjBHYqv

Dead link.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66836/new/

https://reviews.llvm.org/D66836



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66836: [libc++] Add `__truncating_cast` for safely casting float types to integers

2019-08-28 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

In D66836#1649846 , @zoecarver wrote:

> > Dead link.
>
> Here: https://godbolt.org/z/AjBHYq


Yes, conversion of `numeric_limits::max` to `double` rounds to a 
value out of range for `long long`. That's not what I'm talking about. Very 
specifically, in this line:

`if (__r >= ::nextafter(static_cast<_RealT>(_MaxVal), INFINITY))`

`_MaxVal`, by construction, is representable both as `_RealT` and as `_IntT`, 
so the static_cast does not change the value (so the rounding demonstrated in 
your godbolt link doesn't create a bug). `a >= nextafter(b, INFINITY)` is 
equivalent to `a > b` for any finite floating-point `a` and `b`. So this 
condition can simply be `if (__r > static_cast<_RealT>(_MaxVal))`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66836/new/

https://reviews.llvm.org/D66836



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66836: [libc++] Add `__truncating_cast` for safely casting float types to integers

2019-09-04 Thread Steve Canon via Phabricator via cfe-commits
scanon accepted this revision.
scanon added a comment.
This revision is now accepted and ready to land.

I believe that the code can still be simplified somewhat, but that it's correct 
as-is for `float`, `double`, and `long double`. I'll take an AI to follow-up on 
future improvements, and let's get this in.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66836/new/

https://reviews.llvm.org/D66836



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D85031: [builtins] Unify the softfloat division implementation

2020-08-31 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: compiler-rt/lib/builtins/fp_div_impl.inc:99
+  //   0 < x0(b) < 1
+  //   abs(x0(b) - 1/b) <= 3/4 - 1/sqrt(2)
+

sepavloff wrote:
> atrosinenko wrote:
> > sepavloff wrote:
> > > This estimation is absent from the original comment. Do you have 
> > > reference where it came from? Also the original comment states `This is 
> > > accurate to about 3.5 binary digits`. Is still true? If yes, it could be 
> > > worth copying here.
> > This approximation was deduced by writing down the derivative of `f ` "in 
> > infinite precision" and finding its root. Then the values of `f` applied to 
> > its root, 1.0 and 2.0 were calculated -- as far as I remember, **all of 
> > them** were `3/4 - 1/sqrt(2)` or negated - //this is what "minimax 
> > polynomial" probably means, that term was just copied from the original 
> > implementation :)//.
> IIUC, you don't want to put this statement here because you are us not sure 
> it is true? Sounds reasonable.
To be precise, what _minimax polynomial_ means is that p(x) = 3/4 + 1/sqrt(2) - 
x/2 is the first-order polynomial that minimizes the error term max(|1/x - 
p(x)|) on the interval [1,2]. I.e. every other linear polynomial would achieve 
a larger maximum error.

The bound of a minimax approximation to a well-behaved function is always 
achieved at the endpoints, so we can just evaluate at 1 to get the max error: 
|1/1 - 3/4 - 1/sqrt(2) + 1/2| = 3/4 - 1/sqrt(2) = 0.04289... (which is actually 
about _4.5_ bits).



Comment at: compiler-rt/lib/builtins/fp_div_impl.inc:101-102
+
+  // Then, refine the reciprocal estimate using a Newton-Raphson iteration:
+  // x_{n+1} = x_n * (2 - x_n * b)
+  // Let e_n := x_n - 1/b_hw

atrosinenko wrote:
> sepavloff wrote:
> > The original comment states:
> > ```
> >   // This doubles the number of correct binary digits in the approximation
> >   // with each iteration.
> > ```
> > It is true in this implementation? If yes, it could be worth copying here.
> For me this looks too vague. This is probably //approximately true// but I 
> don't know how exactly this should be interpreted.
N-R is quadratically convergent under a bunch of assumptions on how good the 
initial guess is and bounds on the second derivative, which are all satisfied 
here, but probably not worth going into in the comments. IIRC the usual 
reference here is Montuschi and Mezzalama's "Survey of square rooting 
algorithms" (1990).



Comment at: compiler-rt/lib/builtins/fp_div_impl.inc:109
+
+#if NUMBER_OF_HALF_ITERATIONS > 0
+  // Starting with (n-1) half-width iterations

sepavloff wrote:
> atrosinenko wrote:
> > sepavloff wrote:
> > > It is good optimization. Could you please put a comment shortly 
> > > describing the idea of using half-sized temporaries?
> > The idea is just "I guess this takes less CPU time and I have managed to 
> > prove error bounds for it". :) Specifically, for float128, the rep_t * 
> > rep_t multiplication will be emulated with lots of CPU instructions while 
> > the lower half contain some noise at that point. This particular 
> > optimization did exist in the original implementation for float64 and 
> > float128. For float32 it had not much sense, I guess. Still, estimations 
> > were calculated for the case of float32 with half-size iterations as it may 
> > be useful for MSP430 and other 16-bit targets.
> The idea is clear but it require some study of the sources. I would propose 
> to add a comment saying:
> ```
> At the first iterations number of significant digits is small, so we may use 
> shorter type values. Operations on them are usually faster.
> ```
> or something like that.
This is absolutely standard in HW construction of pipelined iterative dividers 
and square root units, so I'm not sure how much explanation is really needed =)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85031/new/

https://reviews.llvm.org/D85031

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D90174: [HIP] Fix regressions due to fp contract change

2020-11-05 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

I do not much like faststd, as there's nothing "standard" about it. I do not, 
however, have a better suggestion off the top of my head. Let's pause and 
consider the name a little bit longer, please?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90174/new/

https://reviews.llvm.org/D90174

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D90174: [HIP] Fix regressions due to fp contract change

2020-11-10 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

Strictly speaking, fp-contract=fast probably should have been a separate flag 
entirely (since there's no _expression_ being contracted in fast). 
Unfortunately, that ship has sailed, and it does constrain our ability to 
choose an accurate name somewhat.

What if we just spell it out? fast-respect-pragma? fast-when-unspecified? I 
don't think that we really need to try to be as brief as possible with this one.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90174/new/

https://reviews.llvm.org/D90174

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D90174: [HIP] Fix regressions due to fp contract change

2020-11-19 Thread Steve Canon via Phabricator via cfe-commits
scanon accepted this revision.
scanon added a comment.
This revision is now accepted and ready to land.

I'm fine with this.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90174/new/

https://reviews.llvm.org/D90174

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89699: [ExtVector] Make .even/.odd/.lo/.hi return vectors for single elements.

2020-10-19 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

I'm fairly certain that this will cause some breaks internally at Apple, but 
I'm also pretty sure that it's a step in the right direction, and we should 
just sign up to fix any issues it causes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89699/new/

https://reviews.llvm.org/D89699

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89699: [ExtVector] Make .even/.odd/.lo/.hi return vectors for single elements.

2020-10-19 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

I guess the counterargument here would be that `.x` does not produce an 
extvector(1), and there is at least a plausible argument that `.x` should be 
the same as `.lo` for a two-element vector. I'm not really convinced by this, 
but it's not totally outrageous.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89699/new/

https://reviews.llvm.org/D89699

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D90174: [HIP] Fix regressions due to fp contract change

2020-11-03 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

GCC doesn't respect the pragma, so "what other compilers do" is not a 
particularly useful metric.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90174/new/

https://reviews.llvm.org/D90174

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D90174: [HIP] Fix regressions due to fp contract change

2020-11-03 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

(If you tell GCC to respect the pragma via -std=c17 or similar, then 
-ffp-contract=fast overrides it just like clang's current behavior: 
https://godbolt.org/z/5dxxGb)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90174/new/

https://reviews.llvm.org/D90174

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D102494: [Clang, Driver] Default to Darwin_libsystem_m veclib on iOS based targets.

2021-05-14 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: clang/lib/Driver/ToolChains/Darwin.cpp:2492
+  // Darwin_libsystem_m for iOS based targets.
+  if (isTargetIOSBased() && !DriverArgs.hasArgNoClaim(options::OPT_fveclib))
+CC1Args.push_back("-fveclib=Darwin_libsystem_m");

fhahn wrote:
> arphaman wrote:
> > Is this applicable to the watchOS targets as well? The iOS based check 
> > doesn't cover it.
> `libsystem_m`'s vector functions should be available on all Darwin platforms 
> I think. I'd gradually opt-in additional platforms, once we verified it is 
> clearly beneficial for each platform individually.
> 
> Should I add a TODO?
Correct, available on all Darwin systems. These APIs were introduced in macOS 
10.13, iOS 11.0, watchOS 4.0, tvOS 11.0, and driverkit 19.0. I think we need a 
check that the target is at least those versions somewhere?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102494/new/

https://reviews.llvm.org/D102494

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D97854: [RFC][nsan] A Floating-point numerical sanitizer.

2021-03-10 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

Is there a mechanism to instruct the sanitizer to ignore a specific expression 
or function? From a cursory reading, I am mildly concerned about a deluge of 
false positives from primitives that compute exact (or approximate) residuals; 
these are acting to eliminate or precisely control floating-point errors, but 
tend to show up as "unstable" in a naive analysis that isn't aware of them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97854/new/

https://reviews.llvm.org/D97854

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113107: Support of expression granularity for _Float16.

2021-11-17 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

In D113107#3138415 , @rjmccall wrote:

> I think we keep dancing around this in this review, so let me go back and 
> start from the basics.  There are four approaches I know of for evaluating a 
> homogeneous `_Float16` expression like `a + b + c`:
>
> 1. You can perform each operation with normal `_Float16` semantics.  Ideally, 
> you would have hardware support for this.  If that isn't available, you can 
> emulate the operations in software.  It happens to be true that, for the 
> operations (`+`, `-`, `*`, `/`, `sqrt`) on `_Float16`, this emulation can 
> just involve converting to e.g. `float`, doing the operation, and immediately 
> converting back.  The core property of this approach is that there are no 
> detectable differences from hardware support.
>
> 2. As a slight twist on approach #1, you can ignore the differences between 
> native `_Float16` and emulation with `float`; instead, you just always do 
> arithmetic in `float`.  This potentially changes the result in some cases; 
> e.g. Steve Canon tells me that FMAs on `float` avoid some rounding errors 
> that FMAs on `_Float16` fall subject to.
>
> 3. Approaches #1 and #2 require a lot of intermediate conversions when 
> hardware isn't available.  In our example, `a + b + c` has to be calculated 
> as `(_Float16) ((float) (_Float16) ((float) a + (float) b) + (float) c)`, 
> where the result of one addition is converted down and then converted back 
> again.  You can avoid this by specifically recognizing this pattern and 
> eliminating the conversion from sub-operations that happen to be of type 
> `float`, so that in our example, `a + b + c` would be calculated as 
> `(_Float16) ((float) a + (float) b + (float) c)`.  This is actually allowed 
> by the C standard by default as a form of FP contraction; in fact, I believe 
> C's rules for FP contraction were originally designed for exactly this kind 
> of situation, except that it was emulating `float` with `double` on hardware 
> that only provided arithmetic on the latter.  Obviously, this can change 
> results.
>
> 4. The traditional language rule for `__fp16` is superficially similar to 
> Approach #3 in terms of generated code, but it has some subtle differences in 
> terms of the language.  `__fp16` is immediately promoted to `float` whenever 
> it appears as an arithmetic operand.  What this means is that operations are 
> performed in `float` but then not formally converted back (unless they're 
> used in a context which requires a value of the original type, which entails 
> a normal conversion, just as if you assigned a `double` into a `float` 
> variable).  Thus, for example, `a + b + c` would actually have type `float`, 
> not type `__fp16`.
>
> What this patch is doing to `_Float16` is approach #4, basically treating it 
> like `__fp16`.  That is non-conformant, and it doesn't seem to be what GCC 
> does.  You can see that quite clearly here: https://godbolt.org/z/55oaajoax
>
> What I believe GCC is doing (when not forbidden by `-fexcess-precision`) is 
> approach #3: basically, FP contraction on expressions of `_Float16` type.

Basically agree with everything John said, with a note that #3 is not quite 
FP_CONTRACT, which allows evaluating an expression as if intermediate steps 
were infinitely-precise, but rather `FLT_EVAL_METHOD == 32` as defined in 
ISO/IEC TS 18661-3: "evaluate operations and constants, whose semantic type has 
at most the range and precision of the _Float32 type, to the range and 
precision of the _Float32 type; evaluate all other operations and constants to 
the range and precision of the semantic type".


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113107/new/

https://reviews.llvm.org/D113107

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113107: Support of expression granularity for _Float16.

2021-11-17 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

In D113107#3138671 , @rjmccall wrote:

>> Basically agree with everything John said, with a note that #3 is not quite 
>> FP_CONTRACT, which allows evaluating an expression as if intermediate steps 
>> were infinitely-precise, but rather FLT_EVAL_METHOD == 32 as defined in 
>> ISO/IEC TS 18661-3: "evaluate operations and constants, whose semantic type 
>> has at most the range and precision of the _Float32 type, to the range and 
>> precision of the _Float32 type; evaluate all other operations and constants 
>> to the range and precision of the semantic type".
>
> Ah, thank you, I wasn't aware that the intent was for FP_CONTRACT to only 
> allow evaluation as if infinitely precise.  That's much more limited; 
> wouldn't it preclude things like doing `float` arithmetic with `double`-only 
> hardware, or with the x87 operations without intermediate rounding?

Yes; the C definition of "contracted" is "A floating expression may be 
contracted, that is, evaluated as though it were a single operation, thereby 
omitting rounding errors implied by the source code and the expression 
evaluation method."

All the things you mention are covered by FLT_EVAL_METHOD, except that it can't 
actually represent all the combinations that one might choose to implement 
because it's a single scalar value. E.g. when targeting no-sse2, one might want 
to evaluate _Float16 and float using SSE, but double and long double (80-bit) 
using x87. There's no way to express those semantics. Fortunately, that's the 
only combination that I can think of someone wanting to use that's not 
expressible, and it's rather niche.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113107/new/

https://reviews.llvm.org/D113107

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129231: [Builtins] Do not claim all libfuncs are readnone with trapping math.

2022-08-11 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

>> Looking at implementations of these functions, it looks like GNU libm 
>> doesn't raise inexact, but the bionic libm does. I think I'm leaning towards 
>> marking all of them as "fng" as it's the more cautious of the two.
>
> Hmm, bionic's behavior sounds a bit surprising. In the committed version, I 
> kept them fnc to keep the previous behavior for now. I'll double-check with 
> @scanon and potentially upload a follow-up patch.

As noted, this is only clarified in C23. Previous C standards did not specify 
whether or not these functions should set inexact, leading to some 
implementations choosing one behavior and others choosing the other. That said, 
the TS that clarified the behavior has been available for about a decade, so 
I'm mildly surprised that Bionic kept their existing behavior for so long.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129231/new/

https://reviews.llvm.org/D129231

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129231: [Builtins] Do not claim all libfuncs are readnone with trapping math.

2022-08-11 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

In any event, from the standpoint of the C(23) language, these operations do 
not set inexact, so I believe that it is appropriate to optimize them as if 
they do not set inexact.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129231/new/

https://reviews.llvm.org/D129231

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D135011: Add builtin_elementwise_sin and builtin_elementwise_cos

2022-10-05 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:607
+ T __builtin_elementwise_cos(T x)return the ratio of the adjacent 
side length over thefloating point types
+ hypoteneuse side length, given 
the angle x in radians
  T __builtin_elementwise_floor(T x)  return the largest integral value 
less than or equal to xfloating point types

craig.topper wrote:
> hypotenuse*
As long as we're tweaking descriptions, please just call these "cosine" and 
"sine" instead of a cumbersome ratio description. "The cosine of x interpreted 
as an angle in radians" or similar.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135011/new/

https://reviews.llvm.org/D135011

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69878: Consoldiate internal denormal flushing controls

2019-12-04 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: llvm/docs/LangRef.rst:1834
+``"denormal-fp-math-f32"``
+   Same as ``"denorm-fp-math-f32"``, except for float types. If both
+   are present, this overrides ``"denorm-fp-math"``.

Can you clarify this a little bit? I'd prefer something like "Same as 
``"denorm-fp-math"``, but only controls the behavior of the 32-bit float type.".


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74183: [IRGen] Add an alignment attribute to underaligned sret parameters

2020-02-07 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: clang/lib/CodeGen/CGCall.cpp:2077
+getDataLayout().getABITypeAlignment(getTypes().ConvertType(RetTy)))
+  SRETAttrs.addAlignmentAttr(Align);
 ArgAttrs[IRFunctionArgs.getSRetArgNo()] =

rjmccall wrote:
> Why only when under-aligned?  Just to avoid churning tests?  I think we 
> should apply this unconditionally.
On mainstream architectures today, there's rarely a benefit to knowing about 
higher alignment (e.g. MOVUPS is just as fast as MOVAPS if the address is 
actually aligned), so we won't see significant perf wins from preserving 
over-alignment in most cases, but it also doesn't cost us anything AFAICT and 
could deliver wins in some specific cases (e.g. AVX on SNB and IVB, where I 
think we split underaligned 256b stores into two 128b chunks).

So, yeah, I think we ought to simply unconditionally add the alignment to the 
sret.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74183/new/

https://reviews.llvm.org/D74183



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf()

2016-12-22 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

Please s/mantissa/significand/. I know that "mantissa" is used in a number of 
places in llvm sources, but it's incorrect terminology. Significand is the 
IEEE-754 nomenclature, which any new work should follow.


https://reviews.llvm.org/D27898



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf()

2016-12-22 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: lib/builtins/floattitf.c:65
+if (a & ((tu_int)1 << LDBL_MANT_DIG)) {
+a >>= 1;
+++e;

Strictly speaking there's no need to adjust `a` here. If we rounded up into a 
new binade, then `a` is necessarily `0b1000...0`, and the leading 1 bit will 
get killed by the mask when we assemble `fb.u.high.all` regardless of its 
position. Same comment applies to floatuntitf.


https://reviews.llvm.org/D27898



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf()

2016-12-22 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: lib/builtins/floatuntitf.c:73
+long_double_bits fb;
+fb.u.high.all = (du_int)(e + 16383) << 48/* exponent */
+  | ((a >> 64) & 0xLL);  /* mantissa */

mgorny wrote:
> rengolin wrote:
> > rengolin wrote:
> > > mgorny wrote:
> > > > rengolin wrote:
> > > > > nit: an hexadecimal pattern here would be clearer. same above.
> > > > Do you mean something like: `(foo << 48) & 0x`?
> > > No, just the `16383` to `0x3FFF`
> > Though, now I think your proposal may be better. :)
> > 
> > Whatever works, I'm not too concerned about that.
> Well, I used the decimal form because that's how the IEEE 754 standard 
> specifies it, so I guessed the correlation would be clearer that way.
FWIW, I think both are pretty obvious, but if you want to be totally explicit:

const int bias = 0x3fff;
... (e + bias) ...


https://reviews.llvm.org/D27898



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf()

2016-12-23 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: lib/builtins/floattitf.c:65
+if (a & ((tu_int)1 << LDBL_MANT_DIG)) {
+a >>= 1;
+++e;

mgorny wrote:
> scanon wrote:
> > Strictly speaking there's no need to adjust `a` here. If we rounded up into 
> > a new binade, then `a` is necessarily `0b1000...0`, and the leading 1 bit 
> > will get killed by the mask when we assemble `fb.u.high.all` regardless of 
> > its position. Same comment applies to floatuntitf.
> I'm sorry but I don't feel confident changing that. AFAIU if the 
> LDBL_MANT_DIG+1 bit is set, this code shifts it lower, so it won't actually 
> be killed by the mask.
In binary128, as in all IEEE 754 binary interchange format encodings, the 
leading bit of the significand is implicit. The only way to end up in this code 
path is `0b111...1` rounding up to `0b100...00`, meaning that the significand 
is 1.0, which is stored as all-zeros (i.e. the leading bit is necessarily 
masked).

To be more explicit, LDBL_MANT_DIG is 113. If this shift happens, after the 
shift bit 112 is set, and bits 111:0 are zero. The mask `((a >> 64) & 
0xLL)` discards bit 112 (= 64 + 48).


https://reviews.llvm.org/D27898



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf()

2016-12-23 Thread Steve Canon via Phabricator via cfe-commits
scanon requested changes to this revision.
scanon added a reviewer: scanon.
scanon added a comment.
This revision now requires changes to proceed.

I don't particularly care about the shift, since the code is completely 
equivalent either way, though it would be nice to clean up. However, please 
replace "mantissa" with "significand" before committing.


https://reviews.llvm.org/D27898



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf()

2017-01-06 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: lib/builtins/floattitf.c:65
+if (a & ((tu_int)1 << LDBL_MANT_DIG)) {
+a >>= 1;
+++e;

mgorny wrote:
> scanon wrote:
> > mgorny wrote:
> > > scanon wrote:
> > > > Strictly speaking there's no need to adjust `a` here. If we rounded up 
> > > > into a new binade, then `a` is necessarily `0b1000...0`, and the 
> > > > leading 1 bit will get killed by the mask when we assemble 
> > > > `fb.u.high.all` regardless of its position. Same comment applies to 
> > > > floatuntitf.
> > > I'm sorry but I don't feel confident changing that. AFAIU if the 
> > > LDBL_MANT_DIG+1 bit is set, this code shifts it lower, so it won't 
> > > actually be killed by the mask.
> > In binary128, as in all IEEE 754 binary interchange format encodings, the 
> > leading bit of the significand is implicit. The only way to end up in this 
> > code path is `0b111...1` rounding up to `0b100...00`, meaning that the 
> > significand is 1.0, which is stored as all-zeros (i.e. the leading bit is 
> > necessarily masked).
> > 
> > To be more explicit, LDBL_MANT_DIG is 113. If this shift happens, after the 
> > shift bit 112 is set, and bits 111:0 are zero. The mask `((a >> 64) & 
> > 0xLL)` discards bit 112 (= 64 + 48).
> Well, I've tried removing this and it causes one of the tests to fail:
> 
> `error in __floatuntitf(0x) = 0X1P+127, 
> expected 0X1P+128`
This sounds like you removed the exponent adjustment (`++e`) as well as (or 
instead of) the shift (`a >>= 1`). Without seeing the change, I can't be 
certain, of course.


https://reviews.llvm.org/D27898



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf()

2017-01-06 Thread Steve Canon via Phabricator via cfe-commits
scanon accepted this revision.
scanon added a comment.
This revision is now accepted and ready to land.

OK, I'm satisfied with that.


https://reviews.llvm.org/D27898



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D28862: [compiler-rt] [test] Use approximate comparison on float types

2017-01-19 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

These tests should either be exact, or should have a tolerance that's 
mathematically sound. +/-1ulp is not sound; the allowed error should be 
proportional to the magnitude of the larger of the real and imaginary 
components of the result -- e.g. if one component is very small compared to the 
other, the smaller one may have *no* "correct" bits and still be acceptable.


Repository:
  rL LLVM

https://reviews.llvm.org/D28862



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120395: [X86] Prohibit arithmetic operations on type `__bfloat16`

2022-02-25 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

There's a lot of churn around proposed "solutions" on this and related PR, but 
not a very clear analysis of what the problem we're trying to solve is.

Concretely, what are the semantics that we want for the BF16 types and 
intrinsics? Unlike the other floating-point types, there's no standard to guide 
this, so it's even more important to clearly specify how these types are to be 
used, instead of having an ad-hoc semantics of whatever someone happens to 
implement.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120395/new/

https://reviews.llvm.org/D120395

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111529: Specify Clang vector builtins.

2021-10-11 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:538
+ T __builtin_elementwise_rint(T x)  return the integral value nearest to x 
(according to the  floating point types
+prevailing rounding mode) in 
floating-point format
+ T __builtin_elementwise_round(T x) return the integral value nearest to x 
rounding half-way casesfloating point types

"Prevailing rounding mode" is not super-useful, other than as a spelling for 
round-to-nearest-ties-to-even (IEEE 754 default rounding). Outside of a 
`FENV_ACCESS ON` context, there's not even really a notion of "prevailing 
rounding mode" to appeal to. I assume the intent is for this to lower to e.g. 
x86 ROUND* with the dynamic rounding-mode immediate.

I would recommend adding `__builtin_elementwise_roundeven(T x)` instead, which 
would statically bind IEEE default rounding (following TS 18661-1 naming) 
without having to appeal to prevailing rounding mode, and can still lower to 
ROUND* on x86 outside of FENV_ACCESS ON contexts, which is the norm for vector 
code  (and FRINTN unconditionally on armv8). I think we can punt on 
rint/nearbyint for now, and add them in the future if there's a need.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111529/new/

https://reviews.llvm.org/D111529

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111529: Specify Clang vector builtins.

2021-10-11 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:565
+ NaNs, fmax() return a NaN.
+ ET __builtin_reduce_add(VT a)   \+
   integer and floating point types
+ ET __builtin_reduce_and(VT a)   & 
   integer types

Should be restricted to integer types.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111529/new/

https://reviews.llvm.org/D111529

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111529: Specify Clang vector builtins.

2021-10-11 Thread Steve Canon via Phabricator via cfe-commits
scanon added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:565
+ NaNs, fmax() return a NaN.
+ ET __builtin_reduce_add(VT a)   \+
   integer and floating point types
+ ET __builtin_reduce_and(VT a)   & 
   integer types

scanon wrote:
> Should be restricted to integer types.
(Never mind, somehow read this as `&` instead of `\+`.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111529/new/

https://reviews.llvm.org/D111529

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111529: Specify Clang vector builtins.

2021-10-13 Thread Steve Canon via Phabricator via cfe-commits
scanon accepted this revision.
scanon added a comment.
This revision is now accepted and ready to land.

I'm happy with this now.




Comment at: clang/docs/LanguageExtensions.rst:552
+operation(x, y) as pairwise tree reduction to the input. The pairs are formed
+by concatenating both inputs and pairing adjacent elements.
+

fhahn wrote:
> craig.topper wrote:
> > I'm not sure I understand what is being concatenated here.
> I tried to spell it out more clearly. I'm still not sure if that spells it 
> out as clearly as possibly and I'd appreciate any suggestions on how to 
> improve the wording.
It's unclear because there's no apparent "first" or "second" vector; there's 
just a single argument, and the result isn't a vector, it's a scalar. I think 
you want to say something like: "the operation is repeatedly applied to 
adjacent pairs of elements until the result is a scalar" and then provide a 
worked example.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111529/new/

https://reviews.llvm.org/D111529

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111986: [Clang] Add elementwise abs builtin.

2021-10-19 Thread Steve Canon via Phabricator via cfe-commits
scanon added a comment.

What's the rationale for making abs undefined on the minimum value? AFAIK every 
actual simd implementation defines the result and they agree (and even if one 
didn't, it would be pretty easy to get the "right" result. Introducing UB here 
just seems like punishing users for no reason.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111986/new/

https://reviews.llvm.org/D111986

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D111986: [Clang] Add elementwise abs builtin.

2021-10-19 Thread Steve Canon via Phabricator via cfe-commits
scanon accepted this revision.
scanon added a comment.
This revision is now accepted and ready to land.

Two minor questions, but also LGTM as is.




Comment at: clang/lib/Sema/SemaChecking.cpp:16667
+
+  if (!TyA->getAs() && 
!ConstantMatrixType::isValidElementType(TyA))
+return Diag(A->getBeginLoc(), diag::err_elementwise_math_invalid_arg_type)

Given that I expect this particular test to occur fairly frequently, maybe 
worth abstracting into some sort of get-elementwise-type operation.



Comment at: clang/test/Sema/builtins-elementwise-math.c:25
+  u = __builtin_elementwise_abs(u);
+  // expected-error@-1 {{argument must have a signed integer or floating point 
type, but was an unsigned integer type}}
+

For the purposes of C++ templates it might be nice to allow `abs` on unsigned 
(as the identity function). I don't have strong feelings though, and a library 
wrapping the builtins can do this themselves.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111986/new/

https://reviews.llvm.org/D111986

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits