[PATCH] D49403: More aggressively complete RecordTypes with Function Pointers

2018-08-01 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

I've talked to Olga about this, and I think we have a way to handle the problem 
she was working on without this change.

However,  I think this change is worth considering anyway. The test case shows 
an example where clang is clearly not producing the output it intends to 
produce. In most cases that probably doesn't matter, and I can't come up with 
any case where it will result in incorrect code generation. One place that I 
think it has the potential to introduce trouble is with LTO.

Modifying the example from the test case slightly, suppose you have that code 
broken up into two source files like this.

f1.c:

  struct has_fp;
  typedef void (*fp)(const struct has_fp *f);
  struct has_fp {
fp i;
  };
  void func(const struct has_fp *f) {}

f2.c:

  struct has_fp;
  typedef void (*fp)(const struct has_fp *f);
  struct has_fp {
fp i;
  };
  void func(const struct has_fp *f);
  void func2(const struct has_fp *f, int n) {
if (n == 0)
  func(f);
  }

Now if I compile both of these files with "-c -flto -O2" and then use 
"llvm-link -S -o - f1.o f2.o" I'll see the following:

  %struct.has_fp = type { {}* }
  %struct.has_fp.0 = type { void (%struct.has_fp.0*)* }
  
  define void @func(%struct.has_fp* %f) {
  entry:
ret void
  }
  
  define void @func2(%struct.has_fp.0* %f, i32 %i) {
  entry:
%cmp = icmp eq i32 %i, 0
br i1 %cmp, label %if.end, label %if.then
  
  if.then:
tail call void bitcast (void (%struct.has_fp*)* @func to void 
(%struct.has_fp.0*)*)(%struct.has_fp.0* %f)
br label %if.end
  
  if.end:
ret void
  }

Granted, this will ultimately produce correct code, and I don't have an example 
handy that shows how the extra type and the function bitcast might inhibit 
optimizations, but I think we're at least a step closer to being able to 
imagine it.


https://reviews.llvm.org/D49403



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D49403: More aggressively complete RecordTypes with Function Pointers

2018-08-01 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

The LLVM linker also seems to have a bunch of problems with resolving pointer 
types in structures.  I'm looking into a couple of those now.

I am aware of the opaque pointer effort, though as you say it seems to be 
stalled. I agree that there aren't a lot of things you can do based on pointer 
type information, but there are some things that you might like to do which can 
be proven to be unsafe based on pointer type information that might be 
incorrectly flagged as unsafe if the pointer type information is incorrect. An 
example would be the sorts of data layout optimizations described here: 
https://llvm.org/devmtg/2014-10/Slides/Prashanth-DLO.pdf  Note, in particular, 
on slide 11 ("Legality") the reference to "cast to/from a given struct type". I 
would imagine that includes pointer casts. It should be possible to do the same 
sort of analysis with opaque pointers by following their uses very carefully, 
and maybe not having these infernal pointer type casts all over the place would 
make that less prone to false negatives. In the meantime, the pointer casts are 
there and have to be dealt with.

Getting back to the current review, Erich explained to me that this patch 
involves a certain amount of risk in that it changes the way clang processes 
things, but it has the merit of getting the correct answer.


https://reviews.llvm.org/D49403



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D49403: More aggressively complete RecordTypes with Function Pointers

2018-08-02 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

I hope I'm not coming across as too argumentative here. I don't really have 
strong feelings about this function pointer type patch and ultimately I see 
that you are right, but the objections you are raising here would equally apply 
to what I'm working on with the IR linker and if I find a way to fix that, I'll 
care a bit more about that case. (If you'd like a preview, here's the bug I'm 
trying to fix: https://bugs.llvm.org/show_bug.cgi?id=38408)

You say "there is no semantic meaning for pointer types in LLVM IR" and that's 
fine, but there is a function PointerType::getElementType() and if I modify 
that function to always return the i8 type it will break a lot of things. So 
while there may be some sense in which the the pointer type cannot be the 
"correct" type, there is most definitely a sense in which it can be "incorrect" 
even if that sense isn't the strict semantic sense. I haven't looked at all the 
uses of  PointerType::getElementType() [or the related 
Type::getPointerElementType()]. I know a lot of them are just tests and 
assertions. Others are trivially walking through something that they know to be 
true by other means. A few seem (at least on first glance) to actually be doing 
something with it. For instance, llvm::getPointerStride() contains this line of 
code: "int64_t Size = DL.getTypeAllocSize(PtrTy->getElementType());"

I'm not arguing against opaque pointer types. I just feel like we're probably 
at least a couple of years away from having that. I'm also not arguing for 
robust and exhaustive correction of all cases where pointer types are not 
currently "working" in the sense that I am describing in my prior comments. I'm 
just saying that if someone runs into a specific case that is causing them 
problems and they submit a fix for that case, maybe we should let it through 
unless we have more of a reason than strict semantics rendering it unimportant.


https://reviews.llvm.org/D49403



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D49403: More aggressively complete RecordTypes with Function Pointers

2018-08-02 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

Fair enough.

FYI, I've spent most of the day poking at the IR linker and I've found all 
sorts of ways that I can get it to make a complete mess of structure types and 
pointers to them, including some simple cases where it will mess it up in 
different ways depending on the order in which you link files, but I think I've 
convinced myself that there is no way to get the linker to cause incorrect code 
to be generated -- just lots of extra types, pointer casts, and function casts. 
This seems to be entirely consistent with what you are saying and sounds like a 
pretty solid argument for getting rid of typed pointers.

I guess maybe I'll take a step back and think about whether or not I could 
solve my current problems by pretending that all pointers are i8* and reasoning 
from there based on uses.

Thanks for taking the time to share your thoughts on this with me.


https://reviews.llvm.org/D49403



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53157: Teach the IRBuilder about constrained fadd and friends

2018-11-08 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

Craig Topper also raised some concerns with me (in person, his desk is right by 
mine) about the potential effect this might have on code size. He tells me that 
IRBuilder calls are intended to always be inlined and if we grow the 
implementation of these functions too much it could lead to noticeable bloat. 
It still seems to me like it might be worthwhile for the simplification it 
would allow in the front end, but I'm not really a front end guy so I 
definitely agree that we should get some input from front end people about what 
they want.


https://reviews.llvm.org/D53157



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-11-08 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In https://reviews.llvm.org/D45616#1290897, @nastafev wrote:

> >   can trigger arbitrary floating-point exceptions anywhere in your code
>
> I believe this statement reflects the current state of many compilers on the 
> market, I guess I just don't see the reason why making things worse. It seems 
> the original intent of the commit was to add support for masked compares, and 
> that could have been achieved without breaking what already worked.
>
> I hope the patch is ultimately helping some performance optimization, but it 
> is breaking the original intent of some legitimate programs that worked 
> before, and introduces correctness regression. So to me it must be at least 
> guarded by a flip-switch.
>
> The reference to constrained floating-point intrinsics work is relevant, but 
> it will obviously take time and extra effort to enable and then to unbreak 
> all the cases that are made broken here. Instead one could postpone lowering 
> of the particular instructions until it was possible without violation of the 
> semantics...


That seems like a legitimate concern to me.

I believe this patch was part of a larger effort to get rid of the dependence 
on intrinsics. We have a very broad preference for expressing things using 
native IR whenever possible because (among other reasons) intrinsics block most 
optimizations and we don't want to teach optimizations to reason about 
target-specific intrinsics. In this particular case we may have overreached 
because it isn't strictly possible to express all the semantics of this 
built-in accurately using native IR.

There is a patch currently active to add constrained fcmp intrinsics, but it 
doesn't have a masked variant.


Repository:
  rC Clang

https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D45616: [X86] Lower _mm[256|512]_cmp[.]_mask intrinsics to native llvm IR

2018-11-08 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

Agreed. Reverting this patch wouldn't move us forward on constrained FP 
handling. What I'm saying (and what I think @nastafev  is saying) is that this 
patch is taking a built-in that allows the user to express specific signalling 
behavior and throwing away that aspect of its semantics. Granted we do say that 
FP environment access is unsupported, so technically unmasking FP exceptions or 
even reading the status flag is undefined behavior. But it seems like there's a 
pretty big difference between using that claim to justify enabling some 
optimization that might do constant folding in a way that assumes the default 
rounding mode versus using that claim to disregard part of the semantics of a 
built-in that is modeling a target-specific instruction.

I'm not insisting that we have to revert this patch or anything like that. I'm 
just saying that we should think about it. My personal preference would 
actually be to have this code re-implemented using the new constrained fcmp 
intrinsic when it lands. That still leaves the masking part of this unsettled, 
but as you say that's probably not a priority right now.

BTW, here's a link to the constrained fcmp review: 
https://reviews.llvm.org/D54121


Repository:
  rC Clang

https://reviews.llvm.org/D45616



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53157: Teach the IRBuilder about constrained fadd and friends

2018-11-16 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In https://reviews.llvm.org/D53157#1291978, @cameron.mcinally wrote:

> In https://reviews.llvm.org/D53157#1291964, @kpn wrote:
>
> > I don't expect anyone would need to switch between constrained and regular 
> > floating point at an instruction level of granularity.
>
>
> The Standard allows for this (to some degree). E.g. FENV_ACCESS can be 
> toggled between nested compound statements.
>
> Also, some AVX512 hardware instructions have explicit SAE and RM operands.


Just because FENV_ACCESS can be toggled on that granularity doesn't mean we 
have to represent it that way. We've previously agreed (I think) that if 
FENV_ACCESS is enabled anywhere in a function we will want to use the 
constrained intrinsics for all FP operations in the function, not just the ones 
in the scope where it was specified. I think this works because FENV_ACCESS ON 
needs to be respected (i.e. we need to assume that the user may change the FP 
environment) but FENV_ACCESS OFF doesn't need to be respected (i.e. we are not 
required to assume that the user will not change the environment). For 
instance, the spec does give us liberty to assume the default rounding mode in 
FENV_ACCESS OFF regions, but if we make no assumptions about the rounding mode 
that will be conservatively correct.

As for instructions that take an explicit rounding mode argument, that's a 
separate issue (and one that is relevant for multiple architectures). The 
constrained intrinsics do not enforce a rounding mode (i.e. they are 
descriptive rather than prescriptive), but if we have concrete rounding mode 
arguments for any given instruction we could use that to encode the rounding 
mode in the instruction. I'm not sure how user code would express this apart 
from use of intrinsics.


https://reviews.llvm.org/D53157



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53157: Teach the IRBuilder about constrained fadd and friends

2018-11-19 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In https://reviews.llvm.org/D53157#1302724, @uweigand wrote:

> A couple of comments on the previous discussion:
>
> 1. Instead of defining a new command line option, I'd prefer to use the 
> existing options -frounding-math and -ftrapping-math to set the default 
> behavior of math operations w.r.t. rounding modes and exception status.  (For 
> compatibility with GCC if nothing else.)


I agree that it's preferable to re-use these existing options if possible. I 
have some concerns that -ftrapping-math has a partial implementation in place 
that doesn't seem to be well aligned with the way fast-math flags are handled, 
so it might require some work to have that working as expected without breaking 
existing users. In general though these seem like they should do what we need.

Regarding GCC compatibility, I notice that GCC defaults to trapping math being 
enabled and I don't think that's what we want with clang. It also seems to 
imply something more than I think we need for constrained handling. For 
example, the GCC documentation says that -fno-trapping-math "can result in 
incorrect output for programs that depend on an exact implementation of IEEE or 
ISO rules/specifications for math functions" so it sounds like maybe it also 
implies (for GCC)  something like LLVM's "afn" fast math flag.

So if we are going to use these options, I think we need to have a discussion 
about whether or not it's OK to diverge from GCC's interpretation of them.

In https://reviews.llvm.org/D53157#1302724, @uweigand wrote:

> 2. I also read the C standard to imply that it is a requirement of **user 
> code** to reset the status flag to default before switching back to 
> FENV_ACCESS OFF.  The fundamental characterization of the pragma says "The 
> FENV_ACCESS pragma provides a means **to inform the implementation** when a 
> program might access the floating-point environment to test floating-point 
> status flags or run under non-default floating-point control modes."  There 
> is no mention anywhere that using the pragma, on its own, will ever 
> **change** those control modes.   The last sentence about "... the 
> floating-point control modes have their default setting", while indeed a bit 
> ambiguous, is still consistent with an interpretation that it is the 
> responsibility of user code to ensure that state, there is no explicit 
> statement that the implementation will do so.


I definitely agree with this interpretation of the standard. My understanding 
is that behavior is undefined if the user has not left the FP environment in 
the default state when transitioning to an FENV_ACCESS OFF region.

In https://reviews.llvm.org/D53157#1302724, @uweigand wrote:

> 3. I agree that we need to be careful about intermixing "normal" 
> floating-point operations with strict ones.  However, I'm still not convinced 
> that the pragma itself must be the scheduling barrier.  It seems to me that 
> the compiler already knows where FP control flags are ever modified directly 
> (this can only happen with intrinsics or the like), so the main issue is 
> whether function calls need to be considered.  This is where the pragma comes 
> in: in my mind, the primary difference between FENV_ACCESS ON and FENV_ACCESS 
> OFF regions is that where the pragma is ON, function calls need to be 
> considered (unless otherwise known for sure) to access FP control flags, 
> while where the pragma is OFF, function calls can be considered to never 
> touch FP control flags.  So the real scheduling barrier would be any 
> **function call within a FENV_ACCESS ON region**.  Those would have to be 
> marked by the front-end in the IR, presumably using a function attribute.  
> The common LLVM optimizers would then need to respect that scheduling barrier 
> (here is where we likely still have an open issue, there doesn't appear to be 
> any way to express that at the IR level for regular floating-point operations 
> ...), and likewise the back-ends (but that looks straightforward: a back-end 
> typically will model FP status as residing in a register or in a 
> pseudo-memory slot, and those can simply be considered used/clobbered by 
> function calls marked as within FENV_ACCESS ON regions).


I'm a bit confused by this. The constrained intrinsics will cause all calls to 
act as barriers to motion of the FP operations represented by the intrinsics 
(at least before instruction selection). So I'm not clear what you are saying 
is needed here.


https://reviews.llvm.org/D53157



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53157: Teach the IRBuilder about constrained fadd and friends

2018-11-20 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In https://reviews.llvm.org/D53157#1304895, @cameron.mcinally wrote:

> The problem I was missing is when a FENV_ACCESS=OFF operation, like a FDIV, 
> is hoisted into a FENV_ACCESS=ON region. I see it now, but still think that 
> forcing FENV_ACCESS=OFF operations into constrained intrinsics is a big 
> hammer. If there is a way to add barriers around function calls in a 
> FENV_ACCESS=ON region, that would be better.


It may be a big hammer, but I don't think we'll need to use it terribly often 
(that is, I don't expect the mixing of FENV_ACCESS regions within a function to 
be common -- I could be wrong). In any event, we've been clear that choosing to 
enable FP environment access will mean sacrificing performance in the near 
term. Once we've had time to teach all the relevant optimizations how to handle 
the constrained intrinsics the hammer will start to feel much smaller.

In https://reviews.llvm.org/D53157#1304895, @cameron.mcinally wrote:

> Stepping way back, the *best* solution is to have the FPEnv implementation 
> shut down unsafe optimizations on an individual basis. Perhaps we should be 
> tagging functions that contain FENV_ACCESS=ON regions. That way unsafe 
> optimization passes, e.g. hoisting, can query the tag to see if they should 
> skip these functions.


One of the reasons we went with the approach we did is that it provides 
conservatively correct results by default. Optimizations don't need to be 
taught to leave the intrinsics alone. They do that as a default behavior when 
they don't recognize the intrinsic. If we had required optimizations to opt out 
as needed, we'd have problems chasing down all the places where that needed to 
happen. So we chose the opposite challenge of using the big hammer for initial 
implementation and then having to go looking for the places that needed to be 
taught how to interpret the intrinsics for cases when the optimization can 
still be performed.

I think at this point we're all on the same page in this regard. I just wanted 
to make sure we also understand why we're on that page. I still believe it was 
the correct choice.


https://reviews.llvm.org/D53157



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-25 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In https://reviews.llvm.org/D37042#850676, @hfinkel wrote:

> I'd really like to see this done in some way such that we can issue a warning 
> along with generating well-defined code. The warning can specifically state 
> that the code is using an extension, etc.


Should it warn on any nullptr arithmetic, or just the cases that are being 
handled by this change?


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-29 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor updated this revision to Diff 113134.
andrew.w.kaylor added a comment.

Added warnings for null pointer arithmetic.


https://reviews.llvm.org/D37042

Files:
  include/clang/Basic/DiagnosticGroups.td
  include/clang/Basic/DiagnosticSemaKinds.td
  lib/CodeGen/CGExprScalar.cpp
  lib/Sema/SemaExpr.cpp
  test/CodeGen/nullptr-arithmetic.c
  test/Sema/pointer-addition.c

Index: lib/CodeGen/CGExprScalar.cpp
===
--- lib/CodeGen/CGExprScalar.cpp
+++ lib/CodeGen/CGExprScalar.cpp
@@ -2671,6 +2671,37 @@
   unsigned width = cast(index->getType())->getBitWidth();
   auto &DL = CGF.CGM.getDataLayout();
   auto PtrTy = cast(pointer->getType());
+
+  // Some versions of glibc and gcc use idioms (particularly in their malloc
+  // routines) that add a pointer-sized integer (known to be a pointer value)
+  // to a null pointer in order to cast the value back to an integer or as
+  // part of a pointer alignment algorithm.  This is undefined behavior, but
+  // we'd like to be able to compile programs that use it.
+  //
+  // Normally, we'd generate a GEP with a null-pointer base here in response
+  // to that code, but it's also UB to dereference a pointer created that
+  // way.  Instead (as an acknowledged hack to tolerate the idiom) we will
+  // generate a direct cast of the integer value to a pointer.
+  //
+  // The idiom (p = nullptr + N) is not met if any of the following are true:
+  //
+  //   The operation is subtraction.
+  //   The index is not pointer-sized.
+  //   The pointer type is not byte-sized.
+  //   The index operand is a constant.
+  //
+  if (isa(pointer) && !isSubtraction && 
+  (width == DL.getTypeSizeInBits(PtrTy)) && 
+  !isa(index)) {
+// The pointer type might come back as null, so it's deferred until here.
+const PointerType *pointerType 
+  = pointerOperand->getType()->getAs();
+if (pointerType && pointerType->getPointeeType()->isCharType()) { 
+  // (nullptr + N) -> inttoptr N to 
+  return CGF.Builder.CreateIntToPtr(index, pointer->getType());
+}
+  }
+
   if (width != DL.getTypeSizeInBits(PtrTy)) {
 // Zero-extend or sign-extend the pointer value according to
 // whether the index is signed or not.
Index: lib/Sema/SemaExpr.cpp
===
--- lib/Sema/SemaExpr.cpp
+++ lib/Sema/SemaExpr.cpp
@@ -8490,6 +8490,18 @@
 << 0 /* one pointer */ << Pointer->getSourceRange();
 }
 
+/// \brief Diagnose invalid arithmetic on a null pointer.
+///
+/// If \p IsGNUIdiom is true, the operation is using the 'p = (i8*)nullptr + n'
+/// idiom, which we recognize as a GNU extension.
+///
+static void diagnoseArithmeticOnNullPointer(Sema &S, SourceLocation Loc,
+Expr *Pointer, bool IsGNUIdiom) {
+  S.Diag(Loc, IsGNUIdiom ? diag::ext_gnu_null_ptr_arith
+ : diag::warn_pointer_arith_null_ptr)
+<< Pointer->getSourceRange();
+}
+
 /// \brief Diagnose invalid arithmetic on two function pointers.
 static void diagnoseArithmeticOnTwoFunctionPointers(Sema &S, SourceLocation Loc,
 Expr *LHS, Expr *RHS) {
@@ -8784,6 +8796,21 @@
   if (!IExp->getType()->isIntegerType())
 return InvalidOperands(Loc, LHS, RHS);
 
+  // Adding to a null pointer is never an error, but should warn.
+  if (PExp->IgnoreParenCasts()->isNullPointerConstant(Context, 
+ Expr::NPC_ValueDependentIsNull)) {
+// Check the conditions to see if this is the 'p = (i8*)nullptr + n' idiom.
+bool IsGNUIdiom = false;
+const PointerType *pointerType = PExp->getType()->getAs();
+if (!IExp->isIntegerConstantExpr(Context) &&
+pointerType->getPointeeType()->isCharType() &&
+Context.getTypeSize(pointerType) == 
+Context.getTypeSize(IExp->getType()))
+  IsGNUIdiom = true;
+
+diagnoseArithmeticOnNullPointer(*this, Loc, PExp, IsGNUIdiom);
+  }
+
   if (!checkArithmeticOpPointerOperand(*this, Loc, PExp))
 return QualType();
 
@@ -8845,6 +8872,13 @@
 
 // The result type of a pointer-int computation is the pointer type.
 if (RHS.get()->getType()->isIntegerType()) {
+  // Subtracting from a null pointer should produce a warning.
+  // The last argument to the diagnose call says this doesn't match the
+  // GNU int-to-pointer idiom.
+  if (LHS.get()->IgnoreParenCasts()->isNullPointerConstant(Context,
+   Expr::NPC_ValueDependentIsNull))
+diagnoseArithmeticOnNullPointer(*this, Loc, LHS.get(), false);
+
   if (!checkArithmeticOpPointerOperand(*this, Loc, LHS.get()))
 return QualType();
 
Index: include/clang/Basic/DiagnosticSemaKinds.td
===
--- include/clang/Basic/DiagnosticSemaKinds.td
+++ include/clang/Basic/DiagnosticSemaKi

[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-29 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6032
+def ext_gnu_null_ptr_arith : Extension<
+  "inttoptr casting using arithmetic on a null pointer is a GNU extension">,
+  InGroup;

rsmith wrote:
> efriedma wrote:
> > The keyword "inttoptr" is part of LLVM, not something we expect users to 
> > understand.
> How about something like "arithmetic on null pointer treated as cast from 
> integer to pointer as a GNU extension"?
I like that.  Thanks for the suggestion.



Comment at: lib/Sema/SemaExpr.cpp:8805
+const PointerType *pointerType = PExp->getType()->getAs();
+if (!IExp->isIntegerConstantExpr(Context) &&
+pointerType->getPointeeType()->isCharType() &&

rsmith wrote:
> It seems strange to me to disable this when the RHS is an ICE. If we're going 
> to support this as an extension, we should make the rules for it as simple as 
> we reasonably can; this ICE check seems like an unnecessary complication.
You're probably right.  My thinking was that I wanted to keep the extension 
idiom as narrow as was reasonable, so these were cases that I felt like I could 
rule out, but the argument for simplicity is compelling since the alternative 
isn't doing anything particularly desirable in most cases.


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-29 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: lib/Sema/SemaExpr.cpp:8808
+Context.getTypeSize(pointerType) == 
+Context.getTypeSize(IExp->getType()))
+  IsGNUIdiom = true;

efriedma wrote:
> Please make sure you use exactly the same check in Sema and CodeGen (probably 
> easiest to stick a helper into lib/AST/).
That's a good idea, but I'm not really familiar enough with the structure of 
clang to be sure I'm not doing it in a ham-fisted way.  Can you clarify?  Are 
you suggesting adding something like ASTContext::isPointeeTypeCharSize() and 
ASTContext::isIntegerTypePointerSize()?  Or maybe a single specialized function 
somewhere that does both checks like 
ASTContext::areOperandNullPointerArithmeticCompatible()?


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-29 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: lib/Sema/SemaExpr.cpp:8808
+Context.getTypeSize(pointerType) == 
+Context.getTypeSize(IExp->getType()))
+  IsGNUIdiom = true;

efriedma wrote:
> rsmith wrote:
> > andrew.w.kaylor wrote:
> > > efriedma wrote:
> > > > Please make sure you use exactly the same check in Sema and CodeGen 
> > > > (probably easiest to stick a helper into lib/AST/).
> > > That's a good idea, but I'm not really familiar enough with the structure 
> > > of clang to be sure I'm not doing it in a ham-fisted way.  Can you 
> > > clarify?  Are you suggesting adding something like 
> > > ASTContext::isPointeeTypeCharSize() and 
> > > ASTContext::isIntegerTypePointerSize()?  Or maybe a single specialized 
> > > function somewhere that does both checks like 
> > > ASTContext::areOperandNullPointerArithmeticCompatible()?
> > I would suggest something even more specific, such as 
> > `isNullPointerPlusIntegerExtension`
> I'm most concerned about the difference between 
> "isa(pointer)" and 
> "PExp->IgnoreParenCasts()->isNullPointerConstant()"... they're different in 
> important ways.
> 
> I was thinking something like 
> "BinaryOperator::isNullPointerArithmeticExtension()"
At this point in Sema the BinaryOperator for the addition hasn't been created 
yet, right?  So a static function that takes the opcode and operands?


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-30 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor updated this revision to Diff 113311.
andrew.w.kaylor added a comment.

Refactored the GNU idiom check to be shared  by Sema and CodeGen.
Refined the checks so that nullptr+0 doesn't warn in C++.
Added the zero offset qualification in the warning produced with C++.


https://reviews.llvm.org/D37042

Files:
  include/clang/AST/Expr.h
  include/clang/Basic/DiagnosticGroups.td
  include/clang/Basic/DiagnosticSemaKinds.td
  lib/AST/Expr.cpp
  lib/CodeGen/CGExprScalar.cpp
  lib/Sema/SemaExpr.cpp
  test/CodeGen/nullptr-arithmetic.c
  test/Sema/pointer-addition.c
  test/SemaCXX/nullptr-arithmetic.cpp

Index: lib/CodeGen/CGExprScalar.cpp
===
--- lib/CodeGen/CGExprScalar.cpp
+++ lib/CodeGen/CGExprScalar.cpp
@@ -2671,6 +2671,30 @@
   unsigned width = cast(index->getType())->getBitWidth();
   auto &DL = CGF.CGM.getDataLayout();
   auto PtrTy = cast(pointer->getType());
+
+  // Some versions of glibc and gcc use idioms (particularly in their malloc
+  // routines) that add a pointer-sized integer (known to be a pointer value)
+  // to a null pointer in order to cast the value back to an integer or as
+  // part of a pointer alignment algorithm.  This is undefined behavior, but
+  // we'd like to be able to compile programs that use it.
+  //
+  // Normally, we'd generate a GEP with a null-pointer base here in response
+  // to that code, but it's also UB to dereference a pointer created that
+  // way.  Instead (as an acknowledged hack to tolerate the idiom) we will
+  // generate a direct cast of the integer value to a pointer.
+  //
+  // The idiom (p = nullptr + N) is not met if any of the following are true:
+  //
+  //   The operation is subtraction.
+  //   The index is not pointer-sized.
+  //   The pointer type is not byte-sized.
+  //
+  if (BinaryOperator::isNullPointerArithmeticExtension(CGF.getContext(),
+   op.Opcode,
+   expr->getLHS(), 
+   expr->getRHS()))
+return CGF.Builder.CreateIntToPtr(index, pointer->getType());
+
   if (width != DL.getTypeSizeInBits(PtrTy)) {
 // Zero-extend or sign-extend the pointer value according to
 // whether the index is signed or not.
Index: lib/AST/Expr.cpp
===
--- lib/AST/Expr.cpp
+++ lib/AST/Expr.cpp
@@ -1829,6 +1829,45 @@
   return OverOps[Opc];
 }
 
+bool BinaryOperator::isNullPointerArithmeticExtension(ASTContext &Ctx,
+  Opcode Opc,
+  Expr *LHS, Expr *RHS) {
+  if (Opc != BO_Add)
+return false;
+
+  // Check that we have one pointer and one integer operand.
+  Expr *PExp;
+  Expr *IExp;
+  if (LHS->getType()->isPointerType()) {
+if (!RHS->getType()->isIntegerType())
+  return false;
+PExp = LHS;
+IExp = RHS;
+  } else if (RHS->getType()->isPointerType()) {
+if (!LHS->getType()->isIntegerType())
+  return false;
+PExp = RHS;
+IExp = LHS;
+  } else {
+return false;
+  }
+
+  // Check that the pointer is a nullptr.
+  if (!PExp->IgnoreParenCasts()
+  ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNull))
+return false;
+
+  // Check that the pointee type is char-sized.
+  const PointerType *PTy = PExp->getType()->getAs();
+  if (!PTy || !PTy->getPointeeType()->isCharType())
+return false;
+
+  // Check that the integer type is pointer-sized.
+  if (Ctx.getTypeSize(IExp->getType()) != Ctx.getTypeSize(PExp->getType()))
+return false;
+
+  return true;
+}
 InitListExpr::InitListExpr(const ASTContext &C, SourceLocation lbraceloc,
ArrayRef initExprs, SourceLocation rbraceloc)
   : Expr(InitListExprClass, QualType(), VK_RValue, OK_Ordinary, false, false,
Index: lib/Sema/SemaExpr.cpp
===
--- lib/Sema/SemaExpr.cpp
+++ lib/Sema/SemaExpr.cpp
@@ -8490,6 +8490,21 @@
 << 0 /* one pointer */ << Pointer->getSourceRange();
 }
 
+/// \brief Diagnose invalid arithmetic on a null pointer.
+///
+/// If \p IsGNUIdiom is true, the operation is using the 'p = (i8*)nullptr + n'
+/// idiom, which we recognize as a GNU extension.
+///
+static void diagnoseArithmeticOnNullPointer(Sema &S, SourceLocation Loc,
+Expr *Pointer, bool IsGNUIdiom) {
+  if (IsGNUIdiom)
+S.Diag(Loc, diag::ext_gnu_null_ptr_arith)
+  << Pointer->getSourceRange();
+  else
+S.Diag(Loc, diag::warn_pointer_arith_null_ptr)
+  << S.getLangOpts().CPlusPlus << Pointer->getSourceRange();
+}
+
 /// \brief Diagnose invalid arithmetic on two function pointers.
 static void diagnoseArithmeticOnTwoFunctionPointers(Sema &S, SourceLocation Loc,
 Expr *LHS, E

[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-30 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: lib/AST/Expr.cpp:1857
+  if (!PExp->IgnoreParenCasts()
+  ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNull))
+return false;

rsmith wrote:
> If we get here with a value-dependent expression, we should treat it as 
> non-null (do not warn on `(char*)PtrTemplateParameter + N`).
OK.  I wasn't sure about that.

So how do I test that?  Is this right?
```
template
T* f(intptr_t offset) {
  return P + offset;
}

char *test(intptr_t offset) {
  return f(offset);
}
```



Comment at: lib/Sema/SemaExpr.cpp:8877
 if (RHS.get()->getType()->isIntegerType()) {
+  // Subtracting from a null pointer should produce a warning.
+  // The last argument to the diagnose call says this doesn't match the

rsmith wrote:
> Subtracting zero from a null pointer should not warn in C++.
> 
> (Conversely, subtracting a non-null pointer from a pointer should warn in 
> C++, and subtracting any pointer from a null pointer should warn in C.)
Is it OK if I just add a FIXME in the section below that handles 
pointer-pointer?


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-30 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor updated this revision to Diff 113343.
andrew.w.kaylor added a comment.

Fixed value-dependent argument in isNullPointerConstant checks.
Added check for C++ zero offset in subtraction.
Added value-dependent test cases.


https://reviews.llvm.org/D37042

Files:
  include/clang/AST/Expr.h
  include/clang/Basic/DiagnosticGroups.td
  include/clang/Basic/DiagnosticSemaKinds.td
  lib/AST/Expr.cpp
  lib/CodeGen/CGExprScalar.cpp
  lib/Sema/SemaExpr.cpp
  test/CodeGen/nullptr-arithmetic.c
  test/Sema/pointer-addition.c
  test/SemaCXX/nullptr-arithmetic.cpp

Index: lib/CodeGen/CGExprScalar.cpp
===
--- lib/CodeGen/CGExprScalar.cpp
+++ lib/CodeGen/CGExprScalar.cpp
@@ -2671,6 +2671,30 @@
   unsigned width = cast(index->getType())->getBitWidth();
   auto &DL = CGF.CGM.getDataLayout();
   auto PtrTy = cast(pointer->getType());
+
+  // Some versions of glibc and gcc use idioms (particularly in their malloc
+  // routines) that add a pointer-sized integer (known to be a pointer value)
+  // to a null pointer in order to cast the value back to an integer or as
+  // part of a pointer alignment algorithm.  This is undefined behavior, but
+  // we'd like to be able to compile programs that use it.
+  //
+  // Normally, we'd generate a GEP with a null-pointer base here in response
+  // to that code, but it's also UB to dereference a pointer created that
+  // way.  Instead (as an acknowledged hack to tolerate the idiom) we will
+  // generate a direct cast of the integer value to a pointer.
+  //
+  // The idiom (p = nullptr + N) is not met if any of the following are true:
+  //
+  //   The operation is subtraction.
+  //   The index is not pointer-sized.
+  //   The pointer type is not byte-sized.
+  //
+  if (BinaryOperator::isNullPointerArithmeticExtension(CGF.getContext(),
+   op.Opcode,
+   expr->getLHS(), 
+   expr->getRHS()))
+return CGF.Builder.CreateIntToPtr(index, pointer->getType());
+
   if (width != DL.getTypeSizeInBits(PtrTy)) {
 // Zero-extend or sign-extend the pointer value according to
 // whether the index is signed or not.
Index: lib/AST/Expr.cpp
===
--- lib/AST/Expr.cpp
+++ lib/AST/Expr.cpp
@@ -1829,6 +1829,45 @@
   return OverOps[Opc];
 }
 
+bool BinaryOperator::isNullPointerArithmeticExtension(ASTContext &Ctx,
+  Opcode Opc,
+  Expr *LHS, Expr *RHS) {
+  if (Opc != BO_Add)
+return false;
+
+  // Check that we have one pointer and one integer operand.
+  Expr *PExp;
+  Expr *IExp;
+  if (LHS->getType()->isPointerType()) {
+if (!RHS->getType()->isIntegerType())
+  return false;
+PExp = LHS;
+IExp = RHS;
+  } else if (RHS->getType()->isPointerType()) {
+if (!LHS->getType()->isIntegerType())
+  return false;
+PExp = RHS;
+IExp = LHS;
+  } else {
+return false;
+  }
+
+  // Check that the pointer is a nullptr.
+  if (!PExp->IgnoreParenCasts()
+  ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNotNull))
+return false;
+
+  // Check that the pointee type is char-sized.
+  const PointerType *PTy = PExp->getType()->getAs();
+  if (!PTy || !PTy->getPointeeType()->isCharType())
+return false;
+
+  // Check that the integer type is pointer-sized.
+  if (Ctx.getTypeSize(IExp->getType()) != Ctx.getTypeSize(PExp->getType()))
+return false;
+
+  return true;
+}
 InitListExpr::InitListExpr(const ASTContext &C, SourceLocation lbraceloc,
ArrayRef initExprs, SourceLocation rbraceloc)
   : Expr(InitListExprClass, QualType(), VK_RValue, OK_Ordinary, false, false,
Index: lib/Sema/SemaExpr.cpp
===
--- lib/Sema/SemaExpr.cpp
+++ lib/Sema/SemaExpr.cpp
@@ -8490,6 +8490,21 @@
 << 0 /* one pointer */ << Pointer->getSourceRange();
 }
 
+/// \brief Diagnose invalid arithmetic on a null pointer.
+///
+/// If \p IsGNUIdiom is true, the operation is using the 'p = (i8*)nullptr + n'
+/// idiom, which we recognize as a GNU extension.
+///
+static void diagnoseArithmeticOnNullPointer(Sema &S, SourceLocation Loc,
+Expr *Pointer, bool IsGNUIdiom) {
+  if (IsGNUIdiom)
+S.Diag(Loc, diag::ext_gnu_null_ptr_arith)
+  << Pointer->getSourceRange();
+  else
+S.Diag(Loc, diag::warn_pointer_arith_null_ptr)
+  << S.getLangOpts().CPlusPlus << Pointer->getSourceRange();
+}
+
 /// \brief Diagnose invalid arithmetic on two function pointers.
 static void diagnoseArithmeticOnTwoFunctionPointers(Sema &S, SourceLocation Loc,
 Expr *LHS, Expr *RHS) {
@@ -8784,6 +8799,21 @@
   if (!IE

[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-09-05 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

Ping.


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-09-13 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

Does anything else need to be done for this to be ready to land?


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-09-13 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6031
   InGroup;
+def ext_gnu_null_ptr_arith : Extension<
+  "arithmetic on a null pointer treated as a cast from integer to pointer is a 
GNU extension">,

efriedma wrote:
> "extension" isn't really right here; this shouldn't be an error in 
> -pedantic-errors mode.  Probably best to just stick this into the 
> NullPointerArithmetic, like the other new warning.
So how should a word the warning?  Just this:

"arithmetic on a null pointer treated as a cast from integer to pointer"?


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-09-13 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6031
   InGroup;
+def ext_gnu_null_ptr_arith : Extension<
+  "arithmetic on a null pointer treated as a cast from integer to pointer is a 
GNU extension">,

efriedma wrote:
> andrew.w.kaylor wrote:
> > efriedma wrote:
> > > "extension" isn't really right here; this shouldn't be an error in 
> > > -pedantic-errors mode.  Probably best to just stick this into the 
> > > NullPointerArithmetic, like the other new warning.
> > So how should a word the warning?  Just this:
> > 
> > "arithmetic on a null pointer treated as a cast from integer to pointer"?
> That wasn't what I meant; the current wording is fine. I meant this should be 
> something like `def warn_gnu_null_ptr_arith : Warning<`.
OK.  I think I understand the behavior you wanted.  I just thought maybe the 
current wording might be technically incorrect.  I wasn't sure how precisely 
defined we consider "extension" to be in this context.


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-09-14 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor updated this revision to Diff 115262.
andrew.w.kaylor added a comment.

-Changed GNU idiom from extension to warning.
-Updated to ToT.


https://reviews.llvm.org/D37042

Files:
  include/clang/AST/Expr.h
  include/clang/Basic/DiagnosticGroups.td
  include/clang/Basic/DiagnosticSemaKinds.td
  lib/AST/Expr.cpp
  lib/CodeGen/CGExprScalar.cpp
  lib/Sema/SemaExpr.cpp
  test/CodeGen/nullptr-arithmetic.c
  test/Sema/pointer-addition.c
  test/SemaCXX/nullptr-arithmetic.cpp

Index: lib/CodeGen/CGExprScalar.cpp
===
--- lib/CodeGen/CGExprScalar.cpp
+++ lib/CodeGen/CGExprScalar.cpp
@@ -2678,6 +2678,30 @@
   unsigned width = cast(index->getType())->getBitWidth();
   auto &DL = CGF.CGM.getDataLayout();
   auto PtrTy = cast(pointer->getType());
+
+  // Some versions of glibc and gcc use idioms (particularly in their malloc
+  // routines) that add a pointer-sized integer (known to be a pointer value)
+  // to a null pointer in order to cast the value back to an integer or as
+  // part of a pointer alignment algorithm.  This is undefined behavior, but
+  // we'd like to be able to compile programs that use it.
+  //
+  // Normally, we'd generate a GEP with a null-pointer base here in response
+  // to that code, but it's also UB to dereference a pointer created that
+  // way.  Instead (as an acknowledged hack to tolerate the idiom) we will
+  // generate a direct cast of the integer value to a pointer.
+  //
+  // The idiom (p = nullptr + N) is not met if any of the following are true:
+  //
+  //   The operation is subtraction.
+  //   The index is not pointer-sized.
+  //   The pointer type is not byte-sized.
+  //
+  if (BinaryOperator::isNullPointerArithmeticExtension(CGF.getContext(),
+   op.Opcode,
+   expr->getLHS(), 
+   expr->getRHS()))
+return CGF.Builder.CreateIntToPtr(index, pointer->getType());
+
   if (width != DL.getTypeSizeInBits(PtrTy)) {
 // Zero-extend or sign-extend the pointer value according to
 // whether the index is signed or not.
Index: lib/AST/Expr.cpp
===
--- lib/AST/Expr.cpp
+++ lib/AST/Expr.cpp
@@ -1829,6 +1829,45 @@
   return OverOps[Opc];
 }
 
+bool BinaryOperator::isNullPointerArithmeticExtension(ASTContext &Ctx,
+  Opcode Opc,
+  Expr *LHS, Expr *RHS) {
+  if (Opc != BO_Add)
+return false;
+
+  // Check that we have one pointer and one integer operand.
+  Expr *PExp;
+  Expr *IExp;
+  if (LHS->getType()->isPointerType()) {
+if (!RHS->getType()->isIntegerType())
+  return false;
+PExp = LHS;
+IExp = RHS;
+  } else if (RHS->getType()->isPointerType()) {
+if (!LHS->getType()->isIntegerType())
+  return false;
+PExp = RHS;
+IExp = LHS;
+  } else {
+return false;
+  }
+
+  // Check that the pointer is a nullptr.
+  if (!PExp->IgnoreParenCasts()
+  ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNotNull))
+return false;
+
+  // Check that the pointee type is char-sized.
+  const PointerType *PTy = PExp->getType()->getAs();
+  if (!PTy || !PTy->getPointeeType()->isCharType())
+return false;
+
+  // Check that the integer type is pointer-sized.
+  if (Ctx.getTypeSize(IExp->getType()) != Ctx.getTypeSize(PExp->getType()))
+return false;
+
+  return true;
+}
 InitListExpr::InitListExpr(const ASTContext &C, SourceLocation lbraceloc,
ArrayRef initExprs, SourceLocation rbraceloc)
   : Expr(InitListExprClass, QualType(), VK_RValue, OK_Ordinary, false, false,
Index: lib/Sema/SemaExpr.cpp
===
--- lib/Sema/SemaExpr.cpp
+++ lib/Sema/SemaExpr.cpp
@@ -8486,6 +8486,21 @@
 << 0 /* one pointer */ << Pointer->getSourceRange();
 }
 
+/// \brief Diagnose invalid arithmetic on a null pointer.
+///
+/// If \p IsGNUIdiom is true, the operation is using the 'p = (i8*)nullptr + n'
+/// idiom, which we recognize as a GNU extension.
+///
+static void diagnoseArithmeticOnNullPointer(Sema &S, SourceLocation Loc,
+Expr *Pointer, bool IsGNUIdiom) {
+  if (IsGNUIdiom)
+S.Diag(Loc, diag::warn_gnu_null_ptr_arith)
+  << Pointer->getSourceRange();
+  else
+S.Diag(Loc, diag::warn_pointer_arith_null_ptr)
+  << S.getLangOpts().CPlusPlus << Pointer->getSourceRange();
+}
+
 /// \brief Diagnose invalid arithmetic on two function pointers.
 static void diagnoseArithmeticOnTwoFunctionPointers(Sema &S, SourceLocation Loc,
 Expr *LHS, Expr *RHS) {
@@ -8780,6 +8795,21 @@
   if (!IExp->getType()->isIntegerType())
 return InvalidOperands(Loc, LHS, RHS);
 
+  /

[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-09-19 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

This is breaking buildbots for 32-bit targets because the offset in 'nullptr + 
int8_t_N' is being implicitly cast to an int.  That makes the sizeof(offset) == 
sizeof(ptr) check turn out differently than my tests were assuming.

I can get the buildbots green quickly by taking out parts of the tests, but I 
just talked to Erich Keane about this and we think the right way to fix this 
long term is to stop checking for sizeof(offset) == sizeof(ptr) in the code 
that identifies the idiom, since that check is of dubious value and would be 
difficult to always get to behave the way I intended.


Repository:
  rL LLVM

https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D38060: Remove offset size check in nullptr arithmetic handling

2017-09-19 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor created this revision.

This patch amends the changes that were committed from 
https://reviews.llvm.org/D37042 to remove the offset size check in the idiom 
identification routine.

That check was behaving inconsistently (from target to target) because the 
offset was implicitly promoted to an integer before the idiom identification 
routine was called.  The result was that when a smaller-than-int offset was 
added to a nullptr, the idiom was reported as having been matched on platforms 
where the int size is the same as the pointer size but the idiom was reported 
as not having been matched on platforms where the int size and the pointer size 
were different.

Because this check adds little value, and because the behavior of nullptr 
arithmetic is undefined in all cases (and therefore at our discretion to 
implement as we choose), it seems best to remove this check in order to 
guarantee consistent behavior across all platforms.


https://reviews.llvm.org/D38060

Files:
  lib/AST/Expr.cpp
  test/CodeGen/nullptr-arithmetic.c
  test/Sema/pointer-addition.c
  test/SemaCXX/nullptr-arithmetic.cpp


Index: lib/AST/Expr.cpp
===
--- lib/AST/Expr.cpp
+++ lib/AST/Expr.cpp
@@ -1862,10 +1862,6 @@
   if (!PTy || !PTy->getPointeeType()->isCharType())
 return false;
 
-  // Check that the integer type is pointer-sized.
-  if (Ctx.getTypeSize(IExp->getType()) != Ctx.getTypeSize(PExp->getType()))
-return false;
-
   return true;
 }
 InitListExpr::InitListExpr(const ASTContext &C, SourceLocation lbraceloc,
Index: test/SemaCXX/nullptr-arithmetic.cpp
===
--- test/SemaCXX/nullptr-arithmetic.cpp
+++ test/SemaCXX/nullptr-arithmetic.cpp
@@ -1,4 +1,6 @@
 // RUN: %clang_cc1 %s -fsyntax-only -verify -pedantic -Wextra -std=c++11
+// RUN: %clang_cc1 %s -fsyntax-only -triple i686-unknown-unknown -verify 
-pedantic -Wextra -std=c++11
+// RUN: %clang_cc1 %s -fsyntax-only -triple x86_64-unknown-unknown -verify 
-pedantic -Wextra -std=c++11
 
 #include 
 
Index: test/CodeGen/nullptr-arithmetic.c
===
--- test/CodeGen/nullptr-arithmetic.c
+++ test/CodeGen/nullptr-arithmetic.c
@@ -1,4 +1,6 @@
 // RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 -S %s -emit-llvm -triple i686-unknown-unknown -o - | 
FileCheck %s
+// RUN: %clang_cc1 -S %s -emit-llvm -triple x86_64-unknown-unknown -o - | 
FileCheck %s
 
 #include 
 
@@ -32,3 +34,14 @@
 // CHECK-LABEL: test3
 // CHECK: getelementptr
 // CHECK-NOT: inttoptr
+
+// This checks the case where the offset isn't pointer-sized.
+// The front end will implicitly cast the offset to an integer, so we need to
+// make sure that doesn't cause problems on targets where integers and pointers
+// are not the same size.
+int8_t *test4(int8_t b) {
+  return NULLPTRI8 + b;
+}
+// CHECK-LABEL: test4
+// CHECK: inttoptr
+// CHECK-NOT: getelementptr
Index: test/Sema/pointer-addition.c
===
--- test/Sema/pointer-addition.c
+++ test/Sema/pointer-addition.c
@@ -1,4 +1,6 @@
 // RUN: %clang_cc1 %s -fsyntax-only -verify -pedantic -Wextra -std=c11
+// RUN: %clang_cc1 %s -fsyntax-only -triple i686-unknown-unknown -verify 
-pedantic -Wextra -std=c11
+// RUN: %clang_cc1 %s -fsyntax-only -triple x86_64-unknown-unknown -verify 
-pedantic -Wextra -std=c11
 
 #include 
 


Index: lib/AST/Expr.cpp
===
--- lib/AST/Expr.cpp
+++ lib/AST/Expr.cpp
@@ -1862,10 +1862,6 @@
   if (!PTy || !PTy->getPointeeType()->isCharType())
 return false;
 
-  // Check that the integer type is pointer-sized.
-  if (Ctx.getTypeSize(IExp->getType()) != Ctx.getTypeSize(PExp->getType()))
-return false;
-
   return true;
 }
 InitListExpr::InitListExpr(const ASTContext &C, SourceLocation lbraceloc,
Index: test/SemaCXX/nullptr-arithmetic.cpp
===
--- test/SemaCXX/nullptr-arithmetic.cpp
+++ test/SemaCXX/nullptr-arithmetic.cpp
@@ -1,4 +1,6 @@
 // RUN: %clang_cc1 %s -fsyntax-only -verify -pedantic -Wextra -std=c++11
+// RUN: %clang_cc1 %s -fsyntax-only -triple i686-unknown-unknown -verify -pedantic -Wextra -std=c++11
+// RUN: %clang_cc1 %s -fsyntax-only -triple x86_64-unknown-unknown -verify -pedantic -Wextra -std=c++11
 
 #include 
 
Index: test/CodeGen/nullptr-arithmetic.c
===
--- test/CodeGen/nullptr-arithmetic.c
+++ test/CodeGen/nullptr-arithmetic.c
@@ -1,4 +1,6 @@
 // RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 -S %s -emit-llvm -triple i686-unknown-unknown -o - | FileCheck %s
+// RUN: %clang_cc1 -S %s -emit-llvm -triple x86_64-unknown-unknown -o - | FileCheck %s
 
 #include 
 
@@ -32,3 +34,14 @@
 // CHECK-LABEL: test3
 // CHECK: getelement

[PATCH] D38060: Remove offset size check in nullptr arithmetic handling

2017-09-20 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor closed this revision.
andrew.w.kaylor added a comment.

This was committed as r313784.  I put the wrong differential revision number in 
the comment for that check-in.


https://reviews.llvm.org/D38060



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: [RFC] Add support for options -fp-model= and -fp-speculation= : specify floating point behavior

2019-08-07 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

I'm not entirely caught up on this review. I've only read the most recent 
comments, but I think I've got enough context to comment on the metadata 
arguments.

Based only on the fp-model command line options, the front end should only ever 
use "round.dynamic" (for strict mode) or "round.tonearest" (where full fenv 
access isn't permitted but we want to enable strict exception semantics). There 
are some pragmas, which I believe are in some draft of a standards document but 
not yet approved, which can declare any of the other rounding modes, but I 
don't know that we have plans to implement those yet. I'm also hoping that at 
some point we'll have a pass that finds calls to fesetround() and changes the 
metadata argument when it can prove what the rounding mode will be.

For the fp exception argument, the only values that can be implied by the 
-fp-model option are "fpexcept.strict" and "fpexcept.ignore". In icc, we have a 
separate option that can prevent speculation (equivalent to 
"fpexcept.maytrap"). I think gcc's, -ftrapping-math has a similar function 
(though the default may be reversed). I don't think we've talked about how (or 
if) clang should ever get into the "fpexcept.maytrap" state.

So for now, I think both arguments only need to support one of two states, 
depending on the -fp-model arguments.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: [RFC] Add support for options -fp-model= and -fp-speculation= : specify floating point behavior

2019-08-09 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D62731#1623114 , @mibintc wrote:

> @andrew.w.kaylor Thanks Andy.  Reminder -- in a private document you 
> indicated to me that -fp-speculation=safe corresponds to the maytrap setting 
> for the exception argument.  This patch is implemented with those semantics.


Right. This is a clear indication of why I shouldn't try to squeeze in a review 
just before heading out the door at the end of the day. The "separate option" 
in icc that I was referring to is -fp-speculation. Somehow I overlooked the 
fact that you were adding that option here. I saw your note and read the recent 
comments without looking over even the patch description. Sorry about that.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66092: [CodeGen] Generate constrained fp intrinsics depending on FPOptions

2019-08-14 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D66092#1627339 , @sepavloff wrote:

> In D66092#1625380 , @kpn wrote:
>
> > Also, if any constrained intrinsics are used in a function then the entire 
> > function needs to be constrained. Is this handled anywhere?
>
>
> If we decided to make the entire function constrained, it should be done 
> somewhere in IR transformations, because inlining may mix function bodies 
> with different fp options.


Kevin is right. We have decided that if constrained intrinsics are used 
anywhere in a function they must be used throughout the function. Otherwise, 
there would be nothing to prevent the non-constrained FP operations from 
migrating across constrained operations and the handling could get botched. The 
"relaxed" arguments ("round.tonearest" and "fpexcept.ignore") should be used 
where the default settings would apply. The front end should also be setting 
the "strictfp" attribute on calls within a constrained scope and, I think, 
functions that contain constrained intrinsics.

We will need to teach the inliner to enforce this rule if it isn't already 
doing so, but if things aren't correct coming out of the front end an incorrect 
optimization could already happen before we get to the inliner. We always rely 
on the front end producing IR with fully correct semantics.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66092/new/

https://reviews.llvm.org/D66092



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65997: Add options rounding and exceptions to pragma fp

2019-08-15 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:3127
 
 The pragma can take three values: ``on``, ``fast`` and ``off``.  The ``on``
 option is identical to using ``#pragma STDC FP_CONTRACT(ON)`` and it allows

This part of the documentation is ambiguous in light of the new options. I 
suppose "[t]he pragma" here refers to "#pragma clang fp contract" but the first 
paragraph refers to "#pragma clang fp" as "the pragma."



Comment at: clang/docs/LanguageExtensions.rst:3152
+rounding mode. This option is experimental; the compiler may ignore an explicit
+rounding mode in some situations.
+

You should say something about whether the rounding mode is applied or assumed. 
The rounding mode argument in the constrained FP intrinsics is assumed. The 
compiler is not required to do anything to force the hardware into that 
rounding mode. I think we want that behavior here as well. I believe this 
follows the semantics of the STDC FENV_ROUND pragmas that were introduced by 
ISO TS-18661.



Comment at: clang/docs/LanguageExtensions.rst:3159
+This option is experimental; the compiler may ignore an explicit exception
+behavior in some situations.
+

I'm not sure what this is supposed to mean. What are the circumstances under 
which the compiler may ignore an explicit exception behavior? Do you mean that 
it isn't supposed to happen but it might while the feature is under development?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65997/new/

https://reviews.llvm.org/D65997



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66092: [CodeGen] Generate constrained fp intrinsics depending on FPOptions

2019-08-15 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D66092#1630997 , @sepavloff wrote:

> Replacement of floating point operations with constrained intrinsics seems 
> more an optimization helper then a semantic requirement. IR where constrained 
> operations are mixed with unconstrained is still valid in sense of IR 
> specification.


The thing that makes the IR semantically incomplete is that there is nothing 
there to prevent incorrect code motion of the non-constrained operations. 
Consider this case:

  if (someCondition) {
#pragma clang fp rounding(downward)
fesetround(FE_DOWNWARD);
x = y/z;
fesetround(FE_TONEAREST);
  }
  a = b/c;

If you generate a regular fdiv instruction for the 'a = b/c;' statement, there 
is nothing that would prevent it from being hoisted above the call to 
fesetround() and so it might be rounded incorrectly.

In D66092#1630997 , @sepavloff wrote:

> Another issue is non-standard rounding. It can be represented by constrained 
> intrinsics only. The rounding does not require restrictions on code motion, 
> so mixture of constrained and unconstrained operation is OK. Replacement of 
> all operations with constrained intrinsics would give poorly optimized code, 
> because compiler does not optimize them. It would be a bad thing if a user 
> adds the pragma to execute a statement with specific rounding mode and loses 
> optimization.


I agree that loss of optimization would be a bad thing, but I think it's 
unavoidable. By using non-default rounding modes the user is implicitly 
accepting some loss of optimization. This may be more than they would have 
expected, but I can't see any way around it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66092/new/

https://reviews.llvm.org/D66092



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D65997: Add options rounding and exceptions to pragma fp

2019-08-16 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/docs/LanguageExtensions.rst:3152
+rounding mode. This option is experimental; the compiler may ignore an explicit
+rounding mode in some situations.
+

sepavloff wrote:
> andrew.w.kaylor wrote:
> > You should say something about whether the rounding mode is applied or 
> > assumed. The rounding mode argument in the constrained FP intrinsics is 
> > assumed. The compiler is not required to do anything to force the hardware 
> > into that rounding mode. I think we want that behavior here as well. I 
> > believe this follows the semantics of the STDC FENV_ROUND pragmas that were 
> > introduced by ISO TS-18661.
> Hm. I understood the proposal as if pragma FENV_ROUND is applied. If I am 
> wrong, and that pragma is only a hint to compiler, it is not suitable for our 
> purposes. We need a mean to generate constrained intrinsics from C/C++ code. 
> it would facilitate adaptation of LLVM passes for specific floating point 
> options, including rounding, exception behavior, FENV_ACCESS an so on. It 
> also would allow users to tune code generation. In this case `pragma 
> FENV_ROUND` is a different functionality, which should be developed 
> separately.
Sorry, I didn't mean to introduce confusion by bringing up FENV_ROUND, and 
after reading the description you linked on the other review I'm not sure it 
does what I was previously told it would anyway. Let's ignore that and just 
talk about what you're intending to do here.

If you only generate constrained intrinsics and do not generate an explicit 
rounding mode control change instruction (such as a call to fesetround) then 
the declared rounding mode may not actually be used. According to the 
definition in the LLVM Language Reference, when you set a rounding mode in a 
constrained FP intrinsic, the optimizer will assume that rounding mode is in 
effect and it may use it to perform transformations like constant folding. 
However, there is no requirement for the compiler to generate code (in response 
to the constrained intrinsic) to set the rounding mode. Architectures that 
encode the rounding mode in the instructions may use the rounding mode 
information (though the LLVM back end isn't currently well structured to enable 
that), but in cases where the rounding mode is controlled by processor state, 
the constrained intrinsic will not change it.

Hopefully that clarifies what I was asking about here. Do you intend for use of 
these pragmas to actually change the rounding mode or merely describe it? If 
the pragmas are intended to change the rounding mode, you will need to generate 
instructions to do that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65997/new/

https://reviews.llvm.org/D65997



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66092: [CodeGen] Generate constrained fp intrinsics depending on FPOptions

2019-08-16 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D66092#1632642 , @sepavloff wrote:

> - What is the issue with moving `a = b/c`? If it moves ahead of `if` 
> statement it seems OK, because the rounding mode is the same in that point. 
> It cannot be moved inside the block (where rounding mode is different) 
> because it breaks semantics.


It may be that the optimizer can prove that 'someCondition' is always true and 
it will eliminate the if statement and there is nothing to prevent the 
operation from migrating between the calls that change the rounding mode.

This is my main point -- "call i32 @fesetround" does not act as a barrier to an 
fdiv instruction (for example), but it does act as a barrier to a constrained 
FP intrinsic. It is not acceptable, for performance reasons in the general 
case, to have calls act as barriers to unconstrained FP operations. Therefore, 
to keep everything semantically correct, it is necessary to use constrained 
intrinsics in any function where the floating point environment may be changed.

I agree that impact on performance must be minimized, but this is necessary for 
correctness.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66092/new/

https://reviews.llvm.org/D66092



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66092: [CodeGen] Generate constrained fp intrinsics depending on FPOptions

2019-08-16 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

It took some digging, but I finally found the e-mail thread where we initially 
agreed that we can't mix constrained FP intrinsics and non-constrained FP 
operations within a function. Here it is: 
http://lists.llvm.org/pipermail/cfe-dev/2017-August/055325.html


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66092/new/

https://reviews.llvm.org/D66092



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: [RFC] Add support for options -fp-model= and -fp-speculation= : specify floating point behavior

2019-08-16 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/lib/CodeGen/CodeGenFunction.cpp:126
+  case LangOptions::FPM_Precise:
+  case LangOptions::FPM_Fast:
+break;

mibintc wrote:
> mibintc wrote:
> > kpn wrote:
> > > Wait, so "fast" and "precise" are the same thing? That doesn't sound like 
> > > where the documentation you put in the ticket says "the compiler 
> > > preserves the source expression ordering and rounding properties of 
> > > floating-point".
> > > 
> > > (Yes, I saw below where "fast" turns on the fast math flags but "precise" 
> > > doesn't. That doesn't affect my point here.)
> > "precise" doesn't necessitate the use of Constrained Intrinsics, And 
> > likewise for "fast".   The words "compiler preserves the source expression 
> > ordering" were copied from the msdn documentation for /fp:precise as you 
> > explained it would be useful to have the msdn documentation for the option 
> > in case it goes offline in, say, 30 years.  The ICL Intel compiler also 
> > provides equivalent floating point options. The Intel documentation for 
> > precise is phrased differently "Disables optimizations that are not 
> > value-safe on floating-point data."  
> > 
> > fp-model=precise should enable contractions, if that's not true at default 
> > (I mean, clang -c) then this patch is missing that.
> > 
> > fp-model=fast is the same as requesting ffast-math 
> Well, we haven't heard from Andy yet, but he told me some time ago that 
> /fp:precise corresponds more or less (there was wiggle room) to clang's 
> default behavior.  It sounds like you think the description in the msdn of 
> /fp:precise isn't describing clang's default behavior, @kpn can you say more 
> about that, and do you think that ConstrainedIntrinsics should be created to 
> provide the semantics of /fp:precise? 
"Precise" means that no value unsafe optimizations will be performed. That's 
what LLVM does by default. As long as no fast math flags are set, we will not 
perform optimizations that are not value safe.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64067: [X86][PPC] Support -mlong-double-64

2019-07-03 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In this review (https://reviews.llvm.org/D6260) @rsmith mentions that this 
should also have an effect on name mangling.

What will this do if the user calls a library function that expects a long 
double? What does gcc do in that case?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64067/new/

https://reviews.llvm.org/D64067



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64067: [X86][PPC] Support -mlong-double-64

2019-07-03 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D64067#1568895 , @hfinkel wrote:

> One thing to realize about these flags is that they're ABI-altering flags. If 
> the user provides the flag to alter the platform defaults, this only works if 
> the user also ensures that matches the ABI of the relevant system libraries 
> that the compiler is using (e.g., because the user is explicitly linking with 
> a suitable libc).


Right. That's what I was trying to figure out. Are we relying on the users of 
this option not to shoot themselves in the foot? It sounds like yes. I believe 
that's the way we handled it in icc also, but gcc and icc do a lot of sketchy 
things that we wouldn't want to do in clang. In this case I don't object to the 
sketchiness, just so everyone realizes that's what we're doing.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64067/new/

https://reviews.llvm.org/D64067



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53157: Teach the IRBuilder about constrained fadd and friends

2019-05-20 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: include/llvm/IR/IRBuilder.h:234
+  /// Set the exception handling to be used with constrained floating point
+  void setDefaultConstrainedExcept(MDNode *NewExcept) { 
+DefaultConstrainedExcept = NewExcept; 

I think it would be better to add some enumerated type to describe the 
exception semantic and rounding modes. The MDNode is an implementation detail 
and an awkward one for the front end to have to deal with.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D53157/new/

https://reviews.llvm.org/D53157



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69272: Restricted variant of '#pragma STDC FENV_ACCESS'

2019-10-21 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

Thanks for the patch! I don't have time to review this in detail this week, but 
I'm very happy to see this functionality.




Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:890
+  "cannot apply to inline functions, ignoring pragma">,
+   InGroup;
 

rjmccall wrote:
> What's the purpose of this restriction?  Whether `inline` really has much to 
> do with inlining depends a lot on the exact language settings.  (Also, even 
> if this restriction were okay, the diagnostic is quite bad given that there 
> are three separate conditions that can lead to it firing.)
> 
> Also, I thought we were adding instruction-level annotations for this kind of 
> thing to LLVM IR.  Was that not in service of implementing this pragma?
> 
> I'm not categorically opposed to taking patches that only partially implement 
> a feature, but I do want to feel confident that there's a reasonable 
> technical path forward to the full implementation.  In this case, it feels 
> like the function-level attribute is a dead end technically.
I'm guessing this is intended to avoid the optimization problems that would 
occur (currently) if a function with strictfp were inlined into a function 
without it. I'm just guessing though, so correct me if I'm wrong.

As I've said elsewhere, I hope this is a temporary problem. It is a real 
problem though (as is the fact that the inliner isn't currently handling this 
case correctly).

What would you think of a new command line option that caused us to mark 
functions with strictfp as noinline? We'd still need an error somewhat like 
this, but I feel like that would be more likely to accomplish what we want on a 
broad scale.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69272/new/

https://reviews.llvm.org/D69272



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D121410: Have cpu-specific variants set 'tune-cpu' as an optimization hint

2022-03-10 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

This example illustrates the problem this patch intends to fix: 
https://godbolt.org/z/j445sxPMc

For Intel microarchitectures before Skylake, the LLVM cost model says that 
vector fsqrt is slow, so if fast-math is enabled, we'll use an approximation 
rather than the vsqrtps instruction when vectorizing a call to sqrtf(). If the 
code is compiled with -march=skylake or -mtune=skylake, we'll choose the 
vsqrtps instruction, but with any earlier base target, we'll choose the 
approximation even if there is a cpu_specific(skylake) implementation in the 
source code.

For example

  __attribute__((cpu_specific(skylake))) void foo(void) {
for (int i = 0; i < 8; ++i)
  x[i] = sqrtf(y[i]);
  }

compiles to

  foo.b:
  vmovaps ymm0, ymmword ptr [rip + y]
  vrsqrtpsymm1, ymm0
  vmulps  ymm2, ymm0, ymm1
  vbroadcastssymm3, dword ptr [rip + .LCPI2_0] # ymm3 = 
[-3.0E+0,-3.0E+0,-3.0E+0,-3.0E+0,-3.0E+0,-3.0E+0,-3.0E+0,-3.0E+0]
  vfmadd231ps ymm3, ymm2, ymm1# ymm3 = (ymm2 * ymm1) + ymm3
  vbroadcastssymm1, dword ptr [rip + .LCPI2_1] # ymm1 = 
[-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]
  vmulps  ymm1, ymm2, ymm1
  vmulps  ymm1, ymm1, ymm3
  vbroadcastssymm2, dword ptr [rip + .LCPI2_2] # ymm2 = 
[NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN]
  vandps  ymm0, ymm0, ymm2
  vbroadcastssymm2, dword ptr [rip + .LCPI2_3] # ymm2 = 
[1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38]
  vcmplepsymm0, ymm2, ymm0
  vandps  ymm0, ymm0, ymm1
  vmovaps ymmword ptr [rip + x], ymm0
  vzeroupper
  ret

but it should compile to

  foo.b:
  vsqrtps ymm0, ymmword ptr [rip + y]
  vmovaps ymmword ptr [rip + x], ymm0
  vzeroupper
  ret


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121410/new/

https://reviews.llvm.org/D121410

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D121410: Have cpu-specific variants set 'tune-cpu' as an optimization hint

2022-03-10 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:2067
+  // favor this processor.
+  TuneCPU = SD->getCPUName(GD.getMultiVersionIndex())->getName();
+}

Unfortunately, I don't think it's this easy. The list of names used for 
cpu_specific doesn't come from the same place as the list of names used by 
"tune-cpu". For one thing, the cpu_specific names can't contain the '-' 
character, so we have names like "skylake_avx512" in cpu_specific that would 
need to be translated to "skylake-avx512" for "tune-cpu". I believe the list of 
valid names for "tune-cpu" comes from here: 
https://github.com/llvm/llvm-project/blob/26cd258420c774254cc48330b1f4d23d353baf05/llvm/lib/Support/X86TargetParser.cpp#L294

Also, some of the aliases supported by cpu_specific don't have any 
corresponding "tune-cpu" name. You happen to have picked one of these for the 
test. I believe "core_4th_gen_avx" should map to "haswell".



Comment at: clang/test/CodeGen/attr-cpuspecific-avx-abi.c:28
 // CHECK: attributes #[[V]] = 
{{.*}}"target-features"="+avx,+avx2,+bmi,+cmov,+crc32,+cx8,+f16c,+fma,+lzcnt,+mmx,+movbe,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
+// CHECK-SAME: "tune-cpu"="core_4th_gen_avx"

As noted above, this isn't a valid setting for "tune-cpu". I think it would 
just be ignored.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121410/new/

https://reviews.llvm.org/D121410

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D121410: Have cpu-specific variants set 'tune-cpu' as an optimization hint

2022-03-11 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor accepted this revision.
andrew.w.kaylor added a comment.
This revision is now accepted and ready to land.

This looks good to me. Thanks for the patch!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121410/new/

https://reviews.llvm.org/D121410

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122155: Add warning when eval-method is set in the presence of value unsafe floating-point calculations.

2022-03-22 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticFrontendKinds.td:50-53
+def warn_eval_method_setting_via_option_in_value_unsafe_context : Warning<
+"setting the eval method via '-ffp-eval-method' has not effect when 
numeric "
+"results of floating-point calculations aren't value-safe.">,
+InGroup;

zahiraam wrote:
> aaron.ballman wrote:
> > Unless you have a strong reason for this to be a warning, this seems like a 
> > situation we should diagnose as an error with a much clearer message.
> May  be @andrew.w.kaylor would weigh in on this?
I was going to say that for the command line option we could just issue a 
warning saying that the later option overrides the earlier, but it's a bit 
complicated to sort out what that would mean if the eval method follows a 
fast-math option and it might not always be what the user intended. So, I guess 
I'd agree that it should be an error.

For the case with pragmas, the model I'd follow is the mixing of #pragma 
float_control(except, on) with a fast-math mode or #pragma 
float_control(precise, off) with a non-ignore exception mode. In both those 
cases we issue an error.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122155/new/

https://reviews.llvm.org/D122155

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: [RFC] Add support for options -fp-model= and -fp-speculation= : specify floating point behavior

2019-08-27 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/docs/UsersManual.rst:1305
+   and ``noexcept``. Note that -fp-model=[no]except can be combined with the
+   other three settings for this option. Details:
+

rjmccall wrote:
> mibintc wrote:
> > rjmccall wrote:
> > > Combined how?  With a comma?
> > > 
> > > This option seems to have two independent dimensions.  Is that necessary 
> > > for command-line compatibility with ICC, or can we separate it into two 
> > > options?
> > > 
> > > The documentation should mention the default behavior along both 
> > > dimensions.  Is it possible to override a prior instance of this option 
> > > to get this default behavior back?
> > > 
> > > You mention that this `-fp-model=fast` is equivalent to `-ffast-math`.  
> > > How does this option interact with that one if both are given on a 
> > > command line?
> > > 
> > > Please put option text in backticks wherever it appears.
> > > 
> > > Most of these comments apply to `-fp-speculation` as well.
> > > Combined how? With a comma?
> > 
> > > This option seems to have two independent dimensions. Is that necessary 
> > > for command-line compatibility with ICC, or can we separate it into two 
> > > options?
> > Yes that's right, there are 2 dimensions.  I wrote it like this for 
> > identical compatibility with icc, and cl.exe also defines the option this 
> > way, to specify multiple values simultaneously. However I think it would be 
> > reasonable and good to split them into separate options.  I will discuss 
> > this with the folks back home.
> > 
> > > The documentation should mention the default behavior along both 
> > > dimensions. 
> > I added this info into the doc
> > 
> > > Is it possible to override a prior instance of this option to get this 
> > > default behavior back?
> > The 3 values along one dimension, precise, strict, fast if they appear 
> > multiple times in the command line, the last value will be the setting 
> > along that dimension.  Ditto with the other dimension, the rightmost 
> > occurrence of except or noexcept will be the setting. 
> > 
> > > You mention that this -fp-model=fast is equivalent to -ffast-math. How 
> > > does this option interact with that one if both are given on a command 
> > > line?
> > The idea is that they are synonyms so if either or both appeared on the 
> > command line, the effect would be identical. 
> > 
> > I'll upload another patch with a few documentation updates and get back to 
> > you about splitting the fp-model option into multiple options.  (Longer 
> > term, there are 2 other dimensions to fp-model)
> > 
> > And thanks for the review
> > Yes that's right, there are 2 dimensions. I wrote it like this for 
> > identical compatibility with icc, and cl.exe also defines the option this 
> > way, to specify multiple values simultaneously. However I think it would be 
> > reasonable and good to split them into separate options. I will discuss 
> > this with the folks back home.
> 
> Okay.  There's certainly some value in imitating existing compilers, but it 
> sounds like a lot has been forced into one option, so maybe we should take 
> the opportunity to split it up.  If we do split it, though, I think the 
> different dimensions should have different base spellings, rather than being 
> repeated uses of `-fp-model`.
> 
> > The 3 values along one dimension, precise, strict, fast if they appear 
> > multiple times in the command line, the last value will be the setting 
> > along that dimension.
> 
> Okay.  This wasn't clear to me from the code, since the code also has an 
> "off" option.
> 
> > The idea is that they are synonyms so if either or both appeared on the 
> > command line, the effect would be identical.
> 
> Right, but compiler options are allowed to conflict with each other, with the 
> general rule being that the last option "wins".  So what I'm asking is if 
> that works correctly with this option and `-ffast-math`, so that e.g. 
> `-ffast-math -fp-model=strict` leaves you with strict FP but 
> `-fp-model=strict -ffast-math` leaves you with fast FP.  (That is another 
> reason why it's best to have one aspect settled in each option: because you 
> don't have to merge information from different uses of the option.)
> 
> At any rate, the documentation should be clear about how this interacts with 
> `-ffast-math`.  You might even consider merging this into the documentation 
> for `-ffast-math`, or at least revising that option's documentation.  Does 
> `-fp-model=fast` cause `__FAST_MATH__` to be defined?
> 
> Also, strictly speaking, this should be `-ffp-model`, right?
I think the ICC interface includes the exception option for 
compatibility/consistency with Microsoft's  /fp option. We can handle that in 
clang-cl. So, I agree that it makes sense to split that out in clang.

ICC's implementation of this actually has four dimensions, only two of which 
are being taken on here. Frankly, I think it's a bit 

[PATCH] D62731: [RFC] Add support for options -fp-model= and -fp-speculation= : specify floating point behavior

2019-08-27 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/docs/UsersManual.rst:1309
+
+   * ``precise`` Disables optimizations that are not value-safe on 
floating-point data, although FP contraction (FMA) is enabled.
+   * ``strict`` Enables precise and except, and disables contractions (FMA).

There's a bit of ambiguity here because FP contraction isn't an on/off switch 
in LLVM. It has three settings: on, off, and fast. What you've done in this 
patch sets it to 'on' for precise, 'off' for strict, and 'fast' for fast. That 
sounds reasonable, but it's not what ICC and MSVC do. ICC and MSVC both have a 
behavior equivalent to -ffp-contract=fast in the precise model.

The idea behind this is that FMA operations are actually more precise than the 
non-contracted operations. They don't always give the same result, but they 
give a more precise result. The problem with this is that if we adopt this 
approach it leaves us with no fp model that corresponds to the default compiler 
behavior if you don't specify an -fp-model  at all.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-22 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor created this revision.

This patch adds a hack to clang's emitPointerArithmetic() function to get it to 
emit an inttoptr instruction rather than a GEP with a null base pointer when 
the 'p = nullptr + n' idiom is used to convert a pointer-sized integer to a 
pointer.  This idiom is used by some versions of glibc and gcc.

This was previously discussed here: 
http://lists.llvm.org/pipermail/llvm-dev/2017-July/115053.html


https://reviews.llvm.org/D37042

Files:
  lib/CodeGen/CGExprScalar.cpp
  test/CodeGen/nullptr-arithmetic.c


Index: lib/CodeGen/CGExprScalar.cpp
===
--- lib/CodeGen/CGExprScalar.cpp
+++ lib/CodeGen/CGExprScalar.cpp
@@ -2671,6 +2671,37 @@
   unsigned width = cast(index->getType())->getBitWidth();
   auto &DL = CGF.CGM.getDataLayout();
   auto PtrTy = cast(pointer->getType());
+
+  // Some versions of glibc and gcc use idioms (particularly in their malloc
+  // routines) that add a pointer-sized integer (known to be a pointer value)
+  // to a null pointer in order to cast the value back to an integer or as
+  // part of a pointer alignment algorithm.  This is undefined behavior, but
+  // we'd like to be able to compile programs that use it.
+  //
+  // Normally, we'd generate a GEP with a null-pointer base here in response
+  // to that code, but it's also UB to dereference a pointer created that
+  // way.  Instead (as an acknowledged hack to tolerate the idiom) we will
+  // generate a direct cast of the integer value to a pointer.
+  //
+  // The idiom (p = nullptr + N) is not met if any of the following are true:
+  //
+  //   The operation is subtraction.
+  //   The index is not pointer-sized.
+  //   The pointer type is not byte-sized.
+  //   The index operand is a constant.
+  //
+  if (isa(pointer) && !isSubtraction && 
+  (width == DL.getTypeSizeInBits(PtrTy)) && 
+  !isa(index)) {
+// The pointer type might come back as null, so it's deferred until here.
+const PointerType *pointerType 
+  = pointerOperand->getType()->getAs();
+if (pointerType && pointerType->getPointeeType()->isCharType()) { 
+  // (nullptr + N) -> inttoptr N to 
+  return CGF.Builder.CreateIntToPtr(index, pointer->getType());
+}
+  }
+
   if (width != DL.getTypeSizeInBits(PtrTy)) {
 // Zero-extend or sign-extend the pointer value according to
 // whether the index is signed or not.
Index: test/CodeGen/nullptr-arithmetic.c
===
--- test/CodeGen/nullptr-arithmetic.c
+++ test/CodeGen/nullptr-arithmetic.c
@@ -0,0 +1,51 @@
+// RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s
+
+#include 
+
+// This test is meant to verify code that handles the 'p = nullptr + n' idiom
+// used by some versions of glibc and gcc.  This is undefined behavior but
+// it is intended there to act like a conversion from a pointer-sized integer
+// to a pointer, and we would like to tolerate that.
+
+#define NULLPTRI8 ((int8_t*)0)
+
+// This should get the inttoptr instruction.
+int8_t *test1(intptr_t n) {
+  return NULLPTRI8 + n;
+}
+// CHECK-LABEL: test1
+// CHECK: inttoptr
+// CHECK-NOT: getelementptr
+
+// This doesn't meet the idiom because the offset type isn't pointer-sized.
+int8_t *test2(int16_t n) {
+  return NULLPTRI8 + n;
+}
+// CHECK-LABEL: test2
+// CHECK: getelementptr
+// CHECK-NOT: inttoptr
+
+// This doesn't meet the idiom because the offset is constant.
+int8_t *test3() {
+  return NULLPTRI8 + 16;
+}
+// CHECK-LABEL: test3
+// CHECK: getelementptr
+// CHECK-NOT: inttoptr
+
+// This doesn't meet the idiom because the element type is larger than a byte.
+int16_t *test4(intptr_t n) {
+  return (int16_t*)0 + n;
+}
+// CHECK-LABEL: test4
+// CHECK: getelementptr
+// CHECK-NOT: inttoptr
+
+// This doesn't meet the idiom because the offset is subtracted.
+int8_t* test5(intptr_t n) {
+  return NULLPTRI8 - n;
+}
+// CHECK-LABEL: test5
+// CHECK: getelementptr
+// CHECK-NOT: inttoptr
+


Index: lib/CodeGen/CGExprScalar.cpp
===
--- lib/CodeGen/CGExprScalar.cpp
+++ lib/CodeGen/CGExprScalar.cpp
@@ -2671,6 +2671,37 @@
   unsigned width = cast(index->getType())->getBitWidth();
   auto &DL = CGF.CGM.getDataLayout();
   auto PtrTy = cast(pointer->getType());
+
+  // Some versions of glibc and gcc use idioms (particularly in their malloc
+  // routines) that add a pointer-sized integer (known to be a pointer value)
+  // to a null pointer in order to cast the value back to an integer or as
+  // part of a pointer alignment algorithm.  This is undefined behavior, but
+  // we'd like to be able to compile programs that use it.
+  //
+  // Normally, we'd generate a GEP with a null-pointer base here in response
+  // to that code, but it's also UB to dereference a pointer created that
+  // way.  Instead (as an acknowledged hack to tolerate the idiom) we will
+  // generate a direct cast of

[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-22 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

I'm not sure why the svn attributes got attached to the file I added.  I'll 
remove them before committing.


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-08-23 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

My intention here was to make this as strict/limited as possible while still 
handling the cases of interest.  There are two different implementations I want 
to handle.  You can see the first variation in the __BPTR_ALIGN definition 
being added here:

https://github.com/lattera/glibc/commit/2fd4de4b15a66f821057af90714145d2c034a609#diff-cd0c2b2693d632ad8843ae58a6ea7ffaR125

You can see the other in the __INT_TO_PTR definition that was being deleted in 
the same change set here:

https://github.com/lattera/glibc/commit/2fd4de4b15a66f821057af90714145d2c034a609#diff-cd0c2b2693d632ad8843ae58a6ea7ffaL122

This second case shows up in some benchmark code, so it's important even though 
glibc phased it out.

I'm really not sure what this looks like in the front end, but it seems like it 
could be relatively open-ended as to where the value came from.

Basically, my intention is to eliminate anything I know isn't what I'm looking 
for and then handle whatever remains being added to a null pointer.  Given that 
adding to a null pointer is undefined behavior, whatever we do should be 
acceptable in all cases.  What I found in practice was that even with the 
existing handling of this it took an awful lot of optimizations before the 
null-based GEP and the load associated with it got close enough together for us 
to recognize that we could optimize it away, so I don't think we'd be losing 
much by tolerating additional cases in the way this patch proposes.


https://reviews.llvm.org/D37042



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69878: Consoldiate internal denormal flushing controls

2020-01-06 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: llvm/docs/LangRef.rst:1837
+   are present, this overrides ``"denormal-fp-math"``. Not all targets
+   support separately setting the denormal mode per type.
+

Can you document which targets do support the option? What happens if I try to 
use the option on a target where it is not supported?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69878: Consoldiate internal denormal flushing controls

2020-01-06 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: llvm/docs/LangRef.rst:1837
+   are present, this overrides ``"denormal-fp-math"``. Not all targets
+   support separately setting the denormal mode per type.
+

arsenm wrote:
> andrew.w.kaylor wrote:
> > Can you document which targets do support the option? What happens if I try 
> > to use the option on a target where it is not supported?
> I'm not sure where to document this, or if/how/where to diagnose it. I don't 
> think the high level LangRef description is the right place to discuss 
> specific target handling.
> 
> Currently it won't error or anything. Code checking the denorm mode will see 
> the f32 specific mode, even if the target in the end isn't really going to 
> respect this.
> 
> One problem is this potentially does require coordination with other 
> toolchain components. For AMDGPU, the compiler can directly tell the driver 
> what FP mode to set on each entry point, but for x86 it requires linking in 
> crtfastmath to set the default mode bits. If another target had a similar 
> runtime environment requirement, I don't think we can be sure the attribute 
> is correct or not.
There is precedent for describing target-specific behavior in LangRef. It just 
doesn't seem useful to say that not all targets support the attribute without 
saying which ones do. We should also say what is expected if a target doesn't 
support the attribute. It seems reasonable for the function attribute to be 
silently ignored.

> One problem is this potentially does require coordination with other 
> toolchain components. For AMDGPU, the compiler can directly tell the driver 
> what FP mode to set on each entry point, but for x86 it requires linking in 
> crtfastmath to set the default mode bits.

This is a point I'm interested in. I don't like the current crtfastmath.o 
handling. It feels almost accidental when FTZ works as expected. My 
understanding is we link crtfastmath.o if we find it but if not everything just 
goes about its business. The Intel compiler injects code into main() to 
explicitly set the FTZ/DAZ control modes. That obviously has problems too, but 
it's at least consistent and predictable. As I understand it, crtfastmath.o 
sets these modes from a static initializer, but I'm not sure anything is done 
to determine the order of that initializer relative to others.

How does the compiler identify entry points for AMDGPU? And does it emit code 
to set FTZ based on the function attribute here?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69878: Consoldiate internal denormal flushing controls

2020-01-09 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: llvm/docs/LangRef.rst:1837
+   are present, this overrides ``"denormal-fp-math"``. Not all targets
+   support separately setting the denormal mode per type.
+

arsenm wrote:
> andrew.w.kaylor wrote:
> > arsenm wrote:
> > > andrew.w.kaylor wrote:
> > > > Can you document which targets do support the option? What happens if I 
> > > > try to use the option on a target where it is not supported?
> > > I'm not sure where to document this, or if/how/where to diagnose it. I 
> > > don't think the high level LangRef description is the right place to 
> > > discuss specific target handling.
> > > 
> > > Currently it won't error or anything. Code checking the denorm mode will 
> > > see the f32 specific mode, even if the target in the end isn't really 
> > > going to respect this.
> > > 
> > > One problem is this potentially does require coordination with other 
> > > toolchain components. For AMDGPU, the compiler can directly tell the 
> > > driver what FP mode to set on each entry point, but for x86 it requires 
> > > linking in crtfastmath to set the default mode bits. If another target 
> > > had a similar runtime environment requirement, I don't think we can be 
> > > sure the attribute is correct or not.
> > There is precedent for describing target-specific behavior in LangRef. It 
> > just doesn't seem useful to say that not all targets support the attribute 
> > without saying which ones do. We should also say what is expected if a 
> > target doesn't support the attribute. It seems reasonable for the function 
> > attribute to be silently ignored.
> > 
> > > One problem is this potentially does require coordination with other 
> > > toolchain components. For AMDGPU, the compiler can directly tell the 
> > > driver what FP mode to set on each entry point, but for x86 it requires 
> > > linking in crtfastmath to set the default mode bits.
> > 
> > This is a point I'm interested in. I don't like the current crtfastmath.o 
> > handling. It feels almost accidental when FTZ works as expected. My 
> > understanding is we link crtfastmath.o if we find it but if not everything 
> > just goes about its business. The Intel compiler injects code into main() 
> > to explicitly set the FTZ/DAZ control modes. That obviously has problems 
> > too, but it's at least consistent and predictable. As I understand it, 
> > crtfastmath.o sets these modes from a static initializer, but I'm not sure 
> > anything is done to determine the order of that initializer relative to 
> > others.
> > 
> > How does the compiler identify entry points for AMDGPU? And does it emit 
> > code to set FTZ based on the function attribute here?
> The entry points are a specific calling convention. There's no real concept 
> of main. Each kernel has an associated blob of metadata the driver uses to 
> set up various config registers on dispatch.
> 
> I don't think specially recognizing main in the compiler is fundamentally 
> different than having it done in a static constructor. It's still a construct 
> not associated with any particular function or anything.
The problem with having it done in a static constructor is that you have no 
certainty of when it will be done relative to other static constructors. If 
it's in main you can at least say that it's after all the static constructors 
(assuming main is your entry point).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69878: Consoldiate internal denormal flushing controls

2020-01-10 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: llvm/docs/LangRef.rst:1837
+   are present, this overrides ``"denormal-fp-math"``. Not all targets
+   support separately setting the denormal mode per type.
+

cameron.mcinally wrote:
> andrew.w.kaylor wrote:
> > arsenm wrote:
> > > andrew.w.kaylor wrote:
> > > > arsenm wrote:
> > > > > andrew.w.kaylor wrote:
> > > > > > Can you document which targets do support the option? What happens 
> > > > > > if I try to use the option on a target where it is not supported?
> > > > > I'm not sure where to document this, or if/how/where to diagnose it. 
> > > > > I don't think the high level LangRef description is the right place 
> > > > > to discuss specific target handling.
> > > > > 
> > > > > Currently it won't error or anything. Code checking the denorm mode 
> > > > > will see the f32 specific mode, even if the target in the end isn't 
> > > > > really going to respect this.
> > > > > 
> > > > > One problem is this potentially does require coordination with other 
> > > > > toolchain components. For AMDGPU, the compiler can directly tell the 
> > > > > driver what FP mode to set on each entry point, but for x86 it 
> > > > > requires linking in crtfastmath to set the default mode bits. If 
> > > > > another target had a similar runtime environment requirement, I don't 
> > > > > think we can be sure the attribute is correct or not.
> > > > There is precedent for describing target-specific behavior in LangRef. 
> > > > It just doesn't seem useful to say that not all targets support the 
> > > > attribute without saying which ones do. We should also say what is 
> > > > expected if a target doesn't support the attribute. It seems reasonable 
> > > > for the function attribute to be silently ignored.
> > > > 
> > > > > One problem is this potentially does require coordination with other 
> > > > > toolchain components. For AMDGPU, the compiler can directly tell the 
> > > > > driver what FP mode to set on each entry point, but for x86 it 
> > > > > requires linking in crtfastmath to set the default mode bits.
> > > > 
> > > > This is a point I'm interested in. I don't like the current 
> > > > crtfastmath.o handling. It feels almost accidental when FTZ works as 
> > > > expected. My understanding is we link crtfastmath.o if we find it but 
> > > > if not everything just goes about its business. The Intel compiler 
> > > > injects code into main() to explicitly set the FTZ/DAZ control modes. 
> > > > That obviously has problems too, but it's at least consistent and 
> > > > predictable. As I understand it, crtfastmath.o sets these modes from a 
> > > > static initializer, but I'm not sure anything is done to determine the 
> > > > order of that initializer relative to others.
> > > > 
> > > > How does the compiler identify entry points for AMDGPU? And does it 
> > > > emit code to set FTZ based on the function attribute here?
> > > The entry points are a specific calling convention. There's no real 
> > > concept of main. Each kernel has an associated blob of metadata the 
> > > driver uses to set up various config registers on dispatch.
> > > 
> > > I don't think specially recognizing main in the compiler is fundamentally 
> > > different than having it done in a static constructor. It's still a 
> > > construct not associated with any particular function or anything.
> > The problem with having it done in a static constructor is that you have no 
> > certainty of when it will be done relative to other static constructors. If 
> > it's in main you can at least say that it's after all the static 
> > constructors (assuming main is your entry point).
> Yes and no. The linker should honor static constructor priorities. But, yeah, 
> there's no guarantee that this constructor will run before other priority 101 
> constructors.
> 
> The performance penalty for setting denormal flushing in main could be 
> significant (think C++). Also, there's precedent for using static 
> constructors, like GCC's crtfastmath.o.
Fair enough. I don't necessarily like how icc handles this. I don't have a 
problem with how gcc handles it. I just really don't like how LLVM does it. If 
we want to take the static constructor approach we should define our own, not 
depend on whether or not the GNU object file happens to be around.

Static initialization doesn't help for AMDGPU, and I suppose that's likely to 
be the case for any offload execution model. Since this patch is moving us 
toward a more consistent implementation I'm wondering if we can define some 
general rules for how this is supposed to work. Like when the function 
attribute will result in injected instructions setting the control flags and 
when it won't.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mail

[PATCH] D74436: Change clang option -ffp-model=precise to select ffp-contract=on

2021-08-05 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

FWIW, fp-contract=on has been the documented default for clang since version 5.

https://releases.llvm.org/5.0.1/tools/clang/docs/ClangCommandLineReference.html#cmdoption-clang-ffp-contract

This change just brought the behavior into conformance with the documentation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74436/new/

https://reviews.llvm.org/D74436

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D112094: Add support for floating-point option `ffp-eval-method` and for `pragma clang fp eval_method`.

2021-11-01 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.
Herald added a subscriber: luke957.

I agree that the pragma should also control the evaluation of constant 
expressions.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112094/new/

https://reviews.llvm.org/D112094

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107994: Making the code compliant to the documentation about Floating Point support default values for C/C++. FPP-MODEL=PRECISE enables FFP-CONTRACT (FMA is enabled).

2021-11-09 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/test/CodeGen/ffp-model.c:4
+// RUN: | FileCheck %s \
+// RUN: --check-prefixes=CHECK,CHECK-FAST --strict-whitespace
+

Why did you add the strict-whitespace option?



Comment at: clang/test/CodeGen/ffp-model.c:10
+
+// RUN: %clang -S -emit-llvm -ffp-model=strict %s -o - \
+// RUN: | FileCheck %s \

This is still going to have problems with targets that don't support strictfp 
unless you add an explicit target to the run line.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107994/new/

https://reviews.llvm.org/D107994

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107994: Making the code compliant to the documentation about Floating Point support default values for C/C++. FPP-MODEL=PRECISE enables FFP-CONTRACT (FMA is enabled).

2021-11-10 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor accepted this revision.
andrew.w.kaylor added a comment.

lgtm


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107994/new/

https://reviews.llvm.org/D107994

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107994: Making the code compliant to the documentation about Floating Point support default values for C/C++. FPP-MODEL=PRECISE enables FFP-CONTRACT (FMA is enabled).

2021-11-11 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor closed this revision.
andrew.w.kaylor added a comment.

Closed by commit by rGf04e387055e495e3e14570087d68e93593cf2918 



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107994/new/

https://reviews.llvm.org/D107994

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D114564: Fix the use of -fno-approx-func along with -Ofast or -ffast-math

2021-11-24 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

Thanks for the patch! This looks mostly good. I have just a few suggestions.

Could you add test cases in clang/test/Driver/clang_f_opts.c to verify that the 
various driver inputs get overridden in the expected way? Without such a test, 
this behavior is likely to be broken in the future.




Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2760
 case options::OPT_fno_honor_nans:   HonorNaNs = false;break;
 case options::OPT_fapprox_func: ApproxFunc = true;break;
 case options::OPT_fno_approx_func:  ApproxFunc = false;   break;

Should this also imply "MathErrno = false"?



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2762
 case options::OPT_fno_approx_func:  ApproxFunc = false;   break;
 case options::OPT_fmath_errno:  MathErrno = true; break;
 case options::OPT_fno_math_errno:   MathErrno = false;break;

Should this conflict with -fapprox-func?



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2880
   break;
 case options::OPT_fno_unsafe_math_optimizations:
   AssociativeMath = false;

I think you need "AppoxFunc = false" here.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2937
   // subsequent options conflict then emit warning diagnostic.
   if (HonorINFs && HonorNaNs && !AssociativeMath && !ReciprocalMath &&
   SignedZeros && TrappingMath && RoundingFPMath &&

You need a check for "!ApproxFunc" here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114564/new/

https://reviews.llvm.org/D114564

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136786: Fix `unsafe-fp-math` attribute emission.

2022-10-28 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/lib/CodeGen/CGCall.cpp:1865
   FuncAttrs.addAttribute("approx-func-fp-math", "true");
-if ((LangOpts.FastMath ||
- (!LangOpts.FastMath && LangOpts.AllowFPReassoc &&
-  LangOpts.AllowRecip && !LangOpts.FiniteMathOnly &&
-  LangOpts.NoSignedZero && LangOpts.ApproxFunc)) &&
-LangOpts.getDefaultFPContractMode() != 
LangOptions::FPModeKind::FPM_Off)
+if (LangOpts.UnsafeFPMath &&
+(LangOpts.getDefaultFPContractMode() ==

michele.scandale wrote:
> michele.scandale wrote:
> > zahiraam wrote:
> > > zahiraam wrote:
> > > > michele.scandale wrote:
> > > > > michele.scandale wrote:
> > > > > > I've found quite confusing the `(FastMath || (!FastMath && ... ))`.
> > > > > > 
> > > > > > Using directly `UnsafeFPMath` seems more compact, however it also 
> > > > > > causes to taking into account the value for `MathErrno` -- which it 
> > > > > > might be not relevant.
> > > > > > 
> > > > > > If `MathErrno` shouldn't affect this, then I would rewrite this as:
> > > > > > ```
> > > > > > if (LangOpts.AllowFPReassoc && LangOpts.AllowRecip &&
> > > > > >  LangOpts.NoSignedZero && LangOpts.ApproxFunc &&
> > > > > >  (LangOpts.getDefaultFPContractMode() ==
> > > > > >  LangOptions::FPModeKind::FPM_Fast ||
> > > > > >  LangOpts.getDefaultFPContractMode() ==
> > > > > >  LangOptions::FPModeKind::FPM_FastHonorPragmas))
> > > > > >   FuncAttrs.addAttribute("unsafe-fp-math", "true");
> > > > > > ```
> > > > > > so that only the relevant options are considered and there is no 
> > > > > > need to think about what is implied by `FastMath`.
> > > > > I do wonder if in 
> > > > > https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L3089-L3091
> > > > >  is actually correct to have `!MathErrno`. In the GCC documentation 
> > > > > of `-funsafe-math-optimizations` I don't see any connection to the 
> > > > > `-fmath-errno` or `-fno-math-errno` options.
> > > > Interesting! Using on the command line 'funsafe-math-optimization' 
> > > > implies 'math-errno'. But 'funsafe-math-optimizations' is triggered 
> > > > when this is true:
> > > >  !MathErrno && AssociativeMath && ReciprocalMath && !SignedZeros && 
> > > >  ApproxFunc && !TrappingMath
> > > > 
> > > > See here: 
> > > > https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L3089
> > > >  
> > > When the ‘funsafe-math-optimizations’ is used on the command line 
> > > LangOpts.UnsafeFP is ‘0’. 
> > > The function attribute ‘funnsafe-fp-math’ is not set. Is that what’s 
> > > wanted? I would have thought that we want the attribute enabled for 
> > > unsafe mode.
> > > 
> > > If we call 
> > > SpecialOperations = AllowFPReassoc && NoSignedZero && AllowRecip && 
> > > ApproxFunc && !RoundingMath.
> > > 
> > > ‘ffast-math’ = !MathErrno && FiniteMathOnly && SpecialOperations
> > > ‘funsafe-math-optimizations’ = MathErrno && !FiniteMathOnly && 
> > > SpecialOperations 
> > > (it's true that there is no mention of MathErrno in the GCC doc, but the 
> > > default is fmath-errno so presumably using this option on the command 
> > > line implies MathErrno).
> > > 
> > > 
> > > The function attribute UnsafeFPMath is enabled for 
> > > (‘ffast-math’ || ‘funsafe-math-optimization’). 
> > > That will lead to this condition: 
> > > (SpecialOperations  && MathErrNo && "-ffast-math" && -ffp-contract=fast") 
> > > || (SpecialOperations && "-fmath-errno" && "-ffp-contract=on") .
> > > 
> > > 
> > The compiler driver has a default value for `MathErrno` which depends on 
> > the toolchain properties. When `-funsafe-math-optimizations` is processed 
> > `MathErrno` is not affected, but then the compiler driver specify 
> > `-funsafe-math-optimizations` to the CC1 command line only if `MathErrno` 
> > is false.
> > The way the compiler driver mutate the floating point options state when 
> > processing `-funsafe-math-optimizations` seems to match the GCC 
> > documentation, but then the condition for the forwarding expect something 
> > more.
> > It is not clear to me if the intention is to match GCC documented behavior, 
> > or if instead the Clang definition of `-funsafe-math-optimizations` implies 
> > `-fno-math-errno`.
> >When the ‘funsafe-math-optimizations’ is used on the command line 
> >LangOpts.UnsafeFP is ‘0’. 
> The function attribute ‘funnsafe-fp-math’ is not set. Is that what’s wanted? 
> I would have thought that we want the attribute enabled for unsafe mode.
> 
> I believe this is caused by `MathErrno` state in the compiler driver being 
> not affected when processing `-funsafe-math-optimizations`, but considered 
> for the forwarding to the CC1.
> 
> >If we call 
> >SpecialOperations = AllowFPReassoc && NoSignedZero && AllowRecip && 
> >ApproxFunc && !RoundingMath.
> >
> >‘ffast-math’ = !MathErrno && FiniteMathOnly && SpecialOperations
> >‘funsafe-math-o

[PATCH] D136786: Fix `unsafe-fp-math` attribute emission.

2022-10-28 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D136786#3892177 , @zahiraam wrote:

>>> I'm not following entirely, but -funsafe-math-optimizations is just a 
>>> subset of -ffast-math, so checking only the properties that contribute to 
>>> -funsafe-math-optimizations should be enough. 
>>>  I think it is way simpler to reason in these terms than enumerating all 
>>> the possible scenarios.
>
> Exactly my point Since -funsafe-math-optimizations is a subset of 
> -ffast-math, shouldn't it be the "minimal" option to trigger the 
> unsafe-fp-math attribute for functions? 
> I agree with the second part of the condition; it should be
> updated with (LangOpts.getDefaultFPContractMode() ==  
> LangOptions::FPModeKind::FPM_Fast ||   LangOpts.getDefaultFPContractMode() == 
> LangOptions::FPModeKind::FPM_FastHonorPragmas) 
> but I am still hesitant about the change you are proposing for the first part 
> of the condition.
> @andrew.w.kaylor Can you please weigh in on this?

I talked to Zahira about this offline, and the current state is even messier 
than I realized. I think we need to start by making sure we agree on the 
behavior we want and then figure out how to get there. There are some messy 
complications in the different ways things are handled in the driver, the front 
end, and the backend. There are more overlapping settings than I would like, 
but I guess it's best to look for the best way we can incrementally improve 
that situation.

As a first principle, I'd like to clarify a "big picture" item that I was 
trying to get at in my earlier comment. I'd like to be able to reason about 
this from the starting point of individual semantic modes. There is a set of 
modes defined in the clang user's manual 
(https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior).
 I think this is reasonably complete:

  exception_behavior
  fenv_access
  rounding_mode
  fp-contract
  denormal_fp_math
  denormal_fp_math_32
  support_math_errno
  no_honor_nans
  no_honor_infinities
  no_signed_zeros
  allow_reciprocal
  allow_approximate_fns
  allow_reassociation

These are the modes from clang's perspective. They get communicated to the back 
end in different ways. The backend handling of this isn't nearly as consistent 
as I'd like, but that's a different problem. I think these basic modes can be 
thought of as language-independent and should be used as the basic building 
blocks for reasoning about floating point behavior across all LLVM-based 
compilers.

On top of these modes, we have two concepts "fast-math" and 
"unsafe-math-optimizations" which are essentially composites of one or more of 
the above semantic modes. I say "essentially" because up to this point there 
has been some fuzzy handling of that. I'd like to say they are absolutely 
composites of the above modes, and I think we can make it so, starting here.

If we agree that fast-math and unsafe-math-optimizations/unsafe-fp-math are 
composites of some of the other modes, then we can apply a symmetry property 
that I think will be helpful in the problem we're trying to solve here and will 
lead to better reasoning about these settings in the future.

I am proposing that passing "-ffast-math" should have *exactly* the same 
effects as passing all of the individual command line options that are implied 
by -ffast-math. Likewise, passing "-funsafe-math-optimizations" should have 
*exactly* the same effects as passing all of the individual options that are 
implied by -funsafe-math-optimizations. And, of course, any combination of 
options that turn various states on and off can be strung together and we can 
just evaluate the final state those options leave us in to see if we are in the 
"fast-math" state or the "unsafe-fp-math" state.

I'm going to ignore fast-math right now, because I think the current handling 
is mostly OK, and this review is about unsafe-fp-math. The unsafe-fp-math case 
is a little simpler in concept, but I think we've got more problems with it.

Another fundamental principle I'd like to declare here is that 
LangOpts.UnsafeFPMath, TargetOptions.UnsafeFPMath, and the "unsafe-fp-math" 
function attribute should all mean exactly the same thing. The function 
attribute has a different scope, so it might not always have the same value as 
the other two, but it should at least mean the same thing.

My understanding is this:

  unsafe-fp-math = exception_behavior(ignore) +
   fenv_access(off) +
   no_signed_zeros(on) +
   allow_reciprocal(on) +
   allow_approximate_fns(on) +
   allow_reassociation(on) +
   fp_contract(fast)

The first two of those settings are the default behavior. The others can be 
turned on by individual command line options. If all of those semantic modes 
are in the states above, then we are in "unsafe-fp-math" mode. That's my 
proposal.

There are a couple of things we need

[PATCH] D136786: Fix `unsafe-fp-math` attribute emission.

2022-10-31 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D136786#3893559 , 
@michele.scandale wrote:

> The fact that `"unsafe-fp-math"="true"` implies `-ffp-contract=fast` is quite 
> unfortunate, and it is something that in principle should be addressed.
> My understanding is that the floating point math function attributes are a 
> quite old concept that predates the per-instruction fast-math flags. Ideally 
> we should get rid of the flooding point math function attribute since we do 
> have a finer grain representation.
> A while back the main issue was that all the backends code (e.g. DAGCombiner) 
> were only using the `TargetOptions` properties (hence the function 
> attributes) to control the floating point related transformations, hence 
> there would have been regressions.

Yes, the function attributes are a vestige of the time before the fast-math 
flags were introduced, but the use of them hasn't been entirely eliminated and 
I think there are a couple of cases where some people think it's necessary. I 
posted an RFC and a patch about a year ago to start cleaning this up, but I got 
pulled away before it landed and honestly I had forgotten about it. 
https://discourse.llvm.org/t/rfc-eliminating-non-ir-floating-point-controls-in-the-selection-dag/59173

> At high level, I think that we need to understand how important is to match 
> GCC behavior in this particular case. We can change the way Clang defines 
> `-funsafe-math-optimizations` to imply `-ffp-contract=fast`, but that seems 
> the wrong solution as it feels like promoting a bug to a feature.

I wouldn't consider the fact that unsafe-fp-math (as a concept) implies 
fp-contract=fast to be a bug. Yes, it may appear to deviate from the gcc 
behavior, but the reasons for that are complicated. Mostly they stem from the 
fact that gcc doesn't support fp-contract in the way that the C standard 
describes it.  In gcc, fp-contract is fast or off, and it defaults to fast. If 
you pass -ffp-contract=on to gcc, it behaves exactly like -ffp-contract=off. If 
you pass -std=c99, the default for fp-contract changes to on, but because gcc 
doesn't support fp-contract=on, FMA isn't formed. 
https://godbolt.org/z/K86Kv8W7h

My take on this is that fp-contract=on is a mode that conforms to the language 
standard and fp-contract=fast allows value changing optimizations that do not 
conform to the standard. Value-changing optimizations that do not conform to 
the standard are exactly what the definition of -funsafe-math-optimizations 
allow, so I can't see any reason that -funsafe-math-optimizations shouldn't 
imply fp-contract=fast.

So I was going to say that gcc is wrong, even by its own rules, to not allow 
fp-contract=fast behavior under -funsafe-math-optimizations, but then I 
double-checked my understanding and made a very happy discovery ... 
-funsafe-math-optimizations DOES imply fp-contract=fast in gcc! You just need 
to follow a convoluted path to see that (i.e. change the default to something 
other than fast using -std=c99 and then add the unsafe math option). 
https://godbolt.org/z/K1GonvGdT


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136786/new/

https://reviews.llvm.org/D136786

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136176: Implement support for option 'fexcess-precision'.

2022-11-15 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

Can you update the clang user's manual to describe the behavior of this option? 
The excess precision issue is also mentioned in the clang language extensions 
document 
(https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point).




Comment at: clang/include/clang/Basic/LangOptions.def:320
 BENIGN_ENUM_LANGOPT(FPEvalMethod, FPEvalMethodKind, 2, FEM_UnsetOnCommandLine, 
"FP type used for floating point arithmetic")
+BENIGN_ENUM_LANGOPT(FPPrecisionMode, FPExcessPrecisionModeKind, 2, 
FPP_Standard, "FP precision used for floating point arithmetic")
 LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment")

A lot of people mix up precision and accuracy, so I think this description is 
likely to be ambiguous. I'd suggest "Intermediate truncation behavior for 
floating point arithmetic"



Comment at: clang/include/clang/Basic/LangOptions.h:300
+FPP_Fast,
+FPP_None
+  };

Is FPP_None somehow the same as fexcess-precision=16? If not, what does it mean?



Comment at: clang/include/clang/Driver/Options.td:1564
+def fexcess_precision_EQ : Joined<["-"], "fexcess-precision=">, 
Group, Flags<[CC1Option]>,
+  HelpText<"Specifies the precision in which this floating-point operations 
will be calculated.">,
+  Values<"standard,fast,16">, NormalizedValuesScope<"LangOptions">,

Again, I think this is ambiguous. My understanding is that the option is more 
about the rules for when intermediate values are truncated to source precision 
than when what precision is prescribed for the calculation. For targets that 
have native support for half precision types, the calculations should always be 
done at source precision and this option will have no effect. It's only when 
the native ISA doesn't support the source precision that this is needed and 
even then we will always perform calculations at the nearest precision to 
source precision that is available, right?



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2995
+  StringRef Val = A->getValue();
+   if (TC.getTriple().getArch() == llvm::Triple::x86 && Val.equals("16"))
+D.Diag(diag::err_drv_unsupported_opt_for_target)

Why is 16 only supported for x86? Is it only here for gcc compatibility?



Comment at: clang/test/CodeGen/X86/fexcess-precise.c:89
+// CHECK-EXT-NEXT:[[EXT1:%.*]] = fpext half [[TMP1]] to float
+// CHECK-EXT-NEXT:[[MUL:%.*]] = fmul float [[EXT]], [[EXT1]]
+// CHECK-EXT-NEXT:[[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2

This is not what I'd expect to see. I'd expect the operations to use the half 
type with explicit truncation inserted where needed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136176/new/

https://reviews.llvm.org/D136176

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D151834: Include math-errno with fast-math

2023-09-06 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor accepted this revision.
andrew.w.kaylor added a comment.

lgtm


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151834/new/

https://reviews.llvm.org/D151834

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D97854: [RFC][nsan] A Floating-point numerical sanitizer.

2022-11-30 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

This looks very interesting. I had a discussion with someone at the recent LLVM 
Dev Meeting about the possibility of something like this. However, rather than 
tracking error based on data precision, I am interested in tracking errors 
introduced by fast-math based optimizations. For instance, someone has a 
program with has been verified within accuracy requirements, but they want it 
to run faster so they enable fast-math. Sometimes this works, but other times 
the fast-math optimizations introduce an unacceptable amount of error. What I'd 
like is to be able to trace the problem back to which part of the original 
source was sensitive to this error so that I can disable fast-math locally for 
just that operation.

Another potential use for this sort of technology relates to an RFC that I just 
posted (https://reviews.llvm.org/D138867). There I'm trying to introduce a 
mechanism that allows the compiler to replace builtin math operations (calls to 
math library functions or equivalent builtins in platforms like SYCL or CUDA) 
based on a specified accuracy requirement. One of the challenges with this is 
verifying that the substitute call is really as accurate as it claims. For 
example, let's say I want to call cosf() and I need 4 ulp accuracy. The 
standard GNU libm implementation claims 1 ulp accuracy, so it's not necessarily 
useful as a point of comparison. But if we had a shadow computation that 
converted the input value to double and called cos() then converted that result 
back to float, that should give me the correctly rounded result, right? Or I 
could use a shadow call to one of the various correctly rounded implementations 
that are becoming available. It would be great to use nsan to verify the 
results from these alternate library calls.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97854/new/

https://reviews.llvm.org/D97854

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74500: clang: Treat ieee mode as the default for denormal-fp-math

2020-03-04 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor accepted this revision.
andrew.w.kaylor added a comment.
This revision is now accepted and ready to land.

lgtm


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74500/new/

https://reviews.llvm.org/D74500



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69598: Work on cleaning up denormal mode handling

2019-11-08 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

I'm unclear as to the expectations surrounding this option. I suppose this is 
somewhat beyond the scope of the current changes, but I'm confused by clang's 
current behavior with regard to denormals.

The -fdenromal-fp-math option causes a function attribute to be set indicating 
the desired flushing behavior, and I guess in some cases that has an effect on 
instruction selection, but it seems to be orthogonal to whether or not we're 
actually setting the processor controls to do flushing (at least for most 
targets). I really only know what happens in the x86 case, and I don't know if 
this behavior is consistent across architectures, but in the x86 case setting 
or not setting the processor  control flags depends on the fast math flags and 
whether or not we find crtfastmath.o when we link.

This leads me to my other point of confusion. Setting the "denormal-fp-math" 
option on a per-function basis seems wrong for targets that have a global FP 
control.




Comment at: clang/include/clang/Basic/CodeGenOptions.h:167
   /// The floating-point denormal mode to use.
-  std::string FPDenormalMode;
+  llvm::DenormalMode FPDenormalMode = llvm::DenormalMode::Invalid;
 

Why is "Invalid" the default here? If you don't use the "fdenormal-fp-math" 
option, shouldn't you get IEEE?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69598/new/

https://reviews.llvm.org/D69598



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69598: Work on cleaning up denormal mode handling

2019-11-12 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor accepted this revision.
andrew.w.kaylor added a comment.

Thanks. I understand your direction for denormal handling now, and I'm OK with 
this patch apart from the remaining references to subnormal that Sanjay 
mentioned.

In D69598#1739723 , @arsenm wrote:

> I do think the floating point environment bits should be a considered a 
> property of the calling convention, with attributes that override them. A 
> function which calls a function with a different mode would be responsible 
> for switching the mode before the call. This would require people actually 
> caring about getting this right to really implement


Do you mean the compiler should insert code to restore the FP environment on 
function transitions? Or do you mean that the function itself (i.e. the user's 
code) is responsible for switching the mode? I have some reservations about 
this, but I think the C standard specification for FENV_ACCESS is clear that 
the it is the programmer's responsibility to manage the floating point 
environment correctly. Yes, that's a sure recipe for broken code, but that's 
what it says. Obviously LLVM IR is not bound by the C standard and we could 
take a different approach, but I have concerns about the performance 
implications because in general the compiler won't know when the environment 
needs to be restored so it would have to take a conservative approach.

I've been meaning to write some documentation on the expected behavior at 
function boundaries of the FP environment. Perhaps we can continue this 
discussion there.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69598/new/

https://reviews.llvm.org/D69598



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69978: Separately track input and output denormal mode

2019-11-12 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: llvm/docs/LangRef.rst:1822
 
 ``"denorm-fp-mode"``
+   This indicates the subnormal handling that may be assumed for the

I don't like the definition of this attribute. It's not reader-friendly. The 
comma-separated pair format has no indication which value refers to inputs and 
which refers to outputs. Also, while this predates your changes, I think the 
meanings of the current choices are unclear. 

What would you think of a comma-separated list with the following possibilities?


```
allow-denormals (default)
inputs-are-zero (outputs not flushed)
inputs-are-zero, outputs-are-zero
inputs-are-zero, outputs-are-positive-zero
inputs-are-positivezero (outputs not flushed)
inputs-are-positivezero, outputs-are-zero
inputs-are-positivezero, outputs-are-positive-zero
denormal-outputs-are-zero (inputs are unchanged)
denormal-outputs-are-positive-zero (inputs are unchanged)

```
I'd also be open to abbreviations. I don't know if "daz" and "ftz" are readable 
to everyone, but I'm more comfortable with them. That would make the options 
something like this.


```
allow-denormals
daz
daz, ftz
daz, ftz+
daz+
daz+, ftz
daz+, ftz+
ftz
ftz+
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69978/new/

https://reviews.llvm.org/D69978



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: Add support for options -frounding-math, -ftrapping-math, -ffp-model=, and -ffp-exception-behavior=, : Specify floating point behavior

2019-11-19 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: llvm/include/llvm/IR/IRBuilder.h:262
 Function *F = BB->getParent();
-if (!F->hasFnAttribute(Attribute::StrictFP)) {
+if (F && !F->hasFnAttribute(Attribute::StrictFP)) {
   F->addFnAttr(Attribute::StrictFP);

kpn wrote:
> rjmccall wrote:
> > kpn wrote:
> > > This looks reasonable to me. 
> > > 
> > > It smells like there's a larger strictfp IRBuilder issue, but that's not 
> > > an issue for this patch here. The larger issue won't be hit since the new 
> > > options affect the entire compilation. It therefore shouldn't block this 
> > > patch.
> > Does IRBuilder actually support inserting into an unparented basic block?  
> > I feel like this is exposing a much more serious mis-use of IRBuilder.
> I suspect you are correct. If we let this "F && " change go in we'll have a 
> situation where whether or not a block is currently in a function when a 
> function call is emitted will affect whether or not the eventual function 
> definition gets the strictfp attribute. That seems like an unfortunate 
> inconsistency.
> 
> I'm still looking into it. I hope to have an IRBuilder review up today or 
> tomorrow.
As I just commented on the related patch @kpn posted, it appears that IRBuilder 
doesn't entirely support inserting into an unparented block. I was surprised by 
this, but there are places that need to be able to get to the Module from the 
BasicBlock. So, I think something problematic may be happening in the failing 
case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: Add support for options -frounding-math, -ftrapping-math, -ffp-model=, and -ffp-exception-behavior=, : Specify floating point behavior

2019-12-12 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticDriverKinds.td:444
 
+def warn_drv_experimental_fp_control_incomplete_opt : Warning<
+  "Support for floating point control option %0 is incomplete and 
experimental">,

rupprecht wrote:
> mibintc wrote:
> > @kpn thought it would be a good idea to add a Warning that the 
> > implementation of float control is experimental and partially implemented.  
> > That's what this is for.
> Instead of adding a warning, I'd like to propose `-frounding-math` not be 
> enabled unless an additional flag (e.g. `-fexperimental-float-control`) is 
> passed. Or maybe this feature should be called 
> `-f[no-]experimental-rounding-math` instead.
> 
> There are plenty of builds that are already specifying `-frounding-math` 
> (e.g. they also support building w/ a compiler such as gcc that implements 
> this), and adding this experimental/incomplete implementation is a surprise 
> to those builds.
> 
> If I'm wrong and it's completely safe to ignore the warning (maybe it's 
> incomplete but not in any unsafe way), we should just not have it at all.
You raise an interesting point about people who might be using -frounding-math 
already. However, if they are using this flag because they also sometimes build 
with a compiler such as gcc that supports the flag, they are currently getting 
incorrect behavior from clang. Without this patch, clang silently ignores the 
option and the optimizer silently ignores the fact that the program may be 
changing the rounding mode dynamically. The user may or may not be aware of 
this.

With this patch such a user is likely to observe two effects: (1) their code 
will suddenly get slower, and (2) it will probably start behaving correctly 
with regard to rounding mode changes. The rounding behavior will definitely not 
get worse. I think the warning is useful as an indication that something has 
changed. I don't think requiring an additional option should be necessary.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: Add support for options -frounding-math, -ftrapping-math, -ffp-model=, and -ffp-exception-behavior=, : Specify floating point behavior

2019-12-12 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D62731#1782762 , @rjmccall wrote:

> Currently we emit a warning if you use `-frounding-math`, saying that the 
> option is ignored.  That should have alerted users that they're not getting 
> the correct behavior now.  This patch removes the warning and (IIUC) 
> implements the correct behavior but is over-conservative.  If that's correct, 
> I think that's acceptable and we don't need an "experimental" flag or a 
> warning.


Oh, I didn't realize we were already warning about that. In theory, we should 
handle rounding math correctly with this change. It's possible we've missed 
some things, but I suppose that's always true. I think there are a few general 
intrinsics left that need constrained versions but don't have them, and we 
don't have any solution yet for target-specific intrinsics. If any of those 
have special handling that assumes the default rounding mode we will get it 
wrong. I don't think most users would be likely to encounter a problem.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: Add support for options -frounding-math, -ftrapping-math, -ffp-model=, and -ffp-exception-behavior=, : Specify floating point behavior

2019-12-13 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D62731#1782912 , @rjmccall wrote:

> Hmm.  The target-specific intrinsics thing does concern me, since I assume 
> many targets have a bunch of vector intrinsics that may be sensitive to 
> rounding.  Do we have an idea of how we'd fix that?  If it's a short-term 
> incorrectness that we have a plan to fix, I don't mind the risk, but if we 
> don't know how we'd even start to address it...


I see two potential problem cases with the target-specific intrinsics:

1. Some optimization pass recognizes the intrinisic and uses its semantics to 
perform some optimization such as constant folding
2. Some optimization performs code motion that moves the intrinsic (or, in the 
backend, the instruction it represents) across an operation that changes the 
rounding mode

I don't know if there are any instances of the first case in the public 
repository. Downstream users could be doing it. Those will need special 
handling if they exist (checking for the the strictfp attribute).

The second case should be handled in IR by fesetround() or other such 
intrinsics being marked as having side effects. It's possible that there are 
target-specific intrinsics to change the rounding mode that aren't marked as 
having side effects, but if so that's simply a bug. The other part of this is 
that the intrinsic might be lowered to MC and the MC instructions in a way that 
neglects rounding mode. Many targets have instructions with forms that take an 
explicit rounding mode argument and the backends may be using that with the 
default rounding mode. I am not aware of any such case, but it's definitely 
possible.

Finally, our design for handling strict fp mode in the backends is that 
rounding mode control will be handled by explicitly modeling the dependency 
between the relevant control registers and instructions that implicitly use the 
rounding mode controled by those registers. X86 only recently started doing 
this. There may be other backends that have not implemented it. Some may never 
do so.

I don't have a strong preference about what to do with the warning. I have a 
slight preference for replacing the existing warning with a more specific 
warning saying that floating math support is a work in progress. Eventually we 
need a way for backends to indicate that they believe their support is complete.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: Add support for options -frounding-math, -ftrapping-math, -ffp-model=, and -ffp-exception-behavior=, : Specify floating point behavior

2019-12-13 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D62731#1784225 , @rjmccall wrote:

> ... The big problem is that intrinsic calls are not arbitrary code: the vast 
> majority of intrinsics (e.g. all the ARM vector intrinsics, many of which can 
> be floating-point) are marked `IntrNoMem` (which I believe corresponds to 
> `readnone`), which means calls to those intrinsics can be reordered with 
> other code whether or not that code has arbitrary side-effects.


Oh, you're right. With the constrained intrinsics we are currently handling 
that by using IntrInaccessibleMemOnly as a proxy for access to the FP 
environment, but that's stronger than we'd want for architecture-specific 
intrinsics in the default case. We have talked about an fpenv-specific 
attribute, but nothing has been done. So, I guess that does leave us in the 
situation where rounding controls might not be correctly respected if 
target-specific intrinsics are used.

> It's good that people are looking at achieving better modeling for the x86 
> backend, but we need to have a plan that doesn't require heroic effort just 
> to get basic correctness.

Do you mean in the backend? If so, I don't think that's possible. The backends 
just don't have any sort of feature that could be used to get conservatively 
correct behavior for cheap the way intrinsics give it to us in the middle end. 
Once you go into instruction selection things get very low level in a hurry.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: Add support for options -frounding-math, -ftrapping-math, -ffp-model=, and -ffp-exception-behavior=, : Specify floating point behavior

2019-12-13 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticDriverKinds.td:444
 
+def warn_drv_experimental_fp_control_incomplete_opt : Warning<
+  "Support for floating point control option %0 is incomplete and 
experimental">,

rupprecht wrote:
> andrew.w.kaylor wrote:
> > rupprecht wrote:
> > > mibintc wrote:
> > > > @kpn thought it would be a good idea to add a Warning that the 
> > > > implementation of float control is experimental and partially 
> > > > implemented.  That's what this is for.
> > > Instead of adding a warning, I'd like to propose `-frounding-math` not be 
> > > enabled unless an additional flag (e.g. `-fexperimental-float-control`) 
> > > is passed. Or maybe this feature should be called 
> > > `-f[no-]experimental-rounding-math` instead.
> > > 
> > > There are plenty of builds that are already specifying `-frounding-math` 
> > > (e.g. they also support building w/ a compiler such as gcc that 
> > > implements this), and adding this experimental/incomplete implementation 
> > > is a surprise to those builds.
> > > 
> > > If I'm wrong and it's completely safe to ignore the warning (maybe it's 
> > > incomplete but not in any unsafe way), we should just not have it at all.
> > You raise an interesting point about people who might be using 
> > -frounding-math already. However, if they are using this flag because they 
> > also sometimes build with a compiler such as gcc that supports the flag, 
> > they are currently getting incorrect behavior from clang. Without this 
> > patch, clang silently ignores the option and the optimizer silently ignores 
> > the fact that the program may be changing the rounding mode dynamically. 
> > The user may or may not be aware of this.
> > 
> > With this patch such a user is likely to observe two effects: (1) their 
> > code will suddenly get slower, and (2) it will probably start behaving 
> > correctly with regard to rounding mode changes. The rounding behavior will 
> > definitely not get worse. I think the warning is useful as an indication 
> > that something has changed. I don't think requiring an additional option 
> > should be necessary.
> > However, if they are using this flag because they also sometimes build with 
> > a compiler such as gcc that supports the flag, they are currently getting 
> > incorrect behavior from clang
> 
> Incorrect, yes, but also likely valid behavior. "experimental" seems to imply 
> a miscompile when using this option should not be unexpected.
> 
> As I suggested before: if I'm wrong, and this behavior is only going to make 
> the code more correct (as you suggest), can we remove the warning that this 
> must be ack'd explicitly by adding `-Wno-experimental-float-control` to 
> builds? I don't understand the motivation for the warning.
The "experimental" code won't be incorrect in any way that the code generated 
when we ignore the option is. The things that have been implemented will work 
correctly. The things that are not implemented will have the potential to 
disregard runtime changes to the rounding mode. Currently, dynamic changes to 
the rounding mode always have the potential of being ignored.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D62731: Add support for options -frounding-math, -ftrapping-math, -ffp-model=, and -ffp-exception-behavior=, : Specify floating point behavior

2019-12-17 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D62731#1788838 , @rupprecht wrote:

> It seems the discussion of whether or not this is incomplete died out -- I'd 
> prefer to assume it is incomplete if there is no consensus. Mailed D71635 
>  to rename `-frounding-math` to 
> `-fexperimental-rounding-math`.
>
> Alternatively we could remove the warning. I still don't see a good argument 
> for the middle ground of having it called `-frounding-math` but also generate 
> a warning.


It's definitely incomplete but the results will not be any worse than you get 
when -frounding-math is ignored.

My preference would be to change the text of the warning that is issued but 
allow -frounding-math to be enabled by this commit without requiring an 
additional option.

I would also very much like to see this patch re-committed. It's currently in 
the "approved" state. If anyone objects to this being committed, please use the 
"request changes" action to indicate this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62731/new/

https://reviews.llvm.org/D62731



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71635: [clang] Rename -frounding-math to -fexperimental-rounding-math and add -frounding-math back as a gcc-compat arg.

2019-12-18 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D71635#1790611 , @MaskRay wrote:

> Before:
>
>   % clang -frounding-math -fsyntax-only -x c /dev/null
>   clang-10: warning: Support for floating point control option frounding-math 
> is incomplete and experimental [-Wexperimental-float-control]
>
>
> CC1 will do rounding math things.
>
> After
>
>   % clang -frounding-math -fsyntax-only -x c /dev/null
>   clang-10: warning: optimization flag '-frounding-math' is not supported 
> [-Wignored-optimization-argument]
>
>
> CC1 will not do rounding math things. -fexperimental-rounding-math if the 
> user really wants to use the feature.
>
> Is my understanding correct? If yes, this patch seems pretty reasonable to 
> me, because -frounding-math is currently incomplete/unsafe.
>
> You may consider not changing CC1 options as they are not user facing.


My understanding is this:

Before D62731 :

  % clang -frounding-math -fsyntax-only -x c /dev/null
  clang-10: warning: optimization flag '-frounding-math' is not supported 
[-Wignored-optimization-argument]

After D62731 :

  % clang -frounding-math -fsyntax-only -x c /dev/null
  clang-10: warning: Support for floating point control option frounding-math 
is incomplete and experimental [-Wexperimental-float-control]

After D71671 :

  % clang -frounding-math -fsyntax-only -x c /dev/null
  

I suppose this patch would be updated to put us back where we were before 
D62731  in terms of the warning and require an 
additional option for anyone who wants to use the rounding math implementation 
as it is currently implemented.

The support for -frounding-math is incomplete, but I'm not sure it's unsafe. 
It's no more unsafe than not using the flag at all. The same reasoning applies 
to -ftrapping-math which was accepted without warning well before D62731 
 and has never been completely implemented.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71635/new/

https://reviews.llvm.org/D71635



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-03-12 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D72675#1920309 , @lebedev.ri wrote:

> I may be wrong, but i suspect those failures aren't actually due to the fact
>  that we pessimize optimizations with this change, but that the whole 
> execution
>  just fails. Can you try running test-suite locally? Do tests themselves 
> actually pass,
>  ignoring the question of their performance?


I find the LNT output very hard to decipher, but I thought that all of the 
failures on the Broadwell (x86) LNT bot were just performance regressions. 
There were many perf improvements and also many regressions. I investigated the 
top regression and found that the loop unroller made a different decision when 
the llvm.fmuladd intrinsic was used than it did for separate mul and add 
operations. The central loop of the test was unrolled eight times rather than 
four. Broadwell gets less benefit from FMA than either Haswell or Skylake, so 
any other factors that might drop performance are less likely to be mitigated 
by having fused these operations. In a more general sense, I don't see any 
reason apart from secondary changes in compiler behavior like this that 
allowing FMA would cause performance to drop.

At least one other target had execution failures caused by Melanie's change, 
but I understood it to be simply exposing a latent problem in the 
target-specific code.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69978: Separately track input and output denormal mode

2020-01-31 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor accepted this revision.
andrew.w.kaylor added a comment.
This revision is now accepted and ready to land.

lgtm


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69978/new/

https://reviews.llvm.org/D69978



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74436: Change clang option -ffp-model=precise to select ffp-contract=on

2020-02-11 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/docs/UsersManual.rst:1388
 
-   * ``precise`` Disables optimizations that are not value-safe on 
floating-point data, although FP contraction (FMA) is enabled 
(``-ffp-contract=fast``).  This is the default behavior.
* ``strict`` Enables ``-frounding-math`` and 
``-ffp-exception-behavior=strict``, and disables contractions (FMA).  All of 
the ``-ffast-math`` enablements are disabled.

lebedev.ri wrote:
> I'm confused. Where in this patch the `This patch establishes the default 
> option for -ffp-model to select "precise".` happens? LHS of diff says it is 
> already default
The comments said that it was the default, but the actual default was something 
that didn't quite match any of the fp-models -- precise but with fp contraction 
off.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2548
   DenormalFPMath = DefaultDenormalFPMath;
   FPContract = "";
   StringRef Val = A->getValue();

I think this always gets changed to fast, on, or off below, but making it empty 
here looks wrong.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2661
 // the FPContract value has already been set to a string literal
 // and the Val string isn't a pertinent value.
 ;

Does this mean that "-ffp-model=precise -ffp-contract=off" will leave FP 
contraction on? That doesn't seem right.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2769
 DenormalFPMath != llvm::DenormalMode::getIEEE() &&
-FPContract.empty())
+(FPContract.equals("off") || FPContract.empty()))
 // OK: Current Arg doesn't conflict with -ffp-model=strict

We should never get here with FPContract empty.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74436/new/

https://reviews.llvm.org/D74436



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74436: Change clang option -ffp-model=precise to select ffp-contract=on

2020-02-12 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

lgtm


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74436/new/

https://reviews.llvm.org/D74436



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74436: Change clang option -ffp-model=precise to select ffp-contract=on

2020-02-13 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a subscriber: MaskRay.
andrew.w.kaylor added a comment.

In D74436#1875386 , @thakis wrote:

> The revert of this breaks tests everywhere, as far as I can tell.


It looks like something strange happened with the revert:

> clang-11: warning: overriding '-ffp-model=strict' option with 
> '-ffp-model=strict' [-Woverriding-t-option]

I believe the problem is that the original change that was being reverted 
contained this:

  clang/lib/Driver/ToolChains/Clang.cpp 
  @@ -2768,7 +2766,7 @@ static void RenderFloatingPointOptions(const ToolChain 
&TC, const Driver &D,
  !AssociativeMath && !ReciprocalMath &&
  SignedZeros && TrappingMath && RoundingFPMath &&
  DenormalFPMath != llvm::DenormalMode::getIEEE() &&
  +FPContract.empty())
  -(FPContract.equals("off") || FPContract.empty()))

But sometime in the land-revert-land-revert cycle the line above that changed, 
causing the merge to miss this change in the most recent revert. I see that 
@MaskRay has since re-landed this change set, but it's going to cause problems 
for PowerPC. If someone needs to revert this yet again, I think it can be 
safely done by recovering the change above.

Apologies for the mess!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74436/new/

https://reviews.llvm.org/D74436



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74436: Change clang option -ffp-model=precise to select ffp-contract=on

2020-02-14 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a subscriber: scanon.
andrew.w.kaylor added a comment.

In D74436#1875320 , @mibintc wrote:

> However you are right we don't want the frontend to create FMA when 
> optimizations are disabled -O0


I've been discussing this with @scanon on the cfe-dev mailing list, and he has 
convinced me that we should create FMA options at -O0 if the fp-contract 
setting is "on" -- including when it is on by default. The reasoning that 
persuaded me was, "preserving FMA formation at O0 _helps_ debuggability, 
because it means that numerical behavior is more likely to match what a user 
observed at Os, allowing them to debug the problem."


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74436/new/

https://reviews.llvm.org/D74436



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74436: Change clang option -ffp-model=precise to select ffp-contract=on

2020-02-14 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D74436#1875315 , @nemanjai wrote:

> > You're right, -O0 shouldn't generate FMA. I'm preparing to revert this now 
> > -- just verifying the build.
>
> Perhaps this should be
>  `off` with no optimization
>  `on` with `-O1/-O2/-O3/-Os/-Oz`
>  `fast` with fast math
>
> Just a suggestion, I'm not sure whether that would be the best breakdown. 
> Perhaps we can also see what the defaults are for GCC and unify with those?


GCC doesn't support "on" so I'm not sure it's possible to distinguish their 
intentions. I think it would be better to define what we think is best for 
clang and encourage GCC to unify with that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74436/new/

https://reviews.llvm.org/D74436



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72820: [FPEnv] Add pragma FP_CONTRACT support under strict FP.

2020-01-16 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:3381
+ Addend->getType()),
+{MulOp0, MulOp1, Addend, MulOp->getOperand(2), MulOp->getOperand(3)});
+  else

You shouldn't just assume that MulOp is a constrained intrinsic. Cast to 
ConstrainedFPIntrinsic and use ConstrainedFPIntrinsic::getRoundingMode() and 
ConstrainedFPIntrinsic::getExceptionBehavior(). The cast will effectively 
assert that MulOp is a constrained intrisic. I think that should always be true.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:3423
 
+  if (Builder.getIsFPConstrained()) {
+if (auto *LHSBinOp = dyn_cast(op.LHS)) {

I don't think we should ever non-constrained create FMul instructions if 
Builder is in FP constrained mode, but you should assert that somewhere above. 
Maybe move this block above line 3409 and add:

assert(LHSBinOp->getOpcode() != llvm::Instruction::FMul && 
RHSBinOp->getOpcode() != llvm::Instruction::FMul);



Comment at: clang/test/CodeGen/constrained-math-builtins.c:160
+// CHECK: declare x86_fp80 
@llvm.experimental.constrained.fmuladd.f80(x86_fp80, x86_fp80, x86_fp80, 
metadata, metadata)
+};

I'd like to see a test that verifies the calls generated in the function and 
specifically a test that verifies that the constrained fneg is generated if 
needed.



Comment at: llvm/docs/LangRef.rst:16094
+
+The fourth and fifth arguments specifie the exception behavior as described
+above.

s/specifie/specify

s/the exception behavior/the rounding mode and exception behavior



Comment at: llvm/docs/LangRef.rst:16104
+
+  %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c)
+

missing metadata arguments



Comment at: llvm/include/llvm/CodeGen/BasicTTIImpl.h:1515
  ConcreteTTI->getArithmeticInstrCost(BinaryOperator::FAdd, RetTy);
+// FIXME: Is constrained intrinsic' cost equal to it's no strict one?
+if (IID == Intrinsic::experimental_constrained_fmuladd)

I don't think that matters. The cost calculation here is a conservative 
estimate based on the cost if we are unable to generate an FMA instruction. So 
a constrained fmuladd that can't be lowered to FMA will be lower the same way a 
contrained mul followed by a constrained add would be.



Comment at: llvm/include/llvm/CodeGen/ISDOpcodes.h:355
 
+/// FMULADD/STRICT_FMULADD - A intermediate node, made functions handle
+/// constrained fmuladd the same as other constrained intrinsics.

Something is wrong with this comment. I'm not sure what it's trying to say but 
the grammar is wrong.

After looking through the rest of the code, I think I understand what's going 
on. I think we need a verbose comment to explain it. Here's my suggestion

```
FMULADD/STRICT_FMULADD -- These are intermediate opcodes used to handle the 
constrained.fmuladd intrinsic. The FMULADD opcode only exists because it is 
required for correct macro expansion and default handling (which is never 
reached). There should never be a node with ISD::FMULADD. The STRICT_FMULADD 
opcode is used to allow selectionDAGBuilder::visitConstrainedFPIntrinsic to 
determine (based on TargetOptions and target cost information) whether the 
constrained.fmuladd intrinsic should be lowered to FMA or separate FMUL and 
FADD operations.
```
Having thought through that, however, it strikes me as a lot of overhead. Can 
we just add special handling for the constrained.fmuladd intrinsic and make the 
decision then to create either a STRICT_FMA node or separate STRICT_FMUL and 
STRICT_FADD?

The idea that ISD::FMULADD is going to exist as a defined opcode but we never 
intend to add any support for handling it is particularly bad.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72820/new/

https://reviews.llvm.org/D72820



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69878: Consoldiate internal denormal flushing controls

2020-01-16 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2311
+  bool TrappingMath = true;
 // overriden by ffp-exception-behavior?
   bool RoundingFPMath = false;

arsenm wrote:
> cameron.mcinally wrote:
> > Last line of comment was not removed.
> > 
> > Also, is it safe to remove `TrappingMathPresent`? Is that part of the 
> > work-in-progress to support `ffp-exception-behavior`?
> I think this is a rebase gone bad. The patch changing the strict math was 
> revered and recommitted and I probably broke this
Looks like this is still wrong. You didn't intend to change either TrappingMath 
flag, did you?



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2657
   // -fno_unsafe_math_optimizations restores default denormal handling
-  DenormalFPMath = "";
+  DenormalFPMath = DefaultDenormalFPMath;
   break;

Shouldn't this also restore DenormalFP32Math to its default value?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72820: [FPEnv] Add pragma FP_CONTRACT support under strict FP.

2020-01-16 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:3436
+}
+  }
+

cameron.mcinally wrote:
> cameron.mcinally wrote:
> > cameron.mcinally wrote:
> > > I don't think it's safe to fuse a FMUL and FADD if the intermediate 
> > > rounding isn't exactly the same as those individual operations. FMULADD 
> > > doesn't guarantee that, does it?
> > To be clear, we could miss very-edge-case overflow/underflow exceptions.
> Ah, but I see C/C++ FP_CONTRACT allows the exceptions to be optimized away. 
> Sorry for the noise.
We've talked about this before but I don't think we ever documented a decision 
as to whether we want to allow constrained intrinsics and fast math flags to be 
combined. This patch moves that decision into clang's decision to generate this 
intrinsic or not.

I think it definitely makes sense in the case of fp contraction, because even 
if a user cares about value safety they might want FMA, which is theorectically 
more accurate than the separate values even though it produces a different 
value. This is consistent with gcc (which produces FMA under 
"-ffp-contract=fast -fno-fast-math") and icc (which produced FMA under 
"-fp-model strict -fma").

For the record, I also think it makes sense to use nnan, ninf, and nsz with 
constrained intrinsics.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72820/new/

https://reviews.llvm.org/D72820



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69878: Consoldiate internal denormal flushing controls

2020-01-17 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor accepted this revision.
andrew.w.kaylor added a comment.
This revision is now accepted and ready to land.

I don't know if there were other reviewers who haven't commented on how you 
addressed their concerns, but this looks good to me.

Thanks for taking the time to improve our handling of this!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72820: [FPEnv] Add pragma FP_CONTRACT support under strict FP.

2020-01-17 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:3436
+}
+  }
+

cameron.mcinally wrote:
> andrew.w.kaylor wrote:
> > cameron.mcinally wrote:
> > > cameron.mcinally wrote:
> > > > cameron.mcinally wrote:
> > > > > I don't think it's safe to fuse a FMUL and FADD if the intermediate 
> > > > > rounding isn't exactly the same as those individual operations. 
> > > > > FMULADD doesn't guarantee that, does it?
> > > > To be clear, we could miss very-edge-case overflow/underflow exceptions.
> > > Ah, but I see C/C++ FP_CONTRACT allows the exceptions to be optimized 
> > > away. Sorry for the noise.
> > We've talked about this before but I don't think we ever documented a 
> > decision as to whether we want to allow constrained intrinsics and fast 
> > math flags to be combined. This patch moves that decision into clang's 
> > decision to generate this intrinsic or not.
> > 
> > I think it definitely makes sense in the case of fp contraction, because 
> > even if a user cares about value safety they might want FMA, which is 
> > theorectically more accurate than the separate values even though it 
> > produces a different value. This is consistent with gcc (which produces FMA 
> > under "-ffp-contract=fast -fno-fast-math") and icc (which produced FMA 
> > under "-fp-model strict -fma").
> > 
> > For the record, I also think it makes sense to use nnan, ninf, and nsz with 
> > constrained intrinsics.
> You had me until:
> 
> >For the record, I also think it makes sense to use nnan, ninf, and nsz with 
> >constrained intrinsics.
> 
> To be clear, we'd need them for the `fast` case, but I don't see a lot of 
> value for the `strict` case.
> 
> We definitely want reassoc/recip/etc for the `optimized but trap-safe` case, 
> so that's enough to require FMF flags on constrained intrinsics alone.
> 
> We should probably break this conversation out into an llvm-dev thread...
I agree about starting an llvm-dev thread. I'll send something out unless 
you've already done so by the time I finish typing it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72820/new/

https://reviews.llvm.org/D72820



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-01-24 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In D72675#1839492 , @wristow wrote:

> 1. Should we enable FMA "by default" at (for example) '-O2'?


We recently introduced a new option "-ffp-model=[precise|fast|strict], which is 
supposed to serve as an umbrella option for the most common combination of 
options. I think our default behavior should be equivalent to 
-ffp-model=precise, which enables contraction. It currently seems to enable 
-ffp-contract=fast in the precise model (https://godbolt.org/z/c6w8mJ). I'm not 
sure that's what it should be doing, but whatever the precise model does should 
be our default.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-01-24 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2792
   if (!MathErrno && AssociativeMath && ReciprocalMath && !SignedZeros &&
-  !TrappingMath)
+  !TrappingMath && !(FPContract.equals("off") || FPContract.equals("on")))
 CmdArgs.push_back("-menable-unsafe-fp-math");

I think this would read better as "... && !FPContract.equals("off") && 
!FPContract.equals("on")" The '||' in the middle of all these '&&'s combined 
with the extra parens from the function call trips me up.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2846
   ReciprocalMath && !SignedZeros && !TrappingMath && !RoundingFPMath) {
-CmdArgs.push_back("-ffast-math");
+if (!(FPContract.equals("off") || FPContract.equals("on")))
+  CmdArgs.push_back("-ffast-math");

As above, I'd prefer "!off && !on".



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2854
 // Enable -ffp-contract=fast
 CmdArgs.push_back(Args.MakeArgString("-ffp-contract=fast"));
   else

This is a bit of an oddity in our handling. The FPContract local variable 
starts out initialized to an empty string. So, what we're saying here is that 
if you've used all the individual controls to set the rest of the fast math 
flags, we'll turn on fp-contract=fast also. That seems wrong. If you use 
"-ffast-math", FPContract will have been set to "fast" so that's not applicable 
here.

I believe this means

```
-fno-honor-infinities -fno-honor-nans -fno-math-errno -fassociative-math 
-freciprocal-math -fno-signed-zeros -fno-trapping-math -fno-rounding-math

```

will imply "-ffp-contract=fast". Even worse:

```
-ffp-contract=off -fno-fast-math -fno-honor-infinities -fno-honor-nans 
-fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros 
-fno-trapping-math -fno-rounding-math

```
will imply "-ffp-contract=fast" because "-fno-fast-math" resets FPContract to 
empty.

I think we should initialize FPContract to whatever we want to be the default 
(on?) and remove the special case for empty here. Also, -fno-fast-math should 
either not change FPContract at all or set it to "off". Probably the latter 
since we're saying -ffast-math includes -ffp-contract=on.



Comment at: clang/test/Driver/fast-math.c:196
+// RUN: %clang -### -ffast-math -ffp-contract=fast -c %s 2>&1 \
+// RUN:   | FileCheck --check-prefix=CHECK-UNSAFE-MATH %s
+//

What about "-ffp-contract=off -ffast-math"? The way the code is written that 
will override the -ffp-contract option. That's probably what we want, though it 
might be nice to have a warning.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D72841: [RFC] Add support for pragma float_control, to control precision and exception behavior at the source level

2020-01-27 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

It's not clear to me from reading this how the "precise" control is going to 
work with relation to the fast math flags. I don't think MSVC allows the 
granularity of control over these flags that we have in clang, so maybe they 
don't have this problem.

Consider this code: https://godbolt.org/z/mHiLCm

With the options "-ffp-model=precise -fno-honor-infinities -fno-honor-nans" the 
math operations here are generated with the "nnan ninf contract" flags. That's 
correct. What will happen when I use the pragma to turn precise off? Does it 
enable all fast math flags? Will the subsequent "pop" leave the "ninf nnan" 
fast math flags enabled?

As I said, I don't think you can get into this situation with MSVC. I believe 
that icc will go into full fast math mode with the "precise, off, push" pragma 
but will go back to "nnan ninf contract" mode with the pop. At least, that's 
what the design docs say. I haven't looked at the code to verify this. It seems 
like the correct behavior in any case. I think the clang FPOptions needs 
individual entries for all of the fast math flags to handle this case.




Comment at: clang/docs/LanguageExtensions.rst:3042
+by the pragma behaves as though the command-line option ``ffp-model=precise``
+is enabled.  That is, fast-math is disabled and fp-contract=on (fused
+multiple add) is enabled.

Re  "fp-contraction=on": I agree that this is what it should do, but I don't 
think that's what fp-model=precise currently does. I think it's setting 
fp-contraction=fast.



Comment at: clang/docs/LanguageExtensions.rst:3043
+is enabled.  That is, fast-math is disabled and fp-contract=on (fused
+multiple add) is enabled.
+

s/multiple/multiply



Comment at: clang/docs/LanguageExtensions.rst:3050
+
+The full syntax this pragma supports is ``float_control(except|precise, 
on|off, [push])`` and ``float_control(push|pop)``.  The ``push`` and ``pop`` 
forms can only occur at file scope.
+

Looks like you need a line break here.



Comment at: clang/docs/LanguageExtensions.rst:3050
+
+The full syntax this pragma supports is ``float_control(except|precise, 
on|off, [push])`` and ``float_control(push|pop)``.  The ``push`` and ``pop`` 
forms can only occur at file scope.
+

andrew.w.kaylor wrote:
> Looks like you need a line break here.
Are the precise and except stacks independent?



Comment at: clang/include/clang/Basic/DiagnosticParseKinds.td:1112
   InGroup;
 // - #pragma fp_contract
+def err_pragma_file_or_compound_scope : Error<

This comment is wrong after your change.



Comment at: clang/include/clang/Basic/DiagnosticParseKinds.td:1135
+def err_pragma_float_control_malformed : Error<
+  "pragma float_control is malformed; it requires one or two comma-separated "
+  "arguments">;

This isn't quite accurate. The pop case has no comma-separated arguments. It 
might be better to print the full syntax here if that's feasible.



Comment at: clang/include/clang/Basic/LangOptions.h:363
+exceptions(LangOptions::FPE_Ignore),
+fp_precise(false)
 {}

It seems like fp_precise describes too many things to be a single option. Even 
within this set of options it overlaps with fp_contract.



Comment at: clang/test/CodeGen/fp-floatcontrol-stack.cpp:124
+#if DEFAULT
+//CHECK-DDEFAULT: fmul float
+//CHECK-DDEFAULT: fadd float

Are there also fast-math flags set here? If not, why not?



Comment at: clang/test/CodeGen/fp-floatcontrol-stack.cpp:2
+// RUN: %clang_cc1 -DDEFAULT=1 -emit-llvm -o - %s | FileCheck 
--check-prefix=CHECK-DDEFAULT %s
+// RUN: %clang_cc1 -DEBSTRICT=1 -ffp-exception-behavior=strict -emit-llvm -o - 
%s | FileCheck --check-prefix=CHECK-DEBSTRICT %s
+

Can you add run lines for -ffast-math and (separately) "-fno-honor-nans 
-fno-honor-infinities"?



Comment at: clang/test/CodeGen/fp-floatcontrol-stack.cpp:17
+// for exception behavior and rounding mode.
+//CHECK-DEBSTRICT: 
llvm.experimental.constrained.fmul{{.*}}tonearest{{.*}}strict
+#endif

There should be a constrained fadd here also, right?



Comment at: clang/test/CodeGen/fp-floatcontrol-stack.cpp:52
+#if EBSTRICT
+//CHECK-DEBSTRICT: 
llvm.experimental.constrained.fmul{{.*}}tonearest{{.*}}ignore
+#endif

Why is the constrained intrinsic generated in this case? If we've got both 
constraints set to the defaults at the file scope I would have expected that to 
turn off the constrained mode.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72841/new/

https://reviews.llvm.org/D72841



___
cfe-commits mailing list
cfe-commit

[PATCH] D72906: [X86] Improve X86 cmpps/cmppd/cmpss/cmpsd intrinsics with strictfp

2020-01-29 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor accepted this revision.
andrew.w.kaylor added a comment.
This revision is now accepted and ready to land.

lgtm

I have a couple of comments, but nothing that couldn't be addressed in a later 
patch.




Comment at: clang/lib/CodeGen/CGBuiltin.cpp:12363
+Cmp = Builder.CreateFCmp(Pred, Ops[0], Ops[1]);
   return EmitX86MaskedCompareResult(*this, Cmp, NumElts, Ops[3]);
 }

How hard would it be to generate a select with known safe values ahead of the 
compare in the constrained case? 



Comment at: clang/test/CodeGen/avx-builtins-constrained.c:170
+  // CHECK-LABEL: test_mm256_cmp_pd_false_os
+  // CHECK: call <4 x double> @llvm.x86.avx.cmp.pd.256(<4 x double> %{{.*}}, 
<4 x double> %{{.*}}, i8 27)
+  return _mm256_cmp_pd(a, b, _CMP_FALSE_OS);

Does this have the strictfp attribute here? I don't think we do anything with 
that, but it will likely be useful when we start handling strictfp for 
target-specific intrinsics.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72906/new/

https://reviews.llvm.org/D72906



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69978: Separately track input and output denormal mode

2020-01-29 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: llvm/docs/LangRef.rst:1829
+   operations. The second indicates the handling of denormal inputs to
+   floating point instructions.
+

Based on the changes below, if the second value is omitted the input mode will 
be assumed to be the same as the output mode. That should probably be 
documented. I guess you intend for that not to happen, but the documentation 
here leaves the result ambiguous if it does happen.



Comment at: llvm/docs/LangRef.rst:1848
+   should be converted to 0 as if by ``@llvm.canonicalize`` during
+   lowering.
+

Is this saying that if a backend generates an instruction that doesn't handle 
the hardware daz mode then it must insert instructions to check for normals and 
convert them to zero? If so, do you intend this to apply to all such 
instructions or only instructions that aren't able to accept denormal inputs?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69978/new/

https://reviews.llvm.org/D69978



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147733: Set rounding_mode to tonearest in presence of a #pragma STDC FENV_ACCESS OFF.

2023-04-07 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/lib/Sema/SemaAttr.cpp:1340
+  if (!IsEnabled)
+NewFPFeatures.setDisallowFenvAccess(IsEnabled);
   FpPragmaStack.Act(Loc, PSK_Set, StringRef(), NewFPFeatures);

Why is this only needed for "!IsEnabled"? Where is the rounding mode set in the 
IsEnabled case? It looks like setAllowFEnvAccessOverride() is defined by a 
macro and just sets a bit in the OverrideMask, but somehow it seems to be 
setting the rounding mode to Round.Dynamic. I'm just concerned that there is a 
disconnect in the implementation here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147733/new/

https://reviews.llvm.org/D147733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147733: Set rounding_mode to tonearest in presence of a #pragma STDC FENV_ACCESS OFF.

2023-04-07 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/test/CodeGen/pragma-fenv_access.c:239
+// CHECK-LABEL: @func_20
+// STRICT: call float @llvm.experimental.constrained.fadd.f32(float {{.*}}, 
float {{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
+// DEFAULT: fadd float

pengfei wrote:
> Should this be `ignore`?
This is a tricky case. By a strict reading of the C standard, this could be 
ignore, because the standard says the compiler can assume the default floating 
point environment when FENV_ACCESS is OFF and that if code compiled with 
FENV_ACCESS OFF is executed with anything other than the default environment 
the behavior is undefined. However, in this case strict exception semantics 
have been enabled elsewhere in the compilation unit, so floating point 
exceptions may be unmasked. The standard allows us to ignore exceptions, but 
raising a spurious exception may be bad for users.

I'm unsure about this case. I lean towards leaving it as Zahira has it here 
because I don't think the use of strict exception semantics will be common 
enough to justify the less conservative behavior.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147733/new/

https://reviews.llvm.org/D147733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147733: Set rounding_mode to tonearest in presence of a #pragma STDC FENV_ACCESS OFF.

2023-04-10 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/test/CodeGen/pragma-fenv_access.c:4
 // RUN: %clang_cc1 -fexperimental-strict-floating-point 
-ffp-exception-behavior=strict -triple %itanium_abi_triple -emit-llvm %s -o - 
-fms-extensions -DMS | FileCheck --check-prefixes=CHECK,STRICT %s
+// RUN %clang_cc1 -fexperimental-strict-floating-point -frounding-math 
-ffp-exception-behavior=strict -triple %itanium_abi_triple -emit-llvm %s -o - 
-fms-extensions -DMS | FileCheck --check-prefixes=CHECK,STRICT %s
 // RUN: %clang_cc1 -fexperimental-strict-floating-point -triple 
%itanium_abi_triple -emit-llvm %s -o - | FileCheck 
--check-prefixes=CHECK,DEFAULT %s

Why doesn't this case use STRICT-RND? round.tonearest, but I don't understand 
why.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147733/new/

https://reviews.llvm.org/D147733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147733: Set rounding_mode to tonearest in presence of a #pragma STDC FENV_ACCESS OFF.

2023-04-11 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor accepted this revision.
andrew.w.kaylor added a comment.

Looks good to me.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147733/new/

https://reviews.llvm.org/D147733

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D151834: Include math-errno with fast-math

2023-06-02 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added inline comments.



Comment at: clang/include/clang/Basic/FPOptions.def:30
 OPTION(BFloat16ExcessPrecision, LangOptions::ExcessPrecisionKind, 2, 
FPEvalMethod)
+OPTION(MathErrno, bool, 1, BFloat16ExcessPrecision)
 #undef OPTION

Does this mean MathErrno is tracked in both LangOpts and FPOptions?



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2248
   FD->hasAttr() ? 0 : BuiltinID;
+  bool MathErrnoOverrride = false;
+  if (E->hasStoredFPFeatures()) {

You should add a comment here explaining what this is doing. I don't think it's 
obvious apart from the description of this patch.



Comment at: clang/lib/CodeGen/CGCall.cpp:2384
 
+  if (HasOptnone && !getLangOpts().MathErrno)
+OptNone = false;

I don't understand what this is doing.



Comment at: clang/lib/CodeGen/CodeGenModule.h:611
   void clear();
+  bool OptNone = true;
 

Why is this a module level flag, and why does it default to true?



Comment at: clang/test/CodeGen/math-errno.c:2
+// Tests that at -O2 math-errno is taken into account. Same than MSVC.
+// RUN: %clang_cc1 -Wno-implicit-function-declaration -fmath-errno \
+// RUN: -O2 -emit-llvm -o - %s \

Isn't math-errno true by default when fast-math isn't used?



Comment at: clang/test/CodeGen/math-errno.c:33
+
+__attribute__((optnone))
+float f4(float x) { 

Can you add a runline with -O0. That should prevent all instances of the 
intrinsics, right?



Comment at: clang/test/CodeGen/math-errno.c:49
+// FAST-LABEL: define dso_local nofpclass(nan inf) float @f4(float noundef 
nofpclass(nan inf) {{.*}})
+// FAST: call reassoc nnan ninf nsz arcp afn nofpclass(nan inf) float 
@sqrtf(float noundef nofpclass(nan inf) %0) #[[ATTR0:[0-9]+]]
+

I think the 'afn' flag here is a problem. The backend has no concept of errno, 
so 'afn' will be treated as allowing the function to be replaced.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151834/new/

https://reviews.llvm.org/D151834

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-02-10 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor added a comment.

In general, it seems like the denormal mode should be considered part of the 
floating point environment (though as far as I know the C standard, at least, 
doesn't document it as such). If it were considered part of the floating point 
environment, the LLVM rules would tell us we could assume the default setting, 
which I'd assume to be IEEE, and it would only be legal to change this mode in 
strict mode. However, for your use case preserving the behavior of fpclass 
seems like what users would want, even in fast-math modes. In this sense, this 
is a lot like the problem we have with preserving isnan() behavior when 
fast-math is enabled. Our rules allow it, but it's not what most people would 
want.

I think the new denormal mode is a good addition.

Do you need to do something with the inliner to handle the case where functions 
with different denormal modes are inlined into one another? We don't seem to 
handle that case correctly now (https://godbolt.org/z/PEsWaMEq6), but with the 
dynamic mode we could handle it without blocking inlining completely.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142907/new/

https://reviews.llvm.org/D142907

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144454: Add builtin for llvm set rounding

2023-02-21 Thread Andy Kaylor via Phabricator via cfe-commits
andrew.w.kaylor accepted this revision.
andrew.w.kaylor added a comment.
This revision is now accepted and ready to land.

This looks good to me. I'm adding @rjmccall as a reviewer to make sure someone 
with front end expertise sees this.

There seems to be a potential type mismatch between the intrinsic and the 
builtin in the case of __builtin_flt_rounds(). Based on D24461 
, I wonder if that's a target-specific 
difference. I'm not sure if a similar issue exists in this case since we aren't 
copying a gcc builtin.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144454/new/

https://reviews.llvm.org/D144454

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   >