[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-07 Thread Yeoul Na via cfe-commits


@@ -3618,6 +3618,27 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static void CheckBoundsSafetyLang(InputKind IK, DiagnosticsEngine &Diags) {

rapidsna wrote:

> I think it's better to handle this language check in Driver, similar to 
> OPT_fminimize_whitespace but for all Inputs.

@MaskRay If we do it in Driver, wouldn't the option be silently ignored for 
unsupported languages when we invoke `clang -cc1` directly?

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-07 Thread Yeoul Na via cfe-commits


@@ -3618,6 +3618,27 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static void CheckBoundsSafetyLang(InputKind IK, DiagnosticsEngine &Diags) {

rapidsna wrote:

IMHO, that might be problematic because people would reasonably expect 
bounds-safety would work for C++/Obj-C/Obj-C++.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-07 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna updated 
https://github.com/llvm/llvm-project/pull/70480

>From 99ec6e055dd32a86bf6d589a6895658dcbe1d7bd Mon Sep 17 00:00:00 2001
From: Yeoul Na 
Date: Fri, 27 Oct 2023 08:34:37 -0700
Subject: [PATCH 1/8] [Driver][BoundsSafety] Add -fbounds-safety-experimental
 flag

-fbounds-safety-experimental is an experimental flag for
-fbounds-safety, which is a bounds-safety extension for C.
-fbounds-safety will require substantial changes across the Clang
codebase. So we introduce this experimental flag is to gate our
incremental patches until we push the essential functionality of
the extension.

-fbounds-safety-experimental currently doesn't do anything but
reporting an error when the flag is used with an unsupported
source language (currently only supports C).
---
 .../clang/Basic/DiagnosticFrontendKinds.td|  3 +++
 clang/include/clang/Basic/LangOptions.def |  2 ++
 clang/include/clang/Driver/Options.td |  8 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  3 +++
 clang/lib/Frontend/CompilerInvocation.cpp | 23 +++
 clang/test/BoundsSafety/Driver/driver.c   |  9 
 .../Frontend/only_c_is_supported.c| 15 
 7 files changed, 63 insertions(+)
 create mode 100644 clang/test/BoundsSafety/Driver/driver.c
 create mode 100644 clang/test/BoundsSafety/Frontend/only_c_is_supported.c

diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td 
b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 715e0c0dc8fa84e..edcbbe992377e12 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -330,6 +330,9 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+def error_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+
 let CategoryName = "Instrumentation Issue" in {
 def warn_profile_data_out_of_date : Warning<
   "profile data may be out of date: of %0 function%s0, %1 
%plural{1:has|:have}1"
diff --git a/clang/include/clang/Basic/LangOptions.def 
b/clang/include/clang/Basic/LangOptions.def
index c0ea4ecb9806a5b..222812d876a65f8 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -470,6 +470,8 @@ VALUE_LANGOPT(FuchsiaAPILevel, 32, 0, "Fuchsia API level")
 // on large _BitInts.
 BENIGN_VALUE_LANGOPT(MaxBitIntWidth, 32, 128, "Maximum width of a _BitInt")
 
+LANGOPT(BoundsSafety, 1, 0, "Bounds safety extension for C")
+
 LANGOPT(IncrementalExtensions, 1, 0, " True if we want to process statements"
 "on the global scope, ignore EOF token and continue later on (thus "
 "avoid tearing the Lexer and etc. down). Controlled by "
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f3f5125d42e7a9..3eb98c8ee2950a1 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1732,6 +1732,14 @@ def fswift_async_fp_EQ : Joined<["-"], 
"fswift-async-fp=">,
 NormalizedValues<["Auto", "Always", "Never"]>,
 MarshallingInfoEnum, "Always">;
 
+defm bounds_safety : BoolFOption<
+  "bounds-safety-experimental",
+  LangOpts<"BoundsSafety">, DefaultFalse,
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption, CC1Option],
+  " experimental bounds safety extension for C">>;
+
 defm addrsig : BoolFOption<"addrsig",
   CodeGenOpts<"Addrsig">, DefaultFalse,
   PosFlag,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 43a92adbef64ba8..7482b852fb37958 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6689,6 +6689,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   Args.addOptOutFlag(CmdArgs, options::OPT_fassume_sane_operator_new,
  options::OPT_fno_assume_sane_operator_new);
 
+  Args.addOptInFlag(CmdArgs, options::OPT_fbounds_safety,
+options::OPT_fno_bounds_safety);
+
   // -fblocks=0 is default.
   if (Args.hasFlag(options::OPT_fblocks, options::OPT_fno_blocks,
TC.IsBlocksDefault()) ||
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index fd6c250efeda2a8..f785bd504d63a81 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3618,6 +3618,23 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static bool SupportsBoundsSafety(Language Lang) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass
+ 

[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-07 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-07 Thread Yeoul Na via cfe-commits


@@ -1412,6 +1412,9 @@ def FunctionMultiVersioning
 
 def NoDeref : DiagGroup<"noderef">;
 
+// Bounds safety specific warnings
+def IgnoredBoundsSafety : DiagGroup<"ignored-bounds-safety">;

rapidsna wrote:

Removed it!

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-07 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> > [Driver][BoundsSafety] Add -fbounds-safety-experimental flag #70480
> 
> The patch implements `-fexperimental-bounds-safety`

I just updated the title. I'll make sure I update the text when I squash the 
commits.


https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang-tools-extra] [clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-11-08 Thread Yeoul Na via cfe-commits


@@ -859,53 +860,93 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of a pointer into the flexible array:
+//
+// __builtin_dynamic_object_size(&p->array[42], 1) ==
+// (p->count - 42) * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//max(sizeof(struct s),
+//offsetof(struct s, array) + p->count * sizeof(*p->array))
+//
 const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
+const Expr *Idx = nullptr;
+if (const auto *UO = dyn_cast(Base);
+UO && UO->getOpcode() == UO_AddrOf) {
+  if (const auto *ASE = dyn_cast(UO->getSubExpr())) {

rapidsna wrote:

Maybe we should do 
`dyn_cast(UO->getSubExpr()->IgnoreParens())`? For case like 
this `&(f->fam[0])`?

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang-tools-extra] [clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-11-08 Thread Yeoul Na via cfe-commits


@@ -859,53 +860,93 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of a pointer into the flexible array:
+//
+// __builtin_dynamic_object_size(&p->array[42], 1) ==
+// (p->count - 42) * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//max(sizeof(struct s),
+//offsetof(struct s, array) + p->count * sizeof(*p->array))
+//
 const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
+const Expr *Idx = nullptr;
+if (const auto *UO = dyn_cast(Base);
+UO && UO->getOpcode() == UO_AddrOf) {
+  if (const auto *ASE = dyn_cast(UO->getSubExpr())) {
+Base = ASE->getBase();
+Idx = ASE->getIdx()->IgnoreParenImpCasts();
+if (const auto *IL = dyn_cast(Idx);
+IL && !IL->getValue().getZExtValue()) {
+  Idx = nullptr;
+}
   }
+}
 
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
-  unsigned Size = getContext().getTypeSize(ArrayTy->getElementType());
+if (const ValueDecl *CountedByFD = FindCountedByField(Base)) {
+  const RecordDecl *OuterRD =
+  CountedByFD->getDeclContext()->getOuterLexicalRecordContext();
+  ASTContext &Ctx = getContext();
 
-  llvm::Value *CountField =
-  EmitAnyExprToTemp(MemberExpr::CreateImplicit(
-getContext(), const_cast(ME->getBase()),
-ME->isArrow(), FD, FD->getType(), VK_LValue,
-OK_Ordinary))
-  .getScalarVal();
+  // Load the counted_by field.
+  const Expr *CountedByExpr = BuildCountedByFieldExpr(Base, CountedByFD);
+  llvm::Value *CountedByInst =
+  EmitAnyExprToTemp(CountedByExpr).getScalarVal();
 
-  llvm::Value *Mul = Builder.CreateMul(
-  CountField, llvm::ConstantInt::get(CountField->getType(), Size / 8));
-  Mul = Builder.CreateZExtOrTrunc(Mul, ResType);
+  if (Idx) {
+llvm::Value *IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = Builder.CreateZExtOrTrunc(IdxInst, CountedByInst->getType());

rapidsna wrote:

I think the index can be negative `&(fp->fam[-1])` and doing `ZExt` on `int: 
-1` may give us an unwanted result here.

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [llvm] [clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-11-08 Thread Yeoul Na via cfe-commits


@@ -859,53 +860,93 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of a pointer into the flexible array:
+//
+// __builtin_dynamic_object_size(&p->array[42], 1) ==
+// (p->count - 42) * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//max(sizeof(struct s),
+//offsetof(struct s, array) + p->count * sizeof(*p->array))
+//
 const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
+const Expr *Idx = nullptr;
+if (const auto *UO = dyn_cast(Base);
+UO && UO->getOpcode() == UO_AddrOf) {
+  if (const auto *ASE = dyn_cast(UO->getSubExpr())) {
+Base = ASE->getBase();
+Idx = ASE->getIdx()->IgnoreParenImpCasts();
+if (const auto *IL = dyn_cast(Idx);
+IL && !IL->getValue().getZExtValue()) {
+  Idx = nullptr;
+}
   }
+}
 
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
-  unsigned Size = getContext().getTypeSize(ArrayTy->getElementType());
+if (const ValueDecl *CountedByFD = FindCountedByField(Base)) {
+  const RecordDecl *OuterRD =
+  CountedByFD->getDeclContext()->getOuterLexicalRecordContext();
+  ASTContext &Ctx = getContext();
 
-  llvm::Value *CountField =
-  EmitAnyExprToTemp(MemberExpr::CreateImplicit(
-getContext(), const_cast(ME->getBase()),
-ME->isArrow(), FD, FD->getType(), VK_LValue,
-OK_Ordinary))
-  .getScalarVal();
+  // Load the counted_by field.
+  const Expr *CountedByExpr = BuildCountedByFieldExpr(Base, CountedByFD);
+  llvm::Value *CountedByInst =
+  EmitAnyExprToTemp(CountedByExpr).getScalarVal();
 
-  llvm::Value *Mul = Builder.CreateMul(
-  CountField, llvm::ConstantInt::get(CountField->getType(), Size / 8));
-  Mul = Builder.CreateZExtOrTrunc(Mul, ResType);
+  if (Idx) {
+llvm::Value *IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = Builder.CreateZExtOrTrunc(IdxInst, CountedByInst->getType());
+CountedByInst = Builder.CreateSub(CountedByInst, IdxInst);
+  }
 
-  if (ObjectSize)
-return Builder.CreateAdd(ObjectSize, Mul);
+  // Get the size of the flexible array member's base type.
+  const ValueDecl *FAM = FindFlexibleArrayMemberField(Ctx, OuterRD);
+  const ArrayType *ArrayTy = Ctx.getAsArrayType(FAM->getType());
+  CharUnits Size = Ctx.getTypeSizeInChars(ArrayTy->getElementType());
+  llvm::Constant *ElemSize =
+  llvm::ConstantInt::get(CountedByInst->getType(), Size.getQuantity());
+
+  llvm::Value *FAMSize = Builder.CreateMul(CountedByInst, ElemSize);

rapidsna wrote:

What happens if `i > count` so  `CountedByInst (count - i)` becomes negative? 
Or in general, what should`__bdos` return if `&a[i]` is pointing to outside the 
array bounds (either it's fam or just an array)? Is it defined?

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [llvm] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-11-08 Thread Yeoul Na via cfe-commits


@@ -859,53 +860,93 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of a pointer into the flexible array:
+//
+// __builtin_dynamic_object_size(&p->array[42], 1) ==
+// (p->count - 42) * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//max(sizeof(struct s),
+//offsetof(struct s, array) + p->count * sizeof(*p->array))
+//
 const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
+const Expr *Idx = nullptr;
+if (const auto *UO = dyn_cast(Base);
+UO && UO->getOpcode() == UO_AddrOf) {
+  if (const auto *ASE = dyn_cast(UO->getSubExpr())) {
+Base = ASE->getBase();
+Idx = ASE->getIdx()->IgnoreParenImpCasts();
+if (const auto *IL = dyn_cast(Idx);
+IL && !IL->getValue().getZExtValue()) {
+  Idx = nullptr;
+}
   }
+}
 
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
-  unsigned Size = getContext().getTypeSize(ArrayTy->getElementType());
+if (const ValueDecl *CountedByFD = FindCountedByField(Base)) {
+  const RecordDecl *OuterRD =
+  CountedByFD->getDeclContext()->getOuterLexicalRecordContext();
+  ASTContext &Ctx = getContext();
 
-  llvm::Value *CountField =
-  EmitAnyExprToTemp(MemberExpr::CreateImplicit(
-getContext(), const_cast(ME->getBase()),
-ME->isArrow(), FD, FD->getType(), VK_LValue,
-OK_Ordinary))
-  .getScalarVal();
+  // Load the counted_by field.
+  const Expr *CountedByExpr = BuildCountedByFieldExpr(Base, CountedByFD);
+  llvm::Value *CountedByInst =
+  EmitAnyExprToTemp(CountedByExpr).getScalarVal();
 
-  llvm::Value *Mul = Builder.CreateMul(
-  CountField, llvm::ConstantInt::get(CountField->getType(), Size / 8));
-  Mul = Builder.CreateZExtOrTrunc(Mul, ResType);
+  if (Idx) {
+llvm::Value *IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = Builder.CreateZExtOrTrunc(IdxInst, CountedByInst->getType());

rapidsna wrote:

And I'm curious what `__bdos` is supposed to return for `__bdos(&fp->fam[-1], 
1)`.

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang-tools-extra] [clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-11-08 Thread Yeoul Na via cfe-commits


@@ -859,53 +860,93 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of a pointer into the flexible array:
+//
+// __builtin_dynamic_object_size(&p->array[42], 1) ==
+// (p->count - 42) * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//max(sizeof(struct s),
+//offsetof(struct s, array) + p->count * sizeof(*p->array))
+//
 const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
+const Expr *Idx = nullptr;
+if (const auto *UO = dyn_cast(Base);
+UO && UO->getOpcode() == UO_AddrOf) {
+  if (const auto *ASE = dyn_cast(UO->getSubExpr())) {
+Base = ASE->getBase();
+Idx = ASE->getIdx()->IgnoreParenImpCasts();
+if (const auto *IL = dyn_cast(Idx);
+IL && !IL->getValue().getZExtValue()) {
+  Idx = nullptr;
+}
   }
+}
 
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
-  unsigned Size = getContext().getTypeSize(ArrayTy->getElementType());
+if (const ValueDecl *CountedByFD = FindCountedByField(Base)) {
+  const RecordDecl *OuterRD =
+  CountedByFD->getDeclContext()->getOuterLexicalRecordContext();
+  ASTContext &Ctx = getContext();
 
-  llvm::Value *CountField =
-  EmitAnyExprToTemp(MemberExpr::CreateImplicit(
-getContext(), const_cast(ME->getBase()),
-ME->isArrow(), FD, FD->getType(), VK_LValue,
-OK_Ordinary))
-  .getScalarVal();
+  // Load the counted_by field.
+  const Expr *CountedByExpr = BuildCountedByFieldExpr(Base, CountedByFD);
+  llvm::Value *CountedByInst =
+  EmitAnyExprToTemp(CountedByExpr).getScalarVal();
 
-  llvm::Value *Mul = Builder.CreateMul(
-  CountField, llvm::ConstantInt::get(CountField->getType(), Size / 8));
-  Mul = Builder.CreateZExtOrTrunc(Mul, ResType);
+  if (Idx) {
+llvm::Value *IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = Builder.CreateZExtOrTrunc(IdxInst, CountedByInst->getType());
+CountedByInst = Builder.CreateSub(CountedByInst, IdxInst);
+  }
 
-  if (ObjectSize)
-return Builder.CreateAdd(ObjectSize, Mul);
+  // Get the size of the flexible array member's base type.
+  const ValueDecl *FAM = FindFlexibleArrayMemberField(Ctx, OuterRD);
+  const ArrayType *ArrayTy = Ctx.getAsArrayType(FAM->getType());

rapidsna wrote:

Should we handle the case where `FAM == nullptr`?

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] [llvm] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-11-08 Thread Yeoul Na via cfe-commits


@@ -859,53 +860,93 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of a pointer into the flexible array:
+//
+// __builtin_dynamic_object_size(&p->array[42], 1) ==
+// (p->count - 42) * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//max(sizeof(struct s),
+//offsetof(struct s, array) + p->count * sizeof(*p->array))
+//
 const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
+const Expr *Idx = nullptr;
+if (const auto *UO = dyn_cast(Base);
+UO && UO->getOpcode() == UO_AddrOf) {
+  if (const auto *ASE = dyn_cast(UO->getSubExpr())) {
+Base = ASE->getBase();
+Idx = ASE->getIdx()->IgnoreParenImpCasts();
+if (const auto *IL = dyn_cast(Idx);
+IL && !IL->getValue().getZExtValue()) {
+  Idx = nullptr;
+}
   }
+}
 
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
-  unsigned Size = getContext().getTypeSize(ArrayTy->getElementType());
+if (const ValueDecl *CountedByFD = FindCountedByField(Base)) {
+  const RecordDecl *OuterRD =
+  CountedByFD->getDeclContext()->getOuterLexicalRecordContext();
+  ASTContext &Ctx = getContext();
 
-  llvm::Value *CountField =
-  EmitAnyExprToTemp(MemberExpr::CreateImplicit(
-getContext(), const_cast(ME->getBase()),
-ME->isArrow(), FD, FD->getType(), VK_LValue,
-OK_Ordinary))
-  .getScalarVal();
+  // Load the counted_by field.
+  const Expr *CountedByExpr = BuildCountedByFieldExpr(Base, CountedByFD);
+  llvm::Value *CountedByInst =
+  EmitAnyExprToTemp(CountedByExpr).getScalarVal();
 
-  llvm::Value *Mul = Builder.CreateMul(
-  CountField, llvm::ConstantInt::get(CountField->getType(), Size / 8));
-  Mul = Builder.CreateZExtOrTrunc(Mul, ResType);
+  if (Idx) {
+llvm::Value *IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = Builder.CreateZExtOrTrunc(IdxInst, CountedByInst->getType());
+CountedByInst = Builder.CreateSub(CountedByInst, IdxInst);
+  }
 
-  if (ObjectSize)
-return Builder.CreateAdd(ObjectSize, Mul);
+  // Get the size of the flexible array member's base type.
+  const ValueDecl *FAM = FindFlexibleArrayMemberField(Ctx, OuterRD);
+  const ArrayType *ArrayTy = Ctx.getAsArrayType(FAM->getType());

rapidsna wrote:

I see. So sounds like this relies on the fact that counted_by can currently be 
added to a flexible array member only? Should we add an assertion here to make 
it easier to deal with when that assumption breaks in future?

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [llvm] [clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-11-08 Thread Yeoul Na via cfe-commits


@@ -859,53 +860,93 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of a pointer into the flexible array:
+//
+// __builtin_dynamic_object_size(&p->array[42], 1) ==
+// (p->count - 42) * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//max(sizeof(struct s),
+//offsetof(struct s, array) + p->count * sizeof(*p->array))
+//
 const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
+const Expr *Idx = nullptr;
+if (const auto *UO = dyn_cast(Base);
+UO && UO->getOpcode() == UO_AddrOf) {
+  if (const auto *ASE = dyn_cast(UO->getSubExpr())) {
+Base = ASE->getBase();
+Idx = ASE->getIdx()->IgnoreParenImpCasts();
+if (const auto *IL = dyn_cast(Idx);
+IL && !IL->getValue().getZExtValue()) {
+  Idx = nullptr;
+}
   }
+}
 
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
-  unsigned Size = getContext().getTypeSize(ArrayTy->getElementType());
+if (const ValueDecl *CountedByFD = FindCountedByField(Base)) {
+  const RecordDecl *OuterRD =
+  CountedByFD->getDeclContext()->getOuterLexicalRecordContext();
+  ASTContext &Ctx = getContext();
 
-  llvm::Value *CountField =
-  EmitAnyExprToTemp(MemberExpr::CreateImplicit(
-getContext(), const_cast(ME->getBase()),
-ME->isArrow(), FD, FD->getType(), VK_LValue,
-OK_Ordinary))
-  .getScalarVal();
+  // Load the counted_by field.
+  const Expr *CountedByExpr = BuildCountedByFieldExpr(Base, CountedByFD);
+  llvm::Value *CountedByInst =
+  EmitAnyExprToTemp(CountedByExpr).getScalarVal();
 
-  llvm::Value *Mul = Builder.CreateMul(
-  CountField, llvm::ConstantInt::get(CountField->getType(), Size / 8));
-  Mul = Builder.CreateZExtOrTrunc(Mul, ResType);
+  if (Idx) {
+llvm::Value *IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = Builder.CreateZExtOrTrunc(IdxInst, CountedByInst->getType());
+CountedByInst = Builder.CreateSub(CountedByInst, IdxInst);
+  }
 
-  if (ObjectSize)
-return Builder.CreateAdd(ObjectSize, Mul);
+  // Get the size of the flexible array member's base type.
+  const ValueDecl *FAM = FindFlexibleArrayMemberField(Ctx, OuterRD);
+  const ArrayType *ArrayTy = Ctx.getAsArrayType(FAM->getType());
+  CharUnits Size = Ctx.getTypeSizeInChars(ArrayTy->getElementType());
+  llvm::Constant *ElemSize =
+  llvm::ConstantInt::get(CountedByInst->getType(), Size.getQuantity());
+
+  llvm::Value *FAMSize = Builder.CreateMul(CountedByInst, ElemSize);

rapidsna wrote:

Hmm, it seems like `__bdos` returns `0` when it's pointing to an OOB object. 
https://godbolt.org/z/eT4c7aPWr
At least from what I'm seeing here. WDYT?


https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [clang-tools-extra] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-11-09 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna approved this pull request.

Thank you! LGTM.

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-09 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna updated 
https://github.com/llvm/llvm-project/pull/70480

>From 99ec6e055dd32a86bf6d589a6895658dcbe1d7bd Mon Sep 17 00:00:00 2001
From: Yeoul Na 
Date: Fri, 27 Oct 2023 08:34:37 -0700
Subject: [PATCH 1/9] [Driver][BoundsSafety] Add -fbounds-safety-experimental
 flag

-fbounds-safety-experimental is an experimental flag for
-fbounds-safety, which is a bounds-safety extension for C.
-fbounds-safety will require substantial changes across the Clang
codebase. So we introduce this experimental flag is to gate our
incremental patches until we push the essential functionality of
the extension.

-fbounds-safety-experimental currently doesn't do anything but
reporting an error when the flag is used with an unsupported
source language (currently only supports C).
---
 .../clang/Basic/DiagnosticFrontendKinds.td|  3 +++
 clang/include/clang/Basic/LangOptions.def |  2 ++
 clang/include/clang/Driver/Options.td |  8 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  3 +++
 clang/lib/Frontend/CompilerInvocation.cpp | 23 +++
 clang/test/BoundsSafety/Driver/driver.c   |  9 
 .../Frontend/only_c_is_supported.c| 15 
 7 files changed, 63 insertions(+)
 create mode 100644 clang/test/BoundsSafety/Driver/driver.c
 create mode 100644 clang/test/BoundsSafety/Frontend/only_c_is_supported.c

diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td 
b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 715e0c0dc8fa84e..edcbbe992377e12 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -330,6 +330,9 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+def error_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+
 let CategoryName = "Instrumentation Issue" in {
 def warn_profile_data_out_of_date : Warning<
   "profile data may be out of date: of %0 function%s0, %1 
%plural{1:has|:have}1"
diff --git a/clang/include/clang/Basic/LangOptions.def 
b/clang/include/clang/Basic/LangOptions.def
index c0ea4ecb9806a5b..222812d876a65f8 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -470,6 +470,8 @@ VALUE_LANGOPT(FuchsiaAPILevel, 32, 0, "Fuchsia API level")
 // on large _BitInts.
 BENIGN_VALUE_LANGOPT(MaxBitIntWidth, 32, 128, "Maximum width of a _BitInt")
 
+LANGOPT(BoundsSafety, 1, 0, "Bounds safety extension for C")
+
 LANGOPT(IncrementalExtensions, 1, 0, " True if we want to process statements"
 "on the global scope, ignore EOF token and continue later on (thus "
 "avoid tearing the Lexer and etc. down). Controlled by "
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f3f5125d42e7a9..3eb98c8ee2950a1 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1732,6 +1732,14 @@ def fswift_async_fp_EQ : Joined<["-"], 
"fswift-async-fp=">,
 NormalizedValues<["Auto", "Always", "Never"]>,
 MarshallingInfoEnum, "Always">;
 
+defm bounds_safety : BoolFOption<
+  "bounds-safety-experimental",
+  LangOpts<"BoundsSafety">, DefaultFalse,
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption, CC1Option],
+  " experimental bounds safety extension for C">>;
+
 defm addrsig : BoolFOption<"addrsig",
   CodeGenOpts<"Addrsig">, DefaultFalse,
   PosFlag,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 43a92adbef64ba8..7482b852fb37958 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6689,6 +6689,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   Args.addOptOutFlag(CmdArgs, options::OPT_fassume_sane_operator_new,
  options::OPT_fno_assume_sane_operator_new);
 
+  Args.addOptInFlag(CmdArgs, options::OPT_fbounds_safety,
+options::OPT_fno_bounds_safety);
+
   // -fblocks=0 is default.
   if (Args.hasFlag(options::OPT_fblocks, options::OPT_fno_blocks,
TC.IsBlocksDefault()) ||
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index fd6c250efeda2a8..f785bd504d63a81 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3618,6 +3618,23 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static bool SupportsBoundsSafety(Language Lang) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass
+ 

[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-09 Thread Yeoul Na via cfe-commits


@@ -3618,6 +3618,27 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static void CheckBoundsSafetyLang(InputKind IK, DiagnosticsEngine &Diags) {

rapidsna wrote:

> I think it's better to handle this language check in Driver, similar to 
> `OPT_fminimize_whitespace` but for all `Inputs`.

@MaskRay Thanks for the review! I made it a driver error here: 
[f125532](https://github.com/llvm/llvm-project/pull/70480/commits/f125532235d5120027ec261204ae09d315d0fa14)

We also keep the -cc1 check in order to prevent it from running with an 
unsupported mode (we still need some checks in cc1 anyway in case we want to 
silently ignore the option, instead of triggering the broken mode).

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-09 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna updated 
https://github.com/llvm/llvm-project/pull/70480

>From 99ec6e055dd32a86bf6d589a6895658dcbe1d7bd Mon Sep 17 00:00:00 2001
From: Yeoul Na 
Date: Fri, 27 Oct 2023 08:34:37 -0700
Subject: [PATCH 01/10] [Driver][BoundsSafety] Add -fbounds-safety-experimental
 flag

-fbounds-safety-experimental is an experimental flag for
-fbounds-safety, which is a bounds-safety extension for C.
-fbounds-safety will require substantial changes across the Clang
codebase. So we introduce this experimental flag is to gate our
incremental patches until we push the essential functionality of
the extension.

-fbounds-safety-experimental currently doesn't do anything but
reporting an error when the flag is used with an unsupported
source language (currently only supports C).
---
 .../clang/Basic/DiagnosticFrontendKinds.td|  3 +++
 clang/include/clang/Basic/LangOptions.def |  2 ++
 clang/include/clang/Driver/Options.td |  8 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  3 +++
 clang/lib/Frontend/CompilerInvocation.cpp | 23 +++
 clang/test/BoundsSafety/Driver/driver.c   |  9 
 .../Frontend/only_c_is_supported.c| 15 
 7 files changed, 63 insertions(+)
 create mode 100644 clang/test/BoundsSafety/Driver/driver.c
 create mode 100644 clang/test/BoundsSafety/Frontend/only_c_is_supported.c

diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td 
b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 715e0c0dc8fa84e..edcbbe992377e12 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -330,6 +330,9 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+def error_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+
 let CategoryName = "Instrumentation Issue" in {
 def warn_profile_data_out_of_date : Warning<
   "profile data may be out of date: of %0 function%s0, %1 
%plural{1:has|:have}1"
diff --git a/clang/include/clang/Basic/LangOptions.def 
b/clang/include/clang/Basic/LangOptions.def
index c0ea4ecb9806a5b..222812d876a65f8 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -470,6 +470,8 @@ VALUE_LANGOPT(FuchsiaAPILevel, 32, 0, "Fuchsia API level")
 // on large _BitInts.
 BENIGN_VALUE_LANGOPT(MaxBitIntWidth, 32, 128, "Maximum width of a _BitInt")
 
+LANGOPT(BoundsSafety, 1, 0, "Bounds safety extension for C")
+
 LANGOPT(IncrementalExtensions, 1, 0, " True if we want to process statements"
 "on the global scope, ignore EOF token and continue later on (thus "
 "avoid tearing the Lexer and etc. down). Controlled by "
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f3f5125d42e7a9..3eb98c8ee2950a1 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1732,6 +1732,14 @@ def fswift_async_fp_EQ : Joined<["-"], 
"fswift-async-fp=">,
 NormalizedValues<["Auto", "Always", "Never"]>,
 MarshallingInfoEnum, "Always">;
 
+defm bounds_safety : BoolFOption<
+  "bounds-safety-experimental",
+  LangOpts<"BoundsSafety">, DefaultFalse,
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption, CC1Option],
+  " experimental bounds safety extension for C">>;
+
 defm addrsig : BoolFOption<"addrsig",
   CodeGenOpts<"Addrsig">, DefaultFalse,
   PosFlag,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 43a92adbef64ba8..7482b852fb37958 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6689,6 +6689,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   Args.addOptOutFlag(CmdArgs, options::OPT_fassume_sane_operator_new,
  options::OPT_fno_assume_sane_operator_new);
 
+  Args.addOptInFlag(CmdArgs, options::OPT_fbounds_safety,
+options::OPT_fno_bounds_safety);
+
   // -fblocks=0 is default.
   if (Args.hasFlag(options::OPT_fblocks, options::OPT_fno_blocks,
TC.IsBlocksDefault()) ||
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index fd6c250efeda2a8..f785bd504d63a81 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3618,6 +3618,23 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static bool SupportsBoundsSafety(Language Lang) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass

[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-09 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> However, implementing the checking in clang/lib/Driver is much more common. 
> clang/lib/Frontend/CompilerInvocation.cpp has some checking, but the majority 
> is in clang/lib/Driver.

@MaskRay To be clear, the check is now in Driver as you suggested. It's just 
that the frontend also has some extra checks too. So, you want me to remove the 
extra checks in the frontend?

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-09 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> Sometimes, it is actually useful to expose some experimental features that we 
> don't feel comfortable surfacing as driver options as cc1 options. The cc1 
> options give the users to experiment with the feature. It's the users's 
> responsibility to adapt when the feature changes.

Ok. I'll fix this.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70749
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna commented:

Thanks @AaronBallman for your feedback! I left inlined comments to answer these 
and updated documents.

https://github.com/llvm/llvm-project/pull/70749
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-12-11 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,362 @@
+==
+``-fbounds-safety``: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+``-fbounds-safety`` is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. ``-fbounds-safety`` aims to eliminate this class of bugs 
by turning OOB accesses into deterministic traps.
+
+The ``-fbounds-safety`` extension offers bounds annotations that programmers 
can use to attach bounds to pointers. For example, programmers can add the 
``__counted_by(N)`` annotation to parameter ``ptr``, indicating that the 
pointer has ``N`` valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of ``-fbounds-safety`` is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The ``-fbounds-safety`` extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI).
+* It interoperates well with plain C code.
+* It can be adopted partially and incrementally while still providing safety 
benefits.
+* It is syntactically and semantically compatible with C.
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost.
+* It can be implemented on top of Clang.
+
+This document discusses the key designs of ``-fbounds-safety``. The document 
is subject to be actively updated with a more detailed specification. The 
implementation plan can be found in `Implementation plans for -fbounds-safety 
`_.
+
+Programming Model
+=
+
+Overview
+
+
+``-fbounds-safety`` ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the ``__counted_by(count)`` annotation indicates that 
parameter ``p`` points to a buffer of integers containing ``count`` elements. 
An off-by-one error is present in the loop condition, leading to ``p[i]`` being 
out-of-bounds access during the loop’s final iteration. The compiler inserts a 
bounds check before ``p`` is dereferenced to ensure that the access remains 
within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by(count) p, unsigned count) {
+   // off-by-one error (i < count)
+  for (unsigned i = 0; i <= count; ++i) {
+ // bounds check inserted:
+ //   if (i >= count) trap();
+ p[i] = i;
+  }
+   }
+
+A bounds annotation defines an invariant for the pointer type, and the model 
ensures that this invariant remains true. In the example below, pointer ``p`` 
annotated with ``__counted_by(count)`` must always point to a memory buffer 
containing at least ``count`` elements of the pointee type. Increasing the 
value of ``count``, like in the example below, would violate this invariant and 
permit out-of-bounds access to the pointer. To avoid this, the compiler emits 
either a compile-time error or a run-time trap. Section `Maintaining 
correctness of bounds annotations`_ provides more details about the programming 
model.
+
+.. code-block:: c
+
+   void foo(int *__counted_by(count) p, size_t count) {
+  count++; // violates the invariant of __counted_by
+   }
+
+The requirement to annotate all pointers with explicit bounds information 
could present a significant adoption burden. To tackle this

[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-14 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-14 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna updated 
https://github.com/llvm/llvm-project/pull/70480

>From 99ec6e055dd32a86bf6d589a6895658dcbe1d7bd Mon Sep 17 00:00:00 2001
From: Yeoul Na 
Date: Fri, 27 Oct 2023 08:34:37 -0700
Subject: [PATCH 01/11] [Driver][BoundsSafety] Add -fbounds-safety-experimental
 flag

-fbounds-safety-experimental is an experimental flag for
-fbounds-safety, which is a bounds-safety extension for C.
-fbounds-safety will require substantial changes across the Clang
codebase. So we introduce this experimental flag is to gate our
incremental patches until we push the essential functionality of
the extension.

-fbounds-safety-experimental currently doesn't do anything but
reporting an error when the flag is used with an unsupported
source language (currently only supports C).
---
 .../clang/Basic/DiagnosticFrontendKinds.td|  3 +++
 clang/include/clang/Basic/LangOptions.def |  2 ++
 clang/include/clang/Driver/Options.td |  8 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  3 +++
 clang/lib/Frontend/CompilerInvocation.cpp | 23 +++
 clang/test/BoundsSafety/Driver/driver.c   |  9 
 .../Frontend/only_c_is_supported.c| 15 
 7 files changed, 63 insertions(+)
 create mode 100644 clang/test/BoundsSafety/Driver/driver.c
 create mode 100644 clang/test/BoundsSafety/Frontend/only_c_is_supported.c

diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td 
b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 715e0c0dc8fa84e..edcbbe992377e12 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -330,6 +330,9 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+def error_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+
 let CategoryName = "Instrumentation Issue" in {
 def warn_profile_data_out_of_date : Warning<
   "profile data may be out of date: of %0 function%s0, %1 
%plural{1:has|:have}1"
diff --git a/clang/include/clang/Basic/LangOptions.def 
b/clang/include/clang/Basic/LangOptions.def
index c0ea4ecb9806a5b..222812d876a65f8 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -470,6 +470,8 @@ VALUE_LANGOPT(FuchsiaAPILevel, 32, 0, "Fuchsia API level")
 // on large _BitInts.
 BENIGN_VALUE_LANGOPT(MaxBitIntWidth, 32, 128, "Maximum width of a _BitInt")
 
+LANGOPT(BoundsSafety, 1, 0, "Bounds safety extension for C")
+
 LANGOPT(IncrementalExtensions, 1, 0, " True if we want to process statements"
 "on the global scope, ignore EOF token and continue later on (thus "
 "avoid tearing the Lexer and etc. down). Controlled by "
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f3f5125d42e7a9..3eb98c8ee2950a1 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1732,6 +1732,14 @@ def fswift_async_fp_EQ : Joined<["-"], 
"fswift-async-fp=">,
 NormalizedValues<["Auto", "Always", "Never"]>,
 MarshallingInfoEnum, "Always">;
 
+defm bounds_safety : BoolFOption<
+  "bounds-safety-experimental",
+  LangOpts<"BoundsSafety">, DefaultFalse,
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption, CC1Option],
+  " experimental bounds safety extension for C">>;
+
 defm addrsig : BoolFOption<"addrsig",
   CodeGenOpts<"Addrsig">, DefaultFalse,
   PosFlag,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 43a92adbef64ba8..7482b852fb37958 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6689,6 +6689,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   Args.addOptOutFlag(CmdArgs, options::OPT_fassume_sane_operator_new,
  options::OPT_fno_assume_sane_operator_new);
 
+  Args.addOptInFlag(CmdArgs, options::OPT_fbounds_safety,
+options::OPT_fno_bounds_safety);
+
   // -fblocks=0 is default.
   if (Args.hasFlag(options::OPT_fblocks, options::OPT_fno_blocks,
TC.IsBlocksDefault()) ||
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index fd6c250efeda2a8..f785bd504d63a81 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3618,6 +3618,23 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static bool SupportsBoundsSafety(Language Lang) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass

[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-14 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> > -fbounds-safety-experimental is an experimental flag for -fbounds-safety,
> 
> -fexperimental-bounds-safety

Changed the description in the PR! I'll adjust the commit message too when I 
squash all the changes once I get your approval.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-14 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> > @MaskRay To be clear, the check is now in Driver as you suggested. It's 
> > just that the frontend also has some extra checks too. So, you want me to 
> > remove the extra checks in the frontend?
> 
> Yes, otherwise it's duplicated check.

I just removed the check in the frontend!



https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fexperimental-bounds-safety flag (PR #70480)

2023-11-14 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> In addition, I think our convention is to add the option when the feature is 
> ready, rather than add a no-op option, than build functional patches on top 
> of it.

@MaskRay  This feature will involve a lot of incremental patches. And we will 
still need an experimental flag to test the incremental functionalities that 
are added. I can make it a CC1 only flag for now. Would it work for you better?

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-30 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna updated 
https://github.com/llvm/llvm-project/pull/70480

>From 99ec6e055dd32a86bf6d589a6895658dcbe1d7bd Mon Sep 17 00:00:00 2001
From: Yeoul Na 
Date: Fri, 27 Oct 2023 08:34:37 -0700
Subject: [PATCH 1/2] [Driver][BoundsSafety] Add -fbounds-safety-experimental
 flag

-fbounds-safety-experimental is an experimental flag for
-fbounds-safety, which is a bounds-safety extension for C.
-fbounds-safety will require substantial changes across the Clang
codebase. So we introduce this experimental flag is to gate our
incremental patches until we push the essential functionality of
the extension.

-fbounds-safety-experimental currently doesn't do anything but
reporting an error when the flag is used with an unsupported
source language (currently only supports C).
---
 .../clang/Basic/DiagnosticFrontendKinds.td|  3 +++
 clang/include/clang/Basic/LangOptions.def |  2 ++
 clang/include/clang/Driver/Options.td |  8 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  3 +++
 clang/lib/Frontend/CompilerInvocation.cpp | 23 +++
 clang/test/BoundsSafety/Driver/driver.c   |  9 
 .../Frontend/only_c_is_supported.c| 15 
 7 files changed, 63 insertions(+)
 create mode 100644 clang/test/BoundsSafety/Driver/driver.c
 create mode 100644 clang/test/BoundsSafety/Frontend/only_c_is_supported.c

diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td 
b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 715e0c0dc8fa84e..edcbbe992377e12 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -330,6 +330,9 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+def error_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+
 let CategoryName = "Instrumentation Issue" in {
 def warn_profile_data_out_of_date : Warning<
   "profile data may be out of date: of %0 function%s0, %1 
%plural{1:has|:have}1"
diff --git a/clang/include/clang/Basic/LangOptions.def 
b/clang/include/clang/Basic/LangOptions.def
index c0ea4ecb9806a5b..222812d876a65f8 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -470,6 +470,8 @@ VALUE_LANGOPT(FuchsiaAPILevel, 32, 0, "Fuchsia API level")
 // on large _BitInts.
 BENIGN_VALUE_LANGOPT(MaxBitIntWidth, 32, 128, "Maximum width of a _BitInt")
 
+LANGOPT(BoundsSafety, 1, 0, "Bounds safety extension for C")
+
 LANGOPT(IncrementalExtensions, 1, 0, " True if we want to process statements"
 "on the global scope, ignore EOF token and continue later on (thus "
 "avoid tearing the Lexer and etc. down). Controlled by "
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f3f5125d42e7a9..3eb98c8ee2950a1 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1732,6 +1732,14 @@ def fswift_async_fp_EQ : Joined<["-"], 
"fswift-async-fp=">,
 NormalizedValues<["Auto", "Always", "Never"]>,
 MarshallingInfoEnum, "Always">;
 
+defm bounds_safety : BoolFOption<
+  "bounds-safety-experimental",
+  LangOpts<"BoundsSafety">, DefaultFalse,
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption, CC1Option],
+  " experimental bounds safety extension for C">>;
+
 defm addrsig : BoolFOption<"addrsig",
   CodeGenOpts<"Addrsig">, DefaultFalse,
   PosFlag,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 43a92adbef64ba8..7482b852fb37958 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6689,6 +6689,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   Args.addOptOutFlag(CmdArgs, options::OPT_fassume_sane_operator_new,
  options::OPT_fno_assume_sane_operator_new);
 
+  Args.addOptInFlag(CmdArgs, options::OPT_fbounds_safety,
+options::OPT_fno_bounds_safety);
+
   // -fblocks=0 is default.
   if (Args.hasFlag(options::OPT_fblocks, options::OPT_fno_blocks,
TC.IsBlocksDefault()) ||
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index fd6c250efeda2a8..f785bd504d63a81 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3618,6 +3618,23 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static bool SupportsBoundsSafety(Language Lang) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass
+ 

[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-30 Thread Yeoul Na via cfe-commits


@@ -330,6 +330,9 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+def error_bounds_safety_lang_not_supported : Error<

rapidsna wrote:

Done!

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-30 Thread Yeoul Na via cfe-commits


@@ -330,6 +330,9 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+def error_bounds_safety_lang_not_supported : Error<

rapidsna wrote:

Done!

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-30 Thread Yeoul Na via cfe-commits


@@ -3618,6 +3618,23 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static bool SupportsBoundsSafety(Language Lang) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass
+  // C_FLAGS to Clang while building assembly files.
+  switch (Lang) {
+  case Language::Unknown:
+  case Language::Asm:

rapidsna wrote:

Introduced a warning for ASM. I don't think LLVM IR/Unknown should be reachable 
when parsing the language arguments so I now treat them unreachable.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-30 Thread Yeoul Na via cfe-commits


@@ -3618,6 +3618,23 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static bool SupportsBoundsSafety(Language Lang) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass
+  // C_FLAGS to Clang while building assembly files.
+  switch (Lang) {
+  case Language::Unknown:
+  case Language::Asm:
+  case Language::LLVM_IR:

rapidsna wrote:

Done!

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-30 Thread Yeoul Na via cfe-commits


@@ -3835,6 +3852,12 @@ bool CompilerInvocation::ParseLangArgs(LangOptions 
&Opts, ArgList &Args,
   Opts.Trigraphs =
   Args.hasFlag(OPT_ftrigraphs, OPT_fno_trigraphs, Opts.Trigraphs);
 
+  Opts.BoundsSafety = Args.hasFlag(OPT_fbounds_safety, OPT_fno_bounds_safety,

rapidsna wrote:

I think you're right. Removed the change.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-30 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,9 @@
+// RUN: %clang -c %s -### 2>&1 | not grep fbounds-safety-experimental

rapidsna wrote:

Done!

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-30 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna updated 
https://github.com/llvm/llvm-project/pull/70480

>From 99ec6e055dd32a86bf6d589a6895658dcbe1d7bd Mon Sep 17 00:00:00 2001
From: Yeoul Na 
Date: Fri, 27 Oct 2023 08:34:37 -0700
Subject: [PATCH 1/2] [Driver][BoundsSafety] Add -fbounds-safety-experimental
 flag

-fbounds-safety-experimental is an experimental flag for
-fbounds-safety, which is a bounds-safety extension for C.
-fbounds-safety will require substantial changes across the Clang
codebase. So we introduce this experimental flag is to gate our
incremental patches until we push the essential functionality of
the extension.

-fbounds-safety-experimental currently doesn't do anything but
reporting an error when the flag is used with an unsupported
source language (currently only supports C).
---
 .../clang/Basic/DiagnosticFrontendKinds.td|  3 +++
 clang/include/clang/Basic/LangOptions.def |  2 ++
 clang/include/clang/Driver/Options.td |  8 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  3 +++
 clang/lib/Frontend/CompilerInvocation.cpp | 23 +++
 clang/test/BoundsSafety/Driver/driver.c   |  9 
 .../Frontend/only_c_is_supported.c| 15 
 7 files changed, 63 insertions(+)
 create mode 100644 clang/test/BoundsSafety/Driver/driver.c
 create mode 100644 clang/test/BoundsSafety/Frontend/only_c_is_supported.c

diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td 
b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 715e0c0dc8fa84e..edcbbe992377e12 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -330,6 +330,9 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+def error_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+
 let CategoryName = "Instrumentation Issue" in {
 def warn_profile_data_out_of_date : Warning<
   "profile data may be out of date: of %0 function%s0, %1 
%plural{1:has|:have}1"
diff --git a/clang/include/clang/Basic/LangOptions.def 
b/clang/include/clang/Basic/LangOptions.def
index c0ea4ecb9806a5b..222812d876a65f8 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -470,6 +470,8 @@ VALUE_LANGOPT(FuchsiaAPILevel, 32, 0, "Fuchsia API level")
 // on large _BitInts.
 BENIGN_VALUE_LANGOPT(MaxBitIntWidth, 32, 128, "Maximum width of a _BitInt")
 
+LANGOPT(BoundsSafety, 1, 0, "Bounds safety extension for C")
+
 LANGOPT(IncrementalExtensions, 1, 0, " True if we want to process statements"
 "on the global scope, ignore EOF token and continue later on (thus "
 "avoid tearing the Lexer and etc. down). Controlled by "
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f3f5125d42e7a9..3eb98c8ee2950a1 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1732,6 +1732,14 @@ def fswift_async_fp_EQ : Joined<["-"], 
"fswift-async-fp=">,
 NormalizedValues<["Auto", "Always", "Never"]>,
 MarshallingInfoEnum, "Always">;
 
+defm bounds_safety : BoolFOption<
+  "bounds-safety-experimental",
+  LangOpts<"BoundsSafety">, DefaultFalse,
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption, CC1Option],
+  " experimental bounds safety extension for C">>;
+
 defm addrsig : BoolFOption<"addrsig",
   CodeGenOpts<"Addrsig">, DefaultFalse,
   PosFlag,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 43a92adbef64ba8..7482b852fb37958 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6689,6 +6689,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   Args.addOptOutFlag(CmdArgs, options::OPT_fassume_sane_operator_new,
  options::OPT_fno_assume_sane_operator_new);
 
+  Args.addOptInFlag(CmdArgs, options::OPT_fbounds_safety,
+options::OPT_fno_bounds_safety);
+
   // -fblocks=0 is default.
   if (Args.hasFlag(options::OPT_fblocks, options::OPT_fno_blocks,
TC.IsBlocksDefault()) ||
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index fd6c250efeda2a8..f785bd504d63a81 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3618,6 +3618,23 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static bool SupportsBoundsSafety(Language Lang) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass
+ 

[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-30 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,5 @@
+
+// RUN: %clang -fbounds-safety-experimental -fsyntax-only %s 2>&1 | FileCheck 
%s
+// RUN: %clang_cc1 -fbounds-safety-experimental -fsyntax-only %s 2>&1 | 
FileCheck %s
+
+// CHECK: warning: '-fbounds-safety' is ignored for assembly

rapidsna wrote:

Done!

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [BoundsSafety] Initial documentation for -fbounds-safety (PR #70749)

2023-10-30 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna created 
https://github.com/llvm/llvm-project/pull/70749

The document is mostly the exact copy of RFC: Enforcing Bounds Safety in C, 
except some minor adjustments in the tone to make it more suitable for 
documentation:
https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854

Further changes and clarifications for the programming model will be done as 
separate patches to make it easier to track history of changes.

>From a49a652c689438c919b3897c97560b05b3c232d7 Mon Sep 17 00:00:00 2001
From: Yeoul Na 
Date: Mon, 30 Oct 2023 16:48:36 -0700
Subject: [PATCH] [BoundsSafety] Initial documentation for -fbounds-safety

The document is mostly the exact copy of RFC: Enforcing Bounds
Safety in C, except some minor adjustments in the tone to make
it more suitable for documentation:
https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854

Further changes and clarifications for the programming model will
be done as separate patches to make it easier to track history of
changes.
---
 clang/docs/BoundsSafety.rst | 480 
 clang/docs/index.rst|   1 +
 2 files changed, 481 insertions(+)
 create mode 100644 clang/docs/BoundsSafety.rst

diff --git a/clang/docs/BoundsSafety.rst b/clang/docs/BoundsSafety.rst
new file mode 100644
index 000..00ef9a2a41b8de0
--- /dev/null
+++ b/clang/docs/BoundsSafety.rst
@@ -0,0 +1,480 @@
+==
+-fbounds-safety: Enforcing bounds safety for C
+==
+
+.. contents::
+   :local:
+
+Overview
+
+
+-fbounds-safety is a C extension to enforce bounds safety to prevent 
out-of-bounds (OOB) memory accesses, which remain a major source of security 
vulnerabilities in C. -fbounds-safety aims to eliminate this class of bugs by 
turning OOB accesses into deterministic traps.
+
+The -fbounds-safety extension offers bounds annotations that programmers can 
use to attach bounds to pointers. For example, programmers can add the 
__counted_by(N) annotation to parameter ptr, indicating that the pointer has N 
valid elements:
+
+.. code-block:: c
+
+   void foo(int *__counted_by(N) ptr, size_t N);
+
+Using this bounds information, the compiler inserts bounds checks on every 
pointer dereference, ensuring that the program does not access memory outside 
the specified bounds. The compiler requires programmers to provide enough 
bounds information so that the accesses can be checked at either run time or 
compile time — and it rejects code if it cannot.
+
+The most important contribution of “-fbounds-safety” is how it reduces the 
programmer’s annotation burden by reconciling bounds annotations at ABI 
boundaries with the use of implicit wide pointers (a.k.a. “fat” pointers) that 
carry bounds information on local variables without the need for annotations. 
We designed this model so that it preserves ABI compatibility with C while 
minimizing adoption effort.
+
+The -fbounds-safety extension has been adopted on millions of lines of 
production C code and proven to work in a consumer operating system setting. 
The extension was designed to enable incremental adoption — a key requirement 
in real-world settings where modifying an entire project and its dependencies 
all at once is often not possible. It also addresses multiple of other 
practical challenges that have made existing approaches to safer C dialects 
difficult to adopt, offering these properties that make it widely adoptable in 
practice:
+
+* It is designed to preserve the Application Binary Interface (ABI)
+* It interoperates well with plain C code
+* It can be adopted partially and incrementally while still providing safety 
benefits
+* It is syntactically and semantically compatible with C
+* Consequently, source code that adopts the extension can continue to be 
compiled by toolchains that do not support the extension.
+* It has a relatively low adoption cost
+* It can be implemented on top of Clang
+
+
+Programming Model
+
+
+Overview
+-
+
+-fbounds-safety ensures that pointers are not used to access memory beyond 
their bounds by performing bounds checking. If a bounds check fails, the 
program will deterministically trap before out-of-bounds memory is accessed.
+
+In our model, every pointer has an explicit or implicit bounds attribute that 
determines its bounds and ensures guaranteed bounds checking. Consider the 
example below where the __counted_by(count) annotation indicates that parameter 
ppoints to a buffer of int s containing count elements. An off-by-one error is 
present in the loop condition, leading to p[i]being out-of-bounds access during 
the loop’s final iteration. The compiler inserts a bounds check before p is 
dereferenced to ensure that the access remains within the specified bounds.
+
+.. code-block:: c
+
+   void fill_array_with_indices(int *__counted_by

[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-30 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-31 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> The other experimental flags I see start with experimental, this one ends 
> with it. Why isn't this called `-fexerimental-bounds-safety`?

Oh, I can see most of them start with `-fexperimental`, not everything though. 
I can fix this. Is there a formal convention?

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -330,6 +330,14 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+let CategoryName = "Bounds Safety Issue" in {
+def err_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+def warn_bounds_safety_asm_ignored : Warning<
+  "'-fbounds-safety' is ignored for assembly">,

rapidsna wrote:

@MaskRay Thank you! I can use `warning: argument unused during compilation` for 
the driver. Do you know what is the convention for unused flags in cc1?

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,11 @@
+// RUN: %clang -c %s -### 2>&1 | FileCheck -check-prefix T0 %s

rapidsna wrote:

@nickdesaulniers I was hoping that we could keep -fbounds-safety tests under a 
separate folder clang/test/BoundsSafety in order to avoid them from getting 
mixed up with other tests. And make it easier to run bounds safety related 
tests only. But I can see this is not what other extensions do so I can move 
them to the existing layout.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,25 @@
+// RUN: not %clang -fbounds-safety-experimental -x c++ %s 2>&1 | FileCheck 
-check-prefix ERR %s
+
+// RUN: not %clang -fbounds-safety-experimental -x objective-c %s 2>&1 | 
FileCheck -check-prefix ERR %s
+
+// RUN: not %clang -fbounds-safety-experimental -x objective-c++ %s 2>&1 | 
FileCheck -check-prefix ERR %s
+
+// RUN: not %clang -fbounds-safety-experimental -x cuda -nocudalib -nocudainc 
%s 2>&1 | FileCheck -check-prefix ERR %s
+
+// RUN: not %clang -fbounds-safety-experimental -x renderscript %s 2>&1 | 
FileCheck -check-prefix ERR %s
+
+// RUN: not %clang_cc1 -fbounds-safety-experimental -x c++ %s 2>&1 | FileCheck 
-check-prefix ERR %s
+
+// RUN: not %clang_cc1 -fbounds-safety-experimental -x objective-c %s 2>&1 | 
FileCheck -check-prefix ERR %s
+
+// RUN: not %clang_cc1 -fbounds-safety-experimental -x objective-c++ %s 2>&1 | 
FileCheck -check-prefix ERR %s
+
+// RUN: not %clang_cc1 -fbounds-safety-experimental -x cuda %s 2>&1 | 
FileCheck -check-prefix ERR %s
+
+// RUN: not %clang_cc1 -fbounds-safety-experimental -x renderscript %s 2>&1 | 
FileCheck -check-prefix ERR %s

rapidsna wrote:

I think at least `-x ir` had to be a separate file because how it recognize 
comments (`//` vs `;`). I'll double check.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -330,6 +330,14 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+let CategoryName = "Bounds Safety Issue" in {
+def err_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+def warn_bounds_safety_asm_ignored : Warning<
+  "'-fbounds-safety' is ignored for assembly">,

rapidsna wrote:

@nickdesaulniers They are separate because the languages (C++/Obj-C/etc) we 
want to make an error until we have a proper support vs a warning for assembly 
(or maybe others?) that are meant to be ignored. For the latter I think we can 
use `warning: argument unused during compilation` as @MaskRay suggested.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -859,53 +859,60 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
-const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
-  }
-
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//sizeof(*p) + p->count * sizeof(*p->array)

rapidsna wrote:

This is about the semantics.

`int array[__counted_by(count)]` means you have `count` elements starting right 
from the flexible member `array` offset, which is `offsetof(struct s, array)`.


Whereas, `sizeof(*p) + p->count * sizeof(*p->array)` indicates you have `count` 
elements starting from at the end of the struct, which isn't always the same as 
the offset of `array` because of the padding and alignment rule.


https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -3618,6 +3618,30 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static void CheckBoundsSafetyLang(InputKind IK, DiagnosticsEngine &Diags) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass
+  // C_FLAGS to Clang while building assembly files.
+  switch (IK.getLanguage()) {
+  case Language::Asm:
+Diags.Report(diag::warn_bounds_safety_asm_ignored);
+break;

rapidsna wrote:

Right, this is something we could potentially allow in future if we see actual 
use cases.

Until we support this, `__has_feature(bounds_safety)` or such will return 
`false` and we should have the compiler report the warning that 
`-fbounds-safety` is ignored for assemblers. 

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -859,53 +859,60 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
-const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
-  }
-
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//sizeof(*p) + p->count * sizeof(*p->array)

rapidsna wrote:

That was why I suggested to have `max` here: `max (offset(struct s, array) + 
p->count * sizeof(*p->array), sizeof(*p))` so that it can return the correct 
size when the struct size is bigger than the end of the array due to the 
padding.

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -859,53 +859,60 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
-const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
-  }
-
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//sizeof(*p) + p->count * sizeof(*p->array)

rapidsna wrote:

```
struct flex {
double dummy;
char c;
char fam[__counted_by(c)];
};
```

[0~8)  : double dummy
[8~9)  : char c = 7
[9~ 16) : char fam[__counted_by(c)];

In the above case, I think `__builtin_dynamic_object_size` should return `16`, 
but `sizeof(*p) + p->count * sizeof(*p->array)` will return `16 + 7 = 23`. 

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -859,53 +859,60 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
-const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
-  }
-
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//sizeof(*p) + p->count * sizeof(*p->array)

rapidsna wrote:

> Note: sizeof(*p) + p->count * sizeof(*p->array) returning 23 is exactly what 
> I want here. :-) At least with regard to __builtin_dynamic_object_size().

hmm, I see. So, that's not what `-fbounds-safety` is meant to work: 
`__counted_by` on fam in -fbounds-safety is to exactly add `count` trailing 
elements from `&s->fam[0]`. So I think this semantic difference between two 
modes going to be a problem (e.g., accessing the offset 22 of the object, for 
example, will be considered OOB in -fbounds-safety).

Is there a reason you want it to be this way? I know people tends to do 
`malloc(sizeof(struct s) + p->count * sizeof(int));` to allocate an object with 
fam. But people also use `offsetof` and in such cases, 
`__builtin_dynamic_object_size` can be bigger than what it should be.

```
struct flex {
  double dummy;
  char c;
  char fam [__counted_by(7)];
};
```

 If `__builtin_dynamic_object_size` does `sizeof(*p) + p->count * 
sizeof(*p->array)`, it's like effectively considering `&s->fam[13]` as part of 
the object.

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -859,53 +859,60 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
-const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
-  }
-
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//sizeof(*p) + p->count * sizeof(*p->array)

rapidsna wrote:

> __builtin_dynamic_object_size only adds the full struct size to the 
> calculation when the full struct pointer is specified: 
> __builtin_dynamic_object_size(p, 1) == 23. __builtin_dynamic_object_size only 
> adds the full struct size to the calculation when the full struct pointer is 
> specified: __builtin_dynamic_object_size(p, 1) == 23. When it's specified on 
> the fam itself (__builtin_dynamic_object_size(*p->array, 1)) it returns only 
> the size of the fam (7 in this example). This seems entirely reasonable to me 
> and fits the definition of __builtin_dynamic_object_size:

I know, but the question is why "the full struct size" should include the part 
of `&s->fam[13]`? It's not even conformed to how the statically initialized 
struct size is determined in C (like @apple-fcloutier 's example also 
indicates).

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -859,53 +859,60 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
-const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
-  }
-
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//sizeof(*p) + p->count * sizeof(*p->array)

rapidsna wrote:

> Why wouldn't it include the FAM part if the full struct pointer is specified 
> to __bdos? This usage is something that shows up in the Linux kernel (a 
> motivating factor for this feature).

I meant the full struct size should include the FAM part that is specified by 
`__counted_by(7)`, but not more than that (e.g., `&s->fam[13]`).

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -859,53 +859,60 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
-const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
-  }
-
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//sizeof(*p) + p->count * sizeof(*p->array)

rapidsna wrote:

> As for @apple-fcloutier 's example, The compiler would complain that 
> `__counted_by(c)` is applied to a non-flexible array member. If we want that 
> capability, we'll have to add it in at a later date.

I think we might have to because some structs are both statically and 
dynamically initialized. But the point the example was trying to make was to 
show how struct sizes with trailing arrays are normally calculated.

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -859,53 +859,60 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
-const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
-  }
-
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//sizeof(*p) + p->count * sizeof(*p->array)

rapidsna wrote:

> @rapidsna That's what this feature does. This is why I'm so confused by this 
> argument. :-)

Sorry, I'm a bit confused. Could you clarify what you meant by "what this 
feature does"?

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits


@@ -859,53 +859,60 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   }
 
   if (IsDynamic) {
-LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
-getLangOpts().getStrictFlexArraysLevel();
-const Expr *Base = E->IgnoreParenImpCasts();
-
-if (FieldDecl *FD = FindCountedByField(Base, StrictFlexArraysLevel)) {
-  const auto *ME = dyn_cast(Base);
-  llvm::Value *ObjectSize = nullptr;
-
-  if (!ME) {
-const auto *DRE = dyn_cast(Base);
-ValueDecl *VD = nullptr;
-
-ObjectSize = ConstantInt::get(
-ResType,
-getContext().getTypeSize(DRE->getType()->getPointeeType()) / 8,
-true);
-
-if (auto *RD = DRE->getType()->getPointeeType()->getAsRecordDecl())
-  VD = RD->getLastField();
-
-Expr *ICE = ImplicitCastExpr::Create(
-getContext(), DRE->getType(), CK_LValueToRValue,
-const_cast(cast(DRE)), nullptr, VK_PRValue,
-FPOptionsOverride());
-ME = MemberExpr::CreateImplicit(getContext(), ICE, true, VD,
-VD->getType(), VK_LValue, OK_Ordinary);
-  }
-
-  // At this point, we know that \p ME is a flexible array member.
-  const auto *ArrayTy = getContext().getAsArrayType(ME->getType());
+// The code generated here calculates the size of a struct with a flexible
+// array member that uses the counted_by attribute. There are two instances
+// we handle:
+//
+//   struct s {
+// unsigned long flags;
+// int count;
+// int array[] __attribute__((counted_by(count)));
+//   }
+//
+//   1) bdos of the flexible array itself:
+//
+// __builtin_dynamic_object_size(p->array, 1) ==
+// p->count * sizeof(*p->array)
+//
+//   2) bdos of the whole struct, including the flexible array:
+//
+// __builtin_dynamic_object_size(p, 1) ==
+//sizeof(*p) + p->count * sizeof(*p->array)

rapidsna wrote:

Okay, I think we are talking past to each other a little bit. That comment I 
was responding to this:

> Why wouldn't it include the FAM part if the full struct pointer is specified 
> to __bdos? 

I'm saying "the full struct size" isn't exactly `struct + fam` because fam 
doesn't always exactly start from sizeof(struct) when there is a padding in the 
struct due to the alignment. 

As we all see in the previous examples, the full struct size is 
`offsetof(struct s, fam) + sizeof(*p->array) * p->count` and align the result 
to `alignof(struct s)`.


https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-10-31 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-01 Thread Yeoul Na via cfe-commits


@@ -330,6 +330,14 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+let CategoryName = "Bounds Safety Issue" in {
+def err_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+def warn_bounds_safety_asm_ignored : Warning<
+  "'-fbounds-safety' is ignored for assembly">,

rapidsna wrote:

@MaskRay Thanks. I still need the error for `-fbounds-safety` not being 
supported in C++/Obj-C/etc in the frontend because it needs to be an error for 
`cc1`.

I will change it so that the unused warning for assembly to be handled in the 
driver like the rest of the unused options for assembly.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-01 Thread Yeoul Na via cfe-commits


@@ -330,6 +330,14 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+let CategoryName = "Bounds Safety Issue" in {
+def err_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+def warn_bounds_safety_asm_ignored : Warning<
+  "'-fbounds-safety' is ignored for assembly">,

rapidsna wrote:


> Conventionally the language compatibility checking and other checking is 
> performed in Driver, not in Frontend.. If you move the language check to 
> Driver, the diagnostic will be natural since clang integrated assembler uses 
> `ClangAs` instead of `Clang`.

@MaskRay It seems both `Clang` and `ClangAs` are invoked for the `clang` 
command with an assembly input (i.e., `ClangAs` is invoked after `Clang`) so it 
seems to me that most options are already `claimed` in `Clang` and so `warning: 
argument unused during compilation` doesn't seem to fire for `clang` in most 
cases including `-fsanitize=address`.

Also, we need to have the input language (`InputKind`) to report the right 
diagnostics so the frontend still seems like the right place to handle this.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-01 Thread Yeoul Na via cfe-commits


@@ -330,6 +330,14 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+let CategoryName = "Bounds Safety Issue" in {
+def err_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+def warn_bounds_safety_asm_ignored : Warning<
+  "'-fbounds-safety' is ignored for assembly">,

rapidsna wrote:

@MaskRay Sorry, it seems we can still do it in Driver with 
`clang::driver::types`. I'm going to make a change to report the "unused" 
warning from the driver, and report the "unsupported language" error in the 
frontend. Does it sound okay to you?

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-01 Thread Yeoul Na via cfe-commits


@@ -330,6 +330,14 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+let CategoryName = "Bounds Safety Issue" in {
+def err_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+def warn_bounds_safety_asm_ignored : Warning<
+  "'-fbounds-safety' is ignored for assembly">,

rapidsna wrote:

> @MaskRay It seems both `Clang` and `ClangAs` are invoked for the `clang` 
> command with an assembly input (i.e., `ClangAs` is invoked after `Clang`) so 
> it seems to me that most options are already `claimed` in `Clang` and so 
> `warning: argument unused during compilation` doesn't seem to fire for 
> `clang` in most cases including `-fsanitize=address`.

I just confirmed that this is happening when the input file format is `.s` and 
no `-x` is provided from the command line. If we do `-x assembler` then only 
`ClangAs` is launched and the unused option warning fires as expected.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-01 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-01 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna updated 
https://github.com/llvm/llvm-project/pull/70480

>From 99ec6e055dd32a86bf6d589a6895658dcbe1d7bd Mon Sep 17 00:00:00 2001
From: Yeoul Na 
Date: Fri, 27 Oct 2023 08:34:37 -0700
Subject: [PATCH 1/5] [Driver][BoundsSafety] Add -fbounds-safety-experimental
 flag

-fbounds-safety-experimental is an experimental flag for
-fbounds-safety, which is a bounds-safety extension for C.
-fbounds-safety will require substantial changes across the Clang
codebase. So we introduce this experimental flag is to gate our
incremental patches until we push the essential functionality of
the extension.

-fbounds-safety-experimental currently doesn't do anything but
reporting an error when the flag is used with an unsupported
source language (currently only supports C).
---
 .../clang/Basic/DiagnosticFrontendKinds.td|  3 +++
 clang/include/clang/Basic/LangOptions.def |  2 ++
 clang/include/clang/Driver/Options.td |  8 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  3 +++
 clang/lib/Frontend/CompilerInvocation.cpp | 23 +++
 clang/test/BoundsSafety/Driver/driver.c   |  9 
 .../Frontend/only_c_is_supported.c| 15 
 7 files changed, 63 insertions(+)
 create mode 100644 clang/test/BoundsSafety/Driver/driver.c
 create mode 100644 clang/test/BoundsSafety/Frontend/only_c_is_supported.c

diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td 
b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 715e0c0dc8fa84e..edcbbe992377e12 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -330,6 +330,9 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+def error_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+
 let CategoryName = "Instrumentation Issue" in {
 def warn_profile_data_out_of_date : Warning<
   "profile data may be out of date: of %0 function%s0, %1 
%plural{1:has|:have}1"
diff --git a/clang/include/clang/Basic/LangOptions.def 
b/clang/include/clang/Basic/LangOptions.def
index c0ea4ecb9806a5b..222812d876a65f8 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -470,6 +470,8 @@ VALUE_LANGOPT(FuchsiaAPILevel, 32, 0, "Fuchsia API level")
 // on large _BitInts.
 BENIGN_VALUE_LANGOPT(MaxBitIntWidth, 32, 128, "Maximum width of a _BitInt")
 
+LANGOPT(BoundsSafety, 1, 0, "Bounds safety extension for C")
+
 LANGOPT(IncrementalExtensions, 1, 0, " True if we want to process statements"
 "on the global scope, ignore EOF token and continue later on (thus "
 "avoid tearing the Lexer and etc. down). Controlled by "
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f3f5125d42e7a9..3eb98c8ee2950a1 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1732,6 +1732,14 @@ def fswift_async_fp_EQ : Joined<["-"], 
"fswift-async-fp=">,
 NormalizedValues<["Auto", "Always", "Never"]>,
 MarshallingInfoEnum, "Always">;
 
+defm bounds_safety : BoolFOption<
+  "bounds-safety-experimental",
+  LangOpts<"BoundsSafety">, DefaultFalse,
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption, CC1Option],
+  " experimental bounds safety extension for C">>;
+
 defm addrsig : BoolFOption<"addrsig",
   CodeGenOpts<"Addrsig">, DefaultFalse,
   PosFlag,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 43a92adbef64ba8..7482b852fb37958 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6689,6 +6689,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   Args.addOptOutFlag(CmdArgs, options::OPT_fassume_sane_operator_new,
  options::OPT_fno_assume_sane_operator_new);
 
+  Args.addOptInFlag(CmdArgs, options::OPT_fbounds_safety,
+options::OPT_fno_bounds_safety);
+
   // -fblocks=0 is default.
   if (Args.hasFlag(options::OPT_fblocks, options::OPT_fno_blocks,
TC.IsBlocksDefault()) ||
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index fd6c250efeda2a8..f785bd504d63a81 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3618,6 +3618,23 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static bool SupportsBoundsSafety(Language Lang) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass
+ 

[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,11 @@
+// RUN: %clang -c %s -### 2>&1 | FileCheck -check-prefix T0 %s

rapidsna wrote:

@nickdesaulniers @MaskRay Thank you! I removed the new directory and moved the 
tests to conform to the existing layout.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits


@@ -330,6 +330,14 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+let CategoryName = "Bounds Safety Issue" in {
+def err_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+def warn_bounds_safety_asm_ignored : Warning<
+  "'-fbounds-safety' is ignored for assembly">,

rapidsna wrote:

@nickdesaulniers I managed to move -x ir test to the `.c` test. Not sure why 
but adding `-###` option seemed to make it work.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> The other experimental flags I see start with experimental, this one ends 
> with it. Why isn't this called `-fexerimental-bounds-safety`?

@tbaederr Thank you! I just renamed the flag to `-fexperimental-bounds-safety`.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,25 @@
+// RUN: not %clang -fbounds-safety-experimental -x c++ %s 2>&1 | FileCheck 
-check-prefix ERR %s
+

rapidsna wrote:

@MaskRay Thank you! I removed the blank lines.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits


@@ -330,6 +330,14 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+let CategoryName = "Bounds Safety Issue" in {
+def err_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+def warn_bounds_safety_asm_ignored : Warning<
+  "'-fbounds-safety' is ignored for assembly">,

rapidsna wrote:

@MaskRay I removed the frontend warning, so that it follows how the driver 
handles unused flags. Now only `warning: argument unused during compilation` 
fires for `-x assembler`, and not for `-x assembler-with-cpp` to conform to 
most of other options.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,5 @@
+; RUN: %clang -fbounds-safety-experimental -x ir -S %s -o /dev/null 2>&1 | 
FileCheck %s
+; RUN: %clang_cc1 -fbounds-safety-experimental -x ir -S %s -o /dev/null 2>&1 | 
FileCheck %s
+

rapidsna wrote:

@MaskRay fixed!

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,12 @@
+// This reports a warning to follow the default behavior of ClangAs.
+// RUN: %clang -fexperimental-bounds-safety -x assembler -c %s -o /dev/null 
2>&1 | FileCheck -check-prefix WARN %s
+
+
+// WARN: warning: argument unused during compilation: 
'-fexperimental-bounds-safety'
+
+// expected-no-diagnostics
+// RUN: %clang -fexperimental-bounds-safety -Xclang -verify -c -x c %s -o 
/dev/null
+// Unlike '-x assembler', '-x assembler-with-cpp' silently ignores unused 
options by default.

rapidsna wrote:

@nickdesaulniers The unused warning doesn't appear for `-x assembler-with-cpp` 
to follow the common behaviors. Instead, I added a note here that we need to 
add a targeted warning in future when assembler tries to use preprocessor 
directives to check if bounds safety ie enabled. WDYT?

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna updated 
https://github.com/llvm/llvm-project/pull/70480

>From 99ec6e055dd32a86bf6d589a6895658dcbe1d7bd Mon Sep 17 00:00:00 2001
From: Yeoul Na 
Date: Fri, 27 Oct 2023 08:34:37 -0700
Subject: [PATCH 1/6] [Driver][BoundsSafety] Add -fbounds-safety-experimental
 flag

-fbounds-safety-experimental is an experimental flag for
-fbounds-safety, which is a bounds-safety extension for C.
-fbounds-safety will require substantial changes across the Clang
codebase. So we introduce this experimental flag is to gate our
incremental patches until we push the essential functionality of
the extension.

-fbounds-safety-experimental currently doesn't do anything but
reporting an error when the flag is used with an unsupported
source language (currently only supports C).
---
 .../clang/Basic/DiagnosticFrontendKinds.td|  3 +++
 clang/include/clang/Basic/LangOptions.def |  2 ++
 clang/include/clang/Driver/Options.td |  8 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  3 +++
 clang/lib/Frontend/CompilerInvocation.cpp | 23 +++
 clang/test/BoundsSafety/Driver/driver.c   |  9 
 .../Frontend/only_c_is_supported.c| 15 
 7 files changed, 63 insertions(+)
 create mode 100644 clang/test/BoundsSafety/Driver/driver.c
 create mode 100644 clang/test/BoundsSafety/Frontend/only_c_is_supported.c

diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td 
b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 715e0c0dc8fa84e..edcbbe992377e12 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -330,6 +330,9 @@ def warn_alias_with_section : Warning<
   "as the %select{aliasee|resolver}2">,
   InGroup;
 
+def error_bounds_safety_lang_not_supported : Error<
+  "bounds safety is only supported for C">;
+
 let CategoryName = "Instrumentation Issue" in {
 def warn_profile_data_out_of_date : Warning<
   "profile data may be out of date: of %0 function%s0, %1 
%plural{1:has|:have}1"
diff --git a/clang/include/clang/Basic/LangOptions.def 
b/clang/include/clang/Basic/LangOptions.def
index c0ea4ecb9806a5b..222812d876a65f8 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -470,6 +470,8 @@ VALUE_LANGOPT(FuchsiaAPILevel, 32, 0, "Fuchsia API level")
 // on large _BitInts.
 BENIGN_VALUE_LANGOPT(MaxBitIntWidth, 32, 128, "Maximum width of a _BitInt")
 
+LANGOPT(BoundsSafety, 1, 0, "Bounds safety extension for C")
+
 LANGOPT(IncrementalExtensions, 1, 0, " True if we want to process statements"
 "on the global scope, ignore EOF token and continue later on (thus "
 "avoid tearing the Lexer and etc. down). Controlled by "
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f3f5125d42e7a9..3eb98c8ee2950a1 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1732,6 +1732,14 @@ def fswift_async_fp_EQ : Joined<["-"], 
"fswift-async-fp=">,
 NormalizedValues<["Auto", "Always", "Never"]>,
 MarshallingInfoEnum, "Always">;
 
+defm bounds_safety : BoolFOption<
+  "bounds-safety-experimental",
+  LangOpts<"BoundsSafety">, DefaultFalse,
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption, CC1Option],
+  " experimental bounds safety extension for C">>;
+
 defm addrsig : BoolFOption<"addrsig",
   CodeGenOpts<"Addrsig">, DefaultFalse,
   PosFlag,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 43a92adbef64ba8..7482b852fb37958 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6689,6 +6689,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   Args.addOptOutFlag(CmdArgs, options::OPT_fassume_sane_operator_new,
  options::OPT_fno_assume_sane_operator_new);
 
+  Args.addOptInFlag(CmdArgs, options::OPT_fbounds_safety,
+options::OPT_fno_bounds_safety);
+
   // -fblocks=0 is default.
   if (Args.hasFlag(options::OPT_fblocks, options::OPT_fno_blocks,
TC.IsBlocksDefault()) ||
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index fd6c250efeda2a8..f785bd504d63a81 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3618,6 +3618,23 @@ void CompilerInvocationBase::GenerateLangArgs(const 
LangOptions &Opts,
 GenerateArg(Consumer, OPT_frandomize_layout_seed_EQ, Opts.RandstructSeed);
 }
 
+static bool SupportsBoundsSafety(Language Lang) {
+  // Currently, bounds safety is only supported for C. However, it's also
+  // possible to pass assembly files and LLVM IR through Clang, and
+  // those should be trivially supported. This is especially important because
+  // some build systems, like xcbuild and somewhat clumsy Makefiles, will pass
+ 

[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-11-02 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> @kees, @nickdesaulniers, @rapidsna, and @apple-fcloutier Should this feature 
> support a `__bdos` to an address inside the FAM?
> 
> ```
> #include 
> #include 
> 
> struct flex {
> double dummy;
> char count;
> char fam[] __attribute__((counted_by(count)));
> };
> 
> int main() {
> struct flex *f = malloc(sizeof(struct flex) + 42 * sizeof(char));
> 
> f->count = 42;
> printf("__bdos(&f->fam[3], 1) == %lu\n", 
> __builtin_dynamic_object_size(&f->fam[3], 1));
> return 0;
> }
> ```

Supporting it similar to how const-sized arrays are currently handled makes 
sense to me.

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits


@@ -0,0 +1,12 @@
+// This reports a warning to follow the default behavior of ClangAs.
+// RUN: %clang -fexperimental-bounds-safety -x assembler -c %s -o /dev/null 
2>&1 | FileCheck -check-prefix WARN %s

rapidsna wrote:

I think `-x assembler` and `-x assembler-with-cpp` are more specific and should 
cover what I want. 

`.s` must be `-x assembler` (except for Darwin which does `-x 
assembler-with-cpp` for `.s`), but that specific behavior should be tested 
separately but not for this PR.

https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver][BoundsSafety] Add -fbounds-safety-experimental flag (PR #70480)

2023-11-02 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/70480
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CodeGen] Revamp counted_by calculations (PR #70606)

2023-11-02 Thread Yeoul Na via cfe-commits

rapidsna wrote:

```
#include 
#include 
struct flex {
int c;
int fam[] __attribute__((counted_by(c)));
};

int main() {
struct flex *p = (struct flex *)malloc(sizeof(struct flex) + sizeof(int) * 
10);
p->c = 100;
printf("%lu\n", __builtin_dynamic_object_size(&p->fam[0], 0)); // 40 : size 
from malloc, but it only contains the array part. Shouldn't it be 44 to include 
the entire object size?
printf("%lu\n", __builtin_dynamic_object_size(&p->fam[0], 1)); // 40 : size 
from malloc;
printf("%lu\n", __builtin_dynamic_object_size(p, 0));  // 404 : 
size from counted_by
}
```

@bwendling It could be tracked as a separate issue, but there seems to be some 
inconsistencies in where bdos is derived from. For `p` it seems the counted_by 
wins over malloc. But for `&->fam[0]` malloc seems to win. 

https://godbolt.org/z/G7WfY4faE

https://github.com/llvm/llvm-project/pull/70606
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   >