[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus created 
https://github.com/llvm/llvm-project/pull/75448

Vectors are always bit-packed and don't respect the elements' alignment
requirements. This is different from arrays. This means offsets of vector GEPs
need to be computed differently than offsets of array GEPs.

This PR fixes many places that rely on an incorrect pattern
that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`.
We replace these by usages of  `GTI.getSequentialElementStride(DL)`, 
which is a new helper function added in this PR.

This changes behavior for GEPs into vectors with element types for which the
(bit) size and alloc size is different. This includes two cases:

* Types with a bit size that is not a multiple of a byte, e.g. i1.
  GEPs into such vectors are questionable to begin with, as some elements
  are not even addressable.
* Overaligned types, e.g. i16 with 32-bit alignment.

Existing tests are unaffected, but a miscompilation of a new precommitted test 
is fixed.

>From 2c367fba42b716d803ee088af45c1b57fe4bcbcd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:24:51 +0100
Subject: [PATCH 1/3] [InstCombine] Precommit test exhibiting miscompile

InstCombine is determining incorrect byte offsets for GEPs
into vectors of overaligned elements.
Add a new testcase showing this behavior, serving as precommit
for a fix.
---
 .../test/Transforms/InstCombine/getelementptr.ll | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll 
b/llvm/test/Transforms/InstCombine/getelementptr.ll
index bc7fdc9352df6c..7c0d95973d5cf3 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1,7 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s -passes=instcombine -S | FileCheck %s
 
-target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64"
+target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64-f16:32"
 
 %intstruct = type { i32 }
 %pair = type { i32, i32 }
@@ -111,6 +111,20 @@ define void @test_evaluate_gep_as_ptrs_array(ptr 
addrspace(2) %B) {
   ret void
 }
 
+define void @test_overaligned_vec(i8 %B) {
+; This should be turned into a constexpr instead of being an 
instruction
+; CHECK-LABEL: @test_overaligned_vec(
+; TODO: In this test case, half is overaligned to 32 bits.
+;   Vectors are bit-packed and don't respect alignment.
+;   Thus, the byte offset of the second half in <2 x half> is 2 bytes, not 
4 bytes:
+; CHECK-NEXT:store i8 [[B:%.*]], ptr getelementptr inbounds ([10 x i8], 
ptr @Global, i64 0, i64 4), align 1
+; CHECK-NEXT:ret void
+;
+  %A = getelementptr <2 x half>, ptr @Global, i64 0, i64 1
+  store i8 %B, ptr %A
+  ret void
+}
+
 define ptr @test7(ptr %I, i64 %C, i64 %D) {
 ; CHECK-LABEL: @test7(
 ; CHECK-NEXT:[[A:%.*]] = getelementptr i32, ptr [[I:%.*]], i64 [[C:%.*]]

>From 2e80e5846d23946304bbb9930d0a12098a2f16bd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:29:59 +0100
Subject: [PATCH 2/3] [IR]: Add
 generic_gep_type_iterator::getSequentialElementStride

This prepares a fix to GEP offset computations on vectors
of overaligned elements.

We have many places that analyze GEP offsets using GEP iterators
with the following pattern:

  GTI = gep_type_begin(ElemTy, Indices),
  GTE = gep_type_end(ElemTy, Indices);
  for (; GTI != GTE; ++GTI) {
if (StructType *STy = GTI.getStructTypeOrNull()) {
   // handle struct
   [..]
} else {
  // handle sequential (outmost index, array, vector):
  auto Stride = DL.getTypeAllocSize(GTI.getIndexedType());
  Offset += Index * Size;
}
  }

This is incorrect for vectors of types whose bit size does not
equal its alloc size (e.g. overaligned types), as vectors
always bit-pack their elements.

This patch introduces new functions generic_gep_type_iterator::isVector()
and generic_gep_type_iterator::getSequentialElementStride(const DataLayout &)
to fix these patterns without having to teach all these places about
the specifics of vector bit layouts. With these helpers, the pattern
above can be fixed by replacing the stride computation:

  auto Stride = GTI.getSequentialElementStride(DL);
---
 .../llvm/IR/GetElementPtrTypeIterator.h   | 57 ++-
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/User.h"
@@ -30,7 +31,39 @@ template 
 class generic_gep_type_iterat

[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus updated 
https://github.com/llvm/llvm-project/pull/75448

>From 2c367fba42b716d803ee088af45c1b57fe4bcbcd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:24:51 +0100
Subject: [PATCH 1/3] [InstCombine] Precommit test exhibiting miscompile

InstCombine is determining incorrect byte offsets for GEPs
into vectors of overaligned elements.
Add a new testcase showing this behavior, serving as precommit
for a fix.
---
 .../test/Transforms/InstCombine/getelementptr.ll | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll 
b/llvm/test/Transforms/InstCombine/getelementptr.ll
index bc7fdc9352df6c..7c0d95973d5cf3 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1,7 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s -passes=instcombine -S | FileCheck %s
 
-target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64"
+target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64-f16:32"
 
 %intstruct = type { i32 }
 %pair = type { i32, i32 }
@@ -111,6 +111,20 @@ define void @test_evaluate_gep_as_ptrs_array(ptr 
addrspace(2) %B) {
   ret void
 }
 
+define void @test_overaligned_vec(i8 %B) {
+; This should be turned into a constexpr instead of being an 
instruction
+; CHECK-LABEL: @test_overaligned_vec(
+; TODO: In this test case, half is overaligned to 32 bits.
+;   Vectors are bit-packed and don't respect alignment.
+;   Thus, the byte offset of the second half in <2 x half> is 2 bytes, not 
4 bytes:
+; CHECK-NEXT:store i8 [[B:%.*]], ptr getelementptr inbounds ([10 x i8], 
ptr @Global, i64 0, i64 4), align 1
+; CHECK-NEXT:ret void
+;
+  %A = getelementptr <2 x half>, ptr @Global, i64 0, i64 1
+  store i8 %B, ptr %A
+  ret void
+}
+
 define ptr @test7(ptr %I, i64 %C, i64 %D) {
 ; CHECK-LABEL: @test7(
 ; CHECK-NEXT:[[A:%.*]] = getelementptr i32, ptr [[I:%.*]], i64 [[C:%.*]]

>From 2e80e5846d23946304bbb9930d0a12098a2f16bd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:29:59 +0100
Subject: [PATCH 2/3] [IR]: Add
 generic_gep_type_iterator::getSequentialElementStride

This prepares a fix to GEP offset computations on vectors
of overaligned elements.

We have many places that analyze GEP offsets using GEP iterators
with the following pattern:

  GTI = gep_type_begin(ElemTy, Indices),
  GTE = gep_type_end(ElemTy, Indices);
  for (; GTI != GTE; ++GTI) {
if (StructType *STy = GTI.getStructTypeOrNull()) {
   // handle struct
   [..]
} else {
  // handle sequential (outmost index, array, vector):
  auto Stride = DL.getTypeAllocSize(GTI.getIndexedType());
  Offset += Index * Size;
}
  }

This is incorrect for vectors of types whose bit size does not
equal its alloc size (e.g. overaligned types), as vectors
always bit-pack their elements.

This patch introduces new functions generic_gep_type_iterator::isVector()
and generic_gep_type_iterator::getSequentialElementStride(const DataLayout &)
to fix these patterns without having to teach all these places about
the specifics of vector bit layouts. With these helpers, the pattern
above can be fixed by replacing the stride computation:

  auto Stride = GTI.getSequentialElementStride(DL);
---
 .../llvm/IR/GetElementPtrTypeIterator.h   | 57 ++-
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/User.h"
@@ -30,7 +31,39 @@ template 
 class generic_gep_type_iterator {
 
   ItTy OpIt;
-  PointerUnion CurTy;
+  // We use two different mechanisms to store the type a GEP index applies to.
+  // In some cases, we need to know the outer aggregate type the index is
+  // applied within, e.g. a struct. In such cases, we store the aggregate type
+  // in the iterator, and derive the element type on the fly.
+  //
+  // However, this is not always possible, because for the outermost index 
there
+  // is no containing type. In such cases, or if the containing type is not
+  // relevant, e.g. for arrays, the element type is stored as Type* in CurTy.
+  //
+  // If CurTy contains a Type* value, this does not imply anything about the
+  // type itself, because it is the element type and not the outer type.
+  // In particular, Type* can be a struct type.
+  //
+  // Consider this example:
+  //
+  //%my.struct = type { i32, [ 4 x float ] }
+  //[...]
+  //%gep = 

[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus updated 
https://github.com/llvm/llvm-project/pull/75448

>From 2c367fba42b716d803ee088af45c1b57fe4bcbcd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:24:51 +0100
Subject: [PATCH 1/3] [InstCombine] Precommit test exhibiting miscompile

InstCombine is determining incorrect byte offsets for GEPs
into vectors of overaligned elements.
Add a new testcase showing this behavior, serving as precommit
for a fix.
---
 .../test/Transforms/InstCombine/getelementptr.ll | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll 
b/llvm/test/Transforms/InstCombine/getelementptr.ll
index bc7fdc9352df6c..7c0d95973d5cf3 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1,7 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s -passes=instcombine -S | FileCheck %s
 
-target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64"
+target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64-f16:32"
 
 %intstruct = type { i32 }
 %pair = type { i32, i32 }
@@ -111,6 +111,20 @@ define void @test_evaluate_gep_as_ptrs_array(ptr 
addrspace(2) %B) {
   ret void
 }
 
+define void @test_overaligned_vec(i8 %B) {
+; This should be turned into a constexpr instead of being an 
instruction
+; CHECK-LABEL: @test_overaligned_vec(
+; TODO: In this test case, half is overaligned to 32 bits.
+;   Vectors are bit-packed and don't respect alignment.
+;   Thus, the byte offset of the second half in <2 x half> is 2 bytes, not 
4 bytes:
+; CHECK-NEXT:store i8 [[B:%.*]], ptr getelementptr inbounds ([10 x i8], 
ptr @Global, i64 0, i64 4), align 1
+; CHECK-NEXT:ret void
+;
+  %A = getelementptr <2 x half>, ptr @Global, i64 0, i64 1
+  store i8 %B, ptr %A
+  ret void
+}
+
 define ptr @test7(ptr %I, i64 %C, i64 %D) {
 ; CHECK-LABEL: @test7(
 ; CHECK-NEXT:[[A:%.*]] = getelementptr i32, ptr [[I:%.*]], i64 [[C:%.*]]

>From 2e80e5846d23946304bbb9930d0a12098a2f16bd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:29:59 +0100
Subject: [PATCH 2/3] [IR]: Add
 generic_gep_type_iterator::getSequentialElementStride

This prepares a fix to GEP offset computations on vectors
of overaligned elements.

We have many places that analyze GEP offsets using GEP iterators
with the following pattern:

  GTI = gep_type_begin(ElemTy, Indices),
  GTE = gep_type_end(ElemTy, Indices);
  for (; GTI != GTE; ++GTI) {
if (StructType *STy = GTI.getStructTypeOrNull()) {
   // handle struct
   [..]
} else {
  // handle sequential (outmost index, array, vector):
  auto Stride = DL.getTypeAllocSize(GTI.getIndexedType());
  Offset += Index * Size;
}
  }

This is incorrect for vectors of types whose bit size does not
equal its alloc size (e.g. overaligned types), as vectors
always bit-pack their elements.

This patch introduces new functions generic_gep_type_iterator::isVector()
and generic_gep_type_iterator::getSequentialElementStride(const DataLayout &)
to fix these patterns without having to teach all these places about
the specifics of vector bit layouts. With these helpers, the pattern
above can be fixed by replacing the stride computation:

  auto Stride = GTI.getSequentialElementStride(DL);
---
 .../llvm/IR/GetElementPtrTypeIterator.h   | 57 ++-
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/User.h"
@@ -30,7 +31,39 @@ template 
 class generic_gep_type_iterator {
 
   ItTy OpIt;
-  PointerUnion CurTy;
+  // We use two different mechanisms to store the type a GEP index applies to.
+  // In some cases, we need to know the outer aggregate type the index is
+  // applied within, e.g. a struct. In such cases, we store the aggregate type
+  // in the iterator, and derive the element type on the fly.
+  //
+  // However, this is not always possible, because for the outermost index 
there
+  // is no containing type. In such cases, or if the containing type is not
+  // relevant, e.g. for arrays, the element type is stored as Type* in CurTy.
+  //
+  // If CurTy contains a Type* value, this does not imply anything about the
+  // type itself, because it is the element type and not the outer type.
+  // In particular, Type* can be a struct type.
+  //
+  // Consider this example:
+  //
+  //%my.struct = type { i32, [ 4 x float ] }
+  //[...]
+  //%gep = 

[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

jasilvanus wrote:

> Can this solve #68566 too?

I don't think it solves it, as it only fixes offset computations within GEPs 
and doesn't teach code in general about the correct vector layout. However, it 
is reducing the amount of code assuming the wrong layout.

There seem to be some clang test failures though that I'm currently looking 
into.

https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

jasilvanus wrote:

> There seem to be some clang test failures though that I'm currently looking 
> into.

Solved: The clang failures were caused by me having `dxv` in `$PATH`, which 
implicitly enabled some extra `dxv`-based tests which also fail without this 
patch for me. 

Clang was invoking `dxv - -o -`, and `dxv` complained about `-` not being a 
file. Not sure whether automatically enabling extra tests based on `$PATH` is 
desirable behavior, but that's a different topic.

https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus updated 
https://github.com/llvm/llvm-project/pull/75448

>From 2c367fba42b716d803ee088af45c1b57fe4bcbcd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:24:51 +0100
Subject: [PATCH 1/4] [InstCombine] Precommit test exhibiting miscompile

InstCombine is determining incorrect byte offsets for GEPs
into vectors of overaligned elements.
Add a new testcase showing this behavior, serving as precommit
for a fix.
---
 .../test/Transforms/InstCombine/getelementptr.ll | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll 
b/llvm/test/Transforms/InstCombine/getelementptr.ll
index bc7fdc9352df6c..7c0d95973d5cf3 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1,7 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s -passes=instcombine -S | FileCheck %s
 
-target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64"
+target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64-f16:32"
 
 %intstruct = type { i32 }
 %pair = type { i32, i32 }
@@ -111,6 +111,20 @@ define void @test_evaluate_gep_as_ptrs_array(ptr 
addrspace(2) %B) {
   ret void
 }
 
+define void @test_overaligned_vec(i8 %B) {
+; This should be turned into a constexpr instead of being an 
instruction
+; CHECK-LABEL: @test_overaligned_vec(
+; TODO: In this test case, half is overaligned to 32 bits.
+;   Vectors are bit-packed and don't respect alignment.
+;   Thus, the byte offset of the second half in <2 x half> is 2 bytes, not 
4 bytes:
+; CHECK-NEXT:store i8 [[B:%.*]], ptr getelementptr inbounds ([10 x i8], 
ptr @Global, i64 0, i64 4), align 1
+; CHECK-NEXT:ret void
+;
+  %A = getelementptr <2 x half>, ptr @Global, i64 0, i64 1
+  store i8 %B, ptr %A
+  ret void
+}
+
 define ptr @test7(ptr %I, i64 %C, i64 %D) {
 ; CHECK-LABEL: @test7(
 ; CHECK-NEXT:[[A:%.*]] = getelementptr i32, ptr [[I:%.*]], i64 [[C:%.*]]

>From 2e80e5846d23946304bbb9930d0a12098a2f16bd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:29:59 +0100
Subject: [PATCH 2/4] [IR]: Add
 generic_gep_type_iterator::getSequentialElementStride

This prepares a fix to GEP offset computations on vectors
of overaligned elements.

We have many places that analyze GEP offsets using GEP iterators
with the following pattern:

  GTI = gep_type_begin(ElemTy, Indices),
  GTE = gep_type_end(ElemTy, Indices);
  for (; GTI != GTE; ++GTI) {
if (StructType *STy = GTI.getStructTypeOrNull()) {
   // handle struct
   [..]
} else {
  // handle sequential (outmost index, array, vector):
  auto Stride = DL.getTypeAllocSize(GTI.getIndexedType());
  Offset += Index * Size;
}
  }

This is incorrect for vectors of types whose bit size does not
equal its alloc size (e.g. overaligned types), as vectors
always bit-pack their elements.

This patch introduces new functions generic_gep_type_iterator::isVector()
and generic_gep_type_iterator::getSequentialElementStride(const DataLayout &)
to fix these patterns without having to teach all these places about
the specifics of vector bit layouts. With these helpers, the pattern
above can be fixed by replacing the stride computation:

  auto Stride = GTI.getSequentialElementStride(DL);
---
 .../llvm/IR/GetElementPtrTypeIterator.h   | 57 ++-
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/User.h"
@@ -30,7 +31,39 @@ template 
 class generic_gep_type_iterator {
 
   ItTy OpIt;
-  PointerUnion CurTy;
+  // We use two different mechanisms to store the type a GEP index applies to.
+  // In some cases, we need to know the outer aggregate type the index is
+  // applied within, e.g. a struct. In such cases, we store the aggregate type
+  // in the iterator, and derive the element type on the fly.
+  //
+  // However, this is not always possible, because for the outermost index 
there
+  // is no containing type. In such cases, or if the containing type is not
+  // relevant, e.g. for arrays, the element type is stored as Type* in CurTy.
+  //
+  // If CurTy contains a Type* value, this does not imply anything about the
+  // type itself, because it is the element type and not the outer type.
+  // In particular, Type* can be a struct type.
+  //
+  // Consider this example:
+  //
+  //%my.struct = type { i32, [ 4 x float ] }
+  //[...]
+  //%gep = 

[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits


@@ -108,7 +143,23 @@ class generic_gep_type_iterator {
   // that.
 
   bool isStruct() const { return isa(CurTy); }
-  bool isSequential() const { return isa(CurTy); }
+  bool isVector() const { return isa(CurTy); }
+  bool isSequential() const { return !isStruct(); }
+
+  // For sequential GEP indices (all except those into structs), the index 
value
+  // can be translated into a byte offset by multiplying with an element 
stride.
+  // This function returns this stride, which both depends on the element type,
+  // and the containing aggregate type, as vectors always tightly bit-pack 
their
+  // elements.
+  TypeSize getSequentialElementStride(const DataLayout &DL) const {
+assert(isSequential());
+Type *ElemTy = getIndexedType();
+TypeSize ElemSizeInBits = isVector() ? DL.getTypeSizeInBits(ElemTy)
+ : DL.getTypeAllocSizeInBits(ElemTy);
+// Check for invalid GEPs that are not byte-addressable.
+assert(ElemSizeInBits.isKnownMultipleOf(8));
+return ElemSizeInBits.divideCoefficientBy(8);

jasilvanus wrote:

Thanks, that's better indeed.

https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus updated 
https://github.com/llvm/llvm-project/pull/75448

>From 2c367fba42b716d803ee088af45c1b57fe4bcbcd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:24:51 +0100
Subject: [PATCH 1/5] [InstCombine] Precommit test exhibiting miscompile

InstCombine is determining incorrect byte offsets for GEPs
into vectors of overaligned elements.
Add a new testcase showing this behavior, serving as precommit
for a fix.
---
 .../test/Transforms/InstCombine/getelementptr.ll | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll 
b/llvm/test/Transforms/InstCombine/getelementptr.ll
index bc7fdc9352df6c..7c0d95973d5cf3 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1,7 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s -passes=instcombine -S | FileCheck %s
 
-target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64"
+target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64-f16:32"
 
 %intstruct = type { i32 }
 %pair = type { i32, i32 }
@@ -111,6 +111,20 @@ define void @test_evaluate_gep_as_ptrs_array(ptr 
addrspace(2) %B) {
   ret void
 }
 
+define void @test_overaligned_vec(i8 %B) {
+; This should be turned into a constexpr instead of being an 
instruction
+; CHECK-LABEL: @test_overaligned_vec(
+; TODO: In this test case, half is overaligned to 32 bits.
+;   Vectors are bit-packed and don't respect alignment.
+;   Thus, the byte offset of the second half in <2 x half> is 2 bytes, not 
4 bytes:
+; CHECK-NEXT:store i8 [[B:%.*]], ptr getelementptr inbounds ([10 x i8], 
ptr @Global, i64 0, i64 4), align 1
+; CHECK-NEXT:ret void
+;
+  %A = getelementptr <2 x half>, ptr @Global, i64 0, i64 1
+  store i8 %B, ptr %A
+  ret void
+}
+
 define ptr @test7(ptr %I, i64 %C, i64 %D) {
 ; CHECK-LABEL: @test7(
 ; CHECK-NEXT:[[A:%.*]] = getelementptr i32, ptr [[I:%.*]], i64 [[C:%.*]]

>From 2e80e5846d23946304bbb9930d0a12098a2f16bd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:29:59 +0100
Subject: [PATCH 2/5] [IR]: Add
 generic_gep_type_iterator::getSequentialElementStride

This prepares a fix to GEP offset computations on vectors
of overaligned elements.

We have many places that analyze GEP offsets using GEP iterators
with the following pattern:

  GTI = gep_type_begin(ElemTy, Indices),
  GTE = gep_type_end(ElemTy, Indices);
  for (; GTI != GTE; ++GTI) {
if (StructType *STy = GTI.getStructTypeOrNull()) {
   // handle struct
   [..]
} else {
  // handle sequential (outmost index, array, vector):
  auto Stride = DL.getTypeAllocSize(GTI.getIndexedType());
  Offset += Index * Size;
}
  }

This is incorrect for vectors of types whose bit size does not
equal its alloc size (e.g. overaligned types), as vectors
always bit-pack their elements.

This patch introduces new functions generic_gep_type_iterator::isVector()
and generic_gep_type_iterator::getSequentialElementStride(const DataLayout &)
to fix these patterns without having to teach all these places about
the specifics of vector bit layouts. With these helpers, the pattern
above can be fixed by replacing the stride computation:

  auto Stride = GTI.getSequentialElementStride(DL);
---
 .../llvm/IR/GetElementPtrTypeIterator.h   | 57 ++-
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/User.h"
@@ -30,7 +31,39 @@ template 
 class generic_gep_type_iterator {
 
   ItTy OpIt;
-  PointerUnion CurTy;
+  // We use two different mechanisms to store the type a GEP index applies to.
+  // In some cases, we need to know the outer aggregate type the index is
+  // applied within, e.g. a struct. In such cases, we store the aggregate type
+  // in the iterator, and derive the element type on the fly.
+  //
+  // However, this is not always possible, because for the outermost index 
there
+  // is no containing type. In such cases, or if the containing type is not
+  // relevant, e.g. for arrays, the element type is stored as Type* in CurTy.
+  //
+  // If CurTy contains a Type* value, this does not imply anything about the
+  // type itself, because it is the element type and not the outer type.
+  // In particular, Type* can be a struct type.
+  //
+  // Consider this example:
+  //
+  //%my.struct = type { i32, [ 4 x float ] }
+  //[...]
+  //%gep = 

[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits


@@ -111,6 +111,20 @@ define void @test_evaluate_gep_as_ptrs_array(ptr 
addrspace(2) %B) {
   ret void
 }
 
+define void @test_overaligned_vec(i8 %B) {
+; This should be turned into a constexpr instead of being an 
instruction
+; CHECK-LABEL: @test_overaligned_vec(
+; TODO: In this test case, half is overaligned to 32 bits.
+;   Vectors are bit-packed and don't respect alignment.
+;   Thus, the byte offset of the second half in <2 x half> is 2 bytes, not 
4 bytes:

jasilvanus wrote:

Thanks, I moved the first comment away. The TODO is already removed in the main 
commit of this PR.

https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

jasilvanus wrote:

> Alternative would be to forbid GEP indexing into vectors entirely.

I agree that it would be better if there just were no vector GEPs, but that 
breaks importing older modules, and that cannot be easily auto-upgraded 
(convert to byte-GEPs?). Even worse, DXIL uses different vector semantics where 
alignment in vectors is respected, so importing DXIL already is a challenge, 
but if vector GEPs in DXIL were to be replaced by byte-GEPs as part of 
auto-upgrade, preserving DXIL semantics becomes difficult.

https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-19 Thread Jannik Silvanus via cfe-commits

jasilvanus wrote:

I'm now on vacation for two weeks, returning Jan 4th.
I'll merge the PR then (if it is approved by then).

https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2024-01-04 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus closed 
https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2024-01-10 Thread Jannik Silvanus via cfe-commits

jasilvanus wrote:

Interesting, here it is online: https://alive2.llvm.org/ce/z/Bm5gP2
The complaint goes away if `half` is naturally aligned (`f16:16`).

Looks like an Alive2 bug with respect to vector bit layout? 



https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 9741ac5 - [clang-format] vim integration: Mention python3 variant of bindings

2023-06-21 Thread Jannik Silvanus via cfe-commits

Author: Jannik Silvanus
Date: 2023-06-21T17:18:36+02:00
New Revision: 9741ac5b3b3e7942e3332e37af829e58e5e2bb82

URL: 
https://github.com/llvm/llvm-project/commit/9741ac5b3b3e7942e3332e37af829e58e5e2bb82
DIFF: 
https://github.com/llvm/llvm-project/commit/9741ac5b3b3e7942e3332e37af829e58e5e2bb82.diff

LOG: [clang-format] vim integration: Mention python3 variant of bindings

The instructions in the documentation only mentioned how to include
bindings for clang-format into vim using python2. Add the instructions
for python3 which were already present in the source comments.

Differential Revision: https://reviews.llvm.org/D153338

Change-Id: I25fdbd36f0c7e745061908be8e26f68cb31c7dd5

Added: 


Modified: 
clang/docs/ClangFormat.rst

Removed: 




diff  --git a/clang/docs/ClangFormat.rst b/clang/docs/ClangFormat.rst
index 98350a04ccda4..129d5f101a0c4 100644
--- a/clang/docs/ClangFormat.rst
+++ b/clang/docs/ClangFormat.rst
@@ -145,8 +145,13 @@ This can be integrated by adding the following to your 
`.vimrc`:
 
 .. code-block:: vim
 
-  map  :pyf /clang-format.py
-  imap  :pyf /clang-format.py
+  if has('python')
+map  :pyf /clang-format.py
+imap  :pyf /clang-format.py
+  elseif has('python3')
+map  :py3f /clang-format.py
+imap  :py3f /clang-format.py
+  endif
 
 The first line enables :program:`clang-format` for NORMAL and VISUAL mode, the
 second line adds support for INSERT mode. Change "C-K" to another binding if



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Remove YAML hack to emit a BasedOnStyle comment (PR #89228)

2024-04-18 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus created 
https://github.com/llvm/llvm-project/pull/89228

When serializing a formatting style to YAML, we were emitting a comment `# 
BasedOnStyle: 

[clang] [clang-format] Remove YAML hack to emit a BasedOnStyle comment (PR #89228)

2024-04-19 Thread Jannik Silvanus via cfe-commits


@@ -807,12 +807,18 @@ template <> struct MappingTraits {
 FormatStyle PredefinedStyle;
 if (getPredefinedStyle(StyleName, Style.Language, &PredefinedStyle) &&
 Style == PredefinedStyle) {
-  IO.mapOptional("# BasedOnStyle", StyleName);
+  // For convenience, emit the info which style this matches. However,
+  // setting BasedOnStyle will override all other keys when importing,
+  // so we set a helper key that is ignored when importing.
+  // Ideally, we'd want a YAML comment here, but that's not supported.
+  IO.mapOptional("OrigBasedOnStyle", StyleName);

jasilvanus wrote:

> I think clang-format will baff if there is styles in the .clang-format file 
> we don't understand, its very unforgiving of new styles

This will set a key that is never read just for reference, as the old comment 
did. 

> Why is using # not a valid YAML comment?

Using `#` in the textual yaml is a valid comment and perfectly fine. 
However, here we are using a function to set a key named `# BasedOnStyle`, which
just happened to emit a correct YAML comment because the key is not properly 
quoted/escaped.

This breaks when fixing key quoting. 

https://github.com/llvm/llvm-project/pull/89228
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Remove YAML hack to emit a BasedOnStyle comment (PR #89228)

2024-04-19 Thread Jannik Silvanus via cfe-commits

jasilvanus wrote:

> This doesn't feel correct to me...

I'd also be fine with just removing the comment.

https://github.com/llvm/llvm-project/pull/89228
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Remove YAML hack to emit a BasedOnStyle comment (PR #89228)

2024-04-19 Thread Jannik Silvanus via cfe-commits


@@ -807,12 +807,18 @@ template <> struct MappingTraits {
 FormatStyle PredefinedStyle;
 if (getPredefinedStyle(StyleName, Style.Language, &PredefinedStyle) &&
 Style == PredefinedStyle) {
-  IO.mapOptional("# BasedOnStyle", StyleName);

jasilvanus wrote:

I don't think our YAML writer supports comments at all.

https://github.com/llvm/llvm-project/pull/89228
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Remove YAML hack to emit a BasedOnStyle comment (PR #89228)

2024-04-23 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus updated 
https://github.com/llvm/llvm-project/pull/89228

>From 5d4a3b0f922b0c28960f610c695c92da7d3538c1 Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 18 Apr 2024 14:56:47 +0200
Subject: [PATCH 1/2] [clang-format] Remove YAML hack to emit a BasedOnStyle
 comment

When serializing a formatting style to YAML, we were emitting a
comment `# BasedOnStyle: 

[clang] [clang-format] Remove YAML hack to emit a BasedOnStyle comment (PR #89228)

2024-04-23 Thread Jannik Silvanus via cfe-commits


@@ -807,12 +807,18 @@ template <> struct MappingTraits {
 FormatStyle PredefinedStyle;
 if (getPredefinedStyle(StyleName, Style.Language, &PredefinedStyle) &&
 Style == PredefinedStyle) {
-  IO.mapOptional("# BasedOnStyle", StyleName);
+  // For convenience, emit the info which style this matches. However,
+  // setting BasedOnStyle will override all other keys when importing,
+  // so we set a helper key that is ignored when importing.
+  // Ideally, we'd want a YAML comment here, but that's not supported.
+  IO.mapOptional("OrigBasedOnStyle", StyleName);
   BasedOnStyle = StyleName;
   break;
 }
   }
 } else {
+  StringRef OrigBasedOnStyle; // ignored
+  IO.mapOptional("OrigBasedOnStyle", OrigBasedOnStyle);

jasilvanus wrote:

This allows the key being present in the input, avoiding an unknown key error. 
But now with all of it removed, this is no longer relevant.

https://github.com/llvm/llvm-project/pull/89228
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Remove YAML hack to emit a BasedOnStyle comment (PR #89228)

2024-04-23 Thread Jannik Silvanus via cfe-commits

jasilvanus wrote:

I've removed the dummy field and the comment now. This should prevent confusion.
I don't know whether anyone actually uses the emitted comment, there is no 
explanation,
and tests pass without it.

I suggest that if we later learn that there are valid uses that require the 
comment, we should add proper comment
support in the YAML writer, and add a test for it.

https://github.com/llvm/llvm-project/pull/89228
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Remove YAML hack to emit a BasedOnStyle comment (PR #89228)

2024-04-23 Thread Jannik Silvanus via cfe-commits


@@ -807,12 +807,18 @@ template <> struct MappingTraits {
 FormatStyle PredefinedStyle;
 if (getPredefinedStyle(StyleName, Style.Language, &PredefinedStyle) &&
 Style == PredefinedStyle) {
-  IO.mapOptional("# BasedOnStyle", StyleName);

jasilvanus wrote:

I don't know, but I'd prefer to avoid supporting it if it isn't even needed. We 
can still add it if we want this back.

https://github.com/llvm/llvm-project/pull/89228
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Remove YAML hack to emit a BasedOnStyle comment (PR #89228)

2024-04-25 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus closed 
https://github.com/llvm/llvm-project/pull/89228
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits