[llvm-bugs] [Bug 145861] [MLIR][linalg] `GeneralizeOuterUnitDimsPackOpPattern` not checking trailing dimension for tiling

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145861




Summary

[MLIR][linalg] `GeneralizeOuterUnitDimsPackOpPattern` not checking trailing dimension for tiling




  Labels
  
mlir
  



  Assignees
  
  



  Reporter
  
  rYm-A
  




Compiling an ML model with IREE I found that an incorrect linalg.transpose op is generated after `iree-codegen-decompose-pack-unpack-ops`.

The linalg.pack I'm attempting to lower before `iree-odegen-decompose-pack-unpack-ops` is:

```
%pack = linalg.pack %extracted_slice_1 
	outer_dims_perm = [1, 0, 2] 
	inner_dims_pos = [0, 2] 
	inner_tiles = [8, 1] 
	into %extracted_slice_2 
	{lowering_config = #iree_codegen.lowering_config<
		tile_sizes = [[1, 48, 64], [1, 1, 1]]>} 
	: tensor<8x1x1xf32> -> tensor<1x1x1x8x1xf32>
```

And the result is:

```
%18 = "tensor.empty"() : () -> tensor<1x1x1xf32>
%19 = "linalg.transpose"(%16, %18) <{permutation = array}> ({
 ^bb0(%arg7: f32, %arg8: f32):
  "linalg.yield"(%arg7) : (f32) -> ()
}) : (tensor<8x1x1xf32>, tensor<1x1x1xf32>) -> tensor<1x1x1xf32>
```

Note that the permutation array is incorrect, what generates an assertion error:

` Assertion permutationMap.isPermutation() && "Invalid permutation vector"' failed.`

Checking this [PR](https://github.com/llvm/llvm-project/pull/115312) by @banach-space I found that this pattern shouldn't be matched by GeneralizeOuterUnitDimsPackOpPattern, since the `inner_dim_pos` of this linalg.pack op aren't the last of the source, which was assumed for this PR. 

The check in https://github.com/llvm/llvm-project/blame/237b8de2c0d9ee50c6a744e95c0706c8cdea70e1/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp#L1183 seems not to catch this particular example since the `-1` at the end may be accepting the N + 1 trailing dims, and not the N trailing dims. Could you confirm if this is intended?

Changing the check to `return dimPos >= (srcRank - numTiles);` will cause the pattern in this particular example to fail, and as a result, this linalg.op will eventually be lowered to:

```
%expanded = tensor.expand_shape %arg0 [[0, 1], [2], [3, 4]] output_shape [1, 8, 1, 1, 1] : tensor<8x1x1xf32> into tensor<1x8x1x1x1xf32>
%transposed = linalg.transpose ins(%expanded : tensor<1x8x1x1x1xf32>) outs(%arg1 : tensor<1x1x1x8x1xf32>) permutation = [2, 0, 3, 1, 4]  {lowering_config = #config}
```

### How to reproduce
IREE v3.5.0.

```
iree-opt packOp.mlir \
--pass-pipeline="builtin.module(func.func(iree-codegen-decompose-pack-unpack-ops))" \
--debug \
--mlir-disable-threading 
```

Trace: 
```

Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0 libIREECompiler.so 0x7fec4b0a85b8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 40
1 libIREECompiler.so 0x7fec4b0a635e llvm::sys::RunSignalHandlers() + 238
2 libIREECompiler.so 0x7fec4b0a8cb6
3  libc.so.6 0x7fec454e6520
4  libIREECompiler.so 0x7fec4b0d50b3 mlir::AffineMap::getContext() const + 3
5  libIREECompiler.so 0x7fec4b102bc4 mlir::AffineMapAttr::get(mlir::AffineMap) + 20
6 libIREECompiler.so 0x7fec4b0ff979 mlir::Builder::getAffineMapArrayAttr(llvm::ArrayRef) + 137
7  libIREECompiler.so 0x7fec4f9b71fb mlir::linalg::TransposeOp::getIndexingMaps() + 475
8  libIREECompiler.so 0x7fec4cb25b76
9  libIREECompiler.so 0x7fec4fc51f67
10 libIREECompiler.so 0x7fec4f99da41 mlir::linalg::LinalgOp::getIndexingMapsArray() + 17
11 libIREECompiler.so 0x7fec4fb1cb0d
12 libIREECompiler.so 0x7fec4ff8e3cd
13 libIREECompiler.so 0x7fec4ff8b664 mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref, llvm::function_ref, llvm::function_ref) + 820
14 libIREECompiler.so 0x7fec4ff729e7
15 libIREECompiler.so 0x7fec4ff707bb mlir::applyPatternsGreedily(mlir::Region&, mlir::FrozenRewritePatternSet const&, mlir::GreedyRewriteConfig, bool*) + 1819
16 libIREECompiler.so 0x7fec4dc12c0b
17 libIREECompiler.so 0x7fec4dc1195c
18 libIREECompiler.so 0x7fec4b31e5cb mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) + 635
19 libIREECompiler.so 0x7fec4b31f1b9 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) + 329
20 libIREECompiler.so 0x7fec4b321907 mlir::detail::OpToOpPassAdaptor::runOnOperationImpl(bool) + 2311
21 libIREECompiler.so 0x7fec4b31ea8c mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) + 1852
22 libIREECompiler.so 0x7fec4b3224db mlir::PassManager::run(mlir::Operation*) + 1531
23 libIREECompiler.so 0x7fec4b3150be
24 libIREECompiler.so 0x7fe

[llvm-bugs] [Bug 145851] lld warning `DWARF unit at offset 0x00000000 has unsupported version 0`

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145851




Summary

lld warning `DWARF unit at offset 0x has unsupported version 0`




  Labels
  
lld
  



  Assignees
  
  



  Reporter
  
  zmodem
  




This warning accompanies undefined symbol errors when certain flags are passed: (at 54953b922d114de5a539f32071650d9a8ab6d78c)

```
$ cat /tmp/a.c
void f();  // Intentionally left undefined.
int main() { f(); }

$ build/bin/clang -fuse-ld=lld -g2 -Wa,--crel,--allow-experimental-crel -Wl,--gdb-index /tmp/a.c
ld.lld: warning: /tmp/a-0657c0.o: DWARF unit at offset 0x has unsupported version 0, supported are 2-5
ld.lld: error: undefined symbol: f
>>> referenced by a.c
>>> /tmp/a-0657c0.o:(main)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```

The real error is of course the undefined symbol. The message about unsupported DWARF unit version is confusing to users.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 146005] [clang-cl] missing `__threadid` `_threadid` and `__threadhandle` in `stddef.h`

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

146005




Summary

[clang-cl] missing `__threadid` `_threadid` and `__threadhandle` in `stddef.h`




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  QianNangong
  




MSVC provides two functions and one macro inside `ucrt`'s `stddef.h`
```cpp
_ACRTIMP extern unsigned long  __cdecl __threadid(void);
#define _threadid (__threadid())
_ACRTIMP extern uintptr_t __cdecl __threadhandle(void);
```
However when I choose `clang-cl` toolset, they are no longer available.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 146015] [MLIR][Affine] affine-pipeline-data-transfer-pass erases op attributes

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

146015




Summary

[MLIR][Affine] affine-pipeline-data-transfer-pass erases op attributes




  Labels
  
mlir
  



  Assignees
  
  



  Reporter
  
  sherylll
  




version: 20.1.7

input:
```
// RUN: mlir-opt %s -affine-pipeline-data-transfer 
func.func @loop_nest_dma() {

  %A = memref.alloc() : memref<256 x f32, affine_map<(d0) -> (d0)>, 0>
  %Ah = memref.alloc() {alignment = 1024} : memref<32 x f32, affine_map<(d0) -> (d0)>, 1>

  %tag = memref.alloc() : memref<1 x f32>

  %zero = arith.constant 0 : index
  %num_elts = arith.constant 32 : index

  affine.for %i = 0 to 8 {
affine.dma_start %A[%i], %Ah[%i], %tag[%zero], %num_elts : memref<256 x f32>, memref<32 x f32, 1>, memref<1 x f32>
affine.dma_wait %tag[%zero], %num_elts : memref<1 x f32>
%v = affine.load %Ah[%i] : memref<32 x f32, affine_map<(d0) -> (d0)>, 1>
  }
  memref.dealloc %tag : memref<1 x f32>
  memref.dealloc %Ah : memref<32 x f32, affine_map<(d0) -> (d0)>, 1>
 return
}
```
output:
```
#map = affine_map<(d0) -> (d0 - 1)>
#map1 = affine_map<(d0) -> (d0 mod 2)>
module {
  func.func @loop_nest_dma() {
 ...
%alloc_1 = memref.alloc() : memref<2x32xf32, 1> // attributes get removed
%alloc_2 = memref.alloc() : memref<2x1xf32>
 affine.dma_start %alloc[%c0], %alloc_1[%c0 mod 2, %c0], %alloc_2[%c0 mod 2, 0], %c32 : memref<256xf32>, memref<2x32xf32, 1>, memref<2x1xf32>
 affine.for %arg0 = 1 to 8 {
  affine.dma_start %alloc[%arg0], %alloc_1[%arg0 mod 2, %arg0], %alloc_2[%arg0 mod 2, 0], %c32 : memref<256xf32>, memref<2x32xf32, 1>, memref<2x1xf32>
  %4 = affine.apply #map(%arg0)
  %5 = affine.apply #map1(%4)
  %6 = affine.apply #map1(%4)
  affine.dma_wait %alloc_2[%4 mod 2, 0], %c32 : memref<2x1xf32>
  %7 = affine.load %alloc_1[%4 mod 2, %4] : memref<2x32xf32, 1>
}
...
return
  }
}
```

The alignment attribute is lost during this pass. Is this expected behavior? If not, adding an argument `cast(oldMemRef.getDefiningOp())->getAttrs()` to this line [this line](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/Affine/Transforms/PipelineDataTransfer.cpp#L105) should fix the issue. 



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 146019] [clang-doc] ClangFormatStyleOptions.rst: Description for `AlignConsecutiveStyle` special options appear in places where they are not applicable

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

146019




Summary

[clang-doc] ClangFormatStyleOptions.rst: Description for `AlignConsecutiveStyle` special options appear in places where they are not applicable




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  Next-Door-Tech
  




There are a small handful of flags under `struct AlignConsecutiveStyle` which are only applicable to one of the seven `AlignConsecutiveXYZ` options; however, their descriptions appear under each instance of AlignConsecutiveStyle (6 extraneous places).

The offending flags are `AlignCompound`, `AlignFunctionDeclarations`, `AlignFunctionPointers`, and `PadOperators`.

As generated:
https://github.com/llvm/llvm-project/blob/569fcac4584ad555b9b57d09e3535260a8634429/clang/docs/ClangFormatStyleOptions.rst#L383-L447

In source header:
https://github.com/llvm/llvm-project/blob/569fcac4584ad555b9b57d09e3535260a8634429/clang/include/clang/Format/Format.h#L217-L277


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 146030] [mlir][Vector] Turn Vector Linearization from a conversion to an IR rewrite?

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

146030




Summary

[mlir][Vector] Turn Vector Linearization from a conversion to an IR rewrite?




  Labels
  
mlir
  



  Assignees
  
  



  Reporter
  
  dcaballe
  




Hey,

I've been looking into the Vector Linearization transformation and hit a few issues. The current implementation uses the Conversion pattern driver and mostly makes every operation with a multi-dimensional vector type illegal using [this target method](https://github.com/dcaballe/llvm-project/blob/vector-to-elements-llvm-lowering/mlir/lib/Dialect/Vector/Transforms/VectorLinearize.cpp#L702-L710). I see a couple of issues with this:

1. If an operation is set as illegal but we don't have a conversion pattern for it or existing conversion patterns fail for some reason, the whole linearization conversion will fail. This will happen even when using `applyPartialConversion` because the target method mentioned above will explicit set operations as illegal. To give you a concrete example, I'm currently hitting a conversion error because scalar `vector.insert` operations are not supported:

```
test2.mlir:13:12: error: failed to legalize operation 'vector.insert' that was explicitly marked illegal
  %8 = vector.insert %7, %cst [0, 0] : f32 into vector<16x8xf32>
```

I hit more issues with other ops. Note that not linearizing an n-D operation shouldn't be a big deal as we have lowering patterns to LLVM that can handle multi-dimensional vectors so it should be fine to continue the compilation regardless or any implementation gaps in this pass.

2. The linearization "conversion" is similar to other transformations that we have in the Vector dialect and are implemented as simple IR rewrites which can be applied using the greedy pattern rewriter driver. However, since we are using the conversion driver for vector linearization, it can't be composed with these IR rewrites. I think the idea of using the type converter and target to model the n-D -> 1-D vector type conversion is fine but this kind of vector "reshape" transformations are common in different vector rewrites and we don't use the conversion infrastructure for them.

All in all, I'm really excited to see the progress we are making in this transformation. I think it's reaching a point where we can start using it more broadly and make it more composable with other existing vector rewrites. As such, we may want to consider moving the current Conversion based implementation to a simpler IR rewrite. I'd love to hear what others think!

cc: @newling, @Hardcode84, @banach-space 


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145879] [mlir][linalg] using outputDimSize in `getPackOpSourceOrPaddedSource` to lower linanlg.pack for sources with dynamic leading dims

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145879




Summary

[mlir][linalg] using outputDimSize in `getPackOpSourceOrPaddedSource` to lower linanlg.pack for sources with dynamic leading dims




  Labels
  
mlir
  



  Assignees
  
  



  Reporter
  
  rYm-A
  




Lowering the following linalg.pack op generates an issue:

```
func.func @main(%arg0: tensor, %arg1:  tensor, %cst: f32 ) -> tensor {
  %result = linalg.pack %arg0
padding_value(%cst : f32)
outer_dims_perm = [0, 1, 2]
inner_dims_pos = [1, 2]
inner_tiles = [8, 1]
into %arg1
{lowering_config = #iree_codegen.lowering_config}
: tensor -> tensor
  return %result : tensor
}
```

...using 

```
iree-opt packOp.mlir \
--pass-pipeline="builtin.module(func.func(iree-llvmcpu-tile{tiling-level=1}, iree-codegen-decompose-pack-unpack-ops))"  \
--debug \
--mlir-disable-threading 
```

...will generate the following IR after `iree-llvmcpu-tile{tiling-level=1}`:

```
func.func @main(%arg0: tensor, %arg1: tensor, %arg2: f32) -> tensor {
  %c8 = arith.constant 8 : index
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index
  %dim = tensor.dim %arg1, %c0 : tensor
  %dim_0 = tensor.dim %arg1, %c1 : tensor
  %0 = scf.for %arg3 = %c0 to %dim step %c1 iter_args(%arg4 = %arg1) -> (tensor) {
%1 = scf.for %arg5 = %c0 to %dim_0 step %c1 iter_args(%arg6 = %arg4) -> (tensor) {
  %2 = scf.for %arg7 = %c0 to %c8 step %c1 iter_args(%arg8 = %arg6) -> (tensor) {
%dim_1 = tensor.dim %arg0, %c0 : tensor
%dim_2 = tensor.dim %arg0, %c1 : tensor
%3 = affine.min affine_map<(d0)[s0] -> (-d0 + s0, 1)>(%arg3)[%dim_1]
%4 = affine.apply affine_map<(d0) -> (d0 * 8)>(%arg5)
%5 = affine.min affine_map<(d0)[s0] -> (d0 * -8 + s0, 8)>(%arg5)[%dim_2]
%extracted_slice = tensor.extract_slice %arg0[%arg3, %4, %arg7] [%3, %5, 1] [1, 1, 1] : tensor to tensor
%extracted_slice_3 = tensor.extract_slice %arg8[%arg3, %arg5, %arg7, 0, 0] [1, 1, 1, 8, 1] [1, 1, 1, 1, 1] : tensor to tensor<1x1x1x8x1xf32>
%pack = linalg.pack %extracted_slice_1 padding_value(%cst : f32) 
			outer_dims_perm = [0, 1, 2] 
			inner_dims_pos = [1, 2] 
			inner_tiles = [8, 1] 
			into %extracted_slice_2 {lowering_config = #iree_codegen.lowering_config} : 
			tensor -> tensor<1x1x1x8x1xf32>
%inserted_slice = tensor.insert_slice %pack into %arg8[%arg3, %arg5, %arg7, 0, 0] [1, 1, 1, 8, 1] [1, 1, 1, 1, 1] : tensor<1x1x1x8x1xf32> into tensor
scf.yield %inserted_slice : tensor
  }
  scf.yield %2 : tensor
}
scf.yield %1 : tensor
  }
  return %0 : tensor
}
```

The conversion `iree-codegen-decompose-pack-unpack-ops` will trigger this [assert](https://github.com/llvm/llvm-project/blob/66f84c8b8a762832af39e91370018f8f8307a0fc/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp#L1057) in `getPackOpSourceOrPaddedSource` , since the source's outermost dim of the linalg.pack op isn't 1 but dynamic.

@banach-space, why not using packOp.SourceType() instead? The `iree-llvmcpu-tile{tiling-level=1}` pass already ensures that the non-tiled outer dimension of the linalg.pack result is set to 1.

This time, not a duplicated issue 🙂



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145881] [Clang] Crash in CodeGen when using explicit teimplate parameters in a lambda

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145881




Summary

[Clang] Crash in CodeGen when using explicit teimplate parameters in a lambda




  Labels
  
clang:frontend,
crash-on-valid
  



  Assignees
  
  



  Reporter
  
  philnik777
  




```
void func() {
  [] {}();
}
```
causes Clang to crash when in `-std=c++11` mode.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145970] [libc++][sanitizers][arm] Setup libc++ CI with Arm sanitizers

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145970




Summary

[libc++][sanitizers][arm] Setup libc++ CI with Arm sanitizers




  Labels
  
libc++,
github:workflow
  



  Assignees
  
qinkunbao
  



  Reporter
  
  vitalybuka
  




HWAsan is interesting in particular, as the others covered by x86 bots as well.

https://github.com/llvm/llvm-project/pull/130145#issuecomment-3008791522


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145975] Unexpected "reference to 'std' is ambiguous" error with C++20 modules

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145975




Summary

Unexpected "reference to 'std' is ambiguous" error with C++20 modules




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  MikailBag
  




Reproducer: https://gist.github.com/MikailBag/1759e1c092649ee749fa0f620805fd62

Adding `namespace std {}` to GMF in connection.cpp fixes error.

Compiler version:
```
$ /home/mb/projects/llvm-project/build/bin/clang++ --version
clang version 21.0.0git (https://github.com/llvm/llvm-project.git 1276a5b368493cc73ef0febee35a6591b92464d5)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/mb/projects/llvm-project/build/bin
Build config: +assertions
```

Originally reported in https://github.com/llvm/llvm-project/issues/118137#issuecomment-3005759000, this is minimized version.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145969] Request Commit Access For Ralender

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145969




Summary

Request Commit Access For Ralender




  Labels
  
infra:commit-access-request
  



  Assignees
  
tstellar
  



  Reporter
  
  Ralender
  




### Why Are you requesting commit access ?
To be able to revert/fix PR if buildbots break.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145992] [Offload] `device.unittests` is failing

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145992




Summary

[Offload] `device.unittests` is failing




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  leandrolcampos
  




```bash
leandro@Zephyrus:~/llvm-project$ ./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests
[==] Running 22 tests from 3 test suites.
[--] Global test environment set-up.
[--] 2 tests from olIterateDevicesTest
[ RUN  ] olIterateDevicesTest.SuccessEmptyCallback
[   OK ] olIterateDevicesTest.SuccessEmptyCallback (0 ms)
[ RUN  ] olIterateDevicesTest.SuccessGetDevice
[   OK ] olIterateDevicesTest.SuccessGetDevice (0 ms)
[--] 2 tests from olIterateDevicesTest (0 ms total)

[--] 12 tests from olGetDeviceInfoTest
[ RUN  ] olGetDeviceInfoTest.SuccessType/CUDA___F__V
[   OK ] olGetDeviceInfoTest.SuccessType/CUDA___F__V (0 ms)
[ RUN  ] olGetDeviceInfoTest.HostSuccessType/CUDA___F__V
[   OK ] olGetDeviceInfoTest.HostSuccessType/CUDA___F__V (0 ms)
[ RUN  ] olGetDeviceInfoTest.SuccessPlatform/CUDA___F__V
[   OK ] olGetDeviceInfoTest.SuccessPlatform/CUDA___F__V (0 ms)
[ RUN  ] olGetDeviceInfoTest.SuccessName/CUDA___F__V
[   OK ] olGetDeviceInfoTest.SuccessName/CUDA___F__V (0 ms)
[ RUN  ] olGetDeviceInfoTest.HostName/CUDA___F__V
[   OK ] olGetDeviceInfoTest.HostName/CUDA___F__V (0 ms)
[ RUN  ] olGetDeviceInfoTest.SuccessVendor/CUDA___F__V
[   OK ] olGetDeviceInfoTest.SuccessVendor/CUDA___F__V (0 ms)
[ RUN  ] olGetDeviceInfoTest.SuccessDriverVersion/CUDA___F__V
 #0 0x5614dc87c382 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests+0x5e382)
 #1 0x5614dc87989f llvm::sys::RunSignalHandlers() (./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests+0x5b89f)
 #2 0x5614dc8799e4 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #3 0x7fd1c2ea8330 (/lib/x86_64-linux-gnu/libc.so.6+0x45330)
 #4 0x7fd1c2f01b2c pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x9eb2c)
 #5 0x7fd1c2ea827e raise (/lib/x86_64-linux-gnu/libc.so.6+0x4527e)
 #6 0x7fd1c2e8b8ff abort (/lib/x86_64-linux-gnu/libc.so.6+0x288ff)
 #7 0x7fd1c36455e9 llvm::offload::olGetDeviceInfoImplDetail(ol_device_impl_t*, ol_device_info_t, unsigned long, void*, unsigned long*)::$_0::operator()(std::vector, std::allocator>, std::allocator, std::allocator>>>) const OffloadImpl.cpp:0:0
 #8 0x7fd1c3644e85 llvm::offload::olGetDeviceInfoImplDetail(ol_device_impl_t*, ol_device_info_t, unsigned long, void*, unsigned long*) (/home/leandro/llvm-project/build/lib/libLLVMOffload.so.21.0git+0x214e85)
 #9 0x7fd1c3645661 llvm::offload::olGetDeviceInfoSize_impl(ol_device_impl_t*, ol_device_info_t, unsigned long*) (/home/leandro/llvm-project/build/lib/libLLVMOffload.so.21.0git+0x215661)
#10 0x7fd1c363c298 olGetDeviceInfoSize_val(ol_device_impl_t*, ol_device_info_t, unsigned long*) (/home/leandro/llvm-project/build/lib/libLLVMOffload.so.21.0git+0x20c298)
#11 0x7fd1c363c3cc olGetDeviceInfoSize (/home/leandro/llvm-project/build/lib/libLLVMOffload.so.21.0git+0x20c3cc)
#12 0x5614dc8374fc olGetDeviceInfoTest_SuccessDriverVersion_Test::TestBody() (./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests+0x194fc)
#13 0x5614dc893aa0 testing::Test::Run() (./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests+0x75aa0)
#14 0x5614dc894f80 testing::TestInfo::Run() (./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests+0x76f80)
#15 0x5614dc895b93 testing::TestSuite::Run() (./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests+0x77b93)
#16 0x5614dc8a6dc4 testing::internal::UnitTestImpl::RunAllTests() (./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests+0x88dc4)
#17 0x5614dc8a6189 testing::UnitTest::Run() (./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests+0x88189)
#18 0x5614dc87f43c main (./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests+0x6143c)
#19 0x7fd1c2e8d1ca (/lib/x86_64-linux-gnu/libc.so.6+0x2a1ca)
#20 0x7fd1c2e8d28b __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28b)
#21 0x5614dc8310e5 _start (./build/runtimes/runtimes-bins/offload/unittests/OffloadAPI/device.unittests+0x130e5)
Aborted (core dumped)
```

**Environment:**
* LLVM version: Custom build from git
* LLVM Project Commit Hash: `6bdfecaea837a07d034b1598a3af38c6f64044f4`
* Operating System: Ubuntu 24.04.2 LTS (WSL2)

```bash
leandro@Zephyrus:~/llvm-project$ nvidia-smi
Thu Jun 26 20:35:46 2025
+-+
| NVIDIA-SMI 575.51.02  Driver Ver

[llvm-bugs] [Bug 145953] [HLSL][SPIR-V] Invalid SPIRV generated for WaveGetLaneIndex

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145953




Summary

[HLSL][SPIR-V] Invalid SPIRV generated for WaveGetLaneIndex




  Labels
  
  



  Assignees
  
  



  Reporter
  
  spall
  




OpCapability Int8 and OpCapability Kernel are generated incorrectly.
It's also missing OpCapability GroupNonUniform which is required for WaveOps 
(https://godbolt.org/z/GT7jTTM8a).


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145934] Request Commit Access For qxy11

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145934




Summary

Request Commit Access For qxy11




  Labels
  
infra:commit-access-request
  



  Assignees
  
  



  Reporter
  
  qxy11
  




### Why Are you requesting commit access ?

I'm working on LLDB at Meta with @clayborg, @jeffreytan81, and others, and would like commit access to make it easier to contribute to the OSS project in the future. 



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145937] [opt][SROA] Long time spent in the SROA opt pass

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145937




Summary

[opt][SROA] Long time spent in the SROA opt pass




  Labels
  
  



  Assignees
  
  



  Reporter
  
  efwright
  




I have a code that is written in CUDA that I am attempting to compile for a MI300A AMD GPU. I noticed that one specific file was taking over an hour to compile in -O3. I isolated that file and did some profiling to see that it was spending this time in the SROA optimization pass. Specifically, it seems that the portion of SROA that it is particularly taking a lot of time in is due to debug records.

I've ran llvm-reduce using a timeout to reduce the IR while also ensuring it is still taking a significant amount of time. The file included is the reduced IR just before it got to the debug records pass (after this pass it seems to greatly reduce the time taken). I'm trying to continue the reduction while skipping this pass to see if I can reduce the IR further without reducing the runtime too greatly. Since I'm using a timeout this reduction process takes a significant amount of time, so I'll try to keep this issue up-to-date as I get a better test case.

Right now, the test case I've attached here is about 100,000 lines and takes about 15 minutes on my system to run the SROA pass using the command `opt -p sroa -disable-output sroa_long.ll`

I'm having trouble attaching a file of that size to this issue, so I have tar'd it for now.

[sroa_long.tar.gz](https://github.com/user-attachments/files/20930668/sroa_long.tar.gz)


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145940] [HLSL][DirectX][NFC] Move validations from `DXILRootSignature` pass to `Frontend/HLSL/RootSignature`

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145940




Summary

[HLSL][DirectX][NFC] Move validations from `DXILRootSignature` pass to `Frontend/HLSL/RootSignature`




  Labels
  
backend:DirectX,
HLSL
  



  Assignees
  
  



  Reporter
  
  inbelic
  




This issue tracks a non-functional change to move all, currently static, validations functions into a common interface in `Fronted/HLSL/RootSignature`.

This will allow the re-use of these verification functions for https://github.com/llvm/llvm-project/issues/129940.

AC:
- [ ] Extend `HLSLRootSignature.[h|cpp]` to provide an interface for the basic validations of root signatures
- [ ] `DXILRootSignature` should be updated to depend on `Frontend/HLSL/RootSignature` and use the validation interface provided there


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145942] [NFC][HLSL][DirectX] Move parsing of root signature metadata to `RootSignatureMetdata`

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145942




Summary

[NFC][HLSL][DirectX] Move parsing of root signature metadata to `RootSignatureMetdata`




  Labels
  
backend:DirectX,
HLSL
  



  Assignees
  
  



  Reporter
  
  inbelic
  




This issue tracks the non-functional change of moving all parsing logic of root signature metadata and infrastructure from `DXILRootSignature` to `Frontend/HLSL/RootSignatureMetadata`.

AC:
- [ ] Move the current logic of parsing root signature metadata into a new class interface in `RootSignatureMetadata`
- [ ] Update `DXILRootSignature` to depend on and use the metadata parsing interface provided there.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145946] [NFC][HLSL][RootSignature] Split up `HLSLRootSignatureUtils`

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145946




Summary

[NFC][HLSL][RootSignature] Split up `HLSLRootSignatureUtils`




  Labels
  
HLSL
  



  Assignees
  
inbelic
  



  Reporter
  
  inbelic
  




Utils files tend to generally end up as a dumping grounds of many different parts.

This issue track the non-functional change to move the metadata related logic into `RootSignatureMetadata` and the serialization logic into `RootSignature`.

AC:
- [ ] Remove the `RootSignatureUtils` library
- [ ] Move metadata logic into the `RootSignatureMetadata` library
- [ ] Move serialization logic into the `HLSLRootSignature` library


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145888] [flang][openmp] undefined symbols with device runtime on amdgpu

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145888




Summary

[flang][openmp] undefined symbols with device runtime on amdgpu




  Labels
  
flang
  



  Assignees
  
  



  Reporter
  
  VeeEM
  




I'm getting some undefined symbols with the flang offload runtime if I try to assign to an array in a target region. Array assignments in target regions have worked previously.

Here's a reproducer
```
program assign
implicit none

integer, dimension(10) :: xs

!$omp target map(xs)
xs = 1
!$omp end target

print *, xs

end program
```

Trying to build it with
`$ flang -fopenmp --offload-arch=gfx1100 assign.f90 -L$RUNTIME_DIR`
outputs
```
flang-21: warning: OpenMP support in flang is still experimental [-Wexperimental-option]
flang-21: warning: OpenMP support in flang is still experimental [-Wexperimental-option]
flang-21: warning: OpenMP support in flang is still experimental [-Wexperimental-option]
ld.lld: error: undefined symbol: std::in_place_index<0ul>
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(void std::_Construct, std::in_place_index_t<0ul> const&, Fortran::runtime::Descriptor&, Fortran::runtime::Descriptor const&, int&, void* (*&)(void*, void const*, unsigned long), Fortran::runtime::typeInfo::DerivedType const*&>(std::__detail::__variant::_Uninitialized*, std::in_place_index_t<0ul> const&, Fortran::runtime::Descriptor&, Fortran::runtime::Descriptor const&, int&, void* (*&)(void*, void const*, unsigned long), Fortran::runtime::typeInfo::DerivedType const*&))
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(void std::_Construct, std::in_place_index_t<0ul> const&, Fortran::runtime::Descriptor&, Fortran::runtime::Descriptor const&, int&, void* (*&)(void*, void const*, unsigned long), Fortran::runtime::typeInfo::DerivedType const*&>(std::__detail::__variant::_Uninitialized*, std::in_place_index_t<0ul> const&, Fortran::runtime::Descriptor&, Fortran::runtime::Descriptor const&, int&, void* (*&)(void*, void const*, unsigned long), Fortran::runtime::typeInfo::DerivedType const*&))
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(void std::_Construct, std::in_place_index_t<0ul> const&, Fortran::runtime::Descriptor const&, Fortran::runtime::typeInfo::DerivedType const&>(std::__detail::__variant::_Uninitialized*, std::in_place_index_t<0ul> const&, Fortran::runtime::Descriptor const&, Fortran::runtime::typeInfo::DerivedType const&))
>>> referenced 11 more times

ld.lld: error: undefined symbol: Fortran::runtime::DerivedAssignTicket::Begin(Fortran::runtime::WorkQueue&)
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(_ZZN7Fortran7runtime6Ticket8ContinueERNS0_9WorkQueueEENK3$_0clINS0_19DerivedAssignTicketILb1EDaRT_)
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(_ZZN7Fortran7runtime6Ticket8ContinueERNS0_9WorkQueueEENK3$_0clINS0_19DerivedAssignTicketILb1EDaRT_)

ld.lld: error: undefined symbol: Fortran::runtime::io::descr::DescriptorIoTicket<(Fortran::runtime::io::Direction)0>::Begin(Fortran::runtime::WorkQueue&)
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(_ZZN7Fortran7runtime6Ticket8ContinueERNS0_9WorkQueueEENK3$_0clINS0_2io5descr18DescriptorIoTicketILNS6_9DirectionE0EDaRT_)
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(_ZZN7Fortran7runtime6Ticket8ContinueERNS0_9WorkQueueEENK3$_0clINS0_2io5descr18DescriptorIoTicketILNS6_9DirectionE0EDaRT_)

ld.lld: error: undefined symbol: Fortran::runtime::io::descr::DescriptorIoTicket<(Fortran::runtime::io::Direction)1>::Begin(Fortran::runtime::WorkQueue&)
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(_ZZN7Fortran7runtime6Ticket8ContinueERNS0_9WorkQueueEENK3$_0clINS0_2io5descr18DescriptorIoTicketILNS6_9DirectionE1EDaRT_)
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(_ZZN7Fortran7runtime6Ticket8ContinueERNS0_9WorkQueueEENK3$_0clINS0_2io5descr18DescriptorIoTicketILNS6_9DirectionE1EDaRT_)

ld.lld: error: undefined symbol: Fortran::runtime::DerivedAssignTicket::Begin(Fortran::runtime::WorkQueue&)
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(_ZZN7Fortran7runtime6Ticket8ContinueERNS0_9WorkQueueEENK3$_0clINS0_19DerivedAssignTicketILb0EDaRT_)
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(_ZZN7Fortran7runtime6Ticket8ContinueERNS0_9WorkQueueEENK3$_0clINS0_19DerivedAssignTicketILb0EDaRT_)

ld.lld: error: undefined symbol: Fortran::runtime::DerivedAssignTicket::Continue(Fortran::runtime::WorkQueue&)
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(_ZZN7Fortran7runtime6Ticket8ContinueERNS0_9WorkQueueEENK3$_1clINS0_19DerivedAssignTicketILb1EDaRT_)
>>> referenced by /tmp/a.out.amdgcn.gfx1100-542f42.img.lto.o:(_ZZN7Fortran7runtime6Ticket8ContinueERNS0_9WorkQueueEENK3$_1clINS0_19DerivedAssignTicketILb1EDaRT_)

ld.lld: error: undefined sy

[llvm-bugs] [Bug 145874] Request Commit Access For @kaadam,

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145874




Summary

Request Commit Access For @kaadam,




  Labels
  
infra:commit-access-request
  



  Assignees
  
paschalis-mpeis,
ilinpv
  



  Reporter
  
  paschalis-mpeis
  




### Why Are you requesting commit access ?
@kaadam recently merged important work for BOLT:
- https://github.com/llvm/llvm-project/pull/129231
- and previously: https://github.com/llvm/llvm-project/pull/83394

A handful of important patches are also planned for the future.

Thanks in advance,
Paschalis (AArch64 BOLT maintainer)


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145919] Miscompiling error resulting to go into a forloop whereas the condition is false

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145919




Summary

Miscompiling error resulting to go into a forloop whereas the condition is false




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  alliaces
  




Hi !
I'm encoutering a problem from clang19.1.0 (clang20.1.0 also has it) with this kind of code : 
https://godbolt.org/z/WWdqcvj6h
```c++
double recursiveEval(int n, const double a) {
return n <= 0 ? 0 : a + recursiveEval(n - 1, a);
}

double eval(const std::vector& emptyVector) {
 std::vector vec;
for (const double& it : emptyVector) {
 std::cerr << "BIG ERROR !! I should not be here" << std::endl;
 vec.push_back(recursiveEval(it, it));
}
return recursiveEval(vec.size(), vec[0]);
}


int main(int argc, char **argv) {
std::vector coefs;

eval(coefs);
return 0;
}
```

It segfaults but that is not the problem. The problem is we go into the forloop whereas the vector _emptyVector_ is empty.

The problem disappears when (any of these):
- vec or  emptyVector are not empty
- `vec[0]` is replaced by a prvalue
- the line `return recursiveEval(vec.size(), vec[0]);` is deleted (replaced by any return).


Thanks for your help


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145929] [mlir][Vector] Add SPIR-V lowering for `vector.to_elements`

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145929




Summary

[mlir][Vector] Add SPIR-V lowering for `vector.to_elements`




  Labels
  
mlir
  



  Assignees
  
kuhar
  



  Reporter
  
  dcaballe
  




Similar to the [lowering to LLVM](https://github.com/llvm/llvm-project/pull/145766), we need to add the corresponding lowering to SPIR-V for `vector.to_elements`.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145924] [DirectX] Scalarizer is producing GEP chains that fail validation

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145924




Summary

[DirectX] Scalarizer is producing GEP chains that fail validation




  Labels
  
bug
  



  Assignees
  
  



  Reporter
  
  Icohedron
  




The validator doesn't like GEP chains where an array GEP is followed by a scalar GEP.

If you compile the following LLVM IR to DXIL using `llc -filetype=obj -mtriple=dxil-pc-shadermodel6.3-library`
```llvm
define i32 @gep_chain() #0 {

%1 = alloca [4 x i32], align 4
%2 = getelementptr inbounds nuw [4 x i32], ptr %1, i32 0, i32 2
%3 = getelementptr inbounds nuw i32, ptr %2, i32 1
%4 = load i32, ptr %3
ret i32 %4
}

attributes #0 = { convergent norecurse nounwind "hlsl.export"}
```
and then validate the resulting dxil using dxv, you get the following validation error (#140417):
```
Function: gep_chain: error: Access to out-of-bounds memory is disallowed.
note: at '%3 = getelementptr inbounds i32, i32* %2, i32 1' in block '#0' of function 'gep_chain'.
Validation failed.
```

These GEP chains can appear when compiling HLSL shaders such as this one: https://godbolt.org/z/ExeGKb7MG
The scalarizer pass is responsible for emitting the scalar GEP in this case, which itself occurs after the dxil-flatten-arrays pass that had attempted to collapse GEP chains. Therefore, we need a later pass, such as the dxil-legalize pass, or a new pass to collapse the GEP chains again.

I propose that to solve this cleanly, we should shift the responsibility of collapsing GEP chains from the dxil-flatten-arrays pass over to the dxil-legalize pass.

Another reason to move GEP-chain-collapsing over to the dxil-legalize pass is to consolidate i8 GEP transformations as well. Currently i8 GEPs can appear during dxil-data-scalarization and dxil-flatten-arrays (see #145780), which occurs before the dxil-legalize pass that is responsible for i8 GEP legalization. By moving the GEP-chain-collapsing logic from the dxil-flatten-arrays pass over to the dxil-legalize pass, the dxil-legalize pass can simultaneously collapse GEP chains and legalize any i8 and scalar GEPs in those GEP chains.

The dxil-flatten-arrays pass' visitGetElementPtrInst can be simplified down to the following:
```c++
/// This visitor simply converts GEPs on multidimensional arrays into GEPs on
/// flattened arrays. A later "GEP chain collapser" pass will be used to combine
/// GEP chains into single flat GEPs, including cases of i8 GEPs, and scalar
/// GEPs introduced by the scalarizer.
bool DXILFlattenArraysVisitor::visitGetElementPtrInst(GetElementPtrInst &GEP) {
 if (!isMultiDimensionalArray(GEP.getSourceElementType())) {
return false;
  }

  ArrayType *ArrType = cast(GEP.getSourceElementType());
  IRBuilder<> Builder(&GEP);
 auto [TotalElements, BaseType] = getElementCountAndType(ArrType);
 ArrayType *FlattenedArrayType = ArrayType::get(BaseType, TotalElements);
 Value *PtrOperand = GEP.getPointerOperand();

  Value *FlattenedIndex;

 const DataLayout &DL = GEP.getDataLayout();
  unsigned BitWidth = DL.getIndexTypeSizeInBits(GEP.getType());
  APInt ConstantOffset(BitWidth, 0);
  SmallMapVector VariableOffsets;
  bool Success = GEP.collectOffset(DL, BitWidth, VariableOffsets, ConstantOffset);
 (void)Success;
  assert(Success && "Failed to collect offsets for GEP");

  // GEP.collectOffset returns the offset in bytes. So we need to divide its
  // offsets by the size in bytes of the BaseType
  unsigned BaseTypeSizeInBytes = BaseType->getPrimitiveSizeInBits() / 8;

  Value *ZeroIndex = Builder.getInt({BitWidth, 0});
  FlattenedIndex = Builder.getInt(ConstantOffset.udiv(BaseTypeSizeInBytes));
  for (auto [VarIndex, Multiplier] : VariableOffsets){
Value *ConstIntMul = Builder.getInt(Multiplier.udiv(BaseTypeSizeInBytes));
Value *MulVarIndex = Builder.CreateMul(VarIndex, ConstIntMul);
FlattenedIndex = Builder.CreateAdd(FlattenedIndex, MulVarIndex);
  }

  Value *NewGEP = Builder.CreateGEP(
  FlattenedArrayType, PtrOperand, {ZeroIndex, FlattenedIndex},
  GEP.getName(), GEP.getNoWrapFlags());

 GEP.replaceAllUsesWith(NewGEP);
  GEP.eraseFromParent();
  return true;
}
```
This will flatten all GEPs, but not collapse any GEP chains. Again, the collapsing of GEP chains could be handled by the dxil-legalize pass (or a new pass).





___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145932] [libc] add basic lifetime annotations

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145932




Summary

[libc] add basic lifetime annotations




  Labels
  
libc
  



  Assignees
  
  



  Reporter
  
  SchrodingerZhu
  




Reject the following cases

```
cpp::string_view test() {
  char data[4] = "123";
  return cpp::string_view {data};
}
```

with

```
[1/2] Building CXX object libc/src/__support/StringUtil/CMakeFiles/libc.src.__support.StringUtil.error_to_string.dir/error_to_string.cpp.o
FAILED: libc/src/__support/StringUtil/CMakeFiles/libc.src.__support.StringUtil.error_to_string.dir/error_to_string.cpp.o 
sccache /usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_20_0_0_git -I/home/schrodingerzy/Documents/llvm-project/libc -isystem /home/schrodingerzy/Documents/llvm-project/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O2 -g -DNDEBUG -std=gnu++17 -DLIBC_QSORT_IMPL=LIBC_QSORT_QUICK_SORT -DLIBC_ADD_NULL_CHECKS -DLIBC_ERRNO_MODE=LIBC_ERRNO_MODE_DEFAULT -fpie -ffreestanding -DLIBC_FULL_BUILD -nostdlibinc -idirafter/usr/include -ffixed-point -fno-builtin -fno-exceptions -fno-lax-vector-conversions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -ftrivial-auto-var-init=pattern -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Wall -Wextra -Werror -Wconversion -Wno-sign-conversion -Wdeprecated -Wno-c99-extensions -Wno-gnu-imaginary-constant -Wno-pedantic -Wimplicit-fallthrough -Wwrite-strings -Wextra-semi -Wnewline-eof -Wnonportable-system-include-path -Wstrict-prototypes -Wthread-safety -Wglobal-constructors -MD -MT libc/src/__support/StringUtil/CMakeFiles/libc.src.__support.StringUtil.error_to_string.dir/error_to_string.cpp.o -MF libc/src/__support/StringUtil/CMakeFiles/libc.src.__support.StringUtil.error_to_string.dir/error_to_string.cpp.o.d -o libc/src/__support/StringUtil/CMakeFiles/libc.src.__support.StringUtil.error_to_string.dir/error_to_string.cpp.o -c /home/schrodingerzy/Documents/llvm-project/libc/src/__support/StringUtil/error_to_string.cpp
/home/schrodingerzy/Documents/llvm-project/libc/src/__support/StringUtil/error_to_string.cpp:87:28: error: address of stack memory associated with local variable 'data' returned [-Werror,-Wreturn-stack-address]
   87 |   return cpp::string_view {data};
 |^~~~
1 error generated.
ninja: build stopped: subcommand failed.
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145875] [SCEV] Repeated max not hoisted out of loop

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145875




Summary

[SCEV] Repeated max not hoisted out of loop




  Labels
  
llvm:optimizations,
llvm:SCEV,
missed-optimization
  



  Assignees
  
  



  Reporter
  
  aengelke
  




The repeated computation of a maximum/minimum is not hoisted outside of a loop, preventing the elimination of the loop (instead, the loop gets pointlessly vectorized).

Example (https://godbolt.org/z/faM4od4dM; see also: https://alive2.llvm.org/ce/z/YqPC82)
```c++
#include 
uint8_t f(uint8_t n, uint8_t a, uint8_t b) {
for (uint8_t i = 0; i < n; i++)
 a = a < b ? b : a;
return a;
}
```

To me this looks like something that SCEV should recognize (I might be wrong here, but in the end, the loop should get eliminated).


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145956] [clang] For-range with structured bindings during constant evaluation

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145956




Summary

[clang] For-range with structured bindings during constant evaluation




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  katzdm
  




As far as I can tell, the following should be well-formed:

```cpp
consteval auto f() -> int {
struct Pair { int first; int second; };

Pair arr[] = {{1, 1}};
for (auto const& [key, value] : arr) {
[=] { [[maybe_unused]] int s = key; }();
}
return 0;
}

int p = f();
```
https://compiler-explorer.com/z/5cGfrMTd7

GCC and MSVC accept; Clang and EDG reject, but EDG has acknowledged their behavior as a bug.

I did some preliminary digging, and the issue looks to occur while computing the lvalue-to-rvalue conversion of the `DeclRefExpr` that names `key` in the initializer of `s`; the `findCompleteObject` routine seems to be tripping over the structured binding and returns an empty `CompleteObject` [here](https://github.com/llvm/llvm-project/blob/a0c5f1992d2188dd58987445aa00a55edad2357f/clang/lib/AST/ExprConstant.cpp#L4488).


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 145981] [clang-tidy] Check request: readability-unnecessary-unique-release

2025-06-26 Thread LLVM Bugs via llvm-bugs


Issue

145981




Summary

[clang-tidy] Check request: readability-unnecessary-unique-release




  Labels
  
clang-tidy
  



  Assignees
  
  



  Reporter
  
  denzor200
  




`std::shared_ptr`'s constructor has a lot of overloads, one of them has `unique_ptr` parameter.
People don't know about existense of that overload.
I've seen the code like this all over the place:
```
std::shared_ptr process(std::unique_ptr foo) {
   // ...
   return std::shared_ptr(foo.release());
}
```

Need a check that will find such patterns and will change it to use constructor's overload with `unique_ptr` parameter:
```
std::shared_ptr process(std::unique_ptr foo) {
   // ...
   return std::shared_ptr(std::move(foo));
}
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs