[llvm-bugs] [Bug 145559] # Explore options for lowering of vector.contract to FEAT_I8MM, unified between Neon and SVE
Issue 145559 Summary # Explore options for lowering of vector.contract to FEAT_I8MM, unified between Neon and SVE Labels mlir, mlir:neon, mlir:sve Assignees Reporter momchil-velikov MLIR contains two patterns for lowering of `vector.contract` to FEAT_I8MM "instructions": * `LowerContractionToNeonI8MMPattern` * `LowerContractionToSVEI8MMPattern` It may be possible and beneficial to develop a unified pattern, able to generate code for either Neon or SVE. There are some differences in the functionality between the patterns: * Neon pattern handles "arbitrary"[1] indexing maps, the SVE pattern only the usual "identities + transposed RHS" one. * The Neon can handle input operands of types `iN, N <= 8`, SVE only handles `i8`. * In the Neon pattern the constraints on left-hand side, right-hand side, and accumulator/output tiles are to be evenly breakable into `2x8`, `8x2`, and `2x2` tiles, respectively, plus the support for left-hand side being one-dimensional. In the SVE pattern the constraints for left-hand side, right-hand side, and accumulator/output are to have shapes ``+, `<8x[N]>`, and ``, respectively, with `M` and `N` even. Notably the `K` dimension is fixed to 8 and only the `N` dimension is allowed to be scalable. [1] "arbitrary" in the sense it does not impose explicit requirements on the maps and handles them in a generic manner; however the pattern does not trigger for `<4x8> * <8x4>` with canonical/textbook matrix multiplication maps whereas it does trigger for `<4x8> * <4x8>` with maps for a transposed right-hand side. Unclear bug or by design. (Also indexing maps are not entirely "arbitrary", they need to make sense in the context of `vector.contract`). Before any unification, it would be nice if the functionality of both patterns converged to a common point. A. Indexing maps * Restrict the Neon indexing maps This is straightforward. * Support "arbitrary" indexing maps with SVE (are there any variants other than straight and transposed?) This is bit more involved, but still doable, under the assumption one would need at most one extra transpose op to accommodate for the data layout expected by the FEAT_I8MM instructions. B. "Small" integer types (`i4`, `i6`, etc) * It does not seem reasonable to remove this from Neon. * Should not be a problem to adds to SVE. May or may not expose the need to add codegen elsewhere (i.e. sign-/zero- extend with scalable vector types) C. Input/output shapes This need a lot of thought. The restriction `K == 8` is fairly fundamental to the SVE pattern and provides a number of adjacency guarantees (in the context FEAT_I8MM). It won't be easy to lift that restriction. In the context of tiled matrix-multiplication (where the operands to the `vector.contract` do not represent the whole matrix, but just tiles of a bigger one) the ability to have tile dimensions many multiples of 8 is unlikely to be very valuable - even a 8x8 output tile would require 16 SIMD registers - bigger tiles may exceed the number of available registers and introduce spills in something that is likely to be an inner loop. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145479] flang fails to compile with: ../flang/lib/Evaluate/intrinsics-library.cpp:225:26: error: address of overloaded function 'acos' does not match required type '__float128 (__floa
Issue 145479 Summary flang fails to compile with: ../flang/lib/Evaluate/intrinsics-library.cpp:225:26: error: address of overloaded function 'acos' does not match required type '__float128 (__float128)' Labels flang Assignees Reporter stefson hello everyone, I tried to compile flang-21 on an x86_64 linux system, with musl instead of glibc, and got this error: ``` 186443-FAILED: lib/Evaluate/CMakeFiles/FortranEvaluate.dir/intrinsics-library.cpp.o 186530-/usr/lib/llvm/21/bin/x86_64-pc-linux-musl-clang++ -DHAS_QUADMATHLIB -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/tmp/portage/llvm-core/flang-21.0.0_pre20250607/work/flang_build/lib/Evaluate -I/var/tmp/portage/llvm-core/flang-21.0.0_pre20250607/work/flang/lib/Evaluate -I/var/tmp/portage/llvm-core/flang-21.0.0_pre20250607/work/flang/include -I/var/tmp/portage/llvm-core/flang-21.0.0_pre20250607/work/flang_build/include -isystem /usr/lib/llvm/21/include -DNDEBUG -O2 -pipe -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion -Wcovered-switch-default -Wno-nested-anon-types -std=c++17 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -MD -MT lib/Evaluate/CMakeFiles/FortranEvaluate.dir/intrinsics-library.cpp.o -MF lib/Evaluate/CMakeFiles/FortranEvaluate.dir/intrinsics-library.cpp.o.d -o lib/Evaluate/CMakeFiles/FortranEvaluate.dir/intrinsics-library.cpp.o -c /var/tmp/portage/llvm-core/flang-21.0.0_pre20250607/work/flang/lib/Evaluate/intrinsics-library.cpp 188145:/var/tmp/portage/llvm-core/flang-21.0.0_pre20250607/work/flang/lib/Evaluate/intrinsics-library.cpp:225:26: error: address of overloaded function 'acos' does not match required type '__float128 (__float128)' 188381- 225 | FolderFactory::Create("acos"), 188458- | ^ -- 192906- 2208 | acos(const std::complex<_Tp>& __z) 192968- | ^ 192991:/var/tmp/portage/llvm-core/flang-21.0.0_pre20250607/work/flang/lib/Evaluate/intrinsics-library.cpp:225:26: error: address of overloaded function 'acos' does not match required type '_Complex __float128 (_Complex __float128)' 193249- 225 | FolderFactory::Create("acos"), 193326- | ^ ``` I'm using the main branch, the commit I've checked out is https://github.com/llvm/llvm-project/commit/23d0c7348aacdfcb145a69e533a14131bae830cc ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145512] [VectorCombine] Expand Intrinsic::vector_insert calls into shufflevectors
Issue 145512 Summary [VectorCombine] Expand Intrinsic::vector_insert calls into shufflevectors Labels good first issue, llvm::vectorcombine Assignees Reporter RKSimon This is already performed in InstCombinerImpl.visitCallInst - but this occurs AFTER vector-combine has run, so further cost driven vector-combine shuffle folds are missed. We should only need to handle vector_insert calls with FixedVectorTypes and there might not be any reason for a cost check, but we must ensure the shuffle are added back to the worklist to allow further expansion. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145511] [SLP] Avoid -passes=instcombine stages in SLP tests
Issue 145511 Summary [SLP] Avoid -passes=instcombine stages in SLP tests Labels good first issue, llvm:SLPVectorizer Assignees Reporter RKSimon As discussed on #144933 - we shouldn't be having multiple pass stages in SLP tests as it obscures the effect of the the vectorizer - especially as in most cases slp-vectorizer and instcombine are not sequential. The slp tests should only run `-passes=slp-vectorizer` in nearly all cases and if there is a need to check for additional IR then we should consider copying / moving the test to PhaseOrdering ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145508] memcmp() returns unexpected value
Issue 145508 Summary memcmp() returns unexpected value Labels new issue Assignees Reporter 8ss-boop ``` #include #include int foo(const char *a, const char *b) { return memcmp(a,b,5); } int main() { int x=foo("HELLO C23","C23 STD"); int y=memcmp("HELLO C23","C23 STD",5); printf("%d = %d\n",x, y); } ``` At -O0 optimization level, when called through function foo(), it incorrectly returns 5 instead of 1. ``` 5 = 1 ``` Reproduce link: https://godbolt.org/z/4aqY8x3TT ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145514] [MLIR] Inconsistent output when executing MLIR program with and without `-tosa-layerwise-constant-fold`
Issue 145514 Summary [MLIR] Inconsistent output when executing MLIR program with and without `-tosa-layerwise-constant-fold` Labels Assignees Reporter Lambor24 My git version is [e04c938](https://github.com/llvm/llvm-project/commit/e04c938cc08a90ae60440ce22d072ebc69d67ee8). ## Description: I am experiencing an inconsistent result when executing the same MLIR program with and without the `-tosa-layerwise-constant-fold`. ## Steps to Reproduce: ### 1. **MLIR Program (test.mlir)**: test.mlir: ``` module { func.func private @printMemrefI64(tensor<*xi64>) func.func @main() { %0 = "tosa.const"() <{values = dense : tensor<7xi1>}> : () -> tensor<7xi1> %1 = tosa.reduce_any %0 {axis = 0 : i32} : (tensor<7xi1>) -> tensor<1xi1> %2 = tosa.cast %1 : (tensor<1xi1>) -> tensor<1xi64> %cast = tensor.cast %2 : tensor<1xi64> to tensor<*xi64> call @printMemrefI64(%cast) : (tensor<*xi64>) -> () return } } ``` ### 2. **Command to Run Without `-tosa-layerwise-constant-fold`:** ``` /path/llvm-project/build/bin/mlir-opt test.mlir -pass-pipeline='builtin.module(func.func(tosa-to-linalg))' | \ /path/llvm-project/build/bin/mlir-opt -tosa-to-arith -one-shot-bufferize="bufferize-function-boundaries" -convert-linalg-to-loops -expand-strided-metadata -lower-affine -convert-scf-to-cf -convert-cf-to-llvm -convert-arith-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts | \ /path/llvm-project/build/bin/mlir-runner -e main -entry-point-result=void \ -shared-libs=/path/llvm-project/build/lib/libmlir_runner_utils.so \ -shared-libs=/path/llvm-project/build/lib/libmlir_c_runner_utils.so ``` ### 3. **Output Without `-tosa-layerwise-constant-fold`:** ``` [[1]] ``` ### 4. **Command to Run With `-tosa-layerwise-constant-fold`:** ``` /path/llvm-project/build/bin/mlir-opt test.mlir -tosa-layerwise-constant-fold | \ /path/llvm-project/build/bin/mlir-opt -pass-pipeline='builtin.module(func.func(tosa-to-linalg))' | \ /path/llvm-project/build/bin/mlir-opt -tosa-to-arith -one-shot-bufferize="bufferize-function-boundaries" -convert-linalg-to-loops -expand-strided-metadata -lower-affine -convert-scf-to-cf -convert-cf-to-llvm -convert-arith-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts | \ /path/llvm-project/build/bin/mlir-runner -e main -entry-point-result=void \ -shared-libs=/path/llvm-project/build/lib/libmlir_runner_utils.so \ -shared-libs=/path/llvm-project/build/lib/libmlir_c_runner_utils.so ``` ### 5. **Output With `-tosa-layerwise-constant-fold`:** ``` [[-1]] ``` I'm not sure if there is any bug in my program or if the wrong usage of the above passes caused this result. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145589] [HLSL] Add -Gis option from DXC to clang-dxc
Issue 145589 Summary [HLSL] Add -Gis option from DXC to clang-dxc Labels new issue Assignees bob80905 Reporter bob80905 The dxc driver mode for clang, or clang-dxc.exe, needs to accept the -Gis compiler option, which dxc.exe accepts. The -Gis compiler option enables IEEE strictness mode, constraining the behavior for NaNs and infs. By enabling -Gis in clang-dxc, it will help to get more consistency across tests in the offload test suite that use these special float values for float16s. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145499] [Flang] Hang / infinite loop regression in OpenMP tests (in last ~2 weeks)
Issue 145499 Summary [Flang] Hang / infinite loop regression in OpenMP tests (in last ~2 weeks) Labels flang Assignees Reporter mgorny While retesting Flang last Saturday, I've noticed a few tests were hanging. Today I've rechecked it, and they still are. The hanging tests are: ``` Flang :: Integration/OpenMP/atomic-capture-complex.f90 Flang :: Lower/OpenMP/atomic-implicit-cast.f90 Flang :: Lower/OpenMP/atomic-privatize.f90 Flang :: Lower/OpenMP/common-atomic-lowering.f90 Flang :: Lower/OpenMP/dump-atomic-analysis.f90 ``` This is Gentoo amd64 standalone build. Confirmed with 437346378fd4d40af30e6969621a605cbd6215d1 and c5972da34a08e5568e2b14e4c6f82c86e25a452a. I'm going to try bisecting but it's going to take a fair while. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145525] CGHLSLRuntime::initializeBufferFromBinding unconditionally dereferencing RBA after a nullptr check
Issue 145525 Summary CGHLSLRuntime::initializeBufferFromBinding unconditionally dereferencing RBA after a nullptr check Labels new issue Assignees Reporter shafik On this line we have a `nullptr` check on `RBA`: https://github.com/llvm/llvm-project/blob/12ba75145efe3ada44374036d7d5b5e94e855283/clang/lib/CodeGen/CGHLSLRuntime.cpp#L592-L593 but soon after we unconditionally dereference `RBA`: https://github.com/llvm/llvm-project/blob/12ba75145efe3ada44374036d7d5b5e94e855283/clang/lib/CodeGen/CGHLSLRuntime.cpp#L596-L600 So one of these lines is a bug, I am not sure which it is. This change was brought in by: https://github.com/llvm/llvm-project/pull/139022 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145571] [DXIL] IsSpecialFloat for 16-bit types must be emulated
Issue 145571 Summary [DXIL] IsSpecialFloat for 16-bit types must be emulated Labels backend:DirectX Assignees Reporter llvm-beanz Due to a [bug in DXC](https://github.com/microsoft/DirectXShaderCompiler/issues/7496), the `IsSpecialFloat` DXIL operations were never generated for 16-bit types. As a result drivers are not able to handle the operation, so we need to update Clang to emulate the functions using integers. We are also tracking a change for SM 6.9 to address this issue (https://github.com/microsoft/hlsl-specs/issues/521). ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145521] nested class with out-of-class-definition and requires clause fails to compile
Issue 145521 Summary nested class with out-of-class-definition and requires clause fails to compile Labels new issue Assignees Reporter bernd5 The following code fails to compile: ```c++ template concept is_valid = true; template class Nesting { public: template requires is_valid class Inner; }; template template requires is_valid class Nesting::Inner {}; ``` with the error message: ``` :13:31: error: requires clause differs in template redeclaration 13 | template requires is_valid | ^ :8:35: note: previous template declaration is here 8 | template requires is_valid | ^ 1 error generated. Compiler returned: 1 ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145618] [RISCV] Hints and Custom Extensions (specifically Xqci)
Issue 145618 Summary [RISCV] Hints and Custom Extensions (specifically Xqci) Labels backend:RISC-V Assignees Reporter lenary The description of the C/Zca hints has recently been updated https://github.com/riscv/riscv-isa-manual/pull/2093 . This has made some clarifications and minor updates to how hints are defined. This issue is to look at how this affects Xqci, which uses some of the custom hint space for some of its instructions. I've noted the ones in the custom space, and those in the HINT space. I've also noted where the operation of an instruction is "hint-compatible" - i.e. it has microarchitectural implications but no register state is updated. Xqci is rv32-only. Relevant Instructions: - `C.SRLI` with `shamt[5] == 1` (custom on rv32) - `QC.C.BEXTI` (`shamt[4:0] != 0`) ✅ - `QC.C.SYNCWF` (`shamt[4:0] == 0`) (hint-compatible) ✅ - `C.SRLI` with `shamt == 0` (HINT) - `QC.C.SYNC` (hint-compatible) 🔺Encoding not designated for custom use. - `C.SRAI` with `shamt[5] == 1` (custom on rv32) - `QC.C.BSETI` (`shamt[4:0] != 0`) ✅ - `QC.C.SYNCWL` (`shamt[4:0] == 0`) (hint-compatible) ✅ - `C.SRAI` with `shamt == 0` (HINT) - `QC.C.SYNC` (hint-compatible) 🔺Encoding not designated for custom use. - `C.SLLI` with `shamt[5] == 1` and `rd != x0` (custom on rv32 ✅) - `QC.C.BEXTI` - `QC.C.BSETI` - `QC.C.EXTU` - `QC.C.DIR` - `QC.C.EIR` - `QC.C.SETINT` - `QC.C.CLRINT` - `QC.C.MIENTER` - `QC.C.MIENTER.NEST` - `QC.C.MRET` - `QC.C.MNRET` - `QC.C.MILEAVERET` - `QC.C.DI` - `QC.C.EI` - `C.SLLI` with `rd == x0` (`shamt[5] == 0`) (HINT) - `QC.C.DELAY` (hint-compatible) ✅ - `QC.C.PTRACE` (hint-compatible) ✅ We are using some of the standard space, but in a hint-compatible way, and there are other places where we overlap e.g. `C.FSDSP`, `C.FLDSP`, and the `Zcmp` instructions already. So I think broadly we're fine. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145548] [RISCV][Instrumentation] Instrumentation for mysql failed
Issue 145548 Summary [RISCV][Instrumentation] Instrumentation for mysql failed Labels new issue Assignees Reporter Xiangtingwei here is my gcc version: > debian@laptop-yjn:~/work-bolt/cross-tool/riscv-gcc/bin$ ./riscv64-unknown-linux-gnu-gcc --version riscv64-unknown-linux-gnu-gcc (g1b306039ac) 15.1.0 Copyright (C) 2025 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. and the log: > sipeed@revyos-lpi4a:~/work_bolt/cross-mysql/install/bin$ llvm-bolt -riscv-uleb128-reloc=false --conservative-instrumentation --instrument --instrumentation-file=mysqld.fdata_github -o mysqld.instr_github mysqld BOLT-INFO: Target architecture: riscv64 BOLT-INFO: BOLT version: 9d491bc602c2d9730cb42fe25f0753471a3af389 BOLT-INFO: first alloc address is 0x1 BOLT-INFO: creating new program header table at address 0x400, offset 0x3ff BOLT-INFO: enabling relocation mode BOLT-INFO: forcing -jump-tables=move for instrumentation BOLT-WARNING: Failed to analyze 4 relocations BOLT-INFO: 0 out of 60782 functions in the binary (0.0%) have non-empty execution profile BOLT-INSTRUMENTER: Number of indirect call site descriptors: 93961 BOLT-INSTRUMENTER: Number of indirect call target descriptors: 58419 BOLT-INSTRUMENTER: Number of function descriptors: 58419 BOLT-INSTRUMENTER: Number of branch counters: 1983743 BOLT-INSTRUMENTER: Number of ST leaf node counters: 0 BOLT-INSTRUMENTER: Number of direct call counters: 0 BOLT-INSTRUMENTER: Total number of counters: 1983743 BOLT-INSTRUMENTER: Total size of counters: 15869944 bytes (static alloc memory) BOLT-INSTRUMENTER: Total size of string table emitted: 6238083 bytes in file BOLT-INSTRUMENTER: Total size of descriptors: 59100604 bytes in file BOLT-INSTRUMENTER: Profile will be saved to file mysqld.fdata_github BOLT-INFO: padding code to 0xb80 to accommodate hot text BOLT-INFO: output linked against instrumentation runtime library, lib entry point is 0xd08c42e BOLT-INFO: clear procedure is 0xd08b4f2 BOLT-INFO: patched build-id (flipped last bit) BOLT-INFO: setting _end to 0xd17b704 BOLT-INFO: setting _end to 0xd17b704 BOLT-INFO: setting __bolt_runtime_start to 0xd08c42e BOLT-INFO: setting __bolt_runtime_fini to 0xd08c4fc BOLT-INFO: setting __hot_start to 0x420 BOLT-INFO: setting __hot_end to 0xb55811c > sipeed@revyos-lpi4a: ~/work_bolt/cross-mysql/install/bin$ ./mysqld --version /home/sipeed/work_bolt/cross-mysql/install/bin/mysqld Ver 8.0.33 for Linux on riscv64 (Source distribution) sipeed@revyos-lpi4a: ~ /work_bolt/cross-mysql/install/bin$ ./mysqld.instr_github --version Segmentation fault and the gdb log > (gdb) r Starting program: /home/sipeed/work_bolt/cross-mysql/install/bin/mysqld.instr_github --version [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/riscv64-linux-gnu/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x003ff7fe6e32 in __GI___tls_get_addr (ti=0x1b8) at ./elf/dl-tls.c:1030 warning: 1030 ./elf/dl-tls.c: No such file or directory (gdb) x/10i 0x003ff7fe6e20 0x3ff7fe6e20 <__GI___tls_get_addr+2>:sd zero,440(sp) 0x3ff7fe6e22 <__GI___tls_get_addr+4>:auipc a4,0x18 0x3ff7fe6e26 <__GI___tls_get_addr+8>:addia4,a4,590 0x3ff7fe6e2a <__GI___tls_get_addr+12>: ld a5,0(a4) 0x3ff7fe6e2c <__GI___tls_get_addr+14>: ld a3,0(a1) 0x3ff7fe6e2e <__GI___tls_get_addr+16>: bne a3,a5,0x3ff7fe6e50 <__GI___tls_get_addr+50> => 0x3ff7fe6e32 <__GI___tls_get_addr+20>: ld a5,0(a0) 0x3ff7fe6e34 <__GI___tls_get_addr+22>: slli a5,a5,0x4 0x3ff7fe6e36 <__GI___tls_get_addr+24>: add a5,a5,a1 0x3ff7fe6e38 <__GI___tls_get_addr+26>: ld a3,0(a5) (gdb) info register a0 a0 0x1b8440 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145558] OpenMP TARGET DATA crash with IF(.FALSE.)
Issue 145558 Summary OpenMP TARGET DATA crash with IF(.FALSE.) Labels new issue Assignees Reporter kparzysz A modified version of use_device_ptr-1.f90 from the gfortran test suite: ``` use iso_c_binding, only: c_ptr implicit none (external, type) interface subroutine bar(x) import type(c_ptr), value :: x end end interface type(c_ptr) :: x !$omp target data map(alloc: x) if(.false.) !$omp target data use_device_ptr(x) if(.false.) call bar(x) !$omp end target data !$omp end target data end ``` flang -fc1 -emit-llvm -module-dir -fopenmp use_device_ptr-1.f90 where "module-dir" = install_prefix/include/flang ``` #0 0x7e68dbde3840 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/work2/kparzysz/c/org/bin/../lib/libLLVMSupport.so.21.0git+0x1e3840) #1 0x7e68dbde0c4f llvm::sys::RunSignalHandlers() (/work2/kparzysz/c/org/bin/../lib/libLLVMSupport.so.21.0git+0x1e0c4f) #2 0x7e68dbde0d9a SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0 #3 0x7e68db442520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520) #4 0x7e68db4969fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 #5 0x7e68db4969fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10 #6 0x7e68db4969fc pthread_kill ./nptl/pthread_kill.c:89:10 #7 0x7e68db442476 gsignal ./signal/../sysdeps/posix/raise.c:27:6 #8 0x7e68db4287f3 abort ./stdlib/abort.c:81:7 #9 0x7e68db42871b _nl_load_domain ./intl/loadmsgcat.c:1177:9 #10 0x7e68db439e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96) #11 0x7e68d879e943 llvm::TargetFolder::FoldGEP(llvm::Type*, llvm::Value*, llvm::ArrayRef, llvm::GEPNoWrapFlags) const (/work2/kparzysz/c/org/bin/../lib/../lib/libLLVMAnalysis.so.21.0git+0x19e943) #12 0x7e68d1fa36cd llvm::IRBuilderBase::CreateGEP(llvm::Type*, llvm::Value*, llvm::ArrayRef, llvm::Twine const&, llvm::GEPNoWrapFlags) (/work2/kparzysz/c/org/bin/../lib/../lib/../lib/libMLIRLLVMToLLVMIRTranslation.so.21.0git+0x266cd) #13 0x7e68d1fc41cb convertOperationImpl(mlir::Operation&, llvm::IRBuilderBase&, mlir::LLVM::ModuleTranslation&) LLVMToLLVMIRTranslation.cpp:0:0 #14 0x7e68dafc7636 mlir::LLVM::ModuleTranslation::convertOperation(mlir::Operation&, llvm::IRBuilderBase&, bool) (/work2/kparzysz/c/org/bin/../lib/../lib/libMLIRTargetLLVMIRExport.so.21.0git+0x2e636) #15 0x7e68dafd2892 mlir::LLVM::ModuleTranslation::convertBlockImpl(mlir::Block&, bool, llvm::IRBuilderBase&, bool) (/work2/kparzysz/c/org/bin/../lib/../lib/libMLIRTargetLLVMIRExport.so.21.0git+0x39892) #16 0x7e68d35cedcb inlineConvertOmpRegions(mlir::Region&, llvm::StringRef, llvm::IRBuilderBase&, mlir::LLVM::ModuleTranslation&, llvm::SmallVectorImpl*) OpenMPToLLVMIRTranslation.cpp:0:0 #17 0x7e68d35d1eb5 llvm::Expected llvm::function_ref (llvm::IRBuilderBase::InsertPoint, llvm::OpenMPIRBuilder::BodyGenTy)>::callback_fn(long, llvm::IRBuilderBase::InsertPoint, llvm::OpenMPIRBuilder::BodyGenTy) OpenMPToLLVMIRTranslation.cpp:0:0 #18 0x7e68d7fc70c0 llvm::OpenMPIRBuilder::createTargetData(llvm::OpenMPIRBuilder::LocationDescription const&, llvm::IRBuilderBase::InsertPoint, llvm::IRBuilderBase::InsertPoint, llvm::Value*, llvm::Value*, llvm::OpenMPIRBuilder::TargetDataInfo&, llvm::function_ref, llvm::function_ref (unsigned int)>, llvm::omp::RuntimeFunction*, llvm::function_ref (llvm::IRBuilderBase::InsertPoint, llvm::OpenMPIRBuilder::BodyGenTy)>, llvm::function_ref, llvm::Value*) (/work2/kparzysz/c/org/bin/../lib/../lib/libLLVMFrontendOpenMP.so.21.0git+0x930c0) #19 0x7e68d35c4279 convertOmpTargetData(mlir::Operation*, llvm::IRBuilderBase&, mlir::LLVM::ModuleTranslation&) OpenMPToLLVMIRTranslation.cpp:0:0 #20 0x7e68d35e5559 convertHostOrTargetOperation(mlir::Operation*, llvm::IRBuilderBase&, mlir::LLVM::ModuleTranslation&) OpenMPToLLVMIRTranslation.cpp:0:0 #21 0x7e68dafc7636 mlir::LLVM::ModuleTranslation::convertOperation(mlir::Operation&, llvm::IRBuilderBase&, bool) (/work2/kparzysz/c/org/bin/../lib/../lib/libMLIRTargetLLVMIRExport.so.21.0git+0x2e636) #22 0x7e68dafd2892 mlir::LLVM::ModuleTranslation::convertBlockImpl(mlir::Block&, bool, llvm::IRBuilderBase&, bool) (/work2/kparzysz/c/org/bin/../lib/../lib/libMLIRTargetLLVMIRExport.so.21.0git+0x39892) #23 0x7e68d35cedcb inlineConvertOmpRegions(mlir::Region&, llvm::StringRef, llvm::IRBuilderBase&, mlir::LLVM::ModuleTranslation&, llvm::SmallVectorImpl*) OpenMPToLLVMIRTranslation.cpp:0:0 #24 0x7e68d35d1eb5 llvm::Expected llvm::function_ref (llvm::IRBuilderBase::InsertPoint, llvm::OpenMPIRBuilder::BodyGenTy)>::callback_fn(long, llvm::IRBuilderBase::InsertPoint, llvm::OpenMPIRBuilder::BodyGenTy) OpenMPToLLVMIRTranslation.cpp:0:0 #25 0x7e68d7fc70c0 llvm::OpenMPIRBuilder::createTargetData(llvm::OpenMPIRBuilder::LocationDescription const&, llvm::IRBuil
[llvm-bugs] [Bug 145510] different bewteen gcc -fPIC -mlong-calls and clang -fPIC -mlong-calls
Issue 145510 Summary different bewteen gcc -fPIC -mlong-calls and clang -fPIC -mlong-calls Labels clang Assignees Reporter LukeSTM demo: ``` #include int main () { printf("hello"); return 0; } ``` https://godbolt.org/z/MzdPY3oTs Hi, i want know why gcc use got and clang not ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145593] [flang] ASIND and ACOSD give incorrect results
Issue 145593 Summary [flang] ASIND and ACOSD give incorrect results Labels flang Assignees Reporter tacaswell There appears to be a bug with the acosd and asind intrinsics(?). Although the following program will compile correctly under flang: ```fortran PROGRAM ACOSD_TEST C Program to compare ACOSD with ACOS conversion IMPLICIT NONE INTEGER I REAL X, Y1, Y2, STEP REAL PI PARAMETER (PI = 3.14159265359) C Step size for 15 evenly spaced values STEP = 2.0 / 14.0 DO 10 I = 0, 14 X = -1.0 + I * STEP Y1 = ACOSD(X) Y2 = ACOS(X) * 180.0 / PI WRITE(*,100) X, Y1, Y2 10 CONTINUE 100 FORMAT('For x = ',F8.4,': ACOSD = ',F8.4, & ', ACOS(rad->deg) = ',F8.4) END ``` but when run gives incorrect results: ``` For x = -1.: ACOSD = 1.5883, ACOS(rad->deg) = 180. For x = -0.8571: ACOSD = 1.5858, ACOS(rad->deg) = 148.9973 For x = -0.7143: ACOSD = 1.5833, ACOS(rad->deg) = 135.5847 For x = -0.5714: ACOSD = 1.5808, ACOS(rad->deg) = 124.8499 For x = -0.4286: ACOSD = 1.5783, ACOS(rad->deg) = 115.3769 For x = -0.2857: ACOSD = 1.5758, ACOS(rad->deg) = 106.6015 For x = -0.1429: ACOSD = 1.5733, ACOS(rad->deg) = 98.2132 For x = 0.: ACOSD = 1.5708, ACOS(rad->deg) = 90. For x = 0.1429: ACOSD = 1.5683, ACOS(rad->deg) = 81.7868 For x = 0.2857: ACOSD = 1.5658, ACOS(rad->deg) = 73.3984 For x = 0.4286: ACOSD = 1.5633, ACOS(rad->deg) = 64.6231 For x = 0.5714: ACOSD = 1.5608, ACOS(rad->deg) = 55.1501 For x = 0.7143: ACOSD = 1.5583, ACOS(rad->deg) = 44.4153 For x = 0.8571: ACOSD = 1.5558, ACOS(rad->deg) = 31.0027 For x = 1.: ACOSD = 1.5533, ACOS(rad->deg) = 0. ``` A similar with ASIND gives: ``` For x = -1.: ASIND = -0.0175, ASIN(rad->deg) = -90. For x = -0.8571: ASIND = -0.0150, ASIN(rad->deg) = -58.9973 For x = -0.7143: ASIND = -0.0125, ASIN(rad->deg) = -45.5847 For x = -0.5714: ASIND = -0.0100, ASIN(rad->deg) = -34.8499 For x = -0.4286: ASIND = -0.0075, ASIN(rad->deg) = -25.3769 For x = -0.2857: ASIND = -0.0050, ASIN(rad->deg) = -16.6015 For x = -0.1429: ASIND = -0.0025, ASIN(rad->deg) = -8.2132 For x = 0.: ASIND = 0., ASIN(rad->deg) = 0. For x = 0.1429: ASIND = 0.0025, ASIN(rad->deg) = 8.2132 For x = 0.2857: ASIND = 0.0050, ASIN(rad->deg) = 16.6016 For x = 0.4286: ASIND = 0.0075, ASIN(rad->deg) = 25.3769 For x = 0.5714: ASIND = 0.0100, ASIN(rad->deg) = 34.8499 For x = 0.7143: ASIND = 0.0125, ASIN(rad->deg) = 45.5847 For x = 0.8571: ASIND = 0.0150, ASIN(rad->deg) = 58.9973 For x = 1.: ASIND = 0.0175, ASIN(rad->deg) = 90. ``` f77 and f90 versions of both are in https://github.com/tacaswell/flang-intrinsics-bug This code gives the correct results with gfortran. I've tested this with: ``` $ flang --version flang version 20.1.7 (Fedora 20.1.7-1.fc42) Target: x86_64-redhat-linux-gnu Thread model: posix InstalledDir: /usr/bin ``` but saw the bug on windows with (I think) flang 19 from conda-forge. I can systematically test if needed. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145628] [clang-tidy] [Modules] Avoid duplicated checkings
Issue 145628 Summary [clang-tidy] [Modules] Avoid duplicated checkings Labels clang-tidy Assignees Reporter ChuanqiXu9 Reproducer: ``` // RUN: rm -fr %t // RUN: mkdir %t // RUN: split-file %s %t // RUN: mkdir %t/tmp // // RUN: %clang -std=c++20 -x c++-module %t/a.cpp --precompile -o %t/a.pcm // // RUN: %check_clang_tidy -std=c++20 -check-suffix=DEFAULT %t/a.cpp \ // RUN: cppcoreguidelines-narrowing-conversions %t/a.cpp -- \ // RUN: -config='{}' // RUN: clang-tidy %t/use.cpp -fix --checks=-*,cppcoreguidelines-narrowing-conversions \ // RUN: -config={} -- \ // RUN: -fmodule-file=a=%t/a.pcm -std=c++20 -nostdinc++ //--- a.cpp export module a; export void most_narrowing_is_not_ok() { int i; long long ui; i = ui; // CHECK-MESSAGES-DEFAULT: :[[@LINE-1]]:7: warning: narrowing conversion from 'long long' to signed type 'int' is implementation-defined [cppcoreguidelines-narrowing-conversions] } //--- use.cpp import a; void use() { most_narrowing_is_not_ok(); // CHECK-MESSAGES-DEFAULT: } ``` In this example, we don't hope we see the warning for use.cpp. Concretely, we can save a lot of times avoid duplicated checkings. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145635] Miscompile with __arm_locally_streaming with -march=armv8-a+sme
Issue 145635 Summary Miscompile with __arm_locally_streaming with -march=armv8-a+sme Labels new issue Assignees Reporter efriedma-quic Consider: ``` __arm_locally_streaming void f(int *p, int n) { #pragma clang loop vectorize_width(4, scalable) for (int i = 0; i < n; ++i) { p[i]++; } } ``` Compile with `-march=armv8-a+sme`. The resulting assembly starts with: ``` sub sp, sp, #96 rdsvl x9, #1 lsr x9, x9, #3 str x9, [sp, #16] mov x9, x0 bl __arm_get_current_vg [...] ``` The bl corrupts x30. Demo at https://godbolt.org/z/Ga9zaWEen Workarounds: - Use `-march=armv8-a+sve+sme` or similar - Use -mno-omit-leaf-frame-pointer CC @sdesmalen-arm @MacDue ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 145467] while using clang-format with --lines option, it will not respect WhitespaceSensitiveMacros
Issue 145467 Summary while using clang-format with --lines option, it will not respect WhitespaceSensitiveMacros Labels clang-format Assignees Reporter fantouch In `.clang-foramt` I have added `WhitespaceSensitiveMacros: ['EM_ASM']` and diff is: ``` EM_ASM({ let thisPromiseIndex = Module.AllFramesCollectorPromiseIndex; Module.AllFramesCollectorPromise = Module.AllFramesCollectorPromise .then((val) => { // only this line#33 changed return [ 0, val, thisPromiseIndex ]; }) }); ``` with `--lines=33:33` option (by `git clang-format` behavior), `.then((val) => {` becomes `.then((val) = > {` It seems that `--lines` break the WhitespaceSensitiveMacros rules, I think clang-foramt should check if lines are inside `EM_ASM` macro ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs