[Touch-packages] [Bug 1993800] Re: LLVM ERROR: Cannot select: 0x2f689c8: v4i32 = ARMISD::VCMPZ 0x2f696b8, Constant:i32<2>

Bug Watch Updater Fri, 21 Oct 2022 05:16:18 -0700

Launchpad has imported 10 comments from the remote bug at
https://bugzilla.opensuse.org/show_bug.cgi?id=1204267.


If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2022-10-13T09:20:11+00:00 Guillaume Gardet wrote:

Created attachment 862142
journal.log

Plasmashell crashes on armv7 since snapshot 20221006.

Oct 13 04:54:05 localhost.localdomain plasmashell[2046]: QFont::setPointSizeF: 
Point size <= 0 (0.000000), must be greater than 0
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]: LLVM ERROR: Cannot 
select: 0x3003d38: v4i32 = ARMISD::VCMPZ 0x2f64940, Constant:i32<2>
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]:   0x2f64940: v4i32,ch 
= ARMISD::VLD1DUP<(load (s32) from %ir.326)> 0x2fcdea8, 0x2fb0848, 
Constant:i32<4>
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]:     0x2fb0848: i32 = 
add 0x3006648, Constant:i32<64>
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]:       0x3006648: 
i32,ch = CopyFromReg 0x2e4bf5c, Register:i32 %35
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]:         0x2fed080: i32 
= Register %35
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]:       0x2e65f08: i32 = 
Constant<64>
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]:     0x2fd93c0: i32 = 
Constant<4>
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]:   0x2fd9690: i32 = 
Constant<2>
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]: In function: 
fs_variant_partial
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]: KCrash: Application 
'plasmashell' crashing...
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]: KCrash: Attempting to 
start /usr/libexec/drkonqi
Oct 13 04:54:06 localhost.localdomain plasmashell[2077]: libEGL warning: DRI2: 
failed to authenticate
Oct 13 04:54:06 localhost.localdomain kded5[1600]: Service  
"org.kde.StatusNotifierHost-2046" unregistered
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]: Unable to start Dr. 
Konqi
Oct 13 04:54:06 localhost.localdomain plasmashell[2046]: Re-raising signal for 
core dump handling.
Oct 13 04:54:06 localhost.localdomain systemd[1284]: 
plasma-plasmashell.service: Main process exited, code=dumped, status=6/ABRT

Reply at: https://bugs.launchpad.net/ubuntu/+source/llvm-
toolchain-15/+bug/1993800/comments/0

------------------------------------------------------------------------
On 2022-10-13T09:44:07+00:00 Fvogt-a wrote:

It was probably introduced in 20221003, which didn't reach openQA for
ARM.

Unfortunately that contained both Mesa and llvm updates, so hard to tell
what caused that. Reassigning.

Reply at: https://bugs.launchpad.net/ubuntu/+source/llvm-
toolchain-15/+bug/1993800/comments/1

------------------------------------------------------------------------
On 2022-10-13T12:47:27+00:00 Sndirsch-u wrote:

Honestly. No clue ...

We switched to Mesa 22.2 and llvm15 lately.

Reply at: https://bugs.launchpad.net/ubuntu/+source/llvm-
toolchain-15/+bug/1993800/comments/2

------------------------------------------------------------------------
On 2022-10-17T12:25:23+00:00 3-christophe wrote:

The issue can be reproduced with one pyside test on armv7l:

[ 3286s] 460/478 Test #461: QtDataVisualization_datavisualization_test 
......Subprocess aborted***Exception:   0.82 sec
[ 3286s] .LLVM ERROR: Cannot select: 0x1e6a2c0: v4i32 = ARMISD::VCMPZ 
0x1d68a08, Constant:i32<2>
[ 3286s]   0x1d68a08: v4i32,ch = ARMISD::VLD1DUP<(load (s32) from %ir.235)> 
0x11ebd5c, 0x1d68660:1, Constant:i32<4>
[ 3286s]     0x1d68660: i32,i32,ch = load<(load (s32) from %ir.232, align 8), 
<post-inc>> 0x11ebd5c, 0x1d52518, Constant:i32<64>
[ 3286s]       0x1d52518: i32,ch = CopyFromReg 0x11ebd5c, Register:i32 %30
[ 3286s]         0x1d68858: i32 = Register %30
[ 3286s]       0x1d67100: i32 = Constant<64>
[ 3286s]     0x1afd1a8: i32 = Constant<4>
[ 3286s]   0x1d66ec0: i32 = Constant<2>
[ 3286s] In function: fs_variant_partial

Reply at: https://bugs.launchpad.net/ubuntu/+source/llvm-
toolchain-15/+bug/1993800/comments/3

------------------------------------------------------------------------
On 2022-10-18T21:04:21+00:00 Aaronpuchert wrote:

In my understanding, "Cannot select" is always an LLVM bug, specifically
in the backend. Early stages of the backend should "legalize" data types
and instructions, sending to instruction selection only what the target
supports.

So I can have a look, but it would be appreciated if someone could
extract the IR that Mesa sends to LLVM. Otherwise I'll have to reverse-
engineer a reproducer.

Nevertheless, some initial remarks: ARMISD::VCMPZ is a "Vector compare
to zero." [1] It should correspond to "vcmpe" in assembly [2]. The first
argument being a v4i32 is slightly suspicious. I would have expected a
v4f32, but since they live in the same registers maybe the backend
doesn't care. The second is a Constant:i32<2> = ARMCC::CondCodes::HS,
corresponding to conditional execution only if the carry flag is set, if
I understand this correctly. [3,4]

Inside we have ARMISD::VLD1DUP, which is a "Vector load N-element
structure to all lanes" (same file as [1], different line), and seems to
correspond to "vld1.N" in assembly. [5] The Constant:i32<4> could be an
alignment, but I'm not sure.

[1] 
https://github.com/llvm/llvm-project/blob/llvmorg-15.0.2/llvm/lib/Target/ARM/ARMISelLowering.h#L148
[2] 
https://developer.arm.com/documentation/ddi0406/c/Application-Level-Architecture/Instruction-Details/Alphabetical-list-of-instructions/VCMP--VCMPE
[3] 
https://developer.arm.com/documentation/ddi0406/c/Application-Level-Architecture/Instruction-Details/Conditional-execution
[4] 
https://github.com/llvm/llvm-project/blob/llvmorg-15.0.2/llvm/lib/Target/ARM/Utils/ARMBaseInfo.h#L33
[5] 
https://developer.arm.com/documentation/ddi0406/c/Application-Level-Architecture/Instruction-Details/Alphabetical-list-of-instructions/VLD1--single-element-to-all-lanes-

Reply at: https://bugs.launchpad.net/ubuntu/+source/llvm-
toolchain-15/+bug/1993800/comments/4

------------------------------------------------------------------------
On 2022-10-19T12:32:52+00:00 Fvogt-a wrote:

Created attachment 862276
Various IR dumps of LLVM failure

It can be reproduced by running
/usr/lib/qt6/examples/datavisualization/bars/bars from the
qt6-datavis3d-examples package as well. That has a slightly more complex
shader though.

I attached the full output of running it inside xvfb-run with
GALLIVM_DEBUG=tgsi,ir,asm LP_DEBUG=fs, which dumps all kind of info,
including LLVM IR.

Reply at: https://bugs.launchpad.net/ubuntu/+source/llvm-
toolchain-15/+bug/1993800/comments/5

------------------------------------------------------------------------
On 2022-10-19T21:34:01+00:00 Aaronpuchert wrote:

Thanks, that should help. This isn't my area of expertise, but at least
we can use this to file a bug upstream.

(In reply to Aaron Puchert from comment #4)
> Nevertheless, some initial remarks: ARMISD::VCMPZ is a "Vector compare to
> zero." [1] It should correspond to "vcmpe" in assembly [2]. The first
> argument being a v4i32 is slightly suspicious. I would have expected a
> v4f32, but since they live in the same registers maybe the backend doesn't
> care. The second is a Constant:i32<2> = ARMCC::CondCodes::HS, corresponding
> to conditional execution only if the carry flag is set, if I understand this
> correctly. [3,4]

Seems I was misreading that, the condition code is for the comparison
itself. For floating-point ARMCC::CondCodes::HS means ">, ==, or
unordered", so we're doing a !(... < 0.0f) comparison. Likely
corresponds to one of the

    fcmp ..., zeroinitializer

in the IR.

Reply at: https://bugs.launchpad.net/ubuntu/+source/llvm-
toolchain-15/+bug/1993800/comments/6

------------------------------------------------------------------------
On 2022-10-20T08:17:13+00:00 Fvogt-a wrote:

(In reply to Aaron Puchert from comment #6)
> Thanks, that should help. This isn't my area of expertise, but at least we
> can use this to file a bug upstream.

Will you do that or should one of us take care of that?

Reply at: https://bugs.launchpad.net/ubuntu/+source/llvm-
toolchain-15/+bug/1993800/comments/7

------------------------------------------------------------------------
On 2022-10-21T02:32:58+00:00 Aaronpuchert wrote:

Since I'm not sure what the precise target machine is, I've used flags
similar to how we build LLVM itself (see the specfile):

    llc -march=arm --float-abi=hard -mattr=+armv7-a,+vfp3d16

This reproduces the crash, just with a slightly different message:

LLVM ERROR: Cannot select: t933: v4i32 = ARMISD::VCMPZ t1307, Constant:i32<2>
  t1307: v4i32,ch = ARMISD::VLD1DUP<(load (s32) from %ir.584)> t0, t1429:1, 
Constant:i32<4>
    t1429: i32,i32,ch = load<(load (s32) from 
%ir."&context.constants_ptr[]5618", align 8), <post-inc>> t0, t2, 
Constant:i32<64>
      t2: i32,ch = CopyFromReg t0, Register:i32 %45
        t1: i32 = Register %45
      t212: i32 = Constant<64>
    t49: i32 = Constant<4>
  t28: i32 = Constant<2>

What's different is the added IR names, but they're not immediately
helpful: there is a "&context.constants_ptr[]56" in the source, maybe
there was disambiguation.

The crash is reproducible on the current main branch, so it's still not
fixed. With

    bugpoint --run-llc <input-file> --tool-args <options as above>

we can reduce it to this:

define void @fs_variant_partial() {
entry:
  %output = alloca <4 x float>, align 16
  br label %loop_begin

loop_begin:                                       ; preds = %skip, %entry
  br i1 undef, label %skip, label %0

0:                                                ; preds = %loop_begin
  %1 = icmp uge <4 x i32> zeroinitializer, undef
  %2 = sext <4 x i1> %1 to <4 x i32>
  %3 = load i32, i32* undef, align 4
  %4 = insertelement <4 x i32> undef, i32 %3, i32 3
  %5 = trunc <4 x i32> %2 to <4 x i1>
  %6 = select <4 x i1> %5, <4 x i32> zeroinitializer, <4 x i32> %4
  %7 = insertvalue [4 x <4 x i32>] undef, <4 x i32> %6, 0
  %8 = insertvalue [4 x <4 x i32>] %7, <4 x i32> undef, 1
  %9 = insertvalue [4 x <4 x i32>] %8, <4 x i32> undef, 2
  %10 = insertvalue [4 x <4 x i32>] %9, <4 x i32> undef, 3
  %11 = extractvalue [4 x <4 x i32>] %10, 0
  %12 = bitcast <4 x i32> %11 to <4 x float>
  %13 = fmul <4 x float> zeroinitializer, %12
  %14 = fadd <4 x float> %13, zeroinitializer
  %15 = fadd <4 x float> %14, zeroinitializer
  %16 = bitcast <4 x float> %15 to <4 x i32>
  %17 = insertvalue [4 x <4 x i32>] undef, <4 x i32> %16, 0
  %18 = insertvalue [4 x <4 x i32>] %17, <4 x i32> undef, 1
  %19 = insertvalue [4 x <4 x i32>] %18, <4 x i32> undef, 2
  %20 = insertvalue [4 x <4 x i32>] %19, <4 x i32> undef, 3
  %21 = extractvalue [4 x <4 x i32>] %20, 0
  %22 = bitcast <4 x i32> %21 to <4 x float>
  store <4 x float> %22, <4 x float>* %output, align 16
  br label %skip

skip:                                             ; preds = %0, %loop_begin
  br label %loop_begin
}

Crash is slightly different now:

LLVM ERROR: Cannot select: t48: v4i32 = ARMISD::VCMPZ undef:v4i32, 
Constant:i32<2>
  t3: v4i32 = undef
  t47: i32 = Constant<2>

This obviously corresponds to the

  %1 = icmp uge <4 x i32> zeroinitializer, undef

With that knowledge we can reduce further:

define <4 x i32> @fs_variant_partial() {
  %1 = icmp uge <4 x i32> zeroinitializer, undef
  %2 = sext <4 x i1> %1 to <4 x i32>
  ret <4 x i32> %2
}

or

define <4 x i32> @fs_variant_partial(<4 x i32> %0) {
  %2 = icmp uge <4 x i32> zeroinitializer, %0
  %3 = sext <4 x i1> %2 to <4 x i32>
  ret <4 x i32> %3
}

I'll see if I can spot where we're missing something, but likely I'll
just file a bug and let the ARM people figure it where this should be
fixed. From the looks of it we're simply not able to lower "icmp uge <4
x i32> zeroinitializer, ...", and the nested instructions have nothing
to do with it.

Reply at: https://bugs.launchpad.net/ubuntu/+source/llvm-
toolchain-15/+bug/1993800/comments/8

------------------------------------------------------------------------
On 2022-10-21T09:11:33+00:00 Fvogt-a wrote:

The rabbit hole is deep!

I noticed two oddities:

The bitcode triggers the error in llc-{13,14,15}, so it's not a change
in LLVM.

I built Mesa 22.2.1 with LLVM14 (building old Mesa with LLVM15 does not
work) and the bitcode produced also triggers the error in llc-{13, 14,
15}. I built Mesa 22.1.7 with LLVM14 as well and the bitcode also
triggers the error!

So the difference has to be somewhere in how Mesa invokes LLVM. I added
LLVMPassBuilderOptionsSetDebugLogging(opts, true); to print the passes.
Output with Mesa 22.2.1 + LLVM 15:

ir_fs322_variant0.bc written
Invoke as "opt -sroa -early-cse -simplifycfg -reassociate -mem2reg -constprop 
-instcombine -gvn ir_fs322_variant0.bc | llc -O2 [-mcpu=<-mcpu option>] 
[-mattr=<-mattr option(s)>]"
Running pass: AlwaysInlinerPass on [module]
Running analysis: InnerAnalysisManagerProxy<llvm::FunctionAnalysisManager, 
llvm::Module> on [module]
Running analysis: ProfileSummaryAnalysis on [module]
Running pass: CoroConditionalWrapper on [module]
Running pass: AnnotationRemarksPass on fs_variant_partial (1387 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_partial
LLVM ERROR: Cannot select: 0x10505b0: v4i32 = ARMISD::VCMPZ 0x1287c98, 
Constant:i32<2>

Mesa with LLVM 14 did not output that at all, which was caused by this
conditional:

#if LLVM_VERSION_MAJOR >= 15
#define GALLIVM_HAVE_CORO 0
#define GALLIVM_USE_NEW_PASS 1
#elif LLVM_VERSION_MAJOR >= 8
#define GALLIVM_HAVE_CORO 1
#define GALLIVM_USE_NEW_PASS 0
#else
#define GALLIVM_HAVE_CORO 0
#define GALLIVM_USE_NEW_PASS 0
#endif

So with LLVM >= 15 it uses the new pass manager and everything is
different.

Some experiments with opt + llc proved to be very helpful:

opt -passes=always-inline,instcombine ir_fs322_variant0.bc | llc -mcpu=generic 
-> works
opt -passes=always-inline ir_fs322_variant0.bc | llc -mcpu=generic -> fails!

So the "instcombine" pass makes all the difference here to avoid the
"Cannot select" error.

Question is, why is the instcombine pass not used? Mesa hardcodes it in
the list of passes after all:

   if (!(gallivm_perf & GALLIVM_PERF_NO_OPT))
      strcpy(passes, 
"sroa,early-cse,simplifycfg,reassociate,mem2reg,constprop,instcombine,");
   else
      strcpy(passes, "mem2reg");

   LLVMRunPasses(gallivm->module, passes,
LLVMGetExecutionEngineTargetMachine(gallivm->engine), opts);

opt can actually answer that quickly:

e06e5d2ccf7e:~/mesa/build # opt-15.0.2 
-passes=sroa,early-cse,simplifycfg,reassociate,mem2reg,constprop,instcombine, 
ir_fs322_variant0.bc | llc-14.0.6 -mcpu=generic
opt-15.0.2: unknown function pass 'constprop'
(failure)

Next try:

e06e5d2ccf7e:~/mesa/build # opt-15.0.2 
-passes=sroa,early-cse,simplifycfg,reassociate,mem2reg,instcombine, 
ir_fs322_variant0.bc | llc-14.0.6 -mcpu=generic
opt-15.0.2: unknown function pass ''
(failure)

Next try:

e06e5d2ccf7e:~/mesa/build # opt-15.0.2 
-passes=sroa,early-cse,simplifycfg,reassociate,mem2reg,instcombine 
ir_fs322_variant0.bc | llc-14.0.6 -mcpu=generic
        .text
        .syntax unified
(success!)

So the missing "instcombine" pass causes the "Cannot select" error and the pass 
is missing
because Mesa passes an invalid list of passes to LLVMRunPasses and ignores the 
error. This
means that if Mesa was built with LLVM >= 15, only the "default<O2>" passes 
were actually run,
so the code was not really optimized...

With this patch, the "Cannot select" error is gone:

    if (!(gallivm_perf & GALLIVM_PERF_NO_OPT))
-      strcpy(passes, 
"sroa,early-cse,simplifycfg,reassociate,mem2reg,constprop,instcombine,");
+      strcpy(passes, 
"sroa,early-cse,simplifycfg,reassociate,mem2reg,instsimplify,instcombine");
    else
       strcpy(passes, "mem2reg");

I'll send that to mesa upstream.

Reply at: https://bugs.launchpad.net/ubuntu/+source/llvm-
toolchain-15/+bug/1993800/comments/9


** Changed in: llvm-toolchain-15 (openSUSE)
   Importance: Unknown => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to mesa in Ubuntu.
https://bugs.launchpad.net/bugs/1993800

Title:
  LLVM ERROR: Cannot select: 0x2f689c8: v4i32 = ARMISD::VCMPZ 0x2f696b8,
  Constant:i32<2>

Status in LLVM:
  Unknown
Status in llvm-toolchain-15 package in Ubuntu:
  Confirmed
Status in mesa package in Ubuntu:
  Confirmed
Status in llvm-toolchain-15 package in openSUSE:
  Unknown

Bug description:
  LLVM ERROR: Cannot select: 0x2f689c8: v4i32 = ARMISD::VCMPZ 0x2f696b8, 
Constant:i32<2>
    0x2f696b8: v4i32,ch = ARMISD::VLD1DUP<(load (s32) from %ir.212)> 0x2aad434, 
0x2f84090:1, Constant:i32<4>
      0x2f84090: i32,i32,ch = load<(load (s32) from %ir.209, align 8), 
<post-inc>> 0x2aad434, 0x2f63a30, Constant:i32<64>
        0x2f63a30: i32,ch = CopyFromReg 0x2aad434, Register:i32 %23
          0x2f51c10: i32 = Register %23
        0x2f82500: i32 = Constant<64>
      0x2f81b28: i32 = Constant<4>
    0x2f81e40: i32 = Constant<2>
  In function: fs_variant_partial

  [https://launchpadlibrarian.net/629171689/buildlog_ubuntu-kinetic-
  armhf.mutter_43.0-1ubuntu3_BUILDING.txt.gz]

  Although "LLVM ERROR: Cannot select" seems to be from LLVM, I can't
  determine what project "fs_variant_partial" is in. Sounds like it
  might be in some old version of Mesa? The start of the log suggests
  it's running on focal.

To manage notifications about this bug go to:
https://bugs.launchpad.net/llvm/+bug/1993800/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

[Touch-packages] [Bug 1993800] Re: LLVM ERROR: Cannot select: 0x2f689c8: v4i32 = ARMISD::VCMPZ 0x2f696b8, Constant:i32<2>

Reply via email to