https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119210
--- Comment #4 from xiezhiheng at huawei dot com --- (In reply to Andrew Pinski from comment #3) > So: > mrs x16, tpidr2_el0 > cbnz x16, .L22 <== it will branch to .L22, and miss 'smstart za' > mov x0, x3 > smstart za > bl __arm_tpidr2_restore > .L22: > > > This means it is already started. > > > What kernel version are you using? Could this be a bug in the kernel not > saving/restoring tpidr2_el0 correctly or setting tpidr2_el0 to zero > originally. In main function, main: stp x29, x30, [sp, -144]! rdsvl x0, #1 cntd x16 mov x29, sp mul x1, x0, x0 stp x23, x16, [sp, 48] stp x19, x20, [sp, 16] stp x21, x22, [sp, 32] stp d8, d9, [sp, 64] stp d10, d11, [sp, 80] stp d12, d13, [sp, 96] stp d14, d15, [sp, 112] sub sp, sp, x1 mov x1, sp stp x1, x0, [x29, 128] mrs x0, tpidr2_el0 <== in gdb, it is zero cbz x0, .L19 bl __arm_tpidr2_save .L19: add x0, x29, 128 msr tpidr2_el0, x0 <== then it writes 0xfffffffff410 to tpidr2_el0 mov x0, 512 bl malloc mov x1, 1 mov x20, x0 mov x0, 512 bl calloc mov x1, 1 mov x21, x0 mov x0, 512 bl calloc mov x22, x0 mov x0, 0 index z31.d, #0, #1 mov w1, 64 cntd x2 whilelo p6.d, wzr, w1 ptrue p7.b, all mov z30.s, w2 .L11: movprfx z29, z31 sxtw z29.d, p7/m, z31.d scvtf z29.d, p7/m, z29.d st1d z29.d, p6, [x20, x0, lsl 3] add z31.s, z31.s, z30.s incd x0 whilelo p6.d, w0, w1 b.any .L11 adrp x23, .LC0 add x23, x23, :lo12:.LC0 mov x19, 0 .L12: ldr d1, [x21, x19, lsl 3] mov w1, w19 ldr d0, [x20, x19, lsl 3] mov x0, x23 add x19, x19, 1 bl printf cmp x19, 64 bne .L12 mov x1, x21 add x3, x29, 128 mrs x16, tpidr2_el0 <== then it reads 0xfffffffff410 from tpidr2_el0 cbnz x16, .L22 mov x0, x3 smstart za bl __arm_tpidr2_restore .L22: adrp x23, .LC1 add x23, x23, :lo12:.LC1 msr tpidr2_el0, xzr <== here reset to zero, but seems useless mov x19, 0 mov x0, x20 smstart sm bl example(double*, double*) smstop sm mov x0, x22 smstart sm bl example0(double*) smstop sm smstop za I do not see other operations on tpidr2_el0. I am not very familiar with SME system registers, and will somewhere reset tpidr2_el0 to zero?