On 14.03.2025 11:12, Roger Pau Monné wrote:
> On Fri, Mar 14, 2025 at 10:13:07AM +0100, Jan Beulich wrote:
>> On 14.03.2025 10:05, Andrew Cooper wrote:
>>> On 14/03/2025 8:44 am, Jan Beulich wrote:
>>>> On 14.03.2025 09:30, Roger Pau Monné wrote:
>>>>> On Fri, Mar 14, 2025 at 09:24:09AM +0100, Jan Beulich wrote:
>>>>>> On 13.03.2025 16:30, Roger Pau Monne wrote:
>>>>>>> When enabling UBSAN with clang, the following error is triggered during
>>>>>>> the
>>>>>>> build:
>>>>>>>
>>>>>>> common/wait.c:154:9: error: symbol '.L_wq_resume' is already defined
>>>>>>> 154 | "push %%rbx; push %%rbp; push %%r12;"
>>>>>>> | ^
>>>>>>> <inline asm>:1:121: note: instantiated into assembly here
>>>>>>> 1 | push %rbx; push %rbp; push %r12;push %r13; push %r14;
>>>>>>> push %r15;sub %esp,%ecx;cmp $4096, %ecx;ja .L_skip;mov
>>>>>>> %rsp,%rsi;.L_wq_resume: rep movsb;mov %rsp,%rsi;.L_skip:pop %r15; pop
>>>>>>> %r14; pop %r13;pop %r12; pop %rbp; pop %rbx
>>>>>>> |
>>>>>>> ^
>>>>>>> common/wait.c:154:9: error: symbol '.L_skip' is already defined
>>>>>>> 154 | "push %%rbx; push %%rbp; push %%r12;"
>>>>>>> | ^
>>>>>>> <inline asm>:1:159: note: instantiated into assembly here
>>>>>>> 1 | push %rbx; push %rbp; push %r12;push %r13; push %r14;
>>>>>>> push %r15;sub %esp,%ecx;cmp $4096, %ecx;ja .L_skip;mov
>>>>>>> %rsp,%rsi;.L_wq_resume: rep movsb;mov %rsp,%rsi;.L_skip:pop %r15; pop
>>>>>>> %r14; pop %r13;pop %r12; pop %rbp; pop %rbx
>>>>>>> |
>>>>>>>
>>>>>>> ^
>>>>>>> 2 errors generated.
>>>>>>>
>>>>>>> The inline assembly block in __prepare_to_wait() is duplicated, thus
>>>>>>> leading to multiple definitions of the otherwise unique labels inside
>>>>>>> the
>>>>>>> assembly block. GCC extended-asm documentation notes the possibility of
>>>>>>> duplicating asm blocks:
>>>>>>>
>>>>>>>> Under certain circumstances, GCC may duplicate (or remove duplicates
>>>>>>>> of)
>>>>>>>> your assembly code when optimizing. This can lead to unexpected
>>>>>>>> duplicate
>>>>>>>> symbol errors during compilation if your asm code defines symbols or
>>>>>>>> labels. Using ‘%=’ (see AssemblerTemplate) may help resolve this
>>>>>>>> problem.
>>>>>>> Move the assembly blocks that deal with saving and restoring the current
>>>>>>> CPU context into it's own explicitly non-inline functions. This
>>>>>>> prevents
>>>>>>> clang from duplicating the assembly blocks. Just using noinline
>>>>>>> attribute
>>>>>>> seems to be enough to prevent assembly duplication, in the future
>>>>>>> noclone
>>>>>>> might also be required if asm block duplication issues arise again.
>>>>>> Wouldn't it be a far easier / less intrusive change to simply append %=
>>>>>> to
>>>>>> the label names?
>>>>> That won't work AFAICT, as the inline asm in check_wakeup_from_wait()
>>>>> won't be able to make a jump to the .L_wq_resume label defined in the
>>>>> __prepare_to_wait() assembly block if the label is declared as
>>>>> .L_wq_resume%=.
>>>>>
>>>>> Also we want to make sure there's a single .L_wq_resume seeing how
>>>>> check_wakeup_from_wait() uses it as the restore entry point?
>>>> Hmm, yes on both points; the %= would only work for .Lskip. Have you gained
>>>> understanding why there is this duplication? The breaking out of the asm()
>>>> that you do isn't going to be reliable, as in principle the compiler is
>>>> still permitted to duplicate stuff. Afaict the only reliable way is to move
>>>> the code to a separate assembly file (with the asm() merely JMPing there,
>>>> providing a pseudo-return-address by some custom means). Or to a file-scope
>>>> asm(), as those can't be duplicated.
>>>
>>> See the simplified example in
>>> https://github.com/llvm/llvm-project/issues/92161
>>>
>>> When I debugged this a while back, The multiple uses of wqv->esp (one
>>> explicit after the asm, one as an asm parameter) gain pointer
>>> sanitisation, so the structure looks like:
>>>
>>> ...
>>> if ( bad pointer )
>>> __ubsan_report();
>>> asm volatile (...);
>>> if ( bad pointer )
>>> __ubsan_report();
>>> ...
>>>
>>> which then got transformed to:
>>>
>>> if ( bad pointer )
>>> {
>>> __ubsan_report();
>>> asm volatile (...);
>>> __ubsan_report();
>>> }
>>> else
>>> asm volatile (...);
>>
>> But isn't it then going to be enough to latch &wqv->esp into a local
>> variable,
>> and use that in the asm() and in the subsequent if()?
>
> I have the following diff which seems to prevent the duplication,
> would you both be OK with this approach?
Yes (with a brief comment added as to the need for the local). And thanks.
Jan
> --- a/xen/common/wait.c
> +++ b/xen/common/wait.c
> @@ -124,6 +124,7 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
> struct cpu_info *cpu_info = get_cpu_info();
> struct vcpu *curr = current;
> unsigned long dummy;
> + void *esp = NULL;
>
> ASSERT(wqv->esp == NULL);
>
> @@ -166,12 +167,12 @@ static void __prepare_to_wait(struct waitqueue_vcpu
> *wqv)
> ".L_skip:"
> "pop %%r15; pop %%r14; pop %%r13;"
> "pop %%r12; pop %%rbp; pop %%rbx"
> - : "=&S" (wqv->esp), "=&c" (dummy), "=&D" (dummy)
> + : "=&S" (esp), "=&c" (dummy), "=&D" (dummy)
> : "0" (0), "1" (cpu_info), "2" (wqv->stack),
> [sz] "i" (PAGE_SIZE)
> : "memory", "rax", "rdx", "r8", "r9", "r10", "r11" );
>
> - if ( unlikely(wqv->esp == NULL) )
> + if ( unlikely(esp == NULL) )
> {
> gdprintk(XENLOG_ERR, "Stack too large in %s\n", __func__);
> domain_crash(curr->domain);
> @@ -179,6 +180,7 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
> for ( ; ; )
> do_softirq();
> }
> + wqv->esp = esp;
> }
>
> static void __finish_wait(struct waitqueue_vcpu *wqv)
>