On Sat, 2020-06-27 at 13:38 +0200, Marc Glisse wrote:
> On Sat, 27 Jun 2020, Ilya Leoshkevich via Gcc-patches wrote:
>
> > Is there something specific that a compiler user should look out
> > for?
> > For example, here is the kernel code, from which the test was
> > derived:
> >
> > static inline void atomic_add(int i, atomic_t *v)
> > {
> > #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
> > if (__builtin_constant_p(i) && (i > -129) && (i < 128)) {
> > __atomic_add_const(i, &v->counter);
> > return;
> > }
> > #endif
> > __atomic_add(i, &v->counter);
> > }
> >
> > It looks very straightforward - can there still be something wrong
> > with its usage of b_c_p?
> >
> > > I'd recommend looking at the .ssa dump and walk forward from
> > > there if
> > > the .ssa dump looks correct.
> >
> > Well, 021t.ssa already has:
> >
> > __attribute__((gnu_inline))
> > __atomic_add_const (intD.6 valD.2269, intD.6 * ptrD.2270)
> > {
> > intD.6 val_3(D) = valD.2269;
> > intD.6 * ptr_2(D) = ptrD.2270;
> > ;; basic block 2, loop depth 0, maybe hot
> > ;; prev block 0, next block 1, flags: (NEW)
> > ;; pred: ENTRY (FALLTHRU)
> > # .MEM_4 = VDEF <.MEM_1(D)>
> > __asm__ __volatile__("asi %0,%1
> > " : "ptr" "=Q" *ptr_2(D) : "val" "i" val_3(D), "Q" *ptr_2(D) :
> > "memory", "cc");
> > # VUSE <.MEM_4>
> > return;
> > ;; succ: EXIT
> >
> > }
> >
> > which is, strictly speaking, not correct, because val_3(D) and
> > valD.2269 are not constant. But as far as I understand we are
> > willing
> > to tolerate trees like this until a certain point.
> >
> > What is this point supposed to be? If I understood you right,
> > 106t.thread1 is already too late - why is it so?
>
> Small remark: shouldn't __atomic_add_const be marked with the
> always_inline attribute, since it isn't usable when it isn't inlined?
I agree, this would be a good improvement (from both readability and
correctness perspectives).
Still, I just tried it, and unfortunately it did not help with the
issue at hand, most likely because the function is inlined either way.
021t.ssa still contains:
__attribute__((always_inline, gnu_inline))
__atomic_add_const (intD.6 valD.2269, intD.6 * ptrD.2270)
{
and threading still eliminates its inlined version.