On 2023/11/2 17:30, wuqiang.matt wrote:
On 2023/11/2 12:53, Vineet Gupta wrote:


On 10/29/23 20:41, wuqiang.matt wrote:

arch_cmpxchg_relaxed:
...
          switch(sizeof((_p_))) {
          case 4:
....

arch_cmpxchg:
...
    BUILD_BUG_ON(sizeof(_p_) != 4);
...

_p is the address pointer, so I'm thinking it's a typo but I couldn't
yet confirm. There is not much about arc processors in the web :(
Hmm, indeed. This seems like a bug but it depends on the 'llock  %0, [%1]'
can take a 32bit address or 32bit data register. Usually it should
check the size of data, but need to check with ISA manual.

Vineet, can you check this suspicious bug?

ARCv2 is a 32-bit ISA and LLOCK/SCOND work on 32-bit data.
So the pointers will be 32-bit anyways. Is the issue that pointer/cmpxchg operation could be on a smaller data type ?

For ARCv2 with CONFIG_ARC_HAS_LLSC, better add the data size checking and
only permit 32bit data size. Even for 32-bit system, data should can be
64bit 'long long'.

Right, makes sense. Can you send a patch ?

Okay, I will.


And In the case that CONFIG_ARC_HAS_LLSC is undefined, in arch_cmpxchg: the
pointer size checking is unnecessary, since it's using spinlock internally:

Correct, I had the same thought.


https://elixir.bootlin.com/linux/v6.6-rc7/source/arch/arc/include/asm/cmpxchg.h#L60:
    BUILD_BUG_ON(sizeof(_p_) != 4);                    \
                                    \
    /*                                \
     * spin lock/unlock provide the needed smp_mb() before/after    \
     */                                \
    atomic_ops_lock(__flags);                    \
    _prev_ = *_p_;                            \
    if (_prev_ == _o_)                        \
        *_p_ = _n_;                        \
    atomic_ops_unlock(__flags);

Can you do that fix in same patch as well ?

Sure.

Another question about the naming: arch_cmpxchg_relaxed() implemented if
CONFIG_ARC_HAS_LLSC is configured and arch_cmpxchg() defined for the rest.
Are there any reasons for difference names ?

Yes the atomics API consists of _relaxed, _acquire, _release and unadorned.
_relaxed is the most efficient, with rest having some extra barriers.
If arch provides _relaxed, generic code can create the rest - assuming arch hardware can support the relaxed ones.

Definitely Yes! The generic cmpxchg will prefer arch_cmpxchg_relaxed rather
than arch_cmpxchg.

It should be arch_cmpxchg (if defined) rather than arch_cmpxchg_relaxed. I
just double checked with v6.6:

        ...
        #if defined(arch_cmpxchg)
        #define raw_cmpxchg arch_cmpxchg
        #elif defined(arch_cmpxchg_relaxed)
                #define raw_cmpxchg(...) \
                __atomic_op_fence(arch_cmpxchg, __VA_ARGS__)
        #else
        extern void raw_cmpxchg_not_implemented(void);
        #define raw_cmpxchg(...) raw_cmpxchg_not_implemented()
        #endif
        ...
        #if defined(arch_cmpxchg_relaxed)
        #define raw_cmpxchg_relaxed arch_cmpxchg_relaxed
        #elif defined(arch_cmpxchg)
        #define raw_cmpxchg_relaxed arch_cmpxchg
        #else
        extern void raw_cmpxchg_relaxed_not_implemented(void);
        #define raw_cmpxchg_relaxed(...) raw_cmpxchg_relaxed_not_implemented()
        #endif
        ...

I made the mistake since the definitions of v5.2 confused me. The followings
are for v5.2, and these conditions are removed from v5.14:

        ...
        #if !defined(arch_cmpxchg_relaxed) || defined(arch_cmpxchg)
        #define cmpxchg(ptr, ...) \
        ({ \
                typeof(ptr) __ai_ptr = (ptr); \
                instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
                arch_cmpxchg(__ai_ptr, __VA_ARGS__); \
        })
        #endif
        ...
        #if defined(arch_cmpxchg_relaxed)
        #define cmpxchg_relaxed(ptr, ...) \
        ({ \
                typeof(ptr) __ai_ptr = (ptr); \
                instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
                arch_cmpxchg_relaxed(__ai_ptr, __VA_ARGS__); \
        })
        #endif
        ...

Sorry for the noise.

That is true for LLSC.

However for !LLSC, spinlock versions already have full barriers due to spinlock, so relaxed variant doesn't make sense.


As I checked, Synopsys has released 64bit ARC processors (HS66/HS68), but
I don't know the status of Linux kernel support.

Yes there's an internal ARC64 port running but it is not ready for upstreaming yet.

Thanks for the update.

-Vineet

Regards,
wuqiang


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Reply via email to