https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019

--- Comment #8 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
(In reply to Robin Dapp from comment #7)
> > The problem is GCC-15 has performance regression compare to GCC-14 on both
> > strict align and we should fix it, we can't specify use no strict align in
> > GCC-15 to pretend that we don't have such performance regression.
> 
> The problem is that in GCC 15 we're now vectorizing the first loop and don't
> cost it properly, most likely the vec_init/vec_construct is too inexpensive.
> A solution could be two-fold:
>  - Increase those costs (but we can never get them really correct in case a
> vec_init is just a broadcast or so)
>  - Enhance our vec_init expander to also allow construction from sub vectors
> so the initialization becomes cheaper and we don't need to clumsily load
> every uint8 element separately.
> 
> I can start with the second one which should also help other workloads. 
> There are several follow-up items, though, including proper subreg handling
> for those cases.

Thanks. The second solution looks nice to me. Looking forward your patches

Reply via email to