I have a fix for the testcases. Just need to increase the size from 32 to 1024. 
Basically with 32, vectors are used for the memcpy which allows to optimize it; 
just differently and not what is being tested here. I am hoping 1024 is big 
enough not to hit another target which does memcpy for 1024 bytes with one 
register load/store.

I will push a fix tomorrow.

Thanks,
Andrew

> -----Original Message-----
> From: Sam James <s...@gentoo.org>
> Sent: Friday, April 18, 2025 5:28 PM
> To: Andrew Pinski (QUIC) <quic_apin...@quicinc.com>
> Cc: gcc-regression@gcc.gnu.org; haochen.ji...@intel.com
> Subject: Re: Regressions on native/master at commit r16-29 vs
> commit r16-21 on Linux/x86_64
> 
> WARNING: This email originated from outside of Qualcomm.
> Please be wary of any links or attachments, and do not enable
> macros.
> 
> This builder uses --with-arch=native. The (a) difference starts
> at x86-64-v3:
> 
>  $ diff -u <(gcc -O2 -fdump-tree-forwprop1-details=- -O2
> gcc.dg/pr78408-3.c -c -march=x86-64-v2) <(gcc -O2 -fdump-
> tree-forwprop1-details=- -O2 gcc.dg/pr78408-3.c -c -
> march=x86-64-v3)
> --- /dev/fd/63  2025-04-19 01:27:31.676852279 +0100
> +++ /dev/fd/62  2025-04-19 01:27:31.651851999 +0100
> @@ -1,15 +1,17 @@
> 
> -;; Function bbb (bbb, funcdef_no=0, decl_uid=2939,
> cgraph_uid=1, symbol_order=0)
> +;; Function bbb (bbb, funcdef_no=0, decl_uid=3312,
> cgraph_uid=1,
> +symbol_order=0)
> 
>  void * bbb ()
>  {
>    char buf[32];
>    void * ret;
> +  vector(32) unsigned char _5;
> 
>    <bb 2> :
>    ret_3 = aaa ();
>    buf = "";
> -  MEM <unsigned char[32]> [(char * {ref-all})ret_3] = MEM
> <unsigned char[32]> [(char * {ref-all})&buf];
> +  _5 = MEM <vector(32) unsigned char> [(char * {ref-
> all})&buf];  MEM
> + <vector(32) unsigned char> [(char * {ref-all})ret_3] = _5;
>    buf ={v} {CLOBBER(eos)};
>    return ret_3;

Reply via email to