Hi, Richard. I still don't understand it. Sorry about that.

>>  loop_len_48 = MIN_EXPR <loop_len_34 * 2, 4>;
  >>   _74 = loop_len_34 * 2 - loop_len_48;

I have the tests already tested.
We have a MIN_EXPR to calculate the total elements:
loop_len_34 = MIN_EXPR <ivtmp_72, 8>;
I think "8" is already multiplied by 2?

Why do we need loop_len_34 * 2 ?
Could you give me more informations, The similiar tests you present we already 
have
execution check and passed. I am not sure whether this patch has the issue that 
I didn't notice.

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-24 23:31
To: 钟居哲
CC: gcc-patches; rguenther
Subject: Re: [PATCH V14] VECT: Add decrement IV iteration loop control by 
variable amount support
钟居哲 <juzhe.zh...@rivai.ai> writes:
> Hi, the .optimized dump is like this:
>
>   <bb 2> [local count: 21045336]:
>   ivtmp.26_36 = (unsigned long) &x;
>   ivtmp.27_3 = (unsigned long) &y;
>   ivtmp.30_6 = (unsigned long) &MEM <int[200]> [(void *)&y + 16B];
>   ivtmp.31_10 = (unsigned long) &MEM <int[200]> [(void *)&y + 32B];
>   ivtmp.32_14 = (unsigned long) &MEM <int[200]> [(void *)&y + 48B];
>
>   <bb 3> [local count: 273589366]:
>   # ivtmp_72 = PHI <ivtmp_73(3), 100(2)>
>   # ivtmp.26_41 = PHI <ivtmp.26_37(3), ivtmp.26_36(2)>
>   # ivtmp.27_1 = PHI <ivtmp.27_2(3), ivtmp.27_3(2)>
>   # ivtmp.30_4 = PHI <ivtmp.30_5(3), ivtmp.30_6(2)>
>   # ivtmp.31_8 = PHI <ivtmp.31_9(3), ivtmp.31_10(2)>
>   # ivtmp.32_12 = PHI <ivtmp.32_13(3), ivtmp.32_14(2)>
>   loop_len_34 = MIN_EXPR <ivtmp_72, 8>;
>   loop_len_48 = MIN_EXPR <loop_len_34, 4>;
>   _74 = loop_len_34 - loop_len_48;
 
Yeah, I think this needs to be:
 
  loop_len_48 = MIN_EXPR <loop_len_34 * 2, 4>;
  _74 = loop_len_34 * 2 - loop_len_48;
  
(as valid gimple).  The point is that...
 
>   loop_len_49 = MIN_EXPR <_74, 4>;
>   _75 = _74 - loop_len_49;
>   loop_len_50 = MIN_EXPR <_75, 4>;
>   loop_len_51 = _75 - loop_len_50;
 
...there are 4 lengths capped to 4, for a total element count of 16.
But loop_len_34 is never greater than 8.
 
So for this case we either need to multiply, or we need to create
a fresh IV for the second rgroup.  Both approaches are fine.
 
Thanks,
Richard
 

Reply via email to