Hi, Richard. I still don't understand it. Sorry about that. >> loop_len_48 = MIN_EXPR <loop_len_34 * 2, 4>; >> _74 = loop_len_34 * 2 - loop_len_48;
I have the tests already tested. We have a MIN_EXPR to calculate the total elements: loop_len_34 = MIN_EXPR <ivtmp_72, 8>; I think "8" is already multiplied by 2? Why do we need loop_len_34 * 2 ? Could you give me more informations, The similiar tests you present we already have execution check and passed. I am not sure whether this patch has the issue that I didn't notice. Thanks. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-05-24 23:31 To: 钟居哲 CC: gcc-patches; rguenther Subject: Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support 钟居哲 <juzhe.zh...@rivai.ai> writes: > Hi, the .optimized dump is like this: > > <bb 2> [local count: 21045336]: > ivtmp.26_36 = (unsigned long) &x; > ivtmp.27_3 = (unsigned long) &y; > ivtmp.30_6 = (unsigned long) &MEM <int[200]> [(void *)&y + 16B]; > ivtmp.31_10 = (unsigned long) &MEM <int[200]> [(void *)&y + 32B]; > ivtmp.32_14 = (unsigned long) &MEM <int[200]> [(void *)&y + 48B]; > > <bb 3> [local count: 273589366]: > # ivtmp_72 = PHI <ivtmp_73(3), 100(2)> > # ivtmp.26_41 = PHI <ivtmp.26_37(3), ivtmp.26_36(2)> > # ivtmp.27_1 = PHI <ivtmp.27_2(3), ivtmp.27_3(2)> > # ivtmp.30_4 = PHI <ivtmp.30_5(3), ivtmp.30_6(2)> > # ivtmp.31_8 = PHI <ivtmp.31_9(3), ivtmp.31_10(2)> > # ivtmp.32_12 = PHI <ivtmp.32_13(3), ivtmp.32_14(2)> > loop_len_34 = MIN_EXPR <ivtmp_72, 8>; > loop_len_48 = MIN_EXPR <loop_len_34, 4>; > _74 = loop_len_34 - loop_len_48; Yeah, I think this needs to be: loop_len_48 = MIN_EXPR <loop_len_34 * 2, 4>; _74 = loop_len_34 * 2 - loop_len_48; (as valid gimple). The point is that... > loop_len_49 = MIN_EXPR <_74, 4>; > _75 = _74 - loop_len_49; > loop_len_50 = MIN_EXPR <_75, 4>; > loop_len_51 = _75 - loop_len_50; ...there are 4 lengths capped to 4, for a total element count of 16. But loop_len_34 is never greater than 8. So for this case we either need to multiply, or we need to create a fresh IV for the second rgroup. Both approaches are fine. Thanks, Richard