On Thu, 25 May 2023, juzhe.zh...@rivai.ai wrote:

> Yeah. I see. Removing it will cause testcase run fail.
> Now I found the issue, since you want to store the step in the iv_rgroup.
> 
> After I tried, the IR looks correct but create ICE:
> 0x18c8d41 process_bb
>         ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:7933
> 0x18cb6d9 do_rpo_vn_1
>         ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:8544
> 0x18cbd35 do_rpo_vn(function*, edge_def*, bitmap_head*, bool, bool, 
> vn_lookup_ki
>         ../../../riscv-gcc/gcc/tree-ssa-sccvn.cc:8646
> 0x19d42d2 execute
>         ../../../riscv-gcc/gcc/tree-vectorizer.cc:1385
> 
> This is the IR:
> 
> loop_len_76 = MIN_EXPR <ivtmp_98, 8>;
> 
>   loop_len_66 = MIN_EXPR <ivtmp_101, 16>;   -> store the step in rgroup 
> instead of LOOP_VINFO
>   _103 = loop_len_66;  ---->reuse the MIN VALUE
> 
>   loop_len_66 = MIN_EXPR <loop_len_66, 4>;

you have two defs of loop_len_66, that's not allowed

> 
>   _104 = _103 - loop_len_66;  ->use MIN - loop_len_66
> 
>   loop_len_65 = MIN_EXPR <_104, 4>;
>   _105 = _104 - loop_len_65;
>   loop_len_64 = MIN_EXPR <_105, 4>;
>   loop_len_63 = _105 - loop_len_64;
> 
> Since previously I store the "MIN_EXPR <ivtmp_101, 16>;" in the LOOP_VINFO, 
> not the rgroup.
> 
> So previously is correct and no ICE:
> 
>   loop_len_76 = MIN_EXPR <ivtmp_98, 8>;
> 
>  _103 = MIN_EXPR <ivtmp_101, 16>;    -> Step store in the LOOP_VINFO (S)
> 
>   loop_len_66 = MIN_EXPR <_103, 4>; 
> 
>   _104 = _103 - loop_len_66;  ->  use MIN - loop_len_66
> 
>   loop_len_65 = MIN_EXPR <_104, 4>;
>   _105 = _104 - loop_len_65;
>   loop_len_64 = MIN_EXPR <_105, 4>;
>   loop_len_63 = _105 - loop_len_64;
> 
> Could you help me with this ?
> Thanks.
> 
> 
> juzhe.zh...@rivai.ai
>  
> From: Richard Sandiford
> Date: 2023-05-25 18:19
> To: juzhe.zhong\@rivai.ai
> CC: gcc-patches; rguenther
> Subject: Re: [PATCH V15] VECT: Add decrement IV iteration loop control by 
> variable amount support
> "juzhe.zh...@rivai.ai" <juzhe.zh...@rivai.ai> writes:
> > Hi? Richard. Thanks for the comments.
> >
> >>> if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo)
> >>>     || !iv_rgc
> >>>     || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor
> >>> != rgc->max_nscalars_per_iter * rgc->factor))
> >>>           {
> >   >>           /* See whether zero-based IV would ever generate all-false 
> > masks
> >    >>             or zero length before wrapping around.  */
> >    >>          bool might_wrap_p = vect_rgroup_iv_might_wrap_p (loop_vinfo, 
> > rgc);
> >  
> >    >>          /* Set up all controls for this group.  */
> >      >>        test_ctrl = vect_set_loop_controls_directly (loop, 
> > loop_vinfo,
> >     >>                                                      &preheader_seq,
> >         >>                                                  &header_seq,
> >     >>                                                      loop_cond_gsi, 
> > rgc,
> >     >>                                                      niters, 
> > niters_skip,
> >     >>                                                      might_wrap_p);
> >  
> >    >>  iv_rgc = rgc;
> >   >> }
> >
> >
> > Could you tell me why you add:
> > (iv_rgc->max_nscalars_per_iter * iv_rgc->factor
> >>> != rgc->max_nscalars_per_iter * rgc->factor) ?
>  
> The patch creates IVs with the following step:
>  
>       gimple_seq_add_stmt (header_seq, gimple_build_assign (step, MIN_EXPR,
>     index_before_incr,
>     nitems_step));
>  
> If nitems_step is the same for two IVs, those IVs will always be equal.
>  
> So having multiple IVs with the same nitems_step is redundant.
>  
> nitems_step is calculated as follows:
>  
>   unsigned int nitems_per_iter = rgc->max_nscalars_per_iter * rgc->factor;
>   ...
>   poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
>   ...
>  
>   if (nitems_per_iter != 1)
>     {
>       ...
>       tree iv_factor = build_int_cst (iv_type, nitems_per_iter);
>       ...
>       nitems_step = gimple_build (preheader_seq, MULT_EXPR, iv_type,
>   nitems_step, iv_factor);
>       ...
>     }
>  
> so nitems_per_step is equal to:
>  
>   rgc->max_nscalars_per_iter * rgc->factor * VF
>  
> VF is fixed for a loop, so nitems_step is equal for two different
> rgroup_controls if:
>  
>   rgc->max_nscalars_per_iter * rgc->factor
>  
> is the same for those rgroup_controls.
>  
> Please try the example I posted earlier today. I think you'll see that,
> without the:
>  
>   (iv_rgc->max_nscalars_per_iter * iv_rgc->factor
>    != rgc->max_nscalars_per_iter * rgc->factor)
>  
> you'll have two IVs with the same step (because their MIN_EXPRs have
> the same bound).
>  
> Thanks,
> Richard
>  
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Reply via email to