On Tue, Mar 24, 2020 at 9:30 AM Kewen.Lin <li...@linux.ibm.com> wrote: > > Hi, > > on 2020/3/18 下午11:10, Richard Biener wrote: > > On Wed, Mar 18, 2020 at 2:56 PM Kewen.Lin <li...@linux.ibm.com> wrote: > >> > >> Hi Richi, > >> > >> Thanks for your comments. > >> > >> on 2020/3/18 下午6:39, Richard Biener wrote: > >>> On Wed, Mar 18, 2020 at 11:06 AM Kewen.Lin <li...@linux.ibm.com> wrote: > >>>> > >> This path can define overrun_p to false, some case can fall into > >> "no peeling for gaps" hunk in vectorizable_load. Since I used > >> DR_GROUP_HALF_MODE to save the half mode, if some case matches > >> this condition, vectorizable_load hunk can get unitialized > >> DR_GROUP_HALF_MODE. But even with proposed recomputing way, I > >> think we still need to check the vec_init optab here if the > >> know_eq half size conditions hold? > > > > Hmm, but for the above case it's fine to access the excess elements. > > > > I guess the vectorizable_load code needs to be amended with > > the alignment check or we do need to store somewhere our > > decision to use smaller loads. > > > > OK, thanks. I'll investigate it separately. > > >> > >>> I don't like storing DR_GROUP_HALF_MODE very much, later > >>> you need a vector type and it looks cheap enough to recompute > >>> it where you need it? Iff then it doesn't belong to DR_GROUP > >>> but to the stmt-info. > >>> > >> > >> OK, I was intended not to recompute it for time saving, will > >> throw it away. > >> > >>> I realize the original optimization was kind of a hack (and I was too > >>> lazy to implement the integer mode construction path ...). > >>> > >>> So, can you factor out the existing code into a function returning > >>> the vector type for construction for a vector type and a > >>> pieces size? So for V16QI and a pieces-size of 4 we'd > >>> get either V16QI back (then construction from V4QI pieces > >>> should work) or V4SI (then construction from SImode pieces > >>> should work)? Eventually as secondary output provide that > >>> piece type (SI / V4QI). > >> > >> Sure. I'm very poor to get a function name, does function name > >> suitable_vector_and_pieces sound good? > >> ie. tree suitable_vector_and_pieces (tree vtype, tree *ptype); > > > > tree vector_vector_composition_type (tree vtype, poly_uint64 nelts, > > tree *ptype); > > > > where nelts specifies the number of vtype elements in a piece. > > > > Thanks, yep, "nelts" I forgot to get it. > > The new version with refactoring has been attached. > Bootstrapped/regtested on powerpc64le-linux-gnu (LE) P8 and P9. > > Is it ok for trunk?
Yes. Thanks, Richard. > BR, > Kewen > --------- > gcc/ChangeLog > > 2020-MM-DD Kewen Lin <li...@gcc.gnu.org> > > PR tree-optimization/90332 > * gcc/tree-vect-stmts.c (vector_vector_composition_type): New > function. > (get_group_load_store_type): Adjust to call > vector_vector_composition_type, > extend it to construct with scalar types. > (vectorizable_load): Likewise. > > -------------