Hi Richard, > -----Original Message----- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Thursday, May 19, 2016 4:08 PM > To: Kumar, Venkataramanan <venkataramanan.ku...@amd.com> > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [Patch V2] Fix SLP PR58135. > > On Wed, May 18, 2016 at 5:29 PM, Kumar, Venkataramanan > <venkataramanan.ku...@amd.com> wrote: > > Hi Richard, > > > >> -----Original Message----- > >> From: Richard Biener [mailto:richard.guent...@gmail.com] > >> Sent: Tuesday, May 17, 2016 5:40 PM > >> To: Kumar, Venkataramanan <venkataramanan.ku...@amd.com> > >> Cc: gcc-patches@gcc.gnu.org > >> Subject: Re: [Patch V2] Fix SLP PR58135. > >> > >> On Tue, May 17, 2016 at 1:56 PM, Kumar, Venkataramanan > >> <venkataramanan.ku...@amd.com> wrote: > >> > Hi Richard, > >> > > >> > I created the patch by passing -b option to git. Now the patch is > >> > more > >> readable. > >> > > >> > As per your suggestion I tried to fix the PR by splitting the SLP > >> > store group at > >> vector boundary after the SLP tree is built. > >> > > >> > Boot strap PASSED on x86_64. > >> > Checked the patch with check_GNU_style.sh. > >> > > >> > The gfortran.dg/pr46519-1.f test now does SLP vectorization. Hence > >> > it > >> generated 2 more vzeroupper. > >> > As recommended I adjusted the test case by adding > >> > -fno-tree-slp-vectorize > >> to make it as expected after loop vectorization. > >> > > >> > The following tests are now passing. > >> > > >> > ------ Snip----- > >> > Tests that now work, but didn't before: > >> > > >> > gcc.dg/vect/bb-slp-19.c -flto -ffat-lto-objects > >> > scan-tree-dump-times > >> > slp2 "basic block vectorized" 1 > >> > > >> > gcc.dg/vect/bb-slp-19.c scan-tree-dump-times slp2 "basic block > >> > vectorized" 1 > >> > > >> > New tests that PASS: > >> > > >> > gcc.dg/vect/pr58135.c (test for excess errors) > >> > gcc.dg/vect/pr58135.c -flto -ffat-lto-objects (test for excess > >> > errors) > >> > > >> > ------ Snip----- > >> > > >> > ChangeLog > >> > > >> > 2016-05-14 Venkataramanan Kumar > >> <venkataramanan.ku...@amd.com> > >> > PR tree-optimization/58135 > >> > * tree-vect-slp.c: When group size is not multiple of vector size, > >> > allow splitting of store group at vector boundary. > >> > > >> > Test suite ChangeLog > >> > 2016-05-14 Venkataramanan Kumar > >> <venkataramanan.ku...@amd.com> > >> > * gcc.dg/vect/bb-slp-19.c: Remove XFAIL. > >> > * gcc.dg/vect/pr58135.c: Add new. > >> > * gfortran.dg/pr46519-1.f: Adjust test case. > >> > > >> > The attached patch Ok for trunk? > >> > >> > >> Please avoid the excessive vertical space around the vect_build_slp_tree > call. > > Yes fixed in the attached patch. > >> > >> + /* Calculate the unrolling factor. */ > >> + unrolling_factor = least_common_multiple > >> + (nunits, group_size) / group_size; > >> ... > >> + else > >> { > >> /* Calculate the unrolling factor based on the smallest type. */ > >> if (max_nunits > nunits) > >> - unrolling_factor = least_common_multiple (max_nunits, group_size) > >> - / group_size; > >> + unrolling_factor > >> + = least_common_multiple (max_nunits, > >> + group_size)/group_size; > >> > >> please compute the "correct" unroll factor immediately and move the > >> "unrolling of BB required" error into the if() case by post-poning > >> the nunits < group_size check (and use max_nunits here). > >> > > Yes fixed in the attached patch. > > > >> + if (is_a <bb_vec_info> (vinfo) > >> + && nunits < group_size > >> + && unrolling_factor != 1 > >> + && is_a <bb_vec_info> (vinfo)) > >> + { > >> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > >> + "Build SLP failed: store group " > >> + "size not a multiple of the vector size " > >> + "in basic block SLP\n"); > >> + /* Fatal mismatch. */ > >> + matches[nunits] = false; > >> > >> this is too pessimistic - you want to add the extra 'false' at > >> group_size / max_nunits * max_nunits. > > Yes fixed in attached patch. > > > >> > >> It looks like you leak 'node' in the if () path as well. You need > >> > >> vect_free_slp_tree (node); > >> loads.release (); > >> > >> thus treat it as a failure case. > > > > Yes fixed. I added an else part before scalar_stmts.release call for the > > case > when SLP tree is not built. This avoids double freeing. > > Bootstrapped and reg tested on X86_64. > > > > Ok for trunk ? > > + /*Calculate the unrolling factor based on the smallest type. */ > + unrolling_factor > > Space after /* missing. > > + unrolling_factor > + = least_common_multiple (max_nunits, group_size)/group_size; > > spaces before/after the '/' > > + if (max_nunits > nunits > + && unrolling_factor != 1 > + && is_a <bb_vec_info> (vinfo)) > { > if (dump_enabled_p ()) > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > - "Build SLP failed: unrolling required in basic" > - " block SLP\n"); > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, > + vect_location, > + "Build SLP failed: unrolling " > + "required in basic block SLP\n"); > vect_free_slp_tree (node); > loads.release (); > return false; > } > > + if (is_a <bb_vec_info> (vinfo) > + && max_nunits < group_size > + && unrolling_factor != 1) > > please merge these. I think you have a not covered case of max_nunits == > nunits, max_nunits > group_size. > So do > > if (unrolling_factor != 1 > && is_a <...>) > { > if (max_nunits >= group_size) > { > first error > } > ... signal fatal mismatch instead... > } > else > { > > Ok with that change.
Thank you for the review. I updated and patch and up streamed it to trunk. Ref: https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=236582 Regards, Venkat. > > Thanks, > Richard. > > >> > >> Thanks, > >> Richard. > >> > >> > Regards, > >> > Venkat. > >> > > > > > Regards, > > Venkat.