Are i386 changes ok? Patches with corresponding changes and new tests are attached.
Thanks, Evgeny On Thu, Jun 12, 2014 at 12:14 PM, Richard Biener <richard.guent...@gmail.com> wrote: > On Thu, Jun 12, 2014 at 6:04 AM, Evgeny Stupachenko <evstu...@gmail.com> > wrote: >> Testing finished. No new regressions. >> Is the following patch ok? > > + if (targetm.sched.reassociation_width (VEC_PERM_EXPR, mode) > 1 || > + !vect_shift_permute_load_chain (dr_chain, size, stmt, gsi, > &result_chain)) > > ||s and &&s go to the next line. > > I miss testcases that make sure the vectorizer/backend code-paths are > both exercised. Put them in gcc.target/i386 and provide an appropriate > -march. > > The vectorizer changes are ok with the above fixed, I defer to backend > maintainers for the i386 changes. > > Richard. > >> 2014-06-11 Evgeny Stupachenko <evstu...@gmail.com> >> >> * config/i386/i386.c (ix86_reassociation_width): Add alternative for >> vector case. >> * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New. >> * config/i386/x86-tune.def (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New. >> * tree-vect-data-refs.c (vect_shift_permute_load_chain): New. >> Introduces alternative way of loads group permutaions. >> (vect_transform_grouped_load): Try alternative way of permutations. >> >> Thanks, >> Evgeny >> >> On Tue, Jun 10, 2014 at 4:43 PM, Evgeny Stupachenko <evstu...@gmail.com> >> wrote: >>> ix86_reassociation_width checks INTEGRAL_MODE_P and FLOAT_MODE_P which >>> include vector mode. >>> I'll try to separate this into scalar and vector part, but it will >>> require more testing (under the testing now). >>> What about the rest of the patch? >>> >>> Thanks, >>> Evgeny >>> >>> On Thu, Jun 5, 2014 at 3:54 PM, Ramana Radhakrishnan >>> <ramana.radhakrish...@arm.com> wrote: >>>> On 06/05/14 12:43, Evgeny Stupachenko wrote: >>>>> >>>>> New hook is related to vector instructions only. Vector instructions >>>>> could be sequential in pipeline, but scalar - parallel. For x86 >>>>> architectures TARGET_SCHED_REASSOC_WIDTH does not give required >>>>> differentiation. >>>>> General hooks could be potentially reused in other algorithms/by other >>>>> architectures. >>>> >>>> >>>> It already takes a "mode" argument. Couldn't you use a vector mode to work >>>> this out ? >>>> >>>> If it is not enough then please be more specific about the documentation of >>>> this hook about where it is useful so that it's easy for people reading the >>>> documentation to understand at a glance what purpose it serves. >>>> >>>> >>>> Ramana >>>> >>>> >>>>> >>>>> Thanks, >>>>> Evgeny >>>>> >>>>> On Thu, Jun 5, 2014 at 2:04 PM, Ramana Radhakrishnan >>>>> <ramana....@googlemail.com> wrote: >>>>>> >>>>>> On Wed, May 28, 2014 at 2:09 PM, Evgeny Stupachenko <evstu...@gmail.com> >>>>>> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> The patch introduces alternative way of permutations for load groups >>>>>>> of size 2 and 3 which should be faster on architectures with low >>>>>>> parallelism. >>>>>>> The patch gives 2 times gain on Silvermont to the test from PR52252 >>>>>>> (in addition to already committed 3 times gain). >>>>>>> >>>>>>> Patch passes bootstrap on x86. Make check is in progress. >>>>>> >>>>>> >>>>>> Why do we need a new hook ? Can't you derive this information from >>>>>> something which is equally badly named TARGET_SCHED_REASSOC_WIDTH >>>>>> though used in the reassociation logic but also serves a similar >>>>>> purpose ? >>>>>> >>>>>> Also the documentation of this hook is incomplete at best and wrong at >>>>>> worst as this is not applied everywhere in the vectorizer but just for >>>>>> this special case for load store permuting. Implying this is useful >>>>>> everywhere in the vectorizer does not appear to be correct. >>>>>> >>>>>> regards >>>>>> Ramana >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> ChangeLog: >>>>>>> >>>>>>> 2014-05-28 Evgeny Stupachenko <evstu...@gmail.com> >>>>>>> >>>>>>> * config/i386/i386.c (ix86_have_vector_parallel_execution): >>>>>>> New. >>>>>>> (TARGET_VECTORIZE_HAVE_VECTOR_PARALLEL_EXECUTION): New. >>>>>>> * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New. >>>>>>> * config/i386/x86-tune.def >>>>>>> (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New. >>>>>>> * target.def (have_vector_parallel_execution): New. >>>>>>> * doc/tm.texi.in (have_vector_parallel_execution)): New. >>>>>>> * doc/tm.texi: Regenerate. >>>>>>> * targhooks.c (default_have_vector_parallel_execution): New. >>>>>>> * tree-vect-data-refs.c (vect_shift_permute_load_chain): New. >>>>>>> Introduces alternative way of loads group permutaions. >>>>>>> (vect_transform_grouped_load): Try alternative way of >>>>>>> permutaions. >>>>>>> >>>>>>> Evgeny >>>>> >>>>> >>>>
vect_groups2.patch
Description: Binary data
i386tests.patch
Description: Binary data