Are i386 changes ok?
Patches with corresponding changes and new tests are attached.

Thanks,
Evgeny

On Thu, Jun 12, 2014 at 12:14 PM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Thu, Jun 12, 2014 at 6:04 AM, Evgeny Stupachenko <evstu...@gmail.com> 
> wrote:
>> Testing finished. No new regressions.
>> Is the following patch ok?
>
> +  if (targetm.sched.reassociation_width (VEC_PERM_EXPR, mode) > 1 ||
> +      !vect_shift_permute_load_chain (dr_chain, size, stmt, gsi,
> &result_chain))
>
> ||s and &&s go to the next line.
>
> I miss testcases that make sure the vectorizer/backend code-paths are
> both exercised.  Put them in gcc.target/i386 and provide an appropriate
> -march.
>
> The vectorizer changes are ok with the above fixed, I defer to backend
> maintainers for the i386 changes.
>
> Richard.
>
>> 2014-06-11  Evgeny Stupachenko  <evstu...@gmail.com>
>>
>>         * config/i386/i386.c (ix86_reassociation_width): Add alternative for
>>         vector case.
>>         * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New.
>>         * config/i386/x86-tune.def (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New.
>>         * tree-vect-data-refs.c (vect_shift_permute_load_chain): New.
>>         Introduces alternative way of loads group permutaions.
>>         (vect_transform_grouped_load): Try alternative way of permutations.
>>
>> Thanks,
>> Evgeny
>>
>> On Tue, Jun 10, 2014 at 4:43 PM, Evgeny Stupachenko <evstu...@gmail.com> 
>> wrote:
>>> ix86_reassociation_width checks INTEGRAL_MODE_P and FLOAT_MODE_P which
>>> include vector mode.
>>> I'll try to separate this into scalar and vector part, but it will
>>> require more testing (under the testing now).
>>> What about the rest of the patch?
>>>
>>> Thanks,
>>> Evgeny
>>>
>>> On Thu, Jun 5, 2014 at 3:54 PM, Ramana Radhakrishnan
>>> <ramana.radhakrish...@arm.com> wrote:
>>>> On 06/05/14 12:43, Evgeny Stupachenko wrote:
>>>>>
>>>>> New hook is related to vector instructions only. Vector instructions
>>>>> could be sequential in pipeline, but scalar - parallel. For x86
>>>>> architectures TARGET_SCHED_REASSOC_WIDTH does not give required
>>>>> differentiation.
>>>>> General hooks could be potentially reused in other algorithms/by other
>>>>> architectures.
>>>>
>>>>
>>>> It already takes a "mode" argument. Couldn't you use a vector mode to work
>>>> this out ?
>>>>
>>>> If it is not enough then please be more specific about the documentation of
>>>> this hook about where it is useful so that it's easy for people reading the
>>>> documentation to understand at a glance what purpose it serves.
>>>>
>>>>
>>>> Ramana
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Evgeny
>>>>>
>>>>> On Thu, Jun 5, 2014 at 2:04 PM, Ramana Radhakrishnan
>>>>> <ramana....@googlemail.com> wrote:
>>>>>>
>>>>>> On Wed, May 28, 2014 at 2:09 PM, Evgeny Stupachenko <evstu...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> The patch introduces alternative way of permutations for load groups
>>>>>>> of size 2 and 3 which should be faster on architectures with low
>>>>>>> parallelism.
>>>>>>> The patch gives 2 times gain on Silvermont to the test from PR52252
>>>>>>> (in addition to already committed 3 times gain).
>>>>>>>
>>>>>>> Patch passes bootstrap on x86. Make check is in progress.
>>>>>>
>>>>>>
>>>>>> Why do we need a new hook ? Can't you derive this information from
>>>>>> something which is equally badly named TARGET_SCHED_REASSOC_WIDTH
>>>>>> though used in the reassociation logic but also serves a similar
>>>>>> purpose ?
>>>>>>
>>>>>> Also the documentation of this hook is incomplete at best and wrong at
>>>>>> worst as this is not applied everywhere in the vectorizer but just for
>>>>>> this special case for load store permuting. Implying this is useful
>>>>>> everywhere in the vectorizer does not appear to be correct.
>>>>>>
>>>>>> regards
>>>>>> Ramana
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> ChangeLog:
>>>>>>>
>>>>>>> 2014-05-28  Evgeny Stupachenko  <evstu...@gmail.com>
>>>>>>>
>>>>>>>          * config/i386/i386.c (ix86_have_vector_parallel_execution):
>>>>>>> New.
>>>>>>>          (TARGET_VECTORIZE_HAVE_VECTOR_PARALLEL_EXECUTION): New.
>>>>>>>          * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New.
>>>>>>>          * config/i386/x86-tune.def
>>>>>>> (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New.
>>>>>>>          * target.def (have_vector_parallel_execution): New.
>>>>>>>          * doc/tm.texi.in (have_vector_parallel_execution)): New.
>>>>>>>          * doc/tm.texi: Regenerate.
>>>>>>>          * targhooks.c (default_have_vector_parallel_execution): New.
>>>>>>>          * tree-vect-data-refs.c (vect_shift_permute_load_chain): New.
>>>>>>>          Introduces alternative way of loads group permutaions.
>>>>>>>          (vect_transform_grouped_load): Try alternative way of
>>>>>>> permutaions.
>>>>>>>
>>>>>>> Evgeny
>>>>>
>>>>>
>>>>

Attachment: vect_groups2.patch
Description: Binary data

Attachment: i386tests.patch
Description: Binary data

Reply via email to