On Mon, Aug 28, 2017 at 10:57 AM, Dominik Inführ
<[email protected]> wrote:
> Hi,
>
> As discussed on IRC: This patches fixes the calculation of the scalar costs
> of SLP vectorization. I’ve added a test case and the auto_vec allocation is
> now reused for all children of a node.
Looks good to me and thanks for the testcase. I'll include this in my
next round of testing and make sure it's fixed for GCC 8.
Thanks,
Richard.
> diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c
> b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c
> new file mode 100644
> index 0000000..5121a88
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fdump-tree-slp-details" } */
> +
> +#define N 4
> +
> +int s1[N], s2[N], s3[N];
> +void escape(int, int, int, int);
> +
> +void
> +foo ()
> +{
> + int t1, t2, t3, t4;
> +
> + t1 = s1[0] + s2[0] + s3[0];
> + t2 = s1[1] + s2[1] + s3[1];
> + t3 = s1[2] + s2[2] + s3[2];
> + t4 = s1[3] + s2[3] + s3[3];
> +
> + s3[0] = t1 - s1[0] * s2[0];
> + s3[1] = t2 - s1[1] * s2[1];
> + s3[2] = t3 - s1[2] * s2[2];
> + s3[3] = t4 - s1[3] * s2[3];
> +
> + escape (t1, t2, t3, t4);
> +}
> +
> +/* { dg-final { scan-tree-dump-not "vectorization is not profitable" "slp2"
> } } */
> +/* { dg-final { scan-tree-dump "basic block vectorized" "slp2" } } */
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 04ecaab..cffd19b 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -2690,9 +2690,17 @@ vect_bb_slp_scalar_cost (basic_block bb,
> scalar_cost += stmt_cost;
> }
>
> + auto_vec<bool, 20> subtree_life;
> +
> FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
> if (SLP_TREE_DEF_TYPE (child) == vect_internal_def)
> - scalar_cost += vect_bb_slp_scalar_cost (bb, child, life);
> + {
> + /* Do not directly pass LIFE to the recursive call, copy it to
> + confine changes in the callee to the current child/subtree. */
> + subtree_life.safe_splice (*life);
> + scalar_cost += vect_bb_slp_scalar_cost (bb, child, &subtree_life);
> + subtree_life.truncate (0);
> + }
>
> return scalar_cost;
> }
>
>> On 04 Aug 2017, at 12:19, Richard Biener <[email protected]> wrote:
>>
>> On Fri, Aug 4, 2017 at 12:08 PM, Dominik Inführ
>> <[email protected]> wrote:
>>> Hi,
>>>
>>> vect_bb_slp_scalar_cost computes the scalar cost of a SLP node. If there
>>> are non-scalar uses of a definition, the costs for it and its operands
>>> (children) are ignored. The vector LIFE is used to keep track of this and
>>> an element is set to true, such that the def and its children are ignored.
>>> But as soon as an element is set to true it is never undone, that means the
>>> following sibling and parent nodes of the current node also stay ignored.
>>
>> Yes, that's intended. They are live as well because they are needed
>> to compute the live scalar.
>>
>> This seems wrong to me, a simple fix would be to clone LIFE for every
>> vector, such that changes to LIFE stay in the subtree:
>>>
>>> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
>>> index 032a9444a5a..f919645f28b 100644
>>> --- a/gcc/tree-vect-slp.c
>>> +++ b/gcc/tree-vect-slp.c
>>> @@ -2590,6 +2590,11 @@ vect_bb_slp_scalar_cost (basic_block bb,
>>> gimple *stmt;
>>> slp_tree child;
>>>
>>> + auto_vec<bool, 20> subtree_life;
>>> + subtree_life.safe_splice (*life);
>>> +
>>> + life = &subtree_life;
>>> +
>>> FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt)
>>> {
>>> unsigned stmt_cost;
>>>
>>>
>