On Tue, Sep 17, 2013 at 10:20 AM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li <davi...@google.com> 
> wrote:
>> On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener
>> <richard.guent...@gmail.com> wrote:
>>> On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li <davi...@google.com> 
>>> wrote:
>>>> On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener
>>>> <richard.guent...@gmail.com> wrote:
>>>>> On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li <davi...@google.com> 
>>>>> wrote:
>>>>>> Currently -ftree-vectorize turns on both loop and slp vectorizations,
>>>>>> but there is no simple way to turn on loop vectorization alone. The
>>>>>> logic for default O3 setting is also complicated.
>>>>>>
>>>>>> In this patch, two new options are introduced:
>>>>>>
>>>>>> 1) -ftree-loop-vectorize
>>>>>>
>>>>>> This option is used to turn on loop vectorization only. option
>>>>>> -ftree-slp-vectorize also becomes a first class citizen, and no funny
>>>>>> business of Init(2) is needed.  With this change, -ftree-vectorize
>>>>>> becomes a simple alias to -ftree-loop-vectorize +
>>>>>> -ftree-slp-vectorize.
>>>>>>
>>>>>> For instance, to turn on only slp vectorize at O3, the old way is:
>>>>>>
>>>>>>      -O3 -fno-tree-vectorize -ftree-slp-vectorize
>>>>>>
>>>>>> With the new change it becomes:
>>>>>>
>>>>>>     -O3 -fno-loop-vectorize
>>>>>>
>>>>>>
>>>>>> To turn on only loop vectorize at O2, the old way is
>>>>>>
>>>>>>     -O2 -ftree-vectorize -fno-slp-vectorize
>>>>>>
>>>>>> The new way is
>>>>>>
>>>>>>     -O2 -ftree-loop-vectorize
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2) -ftree-vect-loop-peeling
>>>>>>
>>>>>> This option is used to turn on/off loop peeling for alignment.  In the
>>>>>> long run, this should be folded into the cheap cost model proposed by
>>>>>> Richard.  This option is also useful in scenarios where peeling can
>>>>>> introduce runtime problems:
>>>>>> http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html  which happens to be
>>>>>> common in practice.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Patch attached. Compiler boostrapped. Ok after testing?
>>>>>
>>>>> I'd like you to split 1) and 2), mainly because I agree on 1) but not on 
>>>>> 2).
>>>>
>>>> Ok. Can you also comment on 2) ?
>>>
>>> I think we want to decide how granular we want to control the vectorizer
>>> and using which mechanism.  My cost-model re-org makes
>>> ftree-vect-loop-version a no-op (basically removes it), so 2) looks like
>>> a step backwards in this context.
>>
>> Using cost model to do a coarse grain control/configuration is
>> certainly something we want, but having a fine grain control is still
>> useful.
>>
>>>
>>> So, can you summarize what pieces (including versioning) of the vectorizer
>>> you'd want to be able to disable separately?
>>
>> Loop peeling seems to be the main one. There is also a correctness
>> issue related. For instance, the following code is common in practice,
>> but loop peeling wrongly assumes initial base-alignment and generates
>> aligned mov instruction after peeling, leading to SEGV.  Peeling is
>> not something we can blindly turned on -- even when it is on, there
>> should be a way to turn it off explicitly:
>>
>> char a[10000];
>>
>> void foo(int n)
>> {
>>   int* b = (int*)(a+n);
>>   int i = 0;
>>   for (; i < 1000; ++i)
>>     b[i] = 1;
>> }
>>
>> int main(int argn, char** argv)
>> {
>>   foo(argn);
>> }
>
> But that's just a bug that should be fixed (looking into it).

Bug in the testcase.  b[i] asserts that b is aligned to 'int', so this invokes
undefined behavior if peeling cannot reach an alignment of 16.

Richard.

Reply via email to