On Sep 19, 2017, at 4:21 PM, Bill Schmidt <wschm...@linux.vnet.ibm.com> wrote:
> 
> On Sep 19, 2017, at 3:58 PM, Richard Sandiford <richard.sandif...@linaro.org> 
> wrote:
>> 
>> Bill Schmidt <wschm...@linux.vnet.ibm.com> writes:
>>> Index: gcc/tree-vect-stmts.c
>>> ===================================================================
>>> --- gcc/tree-vect-stmts.c   (revision 252760)
>>> +++ gcc/tree-vect-stmts.c   (working copy)
>>> @@ -1091,8 +1091,19 @@ vect_model_load_cost (stmt_vec_info stmt_info, int
>>>                     prologue_cost_vec, body_cost_vec, true);
>>>  if (memory_access_type == VMAT_ELEMENTWISE
>>>      || memory_access_type == VMAT_STRIDED_SLP)
>>> -    inside_cost += record_stmt_cost (body_cost_vec, ncopies, vec_construct,
>>> -                                stmt_info, 0, vect_body);
>>> +    {
>>> +      int group_size = GROUP_SIZE (stmt_info);
>>> +      int nunits = TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info));
>>> +      if (group_size < nunits)
>>> +   {
>>> +     if (dump_enabled_p ())
>>> +       dump_printf_loc (MSG_NOTE, vect_location,
>>> +                        "vect_model_load_cost: vec_construct required");
>>> +     inside_cost += record_stmt_cost (body_cost_vec, ncopies,
>>> +                                      vec_construct, stmt_info, 0,
>>> +                                      vect_body);
>>> +   }
>>> +    }
>>> 
>>>  if (dump_enabled_p ())
>>>    dump_printf_loc (MSG_NOTE, vect_location,
>> 
>> This feels like we've probably got the wrong memory_access_type.
>> If it's a just a contiguous load then it should be VMAT_CONTIGUOUS
>> instead.
> 
> I tend to agree -- that will take more surgery to the code that detects 
> strided loads in
> both the cost model analysis and in vectorizable_load.
> 
It's just a little less clear how to clean this up in vect_analyze_data_refs.  
The
too-simplistic test there now is:

      else if (is_a <loop_vec_info> (vinfo)
               && TREE_CODE (DR_STEP (dr)) != INTEGER_CST)
        {
          if (nested_in_vect_loop_p (loop, stmt))
            {
              if (dump_enabled_p ())
                {
                  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                                   "not vectorized: not suitable for strided "
                                   "load ");
                  dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM, stmt, 0);
                }
              return false;
            }
          STMT_VINFO_STRIDED_P (stmt_info) = true;
        }

Furthermore, this is being set while processing each data reference, not after
all information has been gathered and analyzed.  So this is too early.

Next place to fix this would be in vect_analyze_slp_cost_1, where

      vect_memory_access_type memory_access_type
        = (STMT_VINFO_STRIDED_P (stmt_info)
           ? VMAT_STRIDED_SLP
           : VMAT_CONTIGUOUS);

is too simplistic.  I guess we could do something here, but it would require 
unpacking
all the grouped accesses and duplicating all the logic from scratch to 
determine 
whether it's a single load or not.

So it's going to be a little painful to do it "right."  I submitted this patch 
because I felt
it would be the simplest solution.

Thanks,
Bill


> Bill
> 
>> 
>> Thanks,
>> Richard

Reply via email to