On 8 December 2013 16:53, Uros Bizjak <ubiz...@gmail.com> wrote:
> On Fri, Dec 6, 2013 at 8:33 PM, Sriraman Tallam <tmsri...@google.com> wrote:
>> Patch updated with two more tests to check if the vfmadd insn is being
>> produced when possible.
>>
>> Thanks
>> Sri
>>
>> On Fri, Dec 6, 2013 at 11:12 AM, Sriraman Tallam <tmsri...@google.com> wrote:
>>> Hi,
>>>
>>>     I have attached a patch to fix
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59390. Please review.
>>>
>>> Here is the problem. GCC adds target-specific builtins on demand. The
>>> FMA target-specific builtin __builtin_ia32_vfmaddpd gets added via
>>> this declaration:
>>>
>>> void fun() __attribute__((target("fma")));
>>>
>>> Specifically, the builtin __builtin_ia32_vfmaddpd gets added when
>>> ix86_add_new_builtins is called from ix86_valid_target_attribute_tree
>>> when processing this target attribute.
>>>
>>> Now, when the vectorizer is processing the builtin "__builtin_fma" in
>>> function other_fn(), it checks to see if this function is vectorizable
>>> and calls ix86_builtin_vectorized_function in i386.c. That returns the
>>> builtin stored here:
>>>
>>>
>>> case BUILT_IN_FMA:
>>> if (out_mode == DFmode && in_mode == DFmode)
>>> {
>>>  if (out_n == 2 && in_n == 2)
>>>    return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>>>           ....
>>>
>>> ix86_builtins[IX86_BUILTIN_VFMADDPD] would have contained NULL_TREE
>>> had the builtin not been added by the previous target attribute. That
>>> is why the code works if we remove the previous declaration.
>>>
>>> The fix is to not just return the builtin but to also check if the
>>> current function's isa allows the use of the builtin. For instance,
>>> this patch would solve the problem:
>>>
>>> @@ -33977,7 +33977,13 @@ ix86_builtin_vectorized_function (tree fndecl, tre
>>>        if (out_mode == DFmode && in_mode == DFmode)
>>>   {
>>>    if (out_n == 2 && in_n == 2)
>>> -    return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>>> +    {
>>> +      if (ix86_builtins_isa[IX86_BUILTIN_VFMADDPD].isa
>>> +  & global_options.x_ix86_isa_flags)
>>> +        return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>>> +      else
>>> + return NULL_TREE;
>>> +    }
>>>
>>>
>>> but there are many instances of this usage in
>>> ix86_builtin_vectorized_function. This patch covers all the cases.
>
>>     PR target/59390
>>     * gcc.target/i386/pr59390.c: New test.
>>     * gcc.target/i386/pr59390_1.c: New test.
>>     * gcc.target/i386/pr59390_2.c: New test.
>>     * config/i386/i386.c (get_builtin): New function.
>>     (ix86_builtin_vectorized_function): Replace all instances of
>>     ix86_builtins[...] with get_builtin(...).
>>     (ix86_builtin_reciprocal): Ditto.
>
> OK, with a couple of nits:
>
> +static tree get_builtin (enum ix86_builtins code)
>
> Please name this function ix86_get_builtin.
>
> +  if (current_function_decl)
> +    target_tree = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl);
> +  if (target_tree)
> +    opts = TREE_TARGET_OPTION (target_tree);
> +  else
> +    opts = TREE_TARGET_OPTION (target_option_default_node);
> +
>
> opts = TREE_TARGET_OPTION (target_tree ? target_tree :
> target_option_default_node);
Just curious:
> +static tree get_builtin (enum ix86_builtins code)
> +{
[]
> +>
[]
> @@ -33677,9 +33701,9 @@ ix86_builtin_vectorized_function (tree fndecl, tre
>        if (out_mode == DFmode && in_mode == DFmode)
> {
>   if (out_n == 2 && in_n == 2)
> -    return ix86_builtins[IX86_BUILTIN_SQRTPD];
> +    get_builtin (IX86_BUILTIN_SQRTPD);
>   else if (out_n == 4 && in_n == 4)
> -    return ix86_builtins[IX86_BUILTIN_SQRTPD256];
> +    get_builtin (IX86_BUILTIN_SQRTPD256);
> }
>        break;

I must be missing something?
Don't you have to return ix86_get_builtin(...) ?
thanks,

Reply via email to