On Wed, Dec 11, 2013 at 2:42 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Wed, Dec 11, 2013 at 10:15 AM, Bernhard Reutner-Fischer > <rep.dot....@gmail.com> wrote: > >>>> Patch updated with two more tests to check if the vfmadd insn is being >>>> produced when possible. >>>> >>>> Thanks >>>> Sri >>>> >>>> On Fri, Dec 6, 2013 at 11:12 AM, Sriraman Tallam <tmsri...@google.com> >>>> wrote: >>>>> Hi, >>>>> >>>>> I have attached a patch to fix >>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59390. Please review. >>>>> >>>>> Here is the problem. GCC adds target-specific builtins on demand. The >>>>> FMA target-specific builtin __builtin_ia32_vfmaddpd gets added via >>>>> this declaration: >>>>> >>>>> void fun() __attribute__((target("fma"))); >>>>> >>>>> Specifically, the builtin __builtin_ia32_vfmaddpd gets added when >>>>> ix86_add_new_builtins is called from ix86_valid_target_attribute_tree >>>>> when processing this target attribute. >>>>> >>>>> Now, when the vectorizer is processing the builtin "__builtin_fma" in >>>>> function other_fn(), it checks to see if this function is vectorizable >>>>> and calls ix86_builtin_vectorized_function in i386.c. That returns the >>>>> builtin stored here: >>>>> >>>>> >>>>> case BUILT_IN_FMA: >>>>> if (out_mode == DFmode && in_mode == DFmode) >>>>> { >>>>> if (out_n == 2 && in_n == 2) >>>>> return ix86_builtins[IX86_BUILTIN_VFMADDPD]; >>>>> .... >>>>> >>>>> ix86_builtins[IX86_BUILTIN_VFMADDPD] would have contained NULL_TREE >>>>> had the builtin not been added by the previous target attribute. That >>>>> is why the code works if we remove the previous declaration. >>>>> >>>>> The fix is to not just return the builtin but to also check if the >>>>> current function's isa allows the use of the builtin. For instance, >>>>> this patch would solve the problem: >>>>> >>>>> @@ -33977,7 +33977,13 @@ ix86_builtin_vectorized_function (tree fndecl, >>>>> tre >>>>> if (out_mode == DFmode && in_mode == DFmode) >>>>> { >>>>> if (out_n == 2 && in_n == 2) >>>>> - return ix86_builtins[IX86_BUILTIN_VFMADDPD]; >>>>> + { >>>>> + if (ix86_builtins_isa[IX86_BUILTIN_VFMADDPD].isa >>>>> + & global_options.x_ix86_isa_flags) >>>>> + return ix86_builtins[IX86_BUILTIN_VFMADDPD]; >>>>> + else >>>>> + return NULL_TREE; >>>>> + } >>>>> >>>>> >>>>> but there are many instances of this usage in >>>>> ix86_builtin_vectorized_function. This patch covers all the cases. >>> >>>> PR target/59390 >>>> * gcc.target/i386/pr59390.c: New test. >>>> * gcc.target/i386/pr59390_1.c: New test. >>>> * gcc.target/i386/pr59390_2.c: New test. >>>> * config/i386/i386.c (get_builtin): New function. >>>> (ix86_builtin_vectorized_function): Replace all instances of >>>> ix86_builtins[...] with get_builtin(...). >>>> (ix86_builtin_reciprocal): Ditto. >>> >>> OK, with a couple of nits: >>> >>> +static tree get_builtin (enum ix86_builtins code) >>> >>> Please name this function ix86_get_builtin. >>> >>> + if (current_function_decl) >>> + target_tree = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl); >>> + if (target_tree) >>> + opts = TREE_TARGET_OPTION (target_tree); >>> + else >>> + opts = TREE_TARGET_OPTION (target_option_default_node); >>> + >>> >>> opts = TREE_TARGET_OPTION (target_tree ? target_tree : >>> target_option_default_node); >> >> Just curious: >>> +static tree get_builtin (enum ix86_builtins code) >>> +{ >> [] >>> +> >> [] >>> @@ -33677,9 +33701,9 @@ ix86_builtin_vectorized_function (tree fndecl, tre >>> if (out_mode == DFmode && in_mode == DFmode) >>> { >>> if (out_n == 2 && in_n == 2) >>> - return ix86_builtins[IX86_BUILTIN_SQRTPD]; >>> + get_builtin (IX86_BUILTIN_SQRTPD); >>> else if (out_n == 4 && in_n == 4) >>> - return ix86_builtins[IX86_BUILTIN_SQRTPD256]; >>> + get_builtin (IX86_BUILTIN_SQRTPD256); >>> } >>> break; >> >> I must be missing something? >> Don't you have to return ix86_get_builtin(...) ?
Yes, I noticed it last evening when the vect tests broke. I fixed it with adding the return (which was a typo when I did a bulk edit) and running tests. I will include Uros's changes and commit the patch. Thanks Sri > > Huh, I didn't even notice this mistake. > > Uros.