On Wed, 11 Dec 2013, Bernhard Reutner-Fischer wrote: > On 8 December 2013 16:53, Uros Bizjak <ubiz...@gmail.com> wrote: > > On Fri, Dec 6, 2013 at 8:33 PM, Sriraman Tallam <tmsri...@google.com> wrote: > >> Patch updated with two more tests to check if the vfmadd insn is being > >> produced when possible. > >> > >> Thanks > >> Sri > >> > >> On Fri, Dec 6, 2013 at 11:12 AM, Sriraman Tallam <tmsri...@google.com> > >> wrote: > >>> Hi, > >>> > >>> I have attached a patch to fix > >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59390. Please review. > >>> > >>> Here is the problem. GCC adds target-specific builtins on demand. The > >>> FMA target-specific builtin __builtin_ia32_vfmaddpd gets added via > >>> this declaration: > >>> > >>> void fun() __attribute__((target("fma"))); > >>> > >>> Specifically, the builtin __builtin_ia32_vfmaddpd gets added when > >>> ix86_add_new_builtins is called from ix86_valid_target_attribute_tree > >>> when processing this target attribute. > >>> > >>> Now, when the vectorizer is processing the builtin "__builtin_fma" in > >>> function other_fn(), it checks to see if this function is vectorizable > >>> and calls ix86_builtin_vectorized_function in i386.c. That returns the > >>> builtin stored here: > >>> > >>> > >>> case BUILT_IN_FMA: > >>> if (out_mode == DFmode && in_mode == DFmode) > >>> { > >>> if (out_n == 2 && in_n == 2) > >>> return ix86_builtins[IX86_BUILTIN_VFMADDPD]; > >>> .... > >>> > >>> ix86_builtins[IX86_BUILTIN_VFMADDPD] would have contained NULL_TREE > >>> had the builtin not been added by the previous target attribute. That > >>> is why the code works if we remove the previous declaration. > >>> > >>> The fix is to not just return the builtin but to also check if the > >>> current function's isa allows the use of the builtin. For instance, > >>> this patch would solve the problem: > >>> > >>> @@ -33977,7 +33977,13 @@ ix86_builtin_vectorized_function (tree fndecl, > >>> tre > >>> if (out_mode == DFmode && in_mode == DFmode) > >>> { > >>> if (out_n == 2 && in_n == 2) > >>> - return ix86_builtins[IX86_BUILTIN_VFMADDPD]; > >>> + { > >>> + if (ix86_builtins_isa[IX86_BUILTIN_VFMADDPD].isa > >>> + & global_options.x_ix86_isa_flags) > >>> + return ix86_builtins[IX86_BUILTIN_VFMADDPD]; > >>> + else > >>> + return NULL_TREE; > >>> + } > >>> > >>> > >>> but there are many instances of this usage in > >>> ix86_builtin_vectorized_function. This patch covers all the cases. > > > >> PR target/59390 > >> * gcc.target/i386/pr59390.c: New test. > >> * gcc.target/i386/pr59390_1.c: New test. > >> * gcc.target/i386/pr59390_2.c: New test. > >> * config/i386/i386.c (get_builtin): New function. > >> (ix86_builtin_vectorized_function): Replace all instances of > >> ix86_builtins[...] with get_builtin(...). > >> (ix86_builtin_reciprocal): Ditto. > > > > OK, with a couple of nits: > > > > +static tree get_builtin (enum ix86_builtins code) > > > > Please name this function ix86_get_builtin. > > > > + if (current_function_decl) > > + target_tree = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl); > > + if (target_tree) > > + opts = TREE_TARGET_OPTION (target_tree); > > + else > > + opts = TREE_TARGET_OPTION (target_option_default_node); > > + > > > > opts = TREE_TARGET_OPTION (target_tree ? target_tree : > > target_option_default_node); > > Just curious: > > +static tree get_builtin (enum ix86_builtins code) > > +{ > [] > > +> > [] > > @@ -33677,9 +33701,9 @@ ix86_builtin_vectorized_function (tree fndecl, tre > > if (out_mode == DFmode && in_mode == DFmode) > > { > > if (out_n == 2 && in_n == 2) > > - return ix86_builtins[IX86_BUILTIN_SQRTPD]; > > + get_builtin (IX86_BUILTIN_SQRTPD); > > else if (out_n == 4 && in_n == 4) > > - return ix86_builtins[IX86_BUILTIN_SQRTPD256]; > > + get_builtin (IX86_BUILTIN_SQRTPD256); > > } > > break; > > I must be missing something? > Don't you have to return ix86_get_builtin(...) ?
Of course you have to. Richard.