Hello. During compilation of Chromium project, I noticed (perf top) that ix86_valid_target_attribute is utilized by about 0.8-1.9%. Following patch introduces simple optimization that reduces the utilization to ~0.05%.
Tests have been running on x86_64-linux-pc. Ready for trunk? Martin
>From d20232455993882e4b7f769725a0c1589747b8a3 Mon Sep 17 00:00:00 2001 From: marxin <marxin.li...@gmail.com> Date: Sun, 8 Mar 2015 19:39:55 -0500 Subject: [PATCH] def_builtin_const: speed-up. gcc/ChangeLog: 2015-03-09 Martin Liska <marxin.li...@gmail.com> * config/i386/i386.c (def_builtin): Collect union of all possible masks. (ix86_add_new_builtins): Do not iterate over all builtins in cases that isa value has no intersection with possible masks and(or) last passed value is equal to the provided. --- gcc/config/i386/i386.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ab8f03a..5f180b6 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -30592,6 +30592,8 @@ struct builtin_isa { static struct builtin_isa ix86_builtins_isa[(int) IX86_BUILTIN_MAX]; +/* Union of all masks that are part of builtin_isa structures. */ +static HOST_WIDE_INT defined_isa_values = 0; /* Add an ix86 target builtin function with CODE, NAME and TYPE. Save the MASK of which isa_flags to use in the ix86_builtins_isa array. Stores the @@ -30619,6 +30621,7 @@ def_builtin (HOST_WIDE_INT mask, const char *name, if (!(mask & OPTION_MASK_ISA_64BIT) || TARGET_64BIT) { ix86_builtins_isa[(int) code].isa = mask; + defined_isa_values |= mask; mask &= ~OPTION_MASK_ISA_64BIT; if (mask == 0 @@ -30670,6 +30673,14 @@ def_builtin_const (HOST_WIDE_INT mask, const char *name, static void ix86_add_new_builtins (HOST_WIDE_INT isa) { + /* Last cached isa value. */ + static HOST_WIDE_INT last_tested_isa_value = 0; + + if ((isa & defined_isa_values) == 0 || isa == last_tested_isa_value) + return; + + last_tested_isa_value = isa; + int i; tree saved_current_target_pragma = current_target_pragma; current_target_pragma = NULL_TREE; -- 2.1.2