Alfie Richards <alfie.richa...@arm.com> writes: > On 04/02/2025 10:03, Richard Sandiford wrote: >> Alfie Richards <alfie.richa...@arm.com> writes: >>> + return id; >>> + else if (cgraph_node::get_create >>> (decl)->dispatcher_resolver_function) >>> + id = clone_identifier (id, "resolver"); >>> + else if (DECL_FUNCTION_VERSIONED (decl)) >>> { >>> - name += ".default"; >>> - return get_identifier (name.c_str()); >>> - } >>> - >>> - name += "._"; >>> + aarch64_fmv_feature_mask feature_mask >>> + = get_feature_mask_for_version (decl); >>> >>> - int num_features = ARRAY_SIZE (aarch64_fmv_feature_data); >>> - for (int i = 0; i < num_features; i++) >>> - { >>> - if (feature_mask & aarch64_fmv_feature_data[i].feature_mask) >>> + if (feature_mask == 0ULL) // && not implemented! >> Could you explain the // comment? > Apologies, this was a temporary marker for myself that slipped through. >>> { >>> - name += "M"; >>> - name += aarch64_fmv_feature_data[i].name; >>> + if (!DECL_INITIAL (decl)) >>> + return id; >>> + return clone_identifier (id, "default"); >> This different handling of defined vs. undefined functions feels a >> bit weird. It's not clear to me whether: >> >> (1) The .default version is effectively internal to the translation unit, >> and therefore other translation units cannot assume it exists. >> >> (2) Other translation units can assume that the .default version exists >> if they can see a target_version("default") or target_clones definition. >> >> (3) Other translation units can assume that the .default version exists >> if they can see a non-default target_version or a target_clones >> definition. >> >> (4) Something else.
Argh, sorry, I meant declaration rather than definition for both (2) and (3). My question doesn't make sense as written. >> (2) would create a difference between implicit and explicit defaults >> and doesn't seem to be what the series implements (mv-symbols6.C from >> patch 14). (3) seems indirect, and IMO it's dangerous to assume that >> the mere existence of a non-default version X in one TU implies that >> version X will be visible in the TU that contains the resolver. I would >> have expected that it would be valid for version X not to be visible >> in the TU that contains the resolver, with the consequence that the >> resolver won't use it. >> >> (1) seems more appealing on the face of it, so that .default is more >> like .resolver. But that doesn't seem to be what the spec says. > I would say its (4) Something else. > > This code is the result of that and a discussion we had where we decided we > can avoid mangling the default version symbol if it is external. As we > know that > in the TU where the default is defined, the default > version will be mangled, and the dispatched symbol will take its name. > > As the default is the only resolvable version, then all calls and > references > will already have the correct target. > > This allows us to avoid making the dispatched symbol and redirecting > default decl calls/references to the dispatched symbol. That sounds like what I meant by (1): i.e. that the only TU that can reference .default is the one that creates it (i.e. the one with the definition). A declaration with __attribute__((target_version("default")) doesn't itself imply that a .default version exists. The reason for asking is that the ACLE says: The "default" version is mangled with ".default" on top of the language-specific name mangling. All versioned functions with their mangled names are always resolvable. which made it sound like other TUs could rely on ".default" existing. > It's a bit of a hack, and we can instead always mangle the default, > create the dispatched symbol? Apologies if I misunderstood the earlier > conversation. > >>> } >>> - } >>> >>> - if (DECL_ASSEMBLER_NAME_SET_P (decl)) >>> - SET_DECL_RTL (decl, NULL); >>> + std::string suffix = "_"; >>> + >>> + int num_features = ARRAY_SIZE (aarch64_fmv_feature_data); >>> + for (int i = 0; i < num_features; i++) >>> + if (feature_mask & aarch64_fmv_feature_data[i].feature_mask) >>> + { >>> + suffix += "M"; >>> + suffix += aarch64_fmv_feature_data[i].name; >>> + } >>> + >>> + if (DECL_ASSEMBLER_NAME_SET_P (decl)) >>> + SET_DECL_RTL (decl, NULL); >> Why is it necessary to conditionally reset the DECL_RTL for the >> non-default case but not necessary when adding ".default"? >> >> Thanks, >> Richard > To be honest I don’t understand this code particularly well. This was here > previously, and I understood it to imply that the function needs to be > recompiled if it is non-default as it's architecture features will > have changed.I think it only really applies in the target_clones case > where the new > decl from expand_clones could have carried over the RTL of the default? Resetting DECL_RTL doesn't AFAIK trigger any recompilation. IIRC, the DECL_RTL for a function is usually: (mem (symbol_ref "<mangled name>")) and so the purpose of clearing it out is to force a new symbol_ref (with a new mangling) to be created. It would be interesting to see whether any failures are visible with the SET_DECL_RTL removed. If so, it would be interesting to know whether there's a particular reason why the same failure couldn't trigger for the default version. If not, then maybe the code can be removed. (I'd expect a failure, but I'm certainly not 100% sure.) Thanks, Richard