Andrew Carlotti <[email protected]> writes:
> On Sat, Dec 09, 2023 at 07:22:49PM +0000, Richard Sandiford wrote:
>> Andrew Carlotti <[email protected]> writes:
>> > ...
>>
>> This is the only use of native_detect_p, so it'd be good to remove
>> the field itself.
>
> Done
>
>> > ...
>> >
>> > @@ -447,6 +451,13 @@ host_detect_local_cpu (int argc, const char **argv)
>> > if (tune)
>> > return res;
>> >
>> > + if (!processed_exts)
>> > + goto not_found;
>>
>> Could you explain this part? It seems like more of a parsing change
>> (i.e. being more strict about what we accept).
>>
>> If that's the intention, it probably belongs in:
>>
>> if (n_cores == 0
>> || n_cores > 2
>> || (n_cores == 1 && n_variants != 1)
>> || imp == INVALID_IMP)
>> goto not_found;
>>
>> But maybe it should be a separate patch.
>
> I added it because I realised that the parsing behaviour didn't make sense in
> that case, and my patch happens to change the behaviour as well (the outcome
> without the check would be no enabled features, whereas previously it would
> enable only the features with no native detection).
Ah, OK, thanks for the explanation.
> I agree that it makes sense to put it with the original check, so I've made
> that change.
>
>> Looks good otherwise, thanks.
>>
>> Richard
>
> New patch version below, ok for master?
>
> ---
>
> For native cpu feature detection, certain features have no entry in
> /proc/cpuinfo, so have to be assumed to be present whenever the detected
> cpu is supposed to support that feature.
>
> However, the logic for this was mistakenly implemented by excluding
> these features from part of aarch64_get_extension_string_for_isa_flags.
> This function is also used elsewhere when canonicalising explicit
> feature sets, which may require removing features that are normally
> implied by the specified architecture version.
>
> This change reenables generation of +nopredres, +nols64 and +nomops
> during canonicalisation, by relocating the misplaced native cpu
> detection logic.
>
> gcc/ChangeLog:
>
> * common/config/aarch64/aarch64-common.cc
> (struct aarch64_option_extension): Remove unused field.
> (all_extensions): Ditto.
> (aarch64_get_extension_string_for_isa_flags): Remove filtering
> of features without native detection.
> * config/aarch64/driver-aarch64.cc (host_detect_local_cpu):
> Explicitly add expected features that lack cpuinfo detection.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/options_set_28.c: New test.
OK, thanks.
Richard
> diff --git a/gcc/common/config/aarch64/aarch64-common.cc
> b/gcc/common/config/aarch64/aarch64-common.cc
> index
> c2a6d357c0bc17996a25ea5c3a40f69d745c7931..4d0431d3a2cad5414790646bce0c09877c0366b2
> 100644
> --- a/gcc/common/config/aarch64/aarch64-common.cc
> +++ b/gcc/common/config/aarch64/aarch64-common.cc
> @@ -149,9 +149,6 @@ struct aarch64_option_extension
> aarch64_feature_flags flags_on;
> /* If this feature is turned off, these bits also need to be turned off.
> */
> aarch64_feature_flags flags_off;
> - /* Indicates whether this feature is taken into account during native cpu
> - detection. */
> - bool native_detect_p;
> };
>
> /* ISA extensions in AArch64. */
> @@ -159,10 +156,9 @@ static constexpr aarch64_option_extension
> all_extensions[] =
> {
> #define AARCH64_OPT_EXTENSION(NAME, IDENT, C, D, E, FEATURE_STRING) \
> {NAME, AARCH64_FL_##IDENT, feature_deps::IDENT ().explicit_on, \
> - feature_deps::get_flags_off (feature_deps::root_off_##IDENT), \
> - FEATURE_STRING[0]},
> + feature_deps::get_flags_off (feature_deps::root_off_##IDENT)},
> #include "config/aarch64/aarch64-option-extensions.def"
> - {NULL, 0, 0, 0, false}
> + {NULL, 0, 0, 0}
> };
>
> struct processor_name_to_arch
> @@ -358,8 +354,7 @@ aarch64_get_extension_string_for_isa_flags
> /* If either crypto flag needs removing here, then both do. */
> flags = flags_crypto;
>
> - if (opt.native_detect_p
> - && (flags & current_flags & ~isa_flags))
> + if (flags & current_flags & ~isa_flags)
> {
> current_flags &= ~opt.flags_off;
> outstr += "+no";
> diff --git a/gcc/config/aarch64/driver-aarch64.cc
> b/gcc/config/aarch64/driver-aarch64.cc
> index
> 8e318892b10aa2288421fad418844744a2f5a3b4..c18f065aa41e7328d71b45a53c82a3b703ae44d5
> 100644
> --- a/gcc/config/aarch64/driver-aarch64.cc
> +++ b/gcc/config/aarch64/driver-aarch64.cc
> @@ -262,6 +262,7 @@ host_detect_local_cpu (int argc, const char **argv)
> unsigned int n_variants = 0;
> bool processed_exts = false;
> aarch64_feature_flags extension_flags = 0;
> + aarch64_feature_flags unchecked_extension_flags = 0;
> aarch64_feature_flags default_flags = 0;
> std::string buf;
> size_t sep_pos = -1;
> @@ -348,7 +349,10 @@ host_detect_local_cpu (int argc, const char **argv)
> /* If the feature contains no HWCAPS string then ignore it for the
> auto detection. */
> if (val.empty ())
> - continue;
> + {
> + unchecked_extension_flags |= aarch64_extensions[i].flag;
> + continue;
> + }
>
> bool enabled = true;
>
> @@ -380,7 +384,8 @@ host_detect_local_cpu (int argc, const char **argv)
> if (n_cores == 0
> || n_cores > 2
> || (n_cores == 1 && n_variants != 1)
> - || imp == INVALID_IMP)
> + || imp == INVALID_IMP
> + || !processed_exts)
> goto not_found;
>
> /* Simple case, one core type or just looking for the arch. */
> @@ -447,6 +452,10 @@ host_detect_local_cpu (int argc, const char **argv)
> if (tune)
> return res;
>
> + /* Add any features that should be be present, but can't be verified using
> + the /proc/cpuinfo "Features" list. */
> + extension_flags |= unchecked_extension_flags & default_flags;
> +
> {
> std::string extension
> = aarch64_get_extension_string_for_isa_flags (extension_flags,
> diff --git a/gcc/testsuite/gcc.target/aarch64/options_set_28.c
> b/gcc/testsuite/gcc.target/aarch64/options_set_28.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..9e63768581e9d429e9408863942051b1b04761ac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/options_set_28.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=armv9.3-a+nopredres+nols64+nomops" } */
> +
> +int main ()
> +{
> + return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times {\.arch
> armv9\.3\-a\+crc\+nopredres\+nols64\+nomops\n} 1 } } */