Julian Brown <jul...@codesourcery.com> wrote on 05/11/2010 12:58:14 PM:
> I think it's probably fine to default to 128-bit vectors, and fall back > to 64-bits when necessary (where access patterns block usage of wider > vectors, or similar). AIUI, ARM were quite keen to get rid of > -mvectorize-with-neon-quad altogether, so I'm not sure it makes sense > to add a new -double option also: particularly since with > widening/narrowing operations, both vector sizes are generally needed > simultaneously. Right, mixed vector sizes make it irrelevant. > > > The best solution would be to evaluate costs for both size options. > > And it is a reasonable amount of work to do that. But the unknown > > loop bound case will require versioning between two vector options in > > addition to possible versioning between vector/scalar loops. > > > > I don't know if we can make a decision without tuning, especially > > since > > > > > 1. NEON hardware available at the time (Cortex-A8) only processed > > > data in 64-bit chunks, so Q-reg operations weren't necessarily any > > > faster than D-reg operations (that may still be true). > > > > This is why I thought that starting from the option to switch to 64 > > if 128 fails (with -mvectorize-with-neon-quad flag) is the least > > intrusive. > > I'm not sure. The best option may well depend on the particular core > (A8 vs A9 vs A15), and users will generally want to have the right > option (whatever that turns out to be) as the default, without having > to grub around in the documentation. > > (Maybe if we make -mvectorize-with-neon-quad "wired-on" but otherwise a > no-op, Since TARGET_NEON_VECTORIZE_QUAD is only used in arm_preferred_simd_mode and arm_autovectorize_vector_sizes, we can simply remove it, making 128 the default. (I am not sure I fully understand "wired-on" but otherwise a no-op"...). Index: config/arm/arm.c =================================================================== --- config/arm/arm.c (revision 166032) +++ config/arm/arm.c (working copy) @@ -246,6 +246,7 @@ static bool arm_builtin_support_vector_misalignmen const_tree type, int misalignment, bool is_packed); +static unsigned int arm_autovectorize_vector_sizes (void); /* Table of machine attributes. */ @@ -391,6 +392,9 @@ static const struct default_options arm_option_opt #define TARGET_VECTOR_MODE_SUPPORTED_P arm_vector_mode_supported_p #undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE #define TARGET_VECTORIZE_PREFERRED_SIMD_MODE arm_preferred_simd_mode +#undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES +#define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES \ + arm_autovectorize_vector_sizes #undef TARGET_MACHINE_DEPENDENT_REORG #define TARGET_MACHINE_DEPENDENT_REORG arm_reorg @@ -22025,15 +22029,14 @@ arm_preferred_simd_mode (enum machine_mode mode) switch (mode) { case SFmode: - return TARGET_NEON_VECTORIZE_QUAD ? V4SFmode : V2SFmode; + return V4SFmode; case SImode: - return TARGET_NEON_VECTORIZE_QUAD ? V4SImode : V2SImode; + return V4SImode; case HImode: - return TARGET_NEON_VECTORIZE_QUAD ? V8HImode : V4HImode; + return V8HImode; case QImode: - return TARGET_NEON_VECTORIZE_QUAD ? V16QImode : V8QImode; + return V16QImode; case DImode: - if (TARGET_NEON_VECTORIZE_QUAD) return V2DImode; break; @@ -23223,6 +23226,12 @@ arm_expand_sync (enum machine_mode mode, } } +static unsigned int +arm_autovectorize_vector_sizes (void) +{ + return 16 | 8; +} + > we could add e.g. a --param to say "prefer 64-bit vectors" or > "prefer 128-bit vectors" (falling back to 64-bit as necessary), for > benchmarking purposes and/or intrepid users.) ARM specific param? Thanks, Ira > > CC'ing Richard E., in case he has any input. > > Cheers, > > Julian _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain