On Wed, 21 Jul 2021, liuhongt via Gcc-patches wrote:
> @@ -23254,13 +23337,15 @@ ix86_get_excess_precision (enum
> excess_precision_type type)
> provide would be identical were it not for the unpredictable
> cases. */
> if (!TARGET_80387)
> - return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> + return TARGET_SSE2
> + ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
> + : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> else if (!TARGET_MIX_SSE_I387)
> {
> if (!(TARGET_SSE && TARGET_SSE_MATH))
> return FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE;
> else if (TARGET_SSE2)
> - return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
> + return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16;
> }
>
> /* If we are in standards compliant mode, but we know we will
This patch is not changing the default "fast" mode at all; that's
promoting to float, unconditionally. But you have a subsequent change
there in patch 4 to make the promotions in the default "fast" mode depend
on hardware support for the new instructions; it's unhelpful for the
documentation not to corresponding exactly to the code changes in the same
patch.
Rather than using FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 whenever TARGET_SSE2
(i.e. whenever the type is available), it might make more sense to follow
AArch64 and use it only when the hardware instructions are available. In
any case, it seems peculiar to use a different threshold in the "fast"
case from the "standard" case. -fexcess-precision=standard is not "avoid
excess precision", it's "implement excess precision in the front end".
Whenever "fast" is implementing excess precision in the front end,
"standard" should be doing the same thing as "fast".
> +Soft-fp keeps the intermediate result of the operation at 32-bit precision
> by defaults,
> +which may lead to inconsistent behavior between soft-fp and avx512fp16
> instructions,
> +using @option{-fexcess-precision=standard} will force round back after every
> operation.
"soft-fp" is, as the name of some code within GCC, an internal
implementation detail, which should not be referenced in the user manual.
What results in intermediate results being in a wider precision is not
soft-fp; it's promotions inserted by the front end as a result of how the
above hook is defined (promotions inserted by the optabs/expand code are
an implementation detail that should always be followed automatically by a
truncation of the result and so not be user-visible).
As far as I know, the official name of "avx512fp16" is "AVX512-FP16" and
text in the manual should use the official capitalization, hyphenation
etc. in such names unless literally referring to command-line options
inside @option or similar.
--
Joseph S. Myers
[email protected]