zahiraam added inline comments.
================
Comment at: clang/lib/CodeGen/CGExprScalar.cpp:1315
+ if ((SrcType->isHalfType() || iSFloat16Allowed) &&
+ !CGF.getContext().getLangOpts().NativeHalfType) {
// Cast to FP using the intrinsic if the half type itself isn't supported.
----------------
pengfei wrote:
> rjmccall wrote:
> > pengfei wrote:
> > > rjmccall wrote:
> > > > pengfei wrote:
> > > > > rjmccall wrote:
> > > > > > Okay, this condition is pretty ridiculous to be repeating in three
> > > > > > different places across the compiler. Especially since you're
> > > > > > going to change it when you implement the new option, right?
> > > > > >
> > > > > > Can we state this condition more generally? I'm not sure why this
> > > > > > is so narrowly restricted, and the variable name isn't telling me
> > > > > > anything, since `_Float16` must by definition be "allowed" if we
> > > > > > have an expression of `_Float16` type.
> > > > > > since _Float16 must by definition be "allowed" if we have an
> > > > > > expression of _Float16 type.
> > > > >
> > > > > _Float16 is allowed only on a few targets.
> > > > > https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point
> > > > > By the way, we should update for X86 since it's not limited to
> > > > > avx512fp16 now.
> > > > > _Float16 is allowed only on a few targets.
> > > >
> > > > Yes, I know that. But if `SrcType->isFloat16Type()` is true, we must
> > > > be on one of those targets, because the type doesn't otherwise exist.
> > > I see your point now. The problem here is we want to allow the `_Float16`
> > > to be used more broadly. But the target doesn't really support it
> > > sometime. Currently full arithmatic operations are supported only on
> > > target with AVX512FP16.
> > > We should cast for those targets without AVX512FP16 while avoid to do on
> > > AVX512FP16.
> > I agree that many targets don't natively support arithmetic on this format,
> > but x86 is not the first one that does. Unless I'm misunderstanding, we
> > already track this property in Clang's TargetInfo as `hasLegalHalfType()`.
> > `+avx512fp16` presumably ought to set this.
> >
> > I'm not sure what the interaction with the `NativeHalfType` LangOpt is
> > supposed to be here. My understanding is that that option is just supposed
> > to affect `__fp16`, basically turning it into a proper arithmetic type,
> > i.e. essentially `_Float16`. Whatever effect you want to apply to
> > `_Float16` should presumably happen even if that option not set.
> >
> > More broadly, I don't think your approach in this patch is correct. The
> > type of operations on `_Float16` should not change based on whether the
> > target natively supports `_Float16`. If we need to emulate those
> > operations on targets that don't provide them natively, we should do that
> > at a lower level than the type system.
> >
> > The most appropriate place to do that is going to depend on the exact
> > semantics we want.
> >
> > If we want to preserve `half` semantics exactly regardless of target, we
> > should have Clang's IR generation actually emit `half` operations. Targets
> > that don't support those operations natively will have to lower at least
> > some of those operations into compiler-rt calls, but that's not at all
> > unprecedented.
> >
> > If we're okay with playing loose for performance reasons, we can promote to
> > `float` immediately around individual arithmetic operations. IR generation
> > is probably the most appropriate place to do that. But I'm quite concerned
> > about that making `_Float16` feel like an unpredictable/unportable type; it
> > seems to me that software emulation is much better.
> >
> > If you're proposing the latter, I think you need to raise that more widely
> > than a code review; please make a post on llvm-dev.
> > we already track this property in Clang's TargetInfo as `hasLegalHalfType()`
>
> That sounds a good approch. Thank you.
>
> > The type of operations on `_Float16` should not change based on whether the
> > target natively supports `_Float16`. If we need to emulate those operations
> > on targets that don't provide them natively, we should do that at a lower
> > level than the type system.
>
> Unfortunately, we can't do it at low level. The reason is (I'm not expert in
> frontend, just recalled from last disscussion with GCC folks) we have to do
> expresssion emulation to respect C/C++ semantics. GCC has option
> `-fexcess-precision=16` to match the same result with native instructions,
> but the default is `-fexcess-precision=fast` according to language semantics.
>
> > The most appropriate place to do that is going to depend on the exact
> > semantics we want...
>
> Note, we are not simply doing emulation in the frontend. It's backend's
> responsibility to emulate a single `half` operation. But it's frontend's
> responsibility to choose whether to emit several `half` operations or emit
> promote + several `float` operations + truncate. As described in the title,
> this patch is doing for the latter.
> Okay, this condition is pretty ridiculous to be repeating in three different
> places across the compiler. Especially since you're going to change it when
> you implement the new option, right?
>
> Can we state this condition more generally? I'm not sure why this is so
> narrowly restricted, and the variable name isn't telling me anything, since
> `_Float16` must by definition be "allowed" if we have an expression of
> `_Float16` type.
I agree :) I wasn't aware of this flag. I am still in the process of figuring
out how float16 work and what is exactly required for it.
Yes may be having 2 separate patches is the right way to go. One that turns
Float16 on for x86 without the use of +avx512fp16 feature and another one that
does the emulation ( half to float -> float arithmetic-> truncation to half).
@pengfei ?
================
Comment at: clang/lib/CodeGen/CGExprScalar.cpp:1315
+ if ((SrcType->isHalfType() || iSFloat16Allowed) &&
+ !CGF.getContext().getLangOpts().NativeHalfType) {
// Cast to FP using the intrinsic if the half type itself isn't supported.
----------------
zahiraam wrote:
> pengfei wrote:
> > rjmccall wrote:
> > > pengfei wrote:
> > > > rjmccall wrote:
> > > > > pengfei wrote:
> > > > > > rjmccall wrote:
> > > > > > > Okay, this condition is pretty ridiculous to be repeating in
> > > > > > > three different places across the compiler. Especially since
> > > > > > > you're going to change it when you implement the new option,
> > > > > > > right?
> > > > > > >
> > > > > > > Can we state this condition more generally? I'm not sure why
> > > > > > > this is so narrowly restricted, and the variable name isn't
> > > > > > > telling me anything, since `_Float16` must by definition be
> > > > > > > "allowed" if we have an expression of `_Float16` type.
> > > > > > > since _Float16 must by definition be "allowed" if we have an
> > > > > > > expression of _Float16 type.
> > > > > >
> > > > > > _Float16 is allowed only on a few targets.
> > > > > > https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point
> > > > > > By the way, we should update for X86 since it's not limited to
> > > > > > avx512fp16 now.
> > > > > > _Float16 is allowed only on a few targets.
> > > > >
> > > > > Yes, I know that. But if `SrcType->isFloat16Type()` is true, we must
> > > > > be on one of those targets, because the type doesn't otherwise exist.
> > > > I see your point now. The problem here is we want to allow the
> > > > `_Float16` to be used more broadly. But the target doesn't really
> > > > support it sometime. Currently full arithmatic operations are supported
> > > > only on target with AVX512FP16.
> > > > We should cast for those targets without AVX512FP16 while avoid to do
> > > > on AVX512FP16.
> > > I agree that many targets don't natively support arithmetic on this
> > > format, but x86 is not the first one that does. Unless I'm
> > > misunderstanding, we already track this property in Clang's TargetInfo as
> > > `hasLegalHalfType()`. `+avx512fp16` presumably ought to set this.
> > >
> > > I'm not sure what the interaction with the `NativeHalfType` LangOpt is
> > > supposed to be here. My understanding is that that option is just
> > > supposed to affect `__fp16`, basically turning it into a proper
> > > arithmetic type, i.e. essentially `_Float16`. Whatever effect you want
> > > to apply to `_Float16` should presumably happen even if that option not
> > > set.
> > >
> > > More broadly, I don't think your approach in this patch is correct. The
> > > type of operations on `_Float16` should not change based on whether the
> > > target natively supports `_Float16`. If we need to emulate those
> > > operations on targets that don't provide them natively, we should do that
> > > at a lower level than the type system.
> > >
> > > The most appropriate place to do that is going to depend on the exact
> > > semantics we want.
> > >
> > > If we want to preserve `half` semantics exactly regardless of target, we
> > > should have Clang's IR generation actually emit `half` operations.
> > > Targets that don't support those operations natively will have to lower
> > > at least some of those operations into compiler-rt calls, but that's not
> > > at all unprecedented.
> > >
> > > If we're okay with playing loose for performance reasons, we can promote
> > > to `float` immediately around individual arithmetic operations. IR
> > > generation is probably the most appropriate place to do that. But I'm
> > > quite concerned about that making `_Float16` feel like an
> > > unpredictable/unportable type; it seems to me that software emulation is
> > > much better.
> > >
> > > If you're proposing the latter, I think you need to raise that more
> > > widely than a code review; please make a post on llvm-dev.
> > > we already track this property in Clang's TargetInfo as
> > > `hasLegalHalfType()`
> >
> > That sounds a good approch. Thank you.
> >
> > > The type of operations on `_Float16` should not change based on whether
> > > the target natively supports `_Float16`. If we need to emulate those
> > > operations on targets that don't provide them natively, we should do that
> > > at a lower level than the type system.
> >
> > Unfortunately, we can't do it at low level. The reason is (I'm not expert
> > in frontend, just recalled from last disscussion with GCC folks) we have to
> > do expresssion emulation to respect C/C++ semantics. GCC has option
> > `-fexcess-precision=16` to match the same result with native instructions,
> > but the default is `-fexcess-precision=fast` according to language
> > semantics.
> >
> > > The most appropriate place to do that is going to depend on the exact
> > > semantics we want...
> >
> > Note, we are not simply doing emulation in the frontend. It's backend's
> > responsibility to emulate a single `half` operation. But it's frontend's
> > responsibility to choose whether to emit several `half` operations or emit
> > promote + several `float` operations + truncate. As described in the title,
> > this patch is doing for the latter.
> > Okay, this condition is pretty ridiculous to be repeating in three
> > different places across the compiler. Especially since you're going to
> > change it when you implement the new option, right?
> >
> > Can we state this condition more generally? I'm not sure why this is so
> > narrowly restricted, and the variable name isn't telling me anything, since
> > `_Float16` must by definition be "allowed" if we have an expression of
> > `_Float16` type.
>
> I agree :) I wasn't aware of this flag. I am still in the process of
> figuring out how float16 work and what is exactly required for it.
> Yes may be having 2 separate patches is the right way to go. One that turns
> Float16 on for x86 without the use of +avx512fp16 feature and another one
> that does the emulation ( half to float -> float arithmetic-> truncation to
> half). @pengfei ?
> > we already track this property in Clang's TargetInfo as `hasLegalHalfType()`
>
> That sounds a good approch. Thank you.
>
> > The type of operations on `_Float16` should not change based on whether the
> > target natively supports `_Float16`. If we need to emulate those operations
> > on targets that don't provide them natively, we should do that at a lower
> > level than the type system.
>
> Unfortunately, we can't do it at low level. The reason is (I'm not expert in
> frontend, just recalled from last disscussion with GCC folks) we have to do
> expresssion emulation to respect C/C++ semantics. GCC has option
> `-fexcess-precision=16` to match the same result with native instructions,
> but the default is `-fexcess-precision=fast` according to language semantics.
>
> > The most appropriate place to do that is going to depend on the exact
> > semantics we want...
>
> Note, we are not simply doing emulation in the frontend. It's backend's
> responsibility to emulate a single `half` operation. But it's frontend's
> responsibility to choose whether to emit several `half` operations or emit
> promote + several `float` operations + truncate. As described in the title,
> this patch is doing for the latter.
The _Float16 type is supported for both C and C++, on x86 systems with SSE2
enabled.
From https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html:
"On x86 targets with SSE2 enabled, **without -mavx512fp16, all operations will
be emulated by software emulation** and the float instructions. The default
behavior for FLT_EVAL_METHOD is to keep the intermediate result of the
operation as 32-bit precision. This may lead to inconsistent behavior between
software emulation and AVX512-FP16 instructions. Using -fexcess-precision=16
will force round back after each operation.
Using -mavx512fp16 will generate AVX512-FP16 instructions instead of software
emulation. The default behavior of FLT_EVAL_METHOD is to round after each
operation. The same is true with -fexcess-precision=standard and -mfpmath=sse.
If there is no -mfpmath=sse, -fexcess-precision=standard alone does the same
thing as before, It is useful for code that does not have _Float16 and runs on
the x87 FPU."
This is what we are trying to reach out with this patch.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D113107/new/
https://reviews.llvm.org/D113107
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits