Ramana Radhakrishnan wrote:
I haven't seen the patch yet but here are my thoughts on where this should be
going.
Thus in summary -
1. -mfpu=neon implies the presence of the float16x(4/8) types and all the
intrinsics that treat these values as bags of bits.
2. -mfpu=neon-fp16 implies the presence of the vcvt* intrinsics that are needed
for the float16 types.
So I think the "problems" are statements in ACLE that
(a) we should only have float16x(4/8)_t types, when we have scalar types as
well;
(b) whenever we have a scalar __fp16 type, we should have one or other of the
__ARM_FP16_FORMAT_(IEEE/ALTERNATIVE) macros defined to indicate the format
that's in use.
Sadly these seem to forbid the current situation whereby we expose hardware
conversion instructions (that work with either fp16 format, according to the
status of the FPSCR bit) and allow compiling a binary that will work with either
format :(.
The situation is further complicated by GCC's support for the alternative format
(not mandated by ACLE), that we can support either format in the absence of any
hardware (as we have software emulation routines for scalar conversions in
either format, as long as we know which at compile time), and that object files
compiled with different -mfp16-format cannot be linked together (the ABI
attributes conflict).
However, I think we can still go with Ramana's point 1. albeit _only_when_ a
-mfp16-format is specified. _2_ similarly (i.e. -mfpu=neon-fp16 will not provide
any additional intrinsics unless an -mfp16-format is specified).
I'll repost the patch series shortly with those changes implemented. In the
meantime: are patches 1 & 2 ( ARM __builtin_arm_neon_lane_bounds and
qualifier_lane_index) OK to commit? These contain nothing float16-specific, and
would unblock Charles Baylis' work on PR63870
(https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00545.html). I'll ping the
AArch64 changes separately.
Cheers, Alan