Ramana Radhakrishnan wrote:
I haven't seen the patch yet but here are my thoughts on where this should be 
going.

Thus in summary -
1. -mfpu=neon implies the presence of the float16x(4/8) types and all the 
intrinsics that treat these values as bags of bits.
2. -mfpu=neon-fp16 implies the presence of the vcvt* intrinsics that are needed 
for the float16 types.

So I think the "problems" are statements in ACLE that

(a) we should only have float16x(4/8)_t types, when we have scalar types as 
well;

(b) whenever we have a scalar __fp16 type, we should have one or other of the __ARM_FP16_FORMAT_(IEEE/ALTERNATIVE) macros defined to indicate the format that's in use.

Sadly these seem to forbid the current situation whereby we expose hardware conversion instructions (that work with either fp16 format, according to the status of the FPSCR bit) and allow compiling a binary that will work with either format :(.

The situation is further complicated by GCC's support for the alternative format (not mandated by ACLE), that we can support either format in the absence of any hardware (as we have software emulation routines for scalar conversions in either format, as long as we know which at compile time), and that object files compiled with different -mfp16-format cannot be linked together (the ABI attributes conflict).

However, I think we can still go with Ramana's point 1. albeit _only_when_ a -mfp16-format is specified. _2_ similarly (i.e. -mfpu=neon-fp16 will not provide any additional intrinsics unless an -mfp16-format is specified).

I'll repost the patch series shortly with those changes implemented. In the meantime: are patches 1 & 2 ( ARM __builtin_arm_neon_lane_bounds and qualifier_lane_index) OK to commit? These contain nothing float16-specific, and would unblock Charles Baylis' work on PR63870 (https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00545.html). I'll ping the AArch64 changes separately.

Cheers, Alan

Reply via email to