Re: [PATCH 3/16][ARM] Add float16x4_t intrinsics

Alan Lawrence Mon, 27 Jul 2015 05:45:40 -0700

Ramana Radhakrishnan wrote:

I haven't seen the patch yet but here are my thoughts on where this should be 
going.

Thus in summary -

1. -mfpu=neon implies the presence of the float16x(4/8) types and all the 
intrinsics that treat these values as bags of bits.
2. -mfpu=neon-fp16 implies the presence of the vcvt* intrinsics that are needed 
for the float16 types.


So I think the "problems" are statements in ACLE that

(a) we should only have float16x(4/8)_t types, when we have scalar types as 
well;

(b) whenever we have a scalar __fp16 type, we should have one or other of the__ARM_FP16_FORMAT_(IEEE/ALTERNATIVE) macros defined to indicate the formatthat's in use.

Sadly these seem to forbid the current situation whereby we expose hardwareconversion instructions (that work with either fp16 format, according to thestatus of the FPSCR bit) and allow compiling a binary that will work with eitherformat :(.

The situation is further complicated by GCC's support for the alternative format(not mandated by ACLE), that we can support either format in the absence of anyhardware (as we have software emulation routines for scalar conversions ineither format, as long as we know which at compile time), and that object filescompiled with different -mfp16-format cannot be linked together (the ABIattributes conflict).

However, I think we can still go with Ramana's point 1. albeit _only_when_ a-mfp16-format is specified. _2_ similarly (i.e. -mfpu=neon-fp16 will not provideany additional intrinsics unless an -mfp16-format is specified).

I'll repost the patch series shortly with those changes implemented. In themeantime: are patches 1 & 2 ( ARM __builtin_arm_neon_lane_bounds andqualifier_lane_index) OK to commit? These contain nothing float16-specific, andwould unblock Charles Baylis' work on PR63870(https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00545.html). I'll ping theAArch64 changes separately.


Cheers, Alan

Re: [PATCH 3/16][ARM] Add float16x4_t intrinsics

Reply via email to