From: Matthew Malcomson <mmalcom...@nvidia.com> Not sure who to Cc for this. Honestly just guessing a bit here. Please do redirect me if anyone knows of a better set of people to ask.
-------------- >8 ------- 8< ----------- This commit just defines the new names -- as yet don't implement them. Saving this commit because this is one decision, and recording what the decision was and why: Adding new floating point builtins for each floating point type that is defined in the general code *except* f128x (which would have a size greater than 16bytes -- the largest integral atomic operation we currently support). We have to base our naming on floating point *types* rather than sizes since different types can have the same size and the operations need to be distinguished based on type. N.b. one could make size-suffixed builtins that are still overloaded based on types but I thought that this was the cleaner approach. (Actual requirement is distinction based on mode, this is how I choose which internal function to use in a later patch. I believe that defining the function in terms of types and internally mapping to modes is a sensible split between user interface and internal implementation). Have checked with clang developers that they're happy with those names. https://discourse.llvm.org/t/atomic-floating-point-operations-and-libstdc/81461 N.b. in order to choose whether these operations are available or not in something like libstdc++ we use SFINAE on the type. This is already available in clang the below link has the patch where I add this ability into GCC: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664999.html N.b. I used the below type suffixes (following what seems like the existing convention for builtins): - float -> f - double -> <no suffix> - long double -> l - _FloatN -> fN (for N <- (16, 32, 64, 128)) - _FloatNx -> fNx (for N <- (32, 64)) Richi suggested doing this expansion generally for all these builtins following Cxy _Atomic semantics on IRC. Since C hasn't specified any fetch_<op> semantics for floating point types, C++ has only specified `atomic<>::fetch_{add,sub}`, and the operations other than these are all bitwise operations (which don't to map well to floating point), I believe I have followed that suggestion by implementing all fetch_{sub,add}/{add,sub}_fetch operations. I have not implemented anything for the __sync_* builtins on the belief that these are legacy and new code should use the __atomic_* builtins. Happy to adjust if that is a bad choice. Only the new function types were needed for most cases. The Fortran frontend does not use `builtin-types.def` but does include `sync-builtins.def`. Since the new definitions in `sync-builtins.def` use new enums describing their type the fortran `types.def` file needed to be updated to avoid a build error. Some of the floating point types that these new functions use may not be available depending on the target. `builtin-types.def` defines the associated type enum to `error_mark_node` when unavailable. This can not be done with `types.def` since in the fortran frontend there is no handling of `error_mark_node` being the type defined with these macros. Similarly the fortran frontend can not handle functions defined in `sync-builtins.def` with `error_mark_node` as a type (while other frontends can). The new functions will not automatically be exposed in fortran by simply defining them. If/when they are exposed they will have to be exposed with knowledge of the floating point semantics of Fortran in order to correctly handle floating point exceptions when these builtins are expanded as a CAS loop. I.e. the current definitions in `sync-builtins.def` are essentially dead code from the gfortran users perspective. Since there is no functionality to maintain here, I have introduced a new macro DEF_DUMMY_FUNCTION_TYPE in the fortran types.def which defines a type to match BT_FN_VOID_INT. That is used as the type of each of the new specialist floating point atomic functions. gcc/ChangeLog: * builtin-types.def (BT_FN_FLOAT_VPTR_FLOAT_INT): New type. (BT_FN_DOUBLE_VPTR_DOUBLE_INT): New type. (BT_FN_LONGDOUBLE_VPTR_LONGDOUBLE_INT): New type. (BT_FN_BFLOAT16_VPTR_BFLOAT16_INT): New type. (BT_FN_FLOAT16_VPTR_FLOAT16_INT): New type. (BT_FN_FLOAT32_VPTR_FLOAT32_INT): New type. (BT_FN_FLOAT64_VPTR_FLOAT64_INT): New type. (BT_FN_FLOAT128_VPTR_FLOAT128_INT): New type. (BT_FN_FLOAT32X_VPTR_FLOAT32X_INT): New type. (BT_FN_FLOAT64X_VPTR_FLOAT64X_INT): New type. * sync-builtins.def (DEF_SYNC_FLOATN_NX_BUILTINS): New. (DEF_SYNC_FLOAT_BUILTINS): New. (ADD_FETCH_TYPE): New. (BUILT_IN_ATOMIC_ADD_FETCH): New. (SUB_FETCH_TYPE): New. (BUILT_IN_ATOMIC_SUB_FETCH): New. (FETCH_ADD_TYPE): New. (BUILT_IN_ATOMIC_FETCH_ADD): New. (FETCH_SUB_TYPE): New. (BUILT_IN_ATOMIC_FETCH_SUB): New. gcc/fortran/ChangeLog: * f95-lang.cc (DEF_DUMMY_FUNCTION_TYPE): New macro. * types.def (BT_FLOAT): New dummy type. (BT_DOUBLE): New dummy type. (BT_LONGDOUBLE): New dummy type. (BT_FN_FLOAT_VPTR_FLOAT_INT): New dummy type. (BT_FN_DOUBLE_VPTR_DOUBLE_INT): New dummy type. (BT_FN_LONGDOUBLE_VPTR_LONGDOUBLE_INT): New dummy type. (BT_FN_BFLOAT16_VPTR_BFLOAT16_INT): New dummy type. (BT_FN_FLOAT16_VPTR_FLOAT16_INT): New dummy type. (BT_FN_FLOAT32_VPTR_FLOAT32_INT): New dummy type. (BT_FN_FLOAT64_VPTR_FLOAT64_INT): New dummy type. (BT_FN_FLOAT128_VPTR_FLOAT128_INT): New dummy type. (BT_FN_FLOAT32X_VPTR_FLOAT32X_INT): New dummy type. (BT_FN_FLOAT64X_VPTR_FLOAT64X_INT): New dummy type. Signed-off-by: Matthew Malcomson <mmalcom...@nvidia.com> --- gcc/builtin-types.def | 20 ++++++++++++++++++++ gcc/fortran/f95-lang.cc | 5 +++++ gcc/fortran/types.def | 17 +++++++++++++++++ gcc/sync-builtins.def | 40 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 82 insertions(+) diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index 25da582ce58..d0aac6a3e7c 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -802,6 +802,26 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, BT_VOLATILE_PTR, BT_I2, BT DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, BT_I16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT_VPTR_FLOAT_INT, BT_FLOAT, BT_VOLATILE_PTR, + BT_FLOAT, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_VPTR_DOUBLE_INT, BT_DOUBLE, BT_VOLATILE_PTR, + BT_DOUBLE, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_LONGDOUBLE_VPTR_LONGDOUBLE_INT, BT_LONGDOUBLE, + BT_VOLATILE_PTR, BT_LONGDOUBLE, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_BFLOAT16_VPTR_BFLOAT16_INT, BT_BFLOAT16, BT_VOLATILE_PTR, + BT_BFLOAT16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT16_VPTR_FLOAT16_INT, BT_FLOAT16, BT_VOLATILE_PTR, + BT_FLOAT16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32_VPTR_FLOAT32_INT, BT_FLOAT32, BT_VOLATILE_PTR, + BT_FLOAT32, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64_VPTR_FLOAT64_INT, BT_FLOAT64, BT_VOLATILE_PTR, + BT_FLOAT64, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128_VPTR_FLOAT128_INT, BT_FLOAT128, BT_VOLATILE_PTR, + BT_FLOAT128, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32X_VPTR_FLOAT32X_INT, BT_FLOAT32X, BT_VOLATILE_PTR, + BT_FLOAT32X, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64X_VPTR_FLOAT64X_INT, BT_FLOAT64X, BT_VOLATILE_PTR, + BT_FLOAT64X, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_INT_PTRPTR_SIZE_SIZE, BT_INT, BT_PTR_PTR, BT_SIZE, BT_SIZE) DEF_FUNCTION_TYPE_3 (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, BT_CONST_PTR, BT_SIZE) DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_INT_INT_INTPTR, BT_BOOL, BT_INT, BT_INT, diff --git a/gcc/fortran/f95-lang.cc b/gcc/fortran/f95-lang.cc index 30043cf2f92..55df698bfae 100644 --- a/gcc/fortran/f95-lang.cc +++ b/gcc/fortran/f95-lang.cc @@ -648,6 +648,7 @@ gfc_init_builtin_functions (void) enum builtin_type { #define DEF_PRIMITIVE_TYPE(NAME, VALUE) NAME, +#define DEF_DUMMY_FUNCTION_TYPE(NAME) NAME, #define DEF_FUNCTION_TYPE_0(NAME, RETURN) NAME, #define DEF_FUNCTION_TYPE_1(NAME, RETURN, ARG1) NAME, #define DEF_FUNCTION_TYPE_2(NAME, RETURN, ARG1, ARG2) NAME, @@ -676,6 +677,7 @@ gfc_init_builtin_functions (void) #define DEF_POINTER_TYPE(NAME, TYPE) NAME, #include "types.def" #undef DEF_PRIMITIVE_TYPE +#undef DEF_DUMMY_FUNCTION_TYPE #undef DEF_FUNCTION_TYPE_0 #undef DEF_FUNCTION_TYPE_1 #undef DEF_FUNCTION_TYPE_2 @@ -1068,6 +1070,8 @@ gfc_init_builtin_functions (void) #define DEF_PRIMITIVE_TYPE(ENUM, VALUE) \ builtin_types[(int) ENUM] = VALUE; +#define DEF_DUMMY_FUNCTION_TYPE(ENUM) \ + builtin_types[(int) ENUM] = builtin_types[(int) BT_FN_VOID_INT]; #define DEF_FUNCTION_TYPE_0(ENUM, RETURN) \ builtin_types[(int) ENUM] \ = build_function_type_list (builtin_types[(int) RETURN], \ @@ -1231,6 +1235,7 @@ gfc_init_builtin_functions (void) = build_pointer_type (builtin_types[(int) TYPE]); #include "types.def" #undef DEF_PRIMITIVE_TYPE +#undef DEF_DUMMY_FUNCTION_TYPE #undef DEF_FUNCTION_TYPE_0 #undef DEF_FUNCTION_TYPE_1 #undef DEF_FUNCTION_TYPE_2 diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def index a69e25206f1..5e622a41040 100644 --- a/gcc/fortran/types.def +++ b/gcc/fortran/types.def @@ -59,6 +59,10 @@ DEF_PRIMITIVE_TYPE (BT_I4, builtin_type_for_size (BITS_PER_UNIT*4, 1)) DEF_PRIMITIVE_TYPE (BT_I8, builtin_type_for_size (BITS_PER_UNIT*8, 1)) DEF_PRIMITIVE_TYPE (BT_I16, builtin_type_for_size (BITS_PER_UNIT*16, 1)) +DEF_PRIMITIVE_TYPE (BT_FLOAT, float_type_node) +DEF_PRIMITIVE_TYPE (BT_DOUBLE, double_type_node) +DEF_PRIMITIVE_TYPE (BT_LONGDOUBLE, long_double_type_node) + DEF_PRIMITIVE_TYPE (BT_PTR, ptr_type_node) DEF_PRIMITIVE_TYPE (BT_CONST_PTR, const_ptr_type_node) DEF_PRIMITIVE_TYPE (BT_VOLATILE_PTR, @@ -143,6 +147,19 @@ DEF_FUNCTION_TYPE_3 (BT_FN_I2_VPTR_I2_INT, BT_I2, BT_VOLATILE_PTR, BT_I2, BT_INT DEF_FUNCTION_TYPE_3 (BT_FN_I4_VPTR_I4_INT, BT_I4, BT_VOLATILE_PTR, BT_I4, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_I8_VPTR_I8_INT, BT_I8, BT_VOLATILE_PTR, BT_I8, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_I16_VPTR_I16_INT, BT_I16, BT_VOLATILE_PTR, BT_I16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT_VPTR_FLOAT_INT, BT_FLOAT, BT_VOLATILE_PTR, + BT_FLOAT, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_VPTR_DOUBLE_INT, BT_DOUBLE, BT_VOLATILE_PTR, + BT_DOUBLE, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_LONGDOUBLE_VPTR_LONGDOUBLE_INT, BT_LONGDOUBLE, + BT_VOLATILE_PTR, BT_LONGDOUBLE, BT_INT) +DEF_DUMMY_FUNCTION_TYPE (BT_FN_BFLOAT16_VPTR_BFLOAT16_INT) +DEF_DUMMY_FUNCTION_TYPE (BT_FN_FLOAT16_VPTR_FLOAT16_INT) +DEF_DUMMY_FUNCTION_TYPE (BT_FN_FLOAT32_VPTR_FLOAT32_INT) +DEF_DUMMY_FUNCTION_TYPE (BT_FN_FLOAT64_VPTR_FLOAT64_INT) +DEF_DUMMY_FUNCTION_TYPE (BT_FN_FLOAT128_VPTR_FLOAT128_INT) +DEF_DUMMY_FUNCTION_TYPE (BT_FN_FLOAT32X_VPTR_FLOAT32X_INT) +DEF_DUMMY_FUNCTION_TYPE (BT_FN_FLOAT64X_VPTR_FLOAT64X_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I1_INT, BT_VOID, BT_VOLATILE_PTR, BT_I1, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, BT_VOLATILE_PTR, BT_I2, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, BT_INT) diff --git a/gcc/sync-builtins.def b/gcc/sync-builtins.def index b4ec3782799..89cc564a8f6 100644 --- a/gcc/sync-builtins.def +++ b/gcc/sync-builtins.def @@ -28,6 +28,30 @@ along with GCC; see the file COPYING3. If not see is supposed to be using. It's overloaded, and is resolved to one of the "_1" through "_16" versions, plus some extra casts. */ + +/* Same as DEF_GCC_FLOATN_NX_BUILTINS, except for sync builtins. + N.b. we do not define the f128x type because this would be larger than the + 16 byte integral types that we have atomic support for. That would mean + we couldn't implement them without adding special extra handling -- + especially because to act atomically on such large sizes all architectures + would require locking implementations added in libatomic. */ +#undef DEF_SYNC_FLOATN_NX_BUILTINS +#define DEF_SYNC_FLOATN_NX_BUILTINS(ENUM, NAME, TYPE_MACRO, ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F16, NAME "f16", TYPE_MACRO (FLOAT16), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F32, NAME "f32", TYPE_MACRO (FLOAT32), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F64, NAME "f64", TYPE_MACRO (FLOAT64), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F128, NAME "f128", TYPE_MACRO (FLOAT128), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F32X, NAME "f32x", TYPE_MACRO (FLOAT32X), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## F64X, NAME "f64x", TYPE_MACRO (FLOAT64X), ATTRS) + +#undef DEF_SYNC_FLOAT_BUILTINS +#define DEF_SYNC_FLOAT_BUILTINS(ENUM, NAME, TYPE_MACRO, ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## _FPF, NAME "_fpf", TYPE_MACRO (FLOAT), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## _FP, NAME "_fp", TYPE_MACRO (DOUBLE), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## _FPL, NAME "_fpl", TYPE_MACRO (LONGDOUBLE), ATTRS) \ + DEF_SYNC_BUILTIN (ENUM ## _FPF16B, NAME "_fpf16b", TYPE_MACRO (BFLOAT16), ATTRS) \ + DEF_SYNC_FLOATN_NX_BUILTINS (ENUM ## _FP, NAME "_fp", TYPE_MACRO, ATTRS) + DEF_SYNC_BUILTIN (BUILT_IN_SYNC_FETCH_AND_ADD_N, "__sync_fetch_and_add", BT_FN_VOID_VAR, ATTR_NOTHROWCALL_LEAF_LIST) DEF_SYNC_BUILTIN (BUILT_IN_SYNC_FETCH_AND_ADD_1, "__sync_fetch_and_add_1", @@ -378,6 +402,10 @@ DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_ADD_FETCH_8, DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_ADD_FETCH_16, "__atomic_add_fetch_16", BT_FN_I16_VPTR_I16_INT, ATTR_NOTHROWCALL_LEAF_LIST) +#define ADD_FETCH_TYPE(F) BT_FN_##F##_VPTR_##F##_INT +DEF_SYNC_FLOAT_BUILTINS (BUILT_IN_ATOMIC_ADD_FETCH, "__atomic_add_fetch", + ADD_FETCH_TYPE, ATTR_NOTHROWCALL_LEAF_LIST) +#undef ADD_FETCH_TYPE DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_SUB_FETCH_N, "__atomic_sub_fetch", @@ -397,6 +425,10 @@ DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_SUB_FETCH_8, DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_SUB_FETCH_16, "__atomic_sub_fetch_16", BT_FN_I16_VPTR_I16_INT, ATTR_NOTHROWCALL_LEAF_LIST) +#define SUB_FETCH_TYPE(F) BT_FN_##F##_VPTR_##F##_INT +DEF_SYNC_FLOAT_BUILTINS (BUILT_IN_ATOMIC_SUB_FETCH, "__atomic_sub_fetch", + SUB_FETCH_TYPE, ATTR_NOTHROWCALL_LEAF_LIST) +#undef SUB_FETCH_TYPE DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_AND_FETCH_N, "__atomic_and_fetch", @@ -492,6 +524,10 @@ DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_ADD_8, DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_ADD_16, "__atomic_fetch_add_16", BT_FN_I16_VPTR_I16_INT, ATTR_NOTHROWCALL_LEAF_LIST) +#define FETCH_ADD_TYPE(F) BT_FN_##F##_VPTR_##F##_INT +DEF_SYNC_FLOAT_BUILTINS (BUILT_IN_ATOMIC_FETCH_ADD, "__atomic_fetch_add", + FETCH_ADD_TYPE, ATTR_NOTHROWCALL_LEAF_LIST) +#undef FETCH_ADD_TYPE DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_SUB_N, "__atomic_fetch_sub", @@ -511,6 +547,10 @@ DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_SUB_8, DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_SUB_16, "__atomic_fetch_sub_16", BT_FN_I16_VPTR_I16_INT, ATTR_NOTHROWCALL_LEAF_LIST) +#define FETCH_SUB_TYPE(F) BT_FN_##F##_VPTR_##F##_INT +DEF_SYNC_FLOAT_BUILTINS (BUILT_IN_ATOMIC_FETCH_SUB, "__atomic_fetch_sub", + FETCH_SUB_TYPE, ATTR_NOTHROWCALL_LEAF_LIST) +#undef FETCH_SUB_TYPE DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_FETCH_AND_N, "__atomic_fetch_and", -- 2.43.0