On 11.07.2023 08:45, Liu, Hongtao wrote: >> -----Original Message----- >> From: Jan Beulich <jbeul...@suse.com> >> Sent: Tuesday, July 11, 2023 2:08 PM >> >> There's nothing AVX512BW-ish in here, so no reason to use Yw as the >> constraints for the AVX alternative. Furthermore by using the 512-bit form of >> VPSSLD (in a new alternative) all 32 registers can be used directly by the >> insn >> without AVX512VL needing to be enabled. > Yes, the instruction vpslld doesn't need AVX512BW, the patch LGTM.
Thanks. >> --- >> The corresponding expander, "extendbfsf2", looks to have been dead since >> its introduction in a1ecc5600464 ("Fix incorrect _mm_cvtsbh_ss"): The builtin >> references the insn (extendbfsf2_1), not the expander. Can't the expander >> be deleted and the name of the insn then pruned of the _1 suffix? If so, that >> further raises the question of the significance of the "!HONOR_NANS >> (BFmode)" that the expander has, but the insn doesn't have. Which may >> instead suggest the builtin was meant to reference the expander. Yet then I >> can't see what would the builtin would expand to when HONOR_NANS >> (BFmode) it true. > > Quote from what Jakub said in [1]. > ------- > This is not correct. > While using such code for _mm_cvtsbh_ss is fine if it is documented not to > raise exceptions and turn a sNaN into a qNaN, it is not fine for HONOR_NANS > (i.e. when -ffast-math is not on), because a __bf16 -> float conversion > on sNaN should raise invalid exception and turn it into a qNaN. > We could have extendbfsf2 expander that would FAIL; if HONOR_NANS and > emit extendbfsf2_1 otherwise. > ------- > [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607108.html I'm not sure I understand: It sounds like what Jakub said matches my observation, yet then it seems unlikely that the issue wasn't fixed in over half a year. Also having the expander FAIL when HONOR_NANS (matching what I was thinking) still doesn't clarify to me what then would happen to uses of the builtin. Is there any (common code) fallback for such a case? I didn't think there would be, in which case wouldn't this result in an internal compiler error? Jan