On Tue, 2014-12-30 at 16:22 +0100, Richard Biener wrote: > On December 29, 2014 7:44:13 PM CET, Oleg Endo <oleg.e...@t-online.de> wrote: > >On Mon, 2014-12-29 at 17:53 +0000, Thomas Preud'homme wrote: > >> > From: Richard Biener [mailto:rguent...@suse.de] > >> > Sent: Monday, December 29, 2014 5:09 PM > >> > > >> > OK, but what about targets without a rotation optab? Is the > >fallback > >> > expansion reasonable in all cases? > >> > >> To be honest I haven't checked. I thought being a treecode means it > >> can always be expanded, using a sequence of shift and bitwise or if > >> necessary. Isn't there some language that GCC support with rotate > >> operators? > >> > >> Given your question I guess I was wrong assuming this. Is there a > >list > >> of gimple construct that are necessary supported? What about a list > >> of insn pattern that a backend must necessarily provide? > >> > > > >__builtin_bswap16 expansion uses the 'rotlhi3' pattern to do a 16 bit > >bswap as a fallback when there's no 'bswaphi2' pattern in the backend > >(like on SH at the moment. I haven't added bswaphi2, as > >__builtin_bswap16 has been working without it). > > > >I've just tried disabling the 'rotlhi3' pattern and __builtin_bswap16 > >expands into shift + and + or (as expected). > >Thus, I don't think the patch will make something worse (than it > >already
.L42: > >is) on some backends. If the bswap detection bails out at the tree > >level, the expanded ops will be shift + and + or -- as written in the > >original code. So probably, that will be the same as the fallback > >expansion for __builtin_bswap16, and we're back to sqrt (1). > > OK, that is what I was asking - are there cases where using rot is worse > (like forcing a libcall or so). See also comment in expmed.c (expand_shift_1): It is theoretically possible that the target machine might not be able to perform either shift and hence we would be making two libcalls rather than just the one for the shift (similarly if IOR could not be done). We will allow this extremely unlikely lossage to avoid complicating the code below. */ then goto .L42 ;) Cheers, Oleg