On Tue, Feb 26, 2013 at 2:59 AM, Steven Bosscher <stevenb....@gmail.com> wrote: > On Tue, Feb 26, 2013 at 2:12 AM, Wei Mi wrote: >> But it is not a good transformation unless we know insn split will >> change a << (b & 63) to a << b; Here we want to see what the rtl looks >> like after insn splitting in fwprop cost estimation (We call >> split_insns in estimate_split_and_peephole(), but not to do insn >> splitting actually in this phase). > > So you're splitting to find out that the shift is truncated to 5 or 6 > bits. That looks like what you really want is to have > SHIFT_COUNT_TRUNCATED working for your target. It isn't defined for > i386: > > /* Define if shifts truncate the shift count which implies one can > omit a sign-extension or zero-extension of a shift count. > > On i386, shifts do truncate the count. But bit test instructions > take the modulo of the bit offset operand. */ > > /* #define SHIFT_COUNT_TRUNCATED */ > > Perhaps SHIFT_COUNT_TRUNCATED should be turned into a target hook, and > take an rtx_code (or a pattern) to let the target decide whether a > truncation is applicable or not. > > This is a target thing, so perhaps Uros has some ideas about this. > > I'm guessing cse.c would then handle your code transformation already, > or can be made to do so without a lot of extra work, e.g. teach > fold_rtx about such (shift (...) (and (...))) transformations that are > really truncations.
Thanks for pointing out fold_rtx. I took a look at it and cse yesterday, and I agreed with you fold_rtx could be extended to handle the motivational case. But I still think fwprop extension could be meaningful generally. 1. fold_rtx doesn't handling all the propagation-simplification tasks. It only handles some typical cases. I think cse doesn't want to be very cumbersome to include all the fwprop's functionality. fwprop extension tries to generally handle the propagation-simplification problem. I think cse contains fold_rtx partially because existing fwprop and combine are not ideal. If fwprop could handle the general case, cse could simply try to find common subexpression. 2. fold_rtx does the simplification only based on the current insn, while fwprop extension tries to consider the def-uses group in a whole. When all the uses could be propagated, we have the choices: a) do all the propagations then delete the def insn, even if some propagations may not be beneficial. b) only select beneficial propagations and leave the def insn there. fwprop extension has a cost model to choose which way to go. What do you think? Thanks, Wei.