On Thu, Aug 4, 2016 at 4:30 AM, Patrick Palka <patr...@parcs.ath.cx> wrote: > On Wed, 3 Aug 2016, David Malcolm wrote: > >> On Wed, 2016-08-03 at 15:47 +0200, Richard Biener wrote: >> > On Wed, Aug 3, 2016 at 6:00 AM, Patrick Palka <patr...@parcs.ath.cx> >> > wrote: >> > > VRP currently has functionality to eliminate case labels that lie >> > > completely outside of the switch operand's value range. This patch >> > > complements this functionality by teaching VRP to also truncate the >> > > case >> > > label ranges that partially overlap with the operand's value range. >> > > >> > > Bootstrapped and regtested on x86_64-pc-linux-gnu. Does this look >> > > like >> > > a reasonable optimization? Admittedly, its effect will almost >> > > always be >> > > negligible except in cases where a case label range spans a large >> > > number >> > > of values which is a pretty rare thing. The optimization triggered >> > > about 250 times during bootstrap. >> > >> > I think it's most useful when the range collapses to a single value. >> > >> > Ok. >> >> Is this always an improvement? I can see that it can simplify things, >> eliminate dead code etc, but could it make evaluating the switch less >> efficient? >> >> Consider e.g. >> >> void >> test (char ch) >> { >> if (ch > 17) >> return; >> >> switch (ch) >> { >> case 0: >> foo (); break; >> >> case 1 .. 255: >> bar (); break; >> } >> } >> >> which (assuming this could survive this far in this form) previously >> could be implemented as a simple "if (ch == 0)" but with this would get >> simplified to: >> >> void >> test (char ch) >> { >> if (ch > 17) >> return; >> >> switch (ch) >> { >> case 0: >> foo (); break; >> >> case 1 .. 17: >> bar (); break; >> } >> } >> >> which presumably introduces a compare against 17 in the implementation of >> the switch; does the new compare get optimized away by jump threading? > > In this particular example the final code does get worse with the patch > for the reason you mentioned: > > Before: After: > test: test: > .LFB0: .LFB0: > .cfi_startproc .cfi_startproc > cmpb $17, %dil cmpb $17, %dil > ja .L1 ja .L1 > xorl %eax, %eax subl $1, %edi > cmpb $1, %dil xorl %eax, %eax > jb .L7 cmpb $16, %dil > jmp bar ja .L7 > .p2align 4,,10 jmp bar > .p2align 3 .p2align 4,,10 > .L7: .p2align 3 > jmp foo .L7: > .p2align 4,,10 jmp foo > .p2align 3 .p2align 4,,10 > .L1: .p2align 3 > rep ret .L1: > .cfi_endproc rep ret > .cfi_endproc > > What's weird is that during gimplification the switch gets simplified to > > switch (ch) > { > default: foo (); break; > case 1 ... 255: bar (); break; > } > > but if anything I would have expected it to get simplified to > > switch (ch) > { > case 0: foo (); break; > default: bar (); break; > } > > In general, when case labels are exhaustive, maybe it would be better to > designate the case label that has the widest range as the default label? > (Currently preprocess_case_label_vec_for_gimple() just designates the > very first label to be the default label.) That would fix this > particular regression at least.
Yes, that looks useful - though I wonder how easy it is to detect for the cases where there are more than one case/default. Richard.