On 13/02/18 05:09, John Fastabend wrote: > I have some code that is getting close to working on bounded loops. Here > is how I think it should work, (with some annotations on what I have). Thanks for this! For comparison, since my code is also 'getting close to working' here's how my approach works.
First, we get rid of (most of) check_cfg(). Replace it with a function that just puts the state-list marks on jump targets (including following- insns of conditional jumps and calls), then does a quick check that every insn following any JMP-class insn has a state-list mark (this suffices to detect statically-unreachable code). 2*O(n) in insn_cnt, and much easier to understand than check_cfg (I only recently figured out that it's also a topological sort algorithm). Now of course at this point there is nothing to prevent loops at all, so in is_state_visited we now add a check where we search backwards through the parentage chain for a state with the same insn_idx as us. (We have to change a couple of things for this to work: store the insn_idx of the current frame instead of the callsite (insn_idx of the calling frame), and thanks to function calls it turns out that the chain we want doesn't match that for any of the registers. So I gave registers their own parentage chains (that's a nice simplification and worth doing in itself) and added a new chain to func_states that does what I want.) If we find such a state, we have looped, so we reject the prog. So far the diffstat has 159 insertions, 353 deletions. The downside of course is that walking the parentage chain all the time to look for loops is going to hurt performance. I haven't yet come up with a way around that but some time spent meditating on data structures might turn something up ;) Next we want to start allowing the loop as long as it's bounded. So when we detect the loop in is_state_visited, we look at the last jump we did. (With a slight tweak we can make this always available in prev_insn_idx, even if we elided the push/pop_stack in check_cond_jmp_op.) We then look at the jump insn, the old state (conveniently returned from our loop detection routine!) and the new state, to decide whether the loop is moving towards termination. The test I'm using for now shares some "it's just one case" limits as your point 3 :-) For now it's: a) Is the jump a JLT with a constant? (Same reason as yours.) b) Are we in the 'jump true' branch? (I haven't quite figured out how to make sure of this in general, checking insn_idx < prev_insn_idx works but rules out back-edges with forward jumps. Fortunately those only occur in unnatural loops so we don't have to support them all.) c) Does the jump reg hold a SCALAR_VALUE in both old and new states? d) Did the jump reg->umin_value increase from old to new states? e) How _fast_ did it increase? Rule for now is to take the delta, multiply it by 16, and if that exceeds the jump compare constant, we're happy. This doesn't actually limit loops to 16 times, because the delta could decay exponentially - the worst case comes out as 687 iterations. But we could do something cleverer with recording the delta so we can insist on linear induction variables (i = i + c) so that we can give those more iterations without giving too many to the worst case exponential. If all of these checks pass, then the loop is (so far) behaving itself. So we mark the old state with a "this is being looped on" flag (which prevents it from being used for pruning. I haven't yet come up with a way to clear the flag when all the state's descendants have reached an exit, so this will unfortunately reduce pruning effectiveness) and then we carry on the walk. do_check() will ultimately walk around the loop as many times as the program will. Of course, there's one problem here: when do_check walks the loop for the last time, and knows that the loop test has to be false... it will still follow the branch anyway and keep walking forever. So I also had to add a patch to not follow 'impossible' branches, determined by the jump register getting inconsistent bounds. This is similar to what check_cond_jmp_op() already does with JEQ/JNE of const reg against const imm. > The worst case loop count would > be, `c - min_value` then `(c - min_value) * max(loop_cfg(i))` would be the > the worst case instruction that might be executed. Where loop_cfg(i) is > the > max insns that can be run through the loop with worst case CFG path. Currently I don't have any equivalent of that loop_cfg(i) factor, so I'm applying the same max iteration count to all loops regardless of the size of the loop body. But I guess I could do something like counting insns since parent state and recording that in the explored_states, so that when I have my detected loop I can count how long it is. Anyway, that's my design, I hope it wasn't _too_ horrifying, and I hope to have some RFC patches in the next week or so. -Ed