On Tue, Jul 2, 2019 at 3:48 AM Martin Sebor <mse...@gmail.com> wrote: > > Attached is a more complete solution that fully resolves the bug > report by avoiding a warning in cases like: > > char a[32], b[8]; > > void f (void) > { > if (strlen (a) < sizeof b - 2) > snprintf (b, sizeof b, "b=%s", a); // no -Wformat-truncation > } > > It does that by having get_range_strlen_dynamic use the EVRP > information for non-constant strlen calls: EVRP has recorded > that the result is less sizeof b - 2 and so the function > returns this limited range of lengths to snprintf which can > then avoid warning. It also improves the checking and can > find latent bugs it missed before (like the off-by-one in > print-rtl.c). > > Besides improving the accuracy of the -Wformat-overflow and > truncation warnings this can also result in better code. > So far this only benefits snprintf but there may be other > opportunities to string functions as well (e.g., strcmp or > memcmp). > > Jeff, I looked into your question/suggestion for me last week > when we spoke, to introduce some sort of a recursion limit for > get_range_strlen_dynamic. It's easily doable but before we go > down that path I did some testing to see how bad it can get and > to compare it with get_range_strlen. Here are the results for > a few packages. The dept is the maximum recursion depth, success > and fail are the numbers of successful and failed calls to > the function: > > binutils-gdb: > depth success fail > get_range_strlen: 319 8302 21621 > get_range_strlen_dynamic: 41 1503 161 > gcc: > get_range_strlen: 46 7211 11365 > get_range_strlen_dynamic: 23 10272 12 > glibc: > get_range_strlen: 76 2840 11422 > get_range_strlen_dynamic: 51 1186 46 > elfutils: > get_range_strlen: 33 1198 2516 > get_range_strlen_dynamic: 31 685 36 > kernel: > get_range_strlen: 25 5299 11698 > get_range_strlen_dynamic: 31 9911 122 > > Except the first case of get_range_strlen (I haven't looked into > it yet), it doesn't seem too bad, and with just one exception it's > better than get_range_strlen. Let me know if you think it's worth > adding a parameter (I assume we'd use it for both functions) and > what to set it to.
I think we want to avoid adding code to GCC that might turn out quadratic for artificial testcases. History tells us that eventually somebody will find a real-world testcase that hits it. I would suggest to not place a limit on the depth but on the number of SSA_NAME_DEF_STMTs visited. Not sure to what this would turn out but I guess using a limit around 500 would work? For the data above what's the biggest depth / highest number of stmts we visit when we are able to compute a useful limit and what is it when including failure? The individual number of successes/fails are not too interesting. Maybe even provide a histogram for the successful cases depth/stmt count? Otherwise I like the merging - can you annotate the passes.def entries with a comment indicating what the pass parameter designates? Like /* do optimize */? Since we have a parallelization GSoC project running not adding more global state to passes would be nice as well, strlen has a lot of it that could be in a per invocation state class. Not a requirement for this patch though. Besides the limiting no further comments from me - I'll leave the details to Jeff. Thanks a lot for doing this! Richard. > > On 6/11/19 5:26 PM, Martin Sebor wrote: > > The sprintf and strlen passes both work with strings but > > run independently of one another and don't share state. As > > a result, lengths of strings dynamically created by functions > > that are available to the strlen pass are not available to > > sprintf. Conversely, lengths of strings formatted by > > the sprintf functions are not made available to the strlen > > pass. The result is less efficient code, poor diagnostics, > > and ultimately less than optimal user experience. > > > > The attached patch is the first step toward rectifying this > > design problem. It integrates the two passes into one and > > exposes the string length data managed by the strlen pass to > > the sprintf "module." (It does not expose any sprintf data > > to the strlen pass yet.) > > > > The sprintf pass invocations in passes.def have been replaced > > with those of strlen. The first "early" invocation is only > > effective for the sprintf module to enable warnings without > > optimization. The second invocation is "late" and enables > > both warnings and the sprintf and strlen optimizations unless > > explicitly disabled via -fno-optimize-strlen. > > > > Since the strlen optimization is the second invocation of > > the pass tests that scan the strlen dump have been adjusted > > to look for the "strlen2" dump file. > > > > The changes in the patch are mostly mechanical. The one new > > "feature" worth calling out is the get_range_strlen_dynamic > > function. It's analogous to get_range_strlen in gimple-fold.c > > except that it makes use of the strlen "dynamically" obtained > > string length info. In cases when the info is not available > > the former calls the latter. > > > > The other new functions in tree-ssa-strlen.c are > > check_and_optimize_call and handle_integral_assign: I added > > them only to keep the check_and_optimize_stmt function from > > getting excessively long and hard to follow. Otherwise, > > the code in the functions is unchanged. > > > > There are a number of further enhancements to consider as > > the next steps: > > * enhance the strlen module to determine string length range > > information from integer variable to which it was previously > > assigned (necessary to fully resolve pr83431 and pr90625), > > * make the sprintf/snprintf string length results directly > > available to the strlen module, > > * enhance the sprintf module to optimize snprintf(0, 0, fmt) > > calls with fmt strings consisting solely of plain characters > > and %s directives to series of strlen calls on the arguments, > > * and more. > > > > Martin >