erichkeane wrote: > Hi @erichkeane, thanks for your detailed review and comments. > > I had some thoughts about your optimized approach which I would like to share. > > I agree that falling back to full string comparisons is "safe" but > non-optimal. However, performing "first different character" comparison might > get complicated if you have more than 2 different spelling names with the > same size. > > An example case is `AT_NoSanitizeSpecific` with - "no_sanitize_thread" and > "no_sanitize_memory" (size 18). In this case it is easy to check for the > differing character by using `std::mismatch`. However, a case where there are > 3 such strings (like "aaaba", "aaacb", "aaabc") would involve generation of a > much more complex conditional. Although rare, I'm not sure if we can ignore > this situation. We may sacrifice readability of the emitter code / > debuggability of the `.inc` file at the cost of minimal optimizations (full > string comparisons now happen in only 6 cases). > > I'm happy to add the FIXME note, but would like to hear what you think.
So the conditional actually doesn't have to be all that difficult. You need N-1 comparisons to differentiate (though you m ight end up with duplicates). For example, in your 3 strings example if differentiating the first one, my algorithm would generate Str.size() == 5 && Str[3] == 'b' && Str[3] == 'b' (since that is the difference character for each). It is simple enough join with an 'and' and a loop, as long as we don't care about duplicates. As far as debugability/readability of emitted code: it isn't really a concern here. Once we get the tablegen 'right' (which traditionally doesn't take much effort/testing), no one ever gets into here again. The spelling-identification one uses a trie, and it is a giant mess of undebugability :) > > To address a previous comment - > > > This seems to have missed a bunch, see alloc_size and allocating and > > alloc_align, all of which are the same spellings, but you're checking the > > name anyway (in the .inc file you linked). > > I'm confident that my earlier approach did not perform string comparisons for > "alloc_size", "allocating" and "alloc_align". The 2 tests that were failing > were because I assumed that spellings with the same name would have > consecutive indices. This failed for `AT_unused`. But I concede that the > earlier loop did not leverage the `.size()` for a more optimal comparison :) > This way is definitely better. > Got it! It wasn't clear based on the old implementation what was happening, just from the GIST that something wasn't working. > [Gist](https://gist.github.com/chinmaydd/1733a9a84932a45678c47f31be4d64d7) > contains updated output. That GIST looks about right. I see as you mentioned, ~6 different comparisons, which isn't too bad. Definitely an improvement. https://github.com/llvm/llvm-project/pull/114899 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits