tra added a comment. In D110089#3014388 <https://reviews.llvm.org/D110089#3014388>, @jlebar wrote:
>> One alternative would be to use run-time dispatch, but, given that texture >> lookup is a single instruction, the overhead would be >> substantial-to-prohibitive. > > I guess I'm confused... Is the parameter value that we're "overloading" on > usually/always a constant? In that case, there's no overhead with runtime > dispatch. Or, is it not a constant? In which case, how does nvcc generate a > single instruction for this idiom at all? It's a string literal. And you're actually right, clang does manage to optimize strcmp with a known value. https://godbolt.org/z/h351hfsMf However, it's only part of the problem. Depending on which particular operation is used, the arguments vary, too. I still need to use templates that effectively need to be parameterized by that string literal argument and I can't easily do it until C++20. I'd need to push strcmp-based runtime dispatch down to the implementation of the texture lookups with the same operand signature. That's harder to generalize, as I'd have to implement string-based dispatch for quite a few subsets of the operations -- basically for each variant of cartesian product of `{dimensionality, Lod, Level, Sparse}`. Another downside is that the string comparison code will result in functions being much larger than necessary. Probably not a big thing overall, but why add overhead that would be paid for by all users and which does not buy us anything? Having one trivial compiler builtin that simplifies things a lot is a better trade-off, IMO. > But then I see `switch` statements in the code, so now I'm extra confused. :) That switch is for a special case of texture lookup which may result in one of four texture instruction variants. All others map 1:1. > Overall, I am unsure of why we need all of this magic. We can rely on LLVM > to optimize away constant integer comparisons, and also even comparisons > between string literals. It makes it possible to usa a string literal to parameterize templates, which allows to generate variants of `__nv_tex_surf_handler` in a relatively concise way. > What specifically would be inefficient if this were a series of "real" > overloaded functions, with none of the macros, templates, or builtins? > (Assuming efficiency is the concern here?) It's both efficiency and avoidance of typos in repetitive nearly identical code. There are ~2500 variants of high-level texture lookup variants. They end up calling about 600 different `__nv_tex_surf_handler` overloads that, in turn, end up generating ~70 unique inline assembly variants. The current code structure reflects that hierarchy. This is essentially the reason for the parameterization by the operation name happening early, instead of being used as a key for runtime dispatch at the end. ================ Comment at: clang/lib/AST/ExprConstant.cpp:11097 +static int EvaluateTextureOp(const CallExpr *E) { + // Sorted list of known operations stuuported by '__nv_tex_surf_handler' ---------------- jlebar wrote: > Write a comment explaining what this function does? > > (It seems to...translate a string into an integer? If so, to me, it's > strange that it uses a sorted list for this because...what if I add another > function? Won't that mess up all the numbers? Anyway, to be clarified in > the comment.) > > Now that I read more, I see that you don't care about this being a stable > mapping etc etc... > > I don't really get why this has to be a builtin at all, though. If it's > always a string literal, a simple strcmp will do the job, LLVM can optimize > this? And I'm almost sure you can assert that the char* is always a string > literal, so you can guarantee that it's always optimized away. Yes, it's just a 1:1 map. We do not care about specific values as they only matter within one TU. I'll document that. I can't easily use string literal to parameterize a template. Hmm. Perhaps I can implement a `constexpr perfect_hash(literal)` in a header. This would eliminate the need for the builtin. E.g. https://godbolt.org/z/bzzMbaKhe Let me give it a try. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits