yaxunl marked an inline comment as done. yaxunl added inline comments.
================ Comment at: clang/lib/Driver/ToolChains/HIP.cpp:121-123 + auto Pos = SubArchName.find_first_of("+-"); + if (Pos != SubArchName.npos) + SubArchName = SubArchName.substr(0, Pos); ---------------- tra wrote: > yaxunl wrote: > > tra wrote: > > > Parsing should probably be extracted into a separate function to avoid > > > replicating it all over the place. > > > > > > I'd also propose use a different syntax for the properties. > > > * use explicit character to separate individual elements. This way > > > splitting the properties becomes independent of what those properties > > > are. If you decide to make properties with values or change their meaning > > > some other way, it would not affect how you compose them. > > > * use `name=value` or `name[+-]` for individual properties. This makes it > > > easy to parse individual properties and normalize their names. This makes > > > property map creation independent of the property values. > > > > > > Right now `[+-]` serves as both a separator and as the value, which would > > > present problems if you ever need more flexible parametrization of > > > properties. What if a property must be a number or a string. Granted, you > > > can always encode them as a set of bools, but that's rather unpractical. > > > > > > E.g. something like this would work a bit better: > > > `gfx111:foo+:bar=33:buz=string`. > > > > > I discussed this with our team. > > > > The target id features are not raw GPU properties. They are distilled to > > become a few target features to decide what the compiler should do. > > > > Each target feature has 3 values: on, off, and default, which are encoded > > as +feature, -feature, and not present. > > > > For runtime, it is simple and clear how to choose device binaries based on > > the target features: it will try exact match, otherwise choose the default. > > > > For compiler, it is also simple and clear what to do for each target > > feature value, since they corresponding to backend target features. > > > > Basically we expect the target id feature to be like flags, not key/value > > pairs. In case we do need key/value pairs, they can still use + as > > delimiter. > > > > Another reason we use +/- format is that it is more in line with the format > > of existing clang-offload-bundler id and target triple, which uses - as > > delimiter. > > > > Since the target id is an extension of offload arch and users will put it > > into command line, we want to make it short, concise and aesthetically > > appealing, we would avoid using non-alpha-numeric characters in the target > > id features. Target triple components have similar requirements. Using : as > > delimiter seems unnecessary, longer, and more difficult to read. > > > > Consider the following example > > > > > > ``` > > clang -offload-id gfx908+xnack-sramecc a.hip > > > > clang -offload-id gfx908:xnack+:sramecc- a.hip > > ``` > > > > We are more inclined to keep the original format. > You're thinking in terms what's needed by AMDGPU *now*. The scheme you're > proposing is sufficient for your use case and I'm fine with that. I'm > suggesting that you should consider what happens once this change lands. > > The functionality you're implementing is exposed to end-users via top-level > clang driver argument. This is visible to users and will be relied on. > This will make it hard to change in the future without breaking someone. It's > worth making sure we're not painting ourselves in the corner here. > > Also, the functionality may be useful/applicable beyond the scope of amdgpu > and the binary flags will not be sufficient for everyone. The scheme you're > proposing would be somewhat restrictive if I need to pass an integer value or > string. We could use something like `gfx123+foo=456-bar=789` but it would > look rather odd, IMO. > > Granted, none of the above is a showstopper. I guess we could support > multiple formats if it comes to that, but I'd rather not multiply things > later because we didn't think of them earlier. > > > Another reason we use +/- format is that it is more in line with the format > > of existing clang-offload-bundler id and target triple, which uses - as > > delimiter. > > The point was that commingling field separator and the field value is not the > cleanest approach, IMO. I'd be fine fine with some other character. > > > Since the target id is an extension of offload arch and users will put it > > into command line, we want to make it short, concise and aesthetically > > appealing, we would avoid using non-alpha-numeric characters in the target > > id features. Target triple components have similar requirements. Using : as > > delimiter seems unnecessary, longer, and more difficult to read. > > The current use of `gfxXXX` seems to fit the 'short, concise & aesthetically > pleasing' part of your argument much better than the proposed scheme. > > Is the end user allowed to specify an arbitrary set of the features? Or is > the offload-id set restricted to a smaller number of combinations (i.e. tied > to particular hardware variants). I vaguely recall that in the past the > problem was that AMD needed to create multiple device compilations for one > GPU architecture and that didn't fit in the model used by CUDA compilation. > > Would it make sense to keep user-visible GPU arch argument as is and map each > known one internally into a set of `offload-id` parameters used to create > driver device-side compilations? For CUDA it will be a pass-through, for HIP > it will translate single user-specified arch into multiple offload-ids. This > would leave AMDGPU free to choose the way internally-used offload-id is > structured and can change it if/when it's necessary without worrying about > existing users. It also keeps user-visible parameters short. The translation > from gpu-arch to offload-id should be simple enough to maintain. > > After discussion, we decided to adopt the format you proposed. The rationale is that we want target id to be treated as an extended `--offload-arch` option, which means it needs to be able to accept all existing and future CUDA arch names. Using `:` as delimiter should be tolerant enough whereas `+/-` is not. Also I will try introducing -offload-target-id for this option. The features that can be used in target id are restricted to a few predefined features for each GPU arch, because both compiler and runtime needs to know how to handle them. I am not sure if I understand your last question. With the new format we should be able to use any CUDA arch names as target id, therefore we no longer need a map. Also we need to pass each target id as a whole option since we need to use it as an id for the device binary for each device compilation. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D60620/new/ https://reviews.llvm.org/D60620 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits