ChuanqiXu added inline comments.
================ Comment at: clang/lib/AST/ODRHash.cpp:479-480 + + llvm::copy_if(D->attrs(), std::back_inserter(HashableAttrs), + [](const Attr *A) { return !A->isImplicit(); }); +} ---------------- aaron.ballman wrote: > ChuanqiXu wrote: > > aaron.ballman wrote: > > > ChuanqiXu wrote: > > > > aaron.ballman wrote: > > > > > ChuanqiXu wrote: > > > > > > vsapsai wrote: > > > > > > > vsapsai wrote: > > > > > > > > erichkeane wrote: > > > > > > > > > aaron.ballman wrote: > > > > > > > > > > ChuanqiXu wrote: > > > > > > > > > > > vsapsai wrote: > > > > > > > > > > > > I'm not sure `isImplicit` is the best indicator of > > > > > > > > > > > > attributes to check, so suggestions in this area are > > > > > > > > > > > > welcome. I think we can start strict and relax some of > > > > > > > > > > > > the checks if needed. > > > > > > > > > > > > > > > > > > > > > > > > If people have strong opinions that some attributes > > > > > > > > > > > > shouldn't be ignored, we can add them to the tests to > > > > > > > > > > > > avoid regressions. Personally, I believe that alignment > > > > > > > > > > > > and packed attributes should never be silently ignored. > > > > > > > > > > > Agreed. I feel `isImplicit` is enough for now. > > > > > > > > > > The tricky part is -- sometimes certain attributes add > > > > > > > > > > additional implicit attributes and those implicit > > > > > > > > > > attributes matter > > > > > > > > > > (https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L9380). > > > > > > > > > > And some attributes seem to just do the wrong thing > > > > > > > > > > entirely: > > > > > > > > > > https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L7344 > > > > > > > > > > > > > > > > > > > > So I think `isImplicit()` is a good approximation, but I'm > > > > > > > > > > more wondering what the principle is for whether an > > > > > > > > > > attribute should or should not be considered part of the > > > > > > > > > > ODR hash. Type attributes, attributes that impact object > > > > > > > > > > layout, etc all seem like they should almost certainly be > > > > > > > > > > part of ODR hashing. Others are a bit more questionable > > > > > > > > > > though. > > > > > > > > > > > > > > > > > > > > I think this is something that may need a per-attribute > > > > > > > > > > flag in Attr.td so attributes can opt in or out of it > > > > > > > > > > because I'm not certain what ODR issues could stem from > > > > > > > > > > `[[maybe_unused]]` or `[[deprecated]]` disagreements across > > > > > > > > > > module boundaries. > > > > > > > > > I don't think 'isImplicit' is particularly good. I think the > > > > > > > > > idea of auto-adding 'type' attributes and having the 'rest' > > > > > > > > > be analyzed to figure out which are important. > > > > > > > > > > > > > > > > > > Alternatively, I wonder if we're better off just adding ALL > > > > > > > > > attributes and seeing what the fallout is. We can later > > > > > > > > > decide when we don't want them to be ODR significant (which, > > > > > > > > > might be OTHERWISE meaningful later!). > > > > > > > > One criteria to decide which attributes should be hashed is if > > > > > > > > they affect IRGen. But that's not exhaustive and I'm not sure > > > > > > > > how practical it is. > > > > > > > > > > > > > > > > The rule I'm trying to follow right now is if declarations with > > > > > > > > different attributes can be substituted instead of each other. > > > > > > > > For example, structs with different memory layout cannot be > > > > > > > > interchanged, so it is reasonable to reject them. > > > > > > > > > > > > > > > > But maybe we should review attributes on case-by-case basis. > > > > > > > > For example, for `[[deprecated]]` I think the best for > > > > > > > > developers is not to complain about it but merge the attributes > > > > > > > > and then have regular diagnostic about using a deprecated > > > > > > > > entity. > > > > > > > One option was to hash `isa<InheritedAttr>` attributes as they > > > > > > > are "sticky" and should be more significant, so shouldn't be > > > > > > > ignored. But I don't think this argument is particularly > > > > > > > convincing. > > > > > > > > > > > > > > At this stage I think we can still afford to add all attributes > > > > > > > and exclude some as-needed because modules adoption is still > > > > > > > limited. Though I don't have strong feelings about this approach, > > > > > > > just getting more restrictive later can be hard. > > > > > > It looks not a bad idea to add all attributes as experiments, > > > > > > @vsapsai how do you feel about this? > > > > > My intuition is that we're going to want this to be controlled from > > > > > Attr.td on a case-by-case basis and to automatically generate the ODR > > > > > hash code for attribute arguments. > > > > > > > > > > We can either force every attribute to decide explicitly (this seems > > > > > pretty disruptive as a final design but would be a really good way to > > > > > ensure we audit all attributes if done as an experiment), or we can > > > > > pick a default behavior to ODR hash/not. I suspect we're going to > > > > > want to pick a default behavior based on whatever the audit tells us > > > > > the most common situation is. > > > > > > > > > > I think we're going to need something a bit more nuanced than "this > > > > > attribute matters for ODR" because there are attribute arguments that > > > > > don't always contribute to the entity (for example, we have "fake" > > > > > arguments that are used to give the semantic attribute more > > > > > information, there may also be cases where one argument matter and > > > > > another argument does not such as `enable_if` where the condition > > > > > matters greatly but the message doesn't matter much). So we might > > > > > need a marking on the attribute level *and* on the parameter level to > > > > > determine what all factors into attribute identity for generating the > > > > > ODR hashing code. Hopefully we can avoid needing more granularity > > > > > than that. > > > > I feel the default behavior would be to do ODR hashes. I just took a > > > > quick look in Attr.td. And I feel most of them affecting the code > > > > generation. For example, there're a lot of attributes which is hardware > > > > related. > > > > > > > > > because there are attribute arguments that don't always contribute to > > > > > the entity > > > > > > > > When you're talking about "entities", I'm not sure if you're talking > > > > the "entities" in the language spec or a general abstract ones. I mean, > > > > for example, the `always_inline` attribute doesn't contribute to the > > > > declaration from the language perspective. But it'll be super odd if we > > > > lost it. So I feel it may be better to change the requirement to be > > > > "affecting code generation" > > > > > > > > Also, to be honest, I am even not sure if we should take "affecting > > > > code generation " as the requirement. For example, for the > > > > `preferred_name` attribute, this is used for better printing names. It > > > > doesn't affect code generation nor the entity you mentioned. But if we > > > > drop it, then the printing name may not be what users want. I'm not > > > > sure if this is desired. > > > > I feel the default behavior would be to do ODR hashes. I just took a > > > > quick look in Attr.td. And I feel most of them affecting the code > > > > generation. For example, there're a lot of attributes which is hardware > > > > related. > > > > > > Hmmm, I'm seeing a reasonably even split. We definitely have a ton of > > > attributes like multiversioning, interrupts, calling conventions, > > > availability, etc that I think all should be part of an ODR hash. We also > > > have a ton of other ones that don't seem like they should matter for ODR > > > hashing like hot/cold, consumed/returns retained, constructor/destructor, > > > deprecated, nodiscard, maybe_unused, etc. > > > > > > Then we have the really fun ones where I think we'll want them to be in > > > the ODR hash but maybe we don't, like `[[clang::annotate]]`, restrict, > > > etc. > > > > > > So I think we really do need an actual audit of the attributes to decide > > > what the default should be (if there should even be one). > > > > > > > When you're talking about "entities", I'm not sure if you're talking > > > > the "entities" in the language spec or a general abstract ones. I mean, > > > > for example, the always_inline attribute doesn't contribute to the > > > > declaration from the language perspective. But it'll be super odd if we > > > > lost it. So I feel it may be better to change the requirement to be > > > > "affecting code generation" > > > > > > Not in a language spec meaning -- I meant mostly that there's some notion > > > of identify for the one definition rule and not all attribute arguments > > > contribute to that identity the same way. > > > > > > I don't know that "affecting code gen" really captures the whole of it. > > > For example, whether a function is/is not marked hot or cold affects > > > codegen, but really has nothing to do with the identity of the function > > > (it's the same function whether the hint is there or not). > > > `preferred_name` is another example -- do we really think a structure > > > with that attribute is fundamentally a different thing than one without > > > that attribute? I don't think so. > > > > > > To me, it's more about what cross-TU behaviors you'll get if the > > > attribute is only present on one side. Things like ABI tags, calling > > > conventions, attributes that behave like qualifiers, etc all contribute > > > to the identity of something where a mismatch will impact the correctness > > > of the program. But things like optimization hints and diagnostic hints > > > don't seem like they contribute to the identity of something because they > > > don't impact correctness. > > I got your point. But from my developing experience, I want to say the > > optimization hints matters too. And I am wondering how about checking the > > ODR for all attributes. It would emit error if the attributes that > > affecting the correctness mismatches. And if the other attributes > > mismatches, a warning enough. (If we worry about the compile time > > performance, I feel it won't take too long. And we can have an option to > > control it actually.) > > I got your point. But from my developing experience, I want to say the > > optimization hints matters too. > > That's why I'd like to have an idea about what our policy is. To me, ODR > hashing should be restricted to finding ODR violations. It sounds like you'd > like ODR hashing to also optionally find other kinds of differences that > aren't necessarily ODR violations but still surprises nonetheless. I think > that's reasonable, but I think we should ensure the diagnostics distinguish > between "this is definitely UB" and "this might not have the optimization or > diagnostic properties you expect but is still correct". > > > And I am wondering how about checking the ODR for all attributes. It would > > emit error if the attributes that affecting the correctness mismatches. And > > if the other attributes mismatches, a warning enough. (If we worry about > > the compile time performance, I feel it won't take too long. And we can > > have an option to control it actually.) > > I don't think we should ODR-check all attributes blindly. There's no ODR > violation for one function having `[[nodiscard]]` in a TU and another > function not having it. In fact, attributes are sometimes used additively so > that you can put extra constraints locally that aren't there globally. e.g., > I've seen real code doing: > ``` > // Header.h > int some_func(); > > // Source.cpp > #include "Header.h" > > // We've had too many bugs from people ignoring the results, > // which really matters in this particular file. Redeclare the API > // with extra diagnostic checking for this TU. > [[nodiscard]] int some_func(); > > ... > ``` > It shouldn't cause ODR violation diagnostics if `some_func()` is defined in > another source file without the attribute. Oh, yeah, you're right. Let's try to make the policy clearly when we started to it further. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135472/new/ https://reviews.llvm.org/D135472 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits