dblaikie added a comment. In D135551#3847607 <https://reviews.llvm.org/D135551#3847607>, @aaron.ballman wrote:
> In D135551#3847444 <https://reviews.llvm.org/D135551#3847444>, @dblaikie > wrote: > >> I thought this was settled quite a while ago and enshrined in the style >> guide: https://llvm.org/docs/CodingStandards.html#assert-liberally >> >> `assert(0)` should not be used if something is reachable. We shouldn't have >> a "this violates an invariant, but if you don't have asserts enabled you do >> get some maybe-acceptable behavior". >> >> I feel fairly strongly that any cases of "reachable asserts" should be >> changed to valid error handling or `llvm_report_error` and remaining >> `assert(0)` should be transformed to `llvm_unreachable`. (also, ideally, >> don't use branch-to-`unreachable` where possible, instead assert the >> condition - in cases where the `if` has side effects, sometimes that's the >> nicest way to write it, but might be clearer, if more verbose to use a local >> variable for the condition, then assert that the variable is true (and have >> the requisite "variable might be unused" cast)) > > I would be okay with that, but that's not what this patch was doing -- it was > changing `assert(0)` into an `llvm_unreachable` more mechanically, and I > don't think that's a valid transformation. The key, to me, is not losing the > distinction between "reaching here is a programming mistake that you'd make > during active development" vs "we never expect to reach this patch and want > to optimize accordingly." I don't really think those are different things, though. Violating invariants is UB and there's no discussion to be had about how the program (in a non-asserts build) behaves when those invariants are violated - all bets are off, whether it's assert or unreachable. > `__builtin_unreachable` changes the debugging landscape far too much for me > to want to see folks using it for "reaching here is a programming mistake" > situations, *especially* in RelWithDebInfo builds where optimizations are > enabled and may result in surprising call stacks and time travel debugging. Generally LLVM's pretty hard to fathom in a non-asserts build anyway, right? (that's the first thing any of us do is reproduce with an assertions build that may fail miles away from where a crash occurred because an invariant was violated much earlier) - that `cast` won't crash/will continue on happily in a non-asserts build seems like a much larger hole to debuggability of a non-asserts build than any unreachable? >>> Historically, we've usually used llvm_unreachable for situations where >>> we're saying "this code cannot be reached; if it can, something else has >>> gone seriously wrong." For example, in code like: int foo(SomeEnum E) { >>> switch (E) { case One: return 1; default: return 2; } >>> llvm_unreachable("getting here would be mysterious"); } and we've used >>> assert(0) for situations where we're saying "this code is possible to reach >>> only when there were mistakes elsewhere which broke invariants we were >>> relying on." >> >> I don't think those are different things though - violating invariants is ~= >> something going seriously wrong. >> >> (sorry, I guess I should debate this on the thread instead of here - but I >> think most of the folks on that thread did agree with the LLVM style >> guide/the direction here) >> >> This, I think was also discussed about a decade ago in the LLVM community >> and resulted in r166821/2962d9599e463265edae599285bbc6351f1cc0ef which >> specifically "Suggests llvm_unreachable over assert(0)" and is the policy of >> LLVM - this change is consistent with that policy. > > I don't have the context for those changes in my email either, but regardless > of what we thought ten years ago, we have code in the code base today that > assumes a difference in severity between kinds of unreachable statements so > we need to be careful when correcting mistakes. I think we're still in > agreement that `llvm_unreachable` should be preferred over `assert(0)` in > situations where the code is expected to be impossible to reach. I think > we're also still in agreement that "correct error reporting" is preferred > over `assert(0)`. Where we might still have daylight are the occasional > situations where `assert(0)` gives a better experience -- when the code is > possible to reach but reaching it signifies a developer (not user) mistake > that is plausible to make when doing new development, such as when misusing > the C interface (where there isn't always an error that should be reported > via the API). I think these uses should generally be rare, but I don't think > it's a "never do this" kind of situation. That still seems like something that should be caught by an asserts build and is UB otherwise. If you're developing against LLVM's APIs in a non-assertions build, there's a lot of other invariants that won't be checked (`cast` being a pretty core example) and will make debugging really difficult. > `llvm_unreachable` asserts in debug mode, so it has the nice property we need > when doing new development. But it doesn't convey the distinction in > semantics. Maybe another approach is: `#define developer_bugcheck(msg) > llvm_unreachable(msg)` (or something along those lines). We still get the > assertion in debug mode, we then get the better optimization in release mode, > but we don't lose the semantic information in the code base. It doesn't > really help the RelWithDebInfo case though (where call stacks may be > unrecognizable as a result of time travel optimizations) but maybe that's a > tradeoff worth making? I don't really see the distinction, though - it's a violated invariant, which is a developer bug, whether it's from an assert on unreachable. They're both expressing invariants - not of different strength. "Valid code should never reach this branch" and "valid code should never produce a false result here". I'd really like to avoid what I see as cleanup - replacing one construct (assert(false)) with another (llvm_unreachable) of identical (to me) contractual strength - being slowed because of this distinction. Replacing reachable-unreachables, or reachable-false-assertions with llvm_report_error, to me, is orthogonal to this cleanup. (& for the C API - I think assertions/unreachable are the right thing, not llvm_report_error - I'm sure there's many more ways to violate the C API contract that would result in inexplicable/confusing behavior and that will only ever be protected by assertions, than we might cover by a few llvm_report_errors on the interface & that should be the direction we encourage people to go down - develop with assertions enabled) Basically llvm_report_error ends up/should be a "we couldn't find a better way to do error handling here, but it is really a reachable error we have to do something with" - each one is a bug in the library-ness of LLVM. (or it's used up in a tool implementation (above the library layer) and then it's fine/just a convenient way to error/exit, though probably still isn't so good for diagnostic quality from such tools). If it's not reachable by a correct usage (be it from a command line tool or API) then it should be an assertion, not an llvm_report_error. But I know this issue comes up from time to time and I've yet to figure out a way to come to shared understandings on it :/ so I don't expect my rather narrow and firm definition to necessarily carry the day on this. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135551/new/ https://reviews.llvm.org/D135551 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits