[PATCH] D153131: [clang analysis][thread-safety] Handle return-by-reference...

Clement Courbet via Phabricator via cfe-commits Thu, 29 Jun 2023 23:57:03 -0700

courbet added inline comments.


================
Comment at: clang/include/clang/Basic/DiagnosticGroups.td:1046
 def ThreadSafetyPrecise    : DiagGroup<"thread-safety-precise">;
 def ThreadSafetyReference  : DiagGroup<"thread-safety-reference">;
+def ThreadSafetyReturn     : DiagGroup<"thread-safety-return">;
----------------
aaronpuchert wrote:
> courbet wrote:
> > aaronpuchert wrote:
> > > courbet wrote:
> > > > aaronpuchert wrote:
> > > > > Why not under `-Wthread-safety-reference`, as it's 
> > > > > return-by-reference that you're warning on? This seems too small for 
> > > > > a separate flag to me.
> > > > The main reason it so that we provide a soft transition period for 
> > > > users: If we put that in `-Wthread-safety-reference`, we'll start 
> > > > breaking compile for people who use `-Werror`, while a separate flag 
> > > > allows a transition period where people opt into the new feature.
> > > Transition flags can end up resulting in more churn, assuming that we 
> > > eventually want to put this under `-Wthread-safety-reference`, because 
> > > then you have two categories of users:
> > > * those that opted in will eventually have to remove the flag again, and
> > > * those that didn't will get hard errors on updating the compiler at that 
> > > point.
> > > Of course you might argue that the latter case can be prevented by 
> > > carefully reading the release notes, but we know how often that happens.
> > > 
> > > I'd argue that if you're using `-Wthread-safety-reference`, you're 
> > > already opting into warnings on escaping references, and not warning on 
> > > `return` is a false negative.
> > > 
> > > A separate flag would make sense to me if we want to keep it, for example 
> > > because this produces a substantial amount of false positives under some 
> > > circumstances. Did you try this on a larger code base that's using the 
> > > annotations? I could try it on our code, and maybe we can get some 
> > > Googler to test it on theirs, which is also heavily using Thread Safety 
> > > Analysis. (Though I'm not sure whether they use 
> > > `-Wthread-safety-reference`.)
> > I don't have a strong opinion for where the warning should go.  We are 
> > indeed using `-Wthread-safety-reference`, though we're not enabling -Werror 
> > on these, so adding more warnings is fine for us.
> > 
> > I've run the check on a small sample of our codebase (which I don;t claim 
> > to be representative, I can do a larger analysis if needed). The warnings 
> > are more or less evenly split between missing annotations and actual bugs. 
> > I don't think any of the things I've seen qualify as false positives.
> > 
> > Among the missing annotations, most of the warnings are missing 
> > `ABSL_EXCLUSIVE_LOCKS_REQUIRED` on the function. In a small number of 
> > cases, the pattern is that a variable is lazily initialized under a lock 
> > and then returned by reference:
> > 
> > ```
> > struct LazyImmutableThing {
> >   const Thing& Get() {
> >     {
> >       MutexLock lock(&mutex_);
> >       thing_->Initialize();
> >     }
> >     
> >     return thing_;
> >   }
> >   
> >   Mutex mutex_;
> >   Thing thing_ GUARDED_BY(mutex_);
> > };
> > ```
> > 
> > I consider this to be a missing annotation as the returned value is 
> > dynamically immutable, the proper fix would be `return 
> > TS_UNCHECKED_READ(thing_)`.
> > 
> > 
> > Most actual bugs are along the lines of:
> > 
> > ```
> > struct S {
> >   T& Get() const {
> >     MutexLock lock(&mutex_);
> >     return obj_;
> >   }
> > 
> >   Mutex mutex_;
> >   T obj_ GUARDED_BY(mutex_);
> > };
> > ```
> > 
> > though some are missing the guard altogether (`T& Get() const { return 
> > obj_; }`).
> > 
> > There are a few possible fixes. In rough order of occurrence:
> >  - Return by value as the copy is not too expensive and memory ordering 
> > does not matter.
> >  - Let the caller take the lock and annotate with 
> > `ABSL_EXCLUSIVE_LOCKS_REQUIRED` when the `Get` method is not called too 
> > often.
> >  - Let `Get` take a callback and process the value under the lock instead 
> > of returning it (when ordering matters).
> > 
> > In a small number of cases, the pattern is that a variable is lazily 
> > initialized under a lock and then returned by reference:
> 
> I wonder why that's safe, is the initialization guarded to happen only once? 
> Some kind of double-checked locking pattern perhaps? Otherwise it seems that 
> reads could happen in parallel to writes. If it's a checked initialization, 
> then I think the proper way to model this is:
> * The initialization acquires a lock to exclude other initializations running 
> in parallel. Reads cannot happen, because the reference has not yet escaped.
> * After initialization, we essentially acquire an implicit shared lock. This 
> is not tracked as a proper lock, but it doesn't need to: there are no more 
> writes until the end of lifetime, so nobody will acquire another exclusive 
> lock.
> One could model this by creating a mutex wrapper that can be locked once in 
> exclusive mode, and after that hands out shared locks to everybody who wants 
> one without keeping track. (As this is slightly off-topic, we don't need to 
> discuss this here though.)
> 
> Other than than, this matches what I'm seeing in our code.
> I wonder why that's safe, is the initialization guarded to happen only once? 
> Some kind of double-checked locking pattern perhaps? 

Yes, it looks like this:

```
const T& get() const {
    if (!value_set_.load(std::memory_order_acquire)) {
      MutexLock lock(&lock_);
      if (!value_set_.load(std::memory_order_relaxed)) {
        value_ = ComputeValue();
        value_set_.store(true, std::memory_order_release);
      }
    }
    return value_;
  }
```

I ended up silencing the return with a comment:

```
// `value_` is set once an for all, it won't change after this function returns.
 return ABSL_TS_UNCHECKED_READ(value_);
```

I agree that this is not very principled, but it is simple :)

> One could model this by creating a mutex wrapper that can be locked once in 
> exclusive mode, and after that hands out shared locks to everybody who wants 
> one without keeping track. 

This is a cool idea. If I understand correctly, it does mean that the *caller* 
of `get` has to grab a untracked shared lock ? 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153131/new/

https://reviews.llvm.org/D153131

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153131: [clang analysis][thread-safety] Handle return-by-reference...

Reply via email to