steakhal wrote:

> So for me either solution would work:
> a) remove strlen() as a propagator and note it in the checker doc
> b) remove malloc() as a sink and note it in the checker doc
> c) don't do anything and live with the false positives

TBH I would prefer (b). I see removing the whole `MsgTaintedBufferSize` is a 
bit too wild. And `alloca` with unconstrained tainted size should be still 
detected.
(c) is definitely not an option, assuming that there were honest motivations 
behind this patch.



> By the way, could you show an OOBV2 true positive that involves taint 
> propagated by `strlen` call? My impression was that the "index is tainted" 
> errors that involve `strlen` are typically false positives where the analyzer 
> cannot deduce that `s[strlen(s)]` is valid and e.g. `s[strlen(s)-1]` is also 
> valid when the string is nonempty.

I'm sure I've seen it in the Juliet suite. I believe something similar must 
have been the motivation for me proposing `wcslen` as a taint propagator 
because it was missing from the taint-dataflow. Since the test cases are 
generated, and there are `char` counterparts, I'd bet there would be a missing 
taint-flow there as well, if we didn't propagate on `strlen`.
I also understand that the Juliet is an artificial benchmark, that might or 
might not represent real-world scenarios.



> > Consequently, I agree with the raised problems, but I disagree with the 
> > approach. I would rather remove the `MsgTaintedBufferSize` diagnostic to 
> > resolve those FPs. Alternatively, we can also think of creating a heuristic 
> > to reduce such FPs. For e.g. check if the most significant bit of the 
> > allocation size is proven to be unset (aka. checked some meaningful 
> > upperbounds) and suppress reports in that case, but report otherwise. Would 
> > it be okay with you to proceed by not removing taint propagation?
> 
> I agree that `MsgTaintedBufferSize` reports should be limited to cases where 
> the tainted value _can be large_ and I would suggest a threshold that's 
> significantly lower than the SIZE_MAX/2 implied by your "most significant 
> bit" suggestion (ideally the threshold value would be configurable). However, 
> I think this eliminates the false positives related to `strlen` only if we 
> also add some logic that can reliably "tag" the return value of `strlen` with 
> reasonable upper bounds.

Yea, upperbounding `strlen` result I'm not sure if it really makes sense, just 
to fulfill the heuristic imposed on `malloc`.

Let's discuss then how much benefit warning on tainted malloc allocations 
provide.
If it turns out the tainted malloc does not really work, we should just remove 
that from sinks.

https://github.com/llvm/llvm-project/pull/66086
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to