steakhal wrote: > So for me either solution would work: > a) remove strlen() as a propagator and note it in the checker doc > b) remove malloc() as a sink and note it in the checker doc > c) don't do anything and live with the false positives
TBH I would prefer (b). I see removing the whole `MsgTaintedBufferSize` is a bit too wild. And `alloca` with unconstrained tainted size should be still detected. (c) is definitely not an option, assuming that there were honest motivations behind this patch. > By the way, could you show an OOBV2 true positive that involves taint > propagated by `strlen` call? My impression was that the "index is tainted" > errors that involve `strlen` are typically false positives where the analyzer > cannot deduce that `s[strlen(s)]` is valid and e.g. `s[strlen(s)-1]` is also > valid when the string is nonempty. I'm sure I've seen it in the Juliet suite. I believe something similar must have been the motivation for me proposing `wcslen` as a taint propagator because it was missing from the taint-dataflow. Since the test cases are generated, and there are `char` counterparts, I'd bet there would be a missing taint-flow there as well, if we didn't propagate on `strlen`. I also understand that the Juliet is an artificial benchmark, that might or might not represent real-world scenarios. > > Consequently, I agree with the raised problems, but I disagree with the > > approach. I would rather remove the `MsgTaintedBufferSize` diagnostic to > > resolve those FPs. Alternatively, we can also think of creating a heuristic > > to reduce such FPs. For e.g. check if the most significant bit of the > > allocation size is proven to be unset (aka. checked some meaningful > > upperbounds) and suppress reports in that case, but report otherwise. Would > > it be okay with you to proceed by not removing taint propagation? > > I agree that `MsgTaintedBufferSize` reports should be limited to cases where > the tainted value _can be large_ and I would suggest a threshold that's > significantly lower than the SIZE_MAX/2 implied by your "most significant > bit" suggestion (ideally the threshold value would be configurable). However, > I think this eliminates the false positives related to `strlen` only if we > also add some logic that can reliably "tag" the return value of `strlen` with > reasonable upper bounds. Yea, upperbounding `strlen` result I'm not sure if it really makes sense, just to fulfill the heuristic imposed on `malloc`. Let's discuss then how much benefit warning on tainted malloc allocations provide. If it turns out the tainted malloc does not really work, we should just remove that from sinks. https://github.com/llvm/llvm-project/pull/66086 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits