NoQ added inline comments.
================ Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:494 + SymbolManager &SM = C.getSymbolManager(); + return SM.getDerivedSymbol(Sym, LCV.getRegion()); } ---------------- vlad.tsyrklevich wrote: > NoQ wrote: > > vlad.tsyrklevich wrote: > > > NoQ wrote: > > > > I'd think about this a bit more and come back. > > > > > > > > I need to understand how come that constructing a symbol manually is > > > > the right thing to do; that doesn't happen very often, but it seems > > > > correct here. > > > Indeed it is odd. The best justification I could come up with: LCVs are > > > meant to be optimizations, their 'purpose' is to expose an SVal that > > > hides SymbolRef values so that we can have a split store. We don't have > > > to copy all of a compound values SymbolRef mappings because LCVs are kept > > > distinct. Hence to set/query/constrain region values you use SVals so > > > that RegionStore can differentiate between LCVs and SymbolRef backed > > > SVals for the two different stores it contains. > > > > > > The taint interface however requires you taint a SymbolRef, not an SVal. > > > If we wanted, instead of doing this logic here, we could change > > > getPointedToSymbol() to return an SVal and update usages of it > > > accordingly since that value is only passed on to > > > ProgramState.isTainted()/ProgramState.addTaint() anyway. Then we could > > > update addTaint/isTainted to perform this logic, hiding it from the > > > checker. > > > > > > This still requires manually constructing a symbol, now it's just > > > performed in the analyzer instead of in a checker. Not sure if that > > > addresses the issue you were considering, but the idea that we need to > > > 'undo' the LCV optimization hiding the SymbolRef to have a value to taint > > > seems somewhat convincing to me. What do you think? > > Hmm (!) I suggest adding a new function to the program state, that we'd > > call `addPartialTaint()` or something like that, and this function would > > accept a symbol and a region and would act identically to passing a derived > > symbol (from this symbol and that region) to `addTaint()` (but we wouldn't > > need to actually construct a derived symbol here). > > > > Such API would be easier to understand and use than the current approach > > that forces the user to construct a derived symbol manually in the checker > > code. Unfortunately, this checker's `getLCVSymbol()` would become a bit > > more complicated (having various return types depending on circumstances), > > but this misfortune seems more random than systematic to me. > > > > Since we're having this new kind of partial taint, why don't we expose it > > in the API. > I'm happy to implement it this way, but figured I'd ask why you prefer this > approach first in the interest of keeping the TaintChecker simple! The > optimal approach to me seems to be changing `getPointedToSymbol()` to > `getPointedToSVal()` and having `addTaint(SVal)` call `addPartialTaint()` > when it's passed an LCV sub-region. That way users of the taint interface > like the TaintChecker have a clean way to add & check regardless of whether > it's a SymbolRef or an LCV but the partial taint functionality is still > exposed and documented for those who might want to use it in new ways. > > Just curious to understand your rationale. Thanks for the feedback! Your idea actually looks good to me! I'd approve going this way. With this change to `addTaint(SVal)`, i suspect it'd need some extra documentation to explain what it does now. ================ Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:676 + if (const SymbolDerived *SD = dyn_cast<SymbolDerived>(Sym)) { + TaintedSymRegionsRef SymRegions(0, TSRFactory.getTreeFactory()); + ---------------- The `SymRegions` name is a bit confusing because we often shorten `SymbolicRegion` to `SymRegion` (eg. in dumps), which is not what we mean. ================ Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:682 + + SymRegions = SymRegions.add(SD->getRegion()); + NewState = NewState->set<DerivedSymTaint>(SD->getParentSymbol(), SymRegions); ---------------- I wonder if it's worth it to check if a super-region of this region is already tainted, and avoid adding the region in this scenario. I guess in practice it won't happen very often, because this code would most likely be executed just once per taint source. This probably deserves a comment though. https://reviews.llvm.org/D30909 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits