NoQ added inline comments.

================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:2293
+    //      search of connected components.
+    if (!ParmsNeedFix.empty()) {
+      auto First = ParmsNeedFix.begin(), Last = First;
----------------
ziqingluo-90 wrote:
> NoQ wrote:
> > ziqingluo-90 wrote:
> > > NoQ wrote:
> > > > ziqingluo-90 wrote:
> > > > > NoQ wrote:
> > > > > > What about the situation where params aren't seen as unsafe yet, 
> > > > > > but they're discovered to be unsafe later? Eg.
> > > > > > ```
> > > > > > void foo(int *p, int *q) {
> > > > > >   int *p2 = p;
> > > > > >   p2[5] = 7;
> > > > > >   int *q2 = q;
> > > > > >   q2[5] = 7;
> > > > > > }
> > > > > > ```
> > > > > > Will we notice that `p` and `q` need to be fixed simultaneously?
> > > > > > 
> > > > > > ---
> > > > > > 
> > > > > > I suspect that the problem of parameter grouping can't be solved by 
> > > > > > pre-populating strategy implications between all parameters. It'll 
> > > > > > either cause you to implicate all parameters (even the ones that 
> > > > > > will never need fixing), or cause you to miss connections between 
> > > > > > parameters that do need fixing (as the connection is discovered 
> > > > > > later).
> > > > > > 
> > > > > > Connection between parameters needs to be added *after* the 
> > > > > > implications algorithm has finished. And once it's there (might be 
> > > > > > already there if I missed something), then this part of code 
> > > > > > probably becomes unnecessary.
> > > > > Yes.  They will be noticed.  Here is why:
> > > > > 
> > > > > The code here first forms a ring over `p` and  `q` in the assignment 
> > > > > (directed) graph:  
> > > > > ```
> > > > > p <--> q
> > > > > ```
> > > > > Then the two initialization statements (implemented in [[ 
> > > > > https://reviews.llvm.org/D150489 | D150489 ]]) add two more edges to 
> > > > > the graph:
> > > > > ```
> > > > > p2 --> p <--> q <-- q2
> > > > > ```
> > > > > The algorithm later searches the graph starting from unsafe variables 
> > > > > `p2` and `q2`, respectively,  Starting from `p2`, the algorithm 
> > > > > discovers that `p2` and `p`, as well as `p` and `q` depend on each 
> > > > > other.  Starting from `q2`,  the algorithm discovers that `q2` and 
> > > > > `q`, as well as `p` and `q` depend on each other.   The dependency 
> > > > > graph is sort of undirected. So eventually, the four variables `p2`, 
> > > > > `p`, `q`, `q2` are in the same group.
> > > > > 
> > > > > The code here first forms a ring over `p` and  `q` in the assignment 
> > > > > (directed) graph
> > > > 
> > > > I don't think it does. The ring is formed over the `ParamsNeedFix` 
> > > > list, which is empty because none of these parameters are listed in 
> > > > `UnsafeOps.byVar`.
> > > Actually `p` and `q` will be in `ParamsNeedFix`.  Because they are 
> > > implicated by `p2` and `q2`, who are unsafe variables.   `ParamsNeedFix` 
> > > include parameters that need fix, either because they are used in unsafe 
> > > operations or they are implicated by some Gadgets.
> > > 
> > > But anyway, I now think it's not a good idea to lump parameters with 
> > > variables grouped by implication, as you pointed out that variable 
> > > implication also implies bounds propagation. 
> > > 
> > > Instead, I think it might be more appropriate to make a variable group 
> > > structured:  
> > >   - Let's call it a //connected components// for a set of variables that 
> > > need to be fixed together and sharing bounds information. (The term 
> > > //connected components// has been used in the code to refer to a group of 
> > > variables. It keeps its meaning here.)
> > >   - Let's redefine a Variable Group to be a set of mutually disjoint 
> > > //connected components//s (CCs).  If there are more than one CCs in a 
> > > variable group,  each of the CCs contains at least one parameter.
> > > 
> > > For generating fix-its for a variable group, this patch still works as it 
> > > only requires that variables in a variable group are either all fixed or 
> > > given up as a whole.
> > > For generating diagnostic messages, we now have more information about 
> > > relations among grouped variables.
> > > Actually `p` and `q` will be in ParamsNeedFix. Because they are 
> > > implicated by `p2` and `q2`, who are unsafe variables.  `ParamsNeedFix` 
> > > include parameters that need fix, either because they are used in unsafe 
> > > operations or they are implicated by some Gadgets.
> > 
> > Ohh, right, it's populated twice, I even commented on the other part but 
> > didn't notice it populates the same list!
> > 
> > But in this case you'll probably join `p` and `q` in
> > ```
> > void foo(int *p, int *q) {
> >   int *p2 = p;
> >   p2[5] = 7;
> >   int *q2 = q;
> > }
> > ```
> > right? (In this case `q` is implicated by `q2` so it gets added to the 
> > list, but `q2` itself isn't unsafe or implicated by anything, so it doesn't 
> > need fixing, but connecting it to `p` would cause it to be fixed 
> > unnecessarily). So I still think this can't be correct unless you do it 
> > after the crawler finishes.
> > 
> > > But anyway, I now think it's not a good idea to lump parameters with 
> > > variables grouped by implication, as you pointed out that variable 
> > > implication also implies bounds propagation.
> > 
> > Yeah, I think it makes a lot more sense. Just look at variable groups 
> > already built by the existing crawler, and if more than one of them 
> > contains parameters, report all groups that contain parameters through one 
> > combined warning.
> Though I'm going to remove this part,  achieving some agreement on this 
> algorithm may help me understand the problem better.
> 
> > right? (In this case q is implicated by q2 so it gets added to the list, 
> > but q2 itself isn't unsafe or implicated by anything, so it doesn't need 
> > fixing, but connecting it to p would cause it to be fixed unnecessarily)
> Right.  the code in this patch does not form `ParmsNeedFix` correctly.  
> `ParmsNeedFix` should be the set of parameters that are either unsafe or 
> being implicated (transitively) by unsafe variables.   Then the group would 
> be "correct" I think.
> 
> 
Aha ok, whew, gotcha, makes sense then!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153059/new/

https://reviews.llvm.org/D153059

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to