martong added a comment. I am coping my comments that I've sent in mail, just in case.
> However, in the checker, we don't check `byte offset < 0` directly. > We //simplify// the left part of the `byte offset < 0` inequality by > substracting/dividing both sides with the constant `C`, if the `byte offset` > is a `SymintExpr` of `SymExpr OP C` over the plus or multiplication operator > (OP). > We do this //simplification// recursively. > In the end, we will have a //symbolic root// (call it `RootSym`) expression > and a `C'` constant over the right-hand side of the original relational > operator. > So, after the //simplification// process we get the `RootSym < C'` > inequality, which we check. Just for the record, this is in `getSimplifiedOffsets`. > //Note: I don't know what `RegionRawOffset` is, or what that does.// Seems like that is just a pair of a `MemRegion` plus a concrete int `Offset`. And this is the return type for `ElementRegion::getAsArrayOffset()` where we handle only `nonloc::ConcreteInt`s. So, this is similar to RegionRawOffsetV2, but without the possibility of the symbolic index value. > Calculation of the RegionRawOffsetV2 > ------------------------------------ > > Let's see how does these subscript expressions used during the calculation of > the `RegionRawOffsetV2`: > For multi-dimension arrays we //fold// the index expressions of the nested > `ElementRegion`s. > We are doing that by simply calculating the byte-offset of the nested region, > and adding the current level's //index*sizeof(Type)//. > So, the `ByteOffset` that we build up will contain mostly multiplications by > a constant OR additions of symbolic expressions (like the `x+1` in the > example). We have these lines in the case of handling the ElementRegion: if (!index.getAs<NonLoc>()) return RegionRawOffsetV2(); So, if the index is symbolic we give up, don't we? So, how could this work with `x+1`? What am I missing here? > The resulting `RegionRawOffsetV2` for `p`, `q` will be: > p: {BaseRegion: `buf`, ByteOffset: `20 S64b`} > q: {BaseRegion: `buf`, ByteOffset: `(((reg_$0<int x>) + 1) * 12) + 24`} > ^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^ > | | > `@a` This is an //object-language// expression. --/ | > / > `@b` The rest should be //meta-language// expression. -------/ > > SPOILER: This distinction is really important here. > > So, we made an expression, we should not evaluate in either of the worlds. > We should not evaluate it using the semantics of the //object-language// > since only the `@a` is of that world. > We should model overflow in `@a` if that expression is of //unsigned// type > (OR signed but `-fwrapv`...) etc. according to the //abstract machine//. > BUT modeling the possibility of an overflow of the rest of the expression > (`@b`) would be a flaw. Why? I'd expect that the value of the whole expression `@a@b` could overflow. > Simplify step, again > -------------------- > > Simplification of the `(((reg_$0<int x>) + 1) * 12) + 24` < `0` > inequality... > ^^^^^^^^^^ > `@b` > Result: `reg_$0<int x> < -3 S64b` > ^^^^^^^^^^^^^ ^^^^^^^ > `RootSym` --/ | > / > `C'` ----------------------/ > > `#1`: This step supposed to fold **ONLY** the //meta-language// expression > parts (`@b`) of the previously aquired `ByteOffset`. > `#2`: This step requires the expression under simplified to be of > //meta-language// to be able to reorder the constants. (aka. to fold into the > right hand side's constant). > `#3`: This also requires the right-hand side to be of the //meta-language//. Do I understand this correctly, that none of the binary operations can have a symbolic RHS, because that would mean we have a VLA? > We treat the complete expression under //simplification// as an expression of > the //meta-language//. > I'm not changing this, but I probably should though. Again, I don't understand why you are sure that the value of //whole// expression cannot overflow. > Ideally, after //simplification// we should have this inequality: `x+1 < -2` > That way we would not fold any subexpression of the //object-language//, so > the `#1` requirement would be preserved. > The right-hand side is of an expression of the //meta-language//. > It makes sense, to evaluate the `operator<` as the semantics of the > //meta-language//. > But the left-hand side is an expression of the //object-language//. > We need some sort of //logical// conversion here. What if the second index is also symbolic? E.g `buf[x+1][y+1]`? This would result in `(((reg_$0<int x>) + 1) * 12) + (reg_$1<int y>) + 1)` < `0` . And after simplification, the RHS cannot be interpreted as //meta//. > Check if the resulting //folded constant// (`C'`) is negative or not. What happens if the symbolic index is not the most inner index? E.g. `buf[6][x+1]`. Then `C` is no longer constant, and no longer //meta// ... Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D86874/new/ https://reviews.llvm.org/D86874 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits