martong added a comment.

I am coping my comments that I've sent in mail, just in case.

> However, in the checker, we don't check `byte offset < 0` directly.
> We //simplify// the left part of the `byte offset < 0` inequality by 
> substracting/dividing both sides with the constant `C`, if the `byte offset` 
> is a `SymintExpr` of `SymExpr OP C` over the plus or multiplication operator 
> (OP).
> We do this //simplification// recursively.
> In the end, we will have a //symbolic root// (call it `RootSym`) expression 
> and a `C'` constant over the right-hand side of the original relational 
> operator.
> So, after the //simplification// process we get the `RootSym < C'` 
> inequality, which we check.

Just for the record, this is in `getSimplifiedOffsets`.

> //Note: I don't know what `RegionRawOffset` is, or what that does.//

Seems like that is just a pair of a `MemRegion` plus a concrete int `Offset`. 
And this is the return type for `ElementRegion::getAsArrayOffset()` where we 
handle only `nonloc::ConcreteInt`s. So, this is similar to RegionRawOffsetV2, 
but without the possibility of the symbolic index value.

> Calculation of the RegionRawOffsetV2
> ------------------------------------
>
> Let's see how does these subscript expressions used during the calculation of 
> the `RegionRawOffsetV2`:
> For multi-dimension arrays we //fold// the index expressions of the nested 
> `ElementRegion`s.
> We are doing that by simply calculating the byte-offset of the nested region, 
> and adding the current level's //index*sizeof(Type)//.
> So, the `ByteOffset` that we build up will contain mostly multiplications by 
> a constant OR additions of symbolic expressions (like the `x+1` in the 
> example).

We have these lines in the case of handling the ElementRegion:

  if (!index.getAs<NonLoc>())
    return RegionRawOffsetV2();

So, if the index is symbolic we give up, don't we? So, how could this work with 
`x+1`? What am I missing here?

>   The resulting `RegionRawOffsetV2` for `p`, `q` will be:
>   p: {BaseRegion: `buf`, ByteOffset: `20 S64b`}
>   q: {BaseRegion: `buf`, ByteOffset: `(((reg_$0<int x>) + 1) * 12) + 24`}
>                                         ^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^
>                                                      |           |
>   `@a` This is an //object-language// expression. --/            |
>                                                                 /
>   `@b` The rest should be //meta-language// expression. -------/
>
> SPOILER: This distinction is really important here.
>
> So, we made an expression, we should not evaluate in either of the worlds.
> We should not evaluate it using the semantics of the //object-language// 
> since only the `@a` is of that world.
> We should model overflow in `@a` if that expression is of //unsigned// type 
> (OR signed but `-fwrapv`...) etc. according to the //abstract machine//.
> BUT modeling the possibility of an overflow of the rest of the expression 
> (`@b`) would be a flaw.

Why? I'd expect that the value of the whole expression `@a@b` could overflow.

> Simplify step, again
> --------------------
>
>   Simplification of the `(((reg_$0<int x>) + 1) * 12) + 24` < `0` 
> inequality...
>                                                 ^^^^^^^^^^
>                                                    `@b`
>   Result:  `reg_$0<int x> < -3 S64b`
>             ^^^^^^^^^^^^^   ^^^^^^^
>   `RootSym` --/                |
>                               /
>   `C'` ----------------------/
>
> `#1`: This step supposed to fold **ONLY** the //meta-language// expression 
> parts (`@b`) of the previously aquired `ByteOffset`.
> `#2`: This step requires the expression under simplified to be of 
> //meta-language// to be able to reorder the constants. (aka. to fold into the 
> right hand side's constant).
> `#3`: This also requires the right-hand side to be of the //meta-language//.

Do I understand this correctly, that none of the binary operations can have a 
symbolic RHS, because that would mean we have a VLA?

> We treat the complete expression under //simplification// as an expression of 
> the //meta-language//.
> I'm not changing this, but I probably should though.

Again, I don't understand why you are sure that the value of //whole// 
expression cannot overflow.

> Ideally, after //simplification// we should have this inequality: `x+1 < -2`
> That way we would not fold any subexpression of the //object-language//, so 
> the `#1` requirement would be preserved.
> The right-hand side is of an expression of the //meta-language//.
> It makes sense, to evaluate the `operator<` as the semantics of the 
> //meta-language//.
> But the left-hand side is an expression of the //object-language//.
> We need some sort of //logical// conversion here.

What if the second index is also symbolic? E.g `buf[x+1][y+1]`?
This would result in `(((reg_$0<int x>) + 1) * 12) + (reg_$1<int y>) + 1)` < 
`0` . And after simplification, the RHS cannot be interpreted as //meta//.

> Check if the resulting //folded constant// (`C'`) is negative or not.

What happens if the symbolic index is not the most inner index? E.g. 
`buf[6][x+1]`. Then `C` is no longer constant, and no longer //meta// ...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86874/new/

https://reviews.llvm.org/D86874

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to