On Do, Jun 9 2022 at 13:40:06 -0400, David Malcolm
wrote:
On Thu, 2022-06-09 at 16:49 +0200, Tim Lange wrote:
> On Mi, Jun 8 2022 at 11:12:52 -0400, David Malcolm
wrote:
> > On Wed, 2022-06-08 at 01:42 +0200, Tim Lange wrote:
> >
> > Hi Dave,
Hi Tim; various responses inline below...
> >
> > I did spent some time to think about the zero state machine. I
first
> > thought about distinguishing between "assigned zero" and "EQ 0
> > condition on the path" for cases where e.g. unreachable() is
used
to
> > say that some variable will never be zero according to the
programmer.
> > In that case the dev might not want zero warnings from
conditions
> > outside the if body itself for dev build where unreachable
expands
to
> > some panic exit. But as the condition constraint is not pruned
when the
> > state machine is distinguishing the states, I'm not sure how to
know
> > whether the analysis already left the if body?
>
> The analyzer works on the gimple-ssa representation, which uses
basic
> blocks in a CFG, rather than an AST, so the only remants we have
of
> scope is in "clobber" statements (a special kind of assignment
stmt),
> which the gimplify adds as variables go out of scope.
If the constraints only lived until the immediate dominator of `if
(cond)`, I could easily distinguish:
1. if (x == 0) && still inside the if => zero
2. if (x == 0) && outside if => maybe zero
but as this seems to be not the case, if I want to distinguish 1. &
2.,
I'd have to find another way.
>
> For pruning, the analyzer's state_machine class has a
"can_purge_p"
> virtual function:
>
> /* Return true if it safe to discard the given state (to help
> when simplifying state objects).
> States that need leak detection should return false. */
> virtual bool can_purge_p (state_t s) const = 0;
>
> which should return true for a "zeroness" state machine, in that
we
> always consider pruning states for svalues that aren't needed
anymore
> along a path.
Is implemented and returns true.
>
> Is there some other kind of state explosion you're running into?
It's
> hard to figure this out further without seeing code.
No, my code is by far not that mature to be tested. I just had in my
head that I wanted to find out if I can distinguish the two cases.
>
>
> > Also, while trying out different things, it seems simple
assignments on
> > phi like here
> > int x;
> > if (argc == 1) {
> > x = 1; // x_5
> > } else {
> > x = 0; // x_4
> > }
> > printf("%i", 5 / x); // x_2
> > automatically work such that x_2 already inherits the state
from
> > x_4/x_5 without me doing anything inside my sm's on_phi
function.
Same
> > for the simple y = 0; x = y; case. Where does this happen
inside
the
> > code?
>
> With the caveat that I'm seeing your code, what's probably
happening
is
> that we have either:
>
> BB (a):
> x_5 = 1;
> goto BB (c);
>
> BB (b):
> x_4 = 0;
> goto BB (c);
>
> BB (c):
> x_2 = PHI (x_5 from (a), x_4 from (b));
I compiled it with -g, so this one is like the dumped gimple.
>
> or (at higher optimization levels):
>
> BB (a):
> goto BB (c);
>
> BB (b):
> goto BB (c);
>
> BB (c):
> x_2 = PHI (1 from (a), 0 from (b));
>
> and I think that at the phi node we have
region_model::handle_phi,
> which is setting x_2 to either the constant 1 or the constant 0
in
the
> store, and is calling the on_phi vfunc, leading to on_phi being
called
> for all state machines.
Thanks, that is the case. The set_value inside handle_phi seems to
this
for me.
>
> BTW, are you implementing an override for this vfunc:
> virtual state_machine::state_t get_default_state (const svalue *)
> const;
>
> to capture the inherently known zeroness/nonzeroness of
constant_svalue
> instances? That would make those constants have that state.
Yeah, I saw that on your nullness check. I tried it on a small
example
with and without, but didn't noticed a difference in warnings
(except
for not having zero(x_4) inside the supergraph.dot). So if I
understood
this right, this is just to have one state less for that
variable/value[0]?
The states are stored in sm_state_map using a mapping from svalue to
state.
Given e.g. a parameter, this will be "initial_svalue (parm)", but in
the above case where x_4 either has value 1 or has value 0, it's not
storing a state for x_4; it's storing a state for 0 or a state for 1.
So without implementing the get_default_state vfunc, the sm_state_maps
will gradually aquire explicit "this is zero" or "this is nonzero" for
all of the constants that get used, leading to an explosion of states
for all the different combinations of constants that have been
encountered along an execution path, which is probably not going to be
useful. If you do implement the get_default_state vfunc for
constants,
then the constants will implicitly have