On Thu, 2021-01-21 at 20:09 +0100, Jan Hubicka wrote: > > On Thu, 2021-01-14 at 15:00 +0100, Jan Hubicka wrote: > > > > On Wed, Jan 13, 2021 at 11:04 PM David Malcolm via Gcc-patches > > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > gimple.h has this comment for gimple's uid field: > > > > > > > > > > /* UID of this statement. This is used by passes that want > > > > > to > > > > > assign IDs to statements. It must be assigned and used > > > > > by > > > > > each > > > > > pass. By default it should be assumed to contain > > > > > garbage. */ > > > > > unsigned uid; > > > > > > > > > > and gimple_set_uid has: > > > > > > > > > > Please note that this UID property is supposed to be > > > > > undefined > > > > > at > > > > > pass boundaries. This means that a given pass should not > > > > > assume it > > > > > contains any useful value when the pass starts and thus > > > > > can > > > > > set it > > > > > to any value it sees fit. > > > > > > > > > > which suggests that any pass can use the uid field as an > > > > > arbitrary > > > > > scratch space. > > > > > > > > > > PR analyzer/98599 reports a case where this error occurs in > > > > > LTO > > > > > mode: > > > > > fatal error: Cgraph edge statement index out of range > > > > > on certain inputs with -fanalyzer. > > > > > > > > > > The error occurs in the LTRANS phase after -fanalyzer runs in > > > > > the > > > > > WPA phase. The analyzer pass writes to the uid fields of all > > > > > stmts. > > > > > > > > > > The error occurs when LTRANS is streaming callgraph edges > > > > > back > > > > > in. > > > > > If I'm reading things correctly, the LTO format uses stmt > > > > > uids to > > > > > associate call stmts with callgraph edges between WPA and > > > > > LTRANS. > > > > > For example, in lto-cgraph.c, lto_output_edge writes out the > > > > > gimple_uid, and input_edge reads it back in. > > > > > > > > > > Hence IPA passes that touch the uids in WPA need to restore > > > > > them, > > > > > or the stream-in at LTRANS will fail. > > > > > > > > > > Is it intended that the LTO machinery relies on the value of > > > > > the > > > > > uid > > > > > field being preserved during WPA (or, at least, needs to be > > > > > saved > > > > > and > > > > > restored by passes that touch it)? > > > > > > > > I belive this is solely at the cgraph stream out to stream in > > > > boundary but > > > > this may be a blurred area since while we materialize the whole > > > > cgraph > > > > at once the function bodies are streamed in on demand. > > > > > > > > Honza can probably clarify things. > > > > > > Well, the uids are used to associate cgraph edges with > > > statements. At > > > WPA stage you do not have function bodies and thus uids serves > > > role > > > of > > > pointers to the statement. If you load the body in (via > > > get_body) > > > the > > > uids are replaced by pointers and when you stream out uids are > > > recomputed again. > > > > > > When do you touch the uids? At WPA time or from small IPA pass in > > > ltrans? > > > > The analyzer is here in passes.def: > > INSERT_PASSES_AFTER (all_regular_ipa_passes) > > NEXT_PASS (pass_analyzer); > > > > and so in LTO runs as the first regular IPA pass at WPA time, > > when do_whole_program_analysis calls: > > execute_ipa_pass_list (g->get_passes ()->all_regular_ipa_passes); > > > > FWIW I hope to eventually have a way to summarize function bodies > > in > > the analyzer, but I don't yet, so I'm currently brute forcing > > things by > > loading all function bodies at the start of the analyzer (when > > -fanalyzer is enabled). > > > > I wonder if that's messing things up somehow? > > Actually I think it should work. If you do get_body or > get_untransformed_body (that will be equal at that time) you ought to > get ids in symtab datastructure rewritten to pointers and at stream > out > time we should assign new ids... > > Does the stream-out from WPA make any assumptions about the stmt > > uids? > > For example, > > #define STMT_UID_NOT_IN_RANGE(uid) \ > > (gimple_stmt_max_uid (fn) < uid || uid == 0) > > seems to assume that the UIDs are per-function ranges from > > [0-gimple_stmt_max_uid (fn)] > > which isn't the case for the uids set by the analyzer. Maybe > > that's > > the issue here? > > > > Sorry for not being more familiar with the IPA/LTO code > > There is lto_prepare_function_for_streaming which assigns uids to be > incremental. So I guess problem is that it is not called at WPA > time > if function is in memory (since at moment we do not really modify > bodies > at WPA time, however we do stream them in sometimes to icf compare > them > or to update profile). > > So i guess fix would be to arrange lto_prepare_function_for_streaming > to > be called on all functions with body defined before WPA stream-out?
Thanks; I'll have a go at implementing that. Dave > Honza > > Dave > > > > > > > hozna > > > > Note LTO uses this exactly because of this comment to avoid > > > > allocating > > > > extra memory for an 'index' but it could of course leave > > > > gimple_uid > > > > alone > > > > at some extra expense (eventually paid for in generic cgraph > > > > data > > > > structures > > > > and thus for not only the streaming time). > > > > > > > > > On the assumption that this is the case, this patch updates > > > > > the > > > > > comments > > > > > in gimple.h referring to passes being able to set uid to any > > > > > value to > > > > > note the caveat for IPA passes, and it updates the analyzer > > > > > to > > > > > save > > > > > and restore the UIDs, fixing the error. > > > > > > > > > > Successfully bootstrapped & regrtested on x86_64-pc-linux- > > > > > gnu. > > > > > OK for master? > > > > > > > > The analyzer bits are OK, let's see how Honza can clarify the > > > > situation. > > > > > > > > Thanks, > > > > Richard.