> -----Original Message-----
> From: Prathamesh Kulkarni <[email protected]>
> Sent: 13 November 2025 09:28
> To: [email protected]; Jan Hubicka <[email protected]>
> Subject: RE: [RFC] Enable time profile function reordering with
> AutoFDO
> 
> External email: Use caution opening links or attachments
> 
> 
> > -----Original Message-----
> > From: Prathamesh Kulkarni <[email protected]>
> > Sent: 31 October 2025 00:44
> > To: Prathamesh Kulkarni <[email protected]>; gcc-
> > [email protected]; Jan Hubicka <[email protected]>
> > Subject: RE: [RFC] Enable time profile function reordering with
> > AutoFDO
> >
> >
> >
> > > -----Original Message-----
> > > From: Prathamesh Kulkarni <[email protected]>
> > > Sent: 23 October 2025 10:39
> > > To: [email protected]; Jan Hubicka <[email protected]>
> > > Subject: RE: [RFC] Enable time profile function reordering with
> > > AutoFDO
> > >
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > > -----Original Message-----
> > > > From: Prathamesh Kulkarni <[email protected]>
> > > > Sent: 13 October 2025 20:25
> > > > To: Prathamesh Kulkarni <[email protected]>; gcc-
> > > > [email protected]; Jan Hubicka <[email protected]>
> > > > Subject: RE: [RFC] Enable time profile function reordering with
> > > > AutoFDO
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Prathamesh Kulkarni <[email protected]>
> > > > > Sent: 06 October 2025 19:41
> > > > > To: [email protected]; Jan Hubicka <[email protected]>
> > > > > Subject: [RFC] Enable time profile function reordering with
> > > AutoFDO
> > > > >
> > > > > External email: Use caution opening links or attachments
> > > > >
> > > > >
> > > > > Hi Honza,
> > > > > The attached patch enables time profile based reordering with
> > > > AutoFDO
> > > > > with -fauto-profile -fprofile-reorder-functions, by mapping
> > > > timestamps
> > > > > obtained from perf into node->tp_first_run, and is based on
> top
> > of
> > > > > Dhruv's sourcefile tracking patch:
> > > > > https://gcc.gnu.org/pipermail/gcc-patches/2025-
> > > September/694800.html
> > > > >
> > > > > The rationale for doing this is:
> > > > > (1) GCC already implements time-profile function reordering,
> the
> > > > patch
> > > > > enables it with AutoFDO.
> > > > > (2) While time profile ordering is primarily meant for
> > optimizing
> > > > > startup time, we've also observed good effects on code-
> locality
> > > for
> > > > > large internal workloads.
> > > > > (3) Possibly useful for function reordering when accurate
> > profile
> > > > > annotation is hard with AutoFDO -- For eg, if branch samples
> are
> > > > > missing (due to absence of LBR like structure).
> > > > >
> > > > > On AutoFDO tools side, I have a patch that extends gcov to
> emit
> > > 64-
> > > > bit
> > > > > perf timestamp that records first execution of function, which
> > > > loosely
> > > > > corresponds to PGO's time_profile counter.
> > > > > The timestamp is stored adjacent to head field in toplevel
> > > function
> > > > > info.
> > > > > I will post a patch for this shortly on AutoFDO tools upstream
> > > repo.
> > > > >
> > > > > On GCC side, the patch makes the following changes:
> > > > >
> > > > > (1) Changes to auto-profile pass:
> > > > > The patch adds a new field timestamp to function_instance, and
> > > > > populates it in read_function_instance.
> > > > >
> > > > > It maintains a new timestamp_info_map from timestamp -> <name,
> > > > > tp_first_run>, which maps timestamps sorted in ascending order
> > to
> > > > > (1..N), so lowest ordered timestamp is mapped to 1 and so on.
> > The
> > > > > rationale for this is that timestamps are 64-bit integers, and
> > we
> > > > > don't need the full 64-bit range for ordering by tp_first_run.
> > > > >
> > > > > During annotation, the timestamp associated with
> > function_instance
> > > > is
> > > > > looked up in timestamp_info_map, and corresponding mapped
> value
> > is
> > > > > assigned to node->tp_first_run.
> > > > >
> > > > > (2) Handling clones:
> > > > > Currently, for clones not registered in call graph before
> auto-
> > > > profile
> > > > > pass, the tp_first_run field is copied from original function,
> > > when
> > > > > the clone is created.
> > > > > However that may not correspond to the actual order of
> > functions.
> > > > >
> > > > > For eg, if we have two profiled clones of foo:
> > > > > foo.constprop.1, foo.constprop.2
> > > > >
> > > > > both will get same value for tp_first_run as foo-
> >tp_first_run,
> > > > which
> > > > > might not correspond to time profile order.
> > > > >
> > > > > To address this, the patch introduces a new IPA pass
> > > > > ipa_adjust_tp_first_run, that streams <clone name,
> tp_first_run>
> > > > from
> > > > > timestamp_info_map during LGEN, and during WPA reads it, and
> > sets
> > > > > clone's tp_first_run field accordingly.
> > > > > The pass is placed pretty late (just before locality_cloning),
> > by
> > > > that
> > > > > point clones would be registered in the call graph.
> > > > >
> > > > > Dhruv's sourcefile tracking patch already handles LTO
> privatized
> > > > > functions.
> > > > > The patch adds a (temporary) workaround for functions with
> > > > > mismatched/empty filenames from gcov, to avoid getting dropped
> > in
> > > > > afdo_annotate_cfg by iterating thru all filenames in
> > > > afdo_string_table
> > > > > if get_function_instance_by_decl fails to find
> function_instance
> > > > with
> > > > > lbasename (DECL_SOURCE_FILE (decl)).
> > > > >
> > > > > (3) Grouping profiled functions together in as few partitions
> as
> > > > > possible (preferably single).
> > > > > The patch places profiled functions in time profile order
> > together
> > > > in
> > > > > as few paritions as possible to get better advantage of code
> > > > locality.
> > > > > Unlike PGO, where every instrumented function gets a time
> > profile
> > > > > counter, with AutoFDO, the sampled functions are a fraction of
> > the
> > > > > total executed ones.
> > > > > Similarly, in default_function_section, it overrides hot/cold
> > > > > partitioning so that grouping of profiled functions isn't
> > > disrupted.
> > > > >
> > > > > (4) Option to disable profile driven opts.
> > > > > The patch adds option -fauto-profile-reorder-only which only
> > > enables
> > > > > time-profile reordering with AutoFDO (and disables profile
> > driven
> > > > > opts):
> > > > > (a) Useful as a debugging aid to isolate regression to either
> > > > function
> > > > > reordering or profile driven opts.
> > > > > (b) For our use case, it's also seemingly useful as a stopgap
> > > > measure
> > > > > to avoid regressions with AutoFDO profile driven opts, due to
> > > issues
> > > > > with profile quality obtained with merging of SPE and non SPE
> > > > > profiles.
> > > > > We're actively working on resolving this.
> > > > > (c) Possibly useful for architectures which do not support
> > branch
> > > > > sampling.
> > > > > The option is disabled by default.
> > > > >
> > > > > Ideally, I would like to make it a param (and not user facing
> > > > option),
> > > > > but I am not able to control enabling/disabling options in
> > > > > opts.cc:common_handle_option based on param value, will
> > > investigate
> > > > > this further.
> > > > >
> > > > > * Results
> > > > >
> > > > > On one large interal workload, the patch (along with
> sourcefile
> > > > > tracking patch), gives an uplift of 32.63% compared to LTO,
> and
> > > > 8.07%
> > > > > compared to LTO + AutoFDO trunk, and for another workload it
> > gives
> > > > an
> > > > > uplift of 15.31% compared to LTO, and 7.76% compared to LTO +
> > > > AutoFDO
> > > > > trunk.
> > > > > I will try benchmarking with SPEC2017.
> > > > >
> > > > > Will be grateful for suggestions on how to proceed further.
> > > > Hi,
> > > > ping: https://gcc.gnu.org/pipermail/gcc-patches/2025-
> > > > October/696758.html
> > > Hi,
> > > ping * 2: https://gcc.gnu.org/pipermail/gcc-patches/2025-
> > > October/696758.html
> > Hi,
> > ping * 3: https://gcc.gnu.org/pipermail/gcc-patches/2025-
> > October/696758.html
> Hi,
> ping * 4: https://gcc.gnu.org/pipermail/gcc-patches/2025-
> October/696758.html
Hi,
ping * 5: https://gcc.gnu.org/pipermail/gcc-patches/2025-October/696758.html

Thanks,
Prathamesh
> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Prathamesh
> > >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Thanks,
> > > > Prathamesh
> > > > >
> > > > > Signed-off-by: Prathamesh Kulkarni <[email protected]>
> > > > >
> > > > > Thanks,
> > > > > Prathamesh

Reply via email to