Hello. Thank you for idea. I would like to provide some comments about what GCC can currently do and I'm curious we need something extra on top of what we do. Right now, GCC can do hot/cold partitioning based on functions and basic blocks. With a PGO profile, the optimization is quite aggressive and can save quite some code being placed into a cold partitioning and being optimized for size. Without a profile, we do a static profile guess (predict.c), where we also propagate information about cold blocks (determine_unlikely_bbs). Later in RTL, we utilize the information and make the real reordering (bb-reorder.c).
Martin