Hi!
On Tue, Aug 09, 2022 at 08:51:59PM +0800, Kewen.Lin wrote:
> on 2022/8/9 18:35, Segher Boessenkool wrote:
> >> + /* As ELFv2 ABI shows, the allowable bytes past the global entry
> >> + point are 0, 4, 8, 16, 32 and 64. Considering there are two
> >> + non-prefixed instructions for global entry (8 bytes), the count
> >> + for patchable NOPs before local entry would be 2, 6 and 14. */
> >
> > The other option is to allow other numbers of nops, but in that case not
> > have a local entry point (so, always use the global entry point).
>
> Good point, it's doable, but it means for the other counts of NOPs, the
> patched function has to pay the cost of TOC initialization all the time,
> IMHO it may not be what we want.
It isn't very expensive: the main benefit of the LEP is not not having
to do those two insns, but having the r2 setter earlier, allowing loads
via the TOC reg to execute earlier.
> > I don't know if that is useful for any users of this support (if there
> > even are such users :-P )
>
> Yeah, as the discussions in PR98125, powerpc linux kernel doesn't adopt
> this feature. :-P
Right, -mprofile-kernel is more efficient.
So maybe just say in the comment that it is possible to support those
other nop pad sizes, by not doing a LEP at all? Instead of sasying it
cannot be done :-)
>
> >
> >> + if (patch_area_entry > 0)
> >> + {
> >> + if (patch_area_entry != 2
> >> + && patch_area_entry != 6
> >> + && patch_area_entry != 14)
> >> + error ("for %<-fpatchable-function-entry=%u,%u%>, patching "
> >> + "%u NOP(s) before function entry is invalid, it can "
> >> + "cause assembler error",
> >
> > I would not say "it can [etc.]" at all. Oh, and "NOP" (capitals) isn't
> > a thing, it is not an acronym or such ;-)
> >
>
> Poor at wording. :( Could you help to suggest some words here?
I'll try...
"unsupported number of nops before function entry (%u)"
> >> +/* { dg-require-effective-target powerpc_elfv2 } */
> >> +/* Specify -mcpu=power9 to ensure global entry is needed. */
> >> +/* { dg-options "-mdejagnu-cpu=power9" } */
> >
> > Why would it be needed for p9, and not older, or newer?
> >
>
> It can be p8 or p9, but not p10 and later.
>
> It's meant to exclude pc-relative feature which can make the case not
> generate a global entry point prologue and the test point will become
> unavailable. I thought about adding -mno-pcrel, but guessed it's safer
> to use one cpu type which doesn't support pcrel at all, since it can
> exclude all possibilities that pcrel gets re-enabled.
>
> Do you think -mno-pcrel is more elegant and relatively safe?
> Or just update the comments to make it more meaningful?
Just use { ! powerpc_pcrel } ? I don't think you can put that in a
dg-require-effective-target, but you can do for example
dg-do compile { target { ! powerpc_pcrel } }
or similar.
Direct things are aleays much preferred. There should be a comment
saying what some non-obvious restriction is for always, and it will be
simple and boring then (the code already says that pcrel is not okay,
just add a word or two "no TOC etc. with pcrel" or whatever :-)
Segher