Re: Adding file descriptor attribute(s) to gcc and glibc
The 07/13/2022 12:55, David Malcolm wrote: > On Wed, 2022-07-13 at 16:01 +0200, Florian Weimer wrote: > > * David Malcolm: > GCC trunk's -fanalyzer implements the new warnings via a state machine > for file-descriptor values; it currently has rules for handling "open", > "close", "read", and "write", and these functions are currently hard- > coded inside the analyzer. > > Here are some examples on Compiler Explorer of what it can/cannot > detect: > https://godbolt.org/z/nqPadvM4f > > Probably the most important one IMHO is the leak detection. nice. > Would it be helpful to have some kind of attribute for "returns a new > open FD"? Are there other ways to close a FD other than calling > "close" on it? (Would converting that to some kind of "closes" > attribute be a good idea?) dup2(oldfd, newfd) dup3(oldfd, newfd, flags) closes newfd (and also opens it to be a dup of oldfd) unless the call fails. close_range(first, last, flags) fclose(fdopen(fd, mode)) but users can write all sorts of wrappers around close too. > > Are there any other "magic" values for file-descriptors we should be > aware of? > mmap may require fd==-1 for anonymous maps.
Re : School Leads Data
Hi Sir/madam, Greetings of the day. We provide a leads database to increase Leads ratio along with closers ratio and we served more than 2400+ clients globally. We have "SCHOOLS DEALS DATABASE" for promoting and pitching your unique services ? Other lists we have : parents database, 1st-12th standard students database, cbse schools database, international schools data, corporate database etc Hit reply for samples and quotes. Thank you. Regards, Aman 8618970855
Creating a wrapper around a function at compile time
Hello, I'm looking for some help in how to create a new function at compile time / link time. The idea is an alternative form of constant propagation. The current implementation of ipa-cp, may specialize functions for which arguments may be known at compile time. Call graph edges from the caller to the new specialized functions will replace the old call graph edges from the caller to the original functions. Call graph edges which have no known compile time constants will still point to the original unspecialized function. I would like to explore a different approach to function specialization. Instead of only specializing functions which are guaranteed to have a compile time constant, I would like to also attempt to specialize the edges which do not have compile time constants with a parameter test. In other words, for call graph edges with non-constant arguments at compile time, create a wrapper function around the original function and do a switch statement around parameters. For example, let's say we have a function mul, which multiplies two integers. int mul (int a, int b) { return a * b; } Function mul is called from three different callsites in the whole program: A: mul (a, 2); B: mul (b, 4); C: mul (c, d); At the moment, ipa-cp might specialize mul into 3 different versions: // unoptimized original mul int mul (int a, int b) { return a * b; } // optimized for b = 2; int mul.constprop1 (int a) { // DEBUG b => 2 return a << 1; } // optimized for b = 4; int mul.constprop2 (int a) { // DEBUG b => 4 return a << 2; } and change the callsites to: A: mul.constprop1 (a); B: mul.constprop2 (b); C: mul (c, d); I would like instead to do the following: Create a function mul_test_param int mul_test_param (int a, int b) { switch (b) { case 2: return mul.constprop1 (a); break; case 4: return mul.constprop2 (a); break; default: return mul (a, b); break; } } The function mul_test_param will test each parameter and then call the specialized function. The callsites can either be changed to: A: mul.constprop1 (a); B: mul.constprop2 (b); C: mul_test_param (c, d); or A: mul_test_param (a, 2); B: mul_test_param (b, 4); C: mul_test_param (c, d); The idea is that there exist some class of functions for which the parameter test and the specialized version is less expensive than the original function version. And if, at runtime, d might be a quasi-constant with a good likelihood of being either 2 or 4, then it makes sense to have this parameter test. This is very similar to function tests for making direct to indirect functions and to what could be done in value profiling. I already know how to achieve most of this, but I have never created a function from scratch. That is the bit that is challenging to me at the moment. Any help is appreciated. Thanks! -Erick
Re: Question about speculative make_edge_direct_to_target during LTO/IPA_PASS
Hi Martin, thanks a lot for your help! You were right! I am now able to call make_edge_direct_to_target during WPA. -Erick
Re: Creating a wrapper around a function at compile time
On Thu, Jul 14, 2022 at 2:35 PM Erick Ochoa via Gcc wrote: > > Hello, > > I'm looking for some help in how to create a new function at compile time / > link time. The idea is an alternative form of constant propagation. > > The current implementation of ipa-cp, may specialize functions for which > arguments may be known at compile time. Call graph edges from the caller to > the new specialized functions will replace the old call graph edges from > the caller to the original functions. Call graph edges which have no known > compile time constants will still point to the original unspecialized > function. > > I would like to explore a different approach to function specialization. > Instead of only specializing functions which are guaranteed to have a > compile time constant, I would like to also attempt to specialize the edges > which do not have compile time constants with a parameter test. In other > words, for call graph edges with non-constant arguments at compile time, > create a wrapper function around the original function and do a switch > statement around parameters. > > For example, let's say we have a function mul, which multiplies two > integers. > > int > mul (int a, int b) { > return a * b; > } > > Function mul is called from three different callsites in the whole program: > > A: mul (a, 2); > B: mul (b, 4); > C: mul (c, d); > > At the moment, ipa-cp might specialize mul into 3 different versions: > > // unoptimized original mul > int > mul (int a, int b) { > return a * b; > } > > // optimized for b = 2; > int > mul.constprop1 (int a) { > // DEBUG b => 2 > return a << 1; > } > > // optimized for b = 4; > int > mul.constprop2 (int a) { > // DEBUG b => 4 > return a << 2; > } > > and change the callsites to: > > A: mul.constprop1 (a); > B: mul.constprop2 (b); > C: mul (c, d); > > I would like instead to do the following: > > Create a function mul_test_param > > int > mul_test_param (int a, int b) { > switch (b) > { > case 2: > return mul.constprop1 (a); > break; > case 4: > return mul.constprop2 (a); > break; > default: > return mul (a, b); > break; > } > } > > The function mul_test_param will test each parameter and then call the > specialized function. The callsites can either be changed to: > > A: mul.constprop1 (a); > B: mul.constprop2 (b); > C: mul_test_param (c, d); > > or > > A: mul_test_param (a, 2); > B: mul_test_param (b, 4); > C: mul_test_param (c, d); > > The idea is that there exist some class of functions for which the > parameter test and the specialized version is less expensive than the > original function version. And if, at runtime, d might be a quasi-constant > with a good likelihood of being either 2 or 4, then it makes sense to have > this parameter test. > > This is very similar to function tests for making direct to indirect > functions and to what could be done in value profiling. > > I already know how to achieve most of this, but I have never created a > function from scratch. That is the bit that is challenging to me at the > moment. Any help is appreciated. So instead of wrapping the function why not transform the original function to have a prologue doing a runtime check for the compile-time specialized versions and perform tail-calls to them? What I'm missing is who would call mul_test_param in your case? > > Thanks! > > -Erick
Re: Creating a wrapper around a function at compile time
On Thu, Jul 14, 2022 at 3:27 PM Richard Biener wrote: > > On Thu, Jul 14, 2022 at 2:35 PM Erick Ochoa via Gcc wrote: > > > > Hello, > > > > I'm looking for some help in how to create a new function at compile time / > > link time. The idea is an alternative form of constant propagation. > > > > The current implementation of ipa-cp, may specialize functions for which > > arguments may be known at compile time. Call graph edges from the caller to > > the new specialized functions will replace the old call graph edges from > > the caller to the original functions. Call graph edges which have no known > > compile time constants will still point to the original unspecialized > > function. > > > > I would like to explore a different approach to function specialization. > > Instead of only specializing functions which are guaranteed to have a > > compile time constant, I would like to also attempt to specialize the edges > > which do not have compile time constants with a parameter test. In other > > words, for call graph edges with non-constant arguments at compile time, > > create a wrapper function around the original function and do a switch > > statement around parameters. > > > > For example, let's say we have a function mul, which multiplies two > > integers. > > > > int > > mul (int a, int b) { > > return a * b; > > } > > > > Function mul is called from three different callsites in the whole program: > > > > A: mul (a, 2); > > B: mul (b, 4); > > C: mul (c, d); > > > > At the moment, ipa-cp might specialize mul into 3 different versions: > > > > // unoptimized original mul > > int > > mul (int a, int b) { > > return a * b; > > } > > > > // optimized for b = 2; > > int > > mul.constprop1 (int a) { > > // DEBUG b => 2 > > return a << 1; > > } > > > > // optimized for b = 4; > > int > > mul.constprop2 (int a) { > > // DEBUG b => 4 > > return a << 2; > > } > > > > and change the callsites to: > > > > A: mul.constprop1 (a); > > B: mul.constprop2 (b); > > C: mul (c, d); > > > > I would like instead to do the following: > > > > Create a function mul_test_param > > > > int > > mul_test_param (int a, int b) { > > switch (b) > > { > > case 2: > > return mul.constprop1 (a); > > break; > > case 4: > > return mul.constprop2 (a); > > break; > > default: > > return mul (a, b); > > break; > > } > > } > > > > The function mul_test_param will test each parameter and then call the > > specialized function. The callsites can either be changed to: > > > > A: mul.constprop1 (a); > > B: mul.constprop2 (b); > > C: mul_test_param (c, d); > > > > or > > > > A: mul_test_param (a, 2); > > B: mul_test_param (b, 4); > > C: mul_test_param (c, d); > > > > The idea is that there exist some class of functions for which the > > parameter test and the specialized version is less expensive than the > > original function version. And if, at runtime, d might be a quasi-constant > > with a good likelihood of being either 2 or 4, then it makes sense to have > > this parameter test. > > > > This is very similar to function tests for making direct to indirect > > functions and to what could be done in value profiling. > > > > I already know how to achieve most of this, but I have never created a > > function from scratch. That is the bit that is challenging to me at the > > moment. Any help is appreciated. > > So instead of wrapping the function why not transform the original function > to have a prologue doing a runtime check for the compile-time specialized > versions and perform tail-calls to them? > > What I'm missing is who would call mul_test_param in your case? Following your variant more closely would be doing value profiling of function parameters and then "speculative IPA-CP". Richard. > > > > Thanks! > > > > -Erick
Re: Creating a wrapper around a function at compile time
Hi Richard, > > > So instead of wrapping the function why not transform the original > function > > to have a prologue doing a runtime check for the compile-time specialized > > versions and perform tail-calls to them? > > > > What I'm missing is who would call mul_test_param in your case? > The idea is that all callsites to mul may call instead mul_test_param or only those for which there is no known compile time constant. If we had some more information about the distribution of the parameter values at runtime, then we could even choose not to use the wrapper. > Following your variant more closely would be doing value profiling > of function parameters and then "speculative IPA-CP". > Yes, the idea of doing value profiling on function parameters certainly fits this very well. I refrained from calling it "speculative IPA-CP" and instead called it specialization with a parameter test but the idea I think is the same. However, the main difference between "speculative IPA-CP" and this idea is that (if I understand correctly how speculative IPA-CP works) is that instead of changing the callsite C to the following version: C: mul_test_param (c, d); speculative IPA-CP will have the transformation C: if (a == 2) mul.constprop1 (a) else if (a == 4) mul.constprop2 (a) else mul (a, b) Which depending on how many non compile-time constant callsites there are, will increase the size of the binary. That's why I prefer the wrapper around the function. But this would be essentially inlining the wrapper. As of now, the only "speculative IPA-CP" that I've seen is the speculation on the target of the indirect edge. I could look into exactly how this mechanism works to try to instead speculate on an arbitrary argument. Do you think that would be a good first step instead of creating a wrapper and replacing the edges to the wrapper?
Re: Creating a wrapper around a function at compile time
On Thu, Jul 14, 2022 at 3:42 PM Erick Ochoa wrote: > > Hi Richard, > > >> > >> > So instead of wrapping the function why not transform the original function >> > to have a prologue doing a runtime check for the compile-time specialized >> > versions and perform tail-calls to them? >> > >> > What I'm missing is who would call mul_test_param in your case? > > > The idea is that all callsites to mul may call instead mul_test_param or only > those for which there is no known compile time constant. If we had some more > information about the distribution of the parameter values at runtime, then > we could even choose not to use the wrapper. > >> >> Following your variant more closely would be doing value profiling >> of function parameters and then "speculative IPA-CP". > > > Yes, the idea of doing value profiling on function parameters certainly fits > this very well. I refrained from calling it "speculative IPA-CP" and instead > called it specialization with a parameter test but the idea I think is the > same. However, the main difference between "speculative IPA-CP" and this idea > is that (if I understand correctly how speculative IPA-CP works) is that > instead of changing the callsite C to the following version: > > C: mul_test_param (c, d); > > speculative IPA-CP will have the transformation > > C: if (a == 2) mul.constprop1 (a) > else if (a == 4) mul.constprop2 (a) > else mul (a, b) > > Which depending on how many non compile-time constant callsites there are, > will increase the size of the binary. That's why I prefer the wrapper around > the function. But this would be essentially inlining the wrapper. With a value-profile it would be per call site and IPA-CP can use that to direct the cloning. I'm not sure how many values a value histogram can track reliably here. > > As of now, the only "speculative IPA-CP" that I've seen is the speculation on > the target of the indirect edge. I could look into exactly how this mechanism > works to try to instead speculate on an arbitrary argument. Do you think that > would be a good first step instead of creating a wrapper and replacing the > edges to the wrapper?
Re: Creating a wrapper around a function at compile time
On Thu, 14 Jul 2022 at 15:51, Richard Biener wrote: > With a value-profile it would be per call site and IPA-CP can use that > to direct the cloning. I'm not sure how many > values a value histogram can track reliably here. > Last time I checked, value profiling can only track a single value per statement. So that would need to be augmented if we wanted to specialize for more than one parameter. However, this can also be a bad idea. Because even if parameters a and b are quasi-constants, perhaps the cross product a x b follows a more uniform distribution. This means that optimizing on a or on b might be a good idea but optimizing on a x b is a bad idea. (There is still some work to be done defining when specializing is a good idea or not). Also, I do not believe that value profiling profiles arguments? According to the comments only values regarding division and/or indirect/virtual call specialization are tracked during value profiling. However, even if value profiling would be a great addition to this, I would like to explore the transformation independent of value profiling. There are some heuristics that could be used, for example looking at the callsites which do have constant arguments and which optimizations are enabled by those constants. You mention that: "IPA-CP can use that to direct the cloning". Can you elaborate a bit further? I guess the idea here is augmenting ipa-cp with a list of "likely values" and extend the infrastructure which deals with speculation to support arguments. Am I following correctly? Any other suggestions?
Re: Creating a wrapper around a function at compile time
On 7/14/22 16:08, Erick Ochoa via Gcc wrote: > Last time I checked, value profiling can only track a single value per > statement. Hi. Take a look at HIST_TYPE_INDIR_CALL which we use for tracking at maximum 32 (#define GCOV_TOPN_MAXIMUM_TRACKED_VALUES 32) and we use for indirect call speculative calls which you can see for instance here: ./gcc/testsuite/g++.dg/tree-prof/indir-call-prof.C Cheers, Martin
Re: Creating a wrapper around a function at compile time
On Thu, 14 Jul 2022 at 16:10, Martin Liška wrote: > On 7/14/22 16:08, Erick Ochoa via Gcc wrote: > > Last time I checked, value profiling can only track a single value per > > statement. > > Hi. > > Take a look at HIST_TYPE_INDIR_CALL which we use for tracking at maximum 32 > (#define GCOV_TOPN_MAXIMUM_TRACKED_VALUES 32) and we use for indirect call > speculative calls which you can see for instance here: > > ./gcc/testsuite/g++.dg/tree-prof/indir-call-prof.C > Thanks Martin, I'll give it a read. However, I have mis-spoken. If my understanding is correct: multiple values are tracked, but only the values of a single variable/expression per statement are tracked. That means that for a gcall (which is a single statement and) which has n argument expressions, I believe that the naive way to track all argument expressions is not possible without extending how histograms are associated to statements. Perhaps canonicalizing how callsites work (i.e., only variables are allowed as arguments in call sites and then associating a histogram to the definition of the variables being used in call sites) would be enough, but I haven't given it much thought for the consequences that might follow from this. > > Cheers, > Martin >
Re: Adding file descriptor attribute(s) to gcc and glibc
On Thu, 2022-07-14 at 09:30 +0100, Szabolcs Nagy wrote: > The 07/13/2022 12:55, David Malcolm wrote: > > On Wed, 2022-07-13 at 16:01 +0200, Florian Weimer wrote: > > > * David Malcolm: > > GCC trunk's -fanalyzer implements the new warnings via a state > > machine > > for file-descriptor values; it currently has rules for handling > > "open", > > "close", "read", and "write", and these functions are currently hard- > > coded inside the analyzer. > > > > Here are some examples on Compiler Explorer of what it can/cannot > > detect: > > https://godbolt.org/z/nqPadvM4f > > > > Probably the most important one IMHO is the leak detection. > > nice. > > > Would it be helpful to have some kind of attribute for "returns a new > > open FD"? Are there other ways to close a FD other than calling > > "close" on it? (Would converting that to some kind of "closes" > > attribute be a good idea?) Thanks, lots of good ideas here; I've filed various RFEs about these; I'm posting the details below for reference. > > dup2(oldfd, newfd) > dup3(oldfd, newfd, flags) > > closes newfd (and also opens it to be a dup of oldfd) > unless the call fails. dup and friends probably need special-casing; I've filed this as: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106298 > > close_range(first, last, flags) close_range and closefrom would need special-casing, already filed as: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106283 > > fclose(fdopen(fd, mode)) The analyzer now attempts to track both file descriptors and stdio streams, so we probably need to special-case fdopen to capture the various possible interactions between these two leak detectors; I've filed this as: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106299 with an implementation idea there. > > but users can write all sorts of wrappers around close too. Yeah. If the -fanalyzer leak detectors see a value "escape" into unknown code, then they don't report leaks; see e.g.: https://godbolt.org/z/n8fMhGTP5 where we don't report in test_2 about fd leaking due to the call to: might_close (fd); which is extern, and so we conservatively assume that fd doesn't leak. > > > > > Are there any other "magic" values for file-descriptors we should be > > aware of? > > > > mmap may require fd==-1 for anonymous maps. mmap is its own can of worms, which I've filed as: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106301 You also reminded me that we need to track other ways in which the user could obtain an fd that could leak, which I've filed as: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106300 (covering creat, pipe and friends, dup and friends, fcntl, and socket). I've added all of these to the top-level RFE for -fanalyzer tracking file descriptors: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=analyzer-fd which is now a tracker-bug: https://gcc.gnu.org/bugzilla/showdependencytree.cgi?id=analyzer-fd Thanks again for the ideas Dave
Re: [PATCH] analyzer PR 106003
On Thu, 2022-06-23 at 19:20 -0400, David Malcolm wrote: > On Fri, 2022-06-24 at 00:00 +0530, Mir Immad wrote: [...snip...] > > + > > +enum access_mode > > +fd_state_machine::get_access_mode_from_flag (int flag) const > > +{ > > + /* FIXME: this code assumes the access modes on the host and > > + target are the same, which in practice might not be the > > case. */ > > Thanks for moving this into a subroutine. > > Joseph says we should introduce a target hook for this: > https://gcc.gnu.org/pipermail/gcc/2022-June/238961.html > > You can see an example of adding a target hook in commit > 8cdcea51c0fd753e6a652c9b236e91b3a6e0911c. > > As noted above, I think we can leave adding the target hook until a > followup patch, if this is only going to be an issue with cross- > compilation between Hurd and non-Hurd systems. FWIW, for reference, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106302 about this; it strikes me that there are other flags we might eventually want to test e.g. for mmap we'd want to look at MAP_ANONYMOUS and the various PROT_* values. Dave
RISC-V: Public review for Non-ISA Specification: psABI
We are delighted to announce the start of the public review period for psABI. The review period begins today, 2022/07/14, and ends on 2022/08/29 (inclusive). This Non-ISA specification is described in the PDF available at: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0-rc3 which was generated from the source available in the following GitHub repo: https://github.com/riscv-non-isa/riscv-elf-psabi-doc To respond to the public review, please either email comments to the public isa-dev (isa-...@groups.riscv.org) mailing list or add issues and/or pull requests (PRs) to the GitHub repo (https://github.com/riscv-non-isa/riscv-elf-psabi-doc). We welcome all input and appreciate your time and effort in helping us by reviewing the specification. During the public review period, corrections, comments, and suggestions, will be gathered for review by the psABI Task Group. Any minor corrections and/or uncontroversial changes will be incorporated into the specification. Any remaining issues or proposed changes will be addressed in the public review summary report. If there are no issues that require incompatible changes to the public review specification, the psABI Task Group will recommend the updated specifications be approved and ratified by the RISC-V Technical Steering Committee and the RISC-V Board of Directors. Thanks to all the contributors for all their hard work. Kito Cheng
Proposal: allow to extend C++ template argument deduction via plugins
Hi, As far as I understand the currently available plugin extension points, it is not possible to modify template argument deduction algorithm (except the theoretical possibility to completely override parsing step). However, such opportunity might be beneficial for projects like libpqxx, for example, when database schema and query text are available at compile-time, return types of the query might be inferred by the plugin. I propose to add something like PLUGIN_FUNCTION_CALL plugin_event which will allow to modify function calls conditionally. Will a patch adding such functionality be welcomed? Thanks, Dan Klishch
Re: Adding file descriptor attribute(s) to gcc and glibc
On 7/14/22 08:22, David Malcolm via Libc-alpha wrote: The analyzer now attempts to track both file descriptors and stdio streams, so we probably need to special-case fdopen You probably also need to special-case fileno, which goes the opposite direction. If you're handling DIR *, fdopendir and dirfd would need similar treatment.
gcc-10-20220714 is now available
Snapshot gcc-10-20220714 is now available on https://gcc.gnu.org/pub/gcc/snapshots/10-20220714/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 10 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-10 revision b0bb0a8d75295b7d36c5f0a979dea77adb307579 You'll find: gcc-10-20220714.tar.xz Complete GCC SHA256=b0ceec19ecf32d1e2a31734904149a49ad5b4741ada904500a3f4cb3870c203c SHA1=16f3f007c23eb903b9066c4e5beca546eb5ca8c2 Diffs from 10-20220707 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-10 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: [PATCH Linux] powerpc: add documentation for HWCAPs
Finally got some details about the icache snoop question so just coming back to this now, sorry for the delay... (POWER10 does support the coherent icache flush sequence as expected, there was some updates to the UM wording but that will be fixed). Excerpts from Segher Boessenkool's message of May 25, 2022 3:38 am: > Hi! > > On Tue, May 24, 2022 at 07:38:28PM +1000, Nicholas Piggin wrote: >> Thanks for all the comments and corrections. It should be nearing the >> point where it is useful now. Yes I do think it would be useful to align >> this more with OpenPOWER docs (and possibly eventually move it into the >> ABI, given that's the allocator of these numbers) but that's not >> done yet. > > The auxiliary vector is a Linux/glibc thing, it should not be described > in more generic ABI documents. It is fine where you have it now afaics. It is already in the ABI document. In fact that (not the kernel) had been the allocator of the feature numbers, at least in the past I think. > >> +Where software relies on a feature described by a HWCAP, it should check the >> +relevant HWCAP flag to verify that the feature is present before attempting >> to >> +make use of the feature. >> + >> +Features should not be probed through other means. When a feature is not >> +available, attempting to use it may result in unpredictable behaviour, and >> +may not be guaranteed to result in any reliable indication that the feature >> +is unavailable. > > Traditionally VMX was tested for by simply executing an instruction and > catching SIGILL. This is portable even. This has worked fine for over > two decades, it's a bit weird to declare this a forbidden practice > now :-) The statement does not override architectural specification, so if an encoding does not exist then it should cause a trap and SIGILL. I suppose in theory we could work around performance or correctness issues in an implementation by clearing HWCAP even if the hardware does execute the instruction, so I would still say testing HWCAP is preferred. > > It certainly isn't recommended for more complex and/or newer things. > >> +verstions. > > (typo. spellcheck maybe?) Thanks, Nick
Re: [PATCH Linux] powerpc: add documentation for HWCAPs
Excerpts from Segher Boessenkool's message of May 25, 2022 4:32 am: > On Tue, May 24, 2022 at 11:52:00AM +0200, Florian Weimer wrote: >> * Nicholas Piggin: >> >> > +2. Facilities >> > +- >> > +The Power ISA uses the term "facility" to describe a class of >> > instructions, >> > +registers, interrupts, etc. The presence or absence of a facility >> > indicates >> > +whether this class is available to be used, but the specifics depend on >> > the >> > +ISA version. For example, if the VSX facility is available, the VSX >> > +instructions that can be used differ between the v3.0B and v3.1B ISA >> > +verstions. >> >> The 2.07 ISA manual also has categories. ISA 3.0 made a lot of things >> mandatory. It may make sense to clarify that feature bits for mandatory >> aspects of the ISA are still set, to help with backwards compatibility. > > Linux runs on ISA 1.xx and ISA 2.01 machines still. "Category" wasn't > invented for either yet either, but similar concepts did exist of > course. Not sure what to say about this. It now also has "Compliancy Subset" although maybe that's more like a set of features rather than incompatible features or modes such as some of the category stuff seems to be. I'll try add something. Thanks, Nick
[PATCH v2] powerpc: add documentation for HWCAPs
Take the arm64 HWCAP documentation file and adjust it for powerpc. Signed-off-by: Nicholas Piggin --- Thanks for the comments. v2: - Addition of "categories" paragraph. - Change "features should not be probed via other means" to "HWCAP is preferred". - Speling fix. - Documentation/powerpc/elf_hwcaps.rst | 234 +++ 1 file changed, 234 insertions(+) create mode 100644 Documentation/powerpc/elf_hwcaps.rst diff --git a/Documentation/powerpc/elf_hwcaps.rst b/Documentation/powerpc/elf_hwcaps.rst new file mode 100644 index ..5f28687d54f3 --- /dev/null +++ b/Documentation/powerpc/elf_hwcaps.rst @@ -0,0 +1,234 @@ +.. _elf_hwcaps_index: + +== +POWERPC ELF HWCAPs +== + +This document describes the usage and semantics of the powerpc ELF HWCAPs. + + +1. Introduction +--- + +Some hardware or software features are only available on some CPU +implementations, and/or with certain kernel configurations, but have no other +discovery mechanism available to userspace code. The kernel exposes the +presence of these features to userspace through a set of flags called HWCAPs, +exposed in the auxiliary vector. + +Userspace software can test for features by acquiring the AT_HWCAP or +AT_HWCAP2 entry of the auxiliary vector, and testing whether the relevant +flags are set, e.g.:: + + bool floating_point_is_present(void) + { + unsigned long HWCAPs = getauxval(AT_HWCAP); + if (HWCAPs & PPC_FEATURE_HAS_FPU) + return true; + + return false; + } + +Where software relies on a feature described by a HWCAP, it should check the +relevant HWCAP flag to verify that the feature is present before attempting to +make use of the feature. + +HWCAP is the preferred method to test for the presence of a feature rather +than probing through other means, which may not be reliable or may cause +unpredictable behaviour. + +Software that targets a particular platform does not necessarily have to +test for required or implied features. For example if the program requires +FPU, VMX, VSX, it is not necessary to test those HWCAPs, and it may be +impossible to do so if the compiler generates code requiring those features. + +2. Facilities +- + +The Power ISA uses the term "facility" to describe a class of instructions, +registers, interrupts, etc. The presence or absence of a facility indicates +whether this class is available to be used, but the specifics depend on the +ISA version. For example, if the VSX facility is available, the VSX +instructions that can be used differ between the v3.0B and v3.1B ISA +versions. + +3. Categories +- + +The Power ISA before v3.0 uses the term "category" to describe certain +classes of instructions and operating modes which may be optional or +mutually exclusive, the exact meaning of the HWCAP flag may depend on +context, e.g., the presence of the BOOKE feature implies that the server +category is not implemented. + +4. HWCAP allocation +--- + +HWCAPs are allocated as described in Power Architecture 64-Bit ELF V2 ABI +Specification (which will be reflected in the kernel's uapi headers). + +5. The HWCAPs exposed in AT_HWCAP +- + +PPC_FEATURE_32 +32-bit CPU + +PPC_FEATURE_64 +64-bit CPU (userspace may be running in 32-bit mode). + +PPC_FEATURE_601_INSTR +The processor is PowerPC 601. +Unused in the kernel since: + f0ed73f3fa2c ("powerpc: Remove PowerPC 601") + +PPC_FEATURE_HAS_ALTIVEC +Vector (aka Altivec, VMX) facility is available. + +PPC_FEATURE_HAS_FPU +Floating point facility is available. + +PPC_FEATURE_HAS_MMU +Memory management unit is present and enabled. + +PPC_FEATURE_HAS_4xxMAC +The processor is 40x or 44x family. + +PPC_FEATURE_UNIFIED_CACHE +The processor has a unified L1 cache for instructions and data, as +found in NXP e200. +Unused in the kernel since: + 39c8bf2b3cc1 ("powerpc: Retire e200 core (mpc555x processor)") + +PPC_FEATURE_HAS_SPE +Signal Processing Engine facility is available. + +PPC_FEATURE_HAS_EFP_SINGLE +Embedded Floating Point single precision operations are available. + +PPC_FEATURE_HAS_EFP_DOUBLE +Embedded Floating Point double precision operations are available. + +PPC_FEATURE_NO_TB +The timebase facility (mftb instruction) is not available. +This is a 601 specific HWCAP, so if it is known that the processor +running is not a 601, via other HWCAPs or other means, it is not +required to test this bit before using the timebase. +Unused in the kernel since: + f0ed73f3fa2c ("powerpc: Remove PowerPC 601") + +PPC_FEATURE_POWER4 +The processor is POWER4 or PPC970/FX/MP. +POWER4 support dropped from the kernel since: + 471d7ff8b51b ("powerpc/64s: Remove POWER4 support") + +PPC_FEATURE_POWER5 +The processor is POWER5. + +PPC_FEATURE_POWER5_PLUS +The proc