Split-stack support for aarch64
Hi all, Since BZ#67877 [1] does not have much information on it, I would like to ask for some inputs of which is the requirement of implementing split-stack on aarch64 besides 'feature parity'. I am asking it because on PR it states the main use it gccgo and afaik there is some usage in other programs, but the widespread is not really convincing (for instance, rust now allows targets to be built without it [2]). I also noted GO from 1.4 does not use split-stack anymore, stating it suffers from a performance issue ("hot stack split") and some search on the internet describes that for 64-bit targets split-stack is not really an efficient way to manage stack grows. So this is only for gccgo support? It gccgo stuck in pre 1.4 version and/or not willing remove split-stack usage (as go itself)? [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67877 [2] https://github.com/rust-lang/rust/issues/16980
Re: [EXTERNAL] Re: How to test aarch64 when building a cross-compiler?
On 25/11/2019 17:28, Andrew Dean via gcc wrote: >>> This completes successfully. However, when I then try to run the gcc tests >>> like >> so: >>> runtest --outdir . --tool gcc --srcdir /path/to/gcc/gcc/testsuite >>> aarch64.exp --target aarch64-linux-gnu --target_board aarch64-sim >>> --tool_exec >>> /path_to/build_dir/install/compilers/aarch64-linux-gnu/bin/aarch64-gli >>> bc-linux-gnu-gcc --verbose -v >>> >>> I get errors like this: >>> >>> aarch64-glibc-linux-gnu-gcc: fatal error: cannot read spec file >>> 'rdimon.specs': No such file or directory >>> >>> I can see that the rdimon.specs flag is added based on this line in aarch64- >> sim.exp: >> >> Where does aarch64-sim.exp comes from? > > /usr/share/dejagnu/baseboards/aarch64-sim.exp > >> >>> >>> set_board_info ldflags "[libgloss_link_flags] [newlib_link_flags] - >> specs=rdimon.specs" >>> >> I think this is for baremetal/newlib targets, ie. aarch64-elf, not for >> aarch64- >> linux-gnu. > > Hmm.. build-many-glibcs.py doesn't like either aarch64-elf or > aarch64-linux-elf... > I get a KeyError in build_compilers and build_glibcs when it tries to look up > the config with either of those values. > Unfortunately the build-many-glibcs.py does not have support for baremetal build yet (since it is a tool created to build cross-compiling toolchain using glibc).
[libgomp] Ask for help on an improvement for synchronization overhead
Hi all, I would like to check if someone could help me figure out an issue I am chasing on a libgomp patch intended to partially address the issue described at BZ#79784. I have identified that one of the bottlenecks is the global barrier used on both thread pool and team which causes a lof of cache ping-pong in high-core count machines. And it seems not be an aarch64 specific issue as hinted by the bugzilla. So the optimization I am implementing, which is similar of what LLVM openmp implementation does; is to use a per OMP thread barrier to synchronize team/task creation. The activation I have implemented so far is a simple linear one, where the master scan linearly over the children threads (LLVM openmp implement some fancy ones that I plan to take a look as well). The patch I came up so far is quite simple [2] and required some polish yet (some documentation, code styling, etc.), however there is one regression that is making me scratching my head: cancel-parallel-2. What it does to exercise OpenMP cancellation in a 'omp parallel' construct and the issue I am seeing is falling to understand why the final team barrier (done on gomp_team_end called by GOMP_parallel_end) it not synchronizing correctly with the team barrier in each OpenMP task. So any help on the design is appreciate (even if it would I should re-thinking it for libgomp). [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 [2] https://github.com/zatrazz/gcc/tree/azanella/libgomp-scalability
Re: [libgomp] Ask for help on an improvement for synchronization overhead
On 30/04/2020 18:12, Jakub Jelinek wrote: > On Thu, Apr 30, 2020 at 05:37:26PM -0300, Adhemerval Zanella via Gcc wrote: >> Hi all, I would like to check if someone could help me figure out >> an issue I am chasing on a libgomp patch intended to partially >> address the issue described at BZ#79784. >> >> I have identified that one of the bottlenecks is the global barrier >> used on both thread pool and team which causes a lof of cache ping-pong >> in high-core count machines. And it seems not be an aarch64 specific >> issue as hinted by the bugzilla. > > This has been a topic of GSoC last year, but the student didn't deliver it > in usable form and disappeared. > See e.g. thread with "Work-stealing task scheduling" in subject from > last year on gcc-patches and other mails on the topic. In my understanding what I am working is not exactly related to OMP tasking, although I see that the global barrier is still an issue on omp task scheduling. What I am trying to optimize in this specific case is the barrier used on gomp_thread_pool used on constructs like parallel for and maybe a per-thread barrier could be extended to other libgomp places. > > So if you'd have time and motivation to do it properly, it would be greatly > appreciated. >
Re: unnormal Intel 80-bit long doubles and isnanl
On 24/11/2020 10:59, Siddhesh Poyarekar wrote: > On 11/24/20 7:11 PM, Szabolcs Nagy wrote: >> ideally fpclassify (and other classification macros) would >> handle all representations. >> >> architecturally invalid or trap representations can be a >> non-standard class but i think classifying them as FP_NAN >> would break the least amount of code. > > That's my impression too. > >>> glibc evaluates the bit pattern of the 80-bit long double and in the >>> process, ignores the integer bit, i.e. bit 63. As a result, it considers >>> the unnormal number as a valid long double and isnanl returns 0. >> >> i think m68k and x86 are different here. >> >>> >>> gcc on the other hand, simply uses the number in a floating point comparison >>> and uses the parity flag (which indicates an unordered compare, signalling a >>> NaN) to decide if the number is a NaN. The unnormal numbers behave like >>> NaNs in this respect, in that they set the parity flag and with >>> -fsignalling-nans, would result in an invalid-operation exception. As a >>> result, __builtin_isnanl returns 1 for an unnormal number. >> >> compiling isnanl to a quiet fp compare is wrong with >> -fsignalling-nans: classification is not supposed to >> signal exceptions for snan. > > I agree, but I think that issue with __builtin_isnanl is orthogonal to the > question about unnormals. Once that is fixed in gcc, we could actually use > __builtin_isnanl all the time in glibc for isnanl. > > Siddhesh Which is the currently take from gcc developers on this semantic change of __builtin_isnanl? Are they considering current behavior of non classifying the 'unnormal' as NAN the expected behavior and waiting glibc to follow it or are they willing to align with glibc behavior?
Re: GCC association with the FSF
> Il giorno 11 apr 2021, alle ore 17:45, Alexandre Oliva via Gcc > ha scritto: > > Remember how much hate RMS got in glibc land for something I did? I > said I did it out of my own volition, I explained my why I did it, but > people wouldn't believe he had nothing to do with it! It was clear to me and others glibc maintainers that it was *you* who bypass the consensus to *not* reinstate the “joke”. And there was no hate (at least not from my side) only *disappointment* that you used your status to do it even though most of senior developers and maintainers said explicitly you shouldn’t do it.
Re: GCC association with the FSF
On Sun, Apr 11, 2021 at 8:06 PM Alexandre Oliva wrote: > > On Apr 11, 2021, Adhemerval Zanella wrote: > > > It was clear to me and others glibc maintainers that it was *you* who > > bypass the consensus to *not* reinstate the “joke”. > > I think you wrote it backwards: what I did was to revert the commit that > the person who put it in agreed shouldn't have been made at that point, > so that the debate about whether or not to install the patch could be > carried out without the fait accompli. To my surprise, it stopped. > > Then, a year or so later, when most of the GNU policies that incided on > that matter had already been discussed and approved, and they suggested > (at least to me) that the conclusion was likely that the patch was in > line with them, some other situation came up that reminded people of the > patch, it was discussed under the heat of the unrelated situation (which > I also found inappropriate), but it got applied AFAICT in accordance > with GNU and GLIBC policies. RMS briefly stated that he did not want the change to be applied, we considered his input back then but we decided to remove the joke *regardless* of what he thought about the subject. And you used this to state the change had no consensus to reinstate it in a way that we haven't done in the project for a couple of years and which caused a lot of disarray. The problem was not that you did it, but how you did it. You then spent a lot of days trying to convince other glibc maintainers about your actions to the point that Torvald and Siddhesh were fed up with your rhetoric. > > > maintainers said explicitly you shouldn’t do it. > > I do not see nor recall any such responses or reactions to my offer to > revert the patch in case the installer wouldn't do it, except the > installer saying they wouldn't do the reversal. Eventually I did it. > After the fact, some said I shouldn't have done it. > > > That's my recollection of the events. All the other active maintainers suggested you shouldn't have done that, but you ignored it anyway. And we did not want to start a potential contention of patch applying and reversion from that petty discussion. But this is done and I don't want to dig into this. My point is *we* glibc maintainers were fully aware that it was *you* that decided to act in that way and it was not my feelings that it was *hate* the dominant response, but rather a lot of frustration and disappointment from how you acted.
Re: GCC association with the FSF
On Sun, Apr 11, 2021 at 10:43 PM Alexandre Oliva wrote: > > On Apr 11, 2021, Adhemerval Zanella wrote: > > > All the other active maintainers suggested you shouldn't have done that, > > but you > > ignored it anyway. > > How could I possibly have ignored something that hadn't happened yet? > > > *we* glibc maintainers were fully aware that it was *you* that decided > > to act in that way > > There have been plenty of insinuations that contradict that assumption > and attempted to somehow blame it on RMS, but whether the record has > been set straight on this point now, or if it was straight already, the > point stands. No, you are insinuating that the glibc community both as maintainer and contributors acted in a hateful way regarding the 'joke' removal. Sorry, but this is not true; there were messages that might be characterized as such but they did not come from either of main glibc developers or maintainers. > > As recently as a couple of weeks ago someone referred, in this list, to > RMS's voicing his objection to the removal of one of the many pieces he > wrote for the glibc manual, and then setting out to propose and discuss > policies that incided on the matter, as if those were horrible actions. > > That was almost as abhorrent as his asking a GNU developer a question > that he could have answered by just downloading the subproject's source > code and looking for the answer himself! Oh, the horror! > > > If that's not hatred, I don't really wish to know what is :-/ The main idea, which I was vocal about and shared with some glibc developers and maintainers, was that the "joke" has no place in a technical manual. You might disagree ideological and politically from this assessment, but this it is not "hatred" and this very rhetoric is trying to characterize it as such is what made me see that discussion as frustrating and disappointing.
Re: GCC association with the FSF
On 12/04/2021 14:52, Alexandre Oliva wrote: > On Apr 12, 2021, Adhemerval Zanella wrote: > >> No, you are insinuating that the glibc community both as maintainer >> and contributors acted in a hateful way regarding the 'joke' >> removal. Sorry, but this is not true; > > Easy to say for someone who hasn't been the target of hate, but it's > just that it was there right then, it's *remains* there. Not exclusive > among glibc maintainers, and certainly not unanimous among them, but > there. I may even have earned it myself. But the one that Richard got > over incorrect assumptions that he commanded the reversal, that's just > another false piece of evidence often used to support the hate campaign. There were no "hate" campaign from glibc developers and maintainers, keep stating it does not make it true. Since libc-alpha is non moderated list, there were a lot of unfriendly message from undisclosed or non-representative people. What happened is some glibc developers were *really* annoyed in the way *you* acted, not RMS; and they vocalized it. And you, instead of work toward to create consensus by making some concession (as the currently we try to run the glibc community), keep arguing to exhaustion that you acted in the benefit or the project. So the aforementioned 'hate' is just because we did not agreed in the way *you* acted, which caused a lot of distress. > >> The main idea, which I was vocal about and shared with some glibc >> developers and maintainers, was that the "joke" has no place in a >> technical manual. > > I understand there is consensus about that now, but back then there were > too many unsettled policy issues to make that call consensually among > all relevant parties. > > The main disagreement was not over the issue proper, though. It was > about procedure, and then it was about whose opinions as much as > counted. No, the disagreement is the way *you* did it. I haven't seen such contention and disarray you started since I have started to work on the project, about a decade ago. So, please stop put the blame of that episode on the glibc community as a whole. > > > It was a really trivial issue, but sufficiently hot-button and > triggering enough underlying issues that it got to be exploited > politically in several ugly ways. > > It can't really be understood without looking into broader contexts that > had long been mounting, and that again quite explicit in this list too. > > > But I hope we can all agree that it was a horrible mess. >
Re: New TLS usage in libgcc_s.so.1, compatibility impact
On 15/01/24 09:46, Szabolcs Nagy wrote: > The 01/13/2024 13:49, Florian Weimer wrote: >> This commit >> >> commit 8abddb187b33480d8827f44ec655f45734a1749d >> Author: Andrew Burgess >> Date: Sat Aug 5 14:31:06 2023 +0200 >> >> libgcc: support heap-based trampolines >> >> Add support for heap-based trampolines on x86_64-linux, aarch64-linux, >> and x86_64-darwin. Implement the __builtin_nested_func_ptr_created and >> __builtin_nested_func_ptr_deleted functions for these targets. >> >> Co-Authored-By: Maxim Blinov >> Co-Authored-By: Iain Sandoe >> Co-Authored-By: Francois-Xavier Coudert >> >> added TLS usage to libgcc_s.so.1. The way that libgcc_s is currently >> built, it ends up using a dynamic TLS variant on the Linux targets. >> This means that there is no up-front TLS allocation with glibc (but >> there would be one with musl). >> >> There is still a compatibility impact because glibc assigns a TLS module >> ID upfront. This seems to be what causes the >> ust/libc-wrapper/test_libc-wrapper test in lttng-tools to fail. We end >> up with an infinite regress during process termination because >> libgcc_s.so.1 has been loaded, resulting in a DTV update. When this >> happens, the bottom of the stack looks like this: >> >> #4447 0x77f288f0 in free () from >> /lib64/liblttng-ust-libc-wrapper.so.1 >> #4448 0x77fdb142 in free (ptr=) >> at ../include/rtld-malloc.h:50 >> #4449 _dl_update_slotinfo (req_modid=3, new_gen=2) at ../elf/dl-tls.c:822 >> #4450 0x77fdb214 in update_get_addr (ti=0x77f2bfc0, >> gen=) at ../elf/dl-tls.c:916 >> #4451 0x77fddccc in __tls_get_addr () >> at ../sysdeps/x86_64/tls_get_addr.S:55 >> #4452 0x77f288f0 in free () from >> /lib64/liblttng-ust-libc-wrapper.so.1 >> #4453 0x77fdb142 in free (ptr=) >> at ../include/rtld-malloc.h:50 >> #4454 _dl_update_slotinfo (req_modid=2, new_gen=2) at ../elf/dl-tls.c:822 >> #4455 0x77fdb214 in update_get_addr (ti=0x77f39fa0, >> gen=) at ../elf/dl-tls.c:916 >> #4456 0x77fddccc in __tls_get_addr () >> at ../sysdeps/x86_64/tls_get_addr.S:55 >> #4457 0x77f36113 in lttng_ust_cancelstate_disable_push () >>from /lib64/liblttng-ust-common.so.1 >> #4458 0x77f4c2e8 in ust_lock_nocheck () from /lib64/liblttng-ust.so.1 >> #4459 0x77f5175a in lttng_ust_cleanup () from >> /lib64/liblttng-ust.so.1 >> #4460 0x77fca0f2 in _dl_call_fini ( >> closure_map=closure_map@entry=0x77fbe000) at dl-call_fini.c:43 >> #4461 0x77fce06e in _dl_fini () at dl-fini.c:114 >> #4462 0x77d82fe6 in __run_exit_handlers () from /lib64/libc.so.6 >> >> Cc:ing for awareness. >> >> The issue also requires a recent glibc with changes to DTV management: >> commit d2123d68275acc0f061e73d5f86ca504e0d5a344 ("elf: Fix slow tls >> access after dlopen [BZ #19924]"). If I understand things correctly, >> before this glibc change, we didn't deallocate the old DTV, so there was >> no call to the free function. > > with 19924 fixed, after a dlopen or dlclose every thread updates > its dtv on the next dynamic tls access. > > before that, dtv was only updated up to the generation of the > module being accessed for a particular tls access. > > so hitting the free in the dtv update path is now more likely > but the free is not new, it was there before. > > also note that this is unlikely to happen on aarch64 since > tlsdesc only does dynamic tls access after a 512byte static > tls reservation runs out. > >> >> On the glibc side, we should recommend that intercepting mallocs and its >> dependencies use initial-exec TLS because that kind of TLS does not use >> malloc. If intercepting mallocs using dynamic TLS work at all, that's >> totally by accident, and was in the past helped by glibc bug 19924. (I > > right. > >> don't think there is anything special about libgcc_s.so.1 that triggers >> the test failure above, it is just an object with dynamic TLS that is >> implicitly loaded via dlopen at the right stage of the test.) In this >> particular case, we can also paper over the test failure in glibc by not >> call free at all because the argument is a null pointer: >> >> diff --git a/elf/dl-tls.c b/elf/dl-tls.c >> index 7b3dd9ab60..14c71cbd06 100644 >> --- a/elf/dl-tls.c >> +++ b/elf/dl-tls.c >> @@ -819,7 +819,8 @@ _dl_update_slotinfo (unsigned long int req_modid, size_t >> new_gen) >> dtv entry free it. Note: this is not AS-safe. */ >>/* XXX Ideally we will at some point create a memory >> pool. */ >> - free (dtv[modid].pointer.to_free); >> + if (dtv[modid].pointer.to_free != NULL) >> +free (dtv[modid].pointer.to_free); >>dtv[modid].pointer.val = TLS_DTV_UNALLOCATED; >>dtv[modid].pointer.to_free = NULL; > > can be done, but !=NULL is more likely since we do modid reuse > after dlclose. > > there is also