[libgomp] Ask for help on an improvement for synchronization overhead

2020-04-30 Thread Adhemerval Zanella via Gcc
Hi all, I would like to check if someone could help me figure out
an issue I am chasing on a libgomp patch intended to partially
address the issue described at BZ#79784. 

I have identified that one of the bottlenecks is the global barrier 
used on both thread pool and team which causes a lof of cache ping-pong 
in high-core count machines. And it seems not be an aarch64 specific
issue as hinted by the bugzilla.

So the optimization I am implementing, which is similar of what LLVM
openmp implementation does; is to use a per OMP thread barrier to
synchronize team/task creation.  The activation I have implemented
so far is a simple linear one, where the master scan linearly over
the children threads (LLVM openmp implement some fancy ones that I
plan to take a look as well).

The patch I came up so far is quite simple [2] and required some polish
yet (some documentation, code styling, etc.), however there is one 
regression that is making me scratching my head: cancel-parallel-2.

What it does to exercise OpenMP cancellation in a 'omp parallel' 
construct and the issue I am seeing is falling to understand why
the final team barrier (done on gomp_team_end called by GOMP_parallel_end)
it not synchronizing correctly with the team barrier in each OpenMP
task.

So any help on the design is appreciate (even if it would I should
re-thinking it for libgomp).

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784
[2] https://github.com/zatrazz/gcc/tree/azanella/libgomp-scalability


Re: [libgomp] Ask for help on an improvement for synchronization overhead

2020-05-06 Thread Adhemerval Zanella via Gcc



On 30/04/2020 18:12, Jakub Jelinek wrote:
> On Thu, Apr 30, 2020 at 05:37:26PM -0300, Adhemerval Zanella via Gcc wrote:
>> Hi all, I would like to check if someone could help me figure out
>> an issue I am chasing on a libgomp patch intended to partially
>> address the issue described at BZ#79784. 
>>
>> I have identified that one of the bottlenecks is the global barrier 
>> used on both thread pool and team which causes a lof of cache ping-pong 
>> in high-core count machines. And it seems not be an aarch64 specific
>> issue as hinted by the bugzilla.
> 
> This has been a topic of GSoC last year, but the student didn't deliver it
> in usable form and disappeared.
> See e.g. thread with "Work-stealing task scheduling" in subject from
> last year on gcc-patches and other mails on the topic.

In my understanding what I am working is not exactly related to OMP tasking, 
although I see that the global barrier is still an issue on omp task scheduling.
What I am trying to optimize in this specific case is the barrier used
on gomp_thread_pool used on constructs like parallel for and maybe a per-thread
barrier could be extended to other libgomp places.

> 
> So if you'd have time and motivation to do it properly, it would be greatly
> appreciated.
> 



Re: unnormal Intel 80-bit long doubles and isnanl

2020-11-24 Thread Adhemerval Zanella via Gcc



On 24/11/2020 10:59, Siddhesh Poyarekar wrote:
> On 11/24/20 7:11 PM, Szabolcs Nagy wrote:
>> ideally fpclassify (and other classification macros) would
>> handle all representations.
>>
>> architecturally invalid or trap representations can be a
>> non-standard class but i think classifying them as FP_NAN
>> would break the least amount of code.
> 
> That's my impression too.
> 
>>> glibc evaluates the bit pattern of the 80-bit long double and in the
>>> process, ignores the integer bit, i.e. bit 63.  As a result, it considers
>>> the unnormal number as a valid long double and isnanl returns 0.
>>
>> i think m68k and x86 are different here.
>>
>>>
>>> gcc on the other hand, simply uses the number in a floating point comparison
>>> and uses the parity flag (which indicates an unordered compare, signalling a
>>> NaN) to decide if the number is a NaN.  The unnormal numbers behave like
>>> NaNs in this respect, in that they set the parity flag and with
>>> -fsignalling-nans, would result in an invalid-operation exception.  As a
>>> result, __builtin_isnanl returns 1 for an unnormal number.
>>
>> compiling isnanl to a quiet fp compare is wrong with
>> -fsignalling-nans: classification is not supposed to
>> signal exceptions for snan.
> 
> I agree, but I think that issue with __builtin_isnanl is orthogonal to the 
> question about unnormals.  Once that is fixed in gcc, we could actually use 
> __builtin_isnanl all the time in glibc for isnanl.
> 
> Siddhesh

Which is the currently take from gcc developers on this semantic change of
__builtin_isnanl? Are they considering current behavior of non classifying
the 'unnormal' as NAN the expected behavior and waiting glibc to follow
it or are they willing to align with glibc behavior? 


Re: GCC association with the FSF

2021-04-11 Thread Adhemerval Zanella via Gcc



> Il giorno 11 apr 2021, alle ore 17:45, Alexandre Oliva via Gcc 
>  ha scritto:
> 
> Remember how much hate RMS got in glibc land for something I did?  I
> said I did it out of my own volition, I explained my why I did it, but
> people wouldn't believe he had nothing to do with it! 

It was clear to me and others glibc maintainers that it was *you* who bypass 
the consensus to *not* reinstate the “joke”. And there was no hate (at least 
not from my side) only *disappointment* that you used your status to do it even 
though most of senior developers and maintainers said explicitly you shouldn’t 
do it.

Re: GCC association with the FSF

2021-04-11 Thread Adhemerval Zanella via Gcc
On Sun, Apr 11, 2021 at 8:06 PM Alexandre Oliva  wrote:
>
> On Apr 11, 2021, Adhemerval Zanella  wrote:
>
> > It was clear to me and others glibc maintainers that it was *you* who
> > bypass the consensus to *not* reinstate the “joke”.
>
> I think you wrote it backwards: what I did was to revert the commit that
> the person who put it in agreed shouldn't have been made at that point,
> so that the debate about whether or not to install the patch could be
> carried out without the fait accompli.  To my surprise, it stopped.
>
> Then, a year or so later, when most of the GNU policies that incided on
> that matter had already been discussed and approved, and they suggested
> (at least to me) that the conclusion was likely that the patch was in
> line with them, some other situation came up that reminded people of the
> patch, it was discussed under the heat of the unrelated situation (which
> I also found inappropriate), but it got applied AFAICT in accordance
> with GNU and GLIBC policies.

RMS briefly stated that he did not want the change to be applied, we
considered his
input back then but we decided to remove the joke *regardless* of what
he thought
about the subject. And you used this to state the change had no consensus to
reinstate it in a way that we haven't done in the project for a couple
of years and which
caused a lot of disarray. The problem was not that you did it, but how
you did it.

You then spent a lot of days trying to convince other glibc
maintainers about your
actions to the point that Torvald and Siddhesh were fed up with your rhetoric.

>
> > maintainers said explicitly you shouldn’t do it.
>
> I do not see nor recall any such responses or reactions to my offer to
> revert the patch in case the installer wouldn't do it, except the
> installer saying they wouldn't do the reversal.  Eventually I did it.
> After the fact, some said I shouldn't have done it.
>
>
> That's my recollection of the events.

All the other active maintainers suggested you shouldn't have done that, but you
ignored it anyway. And we did not want to start a potential contention of patch
applying and reversion from that petty discussion.

But this is done and I don't want to dig into this. My point is *we*
glibc maintainers
were fully aware that it was *you* that decided to act in that way and
it was not my
feelings that it was *hate* the dominant response, but rather a lot of
frustration and
disappointment from how you acted.


Re: GCC association with the FSF

2021-04-11 Thread Adhemerval Zanella via Gcc
On Sun, Apr 11, 2021 at 10:43 PM Alexandre Oliva  wrote:
>
> On Apr 11, 2021, Adhemerval Zanella  wrote:
>
> > All the other active maintainers suggested you shouldn't have done that, 
> > but you
> > ignored it anyway.
>
> How could I possibly have ignored something that hadn't happened yet?
>
> > *we* glibc maintainers were fully aware that it was *you* that decided
> > to act in that way
>
> There have been plenty of insinuations that contradict that assumption
> and attempted to somehow blame it on RMS, but whether the record has
> been set straight on this point now, or if it was straight already, the
> point stands.

No, you are insinuating that the glibc community both as maintainer
and contributors
acted in a hateful way regarding the 'joke' removal. Sorry, but this
is not true; there
were messages that might be characterized as such but they did not come from
either of main glibc developers or maintainers.

>
> As recently as a couple of weeks ago someone referred, in this list, to
> RMS's voicing his objection to the removal of one of the many pieces he
> wrote for the glibc manual, and then setting out to propose and discuss
> policies that incided on the matter, as if those were horrible actions.
>
> That was almost as abhorrent as his asking a GNU developer a question
> that he could have answered by just downloading the subproject's source
> code and looking for the answer himself!  Oh, the horror!
>
>
> If that's not hatred, I don't really wish to know what is :-/

The main idea, which I was vocal about and shared with some glibc
developers and
maintainers, was that the "joke" has no place in a technical manual. You might
disagree ideological and politically from this assessment, but this it
is not "hatred" and
this very rhetoric is trying to characterize it as such is what made
me see that discussion
as frustrating and disappointing.


Re: GCC association with the FSF

2021-04-12 Thread Adhemerval Zanella via Gcc



On 12/04/2021 14:52, Alexandre Oliva wrote:
> On Apr 12, 2021, Adhemerval Zanella  wrote:
> 
>> No, you are insinuating that the glibc community both as maintainer
>> and contributors acted in a hateful way regarding the 'joke'
>> removal. Sorry, but this is not true;
> 
> Easy to say for someone who hasn't been the target of hate, but it's
> just that it was there right then, it's *remains* there.  Not exclusive
> among glibc maintainers, and certainly not unanimous among them, but
> there.  I may even have earned it myself.  But the one that Richard got
> over incorrect assumptions that he commanded the reversal, that's just
> another false piece of evidence often used to support the hate campaign.

There were no "hate" campaign from glibc developers and maintainers,
keep stating it does not make it true.  Since libc-alpha is non moderated
list, there were a lot of unfriendly message from undisclosed or
non-representative people.

What happened is some glibc developers were *really* annoyed in the way
*you* acted, not RMS; and they vocalized it.  And you, instead of work 
toward to create consensus by making some concession (as the currently
we try to run the glibc community), keep arguing to exhaustion that you
acted in the benefit or the project.  

So the aforementioned 'hate' is just because we did not agreed in the
way *you* acted, which caused a lot of distress.

> 
>> The main idea, which I was vocal about and shared with some glibc
>> developers and maintainers, was that the "joke" has no place in a
>> technical manual.
> 
> I understand there is consensus about that now, but back then there were
> too many unsettled policy issues to make that call consensually among
> all relevant parties.
> 
> The main disagreement was not over the issue proper, though.  It was
> about procedure, and then it was about whose opinions as much as
> counted.

No, the disagreement is the way *you* did it. I haven't seen such
contention and disarray you started since I have started to work on the 
project, about a decade ago.

So, please stop put the blame of that episode on the glibc community as 
a whole.

> 
> 
> It was a really trivial issue, but sufficiently hot-button and
> triggering enough underlying issues that it got to be exploited
> politically in several ugly ways.
> 
> It can't really be understood without looking into broader contexts that
> had long been mounting, and that again quite explicit in this list too.
> 
> 
> But I hope we can all agree that it was a horrible mess.
>