Materials Handling Middle East 2021

2021-09-21 Thread Savannah Gonzalez
Hi,

Hope it all goes well! I am writing to verify if you would like to obtain the 
list of participants/visitors?
We have over 2,500 details about the 2021 Materials Handling Middle East 
attendees. Let us know if you are interested and we will return with discount 
charges and other deliverables.
Does it make sense to explore ways we can help your team as well?
Thank you,
Savannah Gonzalez
Sales & Marketing Executive
email: savannah.gonza...@dealbig.live
Global E-Mail & Tele - Marketing
US, UK, Canada, Australia and other European & Asia Pacific countries



RE: GCC/OpenMP offloading for Intel GPUs?

2021-09-21 Thread Thomas Schwinge
Hi!

On 2021-09-16T01:40:40+, "Liu, Hongtao"  wrote:
> Rely from Xinmin and adding him to this thead.

Thanks.  :-)

By the way: if you are registered for the Linux Plumbers Conference 2021,
, we may also continue this
discussion in the GCC "BoF: Offloading with OpenMP & OpenACC",
.

> IGC is open sourced. It takes SPIR-V IR and LLVM IR.  We need "GCC IR to 
> SPIR-V translator"

Understood that we need a GCC back end producing SPIR-V, complementing
the existing support for Nvidia GPUs via nvptx back end (producing
textual PTX code), and for AMD GPUs via GCN back end (producing GCN
assembly).

Would you please explain what it means that "IGC [...] takes [...] LLVM
IR"?  Can LLVM IR be used to describe the OpenMP 'target' regions and
properly express GPU multi-level parallelism?  If that is possible in
pure LLVM IR, and given that:

> similar to "LLVM-IR to SPIR-V translator" we have for LLVM-IR.

..., this already exists, does it follow that GCC wouldn't actually need
a SPIR-V back end, but could instead "simply" generate LLVM IR from GCC
IR?

(I remember  "DragonEgg - Using LLVM as a
GCC backend", which at least to me still has a certain appeal on its own
grounds.  I understand not everyone in the GCC community will agree...)

Would such an approach make any sense?

> How does GCC support  device library?

I'm not sure I'm correctly understanding the question.

For both nvptx and GCN offloading compilation, there is a device code
linking step, where offload target libraries may be linked in.  (The
results then get embedded into the host "FAT" binaries.)

Then, there is libgomp ("GNU Offloading and Multi Processing Runtime
Library"), which contains plugins for each offload target, for loading
offload code to the devices, memory management, kernel launches, etc.
For nvptx, this uses the CUDA Driver library, and for GCN it uses
'libhsa-runtime64.so'.  A similar plugin would need to be written for the
corresponding Intel GPU device-access library.


Still remains the question who is going to do the work: are Intel
planning to do that work (themselves, like for Intel MIC offloading back
then), or interested in hiring someone to do it, or not (actively)
interested in helping GCC support Intel GPUs?


Grüße
 Thomas


>>-Original Message-
>>From: Thomas Schwinge 
>>Sent: Wednesday, September 15, 2021 7:20 PM
>>To: Liu, Hongtao 
>>Cc: gcc@gcc.gnu.org; Jakub Jelinek ; Tobias Burnus
>>; Kirill Yukhin ; Richard
>>Biener 
>>Subject: RE: GCC/OpenMP offloading for Intel GPUs?
>>
>>Hi!
>>
>>On 2021-09-15T02:00:33+, "Liu, Hongtao via Gcc" 
>>wrote:
>>> I got some feedback from my colleague
>>
>>Thanks for reaching out to them.
>>
>>> -
>>> What we need from GCC
>>>
>>> 1. generate SPIR-V
>>> 2. offload bundler to create FAT object
>>> --
>>>
>>> If the answer is yes for both, they can hook it up with libomptarget library
>>and our IGC back-end.
>>
>>OK, I didn't remember Intel's use of SPIR-V as intermediate representation
>>(but that's certainly good!), and leaving aside the technical/implementation
>>issues (regarding libomptarget etc. use, as brought up by Jakub), the question
>>then is: are Intel planning to do that work (themselves, like for Intel MIC
>>offloading back then), or interested in hiring someone to do it, or not?
>>
>>
>>Grüße
>> Thomas
>>
>>
-Original Message-
From: Thomas Schwinge 
Sent: Wednesday, September 15, 2021 12:57 AM
To: gcc@gcc.gnu.org
Cc: Jakub Jelinek ; Tobias Burnus
; Kirill Yukhin ;
Liu, Hongtao 
Subject: GCC/OpenMP offloading for Intel GPUs?

Hi!

I've had a person ask about GCC/OpenMP offloading for Intel GPUs (the
new ones, not MIC, obviously), to complement the existing support for
Nvidia and AMD GPUs.  Is there any statement other than "ought to be
doable; someone needs to contribute the work"?


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße
201,
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
Registergericht München, HRB 106955
>>-
>>Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
>>80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
>>Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
>>Registergericht München, HRB 106955
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


replacing the VRP threader

2021-09-21 Thread Aldy Hernandez via Gcc
In order to replace VRP, we need an alternative for the jump threader 
embedded within it.  Currently, this threader is a forward threader 
client that uses ASSERT_EXPRs and the avail/const framework to resolve 
statements along a path.


As I have mentioned in the past weeks, I am proposing a hybrid 
replacement, where we still use the forward threader, but with the path 
solver used by the backward threader.  This provides an incremental path 
to remove the forward threader with minimal disruption.


As a reminder, the reason we can't just use the backward threader for a 
VRP-threader replacement right now is because the forward and backward 
threaders have different cost models and block copiers that have not 
been merged.  I'd like to address these differences for the next 
release. This, coupled with support for floats in ranger, would allow us 
to nuke the forward threader entirely.  But for now, a more gradual 
approach is needed.


I have patches to the path solver that address all the shortcomings it 
had in my initial rewrite.  That is, the ability to use the ranger and 
the relation oracle to fully resolve ranges and relations not known 
within the path.  With this enhanced solver, we are able to get 
everything the forward threader can get (both VRP and DOM clients, with 
the exception of floats for DOM).  This path solver can then be used 
either with the backward threader (disabled by default), or with with 
the forward threader in the hybrid approach I propose for VRP.


I'd like to discuss how I tested this, what is missing, and alternatives 
on going forward.


SUMMARY
===

The hybrid threader gets 14.5% more jump threads than the legacy code, 
but most of these threads are at the expense of other threading passes. 
The more it gets, the less DOM and the backward threader get.  That 
being said, there is a total improvement of 0.87% jump threads in the 
total compilation.


This comes with no runtime penalty.  In our callgrind testing harness 
derived from the .ii files from a bootstrap, I measured a 0.62% 
performance drop, well within the noise of the tool.


However, there is a 1.5% performance penalty for splitting the VRP 
threader from outside of VRP (for either hybrid or legacy).  I would 
prefer divorcing embedded jump threading passes from their carriers, but 
if others disagree, we could piggy back on the ranger used in the 
upcoming VRP replacement (evrp has a ranger we could share).


TESTING
===

The presence of ASSERT_EXPRs made it difficult to do a clean comparison, 
as ranger obviously doesn't use these.  What I did was move the VRP 
threader into its own pass (regenerating ASSSERT_EXPRs and its 
environment), and then run my hybrid threader iteratively until no more 
changes (excluding ping-pongs).  Then I ran the legacy code, and 
anything it could find, was something worth investigating.


BTW, the reason for the iterative approach, is because any threading 
pass creates opportunities for subsequent threading passes.


MISSING CASES
=

There were a few things we missed, which can be summarized in broad 
categories:


a) A few missing range-op entries for things like RROTATE_EXPR.  These 
are trivial to address.


b) Some paths which the solver could clearly solve, but it never got the 
chance, because of limitations in the path discovery code in the forward 
threader.


I fixed a few of these (upcoming patches), but I mostly avoided fixing 
the core forward threader code, as it would incur too much work for very 
little return.  Besides, my plan is to eventually nuke it.


One example are paths whose first block ends in a conditional, but whose 
second is empty.  These are usually empty loop pre-headers, which are 
not empty in legacy mode because the ASSERT_EXPR mechanism has put 
asserts in there.  However, since the ranger uses the pristine IL, these 
blocks are empty, which throws off the threader.  In practice, this 
didn't matter, because since the CFG had been cleaned up, these empty 
blocks were predominantly loop pre-headers which the threader was trying 
to thread through, crossing loop boundaries in the process.  As has been 
discussed earlier, we shouldn't be threading through this.


c) The path solver returns UNDEFINED for unreachable paths, so we avoid 
threading them altogether.  However, the legacy code is perfectly happy 
to give answers and thread through the paths.


d) I saw an odd missed propagation/recalculation here and there, where 
ranger returned varying, when it could've done better.  This was very rare.


e) Finally, conditionals when either operand is a MEM_REF are not 
handled.  Interestingly, this was uncommon as most everything had been 
simplified to an SSA.


As the numbers show, the above is probably noise.  The only thing worth 
addressing may be the MEM_REF business.  If this becomes a problem, it 
is precisely what we designed the pointer equivalence tracking we use in 
evrp.  It could be easily adapted.


PATH F

Difference between MULTISUBDIR and "-print-multi-os-directory"

2021-09-21 Thread CHIGOT, CLEMENT via Gcc
Hi everyone, 

I'm wondering what is the real difference between the Makefile
variable MULTISUBDIR which is set as the "installed subdirectory name"
and the result of "-print-multi-os-directory" which is the
"the relative path to OS libraries".                
Especially, some GCC libraries' configure are setting the multilib
installed directory named "toolexeclibdir" with "MUTLISUBDIR" when 
"--enable-version-specific-runtime-libs" is enabled and with
"-print-multi-os-directory" without. This is the case for
at least libgfortran and libstdc++.

FYI, the background of this question is that we are trying to enable
a 64bit-built gcc on AIX. We, however, want the library searched 
path generated by GCC to be similar to the 32bit version. This 
means we can't rely on "ppc32" directory which doesn't exist with
the 32bit version.
Thus, we are setting "MULTILIB_MATCH= .=-maix32" to force gcc64 with
-maix32 to use the libraries under ".". They have been adapted to
contain both 32bit and 64bit shared objects. (That's the usual format
of libraries on AIX).
However, this creates a difference between MULTISUBDIR and  
"-print-multi-os-directory". MULTISUBDIR will be set as if 
"MULTILIB_MATCH" wasn't there, for "-maix32" it is "ppc32". 
Whereas "-print-multi-os-directory" with "-maix32" shows ".".       
Thus, regarding if "--enable-version-specific-runtime-libs" is set or
not, the installation directory will be different for us.

Thansk, 
Clément



INVESTMENT GROUP

2021-09-21 Thread Julie Cruss via Gcc
partner,
I am a loan consultant working with Klunder Beheer Investment Group of a
company in the United Kingdom. I seek a reliable partner that has a worthy
pending project or investment to partner with.We shall proceed with
immediate effect once we find a reputable partner in need of loans to
finance his existing projects.So, do email me if you have a worthy business
projection with high Returns.
Sincerely,
Dr Holger Schutkowski.
Klunder Beheer Investment Group
Regional Investment Director 4th Floor,
1 St James's Market London SW1Y AH United Kingdom
Mobile phone:+447031980083


RE: GCC/OpenMP offloading for Intel GPUs?

2021-09-21 Thread Tian, Xinmin via Gcc
Can LLVM IR be used to describe the OpenMP 'target' regions and properly 
express GPU multi-level parallelism?
Yes, you can generate LLVM IR as well. We can take LLVM IR as well. 

Xinmin 

-Original Message-
From: Thomas Schwinge  
Sent: Tuesday, September 21, 2021 3:31 AM
To: Liu, Hongtao ; Tian, Xinmin 
Cc: gcc@gcc.gnu.org; Jakub Jelinek ; Tobias Burnus 
; Kirill Yukhin ; Richard 
Biener 
Subject: RE: GCC/OpenMP offloading for Intel GPUs?

Hi!

On 2021-09-16T01:40:40+, "Liu, Hongtao"  wrote:
> Rely from Xinmin and adding him to this thead.

Thanks.  :-)

By the way: if you are registered for the Linux Plumbers Conference 2021, 
, we may also continue this discussion 
in the GCC "BoF: Offloading with OpenMP & OpenACC", 
.

> IGC is open sourced. It takes SPIR-V IR and LLVM IR.  We need "GCC IR to 
> SPIR-V translator"

Understood that we need a GCC back end producing SPIR-V, complementing the 
existing support for Nvidia GPUs via nvptx back end (producing textual PTX 
code), and for AMD GPUs via GCN back end (producing GCN assembly).

Would you please explain what it means that "IGC [...] takes [...] LLVM IR"?  
Can LLVM IR be used to describe the OpenMP 'target' regions and properly 
express GPU multi-level parallelism?  If that is possible in pure LLVM IR, and 
given that:

> similar to "LLVM-IR to SPIR-V translator" we have for LLVM-IR.

..., this already exists, does it follow that GCC wouldn't actually need a 
SPIR-V back end, but could instead "simply" generate LLVM IR from GCC IR?

(I remember  "DragonEgg - Using LLVM as a GCC 
backend", which at least to me still has a certain appeal on its own grounds.  
I understand not everyone in the GCC community will agree...)

Would such an approach make any sense?

> How does GCC support  device library?

I'm not sure I'm correctly understanding the question.

For both nvptx and GCN offloading compilation, there is a device code linking 
step, where offload target libraries may be linked in.  (The results then get 
embedded into the host "FAT" binaries.)

Then, there is libgomp ("GNU Offloading and Multi Processing Runtime Library"), 
which contains plugins for each offload target, for loading offload code to the 
devices, memory management, kernel launches, etc.
For nvptx, this uses the CUDA Driver library, and for GCN it uses 
'libhsa-runtime64.so'.  A similar plugin would need to be written for the 
corresponding Intel GPU device-access library.


Still remains the question who is going to do the work: are Intel planning to 
do that work (themselves, like for Intel MIC offloading back then), or 
interested in hiring someone to do it, or not (actively) interested in helping 
GCC support Intel GPUs?


Grüße
 Thomas


>>-Original Message-
>>From: Thomas Schwinge 
>>Sent: Wednesday, September 15, 2021 7:20 PM
>>To: Liu, Hongtao 
>>Cc: gcc@gcc.gnu.org; Jakub Jelinek ; Tobias Burnus 
>>; Kirill Yukhin ; 
>>Richard Biener 
>>Subject: RE: GCC/OpenMP offloading for Intel GPUs?
>>
>>Hi!
>>
>>On 2021-09-15T02:00:33+, "Liu, Hongtao via Gcc" 
>>wrote:
>>> I got some feedback from my colleague
>>
>>Thanks for reaching out to them.
>>
>>> -
>>> What we need from GCC
>>>
>>> 1. generate SPIR-V
>>> 2. offload bundler to create FAT object
>>> --
>>>
>>> If the answer is yes for both, they can hook it up with libomptarget 
>>> library
>>and our IGC back-end.
>>
>>OK, I didn't remember Intel's use of SPIR-V as intermediate 
>>representation (but that's certainly good!), and leaving aside the 
>>technical/implementation issues (regarding libomptarget etc. use, as 
>>brought up by Jakub), the question then is: are Intel planning to do 
>>that work (themselves, like for Intel MIC offloading back then), or 
>>interested in hiring someone to do it, or not?
>>
>>
>>Grüße
>> Thomas
>>
>>
-Original Message-
From: Thomas Schwinge 
Sent: Wednesday, September 15, 2021 12:57 AM
To: gcc@gcc.gnu.org
Cc: Jakub Jelinek ; Tobias Burnus 
; Kirill Yukhin ; 
Liu, Hongtao 
Subject: GCC/OpenMP offloading for Intel GPUs?

Hi!

I've had a person ask about GCC/OpenMP offloading for Intel GPUs 
(the new ones, not MIC, obviously), to complement the existing 
support for Nvidia and AMD GPUs.  Is there any statement other than 
"ought to be doable; someone needs to contribute the work"?


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 
201,
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; 
Registergericht München, HRB 106955
>>-
>>Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 
>>201,
>>80634 München; Gesellschaft mit beschränkter Haftung

Re: [libc-coord] Add new ABI '__memcmpeq()' to libc

2021-09-21 Thread Noah Goldstein via Gcc
On Fri, Sep 17, 2021 at 9:27 AM Florian Weimer via Libc-alpha <
libc-al...@sourceware.org> wrote:

> * Joseph Myers:
>
> > I was supposing a build-time decision (using
> GCC_GLIBC_VERSION_GTE_IFELSE
> > to know if the glibc version on the target definitely has this
> function).
> > But if we add a header declaration, you could check for __memcmpeq being
> > declared (and so cover arbitrary C libraries, not just glibc, and avoid
> > issues of needing to disable this logic for freestanding compilations,
> > which would otherwise be an issue if a glibc-target toolchain is used
> for
> > a freestanding kernel compilation).  The case of people calling
> > __builtin_memcmp (or declaring memcmp themselves) without string.h
> > included probably isn't one it's important to optimize.
>
> The header-less case looks relevant to C++ and other language front
> ends, though.  So a GCC_GLIBC_VERSION_GTE_IFELSE check could still make
> sense for them.
>
> (Dropping libc-coord.)
>
> Thanks,
> Florian
>
>
What are we going with?

Should I go forward with the proposal in GLIBC?

If I should go forward with it should I include a def in string.h?