GSoC OpenACC

2025-04-04 Thread Carter Weidman via Gcc
Hello!

My name is Carter. I’m looking to become active in the GCC community. I would 
of course love to be funded through GSoC (and will most definitely be 
submitting a formal proposal) but will contribute regardless of this. I’m 
interesting in the OpenACC parts of the posted projects, so helpful resources 
to dig further would be much appreciated! If there is anything easy I can do in 
the meantime that would be awesome as well. 

Looking forward to learning from everyone!
Carter. 

[GSoC] Initial Draft

2025-04-07 Thread Carter Weidman via Gcc

Here is my initial draft for my proposal. Please provide as much feedback as 
possible. Upon more research I realize I will almost definitely have time to do 
more than just the bind and device_type clause. I have prepared significant 
research for the cache directive as well, and could definitely include this in 
my proposal. I will also expand upon the device_type section but need to tend 
to an obligation and wanted to post this ASAP. Rip it shreds please!

Thank you!
Carter

Implementing support for the bind clause:

The bind clause will dictate what name will be used when a procedure is called 
from an accelerator device such as a GPU or FPGA. Essentially, users can 
specify multiple versions of the same procedure, one to be run on a host call 
and others that accelerators can call. This is done by specifying either an 
identifier or a string after the bind clause.

The primary benefits of implementing this functionality will be allowing highly 
specialized implementations of procedures, such as custom FFT or BLAS routines 
to be used instead of compiler generated code. It also allows clear definitions 
between host/device functions and greater interoperability with external 
libraries.

Implementation Notes:
We first will need to extend the parser to handle specification of the bind 
clause.
The files that we’ll need to use for this will be 
gcc/c-family/c-pragma.cc, gcc/c-parser.cc, and 
cp/cp-parser.cc (For just the C and C++ implementations). 
Fortran implementation may also be considered (likely) but to begin, we’ll just 
consider these two. Some validation methods already present in GCC may be 
utilized, such as oacc_verify and oacc_finalize. Semantic analysis will also of 
course be performed to ensure proper bind-names are passed.

The parsed clause will then need to be represented in GIMPLE IR (from the AST). 
This will be done by adding a GIMPLE node to represent and store the specified 
procedure to bind to. We will need to make some additions to gcc/tree.h, 
gcc/gimple.def, and gcc/gimple.h such as adding a new bind clause definition 
{OMP_CLAUSE_BIND(NODE)} for example. Storing this symbol-name information in 
GIMPLE IR ensures that, during the linking stage, GCC correctly produces object 
files containing accelerator IR sections and metadata (.gnu.offload_lto_* 
sections).

This new information will also need to be lowered into a device specific IR 
(PTX and GCN) that will allow the accelerator to pull the procedure from the 
bind clause for use. The relevant files here will be gc/config/nvptx and 
gcc/config/gcn.

Timeline:
Week1: Finish research and general design details for implementing bind and 
device_type clause.
Week2: Implement parser support for the bind clause. This will include 
debugging and ensure all new code passes in GCC’s current testing 
infrastructure. Additional testing will be designed to ensure proper 
functionality.
Week3: Implement GIMPLE representation of bind clause. Ensure debugging and 
testing are performed on both the parser and GIMPLE rep.
Week4: Map the routines to external devices (PTX and GCN). We will be able to 
test if the accelerators are implementing the correct routine by seeing what is 
emitted from the backend. Another useful metric would be the compute time, as 
we expect better performance when offloading to an accelerator.
Week5: Finalize implementation and test everything as a whole, ensuring current 
GCC testing passes as well as any important new test that will be implemented 
along the way.

Implementing support for the device_type clause:
Similarly to the bind clause, the device_type clause can specify on which 
device to use a specific procedure. This would allow a user to design many 
different procedures, with device specific implementations. Bind and 
device_type are very much thematic related, hence my interest in both of them.

Thankfully, implementing this function is very similar in the general method. 
The device_type will be parsed the same as any OpenACC directive, lowered into 
both GIMPLE and device-specific IR. If Fortran is included as well (which is 
very likely), it will go from the AST to GENERIC then to GIMPLE. One major 
difference between device_type and bind is that device_type is a configuration 
setting, so we would ideally like to store this setting for any future compiler 
use by the user. A feature to change or disable a specific device for a 
procedure should be considered.

The relevant files are also similar, though I will add some additional ones for 
configuration purposes.
gcc/common/config.cc
gcc/config/gcn/gcn.cc
gcc/config/nvptx/nvptx.cc
gcc/common/config/nvptx/nvptx-common.cc
gcc/lto-section-in.cc/lto-section-out.cc
gcc/lto-wrapper.c

Week6: Begin implementing device_type in parsers. Testing + semantic 
analysis/validity
Week7: Implement relevant GIMPLE (GENERIC) nodes for device_type. Will include 
storing info 

[GSoC] Initial Draft

2025-04-06 Thread Carter Weidman via Gcc
Here is my initial draft for my proposal. Please provide as much feedback as 
possible. Upon more research I realize I will almost definitely have time to do 
more than just the bind and device_type clause. I have prepared significant 
research for the cache directive as well, and could definitely include this in 
my proposal. I will also expand upon the device_type section but need to tend 
to an obligation and wanted to post this ASAP. Rip it shreds please!

Thank you!
Carter


Implementing support for the bind clause:


The bind clause will dictate what name will be used when a procedure is called 
from an accelerator device such as a GPU or FPGA. Essentially, users can 
specify multiple versions of the same procedure, one to be run on a host call 
and others that accelerators can call. This is done by specifying either an 
identifier or a string after the bind clause.


The primary benefits of implementing this functionality will be allowing highly 
specialized implementations of procedures, such as custom FFT or BLAS routines 
to be used instead of compiler generated code. It also allows clear definitions 
between host/device functions and greater interoperability with external 
libraries.


Implementation Notes:

We first will need to extend the parser to handle specification of the bind 
clause.

The files that we’ll need to use for this will be 
gcc/c-family/c-pragma.cc, gcc/c-parser.cc, and 
cp/cp-parser.cc (For just the C and C++ implementations). 
Fortran implementation may also be considered (likely) but to begin, we’ll just 
consider these two. Some validation methods already present in GCC may be 
utilized, such as oacc_verify and oacc_finalize. Semantic analysis will also of 
course be performed to ensure proper bind-names are passed.


The parsed clause will then need to be represented in GIMPLE IR (from the AST). 
This will be done by adding a GIMPLE node to represent and store the specified 
procedure to bind to. We will need to make some additions to gcc/tree.h, 
gcc/gimple.def, and gcc/gimple.h such as adding a new bind clause definition 
{OMP_CLAUSE_BIND(NODE)} for example. Storing this symbol-name information in 
GIMPLE IR ensures that, during the linking stage, GCC correctly produces object 
files containing accelerator IR sections and metadata (.gnu.offload_lto_* 
sections).


This new information will also need to be lowered into a device specific IR 
(PTX and GCN) that will allow the accelerator to pull the procedure from the 
bind clause for use. The relevant files here will be gc/config/nvptx and 
gcc/config/gcn.


Timeline:

Week1: Finish research and general design details for implementing bind and 
device_type clause.

Week2: Implement parser support for the bind clause. This will include 
debugging and ensure all new code passes in GCC’s current testing 
infrastructure. Additional testing will be designed to ensure proper 
functionality.

Week3: Implement GIMPLE representation of bind clause. Ensure debugging and 
testing are performed on both the parser and GIMPLE rep.

Week4: Map the routines to external devices (PTX and GCN). We will be able to 
test if the accelerators are implementing the correct routine by seeing what is 
emitted from the backend. Another useful metric would be the compute time, as 
we expect better performance when offloading to an accelerator.

Week5: Finalize implementation and test everything as a whole, ensuring current 
GCC testing passes as well as any important new test that will be implemented 
along the way.


Implementing support for the device_type clause:

Similarly to the bind clause, the device_type clause can specify on which 
device to use a specific procedure. This would allow a user to design many 
different procedures, with device specific implementations. Bind and 
device_type are very much thematic related, hence my interest in both of them.


Thankfully, implementing this function is very similar in the general method. 
The device_type will be parsed the same as any OpenACC directive, lowered into 
both GIMPLE and device-specific IR. If Fortran is included as well (which is 
very likely), it will go from the AST to GENERIC then to GIMPLE. One major 
difference between device_type and bind is that device_type is a configuration 
setting, so we would ideally like to store this setting for any future compiler 
use by the user. A feature to change or disable a specific device for a 
procedure should be considered.


The relevant files are also similar, though I will add some additional ones for 
configuration purposes.

gcc/common/config.cc

gcc/config/gcn/gcn.cc

gcc/config/nvptx/nvptx.cc

gcc/common/config/nvptx/nvptx-common.cc

gcc/lto-section-in.cc/lto-section-out.cc

gcc/lto-wrapper.c


Week6: Begin implementing device_type in parsers. Testing + semantic 
analysis/validity

Week7: Implement relevant GIMPLE (GENERIC) nodes for device_type. Wil

[GSoC] OpenACC bind() and device_type

2025-04-04 Thread Carter Weidman via Gcc
Here is my (potentially naive) understanding of how we may implement bind() and 
device_type (probably in a follow-up email) in GCC. These are primary parts of 
the project that I am interested in. I would be open to exploring the cache 
directive as well though would like to start with a limited scope and get 
feedback on how long just the first two could potentially take. 

The bind() clause, as part of OpenACC’s “routine directive” section in the spec 
defines a way to specify the name to use when calling a procedure from an 
accelerator device. This can be done in two ways; with an identifier, such as 
device_function, or a string, such as “device_function”. If it is an 
identifier, it will be treated like any other function in your source code. If 
a string is passed, the compiler will not enforce any naming rules or mangling, 
and the accelerator (or other device) will specially target that exact name. 

Implementation:
The bind clause will need to be parsed, represented in GIMPLE IR, mapped to an 
external device (lowered), generate bindings for the external device 
(nvptx/gcc), then tested and validated using existing GCC test infrastructure 
and adding some of my own. 

These are the relevant files for each part:

gcc/c-family/c-pragma.cc
c/c-parser.cc
cp/parser.cc

gcc/gimple.h
gcc/gimple.def

gcc/omp-low.cc

gcc/config/nvptx/ and/or gcc/config/gcn/

gcc/testsuite/c-c++-common/goacc/

Potential Runtime integration: 
libgomp/oacc/… (did not do much research on this yet so I may not know what I’m 
talking about)

I also have significant research prepared for implementing the device_type 
clause as well, but would like feedback on whether or not I’m in the correct 
ballpark with bind() first. I have further implementation details prepared but 
definitely need more time to get comfortable with the codebase and intricacies. 
I believe I could spend an entire summer in just these two but I don’t have the 
clearest vision for this project to make an accurate judgement. 

Be harsh! 

Thank you,
Carter.