[Bug target/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 Benjamin Schulz changed: What|Removed |Added Attachment #60176|0 |1 is obsolete|| Attachment #60177|0 |1 is obsolete|| Attachment #60178|0 |1 is obsolete|| --- Comment #7 from Benjamin Schulz --- Created attachment 60371 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60371&action=edit mdspan_acc
[Bug target/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 --- Comment #9 from Benjamin Schulz --- Created attachment 60373 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60373&action=edit cmakelists.txt
[Bug target/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 --- Comment #8 from Benjamin Schulz --- Created attachment 60372 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60372&action=edit main.cpp
[Bug target/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 --- Comment #10 from Benjamin Schulz --- the newly attached files work with nvc++ from nvidia. but nvidia has difficulties compiling the openmp part. so it would be good if i could compile this with gcc. What must I do to enable aliasing when building the offload compiler? must i pass -malias somewhere in the build process? at crossdev? via a patch?
[Bug target/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 --- Comment #11 from Benjamin Schulz --- if i write something like this: SET (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fopenacc -foffload=nvptx-none -foffload=-malias -fcf-protection=none -fno-stack-protector -U_FORTIFY_SOURCE -std=c++23 -no-pie") it still complains that alias definitions are not supported.
[Bug target/118738] New: the following code compiles with nvc++ and even clang and raises an internal Compiler-error: in expand_UNIQUE, bei internal-fn.cc:3302
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118738 Bug ID: 118738 Summary: the following code compiles with nvc++ and even clang and raises an internal Compiler-error: in expand_UNIQUE, bei internal-fn.cc:3302 Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: schulz.benjamin at googlemail dot com Target Milestone: --- Created attachment 60368 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60368&action=edit mdspan.h /home/benni/projects/arraylibrary/mdspan_acc.h:203:9: interner Compiler-Fehler: in expand_UNIQUE, bei internal-fn.cc:3302 203 | #pragma acc loop vector independent reduction(+:offset) | ^~~ 0x5641637d822f internal_error(char const*, ...) ???:0 0x564161a3897d fancy_abort(char const*, int, char const*) ???:0 Bitte senden Sie einen vollständigen Fehlerbericht auf Englisch ein; note that the output of nvc++ is here: 204, Generated vector simd code for the loop containing reductions
[Bug target/118738] the following code compiles with nvc++ and even clang and raises an internal Compiler-error: in expand_UNIQUE, bei internal-fn.cc:3302
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118738 --- Comment #1 from Benjamin Schulz --- Created attachment 60369 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60369&action=edit main.cpp
[Bug target/118738] the following code compiles with nvc++ and even clang and raises an internal Compiler-error: in expand_UNIQUE, bei internal-fn.cc:3302
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118738 --- Comment #2 from Benjamin Schulz --- Created attachment 60370 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60370&action=edit cmakelists.txt
[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794 --- Comment #6 from Benjamin Schulz --- Hi thanks for the fast reply. Unfortunately none of these works... (yes, putting in the -fno-math-errno option also raises this error, even if i put it into -offload... even if i try -foffload= -fno-math-errno the assert also does not work. and the builtin unreachable option does also not work. One problem is that there is no unsigned double in c++... Strangely, not even this here compiles: T norm=fabs(gpu_dot_product_w(v,v)); T normc= sqrt(norm); // const T normc=norm; #pragma omp parallel for for (size_t i = 0; i < pext0; ++i) { v(i,pstrv0)= v(i,pstrv0)/normc; } Is there another division by zero problem? I do not really know what is going on here..
[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794 --- Comment #8 from Benjamin Schulz --- with this here it is satisfied: // Normalize v T norm=0; // T norm=fabs(gpu_dot_product_w(v,v)); T normc= sqrt(norm); // const T normc=norm; #pragma omp parallel for for (size_t i = 0; i < pext0; ++i) { v(i,pstrv0)= v(i,pstrv0)/normc; } However, gpu_dot_product_w is called before asT dot_pr=gpu_dot_product_w(u,v); which does not throw any nonlocal gotos. it is defined like this: template inline T gpu_dot_product_w( const datastruct& vec1, const datastruct &vec2) { const size_t n=vec1.pextents[0]; const size_t strv1=vec1.pstrides[0]; const size_t strv2=vec2.pstrides[0]; T result=0; #pragma omp parallel for reduction(+:result) for (size_t i = 0; i < n; ++i) { result += vec1(i,strv1) * vec2(i,strv2); } return result; } and the operators are these: #pragma omp begin declare target template inline T& datastruct::operator()(const size_t row, const size_t stride) { return pdata[row * stride]; } #pragma omp end declare target none of this has anything to do with gotos pstrides are pointers to non-stl arrays i do not know what is going on here...
[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794 --- Comment #7 from Benjamin Schulz --- with this here it is satisfied: // Normalize v T norm=0; // T norm=fabs(gpu_dot_product_w(v,v)); T normc= sqrt(norm); // const T normc=norm; #pragma omp parallel for for (size_t i = 0; i < pext0; ++i) { v(i,pstrv0)= v(i,pstrv0)/normc; } However, gpu_dot_product_w is called before asT dot_pr=gpu_dot_product_w(u,v); which does not throw any nonlocal gotos. it is defined like this: template inline T gpu_dot_product_w( const datastruct& vec1, const datastruct &vec2) { const size_t n=vec1.pextents[0]; const size_t strv1=vec1.pstrides[0]; const size_t strv2=vec2.pstrides[0]; T result=0; #pragma omp parallel for reduction(+:result) for (size_t i = 0; i < n; ++i) { result += vec1(i,strv1) * vec2(i,strv2); } return result; } and the operators are these: #pragma omp begin declare target template inline T& datastruct::operator()(const size_t row, const size_t stride) { return pdata[row * stride]; } #pragma omp end declare target none of this has anything to do with gotos pstrides are pointers to non-stl arrays i do not know what is going on here...
[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794 --- Comment #1 from Benjamin Schulz --- Created attachment 60423 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60423&action=edit main.cpp
[Bug target/118794] New: The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794 Bug ID: 118794 Summary: The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos.. Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: schulz.benjamin at googlemail dot com Target Milestone: --- Created attachment 60422 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60422&action=edit header Hi there, the attached openmp offload code will fail for 2 reasons. first, there is an overloaded constructor and even though I pass -malias to the offload compiler, it will not support this, but what is more embarassing is the second problem in the function qr_decomposition. gcc refuses to compile that function with the message "target can not support nonlocal goto. It does not precisely state which line it is, but by sequentially deleting code, one finds that it is this funny line 1964 here which causes the problem: const T normc=sqrt(norm); Well, aehm, of course, a square root can not work for values less than 0, but I still have to call it on the gpu, so what now
[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794 --- Comment #2 from Benjamin Schulz --- Created attachment 60424 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60424&action=edit cmakelists.txt
[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794 --- Comment #3 from Benjamin Schulz --- gcc --version gcc (Gentoo 14.2.1_p20241221 p7) 14.2.1 20241221
[Bug target/118590] internal compiler error: in build_omp_array_section, bei cp/typeck.cc:4823
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118590 --- Comment #1 from Benjamin Schulz --- Created attachment 60225 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60225&action=edit main_acc
[Bug target/118590] New: internal compiler error: in build_omp_array_section, bei cp/typeck.cc:4823
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118590 Bug ID: 118590 Summary: internal compiler error: in build_omp_array_section, bei cp/typeck.cc:4823 Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: schulz.benjamin at googlemail dot com Target Milestone: --- Created attachment 60224 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60224&action=edit mdspan_acc class The following files (which use openacc and openmp) (i transported parts of my earlier openmp code due to the device mapper having more examples with class upload) compile with nvc++ and (at least yesterday's) git verion of clang. But gcc will have the following output: /home/benni/projects/arraylibrary/mdspan_acc.h: In Funktion »void cholesky_decomposition(mdspan&, mdspan&, matrix_multiplication_parameters, size_t, bool, bool)«: /home/benni/projects/arraylibrary/mdspan_acc.h:2764:56: Internal compiler error: in build_omp_array_section, bei cp/typeck.cc:4823 2764 | #pragma acc enter data copyin(dA.pdata[0:dA.pdatalength]) |^ 0x5631935b722f internal_error(char const*, ...) ???:0 0x56319181797d fancy_abort(char const*, int, char const*) ???:0 0x563191a0e348 c_parse_file() ???:0 0x563191b23729 c_common_parse_file() ???:0
[Bug target/118590] internal compiler error: in build_omp_array_section, bei cp/typeck.cc:4823
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118590 --- Comment #2 from Benjamin Schulz --- Created attachment 60226 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60226&action=edit cmakelists.txt
[Bug target/118590] internal compiler error: in build_omp_array_section, bei cp/typeck.cc:4823
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118590 --- Comment #3 from Benjamin Schulz --- gcc --version gcc (Gentoo 14.2.1_p20241221 p7) 14.2.1 20241221
[Bug c++/118518] New: gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 Bug ID: 118518 Summary: gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: schulz.benjamin at googlemail dot com Target Milestone: --- Created attachment 60176 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60176&action=edit mdspan template classes Hi there, I tried to write my own version of the new mdspan class, but one which works with non-compile time extents (i.e. which can be set by the input) and with gpu offload support. The following c++23 code contains interesting algorithms for matrix multiplication (Strassen Algorithm, and Winograd version of the Strassen Algorithm) as well as some advanced and new algorithms for Cholesky, LU and QR decomposition. It has two interfaces, one that uses the stl, with functions that can work on gpu and may offload parts of the code at gpu, and another with functions that can entirely reside on gpu. It is certainly a bit more advanced than small openmp test cases. It uses target teams, parallel for and simd instructions, as well as declare target areas with functions on the target, memory allocation, pointer arithmetic for sub-arrays. It thus represents a nice compiler test. On clang, it compiles at least. but the function offload fails for some reason on clang. and the winograd version has, in clang, a problem with the omp runtime. On gcc, it does not even compile. the options: -fopenmp -foffload=nvptx-none -fcf-protection=none -fno-stack-protector -std=c++23 -no-pie lrt lm lc lstdc++ make gcc claim that "alias definitions" would not be allowed in this configuration... [100%] Linking CXX executable arraytest /usr/bin/cmake -E cmake_link_script CMakeFiles/arraytest.dir/link.txt --verbose=1 /home/benni/projects/arraylibrary/mdspan.h:356:22: Fehler: Alias-Definitionen werden in dieser Konfiguration nicht unterstützt 356 | template datastruct::datastruct( | ^ /home/benni/projects/arraylibrary/mdspan.h:386:22: Fehler: Alias-Definitionen werden in dieser Konfiguration nicht unterstützt 386 | template datastruct::datastruct( | ^ nvptx mkoffload: schwerwiegender Fehler: /usr/bin/x86_64-pc-linux-gnu-accel-nvptx-none-gcc gab Ende-Status 1 zurück but if we look at the sourcecode, we get this, which do not look like aliases. Probably gcc has difficulties with the offloaded functions which may be aliases generated by the compiler? #pragma omp begin declare target template datastruct::datastruct( T* data, size_t pdatalength, bool rowm, size_t rank, size_t* extents, size_t* strides, bool compute_datalength, bool compute_strides_from_extents ) : pdata(data), pextents(extents), pstrides(strides), pdatalength(pdatalength), prank(rank), prowmayor(rowm) { if(compute_strides_from_extents==true && pextents!=nullptr && pstrides!=nullptr && rank !=0) { fill_strides(pextents,pstrides,rank,rowm); } if(compute_datalength==true && pextents!=nullptr && pstrides!=nullptr && rank !=0) { pdatalength=compute_data_length(pextents,pstrides,rank); } } #pragma omp end declare target #pragma omp begin declare target template datastruct::datastruct( T* data, size_t datalength, bool rowm, size_t rows, size_t cols, size_t* extents, size_t* strides, bool compute_datalength, bool compute_strides_from_extents ) : pdata(data), pextents(extents), pstrides(strides), pdatalength(datalength), prank(2), prowmayor(rowm) { if(extents!=nullptr) { pextents[0]=(rowm==true)?rows:cols; pextents[1]=(rowm==true)?cols:rows; } if(pstrides!=nullptr && compute_strides_from_extents) { pstrides[0]=(rowm==true)? cols:1; pstrides[1]=(rowm==true)?1: rows; } if(compute_datalength==true && extents!=nullptr && strides!=nullptr) { pdatalength=(rows-1) * strides[0]+(cols-1)*strides[1]+1; } } #pragma omp end declare target #pragma omp begin declare target template datastruct::datastruct( T* data, size_t datalength, bool rowm, bool rowvector, size_t noelements, size_t* extents, size_t* strides, bool compute_datalength, bool compute_strides_from_extents ) : pdata(data), pextents(extents), pstrides(strides), pdatalength(datalength), prank(1), prowmayor(true) { if(extents!=nullptr) { pextents[0]=noelements; } if(pstrides!=nullptr && compute_strides_from_extents) { if(rowvector) pstrides[0]=(rowm==true)? 1:noelements;
[Bug c++/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 Benjamin Schulz changed: What|Removed |Added Target|nvptx | Component|target |c++ --- Comment #3 from Benjamin Schulz --- i use this configuration ~/projects/arraylibrary $ eselect gcc list [1] nvptx-none-14 * [2] x86_64-pc-linux-gnu-13 [3] x86_64-pc-linux-gnu-14 * gcc (Gentoo 14.2.1_p20241221 p7) 14.2.1 20241221 Copyright (C) 2024 Free Software Foundation, Inc.
[Bug c++/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 --- Comment #2 from Benjamin Schulz --- Created attachment 60178 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60178&action=edit cmakelists.txt
[Bug c++/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 --- Comment #1 from Benjamin Schulz --- Created attachment 60177 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60177&action=edit main.cpp
[Bug c++/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 Benjamin Schulz changed: What|Removed |Added Target|nvptx | Component|target |c++ --- Comment #4 from Benjamin Schulz --- and by the way, feel free to use this file for testing purposes. I guess it uses more involved things than a simpe test file. Because of its complexity, it may have own bugs, but since uses many openmp features, it may be an interesting test case. One thing that fails in clang on runtime is the function offload, despite i think i do this according to the openmp standard. another thing that fails in clang on runtime is the attempt to mimic the multiplication with "tiles", where only a subset of the matrix array data is uploaded to the gpu. Once one can compile the file, one can test all these things. Also, one can test memory allocation. the algorithms make use of sub-arrays, pointer arithmetic, allocation and de-allocation of memory and so on. So feel free to test it. On the host, the computations work.
[Bug target/118520] New: compiles with clang on openmp target, but gcc fails to compile with unresolved symbol __cxa_throw_bad_array_new_length
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118520 Bug ID: 118520 Summary: compiles with clang on openmp target, but gcc fails to compile with unresolved symbol __cxa_throw_bad_array_new_length Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: schulz.benjamin at googlemail dot com Target Milestone: --- Created attachment 60179 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60179&action=edit openmp offload test for simple pointer arithmetic the attached file compiles with clang and runs on target, but gcc -O3 -fopenmp -foffload=nvptx-none -fcf-protection=none -fno-stack-protector -no-pie ./main.cpp -lm -lstdc++ -lrt yields: unresolved symbol __cxa_throw_bad_array_new_length If it would run, its results would be rather interesting, since in clang the running file shows rundime problems, when the lines that are commented out in the source-code, are activated. I have two devices, one gpu and one cpu, yet the omp device number (at least on clang) is always 1. This confuses the target enter data and target exit data commands of openmp. If i set them to work on device(1), they will, on clang silently work on the host. If i run them on device(0), the mat(alloc: ) preprocessor directive will erase data on the host. But that is for clang. For gcc, this small test file does not even compile... I guess that is because there is a new command in a target region? or because of whatever..., but well, i suspect that i have the c++ language available at the target?
[Bug target/101544] [OpenMP][AMDGCN][nvptx] C++ offloading: unresolved _Znwm = "operator new(unsigned long)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544 --- Comment #12 from Benjamin Schulz --- In my view, not only the new command is important, but also things like printf, which also does not exist on the target, apparently. The problem is that with openmp, you can only check whether you are really on the target or on the host, by doing: if(omp_is_initial_device()!=true) { printf("firsttest runs on target\n"); } else printf("runs on host"); On clang, this would work. On my system, with 1 cpu and 1 gpu, on clang it turned out that the omp device number was always 1. The map commands on device 0 would map to the gpu. However, target enter data and target exit data on device 0 would be confused, with a map(alloc..) on device 0 erasing the host data. And a target enter data on device 1 would make everything work on the host. After this experience on clang, I would use this test if(omp_is_initial_device()!=true) { printf("firsttest runs on target\n"); } else printf("runs on host"); rather often. On gcc, this appears not to be available. Compilation of such a simple statement fails on gpu target with "unresolved symbol __printf_chk" This is somewhat embarassing. The gpu should be able to print text out, as it can even render entire videogames. To have commands that print out text is just necessary for debugging of more complex code. But the same could be said with exceptions. Using the stl apparently fails because these classes use nonlocal goto commands, aka exceptions...
[Bug target/101544] [OpenMP][AMDGCN][nvptx] C++ offloading: unresolved _Znwm = "operator new(unsigned long)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544 --- Comment #17 from Benjamin Schulz --- -U_FORTIFY_SOURCE worked. Thanks Sam.
[Bug target/101544] [OpenMP][AMDGCN][nvptx] C++ offloading: unresolved _Znwm = "operator new(unsigned long)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544 --- Comment #18 from Benjamin Schulz --- Interesting. Similarly as in clang, after the printf issue was resolved, a simple test script shows: Number of available devices 1 but also, firsttest runs on target Well, I have a gpu, and a cpu. that would be two devices. On clang, target enter data and target exit data were confused by this. With #pragma omp target enter data(alloc: ) device(0) and target exit data map(release) deleting the host data at the beginning, fortunately #pragma omp target map(tofrom: was working I guess i have to test whether this problem is also there on gcc. It is in any case a bit confusing, if one has one target and one host, both where you can map to, and your omp device number is just 1.
[Bug target/101544] [OpenMP][AMDGCN][nvptx] C++ offloading: unresolved _Znwm = "operator new(unsigned long)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544 --- Comment #16 from Benjamin Schulz --- "BTW, if you're calling "new" in the offload kernel then you're probably "doing it wrong"," I do not think so. For more complex mathematical algorithms, there are many situations, where we need temporary buffers to store some data. Lets look, for example, at this qr decompositon here: https://arxiv.org/pdf/1812.02056 You need a copy of the initial data, and you need to alloc free space for a temporary matrix C on p. 5... The host can demand that the target allocates this with functions like omp_target_alloc or #pragma omp target enter data alloc , or omp target map(alloc:) However, when you are #pragma omp declare target region there is not much reason, why that should not be able to create the needed temporary data cache at the beginning of the function with a new double*x=double[mysize] call on the target... When the buffer is created within the target, this has the benefit that the caller from the host may just wrap the function into a #pragma omp target area, map its arguments to the device, call it, give it its input and then the function allocates every temporary caches it needs by itself at the beginning. I think this is better code style, than the host advising the target to allocate a temporary variable cache. That way, the same function can even be run on the host, and is on the target only if wrapped in a #pragma omp target map(tofrom: data){myfunction(data)} area. One can, however, note that omp alloc and omp target alloc are better for this, since its memory allocators allow to specify whether one wants to put the data into fast memory, or into memory for large data,(provided the hardware has such a designation).this allows, e.g. to put the strides of a matrix into fast memory, and the matrix data into memory for large data. The #omp target map pragmas, or the new[] keyword of c do not have this flexibility.
[Bug target/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 --- Comment #6 from Benjamin Schulz --- how do I do that? -malias this is not a compiler option that is listed. What is strange: When i make a simple test program in c, i.e. without classes, then I can define functions in a #pragma omp target region that will work. But apparently not these constructors.
[Bug target/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 --- Comment #13 from Benjamin Schulz --- Hi many thanks for the effort. Sorry for the late reply. In desperation, by now I have tried to switch my code to openmp entirely and compile it with clang... But openacc has more parallelization levels (gang,loop, vector) as openmp since simd on gpu does not work with clang. Also, I found the teams distribute keyword of openmp very restrictive since once this appears, the code can not have commands between #pragma omp target and #pragma omp teams distribute. This makes it difficult to, do stuff on the target and then write a teams distribute loop on the device. So i will try openacc again... I use gentoo. They already have a gcc ebuild of the versions: https://packages.gentoo.org/packages/sys-devel/gcc 15.0. : 15EAPI 8 ?amd64 ?x86?alpha ?arm?arm64 ?hppa ?mips ?ppc?ppc64 ?riscv ?sparc 15.0.1_pre20250323-r1 : 15EAPI Since the entire gentoo is compiled everyday with gcc, I suppose I will not switch to the 15.0. version by now, which is the development code... I do not know if gcc version 15.0.1_pre20250323-r1 : 15EAPI contains the fixes already. If a sufficiently stable version of gcc appears that contains the fixes which compile my code, i will try them out happily and convert my code again to open acc when on the device, and use openmp when on the host, if this brings some speed.
[Bug target/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518 --- Comment #14 from Benjamin Schulz --- By the way, the numbers are for simple tests and are correct, as for that segfault in the last test, this may have been an error owned by me. But I don't know yet, I guess I must look at my old code again to see why this appears (The version that I currently have has changed code and no segfault, at least with clang. I will test everything out once I get my hands on gcc 15.