[Bug c++/11211] trivial static initializers of const objects should be done at compile time
--- Comment #7 from eric-bugs at omnifarious dot org 2008-12-28 11:21 --- I've been meaning to revisit this bug with a recent version of gcc. And, in fact it still happens with gcc 4.3.0 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11211
[Bug tree-optimization/40210] New: gcc needs byte swap builtins
gcc needs some built in functions for byte swapping. I've been experimenting with the various versions of byte swapping functions out there, and they either result in code that's opaque to the optimizer (i.e. swapping something twice is not considered a null operation) or the optimizer doesn't recognize that a byte swap is what's happening and renders it as a complex series of shift, and and or instructions. I know very little about the internals of gcc, but my ignorant preference would be to make tree-ssa recognize that code like this: inline uint64_t byteswap_64(const uint64_t x) { return x) & 0xff00ull) >> 56) | (((x) & 0x00ffull) >> 40) | (((x) & 0xff00ull) >> 24) | (((x) & 0x00ffull) >> 8) | (((x) & 0xff00ull) << 8) | (((x) & 0x00ffull) << 24) | (((x) & 0xff00ull) << 40) | (((x) & 0x00ffull) << 56)); } is a byte swap and optimize appropriately. If this were being done to an entire array, it might even be possible to vectorize it efficiently. This would also mean that code to pull specific bits out of a pre or post swap value could be moved around and fiddled to get the value out of a different place if it made for more efficient register usage. -- Summary: gcc needs byte swap builtins Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: eric-bugs at omnifarious dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40210
[Bug tree-optimization/40210] gcc needs byte swap builtins
--- Comment #4 from eric-bugs at omnifarious dot org 2009-05-20 19:17 --- Ahh, OK. I hunted a bit to find something like that, but didn't find it. Thank you. I now have a slightly different bug, which is a mild inadequate optimization bug. :-) I'll cut it down to size and paste it in here. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40210
[Bug tree-optimization/40210] gcc byte swap builtins inadequately optimized
--- Comment #5 from eric-bugs at omnifarious dot org 2009-05-20 19:39 --- This code: #include #include inline uint64_t byteswap_64(const uint64_t x) { return __builtin_bswap64(x); } inline uint32_t byteswap_32(const uint32_t x) { return __builtin_bswap32(x); } extern void random_function(uint32_t a, uint64_t b, uint32_t c, uint64_t d); void swapping(const uint32_t x32, const uint64_t x64) { random_function(byteswap_32(x32), byteswap_64(x64), byteswap_32(byteswap_32(x32)), byteswap_64(byteswap_64(x64))); } void swaparray(uint64_t outvals[], char outtop[], const uint64_t invals[], const size_t size) { size_t i = 0; for (i = 0; i < size; ++i) { outvals[i] = byteswap_64(invals[i]); outtop[i] = (byteswap_64(invals[i]) >> 56) & 0xffull; } } results in this assembly: .globl swaparray .type swaparray, @function swaparray: .LFB5: testq %rcx, %rcx je .L8 xorl%r8d, %r8d .p2align 4,,7 .p2align 3 .L7: movq(%rdx,%r8,8), %rax bswap %rax movq%rax, (%rdi,%r8,8) movq(%rdx,%r8,8), %rax bswap %rax shrq$56, %rax movb%al, (%rsi,%r8) incq%r8 cmpq%r8, %rcx ja .L7 .L8: rep ret .LFE5: .size swaparray, .-swaparray .p2align 4,,15 .globl swapping .type swapping, @function swapping: .LFB4: bswap %rsi bswap %edi movq%rsi, %rcx movl%edi, %edx bswap %rcx bswap %edx jmp random_function .LFE4: .size swapping, .-swapping when compiled with gcc -O3 -mtune=native -march=native on an Opteron system. Notice that in swapping bswap is used twice rather than having two move instructions and two bswap instructions. The optimizer is apparently unaware that bswap is its own inverse. In swaparray the bswap operation is not subject to an obvious CSE optimization, nor is it realized that the latter line might be more efficiently implemented by movb %al, (%rsi,%r8) before the bswap operation. -- eric-bugs at omnifarious dot org changed: What|Removed |Added Summary|gcc needs byte swap builtins|gcc byte swap builtins ||inadequately optimized http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40210
[Bug tree-optimization/40210] gcc byte swap builtins inadequately optimized
--- Comment #7 from eric-bugs at omnifarious dot org 2009-05-20 20:22 --- I've been playing around a bit more, and I've noticed that gcc in general does not do a spectacular job of optimizing bitwise operations of any kind. Some kind of general framework for tracking the movements of individual bits and details like "16 bit values only have 16 bits, so using & to ensure this in various ways is a null operation." might actually do a lot to speed up a lot of code. I distinctly remember a time long past when I and a co-worker fiddled some complex bit operations this way and that to get the assembly out we knew was close to optimal for a tight inner loop. The resulting expression was significantly less clear than the most obvious way of stating the same thing and I also knew that if DEC changed their compiler in certain ways we'd have to do it all over again. As an example, there is no reason that: (x << 8) | (x >> 8) should result in better code than ((x & 0xffu) << 8) | ((x & 0xff00u) >> 8) when x is of type uint16_t, but it does. And recognizing that either can be done in one instruction on an x86 would be even better. So, while I think you are likely correct that the byteswap builtins do not need a lot of extensive optimization, I do think that bit operations in general could be handled a lot better, and that would help out a whole lot of code. Once that framework was in place optimizing the byteswap builtin would be trivial. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40210
[Bug c++/50087] [C++0x] Weird optimization anomaly with constexpr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50087 eric-bugs at omnifarious dot org changed: What|Removed |Added CC||eric-bugs at omnifarious ||dot org --- Comment #5 from eric-bugs at omnifarious dot org 2011-09-02 20:57:45 UTC --- I thought that perhaps it was expected behavior. I still think it's a missed optimization opportunity. A call of a constexpr function can clearly, in all cases, be reduced to a constant expression if the arguments are also constant expressions. So it seems like the optimizer should do this if it can. But it isn't a 'bug' exactly, just more of a 'it could do this better'.
[Bug c++/87372] New: __PRETTY_FUNCTION__ not constexpr in gcc trunk on compiler explorer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87372 Bug ID: 87372 Summary: __PRETTY_FUNCTION__ not constexpr in gcc trunk on compiler explorer Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: eric-bugs at omnifarious dot org Target Milestone: --- This code will not compile with gcc trunk on compiler explorer. But it works with gcc 8.2 on that some site. I'm worried this is a regression in current C++ development. constexpr int zstrlen(char const *s) { int i = 0; while (s[i]) ++i; return i; } int joe() { constexpr char const * const foo = __PRETTY_FUNCTION__; constexpr int foolen = zstrlen(foo); return foolen; } It fails to work because __PRETTY_FUNCTION__ isn't constexpr in gcc trunk. I get this error message: : In function 'int joe()': :11:35: in 'constexpr' expansion of 'zstrlen(((const char*)foo))' :11:39: error: the value of '__PRETTY_FUNCTION__' is not usable in a constant expression 11 | constexpr int foolen = zstrlen(foo); | ^ :10:40: note: '__PRETTY_FUNCTION__' was not declared 'constexpr' 10 | constexpr char const * const foo = __PRETTY_FUNCTION__; |^~~ Compiler returned: 1 Here is a link: https://godbolt.org/z/8IdAae
[Bug c++/87372] [9 Regression] __PRETTY_FUNCTION__ not constexpr in gcc trunk on compiler explorer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87372 --- Comment #2 from eric-bugs at omnifarious dot org --- Also, this works in clang 6.0 (with --std=c++17), but not gcc 8.2: #include constexpr int ce_strlen(char const *s) { int i = 0; while (s[i]) ++i; return i; } template constexpr auto as_array(char const *s) { ::std::array output{}; for (int i = 0; i < len; ++i) { output[i] = s[i]; } output[output.size() - 1] = '\0'; return output; } template constexpr auto paste_array(::std::array a, ::std::array b) { constexpr unsigned long tlen = s1 + s2 - 1; ::std::array output{}; int o = 0; for (unsigned long i = 0; i < s1; ++i, ++o) { output[o] = a[i]; } --o; for (unsigned long i = 0; i < s2; ++i, ++o) { output[o] = b[i]; } return output; } #define stringify(x) #x #define evstringify(x) stringify(x) char const * joe() { constexpr static auto mystr = paste_array(paste_array(as_array(__PRETTY_FUNCTION__), as_array(" at line ")), as_array(evstringify(__LINE__))); return mystr.data(); }
[Bug c++/87372] [9 Regression] __PRETTY_FUNCTION__ not constexpr in gcc trunk on compiler explorer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87372 --- Comment #4 from eric-bugs at omnifarious dot org --- Should I file a new bug with my new comment in it? I should probably test against a trunk with your change in it first.
[Bug c++/87381] New: clang 6.0 will compile this constexpr construct, but gcc 8.2.1 will not.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87381 Bug ID: 87381 Summary: clang 6.0 will compile this constexpr construct, but gcc 8.2.1 will not. Product: gcc Version: 8.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: eric-bugs at omnifarious dot org Target Milestone: --- I'm trying to construct profile tags based on __PRETTY_FUNCTION__ at compile time while making minimal use of preprocessor macros. I've had a lot of trouble with using integer arguments to constexpr functions, so that's why the integers are templated. It occurs to me now that this may be because there were implicit narrowing conversions involved, though the compiler didn't clearly state this as the reason. I'm going to try to make a new version of this that tries to be more careful with how the integers are manipulated to see if it can be made simpler. #include constexpr int ce_strlen(char const *s) { int i = 0; while (s[i]) ++i; return i; } template constexpr auto as_array(char const *s) { ::std::array output{}; for (int i = 0; i < len; ++i) { output[i] = s[i]; } output[output.size() - 1] = '\0'; return output; } template constexpr auto paste_array(::std::array a, ::std::array b) { constexpr unsigned long tlen = s1 + s2 - 1; ::std::array output{}; int o = 0; for (unsigned long i = 0; i < s1; ++i, ++o) { output[o] = a[i]; } --o; for (unsigned long i = 0; i < s2; ++i, ++o) { output[o] = b[i]; } return output; } #define stringify(x) #x #define evstringify(x) stringify(x) char const * joe() { constexpr static auto mystr = paste_array(paste_array(as_array(__PRETTY_FUNCTION__), as_array(" at line ")), as_array(evstringify(__LINE__))); return mystr.data(); }
[Bug c++/87381] clang 6.0 will compile this constexpr construct, but gcc 8.2.1 will not.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87381 --- Comment #1 from eric-bugs at omnifarious dot org --- Godbolt link: https://godbolt.org/z/gHnb-G Also, my attempt to simplify this failed because clang will not consider arguments to constexpr functions to be constexpr. Which, IMHO, is wrong. Whether the fault is in the standard or clang, I don't know. I didn't test to see if my attempts to simplify it would work in gcc because even this non-simplified example fails.
[Bug c++/87381] clang 6.0 will compile this constexpr construct, but gcc 8.2.1 will not.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87381 --- Comment #3 from eric-bugs at omnifarious dot org --- Ahh, I guess that does make sense. Oh, well. I guess I'm stuck using template arguments in place of function arguments in some cases.
[Bug c++/87381] clang 6.0 will compile this constexpr construct, but gcc 8.2.1 will not.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87381 --- Comment #4 from eric-bugs at omnifarious dot org --- Given the new way of looking at things prompted by the correction of my erroneous idea, I've rethought how to simplify this, and the simplification does work in gcc 8.2, and I think is generally more correct: -- #include #include using ary_size_t = ::std::array::max()>::size_type; template <::std::size_t len> constexpr auto as_stdarray(const char (&s)[len]) { ::std::array output{}; for (::std::size_t i = 0; i < len; ++i) { output[i] = s[i]; } return output; } template constexpr auto paste_array(::std::array a, ::std::array b, ::std::array... remaining) { constexpr auto numarys = 1 + sizeof...(remaining); constexpr ary_size_t tlen = ((s1 + s2) + ... + sn) - numarys; ::std::array output{}; int o = 0; auto copy_into = [&o, &output](auto const &a) { for (ary_size_t i = 0; i < a.size(); ++i, ++o) { output[o] = a[i]; } --o; }; copy_into(a); (copy_into(b) , ... , copy_into(remaining)); return output; } #define stringify(x) #x #define evstringify(x) stringify(x) char const * joe() { constexpr static auto mystr = paste_array(as_stdarray(__FUNCTION__), as_stdarray(" at line "), as_stdarray(evstringify(__LINE__))); return mystr.data(); } -- Godbolt link: https://godbolt.org/z/jMJ94L Because ::std::array doesn't trivially decay into a pointer, I find it easier to work with in a sensible way, which is why I convert everything to an std::array first. I should probably rename paste_array to concat_zstrings.
[Bug c++/87381] clang 6.0 will compile this constexpr construct, but gcc 8.2.1 will not.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87381 --- Comment #5 from eric-bugs at omnifarious dot org --- Here is the problem, reduced to the simplest expression I could make: - template struct test_template { static int size() { return x; } }; constexpr int ce_strlen(char const *s) { int i = 0; while (s[i]) ++i; return i; } int joe() { constexpr int plen = ce_strlen(__PRETTY_FUNCTION__); // This works test_template a; // This declaration is valid. test_template b; // But this doesn't work?! return a.size() + b.size(); } -- Either the declaration/initialization of both plen and b should fail, or they should both succeed. It makes no sense for one to work and the other to not.
[Bug c++/87399] Inconsistent determination of what is usable in a constant expression with __PRETTY_FUNCTION__
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87399 eric-bugs at omnifarious dot org changed: What|Removed |Added CC||eric-bugs at omnifarious dot org --- Comment #2 from eric-bugs at omnifarious dot org --- This bug is far more succinct than the one I filed. I'll mark mine as the dupe even though this one was filed later.
[Bug c++/87381] clang 6.0 will compile this constexpr construct, but gcc 8.2.1 will not.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87381 eric-bugs at omnifarious dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #6 from eric-bugs at omnifarious dot org --- Someone filed a bug in response to a StackOverflow question I asked, and their bug is much more succinct and clear than this one is. Marking this one as a dupe. *** This bug has been marked as a duplicate of bug 87399 ***
[Bug c++/87399] Inconsistent determination of what is usable in a constant expression with __PRETTY_FUNCTION__
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87399 --- Comment #3 from eric-bugs at omnifarious dot org --- *** Bug 87381 has been marked as a duplicate of this bug. ***
[Bug c++/61414] enum class bitfield size-checking failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61414 eric-bugs at omnifarious dot org changed: What|Removed |Added CC||eric-bugs at omnifarious dot org --- Comment #9 from eric-bugs at omnifarious dot org --- This does not seem like correct behavior to me either. The warning should be based on the maximum declared enum value, not the maximum possible value held by the underlying type. After all as of C++17, the standard makes it undefined what happens if you try to stuff an integer into an enum value that doesn't correspond to one of the values listed in the enum declaration.
[Bug c++/108337] New: Misaligned memory access issues when inline assembly is used with optimization on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108337 Bug ID: 108337 Summary: Misaligned memory access issues when inline assembly is used with optimization on x86_64 Product: gcc Version: 12.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: eric-bugs at omnifarious dot org Target Milestone: --- The following code works fine when compiled this way: g++ -std=c++20 -march=znver2 -static -O1 -nostartfiles -nostdlib -Wl,-emain foo.cpp but when compiled this way generates a segfault before outputting anything: g++ -std=c++20 -march=znver2 -static -O2 -nostartfiles -nostdlib -Wl,-emain foo.cpp -- foo.cpp - using val_t = unsigned long; using call_id = unsigned short; template struct remove_reference { typedef _Tp type; }; template struct remove_reference<_Tp&> { typedef _Tp type; }; template struct remove_reference<_Tp&&> { typedef _Tp type; }; template [[__nodiscard__]] constexpr _Tp&& forward(typename remove_reference<_Tp>::type& __t) noexcept { return static_cast<_Tp&&>(__t); } struct syscall_param { syscall_param(val_t v) noexcept : value(v) { } syscall_param(void *v) noexcept : value(reinterpret_cast(v)) { static_assert(sizeof(void *) == sizeof(val_t)); } syscall_param(void const *v) noexcept : value(reinterpret_cast(v)) { static_assert(sizeof(void *) == sizeof(val_t)); } val_t value; }; inline val_t do_syscall(call_id callnum, syscall_param const &p1) noexcept { val_t retval; asm volatile ( "syscall\n\t" :"=a"(retval) :"a"(static_cast(callnum)), "D"(p1.value) :"%rcx", "%r11", "memory" ); return retval; } inline val_t do_syscall(call_id callnum, syscall_param const &p1, syscall_param const &p2, syscall_param const &p3) noexcept { val_t retval; asm volatile ( "syscall\n\t" :"=a"(retval) :"a"(static_cast(callnum)), "D"(p1.value), "S"(p2.value), "d"(p3.value) :"%rcx", "%r11", "memory" ); return retval; } template val_t syscall_expected(call_id callnum, T &&... args) noexcept { val_t result = do_syscall(callnum, forward(args)...); return result; } inline val_t write(int fd, char const *data, unsigned long size) noexcept { return syscall_expected(1, fd, data, size); } inline void exit [[noreturn]](int status) noexcept { syscall_expected(231, status); __builtin_unreachable(); } int main(int argc, char *argv[]) { int i = 0; char msg[] = "Hello World 0!\n"; auto result = write(1, msg, sizeof(msg) - 1); i = 1; while (result >= 0 && i < 10) { msg[12] = i++ + '0'; result = write(1, msg, sizeof(msg) - 1); } exit(result >= 0 ? 0 : 1); }
[Bug target/108337] Misaligned memory access issues when inline assembly is used with optimization on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108337 eric-bugs at omnifarious dot org changed: What|Removed |Added Resolution|INVALID |FIXED --- Comment #3 from eric-bugs at omnifarious dot org --- Oh, interesting! Does -mstackrealign affect every single function the compiler emits?
[Bug target/108337] Misaligned memory access issues when inline assembly is used with optimization on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108337 --- Comment #5 from eric-bugs at omnifarious dot org --- (In reply to Andrew Pinski from comment #4) > Yes it affects every function. There might be another way to use attribute > to specific that only main needs this treatment. Or better yet add a _start > function in assembly to that. That's what I was thinking. I need to actually re-implement a bunch of the g++ runtime environment for what I want to achieve, and so that's likely what I'll be doing. Thanks!
[Bug c++/108361] New: Assembly code that is never called emitted on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108361 Bug ID: 108361 Summary: Assembly code that is never called emitted on x86_64 Product: gcc Version: 12.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: eric-bugs at omnifarious dot org Target Milestone: --- Created attachment 54236 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54236&action=edit Preprocessed C++ code that generates unneeded assembly The nature of this bug makes a concise test case hard to whittle down to something minimal... The attached code generates a whole bunch of assembly that isn't needed. This assembly references external symbols as well, which creates unnecessary linker errors. Clang handles it just fine. :-) Attached is both the preprocessed C++ code, and the assembly it generates. The only needed assembly is the code for main and the static data main needs. _start is defined as a global in a separate file, which is also attached. The compile command I'm using: g++ -std=c++20 -march=znver2 -static -O3 -nostartfiles -nostdlib -I/usr/include/c++/12 -I/home/hopper/src/posixpp/pubincludes -Wl,-e_start preprocessed.s x86_64_start.s
[Bug c++/108361] Assembly code that is never called emitted on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108361 --- Comment #1 from eric-bugs at omnifarious dot org --- Created attachment 54237 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54237&action=edit Assembly file containing _start
[Bug c++/108361] Assembly code that is never called emitted on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108361 --- Comment #2 from eric-bugs at omnifarious dot org --- Created attachment 54238 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54238&action=edit Assembly output from compiler
[Bug c++/108361] Assembly code that is never called emitted on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108361 --- Comment #6 from eric-bugs at omnifarious dot org --- Technically, I suppose it is. I do reference those things in the original code. :-) But it is sort of annoying to get the error when I can just edit the assembly and clip out the offending code with no effect other than to remove the error. At any rate, yes, it's a duplicate. And thank you.
[Bug c++/2316] g++ fails to overload on language linkage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2316 --- Comment #55 from eric-bugs at omnifarious dot org --- C++ may get an ABI change one of these days. It needs one in order to have a properly efficient unique_ptr. Since that ABI change would involve who calls destructors and C doesn't have destructors, I'm not sure if it would be a huge deal for C/C++ ABI compatibility. But, still...
[Bug c++/113527] New: Missed optimization [[assume]] attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113527 Bug ID: 113527 Summary: Missed optimization [[assume]] attribute Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: eric-bugs at omnifarious dot org Target Milestone: --- This code https://godbolt.org/z/jeYGcr5sv still generates all the exception handling code. It treats the assume attribute more like 'likely' than 'assume'. It's supposed to be undefined behavior for the assumption to be false. Here is the code replicated here: #include struct Potato { explicit Potato(int); Potato(const Potato & other) noexcept; }; using V = std::variant; template struct overloaded : Ts_... { using Ts_::operator()...; }; auto f(const V & v) -> int { [[assume(! v.valueless_by_exception())]]; return visit(overloaded{ [](int) { return 144; }, [](double) { return 27; }, [](const Potato &) { return 50; } }, v); } You can make the exception handling code go away by simply using "if (v.valueless_by_exception())" and returning some random value in the false case. But, if the spirit of the keyword is to be observed, you should be fine just doing a straight up table lookup and using the result without even checking the type tag at all.
[Bug c++/115728] New: Feature Request: inline assembly improvements for C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115728 Bug ID: 115728 Summary: Feature Request: inline assembly improvements for C++ Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: eric-bugs at omnifarious dot org Target Milestone: --- I would like a few things added to inline assembly support. The first is I would very much like to be able to use templates to generate inline assembly. The second is I want finer grain control over marking memory regions as needing to be updated before inline assembly code is executed, or invalidated after. My motivation is that I'm working on a library that allows system calls to be inlined. And so I can't rely on the normal "functional call boundary" to make sure that everything makes it to memory that should before the system call, or that the memory a system call may have updated is considered changed (and so has to be reloaded into registers) afterward. And being able to use templates makes it much easier to handle common patterns, with a fairly compact template function. I think these changes would actually have broad applicability when doing inline assembly in C++ in general though. C++ is a very different language than C, and the current inline assembly features support C very well, but do not support C++ very well.
[Bug c++/115728] Feature Request: inline assembly improvements for C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115728 --- Comment #2 from eric-bugs at omnifarious dot org --- (In reply to Richard Biener from comment #1) > I think there's (pending?) support to allow the asm text to be generated by > constexpr evaluation. Not sure if that will help. I would have to play with the feature to be sure, but I suspect that would be sufficient. It was my first go-to as an idea for how to implement it.
[Bug c++/115728] Feature Request: inline assembly improvements for C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115728 --- Comment #4 from eric-bugs at omnifarious dot org --- (In reply to ak from comment #3) > The constexpr asm support is in trunk. It supports templates. > > > >The second is I want finer grain control over marking memory regions as > >needing >to be updated before inline assembly code is executed, or > >invalidated after. > > You can do that by specifying the memory region to be updated in the > input/output list I know all about that feature, and it's not sufficient. I wouldn't have filed this if it was. It has a really hard time with pointers to pointers, which means handling readv and writev would be difficult.