[Bug target/77882] [Aarch64] Add 'naked' function attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77882 --- Comment #10 from Christophe Monat --- (In reply to Elad Lahav from comment #9) Thanks for your patch proposal! > 1. GCC emits a warning: >/home/elahav/src/projects/aarch64_naked/aarch64_naked.c:15:1: warning: no > return statement in function returning non-void [-Wreturn-type] > } > This is true for ARMv7 as well. I would not expect such a warning in a > naked function. You may be interested in reading this : bug https://bugs.launchpad.net/gcc-arm-embedded/+bug/1836547
[Bug c/53769] [C11]: Macros __STDC_NO_THREADS__ / __STDC_NO_ATOMIC__ missing.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53769 Christophe Monat changed: What|Removed |Added CC||christophe.monat at st dot com --- Comment #10 from Christophe Monat --- Now that C11 is complete, would it be acceptable to have GCC define __STDC_NO_THREADS__ and __STDC_NO_ATOMIC__ when appropriate ? Beyond the standard that stipulates this behavior, IAR compilers do provide these definitions - at least according to their documentation (IAR C/C++ Development Guide Compiling and Linking - ARM version - 2019). I have not checked other commercial compilers. This would avoid this behavior of the ARM embedded toolchain: $ cat samples-threads.c #if !defined(__STDC_NO_THREADS__) #include #endif $ gcc-arm-none-eabi-8-2019-q3-update/bin/arm-none-eabi-gcc samples-threads.c In file included from samples-threads.c:2: /sw/FROM_sw_st_gnu_compil_comp/comp/arm/linaro/gcc-arm-none-eabi-8-2019-q3-update/arm-none-eabi/include/threads.h:30:10: fatal error: machine/_threads.h: No such file or directory #include ^~~~
[Bug tree-optimization/83661] sincos does not handle sin(2x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83661 Christophe Monat changed: What|Removed |Added CC||christophe.monat at st dot com --- Comment #2 from Christophe Monat --- Hi Pratamesh, Just curious to understand how this can be beneficial performance-wise since most (all ?) implementations of sincos fail to reap real benefit from the unique call and finally compute separately sin and cos, after some extra control-flow. The non-optimized version costs : add (to compute *2.0), sin, sin, add The optimized version costs (I choose to account for sincos as being sin, then cos IRL) : sin, cos, mul, mul, add It looks to me that the situation is not favorable for the optimized version - at least one more operation. What do you think ?
[Bug tree-optimization/83661] sincos does not handle sin(2x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83661 --- Comment #4 from Christophe Monat --- Hi Pratamesh, You're absolutely right - maybe it's more efficient when there is some hardware sincos available (Intel FSINCOS ?) but I would check also carefully the actual performance. Indeed, it looks to me that either you have to use two different polynomials or shift one argument and use either sin or cos, but anyway twice. We studied that in a slightly different context with Claude-Pierre Jeannerod from ENS Lyon and our PhD Jingyan Lu-Jourdan a while ago : "Simultaneous floating-point sine and cosine for VLIW integer processors" available here: https://hal.archives-ouvertes.fr/hal-00672327 and we were able to gain significant performance by exploiting the low-level parallelism of the processor. Agreed, this is not a full IEEE implementation but the important ideas are there.
[Bug tree-optimization/83661] sincos does not handle sin(2x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83661 --- Comment #6 from Christophe Monat --- (In reply to Alexander Monakov from comment #5) > sincos performs range reduction for the argument just once, which is fairly > important. A well-optimized sincos also shares some computations for the > sin/cos parts, as done in > https://github.com/ARM-software/optimized-routines/blob/master/math/sincosf.c Thanks for the pointer, indeed this implementation looks great. @Pratamesh #Linaro: is there synchronization between the ARM optimized routines and the usuals libc (glibc, newlib, musl,...) ? (snip) > (fwiw I'm against adding such transforms to the compiler) I would favor upgrading the routines if it's not already done, and not do any such transformation in the compiler.
[Bug target/71607] [5/6/7/8 Regression] [ARM] ice due to forbidden enabled attribute dependency on instruction operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71607 --- Comment #13 from Christophe Monat --- (In reply to Prakhar Bahuguna from comment #12) Hi Prakar, > The patch has now been posted to the mailing list: > https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00872.html Thanks for the work, and the kind notification. I really have high hopes that it will be accepted soon (Ramana, could you please...?), since I am getting high internal pressure to have it fixed, to move forward to deliver our own work. --C
[Bug libstdc++/68197] negative index to ios_base::iword lead to unpredictable result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68197 --- Comment #4 from Christophe Monat --- (In reply to Jonathan Wakely from comment #3) > No, it seems underspecified. I have raised it with the C++ committee. Do you have feedback from the C++ committee ? I have only easily access to a 2011 standard, and reading 27.5.3.5 (3) : Returns : '(snip)... on failure, a valid long& initialized to 0'. I may be wrong, but I read this as an obligation to return 0 in the case described by Mickael.
[Bug target/71607] [5/6/7 Regression] [ARM] ice due to forbidden enabled attribute dependency on instruction operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71607 --- Comment #10 from Christophe Monat --- (In reply to Ramana Radhakrishnan from comment #9) Hello Ramana, Is there a plan to have this patch delivered upstream at some point in the near future ? Best regards, --C
[Bug target/71607] [5/6/7 Regression] [ARM] ice due to forbidden enabled attribute dependency on instruction operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71607 --- Comment #4 from Christophe Monat --- (In reply to Thomas Preud'homme from comment #3) Thomas, I am seeing that the assignee name has been reset : does it mean that you definitely disengage from looking at this problem ? If this is the case, and if nobody officially takes over, would you share your current patch ? Best regards, --C
[Bug c/77754] [5/6/7 Regression] internal compiler error : tree code 'call_expr' is not supported in LTO streams
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77754 --- Comment #4 from Christophe Monat --- (In reply to Richard Biener from comment #3) > I wonder what the standards say about side-effects in those "declarations". From my instance of ISO+IEC+9899-2011.pdf 6.7.6.2 Array declarators Constraints ... Semantics (5) If the size is an expression that is not an integer constant expression: if it occurs in a declaration at function prototype scope, it is treated as if it were replaced by *; otherwise, each time it is evaluated it shall have a value greater than zero. The size of each instance of a variable length array type does not change during its lifetime. Where a size expression is part of the operand of a sizeof operator and changing the value of the size expression would not affect the result of the operator, it is unspecified whether or not the size expression is evaluated. And 6.7.6.3 Function declarators (including prototypes) (12) If the function declarator is not part of a definition of that function, parameters may have incomplete type and may use the [*] notation in their sequences of declarator specifiers to specify variable length array types. So to me it looks, like the compiler could consider the original declaration as if it where: void fn2(int[][*]);
[Bug c/77882] New: [Aarch64, ARM64] Add 'naked' function attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77882 Bug ID: 77882 Summary: [Aarch64, ARM64] Add 'naked' function attribute Product: gcc Version: 6.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: christophe.monat at st dot com Target Milestone: --- With the following code fragment aarch64-attribute-naked.c : void __attribute__((naked)) dealwithit() { // Stuff removed above, below we assume that the 'naked' attribute // does the job and does not generate the usual 'ret' instruction asm("eret"::); } $ aarch64-linux-gnu-gcc -S aarch64-attribute-naked.c aarch64-attribute-naked.c:4:1: warning: 'naked' attribute directive ignored [-Wattributes] I have tried with a gcc-6.1.1, but simply as an example, this feature has never been ported to Aarch64, I think. As this attribute is supported for the legacy ARM target, it would be nice it the Aarch64 compiler could support it.
[Bug target/77882] [Aarch64] Add 'naked' function attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77882 --- Comment #3 from Christophe Monat --- Andrew, (In reply to Andrew Pinski from comment #2) > I really think the naked attribute as not useful at all. I think it was a > bad idea. Why not write a .s file which does what you want? Well, from the discussions I read (for instance in bug #25967) before posting this, I was more or less expecting this remark. I agree that some arguments there are not valid. In a nutshell, I wrote quite a lot of tricky testing code (dealing manually with functions prologues/epilogues, etc..) for the ARM target (Ramana : please pardon the misused 'legacy' adjective) using this feature, and I would like to have a similar installment for ARM64. I am also generally simply bothered at the idea of putting decoration at the beginning of an assembly file, and starting from .c enables me to have the appropriate tags automatically emitted when I switch from target to target. --C
[Bug target/78020] New: [Aarch64, ARM64] vuzp{1,2}q_f64 implementation identical to vzip{1,2}q_f64 in arm_neon.h and probably incorrect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78020 Bug ID: 78020 Summary: [Aarch64, ARM64] vuzp{1,2}q_f64 implementation identical to vzip{1,2}q_f64 in arm_neon.h and probably incorrect Product: gcc Version: 6.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: christophe.monat at st dot com Target Milestone: --- Created attachment 39829 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39829&action=edit Suggested patch to fix the zip vs unzp intrinsic issue I think that vuzp{1,2}q_f64 in gcc/config/aarch64/arm_neon.h are not correctly implemented. For instance: vzip1q_f64 (float64x2_t __a, float64x2_t __b) (snippage) return __builtin_shuffle (__a, __b, (uint64x2_t) {0, 2}); and: vuzp1q_f64 (float64x2_t __a, float64x2_t __b) (snippage) return __builtin_shuffle (__a, __b, (uint64x2_t) {0, 2}); But then, according to the "ARM Architecture reference Manual, ARMv8 for ARMv8-A architecture profile" (I am reading ARM DDI 0487A.i (ID 012816)), the semantic of zip1 and uzp1 differ (C3.5.18 is a convenient starting point to browse the architectural descriptions). It looks to me that the correct implementation would look like: vuzp1q_f64 (float64x2_t __a, float64x2_t __b) (snippage) return __builtin_shuffle (__a, __b, (uint64x2_t) {2, 0}); that would generate zip1v0.2d, v1.2d, v0.2d instead of: zip1v0.2d, v0.2d, v1.2d and has the correct semantic, though it does not use the uzp1 mnemonic (I expected this uzp1 to appear more or less at the beginning, and scratched a little bit my head and draw some diagrams to convince me that I was hopefully correct). I have also noticed that vtrn{1,2}q_f64 are implemented in terms of zip{1,2}, but it seems ok semantically (though I had to check manually with the semantic description and more drawings to convince myself). I have attached a patch suggesting a change in arm_neon.h, in case this analysis is correct. I note that if https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70369 were completed the occurrence of such issue would be less likely.
[Bug target/78020] [AArch64] vuzp{1,2}q_f64 implementation identical to vzip{1,2}q_f64 in arm_neon.h and probably incorrect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78020 --- Comment #6 from Christophe Monat --- James, (In reply to James Greenhalgh from comment #5) > This bug looks invalid to me. I think you're both failing to grasp the > intuition behind these intrinsics. Ignoring the descriptions in the > reference manual for a second, and thinking only in terms of vector > operations. If you start with: Thank you for your truly authoritative exposition of the working of the ASIMD zip/unzip pair that is a much better read than the manual (this does not excuse my error). At least, I'll keep this post as a future reference. Kyrill, Yvan, Thank you both for the time that you spent with my blunder, I took my lesson here. --C
[Bug libstdc++/78236] New: regex_iterator constructor is incomplete and creates uninitialized values that may be used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78236 Bug ID: 78236 Summary: regex_iterator constructor is incomplete and creates uninitialized values that may be used Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: christophe.monat at st dot com Target Milestone: --- Created attachment 39982 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39982&action=edit Porposed patch to fix the regex_iterator constructor The following code snippet, when compiled on X8664 with gcc-trunk, dumps core with any optimization level (you might need to add -std=c++11 depending on your compiler version). #include #include int main(int argc, char *argv[]) { char const s[] = "afoo"; std::basic_regex r("(f+)"); { std::cregex_iterator i(s, s+sizeof(s), r); std::cregex_iterator j(s, s+sizeof(s), r); assert(i == j); } // The iterator manipulation code must be repeated in the same scope // to expose the undefined read during the execution of the == // operator (stack location reuse) { std::cregex_iterator i(s, s+sizeof(s), r); std::cregex_iterator j; assert(!(i == j)); } return 0; } This happens during the execution of the operator== that reads some of the private implementation fields, some of which appear not to be initialized. The issue is due to the fact that the regex_iterator (libstdc++v3/include/bits/regex.h) has for instance a pointer member (const regex_type* _M_pregex) and a user-defined constructor that fails to initialize it I have attached a trivial patch that initializes the two fields that are to be explicitly initialized to avoid the reported issue.
[Bug libstdc++/78236] regex_iterator constructor is incomplete and creates uninitialized values that may be used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78236 --- Comment #1 from Christophe Monat --- Comment on attachment 39982 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39982 Proposed patch to fix the regex_iterator constructor >diff --git a/libstdc++-v3/include/bits/regex.h >b/libstdc++-v3/include/bits/regex.h >index a7d45e6..bb0a167 100644 >--- a/libstdc++-v3/include/bits/regex.h >+++ b/libstdc++-v3/include/bits/regex.h >@@ -2454,7 +2454,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 >* one-past-the-end of a range. >*/ > regex_iterator() >- : _M_match() >+ : _M_pregex(), _M_flags(regex_constants::match_default), _M_match() > { } > > /**
[Bug libstdc++/78236] regex_iterator constructor is incomplete and creates uninitialized values that may be used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78236 --- Comment #3 from Christophe Monat --- (In reply to Tim Shen from comment #2) > I proposed another way to fix this in the list: > https://gcc.gnu.org/ml/libstdc++/2016-11/msg8.html Looks perfect - I was somewhat annoyed by the _M_match() call anyway. The test case is a reconstruction from a much bigger one exhibiting an issue that we encountered at first on ARM/AArch64.