strict aliasing warning
I wrote some code (not released yet) that improves the accuracy of -Wstrict-aliasing using tree-ssa-alias information. The primary idea was to tell the programmer "go fix the types of variables x and y at lines ..." when -fstrict-aliasing breaks their code. It occurred to me that part of this code could be used as a preconditioner to aggressive optimization that would normally require -fstrict-aliasing, so this more aggressive optimization can then be performed selectively on individual functions even with -fno-strict-aliasing on the command line. I fear a little though that the functions which are provably free of cross-type aliasing might also not benefit much from -fstrict-aliasing, but I have very little experience with C compilers and GCC. Is this something worth pursuing? Thank you, Silvius
Re: strict aliasing warning
Joel Sherrill wrote: Silvius Rus wrote: I wrote some code (not released yet) that improves the accuracy of -Wstrict-aliasing using tree-ssa-alias information. The primary idea was to tell the programmer "go fix the types of variables x and y at lines ..." when -fstrict-aliasing breaks their code. How reliable is this detection and warning? Is it ready for testers yet? The first take is almost ready, but will need to pass peer review before releasing it (even to testers), so it will be a few days. I ask because we have found a case where code in RTEMS breaks when strict-aliasing is enabled. We are having discussions on how to effectively perform an audit on RTEMS for other breakages. Right now, the best idea is Ralf's to diff the assembly generated for each file compiled with and without strict-aliasing. If there is a difference, we will have to review it. This eliminates a lot of the code base but it is still generating a lot of cases to examine by hand. I'm curious if your patch will ease this process. The balance (false positives, false negatives) can be controlled by a --param option. You can limit the number of false positives for each type pair, originating statement, function, or compilation unit using another --param option. -- Silvius
Re: strict aliasing warning
Joe Buck wrote: If you first prove that there is no cross-type aliasing, then turn on -fstrict-aliasing, it seems to me that your alias sets won't change at all. The reason is that if there is a change, it means that you eliminated an aliasing possibility based on the fact that it's not allowed because of cross-type aliasing, which you said you proved there is none of. There have been concerns expressed in the past that if gcc checks for strict aliasing violations, users will be upset when it misses some violations but does optimization that breaks an ill-formed program. I personally am not that worried about that; I think that it will help train users if they see warning messages about blatant violations, to the point where they will be less likely to create subtle, hard-to-detect violations. Yes, using the alias information produced by a complex algorithm may lead to errors as false negatives might sneak in. That is OK when warning programmers, but will probably not be accepted as a preconditioner. I was thinking more of a simplified check. For instance, the most conservative and simplest would be verifying that there are no conversions to pointers to different types. That will produce many false positives, but no false negatives. However, if I understood you well, in such cases the complex algorithm will end up with the same alias sets with or without -fstrict-aliasing, so my conclusion is that it is probably not worth doing it. Thank you, Silvius
Re: Tricky(?) aliasing question.
Sergei Organov wrote: Ian Lance Taylor <[EMAIL PROTECTED]> writes: Sergei Organov <[EMAIL PROTECTED]> writes: Below are two example functions foo() and boo(), that I think both are valid from the POV of strict aliasing rules. GCC 4.2 either warns about both (with -Wstrict-aliasing=2) or doesn't warn about any (with -Wstrict-aliasing), and generates the assembly as if the functions don't violate the rules, i.e, both functions return 10. -Wstrict-aliasing=2 is documented to return false positives. Actually both current versions of -Wstrict-aliasing are pretty bad. Well, they are indeed bad, but on the other hand I fail to see how to make them pretty without analyzing the entire source of a program, and even then the "effective type of an object" could change at run-time :( Overall, I tend to refrain from blaming gcc too much for weakness of these warnings. The current implementation of -Wstrict-aliasing runs in the C front end and looks at a single expression at once. Although I think it does OK given the limited scope, it has several drawbacks. First, it produces false positives, i.e., it warns when it should not. For instance, it warns about pointer conversions even when the pointers are not dereferenced. Second, it produces false negatives, i.e., it does not warn when it should. For instance, aliases to malloc-ed memory are not detected, among others. Third, it only checks C programs (and not C++). I am about to submit a patch that implements -Wstrict-aliasing in the backend based on flow-sensitive points-to information, which is computed by analyzing the entire source of each function. It is not perfect (the problem is undecidable), but it improves in all three directions: it checks whether pointers get dereferenced, it detects cross-type aliasing in heap (and other multiple statements situations) and it works on both C and C++. Silvius
Re: Tricky(?) aliasing question.
Andrew Pinski wrote: Third, it only checks C programs (and not C++). This has not been true for some time now (at least developmental wise). -- Pinski Oops, had not noticed that. Forget third argument then. Silvius
how to avoid duplicate warnings
I am implementing -Wstrict-aliasing by catching simple cases in the frontend and more complex ones in the backend. The frontend mechanism is tree pattern matching. The backend one uses flow-sensitive points-to information. I want to avoid duplicate warnings. I thought of a few ways, but none seems perfect. Can you please advise which of the following I should choose, suggest alternatives, or let me know if a solution exists. 1. Save location info for the warnings issued in the frontend + have the backend check before issuing warnings. This would be simple, but it involves communication between frontend and backend. 2. Turn off frontend checking at -O2 and let the backend do all the work. This might affect the applicability of the warning, as the code may be simplified before it gets to the warning check (although the common case would be caught). 3. Make the backend skip cases that are likely to be warned by the frontend. This is finicky because complex expression trees in the frontend are distributed over multiple GIMPLE statements by the time they reach the check in the backend. 4. Just allow duplicates. The problem of duplicates is actually more general and should be solved (is it already?) in a generic way for all warnings, and not just for -Wstrict-aliasing. Even the backend alone will issue duplicate warnings, e.g., when inlining a function from a header file causes type punning at or around several call sites. This may happen in the same or in several .c files that may be processed together by the compiler/LTO. With -Wstrict-aliasing this may be more common than with some other warnings, because type punning sometimes involves set/get methods that are usually inlined. Thank you, Silvius
Re: Where is gstdint.h
Tim Prince wrote: [EMAIL PROTECTED] wrote: Where is gstdint.h ? Does it acctually exist ? libdecnumber seems to use it. decimal32|64|128.h's include decNumber.h which includes deccontext.h which includes gstdint.h When you configure libdecnumber (e.g. by running top-level gcc configure), gstdint.h should be created, by modifying . Since you said nothing about the conditions where you had a problem, you can't expect anyone to fix it for you. If you do want it fixed, you should at least file a complete PR. As it is more likely to happen with a poorly supported target, you may have to look into it in more detail than that. When this happened to me, I simply made a copy of stdint.h to get over the hump. This might happen when you run the top level gcc configure in its own directory. You may want to try to make a new directory elsewhere and run configure there. pwd .../my-gcc-source-tree mkdir ../build cd ../build ../my-gcc-source-tree/configure make
Re: MPC required in one week.
On Tue, Dec 1, 2009 at 2:25 AM, Paolo Bonzini wrote: > > On 11/30/2009 09:47 PM, Michael Witten wrote: >> >> On Mon, Nov 30, 2009 at 12:04 AM, Kaveh R. GHAZI >> wrote: >>> >>> The patch which makes the MPC library a hard requirement for GCC >>> bootstrapping has been approved today. >> >> Out of curiosity and ignorance: Why, specifically, is MPC going to be >> a hard requirement? >> >> On the prerequisites page, MPC is currently described with: "Having >> this library will enable additional optimizations on complex numbers." >> >> Does that mean that such optimizations are now an important >> requirement? or is MPC being used for something else? > > They are a requirement for Fortran, but it's (much) simpler to do them for > all front-ends. > > Paolo > On the flip side, it's not necessarily easy to get it to work. On my build system, apt-get doesn't find it. Downloading and installing the .deb manually triggers 3 missing deps. ftp://gcc.gnu.org/pub/gcc/infrastructure/ is unresponsive, so I had to look around for the source. Installing from source fails with a libgmp abi message during configuration, so now I need to fiddle with it. Like many, I don't use fortran much, so this is pure overhead at this point. It could be just that my build system is such that it's more difficult to bring in this change than is in the average case. I'm not arguing against change, especially when it brings improved performance, but I think it's worth reinforcing that bringing in a library dependence is not free. (Looking at this issue oblivious of the maintenance and development burden, it would have been nice to have a transitional --no-mpc configure option.) Silvius
bitwise dataflow
I'm thinking about adding bitwise dataflow analysis support to RTL. Is this a good idea? Bad idea? Already done? Please review if interested. Thank you, Silvius Motivation == int foo(int x) { int i = 100; do { if (x > 0) x = x & 1; /* After this insn, all bits except 1 are 0. */ else x = x & 2; /* After this insn, all bits except 2 are 0. */ i--; } while (i); return x & 4; /* Before this insn, only bits 1 and 2 may be nonzero. */ } "foo" should simply return 0. This optimization is currently missed at -O2 and -O3 on x86_64. (Cases with simpler control are solved by the "combine" pass.) Proposal 1. Dataflow framework to propagate bitwise register properties. (Integrated with the current dataflow framework.) 2. Forward bitwise dataflow analysis: constant bit propagation. 3. Backward bitwise dataflow analysis: dead bit propagation. 4. Target applications: improve dce and see. (Others?) Preliminary Design == 1. I don't have enough understanding of GCC to decide whether it should be done at RTL level or tree level. After some brainstorming with Ian Taylor, Diego Novillo and others, we decided to go with RTL. 2. This problem could be solved using iterative dataflow with bitmaps. However, memory size would increase significantly compared to scalar analysis, as would processing time. For constant bit propagation, we need to keep 2 bits of state for each register bit. For 64 bit registers, that's a factor of 128x over the scalar reaching definition problem. Instead, I propose a sparse bitwise dataflow framework. We would still use the existing RTL dataflow framework to build scalar DU/UD chains. Once they are available, bitwise information is propagated only over these chains, analogous to the sparse constant propagation described by Wegman & Zadeck TOPLAS 1991. 3. This might be too much detail at this point, but just in case, here is a brief description of a bit constant propagation algorithm. For each instruction I in the function body For each register R in instruction I def_constant_bits(I, R) = collect constants from AND/OR/... operations. Iterate until the def_constant_bits don't change: For each instruction I in the function body For each register R used at I use_constant_bits(I, R) = merge (def_constant_bits(D, R)) across all definitions D of R that reach I For each register R defined at I def_constant_bits(I, R) = transfer (use_constant_bits(I, RU)) for all register uses RU, based on opcodes. The data structures and routines "collect", "merge" and "transfer" depend on the problem solved. 4. Complexity considerations. The solver visits every DU edge once for each iteration of the fixed point convergence loop. The maximum number of iterations is given by the height of the state lattice multiplied by the number of bits. Although this can be as high as 128 for constant bit propagation on x86_64, in practice we expect much lower times. Also, lower complexity guarantees can be given if less accurate information is allowed, e.g., byte level rather than bit level. For byte constants, the upper bound constant factor drops from 128 to 16. Some of these ideas came from discussions with Preston Briggs, Sriraman Tallam and others.
Re: profile mode output analysis (call stacks to source code mapping)
On Fri, May 7, 2010 at 7:09 AM, Karel Gardas wrote: > with recent fixes into profile mode I've succeed even using it for > MICO[1] on OpenSolaris platform. Is there any tool which > translates call stacks to humans or is there any documentation/hint > how to use generated call stack information to find out appropriate > place in the source code? Mapping code addresses to function names and line numbers is system dependent. That's one reason why line numbers are not produced by the profile mode directly. If you are using binutils, this should give you precise line number information for debug binaries. addr2line -e addr1 addr2 ... If you don't have debug information or the debug information is imprecise, you should still be able to map addresses to function names: addr2line -f -e addr1 addr2 ... Other useful addr2line options are -i (print inlined stacks) and -C (demangle C++ names). Silvius
call for libstdc++ profile mode diagnostic ideas
Hello libstdc++ developers and users, (I cc-ed gcc@gcc.gnu.org for a larger audience as others may offer suggestions not as library developers but as C++ programmers. Sorry in advance if this is spam to you.) I'm planning to add a set of new performance diagnostics to the libstdc++ profile mode (http://gcc.gnu.org/onlinedocs/libstdc++/manual/profile_mode.html) and am trying to come up with a list of what diagnostics are most meaningful and most wanted. The profile mode currently gives diagnostics such as "use unordered_map instead of map at context ...". The full list is at http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt03ch19s07.html. At this (brainstorming) point I'm looking for any suggestions, including and not limited to: - container selection (e.g., replace list with deque). - container implementation selection (e.g., string implementation). - algorithm selection (e.g., sort implementation) - data structure or algorithm parameter selection (e.g., fixed string reserved size value for a particular context). Please reply to this message or email me privately (r...@google.com) if you have any suggestions for new diagnostics, or about the profile mode in general, or to express your support for a particular proposed diagnostic. For new diagnostics, it works best if you can provide: - A simple description, e.g., "replace vector with list". - (optional) The instrumentation points, even if vague, e.g., "instrument insertion, element access and iterators". - (optional) How the decision is made, e.g. "there are many inserts in the middle, there is no random access and it's not iterated through many times". Keep in mind that profile mode analysis is context sensitive (context = dynamic call stack), so it's OK to propose both "change list to vector" and "change vector to list", with different criteria sets as they may both be right in different contexts. To make sure the diagnostics are realistic, keep in mind that analysis is usually based on observing, at run time, container state before and after each operation on the container or on iterators of the container. Once I have a good pool of ideas and preferences, I will present a refined set to libstd...@gcc.gnu.org for discussion. Thank you, Silvius
New branch for STL Advisor
Lixia Liu ([EMAIL PROTECTED]) and myself started to work on a tool to recognize STL usage patterns that can be corrected to increase application performance. We believe this tool should live in libstdc++-v3. We want to start a GCC branch to share what we have so far and invite contributors. libstdc++-v3 maintainers, could you please comment/advise? Thank you, Silvius Overview Goal: Give performance improvement advice based on analysis of dynamic STL usage. Method: Instrumentation calls in libstdc++-v3/include/debug/*, runtime support library in libstdc++-v3/libstladvisor, trace analysis/report tool in libstdc++-v3/stladvisor. Timeline: Create branch immediately. Target the next stage 1 for merge. Motivation == Consider the following example: 1 std::unordered_set s; 2 for (int i = 0; i < 1000; ++i) { 3s.insert(i); 4 } Its execution time is reduced by 38% by changing the first line to: 1 std::unordered_set s(1000); There are other STL usage patterns that can be recognized and reported to the application programmer. Sample Report = test.cc:187 advice: Changing initial size of unordered_map instance from 193 to 1572869 will reduce execution time by 31469325 rehash operations. test.cc:192 advice: Changing initial size of unordered_map instance from 193 to 6151 will reduce execution time by 5978 rehash operations. test.cc:253 advice: Changing initial size of unordered_map instance from 193 to 2 will reduce memory consumption by 15343 * sizeof(void*). Design == The usage model is intended to be similar to -fprofile-generate and -fprofile-use. 1. Compiler driver Build with instrumentation: gcc -fstl-advisor=instrument:all foo.cc. The effect on the compiler is simply "|-D_GLIBCXX_DEBUG |-D_GLIBCXX_ADVISOR". The effect on the linker is "-lstladvisor". Run program: ./a.out. Produce advisory report: gcc -fstl-advisor=use:advise. 2. Instrumentation phase Use the existing libstdc++-v3 wrapper model documented in http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt12ch30s04.html. This model is used currently for functional verification of STL code. Calls to functions in the instrumentation library will be added in libstdc++-v3/include/debug/*, guarded by #ifdef _GLIBCXX_ADVISOR. The instrumentation library would live in libstdc++-v3/libstladvisor, at the same level as libsupc++. 3. Dump phase The instrumentation library functions register an "atexit" report generator. The report is either a dump of the instrumentation trace or aggregated information. For the unordered_set resizing example above, the report would be a summary of the cost of rehashes, factored by the actual call stack at hashtable construction time. This report references only numeric instruction addresses. 4. Analysis & report phase In our prototype, this is a python script that performs two functions: - Digest the instrumentation report and make up advice. - Map instruction addresses into function names, file paths and line numbers.
Re: New branch for STL Advisor
Paolo Carlini wrote: libstdc++-v3 maintainers, could you please comment/advise? Can you explain which is the role of the debug-mode code, here? Because certainly the *performance* of the debug-mode library have *nothing* to do with the performance of the "real" library (whether that could be improved, tweaked, or whatever, is another issue, already raised in the past). Or are you planning to use the debug-mode code only for *counting* the calls, etc, no actual timings? Paolo. Right now we are planning to do only *counting* and such, no actual timing. This is something to consider though, in case we or someone else want to use timers in the future. Our choice of debug-mode was simply so that we would not touch the non-debug code. Silvius
Re: New branch for STL Advisor
Paolo Carlini wrote: Also, maybe it's just me, but the specific advantages over normal profiling / existing tools, don't seem completely obvious, I'd like to see that point discussed in better detail... Paolo. The effect of our instrumentation is a meaningful trace of the behavior of containers, algorithms and iterators, and their interaction. This abstraction level is above what normal profiling tools can offer. If we used normal profiling tools, we would have to reverse engineer the trace to get to this level of abstraction. Let me know if you want me to go into more detail. Silvius
Re: New branch for STL Advisor
Paolo Carlini wrote: ... finally (for now ;), an apparently pedantic issue, but really I'm going to strongly object to any use of "STL" together with our implementation of the ISO C++ Runtime Library: it's a *legacy* HP / SGI acronym which is not used anywhere in the ISO Standard in force (C++98 + TC1) or in the working paper of the next one. Paolo. We'll think of another name that does not include STL and run it by you. Silvius
Re: New branch for STL Advisor
Doug Gregor wrote: On Mon, Jul 14, 2008 at 7:20 PM, Benjamin Kosnik <[EMAIL PROTECTED]> wrote: In particular, design. The using bits seem pretty straightforward. It would be nice if you could provide some detail in terms of scope (what are the algorithms or data structures you intend to instrument), and how this fits into the existing debug mode functionality. Do you need the debug mode functionality, or are just piggy-backing off this existing structure? This ties in with the main question I had... typically, a profiling layer is used on larger inputs where it is important that the profiling code itself have very low overhead. Piggybacking on the debug mode is a definite performance-killer, so I hope that the profiling version of the library will be in its own inline namespace alongside the parallel and debug modes. Agree. We will create a new namespace at the same level as the parallel and debug modes. My vote for the command-line switch is -fprofile-stdlib. Agree. - Doug Thank you, Silvius
Re: New branch for STL Advisor
Benjamin Kosnik wrote: Hi Silvius Rus and Lixia Liu! Thanks for posting this, asking for advice, and being willing to help improve libstdc++! Goal: Give performance improvement advice based on analysis of dynamic STL usage. Your project sounds intriguing, and something that could potentially be useful for GNU C++ users. I look forward to seeing how your work progresses, and encourage you to start a gcc branch to house your work in progress. Progress can be evaluated at the next stage 1 window, to see if things are solid enough to merge in to the trunk. Thank you for the review and positive feedback. We will create the branch immediately. Method: Instrumentation calls in libstdc++-v3/include/debug/*, runtime support library in libstdc++-v3/libstladvisor, trace analysis/report tool in libstdc++-v3/stladvisor. I would like to see a bit more detail on your methodology, and encourage you to immediately start working on user documentation and a test strategy before you get too far along with anything else. For documentation, you'll need something similar to: http://gcc.gnu.org/onlinedocs/libstdc++/manual/debug_mode.html http://gcc.gnu.org/onlinedocs/libstdc++/manual/parallel_mode.html In particular, design. The using bits seem pretty straightforward. It would be nice if you could provide some detail in terms of scope (what are the algorithms or data structures you intend to instrument), and how this fits into the existing debug mode functionality. Do you need the debug mode functionality, or are just piggy-backing off this existing structure? It seems to me like you might want to provide a third level of debug mode macro, in addition to the _GLIBCXX_DEBUG and _GLIBCXX_DEBUG_PEDANTIC, maybe something like _GLIBCXX_DEBUG_PROFILE. However, if this is not the way you wish to proceed, and the intention is to be separate from debug mode, I would suggest using a new directory hierarchy and naming convention, one that is more descriptive to what you are attempting than "advisor". Perhaps include/profile? _GLIBCXX_PROFILE? Etc etc. In any case, please explain a bit more as to how this relates to debug mode. I would go for cleaner at this point, so we will develop a profile_mode at the same level as debug_mode and parallel_mode. How do you intend to test this functionality? Do you intend for this functionality to work on all, some, or most platforms? What are the portability concerns or issues? As advised, we will start by documenting the design in more detail and send it out once it is ready for review. There are other STL usage patterns that can be recognized and reported to the application programmer. Again, scope. It would be great to provide an initial list so we know what is being attempted. Feel free to mark some of them as stretch goals, or things that will not be attempted immediately. They will be included with the initial design document and will be sent for review shortly. The effect of our instrumentation is a meaningful trace of the behavior of containers, algorithms and iterators, and their interaction. This abstraction level is above what normal profiling tools can offer. If we used normal profiling tools, we would have to reverse engineer the trace to get to this level of abstraction. Let me know if you want me to go into more detail. More detail, please. Will do. Design == The usage model is intended to be similar to -fprofile-generate and -fprofile-use. 1. Compiler driver Build with instrumentation: gcc -fstl-advisor=instrument:all foo.cc. The effect on the compiler is simply "|-D_GLIBCXX_DEBUG |-D_GLIBCXX_ADVISOR". The effect on the linker is "-lstladvisor". Seems fine. Run program: ./a.out. OK. Produce advisory report: gcc -fstl-advisor=use:advise. What other valid input do you anticipate -fstl-advisor having? I hope that some of the profiling information could be picked up by the compiler. This is not our immediate goal though, so for now I just want to leave the door open for it. best, benjamin To sum it up: We will create the branch, write the design doc and send it for review ASAP. Development will be separated from the debug mode. Thank you, Silvius
Re: Type-punning
This may have been fixed by a recent patch to -Wstrict-aliasing. Let me try to run the latest version of pre4.3 and will get back to you. Herman Geza wrote: Hi, gcc's docs states that at -fstrict-aliasing: "In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same." I have problems with this: struct A { float x, y; }; struct B { float x, y; }; int main() { A a; B &b = reinterpret_cast(a); } I get a type-punned warning for this code. However, A & B is exactly the same type. Is the warning appropriate here? Where can I find the definition of "almost the same [type]"? A little more complicated example: struct A { float x, y; }; struct B: public A { }; struct C: public A { }; int main() { B b; C &c = reinterpret_cast(b); } I get the same warning, and I even get miscompiled code with -O6 (for a more complicated code, not for this). What is the correct way to do this: void setNaN(float &v) { reinterpret_cast(v) = 0x7f81; } without a type-prunning warning? I cannot use the union trick here (memcpy works though, but it's not the most efficient solution, I suppose). Thanks for your help, Geza PS: gcc-4.1, gcc-4.2 produces this. Earlier gcc versions don't produce warnings for these cases.
Re: Type-punning
Herman Geza wrote: struct A { float x, y; }; struct B { float x, y; }; int main() { A a; B &b = reinterpret_cast(a); } I get a type-punned warning for this code. However, A & B is exactly the same type. Is the warning appropriate here? Unfortunately gcc 4.3 will likely warn on this as well. To avoid it, we would need a "prefix" analysis to distinguish between different structs that are compatible up to a point. As far as I know, GCC does not have this capability as of now. If it exists, it still needs to be connected to -Wstrict-aliasing, which is generally not trivial, as it requires figuring out what field is referenced where. I guess it could be implemented in a simpler way for the special case when the structs are identical structurally. Where can I find the definition of "almost the same [type]"? C and C++ standards: C Standard ISO/IEC 9899:1999, section 6.5, paragraph 7, and the C++ Standard ISO/IEC 14882:1998, section 3.10, paragraph 15. But there are a couple other references that touch this. What is the correct way to do this: void setNaN(float &v) { reinterpret_cast(v) = 0x7f81; } without a type-prunning warning? I cannot use the union trick here (memcpy works though, but it's not the most efficient solution, I suppose). The correct and efficient solution is to use memcpy. GCC should recognize the memcpy call and transform it into a move or load/store or such. You may want to try memcpy, generate the assembly code, and see if you are happy with it.
Re: Type-punning
Sergei Organov wrote: Herman Geza <[EMAIL PROTECTED]> writes: [...] What is the correct way to do this: void setNaN(float &v) { reinterpret_cast(v) = 0x7f81; } without a type-prunning warning? I cannot use the union trick here Why? Won't the following work? void setNaN(float &v) { union { float f; int i; } t; t.i = 0x7f81; v = t.f; } As far as I know, this is guaranteed to work with GCC. But it is not kosher according to language standards, so other compilers might dislike it. On the other hand, other compilers are not guaranteed to optimize the call to "memcpy" out either. Type punning has been disallowed regardless of disguise at least since Fortran77, when compiler writers realized it had become evil. However, in my opinion, there's a big difference between a useful little trick like the union above and the horrible memory overlays that the Fortran77 standard tried to help disambiguate. Silvius
Re: Type-punning
Herman Geza wrote: void foo(float *a) { int *b = (int*)a; // type-punning warning // here, access to a and b shouldn't be optimized, as the compiler knows that a and b point to the same address } Is this reasonable? Even if it were trivial to implement, I would vote against it, because it would encourage people to write non-compliant code. In terms of compilation time, it is not reasonable: the standard is clear about this so that compilers can optimize based on declared types, without having to perform overly complex (and expensive) alias analysis. For the corner case where you have non-compliant code that breaks with -fstrict-aliasing and is much slower with -fno-strict-aliasing, and which you cannot modify, I think you are on your own. You could in principle write a pass that flags all the possible references (including possible aliases) to potentially type-punned addresses. Then you would have to make sure that this information is understood by the relevant optimization passes, and that it is preserved across different representations, i.e., from GIMPLE to RTL. All this work just to fight the standard :).
Re: Type-punning
Herman Geza wrote: aliasing when GCC (I think) incorrectly treats types different, but they're the same. For example, I have: struct Point { float x, y, z; }; struct Vector { float x, y, z; Point &asPoint() { return reinterpret_cast(*this); } }; Point and Vector have the same layout, but GCC treats them different when it does aliasing analysis. I have problems when I use Vector::asPoint. I also think this case should not be flagged. I have seen similar usage in network programming. Did it actually result in bad code or was it just the warning that bothered you? :) Thanks for your responses. You're welcome. Silvius