strict aliasing benefit examples
I often need to convince people that gcc is not just defective for doing random nonsense to code which violates the C and C++ aliasing rules. Not that I'm real sure myself actually, given that gcc is able to generate warnings for all the normal cases, but anyway... I'm up against the idea that Visual Studio is correct and gcc is buggy crap. :-) Very few professional software developers can handle the aliasing issue. So I could use some teaching examples. Think "PowerPoint". Heh, OK, I'll use OpenOffice.org Impress, but you get the idea I think. Realistic code matters. Contrived examples won't convince anyone. People care about 32-bit x86, not IA-64. AMD64 and PowerPC count for something, but not much. The best examples would involve optimizations which could not be performed if gcc did what people normally expect from a simple pointer cast and wrong-type access. I doubt such examples exist, but I hope they do.
Aliasing: reliable code or not?
I have code that goes something like this: char *foo(char *buf){ *buf++ = 42; *((short*)buf) = 0xfeed; buf += 2; *((int*)buf) = 0x12345678; buf += 4; *((int*)buf) = 0x12345678; buf += 4; return buf; } The buffer is really of type char. The above comes from a pile of macros and inline functions (C99 code), or alternately from a pile of evil C++ templates. Real C99-compliance would be cute, but then again the buffer will end up getting executed as x86 code. Intentionally executing data is surely a standards violation of the highest order. I can't afford to just use "char" to do the writes. Currently gcc won't merge those into larger writes. Performance matters. So, how likely is gcc to do what I obviously want? It seems to be working right now... It would be nice to have this documented to work, just as was done with unions for type punning. The next best would be a big giant warning and a work-around that doesn't kill performance.
Re: strict aliasing benefit examples
On 11/28/06, Andrew Pinski <[EMAIL PROTECTED]> wrote: > I often need to convince people that gcc is not just > defective for doing random nonsense to code which > violates the C and C++ aliasing rules. Not that I'm > real sure myself actually, given that gcc is able to > generate warnings for all the normal cases, but anyway... > I'm up against the idea that Visual Studio is correct > and gcc is buggy crap. :-) Very few professional > software developers can handle the aliasing issue. The aliasing rules are not hard to understand. For a long-time gcc developer like you, certainly. It doesn't help that the standards are only available for $$$ or as contraband. Anyway, I've never seen a book or course that teaches any of this stuff. Simplified rules for C: Only access a variable by its own type, its signed or unsigned variant, or by the character types (char, unsigned char, and signed char). For structs/unions, accessing it via an inner struct/union is also ok. How about an outer struct? For example, Linux does that all the time to recover driver-specific data from a generic struct without needing an extra pointer. > Realistic code matters. Contrived examples won't > convince anyone. Realistic code for aliasing questions are usually going to be big and hard to understand. Bummer. I'm trying to resist the normal fix, which is to consider strict-aliasing as a benchmark cheat that you have to disable for real-world code. > People care about 32-bit x86, not IA-64. AMD64 and > PowerPC count for something, but not much. Actually PowerPC code generation counts a lot for me, as I work for Sony. It counts for me too, at home. My only computer is a Mac G4 Cube with the MPC7400. It doesn't count at work, where Win32 is the norm. > The best examples would involve optimizations which > could not be performed if gcc did what people normally > expect from a simple pointer cast and wrong-type access. > I doubt such examples exist, but I hope they do. An easy quick example of what strict alias can do is the following: int f(int *a, float *b) { *a = 1; *b = 2.0; return *a == 2; } Without the aliasing rules provided by the C/C++ standard, you would not know if *a could alias *b, therefor not always return 0. I have an example kind of like that, though for __restrict because there is a (char*) in it. Problem: people don't write code that way. (well I hope not) People declare a few local variables, load them with data via the pointers, do stuff with the local variables, then save back the results via the pointers. So that won't convince many Visual Studio 2005 fans. :-( The reason why GCC gets the cast case "wrong" is because GCC does not do that much base+offset based aliasing but instead it implements type based aliasing. What most other compilers do is first base+offset aliasing and then type based aliasing if they cannot figure that out with the base+offset. We have found that we currently get better results with our current IR, with type based aliasing first. I think there are 3 aliasing possibilities here: 1. known to alias 2. known to not alias 3. may alias You could start with a base+offset pass that only distinguishes the known-to-alias cases from the others. That deals with typical casts. Then you follow that with type-based analysis and finally back to base+offset to find a few remaining known-to-not-alias cases. The current situation really hurts. For example, any project using the wxWidgets library must use -fno-strict-aliasing. That means you get no benefit from __restrict if you use wxWidgets. (because __restrict is collateral damage when -fno-strict-aliasing gets used) If you want performance, disallow aliasing in unions. :-) Almost nobody was doing that until it got suggested as a way to make gcc cooperate. Even today it is very rare. People use unions to save space; this should not hurt performance.
Re: strict aliasing benefit examples
On 11/29/06, Paolo Bonzini <[EMAIL PROTECTED]> wrote: >> int f(int *a, float *b) >> { >> *a = 1; >> *b = 2.0; >> return *a == 2; >> } >> > > Problem: people don't write code that way. (well I hope not) > People declare a few local variables, load them with data via > the pointers, do stuff with the local variables, then save back > the results via the pointers. > > So that won't convince many Visual Studio 2005 fans. :-( Then, the answer is that GCC's stronger aliasing allows you to use one line of code instead of three. Consider that most people can only write at most ~50 SLOC/day (including debugging and documentation), with a stunning independence from the programming language and programming style. If you take it with the necessary grain of salt, this is quite an argument. It's an argument to favor K+R style over GNU, Allman, and Whitesmith. This could be holding back gcc development. :-) Since humans have to do a bit of alias analysis when maintaining or writing code, the extra clarity of pulling things into temporary variables isn't wasted. I guess I can imagine that macro expansion might result in some cases where strict-aliasing is of benefit. Most people fail to use a temporary in a macro, probably because __typeof__ is gcc-only. I can probably fit 20 lines of code on a readable slide. Ideas? BTW, there are more normal programming habits that defeat type-based alias analysis. People pick data types by habit. Mostly, people will use the same type for nearly everything.
old aliasing bug: fixed?
int weird(float *fp){ // access an int as an int (see caller), // so not an aliasing violation return *(int*)fp; } int main(int argc, char *argv[]){ return weird((float*)&argc); } I just tried this code with gcc 4.4.5 on 32-bit powerpc using -O2 -W -Wall. Assembly code for the weird function looks OK, both inlined and not, but that certainly isn't proof that gcc will always tolerate such code. I recall that there were problems handling this type of code. (never mind any non-conformant callers that actually pass a pointer to a float -- not that gcc would be able to see them in separately compiled files) So, is it fixed now? (what gcc version?) If not, is it at least fixed if I change "float" to "void" and/or "unsigned char"? BTW, oddly it looks like gcc tolerates a genuine aliasing violation as well now. (passing the value as a float) Of course, that may just be my luck with the optimizer.
Re: old aliasing bug: fixed?
On Thu, Sep 30, 2010 at 5:39 AM, Richard Guenther wrote: > On Thu, Sep 30, 2010 at 9:54 AM, Albert Cahalan wrote: >> int weird(float *fp){ >> // access an int as an int (see caller), >> // so not an aliasing violation >> return *(int*)fp; >> } >> int main(int argc, char *argv[]){ >> return weird((float*)&argc); >> } >> >> I just tried this code with gcc 4.4.5 on 32-bit powerpc using -O2 -W -Wall. >> Assembly code for the weird function looks OK, both inlined and not, but >> that certainly isn't proof that gcc will always tolerate such code. >> I recall that there were problems handling this type of code. (never mind >> any non-conformant callers that actually pass a pointer to a float -- not >> that gcc would be able to see them in separately compiled files) >> >> So, is it fixed now? (what gcc version?) If not, is it at least fixed >> if I change "float" to "void" and/or "unsigned char"? >> >> BTW, oddly it looks like gcc tolerates a genuine aliasing violation >> as well now. (passing the value as a float) Of course, that may just >> be my luck with the optimizer. > > I indeed fixed the above problem at some point (4.1 may be still > broken, 4.3 should be fixed I think). > > We're trying to tolerate genuine alias violations if we can see > what the user intended (in compiler-speak, when we detect > a must-alias relationship we do not try to disabiguate using > type-based alias analysis). That's just being nice to users and > not breaking their code just because we can. I've been trying to come up with an example where either: a. gcc gains optimization from type-based alias analysis b. traditional assumptions result in breakage I am no longer able to find either. Is it safe to consider the type-based aliasing to be essentially disabled now?
sparse overlapping structs for vectorization
I had a problem that got solved in an ugly way. I think gcc ought to provide a few ways to make a nicer solution. There was an array of structs roughly like so: struct{int w;float x;char y[4];short z[2];}foo[512][4]; The types within the struct are 4 bytes each; I don't actually remember anything else and it doesn't matter except that they are distinct. I think it was bitfields actually, neatly grouped into groups of 32 bits. In other words, like 4 4-byte values but with more-or-less incompatible types. Note that 4 of the structs neatly fill a 64-byte cache line. An alignment attribute was used to ensure 64-byte alignment. The most common operation needed on this array is to compare the first struct member of 4 of the structs against a given value, looking to see if there is a match. SSE would be good. This would then be followed by using the matching entry if there is one, else picking one of the 4 to recycle and thus use. First bad solution: One could load up 4 SSE registers, shuffle things around... NO. Second bad solution: One could simply have 4 distinct arrays. This is bad because there are different cache lines for w, x, y, and z. Third bad solution: The array can be viewed as "int foo[512][4][4]" instead, with the struct forming the third array index. Note that the last two array indexes are both 4, so you can kind of swap them around. This groups 4 fields of each type together, allowing SSE. The problem here is loss of type safety; one must use array indexes instead of struct field names. Like so: foo[idx][WHERE_W_IS][i] Fourth bad solution: We lay things out as in the third solution, but we cast pointers to effectively lay sparse structs over each other like shingles. { int w; int pad_wx[3]; float x; int pad_xy[3]; char y[4]; int pad_yz[3]; short z[2]; } Performance is hurt by the need for __may_alias__ and of course the result is painful to look at. We went with this anyway, using SSE intrinsics, and performance was great. Maintainability... not so much. BTW, an array of 512 structs containing 4-entry arrays was not used because we wanted to have a simple normal pointer to indicate the item being operated on. We didn't want to need a pointer,index pair. Can something be done to help out here? The first thing that pops into mind is the ability to tell gcc that the struct-to-struct byte offset for array indexing is a user-specified value instead of simply the struct size. It's possible we could have safely ignored the warning about aliasing. I don't know. Perhaps that would give even better performance, but the casting would still be very ugly. Solutions that that be defined away for non-gcc compilers are better.
ARM/Thumb function attribute
As far as I can tell, there is no way to declare that a particular function pointer will point at plain ARM code or at Thumb code. I'm more than a little surprised actually, so maybe I just missed something. How can I do this? Some background: The function is in ROM. I'm using a linker script to give it a symbol, like so: PROVIDE( tx_thread_create = 0x2718); I'll be declaring it somewhat like this: int __cdecl thread_create(void *thread, char *name, void (__cdecl *fn)(int), int param, void *stack, int stack_size, int sched1, int sched2, int sched3, int sched4); Note that the function itself takes a pointer. I might want to ability to enforce that the pointer goes to Thumb code or to non-Thumb code. Certainly I need to allow for a thread_create function that can handle either kind of code.
Re: ARM/Thumb function attribute
On Sat, Mar 22, 2008 at 8:24 PM, Paul Brook <[EMAIL PROTECTED]> wrote: > This list is for development of gcc, not gcc users. In future gcc-help, or > some other arm specific list is the correct place to ask such questions. I guess it wasn't clear that I'm requesting a new attribute. I want to force a call to be Thumb, or to be ARM. > The low bit of a function pointer value indicates thumbness. The caller > doesn't know or care whether it is calling an Arm or Thumb function. > > However note that > > > int __cdecl thread_create(...) > > This isn't a function pointer, it's an actual function declaration. Expect > this to break because (a) it's probably out of range of a branch instruction, > and (b) your linker defined symbol won't have the correct type. There is an existing attribute that I can use to deal with range. (longcall if I remember right) I will use this as required. I must do this anyway, since even my own code will reside in two separate chunks. My linker-defined symbol probably won't have much of a type at all. I'm asking for a way to make gcc ignore the type, and just call the symbol. Another way to solve the problem would be to have some way to make gcc emit the symbol, perhaps by an attribute that declares the address. So one of these would do: __attribute__((at(0x2718))) __attribute__((thumb)) (ideally both, so that I don't need to mess with the bits of Thumb code addresses)