strict aliasing benefit examples

2006-11-28 Thread Albert Cahalan

I often need to convince people that gcc is not just
defective for doing random nonsense to code which
violates the C and C++ aliasing rules. Not that I'm
real sure myself actually, given that gcc is able to
generate warnings for all the normal cases, but anyway...
I'm up against the idea that Visual Studio is correct
and gcc is buggy crap. :-)  Very few professional
software developers can handle the aliasing issue.

So I could use some teaching examples.

Think "PowerPoint". Heh, OK, I'll use OpenOffice.org
Impress, but you get the idea I think.

Realistic code matters. Contrived examples won't
convince anyone.

People care about 32-bit x86, not IA-64. AMD64 and
PowerPC count for something, but not much.

The best examples would involve optimizations which
could not be performed if gcc did what people normally
expect from a simple pointer cast and wrong-type access.
I doubt such examples exist, but I hope they do.


Aliasing: reliable code or not?

2006-11-28 Thread Albert Cahalan

I have code that goes something like this:

char *foo(char *buf){
   *buf++ = 42;
   *((short*)buf) = 0xfeed;
   buf += 2;
   *((int*)buf) = 0x12345678;
   buf += 4;
   *((int*)buf) = 0x12345678;
   buf += 4;
   return buf;
}

The buffer is really of type char. The above comes
from a pile of macros and inline functions (C99 code),
or alternately from a pile of evil C++ templates.

Real C99-compliance would be cute, but then again
the buffer will end up getting executed as x86 code.
Intentionally executing data is surely a standards
violation of the highest order.

I can't afford to just use "char" to do the writes.
Currently gcc won't merge those into larger writes.
Performance matters.

So, how likely is gcc to do what I obviously want?
It seems to be working right now...

It would be nice to have this documented to work,
just as was done with unions for type punning.
The next best would be a big giant warning and a
work-around that doesn't kill performance.


Re: strict aliasing benefit examples

2006-11-28 Thread Albert Cahalan

On 11/28/06, Andrew Pinski <[EMAIL PROTECTED]> wrote:

> I often need to convince people that gcc is not just
> defective for doing random nonsense to code which
> violates the C and C++ aliasing rules. Not that I'm
> real sure myself actually, given that gcc is able to
> generate warnings for all the normal cases, but anyway...
> I'm up against the idea that Visual Studio is correct
> and gcc is buggy crap. :-)  Very few professional
> software developers can handle the aliasing issue.

The aliasing rules are not hard to understand.


For a long-time gcc developer like you, certainly.

It doesn't help that the standards are only available
for $$$ or as contraband. Anyway, I've never seen a
book or course that teaches any of this stuff.


Simplified rules for C:
Only access a variable by its own type, its signed or
unsigned variant, or by the character types (char, unsigned char,
and signed char).  For structs/unions, accessing it via
an inner struct/union is also ok.


How about an outer struct? For example, Linux does that
all the time to recover driver-specific data from a generic
struct without needing an extra pointer.


> Realistic code matters. Contrived examples won't
> convince anyone.

Realistic code for aliasing questions are usually going
to be big and hard to understand.


Bummer. I'm trying to resist the normal fix, which is
to consider strict-aliasing as a benchmark cheat that
you have to disable for real-world code.


> People care about 32-bit x86, not IA-64. AMD64 and
> PowerPC count for something, but not much.

Actually PowerPC code generation counts a lot for me, as I
work for Sony.


It counts for me too, at home. My only computer is a
Mac G4 Cube with the MPC7400.

It doesn't count at work, where Win32 is the norm.


> The best examples would involve optimizations which
> could not be performed if gcc did what people normally
> expect from a simple pointer cast and wrong-type access.
> I doubt such examples exist, but I hope they do.

An easy quick example of what strict alias can do is the following:

int f(int *a, float *b)
{
  *a = 1;
  *b = 2.0;
  return *a == 2;
}

Without the aliasing rules provided by the C/C++ standard,
you would not know if *a could alias *b, therefor not always return 0.


I have an example kind of like that, though for __restrict
because there is a (char*) in it.

Problem: people don't write code that way. (well I hope not)
People declare a few local variables, load them with data via
the pointers, do stuff with the local variables, then save back
the results via the pointers.

So that won't convince many Visual Studio 2005 fans. :-(


The reason why GCC gets the cast case "wrong" is because GCC does
not do that much base+offset based aliasing but instead it implements
type based aliasing.

What most other compilers do is first base+offset aliasing and then
type based aliasing if they cannot figure that out with the base+offset.
We have found that we currently get better results with our current IR,
with type based aliasing first.


I think there are 3 aliasing possibilities here:

1. known to alias
2. known to not alias
3. may alias

You could start with a base+offset pass that only distinguishes the
known-to-alias cases from the others. That deals with typical casts.
Then you follow that with type-based analysis and finally back to
base+offset to find a few remaining known-to-not-alias cases.

The current situation really hurts. For example, any project using the
wxWidgets library must use -fno-strict-aliasing. That means you get
no benefit from __restrict if you use wxWidgets. (because __restrict
is collateral damage when -fno-strict-aliasing gets used)

If you want performance, disallow aliasing in unions. :-) Almost nobody
was doing that until it got suggested as a way to make gcc cooperate.
Even today it is very rare. People use unions to save space; this should
not hurt performance.


Re: strict aliasing benefit examples

2006-11-29 Thread Albert Cahalan

On 11/29/06, Paolo Bonzini <[EMAIL PROTECTED]> wrote:


>> int f(int *a, float *b)
>> {
>>   *a = 1;
>>   *b = 2.0;
>>   return *a == 2;
>> }
>>
>
> Problem: people don't write code that way. (well I hope not)
> People declare a few local variables, load them with data via
> the pointers, do stuff with the local variables, then save back
> the results via the pointers.
>
> So that won't convince many Visual Studio 2005 fans. :-(

Then, the answer is that GCC's stronger aliasing allows you to use one
line of code instead of three.

Consider that most people can only write at most ~50 SLOC/day (including
debugging and documentation), with a stunning independence from the
programming language and programming style.  If you take it with the
necessary grain of salt, this is quite an argument.


It's an argument to favor K+R style over GNU, Allman, and Whitesmith.
This could be holding back gcc development. :-)

Since humans have to do a bit of alias analysis when maintaining
or writing code, the extra clarity of pulling things into temporary
variables isn't wasted.

I guess I can imagine that macro expansion might result in some
cases where strict-aliasing is of benefit. Most people fail to use a
temporary in a macro, probably because __typeof__ is gcc-only.
I can probably fit 20 lines of code on a readable slide. Ideas?

BTW, there are more normal programming habits that defeat
type-based alias analysis. People pick data types by habit.
Mostly, people will use the same type for nearly everything.


old aliasing bug: fixed?

2010-09-30 Thread Albert Cahalan
int weird(float *fp){
// access an int as an int (see caller),
// so not an aliasing violation
return *(int*)fp;
}
int main(int argc, char *argv[]){
return weird((float*)&argc);
}

I just tried this code with gcc 4.4.5 on 32-bit powerpc using -O2 -W -Wall.
Assembly code for the weird function looks OK, both inlined and not, but
that certainly isn't proof that gcc will always tolerate such code.
I recall that there were problems handling this type of code. (never mind
any non-conformant callers that actually pass a pointer to a float -- not
that gcc would be able to see them in separately compiled files)

So, is it fixed now? (what gcc version?) If not, is it at least fixed
if I change "float" to "void" and/or "unsigned char"?

BTW, oddly it looks like gcc tolerates a genuine aliasing violation
as well now. (passing the value as a float) Of course, that may just
be my luck with the optimizer.


Re: old aliasing bug: fixed?

2010-10-20 Thread Albert Cahalan
On Thu, Sep 30, 2010 at 5:39 AM, Richard Guenther
 wrote:
> On Thu, Sep 30, 2010 at 9:54 AM, Albert Cahalan  wrote:
>> int weird(float *fp){
>>        // access an int as an int (see caller),
>>        // so not an aliasing violation
>>        return *(int*)fp;
>> }
>> int main(int argc, char *argv[]){
>>        return weird((float*)&argc);
>> }
>>
>> I just tried this code with gcc 4.4.5 on 32-bit powerpc using -O2 -W -Wall.
>> Assembly code for the weird function looks OK, both inlined and not, but
>> that certainly isn't proof that gcc will always tolerate such code.
>> I recall that there were problems handling this type of code. (never mind
>> any non-conformant callers that actually pass a pointer to a float -- not
>> that gcc would be able to see them in separately compiled files)
>>
>> So, is it fixed now? (what gcc version?) If not, is it at least fixed
>> if I change "float" to "void" and/or "unsigned char"?
>>
>> BTW, oddly it looks like gcc tolerates a genuine aliasing violation
>> as well now. (passing the value as a float) Of course, that may just
>> be my luck with the optimizer.
>
> I indeed fixed the above problem at some point (4.1 may be still
> broken, 4.3 should be fixed I think).
>
> We're trying to tolerate genuine alias violations if we can see
> what the user intended (in compiler-speak, when we detect
> a must-alias relationship we do not try to disabiguate using
> type-based alias analysis).  That's just being nice to users and
> not breaking their code just because we can.

I've been trying to come up with an example where either:

a. gcc gains optimization from type-based alias analysis
b. traditional assumptions result in breakage

I am no longer able to find either. Is it safe to consider the
type-based aliasing to be essentially disabled now?


sparse overlapping structs for vectorization

2014-02-11 Thread Albert Cahalan
I had a problem that got solved in an ugly way. I think gcc ought
to provide a few ways to make a nicer solution.

There was an array of structs roughly like so:

struct{int w;float x;char y[4];short z[2];}foo[512][4];

The types within the struct are 4 bytes each; I don't actually
remember anything else and it doesn't matter except that they
are distinct. I think it was bitfields actually, neatly grouped
into groups of 32 bits. In other words, like 4 4-byte values
but with more-or-less incompatible types.

Note that 4 of the structs neatly fill a 64-byte cache line.
An alignment attribute was used to ensure 64-byte alignment.

The most common operation needed on this array is to compare
the first struct member of 4 of the structs against a given
value, looking to see if there is a match. SSE would be good.
This would then be followed by using the matching entry if
there is one, else picking one of the 4 to recycle and thus use.

First bad solution:

One could load up 4 SSE registers, shuffle things around... NO.

Second bad solution:

One could simply have 4 distinct arrays. This is bad because
there are different cache lines for w, x, y, and z.

Third bad solution:

The array can be viewed as "int foo[512][4][4]" instead, with
the struct forming the third array index. Note that the last two
array indexes are both 4, so you can kind of swap them around.
This groups 4 fields of each type together, allowing SSE. The
problem here is loss of type safety; one must use array indexes
instead of struct field names. Like so: foo[idx][WHERE_W_IS][i]

Fourth bad solution:

We lay things out as in the third solution, but we cast pointers
to effectively lay sparse structs over each other like shingles.
{
int w;
int pad_wx[3];
float x;
int pad_xy[3];
char y[4];
int pad_yz[3];
short z[2];
}
Performance is hurt by the need for __may_alias__ and of course
the result is painful to look at. We went with this anyway, using
SSE intrinsics, and performance was great. Maintainability... not
so much.

BTW, an array of 512 structs containing 4-entry arrays was not used
because we wanted to have a simple normal pointer to indicate the
item being operated on. We didn't want to need a pointer,index pair.

Can something be done to help out here? The first thing that pops
into mind is the ability to tell gcc that the struct-to-struct
byte offset for array indexing is a user-specified value instead
of simply the struct size.

It's possible we could have safely ignored the warning about aliasing.
I don't know. Perhaps that would give even better performance, but
the casting would still be very ugly.

Solutions that that be defined away for non-gcc compilers are better.


ARM/Thumb function attribute

2008-03-22 Thread Albert Cahalan
As far as I can tell, there is no way to declare
that a particular function pointer will point at
plain ARM code or at Thumb code. I'm more
than a little surprised actually, so maybe I just
missed something. How can I do this?

Some background: The function is in ROM.
I'm using a linker script to give it a symbol,
like so:

PROVIDE( tx_thread_create = 0x2718);

I'll be declaring it somewhat like this:

int __cdecl thread_create(void *thread, char *name, void (__cdecl
*fn)(int), int param, void *stack, int stack_size, int sched1, int
sched2, int sched3, int sched4);

Note that the function itself takes a pointer.
I might want to ability to enforce that the pointer
goes to Thumb code or to non-Thumb code.
Certainly I need to allow for a thread_create
function that can handle either kind of code.


Re: ARM/Thumb function attribute

2008-03-23 Thread Albert Cahalan
On Sat, Mar 22, 2008 at 8:24 PM, Paul Brook <[EMAIL PROTECTED]> wrote:
> This list is for development of gcc, not gcc users. In future gcc-help, or
>  some other arm specific list is the correct place to ask such questions.

I guess it wasn't clear that I'm requesting a new attribute.
I want to force a call to be Thumb, or to be ARM.

>  The low bit of a function pointer value indicates thumbness. The caller
>  doesn't know or care whether it is calling an Arm or Thumb function.
>
>  However note that
>
>  > int __cdecl thread_create(...)
>
>  This isn't a function pointer, it's an actual function declaration. Expect
>  this to break because (a) it's probably out of range of a branch instruction,
>  and (b) your linker defined symbol won't have the correct type.

There is an existing attribute that I can use to deal
with range. (longcall if I remember right) I will use this
as required. I must do this anyway, since even my
own code will reside in two separate chunks.

My linker-defined symbol probably won't have much
of a type at all. I'm asking for a way to make gcc ignore
the type, and just call the symbol.

Another way to solve the problem would be to have
some way to make gcc emit the symbol, perhaps by
an attribute that declares the address.

So one of these would do:

__attribute__((at(0x2718)))
__attribute__((thumb))

(ideally both, so that I don't need to mess with the bits
of Thumb code addresses)