Re: Merging calls to `abort'

2005-03-16 Thread Ken Raeburn
On Mar 16, 2005, at 11:23, Richard Stallman wrote:
But what are you saying to those users who don't like it that GNU 
programs
abort silently when they discover bugs in themselves?  Aren't you 
saying
"tough" in a somewhat more polite way?

No, because nobody has complained about it.  The idea that Emacs
should not use plain abort to crash has only been raised here, not by
Emacs users.  The real complaint that I really got was about
cross-jumping.
As a user (and when I have time, which hasn't been the case in a while, 
an occasional contributor) of both Emacs and GCC, I prefer the 
fancy_abort/assert approaches.  I haven't complained, because I've just 
taken it as a foregone conclusion that that's the way you want Emacs to 
work, and I don't feel strongly enough about it to try to convince you 
otherwise.  That, and the (occasional) problems I run into tend to be 
bad addresses causing faults more than explicit calls to abort.

Ken


Re: Heads-up: volatile and C++

2005-04-18 Thread Ken Raeburn
On Apr 16, 2005, at 15:45, Nathan Sidwell wrote:
It's not clear to me which is the best approach.  (b) allows threads to
be supported via copious uses of volatile (but probably introduces
pessimizations), whereas (a) forces the thread interactions to be 
compiler
visible (but shows more promise for optimizations).
Is there anything in the language specifications (mainly C++ in this 
context, but is this an area where C and C++ are going to diverge, or 
is C likely to follow suit?) that prohibits spurious writes to a 
location?  E.g., translating:

  extern int x;
  x = 3;
  foo(); // may call pthread_*
  y = 4;
  bar(); // likewise
into:
  x <- 3
  call foo
  r1 <- x
  y <- 4
  x <- r1
  call bar
...  And does this change if x and y are members of the same struct?  
Certainly you can talk about quality of implementation issues, but 
would it be non-compliant?  It certainly would be unfriendly to 
multithreaded applications, if foo() released a lock and allowed it to 
be acquired by another thread, possibly running on another processor.

To make a more concrete example, consider two one-byte lvalues, either 
distinct variables or parts of a struct, and the early Alpha processors 
with no byte operations, where byte changes are done by loading, 
modifying, and storing word values.

My suspicion is that if the compiler doesn't need to know about threads 
per se, it at least needs to know about certain kinds of restrictions 
on behavior that would cause problems with threads.

Ken


Re: Heads-up: volatile and C++

2005-04-19 Thread Ken Raeburn
On Apr 18, 2005, at 18:17, Robert Dewar wrote:
Is there anything in the language specifications (mainly C++ in this 
context, but is this an area where C and C++ are going to diverge, or 
is C likely to follow suit?) that prohibits spurious writes to a 
location?
Surely the deal is that spurious writes are allowed unless the
location is volatile. What other interpretation is possible?
That's what I thought.  So, unless the compiler (or language spec) is 
going to become thread-aware, any data to be shared across threads 
needs to be declared volatile, even if some other mechanism (like a 
mutex) is in use to do some synchronization.  Which means performance 
would be poor for any such data.

Which takes me back to: I think the compiler needs to be thread-aware.  
"Enhancing" the meaning of volatile, with the attendant performance 
issues, still doesn't seem adequate to allow for multithreaded 
programming, unless it's used *everywhere*, and performance shoots 
through the floor


Re: [PATCH]: Proof-of-concept for dynamic format checking

2005-08-28 Thread Ken Raeburn
Maybe I should avoid making suggestions that would make the project  
more complex, especially since I'm not implementing it, but...


If we can describe the argument types expected for a given format  
string, then we can produce warnings for values used but not yet set  
(%s with an uninitialized automatic char array, but not %n with an  
uninitialized int), and let the compiler know what values are set by  
the call for use in later warnings.  For additions like bfd's %A and % 
B, though, we'd need a way of indicating what fields of the pointed- 
to structure are read and/or written, because some of them may be  
ignored, or only conditionally used.


Seems to me the best way to describe that is either calling out to  
user-supplied C code, or providing something very much like a C  
function or function fragment to show the compiler how the parameters  
are used -- off the top of my head, say, map 'A' to a static function  
format_asection which takes an asection* argument and reads the name  
field, which function can be analyzed for data usage patterns and  
whether it handles a null pointer, but which probably would be  
discarded by the compiler.  Mapping format specifiers to code  
fragments might also allow the compiler to transform

  bfd_print("%d:%A",sec,num)
to
  printf("%d:%s",num,sec->name)
if it had enough information.  But that requires expressing not just  
the data i/o pattern, but what the formatting actually will be for a  
specifier, which sometimes may be too complex to want to express.


Just a thought...

Ken


Re: should sync builtins be full optimization barriers?

2011-09-12 Thread Ken Raeburn
On Sep 12, 2011, at 19:19, Andrew MacLeod wrote:
> lets say the order of the writes turns out to be  2,4...  is it possible for 
> both writes to be travelling around some bus and have thread 4 actually read 
> the second one first, followed by the first one?   It would imply a lack of 
> memory coherency in the system wouldn't it? My simple understanding is that 
> the hardware gives us this sort of minimum guarantee on all shared memory. 
> which means we should never see that happen.

According to section 8.2.3.5 "Intra-Processor Forwarding Is Allowed" of "Intel 
64 and IA-32 Architectures Software Developer's Manual" volume 3A, December 
2009, a processor can see its own store happening before another's, though the 
example works on two different memory locations.  If at least one of the 
threads reading the values was on the same processor as one of the writing 
threads, perhaps it could see the locally-issued store first, unless 
thread-switching is presumed to include a memory fence.  Consistency of order 
is guaranteed *from the point of view of other processors* (8.2.3.7), which is 
not necessarily the case here.  A total order across all processors is imposed 
for locked instructions (8.2.3.8), but I'm not sure whether their use is 
assumed here.  I'm still reading up on caching protocols, write-back memory, 
etc.  Still not sure either way whether the original example can work...

Ken


Re: Endianess attribute

2009-07-02 Thread Ken Raeburn

On Jul 2, 2009, at 06:02, Paul Chavent wrote:

Hi.

I already have posted about the endianess attribute (http://gcc.gnu.org/ml/gcc/2008-11/threads.html#00146 
).


For some year, i really need this feature on c projects.

Today i would like to go inside the internals of gcc, and i would  
like to implement this feature as an exercise.


You already prevent me that it would be a hard task (aliasing,  
etc.), but i would like to begin with basic specs.


As another gcc user (and, once upon a time, developer) who's had to  
deal with occasional byte ordering issues (mainly in network  
protocols), I can imagine some uses for something like this.  But...



The spec could be :

- add an attribute (this description could change to be compatible  
with existing ones (diabdata for example))


 __attribute__ ((endian("big")))
 __attribute__ ((endian("lil")))


I would use "little" spelled out, rather than trying to use some cute  
abbreviation.  Whether it should be a string vs a C token like little  
or __little__, I don't know, or particularly care.



- this attribute only apply to ints


It should at least be any integral type -- short to long long or  
whatever TImode is.  (Technically maybe char/QImode could be allowed  
but it wouldn't have any effect on code generation.)  I wouldn't jump  
to the conclusion that it would be useless for pointers or floating  
point values, but I don't know what the use cases for those would be  
like.  However, I think that's a case where you could limit the  
implementation initially, then expand the support later if needed,  
unlike the pointer issue below.



- this attribute only apply to variables declaration

- a pointer to this variable don't inherit the attribute (this  
behavior could change later, i don't know...)


This seems like a poor idea -- for one thing, my use cases would  
probably involve something like pointers to unaligned big-endian  
integers in allocated buffers, or maybe integer fields in packed  
structures, again via pointers.  (It looks like you may be trying to  
handle the latter but not the former in the code you've got so far.)   
For another, one operation that may be used in code refactoring  
involves taking a bunch of code accessing some variable x (and  
presumably similar blocks of code elsewhere that may use different  
variables), and pulling it out into a separate function that takes the  
address of the thing to be modified, passed in at the call sites to  
the new function; if direct access to x and access via &x behave  
differently under this attribute, suddenly this formerly reasonable  
transformation is unsafe -- and perhaps worst of all, the behavior  
change would be silent, since the compiler would have nothing to  
complain about.


Also, changing the behavior later means changing the interpretation of  
some code after deploying a compiler using one interpretation.   
Consider this on a 32-bit little-endian machine:


  unsigned int x __attribute__((endian("big"));
  *&x = 0x12345678;

In normal C code without this attribute, reading and writing "*&x" is  
the same as reading and writing x.  In your proposed version, "*&x"  
would use the little-endian interpretation, and "x" would use the big- 
endian interpretation, with nothing at the site of the executable code  
to indicate that the two should be different.  But an expression like  
this can come up naturally when dealing with macro expansions.  Or,  
someone using this attribute may write code depending on that  
different handling of "*&x" to deal with a selected byte order in some  
cases and native byte order in other cases.  Then if you update the  
compiler so that the attribute is passed along to the pointer type, in  
the next release, suddenly the two cases behave the same -- breaking  
the user's code when it worked under the previous compiler release.   
If you support taking the address of specified-endianness variables at  
all, you need to get the pointer handling right the first time around.


I would suggest that if you implement something like this, the  
attribute should be associated with the data type, not the variable  
decl; so in the declaration above, x wouldn't be treated specially,  
but its type would be "big-endian unsigned int", a distinct type from  
"int" (even on a big-endian machine, probably).


The one advantage I see to associating the attribute with the decl  
rather than the type is that I could write:


  uint32_t thing __attribute__((endian("big")));

rather than needing to figure out what uint32_t is in fundamental C  
types and create a new typedef incorporating the underlying type plus  
the attribute, kind of like how you can't write a declaration using  
"signed size_t".  But that's a long-standing issue in C, and I don't  
think making the language inconsistent so you can fix the problem in  
some cases but not others is a very good idea.



- the test case is

 uint32_t x __attribute__ ((endian("big")));
 uint32_t * pt

Re: Endianess attribute

2009-07-02 Thread Ken Raeburn

On Jul 2, 2009, at 16:44, Michael Meissner wrote:
Anyway I had some time during the summit, and I decided to see how  
hard it
would be to add explicit big/little endian support to the powerpc  
port.  It
only took a few hours to add the support for __little and __big  
qualifier
keywords, and in fact more time to get the byte swap instructions  
nailed down


That sounds great!


 (there are restrictions that named
address space variables can only be global/static or referenced  
through a

pointer).


That sounds like a potential problem, depending on the use cases.  No  
structure field members with explicit byte order?  That could be  
annoying for dealing with network protocols or file formats with  
explicit byte ordering.


On the other hand, if we're talking about address spaces... I would  
guess you could apply it to a structure?  That would be good for  
memory-mapped devices accepting only one byte order that may not be  
that of the main CPU.  For that use case, it would be unfortunate to  
have to tag every integer field.


I don't think Paul indicated what his use case was...

Ken


Re: [PATCH][4.3] Deprecate -ftrapv

2008-03-05 Thread Ken Raeburn

On Feb 29, 2008, at 19:13, Richard Guenther wrote:

We wrap the libcalls inside libcall notes using REG_EQUAL notes
which indicate the libcalls compute non-trapping +-* (there's no
RTX code for the trappingness), so we combine and simplify the
operations making the libcall possibly dead and remove it again.


My patch from September (http://gcc.gnu.org/ml/gcc-patches/2007-09/ 
msg01351.html) should help with the libcall issue a bit, by making  
the trapping libcalls not be considered dead, even if optimizations  
make the results not get used.  (Was I supposed to re-submit the  
patch in non-unidiff format?  I've had a couple of machine die on me  
recently, I might have to reconstruct the source tree.)  Of course,  
if the trapping math is optimized away before you get to emitting  
libcalls, that's a different bug.


Re: Thread safety annotations and analysis in GCC

2008-07-22 Thread Ken Raeburn
This looks like interesting work, and I hope something like this gets  
folded into a release soon.  A few things occurred to me when reading  
the writeup at google (sorry, I haven't started looking through the  
code much yet):


All the examples seem to be C++ oriented; is it, in fact, a goal for  
the annotations and analysis to be just as useful in C?


What are the scoping rules used for finding the mutex referenced in  
the GUARDED_BY macro within a C++ class definition?  Are they the same  
as for looking up identifiers in other contexts?  How is the lookup  
done for C structures?


Will the compiler get built-in knowledge of the OS library routines  
(e.g., pthread_mutex_lock) on various platforms?


You list separate annotations for "trylock" functions.  It appears  
that the difference is that "trylock" functions can fail.  However,  
pthread_mutex_lock can fail, if the mutex isn't properly initialized,  
if recursive locking of a non-recursive mutex is detected, or other  
reasons; the difference between pthread_mutex_lock and  
pthread_mutex_trylock is whether it will wait or immediately return  
EBUSY for a mutex locked by another thread.  So I think  
pthread_mutex_lock should be described as a "trylock" function too,  
under your semantics.  Conservatively written code will check for  
errors, and will have a path in which the lock is assumed *not* to  
have been acquired; if the analysis assumes pthread_mutex_lock always  
succeeds, that path may be analyzed incorrectly.  (I ran into a tool  
once before that complained about my locking code until I added an  
unlock call to the error handling path.  Since it's actively  
encouraging writing incorrect code, I'm not using it any more.)


Ken



Re: machine learning for loop unrolling

2007-06-17 Thread Ken Raeburn

  - compile with the loop unrolled 1x, 2x, 4x, 8x, 16x, 32x and
measure the time the benchmark takes


The optimal unrolling factor may not be a power of two, depending on  
icache size (11 times the loop body size?), iteration count (13*n for  
some unknown n?), and whether there are actions performed inside the  
loop once or twice every N passes (for N not a power of two).


The powers of two would probably hit a lot of the common cases, but  
you might want to throw in some intermediate values too, if it's too  
costly to check all practical values.


Ken


Re: RFC: Rename Non-Autpoiesis maintainers category

2007-07-27 Thread Ken Raeburn

On Jul 27, 2007, at 07:54, Diego Novillo wrote:

+Note that individuals who maintain parts of the compiler as reviewers
+need approval changes outside of the parts of the compiler they
+maintain and also need approval for their own patches.


s/approval changes/approval for changes/ ?





Should -ftrapv check type conversion?

2007-09-16 Thread Ken Raeburn
I've been looking at -ftrapv and some simple cases it doesn't work  
right in.  (I've got a patch coming soon to address one case where  
__addvsi3 libcalls and the like get optimized away in RTL.)


I see a few reports in Bugzilla, many marked as duplicates of PR  
19020 though they cover a few different cases, which have me  
wondering about what the scope of -ftrapv ought to be.


(I'm assuming 32-bit int and 16-bit short below, for simplicity.)

1) What about conversions to signed types?

  unsigned int x = 0x8000;
  int y = x;   /* trap? */

You get a negative number out of this if you define the conversion as  
wrapping twos-complement style, but I believe the spec says it's  
undefined.  It's not so much "overflow" from an arithmetic  
calculation as "out of range", but isn't that what the signed- 
overflow errors come down to, results that are out of range for the  
type used to represent them?


2) Conversions to narrower signed types?

  signed int x = 0xf;
  signed short y = x;  /* trap? */

It seems to me that a trap here would be desirable, though again it's  
an "out of range" issue.  However, a logical extension of this would  
be to possibly trap for "*charptr = x & 0xff" or "(char)(x & 0xff)"  
on a signed-char configuration, and that's probably pretty common code.


3) What about narrower-than-int types?

  signed short a = 0x7000, b = 0x7000, c = a+b;

Technically, I believe the addends are widened to signed int before  
doing the addition, so the result of the addition is 0xe000.  If the  
result is assigned to an int variable, there's no undefined  
behavior.  Converting that to signed short would be where the  
overflow question comes up, so this is actually a special case of #2.


3) Is Fortran 90 different?

PR 32153 shows tests in Fortran for 1-, 2-, 4-, and 8-byte types.  I  
know very little about the Fortran 90 spec, but if it doesn't say the  
narrower values get widened as in C, then -ftrapv should definitely  
cause traps for signed short or signed char arithmetic, even if we  
don't do it for the C type conversion cases above.


Ken