Re: Merging calls to `abort'
On Mar 16, 2005, at 11:23, Richard Stallman wrote: But what are you saying to those users who don't like it that GNU programs abort silently when they discover bugs in themselves? Aren't you saying "tough" in a somewhat more polite way? No, because nobody has complained about it. The idea that Emacs should not use plain abort to crash has only been raised here, not by Emacs users. The real complaint that I really got was about cross-jumping. As a user (and when I have time, which hasn't been the case in a while, an occasional contributor) of both Emacs and GCC, I prefer the fancy_abort/assert approaches. I haven't complained, because I've just taken it as a foregone conclusion that that's the way you want Emacs to work, and I don't feel strongly enough about it to try to convince you otherwise. That, and the (occasional) problems I run into tend to be bad addresses causing faults more than explicit calls to abort. Ken
Re: Heads-up: volatile and C++
On Apr 16, 2005, at 15:45, Nathan Sidwell wrote: It's not clear to me which is the best approach. (b) allows threads to be supported via copious uses of volatile (but probably introduces pessimizations), whereas (a) forces the thread interactions to be compiler visible (but shows more promise for optimizations). Is there anything in the language specifications (mainly C++ in this context, but is this an area where C and C++ are going to diverge, or is C likely to follow suit?) that prohibits spurious writes to a location? E.g., translating: extern int x; x = 3; foo(); // may call pthread_* y = 4; bar(); // likewise into: x <- 3 call foo r1 <- x y <- 4 x <- r1 call bar ... And does this change if x and y are members of the same struct? Certainly you can talk about quality of implementation issues, but would it be non-compliant? It certainly would be unfriendly to multithreaded applications, if foo() released a lock and allowed it to be acquired by another thread, possibly running on another processor. To make a more concrete example, consider two one-byte lvalues, either distinct variables or parts of a struct, and the early Alpha processors with no byte operations, where byte changes are done by loading, modifying, and storing word values. My suspicion is that if the compiler doesn't need to know about threads per se, it at least needs to know about certain kinds of restrictions on behavior that would cause problems with threads. Ken
Re: Heads-up: volatile and C++
On Apr 18, 2005, at 18:17, Robert Dewar wrote: Is there anything in the language specifications (mainly C++ in this context, but is this an area where C and C++ are going to diverge, or is C likely to follow suit?) that prohibits spurious writes to a location? Surely the deal is that spurious writes are allowed unless the location is volatile. What other interpretation is possible? That's what I thought. So, unless the compiler (or language spec) is going to become thread-aware, any data to be shared across threads needs to be declared volatile, even if some other mechanism (like a mutex) is in use to do some synchronization. Which means performance would be poor for any such data. Which takes me back to: I think the compiler needs to be thread-aware. "Enhancing" the meaning of volatile, with the attendant performance issues, still doesn't seem adequate to allow for multithreaded programming, unless it's used *everywhere*, and performance shoots through the floor
Re: [PATCH]: Proof-of-concept for dynamic format checking
Maybe I should avoid making suggestions that would make the project more complex, especially since I'm not implementing it, but... If we can describe the argument types expected for a given format string, then we can produce warnings for values used but not yet set (%s with an uninitialized automatic char array, but not %n with an uninitialized int), and let the compiler know what values are set by the call for use in later warnings. For additions like bfd's %A and % B, though, we'd need a way of indicating what fields of the pointed- to structure are read and/or written, because some of them may be ignored, or only conditionally used. Seems to me the best way to describe that is either calling out to user-supplied C code, or providing something very much like a C function or function fragment to show the compiler how the parameters are used -- off the top of my head, say, map 'A' to a static function format_asection which takes an asection* argument and reads the name field, which function can be analyzed for data usage patterns and whether it handles a null pointer, but which probably would be discarded by the compiler. Mapping format specifiers to code fragments might also allow the compiler to transform bfd_print("%d:%A",sec,num) to printf("%d:%s",num,sec->name) if it had enough information. But that requires expressing not just the data i/o pattern, but what the formatting actually will be for a specifier, which sometimes may be too complex to want to express. Just a thought... Ken
Re: should sync builtins be full optimization barriers?
On Sep 12, 2011, at 19:19, Andrew MacLeod wrote: > lets say the order of the writes turns out to be 2,4... is it possible for > both writes to be travelling around some bus and have thread 4 actually read > the second one first, followed by the first one? It would imply a lack of > memory coherency in the system wouldn't it? My simple understanding is that > the hardware gives us this sort of minimum guarantee on all shared memory. > which means we should never see that happen. According to section 8.2.3.5 "Intra-Processor Forwarding Is Allowed" of "Intel 64 and IA-32 Architectures Software Developer's Manual" volume 3A, December 2009, a processor can see its own store happening before another's, though the example works on two different memory locations. If at least one of the threads reading the values was on the same processor as one of the writing threads, perhaps it could see the locally-issued store first, unless thread-switching is presumed to include a memory fence. Consistency of order is guaranteed *from the point of view of other processors* (8.2.3.7), which is not necessarily the case here. A total order across all processors is imposed for locked instructions (8.2.3.8), but I'm not sure whether their use is assumed here. I'm still reading up on caching protocols, write-back memory, etc. Still not sure either way whether the original example can work... Ken
Re: Endianess attribute
On Jul 2, 2009, at 06:02, Paul Chavent wrote: Hi. I already have posted about the endianess attribute (http://gcc.gnu.org/ml/gcc/2008-11/threads.html#00146 ). For some year, i really need this feature on c projects. Today i would like to go inside the internals of gcc, and i would like to implement this feature as an exercise. You already prevent me that it would be a hard task (aliasing, etc.), but i would like to begin with basic specs. As another gcc user (and, once upon a time, developer) who's had to deal with occasional byte ordering issues (mainly in network protocols), I can imagine some uses for something like this. But... The spec could be : - add an attribute (this description could change to be compatible with existing ones (diabdata for example)) __attribute__ ((endian("big"))) __attribute__ ((endian("lil"))) I would use "little" spelled out, rather than trying to use some cute abbreviation. Whether it should be a string vs a C token like little or __little__, I don't know, or particularly care. - this attribute only apply to ints It should at least be any integral type -- short to long long or whatever TImode is. (Technically maybe char/QImode could be allowed but it wouldn't have any effect on code generation.) I wouldn't jump to the conclusion that it would be useless for pointers or floating point values, but I don't know what the use cases for those would be like. However, I think that's a case where you could limit the implementation initially, then expand the support later if needed, unlike the pointer issue below. - this attribute only apply to variables declaration - a pointer to this variable don't inherit the attribute (this behavior could change later, i don't know...) This seems like a poor idea -- for one thing, my use cases would probably involve something like pointers to unaligned big-endian integers in allocated buffers, or maybe integer fields in packed structures, again via pointers. (It looks like you may be trying to handle the latter but not the former in the code you've got so far.) For another, one operation that may be used in code refactoring involves taking a bunch of code accessing some variable x (and presumably similar blocks of code elsewhere that may use different variables), and pulling it out into a separate function that takes the address of the thing to be modified, passed in at the call sites to the new function; if direct access to x and access via &x behave differently under this attribute, suddenly this formerly reasonable transformation is unsafe -- and perhaps worst of all, the behavior change would be silent, since the compiler would have nothing to complain about. Also, changing the behavior later means changing the interpretation of some code after deploying a compiler using one interpretation. Consider this on a 32-bit little-endian machine: unsigned int x __attribute__((endian("big")); *&x = 0x12345678; In normal C code without this attribute, reading and writing "*&x" is the same as reading and writing x. In your proposed version, "*&x" would use the little-endian interpretation, and "x" would use the big- endian interpretation, with nothing at the site of the executable code to indicate that the two should be different. But an expression like this can come up naturally when dealing with macro expansions. Or, someone using this attribute may write code depending on that different handling of "*&x" to deal with a selected byte order in some cases and native byte order in other cases. Then if you update the compiler so that the attribute is passed along to the pointer type, in the next release, suddenly the two cases behave the same -- breaking the user's code when it worked under the previous compiler release. If you support taking the address of specified-endianness variables at all, you need to get the pointer handling right the first time around. I would suggest that if you implement something like this, the attribute should be associated with the data type, not the variable decl; so in the declaration above, x wouldn't be treated specially, but its type would be "big-endian unsigned int", a distinct type from "int" (even on a big-endian machine, probably). The one advantage I see to associating the attribute with the decl rather than the type is that I could write: uint32_t thing __attribute__((endian("big"))); rather than needing to figure out what uint32_t is in fundamental C types and create a new typedef incorporating the underlying type plus the attribute, kind of like how you can't write a declaration using "signed size_t". But that's a long-standing issue in C, and I don't think making the language inconsistent so you can fix the problem in some cases but not others is a very good idea. - the test case is uint32_t x __attribute__ ((endian("big"))); uint32_t * pt
Re: Endianess attribute
On Jul 2, 2009, at 16:44, Michael Meissner wrote: Anyway I had some time during the summit, and I decided to see how hard it would be to add explicit big/little endian support to the powerpc port. It only took a few hours to add the support for __little and __big qualifier keywords, and in fact more time to get the byte swap instructions nailed down That sounds great! (there are restrictions that named address space variables can only be global/static or referenced through a pointer). That sounds like a potential problem, depending on the use cases. No structure field members with explicit byte order? That could be annoying for dealing with network protocols or file formats with explicit byte ordering. On the other hand, if we're talking about address spaces... I would guess you could apply it to a structure? That would be good for memory-mapped devices accepting only one byte order that may not be that of the main CPU. For that use case, it would be unfortunate to have to tag every integer field. I don't think Paul indicated what his use case was... Ken
Re: [PATCH][4.3] Deprecate -ftrapv
On Feb 29, 2008, at 19:13, Richard Guenther wrote: We wrap the libcalls inside libcall notes using REG_EQUAL notes which indicate the libcalls compute non-trapping +-* (there's no RTX code for the trappingness), so we combine and simplify the operations making the libcall possibly dead and remove it again. My patch from September (http://gcc.gnu.org/ml/gcc-patches/2007-09/ msg01351.html) should help with the libcall issue a bit, by making the trapping libcalls not be considered dead, even if optimizations make the results not get used. (Was I supposed to re-submit the patch in non-unidiff format? I've had a couple of machine die on me recently, I might have to reconstruct the source tree.) Of course, if the trapping math is optimized away before you get to emitting libcalls, that's a different bug.
Re: Thread safety annotations and analysis in GCC
This looks like interesting work, and I hope something like this gets folded into a release soon. A few things occurred to me when reading the writeup at google (sorry, I haven't started looking through the code much yet): All the examples seem to be C++ oriented; is it, in fact, a goal for the annotations and analysis to be just as useful in C? What are the scoping rules used for finding the mutex referenced in the GUARDED_BY macro within a C++ class definition? Are they the same as for looking up identifiers in other contexts? How is the lookup done for C structures? Will the compiler get built-in knowledge of the OS library routines (e.g., pthread_mutex_lock) on various platforms? You list separate annotations for "trylock" functions. It appears that the difference is that "trylock" functions can fail. However, pthread_mutex_lock can fail, if the mutex isn't properly initialized, if recursive locking of a non-recursive mutex is detected, or other reasons; the difference between pthread_mutex_lock and pthread_mutex_trylock is whether it will wait or immediately return EBUSY for a mutex locked by another thread. So I think pthread_mutex_lock should be described as a "trylock" function too, under your semantics. Conservatively written code will check for errors, and will have a path in which the lock is assumed *not* to have been acquired; if the analysis assumes pthread_mutex_lock always succeeds, that path may be analyzed incorrectly. (I ran into a tool once before that complained about my locking code until I added an unlock call to the error handling path. Since it's actively encouraging writing incorrect code, I'm not using it any more.) Ken
Re: machine learning for loop unrolling
- compile with the loop unrolled 1x, 2x, 4x, 8x, 16x, 32x and measure the time the benchmark takes The optimal unrolling factor may not be a power of two, depending on icache size (11 times the loop body size?), iteration count (13*n for some unknown n?), and whether there are actions performed inside the loop once or twice every N passes (for N not a power of two). The powers of two would probably hit a lot of the common cases, but you might want to throw in some intermediate values too, if it's too costly to check all practical values. Ken
Re: RFC: Rename Non-Autpoiesis maintainers category
On Jul 27, 2007, at 07:54, Diego Novillo wrote: +Note that individuals who maintain parts of the compiler as reviewers +need approval changes outside of the parts of the compiler they +maintain and also need approval for their own patches. s/approval changes/approval for changes/ ?
Should -ftrapv check type conversion?
I've been looking at -ftrapv and some simple cases it doesn't work right in. (I've got a patch coming soon to address one case where __addvsi3 libcalls and the like get optimized away in RTL.) I see a few reports in Bugzilla, many marked as duplicates of PR 19020 though they cover a few different cases, which have me wondering about what the scope of -ftrapv ought to be. (I'm assuming 32-bit int and 16-bit short below, for simplicity.) 1) What about conversions to signed types? unsigned int x = 0x8000; int y = x; /* trap? */ You get a negative number out of this if you define the conversion as wrapping twos-complement style, but I believe the spec says it's undefined. It's not so much "overflow" from an arithmetic calculation as "out of range", but isn't that what the signed- overflow errors come down to, results that are out of range for the type used to represent them? 2) Conversions to narrower signed types? signed int x = 0xf; signed short y = x; /* trap? */ It seems to me that a trap here would be desirable, though again it's an "out of range" issue. However, a logical extension of this would be to possibly trap for "*charptr = x & 0xff" or "(char)(x & 0xff)" on a signed-char configuration, and that's probably pretty common code. 3) What about narrower-than-int types? signed short a = 0x7000, b = 0x7000, c = a+b; Technically, I believe the addends are widened to signed int before doing the addition, so the result of the addition is 0xe000. If the result is assigned to an int variable, there's no undefined behavior. Converting that to signed short would be where the overflow question comes up, so this is actually a special case of #2. 3) Is Fortran 90 different? PR 32153 shows tests in Fortran for 1-, 2-, 4-, and 8-byte types. I know very little about the Fortran 90 spec, but if it doesn't say the narrower values get widened as in C, then -ftrapv should definitely cause traps for signed short or signed char arithmetic, even if we don't do it for the C type conversion cases above. Ken