Re: GCC aliasing rules: more aggressive than C99?
On 01/03/2010 10:14 PM, Joshua Haberman wrote: > Andrew Haley redhat.com> writes: >> On 01/03/2010 10:53 AM, Richard Guenther wrote: >>> GCC follows its own documentation here, not some random >>> websites and maybe not the strict reading of the standard. >> >> GCC is compatible with C99 in this regard. > > I do not believe this is true. Your argument that GCC complies with C99 > (which you moved to gcc-help@) It's not appropriate here. However, since we've started... > is based on the argument that these are > not compatible types: > > union u { int x; } > int x; > > However, I did not claim that they are compatible types, nor does my > argument rely on them being compatible types. Rather, my argument is > based on section 6.5, paragraph 7 of C99, which I quoted, which > specifies the circumstances under which an object may or may not be > aliased. The case of compatible types is one case, but not the only > case, in which values may be aliased according to the standard. "An object shall have its stored value accessed only by an lvalue expression that has one of the following types: ... an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or ..." doesn't mean that you can get such an aggregate or union lvalue by union u { int x; } *pu = (union u*)&i; because the rules about pointer conversions only allow the result of (union u*)&i to be converted back to an (int*). They do not allow you to dereference that pointer as a (union u*): "6.3.2.3 "A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer." This is *all* you are allowed to do with the converted pointer. You may not dereference it. This is the core rule that governs C's aliasing. Andrew.
Re: GCC aliasing rules: more aggressive than C99?
On 01/03/2010 11:25 PM, Richard Guenther wrote: >char charray[sizeof(long)] = {...}; >long l = *(long*)charray; // ok not correct;) (the lvalue has to be of character type, yours is of type 'long' - the type of the actual object does not matter) What would be correct instead is memcpy ((char *) &l, charray, sizeof(long)); Paolo
PowerPC : GCC2 optimises better than GCC4???
This sounds like a dumb question I know. However the following code snippet results in many more machine instructions under 4.4.2 than under 2.9.5 (I am running a cygwin->PowerPC cross): typedef unsigned int U32; typedef union { U32 R; struct { U32 BF1:2; U32 :8; U32 BF2:2; U32 BF3:2; U32 :18; } B; } TEST_t; U32 testFunc(void) { TEST_t t; t.R=0; t.B.BF1=2; t.B.BF2=3; t.B.BF3=1; return t.R; } Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o gcc-test-442.s): li 0,2 li 3,0 rlwimi 3,0,30,0,1 li 0,3 rlwimi 3,0,20,10,11 li 0,1 rlwimi 3,0,18,12,13 blr Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o gcc-test-295.s): lis 3,0x8034 blr Is there any way to improve this behaviour? I have been using 2.9.5 very successfully for years and am now looking at 4.4.2, but have many such examples in my code (for clarity of commenting and maintainability). I have also noticed that 4.4.2 seems to use significantly larger stack frames, and consequently more register-stacking instructions than 2.9.5 for the same functions. Am I missing something? Many thanks if you can shed any light on this. Mark * This email has been checked by the altohiway Mailcontroller Service *
Re: PowerPC : GCC2 optimises better than GCC4???
On 01/04/2010 10:51 AM, Mark Colby wrote: > This sounds like a dumb question I know. However the following code > snippet results in many more machine instructions under 4.4.2 than under > 2.9.5 (I am running a cygwin->PowerPC cross): > > typedef unsigned int U32; > typedef union > { > U32 R; > struct > { > U32 BF1:2; > U32 :8; > U32 BF2:2; > U32 BF3:2; > U32 :18; > } B; > } TEST_t; > U32 testFunc(void) > { > TEST_t t; > t.R=0; > t.B.BF1=2; > t.B.BF2=3; > t.B.BF3=1; > return t.R; > } > > Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o > gcc-test-442.s): > > li 0,2 > li 3,0 > rlwimi 3,0,30,0,1 > li 0,3 > rlwimi 3,0,20,10,11 > li 0,1 > rlwimi 3,0,18,12,13 > blr > > Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o > gcc-test-295.s): > > lis 3,0x8034 > blr > > Is there any way to improve this behaviour? I have been using 2.9.5 very > successfully for years and am now looking at 4.4.2, but have many such > examples in my code (for clarity of commenting and maintainability). This is very strange. On x86_64, gcc 4.4.1 generates movl$7170, %eax ret This optimization is done by the first RTL cse pass. I can't understand why it's not being done for your target. I guess this will need a powerpc expert. Andrew.
Re: PowerPC : GCC2 optimises better than GCC4???
On Mon, Jan 4, 2010 at 12:02 PM, Andrew Haley wrote: > On 01/04/2010 10:51 AM, Mark Colby wrote: >> This sounds like a dumb question I know. However the following code >> snippet results in many more machine instructions under 4.4.2 than under >> 2.9.5 (I am running a cygwin->PowerPC cross): >> >> typedef unsigned int U32; >> typedef union >> { >> U32 R; >> struct >> { >> U32 BF1:2; >> U32 :8; >> U32 BF2:2; >> U32 BF3:2; >> U32 :18; >> } B; >> } TEST_t; >> U32 testFunc(void) >> { >> TEST_t t; >> t.R=0; >> t.B.BF1=2; >> t.B.BF2=3; >> t.B.BF3=1; >> return t.R; >> } >> >> Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o >> gcc-test-442.s): >> >> li 0,2 >> li 3,0 >> rlwimi 3,0,30,0,1 >> li 0,3 >> rlwimi 3,0,20,10,11 >> li 0,1 >> rlwimi 3,0,18,12,13 >> blr >> >> Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o >> gcc-test-295.s): >> >> lis 3,0x8034 >> blr >> >> Is there any way to improve this behaviour? I have been using 2.9.5 very >> successfully for years and am now looking at 4.4.2, but have many such >> examples in my code (for clarity of commenting and maintainability). > > This is very strange. On x86_64, gcc 4.4.1 generates > > movl $7170, %eax > ret > > This optimization is done by the first RTL cse pass. I can't understand > why it's not being done for your target. I guess this will need a > powerpc expert. Known bug, see http://gcc.gnu.org/PR22141 I hope Jakub will finish this work for gcc 4.5. Ciao! Steven
RE: PowerPC : GCC2 optimises better than GCC4???
> >> Is there any way to improve this behaviour? I have been using 2.9.5 > very > >> successfully for years and am now looking at 4.4.2, but have many > such > >> examples in my code (for clarity of commenting and maintainability). > > > > This is very strange. On x86_64, gcc 4.4.1 generates > > > > movl $7170, %eax > > ret > > > > This optimization is done by the first RTL cse pass. I can't > understand > > why it's not being done for your target. I guess this will need a > > powerpc expert. Thanks Andrew for checking this on your system. > Known bug, see http://gcc.gnu.org/PR22141 > > I hope Jakub will finish this work for gcc 4.5. > > Ciao! > Steven Thanks Steven. At least I have a handle on it now. Fingers crossed for 4.5 :-) * This email has been checked by the altohiway Mailcontroller Service *
Re: PowerPC : GCC2 optimises better than GCC4???
On Mon, Jan 04, 2010 at 12:18:50PM +0100, Steven Bosscher wrote: > > This optimization is done by the first RTL cse pass. I can't understand > > why it's not being done for your target. I guess this will need a > > powerpc expert. > > Known bug, see http://gcc.gnu.org/PR22141 That's unrelated. PR22141 is about (lack of) merging of adjacent stores of constant values into memory, but there are no memory stores involved here, everything is in registers, so PR22141 patch will make zero difference here. IMHO we really should have some late tree pass that converts adjacent bitfield operations into integral operations on non-bitfields (likely with alias set of the whole containing aggregate), as at the RTL level many cases are simply too many instructions for combine etc. to optimize them properly, while at the tree level it could be simpler. Regarding PR22141, the patch works for the memory store merging, but has performance regressions (mainly on PowerPC). I guess I could enable it at least for -Os and in that case check the sizes of all insns that are going to be DCEd because of it against the size of the new sequence. For -O2 perhaps I could limit it to only aligned stores with rtx_cost of the constant being 0 or something similar. Jakub
RE: PowerPC : GCC2 optimises better than GCC4???
> On Mon, Jan 04, 2010 at 12:18:50PM +0100, Steven Bosscher wrote: > > > This optimization is done by the first RTL cse pass. I can't > understand > > > why it's not being done for your target. I guess this will need a > > > powerpc expert. > > > > Known bug, see http://gcc.gnu.org/PR22141 > > That's unrelated. PR22141 is about (lack of) merging of adjacent stores > of > constant values into memory, but there are no memory stores involved > here, > everything is in registers, so PR22141 patch will make zero difference > here. > > IMHO we really should have some late tree pass that converts adjacent > bitfield operations into integral operations on non-bitfields (likely > with > alias set of the whole containing aggregate), as at the RTL level many > cases > are simply too many instructions for combine etc. to optimize them > properly, > while at the tree level it could be simpler. Ah. I take it that v2's optimisation was structured differently, as it does spot and take care of this case? * This email has been checked by the altohiway Mailcontroller Service *
where can find source snapshots of first GCC 4.5.0 ?
Hi, Because of this regression, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41311 Problem is in m68k-elf too, but happen not with any older GCC as 4.5.0 i want try out if the first GCC 4.5.0 snapshot have this Problem or not. The first GCC 4.5.0 i compile was in month 08.this have the Bug. But i find on the mirror sites only first snapshots now that are from month 10. So maybe somebody can post me a link to older versions of GCC 4.5.0 Bye
Re: where can find source snapshots of first GCC 4.5.0 ?
On Mon, Jan 4, 2010 at 8:04 PM, Bernd Roesch wrote: > Hi, > > Because of this regression, > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41311 > > Problem is in m68k-elf too, but happen not with any older GCC as 4.5.0 > > i want try out if the first GCC 4.5.0 snapshot > have this Problem or not. > > The first GCC 4.5.0 i compile was in month 08.this have the Bug. > But i find on the mirror sites > only first snapshots now that are from month 10. > > So maybe somebody can post me a link to older versions of GCC 4.5.0 > I would recommend using GCC git mirror and bisect to locate the source of regression. It's very fast to switch between different revisions. Jie
RE: Possible IRA improvements for irregular register architectures
Happy New Year! I was hoping for some kind of response to this, but maybe I didn't give enough info? I'd appreciate some pointers on what I could do to prompt some discussion because I have some promising new ideas that lead on from what I've described below. Cheers, Ian > -Original Message- > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of > Ian Bolton > Sent: 18 December 2009 15:34 > To: gcc@gcc.gnu.org > Subject: Possible IRA improvements for irregular register architectures > > Let's assume I have two sub-classes of ALL_REGS: BOTTOM_REGS (c0-c15) > and TOP_CREGS (c16-c31). > > Let's also assume I have two main types of instruction: A-type > Instructions, which can use ALL 32 registers, and B-type Instructions, > which can only use the 16 BOTTOM_REGS. > > IRA will correctly calculate costs (in ira-costs.c) for allocnos > appearing in B-type instructions, such that TOP_CREGS has a higher > cost than BOTTOM_REGS. It will also calculate costs for the A-type > instructions such that TOP_CREGS and BOTTOM_REGS have the same cost. > > The order of coloring will be determined by the algorithm chosen: > Priority or Chaitin-Briggs. As each allocno is colored, the costs > will be inspected and the best available hard register will be chosen, > mainly based on the register class costs mentioned above, so allocnos > in B-type Instructions will usually be assigned a BOTTOM_REG if one is > free. If two or more hard registers share the same cost then > whichever one appears first in the REG_ALLOC_ORDER will be assigned. > (If no hard register can be found, the allocno is assigned memory and > will require a "reload" in a later pass to get a hard register.) > > I do not wish to alter the coloring algorithms or the coloring order. > I believe they are good at determing the order to color allocnos, > which dictates the likelihood of being assigned a hard register. What > I wish to change is the hard register that is assigned, given that the > coloring order has determined that this allocno should get one next. > > Why do I need to do this? Since the BOTTOM_REGS can be used by all > instructions, it makes sense to put them first in the REG_ALLOC_ORDER, > so we minimise the number of registers consumed by a low-pressure > function. But it also makes sense, in high-pressure functions, to > steer A-type Instructions away from using BOTTOM_REGS so that they are > free for B-type Instructions to use. > > To achieve this, I tried altering the costs calculated in ira-costs.c, > either explicitly with various hacks or by altering operand > constraints. The problem with this approach was that it is static and > independent, occurring before any coloring order has been determined > and without any awareness of the needs of other allocnos. I believe I > require a dynamic way to alter the costs, based on which allocnos > conflict with the allocno currently being colored and which hard > registers are still available at this point. > > The patch I have attached here is my first reasonable successful > attempt at this dynamic approach, which has led to performance > improvements on some of our benchmarks and no significant > regressions. > > I am hoping it will be useful to others, but I post it more as a > talking point or perhaps to inspire others to come up with better > solutions and share them with me :-)
Call for participation: GROW'10 - 2nd Workshop on GCC Research Opportunities
Apologies if you receive multiple copies of this call. CALL FOR PARTICIPATION 2nd Workshop on GCC Research Opportunities (GROW'10) http://ctuning.org/workshop-grow10 January 23, 2010, Pisa, Italy (co-located with HiPEAC 2010 Conference) EARLY REGISTRATION DEADLINE: JAN. 6th, 2010 We invite you to participate in GROW 2010, the Workshop on GCC Research opportunities, to be held in Pisa, Italy in January 23, 2010, along with the conference on High-Performance Embedded Architectures and Compilers (HiPEAC). The Workshop Program includes: * Presentations of 8 selected papers * A Keynote talk by Diego Novillo, Google, Canada, on: "Using GCC as a toolbox for research: GCC plugins and whole-program compilation" * A panel on plugins and the future of GCC The Workshop Program is now available: http://cTuning.org/wiki/index.php/Dissemination:Workshops:GROW10:Program GROW workshop focuses on current challenges in research and development of compiler analyses and optimizations based on the free GNU Compiler Collection (GCC). The goal of this workshop is to bring together people from industry and academia that are interested in conducting research based on GCC and enhancing this compiler suite for research needs. The workshop will promote and disseminate compiler research (recent, ongoing or planned) with GCC, as a robust industrial-strength vehicle that supports free and collaborative research. The program will include an invited talk and a discussion panel on future research and development directions of GCC. Topics of interest Any issue related to innovative program analysis, optimizations and run-time adaptation with GCC including but not limited to: * Classical compiler analyses, transformations and optimizations * Power-aware analyses and optimizations * Language/Compiler/HW cooperation * Optimizing compilation tools for heterogeneous/reconfigurable/ multicore systems * Tools to improve compiler configurability and retargetability * Profiling, program instrumentation and dynamic analysis * Iterative and collective feedback-directed optimization * Case studies and performance evaluations * Techniques and tools to improve usability and quality of GCC * Plugins to enhance research capabilities of GCC Organizers Dorit Nuzman, IBM, Israel Grigori Fursin, INRIA, France Program Committee Arutyun I. Avetisyan, ISP RAS, Russia Zbigniew Chamski, Infrasoft IT Solutions, Poland Albert Cohen, INRIA, France David Edelsohn, IBM, USA Bjorn Franke, University of Edinburgh, UK Grigori Fursin, INRIA, France Benedict Gaster, AMD, USA Jan Hubicka, SUSE Paul H.J. Kelly, Imperial College of London, UK Ondrej Lhotak, University of Waterloo, Canada Hans-Peter Nilsson, Axis Communications, Sweden Diego Novillo, Google, Canada Dorit Nuzman, IBM, Israel Sebastian Pop, AMD, USA Ian Lance Taylor, Google, USA Chengyong Wu, ICT, China Kenneth Zadeck, NaturalBridge, USA Ayal Zaks, IBM, Israel Previous Workshops GROW'09: http://www.doc.ic.ac.uk/~phjk/GROW09 GREPS'07: http://sysrun.haifa.il.ibm.com/hrl/greps2007
Re: PowerPC : GCC2 optimises better than GCC4???
On 01/04/2010 12:07 PM, Jakub Jelinek wrote: > On Mon, Jan 04, 2010 at 12:18:50PM +0100, Steven Bosscher wrote: >>On Mon, Jan 4, 2010 at 12:02 PM, Andrew Haley wrote: >>> This optimization is done by the first RTL cse pass. I can't understand >>> why it's not being done for your target. I guess this will need a >>> powerpc expert. >> >> Known bug, see http://gcc.gnu.org/PR22141 > > That's unrelated. PR22141 is about (lack of) merging of adjacent stores of > constant values into memory, but there are no memory stores involved here, > everything is in registers, so PR22141 patch will make zero difference here. > > IMHO we really should have some late tree pass that converts adjacent > bitfield operations into integral operations on non-bitfields (likely with > alias set of the whole containing aggregate), as at the RTL level many cases > are simply too many instructions for combine etc. to optimize them properly, > while at the tree level it could be simpler. Yabbut, how come RTL cse can handle it in x86_64, but PPC not? Andrew.
RE: PowerPC : GCC2 optimises better than GCC4???
I can confirm that our target also generate GOOD code for this case. Maybe this is a EABI or target-specific thing, where Struct/union is forced to memory. Bingfeng Broadcom Uk > -Original Message- > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On > Behalf Of Andrew Haley > Sent: 04 January 2010 16:08 > To: gcc@gcc.gnu.org > Subject: Re: PowerPC : GCC2 optimises better than GCC4??? > > On 01/04/2010 12:07 PM, Jakub Jelinek wrote: > > On Mon, Jan 04, 2010 at 12:18:50PM +0100, Steven Bosscher wrote: > >>On Mon, Jan 4, 2010 at 12:02 PM, Andrew Haley > wrote: > >>> This optimization is done by the first RTL cse pass. I > can't understand > >>> why it's not being done for your target. I guess this will need a > >>> powerpc expert. > >> > >> Known bug, see http://gcc.gnu.org/PR22141 > > > > That's unrelated. PR22141 is about (lack of) merging of > adjacent stores of > > constant values into memory, but there are no memory stores > involved here, > > everything is in registers, so PR22141 patch will make zero > difference here. > > > > IMHO we really should have some late tree pass that > converts adjacent > > bitfield operations into integral operations on > non-bitfields (likely with > > alias set of the whole containing aggregate), as at the RTL > level many cases > > are simply too many instructions for combine etc. to > optimize them properly, > > while at the tree level it could be simpler. > > Yabbut, how come RTL cse can handle it in x86_64, but PPC not? > > Andrew. > >
Re: PowerPC : GCC2 optimises better than GCC4???
On Mon, Jan 04, 2010 at 04:08:17PM +, Andrew Haley wrote: > On 01/04/2010 12:07 PM, Jakub Jelinek wrote: > > IMHO we really should have some late tree pass that converts adjacent > > bitfield operations into integral operations on non-bitfields (likely with > > alias set of the whole containing aggregate), as at the RTL level many cases > > are simply too many instructions for combine etc. to optimize them properly, > > while at the tree level it could be simpler. > > Yabbut, how come RTL cse can handle it in x86_64, but PPC not? Probably because the RTL on x86_64 uses and's and ior's, but PPC uses set's of zero_extract's (insvsi). -Nathan
Re: PowerPC : GCC2 optimises better than GCC4???
On 01/04/2010 04:17 PM, Nathan Froyd wrote: > On Mon, Jan 04, 2010 at 04:08:17PM +, Andrew Haley wrote: >> On 01/04/2010 12:07 PM, Jakub Jelinek wrote: >>> IMHO we really should have some late tree pass that converts adjacent >>> bitfield operations into integral operations on non-bitfields (likely with >>> alias set of the whole containing aggregate), as at the RTL level many cases >>> are simply too many instructions for combine etc. to optimize them properly, >>> while at the tree level it could be simpler. >> >> Yabbut, how come RTL cse can handle it in x86_64, but PPC not? > > Probably because the RTL on x86_64 uses and's and ior's, but PPC uses > set's of zero_extract's (insvsi). Aha! Yes, that'll probably be it. It should be easy to fix cse to recognize those too. Andrew.
entry point of gimplification
Hi, I have been trying to understand the gcc source code and am interested in customizing gcc to support speculative parallelization of conditional branching and loop instructions instructions.I am considering gimple as input. I want to know what is the entry point to the gimplification pass? and given a function body which are the functions in the gcc source that convert the body into equivalent gimple statements? Also is there a way in which i can selectively step through the execution of the source related to this? Any other help on the main aim of speculative parallelization will also be most helpful. Apologies if the question sounds vague. -- cheers sandy
Re: GCC aliasing rules: more aggressive than C99?
Andrew Haley redhat.com> writes: > On 01/03/2010 10:14 PM, Joshua Haberman wrote: > > Andrew Haley redhat.com> writes: > "6.3.2.3 > > "A pointer to an object or incomplete type may be converted to a > pointer to a different object or incomplete type. If the resulting > pointer is not correctly aligned for the pointed-to type, the > behavior is undefined. Otherwise, when converted back again, the > result shall compare equal to the original pointer." > > This is *all* you are allowed to do with the converted pointer. You > may not dereference it. The text you quoted does not contain any "shall not" language about dereferencing, so this conclusion does not follow. > [Section 6.3.2.3] is the core rule that governs C's aliasing. Section 6.5 paragraph 7 contains this footnote: The intent of this list is to specify those circumstances in which an object may or may not be aliased. I am not sure why you discard the significance of this section. Also under your interpretation memcpy(&some_int, ..., ...) is illegal, because memcpy() will write to the int's storage with a pointer type other than int. Josh
df_changeable_flags use in combine.c
Hi, I'm fixing some compiler errors when configuring with --enable-build-with-cxx, and ran into a curious line of code that may indicate a bug: static unsigned int rest_of_handle_combine (void) { int rebuild_jump_labels_after_combine; df_set_flags (DF_LR_RUN_DCE + DF_DEFER_INSN_RESCAN); // ... } The DF_* values are from the df_changeable_flags enum, whose values are typically used in logical and/or operations for masking purposes. As such, I'm guessing the author may have meant to do: df_set_flags (DF_LR_RUN_DCE & DF_DEFER_INSN_RESCAN); I could have just added the explicit cast necessary to silence the gcc-as-cxx warning I was running into, but I wanted to be a good citizen :) Any pointers are appreciated, Thanks! -- tangled strands of DNA explain the way that I behave. http://www.clock.org/~matt
[gcc-as-cxx] enum conversion to int
Hi, I'm trying to fix some errors/warnings to make sure that gcc-as-cxx doesn't bitrot too much. I ran into this issue, and an unsure how to fix it without really ugly casting: enum df_changeable_flags df_set_flags (enum df_changeable_flags changeable_flags) { enum df_changeable_flags old_flags = df->changeable_flags; df->changeable_flags |= changeable_flags; return old_flags; } I'm getting this warning on the second line of the function: ./../gcc-trunk/gcc/df-core.c: In function df_changeable_flags df_set_flags(df_changeable_flags): ../../gcc-trunk/gcc/df-core.c:474: error: invalid conversion from int to df_changeable_flags At first blanch, it seems like df_changeable_flags should be a typedef to byte (or int, which is what it was being implicitly converted to everywhere), and the enum should be disbanded into individual #defines. I wanted to make sure that this wasn't a warning false positive first, though. -- tangled strands of DNA explain the way that I behave. http://www.clock.org/~matt
Re: GCC aliasing rules: more aggressive than C99?
On Mon, Jan 04, 2010 at 08:17:00PM +, Joshua Haberman wrote: > Andrew Haley redhat.com> writes: > > On 01/03/2010 10:14 PM, Joshua Haberman wrote: > > > Andrew Haley redhat.com> writes: > > "6.3.2.3 > > > > "A pointer to an object or incomplete type may be converted to a > > pointer to a different object or incomplete type. If the resulting > > pointer is not correctly aligned for the pointed-to type, the > > behavior is undefined. Otherwise, when converted back again, the > > result shall compare equal to the original pointer." > > > > This is *all* you are allowed to do with the converted pointer. You > > may not dereference it. > > The text you quoted does not contain any "shall not" language about > dereferencing, so this conclusion does not follow. It doesn't have to use any "shall not" language. If the standard does not say that any particular action is allowed or otherwise defines what it does, then that action implicitly has undefined behaviour. > > > [Section 6.3.2.3] is the core rule that governs C's aliasing. > > Section 6.5 paragraph 7 contains this footnote: > > The intent of this list is to specify those circumstances in which an > object may or may not be aliased. > > I am not sure why you discard the significance of this section. Also > under your interpretation memcpy(&some_int, ..., ...) is illegal, > because memcpy() will write to the int's storage with a pointer type > other than int. Your conclusion does not follow since the standard does not say what (if any) pointer type memcpy() will use internally. It is not even necessary that memcpy() is implemented in C. -- Erik Trulsson ertr1...@student.uu.se
Re: GCC aliasing rules: more aggressive than C99?
Erik Trulsson student.uu.se> writes: > On Mon, Jan 04, 2010 at 08:17:00PM +, Joshua Haberman wrote: > > The text you quoted does not contain any "shall not" language about > > dereferencing, so this conclusion does not follow. > > It doesn't have to use any "shall not" language. If the standard does not > say that any particular action is allowed or otherwise defines what it > does, then that action implicitly has undefined behaviour. Section 6.5 does define circumstances under which converted pointers may be dereferenced. Section 6.3.2.3 does not include any language prohibiting it, so it does not support the assertion it was quoted to support, and it is irrelevant in the context of this discussion. > > > [Section 6.3.2.3] is the core rule that governs C's aliasing. > > > > Section 6.5 paragraph 7 contains this footnote: > > > > The intent of this list is to specify those circumstances in which an > > object may or may not be aliased. > > > > I am not sure why you discard the significance of this section. Also > > under your interpretation memcpy(&some_int, ..., ...) is illegal, > > because memcpy() will write to the int's storage with a pointer type > > other than int. > > Your conclusion does not follow since the standard does not say what (if > any) pointer type memcpy() will use internally. It is not even necessary > that memcpy() is implemented in C. It says that it will copy characters. More importantly, you are still ignoring section 6.5 without saying why. Josh
Re: df_changeable_flags use in combine.c
On 01/05/2010 07:12 AM, Matt wrote: Hi, I'm fixing some compiler errors when configuring with --enable-build-with-cxx, and ran into a curious line of code that may indicate a bug: static unsigned int rest_of_handle_combine (void) { int rebuild_jump_labels_after_combine; df_set_flags (DF_LR_RUN_DCE + DF_DEFER_INSN_RESCAN); // ... } The DF_* values are from the df_changeable_flags enum, whose values are typically used in logical and/or operations for masking purposes. As such, I'm guessing the author may have meant to do: df_set_flags (DF_LR_RUN_DCE & DF_DEFER_INSN_RESCAN); I think you meant "|". I think "+" is same as "|" here. And I didn't see this error when --enable-build-with-cxx for current trunk head. But I see other errors. Jie
dwarf2 - multiple DW_TAG_variable for global variable
I installed gcc-4.5-20091224 snapshot and noticed that for simple variable declaration I get two DW_TAG_variable dies in the object file. For example, the following code int x; main() {x=1;} generates (with -g -gdwarf2 -O0 switches): <1><54>: Abbrev Number: 4 (DW_TAG_variable) <55> DW_AT_name: (indirect string, offset: 0x36): x <59> DW_AT_decl_file : 1 <5a> DW_AT_decl_line : 1 <5b> DW_AT_type: <0x4d> <5f> DW_AT_external: 1 <60> DW_AT_declaration : 1 <1><61>: Abbrev Number: 5 (DW_TAG_variable) <62> DW_AT_name: (indirect string, offset: 0x36): x <66> DW_AT_decl_file : 1 <67> DW_AT_decl_line : 1 <68> DW_AT_type: <0x4d> <6c> DW_AT_external: 1 <6d> DW_AT_location: 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) Is the above normal? 4.3.2 compiler generates only one die, the second one with DW_AT_location attribute, which is correct. I also noticed that this example (were variable is not used): int x; main() {} generates only one DW_TAG_variable, the one with DW_AT_location, which again should be correct. I ran into this problem by porting some GDB code that uses DWARF2 and got surprised to see this change from the previous version of gcc (4.3). Thanks, Nenad
Why Thumb-2 only allows very limited access to the PC?
Hi In function arm_load_pic_register in file arm.c there are following code: if (TARGET_ARM) { ... } else if (TARGET_THUMB2) { /* Thumb-2 only allows very limited access to the PC. Calculate the address in a temporary register. */ if (arm_pic_register != INVALID_REGNUM) { pic_tmp = gen_rtx_REG (SImode, thumb_find_work_register (saved_regs)); } else { gcc_assert (can_create_pseudo_p ()); pic_tmp = gen_reg_rtx (Pmode); } emit_insn (gen_pic_load_addr_thumb2 (pic_reg, pic_rtx)); emit_insn (gen_pic_load_dot_plus_four (pic_tmp, labelno)); emit_insn (gen_addsi3 (pic_reg, pic_reg, pic_tmp)); } else /* TARGET_THUMB1 */ { ... } The comment said "Thumb-2 only allows very limited access to the PC. Calculate the address in a temporary register.". So the generated code is a little more complex than thumb1. Could anybody help to give more explanation on the limitation thumb2 has compared to thumb1? The generated instructions by this function for thumb1 is listed following, both instructions are available under thumb2. ldr r3, .L2 .LPIC0: add r3, pc thanks Guozhi