Re: GCC aliasing rules: more aggressive than C99?

2010-01-04 Thread Andrew Haley
On 01/03/2010 10:14 PM, Joshua Haberman wrote:
> Andrew Haley  redhat.com> writes:
>> On 01/03/2010 10:53 AM, Richard Guenther wrote:
>>> GCC follows its own documentation here, not some random
>>> websites and maybe not the strict reading of the standard.
>>
>> GCC is compatible with C99 in this regard.
> 
> I do not believe this is true.  Your argument that GCC complies with C99
> (which you moved to gcc-help@)

It's not appropriate here.  However, since we've started...

> is based on the argument that these are 
> not compatible types:
> 
>   union u { int x; }
>   int x;
> 
> However, I did not claim that they are compatible types, nor does my
> argument rely on them being compatible types.  Rather, my argument is
> based on section 6.5, paragraph 7 of C99, which I quoted, which 
> specifies the circumstances under which an object may or may not be
> aliased.  The case of compatible types is one case, but not the only 
> case, in which values may be aliased according to the standard.

"An object shall have its stored value accessed only by an lvalue
expression that has one of the following types: ...  an aggregate
or union type that includes one of the aforementioned types among its
members (including, recursively, a member of a subaggregate or
contained union), or ..."

doesn't mean that you can get such an aggregate or union lvalue by

  union u { int x; } *pu = (union u*)&i;

because the rules about pointer conversions only allow the result of

  (union u*)&i

to be converted back to an (int*).  They do not allow you to dereference
that pointer as a (union u*):

"6.3.2.3

"A pointer to an object or incomplete type may be converted to a
pointer to a different object or incomplete type. If the resulting
pointer is not correctly aligned for the pointed-to type, the
behavior is undefined. Otherwise, when converted back again, the
result shall compare equal to the original pointer."

This is *all* you are allowed to do with the converted pointer.  You
may not dereference it.

This is the core rule that governs C's aliasing.

Andrew.


Re: GCC aliasing rules: more aggressive than C99?

2010-01-04 Thread Paolo Bonzini

On 01/03/2010 11:25 PM, Richard Guenther wrote:

>char charray[sizeof(long)] = {...};
>long l = *(long*)charray;  // ok


not correct;)   (the lvalue has to be of character type, yours is of
type 'long' - the type of the actual object does not matter)


What would be correct instead is

   memcpy ((char *) &l, charray, sizeof(long));

Paolo


PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Mark Colby
This sounds like a dumb question I know. However the following code
snippet results in many more machine instructions under 4.4.2 than under
2.9.5 (I am running a cygwin->PowerPC cross):

  typedef unsigned int U32;
  typedef union
  {
U32 R;
struct
{
  U32 BF1:2;
  U32 :8;
  U32 BF2:2;
  U32 BF3:2;
  U32 :18;
} B;
  } TEST_t;
  U32 testFunc(void)
  {
TEST_t t;
t.R=0;
t.B.BF1=2;
t.B.BF2=3;
t.B.BF3=1;
return t.R;
  }

Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o
gcc-test-442.s):

  li 0,2
  li 3,0
  rlwimi 3,0,30,0,1
  li 0,3
  rlwimi 3,0,20,10,11
  li 0,1
  rlwimi 3,0,18,12,13
  blr

Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o
gcc-test-295.s):

  lis 3,0x8034
  blr

Is there any way to improve this behaviour? I have been using 2.9.5 very
successfully for years and am now looking at 4.4.2, but have many such
examples in my code (for clarity of commenting and maintainability).

I have also noticed that 4.4.2 seems to use significantly larger stack
frames, and consequently more register-stacking instructions than 2.9.5
for the same functions. Am I missing something? Many thanks if you can
shed any light on this.

Mark


*
This email has been checked by the altohiway Mailcontroller Service
*


Re: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Andrew Haley
On 01/04/2010 10:51 AM, Mark Colby wrote:
> This sounds like a dumb question I know. However the following code
> snippet results in many more machine instructions under 4.4.2 than under
> 2.9.5 (I am running a cygwin->PowerPC cross):
> 
>   typedef unsigned int U32;
>   typedef union
>   {
> U32 R;
> struct
> {
>   U32 BF1:2;
>   U32 :8;
>   U32 BF2:2;
>   U32 BF3:2;
>   U32 :18;
> } B;
>   } TEST_t;
>   U32 testFunc(void)
>   {
> TEST_t t;
> t.R=0;
> t.B.BF1=2;
> t.B.BF2=3;
> t.B.BF3=1;
> return t.R;
>   }
> 
> Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o
> gcc-test-442.s):
> 
>   li 0,2
>   li 3,0
>   rlwimi 3,0,30,0,1
>   li 0,3
>   rlwimi 3,0,20,10,11
>   li 0,1
>   rlwimi 3,0,18,12,13
>   blr
> 
> Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o
> gcc-test-295.s):
> 
>   lis 3,0x8034
>   blr
> 
> Is there any way to improve this behaviour? I have been using 2.9.5 very
> successfully for years and am now looking at 4.4.2, but have many such
> examples in my code (for clarity of commenting and maintainability).

This is very strange.  On x86_64, gcc 4.4.1 generates

movl$7170, %eax
ret

This optimization is done by the first RTL cse pass.  I can't understand
why it's not being done for your target.  I guess this will need a
powerpc expert.

Andrew.


Re: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Steven Bosscher
On Mon, Jan 4, 2010 at 12:02 PM, Andrew Haley  wrote:
> On 01/04/2010 10:51 AM, Mark Colby wrote:
>> This sounds like a dumb question I know. However the following code
>> snippet results in many more machine instructions under 4.4.2 than under
>> 2.9.5 (I am running a cygwin->PowerPC cross):
>>
>>   typedef unsigned int U32;
>>   typedef union
>>   {
>>     U32 R;
>>     struct
>>     {
>>       U32 BF1:2;
>>       U32 :8;
>>       U32 BF2:2;
>>       U32 BF3:2;
>>       U32 :18;
>>     } B;
>>   } TEST_t;
>>   U32 testFunc(void)
>>   {
>>     TEST_t t;
>>     t.R=0;
>>     t.B.BF1=2;
>>     t.B.BF2=3;
>>     t.B.BF3=1;
>>     return t.R;
>>   }
>>
>> Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o
>> gcc-test-442.s):
>>
>>   li 0,2
>>   li 3,0
>>   rlwimi 3,0,30,0,1
>>   li 0,3
>>   rlwimi 3,0,20,10,11
>>   li 0,1
>>   rlwimi 3,0,18,12,13
>>   blr
>>
>> Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o
>> gcc-test-295.s):
>>
>>   lis 3,0x8034
>>   blr
>>
>> Is there any way to improve this behaviour? I have been using 2.9.5 very
>> successfully for years and am now looking at 4.4.2, but have many such
>> examples in my code (for clarity of commenting and maintainability).
>
> This is very strange.  On x86_64, gcc 4.4.1 generates
>
>        movl    $7170, %eax
>        ret
>
> This optimization is done by the first RTL cse pass.  I can't understand
> why it's not being done for your target.  I guess this will need a
> powerpc expert.

Known bug, see http://gcc.gnu.org/PR22141

I hope Jakub will finish this work for gcc 4.5.

Ciao!
Steven


RE: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Mark Colby
> >> Is there any way to improve this behaviour? I have been using 2.9.5
> very
> >> successfully for years and am now looking at 4.4.2, but have many
> such
> >> examples in my code (for clarity of commenting and maintainability).
> >
> > This is very strange.  On x86_64, gcc 4.4.1 generates
> >
> >        movl    $7170, %eax
> >        ret
> >
> > This optimization is done by the first RTL cse pass.  I can't
> understand
> > why it's not being done for your target.  I guess this will need a
> > powerpc expert.

Thanks Andrew for checking this on your system.

> Known bug, see http://gcc.gnu.org/PR22141
> 
> I hope Jakub will finish this work for gcc 4.5.
> 
> Ciao!
> Steven

Thanks Steven. At least I have a handle on it now. Fingers crossed for 4.5 :-)


*
This email has been checked by the altohiway Mailcontroller Service
*


Re: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Jakub Jelinek
On Mon, Jan 04, 2010 at 12:18:50PM +0100, Steven Bosscher wrote:
> > This optimization is done by the first RTL cse pass.  I can't understand
> > why it's not being done for your target.  I guess this will need a
> > powerpc expert.
> 
> Known bug, see http://gcc.gnu.org/PR22141

That's unrelated.  PR22141 is about (lack of) merging of adjacent stores of
constant values into memory, but there are no memory stores involved here,
everything is in registers, so PR22141 patch will make zero difference here.

IMHO we really should have some late tree pass that converts adjacent
bitfield operations into integral operations on non-bitfields (likely with
alias set of the whole containing aggregate), as at the RTL level many cases
are simply too many instructions for combine etc. to optimize them properly,
while at the tree level it could be simpler.

Regarding PR22141, the patch works for the memory store merging, but has
performance regressions (mainly on PowerPC).  I guess I could enable it at
least for -Os and in that case check the sizes of all insns that are going
to be DCEd because of it against the size of the new sequence.
For -O2 perhaps I could limit it to only aligned stores with rtx_cost of the
constant being 0 or something similar.

Jakub


RE: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Mark Colby
> On Mon, Jan 04, 2010 at 12:18:50PM +0100, Steven Bosscher wrote:
> > > This optimization is done by the first RTL cse pass.  I can't
> understand
> > > why it's not being done for your target.  I guess this will need a
> > > powerpc expert.
> >
> > Known bug, see http://gcc.gnu.org/PR22141
> 
> That's unrelated.  PR22141 is about (lack of) merging of adjacent stores
> of
> constant values into memory, but there are no memory stores involved
> here,
> everything is in registers, so PR22141 patch will make zero difference
> here.
> 
> IMHO we really should have some late tree pass that converts adjacent
> bitfield operations into integral operations on non-bitfields (likely
> with
> alias set of the whole containing aggregate), as at the RTL level many
> cases
> are simply too many instructions for combine etc. to optimize them
> properly,
> while at the tree level it could be simpler.

Ah. I take it that v2's optimisation was structured differently, as it does 
spot and take care of this case?


*
This email has been checked by the altohiway Mailcontroller Service
*


where can find source snapshots of first GCC 4.5.0 ?

2010-01-04 Thread Bernd Roesch
Hi, 

Because of this regression, 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41311

Problem is in m68k-elf too, but happen not with any older GCC as 4.5.0

i want try out if the first GCC 4.5.0 snapshot
have this Problem or not.

The first GCC 4.5.0 i compile was in month 08.this have the Bug.
But i find on the mirror sites
only first snapshots now that are from month 10.

So maybe somebody can post me a link to older versions of GCC 4.5.0

Bye



Re: where can find source snapshots of first GCC 4.5.0 ?

2010-01-04 Thread Jie Zhang
On Mon, Jan 4, 2010 at 8:04 PM, Bernd Roesch  wrote:
> Hi,
>
> Because of this regression,
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41311
>
> Problem is in m68k-elf too, but happen not with any older GCC as 4.5.0
>
> i want try out if the first GCC 4.5.0 snapshot
> have this Problem or not.
>
> The first GCC 4.5.0 i compile was in month 08.this have the Bug.
> But i find on the mirror sites
> only first snapshots now that are from month 10.
>
> So maybe somebody can post me a link to older versions of GCC 4.5.0
>
I would recommend using GCC git mirror and bisect to locate the source
of regression. It's very fast to switch between different revisions.


Jie


RE: Possible IRA improvements for irregular register architectures

2010-01-04 Thread Ian Bolton
Happy New Year!

I was hoping for some kind of response to this, but maybe I didn't give
enough info?  I'd appreciate some pointers on what I could do to prompt
some discussion because I have some promising new ideas that lead on
from what I've described below.

Cheers,
Ian

> -Original Message-
> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf
Of
> Ian Bolton
> Sent: 18 December 2009 15:34
> To: gcc@gcc.gnu.org
> Subject: Possible IRA improvements for irregular register
architectures
> 
> Let's assume I have two sub-classes of ALL_REGS: BOTTOM_REGS (c0-c15)
> and TOP_CREGS (c16-c31).
> 
> Let's also assume I have two main types of instruction: A-type
> Instructions, which can use ALL 32 registers, and B-type Instructions,
> which can only use the 16 BOTTOM_REGS.
> 
> IRA will correctly calculate costs (in ira-costs.c) for allocnos
> appearing in B-type instructions, such that TOP_CREGS has a higher
> cost than BOTTOM_REGS.  It will also calculate costs for the A-type
> instructions such that TOP_CREGS and BOTTOM_REGS have the same cost.
> 
> The order of coloring will be determined by the algorithm chosen:
> Priority or Chaitin-Briggs.  As each allocno is colored, the costs
> will be inspected and the best available hard register will be chosen,
> mainly based on the register class costs mentioned above, so allocnos
> in B-type Instructions will usually be assigned a BOTTOM_REG if one is
> free.  If two or more hard registers share the same cost then
> whichever one appears first in the REG_ALLOC_ORDER will be assigned.
> (If no hard register can be found, the allocno is assigned memory and
> will require a "reload" in a later pass to get a hard register.)
> 
> I do not wish to alter the coloring algorithms or the coloring order.
> I believe they are good at determing the order to color allocnos,
> which dictates the likelihood of being assigned a hard register.  What
> I wish to change is the hard register that is assigned, given that the
> coloring order has determined that this allocno should get one next.
> 
> Why do I need to do this?  Since the BOTTOM_REGS can be used by all
> instructions, it makes sense to put them first in the REG_ALLOC_ORDER,
> so we minimise the number of registers consumed by a low-pressure
> function.  But it also makes sense, in high-pressure functions, to
> steer A-type Instructions away from using BOTTOM_REGS so that they are
> free for B-type Instructions to use.
> 
> To achieve this, I tried altering the costs calculated in ira-costs.c,
> either explicitly with various hacks or by altering operand
> constraints.  The problem with this approach was that it is static and
> independent, occurring before any coloring order has been determined
> and without any awareness of the needs of other allocnos.  I believe I
> require a dynamic way to alter the costs, based on which allocnos
> conflict with the allocno currently being colored and which hard
> registers are still available at this point.
> 
> The patch I have attached here is my first reasonable successful
> attempt at this dynamic approach, which has led to performance
> improvements on some of our benchmarks and no significant
> regressions.
> 
> I am hoping it will be useful to others, but I post it more as a
> talking point or perhaps to inspire others to come up with better
> solutions and share them with me :-)


Call for participation: GROW'10 - 2nd Workshop on GCC Research Opportunities

2010-01-04 Thread Grigori Fursin
Apologies if you receive multiple copies of this call.



 CALL FOR PARTICIPATION

 2nd Workshop on
GCC Research Opportunities
   (GROW'10)

http://ctuning.org/workshop-grow10

  January 23, 2010, Pisa, Italy

 (co-located with HiPEAC 2010 Conference)


EARLY REGISTRATION DEADLINE: JAN. 6th, 2010


We invite you to participate in GROW 2010, the Workshop on GCC Research
opportunities, to be held in Pisa, Italy in January 23, 2010, along with
the conference on High-Performance Embedded Architectures and Compilers
(HiPEAC).

The Workshop Program includes:
  * Presentations of 8 selected papers
  * A Keynote talk by Diego Novillo, Google, Canada, on:
  "Using GCC as a toolbox for research: GCC plugins and whole-program 
   compilation"
  * A panel on plugins and the future of GCC

The Workshop Program is now available:

http://cTuning.org/wiki/index.php/Dissemination:Workshops:GROW10:Program

GROW workshop focuses on current challenges in research and development of
compiler analyses and optimizations based on the free GNU Compiler
Collection (GCC). The goal of this workshop is to bring together people
from industry and academia that are interested in conducting research based
on GCC and enhancing this compiler suite for research needs. The workshop
will promote and disseminate compiler research (recent, ongoing or planned)
with GCC, as a robust industrial-strength vehicle that supports free and
collaborative research. The program will include an invited talk and a
discussion panel on future research and development directions of GCC.

 Topics of interest 

Any issue related to innovative program analysis, optimizations and
run-time adaptation with GCC including but not limited to:

 * Classical compiler analyses, transformations and optimizations
 * Power-aware analyses and optimizations
 * Language/Compiler/HW cooperation
 * Optimizing compilation tools for heterogeneous/reconfigurable/
   multicore systems
 * Tools to improve compiler configurability and retargetability
 * Profiling, program instrumentation and dynamic analysis
 * Iterative and collective feedback-directed optimization
 * Case studies and performance evaluations
 * Techniques and tools to improve usability and quality of GCC
 * Plugins to enhance research capabilities of GCC

 Organizers 

 Dorit Nuzman, IBM, Israel
 Grigori Fursin, INRIA, France

 Program Committee 

 Arutyun I. Avetisyan, ISP RAS, Russia
 Zbigniew Chamski, Infrasoft IT Solutions, Poland
 Albert Cohen, INRIA, France
 David Edelsohn, IBM, USA
 Bjorn Franke, University of Edinburgh, UK
 Grigori Fursin, INRIA, France
 Benedict Gaster, AMD, USA
 Jan Hubicka, SUSE
 Paul H.J. Kelly, Imperial College of London, UK
 Ondrej Lhotak, University of Waterloo, Canada
 Hans-Peter Nilsson, Axis Communications, Sweden
 Diego Novillo, Google, Canada
 Dorit Nuzman, IBM, Israel
 Sebastian Pop, AMD, USA
 Ian Lance Taylor, Google, USA
 Chengyong Wu, ICT, China
 Kenneth Zadeck, NaturalBridge, USA
 Ayal Zaks, IBM, Israel

 Previous Workshops 

 GROW'09: http://www.doc.ic.ac.uk/~phjk/GROW09
 GREPS'07: http://sysrun.haifa.il.ibm.com/hrl/greps2007




Re: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Andrew Haley
On 01/04/2010 12:07 PM, Jakub Jelinek wrote:
> On Mon, Jan 04, 2010 at 12:18:50PM +0100, Steven Bosscher wrote:
>>On Mon, Jan 4, 2010 at 12:02 PM, Andrew Haley  wrote:
>>> This optimization is done by the first RTL cse pass.  I can't understand
>>> why it's not being done for your target.  I guess this will need a
>>> powerpc expert.
>>
>> Known bug, see http://gcc.gnu.org/PR22141
> 
> That's unrelated.  PR22141 is about (lack of) merging of adjacent stores of
> constant values into memory, but there are no memory stores involved here,
> everything is in registers, so PR22141 patch will make zero difference here.
> 
> IMHO we really should have some late tree pass that converts adjacent
> bitfield operations into integral operations on non-bitfields (likely with
> alias set of the whole containing aggregate), as at the RTL level many cases
> are simply too many instructions for combine etc. to optimize them properly,
> while at the tree level it could be simpler.

Yabbut, how come RTL cse can handle it in x86_64, but PPC not?

Andrew.


RE: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Bingfeng Mei
I can confirm that our target also generate GOOD code for this case. 
Maybe this is a EABI or target-specific thing, where Struct/union is
forced to memory. 

Bingfeng
Broadcom Uk

> -Original Message-
> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On 
> Behalf Of Andrew Haley
> Sent: 04 January 2010 16:08
> To: gcc@gcc.gnu.org
> Subject: Re: PowerPC : GCC2 optimises better than GCC4???
> 
> On 01/04/2010 12:07 PM, Jakub Jelinek wrote:
> > On Mon, Jan 04, 2010 at 12:18:50PM +0100, Steven Bosscher wrote:
> >>On Mon, Jan 4, 2010 at 12:02 PM, Andrew Haley 
>  wrote:
> >>> This optimization is done by the first RTL cse pass.  I 
> can't understand
> >>> why it's not being done for your target.  I guess this will need a
> >>> powerpc expert.
> >>
> >> Known bug, see http://gcc.gnu.org/PR22141
> > 
> > That's unrelated.  PR22141 is about (lack of) merging of 
> adjacent stores of
> > constant values into memory, but there are no memory stores 
> involved here,
> > everything is in registers, so PR22141 patch will make zero 
> difference here.
> > 
> > IMHO we really should have some late tree pass that 
> converts adjacent
> > bitfield operations into integral operations on 
> non-bitfields (likely with
> > alias set of the whole containing aggregate), as at the RTL 
> level many cases
> > are simply too many instructions for combine etc. to 
> optimize them properly,
> > while at the tree level it could be simpler.
> 
> Yabbut, how come RTL cse can handle it in x86_64, but PPC not?
> 
> Andrew.
> 
> 


Re: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Nathan Froyd
On Mon, Jan 04, 2010 at 04:08:17PM +, Andrew Haley wrote:
> On 01/04/2010 12:07 PM, Jakub Jelinek wrote:
> > IMHO we really should have some late tree pass that converts adjacent
> > bitfield operations into integral operations on non-bitfields (likely with
> > alias set of the whole containing aggregate), as at the RTL level many cases
> > are simply too many instructions for combine etc. to optimize them properly,
> > while at the tree level it could be simpler.
> 
> Yabbut, how come RTL cse can handle it in x86_64, but PPC not?

Probably because the RTL on x86_64 uses and's and ior's, but PPC uses
set's of zero_extract's (insvsi).

-Nathan


Re: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Andrew Haley
On 01/04/2010 04:17 PM, Nathan Froyd wrote:
> On Mon, Jan 04, 2010 at 04:08:17PM +, Andrew Haley wrote:
>> On 01/04/2010 12:07 PM, Jakub Jelinek wrote:
>>> IMHO we really should have some late tree pass that converts adjacent
>>> bitfield operations into integral operations on non-bitfields (likely with
>>> alias set of the whole containing aggregate), as at the RTL level many cases
>>> are simply too many instructions for combine etc. to optimize them properly,
>>> while at the tree level it could be simpler.
>>
>> Yabbut, how come RTL cse can handle it in x86_64, but PPC not?
> 
> Probably because the RTL on x86_64 uses and's and ior's, but PPC uses
> set's of zero_extract's (insvsi).

Aha!  Yes, that'll probably be it.  It should be easy to fix cse to
recognize those too.

Andrew.


entry point of gimplification

2010-01-04 Thread sandeep soni
Hi,

I have been trying to understand the gcc source code and am interested
in customizing gcc to support speculative parallelization of
conditional branching and loop instructions instructions.I am
considering gimple as input.

I want to know what is the entry point to the gimplification pass? and
given a function body which are the functions in the gcc source that
convert the body into equivalent gimple statements?

Also is there a way in which i can selectively step through the
execution of the source related to this?

Any other help on the main aim of speculative parallelization will
also be most helpful.

Apologies if the question sounds vague.

-- 
cheers
sandy


Re: GCC aliasing rules: more aggressive than C99?

2010-01-04 Thread Joshua Haberman
Andrew Haley  redhat.com> writes:
> On 01/03/2010 10:14 PM, Joshua Haberman wrote:
> > Andrew Haley  redhat.com> writes:
> "6.3.2.3
>
> "A pointer to an object or incomplete type may be converted to a
> pointer to a different object or incomplete type. If the resulting
> pointer is not correctly aligned for the pointed-to type, the
> behavior is undefined. Otherwise, when converted back again, the
> result shall compare equal to the original pointer."
>
> This is *all* you are allowed to do with the converted pointer.  You
> may not dereference it.

The text you quoted does not contain any "shall not" language about
dereferencing, so this conclusion does not follow.

> [Section 6.3.2.3] is the core rule that governs C's aliasing.

Section 6.5 paragraph 7 contains this footnote:

  The intent of this list is to specify those circumstances in which an
  object may or may not be aliased.

I am not sure why you discard the significance of this section.  Also
under your interpretation memcpy(&some_int, ..., ...) is illegal,
because memcpy() will write to the int's storage with a pointer type
other than int.

Josh



df_changeable_flags use in combine.c

2010-01-04 Thread Matt

Hi,

I'm fixing some compiler errors when configuring with 
--enable-build-with-cxx, and ran into a curious line of code that may 
indicate a bug:


static unsigned int
rest_of_handle_combine (void)
{
  int rebuild_jump_labels_after_combine;

  df_set_flags (DF_LR_RUN_DCE + DF_DEFER_INSN_RESCAN);
 // ...
}

The DF_* values are from the df_changeable_flags enum, whose values are 
typically used in logical and/or operations for masking purposes. As such, 
I'm guessing the author may have meant to do:

  df_set_flags (DF_LR_RUN_DCE & DF_DEFER_INSN_RESCAN);

I could have just added the explicit cast necessary to silence the 
gcc-as-cxx warning I was running into, but I wanted to be a good citizen 
:)


Any pointers are appreciated,
Thanks!




--
tangled strands of DNA explain the way that I behave.
http://www.clock.org/~matt


[gcc-as-cxx] enum conversion to int

2010-01-04 Thread Matt

Hi,

I'm trying to fix some errors/warnings to make sure that gcc-as-cxx 
doesn't bitrot too much. I ran into this issue, and an unsure how to fix 
it without really ugly casting:


enum df_changeable_flags
df_set_flags (enum df_changeable_flags changeable_flags)
{
  enum df_changeable_flags old_flags = df->changeable_flags;
  df->changeable_flags |= changeable_flags;
  return old_flags;
}

I'm getting this warning on the second line of the function:
./../gcc-trunk/gcc/df-core.c: In function df_changeable_flags 
df_set_flags(df_changeable_flags):
../../gcc-trunk/gcc/df-core.c:474: error: invalid conversion from int to 
df_changeable_flags


At first blanch, it seems like df_changeable_flags should be a typedef to 
byte (or int, which is what it was being implicitly converted to 
everywhere), and the enum should be disbanded into individual #defines.


I wanted to make sure that this wasn't a warning false positive first, 
though.


--
tangled strands of DNA explain the way that I behave.
http://www.clock.org/~matt


Re: GCC aliasing rules: more aggressive than C99?

2010-01-04 Thread Erik Trulsson
On Mon, Jan 04, 2010 at 08:17:00PM +, Joshua Haberman wrote:
> Andrew Haley  redhat.com> writes:
> > On 01/03/2010 10:14 PM, Joshua Haberman wrote:
> > > Andrew Haley  redhat.com> writes:
> > "6.3.2.3
> >
> > "A pointer to an object or incomplete type may be converted to a
> > pointer to a different object or incomplete type. If the resulting
> > pointer is not correctly aligned for the pointed-to type, the
> > behavior is undefined. Otherwise, when converted back again, the
> > result shall compare equal to the original pointer."
> >
> > This is *all* you are allowed to do with the converted pointer.  You
> > may not dereference it.
> 
> The text you quoted does not contain any "shall not" language about
> dereferencing, so this conclusion does not follow.

It doesn't have to use any "shall not" language.  If the standard does not
say that any particular action is allowed or otherwise defines what it
does, then that action implicitly has undefined behaviour.


> 
> > [Section 6.3.2.3] is the core rule that governs C's aliasing.
> 
> Section 6.5 paragraph 7 contains this footnote:
> 
>   The intent of this list is to specify those circumstances in which an
>   object may or may not be aliased.
> 
> I am not sure why you discard the significance of this section.  Also
> under your interpretation memcpy(&some_int, ..., ...) is illegal,
> because memcpy() will write to the int's storage with a pointer type
> other than int.

Your conclusion does not follow since the standard does not say what (if
any) pointer type memcpy() will use internally.  It is not even necessary
that memcpy() is implemented in C.




-- 

Erik Trulsson
ertr1...@student.uu.se


Re: GCC aliasing rules: more aggressive than C99?

2010-01-04 Thread Joshua Haberman
Erik Trulsson  student.uu.se> writes:
> On Mon, Jan 04, 2010 at 08:17:00PM +, Joshua Haberman wrote:
> > The text you quoted does not contain any "shall not" language about
> > dereferencing, so this conclusion does not follow.
>
> It doesn't have to use any "shall not" language.  If the standard does not
> say that any particular action is allowed or otherwise defines what it
> does, then that action implicitly has undefined behaviour.

Section 6.5 does define circumstances under which converted pointers may
be dereferenced.  Section 6.3.2.3 does not include any language
prohibiting it, so it does not support the assertion it was quoted to
support, and it is irrelevant in the context of this discussion.

> > > [Section 6.3.2.3] is the core rule that governs C's aliasing.
> >
> > Section 6.5 paragraph 7 contains this footnote:
> >
> >   The intent of this list is to specify those circumstances in which an
> >   object may or may not be aliased.
> >
> > I am not sure why you discard the significance of this section.  Also
> > under your interpretation memcpy(&some_int, ..., ...) is illegal,
> > because memcpy() will write to the int's storage with a pointer type
> > other than int.
>
> Your conclusion does not follow since the standard does not say what (if
> any) pointer type memcpy() will use internally.  It is not even necessary
> that memcpy() is implemented in C.

It says that it will copy characters.  More importantly, you are still 
ignoring section 6.5 without saying why.

Josh



Re: df_changeable_flags use in combine.c

2010-01-04 Thread Jie Zhang

On 01/05/2010 07:12 AM, Matt wrote:

Hi,

I'm fixing some compiler errors when configuring with
--enable-build-with-cxx, and ran into a curious line of code that may
indicate a bug:

static unsigned int
rest_of_handle_combine (void)
{
int rebuild_jump_labels_after_combine;

df_set_flags (DF_LR_RUN_DCE + DF_DEFER_INSN_RESCAN);
// ...
}

The DF_* values are from the df_changeable_flags enum, whose values are
typically used in logical and/or operations for masking purposes. As
such, I'm guessing the author may have meant to do:
df_set_flags (DF_LR_RUN_DCE & DF_DEFER_INSN_RESCAN);


I think you meant "|". I think "+" is same as "|" here.

And I didn't see this error when --enable-build-with-cxx for current 
trunk head. But I see other errors.



Jie


dwarf2 - multiple DW_TAG_variable for global variable

2010-01-04 Thread Nenad Vukicevic
I installed gcc-4.5-20091224 snapshot and noticed that for simple 
variable declaration
I get two DW_TAG_variable dies in the object file. For example, the 
following

code

int x;
main()
{x=1;}

generates (with -g -gdwarf2 -O0 switches):

<1><54>: Abbrev Number: 4 (DW_TAG_variable)
<55>   DW_AT_name: (indirect string, offset: 0x36): x
<59>   DW_AT_decl_file   : 1
<5a>   DW_AT_decl_line   : 1
<5b>   DW_AT_type: <0x4d>
<5f>   DW_AT_external: 1
<60>   DW_AT_declaration : 1
<1><61>: Abbrev Number: 5 (DW_TAG_variable)
<62>   DW_AT_name: (indirect string, offset: 0x36): x
<66>   DW_AT_decl_file   : 1
<67>   DW_AT_decl_line   : 1
<68>   DW_AT_type: <0x4d>
<6c>   DW_AT_external: 1
<6d>   DW_AT_location: 9 byte block: 3 0 0 0 0 0 0 0 0  (DW_OP_addr: 0)

Is the above normal? 4.3.2 compiler generates only one die, the second 
one with

DW_AT_location attribute, which is correct.

I also noticed that this example (were variable is not used):

int x;
main()
{}

generates only one DW_TAG_variable, the one  with DW_AT_location, which 
again

should be correct.

I ran into this problem by porting some GDB code that uses DWARF2 and 
got surprised

to see this change from the previous version of gcc (4.3).

Thanks,
Nenad




Why Thumb-2 only allows very limited access to the PC?

2010-01-04 Thread Carrot Wei
Hi

In function arm_load_pic_register in file arm.c there are following code:

  if (TARGET_ARM)
{
...
}
  else if (TARGET_THUMB2)
{
  /* Thumb-2 only allows very limited access to the PC.  Calculate the
 address in a temporary register.  */
  if (arm_pic_register != INVALID_REGNUM)
{
  pic_tmp = gen_rtx_REG (SImode,
 thumb_find_work_register (saved_regs));
}
  else
{
  gcc_assert (can_create_pseudo_p ());
  pic_tmp = gen_reg_rtx (Pmode);
}

  emit_insn (gen_pic_load_addr_thumb2 (pic_reg, pic_rtx));
  emit_insn (gen_pic_load_dot_plus_four (pic_tmp, labelno));
  emit_insn (gen_addsi3 (pic_reg, pic_reg, pic_tmp));
}
  else /* TARGET_THUMB1 */
{
...
}

The comment said "Thumb-2 only allows very limited access to the PC.
Calculate the address in a temporary register.". So the generated code
is a little more complex than thumb1. Could anybody help to give more
explanation on the limitation thumb2 has compared to thumb1?

The generated instructions by this function for thumb1 is listed
following, both instructions are available under thumb2.

ldr r3, .L2
.LPIC0:
add r3, pc


thanks
Guozhi