How to make a array aligned with 16 byte

2009-12-11 Thread Jianzhang Peng
Can I make a array aligned with 16 byte at RTL pass?

Thanks!

-- 
Jianzhang Peng


Re: identifying indirect references in a loop

2009-12-11 Thread Richard Guenther
On Fri, Dec 11, 2009 at 5:16 AM, Aravinda  wrote:
> Hi,
>
> Im trying to identify all indirect references in a loop so that, after
> this analysis, I have a list of tree_nodes of pointer_type that are
> dereferenced in a loop along with their step size, if any.
>
> E.g.
> while(i++ < n)
> {
>   *(p+i);
> }
>
> I want to get the pointer_type_node for 'p' and identify the step size
> as '1', since 'i' has a step size of 1.
>
> I am able to identify 'INDIRECT_REF' nodes in the loop. But since
> these are generally the expression_temporaries, I will not get the
> tree_node for 'p'. But I believe INDIRECT_REF is an expression who's
> arg0 is an SSA_NAME node  from which I will be able to use the
> SSA_NAME_DEF_STMT to ultimately reach the tree_node for 'p'.
>
> But I dont know how to get the SSA_NAME node from the given
> INDIRECT_REF. Could someone please point out how to do this.
>
> Also, I find it very difficult to know how the tree_nodes and types
> are contained one within the other. Is there a general technique by
> which I can know when a tree node will be nested within another and
> how to retrieve them ?

Look into the tree.def file.  Operands can be retrieved with the
TREE_OPERAND macro (see tree.h).  So if you have an
INDIRECT_REF expression tree node you can get the
variable or SSA_NAME that is dereferenced using TREE_OPERAND (e, 0)
if e is the INDIRECT_REF expression tree.  The pointer type
is then simply TREE_TYPE of that operand.

Btw, I think you want to use the existing data dependence analysis
which provides you with a list of data references in a loop.
See tree-data-ref.[ch].

Richard.

> Thanks,
> Aravinda
>


[RFC] LTO and debug information

2009-12-11 Thread Richard Guenther

The following draft patch disables the debuginfo disabling when using
-flto or -fwhopr and fixes up things so that for C debugging (mostly)
works.

The main question I have is how to proceed further here (with the
goal that simple debugging should be possible in 4.5).  If we
apply this patch then we expose ICEs when -flto is used in
conjunction with -g because the patch doesn't fix all clashes
between free-lang-data and dwarf2out.  Now I was thinking of
instead of ICEing to sorry () if we ICE, have debuginfo enabled
and had run free-lang-data.  Or to keep -g non-operational for
LTO and add a -glto or -fi-really-want-to-debug option.

Or of course hope I can reasonably fix the ICEs I run into and
deal with the remaining cases as bugs?

The patch has proven useful debugging miscompiles in its current
state already.

Thanks,
Richard.

2009-12-11  Richard Guenther  

* tree.c (free_lang_data_in_binfo): Do not free BINFO_OFFSET
and BINFO_VPTR_FIELD.
(free_lang_data_in_decl): Do not free DECL_SIZE_UNIT,
DECL_SIZE, DECL_FIELD_OFFSET and DECL_FCONTEXT.
(free_lang_data): Do not disable debuginfo.
* lto-streamer-out.c (write_symbol_vec): Deal with
non-constant DECL_SIZE.
* dwarf2out.c (add_pure_or_virtual_attribute): Check for
DECL_CONTEXT.
(gen_type_die_for_member): Test for TYPE_STUB_DECL.
* opts.c (decode_options): Do not disable var-tracking for lto.

lto/
* lto.c (lto_fixup_field_decl): Fixup DECL_FIELD_OFFSET.
(lto_post_options): Do not disable debuginfo.

Index: gcc/tree.c
===
*** gcc/tree.c  (revision 155164)
--- gcc/tree.c  (working copy)
*** free_lang_data_in_binfo (tree binfo)
*** 4152,4164 
  
gcc_assert (TREE_CODE (binfo) == TREE_BINFO);
  
-   BINFO_OFFSET (binfo) = NULL_TREE;
BINFO_VTABLE (binfo) = NULL_TREE;
-   BINFO_VPTR_FIELD (binfo) = NULL_TREE;
BINFO_BASE_ACCESSES (binfo) = NULL;
BINFO_INHERITANCE_CHAIN (binfo) = NULL_TREE;
BINFO_SUBVTT_INDEX (binfo) = NULL_TREE;
-   BINFO_VPTR_FIELD (binfo) = NULL_TREE;
  
for (i = 0; VEC_iterate (tree, BINFO_BASE_BINFOS (binfo), i, t); i++)
  free_lang_data_in_binfo (t);
--- 4152,4161 
*** free_lang_data_in_decl (tree decl)
*** 4376,4404 
 }
 }
  
!   if (TREE_CODE (decl) == PARM_DECL
!   || TREE_CODE (decl) == FIELD_DECL
!   || TREE_CODE (decl) == RESULT_DECL)
! {
!   tree unit_size = DECL_SIZE_UNIT (decl);
!   tree size = DECL_SIZE (decl);
!   if ((unit_size && TREE_CODE (unit_size) != INTEGER_CST)
! || (size && TREE_CODE (size) != INTEGER_CST))
!   {
! DECL_SIZE_UNIT (decl) = NULL_TREE;
! DECL_SIZE (decl) = NULL_TREE;
!   }
! 
!   if (TREE_CODE (decl) == FIELD_DECL
! && DECL_FIELD_OFFSET (decl)
! && TREE_CODE (DECL_FIELD_OFFSET (decl)) != INTEGER_CST)
!   DECL_FIELD_OFFSET (decl) = NULL_TREE;
! 
!   /* DECL_FCONTEXT is only used for debug info generation.  */
!   if (TREE_CODE (decl) == FIELD_DECL)
!   DECL_FCONTEXT (decl) = NULL_TREE;
! }
!   else if (TREE_CODE (decl) == FUNCTION_DECL)
  {
if (gimple_has_body_p (decl))
{
--- 4373,4379 
 }
 }
  
!  if (TREE_CODE (decl) == FUNCTION_DECL)
  {
if (gimple_has_body_p (decl))
{
*** free_lang_data (void)
*** 4973,4985 
diagnostic_finalizer (global_dc) = default_diagnostic_finalizer;
diagnostic_format_decoder (global_dc) = default_tree_printer;
  
-   /* FIXME. We remove sufficient language data that the debug
-  info writer gets completely confused.  Disable debug information
-  for now.  */
-   debug_info_level = DINFO_LEVEL_NONE;
-   write_symbols = NO_DEBUG;
-   debug_hooks = &do_nothing_debug_hooks;
- 
return 0;
  }
  
--- 4948,4953 
Index: gcc/lto-streamer-out.c
===
*** gcc/lto-streamer-out.c  (revision 155164)
--- gcc/lto-streamer-out.c  (working copy)
*** write_symbol_vec (struct lto_streamer_ca
*** 2350,2356 
  break;
}
  
!   if (kind == GCCPK_COMMON && DECL_SIZE (t))
size = (((uint64_t) TREE_INT_CST_HIGH (DECL_SIZE (t))) << 32)
  | TREE_INT_CST_LOW (DECL_SIZE (t));
else
--- 2349,2357 
  break;
}
  
!   if (kind == GCCPK_COMMON
! && DECL_SIZE (t)
! && TREE_CODE (DECL_SIZE (t)) == INTEGER_CST)
size = (((uint64_t) TREE_INT_CST_HIGH (DECL_SIZE (t))) << 32)
  | TREE_INT_CST_LOW (DECL_SIZE (t));
else
Index: gcc/dwarf2out.c
===
*** gcc/dwarf2out.c (revision 155164)
--- gcc/dwarf2out.c (working copy)
*** add_pure_or_virtual_attribute (dw_die_re
*** 16476,16482 
   0

Re: generate RTL sequence

2009-12-11 Thread Ian Lance Taylor
Joern Rennecke  writes:

> If you need more rigid scheduling, you can use CC0.

No, please don't.  I accept that CC0 is necessary today for a few
processors, but I really don't think we should encourage any new uses
of it.

Ian


Bitfields problem

2009-12-11 Thread Jean Christophe Beyler
As I continue my work on the machine description file, I currently
worked on the bitfields again to try to get a good code generation
working. Right now, I've followed what was done in the ia64 for signed
extractions :

(define_insn "extv"
  [(set (match_operand:DI 0 "gr_register_operand" "=r")
(sign_extract:DI (match_operand:DI 1 "gr_register_operand" "r")
 (match_operand:DI 2 "extr_len_operand" "n")
 (match_operand:DI 3 "shift_count_operand" "M")))]
  ""
  "extr %0 = %1, %3, %2"
  [(set_attr "itanium_class" "ishf")])


now this works for me except that I get for this code:

typedef struct sTest {
int64_t a:1;
int64_t b:5;
int64_t c:7;
int64_t d:15;
}STest;

int64_t bar2 (STest a)
{
int64_t res = a.d;
return res;
}

Here is what I get at the final cleanup:

;; Function bar2 (bar2)

bar2 (a)
{
  short unsigned int SR.44;
  short unsigned int SR.43;
  short unsigned int SR.41;
  short unsigned int SR.40;
  short unsigned int SR.22;
  short unsigned int SR.3;

:
  SR.22 = (short unsigned int) () ((short unsigned
int) a.d & 32767);
  SR.43 = SR.22 & 32767;
  SR.44 = SR.43 ^ 16384;
  SR.3 = (short unsigned int) () ((short unsigned
int) () (SR.44 + 49152) & 32767);
  SR.40 = SR.3 & 32767;
  SR.41 = SR.40 ^ 16384;
  return (int64_t) () (SR.41 + 49152);

}

I don't understand why I get all these instructions. I know that
because it's signed, it is more complicated but I would prefer to get
an unsigned extract and the a shift left/shift right. Thus 3
instructions.

Right now, I get so many more instructions that represent what I
showed from the final cleanup.

Any reason for all these instructions or ideas on how to get to my 3
instructions ?

Thank you for your help and time,
Jean Christophe Beyler


Dwarf announcements mailing list

2009-12-11 Thread Michael Eager

The public comment draft of the DWARF Version 4 Standard
should be available some time next month.  It will be
on the DWARF website: http://dwarfstd.org.

If you want to receive a notification when this is available,
please sign up on the DWARF announcements mailing list:

http://lists.dwarfstd.org/listinfo.cgi/dwarf-announce-dwarfstd.org

--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077




Vectorizing 16bit signed integers

2009-12-11 Thread Allan Sandfeld Jensen
Hi

I hope someone can help me. I've been trying to write some tight integer loops 
in way that could be auto-vectorized, saving me to write assembler or using 
specific vectorization extensions. Unfortunately I've not yet managed to make 
gcc vectorize any of them. 

I've simplified the case to just perform the very first operation in the loop; 
converting from two's complement to sign-and-magnitude.

I've then used -ftree-vectorizer-verbose to examine if and if not, why not the 
loops were not vectorized, but I am afraid I don't understand the output.

The simplest version of the loop is here (it appears the branch is not a 
problem, but I have another version without).

inline uint16_t transsign(int16_t v) {
if (v<0) {
return 0x8000U | (1-v);
} else {
return v;
}
}

It very simply converts in a fashion that maintains the full effective bit-
width.

The error from the vectorizer is:
vectorizesign.cpp:42: note: not vectorized: relevant stmt not supported: 
v.1_16 = (uint16_t) D.2157_11;

It appears the unsupported operation in vectorization is the typecast from 
int16_t to uint16_t, can this really be the case, or is the output misleading?

If it is the case, then is there good reason for it, or can I fix it myself by 
adding additional vectorizable operations?

I've attached both test case and full output of ftree-vectorized-verbose=9

Best regards
`Allan

#include 

inline uint16_t transsign1(int16_t v) {
// written with no control-flow to facilitate auto-vectorization
uint16_t sv = v >> 15; // signed left-shift gives a classic sign selector -1 or 0
sv = sv & 0x7FFFU; // never invert the sign-bit
return v ^ sv; // conditional invertion by xor
}

inline uint16_t transsign2(int16_t v) {
if (v<0) {
return 0x8000U | ~v;
} else {
return v;
}
}

inline uint16_t transsign3(int16_t v) {
if (v<0) {
return 0x8000U | (1-v);
} else {
return v;
}
}

// candidate for vectorizaton
void convertts1(uint16_t* out, int16_t* in, uint32_t len) {
for(unsigned int i=0;igcc: 2: No such file or directory

vectorizesign.cpp:28: note: = analyze_loop_nest =
vectorizesign.cpp:28: note: === vect_analyze_loop_form ===
vectorizesign.cpp:28: note: split exit edge.
vectorizesign.cpp:28: note: === get_loop_niters ===
vectorizesign.cpp:28: note: ==> get_loop_niters:len_3(D)
vectorizesign.cpp:28: note: Symbolic number of iterations is len_3(D)
vectorizesign.cpp:28: note: === vect_analyze_data_refs ===

vectorizesign.cpp:28: note: get vectype with 8 units of type short int
vectorizesign.cpp:28: note: vectype: vector short int
vectorizesign.cpp:28: note: get vectype with 8 units of type short unsigned int
vectorizesign.cpp:28: note: vectype: vector short unsigned int
vectorizesign.cpp:28: note: === vect_analyze_scalar_cycles ===
vectorizesign.cpp:28: note: Analyze phi: i_16 = PHI 

vectorizesign.cpp:28: note: Access function of PHI: {0, +, 1}_1
vectorizesign.cpp:28: note: step: 1,  init: 0
vectorizesign.cpp:28: note: Detected induction.
vectorizesign.cpp:28: note: Analyze phi: SMT.12_27 = PHI 

vectorizesign.cpp:28: note: === vect_pattern_recog ===
vectorizesign.cpp:28: note: vect_is_simple_use: operand i_16
vectorizesign.cpp:28: note: def_stmt: i_16 = PHI 

vectorizesign.cpp:28: note: type of def: 4.
vectorizesign.cpp:28: note: === vect_mark_stmts_to_be_vectorized ===
vectorizesign.cpp:28: note: init: phi relevant? i_16 = PHI 

vectorizesign.cpp:28: note: init: phi relevant? SMT.12_27 = PHI 

vectorizesign.cpp:28: note: init: stmt relevant? D.2120_5 = i_16 * 2;

vectorizesign.cpp:28: note: init: stmt relevant? D.2121_7 = out_6(D) + D.2120_5;

vectorizesign.cpp:28: note: init: stmt relevant? D.2122_10 = in_9(D) + D.2120_5;

vectorizesign.cpp:28: note: init: stmt relevant? D.2123_11 = *D.2122_10;

vectorizesign.cpp:28: note: init: stmt relevant? D.2124_12 = (int) D.2123_11;

vectorizesign.cpp:28: note: init: stmt relevant? D.2170_17 = D.2124_12 >> 15;

vectorizesign.cpp:28: note: init: stmt relevant? sv_18 = (uint16_t) D.2170_17;

vectorizesign.cpp:28: note: init: stmt relevant? sv_19 = sv_18 & 32767;

vectorizesign.cpp:28: note: init: stmt relevant? sv.0_20 = (short int) sv_19;

vectorizesign.cpp:28: note: init: stmt relevant? D.2167_21 = sv.0_20 ^ 
D.2123_11;

vectorizesign.cpp:28: note: init: stmt relevant? D.2166_22 = (uint16_t) 
D.2167_21;

vectorizesign.cpp:28: note: init: stmt relevant? *D.2121_7 = D.2166_22;

vectorizesign.cpp:28: note: vec_stmt_relevant_p: stmt has vdefs.
vectorizesign.cpp:28: note: mark relevant 4, live 0.
vectorizesign.cpp:28: note: init: stmt relevant? i_14 = i_16 + 1;

vectorizesign.cpp:28: note: init: stmt relevant? if (len_3(D) > i_14)

vectorizesign.cpp:28: note: worklist: examine stmt: *D.2121_7 = D.2166_22;

vectorizesign.cpp:28: note: vect_is_simple_use: operand D.2166_22
vectorizesign.cpp:28: note: def_stmt: D.2166_22 = (uint16_t) D.2167_21;

vectorizesign.cpp:28: note: type of def: 3.
vectorizes

Re: Bitfields problem

2009-12-11 Thread Jean Christophe Beyler
Interestingly enough, if I do this instead:

typedef struct sTest {
int a:12;
int b:20;
int c:7;
int d:15;
}STest;


int64_t bar2 (STest *a)
{
int64_t res = a->b;
return res;
}

I get at the expand pass :

(insn 6 5 7 3 struct3.c:27 (set (reg:SI 75)
(mem/s:SI (reg/v/f:DI 73 [ a ]) [0 S4 A32])) -1 (nil)) ->
Actually get the data

(insn 7 6 8 3 struct3.c:27 (set (reg:DI 77)
(zero_extract:DI (subreg:DI (reg:SI 75) 0)
(const_int 20 [0x14])
(const_int 12 [0xc]))) -1 (nil))   -> Extract the
bits we want but this is zero_extracted

(insn 8 7 9 3 struct3.c:27 (set (reg:DI 78)
(ashift:DI (reg:DI 77)
(const_int 43 [0x2b]))) -1 (nil))

(insn 9 8 10 3 struct3.c:27 (set (subreg:DI (reg:SI 76) 0)
(ashiftrt:DI (reg:DI 78)
(const_int 43 [0x2b]))) -1 (nil))   -> These two
instructions actually sign extend it

(insn 10 9 11 3 struct3.c:27 (set (reg:DI 79)
(ashift:DI (reg:SI 76)
(const_int 32 [0x20]))) -1 (nil))

(insn 11 10 12 3 struct3.c:27 (set (reg:DI 74)
(ashiftrt:DI (reg:DI 79)
(const_int 32 [0x20]))) -1 (expr_list:REG_EQUAL
(sign_extend:DI (reg:SI 76))
(nil)))   -> Because it's seen as a SI, these last two sign
extend it again...


And I get later on in the passes (the instructions are removed by the
combine pass):

(insn 6 3 7 2 struct3.c:27 (set (reg:SI 75)
(mem/s:SI (reg:DI 8 r8 [ a ]) [0 S4 A32])) 74
{movsi_internal2} (expr_list:REG_DEAD (reg:DI 8 r8 [ a ])
(nil)))

(note 7 6 8 2 NOTE_INSN_DELETED)

(note 8 7 9 2 NOTE_INSN_DELETED)

(note 9 8 10 2 NOTE_INSN_DELETED)

(note 10 9 11 2 NOTE_INSN_DELETED)

(note 11 10 16 2 NOTE_INSN_DELETED)

(insn 16 11 22 2 struct3.c:30 (set (reg/i:DI 6 r6)
(zero_extract:DI (subreg:DI (reg:SI 75) 0)
(const_int 20 [0x14])
(const_int 12 [0xc]))) 63 {extzvdi} (expr_list:REG_DEAD (reg:SI 75)
(nil)))

So now I have two issues that I can't seem to figure out :

- Why can combine remove these 4 instructions ?

- Why do I have such a difference between a local variable that is not
a pointer, a pointer and a global variable ?

I remember having a different behavior if the variable was a global
variable or if it was a parameter. It seems that this is the case also
for here. However, this is worse, since it transforms my signed
extract into a simple zero_extract.

Thanks for your help,
Jc

PS: here is the combine pass debug information:

;; Function bar2 (bar2)

starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
insn_cost 2: 4
insn_cost 6: 4
insn_cost 7: 36
insn_cost 8: 4
insn_cost 9: 4
insn_cost 10: 4
insn_cost 11: 4
insn_cost 16: 4
insn_cost 22: 0
deferring deletion of insn with uid = 2.
modifying insn i3 6 r75:SI=[r8:DI]
  REG_DEAD: r8:DI
deferring rescan insn with uid = 6.
deferring deletion of insn with uid = 8.
modifying insn i3 9 r76:SI#0=r77:DI
  REG_DEAD: r77:DI
deferring rescan insn with uid = 9.
deferring deletion of insn with uid = 7.
modifying insn i3 9 r76:SI#0=zero_extract(r75:SI#0,0x14,0xc)
  REG_DEAD: r75:SI
deferring rescan insn with uid = 9.
deferring deletion of insn with uid = 10.
modifying insn i311 r74:DI=r76:SI#0&0xf
  REG_DEAD: r76:SI
deferring rescan insn with uid = 11.
deferring deletion of insn with uid = 9.
modifying insn i311 r74:DI=zero_extract(r75:SI#0,0x14,0xc)
  REG_DEAD: r75:SI
deferring rescan insn with uid = 11.
deferring deletion of insn with uid = 11.
modifying insn i316 r6:DI=zero_extract(r75:SI#0,0x14,0xc)
  REG_DEAD: r75:SI
deferring rescan insn with uid = 16.
(note 1 0 4 NOTE_INSN_DELETED)

(note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)

(note 2 4 3 2 NOTE_INSN_DELETED)

(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

(insn 6 3 7 2 struct3.c:27 (set (reg:SI 75)
(mem/s:SI (reg:DI 8 r8 [ a ]) [0 S4 A32])) 74
{movsi_internal2} (expr_list:REG_DEAD (reg:DI 8 r8 [ a ])
(nil)))

(note 7 6 8 2 NOTE_INSN_DELETED)

(note 8 7 9 2 NOTE_INSN_DELETED)

(note 9 8 10 2 NOTE_INSN_DELETED)

(note 10 9 11 2 NOTE_INSN_DELETED)

(note 11 10 16 2 NOTE_INSN_DELETED)

(insn 16 11 22 2 struct3.c:30 (set (reg/i:DI 6 r6)
(zero_extract:DI (subreg:DI (reg:SI 75) 0)
(const_int 20 [0x14])
(const_int 12 [0xc]))) 63 {extzvdi} (expr_list:REG_DEAD (reg:SI 75)
(nil)))

(insn 22 16 0 2 struct3.c:30 (use (reg/i:DI 6 r6)) -1 (nil))
starting the processing of deferred insns
deleting insn with uid = 2.
deleting insn with uid = 7.
deleting insn with uid = 8.
deleting insn with uid = 9.
deleting insn with uid = 10.
deleting insn with uid = 11.
rescanning insn with uid = 6.
deleting insn with uid = 6.
rescanning insn with uid = 16.
deleting insn with uid = 16.
ending the processing of deferred insns

;; Combiner totals: 16 attempts, 16 substitutions (2 requiring new space),
;; 6 successes.


On Fri, Dec 11, 2009 at 11:57 AM, Jean Christ

Re: Bad mailing list index?

2009-12-11 Thread Nicholas Sherlock

On 10/12/2009 7:43 a.m., H.J. Lu wrote:

Hi,

When I visit:

http://gcc.gnu.org/ml/gcc-bugs/
http://gcc.gnu.org/ml/gcc-cvs/

at Wed Dec  9 10:41:43 PST 2009, I didn't see "December, 2009".
It was there yesterday. Has anyone else seen it? You may need to
clear browser cache first.


The page sends a Last-Modified time but no Expires header, so some 
aggressive proxies, browsers and other caches could end up caching it 
for a long time. Adding an Expires header set for the 1st of the next 
month would be a good idea.


Cheers,
Nicholas Sherlock