Re: [RFC PATCH, i386]: Allow zero_extended addresses (+ problems with reload and offsetable address, "o" constraint)

2011-08-07 Thread Uros Bizjak
On Fri, Aug 5, 2011 at 8:51 PM, Uros Bizjak  wrote:

> As I read this sentence, the RTX is forced into a temporary register,
> and reload tries to satisfy "o" constraint with plus ((reg ...)
> (const_int ...)), as said at the introduction of "o" constraint a
> couple of pages earlier. Unfortunately, this does not seem to be the
> case.
>
> Is there anything wrong with my approach, or is there something wrong in 
> reload?

To answer my own question, the problem was in *add3_doubleword
pattern, defined as:

(define_insn_and_split "*add3_doubleword"
  [(set (match_operand: 0 "nonimmediate_operand" "=r,o")
(plus:
  (match_operand: 1 "nonimmediate_operand" "%0,0")
  (match_operand: 2 "" "ro,r")))
   (clobber (reg:CC FLAGS_REG))]
  "ix86_binary_operator_ok (PLUS, mode, operands)"

When reload tried to satisfy alternative 1 (the "o" and matching "0")
with a non-offsettable (in this particular case, zero-extended)
address, it CSE'd operand 0 and operand 1 to a temporary TImode
register. Unfortunately a Timode move has its own constraints:

(define_insn "*movti_internal_rex64"
  [(set (match_operand:TI 0 "nonimmediate_operand" "=!r,o,x,x,xm")
(match_operand:TI 1 "general_operand" "riFo,riF,C,xm,x"))]
  "TARGET_64BIT && !(MEM_P (operands[0]) && MEM_P (operands[1]))"

where move from/to a general register to/from non-offsettable memory
is not valid.

Although, it would be nice for reload to subsequently fix CSE'd
non-offsetable memory by copying address to temporary reg (*as said in
the documentation*), we could simply require an XMM temporary for
TImode reloads to/from integer registers, and this fixes ICE for x32.

The testcase to play with (gcc -O2 -mx32):

--cut here--
void test (__int128 *array, int idx, int off)
{
  __int128 *dest = &array [idx];

  dest[0] += 1;
  dest[off] = 0;
}
--cut here--

So, following additional patch saves the day:

Index: i386/i386.c
===
--- i386/i386.c (revision 177536)
+++ i386/i386.c (working copy)
@@ -28233,6 +28248,15 @@ ix86_secondary_reload (bool in_p, rtx x, reg_class
   enum machine_mode mode,
   secondary_reload_info *sri ATTRIBUTE_UNUSED)
 {
+  /* Double-word spills from general registers to non-offsettable
+ memory references go through XMM register.  Following code
+ handles zero-extended addresses on x32 target.  */
+  if (TARGET_64BIT
+  && GET_MODE_SIZE (mode) > UNITS_PER_WORD
+  && rclass == GENERAL_REGS
+  && !offsettable_memref_p (x))
+return SSE_REGS;
+
   /* QImode spills from non-QI registers require
  intermediate register on 32bit targets.  */
   if (!TARGET_64BIT

Uros.


Re: PING: PATCH: Use int64 for x86 options

2011-08-07 Thread H.J. Lu
On Sat, Aug 6, 2011 at 9:05 AM, H.J. Lu  wrote:
> Ping.  AVX2 support depends on this patch.
>
> Thanks.
>
> On Thu, Aug 4, 2011 at 5:49 PM, H.J. Lu  wrote:
>> On Thu, Aug 4, 2011 at 4:44 PM, H.J. Lu  wrote:
>>> On Thu, Aug 4, 2011 at 3:46 PM, Joseph S. Myers  
>>> wrote:
 On Thu, 4 Aug 2011, H.J. Lu wrote:

> Here is the updated patch to get proper HOST_WIDE_INT bits and 1
> through a new file, opt-gen.c.  OK for trunk?

 Using another generator program like this can't be the best approach
 (apart from anything else, when built for the build system hwint.h should
 reflect the build system types not the host system types; cf.
  where I suspected that
 sort of host/build confusion of causing a reported build failure).

 You want opth-gen.awk to know the number of bits to give errors.  Note
 that the errors are given by generating #error into the output file.  It's
 easy enough to generate #if conditions into the file that compare with
 HOST_BITS_PER_WIDE_INT.

 You want opth-gen.awk to know whether to use 1LL as the shifted constant.
 You can easily enough make hwint.h contain a HOST_WIDE_INT_1 macro,
 defined to 1L or 1LL as appropriate.

>>>
>>>
>>> Here is the updated patch.  OK for trunk?
>>>
>>
>> Small update.  Replace 1LL with HOST_WIDE_INT_1 in  PTA_XXX.
>> OK for trunk?
>>
>> Thanks.
>>
>> --
>> H.J.
>> ---
>> 2011-08-04  H.J. Lu  
>>            Igor Zamyatin 
>>
>>        * hwint.h (HOST_WIDE_INT_1): New.
>>
>>        * opt-functions.awk (switch_bit_fields): Initialize the
>>        host_wide_int field.
>>        (host_wide_int_var_name): New.
>>        (var_type_struct): Check and return HOST_WIDE_INT.
>>
>>        * opt-read.awk: Handle HOST_WIDE_INT for "Variable".
>>
>>        * optc-save-gen.awk: Support HOST_WIDE_INT on var_target_other.
>>
>>        * opth-gen.awk: Use HOST_WIDE_INT_1 on HOST_WIDE_INT.  Properly
>>        check masks for HOST_WIDE_INT.
>>
>>        * opts-common.c (set_option): Support HOST_WIDE_INT Flag_var.
>>
>>        * opts.h (cl_option): Add cl_host_wide_int.  Change var_value
>>        to HOST_WIDE_INT.
>>
>>        * config/i386/i386-c.c (ix86_target_macros_internal): Replace int
>>        with HOST_WIDE_INT for isa_flag.
>>        (ix86_pragma_target_parse): Replace int with HOST_WIDE_INT for
>>        isa variables.
>>
>>        * config/i386/i386.c (ix86_target_string): Replace int with
>>        HOST_WIDE_INT for isa.  Use HOST_WIDE_INT_PRINT to print isa.
>>        (ix86_target_opts): Replace int with HOST_WIDE_INT on mask.
>>        (pta_flags): Removed.
>>        (PTA_XXX): Redefined as (HOST_WIDE_INT_1 << X).
>>        (pta): Use HOST_WIDE_INT on flags.
>>        (builtin_isa): Use HOST_WIDE_INT on isa.
>>        (ix86_add_new_builtins): Likewise.
>>        (def_builtin): Use HOST_WIDE_INT on mask.
>>        (def_builtin_const): Likewise.
>>        (builtin_description): Likewise.
>>
>>        * config/i386/i386.opt (ix86_isa_flags): Replace int with
>>        HOST_WIDE_INT.
>>        (ix86_isa_flags_explicit): Likewise.
>>        (x_ix86_isa_flags_explicit): Likewise.
>>
>
>
>
> --
> H.J.
>

HOST_BITS_PER_WIDE_INT isn't defined in target library.
I need to check if HOST_BITS_PER_WIDE_INT is defined
first.  Here is the updated patch.



-- 
H.J.
2011-08-07  H.J. Lu  
Igor Zamyatin 

* hwint.h (HOST_WIDE_INT_1): New.

* opt-functions.awk (switch_bit_fields): Initialize the
host_wide_int field.
(host_wide_int_var_name): New.
(var_type_struct): Check and return HOST_WIDE_INT.

* opt-read.awk: Handle HOST_WIDE_INT for "Variable".

* optc-save-gen.awk: Support HOST_WIDE_INT on var_target_other.

* opth-gen.awk: Use HOST_WIDE_INT_1 on HOST_WIDE_INT.  Properly
check masks for HOST_WIDE_INT.

* opts-common.c (set_option): Support HOST_WIDE_INT Flag_var.

* opts.h (cl_option): Add cl_host_wide_int.  Change var_value
to HOST_WIDE_INT.

* config/i386/i386-c.c (ix86_target_macros_internal): Replace int
with HOST_WIDE_INT for isa_flag.
(ix86_pragma_target_parse): Replace int with HOST_WIDE_INT for
isa variables.

* config/i386/i386.c (ix86_target_string): Replace int with
HOST_WIDE_INT for isa.  Use HOST_WIDE_INT_PRINT to print isa.
(ix86_target_opts): Replace int with HOST_WIDE_INT on mask.
(pta_flags): Removed.
(PTA_XXX): Redefined as (HOST_WIDE_INT_1 << X).
(pta): Use HOST_WIDE_INT on flags.
(builtin_isa): Use HOST_WIDE_INT on isa.
(ix86_add_new_builtins): Likewise.
(def_builtin): Use HOST_WIDE_INT on mask.
(def_builtin_const): Likewise.
(builtin_description): Likewise.

* config/i386/i386.opt (ix86_isa_flags): Replace int with
HOST_WIDE_INT.
(ix86_isa_flags_explicit): Likewise.
(x_ix86