On Thu, 20 Jan 2011 17:27:14 -0800
Richard Henderson <r...@redhat.com> wrote:

> Depending on how Haskell programs are built, it may be better
> to avoid the GOT entirely.  E.g.
> 
>   -mcmodel=large
> 
> a-la the x86_64 port.  This generates full 64-bit absolute
> relocations.  For ia64 code this would look like
> 
>       movl    r32 = foo#
> 
>   Offset          Info           Type           Sym. Value    Sym. Name + 
> Addend
> 000000000002  000400000023 R_IA64_IMM64      0000000000000000 foo + 0
> 
> Of course, you wouldn't put this code into a shared library.
> For that you really would want a 64-bit GPREL offset.  E.g.
> 
>       movl    r32 = @gprel(foo)
>       add     r32 = r32, r1
> 
>   Offset          Info           Type           Sym. Value    Sym. Name + 
> Addend
> 000000000002  00040000002b R_IA64_GPREL64I   0000000000000000 foo + 0
> 
> Since both of these assemble now, really doubt there's any 
> binutils work that needs to be done.
> 
> What you'd have to do is add some command-line switches (and perhaps
> clean up the ones that are there wrt code models), and adjust the
> code in ia64_expand_load_address to handle your new options.  It really
> shouldn't be very difficult.

Aha. Sorry, I'm not familar with ia64 'instruction set'/'linux ABI' yet. So
forgive me my silly questions.

In the ia64_expand_load_address I found out -mauto-pic option, which looks
magical. It seems to do _almost_ the thing I need.

I've added commandline-switch handling (actually, stole from sparc's case)
and tried to reuse existing TARGET_AUTO_PIC code.

I've attached dirty patch. It has not very nice comments, tabs and spaces yet.

As I understand, TARGET_AUTO_PIC, TARGET_CONSTANT_GP,
TARGET_NO_PIC should somehow fall into different memory models.
I don't get exact difference between them.

i386  distincts code models in a more fine grained manner:
CMODEL_* and  CMODEL_*_PIC. Maybe, ia64 should have similar distinction?

Something like:
  MEDIUM
  LARGE_CODE
  LARGE_DATA
  LARGE (CODE+DATA)
and
  *_NO_PIC variants


The patch is tested on the following code sample:

    extern void foo_fun();
    extern int foo_var;
    void bar()
    {
        foo_fun();
        foo_var = 4;
    }

Result:
    a.o:
      Offset          Info           Type           Sym. Value    Sym. Name + 
Addend
    000000000022  000900000049 R_IA64_PCREL21B   0000000000000000 foo_fun + 0
    000000000031  000a00000086 R_IA64_LTOFF22X   0000000000000000 foo_var + 0
    000000000040  000a00000087 R_IA64_LDXMOV     0000000000000000 foo_var + 0

    a.o.large:
      Offset          Info           Type           Sym. Value    Sym. Name + 
Addend
    000000000022  000900000049 R_IA64_PCREL21B   0000000000000000 foo_fun + 0
    000000000042  000a0000002b R_IA64_GPREL64I   0000000000000000 foo_var + 0

There is yet one more theoretical problem.
Or not so theoretical as GHC's binary is about 16MBs (and still growing).

To actually follow the meaning of 'cmodel' word (code model) we should
implement function calls to arbtrarily far offset.

AFAIU R_IA64_PCREL21B won't let us make calls more, than 2^21 bundles away.
(~16 megabytes up and down).
What kind of calls should be emitted in this case? call_gp/call_value_gp?

Thanks!

-- 

  Sergei
diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index 1842555..41bd287 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -45,40 +45,43 @@ along with GCC; see the file COPYING3.  If not see
 #include "diagnostic-core.h"
 #include "sched-int.h"
 #include "timevar.h"
 #include "target.h"
 #include "target-def.h"
 #include "tm_p.h"
 #include "hashtab.h"
 #include "langhooks.h"
 #include "cfglayout.h"
 #include "gimple.h"
 #include "intl.h"
 #include "df.h"
 #include "debug.h"
 #include "params.h"
 #include "dbgcnt.h"
 #include "tm-constrs.h"
 #include "sel-sched.h"
 #include "reload.h"
 #include "dwarf2out.h"
 
+/* Used code model */
+enum cmodel ia64_cmodel;
+
 /* This is used for communication between ASM_OUTPUT_LABEL and
    ASM_OUTPUT_LABELREF.  */
 int ia64_asm_output_label = 0;
 
 /* Register names for ia64_expand_prologue.  */
 static const char * const ia64_reg_numbers[96] =
 { "r32", "r33", "r34", "r35", "r36", "r37", "r38", "r39",
   "r40", "r41", "r42", "r43", "r44", "r45", "r46", "r47",
   "r48", "r49", "r50", "r51", "r52", "r53", "r54", "r55",
   "r56", "r57", "r58", "r59", "r60", "r61", "r62", "r63",
   "r64", "r65", "r66", "r67", "r68", "r69", "r70", "r71",
   "r72", "r73", "r74", "r75", "r76", "r77", "r78", "r79",
   "r80", "r81", "r82", "r83", "r84", "r85", "r86", "r87",
   "r88", "r89", "r90", "r91", "r92", "r93", "r94", "r95",
   "r96", "r97", "r98", "r99", "r100","r101","r102","r103",
   "r104","r105","r106","r107","r108","r109","r110","r111",
   "r112","r113","r114","r115","r116","r117","r118","r119",
   "r120","r121","r122","r123","r124","r125","r126","r127"};
 
 /* ??? These strings could be shared with REGISTER_NAMES.  */
@@ -1022,41 +1025,45 @@ ia64_cannot_force_const_mem (rtx x)
 /* Expand a symbolic constant load.  */
 
 bool
 ia64_expand_load_address (rtx dest, rtx src)
 {
   gcc_assert (GET_CODE (dest) == REG);
 
   /* ILP32 mode still loads 64-bits of data from the GOT.  This avoids
      having to pointer-extend the value afterward.  Other forms of address
      computation below are also more natural to compute as 64-bit quantities.
      If we've been given an SImode destination register, change it.  */
   if (GET_MODE (dest) != Pmode)
     dest = gen_rtx_REG_offset (dest, Pmode, REGNO (dest),
 			       byte_lowpart_offset (Pmode, GET_MODE (dest)));
 
   if (TARGET_NO_PIC)
     return false;
   if (small_addr_symbolic_operand (src, VOIDmode))
     return false;
 
-  if (TARGET_AUTO_PIC)
+  /* TODO:
+   *    CMODEL_LARGE && TARGET_NO_PIC should use 'movl r32 = foo#'
+   *    (R_IA64_IMM64)
+   */
+  if (TARGET_AUTO_PIC || ia64_cmodel == CMODEL_LARGE)
     emit_insn (gen_load_gprel64 (dest, src));
   else if (GET_CODE (src) == SYMBOL_REF && SYMBOL_REF_FUNCTION_P (src))
     emit_insn (gen_load_fptr (dest, src));
   else if (sdata_symbolic_operand (src, VOIDmode))
     emit_insn (gen_load_gprel (dest, src));
   else
     {
       HOST_WIDE_INT addend = 0;
       rtx tmp;
 
       /* We did split constant offsets in ia64_expand_move, and we did try
 	 to keep them split in move_operand, but we also allowed reload to
 	 rematerialize arbitrary constants rather than spill the value to
 	 the stack and reload it.  So we have to be prepared here to split
 	 them apart again.  */
       if (GET_CODE (src) == CONST)
 	{
 	  HOST_WIDE_INT hi, lo;
 
 	  hi = INTVAL (XEXP (XEXP (src, 0), 1));
@@ -5758,40 +5765,64 @@ ia64_handle_option (size_t code, const char *arg, int value)
 	  if (!strcmp (arg, processor_alias_table[i].name))
 	    {
 	      ia64_tune = processor_alias_table[i].processor;
 	      break;
 	    }
 	if (i == pta_size)
 	  error ("bad value %<%s%> for -mtune= switch", arg);
 	return true;
       }
 
     default:
       return true;
     }
 }
 
 /* Implement TARGET_OPTION_OVERRIDE.  */
 
 static void
 ia64_option_override (void)
 {
+  static struct code_model {
+    const char *const name;
+    const enum cmodel value;
+  } const cmodels[] = {
+    { "medium", CMODEL_MEDIUM },
+    { "large",  CMODEL_LARGE },
+    { NULL, (enum cmodel) 0 }
+  };
+  const struct code_model *cmodel;
+
+  /* Code model selection.  */
+  ia64_cmodel = CMODEL_MEDIUM;
+
+  if (ia64_cmodel_string != NULL)
+    {
+      for (cmodel = &cmodels[0]; cmodel->name; cmodel++)
+        if (strcmp (ia64_cmodel_string, cmodel->name) == 0)
+          break;
+      if (cmodel->name == NULL)
+        error ("bad value (%s) for -mcmodel= switch", ia64_cmodel_string);
+      else
+        ia64_cmodel = cmodel->value;
+    }
+
   if (TARGET_AUTO_PIC)
     target_flags |= MASK_CONST_GP;
 
   /* Numerous experiment shows that IRA based loop pressure
      calculation works better for RTL loop invariant motion on targets
      with enough (>= 32) registers.  It is an expensive optimization.
      So it is on only for peak performance.  */
   if (optimize >= 3)
     flag_ira_loop_pressure = 1;
 
 
   ia64_section_threshold = (global_options_set.x_g_switch_value
 			    ? g_switch_value
 			    : IA64_DEFAULT_GVALUE);
 
   init_machine_status = ia64_init_machine_status;
 
   if (align_functions <= 0)
     align_functions = 64;
   if (align_loops <= 0)
@@ -10855,46 +10886,50 @@ ia64_output_function_profiler (FILE *file, int labelno)
     {
       gcc_assert (STATIC_CHAIN_REGNUM == 15);
       indirect_call = true;
     }
   else
     indirect_call = false;
 
   if (TARGET_GNU_AS)
     fputs ("\t.prologue 4, r40\n", file);
   else
     fputs ("\t.prologue\n\t.save ar.pfs, r40\n", file);
   fputs ("\talloc out0 = ar.pfs, 8, 0, 4, 0\n", file);
 
   if (NO_PROFILE_COUNTERS)
     fputs ("\tmov out3 = r0\n", file);
   else
     {
       char buf[20];
       ASM_GENERATE_INTERNAL_LABEL (buf, "LP", labelno);
 
-      if (TARGET_AUTO_PIC)
-	fputs ("\tmovl out3 = @gprel(", file);
+      /* TODO:
+       *    CMODEL_LARGE && TARGET_NO_PIC should use 'movl r32 = foo#'
+       *    (R_IA64_IMM64)
+       */
+      if (TARGET_AUTO_PIC || ia64_cmodel == CMODEL_LARGE)
+	fputs ("\tmovl out3 = @gprel(", file); 
       else
 	fputs ("\taddl out3 = @ltoff(", file);
       assemble_name (file, buf);
-      if (TARGET_AUTO_PIC)
+      if (TARGET_AUTO_PIC || ia64_cmodel == CMODEL_LARGE)
 	fputs (")\n", file);
       else
 	fputs ("), r1\n", file);
     }
 
   if (indirect_call)
     fputs ("\taddl r14 = @ltoff(@fptr(_mcount)), r1\n", file);
   fputs ("\t;;\n", file);
 
   fputs ("\t.save rp, r42\n", file);
   fputs ("\tmov out2 = b0\n", file);
   if (indirect_call)
     fputs ("\tld8 r14 = [r14]\n\t;;\n", file);
   fputs ("\t.body\n", file);
   fputs ("\tmov out1 = r1\n", file);
   if (indirect_call)
     {
       fputs ("\tld8 r16 = [r14], 8\n\t;;\n", file);
       fputs ("\tmov b6 = r16\n", file);
       fputs ("\tld8 r1 = [r14]\n", file);
diff --git a/gcc/config/ia64/ia64.h b/gcc/config/ia64/ia64.h
index 8e6d298..23869fb 100644
--- a/gcc/config/ia64/ia64.h
+++ b/gcc/config/ia64/ia64.h
@@ -66,40 +66,48 @@ extern unsigned int ia64_section_threshold;
 #define TARGET_HAVE_TLS true
 #endif
 
 #define TARGET_TLS14		(ia64_tls_size == 14)
 #define TARGET_TLS22		(ia64_tls_size == 22)
 #define TARGET_TLS64		(ia64_tls_size == 64)
 
 #define TARGET_HPUX		0
 #define TARGET_HPUX_LD		0
 
 #define TARGET_ABI_OPEN_VMS 0
 
 #ifndef TARGET_ILP32
 #define TARGET_ILP32 0
 #endif
 
 #ifndef HAVE_AS_LTOFFX_LDXMOV_RELOCS
 #define HAVE_AS_LTOFFX_LDXMOV_RELOCS 0
 #endif
 
+enum cmodel {
+  CMODEL_MEDIUM, /* GOT entries are limited by 21 bits (4MB) */
+  CMODEL_LARGE   /* no assumptions on address and data space sizes */
+};
+
+/* used memory model */
+extern enum cmodel ia64_cmodel;
+
 /* Values for TARGET_INLINE_FLOAT_DIV, TARGET_INLINE_INT_DIV, and
    TARGET_INLINE_SQRT.  */
 
 enum ia64_inline_type
 {
   INL_NO = 0,
   INL_MIN_LAT = 1,
   INL_MAX_THR = 2
 };
 
 /* Default target_flags if no switches are specified  */
 
 #ifndef TARGET_DEFAULT
 #define TARGET_DEFAULT (MASK_DWARF2_ASM)
 #endif
 
 #ifndef TARGET_CPU_DEFAULT
 #define TARGET_CPU_DEFAULT 0
 #endif
 
diff --git a/gcc/config/ia64/ia64.opt b/gcc/config/ia64/ia64.opt
index 49d099a..1535556 100644
--- a/gcc/config/ia64/ia64.opt
+++ b/gcc/config/ia64/ia64.opt
@@ -42,40 +42,44 @@ Use in/loc/out register names
 
 mno-sdata
 Target Report RejectNegative Mask(NO_SDATA)
 
 msdata
 Target Report RejectNegative InverseMask(NO_SDATA)
 Enable use of sdata/scommon/sbss
 
 mno-pic
 Target Report RejectNegative Mask(NO_PIC)
 Generate code without GP reg
 
 mconstant-gp
 Target Report RejectNegative Mask(CONST_GP)
 gp is constant (but save/restore gp on indirect calls)
 
 mauto-pic
 Target Report RejectNegative Mask(AUTO_PIC)
 Generate self-relocatable code
 
+mcmodel=
+Target RejectNegative Joined Var(ia64_cmodel_string)
+Use given ia64 code model
+
 minline-float-divide-min-latency
 Target Report RejectNegative Var(TARGET_INLINE_FLOAT_DIV, 1)
 Generate inline floating point division, optimize for latency
 
 minline-float-divide-max-throughput
 Target Report RejectNegative Var(TARGET_INLINE_FLOAT_DIV, 2) Init(2)
 Generate inline floating point division, optimize for throughput
 
 mno-inline-float-divide
 Target Report RejectNegative Var(TARGET_INLINE_FLOAT_DIV, 0)
 
 minline-int-divide-min-latency
 Target Report RejectNegative Var(TARGET_INLINE_INT_DIV, 1)
 Generate inline integer division, optimize for latency
 
 minline-int-divide-max-throughput
 Target Report RejectNegative Var(TARGET_INLINE_INT_DIV, 2)
 Generate inline integer division, optimize for throughput
 
 mno-inline-int-divide
diff --git a/gcc/config/ia64/predicates.md b/gcc/config/ia64/predicates.md
index e06c521..bf227b7 100644
--- a/gcc/config/ia64/predicates.md
+++ b/gcc/config/ia64/predicates.md
@@ -102,41 +102,41 @@
       op = XEXP (op, 0);
       if (GET_CODE (op) != PLUS
 	  || GET_CODE (XEXP (op, 0)) != SYMBOL_REF
 	  || GET_CODE (XEXP (op, 1)) != CONST_INT)
 	return false;
       op = XEXP (op, 0);
       /* FALLTHRU */
 
     case SYMBOL_REF:
       return SYMBOL_REF_SMALL_ADDR_P (op);
 
     default:
       gcc_unreachable ();
     }
 })
 
 ;; True if OP refers to a symbol with which we may use any offset.
 (define_predicate "any_offset_symbol_operand"
   (match_code "symbol_ref")
 {
-  if (TARGET_NO_PIC || TARGET_AUTO_PIC)
+  if (TARGET_NO_PIC || TARGET_AUTO_PIC || ia64_cmodel == CMODEL_LARGE)
     return true;
   if (SYMBOL_REF_SMALL_ADDR_P (op))
     return true;
   if (SYMBOL_REF_FUNCTION_P (op))
     return false;
   if (sdata_symbolic_operand (op, mode))
     return true;
   return false;
 })
 
 ;; True if OP refers to a symbol with which we may use 14-bit aligned offsets.
 ;; False if OP refers to a symbol with which we may not use any offset at any
 ;; time.
 (define_predicate "aligned_offset_symbol_operand"
   (and (match_code "symbol_ref")
        (match_test "! SYMBOL_REF_FUNCTION_P (op)")))
 
 ;; True if OP refers to a symbol, and is appropriate for a GOT load.
 (define_predicate "got_symbolic_operand" 
   (match_operand 0 "symbolic_operand" "")

Attachment: signature.asc
Description: PGP signature

Reply via email to