On Wed, Oct 24, 2012 at 3:42 AM, Joern Rennecke
<[email protected]> wrote:
> Quoting Richard Biener <[email protected]>:
>
>> On Tue, Oct 16, 2012 at 9:35 PM, Joern Rennecke
>> <[email protected]> wrote:
>
> ..
>>>
>>> Well, we could split it anyway, and give ports without the need for
>>> multiple length attributes the benefit of the optimistic algorithm.
>>>
>>> I have attached a patch that implements this.
>>
>>
>> Looks reasonable to me, though I'm not familiar enough with the code
>> to approve it.
>
>
> Now that Richard Sandiford has reviewed that split-off part and it's in
> the source tree, we can return to the remaining functionality needed
> by for the ARC port.
>
>> I'd strongly suggest to try harder to make things work for you without
>> the new attribute even though I wasn't really able to follow your
>> reasoning
>> on why that wouldn't work. It may be easier to motivate this change
>> once the port is in without that attribute so one can actually look at
>> the machine description and port details.
>
>
> Well, it doesn't simply drop in with the existing branch shortening -
> libgcc won't compile because of out-of-range branches.
> I tried to lump the length and lock_length atribute together, and that
> just gives genattrtab indigestion. It sits there looping forever.
> I could start debugging this, but that would take an unknown amount of
> time, and then the review of the fix would take an unknown amount of time,
> and then the ARC port probably needs fixing up again because it just
> doesn't work right with these fudged lengths. And even if we could get
> everything required in before the close of phase 1, the branch shortening
> would be substandard.
> It seems more productive to get the branch shortening working now.
> The lock_length atrtibute is completely optional, so no port maintainer
> would be forced to use the functionality if it's not desired.
>
> The issue is that the some instructions need to be aligned or unaligned
> for performance or in a few cases even for correctness. Just inserting
> random nops would be a waste of code space and cycles, since most of the
> time, the desired (mis)alignment can be archived by selectively making
> a short instruction long. If an instruction that is long once was forced
> to stay long henceforth, that would actually defeat the purpose of getting
> the desired alignment. Then another short instruction - if it can be found
> -
> would need to be upsized. So a size increase could ripple through as
> alignments are distorted. The natural thing to do is really when the
> alignemnt changes is really to let the upsized instruction be short again.
> Only length-determined branch sizes have to be locked to avoid cycles.
Just to add some extra information, can you quote your ports
ADJUST_INSN_LENGTH and one example instruction with length/lock_length
attribute the above applies to?
> This is the documentation for the new role of the lock_length attribute
> (reduced from my previous attempt):
>
> @cindex lock_length
> Usually, when doing optimizing branch shortening, the instruction length
> is calculated by avaluating the @code{length} attribute, applying
> @code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant
> value and the length of the instruction in the previous iteration.
Which sounds straight-forward. The docs of ADJUST_INSN_LENGTH
are not entirely clear, but it sounds like it may only increase length, correct?
I see that ADJUST_INSN_LENGTH is really not needed as everything
should be expressable in the length attribute of an instruction?
> If you define the @code{lock_length} attribute, the @code{lock_length}
> attribute will be evaluated, and then the maximum with of @code{lock_length}
with of? I read it as 'of the'
> value from the previous iteration will be formed and saved.
So lock_length will only increase during iteration.
> Then the maximum of that value with the @code{length} attribute will
> be formed, and @code{ADJUST_INSN_LENGTH} will be applied.
ADJUST_INSN_LENGTH will be applied to the maximum? What will
be the 'instruction length' equivalent to the simple non-lock_length case?
Is it the above, max (ADJUST_INSN_LENGTH (lock-length-max, length))?
> Thus, you can allow the length to vary downwards as well as upwards
> across iterations with suitable definitions of the @code{length} attribute
> and/or @code{ADJUST_INSN_LENGTH}. Care has to be taken that this does not
> lead to infinite loops.
I don't see that you can shrink length with just suitable lock_length and
length attributes. What seems to be the cruical difference is that you
apply ADJUST_INSN_LENGTH after combining lock-length-max and length.
But then you _decrease_ length with ADJUST_INSN_LENGHT ...
Maybe what you really want is ADJUST_INSN_LENGTH_AFTER which is
applied afterwards? Thus,
> Usually, when doing optimizing branch shortening, the instruction length
> is calculated by avaluating the @code{length} attribute, applying
> @code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant
> value and the length of the instruction in the previous iteration.
> If @code{ADJUST_INSN_LENGTH_AFTER} is defined it is applied to the
> resulting instruction length.
?
Note that
> Care has to be taken that this does not
> lead to infinite loops.
doesn't convince me that is properly designed ;)
Thanks,
Richard.
>
> The new patch builds on this patch:
> http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01890.html
> as a prerequisite.
>
> build tested with libraries in revision 192654 for i686-pc-linux-gnu X
> mipsel-elf .
> bootstrapped in revision 192703 on i686-pc-linux-gnu;
> I've also successfully run config-list.mk with the patch applied to this
> revision. The following ports had pre-existing failures, which are
> documented in the sub-PRs or PR47093/PR44756:
>
> alpha64-dec-vms alpha-dec-vms am33_2.0-linux arm-netbsdelf arm-wrs-vxworks
> avr-elf avr-rtems c6x-elf c6x-uclinux cr16-elf fr30-elf
> i686-interix3 --enable-obsolete i686-openbsd3.0 i686-pc-msdosdjgpp
> ia64-hp-vms iq2000-elf lm32-elf lm32-rtems lm32-uclinux mep-elf
> microblaze-elf microblaze-linux mn10300-elf moxie-elf moxie-rtems
> moxie-uclinux pdp11-aout picochip-elf --enable-obsolete rl78-elf
> rx-elf score-elf --enable-obsolete sh5el-netbsd sh64-elf --with-newlib
> sh64-linux sh64-netbsd sh-elf shle-linux sh-netbsdelf sh-rtems sh-superh-elf
> sh-wrs-vxworks tilegx-linux-gnu tilepro-linux-gnu vax-linux-gnu
> vax-netbsdelf
> vax-openbsd x86_64-knetbsd-gnu
>
>
> I'll be posting the ARC port shortly; it does not fit into a single 100 KB
> posting, so I'm thinking of splitting it in a configury patch and zx
> compressed files/tarballs for arc.c, arc.md, libgcc, and the rest of the
> port.
>
> 2012-10-22 Joern Rennecke <[email protected]>
>
> * doc/md.texi (node Defining Attributes): Add lock_length to
> table of special attributes.
> (node Insn Lengths): Document lock_length attribute.
> * final.c (uid_lock_length): New variable.
> (insn_max_length, get_attr_lock_length): New functions.
> (get_attr_length): Use insn_max_length.
> (get_attr_length_1): Use direct recursion rather than
> calling get_attr_length.
> (INSN_VARIABLE_LENGTH_P): Define.
> (shorten_branches): Implement lock_length attribute functionality.
> * genattrtab.c (lock_length_str): New variable.
> (make_length_attrs): New parameter base.
> (main): Initialize lock_length_str.
> Generate lock_lengths attributes.
> * genattr.c (gen_attr): Emit declarations for lock_length attribute
> related functions.
> (main): Emit stub defines for lock_length attribute related
> functions.
>
> Index: doc/md.texi
> ===================================================================
> --- doc/md.texi (revision 2660)
> +++ doc/md.texi (working copy)
> @@ -7515,6 +7515,9 @@ extern enum attr_type get_attr_type ();
> code chunks. This is especially important when verifying branch
> distances. @xref{Insn Lengths}.
>
> +@item lock_length
> +The @code{lock_length} attribute.
> +
> @item enabled
> The @code{enabled} attribute can be defined to prevent certain
> alternatives of an insn definition from being used during code
> @@ -8038,6 +8041,22 @@ (define_insn "jump"
> (const_int 6)))])
> @end smallexample
>
> +@cindex lock_length
> +Usually, when doing optimizing branch shortening, the instruction length
> +is calculated by avaluating the @code{length} attribute, applying
> +@code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant
> +value and the length of the instruction in the previous iteration.
> +
> +If you define the @code{lock_length} attribute, the @code{lock_length}
> +attribute will be evaluated, and then the maximum with of
> @code{lock_length}
> +value from the previous iteration will be formed and saved.
> +Then the maximum of that value with the @code{length} attribute will
> +be formed, and @code{ADJUST_INSN_LENGTH} will be applied.
> +Thus, you can allow the length to vary downwards as well as upwards
> +across iterations with suitable definitions of the @code{length} attribute
> +and/or @code{ADJUST_INSN_LENGTH}. Care has to be taken that this does not
> +lead to infinite loops.
> +
> @end ifset
> @ifset INTERNALS
> @node Constant Attributes
> Index: final.c
> ===================================================================
> --- final.c (revision 2660)
> +++ final.c (working copy)
> @@ -308,6 +308,7 @@ dbr_sequence_length (void)
> `insn_current_length'. */
>
> static int *insn_lengths;
> +static int *uid_lock_length;
>
> VEC(int,heap) *insn_addresses_;
>
> @@ -430,14 +431,41 @@ get_attr_length_1 (rtx insn, int (*fallb
> return length;
> }
>
> +/* Calculate the maximum length of INSN. */
> +static int
> +insn_max_length (rtx insn)
> +{
> + int length, lock_length;
> +
> + length = insn_default_length (insn);
> + if (HAVE_ATTR_lock_length)
> + {
> + lock_length = insn_default_lock_length (insn);
> + if (length < lock_length)
> + length = lock_length;
> + }
> + return length;
> +}
> +
> /* Obtain the current length of an insn. If branch shortening has been
> done,
> get its actual length. Otherwise, get its maximum length. */
> int
> get_attr_length (rtx insn)
> {
> - return get_attr_length_1 (insn, insn_default_length);
> + return get_attr_length_1 (insn, insn_max_length);
> }
>
> +int
> +get_attr_lock_length (rtx insn)
> +{
> + if (uid_lock_length && insn_lengths_max_uid > INSN_UID (insn))
> + return uid_lock_length[INSN_UID (insn)];
> + return get_attr_length_1 (insn, insn_min_lock_length);
> +}
> +
> +#define INSN_VARIABLE_LENGTH_P(INSN) \
> + (insn_variable_length_p (INSN) || insn_variable_lock_length_p (INSN))
> +
> /* Obtain the current length of an insn. If branch shortening has been
> done,
> get its actual length. Otherwise, get its minimum length. */
> int
> @@ -966,6 +994,11 @@ shorten_branches (rtx first)
> /* Allocate the rest of the arrays. */
> insn_lengths = XNEWVEC (int, max_uid);
> insn_lengths_max_uid = max_uid;
> + if (HAVE_ATTR_lock_length)
> + uid_lock_length = XCNEWVEC (int, max_uid);
> + else
> + uid_lock_length = insn_lengths;
> +
> /* Syntax errors can lead to labels being outside of the main insn
> stream.
> Initialize insn_addresses, so that we get reproducible results. */
> INSN_ADDRESSES_ALLOC (max_uid);
> @@ -1064,7 +1097,7 @@ shorten_branches (rtx first)
> #endif /* CASE_VECTOR_SHORTEN_MODE */
>
> /* Compute initial lengths, addresses, and varying flags for each insn.
> */
> - int (*length_fun) (rtx) = increasing ? insn_min_length :
> insn_default_length;
> + int (*length_fun) (rtx) = increasing ? insn_min_length : insn_max_length;
>
> for (insn_current_address = 0, insn = first;
> insn != 0;
> @@ -1106,7 +1139,7 @@ shorten_branches (rtx first)
> /* Alignment is handled by ADDR_VEC_ALIGN. */
> }
> else if (GET_CODE (body) == ASM_INPUT || asm_noperands (body) >= 0)
> - insn_lengths[uid] = asm_insn_count (body) * insn_default_length
> (insn);
> + insn_lengths[uid] = asm_insn_count (body) * insn_max_length (insn);
> else if (GET_CODE (body) == SEQUENCE)
> {
> int i;
> @@ -1117,7 +1150,7 @@ shorten_branches (rtx first)
> const_delay_slots = 0;
> #endif
> int (*inner_length_fun) (rtx)
> - = const_delay_slots ? length_fun : insn_default_length;
> + = const_delay_slots ? length_fun : insn_max_length;
> /* Inside a delay slot sequence, we do not do any branch
> shortening
> if the shortening could change the number of delay slots
> of the branch. */
> @@ -1130,7 +1163,7 @@ shorten_branches (rtx first)
> if (GET_CODE (body) == ASM_INPUT
> || asm_noperands (PATTERN (XVECEXP (body, 0, i))) >= 0)
> inner_length = (asm_insn_count (PATTERN (inner_insn))
> - * insn_default_length (inner_insn));
> + * insn_max_length (inner_insn));
> else
> inner_length = inner_length_fun (inner_insn);
>
> @@ -1138,7 +1171,7 @@ shorten_branches (rtx first)
> if (const_delay_slots)
> {
> if ((varying_length[inner_uid]
> - = insn_variable_length_p (inner_insn)) != 0)
> + = INSN_VARIABLE_LENGTH_P (inner_insn)) != 0)
> varying_length[uid] = 1;
> INSN_ADDRESSES (inner_uid) = (insn_current_address
> + insn_lengths[uid]);
> @@ -1151,7 +1184,7 @@ shorten_branches (rtx first)
> else if (GET_CODE (body) != USE && GET_CODE (body) != CLOBBER)
> {
> insn_lengths[uid] = length_fun (insn);
> - varying_length[uid] = insn_variable_length_p (insn);
> + varying_length[uid] = INSN_VARIABLE_LENGTH_P (insn);
> }
>
> /* If needed, do any adjustment. */
> @@ -1354,7 +1387,7 @@ shorten_branches (rtx first)
> {
> rtx inner_insn = XVECEXP (body, 0, i);
> int inner_uid = INSN_UID (inner_insn);
> - int inner_length;
> + int inner_length, lock_length;
>
> INSN_ADDRESSES (inner_uid) = insn_current_address;
>
> @@ -1365,16 +1398,23 @@ shorten_branches (rtx first)
> else
> inner_length = insn_current_length (inner_insn);
>
> - if (inner_length != insn_lengths[inner_uid])
> + lock_length
> + = (HAVE_ATTR_lock_length
> + ? insn_current_lock_length (inner_insn) :
> inner_length);
> + if (lock_length != uid_lock_length[inner_uid])
> {
> - if (!increasing || inner_length >
> insn_lengths[inner_uid])
> + if (!increasing
> + || lock_length > uid_lock_length[inner_uid])
> {
> - insn_lengths[inner_uid] = inner_length;
> + uid_lock_length[inner_uid] = lock_length;
> something_changed = 1;
> }
> else
> - inner_length = insn_lengths[inner_uid];
> + lock_length = uid_lock_length[inner_uid];
> }
> + if (inner_length < lock_length)
> + inner_length = lock_length;
> + insn_lengths[inner_uid] = inner_length;
> insn_current_address += inner_length;
> new_length += inner_length;
> }
> @@ -1382,6 +1422,17 @@ shorten_branches (rtx first)
> else
> {
> new_length = insn_current_length (insn);
> + if (HAVE_ATTR_lock_length)
> + {
> + int lock_length = insn_current_lock_length (insn);
> +
> + if (!increasing || lock_length > uid_lock_length[uid])
> + uid_lock_length[uid] = lock_length;
> + else
> + lock_length = uid_lock_length[uid];
> + if (new_length < lock_length)
> + new_length = lock_length;
> + }
> insn_current_address += new_length;
> }
>
> @@ -1393,7 +1444,8 @@ shorten_branches (rtx first)
> #endif
>
> if (new_length != insn_lengths[uid]
> - && (!increasing || new_length > insn_lengths[uid]))
> + && (!increasing || HAVE_ATTR_lock_length
> + || new_length > insn_lengths[uid]))
> {
> insn_lengths[uid] = new_length;
> something_changed = 1;
> @@ -1407,6 +1459,11 @@ shorten_branches (rtx first)
> }
>
> free (varying_length);
> + if (HAVE_ATTR_lock_length)
> + {
> + free (uid_lock_length);
> + uid_lock_length = 0;
> + }
> }
>
> /* Given the body of an INSN known to be generated by an ASM statement,
> return
> Index: genattr.c
> ===================================================================
> --- genattr.c (revision 2660)
> +++ genattr.c (working copy)
> @@ -71,6 +71,10 @@ extern int insn_default_length (rtx);\n\
> extern int insn_min_length (rtx);\n\
> extern int insn_variable_length_p (rtx);\n\
> extern int insn_current_length (rtx);\n\n\
> +extern int insn_default_lock_length (rtx);\n\
> +extern int insn_min_lock_length (rtx);\n\
> +extern int insn_variable_lock_length_p (rtx);\n\
> +extern int insn_current_lock_length (rtx);\n\n\
> #include \"insn-addr.h\"\n");
> }
> }
> @@ -337,7 +341,8 @@ main (int argc, char **argv)
> }
>
> /* Special-purpose atributes should be tested with if, not #ifdef. */
> - const char * const special_attrs[] = { "length", "enabled", 0 };
> + const char * const special_attrs[]
> + = { "length", "lock_length", "enabled", 0 };
> for (const char * const *p = special_attrs; *p; p++)
> {
> printf ("#ifndef HAVE_ATTR_%s\n"
> @@ -346,13 +351,19 @@ main (int argc, char **argv)
> }
> /* We make an exception here to provide stub definitions for
> insn_*_length* functions. */
> - puts ("#if !HAVE_ATTR_length\n"
> - "extern int hook_int_rtx_0 (rtx);\n"
> + puts ("extern int hook_int_rtx_0 (rtx);\n"
> + "#if !HAVE_ATTR_length\n"
> "#define insn_default_length hook_int_rtx_0\n"
> "#define insn_min_length hook_int_rtx_0\n"
> "#define insn_variable_length_p hook_int_rtx_0\n"
> "#define insn_current_length hook_int_rtx_0\n"
> "#include \"insn-addr.h\"\n"
> + "#endif\n"
> + "#if !HAVE_ATTR_lock_length\n"
> + "#define insn_default_lock_length hook_int_rtx_0\n"
> + "#define insn_min_lock_length hook_int_rtx_0\n"
> + "#define insn_variable_lock_length_p hook_int_rtx_0\n"
> + "#define insn_current_lock_length hook_int_rtx_0\n"
> "#endif\n");
>
> /* Output flag masks for use by reorg.
> Index: genattrtab.c
> ===================================================================
> --- genattrtab.c (revision 2660)
> +++ genattrtab.c (working copy)
> @@ -242,6 +242,7 @@ struct attr_value_list **insn_code_value
>
> static const char *alternative_name;
> static const char *length_str;
> +static const char *lock_length_str;
> static const char *delay_type_str;
> static const char *delay_1_0_str;
> static const char *num_delay_slots_str;
> @@ -1541,14 +1542,14 @@ substitute_address (rtx exp, rtx (*no_ad
> */
>
> static void
> -make_length_attrs (void)
> +make_length_attrs (const char **base)
> {
> static const char *new_names[] =
> {
> - "*insn_default_length",
> - "*insn_min_length",
> - "*insn_variable_length_p",
> - "*insn_current_length"
> + "*insn_default_%s",
> + "*insn_min_%s",
> + "*insn_variable_%s_p",
> + "*insn_current_%s"
> };
> static rtx (*const no_address_fn[]) (rtx)
> = {identity_fn,identity_fn, zero_fn, zero_fn};
> @@ -1561,7 +1562,7 @@ make_length_attrs (void)
>
> /* See if length attribute is defined. If so, it must be numeric. Make
> it special so we don't output anything for it. */
> - length_attr = find_attr (&length_str, 0);
> + length_attr = find_attr (base, 0);
> if (length_attr == 0)
> return;
>
> @@ -1574,11 +1575,14 @@ make_length_attrs (void)
> /* Make each new attribute, in turn. */
> for (i = 0; i < ARRAY_SIZE (new_names); i++)
> {
> - make_internal_attr (new_names[i],
> + const char *p = attr_printf (strlen (new_names[i]) - 2 + strlen
> (*base),
> + new_names[i], *base);
> +
> + make_internal_attr (p,
> substitute_address
> (length_attr->default_val->value,
> no_address_fn[i],
> address_fn[i]),
> ATTR_NONE);
> - new_attr = find_attr (&new_names[i], 0);
> + new_attr = find_attr (&p, 0);
> for (av = length_attr->first_value; av; av = av->next)
> for (ie = av->first_insn; ie; ie = ie->next)
> {
> @@ -5179,6 +5183,7 @@ main (int argc, char **argv)
>
> alternative_name = DEF_ATTR_STRING ("alternative");
> length_str = DEF_ATTR_STRING ("length");
> + lock_length_str = DEF_ATTR_STRING ("lock_length");
> delay_type_str = DEF_ATTR_STRING ("*delay_type");
> delay_1_0_str = DEF_ATTR_STRING ("*delay_1_0");
> num_delay_slots_str = DEF_ATTR_STRING ("*num_delay_slots");
> @@ -5275,7 +5280,8 @@ main (int argc, char **argv)
> fill_attr (attr);
>
> /* Construct extra attributes for `length'. */
> - make_length_attrs ();
> + make_length_attrs (&length_str);
> + make_length_attrs (&lock_length_str);
>
> /* Perform any possible optimizations to speed up compilation. */
> optimize_attrs ();
>