missing symbols in libstdc++.so.6 built from the 4.9 branch
on some linux architectures there are some symbols missing in libstdc++.so.6 built from the 4.9 branch. I didn't notice before due to a packaging bug. affected are ARM32, HPPA, SPARC. - ARM32 (build log [1], both soft and hard float) are missing __aeabi_atexit@CXXABI_ARM_1.3.3 __aeabi_vec_* Can these be ignored? - HPPA (build log [2]), is missing all the future_base symbols and exception_ptr13exception symbols, current_exception and rethrow_exception. - SPARC (build log [3]) configured for sparc64-linux-gnu is missing symbols in the 32bit multilib build, although these are present in a sparc-linux-gnu build. Missing are same ones as in the HPPA build, long double 128 related symbols, numeric_limits, and some math symbols. Looks like more than one issue is involved, I remember that the math symbols were already dropped in earlier versions for other architectures. The build is configured -with-long-double-128. Matthias [1] https://buildd.debian.org/status/fetch.php?pkg=gcc-4.9&arch=armhf&ver=4.9.0-8&stamp=1403809654 [2] http://buildd.debian-ports.org/status/fetch.php?pkg=gcc-4.9&arch=hppa&ver=4.9.0-9&stamp=1404018503 [3] http://buildd.debian-ports.org/status/fetch.php?pkg=gcc-4.9&arch=sparc64&ver=4.9.0-9&stamp=1404033854
Re: missing symbols in libstdc++.so.6 built from the 4.9 branch
On 1 July 2014 09:40, Matthias Klose wrote: > - HPPA (build log [2]), is missing all the future_base symbols and >exception_ptr13exception symbols, current_exception and >rethrow_exception. This implies ATOMIC_INT_LOCK_FREE <= 1 for that target. Our future and exception_ptr implementations rely on usable atomics. I don't know about the other missing symbols.
Re: Question about GCC's standard dependent optimization
On Mon, Jun 30, 2014 at 5:42 PM, Jeff Law wrote: > On 06/26/14 14:13, Jeff Law wrote: >> >> On 06/26/14 02:44, Bin.Cheng wrote: >>> >>> Hi, >>> I ran into PR60947, in which GCC understands the return value of >>> memset is the first argument passed in, according to standard, then >>> does optimization like below: >>> movip, sp >>> stmfdsp!, {r4, r5, r6, r7, r8, r9, r10, fp, ip, lr, pc} >>> subfp, ip, #4 >>> subsp, sp, #20 >>> ldrr8, [r0, #112] >>> addr3, r8, #232 >>> addr4, r8, #328 >>> .L1064: >>> movr0, r3 >>> movr1, #255 >>> movr2, #8 >>> blmemset >>> addr3, r0, #32>> cmpr3, r4 >>> bne.L1064 >>> >>> For X insn, GCC takes advantage of standard by using the returned r0 >>> directly. >>> >>> My question is, is it always safe for GCC to do such optimization? Do >>> we have an option to disable such standard dependent optimization? >> >> Others have already answered this question. >> >> FWIW, I just locally added the capability to track equivalences between >> the destination argument to the various mem* str* functions and their >> return value in DOM. It triggers, but not terribly often. I'll be >> looking to see if the additional equivalences actually enable any >> optimizations before going through the full bootstrap and test. > > Just as a follow-up. This turns out to be a relatively bad idea as it gets > in the way of tail call optimizations. > > Probably the only place where this is going to really be useful is in the > allocators to allow us to cheaply rematerialize values and/or tie together > two values that normally wouldn't be seen as related to each other. Also it restrict the inline of string operation functions at expand time. Once we reuse the return value then inlining need to calculate the return value. I don't know if it will break some targets expand/inline now, but it surely increases cost of inlined code. Thanks, bin > > Jeff >
Re: missing symbols in libstdc++.so.6 built from the 4.9 branch
Am 01.07.2014 11:32, schrieb Jonathan Wakely: > On 1 July 2014 09:40, Matthias Klose wrote: >> - HPPA (build log [2]), is missing all the future_base symbols and >>exception_ptr13exception symbols, current_exception and >>rethrow_exception. > > This implies ATOMIC_INT_LOCK_FREE <= 1 for that target. Our future and > exception_ptr implementations rely on usable atomics. thanks for the reminder. then the same missing symbols for sparc is a missing --with-cpu-32=ultrasparc. Matthias
Re: [GSoC] generation of GCC expression trees from isl ast expressions
Hi Tobias, could you please advise me how to verify the results of gimple code generation? I've written the first draft of the generation of loops with empty bodies and tried to verify gimple code using the representation, which is dumped at the end of the generation of the dump_file. If we consider the following example, we'll see that cloog and isl code generator generate similar representation (representation generated by isl code generator doesn't have body of the loop, as was expected). int main (int n, int *a) { int i; for (i = 0; i < 100; i++) a[i] = i; return 0; } gcc/graphite-isl-ast-to-gimple.c loop_0 (header = 0, latch = 1, niter = ) { bb_2 (preds = {bb_0 }, succs = {bb_3 }) { : } bb_5 (preds = {bb_3 }, succs = {bb_1 }) { : # .MEM_10 = PHI <.MEM_3(D)(3)> # VUSE <.MEM_10> return 0; } loop_2 (header = 3, latch = 4, niter = ) { bb_3 (preds = {bb_2 bb_4 }, succs = {bb_4 bb_5 }) { : # graphite_IV.3_1 = PHI <0(2), graphite_IV.3_14(4)> graphite_IV.3_14 = graphite_IV.3_1 + 1; if (graphite_IV.3_1 < 99) goto ; else goto ; } bb_4 (preds = {bb_3 }, succs = {bb_3 }) { : goto ; } } } graphite-clast-to-gimple.c loop_0 (header = 0, latch = 1, niter = ) { bb_2 (preds = {bb_0 }, succs = {bb_3 }) { : } bb_5 (preds = {bb_3 }, succs = {bb_1 }) { : # .MEM_18 = PHI <.MEM_11(3)> # VUSE <.MEM_18> return 0; } loop_2 (header = 3, latch = 4, niter = ) { bb_3 (preds = {bb_2 bb_4 }, succs = {bb_4 bb_5 }) { : # graphite_IV.3_1 = PHI <0(2), graphite_IV.3_14(4)> # .MEM_19 = PHI <.MEM_3(D)(2), .MEM_11(4)> _2 = (sizetype) graphite_IV.3_1; _15 = _2 * 4; _16 = a_6(D) + _15; _17 = (int) graphite_IV.3_1; # .MEM_11 = VDEF <.MEM_19> *_16 = _17; graphite_IV.3_14 = graphite_IV.3_1 + 1; if (graphite_IV.3_1 < 99) goto ; else goto ; } bb_4 (preds = {bb_3 }, succs = {bb_3 }) { : goto ; } } } However, this form doesn't have loop guards which are generated by graphite_create_new_loop_guard in gcc/graphite-isl-ast-to-gimple.c and by graphite_create_new_loop_guard in graphite-clast-to-gimple.c. Below is the code of this generation (It still uses isl_int for generation of isl_expr_int, because the error related to isl/val_gmp.h still arises. I've tried to use isl 0.12.2 and 0.13, but gotten the same error). -- Cheers, Roman Gareev Index: gcc/graphite-isl-ast-to-gimple.c === --- gcc/graphite-isl-ast-to-gimple.c(revision 212194) +++ gcc/graphite-isl-ast-to-gimple.c(working copy) @@ -42,16 +42,620 @@ #include "cfgloop.h" #include "tree-data-ref.h" #include "sese.h" +#include "tree-ssa-loop-manip.h" +#include "tree-scalar-evolution.h" #ifdef HAVE_cloog #include "graphite-poly.h" #include "graphite-isl-ast-to-gimple.h" +#include "graphite-htab.h" /* This flag is set when an error occurred during the translation of ISL AST to Gimple. */ static bool graphite_regenerate_error; +/* Converts a GMP constant VAL to a tree and returns it. */ + +static tree +gmp_cst_to_tree (tree type, mpz_t val) +{ + tree t = type ? type : integer_type_node; + mpz_t tmp; + + mpz_init (tmp); + mpz_set (tmp, val); + wide_int wi = wi::from_mpz (t, tmp, true); + mpz_clear (tmp); + + return wide_int_to_tree (t, wi); +} + +/* Verifies properties that GRAPHITE should maintain during translation. */ + +static inline void +graphite_verify (void) +{ +#ifdef ENABLE_CHECKING + verify_loop_structure (); + verify_loop_closed_ssa (true); +#endif +} + +/* Stores the INDEX in a vector and the loop nesting LEVEL for a given + isl_id NAME. BOUND_ONE and BOUND_TWO represent the exact lower and + upper bounds that can be inferred from the polyhedral representation. */ + +typedef struct ast_isl_name_index { + int index; + int level; + const char *name; + /* If free_name is set, the content of name was allocated by us and needs + to be freed. */ + char *free_name; +} *ast_isl_name_index_p; + +/* Helper for hashing ast_isl_name_index. */ + +struct ast_isl_index_hasher +{ + typedef ast_isl_name_index value_type; + typedef ast_isl_name_index compare_type; + static inline hashval_t hash (const value_type *); + static inline bool equal (const value_type *, const compare_type *); + static inline void remove (value_type *); +}; + +/* Computes a hash function for database element E. */ + +inline hashval_t +ast_isl_index_hasher::hash (const value_type *e) +{ + hashval_t hash = 0; + + int length = strlen (e->name); + int i; + + for (i = 0; i < length; ++i) +hash = hash | (e->name[i] << (i % 4)); + + return hash; +} + +/* Compares database elements ELT1 and ELT2. */ + +inline bool +ast_isl_index_hasher::equal (const value_type *elt1, co
Re: [GSoC] Question about unit tests
Thank you for the answer! -- Cheers, Roman Gareev
Re: [GSoC] generation of GCC expression trees from isl ast expressions
On 01/07/2014 14:53, Roman Gareev wrote: Hi Tobias, could you please advise me how to verify the results of gimple code generation? More comments inline, but here something on a very high level. I personally like testing already on the GIMPLE level and could see us matching for certain expressions in the dumped gimple output. Unfortunately this kind of testing may be a little fragile depending how often gcc changes its internal dumping (hopefully not too often). On the other side, in gcc testing is commonly done by compiling and executing files. For this to work, we would need at least a simple implementation of body statements before we can get anything tested and checked in. I've written the first draft of the generation of loops with empty bodies and tried to verify gimple code using the representation, which is dumped at the end of the generation of the dump_file. If we consider the following example, we'll see that cloog and isl code generator generate similar representation (representation generated by isl code generator doesn't have body of the loop, as was expected). int main (int n, int *a) { int i; for (i = 0; i < 100; i++) a[i] = i; return 0; } gcc/graphite-isl-ast-to-gimple.c loop_0 (header = 0, latch = 1, niter = ) { bb_2 (preds = {bb_0 }, succs = {bb_3 }) { : } bb_5 (preds = {bb_3 }, succs = {bb_1 }) { : # .MEM_10 = PHI <.MEM_3(D)(3)> # VUSE <.MEM_10> return 0; } loop_2 (header = 3, latch = 4, niter = ) { bb_3 (preds = {bb_2 bb_4 }, succs = {bb_4 bb_5 }) { : # graphite_IV.3_1 = PHI <0(2), graphite_IV.3_14(4)> graphite_IV.3_14 = graphite_IV.3_1 + 1; if (graphite_IV.3_1 < 99) goto ; else goto ; } bb_4 (preds = {bb_3 }, succs = {bb_3 }) { : goto ; } } } graphite-clast-to-gimple.c loop_0 (header = 0, latch = 1, niter = ) { bb_2 (preds = {bb_0 }, succs = {bb_3 }) { : } bb_5 (preds = {bb_3 }, succs = {bb_1 }) { : # .MEM_18 = PHI <.MEM_11(3)> # VUSE <.MEM_18> return 0; } loop_2 (header = 3, latch = 4, niter = ) { bb_3 (preds = {bb_2 bb_4 }, succs = {bb_4 bb_5 }) { : # graphite_IV.3_1 = PHI <0(2), graphite_IV.3_14(4)> # .MEM_19 = PHI <.MEM_3(D)(2), .MEM_11(4)> _2 = (sizetype) graphite_IV.3_1; _15 = _2 * 4; _16 = a_6(D) + _15; _17 = (int) graphite_IV.3_1; # .MEM_11 = VDEF <.MEM_19> *_16 = _17; graphite_IV.3_14 = graphite_IV.3_1 + 1; if (graphite_IV.3_1 < 99) goto ; else goto ; } bb_4 (preds = {bb_3 }, succs = {bb_3 }) { : goto ; } } } However, this form doesn't have loop guards which are generated by graphite_create_new_loop_guard in gcc/graphite-isl-ast-to-gimple.c and by graphite_create_new_loop_guard in graphite-clast-to-gimple.c. Maybe the guards are directly constant folded? Can you try with: int main (int n, int *a) { int i; for (i = 0; i < b; i++) a[i] = i; return 0; } Below is the code of this generation (It still uses isl_int for generation of isl_expr_int, because the error related to isl/val_gmp.h still arises. I've tried to use isl 0.12.2 and 0.13, but gotten the same error). Did using 'extern "C"' around the include statement not help? +/* Stores the INDEX in a vector and the loop nesting LEVEL for a given + isl_id NAME. BOUND_ONE and BOUND_TWO represent the exact lower and + upper bounds that can be inferred from the polyhedral representation. */ Why do you mention BOUND_ONE & BOUND_TWO? I do not see any use of them? +typedef struct ast_isl_name_index { + int index; + int level; + const char *name; + /* If free_name is set, the content of name was allocated by us and needs + to be freed. */ + char *free_name; +} *ast_isl_name_index_p; + +/* Helper for hashing ast_isl_name_index. */ + +struct ast_isl_index_hasher +{ + typedef ast_isl_name_index value_type; + typedef ast_isl_name_index compare_type; + static inline hashval_t hash (const value_type *); + static inline bool equal (const value_type *, const compare_type *); + static inline void remove (value_type *); +}; + +/* Computes a hash function for database element E. */ + +inline hashval_t +ast_isl_index_hasher::hash (const value_type *e) +{ + hashval_t hash = 0; + + int length = strlen (e->name); + int i; + + for (i = 0; i < length; ++i) +hash = hash | (e->name[i] << (i % 4)); + + return hash; +} + +/* Compares database elements ELT1 and ELT2. */ + +inline bool +ast_isl_index_hasher::equal (const value_type *elt1, const compare_type *elt2) +{ + return strcmp (elt1->name, elt2->name) == 0; +} + +/* Free the memory taken by a ast_isl_name_index struct. */ + +inline void +ast_isl_index_hasher::remove (value_type *c) +{ + if (c->free_name) +free (c->free_name); + free (c); +} + +typed
combination of read/write and earlyclobber constraint modifier
Vladimir, There are a few patterns which use both the read/write constraint modifier (+) and the earlyclobber constraint modifier (&): ... $ grep -c 'match_operand.*+.*&' gcc/config/*/* | grep -v :0 gcc/config/aarch64/aarch64-simd.md:1 gcc/config/arc/arc.md:1 gcc/config/arm/ldmstm.md:30 gcc/config/rs6000/spe.md:8 ... F.i., this one in gcc/config/aarch64/aarch64-simd.md: ... (define_insn "vec_pack_trunc_" [(set (match_operand: 0 "register_operand" "+&w") (vec_concat: (truncate: (match_operand:VQN 1 "register_operand" "w")) (truncate: (match_operand:VQN 2 "register_operand" "w"] ... The documentation ( https://gcc.gnu.org/onlinedocs/gccint/Modifiers.html#Modifiers ) states: ... '‘&’ does not obviate the need to write ‘=’. ... which seems to state that '&' implies '='. An earlyclobber operand is defined as 'modified before the instruction is finished using the input operands'. AFAIU that would indeed exclude the possibility that the earlyclobber operand is an input/output operand it self, but perhaps I misunderstand. So my question is: is the combination of '&' and '+' supported ? If so, what is the exact semantics ? If not, should we warn or give an error ? Thanks, - Tom
Re: combination of read/write and earlyclobber constraint modifier
On 07/01/14 13:27, Tom de Vries wrote: Vladimir, There are a few patterns which use both the read/write constraint modifier (+) and the earlyclobber constraint modifier (&): ... $ grep -c 'match_operand.*+.*&' gcc/config/*/* | grep -v :0 gcc/config/aarch64/aarch64-simd.md:1 gcc/config/arc/arc.md:1 gcc/config/arm/ldmstm.md:30 gcc/config/rs6000/spe.md:8 ... F.i., this one in gcc/config/aarch64/aarch64-simd.md: ... (define_insn "vec_pack_trunc_" [(set (match_operand: 0 "register_operand" "+&w") (vec_concat: (truncate: (match_operand:VQN 1 "register_operand" "w")) (truncate: (match_operand:VQN 2 "register_operand" "w"] ... The documentation ( https://gcc.gnu.org/onlinedocs/gccint/Modifiers.html#Modifiers ) states: ... '‘&’ does not obviate the need to write ‘=’. ... which seems to state that '&' implies '='. An earlyclobber operand is defined as 'modified before the instruction is finished using the input operands'. AFAIU that would indeed exclude the possibility that the earlyclobber operand is an input/output operand it self, but perhaps I misunderstand. So my question is: is the combination of '&' and '+' supported ? If so, what is the exact semantics ? If not, should we warn or give an error ? I don't think we can define any reasonable semantics for &+. My recommendation would be for this to be considered a hard error. Jeff
Re: combination of read/write and earlyclobber constraint modifier
On Tue, 1 Jul 2014, Jeff Law wrote: On 07/01/14 13:27, Tom de Vries wrote: Vladimir, There are a few patterns which use both the read/write constraint modifier (+) and the earlyclobber constraint modifier (&): ... $ grep -c 'match_operand.*+.*&' gcc/config/*/* | grep -v :0 gcc/config/aarch64/aarch64-simd.md:1 gcc/config/arc/arc.md:1 gcc/config/arm/ldmstm.md:30 gcc/config/rs6000/spe.md:8 ... F.i., this one in gcc/config/aarch64/aarch64-simd.md: ... (define_insn "vec_pack_trunc_" [(set (match_operand: 0 "register_operand" "+&w") (vec_concat: (truncate: (match_operand:VQN 1 "register_operand" "w")) (truncate: (match_operand:VQN 2 "register_operand" "w"] ... The documentation ( https://gcc.gnu.org/onlinedocs/gccint/Modifiers.html#Modifiers ) states: ... '‘&’ does not obviate the need to write ‘=’. ... which seems to state that '&' implies '='. An earlyclobber operand is defined as 'modified before the instruction is finished using the input operands'. AFAIU that would indeed exclude the possibility that the earlyclobber operand is an input/output operand it self, but perhaps I misunderstand. So my question is: is the combination of '&' and '+' supported ? If so, what is the exact semantics ? If not, should we warn or give an error ? I don't think we can define any reasonable semantics for &+. My recommendation would be for this to be considered a hard error. Uh? The doc explicitly says "An input operand can be tied to an earlyclobber operand" and goes on to explain why that is useful. It avoids using the same register for other input when they are identical. -- Marc Glisse
Re: missing symbols in libstdc++.so.6 built from the 4.9 branch
On 1-Jul-14, at 5:32 AM, Jonathan Wakely wrote: On 1 July 2014 09:40, Matthias Klose wrote: - HPPA (build log [2]), is missing all the future_base symbols and exception_ptr13exception symbols, current_exception and rethrow_exception. This implies ATOMIC_INT_LOCK_FREE <= 1 for that target. Our future and exception_ptr implementations rely on usable atomics. ARM and HPPA use kernel assisted libraries for atomic support. Not exactly lock free, but possibly good enough... Currently, c-cppbuiltin.c doesn't provide proper defines for this support. We currently define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4, etc, in pa-linux.h. I'll experiment with defining ATOMIC_INT_LOCK_FREE there. Thanks, Dave -- John David Anglin dave.ang...@bell.net
Re: missing symbols in libstdc++.so.6 built from the 4.9 branch
On 1 July 2014 20:58, John David Anglin wrote: > On 1-Jul-14, at 5:32 AM, Jonathan Wakely wrote: > >> On 1 July 2014 09:40, Matthias Klose wrote: >>> >>> - HPPA (build log [2]), is missing all the future_base symbols and >>> exception_ptr13exception symbols, current_exception and >>> rethrow_exception. >> >> >> This implies ATOMIC_INT_LOCK_FREE <= 1 for that target. Our future and >> exception_ptr implementations rely on usable atomics. > > > ARM and HPPA use kernel assisted libraries for atomic support. Not exactly > lock free, but possibly good enough... > > Currently, c-cppbuiltin.c doesn't provide proper defines for this support. > We > currently define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4, etc, in > pa-linux.h. I'll experiment with defining ATOMIC_INT_LOCK_FREE there. It should already be defined, but its value is what matters for libstdc++'s purposes. To be honest I'm not sure if we really need the value to be greater than one, if it's equal to one that might work. We'd need to check though.
Re: combination of read/write and earlyclobber constraint modifier
On 01-07-14 21:58, Marc Glisse wrote: So my question is: is the combination of '&' and '+' supported ? If so, what is the exact semantics ? If not, should we warn or give an error ? I don't think we can define any reasonable semantics for &+. My recommendation would be for this to be considered a hard error. Uh? The doc explicitly says "An input operand can be tied to an earlyclobber operand" and goes on to explain why that is useful. It avoids using the same register for other input when they are identical. Hi Marc, That part of the doc refers to the mulsi3 insn for ARM as example: ... ;; Use `&' and then `0' to prevent the operands 0 and 1 being the same (define_insn "*arm_mulsi3" [(set (match_operand:SI 0 "s_register_operand" "=&r,&r") (mult:SI (match_operand:SI 2 "s_register_operand" "r,r") (match_operand:SI 1 "s_register_operand" "%0,r")))] "TARGET_32BIT && !arm_arch6" "mul%?\\t%0, %2, %1" [(set_attr "type" "mul") (set_attr "predicable" "yes")] ) ... Note that there's no combination of & and + here. AFAIU, the 'tie' established here is from input operand 1 to an earlyclobber output operand 0 using the '0' matching constraint. Having said that, I don't understand the comment, AFAIU it should be: 'Use '0' to make sure operands 0 and 1 are the same, and use '&' to make sure operands 0 and 2 are not the same. Thanks, - Tom
Re: reverse bitfield patch
Revisiting an old thread, as I still want to get this feature in... https://gcc.gnu.org/ml/gcc/2012-10/msg00099.html > >> Why do you need to change varasm.c at all? The hunks seem to be > >> completely separate of the attribute. > > > > Because static constructors have fields in the original order, not the > > reversed order. Otherwise code like this is miscompiled: > > Err - the struct also has fields in the original order - only the bit > positions > of the fields are different because of the layouting option. The order of the field decls in the type (stor-layout.c) is not changed, only the bit position information. The order here *can't* be changed, because the C language assumes that parameters, initializers, etc are presented in the same order as the original declaration, regardless of the target-specific layout. When the program includes an initializer: > > struct foo a = { 1, 2, 3 }; The order of 1, 2, and 3 need to correspond to the order of the bitfields in 'a', so we can change neither the order of the bitfields in 'a' nor the order of constructor fields. However, when we stream the initializer out to the .S file, we need to pack the bitfields in the right sequence to generate the right bit patterns in the final output image. The code in varasm.c exists to make sure that the initializers for bitfields are written/packed in the correct order, to correspond to the bitfield positions. I.e. the 1,2,3 initializer needs to be written to the .S file as either 0x0123 or 0x3210 depending on the bit positions. In neither case do we change the order of the fields in the type itself, i.e. the array/chain order. > And you expect no other code looks at fields of a structure and its > initializer? It's bad to keep this not in-sync. Thus I don't think it's > viable to re-order fields just because bit allocation is reversed. The fields are in sync. The varasm.c change sorts the elements as they're being output into the byte stream in the .S, it doesn't sort the field definitions themselves. > > + /* If the bitfield-order attribute has been used on this > > +structure, the fields might not be in bit-order. In that > > +case, we need a separate representative for each > > +field. */ > > The typical use-case for this feature is memory-mapped hardware, where > > pessimum access is preferred anyway. > > I doubt that, looking at constraints for strict volatile bitfields. The code that handles representatives requires (via an assert, IIRC) that the bit offsets within a representative be in ascending order. I.e. gcc ICEs if I don't bypass this. In the case of volatile bitfields, which would be the typical use case for a reversed bitfield, the access mode is going to match the type size regardless, so performance is not changed by this patch.
Re: combination of read/write and earlyclobber constraint modifier
On Tue, 1 Jul 2014, Tom de Vries wrote: On 01-07-14 21:58, Marc Glisse wrote: So my question is: is the combination of '&' and '+' supported ? If so, what is the exact semantics ? If not, should we warn or give an error ? I don't think we can define any reasonable semantics for &+. My recommendation would be for this to be considered a hard error. Uh? The doc explicitly says "An input operand can be tied to an earlyclobber operand" and goes on to explain why that is useful. It avoids using the same register for other input when they are identical. Hi Marc, That part of the doc refers to the mulsi3 insn for ARM as example: ... ;; Use `&' and then `0' to prevent the operands 0 and 1 being the same (define_insn "*arm_mulsi3" [(set (match_operand:SI 0 "s_register_operand" "=&r,&r") (mult:SI (match_operand:SI 2 "s_register_operand" "r,r") (match_operand:SI 1 "s_register_operand" "%0,r")))] "TARGET_32BIT && !arm_arch6" "mul%?\\t%0, %2, %1" [(set_attr "type" "mul") (set_attr "predicable" "yes")] ) ... Note that there's no combination of & and + here. I think it could have used (match_dup 0) instead of operand 1, if there had been only the first alternative. And then the constraint would have been +&. AFAIU, the 'tie' established here is from input operand 1 to an earlyclobber output operand 0 using the '0' matching constraint. Having said that, I don't understand the comment, AFAIU it should be: 'Use '0' to make sure operands 0 and 1 are the same, and use '&' to make sure operands 0 and 2 are not the same. Well, yeah, the comment doesn't seem completely in sync with the code. In the first example you gave, looking at the pattern (no match_dup, setting the full register), it seems that it may have wanted "=&" instead of "+&". (by the way, in the same aarch64-simd.md file, I noticed some define_expand with constraints, that looks strange) -- Marc Glisse
Re: combination of read/write and earlyclobber constraint modifier
On 02-07-14 08:23, Marc Glisse wrote: On Tue, 1 Jul 2014, Tom de Vries wrote: On 01-07-14 21:58, Marc Glisse wrote: So my question is: is the combination of '&' and '+' supported ? If so, what is the exact semantics ? If not, should we warn or give an error ? I don't think we can define any reasonable semantics for &+. My recommendation would be for this to be considered a hard error. Uh? The doc explicitly says "An input operand can be tied to an earlyclobber operand" and goes on to explain why that is useful. It avoids using the same register for other input when they are identical. Hi Marc, That part of the doc refers to the mulsi3 insn for ARM as example: ... ;; Use `&' and then `0' to prevent the operands 0 and 1 being the same (define_insn "*arm_mulsi3" [(set (match_operand:SI 0 "s_register_operand" "=&r,&r") (mult:SI (match_operand:SI 2 "s_register_operand" "r,r") (match_operand:SI 1 "s_register_operand" "%0,r")))] "TARGET_32BIT && !arm_arch6" "mul%?\\t%0, %2, %1" [(set_attr "type" "mul") (set_attr "predicable" "yes")] ) ... Note that there's no combination of & and + here. I think it could have used (match_dup 0) instead of operand 1, if there had been only the first alternative. And then the constraint would have been +&. Marc, isn't that explicitly listed as unsupported here ( https://gcc.gnu.org/onlinedocs/gccint/RTL-Template.html#index-match_005fdup-3244 ): ... Note that match_dup should not be used to tell the compiler that a particular register is being used for two operands (example: add that adds one register to another; the second register is both an input operand and the output operand). Use a matching constraint (see Simple Constraints) for those. match_dup is for the cases where one operand is used in two places in the template, such as an instruction that computes both a quotient and a remainder, where the opcode takes two input operands but the RTL template has to refer to each of those twice; once for the quotient pattern and once for the remainder pattern. ... ? Thanks, - Tom