Re: Question regarding usage of vec.h
On 2020/8/3 12:24 AM, Thomas Koenig via Gcc wrote: Hi, in a patch I am developing, I have the following code: typedef struct { gfc_code *c; int branch_level; bool seen_goto; vec extrema; } do_t; static vec doloop_list; [...] do_t loop; doloop_list.safe_push (loop); doloop_list.last().extrema.reserve (1 << doloop_level); where the last statement segfaults. It also segfaults if I replace the last line with doloop_list[0].extrema.reserve (1 << doloop_level); Is there something glaringly obvious I am doing wrong, and could this be easily fixed? I think you need to initialize the 'vec extrema' field with vNULL, or maybe some other constructor. Chung-Lin If not, I guess I'll just implement this using standard C techniques (i.e. malloc/realloc). It will look a bit strange, but, as a wise man once said, a patch that works beats an elegant idea every time... Best regards Thomas
Implementing OpenMP 5.0 requires directive
Hi Jakub, We've been going over how we should implement the requires directive, in a bit more complete sense than the current state (i.e. only atomic_default_mem_order working). For the three clauses where the specification requires that "must appear in all compilation units of a program that contain device constructs or device routines or in none of them": - reverse_offload - unified_address - unified_shared_memory The current design we're contemplating is to generate a mask variable of these 3 clauses for compilation units built with -fopenmp, have them tagged with an attribute to be collected into a special section (e.g. ".gnu.gomp.requires"). Later at runtime device startup, have them checked by the runtime against the capabilities of the libgomp offload target. Cross-checking each word (assuming it is a word that we generate for each compile unit) against each other can also implement the consistency requirement. (actually, as a first stage implementation, we were hoping to just have the special section implemented, which allows compilation of OpenMP programs using requires-directive, and implement any runtime checking at a later stage) We hope to check with you first on any design issues. Have you given any thought on this directive? Thanks, Chung-Lin
Re: Question abt getting the number of available registers
On 2011/10/27 04:33 PM, Revital Eres wrote: > Hello, > > I'm working on estimating register pressure in SMS and using > ira_available_class_regs for getting the number of available > registers. > However I encounter a case where ira_available_class_regs showed 64 > available regs for a certain class while ira_class_hard_regs_num > showed 61 so I am not sure which one I should use. (I'm also not sure > if ira_class_hard_regs_num can be used outside of ira-- if not, I > would need to do that calculation myself) > > Thanks, > Revital Looking at the calculations of both values, I would say they should be the same, unless maybe... REG_ALLOC_ORDER for your target is missing some registers? Chung-Lin
Re: [MIPS] Test case dspr2-MULT is failed
On 2010/12/31 09:38 PM, Richard Sandiford wrote: > Mingjie Xing writes: >> There are two test cases failed when run 'make check-gcc >> RUNTESTFLAGS="mips.exp"'. The log is, >> >> Executing on host: /home/xmj/tools/build-test-trunk-mips/gcc/xgcc >> -B/home/xmj/tools/build-test-trunk-mips/gcc/ >> /home/xmj/tools/test-trunk/gcc/testsuite/gcc.target/mips/dspr2-MULT.c >> -DNOMIPS16=__attribute__((nomips16)) -mabi=32 -mips32r2 -mgp32 -O2 >> -mdspr2 -mtune=74kc -ffixed-hi -ffixed-lo -S -o dspr2-MULT.s >> (timeout = 300) >> PASS: gcc.target/mips/dspr2-MULT.c (test for excess errors) >> PASS: gcc.target/mips/dspr2-MULT.c scan-assembler \tmult\t >> PASS: gcc.target/mips/dspr2-MULT.c scan-assembler ac1 >> FAIL: gcc.target/mips/dspr2-MULT.c scan-assembler ac2 >> Executing on host: /home/xmj/tools/build-test-trunk-mips/gcc/xgcc >> -B/home/xmj/tools/build-test-trunk-mips/gcc/ >> /home/xmj/tools/test-trunk/gcc/testsuite/gcc.target/mips/dspr2-MULTU.c >> -DNOMIPS16=__attribute__((nomips16)) -mabi=32 -mips32r2 -mgp32 -O2 >> -mdspr2 -mtune=74kc -ffixed-hi -ffixed-lo -S -o dspr2-MULTU.s >> (timeout = 300) >> PASS: gcc.target/mips/dspr2-MULTU.c (test for excess errors) >> PASS: gcc.target/mips/dspr2-MULTU.c scan-assembler \tmultu\t >> PASS: gcc.target/mips/dspr2-MULTU.c scan-assembler ac1 >> FAIL: gcc.target/mips/dspr2-MULTU.c scan-assembler ac2 >> >> Is it a bug? > > It's a register-allocation optimisation regression that's been around > for quite a long time now (probably over a year). We're not making as > much use of the 4 accumulator registers as we should. > > In truth, I don't think we ever really made good use of them anyway. > ISTR that trivial modifications of the testcase failed even before > the regression. I analyzed this testcase regression a while earlier; the direct cause of this is due to mips_order_regs_for_local_alloc(), which now serves as MIPS' ADJUST_REG_ALLOC_ORDER macro. The mips_order_regs_for_local_alloc() function seems to be written for the old local-alloc.c, still left as the deprecated ORDER_REGS_FOR_LOCAL_ALLOC macro after the transition to IRA (actually not called at all during then), and relatively recently 'revived' after a patch by Bernd that created the ADJUST_REG_ALLOC_ORDER macro went in. So you have a local-alloc.c heuristic working in IRA, which seemed to cause these regressions. Removing mips_order_regs_for_local_alloc() will let this testcase pass; of course the real fix should be to review the MIPS reg-ordering logic, left for you MIPS people... Chung-Lin
Re: Question about conds attribute for *thumb2_alusi3_short
On 13/6/24 下午11:43, Tom de Vries wrote: > Richard, > > I've noticed that f.i. *thumb2_alusi3_short has no explicit setting of the > conds > attribute, which means the value of the conds attribute for this insn is > nocond: > ... > (define_insn "*thumb2_alusi3_short" > [(set (match_operand:SI 0 "s_register_operand" "=l") > (match_operator:SI 3 "thumb_16bit_operator" >[(match_operand:SI 1 "s_register_operand" "0") > (match_operand:SI 2 "s_register_operand" "l")])) >(clobber (reg:CC CC_REGNUM))] > "TARGET_THUMB2 && reload_completed >&& GET_CODE(operands[3]) != PLUS >&& GET_CODE(operands[3]) != MINUS" > "%I3%!\\t%0, %1, %2" > [(set_attr "predicable" "yes") >(set_attr "length" "2")] > ) > ... > > AFAIU, this insn is either: > - conditional, and does not modify cc, or > - unconditional, and sets cc. > So the clobber of CC in the RTL conservatively describes both cases. > > It seems to me the logical conds setting for the conditional case is nocond, > set > (or perhaps clob) for the unconditional case. So, is this a more accurate > value > of conds for this insn: > ... >(set (attr "conds") > (if_then_else > (match_test "GET_CODE (PATTERN (insn)) == COND_EXEC") > (const_string "nocond") > (const_string "set")))] > ... > ? > > Is there a generic need to have this attribute accurate for all insns? Following this thread that Tom pointed to me earlier in internal discussion: http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00723.html If the short-CC-clobbered form is selected very late now, I think this pattern simply is (or should) not be used for the conditional (within IT-block) case. It should simply be set to "clob". Predicable might be set to "no" as well... Chung-Lin
Altera Nios II port submission
To the GCC Steering Committee, Mentor Graphics has submitted, and recently re-submitted an updated version, of a GCC backend port for the Altera Nios II architecture, currently on gcc-patches awaiting technical review [1]. We're proposing, upon port approval and commit to trunk, Sandra Loosemore and myself (Chung-Lin Tang), both of Mentor Graphics, as target maintainers. Thank you, Chung-Lin [1] http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00526.html http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00527.html http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00528.html
Re: Question about simple_return pattern for the GCC ARM backend.
On 2013/12/28 09:31 AM, Yangfei (Felix) wrote: > Hi, > > I think that simple_return standard pattern is useful for the ARM. I mean > it should be good for target code performance. > But seems this pattern is not there for the GCC ARM backend. Can anyone > explain the reason why we don’t need this? > > Cheers, > Fei > It does use it. Search for the "return" expand pattern, and the "returns" code iterator in config/arm/iterators.md. Chung-Lin
Re: Register allocation cost question
On 2023/10/10 11:11 PM, Andrew Stubbs wrote: > Hi all, > > I'm trying to add a new register set to the GCN port, but I've hit a > problem I don't understand. > > There are 256 new registers (each 2048 bit vector register) but the > register file has to be divided between all the running hardware > threads; if you can use fewer registers you can get more parallelism, > which means that it's important that they're allocated in order. > > The problem is that they're not allocated in order. Somehow the IRA pass > is calculating different costs for the registers within the class. It > seems to prefer registers a32, a96, a160, and a224. > > The internal regno are 448, 512, 576, 640. These are not random numbers! > They all have zero for the 6 LSB. > > What could cause this? Did I overrun some magic limit? What target hook > might I have miscoded? > > I'm also seeing wrong-code bugs when I allow more than 32 new registers, > but that might be an unrelated problem. Or the allocation is broken? I'm > still analyzing this. > > If it matters, ... the new registers can't be used for general purposes, > so I'm trying to set them up as a temporary spill destination. This > means they're typically not busy. It feels like it shouldn't be this > hard... :( Have you tried experimenting with REG_ALLOC_ORDER? I see that the GCN port currently isn't using this target macro. Chung-Lin