Re: I have questions regarding the 4.3 codebase...
On Wed, 22 Mar 2023 18:27:28 -0400 Sid Maxwell via Gcc wrote: > Is there anyone on the list with experience with the gcc 4.3 > codebase? I'm currently maintaining a fork of it, with a PDP10 code > generator. > > I've run into an issue involving the transformation of a movmemhi to a > single PDP10 instruction (an xblt, if you're curious). The > transformation appears to 'lose' its knowledge of being a store, > resulting in certain following stores being declared dead, and code > motion that shouldn't happen (e.g. a load moved before the xblt that > depends on the result of the xblt). > > I'm hoping to find someone who can help me diagnose the problem. We > want to use this instruction rather than the copy-word-loop currently > generated for struct assignments. I think we'd need a bit more info than that... what does the movmemhi instruction pattern look like, for example? Julian
Re: Libgcc divide vectorization question
On Wed, Mar 22, 2023 at 4:57 PM Andrew Stubbs wrote: > > On 22/03/2023 13:56, Richard Biener wrote: > >> Basically, the -ffast-math instructions will always be the fastest way, > >> but the goal is that the default optimization shouldn't just disable > >> vectorization entirely for any loop that has a divide in it. > > > > We try to express division as multiplication, but yes, I think there's > > currently no way to tell the vectorizer that vectorized division is > > available as libcall (nor for any other arithmetic operator that is not > > a call in the first place). > > I have considered creating a new builtin code, similar to the libm > functions, that would be enabled by a backend hook, or maybe just if > TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION doesn't return NULL. The > vectorizer would then use that, somehow. To treat it just like any other > builtin it would have to be set before the vectorizer pass encounters > it, which is probably not ideal for all the other passes that want to > handle divide operators. Alternatively, the vectorizable_operation > function could detect and introduce the builtin where appropriate. > > Would this be acceptable, or am I wasting my time planning something > that would get rejected? So why not make it possible for the target to specify there's a libcall for a specific optab so the vectorizer would simply use vectorized {TRUNC_DIV,RDIV}_EXPR but the RTL expander would emit a libcall (in libgcc ways, thus divv2df3 or so)? It feels wrong to add some secondary machinery here (like for example having .RDIV internal function calls instead of a / operator) I think that for standard unops and binops that would be the default behavior already, the only piece missing is the vectorizer looking for a CODE_FOR_* optab handler and there's currently no way to say "yes I have a libcall fallback" or "no, no libcall fallback available" or for a target to specify those (maybe add a (define_libcall ...) alongside (define_expand ...)?) A short-circuit would be to use a new target hook to specify that libcall availability iff the libcall emission works. There's the remaining question of whether the libcall emission code works good enough for vector types, in cases the ABI for libcalls doesn't match the ABI for regular calls. Richard. > > Thanks > > Andrew
Re: [Static Analyzer] Loop handling - False positive for malloc-sm
On 22/03/2023 19:19, David Malcolm wrote: On Tue, 2023-03-21 at 09:21 +0100, Pierrick Philippe wrote: [stripping] In fact, this could be done directly by the analyzer, and only calling state machine APIs for loop handling which still has not reached such a fixed point in their program state for the analyzed loop, with a maximum number of execution fixed by the analyzer to limit execution time. Does what I'm saying make sense? I think so, though I'm not sure how it would work in practice. Consider e.g. for (int i = 0; i < n; i++) head = prepend_node (head, i); which builds a chain of N dynamically-allocated nodes in a linked list. Well, that would be a case where the loop's analysis will depend of the state machine. If we consider the malloc-sm, it would allow it to track as different pointers each allocated pointers, until the limit of symbolic execution imposed by the analyzer is reached, are the svalue of N if it is a known integer at the current analysis point. For other, such as a the file-sm, it would only be needed to symbolically execute it once, assuming prepend_node() is not opening any files. So this state machine would not have to be executed more than once on the loop at this program point by the analyzer. I think the different steps for such a different cases of loop analysis, is somehow using the second point of the RFE you shared above. The "algorithm" I came with when thinking about it looks like this. Of course, I'm definitely not an expert on the analyzer, so it is possibly not feasible. * Detect loop, and try to get the termination constraint (possibly reduced if possible). * Iterate on the loops' node N: * If N is the loop's first node: * Check if the actual program state is in a sufficient state to satisfy the loop's termination constraint, If so, stop analyzing the loop * Otherwise, check if the maximum number of symbolic execution fixed by the analyzer is reached, If so, stop analyzing the loop * Otherwise, keep going * Call every sm still impacting their program state map on node N This should work for loops iterating on integer, for other kind of loops, it might be trickier though. In terms of implementation, loop detection can be done by looking for strongly connected components (SCCs) in a function graph having more than one node. I don't know if this is how it is already done within the analyzer or not? It isn't yet done in the analyzer, but as noted above there is code in GCC that already does that (in cfgloop.{h,cc}). I definitely have to look at this files. Thank you for your time, Pierrick
Re: I have questions regarding the 4.3 codebase...
Thanks for reaching out, Julian, I greatly appreciate your help. Please forgive and over- or under-sharing. If I've left something out, please let me know. >From my pdp10.md: ;; JIRA sw_gcc-68. gcc recognizes the "movmemhi" 'instruction' for ;; doing block moves, as in struct assignment. This pattern wasn't ;; present, however. I've added it, and it's companion function ;; pdp10_expand_movmemhi(). (define_expand "movmemhi" [(match_operand:BLK 0 "general_operand" "=r") (match_operand:BLK 1 "general_operand" "=r") (match_operand:SI 2 "general_operand" "=r") (match_operand:SI 3 "const_int_operand" "") ] "" { if (pdp10_expand_movmemhi(operands)) { DONE; } FAIL; /* * if (!pdp10_expand_movmemhi(operands)) { * FAIL; * } */ } ) And the definition for pdp10_expand_movmemhi: // JIRA sw_gcc-68: Emit instructions to copy a memory block. Return // nonzero on success. This is really a duplicate of the following // function, but used specifically for the movmemhi operation. // Duplicated to allow for any custimization. int pdp10_expand_movmemhi(rtx *operands) { // operands[0] is the block destination address // operands[1] is the block source address // operands[2] is the block length // operands[3] is the known alignment. if (flag_enable_fix_sw_gcc_0068) { operands[0] = XEXP(operands[0], 0); operands[1] = XEXP(operands[1], 0); if (GET_CODE(operands[3]) != CONST_INT) { return 0; } if (INTVAL(operands[3]) != UNITS_PER_WORD) { return 0; } if (GET_CODE(operands[1]) == CONST && GET_CODE(XEXP(operands[1], 0)) == PLUS && GET_CODE(XEXP(XEXP(operands[1], 0), 1)) == CONST_INT ) { rtx x = XEXP(operands[1], 0); HOST_WIDE_INT offset = INTVAL(XEXP(x, 1)); operands[1] = plus_constant(XEXP(x, 0), offset & ADDRESS_MASK); } if (use_xblt(operands[0], operands[1], operands[2])) { return expand_xblt(operands[0], operands[1], operands[2]); } return expand_blt(operands[0], operands[1], operands[2]); } return 0; } Here are use_xblt() and expand_xblt(): // Return nonzero if XBLT should be used instead of BLT. static int use_xblt(rtx destination, rtx source, rtx length) { if (!HAVE_XBLT || !TARGET_EXTENDED) { return 0; } if (GET_CODE(destination) != CONST_INT) { return 1; } if (source && GET_CODE(source) != CONST_INT) { return 1; } if (GET_CODE(length) != CONST_INT) { return 1; } if (INTVAL(length) >> 32 > 0) { return 1; } if (source) { HOST_WIDE_INT destination_section = INTVAL(destination) >> 32; HOST_WIDE_INT source_section = INTVAL(source) >> 32; if (destination_section != source_section) { return 1; } } return 0; } // Try to emit instructions to XBLT a memory block of LENGTH storage // units from SOURCE to DESTINATION. Return nonzero on success. static int expand_xblt(rtx destination, rtx source, rtx length) { rtx temp, mem, acs; int i, n; if (GET_CODE(length) != CONST_INT) { return 0; } n = INTVAL(length); switch (n / UNITS_PER_WORD) { case 0: break; case 2: case 3: temp = gen_reg_rtx(DImode); for (i = 0; i < n / 8; i++) { emit_move_insn(temp, gen_rtx_MEM(DImode, plus_constant(source, i))); emit_move_insn(gen_rtx_MEM(DImode, plus_constant(destination, i)), temp); } // Fall through. case 1: if (n % 8 >= 4) { int m = 2 * (n / 8); temp = gen_reg_rtx(SImode); emit_move_insn(temp, gen_rtx_MEM(SImode, plus_constant(source, m))); emit_move_insn(gen_rtx_MEM(SImode, plus_constant(destination, m)), temp); } break; default: acs = gen_reg_rtx(TImode); emit_move_insn(gen_rtx_SUBREG(SImode, acs, 0), GEN_INT(INTVAL(length) / UNITS_PER_WORD)); emit_move_insn(gen_rtx_SUBREG(Pmode, acs, 4), source); emit_move_insn(gen_rtx_SUBREG(Pmode, acs, 8), destination); emit_insn(gen_XBLT(acs)); break; } switch (n % UNITS_PER_WORD) { case 0: break; case 1: temp = gen_reg_rtx(QImode); mem = gen_rtx_MEM(QImode, plus_constant(source, n / 4)); set_mem_align(mem, BITS_PER_WORD); emit_move_insn(temp, mem); mem = gen_rtx_MEM(QImode, plus_constant(destination, n / 4)); set_mem_align(mem, BITS_PER_WORD); emit_move_insn(mem, temp); break; case 2: temp = gen_reg_rtx(HImode); mem = gen_rtx_MEM(HImode, plus_constant(source, n / 4)); set_mem_align(mem, BITS_PER_WORD); emit_move_insn(temp, mem); mem = gen_rtx_MEM(HImode, plus_constant(destination, n / 4)); set_mem_align(mem, BITS_PER_WORD); emit_move_insn(mem, temp); break; case 3: temp = gen_reg_rtx(SImode); emit_insn(gen_extzv(temp, gen_rtx_MEM(SImode, plus_constant(source, n / 4)), GEN_INT(27), GEN_INT(0)));
Re: I have questions regarding the 4.3 codebase...
> On Mar 23, 2023, at 10:13 AM, Sid Maxwell via Gcc wrote: > > Thanks for reaching out, Julian, I greatly appreciate your help. Please > forgive and over- or under-sharing. If I've left something out, please let > me know. > > From my pdp10.md: > > ;; JIRA sw_gcc-68. gcc recognizes the "movmemhi" 'instruction' for > ;; doing block moves, as in struct assignment. This pattern wasn't > ;; present, however. I've added it, and it's companion function > ;; pdp10_expand_movmemhi(). > > (define_expand "movmemhi" > [(match_operand:BLK 0 "general_operand" "=r") >(match_operand:BLK 1 "general_operand" "=r") >(match_operand:SI 2 "general_operand" "=r") >(match_operand:SI 3 "const_int_operand" "") > ]... I don't remember that far back, but looking at current examples (like vax.md) that seems like an odd pattern. vax.md has a three operand pattern with the first two marked as "memory_operand" and only the first one has the = modifier on it showing it's an output operand. What does the 4.3 version of gccint say about it? Or the 4.3 version of vax.md? paul
Re: I have questions regarding the 4.3 codebase...
I'll take a look, Paul, thanks. It hadn't occurred to me to compare different machines' uses. -+- Sid On Thu, Mar 23, 2023 at 10:29 AM Paul Koning wrote: > > > > On Mar 23, 2023, at 10:13 AM, Sid Maxwell via Gcc > wrote: > > > > Thanks for reaching out, Julian, I greatly appreciate your help. Please > > forgive and over- or under-sharing. If I've left something out, please > let > > me know. > > > > From my pdp10.md: > > > > ;; JIRA sw_gcc-68. gcc recognizes the "movmemhi" 'instruction' for > > ;; doing block moves, as in struct assignment. This pattern wasn't > > ;; present, however. I've added it, and it's companion function > > ;; pdp10_expand_movmemhi(). > > > > (define_expand "movmemhi" > > [(match_operand:BLK 0 "general_operand" "=r") > >(match_operand:BLK 1 "general_operand" "=r") > >(match_operand:SI 2 "general_operand" "=r") > >(match_operand:SI 3 "const_int_operand" "") > > ]... > > I don't remember that far back, but looking at current examples (like > vax.md) that seems like an odd pattern. vax.md has a three operand pattern > with the first two marked as "memory_operand" and only the first one has > the = modifier on it showing it's an output operand. > > What does the 4.3 version of gccint say about it? Or the 4.3 version of > vax.md? > > paul > >
RE: FYI - International Microwave Symposium 2023
Hi, Any update on my previous email? If YES, please reply back as "Send Counts and Pricing". Regards Sasha From: Sasha Grace Sent: 20 March 2023 07:39 PM To: gcc@gcc.gnu.org Subject: FYI - International Microwave Symposium 2023 Hello Are you interested in purchasing International Microwave Symposium 2023 Updated Database Attendees: Design Engineering, Executive/Senior Management, Student, Professor / Research-Academic, Marketing/Sales, Research & Development-Industry, Engineering Management, Application Engineer, Executive/Senior Technology Development, Research & Development-Government, Engineering Services, Other... Record in the list contains: Contact Name, Job Title, Company/Business Name, Complete Mailing Address, email, Telephone/Fax Number, Website/URL, Revenue, Employee Size, SIC Code, Industry. If you are interested to purchase reply back as "Send Counts and Pricing". Regards, Sasha
gcc-10-20230323 is now available
Snapshot gcc-10-20230323 is now available on https://gcc.gnu.org/pub/gcc/snapshots/10-20230323/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 10 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-10 revision e6623d1dc48526e1a9be40661a0ad5adc67a4e07 You'll find: gcc-10-20230323.tar.xz Complete GCC SHA256=f57b6cce78eb6427ca34538459ab604d7ba4e0b498e3798567e8a06a1b59763b SHA1=539f78a341f00d762b1ee4471b98be048459fdb5 Diffs from 10-20230316 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-10 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.