Re: s390: SImode pointers vs LR
On 06/02/2015 07:13 PM, Jeff Law wrote: > But isn't that 3 registers used in the address computation if the > (const_int 1) gets reloaded? one of the value shifted, two for the > shift count? I'm not familiar with the s390, so if you can handle that > kind of insn, then, umm, cool. The address style operand is only the shift count. Our instructions support base + displacement here. E.g. sll %r2,%r3(3)is r2 << (r3 + 3) > The only other thing that comes immediately to mind would be secondary > reloads. But I always hate suggesting them. I don't see how this would help here. It is not really that reload needs help moving something to/from a register. In fact the INSN is good as is and we are trying to prevent reload from doing anything. Bye, -Andreas-
Re: s390: SImode pointers vs LR
On 06/03/2015 12:53 AM, Richard Henderson wrote: > On 06/02/2015 08:32 AM, Andreas Krebbel wrote: >> -(define_insn "*3" >> +(define_insn "*3_reg" >> [(set (match_operand:GPR 0 "register_operand" "=d") >> (SHIFT:GPR (match_operand:GPR 1 "register_operand" "") >> - (match_operand:SI 2 "shift_count_or_setmem_operand" >> "Y")))] >> + (match_operand:SI 2 "register_operand" "a")))] >> "" >> - "sl\t%0,<1>%Y2" >> + "sl\t%0,<1>%2" >> + [(set_attr "op_type" "RS") >> + (set_attr "atype""reg")]) >> + >> +(define_insn "*3_imm" >> + [(set (match_operand:GPR 0 "register_operand" "=d") >> +(SHIFT:GPR (match_operand:GPR 1 "register_operand" "") >> + (match_operand 2 "immediate_operand" "J")))] >> + "" >> + "sl\t%0,<1>%2" >> + [(set_attr "op_type" "RS") >> + (set_attr "atype""reg")]) > > These two ought not be split apart. They're simple alternatives. Right. That was just a quick copy and paste hack to check if it works. > And why SImode? Other modes would work as well since the instruction only uses the lower 6 bits anyway. But what's wrong with SImode? Bye, -Andreas-
RE: Question about find modifiable mems
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of shmeel gutl Sent: Wednesday, June 03, 2015 12:10 PM To: GCC Development Subject: Question about find modifiable mems >>find_modifiable_mems was introduced to gcc 4.8 in september 2012. Is there >>any documentation as to how it is supposed to help the haifa scheduler? >>In my private port of gcc it make the following type of transformations >>from >>a= *(b+20) >>b+=30 >>to >> b+=30 >>a=*(b-10) >>Although this is functionally correct, it has changed an ANTI_DEP into a >>TRUE_DEP and thus introduced stalls. If it went the other way, that would be >>good. >>Any pointers? Breaking Anti-Dependencies is an important optimization for transformation like Vectorization. Thanks & Regards Ajit Thanks, Shmeel
parameters to _mm_mwait intrinsic
Hi, I was going through the "monitor" and "mwait" builtin implementation. I need clarification on the parameters passed to _mm_mwait intrinsic. We have the following defined in "pmmintrin.h" extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_monitor (void const * __P, unsigned int __E, unsigned int __H) { __builtin_ia32_monitor (__P, __E, __H); } extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_mwait (unsigned int __E, unsigned int __H) { __builtin_ia32_mwait (__E, __H); } I assume parameter names indicates P -> Address E -> Extensions H -> Hints Mwait as per AMD ISA manual Ref: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2008/10/24594_APM_v3.pdf (---Snip---) EAX specifies optional hints for the MWAIT instruction. There are currently no hints defined and all bits should be 0. Setting a reserved bit in EAX is ignored by the processor. ECX specifies optional extensions for the MWAIT instruction. The only extension currently defined is ECX bit 0, which allows interrupts to wake MWAIT, even when eFLAGS.IF = 0. Support for this extension is indicated by a feature flage returned by the CPUID instruction. Setting any unsupported bit in ECX results in a #GP exception. (---Snip---) Mwait defined as per intel ISA manual. Ref: http://www.intel.in/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf (---Snip---) This instruction's operation is the same in non-64-bit modes and 64-bit mode. ECX specifies optional extensions for the MWAIT instruction. EAX may contain hints such as the preferred optimized state the processor should enter. The first processors to implement MWAIT supported only the zero value for EAX and ECX. Later processors allowed setting ECX[0] to enable masked interrupts as break events for MWAIT (see below). Software can use the CPUID instruction to determine the extensions and hints supported by the processor (---Snip---) So for if a user calls _mm_mwait (__E, __H) __E should go into ECX and __H should go into EAX. However I see implementation in GCC (---snip---) case IX86_BUILTIN_MWAIT: arg0 = CALL_EXPR_ARG (exp, 0); arg1 = CALL_EXPR_ARG (exp, 1); op0 = expand_normal (arg0); op1 = expand_normal (arg1); if (!REG_P (op0)) op0 = copy_to_mode_reg (SImode, op0); if (!REG_P (op1)) op1 = copy_to_mode_reg (SImode, op1); emit_insn (gen_sse3_mwait (op0, op1)); return 0; (define_insn "sse3_mwait" [(unspec_volatile [(match_operand:SI 0 "register_operand" "a") (match_operand:SI 1 "register_operand" "c")] UNSPECV_MWAIT)] "TARGET_SSE3" ;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used. ;; Since 32bit register operands are implicitly zero extended to 64bit, ;; we only need to set up 32bit registers. "mwait" [(set_attr "length" "3")]) (---snip---) Here first argument __E is moved to "EAX" and __H is moved to "ECX" . Should the constraint be swaped for the operands in the pattern? Or My understanding is wrong? Regards, Venkat.
Re: parameters to _mm_mwait intrinsic
On Wed, Jun 3, 2015 at 2:47 PM, Kumar, Venkataramanan wrote: > Hi, > > I was going through the "monitor" and "mwait" builtin implementation. > I need clarification on the parameters passed to _mm_mwait intrinsic. > > Should the constraint be swaped for the operands in the pattern? Please swap the constraints in the pattern. Patch is pre-approved for mainline and release branches. Thanks, Uros.
Re: i386: does gcc work with CS ≠ DS?
On 06/02/2015 10:44 PM, H. Peter Anvin wrote: > Hi guys, another low level question: > > Obviously gcc for i386 requires DS = ES = SS (with FS and GS don't > care), but does gcc also require CS = DS? I don't believe so. In these modern times we don't place switch statement tables, or other constant data, in the .text section. Just map the correct sections to the correct segments and you should be fine. What advantage are you looking for? r~
gcc-4.9-20150603 is now available
Snapshot gcc-4.9-20150603 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20150603/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.9 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch revision 224106 You'll find: gcc-4.9-20150603.tar.bz2 Complete GCC MD5=15a5364ce3de48e8708f366b13803168 SHA1=72f655df6f472a38966ab556b3c3060d6a2dad2e Diffs from 4.9-20150527 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Static Chain Register on iOS AArch64
Hello, I noticed the following comment in the GCC source ( https://github.com/gcc-mirror/gcc/blob/7c62dfbbcd3699efcbbadc9fb3aa14f23a123add/libffi/src/aarch64/ffitarget.h#L66 ): /* iOS reserves x18 for the system. Disable Go closures until a new static chain is chosen. */ Based on this comment, it sounds as if GCC hasn't yet decided which register to use for the static chain pointer on iOS AArch64. Is this correct? As I understand it, x18 (the platform register) is not used on Linux and hence can be used by GCC. I couldn't find anything saying this, so could you confirm this (that x18 is not used by Linux and hence used by GCC)? In terms of the register to choose for iOS AArch64, it seems like either x16 or x17 (the Intra Procedural call scratch registers) would be a good choice, in the same way that r12 is used for ARM 32-bit. Does this seem sensible, or is there some reason for rejecting these registers? I'd appreciate anything anyone can tell me about the above. In case you're interested, the context for this is: http://comments.gmane.org/gmane.comp.compilers.llvm.devel/86370 Thanks, Stephen Cross
Re: Question about find modifiable mems
On 06/02/2015 11:39 PM, shmeel gutl wrote: > find_modifiable_mems was introduced to gcc 4.8 in september 2012. Is > there any documentation as to how it is supposed to help the haifa > scheduler? The patch was submitted here https://gcc.gnu.org/ml/gcc-patches/2012-08/msg00155.html and this message contains a brief explanation of what it is supposed to do. The explanation looks like a useful optimization, but perhaps it is triggering in cases when it shouldn't. Jim
Re: [i386] Scalar DImode instructions on XMM registers
On 05/27/2015 07:20 AM, Ilya Enkovich wrote: I looked into assign_stack_local_1 call for this spill. LRA correctly requests 16 bytes size with 16 bytes alignment. But assign_stack_local_1 look reduces alignment to 8 because estimated stack alignment before RA is 8 and requested mode's (DI) alignment fits it. Probably LRA should pass biggest_mode of the reg when requesting a stack slot? It's hard to say for sure. Within the lra_reg structure, biggest_mode refers to the largest mode in which a pseudo is referenced. So for a pseudo it might make sense. Presumably the biggest_mode for the pseudo in question is larger than DImode, right? I handled it by increasing stack_alignment_estimated when transform some instructions to vector mode. I haven't looked deeply, but if your pass runs after stack_alignment_estimated is initially computed, then this seems like a desirable way to fix the problem. jeff
Announcing James Bowman as FT32 port maintainer
I'm pleased to announce James Bowman has been appointed as the maintainer for the FT32 port. James, can you please add yourself to the MAINTAINERS file. Thanks, Jeff