Re: AArch64 and -moutline-atomics

2020-05-19 Thread Richard Henderson
On 5/19/20 3:38 AM, Florian Weimer via Gcc wrote: > One minor improvement would be to document __aarch64_have_lse_atomics as > interposable on the GCC side and define that directly in glibc, so that > lse-init.o is not linked in anymore and __aarch64_have_lse_atomics can > be initialized as soon as

Re: Cortex M0 Floating Point Library

2018-11-11 Thread Richard Henderson
On 11/9/18 9:58 PM, Daniel Engel wrote: > Is the linker aware of section hierarchy, such that using a common section > prefix (e.g. ".text.m0fp.*") would gather the appropriate sections together > from multiple object files? The linker script is not written like that. But we could reasonably re

Re: Cortex M0 Floating Point Library

2018-11-08 Thread Richard Henderson
On 11/7/18 6:10 PM, Daniel Engel wrote: > Also, loss of control of linking order would require all short branches in > the libm section to be replaced with long branches. This particularly > impacts the exception handling in almost every function. You could partially remedy this by placing all

Re: RISC-V vector extension cauldron discussion

2018-09-11 Thread Richard Henderson
On 09/11/2018 09:28 AM, Palmer Dabbelt wrote: >> The RISC-V vector extension described something other than what is >> present in the currently released 2.2 standard.  To clarify the >> language within this message, based on what I remember: > > Yes.  The current RISC-V ISA standard contains no ve

RISC-V vector extension cauldron discussion

2018-09-08 Thread Richard Henderson
This attempts to capture (most of) the two hour discussion that we had in the Entrance Hall at GNU Cauldron yesterday. Please correct any faulty memory on my part and forward or cc this to the appropriate RISC-V forum. The RISC-V vector extension described something other than what is present in

Re: Redundant sign-extension instructions on RISC-V

2017-09-06 Thread Richard Henderson
On 09/06/2017 10:17 AM, Richard Henderson wrote: >> Yea. I'd also expect zero/nonzero bits tracking in combine to catch >> this. Shouldn't the sign bit be known to be zero after the shift which >> makes the extension redundant regardless of the SUBREG_PROMOTED flag?

Re: Redundant sign-extension instructions on RISC-V

2017-09-06 Thread Richard Henderson
On 09/06/2017 09:53 AM, Jeff Law wrote: >> I think the easiest solution to this is for combine to notice when IOR has >> operands with non-zero-bits that do not overlap, convert the operation to >> ADD. >> That allows the final two insns to fold to "addw" and the compiler need do no >> further ana

Re: Redundant sign-extension instructions on RISC-V

2017-09-06 Thread Richard Henderson
On 08/30/2017 02:43 AM, Michael Clark wrote: > POINTERS_EXTEND_UNSIGNED -1 (which is true) is defined on some targets. I > assume they sign-extend but the meaning has been overloaded. Just for your edification, this is for e.g. ia64's "addp4" instruction and it is not a normal extension. A 2-bit

Re: Redundant sign-extension instructions on RISC-V

2017-09-06 Thread Richard Henderson
On 08/29/2017 05:36 PM, Michael Clark wrote: > We’re investigating an issue with redundant sign-extension instructions being > emitted with the riscv backend. Firstly I would like to state that riscv is > possibly a unique backend with respect to its canonical sign-extended > register form due t

Re: Bit-field struct member sign extension pattern results in redundant

2017-08-18 Thread Richard Henderson
On 08/17/2017 03:29 PM, Michael Clark wrote: > hand coded x86 asm (no worse because the sar depends on the lea) > > sx5(int): > shl edi, 27 > sar edi, 27 > movsx eax, dl Typo in the register, but I know what you mean. More interestingly, edi already has the sign-ext

Re: help with PR78809 - inline strcmp for small constant strings

2017-08-04 Thread Richard Henderson
On 08/04/2017 01:38 PM, Wilco Dijkstra wrote: >> For constant strings of small length (upto 3?), I was wondering if it'd be a >> good idea to manually unroll strcmp loop, similar to __strcmp_* macros in >> bits/string.h?> >> For eg in gimple-fold, transform >> x = __builtin_strcmp(s, "ab") >> to >>

Re: help with PR78809 - inline strcmp for small constant strings

2017-08-04 Thread Richard Henderson
On 08/04/2017 05:59 AM, Prathamesh Kulkarni wrote: > Hi, > I was having a look at PR78809. > For the test-case: > int t1(const char *s) { return __builtin_strcmp (s, "a"); } > > for aarch64, trunk with -O2 generates: > t1: > adrpx1, .LANCHOR0 > add x1, x1, :lo12:.LANCHOR0 >

Re: [patch] RFC: Hook for insn costs?

2017-08-02 Thread Richard Henderson
On 08/02/2017 12:34 PM, Richard Earnshaw wrote: > I'm not sure if that's a good or a bad thing. Currently the mid-end > depends on some rtx constructs having sensible costs even if there's no > rtl pattern to match them (IIRC plus:QI is one such construct - RISC > type machines usually lack such a

Re: [patch] RFC: Hook for insn costs?

2017-07-17 Thread Richard Henderson
On 07/17/2017 12:20 AM, Richard Biener wrote: On Sun, Jul 16, 2017 at 12:51 AM, Segher Boessenkool Now what should it take as input? An rtx_insn, or just the pattern (as insn_rtx_cost does)? Is there any useful info on the other operands of an rtx_insn? If not then passing in the pattern (a

Re: libatomic IFUNC question (arm & libat_have_strexbhd)

2017-06-07 Thread Richard Henderson
On 06/07/2017 01:31 PM, Florian Weimer wrote: With lazy binding, the constructors of libraries should run in graph dependency order, which means this constructor should run before any users. Except when another shared object uses the function from its own ELF constructor, and the libatomic cons

Re: libatomic IFUNC question (arm & libat_have_strexbhd)

2017-06-07 Thread Richard Henderson
On 06/05/2017 10:50 PM, Florian Weimer wrote: * Steve Ellcey: I have a question about the use of IFUNCs in libatomic. I was looking at the arm implementation and in gcc/libatomic/config/linux/arm/host-config.h I see: extern bool libat_have_strexbhd HIDDEN; # define IFUNC_COND_

Re: -mcx16 vs. not using CAS for atomic loads

2017-01-24 Thread Richard Henderson
On 01/24/2017 01:08 AM, Torvald Riegel wrote: > Unless HW transactions are guaranteed to succeed for scenarios that are > sufficient for the atomics, HTM won't help because we'd have to consider > the worst-case, which would mean some non-HTM fallback. We're talking about a 16 byte aligned load he

Re: -mcx16 vs. not using CAS for atomic loads

2017-01-20 Thread Richard Henderson
On 01/19/2017 10:23 AM, Torvald Riegel wrote: * Option 3a: -mcx16 continues to only mean that cmpxchg16b is available, and we keep __sync builtins unchanged. This doesn't break valid uses of __sync* (eg, if they didn't need atomic loads at all). We change __atomic for 16-byte to not use cmpxchg1

Re: GCC libatomic ABI specification draft

2017-01-20 Thread Richard Henderson
On 01/20/2017 05:41 AM, Michael Matz wrote: Hi, On Wed, 18 Jan 2017, Richard Henderson wrote: Section 3 Rationale, alternative 1: I'm wondering if the example is correct. For a 4-byte-aligned type of size 3, the implementation cannot simply use 4-byte hardware-backed atomics because

Re: GCC libatomic ABI specification draft

2017-01-18 Thread Richard Henderson
On 01/17/2017 09:00 AM, Torvald Riegel wrote: I think the ABI should set a baseline for each architecture, and the baseline decides whether something is inlinable or not. Thus, the x86_64 ABI would make __int128 operations not imlinable (because of the issues with cmpxchg16b, see above). If use

Re: Fwd: Re: GCC libatomic questions

2016-07-06 Thread Richard Henderson
CMPXCHG16B is not always available on 64-bit x86 platforms, so 16-byte naturally aligned atomics are not inlineable. The support functions for such atomics are free to use lock-free implementation if the instruction is available on specific platforms. Except that it is available on almost all 64

Fwd: Re: GCC libatomic questions

2016-07-06 Thread Richard Henderson
Redirecting to the gcc list for discussion. I'll follow up on that thread directly. r~ Forwarded Message Subject:Re: GCC libatomic questions Date: Wed, 6 Jul 2016 10:27:20 -0700 From: Bin Fan Organization: Oracle Corporation To: Richard Henderson

Re: (R5900) Implementing Vector Support

2016-06-03 Thread Richard Henderson
On 06/03/2016 05:54 AM, Woon yung Liu wrote: The problem is that gen_lowpart() doesn't seem to support casting to other modes of the same size. It certainly does. The only place you get into trouble with gen_lowpart is with CONST_INT, which is mode-less. But I am already doubting that I w

Re: Implementing atomic load as compare-and-swap for read-only memory

2016-06-03 Thread Richard Henderson
On 06/03/2016 05:32 AM, Jakub Jelinek wrote: A change from wide CAS to locking would be an ABI change I suppose, but it could also be considered a necessary bugfix if we don't want to write to read-only memory. Does this affect anything but i686? Also x86_64 (for 128-bit atomics), clearly also

Re: (R5900) Implementing Vector Support

2016-05-31 Thread Richard Henderson
On 05/29/2016 12:59 AM, Woon yung Liu wrote: Hi Richard, I have solved the problems with the mulv8hi3 pattern; I needed to adjust the code within mips.c to allow the double-sized vector modes and to allow vector modes into the LO+HI accumulators. Yes, I should have mentioned that you would n

Re: (R5900) Implementing Vector Support

2016-05-18 Thread Richard Henderson
On 05/18/2016 05:16 AM, Woon yung Liu wrote: I didn't know that, thanks. I've re-done the instructions and expands, mostly based off the stuff that you shared earlier. Unfortunately, the test function wouldn't compile: testv.c: In function 'testv8mult': testv.c:87:1: error: unrecognizable ins

Re: (R5900) Implementing Vector Support

2016-05-16 Thread Richard Henderson
On 05/14/2016 03:21 AM, Woon yung Liu wrote: The current constraints allow GCC to access the 64-bit LO+HI register pair as a single 128-bit register, so I am cheating by using both the x and wr (new constraint for LO1+HI1) constraints. That doesn't seem right. The x constrant is for the hi/lo

Re: (R5900) Implementing Vector Support

2016-05-16 Thread Richard Henderson
On 05/15/2016 03:43 AM, Woon yung Liu wrote: testv.c:70:2: note: ==> examining statement: _5 = (int) _4; You need to implement the vec_unpack* patterns. But how can I tell what operations are required by autovectorization, that are currently not supported? Well, the dumps you're looking

Re: (R5900) Implementing Vector Support

2016-05-11 Thread Richard Henderson
On 05/11/2016 04:54 AM, Woon yung Liu wrote: I saw that the EE has the PMFHL.LH instruction, which loads the HI/LO register pairs (containing the multiplication result) into a single destination (i.e. truncates the multiplication result in the process), with the right order too. I suppose that i

Re: (R5900) Implementing Vector Support

2016-05-09 Thread Richard Henderson
On 05/06/2016 09:28 PM, Woon yung Liu wrote: Regarding multiplication of vectors, is there a way to work with a multiplication operation that results in something like this (the result is spread across these 3 registers), without re-ordering any elements: RD: A6xB6, A4xB4, A2xB2, A0xA0 LO: A7

Re: (R5900) Implementing Vector Support

2016-05-02 Thread Richard Henderson
On 04/29/2016 07:54 AM, Liu Woon Yung wrote: I've done something like that, but GCC still doesn't select the pattern to use: (define_insn "vec_cmp" Because you've used the wrong name. The patterns are: OPTAB_CD(vec_cmp_optab, "vec_cmp$a$b") OPTAB_CD(vec_cmpu_optab, "vec_cmpu$a$b") I see

Re: Some aliasing questions

2016-04-12 Thread Richard Henderson
On 04/11/2016 05:30 PM, Alan Modra wrote: Either way, when we split to set (reg tmp) (high (const (minus ((symbol_ref) (reg 2) .. mem (lo_sum (reg tmp) (const (minus ((symbol_ref) (reg 2) both high and lo_sum reference r2 and the linker could happily replace rtmp in the lo

Re: Some aliasing questions

2016-04-08 Thread Richard Henderson
On 04/08/2016 11:10 AM, Bill Schmidt wrote: > The first is an issue with TOC-relative addresses on PowerPC. These are > symbolic addresses that are to be loaded from a fixed slot in the table > of contents, as addressed by the TOC pointer (r2). In the RTL phases > prior to register allocation, th

Re: (R5900) Implementing Vector Support

2016-04-04 Thread Richard Henderson
On 04/03/2016 09:12 PM, Woon yung Liu wrote: I can't figure out how to implement comparison operations (specifically, equals and the greater than operators). The GCC documentation mentions that the pattern for comparison (==) should be vec_cmp, but I don't understand why it has 4 operands and wha

Re: [RFC] When adding an offset to a lo_sum containing a symbol, check it is in range of the symbol's alignment

2016-04-04 Thread Richard Henderson
On 04/04/2016 05:36 AM, Andrew Bennett wrote: Hi, In MIPS (and similarly for other RISC architectures) to load an absolute address of an object requires a two instruction sequence: one instruction to load the high part of the object's address, and one instruction to load the low part of the ob

Re: Mysterious decision in combine

2016-03-22 Thread Richard Henderson
On 03/21/2016 06:31 AM, Dominik Vogt wrote: > Why does it drop the "parallel" and "clobber" in the combination; > is there a way to force combine to keep that? > > Trying 6 -> 7: > Failed to match this instruction: > (set (reg:DI 65) > (and:DI (subreg:DI (mem:SI (reg:DI 2 %r2 [ a ]) [1

Re: Mysterious decision in combine

2016-03-19 Thread Richard Henderson
On 03/16/2016 11:35 PM, Dominik Vogt wrote: > How does combine get this idea (it's the only match in the > function)? > > Trying 7 -> 12: > Successfully matched this instruction: > (set (reg/i:DI 2 %r2) > (and:DI (subreg:DI (reg:SI 64) 0) > (const_int 4294901775 [0x000f])

Re: Implementing TI mode (128-bit) and the 2nd pipeline for the MIPS R5900

2016-03-09 Thread Richard Henderson
On 03/09/2016 08:45 AM, Woon yung Liu wrote: 3. due to the current register size (UNITS_PER_WORD) definition, allocating a TI mode register will cause two consecutive registers to be allocated instead (like the HILO pseudo register) of one (other than just being wrong, it is probably wasteful).

Re: Validity of SUBREG+AND-imm transformations

2016-03-08 Thread Richard Henderson
On 03/07/2016 02:49 PM, Jeff Law wrote: On 03/07/2016 03:44 AM, Kyrill Tkachov wrote: The RTL documentation for ASHIFT and friends says that the shift amount must be: "a fixed-point mode or be a constant with mode @code{VOIDmode}; which mode is determined by the mode called for in the machine

Re: Implementing TI mode (128-bit) and the 2nd pipeline for the MIPS R5900

2016-03-05 Thread Richard Henderson
On 02/27/2016 01:38 AM, Woon yung Liu wrote: I've given up on trying to implement MMI support for this target because I couldn't get the larger-than-normal GPR sizes to work nicely with the GCC internals (registers sometimes get split due to the defined word size, or the stuff in expr.c will jus

Re: "cc" clobber

2016-02-01 Thread Richard Henderson
On 02/02/2016 01:58 AM, Ulrich Weigand wrote: I think on many targets a clobber "cc" works because the backend actually defines a register named "cc" to correspond to the flags. Therefore the normal handling of clobbering named hard registers catches this case as well. Yes. C.f. Sparc ADDITION

Re: [RFC PR43721] Optimize a/b and a%b to single divmod call

2016-01-31 Thread Richard Henderson
On 01/29/2016 12:37 AM, Richard Biener wrote: To workaround this, I defined a new hook expand_divmod_libfunc, which targets must override for expanding call to target-specific dimovd. The "default" hook default_expand_divmod_libfunc() expands call to libgcc2.c:__udivmoddi4() since that's the only

Re: Reorder/combine insns on superscalar arch

2016-01-15 Thread Richard Henderson
On 01/15/2016 06:06 AM, Bernd Schmidt wrote: > On 01/15/2016 07:05 AM, Jeff Law wrote: > >> Well, you have to write the pattern and a splitter. But these days >> there's define_insn_and_split to help with that. Reusing Bernd's work >> may ultimately be easier though. > > Maybe, but maybe also n

Re: basic asm and memory clobbers

2015-11-20 Thread Richard Henderson
On 11/20/2015 04:34 PM, Jakub Jelinek wrote: Isn't that going to break too much code though? I mean, e.g. including libgcc... I don't know. My suspicion is very little. But that's actually what I'd like to know before we start adjusting code in other ways wrt basic asms. r~

Re: basic asm and memory clobbers

2015-11-20 Thread Richard Henderson
On 11/20/2015 04:20 PM, Segher Boessenkool wrote: On Fri, Nov 20, 2015 at 02:05:01PM +0100, Richard Henderson wrote: I'd be perfectly happy to deprecate and later completely remove basic asm within functions. Because IMO it's essentially useless. It has no inputs, no outputs, and

Re: basic asm and memory clobbers

2015-11-20 Thread Richard Henderson
On 11/20/2015 01:38 PM, David Wohlferd wrote: On 11/20/2015 3:14 AM, Andrew Haley wrote: On 20/11/15 10:37, David Wohlferd wrote: The intent for 24414 is to change basic asm such that it will become (quoting jeff) "an opaque blob that read/write/clobber any register or memory location." Such b

Re: _Fract types and conversion routines

2015-10-30 Thread Richard Henderson
On 10/30/2015 02:05 AM, Richard Biener wrote: > On Thu, Oct 29, 2015 at 6:49 PM, Steve Ellcey wrote: >> So should __satfractqiuhq be dealing with the fact that the argument 'a' >> may not have been sign extend in the correct way? > > No. GCC should ensure libcalls (yes, they are speical for some

Re: inline asm and multi-alternative constraints

2015-10-29 Thread Richard Henderson
On 10/27/2015 02:05 PM, Jeff Law wrote: > On 10/25/2015 09:41 PM, David Wohlferd wrote: >> Does gcc's inline asm support multi-alternative constraints? Or are >> they only supported for md? >> >> The fact that it is doc'ed with the other constraints >> (https://gcc.gnu.org/onlinedocs/gcc/Constrain

Re: [cfe-dev] RFC: Support x86 interrupt and exception handlers

2015-09-22 Thread Richard Henderson
On 09/21/2015 04:03 PM, Hal Finkel wrote: > - Original Message - >> From: "H.J. Lu" >> To: "Hal Finkel" >> Cc: "GCC Development" , cfe-...@lists.llvm.org >> Sent: Monday, September 21, 2015 5:57:36 PM >> Subject: Re: [cfe-dev] RFC: Support x86 interrupt and exception handlers >> >> On Mon

Re: Question about DRAP register and reserving hard registers

2015-06-29 Thread Richard Henderson
On 06/22/2015 11:14 PM, Steve Ellcey wrote: On Fri, 2015-06-19 at 09:09 -0400, Richard Henderson wrote: On 06/16/2015 07:05 PM, Steve Ellcey wrote: I have a question about the DRAP register (used for dynamic stack alignment) and about reserving/using hard registers in general. I am trying

Re: ifcvt limitations?

2015-06-29 Thread Richard Henderson
On 06/10/2015 02:36 PM, Kyrill Tkachov wrote: On 02/06/15 17:50, Jeff Law wrote: On 06/02/2015 09:57 AM, Kyrill Tkachov wrote: I'm stuck on noce_process_if_block (in ifcvt.c) and what I think is a restriction that the THEN-block contents have to be only a single set insn. This fails on aarch64

Re: Question about DRAP register and reserving hard registers

2015-06-19 Thread Richard Henderson
On 06/16/2015 07:05 PM, Steve Ellcey wrote: I have a question about the DRAP register (used for dynamic stack alignment) and about reserving/using hard registers in general. I am trying to understand where, if a drap register is allocated, GCC is told not to use it during general register allo

Re: Builtin/headers: Constant arguments and adding extra entry points.

2015-06-08 Thread Richard Henderson
On 06/04/2015 12:35 PM, Ondřej Bílka wrote: char *strchr_c(char *x, unsigned long u); #define strchr(x,c) \ (__builtin_constant_p(c) ? strchr_c (x, c * (~0ULL / 255)) : strchr (x,c)) Certainly not a universal win, especially for 64-bit RISC. This constant can be just as expensive to construc

Re: Static Chain Register on iOS AArch64

2015-06-08 Thread Richard Henderson
On 06/08/2015 10:00 AM, Richard Earnshaw wrote: r12 can *also* be clobbered by interworking calls or calls that span more than the branch range of a call instruction. Rare, but possible. I can only presume from this that nested functions are not reliable now, for very large programs. Unless

Re: Static Chain Register on iOS AArch64

2015-06-08 Thread Richard Henderson
On 06/06/2015 06:24 AM, Richard Earnshaw wrote: That's going to make it impossible to implement Go closures on AArch32, then, since the only call-clobbered register not used for parameter passing is r12 (ip) and that can be clobbered by function calls. No, because r12 is only clobbered by plt s

Re: Static Chain Register on iOS AArch64

2015-06-05 Thread Richard Henderson
On 06/04/2015 03:40 AM, Richard Earnshaw wrote: The static chain register is pretty much private to a translation unit... That was true when the static chain was restricted to trampolines. Since Go has started using it for cross-translation-unit closures, that makes it part of the ABI. I d

Re: i386: does gcc work with CS ≠ DS?

2015-06-03 Thread Richard Henderson
On 06/02/2015 10:44 PM, H. Peter Anvin wrote: > Hi guys, another low level question: > > Obviously gcc for i386 requires DS = ES = SS (with FS and GS don't > care), but does gcc also require CS = DS? I don't believe so. In these modern times we don't place switch statement tables, or other const

Re: Better info for combine results in worse code generated

2015-06-02 Thread Richard Henderson
On 05/31/2015 07:03 PM, Alan Modra wrote: I agree. Do you intend to get rid of WORD_REGISTER_OPERATIONS, POINTERS_EXTEND_UNSIGNED, PUSH_ROUNDING, SHORT_IMMEDIATES_SIGN_EXTEND, and LOAD_EXTEND_OP? ;-) Oh yes, please. ;-) Although for the specific case of this thread, W_R_O, S_I_S_E, and L_E_

Re: Is it safe to use _Bool as asm statement outputs on x86?

2015-06-02 Thread Richard Henderson
On 06/02/2015 04:46 PM, H. Peter Anvin wrote: For the x86 backend explicitly, is doing something like: _Bool x; asm("blah ; setc %0" : "=qm" (x)); ... guaranteed to be safe for older versions of gcc? I believe so, for the restricted set of conditions I expect you're asking. I

Re: s390: SImode pointers vs LR

2015-06-02 Thread Richard Henderson
On 06/02/2015 08:32 AM, Andreas Krebbel wrote: -(define_insn "*3" +(define_insn "*3_reg" [(set (match_operand:GPR 0 "register_operand" "=d") (SHIFT:GPR (match_operand:GPR 1 "register_operand" "") - (match_operand:SI 2 "shift_count_or_setmem_operand" "Y")))] +

Re: Relocations to use when eliding plts

2015-05-29 Thread Richard Henderson
On 05/28/2015 01:36 PM, Rich Felker wrote: > On Thu, May 28, 2015 at 09:40:57PM +0200, Jakub Jelinek wrote: >> On Thu, May 28, 2015 at 03:29:02PM -0400, Rich Felker wrote: You're not missing anything. But do you want the performance of a library to depend on how the main executable is co

Re: Relocations to use when eliding plts

2015-05-28 Thread Richard Henderson
On 05/28/2015 10:59 AM, Rich Felker wrote: Am I missing something? You're not missing anything. But do you want the performance of a library to depend on how the main executable is compiled? r~

Re: Relocations to use when eliding plts

2015-05-28 Thread Richard Henderson
On 05/28/2015 08:42 AM, H.J. Lu wrote: > On Thu, May 28, 2015 at 8:29 AM, Richard Henderson wrote: >> On 05/28/2015 04:27 AM, H.J. Lu wrote: >>> You get consecutive jmpq's because x86 PLT entry is used as the >>> canonical function address. If you compile main

Re: Relocations to use when eliding plts

2015-05-28 Thread Richard Henderson
On 05/28/2015 04:27 AM, H.J. Lu wrote: > You get consecutive jmpq's because x86 PLT entry is used as the > canonical function address. If you compile main with -fno-plt -fPIE, you > get: Well, duh. If the main executable has no PLTs, they aren't used as the canonical function address. Surely yo

Relocations to use when eliding plts

2015-05-27 Thread Richard Henderson
There's one problem with the couple of patches that I've seen go by wrt eliding PLTs with -z now, and relaxing inlined PLTs (aka -fno-plt): They're currently using the same relocations used by data, and thus the linker and dynamic linker must ensure that pointer equality is maintained. Which resu

Re: [RFC] Combine related fail of gcc.target/powerpc/ti_math1.c

2015-05-21 Thread Richard Henderson
On 05/21/2015 11:44 AM, Segher Boessenkool wrote: > On Thu, May 21, 2015 at 11:34:14AM -0700, Richard Henderson wrote: >> Actually, I believe that the way CA is modeled at the moment is dangerous. >> It's not a 64-bit value, but a 1-bit value. > > It's a fixed regi

Re: [RFC] Combine related fail of gcc.target/powerpc/ti_math1.c

2015-05-21 Thread Richard Henderson
On 05/21/2015 05:39 AM, Segher Boessenkool wrote: >> > Trying 18, 9 -> 24: >> > Failed to match this instruction: >> > (set (reg:DI 4 4 [+8 ]) >> > (plus:DI (plus:DI (reg:DI 5 5 [ val+8 ]) >> > (reg:DI 76 ca)) >> > (reg:DI 169 [+8 ]))) > For some reason it has the CA reg not

Re: Fwd: xtensa PR65730

2015-05-13 Thread Richard Henderson
On 04/10/2015 06:38 AM, Max Filippov wrote: > OTOH calling helper function to do multiplication by a constant 8 looks > rather stupid. I guess we're not going to have non-8-bit bytes on xtensa > anytime soon, maybe this multiplication can be replaced with shift? Yes, that's what I'd do. r~

Re: [i386] Scalar DImode instructions on XMM registers

2015-05-07 Thread Richard Henderson
On 05/07/2015 09:24 AM, Richard Henderson wrote: > I was wondering this morning about the possibility of a kind of constraint > that > would allow RA to generate pairs of registers via CONCAT. That is, the two > hard registers within the CONCAT are collectively the double-word alloc

Re: [i386] Scalar DImode instructions on XMM registers

2015-05-07 Thread Richard Henderson
On 05/07/2015 10:59 AM, Uros Bizjak wrote: > If we consider SSE operations as DImode operations, we will loose the > ability to precisely specify which operation (SSE vs. general reg) we > want. I'm afraid that in DImode case, combine will choose FLAG-less > pattern that will mandate moves from gen

Re: [i386] Scalar DImode instructions on XMM registers

2015-05-07 Thread Richard Henderson
On 04/24/2015 06:32 PM, Jan Hubicka wrote: > Also I believe it was kind of Richard's design deicsion to avoid use of > (paradoxical) subregs for vector conversions because these have funny > implications. Yes indeed. > The code for handling upper parts of paradoxical subregs is controlled by > ma

Re: [RFC] Design for flag bit outputs from asms

2015-05-04 Thread Richard Henderson
On 05/04/2015 01:45 PM, Linus Torvalds wrote: > On Mon, May 4, 2015 at 1:33 PM, Richard Henderson wrote: >> >> A fair point. Though honestly, I was hoping that this feature would mostly >> be >> used for conditions that are "weird" -- that is, not normall

Re: [RFC] Design for flag bit outputs from asms

2015-05-04 Thread Richard Henderson
On 05/04/2015 01:14 PM, H. Peter Anvin wrote: > On 05/04/2015 12:33 PM, Richard Henderson wrote: >> >> (0) The C level output variable should be an integral type, from bool on up. >> >> The flags are a scarse resource, easily clobbered. We cannot allow user code &

[RFC] Design for flag bit outputs from asms

2015-05-04 Thread Richard Henderson
On 05/02/2015 05:39 AM, Peter Zijlstra wrote: > static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr) > { > bool oldbit; > > asm volatile ("btr %2, %1" > : "CF" (oldbit), "+m" (*addr) > : "Ir" (nr)); > > return old

Re: limiting call clobbered registers for library functions

2015-01-29 Thread Richard Henderson
On 01/29/2015 02:08 AM, Paul Shortis wrote: > I've ported GCC to a small 16 bit CPU that has single bit shifts. So I've > handled variable / multi-bit shifts using a mix of inline shifts and calls to > assembler support functions. > > The calls to the asm library functions clobber only one (by con

Re: pointer math vs named address spaces

2014-12-10 Thread Richard Henderson
On 12/10/2014 05:36 AM, Richard Biener wrote: > On Wed, Dec 10, 2014 at 2:24 AM, Richard Henderson wrote: >> On 12/04/2014 01:54 AM, Richard Biener wrote: >>> Apart from what Joseph already said using 'sizetype' in the middle-end >>> for sizes and offsets

Re: pointer math vs named address spaces

2014-12-09 Thread Richard Henderson
On 12/04/2014 01:54 AM, Richard Biener wrote: > Apart from what Joseph already said using 'sizetype' in the middle-end > for sizes and offsets is really really deep-rooted into the compiler. > What you see above is one aspect - POINTER_PLUS_EXPR offsets > are forced to have sizetype type. But you'

[CFT, Darwin] libffi merge from upstream

2014-11-18 Thread Richard Henderson
Dominik Vogt and I are trying to get upstream libffi merged back to gcc, so that we can leverage better support for libgo. I've done a heavy handed merge into a branch (likely not the final form), and I'd like it to see wider testing. Especially Darwin, which is known to be broken upstream. But I

Re: What is R_X86_64_GOTPLT64 used for?

2014-11-13 Thread Richard Henderson
On 11/13/2014 03:55 PM, H.J. Lu wrote: > x86-64 psABI has > > name@GOT: specifies the offset to the GOT entry for the symbol name > from the base of the GOT. > > name@GOTPLT: specifies the offset to the GOT entry for the symbol name > from the base of the GOT, implying that there is a correspondi

Re: Compare Elimination problems

2014-09-04 Thread Richard Henderson
On 09/03/2014 03:14 PM, Paul Shortis wrote: > (insn 33 84 85 6 (parallel [ > (set (reg:HI 1 r1) > (ashift:HI (reg:HI 1 r1) > (const_int 1 [0x1]))) > (clobber (reg:CC_NOOV 7 flags)) > ]) ../gcc/testsuite/gcc.c-torture/execute/960311

Re: Frame pointer optimization issues

2014-08-20 Thread Richard Henderson
On 08/20/2014 08:22 AM, Wilco Dijkstra wrote: > 2. Change the mid-end to call _frame_pointer_required even when > !flag_omit_frame_pointer. Um, it does that already. At least as far as I can see from ira_setup_eliminable_regset and update_eliminables. It turns out to be much easier to re-enable

Re: Zero/Sign extension elimination using value ranges

2014-05-22 Thread Richard Henderson
On 05/22/2014 03:12 AM, Jakub Jelinek wrote: > No way. SUBREG_PROMOTED_UNSIGNED_P right now resides in two separate bits, > volatil and unchanging. Right now volatile != 0, unchanging ignored > is -1, volatile == 0, then the value is unchanging. > What I meant is change this representation, e.g.

aarch64 ada rpms

2014-04-30 Thread Richard Henderson
On 04/30/2014 12:57 AM, Matthias Klose wrote: > Am 16.04.2014 09:02, schrieb r...@redhat.com: >> I'll see about puting some rpms somewhere public so that no one else >> has to do the whole canadian-cross compile dance. > > are these already online? a tarball would be fine too. And is there a > b

Re: [buildrobot] sparc64-linux broken

2014-04-22 Thread Richard Henderson
On 04/22/2014 12:26 AM, Jakub Jelinek wrote: > I've committed following fix as obvious after testing it with a > x86_64->sparc64-linux cross-compiler. > > 2014-04-22 Jakub Jelinek > > PR target/60910 > * config/sparc/sparc.c (sparc_init_modes): Pass enum machine_mode > value

Re: [buildrobot] sparc64-linux broken

2014-04-21 Thread Richard Henderson
On 04/21/2014 11:02 AM, Jakub Jelinek wrote: > but I'd say for GCC codebase it is better if we fix > the few users of these macros that pass an int rather than enum machine_mode > to these macros. I agree. In the aarch64 backend it determined that we were passing a reg_class_t and not a mode at a

Re: [buildrobot] sparc64-linux broken

2014-04-21 Thread Richard Henderson
On 04/21/2014 09:53 AM, Jan-Benedict Glaw wrote: > /home/jbglaw/repos/gcc/gcc/config/sparc/sparc.c:4858: error: invalid > conversion from ‘int’ to ‘machine_mode’ Yes, something has changed recently in the build flags to (I believe) remove -fpermissive. Quite a few backends are affected by this

Re: stack-protection vs alloca vs dwarf2

2014-04-18 Thread Richard Henderson
On 04/18/2014 11:31 PM, Richard Henderson wrote: > On 04/17/2014 10:14 AM, DJ Delorie wrote: >> _medium_frame: >> pushm r6-r12 >> add #-4, r0, r6 ; marked frame-related (fp = sp - 4) >

Re: stack-protection vs alloca vs dwarf2

2014-04-18 Thread Richard Henderson
On 04/17/2014 10:14 AM, DJ Delorie wrote: > _medium_frame: > pushm r6-r12 > add #-4, r0, r6 ; marked frame-related (fp = sp - 4) > mov.L r6, r0 ; marked frame-related (sp = fp) There's your

Re: LRA Stuck in a loop until aborting

2014-04-16 Thread Richard Henderson
On 04/16/2014 03:05 PM, Paul Shortis wrote: > Solved... kind of. > > *ldsi is one of the patterns movsi is expanded to and as the name suggests it > only handles register loads. I know that at some stages memory references will > pass the register_operand predicate so I changed the predicate for o

Re: Rename unwind.h to unwind-gcc.h

2014-04-16 Thread Richard Henderson
On 04/16/2014 12:48 PM, Douglas B Rupp wrote: > On 04/16/2014 12:38 PM, Eric Botcazou wrote: >>> Because GCC would then be already incompatible with the Intel compiler from >>> which this interface was drawn, way back when the ia64 support was added to >>> GCC and we redesigned GCC's exception hand

Re: Rename unwind.h to unwind-gcc.h

2014-04-16 Thread Richard Henderson
On 04/16/2014 12:01 AM, John Marino wrote: > On 4/16/2014 03:22, Ian Lance Taylor wrote: >> On Tue, Apr 15, 2014 at 4:45 AM, Douglas B Rupp wrote: >>> On 04/14/2014 02:01 PM, Ian Lance Taylor wrote: >>> >>> No I considered that but I think that number will be very small. Will you >>> concede, in h

Re: linux says it is a bug

2014-03-05 Thread Richard Henderson
On 03/04/2014 10:12 PM, Yury Gribov wrote: >>> Asms without outputs are automatically volatile. So there ought be zero >>> change >>> with and without the explicit use of the __volatile__ keyword. >> >> That’s what the documentation says but it wasn’t actually true >> as of a couple of releases a

Re: linux says it is a bug

2014-03-04 Thread Richard Henderson
On 03/04/2014 01:23 AM, Richard Biener wrote: > Doesn't sound like a bug but a feature. We can move > asm ("" : : : "memory") around freely up to the next/previous > instruction involving memory. Asms without outputs are automatically volatile. So there ought be zero change with and without the

Re: Handling error conditions in libgomp

2014-02-28 Thread Richard Henderson
On 02/28/2014 03:37 AM, Thomas Schwinge wrote: > The process cannot recover from this, trying to continue despite the > error. (It is of course questionable what exactly to do in this case, as > libgomp's internal state may now be corrupt.) So far, such errors may > have been rare (aside from rea

Re: [RFC] Offloading Support in libgomp

2014-02-14 Thread Richard Henderson
On 02/14/2014 07:43 AM, Jakub Jelinek wrote: > So, perhaps we should just stop for now oring the copyfrom in and just use > the copyfrom from the very first mapping only, and wait for what the committee > actually agrees on. > > Richard, your thoughts on this? I think stopping the or'ing until th

Re: libatomic Makefile unconditionally sets -march=armv7-a when configuring with ifunc support

2014-01-16 Thread Richard Henderson
On 01/16/2014 05:51 AM, Kyrill Tkachov wrote: > Hi Richard, > > I noticed that Makefile.in in libatomic sets -march=armv7-a when compiling for > arm linux targets with ifunc support: > > @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv7-a > -DHAVE_KERNEL64 > > Is there any parti

Multiple local register variables w/ same register

2013-11-19 Thread Richard Henderson
On 11/20/2013 03:33 AM, Peter Zijlstra wrote: > On Tue, Nov 19, 2013 at 05:02:20PM +, Mathieu Desnoyers wrote: >> Unfortunately I don't have a ARM cross-compiler setup ready. Nathan could >> test >> it for us though. >> >> It might shuffle things around enough to work around the issue, but wit

Re: Vectorizer/alignment

2013-11-11 Thread Richard Henderson
On 11/11/2013 11:57 PM, Richard Biener wrote: > On Mon, Nov 11, 2013 at 2:39 PM, Jakub Jelinek wrote: >> On Mon, Nov 11, 2013 at 02:13:24PM +0100, Richard Biener wrote: >>> On Mon, Nov 11, 2013 at 12:39 PM, Jakub Jelinek wrote: On Mon, Nov 11, 2013 at 12:29:29PM +0100, Richard Biener wrote:

Re: New __atomic builtins generating an unwanted/unneeded stack slot

2013-10-29 Thread Richard Henderson
On 10/29/2013 03:06 AM, Richard Biener wrote: > On Mon, Oct 28, 2013 at 7:33 PM, Richard Henderson wrote: >> On 10/28/2013 02:25 AM, Frederic Riss wrote: >>> Is there a clean way to have the compiler discard the unneeded stack slot? >> >> Not yet. There is a rewrit

Re: New __atomic builtins generating an unwanted/unneeded stack slot

2013-10-28 Thread Richard Henderson
On 10/28/2013 02:25 AM, Frederic Riss wrote: > Is there a clean way to have the compiler discard the unneeded stack slot? Not yet. There is a rewrite of the atomic support in gcc to move away from using builtins, which will allow two outputs to be ssa allocacted. But this will not be complete fo

  1   2   3   4   5   6   7   8   9   >