date:20171117

Re: [PATCH 7/7]: Enable clobber high for tls descs on Aarch64

2017-11-17 Thread Alan Hayward

> On 16 Nov 2017, at 19:32, Andrew Pinski wrote: > > On Thu, Nov 16, 2017 at 4:35 AM, Alan Hayward wrote: >> This final patch adds the clobber high expressions to tls_desc for aarch64. >> It also adds three tests. >> >> In addition I also tested by taking the gcc torture test suite and making

Re: [PATCH 7/7]: Enable clobber high for tls descs on Aarch64

2017-11-17 Thread Andrew Pinski

On Fri, Nov 17, 2017 at 12:21 AM, Alan Hayward wrote: > >> On 16 Nov 2017, at 19:32, Andrew Pinski wrote: >> >> On Thu, Nov 16, 2017 at 4:35 AM, Alan Hayward wrote: >>> This final patch adds the clobber high expressions to tls_desc for aarch64. >>> It also adds three tests. >>> >>> In addition I

[PATCH, libgomp, openacc] Use GOMP_ASYNC_SYNC in GOACC_declare

2017-11-17 Thread Tom de Vries

Hi, GOACC_enter_exit_data has this prototype: ... void GOACC_enter_exit_data (int device, size_t mapnum, void **hostaddrs, size_t *sizes, unsigned short *kinds, int async, int num_waits, ...) ... And GOACC_declare calls GOACC_e

[PATCH] Remove some useless work in PRE

2017-11-17 Thread Richard Biener

VN already sees if an expresion is fully constant so there's no reason to duplicate that work during PHI translation. I've verified with an assert the paths are indeed unreachable. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2017-11-17 Richard Biener * tr

[PATCH] Fix PR83017 (fortran part)

2017-11-17 Thread Richard Biener

This adds a new ANNOTATE_EXPR kind, annot_expr_parallel_kind, which is stronger than ivdep which maps semantically to safelen=INT_MAX which alone doesn't tell us enough to auto-parallelize anything. annot_expr_parallel_kind can map to the already existing loop flag can_be_parallel which can be us

[0/7] Fold vectorizable_mask_load_store into vectorizable_load/store

2017-11-17 Thread Richard Sandiford

The vectoriser uses vectorizable_mask_load_store to handle conditional loads and stores (IFN_MASK_LOAD and IFN_MASK_STORE) but uses vectorizable_load and vectorizable_store for unconditional loads and stores. vectorizable_mask_load_store shares a lot of code with the other two routines, and this s

[1/7] Move code that stubs out IFN_MASK_LOADs

2017-11-17 Thread Richard Sandiford

vectorizable_mask_load_store replaces scalar IFN_MASK_LOAD calls with dummy assignments, so that they never survive vectorisation. This patch moves the code to vect_transform_loop instead, so that we only change the scalar statements once all of them have been vectorised. This makes it easier to

[2/7] Make vect_model_store_cost take a vec_load_store_type

2017-11-17 Thread Richard Sandiford

This patch makes vect_model_store_cost take a vec_load_store_type instead of a vect_def_type. It's a wash on its own, but it helps with later patches. Richard 2017-11-17 Richard Sandiford gcc/ * tree-vectorizer.h (vec_load_store_type): Moved from tree-vec-stmts.c (vect_model

[3/7] Split mask checking out of vectorizable_mask_load_store

2017-11-17 Thread Richard Sandiford

This patch splits the mask argument checking out of vectorizable_mask_load_store, so that a later patch can use it in both vectorizable_load and vectorizable_store. It also adds dump messages for false returns. This is mostly useful for the TYPE_VECTOR_SUBPARTS check, which can fail if pattern re

Re: [PATCH 1/3][middle-end]PR78809 (Inline strcmp with small constant strings)

2017-11-17 Thread Paolo Carlini

Hi, On 17/11/2017 06:29, Jeff Law wrote: OK. I'll go ahead and commit for you. Beautiful. Thanks Jeff. I think this patch is small enough to not require a copyright assignment. However further work likely will. I don't offhand know if Oracle has a blanket assignment in place. Can you work w

[patch][x86] skylake costs

2017-11-17 Thread Koval, Julia

Hi, this patch introduces separate cost model for skylake-avx512. Ok for trunk? gcc/ * config/i386/i386.c (processor_target_table): Add skylake_cost for skylake-avx512. * config/i386/x86-tune-costs.h (skylake_memcpy, skylake_memset, skylake_cost): New. Thanks, Julia

[4/7] Split rhs checking out of vectorizable_{,mask_load_}store

2017-11-17 Thread Richard Sandiford

This patch splits out the rhs checking code that's common to both vectorizable_mask_load_store and vectorizable_store. Richard 2017-11-17 Richard Sandiford gcc/ * tree-vect-stmts.c (vect_check_store_rhs): New function, split out from... (vectorizable_mask_load_store):

[5/7] Split out gather load mask building

2017-11-17 Thread Richard Sandiford

This patch splits out the code to build an all-bits-one or all-bits-zero input to a gather load. The catch is that both masks can have floating-point type, in which case they are implicitly treated in the same way as an integer bitmask. Richard 2017-11-17 Richard Sandiford gcc/ * tr

[6/7] Split gather load handling out of vectorizable_{mask_load_store,load}

2017-11-17 Thread Richard Sandiford

vectorizable_mask_load_store and vectorizable_load used the same code to build a gather load call, except that the former also vectorised a mask argument and used it for both the src and mask inputs. The latter instead used a src input of zero and a mask input of all-ones. This patch splits the c

[7/7] Make vectorizable_load/store handle IFN_MASK_LOAD/STORE

2017-11-17 Thread Richard Sandiford

After the previous patches, it's easier to see that the remaining inlined transform code in vectorizable_mask_load_store is just a cut-down version of the VMAT_CONTIGUOUS handling in vectorizable_load and vectorizable_store. This patch therefore makes those functions handle masked loads and stores

Re: [PATCH, AArch64] Adjust tuning parameters for Falkor

2017-11-17 Thread Kyrill Tkachov

Hi Luis, [cc'ing aarch64 maintainers, it's quicker to get review that way] On 15/11/17 03:00, Luis Machado wrote: > I think the best thing is to leave this tuning structure in place and > just change default_opt_level to -1 to disable it at -O3. > > Thanks, > Andrew > Indeed that seems to be

RFC: libgomp target plugins and atexit

2017-11-17 Thread Tom de Vries

Hi, I wrote a patch that called some function in the common libgomp code from GOMP_OFFLOAD_fini_device, and found that it hung due to the fact that: - gomp_target_fini locks devices[*].lock while calling GOMP_OFFLOAD_fini_device, and - the function call that I added also locked that same lock

Re: [patch][x86] skylake costs

2017-11-17 Thread Uros Bizjak

On Fri, Nov 17, 2017 at 10:18 AM, Koval, Julia wrote: > Hi, this patch introduces separate cost model for skylake-avx512. Ok for > trunk? > > gcc/ > * config/i386/i386.c (processor_target_table): Add skylake_cost for > skylake-avx512. > * config/i386/x86-tune-costs.h (skyl

Re: Add support for masked load/store_lanes

2017-11-17 Thread Richard Sandiford

Richard Sandiford writes: > This patch adds support for vectorising groups of IFN_MASK_LOADs > and IFN_MASK_STOREs using conditional load/store-lanes instructions. > This requires new internal functions to represent the result > (IFN_MASK_{LOAD,STORE}_LANES), as well as associated optabs. > > The

Re: [PATCH] Use bswap framework in store-merging (PR tree-optimization/78821)

2017-11-17 Thread Thomas Preudhomme

Hi Jakub, On 16/11/17 17:06, Jakub Jelinek wrote: Hi! This patch uses the bswap pass framework inside of the store merging pass to handle adjacent stores which produce together a 16/32/64 bit store of bswapped value (loaded or from SSA_NAME) or identity (usually only from SSA_NAME, the code pre

SLP reductions with variable-length vectors

2017-11-17 Thread Richard Sandiford

Two things stopped us using SLP reductions with variable-length vectors: (1) We didn't have a way of constructing the initial vector. This patch does it by creating a vector full of the neutral identity value and then using a shift-and-insert function to insert any non-identity inputs

Add support for bitwise reductions

2017-11-17 Thread Richard Sandiford

This patch adds support for the SVE bitwise reduction instructions (ANDV, ORV and EORV). It's a fairly mechanical extension of existing REDUC_* operators. Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu and powerpc64le-linux-gnu. Richard 2017-11-17 Richard Sandiford

Re: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]

2017-11-17 Thread Richard Biener

On Thu, 16 Nov 2017, Tamar Christina wrote: > Hi Richard, > > > > > I'd have made it > > > > if { ([is-effective-target non_strict_align] > > && ! ( [istarget ...] || )) > > > > thus default it to 1 for non-strict-align targets. > > > > Fair, I've switched it to a black list an

[PATCH] Ability to remap file names in FILE, etc (PR other/70268)

2017-11-17 Thread Boris Kolpackov

The below patch adds the -fmacro-prefix-map option that allows remapping of file names in __FILE__, __BASE_FILE__, and __builtin_FILE(), similar to how -fdebug-prefix-map allows to do the same for debug information. Additionally, the patch adds -ffile-prefix-map which can be used to specify both m

[patch] Add support for #pragma GCC unroll

2017-11-17 Thread Eric Botcazou

Hi, this is a cleaned up and updated revision of Mike's latest posted patch implementing #pragma GCC unroll in the C and C++ compilers. To be honest, we're not so much interested in the front-end bits as in the middle-end bits, because the latter would at last make the Ada version of the pragm

Re: [PATCH][ARM] Fix test armv8_2-fp16-move-1.c

2017-11-17 Thread Sudi Das

Hi Kyrill Thanks I have made the change. Sudi From: Kyrill Tkachov Sent: Thursday, November 16, 2017 5:03 PM To: Sudi Das; gcc-patches@gcc.gnu.org Cc: nd; Ramana Radhakrishnan; Richard Earnshaw Subject: Re: [PATCH][ARM] Fix test armv8_2-fp16-move-1.c Hi Sudi, On 16/11/17 16:37, Sudi Das

Re: [PATCH][ARM] Fix test armv8_2-fp16-move-1.c

2017-11-17 Thread Kyrill Tkachov

On 17/11/17 10:45, Sudi Das wrote: Hi Kyrill Thanks I have made the change. Thanks Sudi, I've committed this on your behalf with r254863. Kyrill Sudi From: Kyrill Tkachov Sent: Thursday, November 16, 2017 5:03 PM To: Sudi Das; gcc-patches@gcc.gnu.org Cc: nd; Ramana Radhakrishnan; Richa

Re: [PATCH][PR c++/82888] smarter code for default initialization of scalar arrays

2017-11-17 Thread Richard Biener

On Thu, Nov 16, 2017 at 5:21 PM, Nathan Froyd wrote: > Default-initialization of scalar arrays in C++ member initialization > lists produced rather slow code, laboriously setting each element of the > array to zero. It would be much faster to block-initialize the array, > and that's what this pat

Re: [PATCH] Disable -ftrapping-math by default

2017-11-17 Thread Richard Biener

On Fri, Nov 17, 2017 at 12:10 AM, Marc Glisse wrote: > On Thu, 16 Nov 2017, Richard Biener wrote: > >> On Thu, Nov 16, 2017 at 3:33 PM, Wilco Dijkstra >> wrote: >>> >>> GCC currently defaults to -ftrapping-math. This is supposed to generate >>> code for correct user-visible traps and FP status f

RE: [PATCH] [ARC] update GLIBC_DYNAMIC_LINKER per glibc upstreaming review comments

2017-11-17 Thread Claudiu Zissulescu

Hi, > gcc/ > * config/arc/linux.h: GLIBC_DYNAMIC_LINKER update per glibc > upstreaming review comments > Accepted and committed. Thank you for your contribution, Claudiu

Re: [PATCH] [BRIGFE] Reduce the number of type conversions due to the untyped HSAIL regs

2017-11-17 Thread Rainer Orth

Hi Pekka, > Instead of always representing the HSAIL's untyped registers as > unsigned int, the gccbrig now pre-analyzes the BRIG code and > builds the register variables as a type used the most when storing > or reading data to/from each register. This reduces the total > conversions which cannot

Re: [PATCH] Improve -Wmaybe-uninitialized documentation

2017-11-17 Thread Jonathan Wakely

On 16/11/17 10:59 -0700, Jeff Law wrote: On 11/16/2017 03:49 AM, Jonathan Wakely wrote: On 15/11/17 20:28 -0700, Martin Sebor wrote: On 11/15/2017 07:31 AM, Jonathan Wakely wrote: The docs for -Wmaybe-uninitialized have some issues: - That first sentence is looong. - Apparently some C++ p

Re: [PATCH] Improve -Wmaybe-uninitialized documentation

2017-11-17 Thread Jonathan Wakely

On 16/11/17 09:18 -0700, Martin Sebor wrote: On 11/16/2017 03:49 AM, Jonathan Wakely wrote: On 15/11/17 20:28 -0700, Martin Sebor wrote: On 11/15/2017 07:31 AM, Jonathan Wakely wrote: The docs for -Wmaybe-uninitialized have some issues: - That first sentence is looong. - Apparently some C

Re: [PATCH] Fix PR83017 (fortran part)

2017-11-17 Thread Janne Blomqvist

On Fri, Nov 17, 2017 at 11:13 AM, Richard Biener wrote: > This patch changes the Fortran frontend to annotate DO CONCURRENT > with parallel instead of ivdep. > > The patch is not enough to enable a runtime benefit because of > some autopar costing issues but for other cases it should enable > auto

Re: [RFTesting] New POINTER_DIFF_EXPR

2017-11-17 Thread Richard Biener

On Sat, Nov 11, 2017 at 12:44 AM, Marc Glisse wrote: > Adding some random cc: to people who might be affected. Hopefully I am not > breaking any of your stuff... > > Ulrich Weigand (address space) > Ilya Enkovich (pointer bound check) > DJ Delorie (target with 24-bit partial mode pointer) > > If y

Re: [PATCH] Fix PR83017 (fortran part)

2017-11-17 Thread Richard Biener

On Fri, 17 Nov 2017, Janne Blomqvist wrote: > On Fri, Nov 17, 2017 at 11:13 AM, Richard Biener wrote: > > This patch changes the Fortran frontend to annotate DO CONCURRENT > > with parallel instead of ivdep. > > > > The patch is not enough to enable a runtime benefit because of > > some autopar c

[PATCH] PR83017, parloops part

2017-11-17 Thread Richard Biener

This makes the minimum number of iterations per thread a --param instead of a magic define and handles loop->can_be_parallel independent of whether flag_loop_parallelize_all was enabled (and thus also handle loops our own dependence analysis can analyze but graphites could not). It also adjusts th

Re: [3/10] Add available_vector_sizes to target-supports.exp

2017-11-17 Thread Christophe Lyon

Hi Richard, On 8 November 2017 at 20:11, Jeff Law wrote: > On 11/03/2017 10:18 AM, Richard Sandiford wrote: >> This patch adds a routine that lists the available vector sizes >> for a target and uses it for some existing target conditions. >> Later patches add more uses. >> >> The cases are take

[PATCH, libgomp, openacc] Factor out async argument utility functions

2017-11-17 Thread Tom de Vries

Hi, I've factored out 3 new functions to test properties of enum acc_async_t: ... typedef enum acc_async_t { /* Keep in sync with include/gomp-constants.h. */ acc_async_noval = -1, acc_async_sync = -2 } acc_async_t; ... In order to understand what this means: ... if (async < acc_async

Re: [PATCH] Fix PR83017 (fortran part)

2017-11-17 Thread Janne Blomqvist

On Fri, Nov 17, 2017 at 3:03 PM, Richard Biener wrote: > On Fri, 17 Nov 2017, Janne Blomqvist wrote: > >> On Fri, Nov 17, 2017 at 11:13 AM, Richard Biener wrote: >> > This patch changes the Fortran frontend to annotate DO CONCURRENT >> > with parallel instead of ivdep. >> > >> > The patch is not

Re: [patch] Add support for #pragma GCC unroll

2017-11-17 Thread Richard Biener

On Fri, Nov 17, 2017 at 11:23 AM, Eric Botcazou wrote: > Hi, > > this is a cleaned up and updated revision of Mike's latest posted patch > implementing #pragma GCC unroll in the C and C++ compilers. To be honest, > we're not so much interested in the front-end bits as in the middle-end bits, > be

Re: [PATCH] Fix PR83017 (fortran part)

2017-11-17 Thread Richard Biener

On Fri, 17 Nov 2017, Janne Blomqvist wrote: > On Fri, Nov 17, 2017 at 3:03 PM, Richard Biener wrote: > > On Fri, 17 Nov 2017, Janne Blomqvist wrote: > > > >> On Fri, Nov 17, 2017 at 11:13 AM, Richard Biener wrote: > >> > This patch changes the Fortran frontend to annotate DO CONCURRENT > >> > wi

Re: [RFA][PATCH] patch 4/n Refactor bits of vrp_visit_assignment_or_call -- correct patch attached this time

2017-11-17 Thread Richard Biener

On Fri, Nov 17, 2017 at 5:17 AM, Jeff Law wrote: > No nyquil tonight, so the proper patch is attached this time... > > -- > > > > So the next group of changes is focused on breaking down evrp into an > analysis engine and the actual optimization pass. The analysis engine > can be embedded into ot

Re: [RFA][PATCH] patch 5/n Cleaning up evrp

2017-11-17 Thread Richard Biener

On Fri, Nov 17, 2017 at 5:49 AM, Jeff Law wrote: > So with the major reorganization bits in place it's time to start > cleaning up. > > This patch is primarily concerned with cleanups to the evrp_dom_walker > class. > > We pull a blob of code from execute_early_vrp into a new member > function. T

Re: [RFA][PATCH] patch 6/n Refactoring evrp

2017-11-17 Thread Richard Biener

On Fri, Nov 17, 2017 at 8:18 AM, Jeff Law wrote: > > As I've stated several times one of the goals here is to provide a > little range analysis module that we can embed & reuse. > > To accomplish that I need to break down the evrp class. > > This patch does the bulk of the real work. > > evrp_dom_

Re: [RFA][PATCH] patch 7/n Introduce evrp_range_analyzer class

2017-11-17 Thread Richard Biener

On Fri, Nov 17, 2017 at 8:41 AM, Jeff Law wrote: > This patch introduces the evrp_range_analyzer class. This is the class > we're going to be able to embed into existing dominator walkers to > provide them with context sensitive range analysis. > > The bulk of the class is extracted from the evrp

[PATCH Obvious]Remove redundant check on component distance

2017-11-17 Thread Bin Cheng

Hi, This is an obvious patch removing redundant check on component distance in tree-predcom.c Bootstrap and test along with next patch. Is it OK? Thanks, bin 2017-11-15 Bin Cheng * tree-predcom.c (add_ref_to_chain): Remove check on distance.From 8b62802309b2d14a2fca4446b9f6f8f8670a45

[PATCH GCC]Support load in CT_STORE_STORE chain if dominated by store in the same loop iteration

2017-11-17 Thread Bin Cheng

Hi, I previously introduced CT_STORE_STORE chains in predcom. This patch further supports load reference in CT_STORE_STORE chain if the load is dominated by a store reference in the same loop iteration. So example as in added test case: for (i = 0; i < len; i++) { a[i] = t1;

Re: [PATCH][PR c++/82888] smarter code for default initialization of scalar arrays

2017-11-17 Thread Jason Merrill

On Thu, Nov 16, 2017 at 11:21 AM, Nathan Froyd wrote: > Default-initialization of scalar arrays in C++ member initialization > lists produced rather slow code, laboriously setting each element of the > array to zero. It would be much faster to block-initialize the array, > and that's what this pa

Add unroll-and-jam pass v2

2017-11-17 Thread Michael Matz

Hi, so I've dusted off and improved the implementation of unroll-and-jam from last year. The changes relative to last submission are: * corrected feasibility of the transform (i.e. that dependency directions are correctly retained, the last submission was wrong). * added profitability analys

Re: [RFTesting] New POINTER_DIFF_EXPR

2017-11-17 Thread Jason Merrill

On Fri, Nov 17, 2017 at 7:56 AM, Richard Biener wrote: > On Sat, Nov 11, 2017 at 12:44 AM, Marc Glisse wrote: >> Adding some random cc: to people who might be affected. Hopefully I am not >> breaking any of your stuff... >> >> Ulrich Weigand (address space) >> Ilya Enkovich (pointer bound check)

Re: [PATCH] [BRIGFE] Reduce the number of type conversions due to the untyped HSAIL regs

2017-11-17 Thread Pekka Jääskeläinen

Hi Rainer, On Fri, Nov 17, 2017 at 1:32 PM, Rainer Orth wrote: > Please fix. Fixed in r254870. BR, Pekka

Re: [PATCH] Add _Float/_FloatX rounding built-ins & improve gimple optimization of _Float/_FloatX built-in functions

2017-11-17 Thread Segher Boessenkool

Hi! On Fri, Nov 17, 2017 at 12:04:45AM -0500, Michael Meissner wrote: > This patch is an enhancement of a previous page that never got approved. > https://gcc.gnu.org/ml/gcc-patches/2017-10/threads.html#02124 > > In the original patch, I added support to the machine independent > infrastructure t

[C++ Patch, V2] PR 82593 ("Internal compiler error: in process_init_constructor_array, at cp/typeck2.c:1294")

2017-11-17 Thread Paolo Carlini

Hi again, I managed to spend much more time on the issue and I'm starting a new thread with a mature - IMHO - proposal: the big thing is the use of the existing check_array_designated_initializer in process_init_constructor_array, which calls maybe_constant_value, as we want, and covers all

[PATCH] rs6000: Fix for altivec-macros.c

2017-11-17 Thread Segher Boessenkool

This fixes the altivec-macros.c testcase; we now need to explicitly say "no column number" for messages without one. Tested on powerpc64-linux {-m32,-m64}; committing to trunk. Segher 2017-11-17 Segher Boessenkool gcc/testsuite/ * gcc.target/powerpc/altivec-macros.c: Include "-:" i

[PATCH] combine: Don't split insns if half is unused (PR82621)

2017-11-17 Thread Segher Boessenkool

If we have a PARALLEL of two SETs, and one half is unused, we currently happily split that into two instructions (albeit the unused one is useless). Worse, as PR82621 shows, combine will happily merge this insn into I3 even if some intervening insn sets the same register again, which is wrong. Th

Add support for fully-predicated loops

2017-11-17 Thread Richard Sandiford

This patch adds support for using a single fully-predicated loop instead of a vector loop and a scalar tail. An SVE WHILELO instruction generates the predicate for each iteration of the loop, given the current scalar iv value and the loop bound. This operation is wrapped up in a new internal func

Add support for reductions in fully-masked loops

2017-11-17 Thread Richard Sandiford

This patch removes the restriction that fully-masked loops cannot have reductions. The key thing here is to make sure that the reduction accumulator doesn't include any values associated with inactive lanes; the patch adds a bunch of conditional binary operations for doing that. Tested on aarch64

Make ivopts handle calls to internal functions

2017-11-17 Thread Richard Sandiford

ivopts previously treated pointer arguments to internal functions like IFN_MASK_LOAD and IFN_MASK_STORE as normal gimple values. This patch makes it treat them as addresses instead. This makes a significant difference to the code quality for SVE loops, since we can then use loads and stores with s

Re: [RFA][PATCH] patch 5/n Cleaning up evrp

2017-11-17 Thread Pedro Alves

On 11/17/2017 04:49 AM, Jeff Law wrote: > + /* We do not allow copying this object or initializing one from another. > */ > + evrp_dom_walker (const evrp_dom_walker &); > + evrp_dom_walker& operator= (const evrp_dom_walker &); > + Note you can use include/ansidecl.h's DISABLE_COPY_AND_ASSIGN

Re: [PATCH 1/3][middle-end]PR78809 (Inline strcmp with small constant strings)

2017-11-17 Thread Qing Zhao

thanks Jeff and Paolo. really appreciate for all the help so far. Qing > On Nov 17, 2017, at 3:17 AM, Paolo Carlini wrote: > > Hi, > > On 17/11/2017 06:29, Jeff Law wrote: >> OK. I'll go ahead and commit for you. > Beautiful. Thanks Jeff. >> I think this patch is small enough to not require

Allow the number of iterations to be smaller than VF

2017-11-17 Thread Richard Sandiford

Fully-masked loops can be profitable even if the iteration count is smaller than the vectorisation factor. In this case we're effectively doing a complete unroll followed by SLP. The documentation for min-vect-loop-bound says that the default value is 0, but actually the default and minimum were

Handle peeling for alignment with masking

2017-11-17 Thread Richard Sandiford

This patch adds support for aligning vectors by using a partial first iteration. E.g. if the start pointer is 3 elements beyond an aligned address, the first iteration will have a mask in which the first three elements are false. On SVE, the optimisation is only useful for vector-length-specific

Add an empty_mask_is_expensive hook

2017-11-17 Thread Richard Sandiford

This patch adds a hook to control whether we avoid executing masked (predicated) stores when the mask is all false. We don't want to do that by default for SVE. Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu and powerpc64le-linux-gnu. OK to install? Richard 2017-11-17 R

[PATCH][AArch64] Remove remaining uses of * in patterns

2017-11-17 Thread Wilco Dijkstra

Remove the remaining uses of '*' from aarch64.md. Using '*' in alternatives is typically incorrect as it tells the register allocator to ignore those alternatives. Also add a missing '?' so we prefer a floating point register for same-size int<->fp conversions. Passes regress & bootstrap, OK for

[PATCH][AArch64] Set SLOW_BYTE_ACCESS

2017-11-17 Thread Wilco Dijkstra

Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing bitfields by their declared type, which results in better codegeneration on practically any target. I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS from GCC as it's confusing and useless. OK for commit

Add support for vectorising live-out values using SVE LASTB

2017-11-17 Thread Richard Sandiford

This patch uses the SVE LASTB instruction to optimise cases in which a value produced by the final scalar iteration of a vectorised loop is live outside the loop. Previously this situation would stop us from using a fully-masked loop. Tested on aarch64-linux-gnu (with and without SVE), x86_64-lin

Re: [PATCH][GCC][ARM] Implement "arch" GCC pragma and "+" attributes [Patch (2/3)]

2017-11-17 Thread Kyrill Tkachov

On 15/11/17 15:59, Tamar Christina wrote: -Original Message- From: Kyrill Tkachov [mailto:kyrylo.tkac...@foss.arm.com] Sent: Wednesday, November 15, 2017 10:11 To: Tamar Christina ; Sandra Loosemore ; gcc-patches@gcc.gnu.org Cc: nd ; Ramana Radhakrishnan ; Richard Earnshaw ; ni...@redh

Re: [PATCH, GCC/ARM] Do no clobber r4 in Armv8-M nonsecure call

2017-11-17 Thread Kyrill Tkachov

Hi Thomas, On 15/11/17 17:14, Thomas Preudhomme wrote: Hi, Expanders for Armv8-M nonsecure call unnecessarily clobber r4 despite the libcall they perform not writing to r4. Furthermore, the requirement for the branch target address to be in r4 as expected by the libcall is modeled in a convolu

Add support for conditional reductions using SVE CLASTB

2017-11-17 Thread Richard Sandiford

This patch uses SVE CLASTB to optimise conditional reductions. It means that we no longer need to maintain a separate index vector to record the most recent valid value, and no longer need to worry about overflow cases. Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu and powe

Allow single-element interleaving for non-power-of-2 strides

2017-11-17 Thread Richard Sandiford

This allows LD3 to be used for isolated a[i * 3] accesses, in a similar way to the current a[i * 2] and a[i * 4] for LD2 and LD4 respectively. Given the problems with the cost model underestimating the cost of elementwise accesses, the patch continues to reject the VMAT_ELEMENTWISE cases that are c

Use single-iteration epilogues when peeling for gaps

2017-11-17 Thread Richard Sandiford

This patch adds support for fully-masking loops that require peeling for gaps. It peels exactly one scalar iteration and uses the masked loop to handle the rest. Previously we would fall back on using a standard unmasked loop instead. Tested on aarch64-linux-gnu (with and without SVE), x86_64-li

Re: [PATCH 7/7]: Enable clobber high for tls descs on Aarch64

2017-11-17 Thread Szabolcs Nagy

On 17/11/17 08:42, Andrew Pinski wrote: > On Fri, Nov 17, 2017 at 12:21 AM, Alan Hayward wrote: >> >>> On 16 Nov 2017, at 19:32, Andrew Pinski wrote: >>> >>> On Thu, Nov 16, 2017 at 4:35 AM, Alan Hayward wrote: This final patch adds the clobber high expressions to tls_desc for aarch64.

Re: [PATCH, AArch64] Adjust tuning parameters for Falkor

2017-11-17 Thread Luis Machado

On 11/17/2017 07:25 AM, Kyrill Tkachov wrote: Hi Luis, [cc'ing aarch64 maintainers, it's quicker to get review that way] On 15/11/17 03:00, Luis Machado wrote: > I think the best thing is to leave this tuning structure in place and > just change default_opt_level to -1 to disable it at -O3.

Re: [PATCH, AArch64] Adjust tuning parameters for Falkor

2017-11-17 Thread James Greenhalgh

On Wed, Nov 15, 2017 at 03:00:53AM +, Luis Machado wrote: > > I think the best thing is to leave this tuning structure in place and > > just change default_opt_level to -1 to disable it at -O3. > > > > Thanks, > > Andrew > > > > Indeed that seems to be more appropriate if re-enabling prefe

Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS

2017-11-17 Thread James Greenhalgh

On Fri, Nov 17, 2017 at 03:21:31PM +, Wilco Dijkstra wrote: > Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing > bitfields by their declared type, which results in better codegeneration on > practically > any target. > > I'm thinking we should completely remove all trace

Re: [PATCH 1/3][middle-end]PR78809 (Inline strcmp with small constant strings)

2017-11-17 Thread Jeff Law

On 11/17/2017 02:17 AM, Paolo Carlini wrote: > Hi, > > On 17/11/2017 06:29, Jeff Law wrote: >> OK. I'll go ahead and commit for you. > Beautiful. Thanks Jeff. >> I think this patch is small enough to not require a copyright >> assignment. However further work likely will. I don't offhand know if

Add an "early rematerialisation" pass

2017-11-17 Thread Richard Sandiford

This patch looks for pseudo registers that are live across a call and for which no call-preserved hard registers exist. It then recomputes the pseudos as necessary to ensure that they are no longer live across a call. The comment at the head of the file describes the approach. A new target hook

Rework the legitimize_address_displacement hook

2017-11-17 Thread Richard Sandiford

This patch: - tweaks the handling of legitimize_address_displacement so that it gets called before rather than after the address has been expanded. This means that we're no longer at the mercy of LRA being able to interpret the expanded instructions. - passes the original offset to legitim

Re: [PATCH, AArch64] Adjust tuning parameters for Falkor

2017-11-17 Thread Luis Machado

On 11/17/2017 01:48 PM, James Greenhalgh wrote: On Wed, Nov 15, 2017 at 03:00:53AM +, Luis Machado wrote: I think the best thing is to leave this tuning structure in place and just change default_opt_level to -1 to disable it at -O3. Thanks, Andrew Indeed that seems to be more appropri

Remove unnecessary temporary in tree-if-conv.c

2017-11-17 Thread Richard Sandiford

The call to ifc_temp_var in predicate_mem_writes become redundant in r230099. Before that point the mask was calculated using fold_build_*s, but now it's calculated by gimple_build and so is already a valid gimple value. As it stands, the call forces an SSA_NAME-to-SSA_NAME copy to be created, wh

[PR tree-optimization/83022] malloc/memset->calloc too aggressive

2017-11-17 Thread Nathan Sidwell

We currently optimize a malloc/memset pair into a calloc call (when the values match, of course). This turns out to be a pessimization for mysql 5.6, where the allocator looks like: void *ptr = malloc (size); if (ptr && other_condition) memset (ptr, 0, size); other_condition is false suffic

Re: [AArch64, testsuite] gcc.target/aarch64/extend.c: xfails for ilp32

2017-11-17 Thread Charles Baylis

On 24 October 2017 at 19:40, Andrew Pinski wrote: > On Tue, Oct 24, 2017 at 11:27 AM, Charles Baylis > wrote: >> In ILP32, GCC fails to merge pointer arithmetic into the addressing >> mode of a load instruction, as >> add w0, w0, w1, lsl 2 >> ldr w0, [x0] >> is not equival

Re: [RFA][PATCH] patch 5/n Cleaning up evrp

2017-11-17 Thread Jeff Law

On 11/17/2017 08:04 AM, Pedro Alves wrote: > On 11/17/2017 04:49 AM, Jeff Law wrote: > >> + /* We do not allow copying this object or initializing one from another. >> */ >> + evrp_dom_walker (const evrp_dom_walker &); >> + evrp_dom_walker& operator= (const evrp_dom_walker &); >> + > > Note

Re: [RFC PATCH] Merge libsanitizer from upstream

2017-11-17 Thread Christophe Lyon

On 16 November 2017 at 13:25, Maxim Ostapenko wrote: > Hi Christophe, > > On 13/11/17 15:47, Christophe Lyon wrote: >> >> On 30 October 2017 at 16:21, Maxim Ostapenko >> wrote: >>> >>> On 30/10/17 17:08, Christophe Lyon wrote: On 30/10/2017 11:12, Maxim Ostapenko wrote: > > Hi,

Fix x86 vectorization cost wrt unsupported 8bit and 64bit integer ops

2017-11-17 Thread Jan Hubicka

Hi, as discussed at IRC, currently vectorizer costmodel ignores the fact that not all vector operations are supported. In particular when vectorizing byte and 64bit integer loops we quite often end up producing slower vector sequence by believing that we can use vector operations which does not exi

Re: [patch] Add support for #pragma GCC unroll

2017-11-17 Thread Bernhard Reutner-Fischer

On 17 November 2017 14:31:45 CET, Richard Biener wrote: >On Fri, Nov 17, 2017 at 11:23 AM, Eric Botcazou >wrote: >> Hi, >> >> this is a cleaned up and updated revision of Mike's latest posted >patch >> implementing #pragma GCC unroll in the C and C++ compilers. To be >honest, >> we're not so mu

[PR c++/82836] Fixe testcase

2017-11-17 Thread Nathan Sidwell

The 82836 testcase fell out of creduce. In c++17 mode it fails horribly with missing return errors. Applying this fix, so it's valid in c++17. It still ICEs (in both 14 and 17 modes) with the 82836 fix removed. nathan -- Nathan Sidwell 2017-11-17 Nathan Sidwell * g++.dg/pr82836.C: Fix

Add support for in-order addition reduction using SVE FADDA

2017-11-17 Thread Richard Sandiford

This patch adds support for in-order floating-point addition reductions, which are suitable even in strict IEEE mode. Previously vect_is_simple_reduction would reject any cases that forbid reassociation. The idea is instead to tentatively accept them as "FOLD_LEFT_REDUCTIONs" and only fail later

Re: [PATCH] Improve -Wmaybe-uninitialized documentation

2017-11-17 Thread Jeff Law

On 11/17/2017 05:40 AM, Jonathan Wakely wrote: > On 16/11/17 09:18 -0700, Martin Sebor wrote: >> On 11/16/2017 03:49 AM, Jonathan Wakely wrote: >>> On 15/11/17 20:28 -0700, Martin Sebor wrote: On 11/15/2017 07:31 AM, Jonathan Wakely wrote: > The docs for -Wmaybe-uninitialized have some iss

Re: [RFTesting] New POINTER_DIFF_EXPR

2017-11-17 Thread Joseph Myers

On Fri, 17 Nov 2017, Richard Biener wrote: > Joseph may have an idea about the address-space issue. I'm not clear what the question is. The TR 18037 rule on subtractions with address spaces is "For subtraction, if the two operands are pointers into different address spaces, the address spaces

[PATCH, rs6000] Testcase updates for power9 codegen

2017-11-17 Thread Will Schmidt

Hi Assorted testcase updates to reflect codegen differences on Power9 versus Power8 and earlier systems. Tested on P9, this is expected to clean up the majority of the currently failing tests on that system. OK for trunk? Thanks -Will 2017-11-17 Will Schmidt [testsuite] * f

Re: [011/nnn] poly_int: DWARF locations

2017-11-17 Thread Jeff Law

On 10/23/2017 11:04 AM, Richard Sandiford wrote: > This patch adds support for DWARF location expressions > that involve polynomial offsets. It adds a target hook that > says how the runtime invariants used in the offsets should be > represented in DWARF. SVE vectors have to be a multiple of > 12

[patch, fortran] Fix PR 83012, rejects-valid regression with contiguous pointer

2017-11-17 Thread Thomas Koenig

Hello world, the attached patch fixes the PR by looking at the function interface if one exists. Regression-tested. OK for trunk? Regards Thomas 2017-11-17 Thomas Koenig PR fortran/83012 * expr.c (gfc_is_simply_contiguous): If a function call through a clas

Re: [PATCH] C/C++: more stdlib header hints (PR c/81404) (v5)

2017-11-17 Thread Jason Merrill

On Fri, Nov 3, 2017 at 2:15 PM, David Malcolm wrote: > += get_cp_stdlib_header_for_name (IDENTIFIER_POINTER (name)); > + new suggest_missing_header (loc, > + IDENTIFIER_POINTER (name), Maybe add overloads that take identifie

Re: [RFTesting] New POINTER_DIFF_EXPR

2017-11-17 Thread Richard Biener

On November 17, 2017 6:20:22 PM GMT+01:00, Joseph Myers wrote: >On Fri, 17 Nov 2017, Richard Biener wrote: > >> Joseph may have an idea about the address-space issue. > >I'm not clear what the question is. The TR 18037 rule on subtractions >with address spaces is "For subtraction, if the two op

Re: [RFTesting] New POINTER_DIFF_EXPR

2017-11-17 Thread DJ Delorie

Richard Biener writes: > The question is what ptrdiff_t is for a specific address space. Or > rather if that type may be dependent on the address space or if we can > always use that of the default address space. Some targets have a "far" address space that's bigger than the default. rl78 for ex

Re: [RFTesting] New POINTER_DIFF_EXPR

2017-11-17 Thread Joseph Myers

On Fri, 17 Nov 2017, Richard Biener wrote: > The question is what ptrdiff_t is for a specific address space. Or > rather if that type may be dependent on the address space or if we can > always use that of the default address space. ptrdiff_t is a fixed type which does not depend on the address

Re: [RFTesting] New POINTER_DIFF_EXPR

2017-11-17 Thread Joseph Myers

On Fri, 17 Nov 2017, DJ Delorie wrote: > Richard Biener writes: > > The question is what ptrdiff_t is for a specific address space. Or > > rather if that type may be dependent on the address space or if we can > > always use that of the default address space. > > Some targets have a "far" addres

Re: [RFA][PATCH] patch 7/n Introduce evrp_range_analyzer class

2017-11-17 Thread Bernhard Reutner-Fischer

On 17 November 2017 14:45:29 CET, Richard Biener wrote: >On Fri, Nov 17, 2017 at 8:41 AM, Jeff Law wrote: >> This patch introduces the evrp_range_analyzer class. This is the >class >> we're going to be able to embed into existing dominator walkers to >> provide them with context sensitive range

1 2 >

1 - 100 of 185 matches

Mail list logo