Re: vector _M_start and 0 offset
Hello, here is a clang-friendly version of the patch (same changelog), tested a while ago. Is it ok or do you prefer something like the + if(this->_M_impl._M_start._M_offset != 0) __builtin_unreachable(); version suggested by François? -- Marc GlisseIndex: include/bits/stl_bvector.h === --- include/bits/stl_bvector.h (revision 264371) +++ include/bits/stl_bvector.h (working copy) @@ -802,25 +802,25 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER #endif #if __cplusplus >= 201103L void assign(initializer_list __l) { _M_assign_aux(__l.begin(), __l.end(), random_access_iterator_tag()); } #endif iterator begin() _GLIBCXX_NOEXCEPT - { return this->_M_impl._M_start; } + { return iterator(this->_M_impl._M_start._M_p, 0); } const_iterator begin() const _GLIBCXX_NOEXCEPT - { return this->_M_impl._M_start; } + { return const_iterator(this->_M_impl._M_start._M_p, 0); } iterator end() _GLIBCXX_NOEXCEPT { return this->_M_impl._M_finish; } const_iterator end() const _GLIBCXX_NOEXCEPT { return this->_M_impl._M_finish; } reverse_iterator @@ -835,21 +835,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER rend() _GLIBCXX_NOEXCEPT { return reverse_iterator(begin()); } const_reverse_iterator rend() const _GLIBCXX_NOEXCEPT { return const_reverse_iterator(begin()); } #if __cplusplus >= 201103L const_iterator cbegin() const noexcept - { return this->_M_impl._M_start; } + { return const_iterator(this->_M_impl._M_start._M_p, 0); } const_iterator cend() const noexcept { return this->_M_impl._M_finish; } const_reverse_iterator crbegin() const noexcept { return const_reverse_iterator(end()); } const_reverse_iterator
No a*x+b*x factorization for signed vectors
Hello, this is a simple patch to remove the wrong-code part of PR 87319. I didn't spend much time polishing that code, since it is meant to disappear anyway. We could probably remove the inner == inner2 test in signed_or_unsigned_type_for, I hadn't noticed when copy-pasting the code. bootstrap+regtest on powerpc64le-unknown-linux-gnu. 2018-09-30 Marc Glisse PR middle-end/87319 * fold-const.c (fold_plusminus_mult_expr): Handle complex and vectors. * tree.c (signed_or_unsigned_type_for): Handle complex. -- Marc GlisseIndex: gcc/fold-const.c === --- gcc/fold-const.c (revision 264371) +++ gcc/fold-const.c (working copy) @@ -7136,21 +7136,21 @@ fold_plusminus_mult_expr (location_t loc alt1 = arg10; same = maybe_same; if (swap) maybe_same = alt0, alt0 = alt1, alt1 = maybe_same; } } if (!same) return NULL_TREE; - if (! INTEGRAL_TYPE_P (type) + if (! ANY_INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_WRAPS (type) /* We are neither factoring zero nor minus one. */ || TREE_CODE (same) == INTEGER_CST) return fold_build2_loc (loc, MULT_EXPR, type, fold_build2_loc (loc, code, type, fold_convert_loc (loc, type, alt0), fold_convert_loc (loc, type, alt1)), fold_convert_loc (loc, type, same)); /* Same may be zero and thus the operation 'code' may overflow. Likewise Index: gcc/tree.c === --- gcc/tree.c (revision 264371) +++ gcc/tree.c (working copy) @@ -11202,34 +11202,45 @@ int_cst_value (const_tree x) return val; } /* If TYPE is an integral or pointer type, return an integer type with the same precision which is unsigned iff UNSIGNEDP is true, or itself if TYPE is already an integer type of signedness UNSIGNEDP. */ tree signed_or_unsigned_type_for (int unsignedp, tree type) { - if (TREE_CODE (type) == INTEGER_TYPE && TYPE_UNSIGNED (type) == unsignedp) + if (ANY_INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) == unsignedp) return type; if (TREE_CODE (type) == VECTOR_TYPE) { tree inner = TREE_TYPE (type); tree inner2 = signed_or_unsigned_type_for (unsignedp, inner); if (!inner2) return NULL_TREE; if (inner == inner2) return type; return build_vector_type (inner2, TYPE_VECTOR_SUBPARTS (type)); } + if (TREE_CODE (type) == COMPLEX_TYPE) +{ + tree inner = TREE_TYPE (type); + tree inner2 = signed_or_unsigned_type_for (unsignedp, inner); + if (!inner2) + return NULL_TREE; + if (inner == inner2) + return type; + return build_complex_type (inner2); +} + if (!INTEGRAL_TYPE_P (type) && !POINTER_TYPE_P (type) && TREE_CODE (type) != OFFSET_TYPE) return NULL_TREE; return build_nonstandard_integer_type (TYPE_PRECISION (type), unsignedp); } /* If TYPE is an integral or pointer type, return an integer type with the same precision which is unsigned, or itself if TYPE is already an
[libstdc++,doc] Tweak links to FSF web site
Commmitted. Gerald 2018-09-29 Gerald Pfeifer * doc/xml/gnu/fdl-1.3.xml: The Free Software Foundation web site now uses https. Also omit the unnecessary trailing slash. * doc/xml/gnu/gpl-3.0.xml: Ditto. Index: doc/xml/gnu/fdl-1.3.xml === --- doc/xml/gnu/fdl-1.3.xml (revision 264709) +++ doc/xml/gnu/fdl-1.3.xml (working copy) @@ -6,7 +6,7 @@ Version 1.3, 3 November 2008 Copyright © 2000, 2001, 2002, 2007, 2008 -http://www.w3.org/1999/xlink"; xlink:href="http://www.fsf.org/";>Free Software Foundation, Inc. +http://www.w3.org/1999/xlink"; xlink:href="https://www.fsf.org";>Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this Index: doc/xml/gnu/gpl-3.0.xml === --- doc/xml/gnu/gpl-3.0.xml (revision 264709) +++ doc/xml/gnu/gpl-3.0.xml (working copy) @@ -9,7 +9,7 @@ Copyright © 2007 Free Software Foundation, Inc. -http://www.w3.org/1999/xlink"; xlink:href="http://www.fsf.org/";>http://www.fsf.org/ +http://www.w3.org/1999/xlink"; xlink:href="https://www.fsf.org";>https://www.fsf.org Everyone is permitted to copy and distribute verbatim copies of this license
((X /[ex] A) +- B) * A --> X +- A * B
Hello, I noticed quite ugly code from both testcases. This transformation does not fix either, but it helps a bit. bootstrap+regtest on powerpc64le-unknown-linux-gnu. 2018-09-30 Marc Glisse gcc/ * match.pd (((X /[ex] A) +- B) * A): New transformation. gcc/testsuite/ * gcc.dg/tree-ssa/muldiv-1.c: New file. * gcc.dg/tree-ssa/muldiv-2.c: Likewise. -- Marc GlisseIndex: gcc/match.pd === --- gcc/match.pd (revision 264371) +++ gcc/match.pd (working copy) @@ -2637,20 +2637,39 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && operand_equal_p (@1, build_low_bits_mask (TREE_TYPE (@1), TYPE_PRECISION (type)), 0)) (convert @0))) /* (X /[ex] A) * A -> X. */ (simplify (mult (convert1? (exact_div @0 @@1)) (convert2? @1)) (convert @0)) +/* ((X /[ex] A) +- B) * A --> X +- A * B. */ +(for op (plus minus) + (simplify + (mult (convert1? (op (convert2? (exact_div @0 INTEGER_CST@@1)) INTEGER_CST@2)) @1) + (if (tree_nop_conversion_p (type, TREE_TYPE (@2)) + && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE (@2))) + (with + { + wi::overflow_type overflow; + wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2), + TYPE_SIGN (type), &overflow); + } + (if (types_match (type, TREE_TYPE (@2)) + && types_match (TREE_TYPE (@0), TREE_TYPE (@2)) && !overflow) + (op @0 { wide_int_to_tree (type, mul); }) + (with { tree utype = unsigned_type_for (type); } + (convert (op (convert:utype @0) + (mult (convert:utype @1) (convert:utype @2)) + /* Canonicalization of binary operations. */ /* Convert X + -C into X - C. */ (simplify (plus @0 REAL_CST@1) (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1))) (with { tree tem = const_unop (NEGATE_EXPR, type, @1); } (if (!TREE_OVERFLOW (tem) || !flag_trapping_math) (minus @0 { tem; }) Index: gcc/testsuite/gcc.dg/tree-ssa/muldiv-1.c === --- gcc/testsuite/gcc.dg/tree-ssa/muldiv-1.c (nonexistent) +++ gcc/testsuite/gcc.dg/tree-ssa/muldiv-1.c (working copy) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-tree-optimized-raw" } */ + +// ldist produces (((q-p-4)/4)&...+1)*4 +// Make sure we remove at least the division +// Eventually this should just be n*4 + +void foo(int*p, __SIZE_TYPE__ n){ + for(int*q=p+n;p!=q;++p)*p=0; +} + +/* { dg-final { scan-tree-dump "builtin_memset" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "div" "optimized" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/muldiv-2.c === --- gcc/testsuite/gcc.dg/tree-ssa/muldiv-2.c (nonexistent) +++ gcc/testsuite/gcc.dg/tree-ssa/muldiv-2.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -fdump-tree-optimized-raw" } */ + +// 'a' should disappear, but we are not there yet + +int* f(int* a, int* b, int* c){ +__PTRDIFF_TYPE__ d = b - a; +d += 1; +return a + d; +} + +/* { dg-final { scan-tree-dump-not "div" "optimized" } } */
[wwwdocs,fortran] readings.html -- Fix link to Michel Olagnon's Fortran 90 List
The old link did not work any longer, without any redirect. This seems to be a valid replacement. Committed. Gerald Index: readings.html === RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v retrieving revision 1.302 diff -u -r1.302 readings.html --- readings.html 24 Sep 2018 21:51:59 - 1.302 +++ readings.html 29 Sep 2018 12:26:56 - @@ -466,7 +466,7 @@ Other resources: - http://www.ifremer.fr/ditigo/molagnon/fortran90/engfaq.html";> + http://www.fortran-2000.com/MichelList/";> Michel Olagnon's Fortran 90 List contains a "Tests and Benchmarks" section mentioning commercial testsuites.
[PATCH] i386: Replace __m512 with __m512d in _mm512_abs_pd
_mm512_abs_pd takes __m512d, not __m512. OK for trunk and release branches? Thanks. H.J. -- gcc/ PR target/87467 * config/i386/avx512fintrin.h (_mm512_abs_pd): Replace __m512 with __m512d. gcc/testsuite/ * gcc.target/i386/pr87467.c: New test. --- gcc/config/i386/avx512fintrin.h | 2 +- gcc/testsuite/gcc.target/i386/pr87467.c | 11 +++ 2 files changed, 12 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr87467.c diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h index 80525f9fb4d..599701e10b3 100644 --- a/gcc/config/i386/avx512fintrin.h +++ b/gcc/config/i386/avx512fintrin.h @@ -7798,7 +7798,7 @@ _mm512_mask_abs_ps (__m512 __W, __mmask16 __U, __m512 __A) extern __inline __m512d __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm512_abs_pd (__m512 __A) +_mm512_abs_pd (__m512d __A) { return (__m512d) _mm512_and_epi64 ((__m512i) __A, _mm512_set1_epi64 (0x7fffLL)); diff --git a/gcc/testsuite/gcc.target/i386/pr87467.c b/gcc/testsuite/gcc.target/i386/pr87467.c new file mode 100644 index 000..6a298d1746e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr87467.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512f -O2" } */ +/* { dg-final { scan-assembler-times "vpandq\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ + +#include + +__m512d +avx512f_test (__m512d x) +{ + return _mm512_abs_pd (x); +} -- 2.17.1
[committed] Use structure to bubble up information about unterminated strings from c_strlen
This patch changes the NONSTR argument to c_strlen to instead be a little data structure c_strlen can populate with nuggets of information about the string. There's clearly a need for the decl related to the non-string argument. I see an immediate need for the length of a non-terminated string (c_strlen returns NULL for non-terminated strings). I also see a need for the offset within the non-terminated strong as well. We only populate the structure when c_strlen encounters a non-terminated string. One could argue we should always fill in the members. Right now I think filling it in for unterminated cases makes the most sense, but I could be convinced otherwise. I won't be surprised if subsequent warnings from Martin need additional information about the string. The idea here is we can add more elements to the structure without continually adding arguments to c_strlen. Bootstrapped in isolation as well as with Martin's patches for strnlen and sprintf checking. Installing on the trunk. Jeff * builtins.c (unterminated_array): Pass in c_strlen_data * to c_strlen rather than just a tree *. (c_strlen): Change NONSTR argument to a c_strlen_data pointer. Update recursive calls appropriately. If caller did not provide a suitable data pointer, create a local one. When a non-terminated string is discovered, bubble up information about the string via the c_strlen_data object. * builtins.h (c_strlen): Update prototype. (c_strlen_data): New structure. * gimple-fold.c (get_range_strlen): Update calls to c_strlen. For a type 2 call, if c_strlen indicates a non-terminated string use the length of the non-terminated string. (gimple_fold_builtin_stpcpy): Update calls to c_strlen. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 7be6ceb3d62..23e0ec7b34d 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,19 @@ +2018-09-29 Jeff Law + + * builtins.c (unterminated_array): Pass in c_strlen_data * to + c_strlen rather than just a tree *. + (c_strlen): Change NONSTR argument to a c_strlen_data pointer. + Update recursive calls appropriately. If caller did not provide a + suitable data pointer, create a local one. When a non-terminated + string is discovered, bubble up information about the string via the + c_strlen_data object. + * builtins.h (c_strlen): Update prototype. + (c_strlen_data): New structure. + * gimple-fold.c (get_range_strlen): Update calls to c_strlen. + For a type 2 call, if c_strlen indicates a non-terminated string + use the length of the non-terminated string. + (gimple_fold_builtin_stpcpy): Update calls to c_strlen. + 2018-09-28 John David Anglin * match.pd (simple_comparison): Don't optimize if either operand is diff --git a/gcc/builtins.c b/gcc/builtins.c index e655623febd..fe411efd9a9 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -570,9 +570,10 @@ warn_string_no_nul (location_t loc, const char *fn, tree arg, tree decl) tree unterminated_array (tree exp) { - tree nonstr = NULL; - c_strlen (exp, 1, &nonstr); - return nonstr; + c_strlen_data data; + memset (&data, 0, sizeof (c_strlen_data)); + c_strlen (exp, 1, &data); + return data.decl; } /* Compute the length of a null-terminated character string or wide @@ -592,10 +593,12 @@ unterminated_array (tree exp) accesses. Note that this implies the result is not going to be emitted into the instruction stream. - If a not zero-terminated string value is encountered and NONSTR is - non-zero, the declaration of the string value is assigned to *NONSTR. - *NONSTR is accumulating, thus not cleared on success, therefore it has - to be initialized to NULL_TREE by the caller. + Additional information about the string accessed may be recorded + in DATA. For example, if SRC references an unterminated string, + then the declaration will be stored in the DECL field. If the + length of the unterminated string can be determined, it'll be + stored in the LEN field. Note this length could well be different + than what a C strlen call would return. ELTSIZE is 1 for normal single byte character strings, and 2 or 4 for wide characer strings. ELTSIZE is by default 1. @@ -603,8 +606,16 @@ unterminated_array (tree exp) The value returned is of type `ssizetype'. */ tree -c_strlen (tree src, int only_value, tree *nonstr, unsigned eltsize) +c_strlen (tree src, int only_value, c_strlen_data *data, unsigned eltsize) { + /* If we were not passed a DATA pointer, then get one to a local + structure. That avoids having to check DATA for NULL before + each time we want to use it. */ + c_strlen_data local_strlen_data; + memset (&local_strlen_data, 0, sizeof (c_strlen_data)); + if (!data) +data = &local_strlen_data; + gcc_checking_asse
[committed] Fix _mm512_{,mask_}abs_pd (PR target/87467)
Hi! These two functions were copied from their _mm512_*abs_ps counterparts and weren't fully adjusted, plus avx512f-abspd-1.c testcase was identical to avx512f-absps-1.c and thus nothing caught this up in the testsuite. Sorry for that. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk, 8 and 7 branches as obvious. 2018-09-29 Jakub Jelinek PR target/87467 * config/i386/avx512fintrin.h (_mm512_abs_pd, _mm512_mask_abs_pd): Use __m512d type for __A argument rather than __m512. * gcc.target/i386/avx512f-abspd-1.c (SIZE): Divide by two. (CALC): Use double instead of float. (TEST): Adjust to test _mm512_abs_pd and _mm512_mask_abs_pd rather than _mm512_abs_ps and _mm512_mask_abs_ps. --- gcc/config/i386/avx512fintrin.h.jj 2018-07-11 22:55:44.660456510 +0200 +++ gcc/config/i386/avx512fintrin.h 2018-09-29 10:29:46.731170222 +0200 @@ -7798,7 +7798,7 @@ _mm512_mask_abs_ps (__m512 __W, __mmask1 extern __inline __m512d __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm512_abs_pd (__m512 __A) +_mm512_abs_pd (__m512d __A) { return (__m512d) _mm512_and_epi64 ((__m512i) __A, _mm512_set1_epi64 (0x7fffLL)); @@ -7806,7 +7806,7 @@ _mm512_abs_pd (__m512 __A) extern __inline __m512d __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm512_mask_abs_pd (__m512d __W, __mmask8 __U, __m512 __A) +_mm512_mask_abs_pd (__m512d __W, __mmask8 __U, __m512d __A) { return (__m512d) _mm512_mask_and_epi64 ((__m512i) __W, __U, (__m512i) __A, --- gcc/testsuite/gcc.target/i386/avx512f-abspd-1.c.jj 2017-04-07 21:20:58.092845868 +0200 +++ gcc/testsuite/gcc.target/i386/avx512f-abspd-1.c 2018-09-29 10:25:59.062006644 +0200 @@ -6,11 +6,11 @@ #include "avx512f-helper.h" -#define SIZE (AVX512F_LEN / 32) +#define SIZE (AVX512F_LEN / 64) #include "avx512f-mask-type.h" static void -CALC (float *i1, float *r) +CALC (double *i1, double *r) { int i; @@ -24,27 +24,27 @@ CALC (float *i1, float *r) void TEST (void) { - float ck[SIZE]; + double ck[SIZE]; int i; - UNION_TYPE (AVX512F_LEN, ) s, d, dm; + UNION_TYPE (AVX512F_LEN, d) s, d, dm; MASK_TYPE mask = MASK_VALUE; for (i = 0; i < SIZE; i++) { - s.a[i] = i * ((i & 1) ? 3.5f : -7.5f); + s.a[i] = i * ((i & 1) ? 3.5 : -7.5); d.a[i] = DEFAULT_VALUE; dm.a[i] = DEFAULT_VALUE; } CALC (s.a, ck); - d.x = INTRINSIC (_abs_ps) (s.x); - dm.x = INTRINSIC (_mask_abs_ps) (dm.x, mask, s.x); + d.x = INTRINSIC (_abs_pd) (s.x); + dm.x = INTRINSIC (_mask_abs_pd) (dm.x, mask, s.x); - if (UNION_CHECK (AVX512F_LEN, ) (d, ck)) + if (UNION_CHECK (AVX512F_LEN, d) (d, ck)) abort (); - MASK_MERGE () (ck, mask, SIZE); - if (UNION_CHECK (AVX512F_LEN, ) (dm, ck)) + MASK_MERGE (d) (ck, mask, SIZE); + if (UNION_CHECK (AVX512F_LEN, d) (dm, ck)) abort (); } Jakub
[PATCH] i386: Use TImode for BLKmode values in 2 integer registers
When passing and returning BLKmode values in 2 integer registers, use 1 TImode register instead of 2 DImode registers. Otherwise, V1TImode may be used to move and store such BLKmode values, which prevent RTL optimizations. Tested on x86-64. OK for trunk? Thanks. H.J. --- gcc/ PR target/87370 * config/i386/i386.c (construct_container): Use TImode for BLKmode values in 2 integer registers. gcc/testsuite/ PR target/87370 * gcc.target/i386/pr87370.c: New test. --- gcc/config/i386/i386.c | 17 +-- gcc/testsuite/gcc.target/i386/pr87370.c | 39 + 2 files changed, 54 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr87370.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 176cce521b7..54752513076 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -7914,9 +7914,22 @@ construct_container (machine_mode mode, machine_mode orig_mode, if (n == 2 && regclass[0] == X86_64_INTEGER_CLASS && regclass[1] == X86_64_INTEGER_CLASS - && (mode == CDImode || mode == TImode) + && (mode == CDImode || mode == TImode || mode == BLKmode) && intreg[0] + 1 == intreg[1]) -return gen_rtx_REG (mode, intreg[0]); +{ + if (mode == BLKmode) + { + /* Use TImode for BLKmode values in 2 integer registers. */ + exp[0] = gen_rtx_EXPR_LIST (VOIDmode, + gen_rtx_REG (TImode, intreg[0]), + GEN_INT (0)); + ret = gen_rtx_PARALLEL (mode, rtvec_alloc (1)); + XVECEXP (ret, 0, 0) = exp[0]; + return ret; + } + else + return gen_rtx_REG (mode, intreg[0]); +} /* Otherwise figure out the entries of the PARALLEL. */ for (i = 0; i < n; i++) diff --git a/gcc/testsuite/gcc.target/i386/pr87370.c b/gcc/testsuite/gcc.target/i386/pr87370.c new file mode 100644 index 000..c7b6295a33b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr87370.c @@ -0,0 +1,39 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2" } */ + +struct A +{ + int b[4]; +}; +struct B +{ + char a[12]; + int b; +}; +struct C +{ + char a[16]; +}; + +struct A +f1 (void) +{ + struct A x = {}; + return x; +} + +struct B +f2 (void) +{ + struct B x = {}; + return x; +} + +struct C +f3 (void) +{ + struct C x = {}; + return x; +} + +/* { dg-final { scan-assembler-not "xmm" } } */ -- 2.17.1
[wwwdocs] gcc-6/changes.html - openmp.org has moved to https
Somehow this one apparently fell through when I made updates in the last weeks. Committed. Gerald Index: gcc-6/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v retrieving revision 1.107 diff -u -r1.107 changes.html --- gcc-6/changes.html 2 Sep 2018 08:14:17 - 1.107 +++ gcc-6/changes.html 29 Sep 2018 16:48:41 - @@ -186,7 +186,7 @@ C family -Version 4.5 of the http://www.openmp.org/specifications/"; +Version 4.5 of the https://www.openmp.org/specifications/"; >OpenMP specification is now supported in the C and C++ compilers. The C and C++ compilers now support attributes on enumerators. For instance,
[wwwdocs] Move our main page to HTML 5
After my changes in the past weeks (and months) this is the last of our regular, not automatically generated, pages that moves to HTML 5. There are some further simplifications and cleanups on this page in particular and our style sheet(s), but the majority of work is behind us, and for those of you making changes things pretty much as easy as it gets. :-) Committed. Gerald Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.1100 diff -u -r1.1100 index.html --- index.html 23 Sep 2018 18:02:39 - 1.1100 +++ index.html 29 Sep 2018 16:51:24 - @@ -1,9 +1,6 @@ - -http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";> - + + GCC, the GNU Compiler Collection
Re: [PATCH] i386: Replace __m512 with __m512d in _mm512_abs_pd
On Sat, Sep 29, 2018 at 3:49 PM H.J. Lu wrote: > > _mm512_abs_pd takes __m512d, not __m512. > > OK for trunk and release branches? > > Thanks. > > > H.J. > -- > gcc/ > > PR target/87467 > * config/i386/avx512fintrin.h (_mm512_abs_pd): Replace __m512 > with __m512d. > > gcc/testsuite/ > > * gcc.target/i386/pr87467.c: New test. OK everywhere. Thanks, Uros. > --- > gcc/config/i386/avx512fintrin.h | 2 +- > gcc/testsuite/gcc.target/i386/pr87467.c | 11 +++ > 2 files changed, 12 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr87467.c > > diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h > index 80525f9fb4d..599701e10b3 100644 > --- a/gcc/config/i386/avx512fintrin.h > +++ b/gcc/config/i386/avx512fintrin.h > @@ -7798,7 +7798,7 @@ _mm512_mask_abs_ps (__m512 __W, __mmask16 __U, __m512 > __A) > > extern __inline __m512d > __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > -_mm512_abs_pd (__m512 __A) > +_mm512_abs_pd (__m512d __A) > { >return (__m512d) _mm512_and_epi64 ((__m512i) __A, > _mm512_set1_epi64 > (0x7fffLL)); > diff --git a/gcc/testsuite/gcc.target/i386/pr87467.c > b/gcc/testsuite/gcc.target/i386/pr87467.c > new file mode 100644 > index 000..6a298d1746e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr87467.c > @@ -0,0 +1,11 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavx512f -O2" } */ > +/* { dg-final { scan-assembler-times "vpandq\[ > \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */ > + > +#include > + > +__m512d > +avx512f_test (__m512d x) > +{ > + return _mm512_abs_pd (x); > +} > -- > 2.17.1 >
Re: [PATCH] i386: Use TImode for BLKmode values in 2 integer registers
On Sat, Sep 29, 2018 at 6:36 PM H.J. Lu wrote: > > When passing and returning BLKmode values in 2 integer registers, use > 1 TImode register instead of 2 DImode registers. Otherwise, V1TImode > may be used to move and store such BLKmode values, which prevent RTL > optimizations. > > Tested on x86-64. OK for trunk? > > Thanks. > > H.J. > --- > gcc/ > > PR target/87370 > * config/i386/i386.c (construct_container): Use TImode for > BLKmode values in 2 integer registers. > > gcc/testsuite/ > > PR target/87370 > * gcc.target/i386/pr87370.c: New test. OK. Thanks, Uros. > --- > gcc/config/i386/i386.c | 17 +-- > gcc/testsuite/gcc.target/i386/pr87370.c | 39 + > 2 files changed, 54 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr87370.c > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 176cce521b7..54752513076 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -7914,9 +7914,22 @@ construct_container (machine_mode mode, machine_mode > orig_mode, >if (n == 2 >&& regclass[0] == X86_64_INTEGER_CLASS >&& regclass[1] == X86_64_INTEGER_CLASS > - && (mode == CDImode || mode == TImode) > + && (mode == CDImode || mode == TImode || mode == BLKmode) >&& intreg[0] + 1 == intreg[1]) > -return gen_rtx_REG (mode, intreg[0]); > +{ > + if (mode == BLKmode) > + { > + /* Use TImode for BLKmode values in 2 integer registers. */ > + exp[0] = gen_rtx_EXPR_LIST (VOIDmode, > + gen_rtx_REG (TImode, intreg[0]), > + GEN_INT (0)); > + ret = gen_rtx_PARALLEL (mode, rtvec_alloc (1)); > + XVECEXP (ret, 0, 0) = exp[0]; > + return ret; > + } > + else > + return gen_rtx_REG (mode, intreg[0]); > +} > >/* Otherwise figure out the entries of the PARALLEL. */ >for (i = 0; i < n; i++) > diff --git a/gcc/testsuite/gcc.target/i386/pr87370.c > b/gcc/testsuite/gcc.target/i386/pr87370.c > new file mode 100644 > index 000..c7b6295a33b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr87370.c > @@ -0,0 +1,39 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-O2" } */ > + > +struct A > +{ > + int b[4]; > +}; > +struct B > +{ > + char a[12]; > + int b; > +}; > +struct C > +{ > + char a[16]; > +}; > + > +struct A > +f1 (void) > +{ > + struct A x = {}; > + return x; > +} > + > +struct B > +f2 (void) > +{ > + struct B x = {}; > + return x; > +} > + > +struct C > +f3 (void) > +{ > + struct C x = {}; > + return x; > +} > + > +/* { dg-final { scan-assembler-not "xmm" } } */ > -- > 2.17.1 >
Re: [C++ Patch] PR 84423 ("[6/7/8/9 Regression] [concepts] ICE with invalid using declaration")
Hi again, On 9/28/18 9:15 PM, Paolo Carlini wrote: Thanks. About the location, you are certainly right, but doesn't seem trivial. Something we can do *now* is using declspecs->locations[ds_typedef] and declspecs->locations[ds_alias], but that gives us the location of the keyword 'typedef' and 'using', respectively, whereas I think that we would like to have the location of 'auto' itself. I could look into that as a follow-up piece work In fact, completing the work turned out to be easy: ensure that cp_parser_alias_declaration saves the location of the defining-type-id too and then consistently use locations[ds_type_spec] in the error messages. Tested x86_64-linux. Still Ok? ;) Thanks, Paolo. /// /cp 2018-09-29 Paolo Carlini PR c++/84423 * pt.c (convert_template_argument): Immediately return error_mark_node if the second argument is erroneous. * parser.c (cp_parser_alias_declaration): Save the location of the type-id too. * decl.c (grokdeclarator): Improve error message for 'auto' in alias declaration. /testsuite 2018-09-29 Paolo Carlini PR c++/84423 * g++.dg/concepts/pr84423.C: New. Index: cp/decl.c === --- cp/decl.c (revision 264687) +++ cp/decl.c (working copy) @@ -11879,6 +11879,7 @@ grokdeclarator (const cp_declarator *declarator, /* If this is declaring a typedef name, return a TYPE_DECL. */ if (typedef_p && decl_context != TYPENAME) { + bool alias_p = decl_spec_seq_has_spec_p (declspecs, ds_alias); tree decl; /* This declaration: @@ -11901,7 +11902,12 @@ grokdeclarator (const cp_declarator *declarator, if (type_uses_auto (type)) { - error ("typedef declared %"); + if (alias_p) + error_at (declspecs->locations[ds_type_spec], + "% not allowed in alias declaration"); + else + error_at (declspecs->locations[ds_type_spec], + "typedef declared %"); type = error_mark_node; } @@ -11961,7 +11967,7 @@ grokdeclarator (const cp_declarator *declarator, inlinep, friendp, raises != NULL_TREE, declspecs->locations); - if (decl_spec_seq_has_spec_p (declspecs, ds_alias)) + if (alias_p) /* Acknowledge that this was written: `using analias = atype;'. */ TYPE_DECL_ALIAS_P (decl) = 1; Index: cp/parser.c === --- cp/parser.c (revision 264687) +++ cp/parser.c (working copy) @@ -19073,6 +19073,7 @@ cp_parser_alias_declaration (cp_parser* parser) G_("types may not be defined in alias template declarations"); } + cp_token *type_token = cp_lexer_peek_token (parser->lexer); type = cp_parser_type_id (parser); /* Restore the error message if need be. */ @@ -19107,6 +19108,9 @@ cp_parser_alias_declaration (cp_parser* parser) set_and_check_decl_spec_loc (&decl_specs, ds_alias, using_token); + set_and_check_decl_spec_loc (&decl_specs, + ds_type_spec, + type_token); if (parser->num_template_parameter_lists && !cp_parser_check_template_parameters (parser, Index: cp/pt.c === --- cp/pt.c (revision 264687) +++ cp/pt.c (working copy) @@ -7776,7 +7776,7 @@ convert_template_argument (tree parm, tree val; int is_type, requires_type, is_tmpl_type, requires_tmpl_type; - if (parm == error_mark_node) + if (parm == error_mark_node || error_operand_p (arg)) return error_mark_node; /* Trivially convert placeholders. */ Index: testsuite/g++.dg/concepts/pr84423.C === --- testsuite/g++.dg/concepts/pr84423.C (nonexistent) +++ testsuite/g++.dg/concepts/pr84423.C (working copy) @@ -0,0 +1,10 @@ +// { dg-do compile { target c++11 } } +// { dg-additional-options "-fconcepts" } + +template using A = auto; // { dg-error "30:.auto. not allowed in alias declaration" } + +template class> struct B {}; + +B b; + +typedef auto C; // { dg-error "9:typedef declared .auto." }
[wwwdocs] style.mhtml - Remove the last traces of setting a DOCTYPE.
This removes the last traces of setting a DOCTYPE. All regular pages now have this included directly and for install/ the conversion from texinfo now does that. David, this should be yet another change that'll help make your script easier. Committed. Gerald Index: style.mhtml === RCS file: /cvs/gcc/wwwdocs/htdocs/style.mhtml,v retrieving revision 1.156 diff -u -r1.156 style.mhtml --- style.mhtml 15 Sep 2018 23:03:24 - 1.156 +++ style.mhtml 29 Sep 2018 17:17:18 - @@ -4,20 +4,12 @@ -;;; The "install/" pages are HTML, not XHTML. +;;; The pages under install/ are generated from texinfo sources. "install/.*"> > - - - - > -> - ;;; Redefine the tag so that we can put XHTML attributes inside.
[PATCH] x86: Add pmovzx/pmovsx patterns with SI/DI operands
Add pmovzx/pmovsx patterns with SI and DI operands for pmovzx/pmovsx instructions which only read the low 4 or 8 bytes from the source. gcc/ PR target/87317 * config/i386/sse.md (*sse4_1_v8qiv8hi2): New pattern. (*sse4_1_v4qiv4si2): Likewise. (*sse4_1_v4hiv4si2): Likewise. (*sse4_1_v2hiv2di2): Likewise. (*sse4_1_v2siv2di2): Likewise. gcc/testsuite/ PR target/87317 * gcc.target/i386/pr87317-1.c: New file. * gcc.target/i386/pr87317-2.c: Likewise. * gcc.target/i386/pr87317-3.c: Likewise. * gcc.target/i386/pr87317-4.c: Likewise. * gcc.target/i386/pr87317-5.c: Likewise. * gcc.target/i386/pr87317-6.c: Likewise. * gcc.target/i386/pr87317-7.c: Likewise. * gcc.target/i386/pr87317-8.c: Likewise. * gcc.target/i386/pr87317-9.c: Likewise. * gcc.target/i386/pr87317-10.c: Likewise. --- gcc/config/i386/sse.md | 98 ++ gcc/testsuite/gcc.target/i386/pr87317-1.c | 13 +++ gcc/testsuite/gcc.target/i386/pr87317-10.c | 13 +++ gcc/testsuite/gcc.target/i386/pr87317-2.c | 13 +++ gcc/testsuite/gcc.target/i386/pr87317-3.c | 13 +++ gcc/testsuite/gcc.target/i386/pr87317-4.c | 13 +++ gcc/testsuite/gcc.target/i386/pr87317-5.c | 13 +++ gcc/testsuite/gcc.target/i386/pr87317-6.c | 13 +++ gcc/testsuite/gcc.target/i386/pr87317-7.c | 13 +++ gcc/testsuite/gcc.target/i386/pr87317-8.c | 13 +++ gcc/testsuite/gcc.target/i386/pr87317-9.c | 13 +++ 11 files changed, 228 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/pr87317-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr87317-10.c create mode 100644 gcc/testsuite/gcc.target/i386/pr87317-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr87317-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr87317-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr87317-5.c create mode 100644 gcc/testsuite/gcc.target/i386/pr87317-6.c create mode 100644 gcc/testsuite/gcc.target/i386/pr87317-7.c create mode 100644 gcc/testsuite/gcc.target/i386/pr87317-8.c create mode 100644 gcc/testsuite/gcc.target/i386/pr87317-9.c diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index d2722fdfcd0..c8ff35b125c 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -15521,6 +15521,26 @@ (set_attr "prefix" "orig,orig,maybe_evex") (set_attr "mode" "TI")]) +(define_insn "*sse4_1_v8qiv8hi2" + [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") + (any_extend:V8HI + (vec_select:V8QI + (subreg:V16QI + (vec_concat:V2DI + (match_operand:DI 1 "nonimmediate_operand" "Yrm,*xm,vm") + (const_int 0)) 0) + (parallel [(const_int 0) (const_int 1) + (const_int 2) (const_int 3) + (const_int 4) (const_int 5) + (const_int 6) (const_int 7)]] + "TARGET_SSE4_1 && && " + "%vpmovbw\t{%1, %0|%0, %q1}" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "orig,orig,maybe_evex") + (set_attr "mode" "TI")]) + (define_insn "avx512f_v16qiv16si2" [(set (match_operand:V16SI 0 "register_operand" "=v") (any_extend:V16SI @@ -15562,6 +15582,28 @@ (set_attr "prefix" "orig,orig,maybe_evex") (set_attr "mode" "TI")]) +(define_insn "*sse4_1_v4qiv4si2" + [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v") + (any_extend:V4SI + (vec_select:V4QI + (subreg:V16QI + (vec_merge:V4SI + (vec_duplicate:V4SI + (match_operand:SI 1 "nonimmediate_operand" "m,*m,m")) + (const_vector:V4SI + [(const_int 0) (const_int 0) + (const_int 0) (const_int 0)]) + (const_int 1)) 0) + (parallel [(const_int 0) (const_int 1) + (const_int 2) (const_int 3)]] + "TARGET_SSE4_1 && " + "%vpmovbd\t{%1, %0|%0, %k1}" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "orig,orig,maybe_evex") + (set_attr "mode" "TI")]) + (define_insn "avx512f_v16hiv16si2" [(set (match_operand:V16SI 0 "register_operand" "=v") (any_extend:V16SI @@ -15598,6 +15640,24 @@ (set_attr "prefix" "orig,orig,maybe_evex") (set_attr "mode" "TI")]) +(define_insn "*sse4_1_v4hiv4si2" + [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v") + (any_extend:V4SI + (vec_select:V4HI + (subreg:V8HI + (vec_concat:V2DI + (match_operand:DI 1 "nonimmediate_operand" "Yrm,*xm,vm") + (const_int 0)) 0) + (parallel [(const_int 0) (const_int 1) + (const_int 2) (const_int 3)]] + "TARGET_SSE4_1 && " + "%vpmovwd\t{%1, %0|%0, %q1}" + [(set_attr "isa" "noavx,noavx,avx") +
[GCC 8] [PATCH] i386: Remove _Unwind_Frames_Increment
Hi, I'd like to backport r263030: https://gcc.gnu.org/ml/gcc-cvs/2018-07/msg00756.html to GCC 8 for CET. Is it OK? Thanks. -- H.J.
Re: [PATCH] dumpfile.c: use prefixes other that 'note: ' for MSG_{OPTIMIZED_LOCATIONS|MISSED_OPTIMIZATION}
That produces extra output that breaks a few tests. g++.dg/vect/pr33426-ivdep-2.cc -std=c++11 (test for excess errors) g++.dg/vect/pr33426-ivdep-2.cc -std=c++14 (test for excess errors) g++.dg/vect/pr33426-ivdep-2.cc -std=c++98 (test for excess errors) g++.dg/vect/pr33426-ivdep-3.cc (test for excess errors) g++.dg/vect/pr33426-ivdep-4.cc (test for excess errors) g++.dg/vect/pr33426-ivdep.cc -std=c++11 (test for excess errors) g++.dg/vect/pr33426-ivdep.cc -std=c++14 (test for excess errors) g++.dg/vect/pr33426-ivdep.cc -std=c++98 (test for excess errors) gcc.dg/vect/nodump-vect-opt-info-1.c (test for excess errors) gcc.dg/vect/vect-ivdep-1.c (test for excess errors) gcc.dg/vect/vect-ivdep-1.c -flto -ffat-lto-objects (test for excess errors) gcc.dg/vect/vect-ivdep-2.c (test for excess errors) gcc.dg/vect/vect-ivdep-2.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-ivdep-1.c (test for excess errors) Excess errors: /usr/local/gcc/gcc-20180929/gcc/testsuite/gcc.dg/vect/vect-ivdep-1.c:11:3: optimized: loop versioned for vectorization to enhance alignment Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
PING^4 [PATCH] i386: Add pass_remove_partial_avx_dependency
On Tue, Sep 18, 2018 at 9:44 AM H.J. Lu wrote: > > On Tue, Sep 11, 2018 at 9:01 AM, H.J. Lu wrote: > > On Tue, Sep 4, 2018 at 9:01 AM, H.J. Lu wrote: > >> On Tue, Aug 28, 2018 at 11:04 AM, H.J. Lu wrote: > >>> With -mavx, for > >>> > >>> [hjl@gnu-cfl-1 skx-2]$ cat foo.i > >>> extern float f; > >>> extern double d; > >>> extern int i; > >>> > >>> void > >>> foo (void) > >>> { > >>> d = f; > >>> f = i; > >>> } > >>> > >>> we need to generate > >>> > >>> vxorp[ds] %xmmN, %xmmN, %xmmN > >>> ... > >>> vcvtss2sd f(%rip), %xmmN, %xmmX > >>> ... > >>> vcvtsi2ss i(%rip), %xmmN, %xmmY > >>> > >>> to avoid partial XMM register stall. This patch adds a pass to generate > >>> a single > >>> > >>> vxorps %xmmN, %xmmN, %xmmN > >>> > >>> at function entry, which is shared by all SF and DF conversions, instead > >>> of generating one > >>> > >>> vxorp[ds] %xmmN, %xmmN, %xmmN > >>> > >>> for each SF/DF conversion. > >>> > >>> Performance impacts on SPEC CPU 2017 rate with 1 copy using > >>> > >>> -Ofast -march=native -mfpmath=sse -fno-associative-math -funroll-loops > >>> > >>> are > >>> > >>> 1. On Broadwell server: > >>> > >>> 500.perlbench_r (-0.82%) > >>> 502.gcc_r (0.73%) > >>> 505.mcf_r (-0.24%) > >>> 520.omnetpp_r (-2.22%) > >>> 523.xalancbmk_r (-1.47%) > >>> 525.x264_r (0.31%) > >>> 531.deepsjeng_r (0.27%) > >>> 541.leela_r (0.85%) > >>> 548.exchange2_r (-0.11%) > >>> 557.xz_r (-0.34%) > >>> Geomean: (-0.23%) > >>> > >>> 503.bwaves_r (0.00%) > >>> 507.cactuBSSN_r (-1.88%) > >>> 508.namd_r (0.00%) > >>> 510.parest_r (-0.56%) > >>> 511.povray_r (0.49%) > >>> 519.lbm_r (-1.28%) > >>> 521.wrf_r (-0.28%) > >>> 526.blender_r (0.55%) > >>> 527.cam4_r (-0.20%) > >>> 538.imagick_r (2.52%) > >>> 544.nab_r (-0.18%) > >>> 549.fotonik3d_r (-0.51%) > >>> 554.roms_r (-0.22%) > >>> Geomean: (0.00%) > >>> > >>> 2. On Skylake client: > >>> > >>> 500.perlbench_r (-0.29%) > >>> 502.gcc_r (-0.36%) > >>> 505.mcf_r (1.77%) > >>> 520.omnetpp_r (-0.26%) > >>> 523.xalancbmk_r (-3.69%) > >>> 525.x264_r (-0.32%) > >>> 531.deepsjeng_r (0.00%) > >>> 541.leela_r (-0.46%) > >>> 548.exchange2_r (0.00%) > >>> 557.xz_r (0.00%) > >>> Geomean: (-0.34%) > >>> > >>> 503.bwaves_r (0.00%) > >>> 507.cactuBSSN_r (-0.56%) > >>> 508.namd_r (0.87%) > >>> 510.parest_r (0.00%) > >>> 511.povray_r (-0.73%) > >>> 519.lbm_r (0.84%) > >>> 521.wrf_r (0.00%) > >>> 526.blender_r (-0.81%) > >>> 527.cam4_r (-0.43%) > >>> 538.imagick_r (2.55%) > >>> 544.nab_r (0.28%) > >>> 549.fotonik3d_r (0.00%) > >>> 554.roms_r (0.32%) > >>> Geomean: (0.12%) > >>> > >>> 3. On Skylake server: > >>> > >>> 500.perlbench_r (-0.55%) > >>> 502.gcc_r (0.69%) > >>> 505.mcf_r (0.00%) > >>> 520.omnetpp_r (-0.33%) > >>> 523.xalancbmk_r (-0.21%) > >>> 525.x264_r (-0.27%) > >>> 531.deepsjeng_r (0.00%) > >>> 541.leela_r (0.00%) > >>> 548.exchange2_r (-0.11%) > >>> 557.xz_r (0.00%) > >>> Geomean: (0.00%) > >>> > >>> 503.bwaves_r (0.58%) > >>> 507.cactuBSSN_r (0.00%) > >>> 508.namd_r (0.00%) > >>> 510.parest_r (0.18%) > >>> 511.povray_r (-0.58%) > >>> 519.lbm_r (0.25%) > >>> 521.wrf_r (0.40%) > >>> 526.blender_r (0.34%) > >>> 527.cam4_r (0.19%) > >>> 538.imagick_r (5.87%) > >>> 544.nab_r (0.17%) > >>> 549.fotonik3d_r (0.00%) > >>> 554.roms_r (0.00%) > >>> Geomean: (0.62%) > >>> > >>> On Skylake client, impacts on 538.imagick_r are > >>> > >>> size before: > >>> > >>>textdata bss dec hex filename > >>> 277 108765576 2572029 273efd imagick_r.exe > >>> > >>> size after: > >>> > >>>textdata bss dec hex filename > >>> 2511825 108765576 2528277 269415 imagick_r.exe > >>> > >>> number of vxorp[ds]: > >>> > >>> before after difference > >>> 14570 4515-69% > >>> > >>> OK for trunk? > >>> > >>> Thanks. > >>> > >>> > >>> H.J. > >>> --- > >>> gcc/ > >>> > >>> 2018-08-28 H.J. Lu > >>> Sunil K Pandey > >>> > >>> PR target/87007 > >>> * config/i386/i386-passes.def: Add > >>> pass_remove_partial_avx_dependency. > >>> * config/i386/i386-protos.h > >>> (make_pass_remove_partial_avx_dependency): New. > >>> * config/i386/i386.c (make_pass_remove_partial_avx_dependency): > >>> New function. > >>> (pass_data_remove_partial_avx_dependency): New. > >>> (pass_remove_partial_avx_dependency): Likewise. > >>> (make_pass_remove_partial_avx_dependency): Likewise. > >>> * config/i386/i386.md (SF/DF conversion splitters): Disabled > >>> for TARGET_AVX. > >>> > >>> gcc/testsuite/ > >>> > >>> 2018-08-28 H.J. Lu > >>> Sunil K Pandey > >>> > >>> PR target/87007 > >>> * gcc.target/i386/pr87007.c: New file. > >> > >> > >> PING: > >> > >> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01781.html > >> > > > > PING. > > > > Hi Kirll, Jakub, Jan, > > Can you take a look? > PING. -- H.J.
[PATCH] libgo: Don't assume sys.GoarchAmd64 == 64-bit pointer
On x86-64, sys.GoarchAmd64 == 1 for -mx32. But -mx32 has 32-bit pointer, not 64-bit. There is // _64bit = 1 on 64-bit systems, 0 on 32-bit systems _64bit = 1 << (^uintptr(0) >> 63) / 2 We should check both _64bit and sys.GoarchAmd64. PR go/87470 * go/runtime/malloc.go (arenaBaseOffset): Also check _64bit. --- libgo/go/runtime/malloc.go | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libgo/go/runtime/malloc.go b/libgo/go/runtime/malloc.go index ac4759ffbf1..c3445387057 100644 --- a/libgo/go/runtime/malloc.go +++ b/libgo/go/runtime/malloc.go @@ -306,7 +306,7 @@ const ( // // On other platforms, the user address space is contiguous // and starts at 0, so no offset is necessary. - arenaBaseOffset uintptr = sys.GoarchAmd64 * (1 << 47) + arenaBaseOffset uintptr = _64bit * sys.GoarchAmd64 * (1 << 47) // Max number of threads to run garbage collection. // 2, 3, and 4 are all plausible maximums depending -- 2.17.1
Need link from high authority site?
Hey There My name is Anurag Chatap from https://Programesecure.com, I would love to add your site link in my article Otherwise if you want you can provide us an Article and We will publish it. Site DA: 31 Site PA: 35 Traffic: 3+ pageviews per month Age: 3.8 Years The link will be permanent and Dofollow. Would you be interested in any of those options? I’d be happy to give you a link. Price is very reasonable, If you have any question please feel free to ask. Thanks! Anurag Chatap CEO, Programesecure.com If you don't want to receive any more emails, just click here.
Re: [PATCH] libgo: Don't assume sys.GoarchAmd64 == 64-bit pointer
"H.J. Lu" writes: > On x86-64, sys.GoarchAmd64 == 1 for -mx32. But -mx32 has 32-bit > pointer, not 64-bit. There is > > // _64bit = 1 on 64-bit systems, 0 on 32-bit systems > _64bit = 1 << (^uintptr(0) >> 63) / 2 > > We should check both _64bit and sys.GoarchAmd64. Thanks, but I think the correct fix is to set GOARCH to amd64p32 when using x32. I'm trying that to see if it will work out. Ian