date:20170529

[PATCH 3/4 v3][PR 67328] Added bool conversion for wide_ints

2017-05-29 Thread Yuri Gribov




0003-Added-bool-conversion-for-wide_ints.patch
Description: Binary data

[PATCH 4/4 v3][PR 67328] Optimize some masked comparisons to efficient bittest

2017-05-29 Thread Yuri Gribov

This no longer fixes the PR but still works in some cases as
demonstrated by the test. So I decided to keep it.

-I


0004-Optimize-some-masked-comparisons-to-efficient-bittes.patch
Description: Binary data

[PATCH] Add no_tail_call attribute

2017-05-29 Thread Yuri Gribov

Hi all,

As discussed in
https://sourceware.org/ml/libc-alpha/2017-01/msg00455.html , some
libdl functions rely on return address to figure out the calling
DSO and then use this information in computation (e.g. output of dlsym
depends on which library called it).

As reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66826 this
may break under tailcall optimization i.e. in cases like

  return dlsym(...);

Carlos confirmed that they would prefer to have GCC attribute to
prevent tailcalls
(https://sourceware.org/ml/libc-alpha/2017-01/msg00502.html) so there
you go.

This was bootstrapped on x86_64. Given that this is a minor addition,
I only ran newly added regtests. I hope that's enough (full testsuite
would take a week on my notebook...).

-I


0001-Added-no_tail_call-attribute.patch
Description: Binary data

Re: [RFC, PATCH][ASAN] Implement dynamic allocas/VLAs sanitization.

2017-05-29 Thread Yuri Gribov

On Wed, May 17, 2017 at 1:24 PM, Maxim Ostapenko
 wrote:
> Hi,
>
> this patch implements dynamic allocas/VLAs sanitization in ASan. Basically,
> this is implemented at compiler part in the following way:
>
> 1) For each __builtin_alloca{_with_align} increase its size and alignment to
> contain ASan redzones.
> 2) Poison redzones by calling __asan_alloca_poison(alloc_addr, size) ASan
> runtime library function.
> 3) Remember last allocated address into separate variable called
> 'last_alloca_addr'. This will be used to implement unpoisoning stuff.
> 4) On each stackrestore/return perform dynamic stack unpoisoning by calling
> __asan_allocas_unpoison(last_alloca_addr, restored_sp) library function.
>
> With this patch I was able to find two bugs in GCC itself [1], [2] as well
> as catch a bug in Radare2 [3] initially found by Clang + LibFuzzer.
> I've also managed to build Chromium but didn't find any errors there.
>
> Does this patch looks sensible for GCC? Any feedback/suggestions would be
> greatly appreciated.
>
> Thanks,
> -Maxim
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72765
> [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80798
> [3] https://github.com/radare/radare2/issues/6918

Cc-ed sanitizer maintainers.

[PATCH, AArch64] Add x86 intrinsic headers to GCC AArch64 taget

2017-05-29 Thread Hurugalawadi, Naveen

Hi,

Please find attached the patch that adds first set of X86 instrinsic
headers to AArch64 target.
The implementation is based on similar work targeted at PPC64LE.
https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00550.html

We are using the corresponding DejaGnu tests similar to Powerpc from 
gcc/testsuite/gcc.target/i386/ to gcc/testsuite/gcc.target/aarch64 as the
source remains same. Only modifications are target related as appropriate.

Bootstrapped and Regression tested on aarch64-thunder-linux.

Please review the patch and let us know if any comments or suggestions.

Thanks,
Naveen

2017-05-29  Naveen H.S  

[gcc]
* config.gcc (aarch64*-*-*): Add bmi2intrin.h, bmiintrin.h,
and x86intrin.h
* config/aarch64/bmi2intrin.h: New file.
* config/aarch64/bmiintrin.h: New file.
* config/aarch64/x86intrin.h: New file.

[gcc/testsuite]

* gcc.target/aarch64/bmi-andn-1.c: New file
* gcc.target/aarch64/bmi-andn-2.c: New file.
* gcc.target/aarch64/bmi-bextr-1.c: New file.
* gcc.target/aarch64/bmi-bextr-2.c: New file.
* gcc.target/aarch64/bmi-bextr-4.c: New file.
* gcc.target/aarch64/bmi-bextr-5.c: New file.
* gcc.target/aarch64/bmi-blsi-1.c: New file.
* gcc.target/aarch64/bmi-blsi-2.c: New file.
* gcc.target/aarch64/bmi-blsmsk-1.c: new file.
* gcc.target/aarch64/bmi-blsmsk-2.c: New file.
* gcc.target/aarch64/bmi-blsr-1.c: New file.
* gcc.target/aarch64/bmi-blsr-2.c: New File.
* gcc.target/aarch64/bmi-check.h: New File.
* gcc.target/aarch64/bmi-tzcnt-1.c: new file.
* gcc.target/aarch64/bmi-tzcnt-2.c: New file.
* gcc.target/aarch64/bmi2-bzhi32-1.c: New file.
* gcc.target/aarch64/bmi2-bzhi64-1.c: New file.
* gcc.target/aarch64/bmi2-bzhi64-1a.c: New file.
* gcc.target/aarch64/bmi2-check.h: New file.
* gcc.target/aarch64/bmi2-mulx32-1.c: New file.
* gcc.target/aarch64/bmi2-mulx32-2.c: New file.
* gcc.target/aarch64/bmi2-mulx64-1.c: New file.
* gcc.target/aarch64/bmi2-mulx64-2.c: New file.
* gcc.target/aarch64/bmi2-pdep32-1.c: New file.
* gcc.target/aarch64/bmi2-pdep64-1.c: New file.
* gcc.target/aarch64/bmi2-pext32-1.c: New File.
* gcc.target/aarch64/bmi2-pext64-1.c: New file.
* gcc.target/aarch64/bmi2-pext64-1a.c: New File.diff --git a/gcc/config.gcc b/gcc/config.gcc
index f55dcaa..9eac70e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -301,6 +301,7 @@ m32c*-*-*)
 aarch64*-*-*)
 	cpu_type=aarch64
 	extra_headers="arm_fp16.h arm_neon.h arm_acle.h"
+	extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h x86intrin.h"
 	c_target_objs="aarch64-c.o"
 	cxx_target_objs="aarch64-c.o"
 	extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o"
diff --git a/gcc/config/aarch64/bmi2intrin.h b/gcc/config/aarch64/bmi2intrin.h
new file mode 100644
index 000..c797f22
--- /dev/null
+++ b/gcc/config/aarch64/bmi2intrin.h
@@ -0,0 +1,148 @@
+/* Copyright (C) 2011-2017 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+/* This header is distributed to simplify porting x86_64 code that
+   makes explicit use of Intel intrinsics to Aarch64.
+   It is the user's responsibility to determine if the results are
+   acceptable and make additional changes as necessary.
+   Note that much code that uses Intel intrinsics can be rewritten in
+   standard C or GNU C extensions, which are more portable and better
+   optimized across multiple targets.  */
+
+#if !defined _X86INTRIN_H_INCLUDED
+# error "Never use  directly; include  instead."
+#endif
+
+#ifndef _BMI2INTRIN_H_INCLUDED
+#define _BMI2INTRIN_H_INCLUDED
+
+extern __inline unsigned int
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_bzhi_u32 (unsigned int __X, unsigned int __Y)
+{
+  return ((__X << (32 - __Y)) >> (32 - __Y));
+}
+
+extern __inline unsigned int
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mulx_u32 (unsigned int __X,

[contrib, committed] check_GNU_style_lib.py: Suggest to install all missing pip3 packages at once

2017-05-29 Thread Tom de Vries


Hi,

this patch simplifies the pip3 dependency setup for check_GNU_style_lib.py.

Instead of:
...
$ ./contrib/check_GNU_style.py
termcolor module is missing (run: pip3 install termcolor)
$ pip3 install termcolor
$ ./contrib/check_GNU_style.py
unidiff module is missing (run: pip3 install unidiff)
$ pip3 install unidiff
$
...

we now do:
...
$ ./contrib/check_GNU_style.py
termcolor and unidiff modules are missing (run: pip3 install termcolor 
unidiff)

$ pip3 install termcolor unidiff
$
...

Committed.

Thanks,
- Tom
check_GNU_style_lib.py: Suggest to install all missing pip3 packages at once

Instead of:
...
$ ./contrib/check_GNU_style.py
termcolor module is missing (run: pip3 install termcolor)
$ pip3 install termcolor
$ ./contrib/check_GNU_style.py
unidiff module is missing (run: pip3 install unidiff)
$ pip3 install unidiff
$
...

Do:
...
$ ./contrib/check_GNU_style.py
termcolor and unidiff modules are missing (run: pip3 install termcolor unidiff)
$ pip3 install termcolor unidiff
$
...

2017-05-28  Tom de Vries  

	* check_GNU_style_lib.py: Use import_pip3 to import pip3 packages.
	(import_pip3): New function.

---
 contrib/check_GNU_style_lib.py | 34 +++---
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/contrib/check_GNU_style_lib.py b/contrib/check_GNU_style_lib.py
index a1224c1..d924e68 100755
--- a/contrib/check_GNU_style_lib.py
+++ b/contrib/check_GNU_style_lib.py
@@ -28,17 +28,29 @@ import sys
 import re
 import unittest
 
-try:
-from termcolor import colored
-except ImportError:
-print('termcolor module is missing (run: pip3 install termcolor)')
-exit(3)
-
-try:
-from unidiff import PatchSet
-except ImportError:
-print('unidiff module is missing (run: pip3 install unidiff)')
-exit(3)
+def import_pip3(*args):
+missing=[]
+for (module, names) in args:
+try:
+lib = __import__(module)
+except ImportError:
+missing.append(module)
+continue
+if not isinstance(names, list):
+names=[names]
+for name in names:
+globals()[name]=getattr(lib, name)
+if len(missing) > 0:
+missing_and_sep = ' and '.join(missing)
+missing_space_sep = ' '.join(missing)
+print('%s %s missing (run: pip3 install %s)'
+  % (missing_and_sep,
+ ("module is" if len(missing) == 1 else "modules are"),
+ missing_space_sep))
+exit(3)
+
+import_pip3(('termcolor', 'colored'),
+('unidiff', 'PatchSet'))
 
 from itertools import *

[contrib, committed] check_GNU_style.py: Read stdin if file argument is '-'

2017-05-29 Thread Tom de Vries


Hi,

this patch makes check_GNU_style.py read from stdin if the file argument 
is '-', similar to what check_GNU_style.sh does.


Committed.

Thanks,
- Tom
check_GNU_style.py: Read stdin if file argument is '-'

2017-05-28  Tom de Vries  

	* check_GNU_style_lib.py (check_GNU_style_file): Treat file argument as
	file handle.  Add and handle file_encoding argument.
	* check_GNU_style.py (main): Handle '-' file argument.  Call
	check_GNU_style_file with file handle as argument.

---
 contrib/check_GNU_style.py | 10 +-
 contrib/check_GNU_style_lib.py |  5 ++---
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/contrib/check_GNU_style.py b/contrib/check_GNU_style.py
index 6970ddf..61faa29 100755
--- a/contrib/check_GNU_style.py
+++ b/contrib/check_GNU_style.py
@@ -21,6 +21,7 @@
 # .  */
 
 import argparse
+import sys
 from check_GNU_style_lib import check_GNU_style_file
 
 def main():
@@ -30,6 +31,13 @@ def main():
 help = 'Display format',
 choices = ['stdio', 'quickfix'])
 args = parser.parse_args()
-check_GNU_style_file(args.file, args.format)
+filename = args.file
+format = args.format
+
+if filename == '-':
+check_GNU_style_file(sys.stdin, None, format)
+else:
+with open(filename, 'rb') as diff_file:
+check_GNU_style_file(diff_file, 'utf-8', format)
 
 main()
diff --git a/contrib/check_GNU_style_lib.py b/contrib/check_GNU_style_lib.py
index d924e68..e1031df 100755
--- a/contrib/check_GNU_style_lib.py
+++ b/contrib/check_GNU_style_lib.py
@@ -223,7 +223,7 @@ class LineLengthTest(unittest.TestCase):
 self.assertEqual(r.console_error,
 self.check.limit * 'a' + error_string(' = 123;'))
 
-def check_GNU_style_file(file, format):
+def check_GNU_style_file(file, file_encoding, format):
 checks = [LineLengthCheck(), SpacesCheck(), TrailingWhitespaceCheck(),
 SentenceSeparatorCheck(), SentenceEndOfCommentCheck(),
 SentenceDotEndCheck(), FunctionParenthesisCheck(),
@@ -231,8 +231,7 @@ def check_GNU_style_file(file, format):
 BracesOnSeparateLineCheck(), TrailinigOperatorCheck()]
 errors = []
 
-with open(file, 'rb') as diff_file:
-patch = PatchSet(diff_file, encoding = 'utf-8')
+patch = PatchSet(file, encoding=file_encoding)
 
 for pfile in patch.added_files + patch.modified_files:
 t = pfile.target_file.lstrip('b/')

[contrib, committed] check_GNU_style_lib.py: Fix trailing whitespace check

2017-05-29 Thread Tom de Vries


Hi,

this patch fixes the trailing whitespace check in check_GNU_style_lib.py.

Atm, the lines passed to the checks contain the eol char, so the 
trailing whitespace regexp '(\s+)$' matches for a line '123\n', which is 
in fact without trailing whitespace.


Fixed by removing the eol char.

Committed.

Thanks,
- Tom
check_GNU_style_lib.py: Fix trailing whitespace check

2017-05-28  Tom de Vries  

	* check_GNU_style_lib.py (TrailingWhitespaceCheck.check): Assert no
	trailing eol.
	(TrailingWhitespaceTest): New unit test.
	(check_GNU_style_file): Remove eol before checking.

---
 contrib/check_GNU_style_lib.py | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/contrib/check_GNU_style_lib.py b/contrib/check_GNU_style_lib.py
index e1031df..63d0538 100755
--- a/contrib/check_GNU_style_lib.py
+++ b/contrib/check_GNU_style_lib.py
@@ -104,6 +104,7 @@ class TrailingWhitespaceCheck:
 self.re = re.compile('(\s+)$')
 
 def check(self, filename, lineno, line):
+assert(len(line) == 0 or line[-1] != '\n')
 m = self.re.search(line)
 if m != None:
 return CheckError(filename, lineno,
@@ -223,6 +224,18 @@ class LineLengthTest(unittest.TestCase):
 self.assertEqual(r.console_error,
 self.check.limit * 'a' + error_string(' = 123;'))
 
+class TrailingWhitespaceTest(unittest.TestCase):
+def setUp(self):
+self.check = TrailingWhitespaceCheck()
+
+def test_trailing_whitespace_check_basic(self):
+r = self.check.check('foo', 123, 'a = 123;')
+self.assertIsNone(r)
+r = self.check.check('foo', 123, 'a = 123; ')
+self.assertIsNotNone(r)
+r = self.check.check('foo', 123, 'a = 123;\t')
+self.assertIsNotNone(r)
+
 def check_GNU_style_file(file, file_encoding, format):
 checks = [LineLengthCheck(), SpacesCheck(), TrailingWhitespaceCheck(),
 SentenceSeparatorCheck(), SentenceEndOfCommentCheck(),
@@ -244,7 +257,8 @@ def check_GNU_style_file(file, file_encoding, format):
 for line in hunk:
 if line.is_added and line.target_line_no != None:
 for check in checks:
-e = check.check(t, line.target_line_no, line.value)
+line_chomp = line.value.replace('\n', '')
+e = check.check(t, line.target_line_no, line_chomp)
 if e != None:
 errors.append(e)

[PATCH 2/2] DWARF: make it possible to emit debug info for declarations only

2017-05-29 Thread Pierre-Marie de Rodat

Hello,

The DWARF back-end used to systematically ignore file-scope function and
variable declarations.  While this is justified in language like C/C++,
where such declarations can appear in several translation units and thus
bloat uselessly the debug info, this behavior is counter-productive in
languages with a well-defined module system.  Specifically, it prevents
the description of imported entities, that belong to foreign languages,
making them unavailable from debuggers.

Take for instance:

package C_Binding is
function My_C_Function (I : Integer) return Integer;
pragma Import (C, My_C_Function, "my_c_function");
end C_Binding;

This makes available for Ada programs the C function "my_c_function"
under the following name: C_Binding.My_C_Function.  When GCC compiles
it, though, it is represented as a FUNCTION_DECL node with DECL_EXTERNAL
set and a null DECL_INITIAL, which used to be discarded unconditionally
in the DWARF back-end.

This patch introduces a new DECL language hook:
emit_debug_info_for_decl_p, which the DWARF back-end uses to determine
whether it should emit debug info for some declaration.  This makes it
possible for front-ends to decide the appropriate behavior.

This patch also updates the Ada front-end to override this hook, so that
declarations such as the above do generate debugging information.

Bootstrapped and reg-tested on x86_64-linux.  Ok to commit? Thank you in
advance!

gcc/
* langhooks.h
(lang_hooks_for_decls::emit_debug_info_for_decl_p): New field.
* langhooks-def.h
(LANG_HOOKS_EMIT_DEBUG_INFO_FOR_DECL_P): New macro.
* dwarf2out.c (gen_decl_die): Use the new hook to determine
whether to ignore file-scope declarations.
(dwarf2out_early_global_decl): Likewise.
(dwaf2out_decl): Likewise.

gcc/ada
* gcc-interface/ada-tree.h (DECL_FUNCTION_IS_DEF): New macro.
* gcc-interface/misc.c (gnat_emit_debug_info_for_decl_p): New
function.
(LANG_HOOKS_EMIT_DEBUG_INFO_FOR_DECL_P): Override macro.
* gcc-interface/trans.c (Compilation_Unit_to_gnu): Tag the
elaboration procedure as a definition.
(Subprogram_Body_to_gnu): Tag the subprogram as a definition.
* gcc-interface/decl.c (gnat_to_gnu_entity): Tag declarations of
imported subprograms for the current compilation unit as
definitions.  Disable debug info for references to variables.
* gcc-interface/utils.c (create_subprog_decl): Add a DEFINITION
parameter.  If it is true, tag the function as a definition.
Update all callers.
(gnat_pushdecl): Add external DECLs that are not built-in
functions to their binding scope.
(gnat_write_global_declarations): Emit debug info for imported
functions.  Filter out external variables for which debug info
is disabled.
* gcc-interface/gigi.c (create_subprog_decl): Update
declaration.

git-svn-id: svn+ssh://svn.us.adacore.com/Dev/trunk/gcc-interfaces@328405 
f8352e7e-cb20-0410-8ce7-b5d9e71c585c
---
 gcc/ada/gcc-interface/ada-tree.h |  7 +-
 gcc/ada/gcc-interface/decl.c | 19 +--
 gcc/ada/gcc-interface/gigi.h |  5 +++-
 gcc/ada/gcc-interface/misc.c | 11 +
 gcc/ada/gcc-interface/trans.c| 52 
 gcc/ada/gcc-interface/utils.c| 34 +++---
 gcc/dwarf2out.c  | 13 ++
 gcc/langhooks-def.h  |  2 ++
 gcc/langhooks.h  |  9 +++
 9 files changed, 109 insertions(+), 43 deletions(-)

diff --git a/gcc/ada/gcc-interface/ada-tree.h b/gcc/ada/gcc-interface/ada-tree.h
index a3d38b1b22e..511a0bd8173 100644
--- a/gcc/ada/gcc-interface/ada-tree.h
+++ b/gcc/ada/gcc-interface/ada-tree.h
@@ -6,7 +6,7 @@
  *  *
  *  C Header File   *
  *  *
- *  Copyright (C) 1992-2016, Free Software Foundation, Inc. *
+ *  Copyright (C) 1992-2017, Free Software Foundation, Inc. *
  *  *
  * GNAT is free software;  you can  redistribute it  and/or modify it under *
  * terms of the  GNU General Public License as published  by the Free Soft- *
@@ -463,6 +463,11 @@ do {  \
a discriminant of a discriminated type without default expression.  */
 #define DECL_INVARIANT_P(NODE) DECL_LANG_FLAG_4 (FIELD_DECL_CHECK (NODE))
 
+/* Nonzero in a FUNCTION_DECL if this is a definition, i.e. if it was created
+   by a call to gnat_to_gnu_entity with definition set to True.  */
+#define DECL_FUNCTION_IS_DEF(NODE) \
+  DECL_LANG_FLAG_4 (FUNCTION_DECL_CHECK (NODE))
+
 /* Nonzero in a VAR_DECL if it is a temporary created to

[PATCH 1/2] gimplify_modify_expr: avoid DECL_DEBUG_EXPR links across functions

2017-05-29 Thread Pierre-Marie de Rodat

Hello,

An upcoming patch exposes a bug in gimplify_modify_expr.  There, we try
not to create DECL_DEBUG_EXPR links across functions, however we don't
check that *FROM_P actually belongs to the current function before
modifying it.  This patch fixes this oversight.

Bootstrapped and regtested on x86_64-linux.  Ok to commit?  Thank you in
advance!

gcc/

* gimplify.c (gimplify_modify_expr): Don't create a
DECL_DEBUG_EXPR link if *FROM_P does not belong to the current
function.
---
 gcc/gimplify.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 455a6993e15..2c7fc9fabd1 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -5580,7 +5580,8 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
   && DECL_IGNORED_P (*from_p)
   && DECL_P (*to_p)
   && !DECL_IGNORED_P (*to_p)
-  && decl_function_context (*to_p) == current_function_decl)
+  && decl_function_context (*to_p) == current_function_decl
+  && decl_function_context (*from_p) == current_function_decl)
 {
   if (!DECL_NAME (*from_p) && DECL_NAME (*to_p))
DECL_NAME (*from_p)
-- 
2.13.0

Re: Optimisation of std::binary_search of the header

2017-05-29 Thread jay pokarna

Respected Sir,
  I am sorry , for the use of wrong language
in the previous mail. I wanted to convey that c++ has generalised the
algorithm on various data structures , which is not required due to
low performance.

Could you give me the contact of the standard committee?

Regards,
Jay Pokarna

On Mon, May 29, 2017 at 1:13 PM, Tim Song  wrote:
> I'm not sure if you forgot to CC the lists or intended to direct the
> email to me alone.
>
> On Mon, May 29, 2017 at 2:41 AM, jay pokarna  wrote:
>> I know that cpp wants to generalise its methods so that they can be
>> used with various data structures. But the cost of generalisation is
>> that we have to compromise a lot on performance.
>
> That's neither here nor there. binary_search's performance as applied
> to forward iterators has nothing to do with its performance as to
> random access iterators.
>
>> I would like to recommend cpp to allow the use of binary_search only
>> on data structures that use random access models.
>
> C++ has an international standard and GCC/libstdc++ is an
> implementation of that standard. Proposal for changes to the standard
> should be directed to the standards committee, not GCC's mailing
> lists.
>
>> The technique that I have used is square root decomposition . I think
>> that it will be better than the one that is implemented.
>
> And here's the problem: you *think* it will be better. Just thinking
> is not enough. You need to *prove* it with benchmarks that show that
> your technique is in fact faster than the current one.
>
> If there is in fact substantial improvement, *then* it's the time to
> consider generalization: when is this technique faster than the stock
> one? Always? Only random access? Only pointers? Only built-in types?
> Again, this needs to be shown with appropriate benchmark numbers.
>
> Then, finally, you can write a patch proposing that libstdc++'s
> binary_search be modified to use this technique in situations when
> it's shown to be faster.
>
> For a recent example of how something like this should look like, see
> https://gcc.gnu.org/ml/libstdc++/2016-12/msg00051.html and its
> LLVM/libc++ counterpart, https://reviews.llvm.org/D27068. Note the
> copious benchmark numbers showing that the proposed change was indeed
> (much) better than the previous.



-- 
Regards,
Jay Pokarna
CS Sophomore
Wordpress | Linkedin
Birla Institute of Technology and Science, Pilani
Pilani Campus
Rajasthan - 333031.

Re: [PATCH] Add no_tail_call attribute

2017-05-29 Thread Alexander Monakov

Hi,

On Mon, 29 May 2017, Yuri Gribov wrote:

> Hi all,
> 
> As discussed in
> https://sourceware.org/ml/libc-alpha/2017-01/msg00455.html , some
> libdl functions rely on return address to figure out the calling
> DSO and then use this information in computation (e.g. output of dlsym
> depends on which library called it).
> 
> As reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66826 this
> may break under tailcall optimization i.e. in cases like
> 
>   return dlsym(...);
> 
> Carlos confirmed that they would prefer to have GCC attribute to
> prevent tailcalls
> (https://sourceware.org/ml/libc-alpha/2017-01/msg00502.html) so there
> you go.

A few comments:

- the new attribute will need documentation
- as mentioned earlier, calls to dlsym via a function pointer may still lead to
  the same issue (so the documentation should mention that)
- this suppresses tailcalls for all dlsym calls, although only those with
  RTLD_NEXT are magic and need such suppression

Are there any other possible uses for this attribute?  Given the issue of
calls-via-pointers, I don't understand why Glibc needs it, because for direct
calls Jakub pointed out a simpler solution that works with existing compilers:

#define dlsym(h, s) \
  ({ \
  void *__r = dlsym (h, s); \
  asm ("" : "+r" (__r)); \
  __r; })

(I think life would be easier for everyone if instead of making RTLD_NEXT magic,
there was simply a way to look up a handle of the "next" dso...)

Alexander

Re: [PATCH] Add no_tail_call attribute

2017-05-29 Thread Yuri Gribov

On Mon, May 29, 2017 at 9:14 AM, Alexander Monakov  wrote:
> Hi,
>
> On Mon, 29 May 2017, Yuri Gribov wrote:
>
>> Hi all,
>>
>> As discussed in
>> https://sourceware.org/ml/libc-alpha/2017-01/msg00455.html , some
>> libdl functions rely on return address to figure out the calling
>> DSO and then use this information in computation (e.g. output of dlsym
>> depends on which library called it).
>>
>> As reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66826 this
>> may break under tailcall optimization i.e. in cases like
>>
>>   return dlsym(...);
>>
>> Carlos confirmed that they would prefer to have GCC attribute to
>> prevent tailcalls
>> (https://sourceware.org/ml/libc-alpha/2017-01/msg00502.html) so there
>> you go.
>
> A few comments:
>
> - the new attribute will need documentation

Right, completely forgot...

> - as mentioned earlier, calls to dlsym via a function pointer may still lead 
> to
>   the same issue (so the documentation should mention that)

Yes but compiler will emit an error on cast to function pointer that
lacks an attribute so hopefully we can catch situations like this.

> - this suppresses tailcalls for all dlsym calls, although only those with
>   RTLD_NEXT are magic and need such suppression

Note that other Glibc functions need no_tail_call as well e.g. dlinfo
and dlmopen (grep for RETURN_ADDRESS in dlfcn/ for full list).

> Are there any other possible uses for this attribute?  Given the issue of
> calls-via-pointers, I don't understand why Glibc needs it, because for direct
> calls Jakub pointed out a simpler solution that works with existing compilers:
>
> #define dlsym(h, s) \
>   ({ \
>   void *__r = dlsym (h, s); \
>   asm ("" : "+r" (__r)); \
>   __r; })

True, perhaps they were worried that inline asm may have performance
implications.

-Y

Re: [PATCH, rs6000] Fold vector absolutes in GIMPLE

2017-05-29 Thread Richard Biener

On Fri, May 26, 2017 at 7:19 PM, Will Schmidt  wrote:
> Hi,
>
> Add support for early expansion of vector absolute built-ins.
>
> Bootstraps currently running (p7,p8le,p8be).
>
> OK for trunk?

What's the documented behavior for vec_abs with respect to an argument
of value INT_MIN?

Richard.

> Thanks,
> -Will
>
>
> [gcc]
>
> 2017-05-26  Will Schmidt  
>
> * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
> for early expansion of vector absolute builtins.
>
> [gcc/testsuite]
>
> 2017-05-15  Will Schmidt  
>
> * gcc.target/powerpc/fold-vec-abs-char.c: New.
> * gcc.target/powerpc/fold-vec-abs-floatdouble.c: New.
> * gcc.target/powerpc/fold-vec-abs-int.c: New.
> * gcc.target/powerpc/fold-vec-abs-longlong.c: New.
> * gcc.target/powerpc/fold-vec-abs-short.c: New.
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index dac673c..104a052 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -17333,6 +17333,21 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
> gsi_replace (gsi, g, true);
> return true;
>}
> +/* flavors of vec_abs. */
> +case ALTIVEC_BUILTIN_ABS_V16QI:
> +case ALTIVEC_BUILTIN_ABS_V8HI:
> +case ALTIVEC_BUILTIN_ABS_V4SI:
> +case ALTIVEC_BUILTIN_ABS_V4SF:
> +case P8V_BUILTIN_ABS_V2DI:
> +case VSX_BUILTIN_XVABSDP:
> +  {
> +   arg0 = gimple_call_arg (stmt, 0);
> +   lhs = gimple_call_lhs (stmt);
> +   gimple *g = gimple_build_assign (lhs, ABS_EXPR, arg0);
> +   gimple_set_location (g, gimple_location (stmt));
> +   gsi_replace (gsi, g, true);
> +   return true;
> +  }
>  default:
>break;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c
> new file mode 100644
> index 000..239c919
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c
> @@ -0,0 +1,18 @@
> +/* Verify that overloaded built-ins for vec_abs with char
> +   inputs produce the right results.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-options "-maltivec -O2" } */
> +
> +#include 
> +
> +vector signed char
> +test2 (vector signed char x)
> +{
> +  return vec_abs (x);
> +}
> +
> +/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
> +/* { dg-final { scan-assembler-times "vsububm" 1 } } */
> +/* { dg-final { scan-assembler-times "vmaxsb" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-floatdouble.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-floatdouble.c
> new file mode 100644
> index 000..1a08618
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-floatdouble.c
> @@ -0,0 +1,23 @@
> +/* Verify that overloaded built-ins for vec_abs with float and
> +   double inputs for VSX produce the right results.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-mvsx -O2" } */
> +
> +#include 
> +
> +vector float
> +test1 (vector float x)
> +{
> +  return vec_abs (x);
> +}
> +
> +vector double
> +test2 (vector double x)
> +{
> +  return vec_abs (x);
> +}
> +
> +/* { dg-final { scan-assembler-times "xvabssp" 1 } } */
> +/* { dg-final { scan-assembler-times "xvabsdp" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int.c
> new file mode 100644
> index 000..caf8861
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int.c
> @@ -0,0 +1,18 @@
> +/* Verify that overloaded built-ins for vec_abs with int
> +   inputs produce the right results.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-options "-maltivec -O2 " } */
> +
> +#include 
> +
> +vector signed int
> +test1 (vector signed int x)
> +{
> +  return vec_abs (x);
> +}
> +
> +/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
> +/* { dg-final { scan-assembler-times "vsubuwm" 1 } } */
> +/* { dg-final { scan-assembler-times "vmaxsw" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-longlong.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-longlong.c
> new file mode 100644
> index 000..5b59d19
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-longlong.c
> @@ -0,0 +1,18 @@
> +/* Verify that overloaded built-ins for vec_abs with long long
> +   inputs produce the right results.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mpower8-vector -O2" } */
> +
> +#include 
> +
> +vector signed long long
> +test3 (vector signed long long x)
> +{
> +  return vec_abs (x);
> +}
> +
> +/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
> +/* { dg-final { scan-assembler-times "vsubudm" 1 } } */
> +/* { dg-final { scan-assembler-times "vmaxsd"

Re: builtin fenv functions

2017-05-29 Thread Richard Biener

On Mon, May 29, 2017 at 12:09 AM, Marc Glisse  wrote:
> On Fri, 26 May 2017, Richard Biener wrote:
>
> Similarly, I
> don't see div as a builtin in that file, only FILE* has special code,
> but
> that doesn't seem worth the trouble here. So I am only declaring the 5
> "simple" functions, with minimal properties: leaf, nothrow, and for
> fegetround pure (glibc already declares it that way). We can then
> discuss
> the safety of future optimizations on a case by case basis.



 +DEF_C99_BUILTIN(BUILT_IN_FERAISEEXCEPT, "feraiseexcept",
 BT_FN_INT_INT, ATTR_NOTHROW_LEAF_LIST)

 I think feraiseexcept shouldn't be nothrow?
>>>
>>>
>>>
>>> glibc marks it as nothrow. I can remove the nothrow flag for now, for
>>> safety. It may trap, but it does not throw a C++ exception AFAIU.
>>
>>
>> Also with -fnon-call-exceptions?
>
>
> Hmm, maybe on windows where trap handlers turn into system exceptions which
> are handled like C++ exceptions... I am happy to remove nothrow.

Ok.  I suppose as glibc has it nothrow differing is somewhat pointless
as as soon
as someone includes the fenv.h header it'll get overridden.

 But it may be pure.
>>>
>>>
>>> It writes to the exception register (aka memory for now), so I would
>>> hardly
>>> call it pure.
>>
>>
>> But it doesn't have to be ordered with control word writes/reads, no?
>
>
> Not sure what you mean here. feraiseexcept(FE_DIVBYZERO) is equivalent to
> 1./0., it writes to the exception status flag. Its order with respect to
> fetestexcept must be preserved.

I see.

 Likewise fetestexcept may be pure?
>>>
>>>
>>>
>>> Too unsafe for now, since any FP operation can write to the memory that
>>> fetestexcept reads.
>>
>>
>> Ah...  but then FP operations are not ordered with the builtins anyway,
>> only FP loads/stores would be.
>
>
> Since gcc doesn't handle fenv properly, people have been using a number of
> workarounds, in particular with pass-through asm, sometimes volatile,
> occasionally with the "memory" clobber.
>
> Some of those versions would still work with pure, but the attribute
> increases the likelyhood of breaking some of those uses, and I don't know if
> it would ever help in practice, so I would rather not add it for now.
> fegetround is very different since it can safely swap position with an
> adjacent float operation.
>
>> After all having builtins is only the first easiest step of properly
>> modeling
>> dependences between FP ops and the FP control/exception registers.
>
>
> Yes, I didn't expect adding those 5 builtins (modulo the nothrow flag) to be
> controversial...

Surely not.  I'm fine with erring on the conservative side for now.

Thanks,
Richard.

> --
> Marc Glisse

Re: [PATCH] Add header implementation of std::to_string for integers (PR libstdc++/71108)

2017-05-29 Thread Adrian Wielgosik

> Assuming that the locale issue isn't a problem, can that be reused?

The to_chars patch uses C++14 features, while to_string is C++11. If
that was solved, it probably could be used.
However, as far as I know, simply using to_chars in to_string would
technically be suboptimal, because it needs three loops:
- in to_chars, to determine length of the string
- in to_chars, to format the number
- in to_string, to copy the formatted string to std::string

Meanwhile, ideally to_string can use only two loops:
- to determine length of the std::string
- to format the number in constructed std::string

OR (this is what I am doing)
- to format the number in temporary buffer
- to copy the formatted string to std::string

That said, the proposed not-yet-committed to_chars implementation is
more optimized than my code (by doing less integer divisions), so it
may perform just as good as mine or better.

Backport to GCC5

2017-05-29 Thread Martin Liška

Hello.

There's a series of patches that I installed to GCC6 and majority of there
are also related to GCC 5 branch.

I'm going to install the patches.
Martin

>From 9cbe2ada95218219ca1de6e5d9c839509f8cd6ab Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 28 Mar 2017 09:01:57 +
Subject: [PATCH 01/12] Backport r246525

gcc/ChangeLog:

2017-03-28  Martin Liska  

	PR ipa/80104
	* cgraphunit.c (cgraph_node::expand_thunk): Mark argument of a
	thunk call as DECL_GIMPLE_REG_P when vector or complex type.

gcc/testsuite/ChangeLog:

2017-03-28  Martin Liska  

	PR ipa/80104
	* gcc.dg/ipa/pr80104.c: New test.
---
 gcc/cgraphunit.c   |  4 
 gcc/testsuite/gcc.dg/ipa/pr80104.c | 15 +++
 2 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr80104.c

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index d4db126cbd5..5f0b06ebec0 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -1673,6 +1673,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk)
 	for (; i < nargs; i++, arg = DECL_CHAIN (arg))
 	  {
 	tree tmp = arg;
+	if (VECTOR_TYPE_P (TREE_TYPE (arg))
+		|| TREE_CODE (TREE_TYPE (arg)) == COMPLEX_TYPE)
+	  DECL_GIMPLE_REG_P (arg) = 1;
+
 	if (!is_gimple_val (arg))
 	  {
 		tmp = create_tmp_reg (TYPE_MAIN_VARIANT
diff --git a/gcc/testsuite/gcc.dg/ipa/pr80104.c b/gcc/testsuite/gcc.dg/ipa/pr80104.c
new file mode 100644
index 000..7e75c9907e7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr80104.c
@@ -0,0 +1,15 @@
+/* PR ipa/80104 */
+/* { dg-do compile } */
+/* { dg-options "-fipa-icf" } */
+
+float
+a (_Complex float b)
+{
+  return *&b;
+}
+
+float
+c (_Complex float b)
+{
+  return (&b)[0];
+}
-- 
2.12.2

>From 097cdfb997ec6059947cab918197d9462897191e Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 28 Mar 2017 11:37:22 +
Subject: [PATCH 02/12] Backport r246530

gcc/ChangeLog:

2017-03-28  Richard Biener  

	PR ipa/80205
	* tree-inline.c (copy_phis_for_bb): Do not create PHI node
	without arguments, generate default definition of a SSA name.

gcc/testsuite/ChangeLog:

2017-03-28  Martin Liska  

	PR ipa/80205
	* g++.dg/ipa/pr80205.C: New test.
---
 gcc/testsuite/g++.dg/ipa/pr80205.C | 34 ++
 gcc/tree-inline.c  | 94 --
 2 files changed, 84 insertions(+), 44 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr80205.C

diff --git a/gcc/testsuite/g++.dg/ipa/pr80205.C b/gcc/testsuite/g++.dg/ipa/pr80205.C
new file mode 100644
index 000..460bdcb02ca
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr80205.C
@@ -0,0 +1,34 @@
+// PR ipa/80205
+// { dg-options "-fnon-call-exceptions --param early-inlining-insns=100 -O2" }
+
+class a
+{
+public:
+  virtual ~a ();
+};
+class b
+{
+public:
+  template  b (c);
+  ~b () { delete d; }
+  void
+  operator= (b e)
+  {
+b (e).f (*this);
+  }
+  void
+  f (b &e)
+  {
+a g;
+d = e.d;
+e.d = &g;
+  }
+  a *d;
+};
+void
+h ()
+{
+  b i = int();
+  void j ();
+  i = j;
+}
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index e8c066015f5..60f79336cd7 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2347,53 +2347,59 @@ copy_phis_for_bb (basic_block bb, copy_body_data *id)
 	{
 	  walk_tree (&new_res, copy_tree_body_r, id, NULL);
 	  new_phi = create_phi_node (new_res, new_bb);
-	  FOR_EACH_EDGE (new_edge, ei, new_bb->preds)
+	  if (EDGE_COUNT (new_bb->preds) == 0)
 	{
-	  edge old_edge = find_edge ((basic_block) new_edge->src->aux, bb);
-	  tree arg;
-	  tree new_arg;
-	  edge_iterator ei2;
-	  location_t locus;
-
-	  /* When doing partial cloning, we allow PHIs on the entry block
-		 as long as all the arguments are the same.  Find any input
-		 edge to see argument to copy.  */
-	  if (!old_edge)
-		FOR_EACH_EDGE (old_edge, ei2, bb->preds)
-		  if (!old_edge->src->aux)
-		break;
+	  /* Technically we'd want a SSA_DEFAULT_DEF here... */
+	  SSA_NAME_DEF_STMT (new_res) = gimple_build_nop ();
+	}
+	  else
+	FOR_EACH_EDGE (new_edge, ei, new_bb->preds)
+	  {
+		edge old_edge = find_edge ((basic_block) new_edge->src->aux, bb);
+		tree arg;
+		tree new_arg;
+		edge_iterator ei2;
+		location_t locus;
+
+		/* When doing partial cloning, we allow PHIs on the entry block
+		   as long as all the arguments are the same.  Find any input
+		   edge to see argument to copy.  */
+		if (!old_edge)
+		  FOR_EACH_EDGE (old_edge, ei2, bb->preds)
+		if (!old_edge->src->aux)
+		  break;
 
-	  arg = PHI_ARG_DEF_FROM_EDGE (phi, old_edge);
-	  new_arg = arg;
-	  walk_tree (&new_arg, copy_tree_body_r, id, NULL);
-	  gcc_assert (new_arg);
-	  /* With return slot optimization we can end up with
-	 non-gimple (foo *)&this->m, fix that here.  */
-	  if (TREE_CODE (new_arg) != SSA_NAME
-		  && TREE_CODE (new_arg) != FUNCTION_DECL
-		  && !is_gimple_val (new_arg))
-		{
-		  gimple_seq stmts = NULL;

Re: [PATCH 4/6] Port prefetch configuration from aarch32 to aarch64

2017-05-29 Thread Maxim Kuvyrkov

> On Jan 30, 2017, at 2:48 PM, Maxim Kuvyrkov  wrote:
> 
> This patch port prefetch configuration from aarch32 backend to aarch64.  
> There is no code-generation change from this patch.
> 
> This patch also happens to address Kyrill's comment on Andrew's prefetching 
> patch at https://gcc.gnu.org/ml/gcc-patches/2017-01/msg02133.html .
> 
> This patch also fixes a minor bug in aarch64_override_options_internal(), 
> which used "selected_cpu->tune" instead of "aarch64_tune_params".
> 
> Bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu.


AArch64 maintainers, ping?  Here is a patch rebased against current trunk.   
OK, assuming bootstrap and reg-test passes?


--
Maxim Kuvyrkov
www.linaro.org




0003-Port-prefetch-configuration-from-aarch32-to-aarch64-.patch
Description: Binary data

Re: [PATCH 5/6][AArch64] Enable -fprefetch-loop-arrays at -O3 for cores that benefit from prefetching.

2017-05-29 Thread Maxim Kuvyrkov

On 3 February 2017 at 14:58, Maxim Kuvyrkov  wrote:
>> On Jan 30, 2017, at 5:50 PM, Maxim Kuvyrkov  
>> wrote:
>>
>>> On Jan 30, 2017, at 3:23 PM, Kyrill Tkachov  
>>> wrote:
>>>
>>> Hi Maxim,
>>>
>>> On 30/01/17 12:06, Maxim Kuvyrkov wrote:
 This patch enables prefetching at -O3 for aarch64 cores that set 
 "simultaneous prefetches" parameter above 0.  There are currently no such 
 settings, so this patch doesn't change default code generation.

 I'm now working on improvements to -fprefetch-loop-arrays pass to make it 
 suitable for -O2. I'll post this work in the next month.

 Bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu.

>>>
>>> Are you aiming to get this in for GCC 8?
>>> I have one small comment on this patch:
>>>
>>> +  /* Enable sw prefetching at -O3 for CPUS that have prefetch, and we
>>> + have deemed it beneficial (signified by setting
>>> + prefetch.num_slots to 1 or more).  */
>>> +  if (flag_prefetch_loop_arrays < 0
>>> +  && HAVE_prefetch
>>>
>>> HAVE_prefetch will always be true on aarch64.
>>> I imagine midend code that had logic like this would need this check, but 
>>> aarch64-specific code shouldn't need it.
>>
>> Agree, I'll remove HAVE_prefetch.
>>
>> This pattern was copied from other backends, and HAVE_prefetch is most 
>> likely a historical artifact.
>
> Andrew raised a good point in the review of his patch that it is a bad idea 
> to use one of prefetching parameters (simultaneous_prefetches) as indicator 
> for whether to enable prefetching pass by default.  Indeed there are cases 
> when we want to set simultaneous_prefetch according to HW documentation (or 
> experimental results), but not enable prefetching pass by default.
>
> This update to the patch addresses it.  The patch adds a new explicit field 
> to prefetch tuning structure "default_opt_level" that sets optimization level 
> from which prefetching should be enabled by default.  The current value is to 
> enable prefetching at -O3; additionally, this parameter will come handy for 
> enabling prefetching at -O2 [when it is ready].
>
> --
> Maxim Kuvyrkov
> www.linaro.org
>
>

AArch64 maintainers, ping?  Here is a rebased patch against current
trunk.  OK to commit, assuming bootstrap and reg-test passes?

-- 
Maxim Kuvyrkov
www.linaro.org


0004-Enable-fprefetch-loop-arrays-at-O3-for-cores-that-be.patch
Description: Binary data

Re: [PATCH 0/6] Improve -fprefetch-loop-arrays in general and for AArch64 in particular

2017-05-29 Thread Maxim Kuvyrkov

Hi Andrew,

Thanks for pinging this.  I've re-started the submission.

On 28 May 2017 at 08:01, Andrew Pinski  wrote:
> On Tue, Feb 28, 2017 at 1:53 AM, Maxim Kuvyrkov
>  wrote:
>>> On Feb 20, 2017, at 5:38 PM, Kyrill Tkachov  
>>> wrote:
>>>
>>> Hi Maxim,
>>>
>>> On 30/01/17 11:24, Maxim Kuvyrkov wrote:
 This patch series improves -fprefetch-loop-arrays pass through small fixes 
 and tweaks, and then enables it for several AArch64 cores.

 My tunings were done on and for Qualcomm hardware, with results varying 
 between +0.5-1.9% for SPEC2006 INT and +0.25%-1.0% for SPEC2006 FP at -O3, 
 depending on hardware revision.

 This patch series enables restricted -fprefetch-loop-arrays at -O2, which 
 also improves SPEC2006 numbers

 Biggest progressions are on 419.mcf and 437.leslie3d, with no serious 
 regressions on other benchmarks.

 I'm now investigating making -fprefetch-loop-arrays more aggressive for 
 Qualcomm hardware, which improves performance on most benchmarks, but also 
 causes big regressions on 454.calculix and 462.libquantum.  If I can fix 
 these two regressions, prefetching will give another boost to AArch64.

 Andrew just posted similar prefetching tunings for Cavium's cores, and the 
 two patches have trivial conflicts.  I'll post mine as-is, since it 
 address one of the comments on Andrew's review (adding a stand-alone 
 struct for tuning parameters).

 Andrew, feel free to just copy-paste it to your patch, since it is just a 
 mechanical change.

 All patches were bootstrapped and regtested on x86_64-linux-gnu and 
 aarch64-linux-gnu.

>>>
>>> I've tried these patches out on Cortex-A72 and Cortex-A53, with the tuning 
>>> structs entries appropriately
>>> modified to enable the changes on those cores.
>>> I'm seeing the mcf and leslie3d improvements as well on Cortex-A72 and 
>>> Cortex-A53 and no noticeable regressions.
>>> I've also verified that the improvements are due to the prefetch 
>>> instructions rather than just the unrolling that
>>> the pass does.
>>> So I'm in favor of enabling this for the cores that benefit from it.
>>>
>>> Do you plan to get this in for GCC 8?
>>
>> Hi Kyrill,
>>
>> My hope was to push them in time for GCC 7, but it seems to late now.  I'll 
>> return to these patches at the beginning of Stage 1.
>
> Ping on this patch set as I really want to get in the prefetching side
> for ThunderX 1 and 2.  Or should I resubmit my patch set?
>
> Thanks,
> Andrew
>
>>
>> --
>> Maxim Kuvyrkov
>> www.linaro.org
>>

-- 
Maxim Kuvyrkov
www.linaro.org

Re: [PATCH, rs6000] Fold vector absolutes in GIMPLE

2017-05-29 Thread Segher Boessenkool

On Mon, May 29, 2017 at 10:32:18AM +0200, Richard Biener wrote:
> On Fri, May 26, 2017 at 7:19 PM, Will Schmidt  
> wrote:
> > Add support for early expansion of vector absolute built-ins.
> >
> > Bootstraps currently running (p7,p8le,p8be).
> >
> > OK for trunk?
> 
> What's the documented behavior for vec_abs with respect to an argument
> of value INT_MIN?

The documentation says:

"For integer vectors, the arithmetic is modular."

http://openpowerfoundation.org/wp-content/uploads/resources/leabi-prd/
(appendix A; the PDF is easier to read).


Segher

Re: [PATCH v8] add -fpatchable-function-entry=N,M option

2017-05-29 Thread Maxim Kuvyrkov

Ping?

Richard E., this feature will immediately benefit AArch64 backend and
Linux kernel.  Do you plan to review it?

Thank you.

On Wed, May 3, 2017 at 5:48 PM, Torsten Duwe  wrote:
> On Wed, Mar 01, 2017 at 01:35:52PM +, Richard Earnshaw (lists) wrote:
>> >>
>> >> How about --fpatchable-function-entry=?
>> >
>> I haven't reviewed it yet.  I'm not really planning to spend any more
>> time on this until stage1 re-opens.
>
> So I guess this is about now? Here is version 8, which is functionally 
> identical
> to v6 (v7 tried to guard the gen_nop call, which you wrote isn't neccessary).
> The longer names required some reformatting.
>
> Torsten
>
> gcc/c-family/ChangeLog
> 2017-05-03  Torsten Duwe  
>
> * c-attribs.c (c_common_attribute_table): Add entry for
> "patchable_function_entry".
>
> gcc/lto/ChangeLog
> 2017-05-03  Torsten Duwe  
>
> * lto-lang.c (lto_attribute_table): Add entry for
> "patchable_function_entry".
>
> gcc/ChangeLog
> 2017-05-03  Torsten Duwe  
>
> * common.opt: Introduce -fpatchable-function-entry
> command line option, and its variables function_entry_patch_area_size
> and function_entry_patch_area_start.
> * opts.c (common_handle_option): Add -fpatchable_function_entry_ case,
> including a two-value parser.
> * target.def (print_patchable_function_entry): New target hook.
> * targhooks.h (default_print_patchable_function_entry): New function.
> * targhooks.c (default_print_patchable_function_entry): Likewise.
> * toplev.c (process_options): Switch off IPA-RA if
> patchable function entries are being generated.
> * varasm.c (assemble_start_function): Look at the
> patchable-function-entry command line switch and current
> function attributes and maybe generate NOP instructions by
> calling the print_patchable_function_entry hook.
> * doc/extend.texi: Document patchable_function_entry attribute.
> * doc/invoke.texi: Document -fpatchable_function_entry
> command line option.
> * doc/tm.texi.in (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY):
> New target hook.
> * doc/tm.texi: Likewise.
>
> gcc/testsuite/ChangeLog
> 2017-05-03  Torsten Duwe  
>
> * c-c++-common/attribute-patchable_function_entry-1.c: New test.
>
> diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> index f2a88e147ba..31137ce0433 100644
> --- a/gcc/c-family/c-attribs.c
> +++ b/gcc/c-family/c-attribs.c
> @@ -139,6 +139,8 @@ static tree handle_bnd_variable_size_attribute (tree *, 
> tree, tree, int, bool *)
>  static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
>  static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
>  static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_patchable_function_entry_attribute (tree *, tree, tree,
> +  int, bool *);
>
>  /* Table of machine-independent attributes common to all C-like languages.
>
> @@ -345,6 +347,9 @@ const struct attribute_spec c_common_attribute_table[] =
>   handle_bnd_instrument, false },
>{ "fallthrough",   0, 0, false, false, false,
>   handle_fallthrough_attribute, false },
> +  { "patchable_function_entry",1, 2, true, false, false,
> + handle_patchable_function_entry_attribute,
> + false },
>{ NULL, 0, 0, false, false, false, NULL, false }
>  };
>
> @@ -3173,3 +3178,10 @@ handle_fallthrough_attribute (tree *, tree name, tree, 
> int,
>*no_add_attrs = true;
>return NULL_TREE;
>  }
> +
> +static tree
> +handle_patchable_function_entry_attribute (tree *, tree, tree, int, bool *)
> +{
> +  /* Nothing to be done here.  */
> +  return NULL_TREE;
> +}
> diff --git a/gcc/common.opt b/gcc/common.opt
> index b7ece0c73e1..1b698ef4fc5 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -163,6 +163,13 @@ bool flag_stack_usage_info = false
>  Variable
>  int flag_debug_asm
>
> +; How many NOP insns to place at each function entry by default
> +Variable
> +HOST_WIDE_INT function_entry_patch_area_size
> +
> +; And how far the real asm entry point is into this area
> +Variable
> +HOST_WIDE_INT function_entry_patch_area_start
>
>  ; Balance between GNAT encodings and standard DWARF to emit.
>  Variable
> @@ -2022,6 +2029,10 @@ fprofile-reorder-functions
>  Common Report Var(flag_profile_reorder_functions)
>  Enable function reordering that improves code placement.
>
> +fpatchable-function-entry=
> +Common Joined Optimization
> +Insert NOP instructions at each function entry.
> +
>  frandom-seed
>  Common Var(common_deferred_options) Defer
>
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 1255995eb78..d09ccd90c42 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/do

Re: Optimisation of std::binary_search of the header

2017-05-29 Thread jay pokarna

Respected Sir,
 Could you give me the contact of the standard
committee which handles changes to the c++ standard.

Regards,
Jay Pokarna

On Mon, May 29, 2017 at 2:17 PM, Tim Shen  wrote:
> On Mon, May 29, 2017 at 1:05 AM, jay pokarna  wrote:
 The technique that I have used is square root decomposition . I think
 that it will be better than the one that is implemented.
>>>
>>> And here's the problem: you *think* it will be better. Just thinking
>>> is not enough. You need to *prove* it with benchmarks that show that
>>> your technique is in fact faster than the current one.
>
> Agreed.
>
> Jay, specifically, your algorithm has the roughly the same running time:
>
>   T(n) = log (sqrt(n) + 1) + log sqrt(n)
>  > 2 log (n ^ 0.5)
>  = 2 * 0.5 * log n
>  = log n
>
> It's unclear to me whether it's better than the normal binary search
> or not. Detailed and representative benchmarks may convince more
> people.
>
>
> --
> Regards,
> Tim Shen



-- 
Regards,
Jay Pokarna
CS Sophomore
Wordpress | Linkedin
Birla Institute of Technology and Science, Pilani
Pilani Campus
Rajasthan - 333031.

[PATCH] gcc: xtensa: fix fprintf format specifiers

2017-05-29 Thread Max Filippov

HOST_WIDE_INT may not be long as assumed in print_operand and
xtensa_emit_call. Use HOST_WIDE_INT_PRINT_DEC/HOST_WIDE_INT_PRINT_HEX
format strings instead of %ld/0x%lx. This fixes incorrect assembly code
generation by the compiler running on armhf host.

2017-05-28  Max Filippov  
gcc/
* config/xtensa/xtensa.c (xtensa_emit_call): Use
HOST_WIDE_INT_PRINT_HEX instead of 0x%lx format string.
(print_operand): Use HOST_WIDE_INT_PRINT_DEC instead of %ld
format string.
---
 gcc/config/xtensa/xtensa.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/config/xtensa/xtensa.c b/gcc/config/xtensa/xtensa.c
index 015dd10..df80ad9 100644
--- a/gcc/config/xtensa/xtensa.c
+++ b/gcc/config/xtensa/xtensa.c
@@ -1780,7 +1780,8 @@ xtensa_emit_call (int callop, rtx *operands)
   rtx tgt = operands[callop];
 
   if (GET_CODE (tgt) == CONST_INT)
-sprintf (result, "call%d\t0x%lx", WINDOW_SIZE, INTVAL (tgt));
+sprintf (result, "call%d\t" HOST_WIDE_INT_PRINT_HEX,
+WINDOW_SIZE, INTVAL (tgt));
   else if (register_operand (tgt, VOIDmode))
 sprintf (result, "callx%d\t%%%d", WINDOW_SIZE, callop);
   else
@@ -2351,14 +2352,14 @@ print_operand (FILE *file, rtx x, int letter)
 
 case 'L':
   if (GET_CODE (x) == CONST_INT)
-   fprintf (file, "%ld", (32 - INTVAL (x)) & 0x1f);
+   fprintf (file, HOST_WIDE_INT_PRINT_DEC, (32 - INTVAL (x)) & 0x1f);
   else
output_operand_lossage ("invalid %%L value");
   break;
 
 case 'R':
   if (GET_CODE (x) == CONST_INT)
-   fprintf (file, "%ld", INTVAL (x) & 0x1f);
+   fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (x) & 0x1f);
   else
output_operand_lossage ("invalid %%R value");
   break;
@@ -2372,7 +2373,7 @@ print_operand (FILE *file, rtx x, int letter)
 
 case 'd':
   if (GET_CODE (x) == CONST_INT)
-   fprintf (file, "%ld", INTVAL (x));
+   fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (x));
   else
output_operand_lossage ("invalid %%d value");
   break;
@@ -2437,7 +2438,7 @@ print_operand (FILE *file, rtx x, int letter)
   else if (GET_CODE (x) == MEM)
output_address (GET_MODE (x), XEXP (x, 0));
   else if (GET_CODE (x) == CONST_INT)
-   fprintf (file, "%ld", INTVAL (x));
+   fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (x));
   else
output_addr_const (file, x);
 }
-- 
2.1.4

[PATCH] gcc: xtensa: fix unused parameter warning

2017-05-29 Thread Max Filippov

2017-05-28  Max Filippov  
gcc/
* config/xtensa/xtensa.c (xtensa_initial_elimination_offset):
Mark 'to' argument with ATTRIBUTE_UNUSED.
---
 gcc/config/xtensa/xtensa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/xtensa.c b/gcc/config/xtensa/xtensa.c
index df80ad9..16f8311 100644
--- a/gcc/config/xtensa/xtensa.c
+++ b/gcc/config/xtensa/xtensa.c
@@ -2678,7 +2678,7 @@ xtensa_frame_pointer_required (void)
 }
 
 HOST_WIDE_INT
-xtensa_initial_elimination_offset (int from, int to)
+xtensa_initial_elimination_offset (int from, int to ATTRIBUTE_UNUSED)
 {
   long frame_size = compute_frame_size (get_frame_size ());
   HOST_WIDE_INT offset;
-- 
2.1.4

Re: [RFC] [PATCH] Introduce configure flag --with-stage1-cflags.

2017-05-29 Thread Eric Botcazou

> After a discussion with Richi, using adding "-O2" to STAGE1 cflags with a
> recent enough compiler can significantly speed up bootstrap. Thus I'm
> suggesting to introduce --with-stage1-cflags where one can provide such
> options.

-O1 is sufficient in my experience and far less risky than -O2 in this case.

-- 
Eric Botcazou

Re: [PATCH, rs6000] Fold vector absolutes in GIMPLE

2017-05-29 Thread Richard Biener

On May 29, 2017 12:24:44 PM GMT+02:00, Segher Boessenkool 
 wrote:
>On Mon, May 29, 2017 at 10:32:18AM +0200, Richard Biener wrote:
>> On Fri, May 26, 2017 at 7:19 PM, Will Schmidt
> wrote:
>> > Add support for early expansion of vector absolute built-ins.
>> >
>> > Bootstraps currently running (p7,p8le,p8be).
>> >
>> > OK for trunk?
>> 
>> What's the documented behavior for vec_abs with respect to an
>argument
>> of value INT_MIN?
>
>The documentation says:
>
>   "For integer vectors, the arithmetic is modular."

This means that folding as ABS_EXPR is not safe for !TYPE_OVERFLOW_WRAPS
Integral vector types.

Richard.

>http://openpowerfoundation.org/wp-content/uploads/resources/leabi-prd/
>(appendix A; the PDF is easier to read).
>
>
>Segher

[C++ PATCH] PR 80891#2

2017-05-29 Thread Nathan Sidwell

This patch fixes the second case reported in 80891.  We were asserting 
the wrong thing -- the node could well be a lookup, but it must be 
marked USED (otherwise there's no need to copy it).  I also noticed that 
in marking a lookup, we were not marking the underlying overloads as 
used -- which is part of the point so that future pushdecls know the 
overload is immutable.


I thought I had a testcase for that, so I'm not sure how it slipped 
through.  Will try and create a more robust testcase later.


nathan
--
Nathan Sidwell
 2017-05-29  Nathan Sidwell  
 
	PR c++/80891 (#2)
	* tree.c (ovl_copy): Adjust assert, copy OVL_LOOKUP.
	(ovl_used): New.
	(lookup_keep): Call it.
 
	PR c++/80891 (#2)
	* g++.dg/lookup/pr80891-2.C: New.

Index: cp/tree.c
===
--- cp/tree.c	(revision 248520)
+++ cp/tree.c	(working copy)
@@ -2139,12 +2139,13 @@ ovl_copy (tree ovl)
   else
 result = make_node (OVERLOAD);
 
-  gcc_assert (!OVL_NESTED_P (ovl) && !OVL_LOOKUP_P (ovl));
+  gcc_checking_assert (!OVL_NESTED_P (ovl) && OVL_USED_P (ovl));
   TREE_TYPE (result) = TREE_TYPE (ovl);
   OVL_FUNCTION (result) = OVL_FUNCTION (ovl);
   OVL_CHAIN (result) = OVL_CHAIN (ovl);
   OVL_HIDDEN_P (result) = OVL_HIDDEN_P (ovl);
   OVL_USING_P (result) = OVL_USING_P (ovl);
+  OVL_LOOKUP_P (result) = OVL_LOOKUP_P (ovl);
 
   return result;
 }
@@ -2395,6 +2396,22 @@ lookup_maybe_add (tree fns, tree lookup)
   return lookup_add (fns, lookup);
 }
 
+/* Regular overload OVL is part of a kept lookup.  Mark the nodes on
+   it as immutable.  */
+
+static void
+ovl_used (tree ovl)
+{
+  for (;
+   ovl && TREE_CODE (ovl) == OVERLOAD
+	 && !OVL_USED_P (ovl);
+   ovl = OVL_CHAIN (ovl))
+{
+  gcc_checking_assert (!OVL_LOOKUP_P (ovl));
+  OVL_USED_P (ovl) = true;
+}
+}
+
 /* If KEEP is true, preserve the contents of a lookup so that it is
available for a later instantiation.  Otherwise release the LOOKUP
nodes for reuse.  */
@@ -2407,12 +2424,18 @@ lookup_keep (tree lookup, bool keep)
 	 && OVL_LOOKUP_P (lookup) && !OVL_USED_P (lookup);
lookup = OVL_CHAIN (lookup))
 if (keep)
-  OVL_USED_P (lookup) = true;
+  {
+	OVL_USED_P (lookup) = true;
+	ovl_used (OVL_FUNCTION (lookup));
+  }
 else
   {
 	OVL_FUNCTION (lookup) = ovl_cache;
 	ovl_cache = lookup;
   }
+
+  if (keep)
+ovl_used (lookup);
 }
 
 /* Returns nonzero if X is an expression for a (possibly overloaded)
Index: testsuite/g++.dg/lookup/pr80891-2.C
===
--- testsuite/g++.dg/lookup/pr80891-2.C	(revision 0)
+++ testsuite/g++.dg/lookup/pr80891-2.C	(working copy)
@@ -0,0 +1,29 @@
+// PR c++/80891 part 1
+// instantiation-time ADL for swap needs to copy a previous lookup
+// node, but gets confused.
+
+void swap();
+
+namespace boost {
+  void swap();
+}
+
+using namespace boost;
+
+template 
+void reversible_container_test ()
+{
+  using namespace boost;
+  T a;
+  swap (a);
+}
+
+namespace boost {
+  struct A {};
+  template  void swap(T);
+}
+
+void test_ptr_vector()
+{
+  reversible_container_test;
+}

Re: [PATCH, rs6000] Fold vector absolutes in GIMPLE

2017-05-29 Thread Segher Boessenkool

On Mon, May 29, 2017 at 01:35:22PM +0200, Richard Biener wrote:
> >> What's the documented behavior for vec_abs with respect to an
> >argument
> >> of value INT_MIN?
> >
> >The documentation says:
> >
> > "For integer vectors, the arithmetic is modular."
> 
> This means that folding as ABS_EXPR is not safe for !TYPE_OVERFLOW_WRAPS
> Integral vector types.

Is it still fine if TYPE_OVERFLOW_UNDEFINED?  So essentially always
except with -ftrapv?


Segher

Re: [PING][PATCH, GCC/ARM] Only test tls-disable-literal-pool.c if target supports native TLS

2017-05-29 Thread Christophe Lyon

On 19 May 2017 at 14:29, Prakhar Bahuguna  wrote:
> On 11/05/2017 14:54:37, Prakhar Bahuguna wrote:
>> tls-disable-literal-pool.c should only be run if the toolchain and target
>> support native thread-local storage rather than emulated TLS. This patch also
>> improves the matching of the error message.
>>
>> testsuite/ChangeLog:
>>
>> 2017-05-11  Prakhar Bahuguna  
>>
>>   * gcc.target/arm/tls-disable-literal-pool.c: Change
>>   require-effective-target to tls_native.
>>   Move dg-error to return statement line and change to dg-message.
>>
>> Testing done: Regression testing for ARMv7-M with a TLS-enabled toolchain 
>> and a
>> TLS-disabled toolchain.
>>

Hi,
Can you share more details on the configuration you used?
In my testing, the only cortex-M config I have is arm-none-eabi
--with-cpu=cortex-m3.
Since arm-none-eabi means native-tls is disabled, this test is skipped.
A constraint for me is that m3 was the only cortex-m cpu supported by qemu the
last time I checked.

Thanks,

Christophe

>> Okay for stage1?
>>
>> --
>>
>> Prakhar Bahuguna
>
>> From 84837978d480a1abcebe7b4d2ac21af0eb6645b4 Mon Sep 17 00:00:00 2001
>> From: Prakhar Bahuguna 
>> Date: Thu, 11 May 2017 13:24:39 +0100
>> Subject: [PATCH] Only test tls-disable-literal-pool.c if target supports
>>  native TLS
>>
>> This test should only be run if the toolchain and target support native
>> thread-local storage rather than emulated TLS.
>> ---
>>  gcc/testsuite/gcc.target/arm/tls-disable-literal-pool.c | 5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/arm/tls-disable-literal-pool.c 
>> b/gcc/testsuite/gcc.target/arm/tls-disable-literal-pool.c
>> index fe14a6b132c..283201fdd97 100644
>> --- a/gcc/testsuite/gcc.target/arm/tls-disable-literal-pool.c
>> +++ b/gcc/testsuite/gcc.target/arm/tls-disable-literal-pool.c
>> @@ -1,5 +1,5 @@
>>  /* { dg-do compile } */
>> -/* { dg-require-effective-target tls } */
>> +/* { dg-require-effective-target tls_native } */
>>  /* { dg-require-effective-target arm_cortex_m } */
>>  /* { dg-require-effective-target arm_thumb2_ok } */
>>  /* { dg-options "-mslow-flash-data" } */
>> @@ -9,7 +9,6 @@ __thread int x = 0;
>>  int
>>  bar ()
>>  {
>> -  return x;
>> +  return x; /* { dg-message "sorry, unimplemented: accessing thread-local 
>> storage is not currently supported with -mpure-code or -mslow-flash-data" } 
>> */
>>  }
>>
>> -/* { dg-error "accessing thread-local storage is not currently supported 
>> with -mpure-code or -mslow-flash-data" "" { target *-*-* } 12 } */
>> --
>> 2.11.0
>>
>
> Ping.
>
> --
>
> Prakhar Bahuguna

[C++ PATCH] PR c++/80891 #3

2017-05-29 Thread Nathan Sidwell

This patch fixes the 3rd testcase in 80891.  Here we were squirelling 
away overload lookup results in template definitions, but failing to 
mark them as used.  tsubst rightly asserted.


Fixed by adding calls to lookup_mark in the relevant expr build functions.

build_nt_call_vec is in the common core, but only ever called from the 
c++ FE.  This patch replaces those calls with a C++-specific builder, 
that does the marking.  I left the common one in gcc/tree.c alone, but 
perhaps it should be removed as unneeded?


nathan
--
Nathan Sidwell
2017-05-29  Nathan Sidwell  

	PR c++/80891 (#3)
	* cp-tree.h (build_min_nt_call_vec): Declare.
	* decl.c (build_offset_ref_call_from_tree): Call it.
	* parser.c (cp_parser_postfix_expression): Likewise.
	* pt.c (tsubst_copy_and_build): Likewise.
	* semantics.c (finish_call_expr): Likewise.
	* tree.c (build_min_nt_loc): Keep unresolved lookups.
	(build_min): Likewise.
	(build_min_non_dep): Likewise.
	(build_min_non_dep_call_vec): Likewise.
	(build_min_nt_call_vec): New.

	PR c++/80891 (#3)
	* g++.dg/lookup/pr80891-3.C: New.

Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 248569)
+++ cp/cp-tree.h	(working copy)
@@ -6891,6 +6891,7 @@ extern tree build_min_nt_loc			(location
 		 ...);
 extern tree build_min_non_dep			(enum tree_code, tree, ...);
 extern tree build_min_non_dep_op_overload	(enum tree_code, tree, tree, ...);
+extern tree build_min_nt_call_vec (tree, vec *);
 extern tree build_min_non_dep_call_vec		(tree, tree, vec *);
 extern vec* vec_copy_and_insert(vec*, tree, unsigned);
 extern tree build_cplus_new			(tree, tree, tsubst_flags_t);
Index: cp/decl2.c
===
--- cp/decl2.c	(revision 248569)
+++ cp/decl2.c	(working copy)
@@ -4891,7 +4891,7 @@ build_offset_ref_call_from_tree (tree fn
 		  || TREE_CODE (fn) == MEMBER_REF);
   if (type_dependent_expression_p (fn)
 	  || any_type_dependent_arguments_p (*args))
-	return build_nt_call_vec (fn, *args);
+	return build_min_nt_call_vec (fn, *args);
 
   orig_args = make_tree_vector_copy (*args);
 
Index: cp/parser.c
===
--- cp/parser.c	(revision 248569)
+++ cp/parser.c	(working copy)
@@ -6952,7 +6952,7 @@ cp_parser_postfix_expression (cp_parser
 		  {
 		maybe_generic_this_capture (instance, fn);
 		postfix_expression
-		  = build_nt_call_vec (postfix_expression, args);
+		  = build_min_nt_call_vec (postfix_expression, args);
 		release_tree_vector (args);
 		break;
 		  }
Index: cp/pt.c
===
--- cp/pt.c	(revision 248569)
+++ cp/pt.c	(working copy)
@@ -17389,7 +17389,7 @@ tsubst_copy_and_build (tree t,
 			&& TREE_CODE (fn) != FIELD_DECL)
 		|| type_dependent_expression_p (fn)
 		|| any_type_dependent_arguments_p (call_args)))
-	  ret = build_nt_call_vec (function, call_args);
+	  ret = build_min_nt_call_vec (function, call_args);
 	else if (!BASELINK_P (fn))
 	  ret = finish_call_expr (function, &call_args,
    /*disallow_virtual=*/false,
Index: cp/semantics.c
===
--- cp/semantics.c	(revision 248569)
+++ cp/semantics.c	(working copy)
@@ -2322,7 +2322,7 @@ finish_call_expr (tree fn, vec *args)
+{
+  tree ret, t;
+  unsigned int ix;
+
+  ret = build_vl_exp (CALL_EXPR, vec_safe_length (args) + 3);
+  CALL_EXPR_FN (ret) = fn;
+  CALL_EXPR_STATIC_CHAIN (ret) = NULL_TREE;
+  FOR_EACH_VEC_SAFE_ELT (args, ix, t)
+{
+  CALL_EXPR_ARG (ret, ix) = t;
+  if (TREE_CODE (t) == OVERLOAD)
+	lookup_keep (t, true);
+}
+  return ret;
+}
+
+/* Similar to `build_min_nt_call_vec', but for template definitions of
non-dependent expressions. NON_DEP is the non-dependent expression
that has been built.  */
 
 tree
 build_min_non_dep_call_vec (tree non_dep, tree fn, vec *argvec)
 {
-  tree t = build_nt_call_vec (fn, argvec);
+  tree t = build_min_nt_call_vec (fn, argvec);
   if (REFERENCE_REF_P (non_dep))
 non_dep = TREE_OPERAND (non_dep, 0);
   TREE_TYPE (t) = TREE_TYPE (non_dep);
Index: testsuite/g++.dg/lookup/pr80891-3.C
===
--- testsuite/g++.dg/lookup/pr80891-3.C	(revision 0)
+++ testsuite/g++.dg/lookup/pr80891-3.C	(working copy)
@@ -0,0 +1,26 @@
+// PR c++/80891 part 3
+// We were failing to mark OVERLOADS held in template definitions as
+// immutable in non-call contexts.
+
+namespace std {
+  int endl();
+}
+
+using std::endl;
+
+template  void test_spots(RealType)
+{
+  using namespace std;
+  RealType a;
+  a << endl;
+}
+
+template 
+void operator<< (T, int (&)());
+
+struct Q {};
+void test_maintest_method()
+{
+  Q q;
+  test_spots(q);
+}

Re: [PATCH 2/2] DWARF: make it possible to emit debug info for declarations only

2017-05-29 Thread Pierre-Marie de Rodat


On 05/29/2017 09:50 AM, Pierre-Marie de Rodat wrote:

Bootstrapped and reg-tested on x86_64-linux.  Ok to commit? Thank you in
advance!


I just realized that I forgot to add the testcase: here it is, sorry!

--
Pierre-Marie de Rodat
--  { dg-options "-cargs -g -dA -margs" }
--  { dg-final { scan-assembler "local_imported_func" } }
--  { dg-final { scan-assembler "local_imported_var" } }
--  { dg-final { scan-assembler "global_imported_func" } }
--  { dg-final { scan-assembler "global_imported_var" } }
--  { dg-final { scan-assembler-not "foreign_imported_func" } }
--  { dg-final { scan-assembler-not "foreign_imported_var" } }

with Debug11_Pkg2;

package body Debug11_Pkg is

   procedure Dummy is
  Local_Imported_Var : Integer;
  pragma Import (C, Local_Imported_Var, "imported_var");

  function Local_Imported_Func return Integer;
  pragma Import (C, Local_Imported_Func, "imported_func");
   begin
  Local_Imported_Var := Local_Imported_Func;
  Global_Imported_Var := Global_Imported_Func;
  Debug11_Pkg2.Foreign_Imported_Var :=
 Debug11_Pkg2.Foreign_Imported_Func;
   end Dummy;

end Debug11_Pkg;
package Debug11_Pkg is

   Global_Imported_Var : Integer;
   pragma Import (C, Global_Imported_Var, "imported_var");

   function Global_Imported_Func return Integer;
   pragma Import (C, Global_Imported_Func, "imported_func");

   procedure Dummy;

end Debug11_Pkg;
package Debug11_Pkg2 is

   Foreign_Imported_Var : Integer;
   pragma Import (C, Foreign_Imported_Var, "imported_var");

   function Foreign_Imported_Func return Integer;
   pragma Import (C, Foreign_Imported_Func, "imported_func");

end Debug11_Pkg2;

Re: MAINTAINERS update

2017-05-29 Thread Bernd Schmidt


On 05/27/2017 12:52 PM, Bernd Schmidt wrote:

I am no longer working for Red Hat, so I've updated my email address.
Also, I don't expect to be around very much in the near future, so I've
removed myself as maintainer for some areas.


Judging by a reply I got, I may have been too terse. No need to worry, 
I'm just choosing to do something else for a while and reading gcc 
mailing lists will not be a priority in the near term.



Bernd

Re: [PATCH] add more detail to -Wconversion and -Woverflow (PR 80731)

2017-05-29 Thread Christophe Lyon

On 25 May 2017 at 00:16, Martin Sebor  wrote:
> On 05/24/2017 11:08 AM, Joseph Myers wrote:
>>
>> On Wed, 17 May 2017, Martin Sebor wrote:
>>
>>> @@ -1036,31 +1079,76 @@ warnings_for_convert_and_check (location_t loc,
>>> tree type, tree expr,
>>>   /* This detects cases like converting -129 or 256 to
>>>  unsigned char.  */
>>>   if (!int_fits_type_p (expr, c_common_signed_type (type)))
>>> -   warning_at (loc, OPT_Woverflow,
>>> -   "large integer implicitly truncated to unsigned
>>> type");
>>> +   {
>>> + if (cst)
>>> +   warning_at (loc, OPT_Woverflow,
>>> +   (TYPE_UNSIGNED (exprtype)
>>> +? "conversion from %qT to %qT "
>>> +"changes value from %qE to %qE"
>>> +: "unsigned conversion from %qT to %qT "
>>> +"changes value from %qE to %qE"),
>>> +   exprtype, type, expr, result);
>>> + else
>>> +   warning_at (loc, OPT_Woverflow,
>>> +   (TYPE_UNSIGNED (exprtype)
>>> +? "conversion from %qT to %qT "
>>> +"changes the value of %qE"
>>> +: "unsigned conversion from %qT to %qT "
>>> +"changes the value of %qE"),
>>> +   exprtype, type, expr);
>>> +   }
>>
>>
>> You need to use G_() around both arguments to ?:, otherwise only one will
>> get extracted for translation.
>>
>>> diff --git a/gcc/testsuite/c-c++-common/pr68657-1.c
>>> b/gcc/testsuite/c-c++-common/pr68657-1.c
>>> index 84f3e54..33fdf86 100644
>>> --- a/gcc/testsuite/c-c++-common/pr68657-1.c
>>> +++ b/gcc/testsuite/c-c++-common/pr68657-1.c
>>> @@ -5,14 +5,14 @@
>>>  void
>>>  f1 (void)
>>>  {
>>> -  unsigned int a = -5; /* { dg-error "negative integer implicitly
>>> converted to unsigned type" } */
>>> +  unsigned int a = -5; /* { dg-error "unsigned conversion from .int. to
>>> .unsigned int. changes value from .-5. to .4294967291." } */
>>
>>
>> The more specific match would fail for targets with 16-bit int.  You need
>> to keep it less specific in this test (if you want to test the more
>> specific text as well, another test could be added for that, restricted to
>> the int32 effective-target).
>>
>> (The changes to Wconversion-real-integer-3.C and
>> Wconversion-real-integer2.C are OK in that those tests are restricted to
>> int32plus, although in theory 64-bit int would be an issue there.)
>>
>>> +  /* According to 6.3.1.3 of C11:
>>> + -3-  Otherwise, the new type is signed and the value cannot be
>>> +  represented in it; either the result is implementation-defined
>>> + or an implementation-defined signal is raised.
>>> +
>>> + In GCC such conversios wrap and diagnosed by mentioning "overflow"
>>> + if the absolut value of the operand is in excess of the maximum of
>>> + the destination of type, and "conversion" otherwise, as follows:
>>> */
>>
>>
>> s/conversios/conversions/; s/absolut/absolute/
>>
>> OK with those changes.
>
>
> Thanks for the careful review!  Done and committed in r248431.
>

Hi,

I have noticed failures on arm*:
  Executed from: gcc.dg/fixed-point/fixed-point.exp
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 12)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 13)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 14)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 15)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 16)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 17)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 18)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 19)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 20)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 21)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 22)
gcc.dg/fixed-point/int-warning.c  (test for warnings, line 23)
gcc.dg/fixed-point/int-warning.c (test for excess errors)

Excess errors:
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/fixed-point/int-warning.c:12:8:
warning: overflow in conversion from '_Accum' to 'signed char' chages
value from '5.0e+2' to '-12' [-Woverflow]
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/fixed-point/int-warning.c:13:8:
warning: overflow in conversion from '_Accum' to 'signed char' chages
value from '-5.0e+2' to '12' [-Woverflow]
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/fixed-point/int-warning.c:14:8:
warning: overflow in conversion from 'long _Accum' to 'signed char'
chages value from '5.0e+2' to '-12' [-Woverflow]
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/fixed-point/int-warning.c:15:8:
warning: overflow in conversion from 'long _

Re: [patch, libgfortran] PR53029 missed optimization in internal read (without implied-do-loop)

2017-05-29 Thread Manfred Schwarb

Am 29.05.2017 um 01:16 schrieb Jerry DeLisle:
> Hi all,
> 
> The problem here is that we never set the err return to LIBERROR_END in all 
> cases. For the example case we are detecting the EOF condition inside the 
> read_integer procedure and it gets acted on correctly at higher levels in the 
> code. Consequently in the big loop over the array where we call 
> list_formatted_read_scalar, we never returned an error code so we never 
> exited the loop early.
> 
> The patch tests for the EOF first locally as before, but then returns the 
> error flags set in dtp->common.flags which are set globally in the individual 
> read procedures whene hit_eof is called.
> 
> Regression tested on x86_64. I have added a test case which will check the 
> execution time of the loop. The previous results of the REAd were correct, 
> just took a long time on large arrays.
> 

Seems to work as advertised.
With this small patch, I see a tremendous speedup for array reads.

The implied-do variant gets slightly slower (~10%), but the
array variant now takes 0.002s independent of the size of "m",
compared to some dozens of seconds without this patch!

Concerning your test case:
Your timeout of 2s is dangerously close to the timings of really fast
boxes without this patch, so I would lower this value.
I guess even on really slow ARM boxes or some-such this test case finishes
in some few tenth of seconds, at worst.

Or, as the new behavior seems to be independent of the m setting,
just bump the constant m by a factor 10 or 100. So you are sure no big iron
can pass this test without your patch being applied.

Thanks a bunch!
Manfred


> OK for trunk?
> 
> Regards,
> 
> Jerry
> 
> 2017-05-28  Jerry DeLisle  
> 
> PR libgfortran/35339
> * list_read.c.c (list_formatted_read_scala): Set the err return
> value to the common.flags error values.

[C++ PATCH] PR 80891 #1

2017-05-29 Thread Nathan Sidwell

This patch fixes the first testcase in 80891.  That was a case af a 
using declaration and a using directive bringing in the same function 
during lookup.  the template specialization machinery wasn't prepared 
for that, and reasonably thought neither instance was more specialized.


This patch teaches most_specialized_template to ignore duplicates.

However, I think my decision to permit lookup to return duplicates was 
wrong.  It should try harder not to -- as (a) it appears to be more 
common than I'd guessed and (b) the cost of returning duplicates is 
high, as both go through the instantiation machinery when there are 
templates involved.


Now that using declarations are ordered wrt regular functions, we can do 
better than before -- and even if we couldn't we can do no worse.


I've committed this patch, to unbreak things, but I'm going to leave the 
defect open and fixup lookup shortly, along with turning this patch into 
an appropriate assert.


nathan
--
Nathan Sidwell
2017-05-29  Nathan Sidwell  

	PR c++/80891 (#1)
	* pt.c (most_specialized_instantiation): Cope with duplicate
	instantiations.

	PR c++/80891 (#1)
	* g++.dg/lookup/pr80891-1.C: New.

Index: cp/pt.c
===
--- cp/pt.c	(revision 248571)
+++ cp/pt.c	(working copy)
@@ -21728,31 +21728,32 @@ most_specialized_instantiation (tree tem
 
   champ = templates;
   for (fn = TREE_CHAIN (templates); fn; fn = TREE_CHAIN (fn))
-{
-  int fate = more_specialized_inst (TREE_VALUE (champ), TREE_VALUE (fn));
-  if (fate == -1)
-	champ = fn;
-  else if (!fate)
-	{
-	  /* Equally specialized, move to next function.  If there
-	 is no next function, nothing's most specialized.  */
-	  fn = TREE_CHAIN (fn);
+if (TREE_VALUE (champ) != TREE_VALUE (fn))
+  {
+	int fate = more_specialized_inst (TREE_VALUE (champ), TREE_VALUE (fn));
+	if (fate == -1)
 	  champ = fn;
-	  if (!fn)
-	break;
-	}
-}
+	else if (!fate)
+	  {
+	/* Equally specialized, move to next function.  If there
+	   is no next function, nothing's most specialized.  */
+	fn = TREE_CHAIN (fn);
+	champ = fn;
+	if (!fn)
+	  break;
+	  }
+  }
 
   if (champ)
 /* Now verify that champ is better than everything earlier in the
instantiation list.  */
-for (fn = templates; fn != champ; fn = TREE_CHAIN (fn)) {
-  if (more_specialized_inst (TREE_VALUE (champ), TREE_VALUE (fn)) != 1)
-  {
-champ = NULL_TREE;
-break;
-  }
-}
+for (fn = templates; fn != champ; fn = TREE_CHAIN (fn))
+  if (TREE_VALUE (champ) != TREE_VALUE (fn)
+	  && more_specialized_inst (TREE_VALUE (champ), TREE_VALUE (fn)) != 1)
+	{
+	  champ = NULL_TREE;
+	  break;
+	}
 
   processing_template_decl--;
 
Index: testsuite/g++.dg/lookup/pr80891-1.C
===
--- testsuite/g++.dg/lookup/pr80891-1.C	(revision 0)
+++ testsuite/g++.dg/lookup/pr80891-1.C	(working copy)
@@ -0,0 +1,19 @@
+// PR c++/80891 part 1
+// std::endl is found via two paths and most_specialized_instantiation
+// gets confused.
+
+namespace std {
+  struct A {
+void operator<<(A(A));
+  };
+  template  _CharT endl(_Traits);
+  A a;
+}
+
+using std::endl;
+
+void chi_squared_sample_sized()
+{
+  using namespace std;
+  a << endl;
+}

[PATCH v7 4/4] Add gdb for or1k build

2017-05-29 Thread Stafford Horne

* ChangeLog:

2017-02-12  Stafford Horne  

* configure.ac: Remove logic adding gdb to noconfigsdirs for or1k.
* configure: Regenerate.

Cc: gcc-patches@gcc.gnu.org
---
 configure| 7 ---
 configure.ac | 7 ---
 2 files changed, 14 deletions(-)

diff --git a/configure b/configure
index be9dd89..0bf47fa 100755
--- a/configure
+++ b/configure
@@ -3632,10 +3632,6 @@ case "${target}" in
 ;;
   *-*-rtems*)
 noconfigdirs="$noconfigdirs target-libgloss"
-# this is not caught below because this stanza matches earlier
-case $target in
-  or1k*-*-*) noconfigdirs="$noconfigdirs gdb" ;;
-esac
 ;;
 # The tpf target doesn't support gdb yet.
   *-*-tpf*)
@@ -3841,9 +3837,6 @@ case "${target}" in
   nvptx*-*-*)
 noconfigdirs="$noconfigdirs target-libssp target-libstdc++-v3 
target-libobjc"
 ;;
-  or1k*-*-*)
-noconfigdirs="$noconfigdirs gdb"
-;;
   sh-*-*)
 case "${target}" in
   sh*-*-elf)
diff --git a/configure.ac b/configure.ac
index 532c5c2..9d16792 100644
--- a/configure.ac
+++ b/configure.ac
@@ -966,10 +966,6 @@ case "${target}" in
 ;;
   *-*-rtems*)
 noconfigdirs="$noconfigdirs target-libgloss"
-# this is not caught below because this stanza matches earlier
-case $target in
-  or1k*-*-*) noconfigdirs="$noconfigdirs gdb" ;;
-esac
 ;;
 # The tpf target doesn't support gdb yet.
   *-*-tpf*)
@@ -1175,9 +1171,6 @@ case "${target}" in
   nvptx*-*-*)
 noconfigdirs="$noconfigdirs target-libssp target-libstdc++-v3 
target-libobjc"
 ;;
-  or1k*-*-*)
-noconfigdirs="$noconfigdirs gdb"
-;;
   sh-*-*)
 case "${target}" in
   sh*-*-elf)
-- 
2.9.4

Re: [PATCH] Introduce 4-stages profiledbootstrap to get a better profile.

2017-05-29 Thread Jan Hubicka

> On 05/25/2017 01:22 PM, Markus Trippelsdorf wrote:
> > On 2017.05.25 at 11:55 +0200, Martin Liška wrote:
> >> Hi.
> >>
> >> As I spoke about the PGO with Honza and Richi, current 3-stage is not 
> >> ideal for following
> >> 2 reasons:
> >>
> >> 1) stageprofile compiler is train just on libraries that are built during 
> >> stage2
> >> 2) apart from that, as the compiler is also used to build the final 
> >> compiler, profile
> >> is being updated during the build. So the stage2 compiler is making 
> >> different decisions.
> >>
> >> Both problems can be resolved by adding another step in between current 
> >> stage2 and stage3
> >> where we train stage2 compiler by building compiler with default options.
> >>
> >> I'm going to do some measurements.
> > 
> > I did some measurements on gcc67 (trunk with --enable-checking=release).
> > The apparent speedup is in the noise.
> 
> Hello.
> 
> Thanks for measurements:
> 
> I can see difference for GCC 7.1:
> 
> g++-7 tramp3d-v4.ii -O2 && time for i in `seq 1 10` ; do g++-7 tramp3d-v4.ii 
> -O2 ; done
> 
> before: 2m25.133s
> after: real   2m25.133s
> 
> which is 99.09124426480228%. It's probably within a noise level.
> 
> And apparently file size of binary is bugger:
> 
> before (using bloaty):
> 
>  VM SIZE FILE SIZE
>  --   --
>   59.0%  15.1Mi .text  15.1Mi  62.3%
>   21.3%  5.45Mi .rodata5.45Mi  22.5%
>6.6%  1.69Mi .eh_frame  1.69Mi   6.9%
>5.4%  1.38Mi .bss0   0.0%
>3.3%   874Ki .dynstr 874Ki   3.5%
>1.8%   480Ki .dynsym 480Ki   1.9%
>1.1%   285Ki .eh_frame_hdr   285Ki   1.1%
>0.6%   158Ki .gnu.hash   158Ki   0.6%
>0.5%   144Ki .hash   144Ki   0.6%
>0.2%  44.4Ki .data  44.4Ki   0.2%
>0.2%  40.0Ki .gnu.version   40.0Ki   0.2%
>0.0%  11.1Ki .rela.plt  11.1Ki   0.0%
>0.0%  7.44Ki .plt   7.44Ki   0.0%
>0.0%  4.56Ki .data.rel.ro   4.56Ki   0.0%
>0.0%  3.73Ki .got.plt   3.73Ki   0.0%
>0.0%  38 [Unmapped] 2.75Ki   0.0%
>0.0% 624 [ELF Headers]  2.55Ki   0.0%
>0.0% 848 [Other]1.13Ki   0.0%
>0.0% 917 .gcc_except_table 917   0.0%
>0.0% 608 .dynamic  608   0.0%
>0.0%  16 [None]  0   0.0%
>  100.0%  25.7Mi TOTAL  24.3Mi 100.0%
> 
> after:
> 
>  VM SIZE FILE SIZE
>  --   --
>   58.3%  14.6Mi .text  14.6Mi  54.2%
>   21.6%  5.41Mi .rodata5.41Mi  20.1%
>0.0%   0 .strtab2.13Mi   7.9%
>6.7%  1.67Mi .eh_frame  1.67Mi   6.2%
>5.5%  1.38Mi .bss0   0.0%
>0.0%   0 .symtab1.11Mi   4.1%
>3.4%   876Ki .dynstr 876Ki   3.2%
>1.9%   480Ki .dynsym 480Ki   1.7%
>1.1%   280Ki .eh_frame_hdr   280Ki   1.0%
>0.6%   158Ki .gnu.hash   158Ki   0.6%
>0.6%   144Ki .hash   144Ki   0.5%
>0.2%  44.4Ki .data  44.4Ki   0.2%
>0.2%  40.1Ki .gnu.version   40.1Ki   0.1%
>0.0%  11.1Ki .rela.plt  11.1Ki   0.0%
>0.0%  7.44Ki .plt   7.44Ki   0.0%
>0.0%  4.56Ki .data.rel.ro   4.56Ki   0.0%
>0.0%  3.73Ki .got.plt   3.73Ki   0.0%
>0.0%  58 [Unmapped] 3.11Ki   0.0%
>0.0% 624 [ELF Headers]  2.61Ki   0.0%
>0.0%  2.32Ki [Other]2.60Ki   0.0%
>0.0%  16 [None]  0   0.0%
>  100.0%  25.1Mi TOTAL  26.9Mi 100.0%
> 
> As I had chat with Honza, we still have problem in GCC that using current 
> working sets,
> get_hot_bb_threshold () is very close to number of runs, which is effectively 
> 1 for a single
> run. That's mistake and that should be fixed.

Yep, with LTO+PGO bootstrap I think we also hit the problem that PGO inliner 
was never
seriously tuned (we basically use the very first badness metric I introduced 
and we never
experimented with parameters). The reason is that hot/cold partitioning even 
when it
is very coarsce does work reasonably well for per-file compilation model.  With 
LTO we
are facing very many inline decisions and probably there is a lot of low 
hanging fruit.

GCC is currently on transition to new profile counter code.  I will push out 
the initial
patch retiring gcov_type soon (once I finish updating it to current tree - it 
is very
anoying) and that will let us to track hotness more conservatively and fix the 
old
problem that count becomes unrealistically low by broken profile updates and 
thus
becomes cold.  This should make it possible to increase the threshold and start 
with
re-tunning (hopefully this or next week)

Honza
> 
> Martin

[C++ PATCH] namespace stat hack representation

2017-05-29 Thread Nathan Sidwell

Currently bindings have two slots, a 'value' slot for the regular 
binding, and a 'type' slot for the struct name binding, which is only 
used when the value slot is holding something else.  for instance:


struct foo {...} foo;

The value slot will be a VAR_DECL, and the type slot an artificial 
TYPE_DECL.


The type slot is very rarely non-null, because such code use is terribly 
confusing.  But as the name suggests, it's needed because of the C 
library's definition:


  struct stat {...};
  int stat (const char *, struct stat *);

This patch changes the representation for namespace bindings, so we only 
use one slot, and if the stat hack is needed, it contains an OVERLOAD 
that is marked with LOOKUP_P (such overloads cannot otherwise appear in 
a binding).  In that case the TYPE holds the TYPE_DECL and the FUNCTION 
holds the value binding.


This patch doesn't change the use of cxx_binding, so the underlying 
accessor find_namespace_slot simply returns the address of the value 
field.  (The next patch will remove cxx_binding for namespaces.)


nathan

--
Nathan Sidwell
2017-05-29  Nathan Sidwell  

	Stat hack representation
	* name-lookup.c (STAT_HACK_P, STAT_TYPE, STAT_DECL,
	MAYBE_STAT_DECL, MAYBE_STAT_TYPE): New.
	(stat_hack): New.
	(find_namespace_binding): Replace with ...
	(find_namespace_slot): ... this.
	(find_namespace_value): New.
	(name_lookup::search_namespace_only,
	name_lookup::adl_namespace_only): Adjust.
	(update_binding): Add SLOT parameter, adjust.
	(check_local_shadow): Use find_namespace_value.
	(set_local_extern_decl_linkage): Likewise.
	(do_pushdecl): Adjust for namespace slot.
	(push_local_binding): Assert not a namespace binding.
	(check_for_out_of_scope_variable): Use find_namespace_value.
	(set_identifier_type_value_with_scope): Likewise.
	(get_namespace_binding): Likewise.
	(set_namespace_binding): Delete.
	(set_global_binding): Directly update the binding.
	(finish_namespace_using_decl): Likewise.
	(lookup_type_scope_1): Use find_namespace_slot and update.
	(do_push_nested_namespace): Use find_namespace_value.

Index: name-lookup.c
===
--- name-lookup.c	(revision 248573)
+++ name-lookup.c	(working copy)
@@ -38,6 +38,27 @@ static cp_binding_level *innermost_noncl
 static void set_identifier_type_value_with_scope (tree id, tree decl,
 		  cp_binding_level *b);
 
+/* Create an overload suitable for recording an artificial TYPE_DECL
+   and another decl.  We use this machanism to implement the struct
+   stat hack within a namespace.  It'd be nice to use it everywhere.  */
+
+#define STAT_HACK_P(N) ((N) && TREE_CODE (N) == OVERLOAD && OVL_LOOKUP_P (N))
+#define STAT_TYPE(N) TREE_TYPE (N)
+#define STAT_DECL(N) OVL_FUNCTION (N)
+#define MAYBE_STAT_DECL(N) (STAT_HACK_P (N) ? STAT_DECL (N) : N)
+#define MAYBE_STAT_TYPE(N) (STAT_HACK_P (N) ? STAT_TYPE (N) : NULL_TREE)
+
+static tree stat_hack (tree decl = NULL_TREE, tree type = NULL_TREE)
+{
+  tree result = make_node (OVERLOAD);
+
+  /* Mark this as a lookup, so we can tell this is a stat hack.  */
+  OVL_LOOKUP_P (result) = true;
+  STAT_DECL (result) = decl;
+  STAT_TYPE (result) = type;
+  return result;
+}
+
 /* Create a local binding level for NAME.  */
 
 static cxx_binding *
@@ -58,15 +79,15 @@ create_local_binding (cp_binding_level *
 /* Find the binding for NAME in namespace NS.  If CREATE_P is true,
make an empty binding if there wasn't one.  */
 
-static cxx_binding *
-find_namespace_binding (tree ns, tree name, bool create_p = false)
+static tree *
+find_namespace_slot (tree ns, tree name, bool create_p = false)
 {
   cp_binding_level *level = NAMESPACE_LEVEL (ns);
   cxx_binding *binding = IDENTIFIER_NAMESPACE_BINDINGS (name);
 
   for (;binding; binding = binding->previous)
 if (binding->scope == level)
-  return binding;
+  return &binding->value;
 
   if (create_p)
 {
@@ -76,9 +97,18 @@ find_namespace_binding (tree ns, tree na
   binding->is_local = false;
   binding->value_is_inherited = false;
   IDENTIFIER_NAMESPACE_BINDINGS (name) = binding;
+  return &binding->value;
 }
 
-  return binding;
+  return NULL;
+}
+
+static tree
+find_namespace_value (tree ns, tree name)
+{
+  tree *b = find_namespace_slot (ns, name);
+
+  return b ? MAYBE_STAT_DECL (*b) : NULL_TREE;
 }
 
 /* Add DECL to the list of things declared in B.  */
@@ -480,8 +510,9 @@ name_lookup::search_namespace_only (tree
 {
   bool found = false;
 
-  if (cxx_binding *binding = find_namespace_binding (scope, name))
-found |= process_binding (binding->value, binding->type);
+  if (tree *binding = find_namespace_slot (scope, name))
+found |= process_binding (MAYBE_STAT_DECL (*binding),
+			  MAYBE_STAT_TYPE (*binding));
 
   return found;
 }
@@ -688,8 +719,8 @@ name_lookup::adl_namespace_only (tree sc
 for (unsigned ix = inlinees->length (); ix--;)
   adl_namespace_only ((*inlinees)[ix]);
 
-  if (cxx_binding *binding = find_nam

Re: [C++ PATCH] namespace stat hack representation

2017-05-29 Thread Marek Polacek

On Mon, May 29, 2017 at 11:11:12AM -0400, Nathan Sidwell wrote:
> Currently bindings have two slots, a 'value' slot for the regular binding,
> and a 'type' slot for the struct name binding, which is only used when the
> value slot is holding something else.  for instance:
> 
> struct foo {...} foo;
> 
> The value slot will be a VAR_DECL, and the type slot an artificial
> TYPE_DECL.
> 
> The type slot is very rarely non-null, because such code use is terribly
> confusing.  But as the name suggests, it's needed because of the C library's
> definition:
> 
>   struct stat {...};
>   int stat (const char *, struct stat *);
> 
> This patch changes the representation for namespace bindings, so we only use
> one slot, and if the stat hack is needed, it contains an OVERLOAD that is
> marked with LOOKUP_P (such overloads cannot otherwise appear in a binding).
> In that case the TYPE holds the TYPE_DECL and the FUNCTION holds the value
> binding.
> 
> This patch doesn't change the use of cxx_binding, so the underlying accessor
> find_namespace_slot simply returns the address of the value field.  (The
> next patch will remove cxx_binding for namespaces.)
> 
> nathan
> 
> -- 
> Nathan Sidwell

> 2017-05-29  Nathan Sidwell  
> 
>   Stat hack representation
>   * name-lookup.c (STAT_HACK_P, STAT_TYPE, STAT_DECL,
>   MAYBE_STAT_DECL, MAYBE_STAT_TYPE): New.
>   (stat_hack): New.
>   (find_namespace_binding): Replace with ...
>   (find_namespace_slot): ... this.
>   (find_namespace_value): New.
>   (name_lookup::search_namespace_only,
>   name_lookup::adl_namespace_only): Adjust.
>   (update_binding): Add SLOT parameter, adjust.
>   (check_local_shadow): Use find_namespace_value.
>   (set_local_extern_decl_linkage): Likewise.
>   (do_pushdecl): Adjust for namespace slot.
>   (push_local_binding): Assert not a namespace binding.
>   (check_for_out_of_scope_variable): Use find_namespace_value.
>   (set_identifier_type_value_with_scope): Likewise.
>   (get_namespace_binding): Likewise.
>   (set_namespace_binding): Delete.
>   (set_global_binding): Directly update the binding.
>   (finish_namespace_using_decl): Likewise.
>   (lookup_type_scope_1): Use find_namespace_slot and update.
>   (do_push_nested_namespace): Use find_namespace_value.
> 
> Index: name-lookup.c
> ===
> --- name-lookup.c (revision 248573)
> +++ name-lookup.c (working copy)
> @@ -38,6 +38,27 @@ static cp_binding_level *innermost_noncl
>  static void set_identifier_type_value_with_scope (tree id, tree decl,
> cp_binding_level *b);
>  
> +/* Create an overload suitable for recording an artificial TYPE_DECL
> +   and another decl.  We use this machanism to implement the struct
> +   stat hack within a namespace.  It'd be nice to use it everywhere.  */
> +
> +#define STAT_HACK_P(N) ((N) && TREE_CODE (N) == OVERLOAD && OVL_LOOKUP_P (N))
> +#define STAT_TYPE(N) TREE_TYPE (N)
> +#define STAT_DECL(N) OVL_FUNCTION (N)
> +#define MAYBE_STAT_DECL(N) (STAT_HACK_P (N) ? STAT_DECL (N) : N)
> +#define MAYBE_STAT_TYPE(N) (STAT_HACK_P (N) ? STAT_TYPE (N) : NULL_TREE)
> +
> +static tree stat_hack (tree decl = NULL_TREE, tree type = NULL_TREE)

Should be
static tree
stat_hack (tree decl = NULL_TREE, tree type = NULL_TREE)
to make it easier to grep for the function with ^stat_hack.

(Didn't check the rest of the patch.)

Marek

Re: [C++ PATCH] namespace stat hack representation

2017-05-29 Thread Nathan Sidwell


On 05/29/2017 11:17 AM, Marek Polacek wrote:

On Mon, May 29, 2017 at 11:11:12AM -0400, Nathan Sidwell wrote:



+static tree stat_hack (tree decl = NULL_TREE, tree type = NULL_TREE)


Should be
static tree
stat_hack (tree decl = NULL_TREE, tree type = NULL_TREE)
to make it easier to grep for the function with ^stat_hack.


Thanks, will fix

nathan

--
Nathan Sidwell

Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements

2017-05-29 Thread Dominique d'Humières

Hi Nicolas,

Updating gfortran with your patch fails with

../../work/gcc/fortran/frontend-passes.c: In function 'bool 
traverse_io_block(gfc_code*, bool*, gfc_code*)':
../../work/gcc/fortran/frontend-passes.c:1067:20: error: expected 
unqualified-id before '(' token
 #define swap(x, y) (x) ^= (y), (y) ^= (x), (x) ^= (y);
^
../../work/gcc/fortran/frontend-passes.c:1180:15: note: in expansion of macro 
'swap'
  std::swap(start->value.op.op1, start->value.op.op2);
   ^~~~
../../work/gcc/fortran/frontend-passes.c:1067:36: error: invalid operands of 
types 'gfc_expr*' and 'gfc_expr*' to binary 'operator^'
 #define swap(x, y) (x) ^= (y), (y) ^= (x), (x) ^= (y);
^~
../../work/gcc/fortran/frontend-passes.c:1180:15: note: in expansion of macro 
'swap'
  std::swap(start->value.op.op1, start->value.op.op2);
   ^~~~
../../work/gcc/fortran/frontend-passes.c:1067:41: error:   in evaluation of 
'operator^=(struct gfc_expr*, struct gfc_expr*)'
 #define swap(x, y) (x) ^= (y), (y) ^= (x), (x) ^= (y);
 ^
../../work/gcc/fortran/frontend-passes.c:1180:15: note: in expansion of macro 
'swap'
  std::swap(start->value.op.op1, start->value.op.op2);
   ^~~~
../../work/gcc/fortran/frontend-passes.c:1067:48: error: invalid operands of 
types 'gfc_expr*' and 'gfc_expr*' to binary 'operator^'
 #define swap(x, y) (x) ^= (y), (y) ^= (x), (x) ^= (y);
^~
../../work/gcc/fortran/frontend-passes.c:1180:15: note: in expansion of macro 
'swap'
  std::swap(start->value.op.op1, start->value.op.op2);
   ^~~~
../../work/gcc/fortran/frontend-passes.c:1067:53: error:   in evaluation of 
'operator^=(struct gfc_expr*, struct gfc_expr*)'
 #define swap(x, y) (x) ^= (y), (y) ^= (x), (x) ^= (y);
 ^
../../work/gcc/fortran/frontend-passes.c:1180:15: note: in expansion of macro 
'swap'
  std::swap(start->value.op.op1, start->value.op.op2);
   ^~~~

TIA

Dominique

Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements

2017-05-29 Thread Nicolas Koenig


Hello Dominique,

mea culpa, their was a bit confusion with the file being open in emacs
and vi at the same time. Attached is the new patch with the #define removed.

Nicolas


On 05/29/2017 05:32 PM, Dominique d'Humières wrote:

Hi Nicolas,

Updating gfortran with your patch fails with

../../work/gcc/fortran/frontend-passes.c: In function 'bool 
traverse_io_block(gfc_code*, bool*, gfc_code*)':
../../work/gcc/fortran/frontend-passes.c:1067:20: error: expected 
unqualified-id before '(' token
  #define swap(x, y) (x) ^= (y), (y) ^= (x), (x) ^= (y);
 ^
../../work/gcc/fortran/frontend-passes.c:1180:15: note: in expansion of macro 
'swap'
   std::swap(start->value.op.op1, start->value.op.op2);
^~~~
../../work/gcc/fortran/frontend-passes.c:1067:36: error: invalid operands of 
types 'gfc_expr*' and 'gfc_expr*' to binary 'operator^'
  #define swap(x, y) (x) ^= (y), (y) ^= (x), (x) ^= (y);
 ^~
../../work/gcc/fortran/frontend-passes.c:1180:15: note: in expansion of macro 
'swap'
   std::swap(start->value.op.op1, start->value.op.op2);
^~~~
../../work/gcc/fortran/frontend-passes.c:1067:41: error:   in evaluation of 
'operator^=(struct gfc_expr*, struct gfc_expr*)'
  #define swap(x, y) (x) ^= (y), (y) ^= (x), (x) ^= (y);
  ^
../../work/gcc/fortran/frontend-passes.c:1180:15: note: in expansion of macro 
'swap'
   std::swap(start->value.op.op1, start->value.op.op2);
^~~~
../../work/gcc/fortran/frontend-passes.c:1067:48: error: invalid operands of 
types 'gfc_expr*' and 'gfc_expr*' to binary 'operator^'
  #define swap(x, y) (x) ^= (y), (y) ^= (x), (x) ^= (y);
 ^~
../../work/gcc/fortran/frontend-passes.c:1180:15: note: in expansion of macro 
'swap'
   std::swap(start->value.op.op1, start->value.op.op2);
^~~~
../../work/gcc/fortran/frontend-passes.c:1067:53: error:   in evaluation of 
'operator^=(struct gfc_expr*, struct gfc_expr*)'
  #define swap(x, y) (x) ^= (y), (y) ^= (x), (x) ^= (y);
  ^
../../work/gcc/fortran/frontend-passes.c:1180:15: note: in expansion of macro 
'swap'
   std::swap(start->value.op.op1, start->value.op.op2);
^~~~

TIA

Dominique



Index: frontend-passes.c
===
--- frontend-passes.c	(revision 248539)
+++ frontend-passes.c	(working copy)
@@ -1060,6 +1060,256 @@ convert_elseif (gfc_code **c, int *walk_subtrees A
   return 0;
 }
 
+struct do_stack
+{
+  struct do_stack *prev;
+  gfc_iterator *iter;
+  gfc_code *code;
+} *stack_top;
+
+/* Recursivly traverse the block of a WRITE or READ statement, and, can it be
+   optimized, do so. It optimizes it by replacing do loops with their analog
+   array slices. For example:
+   
+ write (*,*) (a(i), i=1,4)
+ 
+   is replaced with
+ 
+ write (*,*) a(1:4:1) .  */
+
+static bool 
+traverse_io_block(gfc_code *code, bool *has_reached, gfc_code *prev)
+{
+  gfc_code *curr; 
+  gfc_expr *new_e, *expr, *start;
+  gfc_ref *ref;
+  struct do_stack ds_push;
+  int i, future_rank = 0;
+  gfc_iterator *iters[GFC_MAX_DIMENSIONS];
+
+  /* Find the first transfer/do statement.  */
+  for (curr = code; curr; curr = curr->next)
+{
+  if (curr->op == EXEC_DO || curr->op == EXEC_TRANSFER)
+break;
+}
+
+  /* Ensure it is the only transfer/do statement because cases like
+   
+   write (*,*) (a(i), b(i), i=1,4)
+
+ cannot be optimized.  */
+
+  if (!curr || curr->next)
+return false;
+
+  if (curr->op == EXEC_DO)
+{
+  if (curr->ext.iterator->var->ref)
+return false;
+  ds_push.prev = stack_top;
+  ds_push.iter = curr->ext.iterator;
+  ds_push.code = curr;
+  stack_top = &ds_push;
+  if (traverse_io_block(curr->block->next, has_reached, prev))
+{
+	  if (curr != stack_top->code && !*has_reached)
+	{
+  curr->block->next = NULL;
+  gfc_free_statements(curr);
+	}
+	  else
+	*has_reached = true;
+	  return true;
+}
+  return false;
+}
+
+  gcc_assert(curr->op == EXEC_TRANSFER);
+
+  if (curr->expr1->symtree->n.sym->attr.allocatable)
+return false;
+
+  ref = curr->expr1->ref;
+  if (!ref || ref->type != REF_ARRAY || ref->u.ar.codimen != 0)
+return false;
+
+  /* Find the iterators belonging to each variable and check conditions.  */
+  for (i = 0; i < ref->u.ar.dimen; i++)
+{
+  if (!ref->u.ar.start[i] || ref->u.ar.start[i]->ref
+  || ref->u.ar.dimen_type[i] != DIMEN_ELEMENT)
+return false;
+  
+  start = ref->u.ar.start[i];
+  gfc_simplify_expr(start, 0);
+  switch (start->expr_type)
+{
+	case EXPR_VARIABLE:
+
+	  /* write (*,*) (a(i), i=a%b,1) not handled yet.  */
+	  if (start->ref)
+	return fa

Re: [PATCH] gcc: xtensa: fix unused parameter warning

2017-05-29 Thread augustine.sterl...@gmail.com

On Mon, May 29, 2017 at 4:11 AM, Max Filippov  wrote:
> 2017-05-28  Max Filippov  
> gcc/
> * config/xtensa/xtensa.c (xtensa_initial_elimination_offset):
> Mark 'to' argument with ATTRIBUTE_UNUSED.

This is ok.

Re: [PATCH] gcc: xtensa: fix fprintf format specifiers

2017-05-29 Thread augustine.sterl...@gmail.com

On Mon, May 29, 2017 at 4:11 AM, Max Filippov  wrote:
> HOST_WIDE_INT may not be long as assumed in print_operand and
> xtensa_emit_call. Use HOST_WIDE_INT_PRINT_DEC/HOST_WIDE_INT_PRINT_HEX
> format strings instead of %ld/0x%lx. This fixes incorrect assembly code
> generation by the compiler running on armhf host.

This is ok.

Re: [PATCH] Dump function on internal errors

2017-05-29 Thread Alexander Monakov

Hi,

On Wed, 24 May 2017, Richard Biener wrote:
> current_pass might be NULL so you better do set_internal_error_hook when
> we start executing passes (I detest global singletons to do such stuff 
> anyway).

I think there are other problems in this patch, dump_function_to_file won't work
after transition to RTL (it only handles GIMPLE).  It's much better to just use
existing dump routine in passes.c and use the existing diagnostic callbacks.

Here's an alternative, imho a bit more streamlined patch.  Here's how the new
notice is placed:

ice.c: In function ‘fn1.part.0’:
ice.c:24:1: error: size of loop 1 should be 1, not 2
 }
 ^
function dumped to file ice.c.227r.expand
ice.c:24:1: internal compiler error: in verify_loop_structure, at cfgloop.c:1644
0x92aeb8 verify_loop_structure()
/home/am/pr80640/gcc/gcc/cfgloop.c:1644


Bootstrapped on x86_64, OK for trunk?

Alexander

* passes.c (emergency_dump_function): New.
* tree-pass.h (emergency_dump_function): Declare.
* plugin.c (plugins_internal_error_function): Remove.
* plugin.h (plugins_internal_error_function): Remove declaration.
* toplev.c (internal_error_function): New static function.  Use it...
(general_init): ...here.
---
 gcc/passes.c| 14 ++
 gcc/plugin.c| 10 --
 gcc/plugin.h|  2 --
 gcc/toplev.c| 14 +-
 gcc/tree-pass.h |  1 +
 5 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/gcc/passes.c b/gcc/passes.c
index 98e05e4..e8e0322 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -60,6 +60,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-live.h"  /* For remove_unused_locals.  */
 #include "tree-cfgcleanup.h"
 #include "insn-addr.h" /* for INSN_ADDRESSES_ALLOC.  */
+#include "diagnostic-core.h" /* for fnotice */
 
 using namespace gcc;
 
@@ -1779,6 +1780,19 @@ execute_function_dump (function *fn, void *data)
 }
 }
 
+/* This helper function is invoked from diagnostic routines prior to aborting
+   due to internal compiler error.  If a dump file is set up, dump the
+   current function.  */
+
+void
+emergency_dump_function ()
+{
+  if (!dump_file || !current_pass || !cfun)
+return;
+  fnotice (stderr, "function dumped to file %s\n", dump_file_name);
+  execute_function_dump (cfun, current_pass);
+}
+
 static struct profile_record *profile_record;
 
 /* Do profile consistency book-keeping for the pass with static number INDEX.
diff --git a/gcc/plugin.c b/gcc/plugin.c
index c6d3cdd..9892748 100644
--- a/gcc/plugin.c
+++ b/gcc/plugin.c
@@ -858,16 +858,6 @@ warn_if_plugins (void)
 
 }
 
-/* Likewise, as a callback from the diagnostics code.  */
-
-void
-plugins_internal_error_function (diagnostic_context *context ATTRIBUTE_UNUSED,
-const char *msgid ATTRIBUTE_UNUSED,
-va_list *ap ATTRIBUTE_UNUSED)
-{
-  warn_if_plugins ();
-}
-
 /* The default version check. Compares every field in VERSION. */
 
 bool
diff --git a/gcc/plugin.h b/gcc/plugin.h
index 68a673b..b96445d 100644
--- a/gcc/plugin.h
+++ b/gcc/plugin.h
@@ -167,8 +167,6 @@ extern bool plugins_active_p (void);
 extern void dump_active_plugins (FILE *);
 extern void debug_active_plugins (void);
 extern void warn_if_plugins (void);
-extern void plugins_internal_error_function (diagnostic_context *,
-const char *, va_list *);
 extern void print_plugins_versions (FILE *file, const char *indent);
 extern void print_plugins_help (FILE *file, const char *indent);
 extern void finalize_plugins (void);
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 425315c..23b884a 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -79,6 +79,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "omp-offload.h"
 #include "hsa-common.h"
 #include "edit-context.h"
+#include "tree-pass.h"
 
 #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
 #include "dbxout.h"
@@ -1063,6 +1064,17 @@ open_auxiliary_file (const char *ext)
   return file;
 }
 
+/* Auxiliary callback for the diagnostics code.  */
+
+static void
+internal_error_function (diagnostic_context *context ATTRIBUTE_UNUSED,
+const char *msgid ATTRIBUTE_UNUSED,
+va_list *ap ATTRIBUTE_UNUSED)
+{
+  warn_if_plugins ();
+  emergency_dump_function ();
+}
+
 /* Initialization of the front end environment, before command line
options are parsed.  Signal handlers, internationalization etc.
ARGV0 is main's argv[0].  */
@@ -1101,7 +1113,7 @@ general_init (const char *argv0, bool init_signals)
 = global_options_init.x_flag_diagnostics_show_option;
   global_dc->show_column
 = global_options_init.x_flag_show_column;
-  global_dc->internal_error = plugins_internal_error_function;
+  global_dc->internal_error = internal_error_function;
   global_dc->option_enabled = option_enabled;
   global_dc->option_state = &global_options;
   global_dc->option_name

Re: [PATCH] Dump function on internal errors

2017-05-29 Thread Jakub Jelinek

On Mon, May 29, 2017 at 07:15:33PM +0300, Alexander Monakov wrote:
> @@ -1063,6 +1064,17 @@ open_auxiliary_file (const char *ext)
>return file;
>  }
>  
> +/* Auxiliary callback for the diagnostics code.  */
> +
> +static void
> +internal_error_function (diagnostic_context *context ATTRIBUTE_UNUSED,
> +  const char *msgid ATTRIBUTE_UNUSED,
> +  va_list *ap ATTRIBUTE_UNUSED)
> +{
> +  warn_if_plugins ();
> +  emergency_dump_function ();

What if there is another ICE during the dumping?  Won't we then
end in endless recursion?  Perhaps global_dc->internal_error should
be cleared here first?
Also, as none of the arguments are used and we are in C++,
perhaps it should be
static void
internal_error_function (diagnostic_context *, const char *, va_list *)
{
?

Jakub

Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements

2017-05-29 Thread Dominique d'Humières


> Le 29 mai 2017 à 17:49, Nicolas Koenig  a écrit :
> 
> Hello Dominique,
> 
> mea culpa, their was a bit confusion with the file being open in emacs
> and vi at the same time. Attached is the new patch with the #define removed.
> 
> Nicolas
> 

Thanks for the quick fix!

Testing in progress

Dominique

Re: [PATCH] Dump function on internal errors

2017-05-29 Thread Alexander Monakov

On Mon, 29 May 2017, Jakub Jelinek wrote:
> What if there is another ICE during the dumping?  Won't we then
> end in endless recursion?  Perhaps global_dc->internal_error should
> be cleared here first?

Hm, no, as far as I can see existing diagnostic machinery is supposed to fully
handle that.  It detects recursion; see e.g. diagnostic.c: error_recursion.

> Also, as none of the arguments are used and we are in C++,
> perhaps it should be
> static void
> internal_error_function (diagnostic_context *, const char *, va_list *)
> {
> ?

Ah, it seems GCC tends to use either the long-winded form I've copy-pasted in my
patch, or the slightly shorter variant with 'type_name ARG_UNUSED (arg_name)'.
The shorthand form you're pointing out seems to be used only once, in
vmsdbgout.c, as far as I can tell.  I'll be happy to change my patch as desired
by the reviewer, of course :)

Alexander

Re: [PATCH] Dump function on internal errors

2017-05-29 Thread Andi Kleen

On Mon, May 29, 2017 at 07:15:33PM +0300, Alexander Monakov wrote:
> Hi,
> 
> On Wed, 24 May 2017, Richard Biener wrote:
> > current_pass might be NULL so you better do set_internal_error_hook when
> > we start executing passes (I detest global singletons to do such stuff 
> > anyway).
> 
> I think there are other problems in this patch, dump_function_to_file won't 
> work
> after transition to RTL (it only handles GIMPLE).  It's much better to just 
> use
> existing dump routine in passes.c and use the existing diagnostic callbacks.

Your patch looks good to me.

Thanks,
-Andi

Re: [PATCH] Dump function on internal errors

2017-05-29 Thread Alexander Monakov

On Mon, 29 May 2017, Alexander Monakov wrote:
> On Mon, 29 May 2017, Jakub Jelinek wrote:
> > Also, as none of the arguments are used and we are in C++,
> > perhaps it should be
> > static void
> > internal_error_function (diagnostic_context *, const char *, va_list *)
> > {
> > ?
> 
> Ah, it seems GCC tends to use either the long-winded form I've copy-pasted in 
> my
> patch, or the slightly shorter variant with 'type_name ARG_UNUSED (arg_name)'.
> The shorthand form you're pointing out seems to be used only once, in
> vmsdbgout.c, as far as I can tell.

Scratch this, I totally botched my grep invocation.  There's plenty of instances
where the shorthand form is used, and I'll be happy to use it here as well.

Alexander

Re: [patch, libgfortran] PR53029 missed optimization in internal read (without implied-do-loop)

2017-05-29 Thread Thomas Koenig


Hi Jerry,

Regression tested on x86_64. I have added a test case which will check 
the execution time of the loop. The previous results of the REAd were 
correct, just took a long time on large arrays.


OK for trunk?


OK.

It might be good if you followed Manfred's suggestion and turned
down the timeout to something like 0.5 seconds.

Thanks for the patch!

I would also consider backporting, the speedup is just so
large.  What do others think?

Regards

Thomas

Re: [PATCH] Dump function on internal errors

2017-05-29 Thread Jakub Jelinek

On Mon, May 29, 2017 at 07:46:22PM +0300, Alexander Monakov wrote:
> On Mon, 29 May 2017, Alexander Monakov wrote:
> > On Mon, 29 May 2017, Jakub Jelinek wrote:
> > > Also, as none of the arguments are used and we are in C++,
> > > perhaps it should be
> > > static void
> > > internal_error_function (diagnostic_context *, const char *, va_list *)
> > > {
> > > ?
> > 
> > Ah, it seems GCC tends to use either the long-winded form I've copy-pasted 
> > in my
> > patch, or the slightly shorter variant with 'type_name ARG_UNUSED 
> > (arg_name)'.
> > The shorthand form you're pointing out seems to be used only once, in
> > vmsdbgout.c, as far as I can tell.
> 
> Scratch this, I totally botched my grep invocation.  There's plenty of 
> instances
> where the shorthand form is used, and I'll be happy to use it here as well.

Generally, ARG_UNUSED or ATTRIBUTE_UNUSED needs to be used if the argument
is used conditionally (e.g. in macro that on some target uses its argument
and on another target ignores it).  Otherwise, sometimes somebody wants to
make clear what the names of the ignored arguments are, in that case one
can use those 2 forms too.  If the arguments are not used and it isn't
important how they are called, C++ omitted argument names are fine.

Jakub

[C++ PATCH] PR c++80891 case #4

2017-05-29 Thread Nathan Sidwell


this fixes the 4th testcase.  I'd flubbed the sense of an assert.

nathan

--
Nathan Sidwell
2017-05-29  Nathan Sidwell  

	PR c++/80891 (#4)
	* ptree.c (cxx_print_xnode): Show internal OVERLOAD structure.
	* tree.c (ovl_insert, ovl_iterator_remove_node): Fix copying assert.

	PR c++/80891 (#4)
	* g++.dg/lookup/pr80891-4.C: New.

Index: cp/ptree.c
===
--- cp/ptree.c	(revision 248574)
+++ cp/ptree.c	(working copy)
@@ -236,11 +236,8 @@ cxx_print_xnode (FILE *file, tree node,
 		  indent + 4);
   break;
 case OVERLOAD:
-  print_node (file, "name", OVL_NAME (node), indent+4);
-  for (ovl_iterator iter (node, true); iter; ++iter)
-	print_node (file,
-		TREE_CODE (*iter) == OVERLOAD ? "inner" : "function",
-		*iter, indent+4);
+  print_node (file, "function", OVL_FUNCTION (node), indent+4);
+  print_node (file, "next", OVL_CHAIN (node), indent+4);
   break;
 case TEMPLATE_PARM_INDEX:
   print_node (file, "decl", TEMPLATE_PARM_DECL (node), indent+4);
Index: cp/tree.c
===
--- cp/tree.c	(revision 248574)
+++ cp/tree.c	(working copy)
@@ -2170,7 +2170,7 @@ ovl_insert (tree fn, tree maybe_ovl, boo
 		   | (OVL_USING_P (maybe_ovl) << 0
 {
   gcc_checking_assert (!OVL_LOOKUP_P (maybe_ovl)
-			   && (!OVL_USED_P (maybe_ovl) || !copying));
+			   && (!copying || OVL_USED_P (maybe_ovl)));
   if (OVL_USED_P (maybe_ovl))
 	{
 	  copying = true;
@@ -2264,7 +2264,7 @@ ovl_iterator::remove_node (tree overload
 {
   tree probe = *slot;
   gcc_checking_assert (!OVL_LOOKUP_P (probe)
-			   && (!OVL_USED_P (probe) || !copying));
+			   && (!copying || OVL_USED_P (probe)));
   if (OVL_USED_P (probe))
 	{
 	  copying = true;
Index: testsuite/g++.dg/lookup/pr80891-4.C
===
--- testsuite/g++.dg/lookup/pr80891-4.C	(revision 0)
+++ testsuite/g++.dg/lookup/pr80891-4.C	(working copy)
@@ -0,0 +1,13 @@
+// PR c++/80891 part 4
+// Inserting into an immutable overload set
+
+namespace tuples {
+template  void get();
+template  void get();
+}
+using tuples::get;
+template  void make_iterator_vertex_map() {
+  RandomAccessIterator a;
+  a, get;
+}
+template  void get();

Re: [gcn][patch] Add -mgpu option and plumb in assembler/linker

2017-05-29 Thread Martin Jambor

Hello Andrew,

I apologize for taking so long to reply, I was traveling for two past
weeks and just before that we suffered some local infrastructure
issues that prevented me from working on this too.

On Fri, Apr 28, 2017 at 06:06:39PM +0100, Andrew Stubbs wrote:
> This patch, for the "gcn" branch, does three things:
> 
> 1. Add specs to drive the LLVM assembler and linker. It requires them to be
> installed as "as" and "ld", under $target/bin, but then the compiler Just
> Works with these specs.

At the moment I prefer to use --with-as and --with-ld configure
options which are better suited for my setup.  The invocation of
assembler works well, the invocation of ld.lld works too, but with the
added caveat that collect2 afterwards attempts to do non-plufin LTO
and calls maybe_run_lto_and_relink, which wants to run nm, which is
not available and so it fails with a fatal_error.  It took me a while
to figure out what was going on and that the result was actually fine,
despite the error message.  I guess we are fine with passing -fno-lto
or rather disabling lto at configure time for the time being.

> 
> 2. Switch to HSACO format version 2, and have the assembler auto-set the
> architecture flags from -mcpu. This means the amdphdr utility is no longer
> required.

This is the one thing that was it difficult for me to get it working.
I had to upgrade my kernel and both run-time libraries to the newest
ROCm 1.5, re-compile llvm and lld from ROCm github branches, and
rewrite our testing kernel invoker to use non-deprecated HSA 1.1
functions (we had been using hsa_code_object_deserialize and friends
from HSA 1.0).  But finally, my kernels get loaded, started and work.

> 
> 3. Add -mgpu option and corresponding --with-gpu. I've deliberately used
> "gpu" instead of "cpu" because I want offloading compilers to be able to say
> "-mcpu=foo -foffload=-mgpu=bar", or even have the host compiler just
> understand -mgpu and DTRT.

As far as I am concerned, this seems like a good idea. 

Anyhow, thanks for submitting your patch, I apologize once again for
taking so long to test it.  Please commit the changes, I will wait
with (a bit overdue) merge from trunk until after you do.

Thanks,

Martin

> 
> The patch also removes the unused and unwritten "arch" and "tune" settings.
> They can be added back when useful, but the assembler requires a GPU name, I
> think, so we need that as input.
> 
> OK to commit to GCN branch?
> 
> Andrew
> 

> commit 5058457b0fa07865b366832828e74a53e5bd2964
> Author: Andrew Stubbs 
> Date:   Fri Apr 28 14:37:25 2017 +0100
> 
> Add -mgpu
> 
> 2017-04-28  Andrew Stubbs  
> 
>   gcc/
>   * config.gcc (amdgcn): Remove --with-arch and --with-tune.
>   Add --with-gpu, and set default to "carrizo"
>   (add_defaults): Add "gpu".
>   * config/gcn/gcn-opts.h: New file.
>   * config/gcn/gcn.c (output_file_start): Switch to HSACO version
>   2 and auto-detection of GPU type (from -mcpu).
>   (gcn_arch, gcn_tune): Remove.
>   * config/gcn/gcn.h: Include gcn-opts.h.
>   (enum processor_type): Move to gcn-opts.h.
>   (LIBGCC_SPEC, ASM_SPEC, LINK_SPEC): Define.
>   (gcn_arch, gcn_tune): Remove.
>   (OPTION_DEFAULT_SPECS): Remove "arch" and "tune"; add "gpu".
>   * config/gcn/gcn.opt: Include gcn-opts.h.
>   (gpu_type): New Enum.
>   (mgpu): New option.
>

Re: SSA range class and removal of VR_ANTI_RANGEs

2017-05-29 Thread Martin Jambor

Hi,

On Wed, May 24, 2017 at 10:25:40AM +0200, Richard Biener wrote:
> Well, anti-ranges are "evil" for actual working with ranges.  They are nice
> for optimizing the storage requirements though.
> 
> As I'm replying late I'll add that yes, it does make a difference in memory
> use.  We've seen this with IPA VRP info eating up 1 GB extra memory
> for firefox so we optimized it to use trailing wide-ints.

Actually, the way we got most of that memory under control was to use
ggc_cache_remove hasher to store each unique value_range only once and
share it among all jump functions that referred to it.  See
ipa_vr_ggc_hash_traits and ipa_bit_ggc_hash_traits, it was
surprisingly easy to set up and helped a lot, just sharing the
non-NULL VR collapsed 706245 separate value_range structures into one.
Of course, one has to be careful not to change VR for all those
sharing it when intending to update it for just one of the parameters
it describes, but since you are devising an API, it can be taken care
of easily.

Hope this helps,

Martin

Re: [PATCH] Dump function on internal errors

2017-05-29 Thread Alexander Monakov

On Mon, 29 May 2017, Alexander Monakov wrote:
> +/* This helper function is invoked from diagnostic routines prior to aborting
> +   due to internal compiler error.  If a dump file is set up, dump the
> +   current function.  */
> +
> +void
> +emergency_dump_function ()
> +{
> +  if (!dump_file || !current_pass || !cfun)
> +return;
> +  fnotice (stderr, "function dumped to file %s\n", dump_file_name);
> +  execute_function_dump (cfun, current_pass);
> +}

I've noticed that the notice is not terribly useful.  Perhaps it's better to
mention the failing pass when not producing the dump (untested):

void
emergency_dump_function ()
{
  if (!current_pass || !cfun)
return;
  if (dump_file)
{
  fnotice (stderr, "dump file: %s\n", dump_file_name);
  execute_function_dump (cfun, current_pass);
}
  else if (current_pass->name[0] != '*')
{
  enum opt_pass_type pt = current_pass->type;
  fnotice (stderr, "during %s pass: %s\n", 
   pt == GIMPLE_PASS ? "GIMPLE" : pt == RTL_PASS ? "RTL" : "IPA",
   current_pass->name);
}
}

Alexander

Re: [patch, libgfortran] PR53029 missed optimization in internal read (without implied-do-loop)

2017-05-29 Thread Jerry DeLisle


On 05/29/2017 09:51 AM, Thomas Koenig wrote:

Hi Jerry,

Regression tested on x86_64. I have added a test case which will check the 
execution time of the loop. The previous results of the REAd were correct, 
just took a long time on large arrays.


OK for trunk?


OK.

It might be good if you followed Manfred's suggestion and turned
down the timeout to something like 0.5 seconds.

Thanks for the patch!

I would also consider backporting, the speedup is just so
large.  What do others think?

Regards

 Thomas


Committed.

A   gcc/testsuite/gfortran.dg/read_5.f90
M   gcc/testsuite/ChangeLog
M   libgfortran/ChangeLog
M   libgfortran/io/list_read.c
Committed r248577

Thanks,

Jerry

[C++ PATCH] PR c++/80891 #1 & #5

2017-05-29 Thread Nathan Sidwell

This patch addresses testcase #1 by stopping name lookup returning 
duplicates.  We use TREE_VISITED (via LOOKUP_SEEN_P) on the underlying 
decls of an overload.   This is better than what we used to do, which 
was either an O(N^2) search, or use of a hash table.  Furthermore, we 
take advantage of the sorted nature of overload bindings, and only enter 
this deduplicating mode when we see an overload containing a USING 
declaration.


Thus I revert the earlier change to most_specialized_instantiation, 
putting an assert there.  Coincidentally this patch fixes Markus' 5th 
testcase.


nathan
--
Nathan Sidwell
2017-05-29  Nathan Sidwell  

	PR c++/80891 (#1,#5)
	* cp-tree.h (lookup_maybe_add): Add DEDUPING argument.
	* name-lookup.c (name_lookup): Add deduping field.
	(name_lookup::preserve_state, name_lookup::restore_state): Deal
	with deduping.
	(name_lookup::add_overload): New.
	(name_lookup::add_value, name_lookup::add_fns): Call add_overload.
	(name_lookup::search_adl): Set deduping.  Don't unmark here.
	* pt.c (most_specialized_instantiation): Revert previous change,
	Assert not given duplicates.
	* tree.c (lookup_mark): Just mark the underlying decls.
	(lookup_maybe_add): Dedup using marked decls.

	PR c++/80891 (#5)
	* g++.dg/lookup/pr80891-5.C: New.

Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 248576)
+++ cp/cp-tree.h	(working copy)
@@ -6916,7 +6916,8 @@ extern tree ovl_insert(tree fn, tree
 extern tree ovl_skip_hidden			(tree) ATTRIBUTE_PURE;
 extern void lookup_mark(tree lookup, bool val);
 extern tree lookup_add(tree fns, tree lookup);
-extern tree lookup_maybe_add			(tree fns, tree lookup);
+extern tree lookup_maybe_add			(tree fns, tree lookup,
+		 bool deduping);
 extern void lookup_keep(tree lookup, bool keep);
 extern int is_overloaded_fn			(tree) ATTRIBUTE_PURE;
 extern bool really_overloaded_fn		(tree) ATTRIBUTE_PURE;
Index: cp/name-lookup.c
===
--- cp/name-lookup.c	(revision 248576)
+++ cp/name-lookup.c	(working copy)
@@ -48,7 +48,11 @@ static void set_identifier_type_value_wi
 #define MAYBE_STAT_DECL(N) (STAT_HACK_P (N) ? STAT_DECL (N) : N)
 #define MAYBE_STAT_TYPE(N) (STAT_HACK_P (N) ? STAT_TYPE (N) : NULL_TREE)
 
-static tree stat_hack (tree decl = NULL_TREE, tree type = NULL_TREE)
+/* Create a STAT_HACK node with DECL as the value binding and TYPE as
+   the type binding.  */
+
+static tree
+stat_hack (tree decl = NULL_TREE, tree type = NULL_TREE)
 {
   tree result = make_node (OVERLOAD);
 
@@ -179,6 +183,8 @@ public:
   tree value;	/* A (possibly ambiguous) set of things found.  */
   tree type;	/* A type that has been found.  */
   int flags;	/* Lookup flags.  */
+  bool deduping; /* Full deduping is needed because using declarations
+		are in play.  */
   vec *scopes;
   name_lookup *previous; /* Previously active lookup.  */
 
@@ -191,7 +197,7 @@ protected:
 public:
   name_lookup (tree n, int f = 0)
   : name (n), value (NULL_TREE), type (NULL_TREE), flags (f),
-scopes (NULL), previous (NULL)
+deduping (false), scopes (NULL), previous (NULL)
   {
 preserve_state ();
   }
@@ -235,6 +241,7 @@ private:
 
 private:
   static tree ambiguous (tree thing, tree current);
+  void add_overload (tree fns);
   void add_value (tree new_val);
   void add_type (tree new_type);
   bool process_binding (tree val_bind, tree type_bind);
@@ -321,7 +328,8 @@ name_lookup::preserve_state ()
 	}
 
   /* Unmark the outer partial lookup.  */
-  lookup_mark (previous->value, false);
+  if (previous->deduping)
+	lookup_mark (previous->value, false);
 }
   else
 scopes = shared_scopes;
@@ -333,6 +341,9 @@ name_lookup::preserve_state ()
 void
 name_lookup::restore_state ()
 {
+  if (deduping)
+lookup_mark (value, false);
+
   /* Unmark and empty this lookup's scope stack.  */
   for (unsigned ix = vec_safe_length (scopes); ix--;)
 {
@@ -371,7 +382,8 @@ name_lookup::restore_state ()
 	}
 
   /* Remark the outer partial lookup.  */
-  lookup_mark (previous->value, true);
+  if (previous->deduping)
+	lookup_mark (previous->value, true);
 }
   else
 shared_scopes = scopes;
@@ -415,12 +427,36 @@ name_lookup::ambiguous (tree thing, tree
   return current;
 }
 
+/* FNS is a new overload set to add to the exising set.  */
+
+void
+name_lookup::add_overload (tree fns)
+{
+  if (!deduping && TREE_CODE (fns) == OVERLOAD)
+{
+  tree probe = fns;
+  if (flags & LOOKUP_HIDDEN)
+	probe = ovl_skip_hidden (probe);
+  if (probe && TREE_CODE (probe) == OVERLOAD && OVL_USING_P (probe))
+	{
+	  /* We're about to add something found by a using
+	 declaration, so need to engage deduping mode.  */
+	  lookup_mark (value, true);
+	  deduping = true;
+	}
+}
+
+  value = lookup_maybe_add (fns, value, deduping);
+}
+
 /* Add a NEW_VAL, a found value binding into the current value binding.  */
 
 vo

Re: [libcc1] add support for C++

2017-05-29 Thread Alexandre Oliva

On May 28, 2017, Alexandre Oliva  wrote:

> Oh, no!  I put it there temporarily, very early in the project, because
> I couldn't find a better place (I looked for available bits elsewhere,
> and I recall I couldn't find any); at the end we moved to a hash_set
> (see query_oracle below), that makes a lot more sense since the bit is
> only used when libcc1 is in use.  But I accidentally left in place the
> data member I'd added before, completely unused :-(  Ouch!

> I'll test and install the obvious fix, trunk and branch

Here's what I'm checking in, trunk and gcc-7-branch.  Regstrapped on
x86_64-linux-gnu and i686-linux-gnu.

[libcc1] drop unused field from C++ lang_identifier

for  gcc/cp/ChangeLog

* cp-tree.h (lang_identifier): Drop oracle_looked_up, unused.
---
 gcc/cp/cp-tree.h |1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 100f85c..360e13f 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -519,7 +519,6 @@ struct GTY(()) lang_identifier {
   cxx_binding *bindings;
   tree class_template_info;
   tree label_value;
-  bool oracle_looked_up;
 };
 
 /* Return a typed pointer version of T if it designates a


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Re: Default std::vector default and move constructor

2017-05-29 Thread François Dumont


Hi

It wasn't such a big deal to restore value-init of the allocator. 
So here is the updated patch.


I used:
  _Bvector_impl() _GLIBCXX_NOEXCEPT_IF( noexcept(_Bit_alloc_type()) )

rather than using is_nothrow_default_constructible. Any advantage 
in one approach or the other ?


I'll complete testing and add a test on this value-initialization 
before commit if you agree.


Tests still running but I'm pretty sure it will work the same.

François


On 28/05/2017 22:13, François Dumont wrote:

On 27/05/2017 13:14, Jonathan Wakely wrote:

On 26/05/17 23:13 +0200, François Dumont wrote:

On 25/05/2017 18:28, Jonathan Wakely wrote:

On 15/05/17 19:57 +0200, François Dumont wrote:

Hi

  Following what I have started on RbTree here is a patch to 
default implementation of default and move constructors on 
std::vector.


  As in _Rb_tree_impl the default allocator is not value 
initialized anymore. We could add a small helper type arround the 
allocator to do this value initialization per default. Should I do 
so ?


It's required to be value-initialized, so if your patch changes that
then it's a problem.

Did we decide it's OK to do that for RB-trees? Did we actually discuss
that part of the r243379 changes?


I remember a message pointing this issue but after the commit AFAIR. 
I thought it was from Tim but I can't find it on the archive.


What is the rational of this requirement ? I started working on a 
type to do the allocator value initialization if there is no default 
constructor but it seems quite complicated to do so. It is quite sad 
that we can't fully benefit from this nice C++11 feature just 
because of this requirement. If there is any initialization needed 
it doesn't sound complicated to provide a default constructor.


The standard says that the default constructor is:

 vector() : vector(Allocator()) { }

That value-initializes the allocator. If the allocator type behaves
differently for value-init and default-init (e.g. it has data members
that are left uninitialized by default-init) then the difference
matters. If you change the code so it only does default-init of the
allocator then you will introduce an observable difference.

I don't see any requirement that a DefaultConstructible allocator
cannot leave members uninitialized, so that means the standard
requires default construction of vector to value-init the
allocator. Not default-init.


Sure but like freedom which stop where start others' freedom so does 
those requirements :-). Because the Standard says that an allocator 
will be value-init when there is no default-init it makes usage of the 
C++11 default constructor more complicated.


But as it is unavoidable here is a type I tried to work on to keep 
current implementations as long as we inherit from 
__alloc_value_initializer.


I don't like it myself but I propose just in case you are interested.

Otherwise I am also going to rework my patch to keep this initialization.

François



diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 37e000a..6509ac5 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -388,10 +388,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   { return __x + __n; }
 
   inline void
-  __fill_bvector(_Bit_iterator __first, _Bit_iterator __last, bool __x)
+  __fill_bvector(_Bit_type * __v,
+		 unsigned int __first, unsigned int __last, bool __x)
   {
-for (; __first != __last; ++__first)
-  *__first = __x;
+const _Bit_type __fmask = ~0ul << __first;
+const _Bit_type __lmask = ~0ul >> (_S_word_bit - __last);
+const _Bit_type __mask = __fmask & __lmask;
+
+if (__x)
+  *__v |= __mask;
+else
+  *__v &= ~__mask;
   }
 
   inline void
@@ -399,12 +406,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   {
 if (__first._M_p != __last._M_p)
   {
-	std::fill(__first._M_p + 1, __last._M_p, __x ? ~0 : 0);
-	__fill_bvector(__first, _Bit_iterator(__first._M_p + 1, 0), __x);
-	__fill_bvector(_Bit_iterator(__last._M_p, 0), __last, __x);
+	_Bit_type *__first_p = __first._M_p;
+	if (__first._M_offset != 0)
+	  __fill_bvector(__first_p++, __first._M_offset, _S_word_bit, __x);
+
+	__builtin_memset(__first_p, __x ? ~0 : 0,
+			 (__last._M_p - __first_p) * sizeof(_Bit_type));
+
+	if (__last._M_offset != 0)
+	  __fill_bvector(__last._M_p, 0, __last._M_offset, __x);
   }
 else
-  __fill_bvector(__first, __last, __x);
+  __fill_bvector(__first._M_p, __first._M_offset, __last._M_offset, __x);
   }
 
   template
@@ -416,33 +429,61 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	_Bit_alloc_traits;
   typedef typename _Bit_alloc_traits::pointer _Bit_pointer;
 
-  struct _Bvector_impl
-  : public _Bit_alloc_type
+  struct _Bvector_impl_data
   {
 	_Bit_iterator 	_M_start;
 	_Bit_iterator 	_M_finish;
 	_Bit_pointer 	_M_end_of_storage;
 
-	_Bvector_impl()
-	: _Bit_alloc_type(), _M_start(), _M_finish(), _M_end_of_

Re: [PATCH 0/13] D: Submission of D Front End

2017-05-29 Thread Eric Botcazou

> The upstream DMD compiler that comprises all components of the
> standalone part is now implemented in D programming language itself.
> However here GDC is still using the C++ implementation, it is a future
> goal to switch to being a self-hosted compiler minus the GCC binding
> interface (similar to Ada?), however extended platform support is
> something I wish to address first before I make this a consideration.

Yes, the Ada compiler is written in Ada and the glue code (called gigi) lives 
in ada/gcc-interface and is written in C++.

-- 
Eric Botcazou

Re: [libcc1] add support for C++

2017-05-29 Thread Nathan Sidwell


On 05/29/2017 04:21 PM, Alexandre Oliva wrote:


Here's what I'm checking in, trunk and gcc-7-branch.  Regstrapped on
x86_64-linux-gnu and i686-linux-gnu.

[libcc1] drop unused field from C++ lang_identifier

for  gcc/cp/ChangeLog

* cp-tree.h (lang_identifier): Drop oracle_looked_up, unused.



thanks!


--
Nathan Sidwell

Restore documentation of --enable-sjlj-exceptions

2017-05-29 Thread Eric Botcazou

It had wrongly been put in the Java-specific section of doc/install.texi so 
was removed when the entire section was removed.  Fixed thusly on the mainline 
and 7 branch (and entry moved on the 6 and 5 branches for good measure).

Tested on x86_64-suse-linux, applied on all active branches.


2017-05-29  Eric Botcazou  

* doc/install.texi (Options specification): Restore entry of
--enable-sjlj-exceptions.

-- 
Eric BotcazouIndex: doc/install.texi
===
--- doc/install.texi	(revision 248552)
+++ doc/install.texi	(working copy)
@@ -1047,6 +1047,11 @@ and for cross builds configured with @op
 More documentation about multiarch can be found at
 @uref{https://wiki.debian.org/Multiarch}.
 
+@item --enable-sjlj-exceptions
+Force use of the @code{setjmp}/@code{longjmp}-based scheme for exceptions.
+@samp{configure} ordinarily picks the correct value based on the platform.
+Only use this option if you are sure you need a different setting.
+
 @item --enable-vtable-verify
 Specify whether to enable or disable the vtable verification feature.
 Enabling this feature causes libstdc++ to be built with its virtual calls

Re: [PATCH] gcc: xtensa: fix unused parameter warning

2017-05-29 Thread Max Filippov

On Mon, May 29, 2017 at 9:08 AM, augustine.sterl...@gmail.com
 wrote:
> On Mon, May 29, 2017 at 4:11 AM, Max Filippov  wrote:
>> 2017-05-28  Max Filippov  
>> gcc/
>> * config/xtensa/xtensa.c (xtensa_initial_elimination_offset):
>> Mark 'to' argument with ATTRIBUTE_UNUSED.
>
> This is ok.

Applied to trunk. Thank you!

-- Max

Re: [PATCH] gcc: xtensa: fix fprintf format specifiers

2017-05-29 Thread Max Filippov

On Mon, May 29, 2017 at 9:08 AM, augustine.sterl...@gmail.com
 wrote:
> On Mon, May 29, 2017 at 4:11 AM, Max Filippov  wrote:
>> HOST_WIDE_INT may not be long as assumed in print_operand and
>> xtensa_emit_call. Use HOST_WIDE_INT_PRINT_DEC/HOST_WIDE_INT_PRINT_HEX
>> format strings instead of %ld/0x%lx. This fixes incorrect assembly code
>> generation by the compiler running on armhf host.
>
> This is ok.

Applied to trunk. Thank you!

-- Max

Re: [C++ PATCH] namespace stat hack representation

2017-05-29 Thread H.J. Lu

On Mon, May 29, 2017 at 8:11 AM, Nathan Sidwell  wrote:
> Currently bindings have two slots, a 'value' slot for the regular binding,
> and a 'type' slot for the struct name binding, which is only used when the
> value slot is holding something else.  for instance:
>
> struct foo {...} foo;
>
> The value slot will be a VAR_DECL, and the type slot an artificial
> TYPE_DECL.
>
> The type slot is very rarely non-null, because such code use is terribly
> confusing.  But as the name suggests, it's needed because of the C library's
> definition:
>
>   struct stat {...};
>   int stat (const char *, struct stat *);
>
> This patch changes the representation for namespace bindings, so we only use
> one slot, and if the stat hack is needed, it contains an OVERLOAD that is
> marked with LOOKUP_P (such overloads cannot otherwise appear in a binding).
> In that case the TYPE holds the TYPE_DECL and the FUNCTION holds the value
> binding.
>
> This patch doesn't change the use of cxx_binding, so the underlying accessor
> find_namespace_slot simply returns the address of the value field.  (The
> next patch will remove cxx_binding for namespaces.)
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80915


-- 
H.J.

[PATCH v2] Add no_tail_call attribute

2017-05-29 Thread Yuri Gribov

On Mon, May 29, 2017 at 8:14 AM, Yuri Gribov  wrote:
> Hi all,
>
> As discussed in
> https://sourceware.org/ml/libc-alpha/2017-01/msg00455.html , some
> libdl functions rely on return address to figure out the calling
> DSO and then use this information in computation (e.g. output of dlsym
> depends on which library called it).
>
> As reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66826 this
> may break under tailcall optimization i.e. in cases like
>
>   return dlsym(...);
>
> Carlos confirmed that they would prefer to have GCC attribute to
> prevent tailcalls
> (https://sourceware.org/ml/libc-alpha/2017-01/msg00502.html) so there
> you go.
>
> This was bootstrapped on x86_64. Given that this is a minor addition,
> I only ran newly added regtests. I hope that's enough (full testsuite
> would take a week on my notebook...).

Added docs, per Alex's suggestion.

-Y


0001-Added-no_tail_call-attribute.patch
Description: Binary data

[PATCH][x86]Fix for false-positives results of runtime tests on machines not supporting AVX512F

2017-05-29 Thread Peryt, Sebastian

Hi,

The attached patch fixes the issue of tests' false-positive results generation 
on machines not supporting AVX512F feature. Currently when any runtime test 
intended for AVX512F feature will be run on non-AVX512F machine the best it can 
produce to inform of such a case is print SKIPPED, if debug is enabled. But in 
any case the return value is 0, which is exactly the same as if the test passed 
what might be misleading when looking at gcc.sum summary values. With this 
patch such tests can be properly recognized during make check as unexpected 
failures.

gcc/testsuite/
* gcc.target/i386/avx512f-check.h: Return value modified for skipped 
test.



Please let me know if such fix can be accepted.

Thanks,
Sebastian


AVX512F_TESTS_VERIFICATION_PATCH.patch
Description: AVX512F_TESTS_VERIFICATION_PATCH.patch

Re: {PATCH] New C++ warning -Wcatch-value

2017-05-29 Thread Volker Reichelt

On 24 May, Jason Merrill wrote:
> On Mon, May 15, 2017 at 3:58 PM, Martin Sebor  wrote:
>>> So how about the following then? I stayed with the catch part and added
>>> a parameter to the warning to let the user decide on the warnings she/he
>>> wants to get: -Wcatch-value=n.
>>> -Wcatch-value=1 only warns for polymorphic classes that are caught by
>>> value (to avoid slicing), -Wcatch-value=2 warns for all classes that
>>> are caught by value (to avoid copies). And finally -Wcatch-value=3
>>> warns for everything not caught by reference to find typos (like pointer
>>> instead of reference) and bad coding practices.
>>
>> It seems reasonable to me.  I'm not too fond of multi-level
>> warnings since few users take advantage of anything but the
>> default, but this case is simple and innocuous enough that
>> I don't think it can do harm.
> 
>>> Bootstrapped and regtested on x86_64-pc-linux-gnu.
>>> OK for trunk?
> 
> OK.

Committed.

>>> If so, would it make sense to add -Wcatch-value=1 to -Wextra or even -Wall?
>>> I would do this in a seperate patch, becuase I haven't checked what that
>>> would mean for the testsuite.
>>
>> I can't think of a use case for polymorphic slicing that's not
>> harmful so unless there is a common one that escapes me, I'd say
>> -Wall.
> 
> Agreed.  But then you'll probably want to allow -Wno-catch-value to turn it 
> off.
> 
> Jason

So how about the following then?
Bootstrapped and regtested on x86_64-pc-linux-gnu.
OK for trunk?

Regards,
Volker

2017-05-30  Volker Reichelt  

* doc/invoke.texi (-Wcatch-value): Document new shortcut.
Add to -Wall section.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 248585)
+++ gcc/doc/invoke.texi (working copy)
@@ -265,8 +265,8 @@
 -Wno-builtin-declaration-mismatch @gol
 -Wno-builtin-macro-redefined  -Wc90-c99-compat  -Wc99-c11-compat @gol
 -Wc++-compat  -Wc++11-compat  -Wc++14-compat  -Wcast-align  -Wcast-qual  @gol
--Wchar-subscripts  -Wchkp  -Wcatch-value=@var{n}  -Wclobbered  -Wcomment  @gol
--Wconditionally-supported  @gol
+-Wchar-subscripts  -Wchkp  -Wcatch-value  -Wcatch-value=@var{n} @gol
+-Wclobbered  -Wcomment  -Wconditionally-supported @gol
 -Wconversion  -Wcoverage-mismatch  -Wno-cpp  -Wdangling-else  -Wdate-time @gol
 -Wdelete-incomplete @gol
 -Wno-deprecated  -Wno-deprecated-declarations  -Wno-designated-init @gol
@@ -3757,6 +3757,7 @@
 -Wbool-compare  @gol
 -Wbool-operation  @gol
 -Wc++11-compat  -Wc++14-compat  @gol
+-Wcatch-value @r{(C++ and Objective-C++ only)}  @gol
 -Wchar-subscripts  @gol
 -Wcomment  @gol
 -Wduplicate-decl-specifier @r{(C and Objective-C only)} @gol
@@ -5834,13 +5835,16 @@
 literals to @code{char *}.  This warning is enabled by default for C++
 programs.
 
-@item -Wcatch-value=@var{n} @r{(C++ and Objective-C++ only)}
+@item -Wcatch-value
+@itemx -Wcatch-value=@var{n} @r{(C++ and Objective-C++ only)}
 @opindex Wcatch-value
+@opindex Wno-catch-value
 Warn about catch handlers that do not catch via reference.
-With @option{-Wcatch-value=1} warn about polymorphic class types that
-are caught by value. With @option{-Wcatch-value=2} warn about all class
-types that are caught by value. With @option{-Wcatch-value=3} warn about
-all types that are not caught by reference.
+With @option{-Wcatch-value=1} (or @option{-Wcatch-value} for short)
+warn about polymorphic class types that are caught by value.
+With @option{-Wcatch-value=2} warn about all class types that are caught
+by value. With @option{-Wcatch-value=3} warn about all types that are
+not caught by reference. @option{-Wcatch-value} is enabled by @option{-Wall}.
 
 @item -Wclobbered
 @opindex Wclobbered
===

2017-05-30  Volker Reichelt  

* c.opt (Wcatch-value): New shortcut for Wcatch-value=1.
(Wcatch-value=1): Enable by -Wall.

Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt  (revision 248585)
+++ gcc/c-family/c.opt  (working copy)
@@ -388,8 +388,12 @@
 C ObjC C++ ObjC++ Var(warn_cast_qual) Warning
 Warn about casts which discard qualifiers.
 
+Wcatch-value
+C++ ObjC++ Warning Alias(Wcatch-value=, 1, 0)
+Warn about catch handlers of non-reference type.
+
 Wcatch-value=
-C++ ObjC++ Var(warn_catch_value) Warning Joined RejectNegative UInteger
+C++ ObjC++ Var(warn_catch_value) Warning Joined RejectNegative UInteger 
LangEnabledBy(C++ ObjC++,Wall, 1, 0)
 Warn about catch handlers of non-reference type.
 
 Wchar-subscripts
===

Re: [PATCH][x86]Fix for false-positives results of runtime tests on machines not supporting AVX512F

2017-05-29 Thread Uros Bizjak

On Tue, May 30, 2017 at 7:59 AM, Peryt, Sebastian
 wrote:
> Hi,
>
> The attached patch fixes the issue of tests' false-positive results 
> generation on machines not supporting AVX512F feature. Currently when any 
> runtime test intended for AVX512F feature will be run on non-AVX512F machine 
> the best it can produce to inform of such a case is print SKIPPED, if debug 
> is enabled. But in any case the return value is 0, which is exactly the same 
> as if the test passed what might be misleading when looking at gcc.sum 
> summary values. With this patch such tests can be properly recognized during 
> make check as unexpected failures.
>
> gcc/testsuite/
> * gcc.target/i386/avx512f-check.h: Return value modified for skipped 
> test.
>
>
>
> Please let me know if such fix can be accepted.

No, this is by design. It is not a failure, if the target doesn't
support requested runtime feature. The test shoudl be marked
UNSUPPORTED in this case, but I don't think DejaGnu infrastructure
allows that.

Uros.

Re: [PATCH 1/4 v3][PR 67328] Generate bittests in range checks if possible

2017-05-29 Thread Richard Sandiford

Yuri Gribov  writes:
> Added special case to build_range_check. Fixed couple of existing
> tests where it changed codegen.
>
> -I
>
> From b7819f341e2ffa0437be497024f61d0a4e1be588 Mon Sep 17 00:00:00 2001
> From: Yury Gribov 
> Date: Fri, 26 May 2017 07:49:46 +0100
> Subject: [PATCH 1/4] Generate bittests in range checks if possible.
>
> gcc/testsuite/
> 2017-05-26  Yury Gribov  
>
>   * c-c++-common/fold-masked-cmp-1.c: New test.
>   * c-c++-common/fold-masked-cmp-2.c: New test.
>   * gcc.dg/pr46309.c: Fix pattern.
>   * gcc.dg/pr46309-2.c: Fix pattern.
>
> gcc/
> 2017-05-26  Yury Gribov  
>
>   * fold-const.c (maskable_range_p): New function.
>   (build_range_check): Generate bittests if possible.
> ---
>  gcc/fold-const.c   | 41 -
>  gcc/testsuite/c-c++-common/fold-masked-cmp-1.c | 41 +
>  gcc/testsuite/c-c++-common/fold-masked-cmp-2.c | 42 
> ++
>  gcc/testsuite/gcc.dg/pr46309-2.c   |  2 +-
>  gcc/testsuite/gcc.dg/pr46309.c |  2 +-
>  5 files changed, 125 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/fold-masked-cmp-1.c
>  create mode 100644 gcc/testsuite/c-c++-common/fold-masked-cmp-2.c
>
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index efc0b10..c334dc6 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -4745,6 +4745,38 @@ make_range (tree exp, int *pin_p, tree *plow, tree 
> *phigh,
>*pin_p = in_p, *plow = low, *phigh = high;
>return exp;
>  }
> +
> +/* Returns TRUE is [LOW, HIGH] range check can be optimized to

s/is/if/

> +   a bitwise check i.e. when
> + LOW  == 0xXX...X00...0
> + HIGH == 0xXX...X11...1
> +   Return corresponding mask in MASK and stem in VALUE.  */
> +
> +static bool
> +maskable_range_p (const_tree low, const_tree high, tree type, tree *mask, 
> tree *value)
> +{
> +  if (TREE_CODE (low) != INTEGER_CST
> +  || TREE_CODE (high) != INTEGER_CST)
> +return false;
> +
> +  widest_int lo = wi::to_widest (low),
> +hi = wi::to_widest (high);

It shouldn't be necessary to go to widest here, since AIUI both
values have the same type.

> +
> +  widest_int end_mask = lo ^ hi,
> +stem_mask = ~end_mask;
> +  if ((end_mask & (end_mask + 1)) != 0
> +  || (lo & end_mask) != 0)
> +return false;
> +
> +  widest_int stem = lo & stem_mask;
> +  if (stem != (hi & stem_mask))
> +return false;

FWIW, I think this is equivalent to:

  if (lo + (lo & -lo) != hi + 1)
return false;

but I guess it's a matter of taste whether that's clearer or not.

> +  *mask = wide_int_to_tree (type, stem_mask);
> +  *value = wide_int_to_tree (type, stem);
> +
> +  return true;

Thanks,
Richard

[Patch] Forward triviality in variant

2017-05-29 Thread Tim Shen via gcc-patches

This patch implements
, but with more
changes than the proposal's. It
1) Creates __detail::__variant::_Traits as a centralized place to hold
common (but not all yet) compile-time conditions.
2) Changes the noexcept conditions for the (copy|move) (ctor|assign)
SMFs, so that when one is trivial, one is also noexcept. It's not the
same as p0088r3, nor p0088r3 + D0602R1 anymore.
3) Creates 4 structs, namely (_Copy|_Move)_(ctor|assign)_(base|alias)
for dispatch on triviality. The code that were originally in
_Variant_base are moved into these four structs. There aren't
functional changes except for more triviality.

Sorry for having a large patch. Do tell me if you want me to split it.

Tested on x86_64-linux-gnu.

Thanks!


-- 
Regards,
Tim Shen
commit a4db7d21c6e4223300861114931eb0ef78bef1a6
Author: Tim Shen 
Date:   Mon May 29 22:44:42 2017 -0700

2017-05-30  Tim Shen  

PR libstdc++/80187
* include/std/variant (variant::variant, variant::~variant,
variant::operator=): Implement triviality forwarding for four
special member functions.
* testsuite/20_util/variant/compile.cc: Tests.

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index b9824a5182c..8736fcc75bc 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -290,6 +290,49 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  __ref_cast<_Tp>(__t));
 }
 
+  template
+struct _Traits
+{
+  static constexpr bool is_default_constructible_v =
+  is_default_constructible_v::type>;
+  static constexpr bool is_copy_constructible_v =
+  __and_...>::value;
+  static constexpr bool is_move_constructible_v =
+  __and_...>::value;
+  static constexpr bool is_copy_assignable_v =
+  is_copy_constructible_v && is_move_constructible_v
+  && __and_...>::value;
+  static constexpr bool is_move_assignable_v =
+  is_move_constructible_v
+  && __and_...>::value;
+
+  static constexpr bool is_copy_ctor_trivial =
+  __and_...>::value;
+  static constexpr bool is_move_ctor_trivial =
+  __and_...>::value;
+  static constexpr bool is_copy_assign_trivial =
+  __and_...>::value;
+  static constexpr bool is_move_assign_trivial =
+  __and_...>::value;
+  static constexpr bool is_dtor_trivial =
+  __and_...>::value;
+
+  static constexpr bool is_default_ctor_noexcept =
+  is_nothrow_default_constructible_v<
+  typename _Nth_type<0, _Types...>::type>;
+  static constexpr bool is_copy_ctor_noexcept =
+  is_copy_ctor_trivial;
+  static constexpr bool is_move_ctor_noexcept =
+  is_move_ctor_trivial
+  || __and_...>::value;
+  static constexpr bool is_copy_assign_noexcept =
+  is_copy_assign_trivial;
+  static constexpr bool is_move_assign_noexcept =
+  is_move_assign_trivial ||
+  (is_move_ctor_noexcept
+   && __and_...>::value);
+};
+
   // Defines members and ctors.
   template
 union _Variadic_union { };
@@ -355,6 +398,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   ~_Variant_storage()
   { _M_reset(); }
 
+  void*
+  _M_storage() const
+  {
+	return const_cast(static_cast(
+	std::addressof(_M_u)));
+  }
+
+  constexpr bool
+  _M_valid() const noexcept
+  {
+	return this->_M_index != __index_type(variant_npos);
+  }
+
   _Variadic_union<_Types...> _M_u;
   using __index_type = __select_index<_Types...>;
   __index_type _M_index;
@@ -374,59 +430,114 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   void _M_reset()
   { _M_index = variant_npos; }
 
+  void*
+  _M_storage() const
+  {
+	return const_cast(static_cast(
+	std::addressof(_M_u)));
+  }
+
+  constexpr bool
+  _M_valid() const noexcept
+  {
+	return this->_M_index != __index_type(variant_npos);
+  }
+
   _Variadic_union<_Types...> _M_u;
   using __index_type = __select_index<_Types...>;
   __index_type _M_index;
 };
 
-  // Helps SFINAE on special member functions. Otherwise it can live in variant
-  // class.
   template
-struct _Variant_base :
-  _Variant_storage<(std::is_trivially_destructible_v<_Types> && ...),
-			_Types...>
-{
-  using _Storage =
-	  _Variant_storage<(std::is_trivially_destructible_v<_Types> && ...),
-			_Types...>;
+using _Variant_storage_alias =
+_Variant_storage<_Traits<_Types...>::is_dtor_trivial, _Types...>;
 
-  constexpr
-  _Variant_base()
-  noexcept(is_nothrow_default_constructible_v<
-		 variant_alternative_t<0, variant<_Types...>>>)
-  : _Variant_base(in_place_index<0>) { }
+  // The following are (Copy|Move) (ctor|assign) layers for forwarding
+  // triviality and handling non-trivial SMF behaviors.
 
-  _Variant_base(const _Variant_

Re: [PATCH 3/4 v3][PR 67328] Added bool conversion for wide_ints

2017-05-29 Thread Richard Sandiford

Yuri Gribov  writes:
> From 330209f721a598ec393dcb5d62de3457ee282153 Mon Sep 17 00:00:00 2001
> From: Yury Gribov 
> Date: Fri, 26 May 2017 07:53:10 +0100
> Subject: [PATCH 3/4] Added bool conversion for wide_ints.
>
> gcc/
> 2017-05-26  Yury Gribov  
>
>   * wide-int.cc (wi::zero_p_large): New method.
>   * wide-int.h (wi::zero_p): New method.

Do you still need this bit?  It looks like it isn't used by the other
parts of the series.

The idea was that wi::eq_p (x, 0) (or just x == 0, if x is a
wide-int-based type) is supposed to be as fast as a dedicated zero check.
It'd be OK to have a helper function anyway, but it should probably be
defined using wi::eq_p.

The zero_p_large fallback can never return true, since a zero of
any precision will have a length of 1.

Thanks,
Richard

> ---
>  gcc/wide-int.cc | 10 ++
>  gcc/wide-int.h  | 17 +
>  2 files changed, 27 insertions(+)
>
> diff --git a/gcc/wide-int.cc b/gcc/wide-int.cc
> index dab4c19..f1be89b 100644
> --- a/gcc/wide-int.cc
> +++ b/gcc/wide-int.cc
> @@ -433,6 +433,16 @@ top_bit_of (const HOST_WIDE_INT *a, unsigned int len, 
> unsigned int prec)
>   * unsigned and C++ has no such operators.
>   */
>  
> +/* Return true if OP == 0.  */
> +bool
> +wi::zero_p_large (const HOST_WIDE_INT *op, unsigned int len)
> +{
> +  for (unsigned i = 0; i < len; ++i)
> +if (op[i])
> +  return false;
> +  return true;
> +}
> +
>  /* Return true if OP0 == OP1.  */
>  bool
>  wi::eq_p_large (const HOST_WIDE_INT *op0, unsigned int op0len,
> diff --git a/gcc/wide-int.h b/gcc/wide-int.h
> index 2115b61..af63ffe 100644
> --- a/gcc/wide-int.h
> +++ b/gcc/wide-int.h
> @@ -462,6 +462,7 @@ namespace wi
>UNARY_PREDICATE fits_shwi_p (const T &);
>UNARY_PREDICATE fits_uhwi_p (const T &);
>UNARY_PREDICATE neg_p (const T &, signop = SIGNED);
> +  UNARY_PREDICATE zero_p (const T &);
>  
>template 
>HOST_WIDE_INT sign_mask (const T &);
> @@ -675,6 +676,9 @@ public:
>template 
>generic_wide_int &operator = (const T &);
>  
> +#define UNARY_PREDICATE(OP, F) \
> +  bool OP () const { return wi::F (*this); }
> +
>  #define BINARY_PREDICATE(OP, F) \
>template  \
>bool OP (const T &c) const { return wi::F (*this, c); }
> @@ -699,6 +703,7 @@ public:
>  #define INCDEC_OPERATOR(OP, DELTA) \
>generic_wide_int &OP () { *this += DELTA; return *this; }
>  
> +  UNARY_PREDICATE (operator !, zero_p)
>UNARY_OPERATOR (operator ~, bit_not)
>UNARY_OPERATOR (operator -, neg)
>BINARY_PREDICATE (operator ==, eq_p)
> @@ -1605,6 +1610,7 @@ decompose (HOST_WIDE_INT *scratch, unsigned int 
> precision,
> we generally want those to be removed by SRA.)  */
>  namespace wi
>  {
> +  bool zero_p_large (const HOST_WIDE_INT *, unsigned int);
>bool eq_p_large (const HOST_WIDE_INT *, unsigned int,
>  const HOST_WIDE_INT *, unsigned int, unsigned int);
>bool lts_p_large (const HOST_WIDE_INT *, unsigned int, unsigned int,
> @@ -1729,6 +1735,17 @@ wi::neg_p (const T &x, signop sgn)
>return xi.sign_mask () < 0;
>  }
>  
> +/* Return true if X is zero.  */
> +template 
> +inline bool
> +wi::zero_p (const T &x)
> +{
> +  WIDE_INT_REF_FOR (T) xi (x);
> +  if (__builtin_expect (xi.len == 1, true))
> +return !xi.val[0];
> +  return zero_p_large (xi.val, xi.len);
> +}
> +
>  /* Return -1 if the top bit of X is set and 0 if the top bit is clear.  */
>  template 
>  inline HOST_WIDE_INT

74 matches

Mail list logo