[PATCH] Remove out-of-date comment

2017-12-29 Thread A. Skrobov
As of currently, `-fmessy-debugging` is not a valid option, nor is
`REG_USERVAR` looked at by the splitting logic.

2017-12-29  Artyom Skrobov tyomi...@gmail.com

* web.c: Remove out-of-date comment
---
gcc/web.c | 4 
1 file changed, 4 deletions(-)

diff --git a/gcc/web.c b/gcc/web.c
index a642f219865c..217e799ef732 100644
--- a/gcc/web.c
+++ b/gcc/web.c
@@ -22,10 +22,6 @@ along with GCC; see the file COPYING3.  If not see
increasing effectiveness of other optimizations.  The optimization can
serve as an example of use for the dataflow module.

-   We don't split registers with REG_USERVAR set unless -fmessy-debugging
-   is specified, because debugging information about such split variables
-   is almost unusable.
-
TODO
 - We may use profile information and ignore infrequent use for the
   purpose of web unifying, inserting the compensation code later to


Re: [nvptx, committed] Disable -gstatement-frontiers for nvptx

2017-12-29 Thread Tom de Vries

On 12/29/2017 05:12 AM, Alexandre Oliva wrote:

On Dec 27, 2017, Tom de Vries  wrote:


.loc file_index line_number column_position



so this causes ptxas errors when compiling something for nvptx with
-g, which breaks the nvptx build.


What do the errors look like?



The nvptx-none build breaks in libgcc:
...
configure:3706: error: in 
`/home/vries/nvptx/mainkernel-2/build-gcc/nvptx-none/libgcc':

configure:3709: error: cannot compute suffix of object files: cannot compile
...

For this conftest.c:
...
#define PACKAGE_NAME "GNU C Runtime Library"
#define PACKAGE_TARNAME "libgcc"
#define PACKAGE_VERSION "1.0"
#define PACKAGE_STRING "GNU C Runtime Library 1.0"
#define PACKAGE_BUGREPORT ""
#define PACKAGE_URL "http://www.gnu.org/software/libgcc/";
/* end confdefs.h.  */

int
main ()
{

  ;
  return 0;
}
...

We get this error:
...
$ build-gcc/gcc/xgcc -Bbuild-gcc/gcc -c -g -O2 conftest.c
ptxas conftest.o, line 17; fatal   : Parsing error near 'st': syntax error
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status
...

By using -Wa,--no-verify, we can see the ptxas input:
...
$ cat -n conftest.o
...
15  mov.u32 %value,0;
16  .loc 1 15 1 is_stmt 0
17  st.param.u32 [%value_out],%value;
18  ret;
19  }
...

With -gno-statement-frontiers, we have this instead:
...
...
13  mov.u32 %value,0;
14  .loc 1 15 1
15  st.param.u32 [%value_out],%value;
16  ret;
17  }
...
and ptxas is able to parse the input.


I ask because the patches that actually change the generated debug info,
adding view to .loc lines when the assembler supports them, are yet to
be installed, in the patches that introduces LVUs.  No significant
changes have been made to dwarf2out in the SFN patchset so far.

Furthermore, even with the LVU patch, .loc directives with view numbers
would only be used if the assembler is detected as supporting them at
compiler build time.



If you point me to the location of a configure test that is supposed to 
file, I can try to find out why it passes.


Thanks,
- Tom


Re: [libgomp, openacc, openmp, PR83046] Prune removed funcs from offload table

2017-12-29 Thread Tom de Vries

On 12/28/2017 05:14 PM, Jakub Jelinek wrote:

On Thu, Dec 28, 2017 at 05:06:57PM +0100, Jakub Jelinek wrote:

This has O(n^2) complexity for n == vec_safe_length (offload_funcs).
Can't you instead just have 2 IVs, one for where we read the vector elt and
one for where we write it if the 2 are different, then truncate the vector
if needed at the end?



Done.


Another thing, I think you can safely remove elts from the vector (== from
the host and offloading target arrays) only when !flag_lto, because we rely
on the two arrays being the same.


I now mark the offload_funcs with DECL_PRESERVE_P in expand_omp_target 
if flag_lto, so AFAIU the removal should not happen anymore for flag_lto.



 So you can't remove elts only on the host
and not on the device, or vice versa.  The output_offload_tables function
has:
   /* In WHOPR mode during the WPA stage the joint offload tables need to be
  streamed to one partition only.  That's why we free offload_funcs and
  offload_vars after the first call of output_offload_tables.  */
   if (flag_wpa)
 {
   vec_free (offload_funcs);
   vec_free (offload_vars);
 }
so at least with flag_wpa, if we remove anything in there, it won't be
reflected by the other tables.  So, can we do something different in case
we can't easily remove stuff from the vector anymore?  Either store some
placeholder in the tables (dunno if NULL would work or what), 


I've tried NULL, that didn't work.


or instead
ensure corresponding functions can't be removed?




That's the approach I've chosen, as described above.


Maybe this removal if (!flag_lto) could be done earlier, e.g. at the
beginning of lto_output, and for nodes we keep around in the table
past that point set DECL_PRESERVE_P to 1 on the fndecl, so that we then
stream that flag.


Done.

Bootstrapped and reg-tested on x86_64.
Build and reg-tested for x86_64 with nvptx accelerator.

OK for trunk?

Thanks,
- Tom
Prune removed funcs from offload table

2017-12-27  Tom de Vries  

	PR libgomp/83046
	* omp-expand.c (expand_omp_target): If flag_lto, mark offload_funcs with
	DECL_PRESERVE_P.
	* lto-streamer-out.c (lto_output): Remove offload_funcs entries
	that no longer have a corresponding cgraph_node.  If !flag_lto, mark the
	remaining ones as DECL_PRESERVE_P.

	* testsuite/libgomp.oacc-c-c++-common/pr83046.c: New test.
	* testsuite/libgomp.c-c++-common/pr83046.c: New test.

---
 gcc/lto-streamer-out.c | 26 ++
 gcc/omp-expand.c   |  6 -
 libgomp/testsuite/libgomp.c-c++-common/pr83046.c   | 25 +
 .../testsuite/libgomp.oacc-c-c++-common/pr83046.c  | 25 +
 4 files changed, 81 insertions(+), 1 deletion(-)

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index ba29bd0..c38e389 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "gomp-constants.h"
 #include "debug.h"
+#include "omp-offload.h"
 
 
 static void lto_write_tree (struct output_block*, tree, bool);
@@ -2355,6 +2356,31 @@ lto_output (void)
   int i, n_nodes;
   lto_symtab_encoder_t encoder = lto_get_out_decl_state ()->symtab_node_encoder;
 
+  bool truncated_p = false;
+  unsigned int write_index = 0;
+  for (unsigned read_index = 0; read_index < vec_safe_length (offload_funcs);
+   read_index++)
+{
+  tree fn_decl = (*offload_funcs)[read_index];
+  bool remove_p = cgraph_node::get (fn_decl) == NULL;
+  if (remove_p)
+	{
+	  truncated_p = true;
+	  continue;
+	}
+
+  if (write_index != read_index)
+	(*offload_funcs)[write_index] = (*offload_funcs)[read_index];
+
+  write_index++;
+}
+  if (truncated_p)
+offload_funcs->truncate (write_index);
+
+  if (!flag_lto)
+for (unsigned i = 0; i < vec_safe_length (offload_funcs); i++)
+  DECL_PRESERVE_P ((*offload_funcs)[i]) = 1;
+
   if (flag_checking)
 output = lto_bitmap_alloc ();
 
diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index 0248833..59237ff 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -7058,7 +7058,11 @@ expand_omp_target (struct omp_region *region)
 
   /* Add the new function to the offload table.  */
   if (ENABLE_OFFLOADING)
-	vec_safe_push (offload_funcs, child_fn);
+	{
+	  if (flag_lto)
+	DECL_PRESERVE_P (child_fn) = 1;
+	  vec_safe_push (offload_funcs, child_fn);
+	}
 
   bool need_asm = DECL_ASSEMBLER_NAME_SET_P (current_function_decl)
 		  && !DECL_ASSEMBLER_NAME_SET_P (child_fn);
diff --git a/libgomp/testsuite/libgomp.c-c++-common/pr83046.c b/libgomp/testsuite/libgomp.c-c++-common/pr83046.c
new file mode 100644
index 000..90dcb70
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/pr83046.c
@@ -0,0 +1,25 @@
+/* { dg-do link } */
+
+#define N 100
+
+int
+main ()
+{
+  int a[N];
+  int i, x;
+  int c;
+
+  c = 1;
+#pragma omp target
+  for (i = 0; i < 100; i++)

[RFC] Statistics of new ASAN tests: -fsanitize=pointer-{compare,subtract}

2017-12-29 Thread Martin Liška
Hello.

AS you've probably already noted, me and Jakub installed patches for the new 
ASAN feature.
It does Address sanitization of pointer comparison (or subtraction) and finds 
situations where
there pointers point to a different object. And as memory layout is a 
randomness, that can
lead to a potential problem.

I'm attaching a new patch that introduces a new configuration. For now, it logs 
all errors
to a tmp folder. It's not expected this will finish without errors (at least 
for now).
Using the config, bootstrap finishes, but takes couple of hours. It's mainly 
caused by
slow comparison of pointers to global variables. It's quite known limitation 
and I have
an idea how to fix that.

Statistics are attached. I'm going to return to that and identify if it's really
a potential issue.

Martin
$ grep SUMMARY /tmp/gcc-asan-logs.* | sed 's/.*SUMMARY//' | sort | uniq -c | 
sort -n
  1 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/call.c:4574 in build_op_call_1
  2 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/dse.c:2800 in dse_step2_init
  2 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/genattrtab.c:351 in attr_hash_add_rtx
  3 : AddressSanitizer: invalid-pointer-pair 
/dev/shm/objdir/libiberty/conftest.c:118 in find_stack_direction
  3 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/call.c:10877 in initialize_reference(tree_node*, 
tree_node*, int, int)
  3 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/dse.c:2797 in dse_step2_init
  4 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/libcpp/directives.c:2669 in _cpp_pop_buffer
  7 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/call.c:10624 in 
perform_direct_initialization_if_possible(tree_node*, tree_node*, bool, int)
  8 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/parser.c:12435 in cp_parser_declaration_statement
  9 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/call.c:10471 in can_convert_arg(tree_node*, tree_node*, 
tree_node*, int, int)
 28 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/libcpp/line-map.c:234 in get_combined_adhoc_loc(line_maps*, 
unsigned int, source_range, void*)
 38 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/libiberty/regex.c:3109 in byte_regex_compile
 65 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/c/c-parser.c:1525 in c_parser_translation_unit
 71 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/call.c:10491 in can_convert_arg_bad(tree_node*, 
tree_node*, tree_node*, int, int)
 78 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/call.c:4115 in 
build_converted_constant_expr(tree_node*, tree_node*, int)
125 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/call.c:10551 in 
perform_implicit_conversion_flags(tree_node*, tree_node*, int, int)
134 : AddressSanitizer: invalid-pointer-pair /home/marxin/gcc/gcc/cfg.c:707 
in free_aux_for_edges()
162 : AddressSanitizer: invalid-pointer-pair /usr/include/bits/stdio.h:100 
in putc_unlocked
177 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/call.c:5954 in build_new_op_1
250 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/call.c:4311 in build_new_function_call(tree_node*, 
vec**, int)
261 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/libcpp/directives.c:2149 in do_endif
288 : AddressSanitizer: invalid-pointer-pair /home/marxin/gcc/gcc/lcm.c:572 
in compute_available(simple_bitmap_def**, simple_bitmap_def**, 
simple_bitmap_def**, simple_bitmap_def**)
645 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/call.c:9305 in build_new_method_call_1
832 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/genautomata.c:5076 in store_alt_unit_usage
895 : AddressSanitizer: invalid-pointer-pair /home/marxin/gcc/gcc/lcm.c:542 
in compute_available(simple_bitmap_def**, simple_bitmap_def**, 
simple_bitmap_def**, simple_bitmap_def**)
   1067 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/cp/parser.c:12739 in cp_parser_declaration
   2724 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/gcc.c:8431 in lookup_compiler
   2724 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/gcc/gcc.c:8462 in lookup_compiler
  16369 : AddressSanitizer: invalid-pointer-pair /home/marxin/gcc/gcc/cfg.c:631 
in free_aux_for_blocks()
  69260 : AddressSanitizer: invalid-pointer-pair 
/home/marxin/gcc/libiberty/obstack.c:277 in _obstack_free

# 1
SUMMARY: AddressSanitizer: invalid-pointer-pair /home/marxin/gcc/gcc/gcc.c:8431 
in lookup_compiler
=
==100701==ERROR: AddressSanitizer: invalid-pointer-pair: 0x61d00060 
0x61d00080
#0 0x511fe9 in lookup_compiler /home/marxin/gcc/gcc/gcc.c:8462
#1 0x5

[PATCH] PR 78534 Change character length from int to size_t

2017-12-29 Thread Janne Blomqvist
In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.

This is an ABI change, as procedures with character arguments take
hidden arguments with the character length.

I also changed the _size member in vtables from int to size_t, as
there were some cases where character lengths and sizes were
apparently mixed up and caused regressions otherwise. Although I
haven't tested, this might enable very large derived types as well.

Also, as there are some places in the frontend were negative character
lengths are used as special flag values, in the frontend the character
length is handled as a signed variable of the same size as a size_t,
although in the runtime library it really is size_t.

I haven't changed the character length variables for the co-array
intrinsics, as this is something that may need to be synchronized with
OpenCoarrays.

This is v5 of the patch. v4 was applied but caused breakage on big
endian targets. These have been fixed and tested, thanks to access to
the GCC compile farm.

Overview of v4 of the patch: v3 was applied but had to reverted due to
breaking bootstrap. The fix is in resolve.c:resolve_charlen, where
it's necessary to check that an expression is constant before using
mpz_sgn.

Overview of v3 of the patch: All the issues pointed out by FX's review
of v2 have been fixed. In particular, there are now new functions
gfc_mpz_get_hwi and gfc_mpz_set_hwi, similar to the GMP functions
mpz_get_si and mpz_set_si, except that they get/set a HOST_WIDE_INT
instead of a long value. Similarly, gfc_get_int_expr now takes a
HOST_WIDE_INT instead of a long, gfc_extract_long is replaced by
gfc_extract_hwi. Also, the preliminary work to handle
gfc_charlen_type_node being unsigned has been removed.

Regtested on x86_64-pc-linux-gnu, i686-pc-linux-gnu and
powerpc64-unknown-linux-gnu. Also regtested all three targets by
modifying gfortran-dg.exp to also test with "-g -flto", no new
failures observed.

frontend:

2017-12-29  Janne Blomqvist  

PR fortran/78534
PR fortran/66310
* array.c (got_charlen): Use gfc_charlen_int_kind.
* class.c (gfc_find_derived_vtab): Use gfc_size_kind instead of
hardcoded kind.
(find_intrinsic_vtab): Likewise.
* decl.c (match_char_length): Use gfc_charlen_int_kind.
(add_init_expr_to_sym): Use gfc_charlen_t and gfc_charlen_int_kind.
(gfc_match_implicit): Use gfc_charlen_int_kind.
* dump-parse-tree.c (show_char_const): Use gfc_charlen_t and size_t.
(show_expr): Use HOST_WIDE_INT_PRINT_DEC.
* expr.c (gfc_get_character_expr): Length parameter of type
gfc_charlen_t.
(gfc_get_int_expr): Value argument of type HOST_WIDE_INT.
(gfc_extract_hwi): New function.
(simplify_const_ref): Make string_len of type gfc_charlen_t.
(gfc_simplify_expr): Use HOST_WIDE_INT for substring refs.
* frontend-passes.c (optimize_trim): Use gfc_charlen_int_kind.
* gfortran.h (gfc_mpz_get_hwi): New prototype.
(gfc_mpz_set_hwi): Likewise.
(gfc_charlen_t): New typedef.
(gfc_expr): Use gfc_charlen_t for character lengths.
(gfc_size_kind): New extern variable.
(gfc_extract_hwi): New prototype.
(gfc_get_character_expr): Use gfc_charlen_t for character length.
(gfc_get_int_expr): Use HOST_WIDE_INT type for value argument.
* gfortran.texi: Update description of hidden string length argument.
* iresolve.c (check_charlen_present): Use gfc_charlen_int_kind.
(gfc_resolve_char_achar): Likewise.
(gfc_resolve_repeat): Pass string length directly without
temporary, use gfc_charlen_int_kind.
(gfc_resolve_transfer): Use gfc_charlen_int_kind.
* match.c (select_intrinsic_set_tmp): Use HOST_WIDE_INT for charlen.
* misc.c (gfc_mpz_get_hwi): New function.
(gfc_mpz_set_hwi): New function.
* module.c (atom_int): Change type from int to HOST_WIDE_INT.
(parse_integer): Don't complain about large integers.
(write_atom): Use HOST_WIDE_INT for integers.
(mio_integer): Handle integer type mismatch.
(mio_hwi): New function.
(mio_intrinsic_op): Use HOST_WIDE_INT.
(mio_array_ref): Likewise.
(mio_expr): Likewise.
* primary.c (match_substring): Use gfc_charlen_int_kind.
* resolve.c (resolve_substring_charlen): Use gfc_charlen_int_kind.
(resolve_character_operator): Likewise.
(resolve_assoc_var): Likewise.
(resolve_select_type): Use HOST_WIDE_INT for charlen, use snprintf.
(resolve_charlen): Use mpz_sgn to determine sign.
* simplify.c (gfc_simplify_repeat): Use HOST_WIDE_INT/gfc_charlen_t
instead of long.
* symbol.c (generate_isocbinding_symbol): Use gfc_charlen_int_kind.
* target-memory.c (size_character): Length argument of type
gfc_charlen_t.

[RFC] Add vec::ordered_remove_if

2017-12-29 Thread Tom de Vries
[ was: Re: [libgomp, openacc, openmp, PR83046] Prune removed funcs from 
offload table ]


On 12/28/2017 05:06 PM, Jakub Jelinek wrote:

On Thu, Dec 28, 2017 at 04:53:29PM +0100, Tom de Vries wrote:

--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -,6 +,16 @@ output_offload_tables (void)
struct lto_simple_output_block *ob
  = lto_create_simple_output_block (LTO_section_offload_table);
  
+  for (unsigned i = 0; i < vec_safe_length (offload_funcs);)

+{
+  if (!cgraph_node::get ((*offload_funcs)[i]))
+   {
+ offload_funcs->ordered_remove (i);
+ continue;
+   }
+  i++;
+}



This has O(n^2) complexity for n == vec_safe_length (offload_funcs).
Can't you instead just have 2 IVs, one for where we read the vector elt and
one for where we write it if the 2 are different, then truncate the vector
if needed at the end?


I wonder if it makes sense to add this function to the vec interface.

Any comments?

Thanks,
- Tom
Add vec::ordered_remove_if

---
 gcc/vec.c | 21 +
 gcc/vec.h | 39 +++
 2 files changed, 60 insertions(+)

diff --git a/gcc/vec.c b/gcc/vec.c
index 9a80f34..91e2464 100644
--- a/gcc/vec.c
+++ b/gcc/vec.c
@@ -403,6 +403,26 @@ test_ordered_remove ()
   ASSERT_EQ (9, v.length ());
 }
 
+static bool
+remove_5_7_p (const int &a)
+{
+  return a == 5 || a == 7;
+}
+
+/* Verify that vec::ordered_remove_if works correctly.  */
+
+static void
+test_ordered_remove_if (void)
+{
+  auto_vec  v;
+  safe_push_range (v, 0, 10);
+  v.ordered_remove_if (remove_5_7_p);
+  ASSERT_EQ (4, v[4]);
+  ASSERT_EQ (6, v[5]);
+  ASSERT_EQ (8, v[6]);
+  ASSERT_EQ (8, v.length ());
+}
+
 /* Verify that vec::unordered_remove works correctly.  */
 
 static void
@@ -464,6 +484,7 @@ vec_c_tests ()
   test_pop ();
   test_safe_insert ();
   test_ordered_remove ();
+  test_ordered_remove_if ();
   test_unordered_remove ();
   test_block_remove ();
   test_qsort ();
diff --git a/gcc/vec.h b/gcc/vec.h
index f55a41f..857e4f4 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -571,6 +571,7 @@ public:
   void truncate (unsigned);
   void quick_insert (unsigned, const T &);
   void ordered_remove (unsigned);
+  void ordered_remove_if (bool (*) (const T &));
   void unordered_remove (unsigned);
   void block_remove (unsigned, unsigned);
   void qsort (int (*) (const void *, const void *));
@@ -1013,6 +1014,31 @@ vec::ordered_remove (unsigned ix)
 }
 
 
+/* Remove all elements from this vector for which REMOVE_ELEM_P (elem) holds.
+   Ordering of remaining elements is preserved.  This is an an O(N)
+   operation.  */
+
+template
+inline void
+vec::ordered_remove_if (bool (*remove_elem_p) (const T &))
+{
+  unsigned int n = length ();
+  unsigned int write_index = 0;
+  for (unsigned int read_index = 0; read_index < n; ++read_index)
+{
+  bool remove_p = remove_elem_p (read_index);
+  if (remove_p)
+	continue;
+
+  if (read_index != write_index)
+	m_vecdata[write_index] = m_vecdata[read_index];
+
+  write_index++;
+}
+  truncate (write_index);
+}
+
+
 /* Remove an element from the IXth position of this vector.  Ordering of
remaining elements is destroyed.  This is an O(1) operation.  */
 
@@ -1334,6 +1360,7 @@ public:
   void quick_insert (unsigned, const T &);
   void safe_insert (unsigned, const T & CXX_MEM_STAT_INFO);
   void ordered_remove (unsigned);
+  void ordered_remove_if (bool (*) (const T &));
   void unordered_remove (unsigned);
   void block_remove (unsigned, unsigned);
   void qsort (int (*) (const void *, const void *));
@@ -1779,6 +1806,18 @@ vec::ordered_remove (unsigned ix)
 }
 
 
+/* Remove all elements from this vector for which REMOVE_ELEM_P (elem) holds.
+   Ordering of remaining elements is preserved.  This is an an O(N)
+   operation.  */
+
+template
+inline void
+vec::ordered_remove_if (bool (*remove_elem_p) (const T &))
+{
+  m_vec->ordered_remove_if (remove_elem_p);
+}
+
+
 /* Remove an element from the IXth position of this vector.  Ordering
of remaining elements is destroyed.  This is an O(1) operation.  */
 


Re: a new libgcov interface: __gcov_dump_all

2017-12-29 Thread Martin Liška
On 10/26/2017 10:47 AM, Martin Liška wrote:
> On 07/22/2014 06:04 PM, Xinliang David Li wrote:
>> Please take a look the updated patch. It addresses the issue of using
>> dlclose before dump, and potential races (between a thread closing a
>> library and the dumper call).
>>
>> David
>>
>> On Sun, Jul 20, 2014 at 11:12 PM, Nathan Sidwell  wrote:
>>> On 07/20/14 21:38, Xinliang David Li wrote:

 The gcov_info chain is not duplicated -- there is already one chain
 (linking only modules of the library) per shared library in current
 implementation.  My change does not affect underlying behavior at all
 -- it merely introduces a new interface to access private dumper
 methods associated with shared libs.
>>>
>>>
>>> ah, got it.  thanks for clarifying.  Can't help thinking gcov_init should be
>>> doing this, and wondering about dlload/dlclose.  Let me think
>>>
>>> nathan
> 
> Hi.
> 
> Unfortunately, it looks the patch hasn't been installed to trunk. Some folks 
> from Firefox
> are interested in that.
> 
> Are you Nathan OK with the patch? I guess a rebase will be needed.
> 
> Martin
> 

Hi.

I would like to suspend the patch review as there's actually no call for the 
feature.
The person which sent me the link actually wanted to dump all counters. That 
can be
easily done via __gcov_dump interface we already have.

Martin


Re: [PATCH] Implement smart multiple switch expansion algorithms.

2017-12-29 Thread Martin Liška
On 10/06/2017 07:24 PM, David Malcolm wrote:
> On Fri, 2017-10-06 at 14:25 +0200, Martin Liška wrote:
>> Hello.
>>
>> As presented at this year's Cauldron, I rewrote current switch
>> expansion to support
>> multiple algorithms (jump table and bit-test) that can be used just
>> for a fraction
>> of cases. Balanced decision tree is built on top of that. I decided
>> to do a bigger
>> refactoring and put all there 3 mentioned algorithm to its own class.
>>
>> There's a bigger change in jump_table_cluster::can_be_handled where
>> the constant 10 (3 respectively)
>> is compared to number of handled values, and not number of cases.
>> Later one is wrong in my opinion.
>>
>> There are some numbers for cc1plus:
>>
>> $ bloaty ./objdir2/gcc/cc1plus -- ./objdir/gcc/cc1plus
>>  VM SIZE  FILE SIZE
>>  ++ GROWING++
>>+19% +1.32Mi .rodata+1.32Mi   +19%
>>   [ = ]   0 .symtab+7.38Ki  +0.5%
>>   [ = ]   0 .strtab+5.18Ki  +0.2%
>>   [ = ]   0 .debug_info   +743  +0.0%
>>   +0.0%+712 .eh_frame +712  +0.0%
>>   +0.1%+440 .eh_frame_hdr +440  +0.1%
>>   [ = ]   0 .debug_aranges +80  +0.1%
>>   +0.0% +67 .dynstr+67  +0.0%
>>   [ = ]   0 .debug_str  +6  +0.0%
>>
>>  -- SHRINKING  --
>>   -1.2%  -214Ki .text   -214Ki  -1.2%
>>   [ = ]   0 .debug_loc -66.7Ki  -0.1%
>>   [ = ]   0 .debug_line-14.0Ki  -0.2%
>>   [ = ]   0 .debug_ranges  -9.56Ki  -0.1%
>>   -6.8%  -3 [Unmapped]-375 -14.4%
>>   [ = ]   0 .debug_abbrev  -46  -0.0%
>>
>>   +3.8% +1.11Mi TOTAL  +1.03Mi  +0.5%
>>
>> So it growth and can be easily explained:
>>
>> insn-attrtab.o:
>>  VM SIZE  FILE SIZE
>>  ++ GROWING++
>>   [ = ]   0 .rela.rodata   +2.00Mi  +215%
>>   +214%  +682Ki .rodata +682Ki  +214%
>>   [ = ]   0 .rela.debug_loc+29.3Ki   +36%
>>+32% +1.91Ki .eh_frame  +1.91Ki   +32%
>>   [ = ]   0 .debug_loc +37  +5.6%
>>   [ = ]   0 .debug_str  +2  +0.0%
>>
>>  -- SHRINKING  --
>>  -50.1% -63.3Ki .text  -63.3Ki -50.1%
>>   [ = ]   0 .debug_line-1.71Ki -10.4%
>>   [ = ]   0 .rela.debug_info  -768  -0.3%
>>   [ = ]   0 .rela.text-624  -0.8%
>>   [ = ]   0 .rela.debug_ranges-384  -2.7%
>>   [ = ]   0 .debug_info-87  -0.2%
>>   [ = ]   0 [Unmapped]  -2  -8.7%
>>
>>   +137%  +621Ki TOTAL  +2.63Mi  +139%
>>
>> It's caused by:
>> ...
>> ;; GIMPLE switch case clusters: JT(2710):-1-5090 
>> ;; GIMPLE switch case clusters: JT(2710):-1-5090 
>> ;; GIMPLE switch case clusters: JT(2967):-1-5090 
>> ;; GIMPLE switch case clusters: JT(1033):-1-5017 
>> ...
>>
>> so there are many switch statements with very reasonable density.
>>
>> and
>> insn-dfatab.o:+13% +99.4Ki TOTAL   +455Ki   +14%
>> insn-latencytab.o:   +19%  +136Ki TOTAL   +609Ki   +20%
>>
>> There shouldn't be any fallout other from what I mentioned in
>> previous email that is
>> a prerequisite for this patch.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>> tests.
>>
>> I'm still planning to come with some numbers and stats. Will do that
>> next week.
>>
>> Martin
> 
> 
>> gcc/ChangeLog:
>>
>> 2017-09-29  Martin Liska  
>>
>>  * tree-switch-conversion.c (MAX_CASE_BIT_TESTS): Remove and move
>>  to header file.
>>  (hoist_edge_and_branch_if_true): Move to ...
>>  (bit_test_cluster::hoist_edge_and_branch_if_true): ... this.
> 
> I'm not a reviewer, but there's a lot of "churn" in the patch due to
> things moving around; there seems to be a mixture of things simply
> moving around, functions becoming methods of classes, and some
> things changing.
> 
> Would it be helpful to split the patch into two: a patch that moves
> the functions to where they'll be needed, and then a patch to do the
> "actual" (or functional) changes?

Hello.

Sorry for late answer. I fully agree what you wrote David. It would make
the patch more easily readable and reviewable. Let me postpone it to next
stage1, patch separation into 2 pieces will need some effort.

> 
> [...snip...]
> 
>> diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
>> new file mode 100644
>> index 000..45ae11f408d
>> --- /dev/null
>> +++ b/gcc/tree-switch-conversion.h
>> @@ -0,0 +1,806 @@
>> +/* Tree switch conversion for GNU compiler.
>> +   Copyright (C) 2017 Free Software Foundation, Inc.
> 
> Should this be "gimple switch conversion" and
> gimple-ssa-switch-conversion.h?
> (but presumably this is to mirror tree-switch-conversion.c, and
> I don't want to advocate for unnecessary churn)
> 
>> +This file is part of GCC.
>> +
>> +GCC is free software; you can redistribute it a

Re: [PATCH] Add selftest for vec::reverse

2017-12-29 Thread Martin Liška
On 10/06/2017 06:20 PM, David Malcolm wrote:
> Martin: I noticed that your switch expansion patch added a
>   vec::reverse ()
> method.  Here's a proposed selftest for it, mostly to verify
> that it handles even vs odd lengths (which it does).
> 
> Only lightly tested; hope this is useful.
> Dave

Hi.

I like it! Do you want to commit it before there will be
a consumer of the function?

Martin

> 
> gcc/ChangeLog:
>   * vec.c (selftest::test_reverse): New function.
>   (selftest::vec_c_tests): Call it.
> ---
>  gcc/vec.c | 38 ++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/gcc/vec.c b/gcc/vec.c
> index d612703..5d70973 100644
> --- a/gcc/vec.c
> +++ b/gcc/vec.c
> @@ -359,6 +359,43 @@ test_qsort ()
>ASSERT_EQ (10, v.length ());
>  }
>  
> +/* Verify that vec::reverse works correctly.  */
> +
> +static void
> +test_reverse ()
> +{
> +  /* Reversing an empty vec ought to be a no-op.  */
> +  {
> +auto_vec  v;
> +ASSERT_EQ (0, v.length ());
> +v.reverse ();
> +ASSERT_EQ (0, v.length ());
> +  }
> +
> +  /* Verify reversing a vec with even length.  */
> +  {
> +auto_vec  v;
> +safe_push_range (v, 0, 4);
> +v.reverse ();
> +ASSERT_EQ (3, v[0]);
> +ASSERT_EQ (2, v[1]);
> +ASSERT_EQ (1, v[2]);
> +ASSERT_EQ (0, v[3]);
> +ASSERT_EQ (4, v.length ());
> +  }
> +
> +  /* Verify reversing a vec with odd length.  */
> +  {
> +auto_vec  v;
> +safe_push_range (v, 0, 3);
> +v.reverse ();
> +ASSERT_EQ (2, v[0]);
> +ASSERT_EQ (1, v[1]);
> +ASSERT_EQ (0, v[2]);
> +ASSERT_EQ (3, v.length ());
> +  }
> +}
> +
>  /* Run all of the selftests within this file.  */
>  
>  void
> @@ -374,6 +411,7 @@ vec_c_tests ()
>test_unordered_remove ();
>test_block_remove ();
>test_qsort ();
> +  test_reverse ();
>  }
>  
>  } // namespace selftest
> 



Re: [PATCH] Implement smart multiple switch expansion algorithms.

2017-12-29 Thread Martin Liška
On 10/06/2017 03:46 PM, Wilco Dijkstra wrote:
> Martin Liska wrote:
> 
>> There are some numbers for cc1plus:
>>
>> $ bloaty ./objdir2/gcc/cc1plus -- ./objdir/gcc/cc1plus
>> VM SIZE  FILE SIZE
>>   +3.8% +1.11Mi TOTAL  +1.03Mi  +0.5%
> 
>> insn-attrtab.o:
>> VM SIZE  FILE SIZE
>>   +214%  +682Ki .rodata +682Ki  +214%
>>  -50.1% -63.3Ki .text  -63.3Ki -50.1%
> 
> So is that a 3.8% codesize increase or decrease? If an increase,

Hi.

Yes, increase.

> I can't see how replacing 63KB of instructions with 682KB of data
> is a good tradeoff... There should be an accurate calculation
> of the density, taking the switch table width into account (really small
> tables can use 1-byte offsets, large tables are typically forced to
> use 4-byte offsets). This may need new target callbacks - I changed
> PARAM_CASE_VALUES_THRESHOLD on AArch64 to get smaller
> code and better performance since the current density calculations
> are hardcoded and quite wrong for big tables...

Let me return to that in next stage1. Can you please provide more details about
how it affects AArach64 target?

> 
> Also what is the codesize difference on SPEC2006/2017? I don't see
> any mention of performance impact either...

Will prepare that.

Thank you,
Martin

> 
> Wilco
> 



Re: debug container mutex association

2017-12-29 Thread Andreas Schwab
On Sep 19 2016, François Dumont  wrote:

> diff --git 
> a/libstdc++-v3/testsuite/23_containers/vector/debug/mutex_association.cc 
> b/libstdc++-v3/testsuite/23_containers/vector/debug/mutex_association.cc
> new file mode 100644
> index 000..a3c56e2
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/23_containers/vector/debug/mutex_association.cc
> @@ -0,0 +1,42 @@
> +// Copyright (C) 2016 Free Software Foundation, Inc.
> +//
> +// This file is part of the GNU ISO C++ Library.  This library is free
> +// software; you can redistribute it and/or modify it under the
> +// terms of the GNU General Public License as published by the
> +// Free Software Foundation; either version 3, or (at your option)
> +// any later version.
> +//
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +// GNU General Public License for more details.
> +//
> +// You should have received a copy of the GNU General Public License along
> +// with this library; see the file COPYING3.  If not see
> +// .
> +//
> +
> +#include 
> +#include 
> +
> +#include 
> +
> +class container : public __gnu_debug::_Safe_sequence
> +{
> +public:
> +  __gnu_cxx::__mutex&
> +  get_mutex()
> +  { return this->_M_get_mutex(); }
> +};
> +
> +int
> +main()
> +{
> +  std::set<__gnu_cxx::__mutex*> mutexes;
> +  container conts[17];
> +
> +  for (int i = 0; i != 16; ++i)
> +VERIFY( mutexes.insert(&conts[i].get_mutex()).second );

There will be less than 16 unique mutexes, if sizeof(container) has more
trailing zero bits than alignof(__gnu_debug::vector).

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


[PATCH] Use pointer sized array indices.

2017-12-29 Thread Janne Blomqvist
Using pointer sized variables (e.g. size_t / ptrdiff_t) when the
variables are used as array indices allows accessing larger arrays,
and can be a slight performance improvement due to no need for sign or
zero extending, or masking.

Regtested on x86_64-pc-linux-gnu, Ok for trunk?

libgfortran/ChangeLog:

2017-12-29  Janne Blomqvist  

* generated/cshift1_16.c (cshift1): Regenerated.
* generated/cshift1_4.c (cshift1): Regenerated.
* generated/cshift1_8.c (cshift1): Regenerated.
* generated/eoshift1_16.c (eoshift1): Regenerated.
* generated/eoshift1_4.c (eoshift1): Regenerated.
* generated/eoshift1_8.c (eoshift1): Regenerated.
* generated/eoshift3_16.c (eoshift3): Regenerated.
* generated/eoshift3_4.c (eoshift3): Regenerated.
* generated/eoshift3_8.c (eoshift3): Regenerated.
* generated/in_pack_c10.c (internal_pack_c10): Regenerated.
* generated/in_pack_c16.c (internal_pack_c16): Regenerated.
* generated/in_pack_c4.c (internal_pack_c4): Regenerated.
* generated/in_pack_c8.c (internal_pack_c8): Regenerated.
* generated/in_pack_i1.c (internal_pack_1): Regenerated.
* generated/in_pack_i16.c (internal_pack_16): Regenerated.
* generated/in_pack_i2.c (internal_pack_2): Regenerated.
* generated/in_pack_i4.c (internal_pack_4): Regenerated.
* generated/in_pack_i8.c (internal_pack_8): Regenerated.
* generated/in_pack_r10.c (internal_pack_r10): Regenerated.
* generated/in_pack_r16.c (internal_pack_r16): Regenerated.
* generated/in_pack_r4.c (internal_pack_r4): Regenerated.
* generated/in_pack_r8.c (internal_pack_r8): Regenerated.
* generated/in_unpack_c10.c (internal_unpack_c10): Regenerated.
* generated/in_unpack_c16.c (internal_unpack_c16): Regenerated.
* generated/in_unpack_c4.c (internal_unpack_c4): Regenerated.
* generated/in_unpack_c8.c (internal_unpack_c8): Regenerated.
* generated/in_unpack_i1.c (internal_unpack_1): Regenerated.
* generated/in_unpack_i16.c (internal_unpack_16): Regenerated.
* generated/in_unpack_i2.c (internal_unpack_2): Regenerated.
* generated/in_unpack_i4.c (internal_unpack_4): Regenerated.
* generated/in_unpack_i8.c (internal_unpack_8): Regenerated.
* generated/in_unpack_r10.c (internal_unpack_r10): Regenerated.
* generated/in_unpack_r16.c (internal_unpack_r16): Regenerated.
* generated/in_unpack_r4.c (internal_unpack_r4): Regenerated.
* generated/in_unpack_r8.c (internal_unpack_r8): Regenerated.
* generated/reshape_c10.c (reshape_c10): Regenerated.
* generated/reshape_c16.c (reshape_c16): Regenerated.
* generated/reshape_c4.c (reshape_c4): Regenerated.
* generated/reshape_c8.c (reshape_c8): Regenerated.
* generated/reshape_i16.c (reshape_16): Regenerated.
* generated/reshape_i4.c (reshape_4): Regenerated.
* generated/reshape_i8.c (reshape_8): Regenerated.
* generated/reshape_r10.c (reshape_r10): Regenerated.
* generated/reshape_r16.c (reshape_r16): Regenerated.
* generated/reshape_r4.c (reshape_r4): Regenerated.
* generated/reshape_r8.c (reshape_r8): Regenerated.
* generated/shape_i1.c (shape_1): Regenerated.
* generated/shape_i16.c (shape_16): Regenerated.
* generated/shape_i2.c (shape_2): Regenerated.
* generated/shape_i4.c (shape_4): Regenerated.
* generated/shape_i8.c (shape_8): Regenerated.
* generated/spread_c10.c (spread_scalar_c10): Regenerated.
* generated/spread_c16.c (spread_scalar_c16): Regenerated.
* generated/spread_c4.c (spread_scalar_c4): Regenerated.
* generated/spread_c8.c (spread_scalar_c8): Regenerated.
* generated/spread_i1.c (spread_scalar_i1): Regenerated.
* generated/spread_i16.c (spread_scalar_i16): Regenerated.
* generated/spread_i2.c (spread_scalar_i2): Regenerated.
* generated/spread_i4.c (spread_scalar_i4): Regenerated.
* generated/spread_i8.c (spread_scalar_i8): Regenerated.
* generated/spread_r10.c (spread_scalar_r10): Regenerated.
* generated/spread_r16.c (spread_scalar_r16): Regenerated.
* generated/spread_r4.c (spread_scalar_r4): Regenerated.
* generated/spread_r8.c (spread_scalar_r8): Regenerated.
* intrinsics/random.c (jump): Use size_t for array index in loop.
(getosrandom): Likewise.
(arandom_r4): Make n an index_type.
(arandom_r8): Likewise.
(arandom_r10): Likewise.
(arandom_r16): Likewise.
(scramble_seed): Use size_t for array index in loop.
* m4/cshift1.m4: Make i an index_type.
* m4/eoshift1.m4: Likewise.
* m4/eoshift3.m4: Likewise.
* m4/in_pack.m4: Make n an index_type.
* m4/in_unpack.m4: Likewise.
* m4/reshape.m4: Make n and dim index_type's.

[v2 of PATCH 03/14] C++: add location_t wrapper nodes during parsing (minimal impl)

2017-12-29 Thread David Malcolm
On Fri, 2017-12-15 at 13:58 -0500, Jason Merrill wrote:
> On Fri, Dec 15, 2017 at 11:35 AM, David Malcolm 
> wrote:
> > On Fri, 2017-12-15 at 10:01 -0500, Jason Merrill wrote:
> > > On Thu, Dec 14, 2017 at 2:25 PM, David Malcolm  > > om>
> > > wrote:
> > > > On Mon, 2017-12-11 at 21:10 -0500, Jason Merrill wrote:
> > > > > On 11/10/2017 04:45 PM, David Malcolm wrote:
> > > > > > The initial version of the patch kit added location wrapper
> > > > > > nodes
> > > > > > around constants and uses-of-declarations, along with some
> > > > > > other
> > > > > > places in the parser (typeid, alignof, sizeof, offsetof).
> > > > > > 
> > > > > > This version takes a much more minimal approach: it only
> > > > > > adds
> > > > > > location wrapper nodes around the arguments at callsites,
> > > > > > thus
> > > > > > not adding wrapper nodes around uses of constants and decls
> > > > > > in
> > > > > > other
> > > > > > locations.
> > > > > > 
> > > > > > It keeps them for the other places in the parser (typeid,
> > > > > > alignof,
> > > > > > sizeof, offsetof).
> > > > > > 
> > > > > > In addition, for now, each site that adds wrapper nodes is
> > > > > > guarded
> > > > > > with !processing_template_decl, suppressing the creation of
> > > > > > wrapper
> > > > > > nodes when processing template declarations.  This is to
> > > > > > simplify
> > > > > > the patch kit so that we don't have to support wrapper
> > > > > > nodes
> > > > > > during
> > > > > > template expansion.
> > > > > 
> > > > > Hmm, it should be easy to support them, since NON_LVALUE_EXPR
> > > > > and
> > > > > VIEW_CONVERT_EXPR don't otherwise appear in template trees.
> > > > > 
> > > > > Jason
> > > > 
> > > > I don't know if it's "easy"; it's at least non-trivial.
> > > > 
> > > > I attempted to support them in the obvious way by adding the
> > > > two
> > > > codes
> > > > to the switch statement tsubst_copy, reusing the case used by
> > > > NOP_EXPR
> > > > and others, but ran into a issue when dealing with template
> > > > parameter
> > > > packs.
> > > > Attached is the reproducer I've been testing with (minimized
> > > > using
> > > > "delta" from a stdlib reproducer); my code was failing with:
> > > > 
> > > > ../../src/cp-stdlib.ii: In instantiation of ‘struct
> > > > allocator_traits >’:
> > > > ../../src/cp-stdlib.ii:31:8:   required from ‘struct
> > > > __alloc_traits, char>’
> > > > ../../src/cp-stdlib.ii:43:75:   required from ‘class
> > > > basic_string >’
> > > > ../../src/cp-stdlib.ii:47:58:   required from here
> > > > ../../src/cp-stdlib.ii:27:55: sorry, unimplemented: use of
> > > > ‘type_pack_expansion’ in template
> > > >  -> decltype(_S_construct(__a, __p,
> > > > forward<_Args>(__args)...))  {   }
> > > >^~
> > > > 
> > > > The issue is that normally "__args" would be a PARM_DECL of
> > > > type
> > > > TYPE_PACK_EXPANSION, and that's handled by tsubst_decl, but on
> > > > adding a
> > > > wrapper node we now have a VIEW_CONVERT_EXPR of the same type
> > > > i.e.
> > > > TYPE_PACK_EXPANSION wrapping the PARM_DECL.
> > > > 
> > > > When tsubst traverses the tree, the VIEW_CONVERT_EXPR is
> > > > reached
> > > > first,
> > > > and it attempts to substitute the type TYPE_PACK_EXPANSION,
> > > > which
> > > > leads
> > > > to the "sorry".
> > > > 
> > > > If I understand things right, during substitution, only
> > > > tsubst_decl
> > > > on
> > > > PARM_DECL can handle nodes with type with code
> > > > TYPE_PACK_EXPANSION.
> > > > 
> > > > The simplest approach seems to be to not create wrapper nodes
> > > > for
> > > > decls
> > > > of type TYPE_PACK_EXPANSION, and that seems to fix the issue.
> > > 
> > > That does seem simplest.
> > > 
> > > > Alternatively I can handle TYPE_PACK_EXPANSION for
> > > > VIEW_CONVERT_EXPR in
> > > > tsubst by remapping the type to that of what they wrap after
> > > > substitution; doing so also fixes the issue.
> > > 
> > > This will be more correct.  For the wrappers you don't need all
> > > the
> > > handling that we currently have for NOP_EXPR and such; since we
> > > know
> > > they don't change the type, we can substitute what they wrap, and
> > > then
> > > rewrap the result.
> > 
> > (nods; I have this working)
> > 
> > I've been debugging the other issues that I ran into when removing
> > the
> > "!processing_template_decl" filter on making wrapper nodes (ICEs
> > and
> > other errors on valid code).  They turn out to relate to wrappers
> > around decls of type TEMPLATE_TYPE_PARM; having these wrappers
> > leads to
> > such VIEW_CONVERT_EXPRs turning up in unexpected places.
> 
> Hmm, that's odd.  What kind of decls?  A variable which happens to
> have a template parameter for a type shouldn't be a problem.

Sorry.  This turned out to be a bug in my implementation of tsubst_*.

> > I could try to track all those places down, but it seems much
> > simpler
> > to just add an exclusion to adding wrapper nodes around decls of

Re: [PATCH] Add selftest for vec::reverse

2017-12-29 Thread David Malcolm
On Fri, 2017-12-29 at 15:39 +0100, Martin Liška wrote:
> On 10/06/2017 06:20 PM, David Malcolm wrote:
> > Martin: I noticed that your switch expansion patch added a
> >   vec::reverse ()
> > method.  Here's a proposed selftest for it, mostly to verify
> > that it handles even vs odd lengths (which it does).
> > 
> > Only lightly tested; hope this is useful.
> > Dave
> 
> Hi.
> 
> I like it! Do you want to commit it before there will be
> a consumer of the function?
> 
> Martin

Looks like vec::reverse isn't in trunk yet.  I don't have a strong
opinion on whether vec::reverse should be added without a "real"
consumer, or added when the consumer is added, but feel free to add my
selftest patch into whichever patch adds vec::reverse.


Re: [PATCH] Use pointer sized array indices.

2017-12-29 Thread Thomas Koenig

Hi Janne,


Using pointer sized variables (e.g. size_t / ptrdiff_t) when the
variables are used as array indices allows accessing larger arrays,
and can be a slight performance improvement due to no need for sign or
zero extending, or masking.


Unless I have missed something, all the examples are for cases where
the array is of maximum size GFC_MAX_DIMENSIONS.  This is why they
were left as int in the first place (because it is unlikely we will have
arrays of more than 2^31-1 dimensions soon :-).

Do you really think this change is necessary? If not, I'd rather avoid
such a change.

Regards

Thomas


Re: [patch, lingfortran] Bug 83560 - list-directed formatting of INTEGER is missing plus on output

2017-12-29 Thread Thomas Koenig

Hi Jerry,


OK for trunk?


Looks good, thanks for the patch!

Looking at the number of test cases, namelist has to be one of the
most complicated things in all of Fortran...

Regards

Thomas


Re: [PATCH] Use pointer sized array indices.

2017-12-29 Thread Janne Blomqvist
On Fri, Dec 29, 2017 at 7:17 PM, Thomas Koenig  wrote:
> Hi Janne,
>
>> Using pointer sized variables (e.g. size_t / ptrdiff_t) when the
>> variables are used as array indices allows accessing larger arrays,
>> and can be a slight performance improvement due to no need for sign or
>> zero extending, or masking.
>
>
> Unless I have missed something, all the examples are for cases where
> the array is of maximum size GFC_MAX_DIMENSIONS.

Many, but not all.

> This is why they
> were left as int in the first place (because it is unlikely we will have
> arrays of more than 2^31-1 dimensions soon :-).
>
> Do you really think this change is necessary? If not, I'd rather avoid
> such a change.

I'm not planning on supporting > 2**31-1 dimensions, no. :)

But even if we know that the maximum value is always going to be
smaller, by using pointer-sized variables the compiler can generate
slightly more efficient code.

See e.g. https://godbolt.org/g/oAvw5L ; in the functions with a loop,
the ones which use pointer-sized indices have shorter preambles as
well as loop bodies. And for the simple functions that just index an
array, by using pointer-sized indices there is no need to zero the
upper half of the registers.

I mean, it's not a huge improvement, but it might be a tiny one in some cases.

Also, by moving the induction variable from the beginning of the
function into the loop header, it makes it easier for both readers and
the compiler to see the scope of the variable.

-- 
Janne Blomqvist


[patch, committed] Bug 83613 - [8 Regression] Executing gfortran.dg/inquire_internal.f90 hangs on darwin

2017-12-29 Thread Jerry DeLisle
Committed as obvious after regression testing on x85_64-pc-linux-gnu.

Dominiq: Please confirm is fixed.

Regards,

Jerry

Author: jvdelisle
Date: Fri Dec 29 22:36:25 2017
New Revision: 256035

URL: https://gcc.gnu.org/viewcvs?rev=256035&root=gcc&view=rev
Log:
2017-12-29  Jerry DeLisle  

PR libgfortran/83613
* io/unit.c (init_units): Don't forget to unlock the unit locks
after being inserted.

Modified:
trunk/libgfortran/ChangeLog
trunk/libgfortran/io/unit.c