[PATCH,committed] [MAINTAINERS] Update email address

2018-08-06 Thread Hurugalawadi, Naveen
Hi,

Updating my email address in the MAINTAINERS file.

Thanks,
Naveen
Index: ChangeLog
===
--- ChangeLog	(revision 263324)
+++ ChangeLog	(working copy)
@@ -1,3 +1,7 @@
+2018-08-06  Naveen H.S  
+
+	* MAINTAINERS: Update my email address.
+
 2018-07-19  DJ Delorie  
 
 	* MAINTAINERS (m32c, msp43, rl78, libiberty, build): Remove myself
Index: MAINTAINERS
===
--- MAINTAINERS	(revision 263324)
+++ MAINTAINERS	(working copy)
@@ -414,7 +414,7 @@
 Andrew John Hughes
 Dominique d'Humieres
 Andy Hutchinson	
-Naveen H.S	
+Naveen H.S	
 Meador Inge	
 Bernardo Innocenti
 Alexander Ivchenko


Re: [patch] improve internals documentation for nested function descriptors

2018-08-06 Thread Eric Botcazou
> Here's another attempt at documenting what's there now, this time
> avoiding using the ambiguous term "custom" at all in the text.  I've
> also tried to address the other comments.  Is this any better?

Yes, it's fine with me, thanks.

-- 
Eric Botcazou


[testsuite] Fix gcc.dg/vect/no-section-anchors-vect-69.c on SPARC etc. (PR tree-optimization/80925)

2018-08-06 Thread Rainer Orth
Since this patch

2017-07-31  Steve Ellcey  

PR tree-optimization/80925
* gcc.dg/vect/no-section-anchors-vect-69.c: Add
--param vect-max-peeling-for-alignment=0 option.
Remove unaligned access and peeling checks.
* gcc.dg/vect/section-anchors-vect-69.c: Ditto.

gcc.dg/vect/no-section-anchors-vect-69.c FAILs on a couple of targets,
including ia64, aarch64, powerpc64lc, and sparc:

FAIL: gcc.dg/vect/no-section-anchors-vect-69.c scan-tree-dump-times vect 
"vectorized 4 loops" 1

Steve suggested the fix implemented below in the PR, which I've now
finally tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11, both
trunk and gcc-8 branch.

Ok for both?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2018-08-02  Steve Ellcey  
Rainer Orth  

PR tree-optimization/80925
* gcc.dg/vect/no-section-anchors-vect-69.c: Expect 3 loops
vectorized on !vect_hw_misalign targets.

# HG changeset patch
# Parent  171d5cc88d38a18885744478b2e811ef8f59ee28
Fix gcc.dg/vect/no-section-anchors-vect-69.c on SPARC etc. (PR tree-optimization/80925)

diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
@@ -119,5 +119,6 @@ int main (void)
   return main1 ();
 } 
 
-/* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { ! vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_hw_misalign } } } } } */


[PATCH] AMD perf PMU events for AMD Family 17h.

2018-08-06 Thread Martin Liška
Hello.

Following patch adds PMC events for AMD Family 17 CPUs as defined in [1].
It covers events described in section: 2.1.13. Regex pattern in mapfile.csv
covers all CPUs of the family.

Thanks,
Martin

[1] https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf

Signed-off-by: Martin Liška 

---
 .../pmu-events/arch/x86/amdfam17h/cache.json  | 332 ++
 .../pmu-events/arch/x86/amdfam17h/core.json   | 124 +++
 .../arch/x86/amdfam17h/floating-point.json| 196 +++
 .../pmu-events/arch/x86/amdfam17h/memory.json | 225 
 .../pmu-events/arch/x86/amdfam17h/other.json  |  51 +++
 tools/perf/pmu-events/arch/x86/mapfile.csv|   1 +
 6 files changed, 929 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/cache.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/core.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/floating-point.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/other.json


diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/cache.json b/tools/perf/pmu-events/arch/x86/amdfam17h/cache.json
new file mode 100644
index ..6a41cc9d1d5e
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdfam17h/cache.json
@@ -0,0 +1,332 @@
+[
+  {
+"EventName": "ic_fw32",
+"EventCode": "0x80",
+"BriefDescription": "The number of 32B fetch windows transferred from IC pipe to DE instruction decoder (includes non-cacheable and cacheable fill responses)."
+  },
+  {
+"EventName": "ic_fw32_miss",
+"EventCode": "0x81",
+"BriefDescription": "The number of 32B fetch windows tried to read the L1 IC and missed in the full tag."
+  },
+  {
+"EventName": "ic_cache_fill_l2",
+"EventCode": "0x82",
+"BriefDescription": "The number of 64 byte instruction cache line was fulfilled from the L2 cache."
+  },
+  {
+"EventName": "ic_cache_fill_sys",
+"EventCode": "0x83",
+"BriefDescription": "The number of 64 byte instruction cache line fulfilled from system memory or another cache."
+  },
+  {
+"EventName": "bp_l1_tlb_miss_l2_hit",
+"EventCode": "0x84",
+"BriefDescription": "The number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB."
+  },
+  {
+"EventName": "bp_l1_tlb_miss_l2_miss",
+"EventCode": "0x85",
+"BriefDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs."
+  },
+  {
+"EventName": "bp_snp_re_sync",
+"EventCode": "0x86",
+"BriefDescription": "The number of pipeline restarts caused by invalidating probes that hit on the instruction stream currently being executed. This would happen if the active instruction stream was being modified by another processor in an MP system - typically a highly unlikely event."
+  },
+  {
+"EventName": "ic_fetch_stall.ic_stall_any",
+"EventCode": "0x87",
+"BriefDescription": "IC pipe was stalled during this clock cycle for any reason (nothing valid in pipe ICM1).",
+"PublicDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle for any reason (nothing valid in pipe ICM1).",
+"UMask": "0x4"
+  },
+  {
+"EventName": "ic_fetch_stall.ic_stall_dq_empty",
+"EventCode": "0x87",
+"BriefDescription": "IC pipe was stalled during this clock cycle (including IC to OC fetches) due to DQ empty.",
+"PublicDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle (including IC to OC fetches) due to DQ empty.",
+"UMask": "0x2"
+  },
+  {
+"EventName": "ic_fetch_stall.ic_stall_back_pressure",
+"EventCode": "0x87",
+"BriefDescription": "IC pipe was stalled during this clock cycle (including IC to OC fetches) due to back-pressure.",
+"PublicDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle (including IC to OC fetches) due to back-pressure.",
+"UMask": "0x1"
+  },
+  {
+"EventName": "bp_l1_btb_correct",
+"EventCode": "0x8a",
+"BriefDescription": "L1 BTB Correction."
+  },
+  {
+"EventName": "bp_l2_btb_correct",
+"EventCode": "0x8b",
+"BriefDescription": "L2 BTB Correction."
+  },
+  {
+"EventName": "ic_cache_inval.l2_invalidating_probe",
+"EventCode": "0x8c",
+"BriefDescription": "IC line invalidated due to L2 invalidating probe (external or LS).",
+"PublicDescription": "The number of instruction cache lines invalidated. A non-SMC event is CMC (cross modifying code), either from the other thread of the core or another core. IC line invalidated due to L2 invalidating probe (external or LS).",
+"UMask": "0x2"
+  },
+  {
+"EventName": "ic_cache_inval.fill_invalidated",
+"EventCode": "0x8c",
+"BriefDescription": "IC line invalidated due to overwriting fill response.",
+"PublicDescription": "The number of instruction cache lines invalidated. A non-SMC event is CMC (cross mod

Re: [PATCH] AMD perf PMU events for AMD Family 17h.

2018-08-06 Thread Martin Liška
Oup, I have gcc-patches as default mailing list for git format-patch.
Please ignore this email thread.

Martin


Re: [Patch, avr, PR85624] - Fix ICE when initializing 128-byte aligned array

2018-08-06 Thread Senthil Kumar Selvaraj
Ping!

Regards
Senthil

Senthil Kumar Selvaraj writes:

> Hi,
>
> The below patch fixes an ICE for the avr target when the setmemhi
> expander is involved.
>
> The setmemhi expander generated RTL ends up as an unrecognized insn
> if the alignment of the destination exceeds that of a QI
> mode const_int (127), AND the number of bytes to set fits in a QI
> mode const_int. The second condition prevents *clrmemhi from matching,
> and *clrmemqi does not match because it expects operand 3 (the alignment
> const_int rtx) to be QI mode, and a value of 128 or greater does not fit.
>   
> The patch fixes this by changing the *clrmemqi pattern to match a HI
> mode const_int, and also adds a testcase.
>
> Regression test showed no new failures, ok to commit to trunk?
>
> Regards
> Senthil
>
> gcc/ChangeLog:
> 
> 2018-07-18  Senthil Kumar Selvaraj  
> 
> PR target/85624
> * config/avr/avr.md (*clrmemqi): Change mode of operands[2]
> from QI to HI.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-07-18  Senthil Kumar Selvaraj  
> 
> PR target/85624
> * gcc.target/avr/pr85624.c: New test.
>
> diff --git gcc/config/avr/avr.md gcc/config/avr/avr.md
> index e619e695418..644e3cfabc5 100644
> --- gcc/config/avr/avr.md
> +++ gcc/config/avr/avr.md
> @@ -1095,7 +1095,7 @@
>[(set (mem:BLK (match_operand:HI 0 "register_operand" "e"))
>  (const_int 0))
> (use (match_operand:QI 1 "register_operand" "r"))
> -   (use (match_operand:QI 2 "const_int_operand" "n"))
> +   (use (match_operand:HI 2 "const_int_operand" "n"))
> (clobber (match_scratch:HI 3 "=0"))
> (clobber (match_scratch:QI 4 "=&1"))]
>""
> diff --git gcc/testsuite/gcc.target/avr/pr85624.c 
> gcc/testsuite/gcc.target/avr/pr85624.c
> new file mode 100644
> index 000..ede2e80216a
> --- /dev/null
> +++ gcc/testsuite/gcc.target/avr/pr85624.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +/* This testcase exposes PR85624. An alignment directive with
> +   a value greater than 127 on an array with dimensions that fit
> +   QImode causes an 'unrecognizable insn' ICE. Turns out clrmemqi
> +   did not match the pattern expanded by setmemhi, because it
> +   assumed the alignment val will fit in a QI. */
> +
> +int foo() {
> +  volatile int arr[3] __attribute__((aligned(128))) = {0};
> +  return arr[2];
> +}



Re: Async I/O patch with compilation fix

2018-08-06 Thread Christophe Lyon
On Sat, 4 Aug 2018 at 00:42, Thomas König  wrote:
>
> Hi Cristophe,
>
> this is seriously weird - there is not even an I/O statement in that test 
> case.
>
> One question: Is this real hardware or an emulator?
I'm using QEMU

> Also, Could you try a few things?
>
> Run the test case manually. Do you still fail?
Yes.

> Is there an error if the executable is run under valgrind?
I don't know how to do that with qemu, nor if valgrind supports armeb?

> If you have two compilers, one with the patch and one without: Is there a 
> difference in the generated files for
>
> -dump-tree-original, -fdump-tree-optimized and -S?

I posted a few comments in the associated PR:
- the .s files are the same with /without the patch, so I suppose the
problem comes from the runtime libraries
- I've attached both execution traces and output from objdump on the
statically linked executable, so as to hopefully include all the code
executed

> Regards, Thomas


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Bernd Edlinger
On 08/05/18 17:58, Jeff Law wrote:
> On 08/05/2018 12:08 AM, Bernd Edlinger wrote:
>> I see this from a software engineering POV.
>>
>> If we have code like this:
>>
>> void test (const char *x)
>> {
>> assert (strlen (x) < 10);
>> }
>>
>> One would usually expect the program to abort (or at least abort with
>> a near 100% likelihood) if x is not a valid string with length less
>> than 10.
> I would not expect the program to abort if this function were called
> with an invalid (ie unterminated) string.  In that scenario I know
> enough to not expect anything because my program is broken, plain and
> simple.
> 
>>
>> But if lto and other optimizations show that this code is invoked with
>> an invalid, non-zero terminated string the assertion is suddenly gone.
> And the program is invalid as it exhibits undefined behavior.  One
> undefined behavior is exhibited I have no expectations.  I *like* when
> we do something like trap as soon as undefined behavior is discovered,
> but I do not expect it.
> 
>>
>> Martin, why do you insist that GCC has to do this, and that it must
>> be a good idea to do so, just based on the language definition?
> We do this all the time in all kinds of situations.
> 
> The language definition provides the contract that the programmer and
> compiler must adhere to.  When the contract is broken we can not be
> responsible for the resulting code as the input source is simply broken.
> 
>>
>> Why do we need assertions at all, when all programs have to be completely
>> correct before we may compile them?
> That's what compilers do.  When they see code that is pointless it gets
> removed.  As engineers we sometimes say "while this is undefined
> behavior we're not going to exploit the undefined-ness".  That is based
> on our experience as developers.  And sometimes we may not even agree
> where the line between optimize this vs do not because it's going to
> surprise a developer.
> 

Sorry, to re-iterate on this topic, but maybe this can give some insights.

I'd say, my concern would be resolved if strlen range information
would not be more aggressive than the well-known loop-niter optimization.

What I mean, if strlen would be defined like an inline macro, the assertion
does not get optimized, even if the function is later in-lined in a totally
different context:


#define strlen(x) ({ int _l; for (_l=0; (x)[_l]; _l++); _l;})
void test (const char *x)
{
 assert (strlen (x) < 10);
}


But it would get optimized, when x is declared locally.
The rationale might be, that the type of object where x points
to is unknown at where the assertion is vs. the type
of object is statically known at where the assertion is.

What is the reason for this, and what would be necessary to get the exact
same behavior with the string range info?


Thanks
Bernd.


Re: [PATCH][Fortran][v2] Use MIN/MAX_EXPR for min/max intrinsics

2018-08-06 Thread Janne Blomqvist
On Wed, Jul 18, 2018 at 6:10 PM, Janne Blomqvist 
wrote:

> On Wed, Jul 18, 2018 at 4:26 PM, Thomas König  wrote:
>
>> Hi Kyrlll,
>>
>> > Am 18.07.2018 um 13:17 schrieb Kyrill Tkachov <
>> kyrylo.tkac...@foss.arm.com>:
>> >
>> > Thomas, Janne, would this relaxation of NaN handling be acceptable
>> given the benefits
>> > mentioned above? If so, what would be the recommended adjustment to the
>> nan_1.f90 test?
>>
>> I would be a bit careful about changing behavior in such a major way.
>> What would the results with NaN and infinity then be, with or without
>> optimization? Would the results be consistent with min(nan,num) vs
>> min(num,nan)? Would they be consistent with the new IEEE standard?
>>
>
> AFAIU, MIN/MAX_EXPR do the right thing when comparing a normal number with
> Inf. For NaN the result is undefined, and you might indeed have
>
> min(a, NaN) = a
> min(NaN, a) = NaN
>
> where "a" is a normal number.
>
> (I think that happens at least on x86 if MIN_EXPR is expanded to
> minsd/minpd.
>
> Apparently what the proper result for min(a, NaN) should be is contentious
> enough that minnum was removed from the upcoming IEEE 754 revision, and new
> operations AFAICS have the semantics
>
> minimum(a, NaN) = minimum(NaN, a) = NaN
> minimumNumber(a, NaN) = minimumNumber(NaN, a) = a
>
> That is minimumNumber corresponds to minnum in IEEE 754-2008 and fmin* in
> C, and to the current behavior of gfortran.
>
>
>> In general, I think that min(nan,num) should be nan and that our current
>> behavior is not the best.
>>
>
> There was some extensive discussion of that in the Julia bug report I
> linked to in an earlier message, and they came to the same conclusion and
> changed their behavior.
>
>
>> Does anybody have dats points on how this is handled by other compilers?
>>
>
> The only other compiler I have access to at the moment is ifort (and not
> the latest version), but maybe somebody has access to a wider variety?
>
>
>> Oh, and if anything is changed, then compile and runtime behavior should
>> always be the same.
>>
>
> Well, IFF we place some weight on the runtime behavior being particularly
> sensible wrt NaN's, which it wouldn't be if we just use a plain
> MIN/MAX_EXPR. Is it worth taking a performance hit for, though? In
> particular, if other compilers are inconsistent, we might as well do
> whatever is fastest.
>
>
> --
> Janne Blomqvist
>


The testcase below (the functions in a separate file to prevent
inter-procedural and constant propagation optimizations):

program main
  implicit none
  real :: a, b = 1., mymax, mydiv
  external mymax, mydiv
  a = mydiv(0., 0.)
  print *, 'Verify that the following value is a NaN: ', a
  print *, 'max(', a, ',', b, ') = ', mymax(a, b)
  print *, 'max(', b, ',', a, ') = ', mymax(b, a)

  a = mydiv(1., 0.)
  print *, 'Verify that the following is a Inf: ', a
  print *, 'max(', a, ',', b, ') = ', mymax(a, b)
  print *, 'max(', b, ',', a, ') = ', mymax(b, a)
end program main

real function mymax(a, b)
  implicit none
  real :: a, b
  mymax = max(a, b)
end function mymax

real function mydiv(a, b)
  implicit none
  real :: a, b
  mydiv = a/b
end function mydiv


With gfortran 6.2 (didn't bother to check other versions as it shouldn't
have changed lately) and Intel Fortran 17.0.1 I get the following:

% gfortran main.f90 my.f90 && ./a.out
 Verify that the following value is a NaN:   NaN
 max(  NaN ,   1. ) =1.
 max(   1. ,  NaN ) =1.
 Verify that the following is a Inf:  Infinity
 max( Infinity ,   1. ) =  Infinity
 max(   1. , Infinity ) =  Infinity

% gfortran -ffast-math main.f90 my.f90 && ./a.out
 Verify that the following value is a NaN:   NaN
 max(  NaN ,   1. ) =   NaN
 max(   1. ,  NaN ) =1.
 Verify that the following is a Inf:  Infinity
 max( Infinity ,   1. ) =  Infinity
 max(   1. , Infinity ) =  Infinity


% ifort main.f90 my.f90 && ./a.out
 Verify that the following value is a NaN: NaN
 max(NaN ,   1.00 ) =1.00
 max(   1.00 ,NaN ) = NaN
 Verify that the following is a Inf:Infinity
 max(   Infinity ,   1.00 ) =Infinity
 max(   1.00 ,   Infinity ) =Infinity


% ifort -fp-model strict main.f90 my.f90 && ./a.out
 Verify that the following value is a NaN: NaN
 max(NaN ,   1.00 ) =1.00
 max(   1.00 ,NaN ) = NaN
 Verify that the following is a Inf:Infinity
 max(   Infinity ,   1.00 ) =Infinity
 max(   1.00 ,   Infinity ) =Infinity


For brevity I have omitted tests with various -O[N] optimization levels,
which didn't affect the results

Re: [PATCH v2 1/4] vxworks: add target/h/wrn/coreip to the set of system include paths

2018-08-06 Thread Olivier Hainque
Hi Rasmus,

> On 28 Jun 2018, at 10:43, Rasmus Villemoes  wrote:
> 
> 2018-06-04  Rasmus Villemoes  
> 
>* config/vxworks.h: Add $(WIND_BASE)/target/h/wrn/coreip to
>  default search path.

Ok, thanks.

Olivier




Re: [PATCH v2 1/4] vxworks: add target/h/wrn/coreip to the set of system include paths

2018-08-06 Thread Olivier Hainque



> On 6 Aug 2018, at 14:12, Olivier Hainque  wrote:
>>   * config/vxworks.h: Add $(WIND_BASE)/target/h/wrn/coreip to
>> default search path.
> 
> Ok, thanks.

With a nit on the ChangeLog formatting:

* config/vxworks.h: Add $(WIND_BASE)/target/h/wrn/coreip to
default search path.

^
no extra space here

and an indication that this is for VxWorks pre vx7.

Thanks.

Olivier



Re: [PATCH v2 7/7] Enable clobber high for tls descs on Aarch64

2018-08-06 Thread Richard Sandiford
Alan Hayward  writes:
> Add the clobber high expressions to tls_desc for aarch64.
> It also adds three tests.
>
> In addition I also tested by taking the gcc torture test suite and making
> all global variables __thread. Then emended the suite to compile with -fpic,
> save the .s file and only for one given O level.
> I ran this before and after the patch and compared the resulting .s files,
> ensuring that there were no ASM changes.
> I discarded the 10% of tests that failed to compile (due to the code in
> the test now being invalid C).
> I did this for O0,O2,O3 on both x86 and aarch64 and observed no difference
> between ASM files before and after the patch.
>
> Alan.
>
> 2018-07-25  Alan Hayward  
>
> gcc/
>   * config/aarch64/aarch64.md: Add clobber highs to tls_desc.
>
> gcc/testsuite/
>   * gcc.target/aarch64/sve_tls_preserve_1.c: New test.
>   * gcc.target/aarch64/sve_tls_preserve_2.c: New test.
>   * gcc.target/aarch64/sve_tls_preserve_3.c: New test.

Like you were saying off-line, these should be in the aarch64/sve/
directory.  Files there don't specify an -march* option, since the
harness adds one if necessary.

OK with that change, thanks.

Richard


[committed] Fix reload assert due to CLOBBER_HIGH support

2018-08-06 Thread Jeff Law



forget_old_reloads_1 can be called with NULL_RTX for SETTER which causes
a segfault in the assert that SETTER is not a CLOBBER_HIGH expression
(see the call from within reload_as_needed)


 forget_old_reloads_1 (XEXP (in_reg, 0), NULL_RTX, NULL);


Clearly this only happens on reload targets.  Verified this fixed the
segfault my tester hit on c6x.

Installing as obvious.

JEff


Re: [committed] Fix reload assert due to CLOBBER_HIGH support

2018-08-06 Thread Jeff Law
On 08/06/2018 07:48 AM, Jeff Law wrote:
> 
> 
> forget_old_reloads_1 can be called with NULL_RTX for SETTER which causes
> a segfault in the assert that SETTER is not a CLOBBER_HIGH expression
> (see the call from within reload_as_needed)
> 
> 
>  forget_old_reloads_1 (XEXP (in_reg, 0), NULL_RTX, NULL);
> 
> 
> Clearly this only happens on reload targets.  Verified this fixed the
> segfault my tester hit on c6x.
> 
> Installing as obvious.
> 
> JEff
> 
Opps.  With patch this time...

Jeff


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 78e51d29f73..c17a55ce837 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2018-08-06  Jeff Law  
+
+   * reload1.c (forget_old_reloads_1): Adjust CLOBBER_HIGH
+   assert.
+
 2018-08-06  Jozef Lawrynowicz  
 
PR target/86662
diff --git a/gcc/reload1.c b/gcc/reload1.c
index 8e15160c6ad..3c0c9ff982f 100644
--- a/gcc/reload1.c
+++ b/gcc/reload1.c
@@ -4931,7 +4931,7 @@ forget_old_reloads_1 (rtx x, const_rtx setter,
 return;
 
   /* CLOBBER_HIGH is only supported for LRA.  */
-  gcc_assert (GET_CODE (setter) != CLOBBER_HIGH);
+  gcc_assert (setter == NULL_RTX || GET_CODE (setter) != CLOBBER_HIGH);
 
   regno = REGNO (x);
 


Re: [C++ PATCH] Implement P0595R1 - so far as __builtin_is_constant_evaluated rather than std::is_constant_evaluated magic builtin

2018-08-06 Thread Jason Merrill
On Wed, Aug 1, 2018 at 7:07 PM, Jakub Jelinek  wrote:
> On Wed, Aug 01, 2018 at 12:35:09AM +1000, Jason Merrill wrote:
>> On Mon, Jul 23, 2018 at 8:50 PM, Richard Biener
>>  wrote:
>> > On Mon, Jul 23, 2018 at 12:28 PM Jakub Jelinek  wrote:
>> >>
>> >> On Mon, Jul 23, 2018 at 12:17:42PM +0200, Richard Biener wrote:
>> >> > > Bootstrapped/regtested on x86_64-linux.
>> >> >
>> >> > Thanks for working on this.  I wonder if we can completely hide this
>> >> > from the middle-end, without requiring defining of c_dialect_cxx.
>> >> > There is the BUILT_IN_FRONTEND class so you could somewhere
>> >> > manually inject a decl in that class for C++?
>> >>
>> >> But then I couldn't handle folding of the builtin in the middle-end to 
>> >> false,
>> >> which is what I need (because in the FE it needs to be either folded to
>> >> true, or its folding deferred until later).
>> >> Or maybe in the C++ gimplification langhook?
>> >
>> > Yes, I was thinking the C++ langhook or its fully_fold routine.
>>
>> fully_fold is too soon (until constexpr evaluation uses the
>> pre-genericize form), but the gimplification hook should work.
>>
>> >> Seems we have a single BUILT_IN_FRONTEND builtin in the whole compiler,
>> >> __integer_pack, but it doesn't act as a normal builtin, given it is a
>> >> templatish magic.
>> >
>> > Yeah, I think at some point we considered removing BUILT_IN_FRONTEND ...
>> >
>> > Nowadays internal-use builtins can easily be internal-functions but of 
>> > couse
>> > this one will eventually be used from libstdc++.
>>
>> Immediately, I'd think.
>
> Here is an updated patch that uses BUILT_IN_FRONTEND for this rather than
> BUILT_IN_NORMAL builtin.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux.
>
> 2018-08-01  Jakub Jelinek  
>
> P0595R1 - is_constant_evaluated
> cp/
> * cp-tree.h (enum cp_built_in_function): New.
> (maybe_constant_init): Add const_evaluated argument.

We should document that this new parameter (here and in
cxx_eval_outermost_constant_expr) means "required to be
constant-evaluated" as per P0595 even though allow_non_constant is
true.  And the parameter names should make its magical nature more
obvious; perhaps "pretend_const_required".  OK with those changes.

Jason


Re: [PATCH 02/11] Arm - add speculation_barrier pattern

2018-08-06 Thread Christophe Lyon
On Fri, 27 Jul 2018 at 11:38, Richard Earnshaw  wrote:
>
>
> This patch defines a speculation barrier for AArch32.
>
> * config/arm/unspecs.md (unspecv): Add VUNSPEC_SPECULATION_BARRIER.
> * config/arm/arm.md (speculation_barrier): New expand.
> (speculation_barrier_insn): New pattern.
> ---

Hi Richard,

This patch causes compilation errors for c-c++-common/spec-barrier-2.c
when compiled for old architectures (eg -march=armv5t):
/ccrf2LoR.s:49: Error: selected processor does not support `isb' in ARM mode
/ccrf2LoR.s:50: Error: selected processor does not support `dsb sy' in ARM mode

Does this belong to the kind of failures you still expect from this
patch series?


Re: [PATCH, testsuite]: Fix PR 86153, test case g++.dg/pr83239.C fails

2018-08-06 Thread Christophe Lyon
On Wed, 1 Aug 2018 at 12:34, Uros Bizjak  wrote:
>
> Hello!
>
> The testcase fails with:
>
> FAIL: g++.dg/pr83239.C  -std=gnu++11  scan-tree-dump-not optimized
> "_ZNSt6vectorIiSaIiEE17_M_default_appendEm"
> FAIL: g++.dg/pr83239.C  -std=gnu++14  scan-tree-dump-not optimized
> "_ZNSt6vectorIiSaIiEE17_M_default_appendEm"
>
> the test depends on _M_default_append to be inlined, so it verifies
> the inlining with:
>
> // Verify that std::vector::_M_default_append() has been inlined
> // (the absence of warnings depends on it).
> // { dg-final { scan-tree-dump-not
> "_ZNSt6vectorIiSaIiEE17_M_default_appendEm"  optimized } }
> // { dg-final { scan-tree-dump-not
> "_ZNSt6vectorIPvSaIS0_EE17_M_default_appendEm" optimized } }
>
> However, this is not the case with the default -finline-limit, so
> raise it to 500 (the same approach is taken in g++.dg/
> tree-ssa/copyprop.C).
>
> Unfortunately, the fixed testcase exposes some issue with -std=gnu++98:
>
> FAIL: g++.dg/pr83239.C  -std=gnu++98 (test for excess errors)
>
> In function 'void test_loop() [with T = int]':
> cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned
> int)' specified size 18446744073709551608 exceeds maximum object size
> 9223372036854775807 [-Wstringop-overflow=]
> In function 'void test_if(std::vector&, int) [with T = long int]':
> cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned
> int)' specified size 18446744073709551600 exceeds maximum object size
> 9223372036854775807 [-Wstringop-overflow=]
>
> 2018-08-01  Uros Bizjak  
>
> PR testsuite/86153
> * g++.dg/pr83239.C (dg-options): Add -finline-limit=500.
>
> OK for mainline and gcc-8 branch?
>

On aarch64 and arm I'm seeing a regression like:
FAIL: g++.dg/pr83239.C  -std=gnu++98 (test for excess errors)
Excess errors:
cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned
int)' specified size 18446744073709551608 exceeds maximum object size
9223372036854775807 [-Wstringop-overflow=]
cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned
int)' specified size 18446744073709551600 exceeds maximum object size
9223372036854775807 [-Wstringop-overflow=]

> Uros.


Re: dejagnu version update?

2018-08-06 Thread Jonathan Wakely
On Sat, 4 Aug 2018 at 17:32, Bernhard Reutner-Fischer wrote:
> debian-stable (i think 9 ATM), Ubuntu LTS ship versions recent enough
> to contain both fixes. Commercial distros seem to ship fixed versions,
> too.

The CentOS 7.4.1708 version on gcc112 doesn't seem to be fixed.


[spu, commit] Define TARGET_HAVE_SPECULATION_SAFE_VALUE

2018-08-06 Thread Ulrich Weigand
Hello,

the SPU processor is not affected by speculation, so this macro can
safely be defined as speculation_safe_value_not_needed.

Committed to mainline.

Bye,
Ulrich


gcc/ChangeLog:

PR target/86807
* config/spu/spu.c (TARGET_HAVE_SPECULATION_SAFE_VALUE):
Define to speculation_safe_value_not_needed.

Index: config/spu/spu.c
===
--- config/spu/spu.c(revision 263334)
+++ config/spu/spu.c(working copy)
@@ -7463,6 +7463,9 @@ static const struct attribute_spec spu_a
 #undef TARGET_CONSTANT_ALIGNMENT
 #define TARGET_CONSTANT_ALIGNMENT spu_constant_alignment
 
+#undef  TARGET_HAVE_SPECULATION_SAFE_VALUE
+#define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-spu.h"
-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] Optimize logarithm addition and subtraction

2018-08-06 Thread Jeff Law
On 08/05/2018 12:35 PM, Marc Glisse wrote:
> On Sun, 5 Aug 2018, Bernd Edlinger wrote:
> 
>>> merging log a + log b => log a*b and
>>
>> Maybe a*b could overflow, while adding the logarithms would not?
> 
> Well, that's a discussion that happens every time a new transformation
> is added to -funsafe-math-optimizations (I assume this one is under that
> umbrella?). We already may get extra overflow with -fassociative-math
> for instance. Sure, the overflow (or underflow!) is more likely for log
> than for addition, but this still seems like the kind of transformation
> that -ffast-math should enable, in my opinion. Now we can see if I am in
> the minority...
> 
One of the requirements for adding something to -ffast-math is that it
doesn't break spec (we've never been specific about the version, though
I think spec2017 and spec2006 are the only ones that matter anymore and
the latter's importance diminishes daily).

I don't know if the original submitter has access to the spec suite, so
we'll likely need to do that testing.  I can't commit to any timeframe
for that right now, but I think that's the biggest hurdle we need to jump.

jeff


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Jeff Law
On 08/03/2018 01:48 AM, Jakub Jelinek wrote:
> On Fri, Aug 03, 2018 at 01:19:14AM -0600, Jeff Law wrote:
>> I'm leaning towards a similar conclusion, namely that we can only rely
>> on type information for the pointer that actually gets passed to strlen,
>> which 99.9% of the time is (char *), potentially with const qualifiers.
> 
> You can't derive anything from the pointer type of the strlen argument,
> because pointer conversions are useless in the middle-end, so if there was
> some conversion at some point, it might be gone, or you might get there a
> completely different pointer type of something that happened to have the
> same value.  That is why the information, if it matters, needs to be stored
> elsewhere, on the memory access (MEM_REF has such info, TARGET_MEM_REF too,
> handled_component_p do too).
So ISTM that you, Richi and I are in broad agreement that we can't walk
backwards through casts to refine the potential range for string lengths
because of GIMPLE semantics.  That also matches which Bernd wants to see
happen.  With that in mind I propose we either revert that aspect of
Martin's patch or move forward with Bernd's patch for that issue,
whichever makes the most sense.

However, I want to give Martin a chance to chime in explicitly on the
subject of looking through casts before making that change.

Jeff




[Committed] S/390: Don't unroll memory blk op loops

2018-08-06 Thread Andreas Krebbel
From: Andreas Krebbel 

gcc/ChangeLog:

2018-08-06  Andreas Krebbel  

* config/s390/s390.c (s390_loop_unroll_adjust): Prevent small
loops with memory block operations from getting unrolled.

gcc/testsuite/ChangeLog:

2018-08-06  Andreas Krebbel  

* gcc.target/s390/nomemloopunroll-1.c: New test.
---
 gcc/config/s390/s390.c| 31 ---
 gcc/testsuite/gcc.target/s390/nomemloopunroll-1.c | 27 
 2 files changed, 55 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/nomemloopunroll-1.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index ec588a2..aa34f56 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -390,6 +390,11 @@ static unsigned vfu_longrunning[NUM_SIDES];
base and index are registers of the class ADDR_REGS,
displacement is an unsigned 12-bit immediate constant.  */
 
+/* The max number of insns of backend generated memset/memcpy/memcmp
+   loops.  This value is used in the unroll adjust hook to detect such
+   loops.  Current max is 9 coming from the memcmp loop.  */
+#define BLOCK_MEM_OPS_LOOP_INSNS 9
+
 struct s390_address
 {
   rtx base;
@@ -15385,9 +15390,29 @@ s390_loop_unroll_adjust (unsigned nunroll, struct loop 
*loop)
   for (i = 0; i < loop->num_nodes; i++)
 FOR_BB_INSNS (bbs[i], insn)
   if (INSN_P (insn) && INSN_CODE (insn) != -1)
-   FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST)
- if (MEM_P (*iter))
-   mem_count += 1;
+   {
+ rtx set;
+
+ /* The runtime of small loops with memory block operations
+will be determined by the memory operation.  Doing
+unrolling doesn't help here.  Measurements to confirm
+this where only done on recent CPU levels.  So better do
+not change anything for older CPUs.  */
+ if (s390_tune >= PROCESSOR_2964_Z13
+ && loop->ninsns <= BLOCK_MEM_OPS_LOOP_INSNS
+ && ((set = single_set (insn)) != NULL_RTX)
+ && ((GET_MODE (SET_DEST (set)) == BLKmode
+  && (GET_MODE (SET_SRC (set)) == BLKmode
+  || SET_SRC (set) == const0_rtx))
+ || (GET_CODE (SET_SRC (set)) == COMPARE
+ && GET_MODE (XEXP (SET_SRC (set), 0)) == BLKmode
+ && GET_MODE (XEXP (SET_SRC (set), 1)) == BLKmode)))
+   return 1;
+
+ FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST)
+   if (MEM_P (*iter))
+ mem_count += 1;
+   }
   free (bbs);
 
   /* Prevent division by zero, and we do not need to adjust nunroll in this 
case.  */
diff --git a/gcc/testsuite/gcc.target/s390/nomemloopunroll-1.c 
b/gcc/testsuite/gcc.target/s390/nomemloopunroll-1.c
new file mode 100644
index 000..695d925
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/nomemloopunroll-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -funroll-loops -march=z13" } */
+
+/* 2x mvc */
+void *
+foo (char *a, int c, long len)
+{
+  return __builtin_memset (a, c, len);
+}
+
+/* 2x mvc */
+void
+bar (char *a, char *b)
+{
+  __builtin_memcpy (a, b, 3);
+}
+
+/* 2x clc */
+
+int
+baz (char *a, char *b)
+{
+  return __builtin_memcmp (a, b, 3);
+}
+
+/* { dg-final { scan-assembler-times "\\\smvc\\\s" 4 } } */
+/* { dg-final { scan-assembler-times "\\\sclc\\\s" 2 } } */
-- 
2.9.1



[PATCH, testsuite]: Fix g++.dg/torture/pr86763.C link failure for glibc < 2.17

2018-08-06 Thread Uros Bizjak
2018-08-06  Uros Bizjak  

* g++.dg/torture/pr86763.C (dg-additional-options): Add -lrt.

Tested on CentOS 5.10 and Fedora 28.

OK for mainline?

Uros.
Index: g++.dg/torture/pr86763.C
===
--- g++.dg/torture/pr86763.C(revision 263317)
+++ g++.dg/torture/pr86763.C(working copy)
@@ -1,5 +1,5 @@
 // { dg-do run }
-// { dg-additional-options "-fschedule-insns2 -fstrict-aliasing" }
+// { dg-additional-options "-fschedule-insns2 -fstrict-aliasing -lrt" }
 
 #include 
 #include 


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Jeff Law
On 08/04/2018 03:56 PM, Martin Sebor wrote:
> On 08/03/2018 01:00 AM, Jeff Law wrote:
>> On 07/24/2018 05:18 PM, Bernd Edlinger wrote:
>>> On 07/24/18 23:46, Jeff Law wrote:
 On 07/24/2018 01:59 AM, Bernd Edlinger wrote:
> Hi!
>
> This patch makes strlen range computations more conservative.
>
> Firstly if there is a visible type cast from type A to B before
> passing
> then value to strlen, don't expect the type layout of B to restrict
> the
> possible return value range of strlen.
 Why do you think this is the right thing to do?  ie, is there language
 in the standards that makes you think the code as it stands today is
 incorrect from a conformance standpoint?  Is there a significant
 body of
 code that is affected in an adverse way by the current code?  If so,
 what code?


>>>
>>> I think if you have an object, of an effective type A say char[100],
>>> then
>>> you can cast the address of A to B, say typedef char (*B)[2] for
>>> instance
>>> and then to const char *, say for use in strlen.  I may be wrong, but
>>> I think
>>> that we should at least try not to pick up char[2] from B, but instead
>>> use A for strlen ranges, or leave this range open.  Currently the range
>>> info for strlen is [0..1] in this case, even if we see the type cast
>>> in the generic tree.
>> ISTM that you're essentially saying that the cast to const char *
>> destroys any type information we can exploit here.  But if that's the
>> case, then I don't think we can even derive a range of [0,99].  What's
>> to say that "A" didn't result from a similar cast of some object that
>> was char[200] that happened out of the scope of what we could see during
>> the strlen range computation?
>>
>> If that is what you're arguing, then I think there's a re-evaluation
>> that needs to happen WRT strlen range computation/
>>
>> And just to be clear, I do see this as a significant correctness
>> question.
>>
>> Martin, thoughts?
> 
> The argument is that given:
> 
>   struct S { char a[4], b; };
> 
>   char a[8] = "1234567";
> 
> this is valid and should pass:
> 
>   __attribute__ ((noipa))
>   void f (struct S *p)
>   {
>     assert (7 == strlen (p->a));
>   }
> 
>   int main (void)
>   {
>     f ((struct S*)a);
>   }
> 
> (This is the basic premise behind pr86259.)
> 
> This argument is wrong and the code is invalid.  For the access
> to p->a to be valid p must point to an object of struct S (it
> doesn't) and the p->a array must hold a nul-terminated string
> (it also doesn't).
I agree with you for C/C++, but I think it's been shown elsewhere in
this thread that GIMPLE semantics to not respect the subobject
boundaries.  That's a sad reality.

[ ... ]

> 
> I care less about the optimization than I do about the basic
> premise that it's essential to respect subobject boundaries(*).
I understand, but the semantics of GIMPLE do not respect them.  We can
argue about whether or not those should change and what it would take to
fix that. But right now the existing semantics do not respect those
boundaries.


> It would make little sense to undo the strlen optimization
> without also undoing the optimization for the direct array
> access case.  Undoing either would raise the question about
> the validity of the _FORRTIFY_SOURCE=2 behavior.  That would
> be a huge step backwards in terms of code security.  If we
> did some of these but not others, it would make the behavior
> inconsistent and surprising, all to accommodate one instance
> of invalid code.
In the direct array access case I think (and I'm sure Jakub, Richi and
others will correct me if I'm wrong), we can use the object's type
because the dereferences are actually using the array's type.


> 
> If we had a valid test case where the strlen optimization
> leads to invalid code, or even if there were a significant
> number of bug reports showing that it breaks an invalid
> but common idiom, I would certainly feel compelled to
> make it right somehow.  But there has been just one bug
> report with clearly invalid code that should be easily
> corrected.
Again, I think you're too narrowly focused on C/C++ semantics here.
What matters are the semantics in GIMPLE.


> 
> Martin
> 
> [*] I also care deeply about all the warnings that depend
> on it to avoid false positives: that's pretty much all those
> I have implemented in the middle-end: -Wformat-{overflow,
> truncation}, -Wstringop-{overflow,truncation}, and likely
> even -Wrestrict.
I know.  And I care deeply about your work to improve the preciseness of
the warnings and ultimately improve the overall quality of code compiled
with GCC.  I just think we can't currently rely on the semantics you
want to exploit to improve the precision of string lengths.

Jeff





Re: [PATCH] Fix memory leak in selftest::test_expansion_to_rtl

2018-08-06 Thread Jeff Law
On 08/02/2018 07:14 PM, David Malcolm wrote:
> "make selftest-valgrind" shows:
> 
> 187 bytes in 1 blocks are definitely lost in loss record 567 of 669
> at 0x4A081D4: calloc (in 
> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> by 0x1F08260: xcalloc (xmalloc.c:162)
> by 0xB24F32: init_emit() (emit-rtl.c:5843)
> by 0xC10080: prepare_function_start() (function.c:4803)
> by 0xC10254: init_function_start(tree_node*) (function.c:4877)
> by 0x1CDF92A: selftest::test_expansion_to_rtl() (function-tests.c:595)
> by 0x1CE007C: selftest::function_tests_c_tests() (function-tests.c:676)
> by 0x1E010E7: selftest::run_tests() (selftest-run-tests.c:98)
> by 0x1062D1E: toplev::run_self_tests() (toplev.c:2225)
> by 0x1062F40: toplev::main(int, char**) (toplev.c:2303)
> by 0x1E5B90A: main (main.c:39)
> 
> The allocation in question is:
> 
>   crtl->emit.regno_pointer_align
> = XCNEWVEC (unsigned char, crtl->emit.regno_pointer_align_length);
> 
> This patch fixes this leak (and makes the output of
> "make selftest-valgrind" clean) by calling free_after_compilation at the
> end of the selftest in question.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> 
> OK for trunk?
> 
> gcc/ChangeLog:
>   * function-tests.c (selftest::test_expansion_to_rtl): Call
>   free_after_compilation.
OK.
jeff


Re: [PATCH, testsuite]: Fix g++.dg/torture/pr86763.C link failure for glibc < 2.17

2018-08-06 Thread Jeff Law
On 08/06/2018 09:10 AM, Uros Bizjak wrote:
> 2018-08-06  Uros Bizjak  
> 
> * g++.dg/torture/pr86763.C (dg-additional-options): Add -lrt.
> 
> Tested on CentOS 5.10 and Fedora 28.
> 
> OK for mainline?
But what about systems without librt?  I'm thinking primarily of the
embedded *-elf targets.

Jeff


Re: dejagnu version update?

2018-08-06 Thread Mike Stump
On Aug 4, 2018, at 9:32 AM, Bernhard Reutner-Fischer  
wrote:
> On Tue, 16 May 2017 at 21:08, Mike Stump  wrote:
>> 
>> On May 16, 2017, at 5:16 AM, Jonathan Wakely  wrote:
>>> The change I care about in 1.5.3
>> 
>> So, we haven't talked much about the version people want most.  If we 
>> update, might as well get something that more people care about.  1.5.3 is 
>> in ubuntu LTS 16.04 and Fedora 24, so it's been around awhile.  SUSU is said 
>> to be using 1.6, in the post 1.4.4 systems.  People stated they want 1.5.2 
>> and 1.5.3, so, I'm inclined to say, let's shoot for 1.5.3 when we do update.
>> 
>> As for the machines in the FSF compile farm, nah, tail wagging the dog.  I'd 
>> rather just update the requirement, and the owners or users of those 
>> machines can install a new dejagnu, if they are using one that is too old 
>> and they want to support testing gcc.
> 
> So.. let me ping that, again, now that another year has passed :)

Putting on my random engineer hat, does Centos 7 have a patch in it?  My system 
says 1.5.1.

Since g++ already requires 1.5.3, it make no sense to bump to anything older 
that 1.5.3, so let's bump to 1.5.3.  Those packaging systems and OSes that 
wanted to update by now, have had their chance to update.  Those that punt 
until we bump the requirement, well, they will now have to bump.  :-)

Ok to update to 1.5.3.

I'll pre-approve the patches to simplify and remove work arounds from the 
testsuite that cater to older versions.

If an RM wants to push the approval to sometime later (post a release branch 
creation point for example), let's give them a few days to request deferral.  I 
don't want to impact any next release in a way an RM doesn't want.  RM approval 
for back ports, I think we don't want to back port to a previous release, but 
I'm happy to defer to RM; if they want to do it.

[PATCH] Diagnostic included-from loop

2018-08-06 Thread Nathan Sidwell
We currently emit the 'included from' information in a peeled-once loop. 
 This patch rerolls the loop.  It'll make some future changes simpler, 
and we're not at all worried about speed here -- it's a diagnostic!


In the long term, the include-at path will include modules (imported 
at).  Having single logic to figure that out is simpler.


In the near future, I have a patch to change the representation of the 
included-from information, which I'll get to shortly.


ok?

nathan

--
Nathan Sidwell
2018-08-06  Nathan Sidwell  

	* diagnostic.c (diagnostic_report_current_module): Reroll
	included-at loop.

Index: diagnostic.c
===
--- diagnostic.c	(revision 263332)
+++ diagnostic.c	(working copy)
@@ -587,22 +587,26 @@ diagnostic_report_current_module (diagno
   set_last_module (context, map);
   if (! MAIN_FILE_P (map))
 	{
-	  map = INCLUDED_FROM (line_table, map);
-	  const char *line_col
-	= maybe_line_and_column (LAST_SOURCE_LINE (map),
- context->show_column
- ? LAST_SOURCE_COLUMN (map) : 0);
-	  pp_verbatim (context->printer,
-		   "In file included from %r%s%s%R", "locus",
-		   LINEMAP_FILE (map), line_col);
-	  while (! MAIN_FILE_P (map))
+	  bool first = true;
+	  do
 	{
 	  map = INCLUDED_FROM (line_table, map);
-	  line_col = maybe_line_and_column (LAST_SOURCE_LINE (map), 0);
-	  pp_verbatim (context->printer,
-			   ",\n from %r%s%s%R", "locus",
-			   LINEMAP_FILE (map), line_col);
+	  const char *line_col
+		= maybe_line_and_column (LAST_SOURCE_LINE (map),
+	 first && context->show_column
+	 ? SOURCE_COLUMN (map, where) : 0);
+	  static const char *const msgs[] =
+		{
+		 N_("In file included from"),
+		 N_(" from"),
+		};
+	  unsigned index = !first;
+	  pp_verbatim (context->printer, "%s%s %r%s%s%R",
+			   first ? "" : ",\n", _(msgs[index]),
+			   "locus", LINEMAP_FILE (map), line_col);
+	  first = false;
 	}
+	  while (! MAIN_FILE_P (map));
 	  pp_verbatim (context->printer, ":");
 	  pp_newline (context->printer);
 	}


Re: [PATCH, testsuite]: Fix g++.dg/torture/pr86763.C link failure for glibc < 2.17

2018-08-06 Thread Uros Bizjak
On Mon, Aug 6, 2018 at 5:23 PM, Jeff Law  wrote:
> On 08/06/2018 09:10 AM, Uros Bizjak wrote:
>> 2018-08-06  Uros Bizjak  
>>
>> * g++.dg/torture/pr86763.C (dg-additional-options): Add -lrt.
>>
>> Tested on CentOS 5.10 and Fedora 28.
>>
>> OK for mainline?
> But what about systems without librt?  I'm thinking primarily of the
> embedded *-elf targets.

Perhaps the following patch is better:

--cut here--
Index: g++.dg/torture/pr86763.C
===
--- g++.dg/torture/pr86763.C(revision 263317)
+++ g++.dg/torture/pr86763.C(working copy)
@@ -1,5 +1,6 @@
 // { dg-do run }
 // { dg-additional-options "-fschedule-insns2 -fstrict-aliasing" }
+// { dg-additional-options "-lrt" { target *-*-linux-gnu } }

 #include 
 #include 
--cut here--

Uros.


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Jeff Law
On 07/27/2018 12:48 AM, Bernd Edlinger wrote:
> I have one more example similar to PR86259, that resembles IMHO real world 
> code:
> 
> Consider the following:
> 
> 
> int fun (char *p)
> {
>   char buf[16];
> 
>   assert(strlen(p) < 4); //here: security relevant check
> 
>   sprintf(buf, "echo %s - %s", p, p); //here: security relevant code
>   return system(buf);
> }
> 
> 
> What is wrong with the assertion?
> 
> Nothing, except it is removed, when this function is called from untrusted 
> code:
> 
> untrused_fun ()
> {
>char b[2] = "ab";
>fun(b);
> }
> 
>  don't try to execute that: after "ab" there can be "; rm -rF / ;" on 
> your stack
But this code is fundamentally broken and catering to this kind of crap
is well, dumb.  At the point where we call strlen we've invoked
undefined behavior.

These aren't security checks in my mind, they're bandaids for idiot code
and are not suitable justification for making any changes for how we
generate code in GCC.

You could use them as an argument for improving warnings though.

Jeff


C++ PATCH to fix infinite constexpr evaluation (PR c++/86767)

2018-08-06 Thread Marek Polacek
This PR points out infinite looping in constexpr evaluation for e.g.

  for (...)
{
  continue;
  while (1);
}

where we tried to evaluate the while despite the continue.  The problem
started with r240591 and its cxx_eval_statement_list changes.  This
patch fixes it by handling continue in such a way that we skip stmts that
are not meant to be evaluated.

Bootstrapped/regtested on x86_64-linux, ok for trunk/8.3?

2018-08-06  Marek Polacek  

PR c++/86767
* constexpr.c (cxx_eval_statement_list): Handle continue.

* g++.dg/cpp1y/constexpr-86767.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 365296d6e3b..79039ff2b79 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -3950,6 +3950,16 @@ cxx_eval_statement_list (const constexpr_ctx *ctx, tree 
t,
   for (i = tsi_start (t); !tsi_end_p (i); tsi_next (&i))
 {
   tree stmt = tsi_stmt (i);
+  /* We've found a continue, so skip everything until we reach
+the label its jumping to.  */
+  if (continues (jump_target))
+   {
+ if (label_matches (ctx, jump_target, stmt))
+   /* Found it.  */
+   *jump_target = NULL_TREE;
+ else
+   continue;
+   }
   if (TREE_CODE (stmt) == DEBUG_BEGIN_STMT)
continue;
   r = cxx_eval_constant_expression (ctx, stmt, false,
diff --git gcc/testsuite/g++.dg/cpp1y/constexpr-86767.C 
gcc/testsuite/g++.dg/cpp1y/constexpr-86767.C
index e69de29bb2d..2ad71d507d1 100644
--- gcc/testsuite/g++.dg/cpp1y/constexpr-86767.C
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-86767.C
@@ -0,0 +1,119 @@
+// PR c++/86767
+// { dg-do compile { target c++14 } }
+
+constexpr int
+fn0 () noexcept
+{
+  int r = 0;
+  for (int i = 0; i < 10; ++i)
+{
+  continue;
+  r++;
+  for (int j = 0; j < 10; ++j )
+   {
+   }
+}
+  return r;
+}
+static_assert (fn0 () == 0, "");
+
+constexpr int
+fn1 () noexcept
+{
+  int r = 0;
+  for (int i = 0; i < 10; ++i)
+for (int j = 0; j < 10; ++j)
+  {
+   continue;
+   r++;
+  }
+  return r;
+}
+static_assert (fn1 () == 0, "");
+
+constexpr int
+fn2 () noexcept
+{
+  int r = 0;
+  for (int i = 0; i < 10; ++i)
+{
+  continue;
+  r++;
+}
+  return r;
+}
+static_assert (fn2 () == 0, "");
+
+constexpr int
+fn3 () noexcept
+{
+  int r = 0;
+  for (int i = 0; i < 10; ++i)
+{
+  continue;
+  r++;
+  while (1)
+   {
+   }
+}
+  return r;
+}
+static_assert (fn3 () == 0, "");
+
+constexpr int
+fn4 () noexcept
+{
+  for (int i = 0; i < 10; ++i)
+{
+  switch (i)
+   {
+   case 5:
+ return i;
+   default:
+ continue;
+   }
+  while (1)
+   {
+   }
+}
+  return 0;
+}
+static_assert (fn4 () == 5, "");
+
+constexpr int
+fn5 () noexcept
+{
+  for (int i = 0; i < 10; ++i)
+{
+  switch (i)
+   {
+   case 0:
+   case 1:
+   case 2:
+   case 3:
+   case 4:
+ continue;
+   default:
+ return i;
+   }
+  while (1)
+   {
+   }
+}
+  return 0;
+}
+static_assert (fn5 () == 5, "");
+
+constexpr int
+fn6 () noexcept
+{
+  int r = 0;
+  for (int i = 0; i < 10; ++i)
+{
+  continue;
+  for (int j = 0; j < 10; ++j )
+   r++;
+}
+  return r;
+}
+static_assert (fn6 () == 0, "");

Marek


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Martin Sebor

On 08/05/2018 11:27 AM, Richard Biener wrote:

On August 4, 2018 10:52:02 PM GMT+02:00, Martin Sebor  wrote:

On 08/03/2018 01:43 AM, Jakub Jelinek wrote:

On Thu, Aug 02, 2018 at 09:59:13PM -0600, Martin Sebor wrote:

If I call this with foo (2, 1), do you still claim it is not valid

C?


String functions like strlen operate on character strings stored
in character arrays.  Calling strlen (&s[1]) is invalid because
&s[1] is not the address of a character array.  The fact that
objects can be represented as arrays of bytes doesn't change
that.  The standard may be somewhat loose with words on this
distinction but the intent certainly isn't for strlen to traverse
arbitrary sequences of bytes that cross subobject boundaries.
(That is the intent behind the raw memory functions, but
the current text doesn't make the distinction clear.)


But the standard doesn't say that right now.


It does, in the restriction on multi-dimensional array accesses.
Given the array 'char a[2][2];' it's only valid to access a[0][0]
and a[0][1], and a[1][0], and a[1][1].  It's not valid to access
a[2][0] or a[2][1], even though they happen to be located at
the same addresses as a[1][0] and a[1][1].

There is no exception for distinct struct members.  So in
a struct { char a[2], b[2]; }, even though a and b and laid
out the same way as char[2][2] would be, it's not valid to
treat a as such.  There is no distinction between array
subscripting and pointer arithmetic, so it doesn't matter
what form the access takes.


What does the standard say to comparing & s. a[2] and & s. b[0] and what does 
that mean when you consider converting those to uintptr_t and back and then access the 
data pointed to?
Points-to analysis considers the first pointer to point to both subobjects 
while the second only to the second. (just pointing out other maybe 
inconsistent itself within GIMPLE handling of subobjects in points-to analysis)


The text (since C99) says that such pointers compare equal.

This doesn't imply that it's intended to be valid to access
the adjacent object using the past-the-end pointer.  Making
this clear is one of the main goals of the (evolving)
provenance proposal.  Converting to uintptr_t isn't meant
to change that either (the provenance is preserved through
such conversions).

Martin



Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Jeff Law
On 08/02/2018 10:19 PM, Martin Sebor wrote:
> On 08/01/2018 12:55 AM, Bernd Edlinger wrote:
>>> Certainly not every "strlen" has these semantics.  For example,
>>> this open-coded one doesn't:
>>>
>>>    int len = 0;
>>>    for (int i = 0; s.a[i]; ++i)
>>>  ++len;
>>>
>>> It computes 2 (with no warning for the out-of-bounds access).
>>>
>>
>> yes, which is questionable as well, but that happens only
>> if the source code accesses the array via s.a[i]
>> not if it happens to use char *, as this experiment shows:
> 
> Yes, that just happens to be the case with GCC in some
> situations, and not in others.  That's why it shouldn't
> be relied on.
Right.  An access via an ARRAY_REF carries some semantic information
that is not carried by a INDIRECT_REF.   However, we may at times lower
an ARRAY_REF to an INDIRECT_REF -- and I believe we've had code in the
front-ends to go the other way.

> 
>> The point I make is that it is impossible to know where the function
>> is inlined, and if the original code can be broken in surprising ways.
>> And most importantly strlen is often used in security relevant ways.
> 
> Code that's concerned with security or safety (which should
> be all of it) needs to follow the basic rules of the language.
> Calling strlen() on a char[4] argument expecting it to return
> a value larger than 3 as an indication that the array isn't
> nul-terminated is not a secure coding practice -- it's a plain
> old bug.  Don't take my word for it -- read any of the secure
> coding standards: CEERT STR32-C. Do not pass a non-null-terminated
> character sequence to a library function that expects a string,
> CWE-170: Improper Null Termination, OWASP String Termination
> Error.  This is elementary material that shouldn't need
> explaining.
I couldn't agree more with this.

jeff


Re: C++ PATCH to fix infinite constexpr evaluation (PR c++/86767)

2018-08-06 Thread Nathan Sidwell

On 08/06/2018 11:35 AM, Marek Polacek wrote:

This PR points out infinite looping in constexpr evaluation for e.g.



Bootstrapped/regtested on x86_64-linux, ok for trunk/8.3?

2018-08-06  Marek Polacek  

PR c++/86767
* constexpr.c (cxx_eval_statement_list): Handle continue.


ok.  I like the use of 'continue' in the processing of 'continue' :)

nathan

--
Nathan Sidwell


Re: [PATCH, testsuite]: Fix g++.dg/torture/pr86763.C link failure for glibc < 2.17

2018-08-06 Thread Jeff Law
On 08/06/2018 09:33 AM, Uros Bizjak wrote:
> On Mon, Aug 6, 2018 at 5:23 PM, Jeff Law  wrote:
>> On 08/06/2018 09:10 AM, Uros Bizjak wrote:
>>> 2018-08-06  Uros Bizjak  
>>>
>>> * g++.dg/torture/pr86763.C (dg-additional-options): Add -lrt.
>>>
>>> Tested on CentOS 5.10 and Fedora 28.
>>>
>>> OK for mainline?
>> But what about systems without librt?  I'm thinking primarily of the
>> embedded *-elf targets.
> 
> Perhaps the following patch is better:
> 
> --cut here--
> Index: g++.dg/torture/pr86763.C
> ===
> --- g++.dg/torture/pr86763.C(revision 263317)
> +++ g++.dg/torture/pr86763.C(working copy)
> @@ -1,5 +1,6 @@
>  // { dg-do run }
>  // { dg-additional-options "-fschedule-insns2 -fstrict-aliasing" }
> +// { dg-additional-options "-lrt" { target *-*-linux-gnu } }
That's be fine with me.

If someone wants the test to be run on solaris or some other target with
librt, they can extend the selector to include the proper target/versions.

Jeff


Re: [PATCH, testsuite]: Fix g++.dg/torture/pr86763.C link failure for glibc < 2.17

2018-08-06 Thread Uros Bizjak
On Mon, Aug 6, 2018 at 5:44 PM, Jeff Law  wrote:
> On 08/06/2018 09:33 AM, Uros Bizjak wrote:
>> On Mon, Aug 6, 2018 at 5:23 PM, Jeff Law  wrote:
>>> On 08/06/2018 09:10 AM, Uros Bizjak wrote:
 2018-08-06  Uros Bizjak  

 * g++.dg/torture/pr86763.C (dg-additional-options): Add -lrt.

 Tested on CentOS 5.10 and Fedora 28.

 OK for mainline?
>>> But what about systems without librt?  I'm thinking primarily of the
>>> embedded *-elf targets.
>>
>> Perhaps the following patch is better:
>>
>> --cut here--
>> Index: g++.dg/torture/pr86763.C
>> ===
>> --- g++.dg/torture/pr86763.C(revision 263317)
>> +++ g++.dg/torture/pr86763.C(working copy)
>> @@ -1,5 +1,6 @@
>>  // { dg-do run }
>>  // { dg-additional-options "-fschedule-insns2 -fstrict-aliasing" }
>> +// { dg-additional-options "-lrt" { target *-*-linux-gnu } }
> That's be fine with me.
>
> If someone wants the test to be run on solaris or some other target with
> librt, they can extend the selector to include the proper target/versions.

Thanks, I'll commit the patch to mainline with updated ChangeLog entry:

2018-08-06  Uros Bizjak  

* g++.dg/torture/pr86763.C (dg-additional-options): Add -lrt
for target *-*-linux-gnu.

Uros.


Re: [PATCH, testsuite]: Fix g++.dg/torture/pr86763.C link failure for glibc < 2.17

2018-08-06 Thread Andreas Schwab
How about replacing clock_gettime with something else?  It's not needed
for the particular test.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [Patch, avr, PR85624] - Fix ICE when initializing 128-byte aligned array

2018-08-06 Thread Jeff Law
On 08/06/2018 04:53 AM, Senthil Kumar Selvaraj wrote:
> Ping!
> 
> Regards
> Senthil
> 
> Senthil Kumar Selvaraj writes:
> 
>> Hi,
>>
>> The below patch fixes an ICE for the avr target when the setmemhi
>> expander is involved.
>>
>> The setmemhi expander generated RTL ends up as an unrecognized insn
>> if the alignment of the destination exceeds that of a QI
>> mode const_int (127), AND the number of bytes to set fits in a QI
>> mode const_int. The second condition prevents *clrmemhi from matching,
>> and *clrmemqi does not match because it expects operand 3 (the alignment
>> const_int rtx) to be QI mode, and a value of 128 or greater does not fit.
>>   
>> The patch fixes this by changing the *clrmemqi pattern to match a HI
>> mode const_int, and also adds a testcase.
>>
>> Regression test showed no new failures, ok to commit to trunk?
>>
>> Regards
>> Senthil
>>
>> gcc/ChangeLog:
>> 
>> 2018-07-18  Senthil Kumar Selvaraj  
>> 
>> PR target/85624
>> * config/avr/avr.md (*clrmemqi): Change mode of operands[2]
>> from QI to HI.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 2018-07-18  Senthil Kumar Selvaraj  
>> 
>> PR target/85624
>> * gcc.target/avr/pr85624.c: New test.
Given there's also pattern clrmemhi it feels like you're papering over a
bug elsewhere, possibly in the expanders.

Ultimately I'll leave the decision here to the AVR maintainers through.

jeff


Re: [PATCH] Diagnostic included-from loop

2018-08-06 Thread David Malcolm
On Mon, 2018-08-06 at 11:32 -0400, Nathan Sidwell wrote:
> We currently emit the 'included from' information in a peeled-once
> loop. 
>   This patch rerolls the loop.  It'll make some future changes
> simpler, 
> and we're not at all worried about speed here -- it's a diagnostic!
> 
> In the long term, the include-at path will include modules (imported 
> at).  Having single logic to figure that out is simpler.
> 
> In the near future, I have a patch to change the representation of
> the 
> included-from information, which I'll get to shortly.
> 
> ok?
> 
> nathan

I believe the only functional change here is that the strings are now
marked for translation, and translated.

OK for trunk, assuming you performed the usual testing on this patch.

Dave


Re: [PATCH 02/11] Arm - add speculation_barrier pattern

2018-08-06 Thread Richard Earnshaw (lists)
On 06/08/18 15:00, Christophe Lyon wrote:
> On Fri, 27 Jul 2018 at 11:38, Richard Earnshaw  
> wrote:
>>
>>
>> This patch defines a speculation barrier for AArch32.
>>
>> * config/arm/unspecs.md (unspecv): Add VUNSPEC_SPECULATION_BARRIER.
>> * config/arm/arm.md (speculation_barrier): New expand.
>> (speculation_barrier_insn): New pattern.
>> ---
> 
> Hi Richard,
> 
> This patch causes compilation errors for c-c++-common/spec-barrier-2.c
> when compiled for old architectures (eg -march=armv5t):
> /ccrf2LoR.s:49: Error: selected processor does not support `isb' in ARM mode
> /ccrf2LoR.s:50: Error: selected processor does not support `dsb sy' in ARM 
> mode
> 
> Does this belong to the kind of failures you still expect from this
> patch series?
> 

Nope :-(

I'll look into it.  We may need to just disable these patterns for
architectures that old.  Thumb1 code on pre-v6t2 similarly can't be
protected.

Thanks for reporting.

R.


Re: [testsuite] Fix gcc.dg/vect/no-section-anchors-vect-69.c on SPARC etc. (PR tree-optimization/80925)

2018-08-06 Thread Jeff Law
On 08/06/2018 02:31 AM, Rainer Orth wrote:
> Since this patch
> 
> 2017-07-31  Steve Ellcey  
> 
> PR tree-optimization/80925
> * gcc.dg/vect/no-section-anchors-vect-69.c: Add
> --param vect-max-peeling-for-alignment=0 option.
> Remove unaligned access and peeling checks.
> * gcc.dg/vect/section-anchors-vect-69.c: Ditto.
> 
> gcc.dg/vect/no-section-anchors-vect-69.c FAILs on a couple of targets,
> including ia64, aarch64, powerpc64lc, and sparc:
> 
> FAIL: gcc.dg/vect/no-section-anchors-vect-69.c scan-tree-dump-times vect 
> "vectorized 4 loops" 1
> 
> Steve suggested the fix implemented below in the PR, which I've now
> finally tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11, both
> trunk and gcc-8 branch.
> 
> Ok for both?
OK.
jeff


[PATCH][AAarch64][v3] Add support for TARGET_COMPUTE_FRAME_LAYOUT

2018-08-06 Thread Vlad Lazar

Hi,

The patch adds support for the TARGET_COMPUTE_FRAME_LAYOUT hook on AArch64
and removes unneeded frame layout recalculation.

The removed aarch64_layout_frame calls are unnecessary because the functions in 
which
they appear will be called during or after the reload pass in which the 
TARGET_COMPUTE_FRAME_LAYOUT
hook is called. The if statement in aarch64_layout_frame had the purpose of 
avoiding
the extra work from the calls which have been removed and is now redundant.

Bootstrapped and regtested on aarch64-none-linux-gnu and there are no 
regressions.

Thanks,
Vlad

gcc/
2018-08-06  Vlad Lazar  

* config/aarch64/aarch64.h (TARGET_COMPUTE_FRAME_LAYOUT): Define.
* config/aarch64/aarch64.c (aarch64_expand_prologue): Remove 
aarch64_layout_frame call.
(aarch64_expand_epilogue): Likewise.
(aarch64_initial_elimination_offset): Likewise.
(aarch64_get_separate_components): Likewise.
(aarch64_use_return_insn_p): Likewise.
(aarch64_layout_frame): Remove unneeded check.

---

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
976f9afae54c1c98c22a4d5002d8d94e33b3190a..02aaa333fb57eff918049681173403f004a8a8e3
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -478,6 +478,9 @@ extern unsigned aarch64_architecture_version;
 #undef DONT_USE_BUILTIN_SETJMP
 #define DONT_USE_BUILTIN_SETJMP 1
 
+#undef TARGET_COMPUTE_FRAME_LAYOUT

+#define TARGET_COMPUTE_FRAME_LAYOUT aarch64_layout_frame
+
 /* Register in which the structure value is to be returned.  */
 #define AARCH64_STRUCT_VALUE_REGNUM R8_REGNUM
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c

index 
b88e7cac27ab76e01b9769563ec9077d2a81bd7b..6a52eecf94011f5a7ee787c1295ca24732af2ff4
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4021,9 +4021,6 @@ aarch64_layout_frame (void)
   HOST_WIDE_INT offset = 0;
   int regno, last_fp_reg = INVALID_REGNUM;
 
-  if (reload_completed && cfun->machine->frame.laid_out)

-return;
-
   cfun->machine->frame.emit_frame_chain = aarch64_needs_frame_chain ();
 
 #define SLOT_NOT_REQUIRED (-2)

@@ -4567,8 +4564,6 @@ offset_12bit_unsigned_scaled_p (machine_mode mode, 
poly_int64 offset)
 static sbitmap
 aarch64_get_separate_components (void)
 {
-  aarch64_layout_frame ();
-
   sbitmap components = sbitmap_alloc (LAST_SAVED_REGNUM + 1);
   bitmap_clear (components);
 
@@ -4592,7 +4587,7 @@ aarch64_get_separate_components (void)
 
   unsigned reg1 = cfun->machine->frame.wb_candidate1;

   unsigned reg2 = cfun->machine->frame.wb_candidate2;
-  /* If aarch64_layout_frame has chosen registers to store/restore with
+  /* If registers have been chosen to be stored/restored with
  writeback don't interfere with them to avoid having to output explicit
  stack adjustment instructions.  */
   if (reg2 != INVALID_REGNUM)
@@ -4850,8 +4845,6 @@ aarch64_add_cfa_expression (rtx_insn *insn, unsigned int 
reg,
 void
 aarch64_expand_prologue (void)
 {
-  aarch64_layout_frame ();
-
   poly_int64 frame_size = cfun->machine->frame.frame_size;
   poly_int64 initial_adjust = cfun->machine->frame.initial_adjust;
   HOST_WIDE_INT callee_adjust = cfun->machine->frame.callee_adjust;
@@ -4964,8 +4957,6 @@ aarch64_use_return_insn_p (void)
   if (crtl->profile)
 return false;
 
-  aarch64_layout_frame ();

-
   return known_eq (cfun->machine->frame.frame_size, 0);
 }
 
@@ -4977,8 +4968,6 @@ aarch64_use_return_insn_p (void)

 void
 aarch64_expand_epilogue (bool for_sibcall)
 {
-  aarch64_layout_frame ();
-
   poly_int64 initial_adjust = cfun->machine->frame.initial_adjust;
   HOST_WIDE_INT callee_adjust = cfun->machine->frame.callee_adjust;
   poly_int64 final_adjust = cfun->machine->frame.final_adjust;
@@ -7525,8 +7514,6 @@ aarch64_can_eliminate (const int from ATTRIBUTE_UNUSED, 
const int to)
 poly_int64
 aarch64_initial_elimination_offset (unsigned from, unsigned to)
 {
-  aarch64_layout_frame ();
-
   if (to == HARD_FRAME_POINTER_REGNUM)
 {
   if (from == ARG_POINTER_REGNUM)


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Martin Sebor

On 08/06/2018 09:12 AM, Jeff Law wrote:

On 08/04/2018 03:56 PM, Martin Sebor wrote:

On 08/03/2018 01:00 AM, Jeff Law wrote:

On 07/24/2018 05:18 PM, Bernd Edlinger wrote:

On 07/24/18 23:46, Jeff Law wrote:

On 07/24/2018 01:59 AM, Bernd Edlinger wrote:

Hi!

This patch makes strlen range computations more conservative.

Firstly if there is a visible type cast from type A to B before
passing
then value to strlen, don't expect the type layout of B to restrict
the
possible return value range of strlen.

Why do you think this is the right thing to do?  ie, is there language
in the standards that makes you think the code as it stands today is
incorrect from a conformance standpoint?  Is there a significant
body of
code that is affected in an adverse way by the current code?  If so,
what code?




I think if you have an object, of an effective type A say char[100],
then
you can cast the address of A to B, say typedef char (*B)[2] for
instance
and then to const char *, say for use in strlen.  I may be wrong, but
I think
that we should at least try not to pick up char[2] from B, but instead
use A for strlen ranges, or leave this range open.  Currently the range
info for strlen is [0..1] in this case, even if we see the type cast
in the generic tree.

ISTM that you're essentially saying that the cast to const char *
destroys any type information we can exploit here.  But if that's the
case, then I don't think we can even derive a range of [0,99].  What's
to say that "A" didn't result from a similar cast of some object that
was char[200] that happened out of the scope of what we could see during
the strlen range computation?

If that is what you're arguing, then I think there's a re-evaluation
that needs to happen WRT strlen range computation/

And just to be clear, I do see this as a significant correctness
question.

Martin, thoughts?


The argument is that given:

  struct S { char a[4], b; };

  char a[8] = "1234567";

this is valid and should pass:

  __attribute__ ((noipa))
  void f (struct S *p)
  {
assert (7 == strlen (p->a));
  }

  int main (void)
  {
f ((struct S*)a);
  }

(This is the basic premise behind pr86259.)

This argument is wrong and the code is invalid.  For the access
to p->a to be valid p must point to an object of struct S (it
doesn't) and the p->a array must hold a nul-terminated string
(it also doesn't).

I agree with you for C/C++, but I think it's been shown elsewhere in
this thread that GIMPLE semantics to not respect the subobject
boundaries.  That's a sad reality.

[ ... ]



I care less about the optimization than I do about the basic
premise that it's essential to respect subobject boundaries(*).

I understand, but the semantics of GIMPLE do not respect them.  We can
argue about whether or not those should change and what it would take to
fix that. But right now the existing semantics do not respect those
boundaries.


They don't respect them in all cases (i.e., when MEM_REF loses
information about the structure of an access) but in a good
number of them GCC can still derive useful information from
the access.  It's relied on to great a effect by _FORTIFTY_SOURCE.
I think it would be a welcome enhancement if besides out-of-
bounds writes _FORTIFTY_SOURCE also prevented out-of-bounds
reads.


It would make little sense to undo the strlen optimization
without also undoing the optimization for the direct array
access case.  Undoing either would raise the question about
the validity of the _FORRTIFY_SOURCE=2 behavior.  That would
be a huge step backwards in terms of code security.  If we
did some of these but not others, it would make the behavior
inconsistent and surprising, all to accommodate one instance
of invalid code.

In the direct array access case I think (and I'm sure Jakub, Richi and
others will correct me if I'm wrong), we can use the object's type
because the dereferences are actually using the array's type.


Subscripting and pointer access are identical both in C/C++
and in GCC's _FORTIFY_SOURCE.  The absence of a distinction
between the two is essential for preventing writes past
the end by string functions like strcpy (_FORTIFY_SOURCE).


If we had a valid test case where the strlen optimization
leads to invalid code, or even if there were a significant
number of bug reports showing that it breaks an invalid
but common idiom, I would certainly feel compelled to
make it right somehow.  But there has been just one bug
report with clearly invalid code that should be easily
corrected.

Again, I think you're too narrowly focused on C/C++ semantics here.
What matters are the semantics in GIMPLE.


I don't get that.  GCC is a C/C++ compiler (besides other
languages), but not a GIMPLE compiler.   The only reason this
came up at all is a bug report with an invalid C test case that
reads past the end.  The only reason in my mind to consider
relaxing an assumption/restriction would be a valid test case
(in any supported language) that the optimization invalidates.

But

Re: [PATCH][GCC][AARCH64] Use STLUR for atomic_store

2018-08-06 Thread Matthew Malcomson

On 03/08/18 13:18, Sudakshina Das wrote:


Thank you for doing this. I am not a maintainer but I have a few nits on
this patch:

Thanks for the suggestions Sudi: nits have been removed, and new patch 
attached.


Have also removed the reference to "stlur" in names as Richard suggested.

New changelog:

gcc/
2018-08-06  Matthew Malcomson  

    * config/aarch64/aarch64-protos.h
    (aarch64_offset_9bit_signed_unscaled_p): New declaration.
    * config/aarch64/aarch64.c
    (aarch64_offset_9bit_signed_unscaled_p): Rename from
    offset_9bit_signed_unscaled_p.
    * config/aarch64/aarch64.h (TARGET_ARMV8_4): Add feature macro.
    * config/aarch64/atomics.md (atomic_store): Allow offset
    and use stlur.
    * config/aarch64/constraints.md (Ust): New constraint.
    * config/aarch64/predicates.md.
    (aarch64_sync_or_offset_memory_operand): New predicate.
    (aarch64_armv84_offset_memory_operand): New predicate.

gcc/testsuite/
2018-08-06  Matthew Malcomson  

    * gcc.target/aarch64/atomic-store.c: New.


Regards,
Matthew
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index ef95fc829b83886e2ff00e4664e31af916e99b0c..7a6254e46893fb36dc2ae57e7cfe78af67fb0e49 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -393,6 +393,7 @@ void aarch64_split_add_offset (scalar_int_mode, rtx, rtx, rtx, rtx, rtx);
 bool aarch64_mov_operand_p (rtx, machine_mode);
 rtx aarch64_reverse_mask (machine_mode, unsigned int);
 bool aarch64_offset_7bit_signed_scaled_p (machine_mode, poly_int64);
+bool aarch64_offset_9bit_signed_unscaled_p (machine_mode, poly_int64);
 char *aarch64_output_sve_cnt_immediate (const char *, const char *, rtx);
 char *aarch64_output_sve_addvl_addpl (rtx, rtx, rtx);
 char *aarch64_output_sve_inc_dec_immediate (const char *, rtx);
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index c1218503bab19323eee1cca8b7e4bea8fbfcf573..328512e11f4230e24223bc51e55bdca8b31f6a20 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -237,6 +237,9 @@ extern unsigned aarch64_architecture_version;
 /* ARMv8.3-A features.  */
 #define TARGET_ARMV8_3	(AARCH64_ISA_V8_3)
 
+/* ARMv8.4-A features.  */
+#define TARGET_ARMV8_4	(AARCH64_ISA_V8_4)
+
 /* Make sure this is always defined so we don't have to check for ifdefs
but rather use normal ifs.  */
 #ifndef TARGET_FIX_ERR_A53_835769_DEFAULT
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 13b5448aca88555222481f0955237b6fdcbb38b9..b9f3759ee67bbcf8a536cb5c334162922ae4cfd0 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4470,9 +4470,9 @@ aarch64_offset_7bit_signed_scaled_p (machine_mode mode, poly_int64 offset)
 
 /* Return true if OFFSET is a signed 9-bit value.  */
 
-static inline bool
-offset_9bit_signed_unscaled_p (machine_mode mode ATTRIBUTE_UNUSED,
-			   poly_int64 offset)
+bool
+aarch64_offset_9bit_signed_unscaled_p (machine_mode mode ATTRIBUTE_UNUSED,
+   poly_int64 offset)
 {
   HOST_WIDE_INT const_offset;
   return (offset.is_constant (&const_offset)
@@ -5747,7 +5747,7 @@ aarch64_classify_address (struct aarch64_address_info *info,
 	 instruction memory accesses.  */
 	  if (mode == TImode || mode == TFmode)
 	return (aarch64_offset_7bit_signed_scaled_p (DImode, offset)
-		&& (offset_9bit_signed_unscaled_p (mode, offset)
+		&& (aarch64_offset_9bit_signed_unscaled_p (mode, offset)
 			|| offset_12bit_unsigned_scaled_p (mode, offset)));
 
 	  /* A 7bit offset check because OImode will emit a ldp/stp
@@ -5761,7 +5761,8 @@ aarch64_classify_address (struct aarch64_address_info *info,
 	 ldr/str instructions (only big endian will get here).  */
 	  if (mode == CImode)
 	return (aarch64_offset_7bit_signed_scaled_p (TImode, offset)
-		&& (offset_9bit_signed_unscaled_p (V16QImode, offset + 32)
+		&& (aarch64_offset_9bit_signed_unscaled_p (V16QImode,
+			   offset + 32)
 			|| offset_12bit_unsigned_scaled_p (V16QImode,
 			   offset + 32)));
 
@@ -5801,7 +5802,7 @@ aarch64_classify_address (struct aarch64_address_info *info,
 		 || known_eq (GET_MODE_SIZE (mode), 16))
 		&& aarch64_offset_7bit_signed_scaled_p (mode, offset));
 	  else
-	return (offset_9bit_signed_unscaled_p (mode, offset)
+	return (aarch64_offset_9bit_signed_unscaled_p (mode, offset)
 		|| offset_12bit_unsigned_scaled_p (mode, offset));
 	}
 
@@ -5854,7 +5855,7 @@ aarch64_classify_address (struct aarch64_address_info *info,
 	   */
 	  if (mode == TImode || mode == TFmode)
 	return (aarch64_offset_7bit_signed_scaled_p (mode, offset)
-		&& offset_9bit_signed_unscaled_p (mode, offset));
+		&& aarch64_offset_9bit_signed_unscaled_p (mode, offset));
 
 	  if (load_store_pair_p)
 	return ((known_eq (GET_MODE_SIZE (mode), 4)
@@ -5862,7 +5863,7 @@ aarch64_classify_address (struct aarch64_address_info *info,
 		 || known_eq (GET_

Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-06 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01884.html

On 07/30/2018 05:51 PM, Martin Sebor wrote:

The middle-end contains code to determine the lengths of constant
character arrays initialized by string literals.  The code is used
in a number of optimizations and warnings.

However, the code is unable to deal with constant arrays initialized
using the braced initializer syntax, as in

  const char a[] = { '1', '2', '\0' };

The attached patch extends the C and C++ front-ends to convert such
initializers into a STRING_CST form.

The goal of this work is to both enable existing optimizations for
such arrays, and to help detect bugs due to using non-nul terminated
arrays where nul-terminated strings are expected.  The latter is
an extension of the GCC 8 _Wstringop-overflow and
-Wstringop-truncation warnings that help detect or prevent reading
past the end of dynamically created character arrays.  Future work
includes detecting potential past-the-end reads from uninitialized
local character arrays.

Tested on x86_64-linux.

Martin




[PATCH] Line map table allocation

2018-08-06 Thread Nathan Sidwell
Here's a line-map patch to make the new_linemap logic simpler.  On the 
modules branch I need to allocate blocks of linemaps, and found the 
current allocation scheme a little confusing to adjust.  This'll make 
that subsequent change simpler.


While there, I set the default allocator (xmalloc) in the init routine, 
rather than check it for each allocation.  I doubt the loss of a 
devirtualization possibility is significant (we're doing allocation 
wrong if it is).


booted & tested on x86_64-linux

ok?

nathan
--
Nathan Sidwell
2018-08-06  Nathan Sidwell  

	* line-map.c: (linemap_init): Set default allocator here.
	(new_linemap): Rather than here.  Refactor allocation logic.

Index: line-map.c
===
--- line-map.c	(revision 263332)
+++ line-map.c	(working copy)
@@ -346,6 +346,8 @@ linemap_init (struct line_maps *set,
 #else
   new (set) line_maps();
 #endif
+  /* Set default allocator.  */
+  set->reallocator = (line_map_realloc) xrealloc;
   set->highest_location = RESERVED_LOCATION_COUNT - 1;
   set->highest_line = RESERVED_LOCATION_COUNT - 1;
   set->location_adhoc_data_map.htab =
@@ -376,81 +378,58 @@ linemap_check_files_exited (struct line_
 static struct line_map *
 new_linemap (struct line_maps *set,  source_location start_location)
 {
-  struct line_map *result;
-  bool macro_map_p = start_location >= LINE_MAP_MAX_LOCATION;
+  bool macro_p = start_location >= LINE_MAP_MAX_LOCATION;
+  unsigned alloc = LINEMAPS_ALLOCATED (set, macro_p);
+  unsigned used = LINEMAPS_USED (set, macro_p);
 
-  if (LINEMAPS_USED (set, macro_map_p) == LINEMAPS_ALLOCATED (set, macro_map_p))
+  if (used == alloc)
 {
-  /* We ran out of allocated line maps. Let's allocate more.  */
-  size_t alloc_size;
-
-  /* Cast away extern "C" from the type of xrealloc.  */
-  line_map_realloc reallocator = (set->reallocator
-  ? set->reallocator
-  : (line_map_realloc) xrealloc);
-  line_map_round_alloc_size_func round_alloc_size =
-	set->round_alloc_size;
-
-  size_t map_size = (macro_map_p
-			 ? sizeof (line_map_macro)
-			 : sizeof (line_map_ordinary));
+  /* We need more space!  */
+  if (!alloc)
+	alloc = 128;
+  alloc *= 2;
+
+  size_t map_size;
+  void *buffer;
+  if (macro_p)
+	{
+	  map_size = sizeof (line_map_macro);
+	  buffer = set->info_macro.maps;
+	}
+  else
+	{
+	  map_size = sizeof (line_map_ordinary);
+	  buffer = set->info_ordinary.maps;
+	}
 
   /* We are going to execute some dance to try to reduce the
 	 overhead of the memory allocator, in case we are using the
 	 ggc-page.c one.
 	 
 	 The actual size of memory we are going to get back from the
-	 allocator is the smallest power of 2 that is greater than the
-	 size we requested.  So let's consider that size then.  */
-
-  alloc_size =
-	(2 * LINEMAPS_ALLOCATED (set, macro_map_p) +  256)
-	* map_size;
-
-  /* Get the actual size of memory that is going to be allocated
-	 by the allocator.  */
-  alloc_size = round_alloc_size (alloc_size);
+	 allocator may well be larger than what we ask for.  Use this
+	 hook to find what that size is.  */
+  size_t alloc_size = set->round_alloc_size (alloc * map_size);
 
   /* Now alloc_size contains the exact memory size we would get if
 	 we have asked for the initial alloc_size amount of memory.
 	 Let's get back to the number of macro map that amounts
 	 to.  */
-  LINEMAPS_ALLOCATED (set, macro_map_p) =
-	alloc_size / map_size;
-
-  /* And now let's really do the re-allocation.  */
-  if (macro_map_p)
-	{
-	  set->info_macro.maps
-	= (line_map_macro *) (*reallocator) (set->info_macro.maps,
-		 (LINEMAPS_ALLOCATED (set, macro_map_p)
-		  * map_size));
-	  result = &set->info_macro.maps[LINEMAPS_USED (set, macro_map_p)];
-	}
-  else
-	{
-	  set->info_ordinary.maps =
-	(line_map_ordinary *) (*reallocator) (set->info_ordinary.maps,
-		  (LINEMAPS_ALLOCATED (set, macro_map_p)
-		   * map_size));
-	  result = &set->info_ordinary.maps[LINEMAPS_USED (set, macro_map_p)];
-	}
-  memset (result, 0,
-	  ((LINEMAPS_ALLOCATED (set, macro_map_p)
-		- LINEMAPS_USED (set, macro_map_p))
-	   * map_size));
-}
-  else
-{
-  if (macro_map_p)
-	result = &set->info_macro.maps[LINEMAPS_USED (set, macro_map_p)];
+  unsigned num_maps = alloc_size / map_size;
+  buffer = set->reallocator (buffer, num_maps * map_size);
+  memset ((char *)buffer + used * map_size, 0, (num_maps - used) * map_size);
+  if (macro_p)
+	set->info_macro.maps = (line_map_macro *)buffer;
   else
-	result = &set->info_ordinary.maps[LINEMAPS_USED (set, macro_map_p)];
+	set->info_ordinary.maps = (line_map_ordinary *)buffer;
+  LINEMAPS_ALLOCATED (set, macro_p) = num_maps;
 }
 
-  result->start_location = start_location;
+  line_map *result = (macro_p ? (line_map *)&set->info_macro.maps[used]
+		  : (line_map *)&set->info_ordi

Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-06 Thread Joseph Myers
On Mon, 6 Aug 2018, Martin Sebor wrote:

> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01884.html

I'd expect testcases with signed char and unsigned char as well, if those 
work for C, including tests for signed char where some of the initializers 
are negative.  (Tests that actual array contents are still correct after 
this conversion, as well as that the optimizations occur, would also be a 
good idea.)

The c-parser.c patch adds a comment that ends in the middle of a word.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Martin Sebor

These examples do not aim to be valid C, they just point out limitations
of the middle-end design, and a good deal of the problems are due
to trying to do things that are not safe within the boundaries given
by the middle-end design.

I really think this is important -- and as such I think we need to move
away from trying to describe scenarios in C because doing so keeps
bringing us back to the "C doesn't allow XYZ" kinds of arguments when
what we're really discussing are GIMPLE semantic issues.

So examples should be GIMPLE.  You might start with (possibly invalid) C
code to generate the GIMPLE, but the actual discussion needs to be
looking at GIMPLE.  We might include the C code in case someone wants to
look at things in a debugger, but bringing the focus to GIMPLE is really
important here.


I don't understand the goal of this exercise.  Unless the GIMPLE
code is the result of a valid test case (in some language GCC
supports), what does it matter what it looks like?  The basis of
every single transformation done by a compiler is that the source
code is correct.  If it isn't then all bets are off.  I'm no GIMPLE
expert but even I can come up with any number of GIMPLE expressions
that have undefined behavior.  What would that prove?

But let me try anyway.  Here's a simplified (and gimplified) version
of the test case that started this debate:

  struct S { char a[4], b; };

  f (struct S * p)
  {
int D.1908;

_1 = &p->a;
_2 = __builtin_strlen (_1);   // strlen (p->a);
D.1908 = (int) _2;
return D.1908;
  }

and one involving a pointer:

  g (struct S * p)
  {
int D.1910;
char * q;

q = &p->a;
_1 = __builtin_strlen (q);
D.1910 = (int) _1;
return D.1910;
  }

and another one involving a pointer and strcpy and
_FORTIFY_SOURCE:

  h (struct S * p)
  {
int D.2208;
char * q;

q = &p->a;
_1 = strcpy (q, "1234");
_2 = (long int) _1;
D.2208 = (int) _2;
return D.2208;
  }

with strcpy defined as:

  __attribute__((artificial, gnu_inline, always_inline, leaf, nothrow))
  strcpy (char * restrict __dest, const char * restrict __src)
  {
char * D.2210;

_1 = __builtin_object_size (__dest, 1);
D.2210 = __builtin___strcpy_chk (__dest, __src, _1);
return D.2210;
  }

What does this show?

AFAICS, all three functions are equivalent GIMPLE, yet I'm being
told that the first one is different in some important detail from
the second, and that even though it's the same as the third and
even though it's good to have __strcpy_chk() abort in the third
case it's bad for the strlen() call to return a value constrained
to [0, 3].  Would defining strlen like so

  __attribute__((artificial, gnu_inline, always_inline, leaf, nothrow))
  strlen (const char * __src)
  {
char * D.2210;

_1 = __builtin_object_size (__src, 1);
D.2210 = __builtin___strlen_chk (__src, _1);
return D.2210;
  }

and having __strlen_chk() abort if __strc were longer than _1
be also bad?  (If not -- I sincerely hope that's the answer --
then I'll be happy to put together a patch for that.  In fact,
I think it would be useful to extend this to all string
functions (i.e., have them all abort on reads past the end,
just as they abort on writes).

Martin


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Jeff Law
On 08/06/2018 11:15 AM, Martin Sebor wrote:
>>> These examples do not aim to be valid C, they just point out limitations
>>> of the middle-end design, and a good deal of the problems are due
>>> to trying to do things that are not safe within the boundaries given
>>> by the middle-end design.
>> I really think this is important -- and as such I think we need to move
>> away from trying to describe scenarios in C because doing so keeps
>> bringing us back to the "C doesn't allow XYZ" kinds of arguments when
>> what we're really discussing are GIMPLE semantic issues.
>>
>> So examples should be GIMPLE.  You might start with (possibly invalid) C
>> code to generate the GIMPLE, but the actual discussion needs to be
>> looking at GIMPLE.  We might include the C code in case someone wants to
>> look at things in a debugger, but bringing the focus to GIMPLE is really
>> important here.
> 
> I don't understand the goal of this exercise.  Unless the GIMPLE
> code is the result of a valid test case (in some language GCC
> supports), what does it matter what it looks like?  The basis of
> every single transformation done by a compiler is that the source
> code is correct.  If it isn't then all bets are off.  I'm no GIMPLE
> expert but even I can come up with any number of GIMPLE expressions
> that have undefined behavior.  What would that prove?
The GIMPLE IL is less restrictive than the original source language.
The process of translation into GIMPLE and optimization can create
situations in the GIMPLE IL that can't be validly represented in the
original source language.  Subobject crossing being one such case, there
are certainly others.  We have to handle these scenarios correctly.

My favorite from a long time ago was the RTL loop optimizer creating a
pointer well outside the bounds of an object.  That pointer was then
used in a reg+d addressing mode where the displacement brought the final
effective address back into the bounds of the object.   You can't
validly do that in C/C++, but it was certainly valid RTL and it was
useful to allow creation of such pointers which were outside the bounds
of the object.



Jeff


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Richard Biener
On August 6, 2018 6:32:41 PM GMT+02:00, Martin Sebor  wrote:
>On 08/06/2018 09:12 AM, Jeff Law wrote:
>> On 08/04/2018 03:56 PM, Martin Sebor wrote:
>>> On 08/03/2018 01:00 AM, Jeff Law wrote:
 On 07/24/2018 05:18 PM, Bernd Edlinger wrote:
> On 07/24/18 23:46, Jeff Law wrote:
>> On 07/24/2018 01:59 AM, Bernd Edlinger wrote:
>>> Hi!
>>>
>>> This patch makes strlen range computations more conservative.
>>>
>>> Firstly if there is a visible type cast from type A to B before
>>> passing
>>> then value to strlen, don't expect the type layout of B to
>restrict
>>> the
>>> possible return value range of strlen.
>> Why do you think this is the right thing to do?  ie, is there
>language
>> in the standards that makes you think the code as it stands today
>is
>> incorrect from a conformance standpoint?  Is there a significant
>> body of
>> code that is affected in an adverse way by the current code?  If
>so,
>> what code?
>>
>>
>
> I think if you have an object, of an effective type A say
>char[100],
> then
> you can cast the address of A to B, say typedef char (*B)[2] for
> instance
> and then to const char *, say for use in strlen.  I may be wrong,
>but
> I think
> that we should at least try not to pick up char[2] from B, but
>instead
> use A for strlen ranges, or leave this range open.  Currently the
>range
> info for strlen is [0..1] in this case, even if we see the type
>cast
> in the generic tree.
 ISTM that you're essentially saying that the cast to const char *
 destroys any type information we can exploit here.  But if that's
>the
 case, then I don't think we can even derive a range of [0,99]. 
>What's
 to say that "A" didn't result from a similar cast of some object
>that
 was char[200] that happened out of the scope of what we could see
>during
 the strlen range computation?

 If that is what you're arguing, then I think there's a
>re-evaluation
 that needs to happen WRT strlen range computation/

 And just to be clear, I do see this as a significant correctness
 question.

 Martin, thoughts?
>>>
>>> The argument is that given:
>>>
>>>   struct S { char a[4], b; };
>>>
>>>   char a[8] = "1234567";
>>>
>>> this is valid and should pass:
>>>
>>>   __attribute__ ((noipa))
>>>   void f (struct S *p)
>>>   {
>>> assert (7 == strlen (p->a));
>>>   }
>>>
>>>   int main (void)
>>>   {
>>> f ((struct S*)a);
>>>   }
>>>
>>> (This is the basic premise behind pr86259.)
>>>
>>> This argument is wrong and the code is invalid.  For the access
>>> to p->a to be valid p must point to an object of struct S (it
>>> doesn't) and the p->a array must hold a nul-terminated string
>>> (it also doesn't).
>> I agree with you for C/C++, but I think it's been shown elsewhere in
>> this thread that GIMPLE semantics to not respect the subobject
>> boundaries.  That's a sad reality.
>>
>> [ ... ]
>>
>>>
>>> I care less about the optimization than I do about the basic
>>> premise that it's essential to respect subobject boundaries(*).
>> I understand, but the semantics of GIMPLE do not respect them.  We
>can
>> argue about whether or not those should change and what it would take
>to
>> fix that. But right now the existing semantics do not respect those
>> boundaries.
>
>They don't respect them in all cases (i.e., when MEM_REF loses
>information about the structure of an access) but in a good
>number of them GCC can still derive useful information from
>the access.  It's relied on to great a effect by _FORTIFTY_SOURCE.
>I think it would be a welcome enhancement if besides out-of-
>bounds writes _FORTIFTY_SOURCE also prevented out-of-bounds
>reads.
>
>>> It would make little sense to undo the strlen optimization
>>> without also undoing the optimization for the direct array
>>> access case.  Undoing either would raise the question about
>>> the validity of the _FORRTIFY_SOURCE=2 behavior.  That would
>>> be a huge step backwards in terms of code security.  If we
>>> did some of these but not others, it would make the behavior
>>> inconsistent and surprising, all to accommodate one instance
>>> of invalid code.
>> In the direct array access case I think (and I'm sure Jakub, Richi
>and
>> others will correct me if I'm wrong), we can use the object's type
>> because the dereferences are actually using the array's type.
>
>Subscripting and pointer access are identical both in C/C++
>and in GCC's _FORTIFY_SOURCE.  The absence of a distinction
>between the two is essential for preventing writes past
>the end by string functions like strcpy (_FORTIFY_SOURCE).
>
>>> If we had a valid test case where the strlen optimization
>>> leads to invalid code, or even if there were a significant
>>> number of bug reports showing that it breaks an invalid
>>> but common idiom, I would certainly feel compelled to
>>> make it right somehow.  But there has been just one bug

[SVE ACLE] Tidy up function names in testcases

2018-08-06 Thread Richard Sandiford
While looking through the SVE ACLE tests again, I realised the
naming of the test functions was a bit inconsistent.  The patch
below adopts the following scheme:

  test_[mxz]:
a test that isn't focusing on the register allocation.  More
specifically, this means that there are no other functions in
the test file that differ from this one only in their register
allocation.

  test_[mxz]_tied:
a test in which the output operand is tied to the th input
operand (counting from 1).

  test_[mxz]_untied:
a test in which the output operand is not tied to any input operand.

The patch also adds some missing z_untied tests.

Tested on aarch64-linux-gnu and applied to aarch64/sve-acle-branch.

Richard




sve-acle-test-tweaks.patch.bz2
Description: Binary data


Re: [PATCH] [Ada] Make middle-end string literals NUL terminated

2018-08-06 Thread Olivier Hainque
Hi Bernd,

Things are moving fast so I'll answer your
messages incrementally :-)

> On 3 Aug 2018, at 20:37, Bernd Edlinger  wrote:
[...]
> Oh, how interesting.
> 
> My patch only intends to add a dummy byte that is stripped again
> in the assembly.  The purpose is to make it trivial to find out if
> a string constant is NUL-terminated in  the middle-end by comparing
> TYPE_SIZE_UNIT (TREE_TYPE (decl)) with TREE_STRING_LENGTH (decl).

Ah, Ok. I hadn't quite realized that yet.

> TYPE_SIZE_UNIT >= TREE_STRING_LENGTH means zero terminated
> TYPE_SIZE_UNIT < TREE_STRING_LENGTH means not necessarily zero terminated,

That is opposite to the situation we were having
for Ada before the patch series:

> Such strings shall never chop off more than one nul character.
> Ada strings are generally not zero terminated, right?

Right. Strings have explicit bounds in Ada, and a NUL character
is like any other. It can appear anywhere. For example, the
excerpt below constructs a string of 5 characters, bounds 1 to 5,
with a nul character in second position:

 X : String (1 .. 5) := "1" & Ascii.Nul & "345";

('&' is the catenation operator)

The problem we had was that string literals not explicitly
NUL terminated (as in X : String := "123" & Ascii.NUL) would
never go in mergeable sections. For example:

Ada.Text_IO.Put_Line ("Hello World!");

Ada has so called "generic" units that are sometimes
instanciated many many times so there is real interest
in improving the situation there.

> So in your search loop you would not have to look after
> type_len, because that byte would be guaranteed to be zero.
> 
> That is if we agree that we want to restrict the string constants
> in that way as I proposed.
> 
> In the case str_len > type_len I do not understand if that is right.

In your new model, I don't know what sense it would make, indeed.

In the previous model, it was clearly expected to be a possibility.

> Because varasm will only output type_size bytes,
> this seems only to select a string section
> but the last nul byte will not be assembled.
> Something that is not contained in the type_size
> should not be assembled.
> 
> What exactly do you want to accomplish?

Based on the previous (say, gcc-7 or gcc-8) code base, arrange
for most Ada string literals to end up in a mergeable section
on ELF targets.

In at least gcc-7 and gcc-8, our gigi adjustment + the
patch I posted achieve this effect.

For example, for the following source:

procedure Hello is
  procedure Process (S : String) with Import, Convention => C;
begin
 Process ("1234");
end;

Without the gigi change, I get (x86_64-linux,
-O2 -fmerge-all-constants):

  .section .rodata
  .LC0:
  .ascii "1234"

Regular section, no terminating zero.

The gigi change alone gets us the terminating nul (*),
but not yet a mergeable section, because the extra nul
isn't covered by the type bounds:

  .section .rodata
  .LC0:
  .string "1234"

With the varasm change on top, checking beyond the
type bounds when the string constant isn't an initializer,
we get:

  .section .rodata.str1.1,"aMS",@progbits,1
  .LC0:
  .string "1234"

(*) The terminating zero, part of the string value,
not spanned by the type bounds, gets assembled through:

assemble_constant_contents (...)
...
size = get_constant_size (exp);

then:

get_constant_size (tree exp)

  size = int_size_in_bytes (TREE_TYPE (exp));
  if (TREE_CODE (exp) == STRING_CST)
size = MAX (TREE_STRING_LENGTH (exp), size);

Again, this is just providing more context on the
original patch I posted, based on your questions.

I understand there are proposals to change things
pretty significantly in this area, so ...

now on to your next messages and ideas :-) 


Olivier



Re: [PATCH][Middle-end] disable strcmp/strncmp inlining with O2 below and Os

2018-08-06 Thread Qing Zhao
thanks for reporting this issue.

I will take a look.

Qing
> On Jul 30, 2018, at 8:45 AM, Christophe Lyon  
> wrote:
> 
> On Wed, 25 Jul 2018 at 19:08, Qing Zhao  > wrote:
>> 
>> Hi,
>> 
>> As Wilco suggested, the new added strcmp/strncmp inlining should be only 
>> enabled with O2 and above.
>> 
>> this is the simple patch for this change.
>> 
>> tested on both X86 and aarch64.
>> 
>> Okay for thunk?
>> 
>> Qing
>> 
>> gcc/ChangeLog:
>> 
>> +2018-07-25  Qing Zhao  
>> +
>> +   * builtins.c (inline_expand_builtin_string_cmp): Disable inlining
>> +   when optimization level is lower than 2 or optimize for size.
>> +
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> +2018-07-25  Qing Zhao  
>> +
>> +   * gcc.dg/strcmpopt_5.c: Change to O2 to enable the transformation.
>> +   * gcc.dg/strcmpopt_6.c: Likewise.
>> +
>> 
> 
> Hi,
> 
> After this change, I've noticed that:
> FAIL: gcc.dg/strcmpopt_6.c scan-rtl-dump-times expand "__builtin_memcmp" 6
> on arm-none-linux-gnueabi
> --with-mode thumb
> --with-cpu cortex-a9
> and forcing -march=armv5t via RUNTESTFLAGS
> 
> The log says:
> gcc.dg/strcmpopt_6.c: pattern found 4 times
> FAIL: gcc.dg/strcmpopt_6.c scan-rtl-dump-times expand "__builtin_memcmp" 6
> 
> Christophe



Re: [C++ PATCH] PR c++/79133

2018-08-06 Thread Ville Voutilainen
On 8 July 2018 at 02:08, Ville Voutilainen  wrote:
> On 8 July 2018 at 01:54, Paolo Carlini  wrote:
>>> That would make this more consistent with such a shadow warning, but I
>>> don't want
>>> to use the shadowing wording (which would be easy to do; just set
>>> 'shadowed' and do
>>> a 'goto inform'), because this isn't shadowing in the precise sense;
>>> the shadowing cases
>>> are warnings, whereas this is more like the redeclaration errors in
>>> the same function.
>>
>> ... indeed and that annoys me a bit. Not having studied at all c++/79133 so
>> far (sorry) it seems a little weird to me that according to the standard we
>> have to handle the two types of "shadowing" in different ways, one more
>> strict, one less. Thus I would suggest double checking the details of that,
>> eventually with Jason too in terms of the actual patch you would like to
>> apply.
>
> Well. The PR is about DR 2211 which, in simple terms, says that lambda
> parameters
> and captures cannot have the same name. See
> http://open-std.org/JTC1/SC22/WG21/docs/cwg_defects.html#2211
>
> That's stricter than -Wshadow, but otherwise equally strict as the
> other error cases already handled
> in check_local_shadow. So I'll make this error case more consistent
> with the others. We already
> handle redeclaration errors slightly differently from shadowing
> warnings in that function.

Here's an updated patch. Tested on Linux-PPC64, OK for trunk?

Backports?

2018-08-06  Ville Voutilainen  

gcc/cp/

PR c++/79133
* name-lookup.c (check_local_shadow): Reject captures and parameters
with the same name.

testsuite/

PR c++/79133
* g++.dg/cpp0x/lambda/lambda-shadow3.C: New.
* g++.dg/cpp1y/lambda-generic-variadic18.C: Likewise.
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 3aafb0f..72d87b3 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -2640,6 +2640,7 @@ check_local_shadow (tree decl)
 		  || TREE_CODE (decl) == TYPE_DECL)))
   && DECL_FUNCTION_SCOPE_P (old)
   && (!DECL_ARTIFICIAL (decl)
+	  || is_capture_proxy (decl)
 	  || DECL_IMPLICIT_TYPEDEF_P (decl)
 	  || (VAR_P (decl) && DECL_ANON_UNION_VAR_P (decl
 {
@@ -2648,7 +2649,8 @@ check_local_shadow (tree decl)
   /* Don't complain if it's from an enclosing function.  */
   if (DECL_CONTEXT (old) == current_function_decl
 	  && TREE_CODE (decl) != PARM_DECL
-	  && TREE_CODE (old) == PARM_DECL)
+	  && TREE_CODE (old) == PARM_DECL
+	  && !is_capture_proxy (decl))
 	{
 	  /* Go to where the parms should be and see if we find
 	 them there.  */
@@ -2665,6 +2667,21 @@ check_local_shadow (tree decl)
 	  return;
 	}
 	}
+  /* DR 2211: check that captures and parameters
+	 do not have the same name. */
+  else if (is_capture_proxy (decl))
+	{
+	  if (current_lambda_expr ()
+	  && DECL_CONTEXT (old) == lambda_function (current_lambda_expr ())
+	  && TREE_CODE (old) == PARM_DECL
+	  && DECL_NAME (decl) != this_identifier)
+	{
+	  error_at (DECL_SOURCE_LOCATION (old),
+			"lambda parameter %qD "
+			"previously declared as a capture", old);
+	}
+	  return;
+	}
 
   /* The local structure or class can't use parameters of
 	 the containing function anyway.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-shadow3.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-shadow3.C
new file mode 100644
index 000..8364321
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-shadow3.C
@@ -0,0 +1,6 @@
+// { dg-do compile { target c++11 } }
+
+int main() {
+  int x = 42;
+  auto lambda = [x](int x) {}; // { dg-error "previously declared as a capture" }
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-variadic18.C b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-variadic18.C
new file mode 100644
index 000..1eb9cce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-variadic18.C
@@ -0,0 +1,11 @@
+// { dg-do compile { target c++14 } }
+
+int main() {
+  int x = 42;
+  auto lambda2 = [x=x](int x) {}; // { dg-error "previously declared as a capture" }
+  auto lambda3 = [x](auto... x) {}; // { dg-error "previously declared as a capture" }
+  auto lambda4 = [](auto... x) {
+auto lambda5 = [x...](auto... x) {};  // { dg-error "previously declared as a capture" }
+auto lambda6 = [x...](int x) {};  // { dg-error "previously declared as a capture" }
+  };
+}


Use EVRP within DOM to discover more constants

2018-08-06 Thread Jeff Law

Now that we have EVRP analysis available within DOM, we can query it to
see if an SSA_NAME has a singleton constant range.  If so, we can
substitute the constant for the SSA_NAME.

This is most helpful in assignments (for GIMPLE_COND we'd simplify it
later anyway)...  Consider this:

  _930 = _631 > 5;
  _952 = _638 != 0B;
  _933 = _930 & _952;
  _929 = _631 <= 10;
  _1658 = _929 & _933;
  if (_1658 != 0)

We already have some info about _631 before we enter this block via the
EVRP analysis:


pushing new range for _631: [1, 10]  EQUIVALENCES: { _631 } (1 elements)

Knowing the range for _631 results in _929 having a constant value 1.

That in turn allows us to turn the assignment for _1658 into a simple
copy (we're dealing with booleans here):


Optimizing statement _1658 = _929 & _933;
  Replaced '_929' with constant '1'
gimple_simplified to _1658 = _933;
  Folded to: _1658 = _933;


That in turn allows us to propagate _933 into the use of _1658:

Optimizing statement if (_1658 != 0)
  Replaced '_1658' with variable '_933'
gimple_simplified to if (_933 != 0)
  Folded to: if (_933 != 0)

It triggers a few thousand times during a bootstrap.  Many of those
would have been caught later anyway, so it's not as impressive as it
might sound.  BUt it does seem to generate consistent small improvements
in the resulting code.

Bootstrapped and regression tested on x86_64-linux-gnu.  Installing on
the trunk.

Jeff
* tree-ssa-dom.c (dom_opt_dom_walker::optimize_stmt): Pass down
the vr_values instance to cprop_into_stmt.
(cprop_into_stmt): Pass vr_values instance down to cprop_operand.
(cprop_operand): Also query EVRP to determine if OP is a constant.

diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index a6f176c..b936fd3 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -1700,7 +1712,7 @@ record_equivalences_from_stmt (gimple *stmt, int 
may_optimize_p,
CONST_AND_COPIES.  */
 
 static void
-cprop_operand (gimple *stmt, use_operand_p op_p)
+cprop_operand (gimple *stmt, use_operand_p op_p, vr_values *vr_values)
 {
   tree val;
   tree op = USE_FROM_PTR (op_p);
@@ -1709,6 +1721,9 @@ cprop_operand (gimple *stmt, use_operand_p op_p)
  copy of some other variable, use the value or copy stored in
  CONST_AND_COPIES.  */
   val = SSA_NAME_VALUE (op);
+  if (!val)
+val = vr_values->op_with_constant_singleton_value_range (op);
+
   if (val && val != op)
 {
   /* Do not replace hard register operands in asm statements.  */
@@ -1765,7 +1780,7 @@ cprop_operand (gimple *stmt, use_operand_p op_p)
vdef_ops of STMT.  */
 
 static void
-cprop_into_stmt (gimple *stmt)
+cprop_into_stmt (gimple *stmt, vr_values *vr_values)
 {
   use_operand_p op_p;
   ssa_op_iter iter;
@@ -1782,7 +1797,7 @@ cprop_into_stmt (gimple *stmt)
 operands.  */
   if (old_op != last_copy_propagated_op)
{
- cprop_operand (stmt, op_p);
+ cprop_operand (stmt, op_p, vr_values);
 
  tree new_op = USE_FROM_PTR (op_p);
  if (new_op != old_op && TREE_CODE (new_op) == SSA_NAME)
@@ -1925,7 +1940,7 @@ dom_opt_dom_walker::optimize_stmt (basic_block bb, 
gimple_stmt_iterator si)
   opt_stats.num_stmts++;
 
   /* Const/copy propagate into USES, VUSES and the RHS of VDEFs.  */
-  cprop_into_stmt (stmt);
+  cprop_into_stmt (stmt, evrp_range_analyzer.get_vr_values ());
 
   /* If the statement has been modified with constant replacements,
  fold its RHS before checking for redundant computations.  */


Re: [PATCH 00/11] (v2) Mitigation against unsafe data speculation (CVE-2017-5753)

2018-08-06 Thread John David Anglin

On 2018-08-03 5:06 AM, Richard Earnshaw (lists) wrote:

I don't think there's a suitable barrier.  The sync instruction seems
like overkill.

So, I'm going to install attached change after testing is complete.


It's your call as port maintainers.

I committed the attached change after testing on hppa-unknown-linux-gnu.

Dave

--
John David Anglin  dave.ang...@bell.net

2018-08-06  John David Anglin  

PR target/86807
* config/pa/pa.c (TARGET_HAVE_SPECULATION_SAFE_VALUE):
Define to speculation_safe_value_not_needed.

Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 263228)
+++ config/pa/pa.c  (working copy)
@@ -428,6 +428,9 @@
 #undef TARGET_STARTING_FRAME_OFFSET
 #define TARGET_STARTING_FRAME_OFFSET pa_starting_frame_offset
 
+#undef TARGET_HAVE_SPECULATION_SAFE_VALUE
+#define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 /* Parse the -mfixed-range= option string.  */


Re: [Patch][Aarch64] Implement Aarch64 SIMD ABI and aarch64_vector_pcs attribute

2018-08-06 Thread Steve Ellcey
Thanks for the feedback Kyrill.  I have updated my patch and attached
the new version to this email.  The one change I did not make was to
remove load_pair_dw_tftf and store_pair_dw_tftf and use the
load_pair and vec_store_pair
patterns.  Having to add two new iterators to remove two instructions
didn't seem like much of an advantage and I liked having the names
for tf to match the other load_pair_dw/store_pair_dw instructions.
If you feel strongly about this I can go ahead and make that change
though.

With respect to the new tests iterating over the various options, I am
not sure how that works but it does.  I copied the aarch64-torture.exp
file from one of the other targets and verified that it ran the
tests with -O0, -O1, -O2, '-O3 -g', -Os, '-O2 -flto -fno-use-linker-
plugin -flto-partition=none', and '-O2 -flto -fuse-linker-plugin -fno-
fat-lto-objects'.

Steve Ellcey
sell...@cavium.com


2018-08-06  Steve Ellcey  

* config/aarch64/aarch64-protos.h (aarch64_use_simple_return_insn_p):
New prototype.
(aarch64_epilogue_uses): Ditto.
* config/aarch64/aarch64.c (aarch64_attribute_table): New array.
(aarch64_simd_decl_p): New function.
(aarch64_reg_save_mode): New function.
(aarch64_is_simd_call_p): New function.
(aarch64_function_ok_for_sibcall): Check for simd calls.
(aarch64_layout_frame): Check for simd function.
(aarch64_gen_storewb_pair): Handle E_TFmode.
(aarch64_push_regs): Use aarch64_reg_save_mode to get mode.
(aarch64_gen_loadwb_pair): Handle E_TFmode.
(aarch64_pop_regs): Use aarch64_reg_save_mode to get mode.
(aarch64_gen_store_pair): Handle E_TFmode.
(aarch64_gen_load_pair): Ditto.
(aarch64_save_callee_saves): Handle different mode sizes.
(aarch64_restore_callee_saves): Ditto.
(aarch64_components_for_bb): Check for simd function.
(aarch64_epilogue_uses): New function.
(aarch64_process_components): Ditto.
(aarch64_expand_prologue): Ditto.
(aarch64_expand_epilogue): Ditto.
(aarch64_expand_call): Ditto.
(TARGET_ATTRIBUTE_TABLE): New define.
* config/aarch64/aarch64.h (EPILOGUE_USES): Redefine.
(FP_SIMD_SAVED_REGNUM_P): New macro.
* config/aarch64/aarch64.md (V23_REGNUM) New constant.
(simple_return): New define_expand.
(load_pair_dw_tftf): New instruction.
(store_pair_dw_tftf): Ditto.
(loadwb_pair_): Ditto.
("storewb_pair_): Ditto.


2018-08-06  Steve Ellcey  

* gcc.target/aarch64/torture/aarch64-torture.exp: New file.
* gcc.target/aarch64/torture/simd-abi-1.c: New test.
* gcc.target/aarch64/torture/simd-abi-2.c: Ditto.
* gcc.target/aarch64/torture/simd-abi-3.c: Ditto.
* gcc.target/aarch64/torture/simd-abi-4.c: Ditto.diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index af5db9c..99c962f 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -423,6 +423,7 @@ bool aarch64_split_dimode_const_store (rtx, rtx);
 bool aarch64_symbolic_address_p (rtx);
 bool aarch64_uimm12_shift (HOST_WIDE_INT);
 bool aarch64_use_return_insn_p (void);
+bool aarch64_use_simple_return_insn_p (void);
 const char *aarch64_mangle_builtin_type (const_tree);
 const char *aarch64_output_casesi (rtx *);
 
@@ -507,6 +508,8 @@ void aarch64_split_simd_move (rtx, rtx);
 /* Check for a legitimate floating point constant for FMOV.  */
 bool aarch64_float_const_representable_p (rtx);
 
+extern int aarch64_epilogue_uses (int);
+
 #if defined (RTX_CODE)
 void aarch64_gen_unlikely_cbranch (enum rtx_code, machine_mode cc_mode,
    rtx label_ref);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index fa01475..7ab2a60 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1027,6 +1027,15 @@ static const struct processor *selected_tune;
 /* The current tuning set.  */
 struct tune_params aarch64_tune_params = generic_tunings;
 
+/* Table of machine attributes.  */
+static const struct attribute_spec aarch64_attribute_table[] =
+{
+  /* { name, min_len, max_len, decl_req, type_req, fn_type_req,
+   affects_type_identity, handler, exclude } */
+  { "aarch64_vector_pcs", 0, 0, true,  false, false, false, NULL, NULL },
+  { NULL, 0, 0, false, false, false, false, NULL, NULL }
+};
+
 #define AARCH64_CPU_DEFAULT_FLAGS ((selected_cpu) ? selected_cpu->flags : 0)
 
 /* An ISA extension in the co-processor and main instruction set space.  */
@@ -1405,6 +1414,31 @@ aarch64_hard_regno_mode_ok (unsigned regno, machine_mode mode)
   return false;
 }
 
+/* Return true if this is a definition of a vectorized simd function.  */
+
+static bool
+aarch64_simd_decl_p (tree fndecl)
+{
+  if (lookup_attribute ("aarch64_vector_pcs", DECL_ATTRIBUTES (fndecl)) != NULL)
+return true;
+  if (lookup_attribute ("simd", DECL_A

[committed][libgomp, nvptx] Minimize lifetime of CUDA_ONE_CALL defines

2018-08-06 Thread Tom de Vries
Hi,

This patch makes sure that the lifetimes of the CUDA_ONE_CALL macro (which is
defined twice in plugin-nvptx.c) are minimized, to make it obvious that the
definitions are used only in the lib-cuda.def include.

Build on x86_64 with nvptx accelerator and reg-tested libgomp.

Committed to trunk.

Thanks,
- Tom

[libgomp, nvptx] Minimize lifetime of CUDA_ONE_CALL defines

2018-08-06  Tom de Vries  

* plugin/plugin-nvptx.c (struct cuda_lib_s, init_cuda_lib): Put
CUDA_ONE_CALL defines right before the cuda-lib.def include, and the
corresponding undefs right after.

---
 libgomp/plugin/plugin-nvptx.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index cc465b4addb2..2e72a6379eb9 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -52,10 +52,13 @@
 #if PLUGIN_NVPTX_DYNAMIC
 # include 
 
-# define CUDA_ONE_CALL(call) \
-  __typeof (call) *call;
 struct cuda_lib_s {
+
+# define CUDA_ONE_CALL(call)   \
+  __typeof (call) *call;
 #include "cuda-lib.def"
+# undef CUDA_ONE_CALL
+
 } cuda_lib;
 
 /* -1 if init_cuda_lib has not been called yet, false
@@ -74,18 +77,19 @@ init_cuda_lib (void)
   cuda_lib_inited = false;
   if (h == NULL)
 return false;
-# undef CUDA_ONE_CALL
+
 # define CUDA_ONE_CALL(call) CUDA_ONE_CALL_1 (call)
 # define CUDA_ONE_CALL_1(call) \
   cuda_lib.call = dlsym (h, #call);\
   if (cuda_lib.call == NULL)   \
 return false;
 #include "cuda-lib.def"
+# undef CUDA_ONE_CALL
+# undef CUDA_ONE_CALL_1
+
   cuda_lib_inited = true;
   return true;
 }
-# undef CUDA_ONE_CALL
-# undef CUDA_ONE_CALL_1
 # define CUDA_CALL_PREFIX cuda_lib.
 #else
 # define CUDA_CALL_PREFIX


[committed][libgomp, nvptx] Handle CUDA_ONE_CALL_MAYBE_NULL

2018-08-06 Thread Tom de Vries
Hi,

This patch adds handling of functions that may not be present in the cuda
driver.

Such a function can be declared using CUDA_ONE_CALL_MAYBE_NULL in cuda-lib.def,
it can be called with the usual convenience macros, but before calling its
presence needs to be tested using new macro CUDA_CALL_EXISTS.

When using the dlopen interface (PLUGIN_NVPTX_DYNAMIC == 1), we allow
non-present functions by allowing dlsym to return NULL.  Otherwise
(PLUGIN_NVPTX_DYNAMIC == 0) we declare the non-present function to be weak.

Build and reg-tested libgomp on x86_64 with nvidia accelerator, with and without
--disable-cuda-driver, in combination with a trigger patch that adds a
non-existing function foo to cuda-lib.def:
...
CUDA_ONE_CALL_MAYBE_NULL (foo)
...
and declares it in plugin-nvptx.c:
...
CUresult foo (void);
...
and then uses it in nvptx_init after the init_cuda_lib call:
...
  if (CUDA_CALL_EXISTS (foo))
CUDA_CALL (foo);
...

Also build and reg-tested on x86_64 with nvidia accelerator, with and without
--disable-cuda-driver, in combination with a trigger patch that replaces all
CUDA_ONE_CALLs in cuda-lib.def with CUDA_ONE_CALL_MAYBE_NULL, and guards two
CUDA_CALLs with CUDA_CALL_EXISTS, one for a regular fn, and one for a fn that is
a define in cuda/cuda.h.

Committed to trunk.

Thanks,
- Tom

[libgomp, nvptx] Handle CUDA_ONE_CALL_MAYBE_NULL

2018-08-06  Tom de Vries  

* plugin/plugin-nvptx.c (DO_PRAGMA): Define.
(struct cuda_lib_s): Add def/undef of CUDA_ONE_CALL_MAYBE_NULL.
(init_cuda_lib): Add new param to CUDA_ONE_CALL_1.  Add arg to
corresponding call in CUDA_ONE_CALL.  Add def/undef of
CUDA_ONE_CALL_MAYBE_NULL.
(CUDA_CALL_EXISTS): Define.

---
 libgomp/plugin/plugin-nvptx.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 2e72a6379eb9..825470adce3e 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -49,6 +49,8 @@
 #include 
 #include 
 
+#define DO_PRAGMA(x) _Pragma (#x)
+
 #if PLUGIN_NVPTX_DYNAMIC
 # include 
 
@@ -56,8 +58,11 @@ struct cuda_lib_s {
 
 # define CUDA_ONE_CALL(call)   \
   __typeof (call) *call;
+# define CUDA_ONE_CALL_MAYBE_NULL(call)\
+  CUDA_ONE_CALL (call)
 #include "cuda-lib.def"
 # undef CUDA_ONE_CALL
+# undef CUDA_ONE_CALL_MAYBE_NULL
 
 } cuda_lib;
 
@@ -78,20 +83,29 @@ init_cuda_lib (void)
   if (h == NULL)
 return false;
 
-# define CUDA_ONE_CALL(call) CUDA_ONE_CALL_1 (call)
-# define CUDA_ONE_CALL_1(call) \
+# define CUDA_ONE_CALL(call) CUDA_ONE_CALL_1 (call, false)
+# define CUDA_ONE_CALL_MAYBE_NULL(call) CUDA_ONE_CALL_1 (call, true)
+# define CUDA_ONE_CALL_1(call, allow_null) \
   cuda_lib.call = dlsym (h, #call);\
-  if (cuda_lib.call == NULL)   \
+  if (!allow_null && cuda_lib.call == NULL)\
 return false;
 #include "cuda-lib.def"
 # undef CUDA_ONE_CALL
 # undef CUDA_ONE_CALL_1
+# undef CUDA_ONE_CALL_MAYBE_NULL
 
   cuda_lib_inited = true;
   return true;
 }
 # define CUDA_CALL_PREFIX cuda_lib.
 #else
+
+# define CUDA_ONE_CALL(call)
+# define CUDA_ONE_CALL_MAYBE_NULL(call) DO_PRAGMA (weak call)
+#include "cuda-lib.def"
+#undef CUDA_ONE_CALL_MAYBE_NULL
+#undef CUDA_ONE_CALL
+
 # define CUDA_CALL_PREFIX
 # define init_cuda_lib() true
 #endif
@@ -136,6 +150,9 @@ init_cuda_lib (void)
 #define CUDA_CALL_NOCHECK(FN, ...) \
   CUDA_CALL_PREFIX FN (__VA_ARGS__)
 
+#define CUDA_CALL_EXISTS(FN)   \
+  CUDA_CALL_PREFIX FN
+
 static const char *
 cuda_error (CUresult r)
 {


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Jeff Law
On 08/06/2018 11:15 AM, Martin Sebor wrote:
>>> These examples do not aim to be valid C, they just point out limitations
>>> of the middle-end design, and a good deal of the problems are due
>>> to trying to do things that are not safe within the boundaries given
>>> by the middle-end design.
>> I really think this is important -- and as such I think we need to move
>> away from trying to describe scenarios in C because doing so keeps
>> bringing us back to the "C doesn't allow XYZ" kinds of arguments when
>> what we're really discussing are GIMPLE semantic issues.
>>
>> So examples should be GIMPLE.  You might start with (possibly invalid) C
>> code to generate the GIMPLE, but the actual discussion needs to be
>> looking at GIMPLE.  We might include the C code in case someone wants to
>> look at things in a debugger, but bringing the focus to GIMPLE is really
>> important here.
> 
> I don't understand the goal of this exercise.  Unless the GIMPLE
> code is the result of a valid test case (in some language GCC
> supports), what does it matter what it looks like?  The basis of
> every single transformation done by a compiler is that the source
> code is correct.  If it isn't then all bets are off.  I'm no GIMPLE
> expert but even I can come up with any number of GIMPLE expressions
> that have undefined behavior.  What would that prove?
> 
> But let me try anyway.  Here's a simplified (and gimplified) version
> of the test case that started this debate:
> 
>   struct S { char a[4], b; };
> 
>   f (struct S * p)
>   {
>     int D.1908;
> 
>     _1 = &p->a;
>     _2 = __builtin_strlen (_1);   // strlen (p->a);
>     D.1908 = (int) _2;
>     return D.1908;
>   }
You need to include the declaration of _1 so that we can see its type.
Assuming you're just calling strlen (p->a), it's type will be:


  char[4] * _1;

Which provides you with useful type information.  If I'm wrong on this
one, I'm sure Jakub & Richi will chime in.


> 
> and one involving a pointer:
> 
>   g (struct S * p)
>   {
>     int D.1910;
>     char * q;
> 
>     q = &p->a;
>     _1 = __builtin_strlen (q);
>     D.1910 = (int) _1;
>     return D.1910;
>   }
In this case the pointer passed is a char *, so you know nothing.

It seems a bit silly since you can see that q = &p->a, but that's how it
works.

It all comes down to how information is lost as we go through the
pipeline.  It may not matter in this specific example, but it matters in
the general case.


> 
> and another one involving a pointer and strcpy and
> _FORTIFY_SOURCE:
> 
>   h (struct S * p)
>   {
>     int D.2208;
>     char * q;
> 
>     q = &p->a;
>     _1 = strcpy (q, "1234");
>     _2 = (long int) _1;
>     D.2208 = (int) _2;
>     return D.2208;
>   }
> 
> with strcpy defined as:
> 
>   __attribute__((artificial, gnu_inline, always_inline, leaf, nothrow))
>   strcpy (char * restrict __dest, const char * restrict __src)
>   {
>     char * D.2210;
> 
>     _1 = __builtin_object_size (__dest, 1);
>     D.2210 = __builtin___strcpy_chk (__dest, __src, _1);
>     return D.2210;
>   }
> 
> What does this show?
> 
> AFAICS, all three functions are equivalent GIMPLE, yet I'm being
> told that the first one is different in some important detail from
> the second, and that even though it's the same as the third and
> even though it's good to have __strcpy_chk() abort in the third
> case it's bad for the strlen() call to return a value constrained
> to [0, 3].  Would defining strlen like so
Note that _b_o_s is supposed to honor the somewhat wonky GIMPLE
semantics in this space.  If it doesn't that'd be a bug.  Assuming it
does honor those semantics, then the right things will happen.

Note that the loss of information as we go through the pipeline may mean
that _b_o_s could return "don't know" which is -1.  It may also return
the largest of multiple objects that the pointer points to.  So you
won't get a error from the fortification system in those cases where
semantics between C and GIMPLE differ.


> 
>   __attribute__((artificial, gnu_inline, always_inline, leaf, nothrow))
>   strlen (const char * __src)
>   {
>     char * D.2210;
> 
>     _1 = __builtin_object_size (__src, 1);
>     D.2210 = __builtin___strlen_chk (__src, _1);
>     return D.2210;
>   }
> 
> and having __strlen_chk() abort if __strc were longer than _1
> be also bad?  (If not -- I sincerely hope that's the answer --
> then I'll be happy to put together a patch for that.  In fact,
> I think it would be useful to extend this to all string
> functions (i.e., have them all abort on reads past the end,
> just as they abort on writes).
So the key here is _b_o_s should be honoring the semantics of GIMPLE.
It can/will return "don't know" or sizes potentially larger than the
object in some cases.

Jeff
> 
> Martin



Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Jeff Law
On 08/06/2018 10:32 AM, Martin Sebor wrote:
> On 08/06/2018 09:12 AM, Jeff Law wrote:
>> On 08/04/2018 03:56 PM, Martin Sebor wrote:
>>> On 08/03/2018 01:00 AM, Jeff Law wrote:
 On 07/24/2018 05:18 PM, Bernd Edlinger wrote:
> On 07/24/18 23:46, Jeff Law wrote:
>> On 07/24/2018 01:59 AM, Bernd Edlinger wrote:
>>> Hi!
>>>
>>> This patch makes strlen range computations more conservative.
>>>
>>> Firstly if there is a visible type cast from type A to B before
>>> passing
>>> then value to strlen, don't expect the type layout of B to restrict
>>> the
>>> possible return value range of strlen.
>> Why do you think this is the right thing to do?  ie, is there
>> language
>> in the standards that makes you think the code as it stands today is
>> incorrect from a conformance standpoint?  Is there a significant
>> body of
>> code that is affected in an adverse way by the current code?  If so,
>> what code?
>>
>>
>
> I think if you have an object, of an effective type A say char[100],
> then
> you can cast the address of A to B, say typedef char (*B)[2] for
> instance
> and then to const char *, say for use in strlen.  I may be wrong, but
> I think
> that we should at least try not to pick up char[2] from B, but instead
> use A for strlen ranges, or leave this range open.  Currently the
> range
> info for strlen is [0..1] in this case, even if we see the type cast
> in the generic tree.
 ISTM that you're essentially saying that the cast to const char *
 destroys any type information we can exploit here.  But if that's the
 case, then I don't think we can even derive a range of [0,99].  What's
 to say that "A" didn't result from a similar cast of some object that
 was char[200] that happened out of the scope of what we could see
 during
 the strlen range computation?

 If that is what you're arguing, then I think there's a re-evaluation
 that needs to happen WRT strlen range computation/

 And just to be clear, I do see this as a significant correctness
 question.

 Martin, thoughts?
>>>
>>> The argument is that given:
>>>
>>>   struct S { char a[4], b; };
>>>
>>>   char a[8] = "1234567";
>>>
>>> this is valid and should pass:
>>>
>>>   __attribute__ ((noipa))
>>>   void f (struct S *p)
>>>   {
>>>     assert (7 == strlen (p->a));
>>>   }
>>>
>>>   int main (void)
>>>   {
>>>     f ((struct S*)a);
>>>   }
>>>
>>> (This is the basic premise behind pr86259.)
>>>
>>> This argument is wrong and the code is invalid.  For the access
>>> to p->a to be valid p must point to an object of struct S (it
>>> doesn't) and the p->a array must hold a nul-terminated string
>>> (it also doesn't).
>> I agree with you for C/C++, but I think it's been shown elsewhere in
>> this thread that GIMPLE semantics to not respect the subobject
>> boundaries.  That's a sad reality.
>>
>> [ ... ]
>>
>>>
>>> I care less about the optimization than I do about the basic
>>> premise that it's essential to respect subobject boundaries(*).
>> I understand, but the semantics of GIMPLE do not respect them.  We can
>> argue about whether or not those should change and what it would take to
>> fix that. But right now the existing semantics do not respect those
>> boundaries.
> 
> They don't respect them in all cases (i.e., when MEM_REF loses
> information about the structure of an access) but in a good
> number of them GCC can still derive useful information from
> the access.  It's relied on to great a effect by _FORTIFTY_SOURCE.
> I think it would be a welcome enhancement if besides out-of-
> bounds writes _FORTIFTY_SOURCE also prevented out-of-bounds
> reads.
> 
>>> It would make little sense to undo the strlen optimization
>>> without also undoing the optimization for the direct array
>>> access case.  Undoing either would raise the question about
>>> the validity of the _FORRTIFY_SOURCE=2 behavior.  That would
>>> be a huge step backwards in terms of code security.  If we
>>> did some of these but not others, it would make the behavior
>>> inconsistent and surprising, all to accommodate one instance
>>> of invalid code.
>> In the direct array access case I think (and I'm sure Jakub, Richi and
>> others will correct me if I'm wrong), we can use the object's type
>> because the dereferences are actually using the array's type.
> 
> Subscripting and pointer access are identical both in C/C++
> and in GCC's _FORTIFY_SOURCE.  The absence of a distinction
> between the two is essential for preventing writes past
> the end by string functions like strcpy (_FORTIFY_SOURCE).
> 
>>> If we had a valid test case where the strlen optimization
>>> leads to invalid code, or even if there were a significant
>>> number of bug reports showing that it breaks an invalid
>>> but common idiom, I would certainly feel compelled to
>>> make it right somehow.  But there has been just one bug

Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Martin Sebor

But as I said, far more essential than the optimization is
the ability to detect these invalid access (both reads and
writes), such as in:


The essential thing is to not introduce latent wrong code issues because you 
exploit C language constraints that are not preserved by GIMPLE transforms 
because they are not constraints in the GIMPLE IL _WHICH_ _IS_ _NOT_ _C_.


You misunderstood my point: I'm saying "if you must, disable
the strlen optimization but please don't compromise the bug
detection."

Martin


Re: [PATCH] Add fix-it hint for missing return statement in assignment operators (PR c++/85523)

2018-08-06 Thread H.J. Lu
On Fri, Aug 3, 2018 at 11:41 AM, David Malcolm  wrote:
> On Tue, 2018-05-01 at 07:18 -0400, Nathan Sidwell wrote:
>> On 04/30/2018 08:29 PM, David Malcolm wrote:
>> > Following on from the thread on the "gcc" list here:
>> >
>> >https://gcc.gnu.org/ml/gcc/2018-04/msg00172.html
>> >
>> > here's an updated version of Jonathan's patch, which:
>> > +   {
>> > + tree valtype = TREE_TYPE (DECL_RESULT (fndecl));
>> > + if (TREE_CODE (valtype) == REFERENCE_TYPE
>> > + && same_type_ignoring_top_level_qualifiers_p
>> > + (TREE_TYPE (valtype), TREE_TYPE
>> > (current_class_ref)))
>>
>> While this is probably close enough, you could
>> *) use convert_to_base to include cases where the return type's a
>> base
>> of the current object.
>> *) convert_to_base would also allow dropping the REFERENCE_TYPE
>> check
>> here, so icky code returning by-value could also be caught.
>>
>> Up to you.  But otherwise ok.
>
> Sorry about the belated response; this fell off my radar for some
> reason.
>
> I looked at updating it to support the cases you suggest, but I wasn't
> happy with issuing the fix-it hint for them, so I've gone with the
> patch as-is.
>
> Committed to trunk as r263298 (after rebasing and re-testing)
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86872


-- 
H.J.


Re: [PATCH] convert braced initializers to strings (PR 71625)

2018-08-06 Thread Martin Sebor

On 08/06/2018 11:04 AM, Joseph Myers wrote:

On Mon, 6 Aug 2018, Martin Sebor wrote:


Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01884.html


I'd expect testcases with signed char and unsigned char as well, if those
work for C, including tests for signed char where some of the initializers
are negative.  (Tests that actual array contents are still correct after
this conversion, as well as that the optimizations occur, would also be a
good idea.)

The c-parser.c patch adds a comment that ends in the middle of a word.


Thanks.  Adding more tests revealed a couple of oversights:
1) using tree_fits_uhwi_p excluded initializers with negative
values,
2) skipping embedded nuls made it possible to create a string
with fewer elements than the initializer array, which caused
arrays with unspecified bound to be smaller than they would
have been otherwise

The attached update fixes both of these and makes the C/C++
front-end code simpler and more alike.

Martin

PR tree-optimization/71625 - missing strlen optimization on different array initialization style

gcc/c/ChangeLog:

	PR tree-optimization/71625
	* c-parser.c (c_parser_declaration_or_fndef): Call
	braced_list_to_string.

gcc/c-family/ChangeLog:

	PR tree-optimization/71625
	* c-common.c (braced_list_to_string): New function.
	* c-common.h (braced_list_to_string): Declare it.

gcc/cp/ChangeLog:

	PR tree-optimization/71625
	* parser.c (cp_parser_init_declarator):  Call braced_list_to_string.

gcc/testsuite/ChangeLog:

	PR tree-optimization/71625
	* g++.dg/init/string2.C: New test.
	* g++.dg/init/string3.C: New test.
	* gcc.dg/strlenopt-55.c: New test.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c	(revision 263341)
+++ gcc/c/c-parser.c	(working copy)
@@ -2126,6 +2126,24 @@ c_parser_declaration_or_fndef (c_parser *parser, b
 	  if (d != error_mark_node)
 		{
 		  maybe_warn_string_init (init_loc, TREE_TYPE (d), init);
+
+		  /* Convert a string CONSTRUCTOR into a STRING_CST.  */
+		  tree valtype = TREE_TYPE (init.value);
+		  if (TREE_CODE (init.value) == CONSTRUCTOR
+		  && TREE_CODE (valtype) == ARRAY_TYPE)
+		{
+		  if (TYPE_STRING_FLAG (TREE_TYPE (valtype)))
+			{
+			  if (tree str = braced_list_to_string (valtype,
+init.value))
+			{
+			  /* Replace the initializer with the string
+ constant.  */
+			  init.value = str;
+			}
+			}
+		}
+
 		  finish_decl (d, init_loc, init.value,
 			   init.original_type, asm_name);
 		}
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(revision 263341)
+++ gcc/c-family/c-common.c	(working copy)
@@ -8509,4 +8509,83 @@ maybe_add_include_fixit (rich_location *richloc, c
   free (text);
 }
 
+/* Attempt to convert a braced array initializer list CTOR for array
+   TYPE into a STRING_CST for convenience and efficiency.  When non-null,
+   use EVAL to attempt to evalue constants (used by C++).  Return
+   the converted string on success or null on failure.  */
+
+tree
+braced_list_to_string (tree type, tree ctor, tree (*eval)(tree))
+{
+  /* If the array has an explicit bound, use it to constrain the size
+ of the string.  If it doesn't, be sure to create a string that's
+ as long as implied by the index of the last zero specified via
+ a designator, as in:
+   const char a[] = { [7] = 0 };  */
+  unsigned HOST_WIDE_INT maxelts = HOST_WIDE_INT_M1U;
+  if (tree nelts = TYPE_SIZE_UNIT (type))
+if (tree_fits_uhwi_p (nelts))
+  {
+	maxelts = tree_to_uhwi (nelts);
+	maxelts /= tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (type)));
+  }
+
+  unsigned HOST_WIDE_INT nelts = CONSTRUCTOR_NELTS (ctor);
+
+  auto_vec str;
+  str.reserve (nelts + 1);
+
+  unsigned HOST_WIDE_INT i;
+  tree index, value;
+
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), i, index, value)
+{
+  unsigned HOST_WIDE_INT idx = index ? tree_to_uhwi (index) : i;
+
+  /* auto_vec is limited to UINT_MAX elements.  */
+  if (idx > UINT_MAX)
+	return NULL_TREE;
+
+  /* Attempt to evaluate constants.  */
+  if (eval)
+	value = eval (value);
+
+  /* Avoid non-constant initializers.  */
+ if (!tree_fits_shwi_p (value))
+	return NULL_TREE;
+
+  /* Skip over embedded nuls.  */
+  unsigned val = tree_to_shwi (value);
+  if (!val && i + 1 < nelts)
+	continue;
+
+  /* Bail if the CTOR has a block of more than 256 embedded nuls
+	 due to implicitly initialized elements.  */
+  unsigned nelts = (idx - str.length ()) + 1;
+  if (nelts > 256)
+	return NULL_TREE;
+
+  if (nelts > 1)
+	{
+	  str.reserve (idx);
+	  str.quick_grow_cleared (idx);
+	}
+
+  if (idx > maxelts)
+	return NULL_TREE;
+
+  str.safe_insert (idx, val);
+}
+
+  if (!nelts || str.length () < i)
+/* Append a nul for the empty initializer { } a

Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Martin Sebor

On 07/26/2018 02:55 AM, Richard Biener wrote:

On Wed, 25 Jul 2018, Martin Sebor wrote:


BUT - for the string_constant and c_strlen functions we are,
in all cases we return something interesting, able to look
at an initializer which then determines that type.  Hopefully.
I think the strlen() folding code when it sets SSA ranges
now looks at types ...?

Consider

struct X { int i; char c[4]; int j;};
struct Y { char c[16]; };

void foo (struct X *p, struct Y *q)
{
  memcpy (p, q, sizeof (struct Y));
  if (strlen ((char *)(struct Y *)p + 4) < 7)
abort ();
}

here the GIMPLE IL looks like

  const char * _1;

   [local count: 1073741825]:
  _5 = MEM[(char * {ref-all})q_4(D)];
  MEM[(char * {ref-all})p_6(D)] = _5;
  _1 = p_6(D) + 4;
  _2 = __builtin_strlen (_1);

and I guess Martin would argue that since p is of type struct X
+ 4 gets you to c[4] and thus strlen of that cannot be larger
than 3.  But of course the middle-end doesn't work like that
and luckily we do not try to draw such conclusions or we
are somehow lucky that for the testcase as written above we do not
(I'm not sure whether Martins changes in this area would derive
such conclusions in principle).


Only if the strlen argument were p->c.


NOTE - we do not know the dynamic type here since we do not know
the dynamic type of the memory pointed-to by q!  We can only
derive that at q+4 there must be some object that we can
validly call strlen on (where Martin again thinks strlen
imposes constrains that memchr does not - sth I do not agree
with from a QOI perspective)


The dynamic type is a murky area.


It's well-specified in the middle-end.  A store changes the
dynamic type of the stored-to object.  If that type is
compatible with the surrounding objects dynamic type that one
is not affected, if not then the surrounding objects dynamic
type becomes unspecified.  There is TYPE_TYPELESS_STORAGE
to somewhat control "compatibility" of subobjects.


I never responded to this.  Using a dynamic (effective) type as
you describe it would invalidate the aggressive loop optimization
in the following:

  void foo (struct X *p)
  {
  struct Y y = { "12345678" };
  memcpy (p, &y, sizeof (struct Y));

  // *p effective type is now struct Y

  int n = 0;
  while (p->c[n])
++n;

  if (n < 7)
abort ();
  }

GCC unconditionally aborts, just as it does with strlen(p->c).
Why is that not wrong (in either case)?

Because the code is invalid either way, for two reasons:

1) it accesses an object of (effective) type struct Y via
   an lvalue of type struct X (specifically via (*p).c)
2) it relies on p->c

The loop optimization relies on the exact same requirement
as the strlen one.  Either they are both valid or neither is.

Martin


Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Martin Sebor

On 08/06/2018 11:40 AM, Jeff Law wrote:

On 08/06/2018 11:15 AM, Martin Sebor wrote:

These examples do not aim to be valid C, they just point out limitations
of the middle-end design, and a good deal of the problems are due
to trying to do things that are not safe within the boundaries given
by the middle-end design.

I really think this is important -- and as such I think we need to move
away from trying to describe scenarios in C because doing so keeps
bringing us back to the "C doesn't allow XYZ" kinds of arguments when
what we're really discussing are GIMPLE semantic issues.

So examples should be GIMPLE.  You might start with (possibly invalid) C
code to generate the GIMPLE, but the actual discussion needs to be
looking at GIMPLE.  We might include the C code in case someone wants to
look at things in a debugger, but bringing the focus to GIMPLE is really
important here.


I don't understand the goal of this exercise.  Unless the GIMPLE
code is the result of a valid test case (in some language GCC
supports), what does it matter what it looks like?  The basis of
every single transformation done by a compiler is that the source
code is correct.  If it isn't then all bets are off.  I'm no GIMPLE
expert but even I can come up with any number of GIMPLE expressions
that have undefined behavior.  What would that prove?

The GIMPLE IL is less restrictive than the original source language.
The process of translation into GIMPLE and optimization can create
situations in the GIMPLE IL that can't be validly represented in the
original source language.  Subobject crossing being one such case, there
are certainly others.  We have to handle these scenarios correctly.


Sure, but a valid C test case still needs to exist to show that
such a transformation is possible.  Until someone comes up with
one it's all speculation.

Under normal circumstances the burden of proof that there is
a problem is on the reporter.  In this case, the requirement
has turned into one to prove a negative.  Effectively, you
are asking for a proof that there is no bug, either in
the assumptions behind the strlen optimization, or somewhere
else in GCC that would lead the optimization to invalidate
a valid piece of code.  That's impossible.

Martin


[RFC,PATCH] Introduce -msdata=explicit for powerpc

2018-08-06 Thread Alexandre Oliva
This option allows users to manually select what goes in the limited
small data range, and still get smaller and faster small data access
sequences for the selected data.


We've considered adding a new attribute, say "sdata", to let the
compiler pick the sdata/sbss section name itself, but that's not
strictly necessary since attribute section already does what we need.
I'm not sure how the semantics of conflicting attributes should be
implemented, but if others think it's a good idea, I could give it a
shot.  Like, if attribute((sdata, section("data"))) is given, should the
variable be placed in .data but be accessed using sdata addressing
modes?  Should we reject that with an error?  Should we warn and ignore
one of the attributes?  Something else?


I saw comments, docs and init code that suggested the possibility of
using r2/.sdata2 for small data, but I couldn't get code to be generated
for such access, even with very old toolchains.  I'm not sure I'm just
missing how this magic comes about, or whether it hasn't been available
for a very long time but nobody removed the comments/docs.  Assuming I'm
missing something, I put in the possibility of using r2 in the test in
the patch, but I'm sure I have not exercised it to make sure I got it
right.  Help?


I have not YET given this much testing.  I'm posting it so as to give
ppc maintainers an opportunity to comment on the proposed approach, in
hopes of getting buy-in for the idea, if not necessarily for the patch,
but I welcome alternate suggestions to enable users to choose what goes
in faster sdata when there's too much data for size-based assignment to
place interesting variables in sdata without overflowing its range.


for  gcc/ChangeLog

* config/rs6000/rs6000-opts.h (SDATA_EXPLICIT): New enumerator.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Handle it.
(rs6000_file_start): Likewise.
(rs6000_elf_in_small_data_p): Likewise.
(SMALL_DATA_EABI_P): New, likewise.
(SMALL_DATA_RELOC, SMALL_DATA_REG): Use it.
* config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): Check for
and set up SDATA_EXPLICIT.
* config/rs6000/sysv4.opt: Add explicit to -msdata.
* doc/invoke.texi (-msdata=explicit): Document it.

for  gcc/testsuite/ChangeLog

* gcc.target/powerpc/ppc-sdata-3.c: New.

diff --git a/gcc/config/rs6000/rs6000-opts.h b/gcc/config/rs6000/rs6000-opts.h
index a8194783e018..cc780f569308 100644
--- a/gcc/config/rs6000/rs6000-opts.h
+++ b/gcc/config/rs6000/rs6000-opts.h
@@ -120,7 +120,8 @@ enum rs6000_sdata_type {
   SDATA_NONE,  /* No small data support.  */
   SDATA_DATA,  /* Just put data in .sbss/.sdata, don't use 
relocs.  */
   SDATA_SYSV,  /* Use r13 to point to .sdata/.sbss.  */
-  SDATA_EABI   /* Use r13 like above, r2 points to 
.sdata2/.sbss2.  */
+  SDATA_EABI,  /* Use r13 like above, r2 points to 
.sdata2/.sbss2.  */
+  SDATA_EXPLICIT   /* Use r13 (or r2?) for explicit sdata.  */
 };
 
 /* Type of traceback to use.  */
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ddc61bdaffe7..a679709a055f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2819,6 +2819,10 @@ rs6000_debug_reg_global (void)
   fprintf (stderr, DEBUG_FMT_S, "sdata", "eabi");
   break;
 
+case SDATA_EXPLICIT:
+  fprintf (stderr, DEBUG_FMT_S, "sdata", "explicit");
+  break;
+
 }
 
   switch (rs6000_traceback)
@@ -6098,6 +6102,8 @@ rs6000_file_start (void)
case SDATA_DATA: fprintf (file, "%s -msdata=data", start); start = ""; 
break;
case SDATA_SYSV: fprintf (file, "%s -msdata=sysv", start); start = ""; 
break;
case SDATA_EABI: fprintf (file, "%s -msdata=eabi", start); start = ""; 
break;
+   case SDATA_EXPLICIT:
+ fprintf (file, "%s -msdata=explicit", start); start = ""; break;
}
 
   if (rs6000_sdata && g_switch_value)
@@ -21240,8 +21246,10 @@ rs6000_output_function_entry (FILE *file, const char 
*fname)
 /* Print an operand.  Recognize special options, documented below.  */
 
 #if TARGET_ELF
-#define SMALL_DATA_RELOC ((rs6000_sdata == SDATA_EABI) ? "sda21" : "sdarel")
-#define SMALL_DATA_REG ((rs6000_sdata == SDATA_EABI) ? 0 : 13)
+#define SMALL_DATA_EABI_P (rs6000_sdata == SDATA_EABI \
+  || (rs6000_sdata == SDATA_EXPLICIT && TARGET_EABI))
+#define SMALL_DATA_RELOC (SMALL_DATA_EABI_P ? "sda21" : "sdarel")
+#define SMALL_DATA_REG (SMALL_DATA_EABI_P ? 0 : 13)
 #else
 #define SMALL_DATA_RELOC "sda21"
 #define SMALL_DATA_REG 0
@@ -33221,6 +33229,11 @@ rs6000_elf_in_small_data_p (const_tree decl)
 }
   else
 {
+  /* Explicit mode means no implicit assignment to small data
+sections.  */
+  if (rs6000_sdata == SDATA_EXPLICIT)
+   return false;
+
   /* If we are told not to put readonly data in sdata, then don't.  */
   if 

Re: [PATCH] Make strlen range computations more conservative

2018-08-06 Thread Richard Biener
On August 7, 2018 5:38:59 AM GMT+02:00, Martin Sebor  wrote:
>On 08/06/2018 11:40 AM, Jeff Law wrote:
>> On 08/06/2018 11:15 AM, Martin Sebor wrote:
> These examples do not aim to be valid C, they just point out
>limitations
> of the middle-end design, and a good deal of the problems are due
> to trying to do things that are not safe within the boundaries
>given
> by the middle-end design.
 I really think this is important -- and as such I think we need to
>move
 away from trying to describe scenarios in C because doing so keeps
 bringing us back to the "C doesn't allow XYZ" kinds of arguments
>when
 what we're really discussing are GIMPLE semantic issues.

 So examples should be GIMPLE.  You might start with (possibly
>invalid) C
 code to generate the GIMPLE, but the actual discussion needs to be
 looking at GIMPLE.  We might include the C code in case someone
>wants to
 look at things in a debugger, but bringing the focus to GIMPLE is
>really
 important here.
>>>
>>> I don't understand the goal of this exercise.  Unless the GIMPLE
>>> code is the result of a valid test case (in some language GCC
>>> supports), what does it matter what it looks like?  The basis of
>>> every single transformation done by a compiler is that the source
>>> code is correct.  If it isn't then all bets are off.  I'm no GIMPLE
>>> expert but even I can come up with any number of GIMPLE expressions
>>> that have undefined behavior.  What would that prove?
>> The GIMPLE IL is less restrictive than the original source language.
>> The process of translation into GIMPLE and optimization can create
>> situations in the GIMPLE IL that can't be validly represented in the
>> original source language.  Subobject crossing being one such case,
>there
>> are certainly others.  We have to handle these scenarios correctly.
>
>Sure, but a valid C test case still needs to exist to show that
>such a transformation is possible.  Until someone comes up with
>one it's all speculation.

Jakub showed you one wrt CSE of addresses. 

Richard. 

>Under normal circumstances the burden of proof that there is
>a problem is on the reporter.  In this case, the requirement
>has turned into one to prove a negative.  Effectively, you
>are asking for a proof that there is no bug, either in
>the assumptions behind the strlen optimization, or somewhere
>else in GCC that would lead the optimization to invalidate
>a valid piece of code.  That's impossible.
>
>Martin



Re: [PATCH] Introduce instance discriminators

2018-08-06 Thread Alexandre Oliva
On Jul 30, 2018, Alexandre Oliva  wrote:

>> for  gcc/ada

>> * trans.c: Include debug.h.
>> (file_map): New static variable.
>> (gigi): Set it.  Create decl_to_instance_map when needed.
>> (Subprogram_Body_to_gnu): Pass gnu_subprog_decl to...
>> (Sloc_to_locus): ... this.  Add decl parm, map it to instance.
>> * gigi.h (Sloc_to_locus): Adjust declaration.

I noticed I missed the gcc-interface dir in these entries.  I'm
installing this patch to gcc/ada/ChangeLog to fix it.

diff --git a/gcc/ada/ChangeLog b/gcc/ada/ChangeLog
index 1095165f99fb..792811fb989c 100644
--- a/gcc/ada/ChangeLog
+++ b/gcc/ada/ChangeLog
@@ -10,12 +10,12 @@
 2018-07-31  Alexandre Oliva  
 Olivier Hainque  
 
-   * trans.c: Include debug.h.
+   * gcc-interface/trans.c: Include debug.h.
(file_map): New static variable.
(gigi): Set it.  Create decl_to_instance_map when needed.
(Subprogram_Body_to_gnu): Pass gnu_subprog_decl to...
(Sloc_to_locus): ... this.  Add decl parm, map it to instance.
-   * gigi.h (Sloc_to_locus): Adjust declaration.
+   * gcc-interface/gigi.h (Sloc_to_locus): Adjust declaration.
 
 2018-07-31  Arnaud Charlet  
 

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist


Re: [PATCH,nvptx] Use CUDA driver API to select default runtime launch, geometry

2018-08-06 Thread Tom de Vries
On 08/01/2018 12:18 PM, Tom de Vries wrote:

> I think we need to add and handle:
> ...
>   CUDA_ONE_CALL_MAYBE_NULL (cuOccupancyMaxPotentialBlockSize)
> ...
> 

I realized that the patch I posted introducing CUDA_ONE_CALL_MAYBE_NULL
was incomplete, and needed to use the weak attribute in case of linking
against a concrete libcuda.so.

So, I've now committed a patch implementing just CUDA_ONE_CALL_MAYBE_NULL:
"[libgomp, nvptx] Handle CUDA_ONE_CALL_MAYBE_NULL" @
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00447.html . You can use
"CUDA_CALL_EXISTS (cuOccupancyMaxPotentialBlockSize)" to test for
existence of the function in the cuda driver API.

> The patch doesn't build in a setup with
> --enable-offload-targets=nvptx-none and without cuda, that enables usage
> of plugin/cuda/cuda.h:
> ...
> /data/offload-nvptx/src/libgomp/plugin/plugin-nvptx.c:98:16: error:
> ‘cuOccupancyMaxPotentialBlockSize’ undeclared here (not in a function);
> did you mean ‘cuOccupancyMaxPotentialBlockSizeWithFlags’?
>  CUDA_ONE_CALL (cuOccupancyMaxPotentialBlockSize) \
> ...
> 

I've committed a patch "[libgomp, nvptx, --without-cuda-driver] Don't
use system cuda driver" @
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00348.html .

Using --without-cuda-driver should make it easy to build using the
dlopen interface without having to de-install the system libcuda.so.

Thanks,
- Tom


Re: [Patch, avr, PR85624] - Fix ICE when initializing 128-byte aligned array

2018-08-06 Thread Senthil Kumar Selvaraj


Jeff Law writes:

> On 08/06/2018 04:53 AM, Senthil Kumar Selvaraj wrote:
>> Ping!
>> 
>> Regards
>> Senthil
>> 
>> Senthil Kumar Selvaraj writes:
>> 
>>> Hi,
>>>
>>> The below patch fixes an ICE for the avr target when the setmemhi
>>> expander is involved.
>>>
>>> The setmemhi expander generated RTL ends up as an unrecognized insn
>>> if the alignment of the destination exceeds that of a QI
>>> mode const_int (127), AND the number of bytes to set fits in a QI
>>> mode const_int. The second condition prevents *clrmemhi from matching,
>>> and *clrmemqi does not match because it expects operand 3 (the alignment
>>> const_int rtx) to be QI mode, and a value of 128 or greater does not fit.
>>>   
>>> The patch fixes this by changing the *clrmemqi pattern to match a HI
>>> mode const_int, and also adds a testcase.
>>>
>>> Regression test showed no new failures, ok to commit to trunk?
>>>
>>> Regards
>>> Senthil
>>>
>>> gcc/ChangeLog:
>>> 
>>> 2018-07-18  Senthil Kumar Selvaraj  
>>> 
>>> PR target/85624
>>> * config/avr/avr.md (*clrmemqi): Change mode of operands[2]
>>> from QI to HI.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>> 2018-07-18  Senthil Kumar Selvaraj  
>>> 
>>> PR target/85624
>>> * gcc.target/avr/pr85624.c: New test.
> Given there's also pattern clrmemhi it feels like you're papering over a
> bug elsewhere, possibly in the expanders.

clrmemhi and clrmemqi differ on the mode of the register used to hold
the number of bytes to clear. The setmemhi expander picks a QI or HI
mode reg depending on the width of the size operand, and the
clrmem{qi,hi} match on that.

The operand whose mode I modified represents the alignment of the
destination, and isn't actually used in the assembler template.

Regards
Senthil
>
> Ultimately I'll leave the decision here to the AVR maintainers through.
>
> jeff