Re: [PING]: [PATCH]: Conditionally include target specific files while building TSAN
Hi Rainer, Please find the corrected patch attached. I removed some eval statements I added for debugging. regards, Venkat, On 24 January 2015 at 13:23, Venkataramanan Kumar wrote: > Hi Rainer, > > I reused libgcc's "host_address" test and the patch passed normal > bootstrap in x86_64. > > Can you please check if this is fine ? > > regards, > Venkat. > > On 24 January 2015 at 12:53, Rainer Orth > wrote: >> Hi Venkat, >> >>> Yes thanks I will work on fixing this. Let me know if I need to revert >>> the patch meanwhile. >> >> I don't think this is urgent enough to justify reversion. >> >> Thanks. >> Rainer >> >> -- >> - >> Rainer Orth, Center for Biotechnology, Bielefeld University Index: libsanitizer/ChangeLog === --- libsanitizer/ChangeLog (revision 220077) +++ libsanitizer/ChangeLog (working copy) @@ -1,5 +1,10 @@ 2015-01-25 Venkataramanan Kumar + * configure.ac: Set host_address to 64 or 32. + * configure: Regenerate. + +2015-01-25 Venkataramanan Kumar + * configure.ac (TSAN_TARGET_DEPENDENT_OBJECTS): Define. * configure: Regenerate. * tsan/Makefile.am (EXTRA_libtsan_la_SOURCES): Define. Index: libsanitizer/configure === --- libsanitizer/configure (revision 220077) +++ libsanitizer/configure (working copy) @@ -16363,12 +16363,29 @@ fi -case "${target}" in - x86_64-*-linux-*) TSAN_TARGET_DEPENDENT_OBJECTS='tsan_rtl_amd64.lo' ;; - *) TSAN_TARGET_DEPENDENT_OBJECTS='' ;; -esac +# Check 32bit or 64bit. In the case of MIPS, this really determines the +# word size rather than the address size. +cat > conftest.cconftest.c <
Re: [PING]: [PATCH]: Conditionally include target specific files while building TSAN
On Sat, Jan 24, 2015 at 01:23:22PM +0530, Venkataramanan Kumar wrote: > I reused libgcc's "host_address" test and the patch passed normal > bootstrap in x86_64. > > Can you please check if this is fine ? Can't you just use what configure.tgt already uses? x86_64-*-linux* | i?86-*-linux*) if test x$ac_cv_sizeof_void_p = x8; then TSAN_SUPPORTED=yes LSAN_SUPPORTED=yes fi ;; Just make sure AC_CHECK_SIZEOF([void *]) is above this (seems it is). So TSAN_TARGET_DEPENDENT_OBJECTS= case "${target}" in x86_64-*-linux* | i?86-*-linux*) if test x$ac_cv_sizeof_void_p = x8; then TSAN_TARGET_DEPENDENT_OBJECTS=tsan_rtl_amd64.lo fi;; esac AC_SUBST([TSAN_TARGET_DEPENDENT_OBJECTS]) ? Or even better move the TSAN_TARGET_DEPENDENT_OBJECTS initialization to configure.tgt and just keep AC_SUBST([TSAN_TARGET_DEPENDENT_OBJECTS]) in configure.ac. Jakub
Re: PATCH: Update host_detect_local_cpu for new Intel processors
On Sat, Jan 24, 2015 at 3:50 AM, H.J. Lu wrote: > On Fri, Jan 23, 2015 at 06:37:01PM -0800, H.J. Lu wrote: >> The new Silvermont aswell and Broadwell model numbers are in >> >> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf >> >> This patch updates host_detect_local_cpu to check new Silvermont, Haswell >> and Broadwell model numbers. OK for trunk and 4.9 branch? >> >> Thanks. >> > > There are more model numbers in CHAPTER 35 MODEL-SPECIFIC REGISTERS (MSRS): > > http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf > > OK for trunk and 4.9 branch? > > Thanks. > > > H.J. > --- > 2015-01-23 H.J. Lu > > * config/i386/driver-i386.c (host_detect_local_cpu): Check new > Silvermont, Haswell, Broadwell and Knights Landing model numbers. Please also update /libgcc/config/i386/cpuinfo.c and relevant testsuite files. Uros.
Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component
ping! 2015-01-19 15:41 GMT+01:00 Janus Weil : > Hi, > > this is a second patch dealing with finalization-related regressions, > after the one I submitted yesterday > (https://gcc.gnu.org/ml/fortran/2015-01/msg00109.html), which btw is > also still waiting for review ... > > This patch fixes an invalid memory reference inside the finalizer > routine (at runtime), which apparently was caused by dereferencing a > pointer without checking if it's NULL. I simply insert a call to > ASSOCIATED. > > I also rename two different runtime variables, which were both called > 'ptr', to 'ptr1' and 'ptr2', just to make it easier to distinguish > them in the dump. > > I also have the feeling the a lot of what is being done in > generate_finalization_wrapper and finalize_component (including my > changes) is a bit laborious. Some helper functions might be useful to > make all that code generation a bit more readable and less verbose. I > may attack this in a follow-up patch. > > This one regtests cleanly on x86_64-unknown-linux-gnu. Ok for trunk and 4.9? > > Cheers, > Janus > > > > 2015-01-19 Janus Weil > > PR fortran/64230 > * class.c (finalize_component): New argument 'sub_ns'. Insert code to > check if 'expr' is associated. > (generate_finalization_wrapper): Rename 'ptr' symbols to 'ptr1' and > 'ptr2'. Pass 'sub_ns' to finalize_component. > > 2015-01-19 Janus Weil > > PR fortran/64230 > * gfortran.dg/class_allocate_18.f90: Extended.
Re: PATCH: Update host_detect_local_cpu for new Intel processors
On Sat, Jan 24, 2015 at 11:13 AM, Uros Bizjak wrote: > On Sat, Jan 24, 2015 at 3:50 AM, H.J. Lu wrote: >> On Fri, Jan 23, 2015 at 06:37:01PM -0800, H.J. Lu wrote: >>> The new Silvermont aswell and Broadwell model numbers are in >>> >>> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf >>> >>> This patch updates host_detect_local_cpu to check new Silvermont, Haswell >>> and Broadwell model numbers. OK for trunk and 4.9 branch? >>> >>> Thanks. >>> >> >> There are more model numbers in CHAPTER 35 MODEL-SPECIFIC REGISTERS (MSRS): >> >> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf >> >> OK for trunk and 4.9 branch? >> >> Thanks. >> >> >> H.J. >> --- >> 2015-01-23 H.J. Lu >> >> * config/i386/driver-i386.c (host_detect_local_cpu): Check new >> Silvermont, Haswell, Broadwell and Knights Landing model numbers. > > Please also update /libgcc/config/i386/cpuinfo.c and relevant testsuite files. Huh, complete AVX512 handling is missing in the above file. Adding Kirill to CC: Uros.
Re: [Patch, i386] Support BMI and BMI2 targets in multiversioning
On Mon, Jan 12, 2015 at 6:02 PM, Uros Bizjak wrote: > Hello! > >>> On Wed, Dec 31, 2014 at 01:28:47PM +0100, Allan Sandfeld Jensen wrote: >>> > I recently wanted to use multiversioning for BMI2 specific extensions >>> > PDEP/PEXT, and noticed it wasn't there. So I wrote this patch to add it, >>> > and also added AES, F16C and BMI1 for completeness. >>> >>> AES nor F16C doesn't make any sense IMHO for multiversioning, you need >>> special intrinsics for that anyway and when you use them, the function will >>> fail to compile without those features. >>> Multiversioning only makes sense for ISA features the compiler uses for >>> normal C/C++ code without any intrinsics. >>> >> Patch reduced to just adding BMI and BMI2 multiversioning: > > +2014-12-29 Allan Sandfeld Jensen > + > + * config/i386/i386.c (get_builtin_code_for_version): Add > + support for BMI and BMI2 multiversion functions. > > +2014-12-29 Allan Sandfeld Jensen > + > + * gcc.target/i386/funcspec-5.c: Test new multiversion targets. > + * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. > > +2014-12-29 Allan Sandfeld Jensen > + > + * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI and > + FEATURE_BMI2. > + (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. > > OK for mainline Allan, did you commit the patch to mainline? I don't see it in SVN logs. (If you don't have SVN commit access, please mention it in the patch submission, so someone will commit the patch for you). Uros.
Re: [PATCH x86] Add march/mtune=knl
On 10-12-14 17:35, Uros Bizjak wrote: On Wed, Dec 10, 2014 at 5:20 PM, Ilya Tocar wrote: gcc/testsuite/ * gcc.target/i386/funcspec-5.c: Test avx512f and knl. --- a/gcc/testsuite/gcc.target/i386/funcspec-5.c +++ b/gcc/testsuite/gcc.target/i386/funcspec-5.c +extern void test_avx512 (void) __attribute__((__target__("avx512"))); +extern void test_no_avx512 (void) __attribute__((__target__("no-avx512"))); funcspec-5.c is currently failing (mentioned in PR64342) with: ... Excess errors: src/gcc/testsuite/gcc.target/i386/funcspec-5.c:27:1: error: attribute(target("avx512")) is unknown src/gcc/testsuite/gcc.target/i386/funcspec-5.c:50:1: error: attribute(target("no-avx512")) is unknown ... Given the used of avx512f in the ChangeLog entry, I assume avx512f was meant in the attributes instead of avx512? Attached patch ok for stage4 trunk? Thanks, - Tom 2015-01-24 Tom de Vries * gcc.target/i386/funcspec-5.c: Replace avx512 with avx512f. --- gcc/testsuite/gcc.target/i386/funcspec-5.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.target/i386/funcspec-5.c b/gcc/testsuite/gcc.target/i386/funcspec-5.c index 269e610..d796484 100644 --- a/gcc/testsuite/gcc.target/i386/funcspec-5.c +++ b/gcc/testsuite/gcc.target/i386/funcspec-5.c @@ -24,7 +24,7 @@ extern void test_ssse3 (void) __attribute__((__target__("ssse3"))); extern void test_tbm (void) __attribute__((__target__("tbm"))); extern void test_avx (void) __attribute__((__target__("avx"))); extern void test_avx2 (void) __attribute__((__target__("avx2"))); -extern void test_avx512 (void) __attribute__((__target__("avx512"))); +extern void test_avx512f (void) __attribute__((__target__("avx512f"))); extern void test_no_abm (void) __attribute__((__target__("no-abm"))); extern void test_no_aes (void) __attribute__((__target__("no-aes"))); @@ -47,7 +47,7 @@ extern void test_no_ssse3 (void) __attribute__((__target__("no-ssse3"))); extern void test_no_tbm (void) __attribute__((__target__("no-tbm"))); extern void test_no_avx (void) __attribute__((__target__("no-avx"))); extern void test_no_avx2 (void) __attribute__((__target__("no-avx2"))); -extern void test_no_avx512 (void) __attribute__((__target__("no-avx512"))); +extern void test_no_avx512f (void) __attribute__((__target__("no-avx512f"))); extern void test_arch_i386 (void) __attribute__((__target__("arch=i386"))); extern void test_arch_i486 (void) __attribute__((__target__("arch=i486"))); -- 1.9.1
Re: [PATCH x86] Add march/mtune=knl
On Sat, Jan 24, 2015 at 11:40 AM, Tom de Vries wrote: > On 10-12-14 17:35, Uros Bizjak wrote: >> >> On Wed, Dec 10, 2014 at 5:20 PM, Ilya Tocar >> wrote: > > >>> gcc/testsuite/ >>> * gcc.target/i386/funcspec-5.c: Test avx512f and knl. >> >> > >>> --- a/gcc/testsuite/gcc.target/i386/funcspec-5.c >>> +++ b/gcc/testsuite/gcc.target/i386/funcspec-5.c > > >>> +extern void test_avx512 (void) >>> __attribute__((__target__("avx512"))); > > >>> +extern void test_no_avx512 (void) >>> __attribute__((__target__("no-avx512"))); > > > funcspec-5.c is currently failing (mentioned in PR64342) with: > ... > Excess errors: > src/gcc/testsuite/gcc.target/i386/funcspec-5.c:27:1: error: > attribute(target("avx512")) is unknown > src/gcc/testsuite/gcc.target/i386/funcspec-5.c:50:1: error: > attribute(target("no-avx512")) is unknown > ... > > Given the used of avx512f in the ChangeLog entry, I assume avx512f was meant > in the attributes instead of avx512? > > Attached patch ok for stage4 trunk? OK. Thanks, Uros.
Re: [Patch, i386] Support BMI and BMI2 targets in multiversioning
On Saturday 24 January 2015, Uros Bizjak wrote: > On Mon, Jan 12, 2015 at 6:02 PM, Uros Bizjak wrote: > > Hello! > > > >>> On Wed, Dec 31, 2014 at 01:28:47PM +0100, Allan Sandfeld Jensen wrote: > >>> > I recently wanted to use multiversioning for BMI2 specific extensions > >>> > PDEP/PEXT, and noticed it wasn't there. So I wrote this patch to add > >>> > it, and also added AES, F16C and BMI1 for completeness. > >>> > >>> AES nor F16C doesn't make any sense IMHO for multiversioning, you need > >>> special intrinsics for that anyway and when you use them, the function > >>> will fail to compile without those features. > >>> Multiversioning only makes sense for ISA features the compiler uses for > >>> normal C/C++ code without any intrinsics. > >> > >> Patch reduced to just adding BMI and BMI2 multiversioning: > > +2014-12-29 Allan Sandfeld Jensen > > + > > + * config/i386/i386.c (get_builtin_code_for_version): Add > > + support for BMI and BMI2 multiversion functions. > > > > +2014-12-29 Allan Sandfeld Jensen > > + > > + * gcc.target/i386/funcspec-5.c: Test new multiversion targets. > > + * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. > > > > +2014-12-29 Allan Sandfeld Jensen > > + > > + * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI and > > + FEATURE_BMI2. > > + (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. > > > > OK for mainline > > Allan, did you commit the patch to mainline? I don't see it in SVN logs. > > (If you don't have SVN commit access, please mention it in the patch > submission, so someone will commit the patch for you). > Sorry. I don't have SVN commit access. `Allan
Re: [PATCH, RFC] LRA subreg handling
Jeff Law writes: > On 01/15/15 03:13, Robert Suchanek wrote: >>> Robert, can you look at reload.c::reload_inner_reg_of_subreg and verify >>> that the comment just before its return statement is effectively the >>> situation you're in. >>> >>> There are certainly cases where a SUBREG needs to be treated as an >>> in-out operand. We walked through them eons ago when we were poking at >>> SSA for RTL. But the details have long since faded from memory. >> >> The comment pretty much applies to my situation. The only difference I can >> see is that reload would have had hard registers at this point. In >> the testcase, >> LRA does not have hard registers assigned to the concerned pseudo(s), thus, >> it can't rely on the information in hard_regno_nregs to check if the number >> of >> registers in INNER is different to the number of words in INNER. But the code you quote is: if (GET_CODE (*loc) == SUBREG) { reg = SUBREG_REG (*loc); byte = SUBREG_BYTE (*loc); if (REG_P (reg) /* Strict_low_part requires reload the register not the sub-register. */ && (curr_static_id->operand[i].strict_low || (GET_MODE_SIZE (mode) <= GET_MODE_SIZE (GET_MODE (reg)) && (hard_regno = get_try_hard_regno (REGNO (reg))) >= 0 && (simplify_subreg_regno (hard_regno, GET_MODE (reg), byte, mode) < 0) && (goal_alt[i] == NO_REGS || (simplify_subreg_regno (ira_class_hard_regs[goal_alt[i]][0], GET_MODE (reg), byte, mode) >= 0 Here we do have a hard register, but it isn't valid to form the subreg on that hard register. Reload had to cope with that case too. Since the subreg on the original hard register is invalid, we can't use it to decide whether the intention was to write to only a part of the inner register. But I don't think we need to use the hard register here. The original register was a psuedo and I'm pretty sure... > The differences (hard vs pseudo regs) are primarily an implementation > detail. I was really looking to see if there was existing code which > would turn an output reload into an in-out reload for these subregs. > > The in-out nature of certain subregs is something I've personally > stumbled over in various contexts (for example, this also came up during > RTL-SSA investigations years ago). ...the rule for pseudos is that words of a multiword pseudo can be accessed independently but subword pieces of an individual word can't. This obviously isn't ideal if a mode is intended for wider-than-word registers, since not all words will be independently addressable when allocated to those registers. The code above is partly dealing with the fallout from that. It's also why we have strict_lowpart for cases like al on i386. So IMO the patch is too broad. I think it should only use INOUT reloads for !strict_low if the inner mode is wider than a word and the outer mode is strictly narrower than the inner mode. That's on top of Vlad's comment about only converting OP_OUTs, of course. Thanks, Richard
Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition
Jeff Law writes: > On 01/14/15 16:52, Vladimir Makarov wrote: >>The problem of unexpected code generation is discussed on >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110 >> >>The following patch introduces 2 new constraints '^' and '$' which >> are analogous to '?' and '!' but disfavor given alternative when *the >> operand with the new constraint* needs a reload ('?' and '!' disfavor >> the alternative if *any* operand needs a reload). I hope the new >> constraints will be useful for other insns and targets. > Right. This gives us finer grained control over when to disparage an > alternative. > > Reloading some of the operands in an alternative may not be a big deal, > but there may be other operands in the alternative that if a reload was > needed for that operand would be so bad that we'd want to reject the > entire alternative. > > The example I had in mind when I read Vlad's analysis in the BZ were the > old movb and addb patterns on the PA. Basically we have some side > effect like addition/subtraction/register copy along with a conditional > jump. > > (define_insn "" >[(set (pc) > (if_then_else >(match_operator 2 "movb_comparison_operator" > [(match_operand:SI 1 "register_operand" "r,r,r,r") > (const_int 0)]) >(label_ref (match_operand 3 "" "")) >(pc))) > (set (match_operand:SI 0 "reg_before_reload_operand" "=!r,!*f,*Q,!*q") > (match_dup 1))] > > Needing a reload for operand 1 really isn't a big deal here, but > reloading operand 0 is a disaster. This would be a good place to use > the new constraint modifiers. > > I can distinctly recall running into similar issues on other ports > through the years. I wouldn't be at all surprised if a notable > percentage of the "!" and "?"s that appear in our machine descriptions > would be better off as "^" and "$". Yeah. I expect in practice most people who used "?" and "!" attached them to a particular operand for a reason. From a quick scan through 386.exp it looked like almost all uses would either want this behaviour or wouldn't care. An interesting exception is: (define_insn "extendsidi2_1" [(set (match_operand:DI 0 "nonimmediate_operand" "=*A,r,?r,?*o") (sign_extend:DI (match_operand:SI 1 "register_operand" "0,0,r,r"))) (clobber (reg:CC FLAGS_REG)) (clobber (match_scratch:SI 2 "=X,X,X,&r"))] "!TARGET_64BIT" "#") I don't know how effective the third alternative is with LRA. Surely a "r<-0" alternative is by definition a case where "r<-r" is possible only with a "?"-cost reload? Seems to me we could just delete it. But assuming it does some good, I suppose the "?" really does apply to the alternative as a whole. If we had to reload operand 1 or operand 0, there's an extra cost if it can't use the same register as the other operand. Wouldn't it be better to make "?" and "!" behave the new way and only add new constraints if it turns out that the old behaviour really is useful in some cases? Maybe stage 4 isn't the time to be making that kind of change. Still, it'd be great if someone who's set up do x86_64 benchmarking could measure the effect of making "?" and "!" behave like the new constraints. Thanks, Richard
Re: [PATCH] Reenable CSE of non-volatile inline asm (PR rtl-optimization/63637)
Segher Boessenkool writes: > On Fri, Jan 23, 2015 at 10:48:50PM +0100, Jakub Jelinek wrote: >> On Fri, Jan 23, 2015 at 03:39:40PM -0600, Segher Boessenkool wrote: >> > I understand that argument. But it is not what GCC actually does, nor >> > what I think it should do. Consider this program: >> > >> > --- 8< --- >> > int main(void) >> > { >> >int x[100], y[100]; >> > >> >x[31] = 42; >> > >> >asm("# eww %0" : "=m"(y[4]) : : "memory"); >> > >> >return 0; >> > } >> > --- 8< --- >> >> Here x isn't addressable, so it is certainly fine to DSE it. >> x shouldn't be considered memory. >> If the address of x escaped, either to the assembly or to some global var >> etc., then it probably shouldn't be removed. > > But GCC does consider it memory. If you look at the (tree) dump files > you see both arrays are clobbered after the asm. Tree DCE removes the > store to x[31] nevertheless. > > If the address of x escapes then of course the store to x[31] should > not be removed, irrespective of whether the clobber implies a read > or not. Just tried some other examples out of curiosity. In: int main(void) { int x[100], y[100]; asm volatile("# foo" :: "r"(x)); x[31] = 42; asm("# eww %0" : "=m"(y[4]) : : "memory"); return 0; } "x[31]" can only validly escape to the second asm. In this case the assignment is kept, as it is with: int main(void) { int x[100], y; asm volatile("# foo" :: "r"(x)); x[31] = 42; asm("# eww %0" : "=r"(y) : : "memory"); return y; } But remove the clobber and it goes away: int main(void) { int x[100], y; asm volatile("# foo" :: "r"(x)); x[31] = 42; asm("# eww %0" : "=r"(y)); return y; } So it looks like these four cases (including yours) are handled correctly. Thanks, Richard
Re: [PATCH RFA MIPS] Prohibit vector modes in accumulators
Matthew Fortune writes: >> 2015-01-23 Robert Suchanek >> >> * config/mips/mips.c (mips_hard_regno_mode_ok_p): Prohibit >> accumulators >> for all vector modes. > > This seems like a genuine bug and although it can only be triggered by > loongson or paired-single support it probably qualifies for fixing. Agreed FWIW. We shouldn't mark something as valid for a mode if even the mode's move pattern can't handle it. I think this kind of thing should go in regardless of development stage. Thanks, Richard
Re: PATCH: Update host_detect_local_cpu for new Intel processors
On Sat, Jan 24, 2015 at 2:16 AM, Uros Bizjak wrote: > On Sat, Jan 24, 2015 at 11:13 AM, Uros Bizjak wrote: >> On Sat, Jan 24, 2015 at 3:50 AM, H.J. Lu wrote: >>> On Fri, Jan 23, 2015 at 06:37:01PM -0800, H.J. Lu wrote: The new Silvermont aswell and Broadwell model numbers are in http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf This patch updates host_detect_local_cpu to check new Silvermont, Haswell and Broadwell model numbers. OK for trunk and 4.9 branch? Thanks. >>> >>> There are more model numbers in CHAPTER 35 MODEL-SPECIFIC REGISTERS (MSRS): >>> >>> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf >>> >>> OK for trunk and 4.9 branch? >>> >>> Thanks. >>> >>> >>> H.J. >>> --- >>> 2015-01-23 H.J. Lu >>> >>> * config/i386/driver-i386.c (host_detect_local_cpu): Check new >>> Silvermont, Haswell, Broadwell and Knights Landing model numbers. >> >> Please also update /libgcc/config/i386/cpuinfo.c and relevant testsuite >> files. > > Huh, complete AVX512 handling is missing in the above file. Adding Kirill to > CC: > > Uros. Yes, model 0x4d is Sky Lake. Kirill, I don't think we want to tune it like KNL. I think it is closer to Broadwell than KNL. Before we have -mtune=skylake, we should make -mtune=native as -mtune=broadwell on Sky Lake. -- H.J.
Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition
On Sat, Jan 24, 2015 at 3:29 AM, Richard Sandiford wrote: > Jeff Law writes: >> On 01/14/15 16:52, Vladimir Makarov wrote: >>>The problem of unexpected code generation is discussed on >>> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110 >>> >>>The following patch introduces 2 new constraints '^' and '$' which >>> are analogous to '?' and '!' but disfavor given alternative when *the >>> operand with the new constraint* needs a reload ('?' and '!' disfavor >>> the alternative if *any* operand needs a reload). I hope the new >>> constraints will be useful for other insns and targets. >> Right. This gives us finer grained control over when to disparage an >> alternative. >> >> Reloading some of the operands in an alternative may not be a big deal, >> but there may be other operands in the alternative that if a reload was >> needed for that operand would be so bad that we'd want to reject the >> entire alternative. >> >> The example I had in mind when I read Vlad's analysis in the BZ were the >> old movb and addb patterns on the PA. Basically we have some side >> effect like addition/subtraction/register copy along with a conditional >> jump. >> >> (define_insn "" >>[(set (pc) >> (if_then_else >>(match_operator 2 "movb_comparison_operator" >> [(match_operand:SI 1 "register_operand" "r,r,r,r") >> (const_int 0)]) >>(label_ref (match_operand 3 "" "")) >>(pc))) >> (set (match_operand:SI 0 "reg_before_reload_operand" "=!r,!*f,*Q,!*q") >> (match_dup 1))] >> >> Needing a reload for operand 1 really isn't a big deal here, but >> reloading operand 0 is a disaster. This would be a good place to use >> the new constraint modifiers. >> >> I can distinctly recall running into similar issues on other ports >> through the years. I wouldn't be at all surprised if a notable >> percentage of the "!" and "?"s that appear in our machine descriptions >> would be better off as "^" and "$". > > Yeah. I expect in practice most people who used "?" and "!" attached > them to a particular operand for a reason. From a quick scan through > 386.exp it looked like almost all uses would either want this behaviour > or wouldn't care. An interesting exception is: > > (define_insn "extendsidi2_1" > [(set (match_operand:DI 0 "nonimmediate_operand" "=*A,r,?r,?*o") > (sign_extend:DI (match_operand:SI 1 "register_operand" "0,0,r,r"))) >(clobber (reg:CC FLAGS_REG)) >(clobber (match_scratch:SI 2 "=X,X,X,&r"))] > "!TARGET_64BIT" > "#") > > I don't know how effective the third alternative is with LRA. Surely > a "r<-0" alternative is by definition a case where "r<-r" is possible > only with a "?"-cost reload? Seems to me we could just delete it. > But assuming it does some good, I suppose the "?" really does apply to > the alternative as a whole. If we had to reload operand 1 or operand 0, > there's an extra cost if it can't use the same register as the other > operand. > > Wouldn't it be better to make "?" and "!" behave the new way and only > add new constraints if it turns out that the old behaviour really is > useful in some cases? > > Maybe stage 4 isn't the time to be making that kind of change. > Still, it'd be great if someone who's set up do x86_64 benchmarking > could measure the effect of making "?" and "!" behave like the > new constraints. Areg, can we run some benchmarks? -- H.J.
Re: PATCH: Update host_detect_local_cpu for new Intel processors
On Sat, Jan 24, 2015 at 11:16:42AM +0100, Uros Bizjak wrote: > On Sat, Jan 24, 2015 at 11:13 AM, Uros Bizjak wrote: > > On Sat, Jan 24, 2015 at 3:50 AM, H.J. Lu wrote: > >> On Fri, Jan 23, 2015 at 06:37:01PM -0800, H.J. Lu wrote: > >>> The new Silvermont aswell and Broadwell model numbers are in > >>> > >>> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf > >>> > >>> This patch updates host_detect_local_cpu to check new Silvermont, Haswell > >>> and Broadwell model numbers. OK for trunk and 4.9 branch? > >>> > >>> Thanks. > >>> > >> > >> There are more model numbers in CHAPTER 35 MODEL-SPECIFIC REGISTERS (MSRS): > >> > >> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf > >> > >> OK for trunk and 4.9 branch? > >> > >> Thanks. > >> > >> > >> H.J. > >> --- > >> 2015-01-23 H.J. Lu > >> > >> * config/i386/driver-i386.c (host_detect_local_cpu): Check new > >> Silvermont, Haswell, Broadwell and Knights Landing model numbers. > > > > Please also update /libgcc/config/i386/cpuinfo.c and relevant testsuite > > files. > > Huh, complete AVX512 handling is missing in the above file. Adding Kirill to > CC: > Here is the updated patch. Tested on Ivy Bridge. OK for trunk and 4.9 branches? I will leave AVX512 to Kirill. Thanks. H.J. --- gcc/ 2015-01-24 H.J. Lu * config/i386/driver-i386.c (host_detect_local_cpu): Check new Silvermont, Haswell, Broadwell and Knights Landing model numbers. * config/i386/i386.c (processor_model): Add M_INTEL_COREI7_BROADWELL. (arch_names_table): Add "broadwell". gcc/testsuite/ 2015-01-24 H.J. Lu * gcc.target/i386/builtin_target.c (check_intel_cpu_model): Add Silvermont, Ivy Bridge, Haswell and Broadwell tests. Update Sandy Bridge test. libgcc/ 2015-01-24 H.J. Lu * config/i386/cpuinfo.c (processor_subtypes): Add INTEL_COREI7_BROADWELL. (get_intel_cpu): Support new Silvermont, Haswell and Broadwell model numbers. diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c index c731c50..c69149d 100644 --- a/gcc/config/i386/driver-i386.c +++ b/gcc/config/i386/driver-i386.c @@ -703,7 +703,10 @@ const char *host_detect_local_cpu (int argc, const char **argv) cpu = "bonnell"; break; case 0x37: + case 0x4a: case 0x4d: + case 0x5a: + case 0x5d: /* Silvermont. */ cpu = "silvermont"; break; @@ -738,11 +741,22 @@ const char *host_detect_local_cpu (int argc, const char **argv) cpu = "ivybridge"; break; case 0x3c: + case 0x3f: case 0x45: case 0x46: /* Haswell. */ cpu = "haswell"; break; + case 0x3d: + case 0x4f: + case 0x56: + /* Broadwell. */ + cpu = "broadwell"; + break; + case 0x57: + /* Knights Landing. */ + cpu = "knl"; + break; default: if (arch) { diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index d10d3ff..b3ae575 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -35333,7 +35333,8 @@ fold_builtin_cpu (tree fndecl, tree *args) M_AMDFAM15H_BDVER3, M_AMDFAM15H_BDVER4, M_INTEL_COREI7_IVYBRIDGE, -M_INTEL_COREI7_HASWELL +M_INTEL_COREI7_HASWELL, +M_INTEL_COREI7_BROADWELL }; static struct _arch_names_table @@ -35354,6 +35355,7 @@ fold_builtin_cpu (tree fndecl, tree *args) {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE}, {"ivybridge", M_INTEL_COREI7_IVYBRIDGE}, {"haswell", M_INTEL_COREI7_HASWELL}, + {"broadwell", M_INTEL_COREI7_BROADWELL}, {"bonnell", M_INTEL_BONNELL}, {"silvermont", M_INTEL_SILVERMONT}, {"knl", M_INTEL_KNL}, diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c b/gcc/testsuite/gcc.target/i386/builtin_target.c index b6a3eee..10c0568 100644 --- a/gcc/testsuite/gcc.target/i386/builtin_target.c +++ b/gcc/testsuite/gcc.target/i386/builtin_target.c @@ -30,6 +30,14 @@ check_intel_cpu_model (unsigned int family, unsigned int model, /* Atom. */ assert (__builtin_cpu_is ("atom")); break; + case 0x37: + case 0x4a: + case 0x4d: + case 0x5a: + case 0x5d: + /* Silvermont. */ + assert (__builtin_cpu_is ("silvermont")); + break; case 0x1a: case 0x1e: case 0x1f: @@ -46,10 +54,32 @@ check_intel_cpu_model (unsigned int family, unsigned int model, assert (__builtin_cpu_is ("westmere")); break; case 0x2a: + case 0x2d: /* Sandy Bridge. */ assert (__builtin_cpu_is (
Re: [PING]: [PATCH]: Conditionally include target specific files while building TSAN
Hi Jakub, On 24 January 2015 at 14:40, Jakub Jelinek wrote: > On Sat, Jan 24, 2015 at 01:23:22PM +0530, Venkataramanan Kumar wrote: >> I reused libgcc's "host_address" test and the patch passed normal >> bootstrap in x86_64. >> >> Can you please check if this is fine ? > > Can't you just use what configure.tgt already uses? > > x86_64-*-linux* | i?86-*-linux*) > if test x$ac_cv_sizeof_void_p = x8; then > TSAN_SUPPORTED=yes > LSAN_SUPPORTED=yes > fi > ;; > > Just make sure AC_CHECK_SIZEOF([void *]) is above this (seems it is). > > So > > TSAN_TARGET_DEPENDENT_OBJECTS= > case "${target}" in > x86_64-*-linux* | i?86-*-linux*) > if test x$ac_cv_sizeof_void_p = x8; then > TSAN_TARGET_DEPENDENT_OBJECTS=tsan_rtl_amd64.lo > fi;; > esac > AC_SUBST([TSAN_TARGET_DEPENDENT_OBJECTS]) > > ? > Or even better move the TSAN_TARGET_DEPENDENT_OBJECTS initialization > to configure.tgt and just keep AC_SUBST([TSAN_TARGET_DEPENDENT_OBJECTS]) > in configure.ac. > > Jakub As per you suggestion, I moved the TSAN_TARGET_DEPENDENT_OBJECTS to "configure.tgt" also it includes i?86 targets. Bootstraped on x86_64 and Aarch64. regards, Venkat. Index: libsanitizer/ChangeLog === --- libsanitizer/ChangeLog (revision 220079) +++ libsanitizer/ChangeLog (working copy) @@ -1,5 +1,11 @@ 2015-01-25 Venkataramanan Kumar + * configure.ac (TSAN_TARGET_DEPENDENT_OBJECTS): Undefine. + * configure: Regenerate. + * configure.tgt (TSAN_TARGET_DEPENDENT_OBJECTS): Define. + +2015-01-25 Venkataramanan Kumar + * configure.ac (TSAN_TARGET_DEPENDENT_OBJECTS): Define. * configure: Regenerate. * tsan/Makefile.am (EXTRA_libtsan_la_SOURCES): Define. Index: libsanitizer/configure === --- libsanitizer/configure (revision 220079) +++ libsanitizer/configure (working copy) @@ -16363,10 +16363,6 @@ fi -case "${target}" in - x86_64-*-linux-*) TSAN_TARGET_DEPENDENT_OBJECTS='tsan_rtl_amd64.lo' ;; - *) TSAN_TARGET_DEPENDENT_OBJECTS='' ;; -esac cat >confcache <<\_ACEOF Index: libsanitizer/configure.ac === --- libsanitizer/configure.ac (revision 220079) +++ libsanitizer/configure.ac (working copy) @@ -346,10 +346,6 @@ ]) fi -case "${target}" in - x86_64-*-linux-*) TSAN_TARGET_DEPENDENT_OBJECTS='tsan_rtl_amd64.lo' ;; - *) TSAN_TARGET_DEPENDENT_OBJECTS='' ;; -esac AC_SUBST([TSAN_TARGET_DEPENDENT_OBJECTS]) AC_OUTPUT Index: libsanitizer/configure.tgt === --- libsanitizer/configure.tgt (revision 220079) +++ libsanitizer/configure.tgt (working copy) @@ -19,11 +19,13 @@ # lets us skip running autoconf when modifying target specific information. # Filter out unsupported systems. +TSAN_TARGET_DEPENDENT_OBJECTS= case "${target}" in x86_64-*-linux* | i?86-*-linux*) if test x$ac_cv_sizeof_void_p = x8; then TSAN_SUPPORTED=yes LSAN_SUPPORTED=yes + TSAN_TARGET_DEPENDENT_OBJECTS=tsan_rtl_amd64.lo fi ;; powerpc*le-*-linux*)
Re: [PING]: [PATCH]: Conditionally include target specific files while building TSAN
On Sat, Jan 24, 2015 at 08:09:24PM +0530, Venkataramanan Kumar wrote: > Index: libsanitizer/ChangeLog > === > --- libsanitizer/ChangeLog(revision 220079) > +++ libsanitizer/ChangeLog(working copy) > @@ -1,5 +1,11 @@ > 2015-01-25 Venkataramanan Kumar > > + * configure.ac (TSAN_TARGET_DEPENDENT_OBJECTS): Undefine. > + * configure: Regenerate. > + * configure.tgt (TSAN_TARGET_DEPENDENT_OBJECTS): Define. Ok. Jakub
Re: [PING]: [PATCH]: Conditionally include target specific files while building TSAN
Hi Jakub, Thank you and I committed the patch https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=220083. regards, Venkat. On 24 January 2015 at 20:38, Jakub Jelinek wrote: > On Sat, Jan 24, 2015 at 08:09:24PM +0530, Venkataramanan Kumar wrote: >> Index: libsanitizer/ChangeLog >> === >> --- libsanitizer/ChangeLog(revision 220079) >> +++ libsanitizer/ChangeLog(working copy) >> @@ -1,5 +1,11 @@ >> 2015-01-25 Venkataramanan Kumar >> >> + * configure.ac (TSAN_TARGET_DEPENDENT_OBJECTS): Undefine. >> + * configure: Regenerate. >> + * configure.tgt (TSAN_TARGET_DEPENDENT_OBJECTS): Define. > > Ok. > > Jakub
[Patch, Fortran, committed] Fix gfc_error call
Seemingly, we missed a gfc_error call, which takes two locations, which is not yet supported. Hence, the old version (gfc_error_1) has to be used. Committed as Rev. 220084 as obvious. Tobias Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (Revision 220083) +++ gcc/fortran/ChangeLog (Arbeitskopie) @@ -1,3 +1,7 @@ +2015-01-24 Tobias Burnus + + * parse.c (gfc_parse_file): Fix two-location gfc_error call. + 2015-01-23 Martin Liska * decl.c (attr_decl1): Workaround -Wmaybe-uninitialized Index: gcc/fortran/parse.c === --- gcc/fortran/parse.c (Revision 220083) +++ gcc/fortran/parse.c (Arbeitskopie) @@ -5544,7 +5544,7 @@ duplicate_main: /* If we see a duplicate main program, shut down. If the second instance is an implied main program, i.e. data decls or executable statements, we're in for lots of errors. */ - gfc_error ("Two main PROGRAMs at %L and %C", &prog_locus); + gfc_error_1 ("Two main PROGRAMs at %L and %C", &prog_locus); reject_statement (); gfc_done_2 (); return true;
[Patch, Fortran] PR64771 - Fix coarray ICE
Build and regtested on x86-64-gnu-linux. OK for the trunk and 4.9? (It's a regression.) Tobias 2015-01-24 Tobias Burnus PR fortran/64771 gcc/fortran/ * interface.c (check_dummy_characteristics): Fix coarray handling. testsuite/ * gfortran.dg/coarray_36.f: New. * gfortran.dg/coarray_37.f90: New. diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c index dd3ad2a..5de416b 100644 --- a/gcc/fortran/interface.c +++ b/gcc/fortran/interface.c @@ -1205,8 +1205,15 @@ check_dummy_characteristics (gfc_symbol *s1, gfc_symbol *s2, return false; } + if (s1->as->corank != s2->as->corank) + { + snprintf (errmsg, err_len, "Corank mismatch in argument '%s' (%i/%i)", + s1->name, s1->as->corank, s2->as->corank); + return false; + } + if (s1->as->type == AS_EXPLICIT) - for (i = 0; i < s1->as->rank + s1->as->corank; i++) + for (i = 0; i < s1->as->rank + std::max(0, s1->as->corank-1); i++) { shape1 = gfc_subtract (gfc_copy_expr (s1->as->upper[i]), gfc_copy_expr (s1->as->lower[i])); @@ -1220,8 +1227,12 @@ check_dummy_characteristics (gfc_symbol *s1, gfc_symbol *s2, case -1: case 1: case -3: - snprintf (errmsg, err_len, "Shape mismatch in dimension %i of " - "argument '%s'", i + 1, s1->name); + if (i < s1->as->rank) + snprintf (errmsg, err_len, "Shape mismatch in dimension %i of" + " argument '%s'", i + 1, s1->name); + else + snprintf (errmsg, err_len, "Shape mismatch in codimension %i " + "of argument '%s'", i - s1->as->rank + 1, s1->name); return false; case -2: diff --git a/gcc/testsuite/gfortran.dg/coarray_36.f b/gcc/testsuite/gfortran.dg/coarray_36.f new file mode 100644 index 000..d06a01e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/coarray_36.f @@ -0,0 +1,347 @@ +! { dg-do compile } +! { dg-options "-fcoarray=lib" } +! +! PR fortran/64771 +! +! Contributed by Alessandro Fanfarill +! +! Reduced version of the full NAS CG benchmark +! + +!-! +! ! +!N A S P A R A L L E L B E N C H M A R K S 3.3 ! +! ! +! C G ! +! ! +!-! +! ! +!This benchmark is part of the NAS Parallel Benchmark 3.3 suite. ! +!It is described in NAS Technical Reports 95-020 and 02-007 ! +! ! +!Permission to use, copy, distribute and modify this software ! +!for any purpose with or without fee is hereby granted. We ! +!request, however, that all derived work reference the NAS! +!Parallel Benchmarks 3.3. This software is provided "as is" ! +!without express or implied warranty. ! +! ! +!Information on NPB 3.3, including the technical report, the ! +!original specifications, source code, results and information! +!on how to submit new results, is available at: ! +! ! +! http://www.nas.nasa.gov/Software/NPB/ ! +! ! +!Send comments or suggestions to n...@nas.nasa.gov! +! ! +! NAS Parallel Benchmarks Group ! +! NASA Ames Research Center ! +! Mail Stop: T27A-1 ! +! Moffett Field, CA 94035-1000 ! +! ! +! E-mail: n...@nas.nasa.gov ! +! Fax: (650) 604-3957! +! ! +!-! + + +c- +c +c Authors: M. Yarrow +c C. Kuszmaul +c R. F. Van der Wijngaart +c H. Jin +c +c- + + +c- +c--
Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component
Janus Weil wrote: this is a second patch dealing with finalization-related regressions, [...] This patch fixes an invalid memory reference inside the finalizer routine (at runtime), which apparently was caused by dereferencing a pointer without checking if it's NULL. I simply insert a call to ASSOCIATED. [...] This one regtests cleanly on x86_64-unknown-linux-gnu. Ok for trunk and 4.9? OK. Thanks for the patch! Tobias 2015-01-19 Janus Weil PR fortran/64230 * class.c (finalize_component): New argument 'sub_ns'. Insert code to check if 'expr' is associated. (generate_finalization_wrapper): Rename 'ptr' symbols to 'ptr1' and 'ptr2'. Pass 'sub_ns' to finalize_component. 2015-01-19 Janus Weil PR fortran/64230 * gfortran.dg/class_allocate_18.f90: Extended.
[Patch, Fortran] PR63861 - fix OpenMP/ACC's gfc_has_alloc_comps
gfortran's scalar coarray are special: The descriptorless variant is a normal variable with some language-specific additional information (corank, bounds). The descriptor variant has a descriptor but the _data component is just a pointer to the scalar variable. As the element type of a descriptorless coarray is the type itself, we need to break the while loop. Build and regtested on x86-64-gnu-linux. OK for the trunk? Tobias PS: I believes coarrays are fine in OpenMP and OpenACC constructs as long as the variable is not coindexed ("variable[remove_index]", gfc_is_coindexed()). Issues like synchronization is in my opinion purely in the responsibility of the user. 2015-01-24 Tobias Burnus PR fortran/63861 gcc/fortran/ * trans-openmp.c (gfc_has_alloc_comps): Fix handling for scalar coarrays. gcc/testsuite/ * gfortran.dg/goacc/coarray_2.f90: New. diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c index cdd1885..4c7d82d 100644 --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -189,7 +189,8 @@ gfc_has_alloc_comps (tree type, tree decl) return false; } - while (GFC_DESCRIPTOR_TYPE_P (type) || GFC_ARRAY_TYPE_P (type)) + if (GFC_DESCRIPTOR_TYPE_P (type) + || (GFC_ARRAY_TYPE_P (type) && GFC_TYPE_ARRAY_RANK (type) == 0)) type = gfc_get_element_type (type); if (TREE_CODE (type) != RECORD_TYPE) diff --git a/gcc/testsuite/gfortran.dg/goacc/coarray_2.f90 b/gcc/testsuite/gfortran.dg/goacc/coarray_2.f90 new file mode 100644 index 000..7fbd928 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/goacc/coarray_2.f90 @@ -0,0 +1,112 @@ +! { dg-do compile } +! { dg-additional-options "-fcoarray=lib" } +! +! PR fortran/63861 + +module test +contains + subroutine oacc1(a) +implicit none +integer :: i +integer, codimension[*] :: a +!$acc declare device_resident (a) +!$acc data copy (a) +!$acc end data +!$acc data deviceptr (a) +!$acc end data +!$acc parallel private (a) +!$acc end parallel +!$acc host_data use_device (a) +!$acc end host_data +!$acc parallel loop reduction(+:a) +do i = 1,5 +enddo +!$acc end parallel loop +!$acc parallel loop +do i = 1,5 +enddo +!$acc end parallel loop +!$acc update device (a) +!$acc update host (a) +!$acc update self (a) + end subroutine oacc1 + + subroutine oacc2(a) +implicit none +integer :: i +integer, allocatable, codimension[*] :: a +!$acc declare device_resident (a) +!$acc data copy (a) +!$acc end data +!$acc data deviceptr (a) +!$acc end data +!$acc parallel private (a) +!$acc end parallel +!$acc host_data use_device (a) +!$acc end host_data +!$acc parallel loop reduction(+:a) +do i = 1,5 +enddo +!$acc end parallel loop +!$acc parallel loop +do i = 1,5 +enddo +!$acc end parallel loop +!$acc update device (a) +!$acc update host (a) +!$acc update self (a) + end subroutine oacc2 + + subroutine oacc3(a) +implicit none +integer :: i +integer, codimension[*] :: a(:) +!$acc declare device_resident (a) +!$acc data copy (a) +!$acc end data +!$acc data deviceptr (a) +!$acc end data +!$acc parallel private (a) +!$acc end parallel +!$acc host_data use_device (a) +!$acc end host_data +!$acc parallel loop reduction(+:a) +do i = 1,5 +enddo +!$acc end parallel loop +!$acc parallel loop +do i = 1,5 +enddo +!$acc end parallel loop +!$acc update device (a) +!$acc update host (a) +!$acc update self (a) + end subroutine oacc2 + + subroutine oacc2(a) +implicit none +integer :: i +integer, allocatable, codimension[*] :: a(:) +!$acc declare device_resident (a) +!$acc data copy (a) +!$acc end data +!$acc data deviceptr (a) +!$acc end data +!$acc parallel private (a) +!$acc end parallel +!$acc host_data use_device (a) +!$acc end host_data +!$acc parallel loop reduction(+:a) +do i = 1,5 +enddo +!$acc end parallel loop +!$acc parallel loop +do i = 1,5 +enddo +!$acc end parallel loop +!$acc update device (a) +!$acc update host (a) +!$acc update self (a) + end subroutine oacc2 +end module test +! { dg-excess-errors "sorry, unimplemented: directive not yet implemented" }
Re: PATCH: Update host_detect_local_cpu for new Intel processors
On Sat, Jan 24, 2015 at 2:53 PM, H.J. Lu wrote: >> >>> The new Silvermont aswell and Broadwell model numbers are in >> >>> >> >>> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf >> >>> >> >>> This patch updates host_detect_local_cpu to check new Silvermont, Haswell >> >>> and Broadwell model numbers. OK for trunk and 4.9 branch? >> >>> >> >>> Thanks. >> >>> >> >> >> >> There are more model numbers in CHAPTER 35 MODEL-SPECIFIC REGISTERS >> >> (MSRS): >> >> >> >> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf >> >> >> >> OK for trunk and 4.9 branch? >> >> >> >> Thanks. >> >> >> >> >> >> H.J. >> >> --- >> >> 2015-01-23 H.J. Lu >> >> >> >> * config/i386/driver-i386.c (host_detect_local_cpu): Check new >> >> Silvermont, Haswell, Broadwell and Knights Landing model numbers. >> > >> > Please also update /libgcc/config/i386/cpuinfo.c and relevant testsuite >> > files. >> >> Huh, complete AVX512 handling is missing in the above file. Adding Kirill to >> CC: >> > > Here is the updated patch. Tested on Ivy Bridge. OK for trunk and 4.9 > branches? > > I will leave AVX512 to Kirill. > > Thanks. > > H.J. > --- > gcc/ > > 2015-01-24 H.J. Lu > > * config/i386/driver-i386.c (host_detect_local_cpu): Check new > Silvermont, Haswell, Broadwell and Knights Landing model numbers. > * config/i386/i386.c (processor_model): Add > M_INTEL_COREI7_BROADWELL. > (arch_names_table): Add "broadwell". > > gcc/testsuite/ > > 2015-01-24 H.J. Lu > > * gcc.target/i386/builtin_target.c (check_intel_cpu_model): Add > Silvermont, Ivy Bridge, Haswell and Broadwell tests. Update Sandy > Bridge test. > > libgcc/ > > 2015-01-24 H.J. Lu > > * config/i386/cpuinfo.c (processor_subtypes): Add > INTEL_COREI7_BROADWELL. > (get_intel_cpu): Support new Silvermont, Haswell and Broadwell > model numbers. OK for trunk and 4.9 after a couple of days. Thanks, Uros. > diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c > index c731c50..c69149d 100644 > --- a/gcc/config/i386/driver-i386.c > +++ b/gcc/config/i386/driver-i386.c > @@ -703,7 +703,10 @@ const char *host_detect_local_cpu (int argc, const char > **argv) > cpu = "bonnell"; > break; > case 0x37: > + case 0x4a: > case 0x4d: > + case 0x5a: > + case 0x5d: > /* Silvermont. */ > cpu = "silvermont"; > break; > @@ -738,11 +741,22 @@ const char *host_detect_local_cpu (int argc, const char > **argv) > cpu = "ivybridge"; > break; > case 0x3c: > + case 0x3f: > case 0x45: > case 0x46: > /* Haswell. */ > cpu = "haswell"; > break; > + case 0x3d: > + case 0x4f: > + case 0x56: > + /* Broadwell. */ > + cpu = "broadwell"; > + break; > + case 0x57: > + /* Knights Landing. */ > + cpu = "knl"; > + break; > default: > if (arch) > { > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index d10d3ff..b3ae575 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -35333,7 +35333,8 @@ fold_builtin_cpu (tree fndecl, tree *args) > M_AMDFAM15H_BDVER3, > M_AMDFAM15H_BDVER4, > M_INTEL_COREI7_IVYBRIDGE, > -M_INTEL_COREI7_HASWELL > +M_INTEL_COREI7_HASWELL, > +M_INTEL_COREI7_BROADWELL >}; > >static struct _arch_names_table > @@ -35354,6 +35355,7 @@ fold_builtin_cpu (tree fndecl, tree *args) >{"sandybridge", M_INTEL_COREI7_SANDYBRIDGE}, >{"ivybridge", M_INTEL_COREI7_IVYBRIDGE}, >{"haswell", M_INTEL_COREI7_HASWELL}, > + {"broadwell", M_INTEL_COREI7_BROADWELL}, >{"bonnell", M_INTEL_BONNELL}, >{"silvermont", M_INTEL_SILVERMONT}, >{"knl", M_INTEL_KNL}, > diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c > b/gcc/testsuite/gcc.target/i386/builtin_target.c > index b6a3eee..10c0568 100644 > --- a/gcc/testsuite/gcc.target/i386/builtin_target.c > +++ b/gcc/testsuite/gcc.target/i386/builtin_target.c > @@ -30,6 +30,14 @@ check_intel_cpu_model (unsigned int family, unsigned int > model, > /* Atom. */ > assert (__builtin_cpu_is ("atom")); > break; > + case 0x37: > + case 0x4a: > + case 0x4d: > + case 0x5a: > + case 0x5d: > + /* Silvermont. */ > + assert (__builtin_cpu_is ("silvermont")); > + break; > case 0x1a: > case 0x1e: > case 0x1f: > @@ -46,10 +54,32 @@ check_intel_cpu_model (unsigned int family, unsigned int > model, > assert (__builtin_cpu_is ("westmere"
Re: [COMMITTED] Merge libffi with upstream
Bootstrap on FreeBSD 10.x/i386 is now broken: libtool: compile: /scratch2/tmp/gerald/OBJ-0124-0939/./gcc/xgcc -B/scratch2/tmp/gerald/OBJ-0124-0939/./gcc/ -B/home/gerald/gcc-ref10-i386/i386-unknown-freebsd10.1/bin/ -B/home/gerald/gcc-ref10-i386/i386-unknown-freebsd10.1/lib/ -isystem /home/gerald/gcc-ref10-i386/i386-unknown-freebsd10.1/include -isystem /home/gerald/gcc-ref10-i386/i386-unknown-freebsd10.1/sys-include -DHAVE_CONFIG_H -I. -I/scratch2/tmp/gerald/gcc-HEAD/libffi -I. -I/scratch2/tmp/gerald/gcc-HEAD/libffi/include -Iinclude -I/scratch2/tmp/gerald/gcc-HEAD/libffi/src -I. -I/scratch2/tmp/gerald/gcc-HEAD/libffi/include -Iinclude -I/scratch2/tmp/gerald/gcc-HEAD/libffi/src -g -O2 -MT src/x86/sysv.lo -MD -MP -MF src/x86/.deps/sysv.Tpo -c /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S -fPIC -DPIC -o src/x86/.libs/sysv.o /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S: Assembler messages: /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S:864: Error: junk at end of line, first unrecognized character is `@' /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S:886: Error: junk at end of line, first unrecognized character is `@' /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S:898: Error: junk at end of line, first unrecognized character is `@' /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S:910: Error: junk at end of line, first unrecognized character is `@' /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S:938: Error: junk at end of line, first unrecognized character is `@' /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S:950: Error: junk at end of line, first unrecognized character is `@' /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S:964: Error: junk at end of line, first unrecognized character is `@' /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S:983: Error: junk at end of line, first unrecognized character is `@' /scratch2/tmp/gerald/gcc-HEAD/libffi/src/x86/sysv.S:1007: Error: junk at end ofline, first unrecognized character is `@' Makefile:1177: recipe for target 'src/x86/sysv.lo' failed gmake[4]: *** [src/x86/sysv.lo] Error 1 gmake[4]: Leaving directory '/scratch2/tmp/gerald/OBJ-0124-0939/i386-unknown-fre ebsd10.1/libffi' Makefile:1433: recipe for target 'all-recursive' failed gmake[3]: *** [all-recursive] Error 1 I also filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64779 so that this can be tracked for the GCC 5.0 release. Gerald
Re: Add to maintainers list.
Hi Alex, On Friday 2014-11-21 10:07, Alex Velenko wrote: > Can someone, please, approve? we tried to document this in https://gcc.gnu.org/svnwrite.html . Can you perhaps suggest a way for us to improve this to make it more clear or easier to find? Gerald
Re: [Patch, Fortran] PR64771 - Fix coarray ICE
On Sat, Jan 24, 2015 at 06:13:04PM +0100, Tobias Burnus wrote: >if (s1->as->type == AS_EXPLICIT) > - for (i = 0; i < s1->as->rank + s1->as->corank; i++) > + for (i = 0; i < s1->as->rank + std::max(0, s1->as->corank-1); i++) Doesn't this require '#include '? I suspect that you are depending on namespace pollution via some other header (coretypes.h?). -- Steve
[PATCH, libgfortran, committed] PR 64770 Segfault when trying to open existing file with status="new"
Hi, the attached patch fixes PR 64770, by checking whether a string is a null pointer before calling strdup() on it. Committed r220086 as obvious. libgfortran ChangeLog: 2015-01-24 Janne Blomqvist PR libfortran/64770 * io/unit.c (filename_from_unit): Check that u->filename != NULL before calling strdup. testsuite ChangeLog: 2015-01-24 Janne Blomqvist PR libfortran/64770 * gfortran.dg/open_new_segv.f90: New test. diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c index e168d32..687f507 100644 --- a/libgfortran/io/unit.c +++ b/libgfortran/io/unit.c @@ -829,7 +829,7 @@ filename_from_unit (int n) } /* Get the filename. */ - if (u != NULL) + if (u != NULL && u->filename != NULL) return strdup (u->filename); else return (char *) NULL; -- Janne Blomqvist
Re: [patch] [C++14] Implement N3657: heterogeneous lookup in associative containers.
Sorry, I hadn't notice the condition to expose the new methods. It was hidden within the _Rb_tree type that I hadn't check (and I do not often check the Standard directly for my limited patches). On my side I am surprised you didn't reuse your code to detect member types. I am also surprised that it is not using enable_if, IMHO it makes the code clearer. Here is a proposal to use both extended to the debug mode too. François On 22/01/2015 03:07, Jonathan Wakely wrote: On 21/01/15 23:30 +0100, François Dumont wrote: +#if __cplusplus > 201103L + template +std::pair +equal_range(const _Kt& __x) +{ + std::pair<_Base_iterator, _Base_iterator> __res = +_Base::equal_range(__x); + return std::make_pair(iterator(__res.first, this), +iterator(__res.second, this)); +} BTW, this is C++14 code, what's wrong with: template std::pair equal_range(const _Kt& __x) const { auto __res = _Base::equal_range(__x); return { iterator(__res.first, this), iterator(__res.second, this) }; } Or even: template std::pair equal_range(const _Kt& __x) const { auto __res = _Base::equal_range(__x); return { { __res.first, this }, {__res.second, this} }; } Index: include/bits/stl_tree.h === --- include/bits/stl_tree.h (revision 220078) +++ include/bits/stl_tree.h (working copy) @@ -342,6 +342,9 @@ _Rb_tree_rebalance_for_erase(_Rb_tree_node_base* const __z, _Rb_tree_node_base& __header) throw (); +#if __cplusplus > 201103L + _GLIBCXX_HAS_NESTED_TYPE(is_transparent) +#endif template > @@ -1119,14 +1122,6 @@ equal_range(const key_type& __k) const; #if __cplusplus > 201103L - template> - struct __is_transparent { }; - - template - struct - __is_transparent<_Cmp, _Kt, __void_t> - { typedef void type; }; - static auto _S_iter(_Link_type __x) { return iterator(__x); } static auto _S_iter(_Const_Link_type __x) { return const_iterator(__x); } @@ -1155,9 +1150,8 @@ return _S_iter(__y); } - template::type> - iterator + template + enable_if_t<__has_is_transparent<_Compare>::value, iterator> _M_find_tr(const _Kt& __k) { auto& __cmp = _M_impl._M_key_compare; @@ -1166,9 +1160,8 @@ ? end() : __j; } - template::type> - const_iterator + template + enable_if_t<__has_is_transparent<_Compare>::value, const_iterator> _M_find_tr(const _Kt& __k) const { auto& __cmp = _M_impl._M_key_compare; @@ -1177,9 +1170,8 @@ ? end() : __j; } - template::type> - size_type + template + enable_if_t<__has_is_transparent<_Compare>::value, size_type> _M_count_tr(const _Kt& __k) const { auto __p = _M_equal_range_tr(__k); @@ -1186,9 +1178,8 @@ return std::distance(__p.first, __p.second); } - template::type> - iterator + template + enable_if_t<__has_is_transparent<_Compare>::value, iterator> _M_lower_bound_tr(const _Kt& __k) { auto& __cmp = _M_impl._M_key_compare; @@ -1195,9 +1186,8 @@ return _S_lower_bound_tr(__cmp, _M_begin(), _M_end(), __k); } - template::type> - const_iterator + template + enable_if_t<__has_is_transparent<_Compare>::value, const_iterator> _M_lower_bound_tr(const _Kt& __k) const { auto& __cmp = _M_impl._M_key_compare; @@ -1204,9 +1194,8 @@ return _S_lower_bound_tr(__cmp, _M_begin(), _M_end(), __k); } - template::type> - iterator + template + enable_if_t<__has_is_transparent<_Compare>::value, iterator> _M_upper_bound_tr(const _Kt& __k) { auto& __cmp = _M_impl._M_key_compare; @@ -1213,9 +1202,8 @@ return _S_upper_bound_tr(__cmp, _M_begin(), _M_end(), __k); } - template::type> - const_iterator + template + enable_if_t<__has_is_transparent<_Compare>::value, const_iterator> _M_upper_bound_tr(const _Kt& __k) const { auto& __cmp = _M_impl._M_key_compare; @@ -1222,9 +1210,9 @@ return _S_upper_bound_tr(__cmp, _M_begin(), _M_end(), __k); } - template::type> - pair + template + enable_if_t<__has_is_transparent<_Compare>::value, + pair> _M_equal_range_tr(const _Kt& __k) { auto __low = _M_lower_bound_tr(__k); @@ -1235,9 +1223,9 @@ return { __low, __high }; } - template::type> - pair + template + enable_if_t<__has_is_transparent<_Compare>::value, + pair> _M_equal_range_tr(const _Kt& __k) const { auto __low = _M_lower_bound_tr(__k); Index: include/debug/map.h === --- include/debug/map.h (revision 220078) +++ include/debug/map.h (working copy) @@ -412,10 +412,24 @@ find(const key_type& __x) { return iterator(_Base::find(__x), this); } +#if __cplusplus > 201103L + template + enable_if_t<__has_is_transparent<_Compare>::value, iterator> + find(const _Kt& __x) + { return { _Base::find(__x), this }; } +#endif + cons
Re: [patch] [C++14] Implement N3657: heterogeneous lookup in associative containers.
On 24/01/15 23:03 +0100, François Dumont wrote: Sorry, I hadn't notice the condition to expose the new methods. It was hidden within the _Rb_tree type that I hadn't check (and I do not often check the Standard directly for my limited patches). On my side I am surprised you didn't reuse your code to detect member That adds another class template to the global namespace. I did that initially, but (because I forgot about Debug + Profile modes) decided against defining a new global type that is only needed in one place. (N.B. _GLIBCXX_HAS_NESTED_TYPE is not mine, I only modified it recently to use __void_t when I added that. Personally I find directly using void_t is simpler than defining a new type using void_t and then using that. I expect that to be the idiomatic solution in C++17 when std::void_t is added. types. I am also surprised that it is not using enable_if, IMHO it makes the code clearer. It doesn't work though. @@ -1155,9 +1150,8 @@ return _S_iter(__y); } - template::type> - iterator + template + enable_if_t<__has_is_transparent<_Compare>::value, iterator> _M_find_tr(const _Kt& __k) This doesn't work. Consider: #include struct I { int i; operator int() const { return i; } }; int main() { std::set s; I i = { }; s.find(i); } (I will add something like this to the testsuite.) This program is valid according to any C++ standard, but fails to compile with your patch applied because overload resolution instantiates std::_Rb_tree<>::_M_find_tr which instantiates enable_if::type, which is an error. SFINAE does not apply, because the invalid type enable_if::type is not found during *substitution*. It's just invalid, so when _Compare is not transparent, instantiating the function template is simply an error. Observe that my __is_transparent alias template takes two template arguments, so that it depends on the template parameter of the function, not only on _Compare. That means whether if the type is invalid that will be found during template argument substitution, so SFINAE applies.
Re: [patch] [C++14] Implement N3657: heterogeneous lookup in associative containers.
On 24/01/15 22:46 +, Jonathan Wakely wrote: Observe that my __is_transparent alias template takes two template arguments, so that it depends on the template parameter of the function, not only on _Compare. That means whether if the type is s/whether // invalid that will be found during template argument substitution, so SFINAE applies.