Re: Repository for the conversion machinery

2015-08-27 Thread James Greenhalgh
On Thu, Aug 27, 2015 at 03:38:10PM +0100, Eric S. Raymond wrote:
> I've made it available at:
> 
> http://thyrsus.com/gitweb/?p=gcc-conversion.git
> 
> The interesting content is gcc.map (the contributor map) and gcc.lift.
> 
> Presently the only command in gcc.lift expunges the hooks directory.

This appears to be missing some contributors whose usernames I recognise
(starting with me :-) )...

  jgreenhalgh = James Greenhalgh  (e.g. revision 
227028)
  ktkachov = Kyrylo Tkachov   (e.g. r227012 )
  jiwang = Jiong Wang  (e.g. r227220 )
  renlin =  Renlin Li   (e.g. r222635 )
  alalaw01 = Alan Lawrence   (e.g. r217927 )

Thanks,
James
 


Re: [AArch64] A question about Cortex-A57 pipeline description

2015-09-11 Thread James Greenhalgh
On Fri, Sep 11, 2015 at 04:31:37PM +0100, Nikolai Bozhenov wrote:
> Hi!
> 
> Recently I got somewhat confused by Cortex-A57 pipeline description in 
> GCC and
> I would be grateful if you could help me understand a few unclear points.

Sure,

> Particularly I am interested in how memory operations (loads/stores) are
> scheduled. It seems that according to the cortex-a57.md file, firstly, two
> memory operations may never be scheduled at the same cycle and, 
> secondly, two loads may never be scheduled at two consecutive cycles:
> 
>  ;; 5.  Two pipelines for load and store operations: LS1, LS2. The most
>  ;; valuable thing we can do is force a structural hazard to split
>  ;; up loads/stores.
> 
>  (define_cpu_unit "ca57_ls_issue" "cortex_a57")
>  (define_cpu_unit "ca57_ldr, ca57_str" "cortex_a57")
>  (define_reservation "ca57_load_model" "ca57_ls_issue,ca57_ldr*2")
>  (define_reservation "ca57_store_model" "ca57_ls_issue,ca57_str")
> 
> However, the Cortex-A57 Software Optimization Guide states that the core is
> able to execute one load operation and one store operation every cycle. And
> that agrees with my experiments. Indeed, a loop consisting of 10 loads, 10
> stores and several arithmetic operations takes on average about 10 cycles per
> iteration, provided that the instructions are intermixed properly.
> 
> So, what is the purpose of additional restrictions imposed on the scheduler
> in cortex-a57.md file? It doesn't look like an error. Rather, it looks like a
> deliberate decision.

When designing the model for the Cortex-A57 processor, I was primarily
trying to build a model which would increase the blend of utilized
pipelines on each cycle across a range of benchmarks, rather than to
accurately reflect the constraints listed in the Cortex-A57 Software
Optimisation Guide [1].

My reasoning here is that the Cortex-A57 is a high-performance processor,
and an accurate model would be infeasible to build. Because of this, it is
unlikely that the model in GCC will be representative of the true state of the
processor, and consequently GCC may make decisions which result in an
instruction stream which would bias towards one execution pipeline. In
particular, given a less restrictive model, GCC will try to hoist more
loads to be earlier in the basic block, which can result in less good
utilization of the other execution pipelines.

In my experiments, I found this model to be more beneficial across a range
of benchmarks than a model with the additional restrictions I imposed relaxed.
I'd be happy to consider counter-examples where this modeling produces
suboptimal results - and where the changes you suggest are sufficient to
resolve the issue.

Thanks,
James

---
[1]: Cortex-A57 Software Optimisation Guide
 
http://infocenter.arm.com/help/topic/com.arm.doc.uan0015a/cortex_a57_software_optimisation_guide_external.pdf


Re: distro test rebuild using GCC 6

2016-01-14 Thread James Greenhalgh
On Wed, Jan 13, 2016 at 02:17:16PM +0100, Matthias Klose wrote:
> Here are some first results from a distro test rebuild using GCC 6.
> A snapshot of the current Ubuntu development series was taken on
> 20151218 for all architectures (amd64, arm64, armhf, i386/i686,
> powerpc, ppc64el, s390x), and rebuilt unmodified using the current
> GCC 5 branch, and using GCC 6 20160101 (then updated to 20160109).
> 
> I haven't yet looked into the build failures except for the ICEs.
> If somebody wants to help please let me know so that work isn't
> duplicated.

Hi,

I've flicked through the 42 unique arm64 failures and given them a
first-step triage. The majority of issues look to be source based and
more than a few can be blamed on the move to C++14. Two of these I don't
understand (qutecom_2.2.1+dfsg1-5.2ubuntu2, sitplus_1.0.3-4.1build1). The
VLC one is strange, and I don't know how it has ever managed to build!

I generated my set of arm64 failures as so:

  grep xenial-arm64 00list | cut -f 1 -d " " | sed 
"s@^@http://people.canonical.com/~doko/tmp/gcc6-regr/@"; | xargs wget
  gunzip *.gz

Triage notes follow...

Hope this helps, if it is useless, let me know what would be a better way
for me to help out with the AArch64 stuff.

Thanks,
James

---
I used "grep -v GPG" to remove error lines that looked like:

  W: GPG error: http://ppa.launchpad.net xenial InRelease: The following 
signatures couldn't be verified because the public key is not available: 
NO_PUBKEY 1E9377A2BA9EF27F

This also removed two packages from my list that had system/package
failures:

  gnuais_0.3.3-3

[ Package dependency not met ]

sbuild-build-depends-gnuais-dummy : Depends: libosmgpsmap-1.0-0-dev (>= 
1.0.2) but it is not installable

  libnih_1.0.3-4

[ Interrupted build ]

Session terminated, terminating shell...

---
I used "grep -v "debian\/rules" to remove error lines that looked like:

  dpkg-buildpackage: error: debian/rules build-arch gave error exit status 2

This removed eight packages from my list which had system/environment
issues, ran out of memory, or failed a symbol match check.

  ardour_1:4.4~dfsg-1
  dune-grid_2.4.0-1
  mysql-workbench_6.3.4+dfsg-3build1

 [ Memory exhausted ]

 virtual memory exhausted: Cannot allocate memory

  marble_4:15.08.2-0ubuntu3

[ Expected symbol mismatch ]

dpkg-gensymbols: warning: debian/libmarblewidget-qt5-22/DEBIAN/symbols 
doesn't match completely debian/libmarblewidget-qt5-22.symbols
- (optional=templinst)_ZN12QtConcurrent19RunFunctionTaskBaseIvE3runEv@Base 
4:15.08.0
- (optional=templinst|arch=!amd64 
!i386)_ZN14QSharedPointerIN6Marble22MarbleQuickItemPrivateEE5derefEPN15QtSharedPointer20ExternalRefCountDataE@Base
 4:15.08.0
- 
(optional=templinst)_ZN17QtMetaTypePrivate23QMetaTypeFunctionHelperI5QListIdELb1EE6CreateEPKv@Base
 4:15.08.0
- 
(optional=templinst)_ZN17QtMetaTypePrivate23QMetaTypeFunctionHelperI5QListIdELb1EE6DeleteEPv@Base
 4:15.08.0
- 
(optional=templinst)_ZN17QtMetaTypePrivate23QMetaTypeFunctionHelperI7QVectorIPN6Marble16GeoDataPlacemarkEELb1EE6CreateEPKv@Base
 4:15.08.0
< etc... >

  shellcheck_0.3.7-1

[ Memory Exhausted ]

ghc: out of memory (requested 1048576 bytes)

  telepathy-qt5_0.9.6.1-0ubuntu3

[ Expected symbol mismatch ]

dpkg-gensymbols: warning: debian/libtelepathy-qt5-0/DEBIAN/symbols doesn't 
match completely debian/libtelepathy-qt5-0.symbols
- _ZN10QByteArray7reserveEi@Base 0.9.6.1-0ubuntu2~gcc5.1

  xapian1.3-core_1.3.3-0ubuntu2

[ Network testsuite failures (environment issue?) ]

Running test: keepalive1... FAILED
./apitest backend remoteprog_glass: 248 tests passed, 1 failed, 4 skipped.
Running test: keepalive1... NetworkTimeoutError: REMOTE:Timeout expired 
while trying to read; context was: .glass/db=apitest_simpledata (context: 
remote:tcp(127.0.0.1:1239))
Running test: bigoaddvalue1... std::bad_alloc: std::bad_alloc
./apitest backend remotetcp_glass: 247 tests passed, 2 failed, 4 skipped.
./apitest total: 2338 tests passed, 3 failed, 42 skipped.
FAIL: apitest

  xapian-core_1.2.21-1.2

[ Network testsuite failures (environment issue?) ]

Running test: keepalive1... NetworkError: write failed (context: 
remote:prog(../bin/xapian-progsrv -t5000 .chert/db=apitest_simpledata) (Broken 
pipe)
./apitest backend remoteprog_chert: 224 tests passed, 1 failed, 3 skipped.
./apitest total: 2992 tests passed, 1 failed, 45 skipped.
FAIL: apitest

---
After filtering the above with:

  grep "error:" *BUILDING* | grep -v GPG | grep -v "debian\/rules" >& 
all-failures.txt

I had 32 unique failures remaining...

---
-Wnarrowing

This is a mismatch between signed values and arm64's unsigned char. This
hits 21 packages. A typical failure looks like:

  s3m.cpp:29:90: error: narrowing conversion of '-1' from 'int' to 'char' 
inside { } [-Wnarrowing]
 
{-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,0,1,2,3,4,5,6,7,8,-1,-1,

Re: distro test rebuild using GCC 6

2016-01-15 Thread James Greenhalgh
On Thu, Jan 14, 2016 at 05:15:29PM +, James Greenhalgh wrote:
> On Wed, Jan 13, 2016 at 02:17:16PM +0100, Matthias Klose wrote:
> > Here are some first results from a distro test rebuild using GCC 6.
> > A snapshot of the current Ubuntu development series was taken on
> > 20151218 for all architectures (amd64, arm64, armhf, i386/i686,
> > powerpc, ppc64el, s390x), and rebuilt unmodified using the current
> > GCC 5 branch, and using GCC 6 20160101 (then updated to 20160109).
> > 
> > I haven't yet looked into the build failures except for the ICEs.
> > If somebody wants to help please let me know so that work isn't
> > duplicated.
> 
> I've flicked through the 42 unique arm64 failures and given them a
> first-step triage. The majority of issues look to be source based and
> more than a few can be blamed on the move to C++14. Two of these I don't
> understand (qutecom_2.2.1+dfsg1-5.2ubuntu2, sitplus_1.0.3-4.1build1). The
> VLC one is strange, and I don't know how it has ever managed to build!

Hi,

Today I've looked at the 30 unique armhf failures and given them the same
treatment as arm64. Some of the testsuite failures I can't find details or
reports of online so may be indicative of wrong-code bugs. Other than that
there are a larger number of ICEs for the ARM port, but these are logged
and actively being worked on. A number of packages have not made a
clean transition to C++14 and some build failures I just don't understand
how they could ever have worked.

As before, I generated the list of failures with:

  grep xenial-armhf 00list | cut -f 1 -d " " | sed 
"s@^@http://people.canonical.com/~doko/tmp/gcc6-regr/@"; | xargs wget

Triage notes follow...

Hope this helps. Please do let me know if there is something more useful I
could do instead.

Thanks,
James


Eliminating First order junk with "grep -v GPG" to remove error lines
that looked like:

  W: GPG error: http://ppa.launchpad.net xenial InRelease: The following 
signatures couldn't be verified because the public key is not available: 
NO_PUBKEY 1E9377A2BA9EF27F

Has no effect.


Looking for testsuite failures and environment issues, I see:

  guile-1.8_1.8.8+1-10ubuntu1

[ Testsuite failures... ]

ERROR: Value out of range -9223372036854775808 to 9223372036854775807: 
-9223372036854775808
FAIL: test-num2integral
fail: scm_is_signed_integer ((- (expt 2 63)), -9223372036854775808, 
9223372036854775807) == 1
FAIL: test-conversion
==
2 of 16 tests failed

  haskell-cipher-aes_0.2.11-1

[ Testsuite failures, look bad... ]

 AE1: [Failed]
expected: AuthTag "(Pd/\168\139\171!*g\SUB\151Hi\165l"
 but got: AuthTag "u:\252\SYN\141\165\&0\186S\191\GS\151\SYN\198E{"
 AD1: [Failed]
expected: AuthTag "(Pd/\168\139\171!*g\SUB\151Hi\165l"
 but got: AuthTag "r\205\252\225Sz0e\EM\203\GS\227\228lE\209"

  { ... etc ... }

 Properties   Test CasesTotal
 Passed  34   146   180  
 Failed  02626   
 Total   34   172   206  
Test suite test-cipher-aes: FAIL
Test suite logged to: dist-ghc/test/cipher-aes-0.2.11-test-cipher-aes.log
0 of 1 test suites (0 of 1 test cases) passed.

  haskell-cryptohash_0.11.6-4build1

[ Testsuite failures, look bad... ]

SHA1
  0 one-pass:  FAIL
expected: "da39a3ee5e6b4b0d3255bfef95601890afd80709"
 but got: "5a600060e4e200e24faa00aa71e400e456c400c4"
  0 inc 1: FAIL
expected: "da39a3ee5e6b4b0d3255bfef95601890afd80709"
 but got: "5a600060e4e200e24faa00aa71e400e456c400c4"
  0 inc 2: FAIL
expected: "da39a3ee5e6b4b0d3255bfef95601890afd80709"
 but got: "5a600060e4e200e24faa00aa71e400e456c400c4"

  { ... etc ... }

176 out of 592 tests failed
Test suite test-kat: FAIL
0 of 1 test suites (0 of 1 test cases) passed.

  kdelibs4support_5.15.0-0ubuntu2

[ Symbol mismatch ]

dpkg-gensymbols: warning: debian/libkf5kdelibs4support5/DEBIAN/symbols 
doesn't match completely debian/libkf5kdelibs4support5.symbols
+#MISSING: 5.15.0-0ubuntu2# (arch=armhf 
ppc64el)_ZN3KDE4statERK7QStringP4stat@Base 5.13.0

  kjsembed_5.15.0-0ubuntu2

[ Symbol mismatch ]

dpkg-gensymbols: warning: debian/libkf5jsembed5/DEBIAN/symbols doesn't 
match completely debian/libkf5jsembed5.symbols
+#MISSING: 5.15.0-0ubuntu2# _ZN3KJS7JSValueD0Ev@Base 4.96.0
+#MISSING: 5.15.0-0ubuntu2# _ZN3KJS7JSValueD1Ev@Base 4.96.0
+#MISSING: 5.15.0-0ubuntu2# _ZN3KJS7JSValueD2Ev@Base 4.96.0

  linux-flo_3.4.0-5.19
  linux-hammerhead_3.4.0-1.9
  linux-mako_3.4.0-7.41

[ Mystery bui

Re: distro test rebuild using GCC 6

2016-01-18 Thread James Greenhalgh
On Fri, Jan 15, 2016 at 05:52:49PM +, James Greenhalgh wrote:
>
>   libbpp-qt_2.1.0-1ubuntu2
> 
> [ ICE: Looks like: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68068
>   but reproduces on current trunk. Testcase reducer is in progress. ]

This turned out to be https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68148 ,
now fixed for trunk with r232410.

Author: hubicka 
Date:   Fri Jan 15 11:00:24 2016 +

PR ipa/68148
* ipa-icf.c (sem_function::merge): Virtual functions may become
reachable even if they address is not taken and there are no
idrect calls.
* g++.dg/ipa/devirt-49.C: New testcase.

Thanks,
James



Re: Deprecating basic asm in a function - What now?

2016-06-22 Thread James Greenhalgh
On Mon, Jun 20, 2016 at 08:01:24PM +0200, Michael Matz wrote:

> You see, the experiment shows that there's a gazillion uses of basic asms 
> out there.

I applied the proposed patch and built myself an allyesconfig AArch64 linux
kernel, the results were not pretty...

  make | grep "warning: Deprecated: asm" | wc -l
  8911

You could say I'm cheating as the asm in question appears in header files,
but this would be more than enough noise to guarantee the kernel will be
built with your warning disabled (and when the Linux kernel switches a
warning off, history suggests it won't be coming back on!). Remember that
while you might convince people to apply the proper fix for kernel version
N, you might struggle to get that fix backported across stable releases,
vendor releases, private releases etc. particularly when the alternative
fix of silencing the warning will appear to work fine.

There are 11 unique warnings for an arm64 Linux kernel build, 7 unique
warnings for an arm kernel build and 369 unique warnings for an x86_64
kernel build. Just for fun I counted how many warnings I get with x86_64
allyesconfig without filtering for unique; 14,697.

I haven't audited them to see which would be safe with the minimal changes
given in https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended (that document
looks buggy, should it not reccommend that you always add a volatile when
moving from basic to extended?). The sorts of things I see in the arm/arm64
port are sleep loops for CPUs guarded by a conditional, asm guarded by other
asm barriers, text added to mergeable read-only section for common strings,
other things that are not obviously broken uses of basic asm.

On the other hand, some of the the x86_64 examples in lib/raid6/sse2.c are
super scary and have back-to-back instructions with hard-coded registers...

asm volatile("pcmpgtb %xmm4,%xmm5");
asm volatile("paddb %xmm4,%xmm4");
asm volatile("pand %xmm0,%xmm5");

I can't defend them, and they look very fragile! Fixing this code would
require more substantial work.

I'm only looking at one consumer of our compiler here, but I think anything
that would cause the kernel community this much pain is a risky move, and is
indicative of how frequently we'd see this warning fire in the wild.
Permanently removing basic asm in a future GCC version is certainly not a
workable solution. So what is the value of a noisy warning?

Jeff and others have already come out against this patch, I'd like to add
my vote against it too.

Thanks,
James



Re: [GCC Steering Committee attention] [PING] [PING] [PING] libgomp: In OpenACC testing, cycle though $offload_targets, and by default only build for the offload target that we're actually going to te

2016-08-05 Thread James Greenhalgh
On Thu, Aug 04, 2016 at 09:12:36PM +0100, Manuel López-Ibáñez wrote:
> On 04/08/16 15:49, Thomas Schwinge wrote:
> >I suppose, if I weren't paid for paid for this, I would have run away
> >long ago, and would have looked for another project to contribute to.
> >:-(
> 
> You are a *paid* developer for one of the most active companies in
> the GCC community. Imagine how it feels for someone who just
> convinced their company to let them contribute for the first time or
> a volunteer:
> 
> https://gcc.gnu.org/ml/gcc/2010-04/msg00667.html
> https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00093.html
> 
> They will never bother to send an email like this and just silently
> go away. People do give up and move somewhere else based on this
> problem only. (And nowadays, this decision is quite easy to make or
> sell to one's boss)
> 
> >OpenACC/OpenMP/offloading patches.  I'm certainly not going in any way to
> >disapprove Jakub's help, skills and experience, but I'm more and more
> >worried about this "bus factor" of one single person
> >().
> 
> This is a problem throughout GCC. We have a single C++ maintainer, a
> single part-time C maintainer, none? for libiberty, no regular
> maintainer for build machinery, and so on and so forth.

I'd object to this being "throughout" GCC. There are a number of very good,
very responsive, very helpful GCC reviewers in our community. However, as
you'd expect, the most active areas for contribution are the most active
areas for review.

> This has been a problem for a very long time, but it seems to be
> getting worse now that fewer contributors are doing most of the
> work.

Rumours of GCC's death have been greatly exagerated!

At times it might feel this way, but the data
(BlackDuck at www.openhub.net and my own analysis on the Git Mirror)
suggests that actually the GCC community is near enough as strong as it
has been for most of the past 20 years, both in terms of volume of
contributions, number of contributors, and average number of contributions
per contributor. Some rudimentary stats gathering on the git mirror suggests
no substantial change in the makeup of the community over the years. If
anything there are more commits from more people and the average number of
commits per committer is decreasing.

GCC releases come over an uneven time period, so I've stuck to years and
statistics for those years. The tables give the number of contributors in
each "bucket" of commits. Note that the earlier years are skewed a little
as Jeff used to do the "Daily bump" that is now done by gccadmin, and
I haven't tried to strip this from the early numbers. As soon as it became
practical to I dropped that from the commit count.

--
Year| 1998 | 2001 | 2004 | 2007 | 2010 | 2013 | 2015 |
--
Total Commits   | 4996 | 6864 | 9139 | 6632 | 7589 | 5972 | 7742 |
Total Commitors |   44 |  116 |  153 |  167 |  171 |  176 |  190 |
Average commits |  114 |   59 |   60 |   40 |   44 |   34 |   41 |
--
Number of committers with... |
0-19 commits|   16 |   55 |   71 |   91 |  103 |  110 |  115 |
20-39 commits   |8 |   19 |   26 |   26 |   28 |   28 |   32 |
40-59 commits   |4 |9 |   10 |   19 |   13 |   16 |   12 |
60-79 commits   |3 |7 |   10 |9 |3 |4 |5 |
80-99 commits   |2 |6 |   10 |5 |4 |4 |4 |
100+ commits|   11 |   21 |   27 |   18 |   21 |   15 |   23 |
--

So far for 2016 we're on:

  Total commits: 4096
  Total Contributors: 152
  Average commit count per contributor: 26.9474
  0-19  101
  20-39 22
  40-59 13
  60-79 3
  80-99 5
  100+  9

We shouldn't let the external perception of the GCC community influence
how we talk about it. Statistics like this are easy enough to generate
from git for any time period you like, and counter most of the prevailing
commentary on the direction and quality of the GCC community.

The script I used to generate the above was nothing special...

RANGE=
git shortlog -s -n $RANGE |  awk '{if ($2 != "gccadmin") {sum+=$1;count+=1};  
if ($1 < 20) {bucket1+=1} else if ($1 < 40) {bucket2+=1} else if ($1 < 60) 
{bucket3+=1} else if ($1 < 80) {bucket4+=1} else if ($1 < 100) {bucket5+=1} 
else {bucket6+=1} } END {print "  Total commits: " sum; print "  Total 
Contributors: " count; print "  Average commit count per contributor: " sum / 
count; print "  0-19\t" bucket1; print "  20-39\t" bucket2; print " 
 40-59\t" bucket3; print "  60-79\t" bucket4; print "  80-99\t" 
bucket5; print "  100+\t" bucket6}'

> >... if I remember correctly), was that in addition to him, all
> >Global Reviewers are wel

Re: GCC Commit Stats [was: [GCC Steering Committee attention] [PING] [PING] [PING] libgomp [...]]

2016-08-05 Thread James Greenhalgh
On Fri, Aug 05, 2016 at 04:38:30PM +0100, Manuel López-Ibáñez wrote:
 
> I think those conclusions are debatable:

I won't respond to all your points (I'm busy this evening), but I can
regenerate my table with some of your suggestions.

> * GCC has also grown over the years, there is a lot more code and
> areas, specially more targets, which attract their own temporary
> developers who do not contribute to the rest of the compiler (much
> less review patches for the rest of the compiler).
> 
> * Your analysis includes Ada, Go and Fortran. I'd suggest to exclude
> them, since in terms of developers and reviewing, they seem to be
> doing fine. They also tend to organise themselves mostly independently
> of the rest of the compiler. This is also mostly true for targets.

Excluding this is tricky, but in principle is just matter of tweaking
the git shortlog command. If that's something you want to do, I'd be
interested to see. I didn't get reasonable results in time for the
history back to 1998 to present these numbers, in a few rough tests
they didn't look vastly different (when filtering on gcc/*.[ch]).

I've given the 2012-2015 numbers below, just to show that (for the files
in gcc/*.[ch]) your hypothesis doesn't hold. The vast majority of
committers make <20 commits in a year.

Year| 2012 | 2013 | 2014 | 2015
Commits | 1816 | 1632 | 2148 | 2362
Committers  |   98 |  110 |  109 |  114 
Average commits |   19 |   15 |   20 |   21
Number of committers achieving N commits by bucket...
1-19|   78 |   96 |   92 |   94
20-39   |   12 |4 |5 |7
40-59   |2 |3 |7 |3
60-79   |1 |3 |0 |0
80-100  |1 |1 |0 |2
100-199 |2 |1 |2 |6
200+|2 |2 |3 |2
Percentage of committers achieving N commits by bucket...
1-19|   80 |   87 |   84 |   82
20-39   |   12 |4 |5 |6
40-59   |2 |3 |6 |3
60-79   |1 |3 |0 |0
80-99   |1 |1 |0 |2
100-199 |2 |1 |2 |5
200+|2 |2 |3 |2


> * 100 commits is less than 2%. Quite a low threshold. Perhaps 1%, 25%,
> 50%, 75%, 90% are more informative.

Again, just done for time. I've changed the last two buckets to 100-199
and 200+ in this run. If you'd like to do, I'd be happy to see the
results.

> * https://www.openhub.net/p/taezaza/contributors/summary shows that
> more than 25% of the commits in the last 12 months were made by 6
> people. Note that those people are also the most active reviewers.

True, but as you point out below, few data samples tell us little.

> * If I adjust the numbers by the total number of contributors, then we
> get a different picture:

I've added that to my table.

> that is, most of the commits are done by smaller fraction of the
> total.

For 2015 I found the 4 "25%" marks to be:

  26%1-4
  25%5-13
  25%14-39
  23%40+

So 75% of the work is being done by people who commit fewer than 40
patches in a year. Encouragingly 50% of the people who committed in
2015 committed at least one patch per month (on average).

> * Numbers for other years might shed more light. 2010, 2013 and 2015
> might have been especial in one sense or another.

I compressed this for space. The full table is below (and attached - just
in case your mail client gets zealous with the text and re-wraps it).

Year| 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 
| 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 
Commits | 4997 | 5531 | 7031 | 6850 | 6704 | 7961 | 9137 | 7646 | 5039 
| 6633 | 5667 | 6244 | 7582 | 8181 | 6463 | 5970 | 7497 | 7742 | 
Committers  |   44 |   65 |   89 |  116 |  128 |  153 |  153 |  167 |  163 
|  167 |  172 |  174 |  171 |  176 |  181 |  176 |  204 |  190 | 
Average commits |  114 |   85 |   79 |   59 |   52 |   52 |   60 |   46 |   31 
|   40 |   33 |   36 |   44 |   46 |   36 |   34 |   37 |   41 | 
Number of committers achieving N commits by bucket...
1-19|   16 |   29 |   38 |   55 |   64 |   80 |   71 |   91 |   97 
|   91 |  111 |  107 |  103 |  114 |  116 |  110 |  131 |  116 | 
20-39   |8 |   12 |   19 |   19 |   18 |   21 |   26 |   18 |   28 
|   26 |   25 |   29 |   28 |   18 |   26 |   28 |   32 |   31 | 
40-59   |4 |5 |4 |9 |8 |   13 |   10 |   15 |   13 
|   19 |   15 |   15 |   13 |9 |   15 |   16 |   16 |   12 | 
60-79   |3 |4 |4 |7 |   11 |8 |   10 |   13 |4 
|9 |6 |6 |3 |   12 |7 |4 |6 |5 | 
80-100  |2 |3 |4 |6 |7 |   12 |   10 |7 |7 
|5 |4 |3 |4 |4 |0 |4 |3 |4 | 
100-199 |6 |6 |8 |   11 |   13 |8 |   13 |   16 |   12 
|   10 |6 |6 |9 |8 | 

Re: GCC 4.9.1 Status Report (2014-07-10)

2014-07-12 Thread James Greenhalgh
On Fri, Jul 11, 2014 at 07:52:06PM +0100, Franzi Edo. wrote:
> Hi All,
> Thank you for your suggestions.
> Unfortunately, no way!
> 
> 4. I can generate my cross compiler based on the "gcc 4.8.3” without problem
> (using both the apple-gcc4.2 or the XCode llvm) So, what has changed of
> fundamental between the 4.8.3 and the 4.9 versions?
(Sorry for any duplicate mails, my mailer is having difficulty with
certain symbols in the body of the mail).

The fundamental change was a patch I put in to 4.9 in October of 2013 [1].

In this patch, we use a large define_attr to group together all of the Neon
types used for (set_attr "type") expressions in the ARM backend in the
"is_neon_type" attribute, which we use later to decide if an instruction is
predicable. We end up with a define_attr with around 290 elements for the
"yes" case.

As you can see in the preprocessed source on the LLVM bug [2], each element
in the define_attr is expanded as

  ((cached_type == TYPE_NEON_FOO) || ...   )

So after three elements we have 3 levels of nesting. After six, we have 6.
After around 290 we have 290 levels of bracket nesting, and Clang errors.

If there is a better way for me to write the "is_neon_type" attribute, I'll
happily spin a patch for it. Certainly, I don't like the size of that
generated "if" statement.

Alternatively, perhaps the code which generates the if statement could be
rewritten to build a switch rather than the large "||" expression. I don't
know anything about the gen* programs and how define_attr can be used in
the general case to say how feasible that change would be.

None of this solves your problem in the interim, for that I think your best
bet is to set -fbracket-depth=1024 in your BUILD_CFLAGS, as suggested by
Chris.

Thanks,
James

[1] 
https://gcc.gnu.org/viewcvs/gcc/branches/gcc-4_9-branch/gcc/config/arm/arm.md?r1=203059&r2=203613
[2] http://llvm.org/bugs/show_bug.cgi?id=19650

> 
> So, I am a bit without ideas
> Cheers,
>Edo
> 
> 
> On 11 Jul 2014, at 00:29, Joel Sherrill  wrote:
> 
> > 
> > On 7/10/2014 5:14 PM, pins...@gmail.com wrote:
> >> 
> >>> On Jul 10, 2014, at 3:13 PM, Ian Lance Taylor  wrote:
> >>> 
>  On Thu, Jul 10, 2014 at 11:40 AM, Franzi Edo.  wrote:
>  
>  As for the version 4.9.0, on OSX stil remain a problem.
>  I cannot build an ARM a cross compiler!
>  Here is the message (same as for the 4.9.0)

> >>> You did not include enough context to be sure, but I don't think that
> >>> error message is coming from GCC.  At least, I don't see that error
> >>> message in the GCC sources.
> >>> 
> >>> I think that error message is coming from the host compiler you are
> >>> using, in which case, based on the error message, the solution would
> >>> seem to be

> >> Also i thought we did not support cross building with anything besides 
> >> gcc. 
> > The RTEMS community sees this when using clang/llvm on FreeBSD.
> > 
> > Franzi.. did the suggestion from Chris Johns to increase the limit
> > to 1024, not work?
> > 
> > https://gcc.gnu.org/ml/gcc/2014-05/msg00018.html
> > 
> > This ended up being reported at http://llvm.org/bugs/show_bug.cgi?id=19650
> > 
> > --joel
> >> Thanks,
> >> Andrew
> >> 
> >>> Ian
> > 



Re: [AArch64] Missed vectorization opportunity in cactusADM

2015-04-08 Thread James Greenhalgh
On Thu, Apr 02, 2015 at 04:20:06AM +0100, Ekanathan, Saravanan wrote:
> (I had sent this mail to gcc-help a week ago. Not sure, all GCC developers
> are subscribed to gcc-help, so re-sending to GCC development mailing list)
>
> Hi,
>
> This looks like a missed vectorization opportunity for one of the 'Fortran'
> hot loops in cactusADM (CPU2006 benchmark) when compiled with
> "-mcpu=cortex-a57 -Ofast".  Interestingly, the 'generic' model (compiled with
> plain "-Ofast or -O3" and without -mcpu option) vectorizes this hot loop,
> hence there is good runtime performance improvement noticed on native Aarch64
> platform.
> 
> I don't have a small reproducible testcase, hence quoting cactusADM benchmark
> here.  The hot loop is present in Bench_StaggeredLeapfrog2() in
> StaggeredLeapfrog2.F file.
>
> For cortex-a57, vectorization report clearly mentions that scalar cost <
> vector_cost/vectorization_factor, hence didn't vectorize.
>
> For generic case, due to un-tuned vector cost model, the scalar cost >
> vector_cost/vectorization_factor  (since scalar_cost = vector_cost), so the
> loop got vectorized
>
><< Output of  generic vectorized case>>
>  
> StaggeredLeapfrog2.fppized.f.130t.vect:StaggeredLeapfrog2.fppized.f:362:0: 
> note: LOOP VECTORIZED
>
> I have also played around with cortexa57_vector_cost table(esp.,
> scalar_stmt_cost, vector_stmt_cost, vec_unaligned_cost  etc..,), which
> influences the vectorization decision in this case.  The
> cortexa57_vector_cost table directly maps to the cost mentioned in
> "Cortex-A57 Software Optimisation Guide".  But, it looks like there is
> further scope of tuning the cortexa57 vector cost to vectorize such cases.
>
> Any comments on this missed opportunity ?
 
When I added the vector costs for Cortex-A57, I followed the Cortex-A57
Software Optimisation Guide [1] you mentioned above. I took a lower-bound
estimate for each cost, which will certainly underestimate the
floating-point scalar costs.

So, I can believe that the costs will not be optimal for all test code
you can give them, and I'm happy to look at patches which improve the
vector costs. If you are planning to look at this, please feel free to
raise a bugzilla issue and assign it to yourself so we can track things.

Please be sure to test any changes across a range of workloads - from
time to time I've seen issues with the Cortex-A57 vector costs where
we have been too eager to vectorize and would have been better keeping
to scalar code.

Thanks,
James

---
[1]: Cortex-57 Software Optimisation Guide
  
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.uan0015a/index.html



Re: missing explanation of Stage 4 in GCC Development Plan document

2015-04-27 Thread James Greenhalgh
On Mon, Apr 27, 2015 at 09:37:36AM +0100, Richard Biener wrote:
> On Sun, Apr 26, 2015 at 9:56 AM, Honggyu Kim  wrote:
> > Hi all,
> >
> > I would like to know about the stages of development plan so I checked the 
> > following article:
> > https://gcc.gnu.org/develop.html

[Just Bike-shedding...]

The stages, timings, and exact rules for which patches are acceptable
and when, seem to have drifted quite substantially from that page.
Stage 2 has been missing for 7 years now, Stages 3 and 4 seem to blur
together, the "regression only" rule is more like "non-invasive fixes
only" (likewise for the support branches).

In which case, we seem only to be keeping the process listed as an
aspiration, rather than a document of reality.

So, why not try to reflect practice and form a two stage model (and
name the stages in a descriptive fashion)?

  Development:

  Expected to last for around 70% of a release cycle. During this
  period, changes of any nature may be made to the compiler. In particular,
  major changes may be merged from branches. In order to avoid chaos,
  the Release Managers will ask for a list of major projects proposed for
  the coming release cycle before the start of this stage. They will
  attempt to sequence the projects in such a way as to cause minimal
  disruption. The Release Managers will not reject projects that will be
  ready for inclusion before the end of the development phase. Similarly,
  the Release Managers have no special power to accept a particular
  patch or branch beyond what their status as maintainers affords.
  The role of the Release Managers is merely to attempt to order the
  inclusion of major features in an organized manner.

  Stabilization:

  Expected to last for around 30% of a release cycle. New functionality
  may not be introduced during this period. Changes during this phase
  of the release cycle should focus on preparing the trunk for a high
  quality release, free of major regression and code generation issues.
  As we near the end of a release cycle, changes will only be accepted
  where they fix a regression, or are sufficiently non-intrusive as to
  not introduce a risk of affecting the quality of the release.

Thanks,
James



Remove my name from AArch64 port maintainers

2019-11-20 Thread James Greenhalgh

Hi,

After personal reflection on my current day-to-day involvement with the
GCC project and the expected behaviours and responsibilities delegated to
GNU project maintainers, I have come to the conclusion that the AArch64
port maintenance role is not one I am able to continue to commit to.

This patch therefore removes my name from the AArch64 maintainers list.
I've left my name under write-after-approval, just in case I need it in
future.

Thanks to the steering committee for the opportunity to contribute to GCC
as a maintainer, I've very much enjoyed seeing the many contributions to
the AArch64 port over the past two years.

Kyrill Tkachov, Richard Earnshaw, Richard Sandiford and Marcus Shawcroft
make for a great team of maintainers - I fully expect the AArch64 to
continue to thrive under their watch.

Best Regards,
James

2019-11-19  James Greenhalgh  

* MAINTAINERS (aarch64 port): Remove my name, move to...
(Write After Approval): ...Here.

diff --git a/MAINTAINERS b/MAINTAINERS
index 1385214f789..54edab3f177 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -44,7 +44,6 @@ docs, and the testsuite related to that.
 			CPU Port Maintainers	(CPU alphabetical order)
 
 aarch64 port		Richard Earnshaw	
-aarch64 port		James Greenhalgh	
 aarch64 port		Richard Sandiford	
 aarch64 port		Marcus Shawcroft	
 aarch64 port		Kyrylo Tkachov		
@@ -399,6 +398,7 @@ Jan-Benedict Glaw
 Marc Glisse	
 Prachi Godbole	
 Torbjorn Granlund				
+James Greenhalgh
 Doug Gregor	
 Matthew Gretton-Dann
 Yury Gribov	


Re: [Aarch64] Vector Function Application Binary Interface Specification for OpenMP

2017-03-17 Thread James Greenhalgh
On Wed, Mar 15, 2017 at 09:50:18AM +, Sekhar, Ashwin wrote:
> Hi GCC Team, Aarch64 Maintainers,
> 
> 
> The rules in Vector Function Application Binary Interface Specification  for
> OpenMP
> (https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt)
> is used in x86 for generating the simd clones of a function.
> 
> Is there a similar one defined for Aarch64?
> 
> If not, would like to start a discussion on the same for Aarch64. To  kick
> start the same, a draft proposal for Aarch64 (on the same lines as  x86 ABI)
> is included below. The only change from x86 ABI is in the  function name
> mangling. Here the letter 'b' is used for indicating the  ASIMD isa.

Hi Ashwin,

Thanks for the question. ARM has defined a vector function ABI, based
on the Vector Function ABI Specification you linked below, which
is designed to be suitable for both the Advanced SIMD and Scalable
Vector Extensions. There has not yet been a release of this document
which I can point you at, nor can I give you an estimate of when the
document will be published.

However, Francesco Petrogalli has recently made a proposal to the
LLVM mailing list ( https://reviews.llvm.org/D30739 ) which I would
note conflicts with your proposal in one way. You choose 'b' for name
mangling for a vector function using Advanced SIMD, while Francesco
uses 'n', which is the agreed character in the Vector Function ABI
Specification we have been working on.

I'd encourage you to wait for formal publication of the ARM Vector
Function ABI to prevent any unexpected divergence between
implementations.

Thanks,
James



Re: Overwhelmed by GCC frustration

2017-08-01 Thread James Greenhalgh
On Tue, Aug 01, 2017 at 11:12:12AM -0400, Eric Gallager wrote:
> On 8/1/17, Jakub Jelinek  wrote:
> > On Tue, Aug 01, 2017 at 07:08:41AM -0400, Eric Gallager wrote:
> >> > Heh.  I suspect -Os would benefit from a separate compilation pipeline
> >> > such as -Og.  Nowadays the early optimization pipeline is what you
> >> > want (mostly simple CSE & jump optimizations, focused on code
> >> > size improvements).  That doesn't get you any loop optimizations but
> >> > loop optimizations always have the chance to increase code size
> >> > or register pressure.
> >> >
> >>
> >> Maybe in addition to the -Os optimization level, GCC mainline could
> >> also add the -Oz optimization level like Apple's GCC had, and clang
> >> still has? Basically -Os is -O2 with additional code size focus,
> >> whereas -Oz is -O0 with the same code size focus. Adding it to the
> >> FSF's GCC, too, could help reduce code size even further than -Os
> >> currently does.
> >
> > No, lack of optimizations certainly doesn't reduce the code size.
> > For small code, you need lots of optimizations, but preferrably code-size
> > aware ones.  For RTL that is usually easier, because you can often compare
> > the sizes of the old and new sequences and choose smaller, for GIMPLE
> > optimizations it is often just a wild guess on what optimizations generally
> > result in smaller and what optimizations generally result in larger code.
> > There are too many following passes to know for sure, and finding the right
> > heuristics is hard.
> >
> > Jakub
> >
> 
> Upon rereading of the relevant docs, I guess it was a mistake to
> compare -Oz to -O0. Let me quote from the apple-gcc "Optimize Options"
> page:
> 
> -Oz
> (APPLE ONLY) Optimize for size, regardless of performance. -Oz
> enables the same optimization flags that -Os uses, but -Oz also
> enables other optimizations intended solely to reduce code size.
> In particular, instructions that encode into fewer bytes are
> preferred over longer instructions that execute in fewer cycles.
> -Oz on Darwin is very similar to -Os in FSF distributions of GCC.
> -Oz employs the same inlining limits and avoids string instructions
> just like -Os.
> 
> Meanwhile, their description of -Os as contrasted to -Oz reads:
> 
> -Os
> Optimize for size, but not at the expense of speed. -Os enables all
> -O2 optimizations that do not typically increase code size.
> However, instructions are chosen for best performance, regardless
> of size. To optimize solely for size on Darwin, use -Oz (APPLE
> ONLY).
> 
> And the clang docs for -Oz say:
> 
> -Oz Like -Os (and thus -O2), but reduces code size further.
> 
> So -Oz does actually still optimize, so it's more like -O2 than -O0
> after all, just even more size-focused than -Os.

The relationship between -Os and -Oz is like the relationship between -O2
and -O3.

If -O3 says, try everything you can to increase performance even at the
expense of code-size and compile time, then -Oz says, try everything you
can to reduce the code size, even at the expense of performance and
compile time.

Thanks,
James



Re: [patch] RFC: Hook for insn costs?

2017-08-03 Thread James Greenhalgh
On Wed, Aug 02, 2017 at 12:56:58PM -0700, Richard Henderson wrote:
> On 08/02/2017 12:34 PM, Richard Earnshaw wrote:
> > I'm not sure if that's a good or a bad thing.  Currently the mid-end
> > depends on some rtx constructs having sensible costs even if there's no
> > rtl pattern to match them (IIRC plus:QI is one such construct - RISC
> > type machines usually lack such an instruction). 
> 
> I hadn't considered this... but there are several possible workarounds.
> 
> The simplest of which is to fall back to using rtx_cost if the insn_cost hook
> returns a failure indication, e.g. -1.
> 
> > Also, costs tend to be
> > micro-architecture specific so attaching costs directly to patterns
> > would be extremely painful, adding support would require touching the
> > entirety of the MD files.  The best bet would be a level of indirection
> > from the patterns to cost tables, much like scheduler attributes.
> 
> I was never thinking of adding costs directly to the md files, but rather
> structuring the insn_cost hook like
> 
>   if (recog_memoized (insn) < 0)
> return -1;
>   switch (get_attr_type (insn))
> {
> case TYPE_iadd:
> case TYPE_ilog:
> case TYPE_mvi:
>   return COSTS_N_INSNS (1);
> 
> case TYPE_fadd:
>   return cost_data->fp_add;
> }
> 
> etc.  This would be especially important when it comes costing for simd-type
> insns.  Matching many of those any other way would be fraught with peril.

I tried prototyping something like this for AArch64 a while back - it
feels like the only sensible, scalable and maintainable way to rationalise
the costs code. Anything else drifts from the md file over time, and is
a tangled mess of spaghetti code to implement a poor-quality recog clone.

I ran in to exactly the problem Richard Earnshaw mentions above - too
many partial rtx fragments came in that I couldn't recognise. The points
in the compilation pipeline where I could receive an RTX to cost meant that
I could get be asked for costs before recog was correctly initialised.

I think a clean hook which operated in this way would be a great step
forward for the RTX costs.

Thanks,
James



Re: Announcing ARM and AArch64 port maintainers.

2017-09-09 Thread James Greenhalgh

On Sat, Sep 09, 2017 at 12:44:14PM +0100, Ramana Radhakrishnan wrote:
> I'm pleased to announce that the steering committee has appointed
>
> -  James Greenhalgh as a full maintainer for the AArch64 port
>
> and
>
> -  Kyrylo Tkachov as a full maintainer for the ARM port.
>
> James & Kyrylo, if you could update your entries in the MAINTAINERS
> file to reflect these roles, it would be appreciated.

Thanks! I'm looking forward to continuing my contributions to the AArch64
port under my new role as port maintainer.

This is what I committed as revision 251940, it moves my name to the AArch64
maintainers and puts the other names there back in alphabetical order,
as elsewhere in the file.

Thanks,
James

---
2017-09-09  James Greenhalgh  

* MAINTAINERS (James Greenhalgh): Move to AArch64 maintainers.

diff --git a/MAINTAINERS b/MAINTAINERS
index 7effec1..2ed1ef9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -39,8 +39,9 @@ their own patches from other maintainers or reviewers.
 
 			CPU Port Maintainers	(CPU alphabetical order)
 
-aarch64 port		Marcus Shawcroft	
 aarch64 port		Richard Earnshaw	
+aarch64 port		James Greenhalgh	
+aarch64 port		Marcus Shawcroft	
 alpha port		Richard Henderson	
 arc port		Joern Rennecke		
 arm port		Nick Clifton		
@@ -249,7 +250,6 @@ check in changes outside of the parts of the compiler they maintain.
 
 			Reviewers
 
-aarch64 port		James Greenhalgh	
 arc port		Andrew Burgess		
 arc port		Claudiu Zissulescu	
 arm port		Kyrylo Tkachov		


Re: GCC Buildbot Update

2017-12-20 Thread James Greenhalgh
On Wed, Dec 20, 2017 at 10:02:45AM +, Paulo Matos wrote:
> 
> 
> On 20/12/17 10:51, Christophe Lyon wrote:
> > 
> > The recent fix changed the Makefile and configure script in libatomic.
> > I guess that if your incremental builds does not run configure, it's
> > still using old Makefiles, and old options.
> > 
> > 
> You're right. I guess incremental builds should always call configure,
> just in case.

For my personal bisect scripts I try an incremental build, with a
full rebuild as a fallback on failure.

That gives me the benefits of an incremental build most of the time (I
don't have stats on how often) with an automated approach to keeping things
going where there are issues.

Note that there are rare cases where depencies are missed in the toolchain
and an incremental build will give you a toolchain with undefined
behaviour, as one compilation unit takes a new definition of a
struct/interface and the other sits on an outdated compile from the
previous build.

I don't have a good way to detect these.

Thanks,
James



Re: Aarch64 / simd builtin question

2018-06-08 Thread James Greenhalgh
On Fri, Jun 08, 2018 at 04:01:14PM -0500, Steve Ellcey wrote:
> I have a question about the Aarch64 simd instructions and builtins.
> 
> I want to unpack a __Float32x4 (V4SF) variable into two __Float64x2
> variables.  I can get the upper part with:
> 
> __Float64x2_t a = __builtin_aarch64_vec_unpacks_hi_v4sf (x);
> 
> But I can't seem to find a builtin that would get me the lower half.
> I assume this is due to the issue in aarch64-simd.md around the
> vec_unpacks_lo_ instruction:
> 
> ;; ??? Note that the vectorizer usage of the vec_unpacks_[lo/hi] patterns
> ;; is inconsistent with vector ordering elsewhere in the compiler, in that
> ;; the meaning of HI and LO changes depending on the target endianness.
> ;; While elsewhere we map the higher numbered elements of a vector to
> ;; the lower architectural lanes of the vector, for these patterns we want
> ;; to always treat "hi" as referring to the higher architectural lanes.
> ;; Consequently, while the patterns below look inconsistent with our
> ;; other big-endian patterns their behavior is as required.
> 
> Does this mean we can't have a __builtin_aarch64_vec_unpacks_lo_v4sf
> builtin that will work in big endian and little endian modes?
> It seems like it should be possible but I don't really understand 
> the details of the implementation enough to follow the comment and
> all its implications.
> 
> Right now, as a workaround, I use:
> 
> static inline __Float64x2_t __vec_unpacks_lo_v4sf (__Float32x4_t x)
> {
>   __Float64x2_t result;
>   __asm__ ("fcvtl %0.2d,%1.2s" : "=w"(result) : "w"(x) : /* No clobbers */);
>   return result;
> }
> 
> But a builtin would be cleaner.

Hi Steve,

Are you in an environment where you can use arm_neon.h ? If so, that would
be the best approach:

  float32x4_t in;
  float64x2_t low = vcvt_f64_f32 (vget_low_f64 (in));
  float64x2_t high = vcvt_high_f64_f32 (in);

If you can't use arm_neon.h for some reason, you can look there for
inspiration of how to write your own versions of these intrinsics.

Thanks,
James



Re: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread James Greenhalgh
On Fri, Nov 22, 2013 at 11:03:22AM +, Bingfeng Mei wrote:
> Well, in your modified example, it is still due to jump threading that produce
> code of bad control flow that cannot be if-converted and vectorized, though in
> tree-vrp pass this time. 
> 
> Try this 
> ~/install-4.8/bin/gcc vect-ifconv-2.c  -O2 -fdump-tree-ifcvt-details 
> -ftree-vectorize  -save-temps -fno-tree-vrp
> 
> The code can be vectorized. 
> 
> Grep "threading" in gcc, it seems that dom and vrp passes are two places that 
> apply
> jump threading. Any other place? I think we need an target hook to control 
> it. 
> 

You can effectively disable jump-threading using:
  --param max-jump-thread-duplication-stmts=0

(grep dump files for "Jumps threaded")

I don't see Andrew's code vectorized even with jump-threading disabled
so I think Andrew is correct and this is some other missed optimization.

James



Re: Request for discussion: Rewrite of inline assembler docs

2014-03-21 Thread James Greenhalgh
On Thu, Feb 27, 2014 at 11:07:21AM +, Andrew Haley wrote:
> Over the years there has been a great deal of traffic on these lists
> caused by misunderstandings of GCC's inline assembler.  That's partly
> because it's inherently tricky, but the existing documentation needs
> to be improved.
> 
> dw  has done a fairly thorough reworking of
> the documentation.  I've helped a bit.
> 
> Section 6.41 of the GCC manual has been rewritten.  It has become:
> 
> 6.41 How to Use Inline Assembly Language in C Code
> 6.41.1 Basic Asm - Assembler Instructions with No Operands
> 6.41.2 Extended Asm - Assembler Instructions with C Expression Operands
> 
> We could simply post the patch to GCC-patches and have at it, but I
> think it's better to discuss the document here first.  You can read it
> at

This documentation looks like a huge improvement.

As the discussion here seems to have stalled, perhaps it is time to propose
the patch to gcc-patches?

I'm certainly keen to see this make it to trunk, the increase in clarity
is substantial.

Thanks,
James