Improvements of the haifa scheduler

2007-03-04 Thread Maxim Kuvyrkov

Hi.

I want to share some of my thoughts and doings on improving / cleaning
up current GCC instruction scheduler (Haifa) - most of them are just
small obvious improvements.

I have semi-ready patches for about a half of them and would appreciate
any early suggestion or comments on the following draft plan:

1. Remove compute_forward_dependencies ().  [Done]

Since new representation of dependencies lists was checked in, we don't
longer need to compute forward dependencies separately.  It would be
natural to add forward links at the same time as we generate backward ones.

2. Use alloc_pools instead of obstacks for dep_nodes and deps_lists.
[In progress]

As pointed out by Jan Hubicka scheduler peaks +100M on a testcase for
PR28071 after my patch for new dependencies lists was checked in.
Though alloc_pools should have been my first choice while writing that
patch, I decided to mimic as close as possible the original rtx
instruction lists with their scheme of deallocation at the very end.  So
the next step would be to define proper lifetime of dependency lists
and use alloc_pools to reuse nodes and lists from the previous regions.

Which brings us to ...

3. Define clear interface for manipulating dependencies.  [In progress]

This one popped up when I began to debug <2> and understood that the
scheduler uses and changes dependencies lists the way it shouldn't.
Lists are being copied, edited and deleted directly without interaction
with sched-deps.c .  What the scheduler really needs is the following
set of primitives:
  o FOR_EACH_DEP (insn, which_list, iterator, dep) - walk through
insn's which_list (one of {backward, resolved_backward, forward,
resolved_forward}) and provide the user with the dep.  Ayal Zaks
suggested this type of macro weeks ago but at that time I didn't agree
with him.
  o dep_t find_dep_between (producer, consumer) - find dependency
between two instructions.  Currently we walk through the list looking
for what we need.  A better way would be to first check dependency
caches and then, if we can't determine that there is no dependency, walk
through the shorter list given the choice of two: producer's forward
list and consumer's backward list.
  o void add_dep (dep_t) - Add a dependency.
  o void remove_dep (iterator) - Remove dependency pointed out by iterator.
  o void resolve_dep (iterator) - Resolve dependency pointed out by
iterator.
  o int get_list_size (insn, which_list) - Get the size of the insn's
which_list.
  o bool list_has_only_speculative_deps (insn, which_list) - Return
true if all insn's dependencies can be overcome with some sort of
speculation.
  o void {create, delete}_dependency_lists (insn) - Create / delete
dependency lists for insn.

As you can see, the scheduler doesn't need to know the internal
representation of the deps_list / dep_node.

4. Support speculative loads into subregs.  [Planned]

As noted by Jim Wilson current support for ia64 speculation doesn't
handle subregs though that would be easy to fix.

5. Make sched-deps.c mark only those dependencies as speculative which
can actually be overcame with speculation types currently in use.  [Planned]

At the moment we first generate speculative dependencies and only at the
moment of adding instruction to the ready list we check if we can (or it
worth to) overcome every of its dependencies.

6. Make ds_t a structure.  [Planned]

ds_t is type for representing status of a dependency.  It contains
information about types of the dependency (true, output, anti) and
probabilities of speculation success (begin_data, be_in_data,
begin_control, be_in_control) - that makes three bits and for integers
coded in a single int.  Historical reasons forced this inelegant
approach but now the reasons are inexistent and the problem can be
solved in a natural way.

7. Use cse_lib in sched-rgn.c .  [In progress]

At the moment cse_lib works to improve alias analysis only during
sched-ebb scheduling.  It is trivial that we can also enable it when
scheduling single block regions in sched-rgn.  The patch for this is
a one-liner which was tested on a bootstrap but not on SPECs.

It is also possible to use cse_lib on sequential basic_blocks of the
region thus handling them as an extended basic block.

If it is possible to save cse_lib states, then we'll be able to process
trees and merging capabilities are required for DAGs.  Don't know if
this can be done.

8. Don't generate a memory barrier on simplejump.  [Done]

sched-deps.c handles every jump in the scheduling region as a memory
barrier - e.g. almost no memory operation can be moved through it.  But
unconditional jumps don't really need such restrictions.  A one-liner
patch for this was tested on the bootstrap but not on SPECs.

9. Use sched-ebb on other architectures.  [Done]

After patches for ia64 speculation and follow up fixes to them sched-ebb
no longer corrupts CFG and can safely be used as not final pass on
platforms other than ia64.  I successfully bootstrapped (and, pro

Re: Improvements of the haifa scheduler

2007-03-04 Thread Maxim Kuvyrkov

Vladimir N. Makarov wrote:

Maxim Kuvyrkov wrote:


Hi.

I want to share some of my thoughts and doings on improving / cleaning
up current GCC instruction scheduler (Haifa) - most of them are just
small obvious improvements.

I have semi-ready patches for about a half of them and would appreciate
any early suggestion or comments on the following draft plan:



...


Any comments, suggestions or 'please don't do that' will be greatly
appreciated.



...



Good aliasing is very important for the scheduler.  But I'd look at this 
more wider.  We need a good aliasing for many RTL optimizations.  What's 
happened to ISP RAS aliasing patch propagating SSA info to RTL?  Why is 
it stalled?


As for Sanjiv Gupta's aliasing work, that was interesting but as I 
remember the patch made compiler too slow (like 40% slower).  You should 
make this approach faster to make it accepted as used by default.


I understand that good aliasing is important for several RTL passes and 
I hope that the general aliasing support will improve in time.  I must 
admit that I didn't investigate the Gupta's patch yet, but I believe 
that it is so slow because it needs to rescan the function many times in 
order to get correct aliasing information.  On the other hand alias 
analysis for scheduler's data speculation has a luxury of being not 
correct at times.  So it looks like a low hanging fruit to me to try 
fast but not safe variant of Gupta's work and see if the magic will happen.




Another important thing to do is to make the 1st scheduler register 
pressure sensitive.  It would improve performance and solve the 1st insn 
scheduling problem for x86, x86_64.  Now it is off by default because 
the scheduler moves insns containing hard regs too freely and this 
results in failure of the reload to find a hard register of a small class.


If you need benchmarking for machines (like ppc) you have no access to, 
I can provide the benchmarking.



I should also mention that I do all these works in my spare time which
is not a lot, thus the above is my plan for about a half of the year.

I really appreciate.  May be if you or ISP RAS could find students (e.g. 
from Moscow University) to do this as Google Summer Code, it could help 
you.  I think it is not too late.  You should ask Ian Taylor or Daniel 
Berlin, if you want to do this.


The projects I've described are too small to qualify as a 3 months 
works.  But your suggestion to solve the 'scheduler -> ra' problem I 
would estimate just like the right one.


The other GSC project might be to investigate and fix the places in 
compiler where exported from tree-ssa aliasing information is being 
invalidated.


So basically here are three Google Summer of Code projects:

  o Scheduler -> RA
  o Fix passes that invalidate tree-ssa alias export.
  o { Fast but unsafe Gupta's aliasing patch, Unsafe tree-ssa alias 
export } in scheduler's data speculation.


Ian, Daniel, what do you think?


Thanks,

Maxim




Re: Improvements of the haifa scheduler

2007-03-05 Thread Maxim Kuvyrkov

Diego Novillo wrote:

Maxim Kuvyrkov wrote on 03/05/07 02:14:


   o Fix passes that invalidate tree-ssa alias export.


Yes, this should be good and shouldn't need a lot of work.

   o { Fast but unsafe Gupta's aliasing patch, Unsafe tree-ssa alias 
export } in scheduler's data speculation.


"unsafe" alias export?  I would definitely like to see the tree->rtl
alias information transfer fixed once and for all.  Finishing RAS's
tree->rtl work would probably make a good SoC project.


"Unsafe" doesn't mean not fixed.  My thought is that it would be nice to 
have a switch in aliasing that will turn such operations as


join (pt_anything, points_to) -> pt_anything

into

join (pt_anything, points_to) -> points_to

This transformation will sacrifice correctness for sake of additional 
information.



Thanks,

Maxim


Re: GCC 4.2.0 Status Report (2007-04-15)

2007-04-16 Thread Maxim Kuvyrkov

Steven Bosscher wrote:

On 4/16/07, Mark Mitchell <[EMAIL PROTECTED]> wrote:

29841  [4.2/4.3 regression] ICE with scheduling and __builtin_trap


Honza, PING!


There is a patch for this PR29841 in
http://gcc.gnu.org/ml/gcc-patches/2007-02/msg01134.html .  The problem 
is that I don't really know which maintainer ask to review it :(


--
Maxim


Re: [RFC] Kernel livepatching support in GCC

2015-10-13 Thread Maxim Kuvyrkov
Hi,

The feedback in this thread was overall positive with good suggestions
on implementation details.  I'm starting to work on the first draft,
and plan to post something in 2-4 weeks.

Thanks.

On 28 May 2015 at 11:39, Maxim Kuvyrkov  wrote:
> Hi,
>
> Akashi-san and I have been discussing required GCC changes to make kernel's 
> livepatching work for AArch64 and other architectures.  At the moment 
> livepatching is supported for x86[_64] using the following options: "-pg 
> -mfentry -mrecord-mcount -mnop-mcount" which is geek-speak for "please add 
> several NOPs at the very beginning of each function, and make a section with 
> addresses of all those NOP pads".
>
> The above long-ish list of options is a historical artifact of how 
> livepatching support evolved for x86.  The end result is that for 
> livepatching (or ftrace, or possible future kernel features) to work compiler 
> needs to generate a little bit of empty code space at the beginning of each 
> function.  Kernel can later use that space to insert call sequences for 
> various hooks.
>
> Our proposal is that instead of adding -mfentry/-mnop-count/-mrecord-mcount 
> options to other architectures, we should implement a target-independent 
> option -fprolog-pad=N, which will generate a pad of N nops at the beginning 
> of each function and add a section entry describing the pad similar to 
> -mrecord-mcount [1].
>
> Since adding NOPs is much less architecture-specific then outputting call 
> instruction sequences, this option can be handled in a target-independent way 
> at least for some/most architectures.
>
> Comments?
>
> As I found out today, the team from Huawei has implemented [2], which follows 
> x86 example of -mfentry option generating a hard-coded call sequence.  I hope 
> that this proposal can be easily incorporated into their work since most of 
> the livepatching changes are in the kernel.
>
> [1] Technically, generating a NOP pad and adding a section entry in 
> .__mcount_loc are two separate actions, so we may want to have a 
> -fprolog-pad-record option.  My instinct is to stick with a single option for 
> now, since we can always add more later.
>
> [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/346905.html
>
> --
> Maxim Kuvyrkov
> www.linaro.org
>
>
>



-- 
Maxim Kuvyrkov
www.linaro.org


Re: [RFC, Fortran] Avoid race on testsuite temporary files

2015-12-10 Thread Maxim Kuvyrkov
> On Dec 9, 2015, at 5:27 PM, Yvan Roux  wrote:
> 
> Hi,
> 
> as it was raised in
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01540.html we experiment
> random failures in gfortran validation when it is run in parallel (-j
> 8).  The issues occurs because of concurrent access to the same file,
> the first two patches fixed some of them by changing the file names
> used, but there are still remaining conflicts (6 usages of foo, 8 of
> test.dat). This is easy to fix and I've a patch for that, but there is
> another issue and I'd like to have your thoughts on it.
> 
> There is a little bit more than 1000 testcases which use IO without
> explicit file names, ~150 use scratches (a temporary file named
> gfortrantmp + 6 extra char is created) and the others, which only
> specify the io unit number, use a file named fort.NN with NN the
> number of the io unit. We see conflicts on these generated files, as
> lot of testcases use the same number, the most used are:
> 
> 10 => 150
> 99 => 70
> 6  => 31
> 11 => 27
> 1  => 23
> 
> I started to change the testcases to use scratches when it is possible
> and before finding that there is that many to fix, and I also had
> conflicts on the generated scratch names.  The options I see to fix
> that are:
> 
> 1- Move all these testcases into an IO subdir and change the testsuite
> to avoid parallelism in that directory.
> 2- Use scratches when possible and improve libgfortran file name
> generation, I don't know well fortran but is it possible to change the
> file name patterns for scratches and io unit files ?
> 3- Change the io unit numbers used, as it was suggested on irc, but I
> find it a bit painful to maintain.
> 
> Any comments are welcome.

I have also investigated several races on I/O in the gfortran testsuite, and my 
preference is to go with [1].  Specifically, if a fortran test does I/O with 
filenames that can clash with some other test, then the test should be located 
in a sub-directory of gfortran.dg testsuite that runs its test in-order.

--
Maxim Kuvyrkov
www.linaro.org



Re: Live range shrinkage in pre-reload scheduling

2014-05-13 Thread Maxim Kuvyrkov
On May 13, 2014, at 10:27 PM, Kyrill Tkachov  wrote:

> Hi all,
> 
> In haifa-sched.c (in rank_for_schedule) I notice that live range shrinkage is 
> not performed when SCHED_PRESSURE_MODEL is used and the comment mentions that 
> it results in much worse code.
> 
> Could anyone elaborate on this? Was it just empirically noticed on x86_64?

+ Richard Sandiford who wrote SCHED_PRESSURE_MODEL

--
Maxim Kuvyrkov
www.linaro.org



Re: Live range shrinkage in pre-reload scheduling

2014-05-15 Thread Maxim Kuvyrkov
On May 15, 2014, at 6:46 PM, Ramana Radhakrishnan  
wrote:
> 
>> 
>> I'm not claiming it's a great heuristic or anything.  There's bound to
>> be room for improvement.  But it was based on "reality" and real results.
>> 
>> Of course, if it turns out not be a win for ARM or s390x any more then it
>> should be disabled.
> 
> The current situation that Kyrill is investigating is a case where we
> notice the first scheduler pass being a bit too aggressive with
> creating ILP opportunities with the A15 scheduler that causes
> performance differences with not turning on the first scheduler pass
> vs using the defaults.

Charles has a work-in-progress patch that fixes a bug in SCHED_PRESSURE_MODEL 
that causes the above symptoms.  The bug causes 1st scheduler to unnecessarily 
increase live ranges of pseudo registers when there are a lot of instructions 
in the ready list.

Charles, can you finish your patch in the next several days and post it for 
review?

Thank you,

--
Maxim Kuvyrkov
www.linaro.org




[GSoC] Status - 20140516

2014-05-16 Thread Maxim Kuvyrkov
Hi Community,

The community bonding period is coming to a close, students can officially 
start coding on Monday, May 19th.

In the past month the student should have applied for FSF copyright assignment 
and, hopefully, executed on a couple of test tasks to get a feel for GCC 
development.

The GSoC Reunion (an unconference to discuss results of concluded GSoC) will be 
held in San Jose, CA, on 23-26 October 2014.  GCC gets to send 2 delegates on 
Google's dime (airfare, hotel, food), but more can attend via a registration 
lottery and covering their own expenses.  If you are interested in going to 
GSoC Reunion, please let me know.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org





Re: [GSoC] Status - 20140516

2014-05-16 Thread Maxim Kuvyrkov
On May 17, 2014, at 10:41 AM, Tobias Grosser  wrote:

> 
> 
> On 17/05/2014 00:27, Maxim Kuvyrkov wrote:
>> Hi Community,
>> 
>> The community bonding period is coming to a close, students can officially 
>> start coding on Monday, May 19th.
>> 
>> In the past month the student should have applied for FSF copyright 
>> assignment and, hopefully, executed on a couple of test tasks to get a feel 
>> for GCC development.
> 
> In the last mail, I got the impression that you will keep track of the 
> copyright assignments. Is this the case?

Yes.  Two of the students already have copyright assignment in place, and I 
have asked the other 3 about their assignment progress today.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org



[GSoC] Status - 20140616

2014-06-16 Thread Maxim Kuvyrkov
Hi Community,

We are 1 week away from midterm evaluations of students' work.  Mentors, please 
start looking closely into your student's progress and draft up evaluation 
notes.

Midterm evaluations are very important in GSoC.  Students who fail this 
evaluation are immediately kicked out of the program.  Students who pass -- get 
their midterm payment ($2250).

Both mentors and students will need to submit midterm evaluations between June 
23-27.  There is no excuse for not submitting your evaluations.  Please let me 
know if you have any problems submitting your evaluation in the period June 
23-27.

For evaluations, you might find this guide helpful: 
http://en.flossmanuals.net/GSoCMentoring/evaluations/ .

On another note, copyright assignments are now completed for 4 out of 5 
students.  I have pinged the last student to get his assignment in order.

--
Maxim Kuvyrkov
www.linaro.org





[GSoC] Status - 20140707

2014-07-07 Thread Maxim Kuvyrkov
Hi Community,

All GCC GSoC students have successfully passed mid-term evaluations, and are 
continuing to work on their projects.  Congratulations to all the students!

Furthermore, Linaro has generously provided sponsorship to pay for 1 GCC GSoC 
student to travel to GNU Tools Cauldron this year.  By the results of mid-term 
evaluations and mentor comments -- Prathamesh Kulkarni was selected.  As 
always, thank you to Google for hosting the Cauldron and to Diego for procuring 
an extra registration spot.

Our plan is to continue bringing top 1-3 GSoC students to GCC conferences each 
year.  Hopefully, we will get more sponsorship slots from companies doing GCC 
development next year.  We also plan to earmark funds that GCC project will 
receive for mentoring the students ($500 per student) towards sponsoring one of 
the next year's students.

Thank you, and will see at the Cauldron!

--
Maxim Kuvyrkov
www.linaro.org





Re: mn10300, invariants on DEP_PRO/DEP_CON and on TARGET_SCHED_ADJUST_COST params

2014-07-10 Thread Maxim Kuvyrkov
On Jul 9, 2014, at 8:21 AM, David Malcolm  wrote:

> [CCing nickc, who wrote the mn10300 hook in question]
> 
> I'm experimenting with separating out instructions from expressions in
> RTL; see [1] for more info on that.
> 
> I noticed that mn10300 has this implementation of a target hook:
>  #define TARGET_SCHED_ADJUST_COST mn10300_adjust_sched_cost
> 
> Within mn10300_adjust_sched_cost (where "insn" and "dep" are the first
> and third parameters respectively), there's this code:
> 
>  if (GET_CODE (insn) == PARALLEL)
>insn = XVECEXP (insn, 0, 0);
> 
>  if (GET_CODE (dep) == PARALLEL)
>dep = XVECEXP (dep, 0, 0);
> 
> However, I believe that these params of this hook ("insn") always
> satisfy INSN_CHAIN_CODE_P, and so can't have code PARALLEL.  [Nick: did
> those conditionals ever get triggered, or was this defensive coding?]

>From what I can tell these are remnants from the early days of haifa-sched 
>(10+ years ago).  I would be very surprised if scheduler didn't ICE on a 
>PARALLEL of INSNs (not to be confused with a PARALLEL as INSN_PATTERN).

> 
> Specifically, the hook is called from haifa-sched.c:dep_cost_1 on the
> DEP_CON and DEP_PRO of a dep_t.
> 
> It's my belief that DEP_CON and DEP_PRO always satisfy INSN_CHAIN_CODE_P
> - and on every other config so far that seems to be the case.
> 
> Is my belief about DEP_CON/DEP_PRO correct?  (or, at least, consistent
> with other gcc developers' views on the matter :))  My patch kit [2] has
> this expressed in the type system as of [3], so if I'm incorrect about
> this I'd prefer to know ASAP.

Yes, it is correct.

> 
> Similarly, do the first and third params of TARGET_SCHED_ADJUST_COST
> also satisfy INSN_CHAIN_CODE_P?
> 

Yes, since they are always derived from DEP_CON / DEP_PRO.

--
Maxim Kuvyrkov
www.linaro.org



[GSoC] Status - 20140804

2014-08-03 Thread Maxim Kuvyrkov
Hi Community,

The Google Summer of Code is in its final month, and our students made good 
progress on their projects.  I very much hope that the talented developers who 
worked on GCC in this year's program will continue to hack on and contribute to 
GCC outside of the GSoC program!

It's has been a great weekend at the Cauldron, and we all had fun and some good 
discussions too!  Prathamesh will be posting notes on the sessions that he 
attended this week.  Big shoutout to all the sponsors and organizers of the 
Cauldron -- Great Job!

Alessandro and Braden will be GCC's delegates to the GSoC Reunion un-conference 
this year -- sponsored by Google.

--
Maxim Kuvyrkov
www.linaro.org





[GSoC] Status - 20140819

2014-08-19 Thread Maxim Kuvyrkov
GSoC Mentors and Students,

Please remember that the deadline for final evaluations is August 22 19:00UTC.  
Both mentors and students should submit their evaluations on GSoC website [*] 
by that time.

So far we have only 2 evaluations (out of 10) submitted.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org





[GSoC] Status - 20140901 FINAL

2014-09-01 Thread Maxim Kuvyrkov
Hi Community!

Google Summer of Code 2014 has come to an end.  We've got some very good 
results this year -- with code from 4 out of 5 projects checked in to either 
GCC trunk or topic branch.  Congratulations to students and mentors for their 
great work!

Even more impressive is the fact that [according to student self-evaluations] 
most of the students intend to continue GCC development outside of the program.

I encourage both mentors and students to echo their feedback about GCC's GSoC 
in this thread.  The evaluations you posted on the GSoC website is visible to 
only a few people, and there are good comments and thoughts there that deserve 
a wider audience.

Thank you, [your friendly GSoC admin signing off]

--
Maxim Kuvyrkov
www.linaro.org





Re: Maxim Kuvyrkov appointed Android sub-port reviewer

2014-11-19 Thread Maxim Kuvyrkov
On Nov 14, 2014, at 9:00 PM, H.J. Lu  wrote:

> On Sun, Apr 15, 2012 at 5:08 PM, David Edelsohn  wrote:
>>I am pleased to announce that the GCC Steering Committee has
>> appointed Maxim Kuvyrkov as reviewer for the Android sub-port.
>> 
>>Please join me in congratulating Maxim on his new role.
>> Maxim, please update your listing in the MAINTAINERS file.
>> 
>> Happy hacking!
>> David
>> 
> 
> Hi Maxim,
> 
> Have you added your name to MAINTAINERS?

Will do.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org





Re: [GSoC] Google Summer of Code 2015?

2015-02-19 Thread Maxim Kuvyrkov
Hi Thomas,

Tobias will be GSoC admin for GCC this year.  He has submitted GSoC application 
today.

Tobias, would you please CC gcc@ for future GSoC-related news and updates?

Thank you,

--
Maxim Kuvyrkov
www.linaro.org



> On Feb 19, 2015, at 11:11 AM, Thomas Schwinge  wrote:
> 
> Hi!
> 
> I can't remember this being discussed: if the GCC community would like to
> participate in this year's Google Summer of Code -- the organization
> application period will end tomorrow,
> <http://groups.google.com/group/google-summer-of-code-announce>,
> <http://www.google-melange.com/gsoc/homepage/google/gsoc2015>.  If I
> remember correctly, Maxim handled the organizational bits last year;
> CCing him, just in case.  ;-)
> 
> 
> Grüße,
> Thomas



Git-only namespace for vendor branches

2015-04-30 Thread Maxim Kuvyrkov
Hi Jason,

We at Linaro are moving fully to git, and will be using git-only branches in 
GCC's git mirror for Linaro's branches starting with gcc-5-branch.  Everything 
would have been simple if we didn't have linaro/* namespace in SVN repo.  I 
want to double-check with you (and anyone else skilled in GCC's git mirror) on 
the changes we plan to make to linaro branches in git mirror.

At the moment we have:
1. SVN repository has linaro/gcc-4_8-branch and linaro/gcc-4_9-branch.  These 
will continue to live in SVN repo.  It's fine to not have access to these 
branches from the git mirror.
2. Git repository has linaro-dev/* namespace with a couple of project branches 
in it: linaro-dev/sched-model-prefetch and linaro-dev/type-promotion-pass.

We want to get to:
1. Git repository has linaro/* namespace that hosts linaro/gcc-5-branch, 
linaro-dev/sched-model-prefetch and linaro-dev/type-promotion-pass.
2. Ideally, linaro/* namespace would also have branches linaro/gcc-4_8-branch 
and linaro/gcc-4_9-branch mirrored from SVN.  My understanding is that git-svn 
will not cooperate on this one, so absence of these branches from Git mirror is 
OK.

My main question is whether it is OK to overlay git-only namespace on the 
same-named svn namespace?

I want to avoid having linaro/* namespace in SVN, and linaro-something 
namespace in git, especially since linaro/* SVN namespace does not appear in 
git mirror by default (you have to tell git to fetch non-default SVN branches 
to see it).

Thank you,

--
Maxim Kuvyrkov
www.linaro.org





[RFC] Kernel livepatching support in GCC

2015-05-28 Thread Maxim Kuvyrkov
Hi,

Akashi-san and I have been discussing required GCC changes to make kernel's 
livepatching work for AArch64 and other architectures.  At the moment 
livepatching is supported for x86[_64] using the following options: "-pg 
-mfentry -mrecord-mcount -mnop-mcount" which is geek-speak for "please add 
several NOPs at the very beginning of each function, and make a section with 
addresses of all those NOP pads".

The above long-ish list of options is a historical artifact of how livepatching 
support evolved for x86.  The end result is that for livepatching (or ftrace, 
or possible future kernel features) to work compiler needs to generate a little 
bit of empty code space at the beginning of each function.  Kernel can later 
use that space to insert call sequences for various hooks.

Our proposal is that instead of adding -mfentry/-mnop-count/-mrecord-mcount 
options to other architectures, we should implement a target-independent option 
-fprolog-pad=N, which will generate a pad of N nops at the beginning of each 
function and add a section entry describing the pad similar to -mrecord-mcount 
[1].

Since adding NOPs is much less architecture-specific then outputting call 
instruction sequences, this option can be handled in a target-independent way 
at least for some/most architectures.

Comments?

As I found out today, the team from Huawei has implemented [2], which follows 
x86 example of -mfentry option generating a hard-coded call sequence.  I hope 
that this proposal can be easily incorporated into their work since most of the 
livepatching changes are in the kernel.

[1] Technically, generating a NOP pad and adding a section entry in 
.__mcount_loc are two separate actions, so we may want to have a 
-fprolog-pad-record option.  My instinct is to stick with a single option for 
now, since we can always add more later.

[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/346905.html

--
Maxim Kuvyrkov
www.linaro.org





Re: [RFC] Kernel livepatching support in GCC

2015-05-28 Thread Maxim Kuvyrkov
> On May 28, 2015, at 11:59 AM, Richard Biener  
> wrote:
> 
> On May 28, 2015 10:39:27 AM GMT+02:00, Maxim Kuvyrkov 
>  wrote:
>> Hi,
>> 
>> Akashi-san and I have been discussing required GCC changes to make
>> kernel's livepatching work for AArch64 and other architectures.  At the
>> moment livepatching is supported for x86[_64] using the following
>> options: "-pg -mfentry -mrecord-mcount -mnop-mcount" which is
>> geek-speak for "please add several NOPs at the very beginning of each
>> function, and make a section with addresses of all those NOP pads".
>> 
>> The above long-ish list of options is a historical artifact of how
>> livepatching support evolved for x86.  The end result is that for
>> livepatching (or ftrace, or possible future kernel features) to work
>> compiler needs to generate a little bit of empty code space at the
>> beginning of each function.  Kernel can later use that space to insert
>> call sequences for various hooks.
>> 
>> Our proposal is that instead of adding
>> -mfentry/-mnop-count/-mrecord-mcount options to other architectures, we
>> should implement a target-independent option -fprolog-pad=N, which will
>> generate a pad of N nops at the beginning of each function and add a
>> section entry describing the pad similar to -mrecord-mcount [1].
>> 
>> Since adding NOPs is much less architecture-specific then outputting
>> call instruction sequences, this option can be handled in a
>> target-independent way at least for some/most architectures.
>> 
>> Comments?
> 
> Maybe follow s390 -mhotpatch instead?

Regarding implementation of the option, it will follow what s390 is doing with 
function attributes to mark which functions to apply nop-treatment to (using 
attributes will avoid problems with [coming] LTO builds of the kernel).  The 
new option will set value of the attribute on all functions in current 
compilation unit, and then nops will be generated from the attribute 
specification.

On the other hand, s390 does not generate a section of descriptor entries of 
NOP pads, which seems like a useful (or necessary) option.  A more-or-less 
generic implementation should, therefore, combine s390's attributes approach to 
annotating functions and x86's approach to providing information in an ELF 
section about NOP entries.  Or can we record value of a function attribute in 
ELF in a generic way?

Whatever the specifics, implementation of livepatch support should be decoupled 
from -pg/mcount dependency as I don't see any real need in overloading mcount 
with livepatching stuff.

--
Maxim Kuvyrkov
www.linaro.org




Re: Proposal for the transition timetable for the move to GIT

2019-09-19 Thread Maxim Kuvyrkov
> On Sep 17, 2019, at 3:02 PM, Richard Earnshaw (lists) 
>  wrote:
> 
> At the Cauldron this weekend the overwhelming view for the move to GIT soon 
> was finally expressed.
> 
...
> 
> So in summary my proposed timetable would be:
> 
> Monday 16th December 2019 - cut off date for picking which git conversion to 
> use
> 
> Tuesday 31st December 2019 - SVN repo becomes read-only at end of stage 3.
> 
> Thursday 2nd January 2020 - (ie read-only + 2 days) new git repo comes on 
> line for live commits.
> 
> Doing this over the new year holiday period has both advantages and 
> disadvantages.  On the one hand the traffic is light, so the impact to most 
> developers will be quite low; on the other, it is a holiday period, so 
> getting the right key folk to help might be difficult.  I won't object 
> strongly if others feel that slipping a few days (but not weeks) would make 
> things significantly easier.

The timetable looks entirely reasonable to me.

I have regenerated my primary version this week, and it's up at 
https://git.linaro.org/people/maxim-kuvyrkov/gcc-pretty.git/ .  So far I have 
received only minor issue reports about it, and all known problems have been 
fixed.  I could use a bit more scrutiny :-).

Regards,

--
Maxim Kuvyrkov
www.linaro.org




Re: Branch and tag deletions

2019-11-27 Thread Maxim Kuvyrkov


> On Nov 25, 2019, at 7:07 PM, Joseph Myers  wrote:
> 
> I'm looking at the sets of branches and tags resulting from a GCC 
> repository conversion with reposurgeon.
> 
> 1. I see 227 branches (and one tag) with names like 
> cxx0x-concepts-branch-deleted-r131428-1 (this is out of 780 branches in 
> total in a conversion of GCC history as of a few days ago).  Can we tell 
> reposurgeon not to create such branches (and tags)?  I can't simply do 
> "branch /-deleted-r/ delete" because that command doesn't take a regular 
> expression.
> 
> 2. gcc.lift has a series of "tag  delete" commands, generally 
> deleting tags that aren't official GCC releases or prereleases (many of 
> which were artifacts of how creating such tags was necessary to track 
> merges in the CVS and older SVN era).  But some such commands are 
> mysteriously failing to work.  For example I see
> 
> tag /ZLIB_/ delete
> reposurgeon: no tags matching /ZLIB_/
> 
> but there are tags ZLIB_1_1_3, ZLIB_1_1_4, ZLIB_1_2_1, ZLIB_1_2_3 left 
> after the conversion.  This isn't just an issue with regular expressions; 
> I also see e.g.
> 
> tag apple/ppc-import-20040330 delete
> reposurgeon: no tags matching apple/ppc-import-20040330
> 
> and again that tag exists after the conversion.

IMO, we should aim to convert complete SVN history frozen at a specific point.  
So that if we don't want to convert some of the branches or tags to git, then 
we should delete them from SVN repository before conversion.

Otherwise it will (a) complicate comparison or repos converted by different 
tools, and (b) will require us to remember why parts of SVN history were not 
converted to git.

My conversion at https://git.linaro.org/people/maxim-kuvyrkov/gcc-pretty.git/ 
contains all branches and tags present in current SVN repo.

--
Maxim Kuvyrkov
https://www.linaro.org




Re: Branch and tag deletions

2019-12-05 Thread Maxim Kuvyrkov
of their branches.  
Therefore, it may make for a smoother git experience to put user branches out 
of sight.  Vendors, otoh, tend to keep their branches very clean.

> 
>> 
>>> d) releases should go into refs/{heads/tags}/releases (makes it clearer 
>>> to casual users of the repo that these are 'official')
>> 
>> What are releases?  Release branches?
> 
> branches in the heads space, tags in the tags space.
> 
>> 
>> It would be very inconvenient to not have the recent releases immediately
>> accessible, fwiw, but those could be just a copy.  And then delete that
>> one after a branch is closed?
>> 
>>> e) other general development branches in refs/{heads/tags}/devt
>> 
>> What does this mean?  "other", "general"?
> 
> Anything that's not vendor/user specific and not a release - a topic
> branch most likely
>> 
>>> That probably means the top-level heads/tags spaces are empty; but I 
>>> have no problem with that.
>> 
>> It is good when people get the most often used things immediately.
> 
> git branch -a will show anything in refs/remotes, and the default pull
> spec is to pull refs/heads/* (and anything under that), so all release
> and topic branches would be pulled by default, but not anything else.
> 
> According to the git fetch manual page, tags are fetched if an object
> they point to is fetched.  I presume this only applies to tags under
> refs/tags.  But this is getting into details of git that I've not used
> before.  I need to experiment a bit more here.
> 
> R.
> 
> PS.  Just seen https://git-scm.com/docs/gitnamespaces, that might be
> exactly what we want for users, vendors and legacy stuff.  I'll
> investigate some more...

--
Maxim Kuvyrkov
https://www.linaro.org






Re: Proposal for the transition timetable for the move to GIT

2019-12-06 Thread Maxim Kuvyrkov
> On Sep 19, 2019, at 6:34 PM, Maxim Kuvyrkov  wrote:
> 
>> On Sep 17, 2019, at 3:02 PM, Richard Earnshaw (lists) 
>>  wrote:
>> 
>> At the Cauldron this weekend the overwhelming view for the move to GIT soon 
>> was finally expressed.
>> 
> ...
>> 
>> So in summary my proposed timetable would be:
>> 
>> Monday 16th December 2019 - cut off date for picking which git conversion to 
>> use
>> 
>> Tuesday 31st December 2019 - SVN repo becomes read-only at end of stage 3.
>> 
>> Thursday 2nd January 2020 - (ie read-only + 2 days) new git repo comes on 
>> line for live commits.
>> 
>> Doing this over the new year holiday period has both advantages and 
>> disadvantages.  On the one hand the traffic is light, so the impact to most 
>> developers will be quite low; on the other, it is a holiday period, so 
>> getting the right key folk to help might be difficult.  I won't object 
>> strongly if others feel that slipping a few days (but not weeks) would make 
>> things significantly easier.
> 
> The timetable looks entirely reasonable to me.
> 
> I have regenerated my primary version this week, and it's up at 
> https://git.linaro.org/people/maxim-kuvyrkov/gcc-pretty.git/ .  So far I have 
> received only minor issue reports about it, and all known problems have been 
> fixed.  I could use a bit more scrutiny :-).

I think now is a good time to give status update on the svn->git conversion I 
maintain.  See https://git.linaro.org/people/maxim-kuvyrkov/gcc-pretty.git/ .

1. The conversion has all SVN live branches converted as branches under 
refs/heads/* .

2. The conversion has all SVN live tags converted as annotated tags under 
refs/tags/* .

3. If desired, it would be trivial to add all deleted / leaf SVN branches and 
tags.  They would be named as branches/my-deleted-branch@12345, where @12345 is 
the revision at which the branch was deleted.  Branches created and deleted 
multiple times would have separate entries corresponding to delete revisions.

4. Git committer and git author entries are very accurate (imo, better than 
reposurgeon's, but I'm biased).  Developers' names and email addresses are 
mined from commit logs, changelogs and source code and have 
historically-accurately attributions to employer's email addresses.

5. Since there is interest in reparenting branches to fix cvs2svn merge issues, 
I've added this feature to my scripts as well (turned out to be trivial).  I'll 
keep the original gcc-pretty.git repo intact and will upload the new one at 
https://git.linaro.org/people/maxim-kuvyrkov/gcc-reparent.git/  -- should be 
live by Monday.

Finally, there seems to be quite a few misunderstandings about the scripts I've 
developed and their limitations.  Most of these misunderstanding stem from 
assumption that all git-svn limitations must apply to my scripts.  That's not 
the case.  SVN merges, branch/tag reparenting, adjusting of commit logs are all 
handled correctly in my scripts.  I welcome criticism with pointers to 
revisions which have been incorrectly converted.

The general conversion workflow is (this really is a poor-man's translator of 
one DAG into another):

1. Parse SVN history of entire SVN root (svn log -qv file:///svnrepo/) and 
build a list of branch points.
2. From the branch points build a DAG of "basic blocks" of revision history.  
Each basic block is a consecutive set of commits where only the last commit can 
be a branchpoint.
3. Walk the DAG and ...
4. ... use git-svn to individually convert these basic blocks.
4a. Optionally, post-process git result of basic block conversion using "git 
filter-branch" and similar tools.

Git-svn is used in a limited role, and it does its job very well in this role.

Regards,

--
Maxim Kuvyrkov
https://www.linaro.org




Re: Proposal for the transition timetable for the move to GIT

2019-12-11 Thread Maxim Kuvyrkov
> On Dec 9, 2019, at 9:19 PM, Joseph Myers  wrote:
> 
> On Fri, 6 Dec 2019, Eric S. Raymond wrote:
> 
>> Reposurgeon has been used for several major conversions, including groff 
>> and Emacs.  I don't mean to be nasty to Maxim, but I have not yet seen 
>> *anybody* who thought they could get the job done with ad-hoc scripts 
>> turn out to be correct.  Unfortunately, the costs of failure are often 
>> well-hidden problems in the converted history that people trip over 
>> months and years later.
> 
> I think the ad hoc script is the risk factor here as much as the fact that 
> the ad hoc script makes limited use of git-svn.
> 
> For any conversion we're clearly going to need to run various validation 
> (comparing properties of the converted repository, such as contents at 
> branch tips, with expected values of those properties based on the SVN 
> repository) and fix issues shown up by that validation.  reposurgeon has 
> its own tools for such validation; I also intend to write some validation 
> scripts myself.  And clearly we need to fix issues shown up by such 
> validation - that's what various recent reposurgeon issues Richard and I 
> have reported are about, fixing the most obvious issues that show up, 
> which in turn will enable running more detailed validation.
> 
> The main risks are about issues that are less obvious in validation and so 
> don't get fixed in that process.  There, if you're using an ad hoc script, 
> the risks are essentially unknown.  But using a known conversion tool with 
> an extensive testsuite, such as reposurgeon, gives confidence based on 
> reposurgeon passing its own testsuite (once the SVN dump reader rewrite 
> does so) that a wide range of potential conversion bugs, that might appear 
> without showing up in the kinds of validation people try, are less likely 
> because of all the regression tests for conversion issues seen in past 
> conversions.  When using an ad hoc script specific to one conversion you 
> lose that confidence that comes from a conversion tool having been used in 
> previous conversions and having tests to ensure bugs found in those 
> conversions don't come back.
> 
> I think we should fix whatever the remaining relevant bugs are in 
> reposurgeon and do the conversion with reposurgeon being used to read and 
> convert the SVN history and do any desired surgical operations on it.
> 
> Ad hoc scripts identifying specific proposed local changes to the 
> repository content, such as the proposed commit message improvements from 
> Richard or my branch parent fixes, to be performed with reposurgeon, seem 
> a lot safer than ad hoc code doing the conversion itself.  And for 
> validation, the more validation scripts people come up with the better.  
> If anyone has or wishes to write custom scripts to analyze the SVN 
> repository branch structure and turn that into verifiable assertions about 
> what a git conversion should look like, rather than into directly 
> generating a git repository or doing surgery on history, that helps us 
> check a reposurgeon-converted repository in areas that might be 
> problematic - and in that case it's OK for the custom script to have 
> unknown bugs because issues it shows up are just pointing out places where 
> the converted repository needs checking more carefully to decide whether 
> there is a conversion bug or not.

Firstly, I am not going to defend my svn-git-* scripts or the git-svn tool they 
are using.  They are likely to have bugs and problems.  I am, though, going to 
defend the conversion that these tools produced.  No matter the conversion 
tool, all that matters is the final result.  I have asked many times to 
scrutinize the git repository that I have uploaded several months ago and to 
point out any artifacts or mistakes.  Surely, it can't be hard for one to find 
a mistake or two in my converted repository by comparing it against any other 
/better/ repository that one has.

[FWIW, I am going to privately compare reposurgeon-generated repo that Richard 
E. uploaded against my repo.  The results of such comparison can appear biased, 
so I'm not planning to publish them.]

Secondly, the GCC community has overwhelmingly supported move to git, and in 
private conversations many developers have expressed the same view:

1. all we care about is history of trunk and recent release branches
2. current gcc-mirror is really all we need
3. having vendor branches and author info would be nice, but not so nice as to 
delay the switch any longer

Granted, the above is not the /official/ consensus of GCC community, and I 
don't want to represent it as such.  However, it is equally not the consensus 
of GCC community to delay the switch to git until we have a confirmed perfect 
repo.

--
Maxim Kuvyrkov
https://www.linaro.org



Re: Test GCC conversion with reposurgeon available

2019-12-24 Thread Maxim Kuvyrkov
> On Dec 22, 2019, at 4:56 PM, Joseph Myers  wrote:
> 
> On Thu, 19 Dec 2019, Joseph Myers wrote:
> 
>> And two more.
>> 
>> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4a.git
>> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4b.git
> 
> Two more.
> 
> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5a.git
> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5b.git
> 
> The main changes are:
> 
> * The case of both svnmerge-integrated and svn:mergeinfo being set is now 
> handled properly, so the commit Bernd found is interpreted as a merge from 
> trunk to named-addr-spaces-branch and has exactly two parents as expected, 
> with the parents corresponding to the merges from other branches to trunk 
> being optimized away.
> 
> * The author map used now avoids timezone-only entries also remapping 
> email addresses, so the email addresses from the ChangeLogs are used 
> whenever a commit adds ChangeLog entries from exactly one author.
> 
> * When commits add ChangeLog entries from more than one author (e.g. 
> merges done in CVS), the committer is now used as the author rather than 
> selecting one of the authors from the ChangeLog entries.
> 
> * The latest whitelisting / PR corrections are used with Richard's script 
> (430 checkme: entries remain).
> 
> * One fix to the ref renaming in gcc-reposurgeon-5b.git so that the tag 
> gcc-3_2-rhl8-3_2-7 properly ends up in vendors rather than prereleases.

I'll spend next couple of days comparing Joseph's gcc-reposurgeon-5a.git 
conversion against my gcc-pretty.git and gcc-reparent.git conversions, and will 
post results along with the scripts to this mailing list.

Regarding gcc-pretty.git and gcc-reparent.git conversions, I have the following 
comments so far:

Q1: Why are there missing branches for stuff that didn't originate at trunk@1?
A1: Indeed, that's by design / configuration.  The scripts start with trunk@1 
and build a parent DAG from that node.  If desired, it is trivial to add more 
initial "root" commits to include these missing branches.

Q2: Why are entries from branches/st/tags treated as branches, not as tags?
A2: Because I opted to not special-case these to simplify comparison of 
different conversions.  Tags/* entries are converted to git annotated tags in a 
separate pass, an it is trivial to add handling for branches/st/tags there.

Q3: Why do reparented branches in gcc-reparent.git repo have merge commits at 
the point of reparenting?
A3: That's an artifact of svn-git machinery my scripts are using.  I haven't 
looked at this in depth.

Q4: Is it possible to integrate Richard E.'s script to rewrite commit log 
messages?
A5: Yes, absolutely.  The scripts have a pass to rewrite commit 
author/committer entries, and log rewrite easily fits in there.  It would be 
very helpful to have a version of Richard's script that runs on per-commit 
basis, suitable for "git filter-branch" consumption.

Regards,

--
Maxim Kuvyrkov
https://www.linaro.org





Re: Proposal for the transition timetable for the move to GIT

2019-12-26 Thread Maxim Kuvyrkov


> On Dec 26, 2019, at 2:16 PM, Jakub Jelinek  wrote:
> 
> On Thu, Dec 26, 2019 at 11:04:29AM +, Joseph Myers wrote:
> Is there some easy way (e.g. file in the conversion scripts) to correct
> spelling and other mistakes in the commit authors?
> E.g. there are misspelled surnames, etc. (e.g. looking at my name, I see
> Jakub Jakub Jelinek (1):
> Jakub Jeilnek (1):
> Jelinek (1):
> entries next to the expected one with most of the commits.
> For the misspellings, wonder if e.g. we couldn't compute edit distances from
> other names and if we have one with many commits and then one with very few
> with small edit distance from those, flag it for human review.

This is close to what svn-git-author.sh script is doing in gcc-pretty and 
gcc-reparent conversions.  It ignores 1-3 character differences in 
author/committer names and email addresses.  I've audited results for all 
branches and didn't spot any mistakes.

In other news, I'm working on comparison of gcc-pretty, gcc-reparent and 
gcc-reposurgeon-5a repos among themselves.  Below are current notes for 
comparison of gcc-pretty/trunk and gcc-reposurgeon-5a/trunk.

== Merges on trunk ==

Reposurgeon creates merge entries on trunk when changes from a branch are 
merged into trunk.  This brings entire development history from the branch to 
trunk, which is both good and bad.  The good part is that we get more 
visibility into how the code evolved.  The bad part is that we get many "noisy" 
commits from merged branch (e.g., "Merge in trunk" every few revisions) and 
that our SVN branches are work-in-progress quality, not ready for review/commit 
quality.  It's common for files to be re-written in large chunks on branches.

Also, reposurgeon's commit logs don't have information on SVN path from which 
the change came, so there is no easy way to determine that a given commit is 
from a merged branch, not an original trunk commit.  Git-svn, on the other 
hand, provides "git-svn-id: @" tags in its commit logs.

My conversion follows current GCC development policy that trunk history should 
be linear.  Branch merges to trunk are squashed.  Merges between non-trunk 
branches are handled as specified by svn:mergeinfo SVN properties.

== Differences in trees ==

Git trees (aka filesystem content) match between pretty/trunk and 
reposurgeon-5a/trunk from current tip and up tosvn's r130805.
Here is SVN log of that revision (restoration of deleted trunk):

r130805 | dberlin | 2007-12-13 01:53:37 + (Thu, 13 Dec 2007)
Changed paths:
   A /trunk (from /trunk:130802)


Reposurgeon conversion has:
-
commit 7e6f2a96e89d96c2418482788f94155d87791f0a
Author: Daniel Berlin 
Date:   Thu Dec 13 01:53:37 2007 +

Readd trunk

Legacy-ID: 130805

 .gitignore | 17 -
 1 file changed, 17 deletions(-)
-
and my conversion has:
-
commit fb128f3970789ce094c798945b4fa20eceb84cc7
Author: Daniel Berlin 
Date:   Thu Dec 13 01:53:37 2007 +

Readd trunk


git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@130805 
138bc75d-0d04-0410-961f-82ee72b054a4
-

It appears that .gitignore has been added in r1 by reposurgeon and then deleted 
at r130805.  In SVN repository .gitignore was added in r195087.  I speculate 
that addition of .gitignore at r1 is expected, but it's deletion at r130805 is 
highly suspicious.

== Committer entries ==

Reposurgeon uses $u...@gcc.gnu.org for committer email addresses even when it 
correctly detects author name from ChangeLog.

reposurgeon-5a:
r278995 Martin Liska  Martin Liska 
r278994 Jozef Lawrynowicz  Jozef Lawrynowicz 

r278993 Frederik Harwath  Frederik Harwath 

r278992 Georg-Johann Lay  Georg-Johann Lay 
r278991 Richard Biener  Richard Biener 

pretty:
r278995 Martin Liska  Martin Liska 
r278994 Jozef Lawrynowicz  Jozef Lawrynowicz 

r278993 Frederik Harwath  Frederik Harwath 

r278992 Georg-Johann Lay  Georg-Johann Lay 
r278991 Richard Biener  Richard Biener 

== Bad summary line ==

While looking around r138087, below caught my eye.  Is the contents of summary 
line as expected?

commit cc2726884d56995c514d8171cc4a03657851657e
Author: Chris Fairles 
Date:   Wed Jul 23 14:49:00 2008 +

acinclude.m4 ([GLIBCXX_CHECK_CLOCK_GETTIME]): Define GLIBCXX_LIBS.

2008-07-23  Chris Fairles 

* acinclude.m4 ([GLIBCXX_CHECK_CLOCK_GETTIME]): Define GLIBCXX_LIBS.
Holds the lib that defines clock_gettime (-lrt or -lposix4).
* src/Makefile.am: Use it.
* configure: Regenerate.
* configure.in: Likewise.
* Makefile.in: Likewise.
* src/Makefile.in: Likewise.
* libsup++/Makefile.in: Likewise.
    * po/Ma

Re: Proposal for the transition timetable for the move to GIT

2019-12-27 Thread Maxim Kuvyrkov
> On Dec 27, 2019, at 4:32 AM, Joseph Myers  wrote:
> 
> On Thu, 26 Dec 2019, Joseph Myers wrote:
> 
>>> It appears that .gitignore has been added in r1 by reposurgeon and then 
>>> deleted at r130805.  In SVN repository .gitignore was added in r195087.  
>>> I speculate that addition of .gitignore at r1 is expected, but it's 
>>> deletion at r130805 is highly suspicious.
>> 
>> I suspect this is one of the known issues related to reposurgeon-generated 
>> .gitignore files.  Since such files are not really part of the GCC 
>> history, and the .gitignore files checked into SVN are properly preserved 
>> as far as I can see, I don't think it's a particularly important issue for 
>> the GCC conversion (since auto-generated .gitignore files are only 
>> nice-to-have, not required).  I've filed 
>> https://gitlab.com/esr/reposurgeon/issues/219 anyway with a reduced test 
>> for this oddity.
> 
> This has now been fixed, so future conversion runs with reposurgeon should 
> have the automatically-generated .gitignore present until replaced by the 
> one checked into SVN.  (If people don't want automatically-generated 
> .gitignore files at all, we could always add an option to reposurgeon not 
> to generate them.)

Removing auto-generated .gitignore files from reposurgeon conversion would 
allow comparison of git trees vs gcc-pretty and gcc-reparent beyond r195087.  
So, while we are evaluating the conversion candidates, it is best to disable 
conversion features that cause hard-to-workaround differences.

> 
> I'll do another GCC conversion run to pick up all the accumulated fixes 
> and improvements (including many more PR whitelist entries / fixes in 
> Richard's script), once another ChangeLog-related fix is in.


--
Maxim Kuvyrkov
https://www.linaro.org



Re: Proposal for the transition timetable for the move to GIT

2019-12-29 Thread Maxim Kuvyrkov
Below are several more issues I found in reposurgeon-6a conversion comparing it 
against gcc-reparent conversion.

I am sure, these and whatever other problems I may find in the reposurgeon 
conversion can be fixed in time.  However, I don't see why should bother.  My 
conversion has been available since summer 2019, I made it ready in time for 
GCC Cauldron 2019, and it didn't change in any significant way since then.

With the "Missed merges" problem (see below) I don't see how reposurgeon 
conversion can be considered "ready".  Also, I expected a diligent developer to 
compare new conversion (aka reposurgeon's) against existing conversion (aka 
gcc-pretty / gcc-reparent) before declaring the new conversion "better" or even 
"ready".  The data I'm seeing in differences between my and reposurgeon 
conversions shows that gcc-reparent conversion is /better/.

I suggest that GCC community adopts either gcc-pretty or gcc-reparent 
conversion.  I welcome Richard E. to modify his summary scripts to work with 
svn-git scripts, which should be straightforward, and I'm ready to help.

Meanwhile, I'm going to add additional root commits to my gcc-reparent 
conversion to bring in "missing" branches (the ones, which don't share history 
with trunk@1) and restart daily updates of gcc-reparent conversion.

Finally, with the comparison data I have, I consider statements about git-svn's 
poor quality to be very misleading.  Git-svn may have had serious bugs years 
ago when Eric R. evaluated it and started his work on reposurgeon.  But a lot 
of development has happened and many problems have been fixed since them.  At 
the moment it is reposurgeon that is producing conversions with obscure 
mistakes in repository metadata.


=== Missed merges ===

Reposurgeon misses merges from trunk on 130+ branches.  I've spot-checked 
ARM/hard_vfp_branch and redhat/gcc-9-branch and, indeed, rather mundane merges 
were omitted.  Below is analysis for ARM/hard_vfp_branch.

$ git log --stat refs/remotes/gcc-reposurgeon-6a/ARM/hard_vfp_branch~4

commit ef92c24b042965dfef982349cd5994a2e0ff5fde
Author: Richard Earnshaw 
Date:   Mon Jul 20 08:15:51 2009 +

Merge trunk through to r149768

Legacy-ID: 149804

 COPYING.RUNTIME |73 +
 ChangeLog   |   270 +-
 MAINTAINERS |19 +-



at the same time for svn-git scripts we have:

$ git log --stat refs/remotes/gcc-reparent/ARM/hard_vfp_branch~4

commit ce7d5c8df673a7a561c29f095869f20567a7c598
Merge: 4970119c20da 3a69b1e566a7
Author: Richard Earnshaw 
Date:   Mon Jul 20 08:15:51 2009 +

Merge trunk through to r149768

git-svn-id: https://gcc.gnu.org/svn/gcc/branches/ARM/hard_vfp_branch@149804 
138bc75d-0d04-0410-961f-82ee72b054a4


... which agrees with
$ svn propget svn:mergeinfo 
file:///home/maxim.kuvyrkov/tmpfs-stuff/svnrepo/branches/ARM/hard_vfp_branch@149804
/trunk:142588-149768

=== Bad author entries ===

Reposurgeon-6a conversion has authors "12:46:56 1998 Jim Wilson" and 
"2005-03-18 Kazu Hirata".  It is rather obvious that person's name is unlikely 
to start with a digit.

=== Missed authors ===

Reposurgeon-6a conversion misses many authors, below is a list of people with 
names starting with "A".

Akos Kiss
Anders Bertelrud
Andrew Pochinsky
Anton Hartl
Arthur Norman
Aymeric Vincent

=== Conservative author entries ===

Reposurgeon-6a conversion uses default "@gcc.gnu.org" emails for many commits 
where svn-git conversion manages to extract valid email from commit data.  This 
happens for hundreds of author entries.

Regards,

--
Maxim Kuvyrkov
https://www.linaro.org


> On Dec 26, 2019, at 7:11 PM, Maxim Kuvyrkov  wrote:
> 
> 
>> On Dec 26, 2019, at 2:16 PM, Jakub Jelinek  wrote:
>> 
>> On Thu, Dec 26, 2019 at 11:04:29AM +, Joseph Myers wrote:
>> Is there some easy way (e.g. file in the conversion scripts) to correct
>> spelling and other mistakes in the commit authors?
>> E.g. there are misspelled surnames, etc. (e.g. looking at my name, I see
>> Jakub Jakub Jelinek (1):
>> Jakub Jeilnek (1):
>> Jelinek (1):
>> entries next to the expected one with most of the commits.
>> For the misspellings, wonder if e.g. we couldn't compute edit distances from
>> other names and if we have one with many commits and then one with very few
>> with small edit distance from those, flag it for human review.
> 
> This is close to what svn-git-author.sh script is doing in gcc-pretty and 
> gcc-reparent conversions.  It ignores 1-3 character differences in 
> author/committer names and email addresses.  I've audited results for all 
> branches and didn't spot any mistakes

Re: Proposal for the transition timetable for the move to GIT

2019-12-30 Thread Maxim Kuvyrkov
> On Dec 30, 2019, at 3:18 AM, Joseph Myers  wrote:
> 
> On Sun, 29 Dec 2019, Richard Earnshaw (lists) wrote:
> 
>> gcc-reparent is better, but many (most?) of the release tags are shown
>> as merge commits with a fake parent back to the gcc-3 branch point,
>> which is certainly not what happened when the tagging was done at that
>> time.
> 
> And looking at the history of gcc-reparent as part of preparing to compare 
> authors to identify commits needing manual attention to author 
> identification, I see other oddities.
> 
> Do "git log egcs_1_1_2_prerelease_2" in gcc-reparent, for example.  The 
> history ends up containing two different versions of SVN r5 and of many 
> other commits.  One of them looks normal:
> 
> commit c01d37f1690de9ea83b341780fad458f506b80c7
> Author: Charles Hannum 
> Date:   Mon Nov 27 21:22:14 1989 +
> 
>entered into RCS
> 
> 
>git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@5 
> 138bc75d-0d04-0410-961f-82ee72b054a4
> 
> The other looks strange:
> 
> commit 09c5a0fa5ed76e58cc67f3d72bf397277fdd
> Author: Charles Hannum 
> Date:   Mon Nov 27 21:22:14 1989 +
> 
>entered into RCS
> 
> 
>git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@5 
> 138bc75d-0d04-0410-961f-82ee72b054a4
>Updated tag 'egcs_1_1_2_prerelease_2@279090' (was bc80be265a0)
>Updated tag 'egcs_1_1_2_prerelease_2@279154' (was f7cee65b219)
>Updated tag 'egcs_1_1_2_prerelease_2@279213' (was 74dcba9b414)
>Updated tag 'egcs_1_1_2_prerelease_2@279270' (was 7e63c9b344d)
>Updated tag 'egcs_1_1_2_prerelease_2@279336' (was 47894371e3c)
>Updated tag 'egcs_1_1_2_prerelease_2@279392' (was 3c3f6932316)
>Updated tag 'egcs_1_1_2_prerelease_2@279402' (was 29d9998f523b)
> 
> (and in fact it seems there are *four* commits corresponding to SVN r5 and 
> reachable from refs in the gcc-reparent repository).  So we don't just 
> have stray merge commits, they actually end up leading back to strange 
> alternative versions of history (which I think is clearly worse than 
> conservatively not having a merge commit in some case where a commit might 
> or might not be unambiguously a merge - if a merge was missed on an active 
> branch, the branch maintainer can easily correct that afterwards with "git 
> merge -s ours" to avoid problems with future merges).
> 
> My expectation is that there are only multiple git commits corresponding 
> to an SVN commit when the SVN commit touched more than one SVN branch or 
> tag and so has to be split to represent it in git (there are about 1500 
> such SVN commits, most of them automatic datestamp updates in the CVS era 
> that cvs2svn turned into mixed-branch commits).

Thanks for catching this.  This is fallout from incremental rebuilds (rather 
than fresh builds) of gcc-reparent repository.  Incremental builds take about 
1h and full rebuilds take about 30h.  I'll switch to doing full rebuilds.

--
Maxim Kuvyrkov
https://www.linaro.org



Re: Proposal for the transition timetable for the move to GIT

2019-12-30 Thread Maxim Kuvyrkov
> On Dec 30, 2019, at 1:24 AM, Richard Earnshaw (lists) 
>  wrote:
> 
> On 29/12/2019 18:30, Maxim Kuvyrkov wrote:
>> Below are several more issues I found in reposurgeon-6a conversion comparing 
>> it against gcc-reparent conversion.
>> 
>> I am sure, these and whatever other problems I may find in the reposurgeon 
>> conversion can be fixed in time.  However, I don't see why should bother.  
>> My conversion has been available since summer 2019, I made it ready in time 
>> for GCC Cauldron 2019, and it didn't change in any significant way since 
>> then.
>> 
>> With the "Missed merges" problem (see below) I don't see how reposurgeon 
>> conversion can be considered "ready".  Also, I expected a diligent developer 
>> to compare new conversion (aka reposurgeon's) against existing conversion 
>> (aka gcc-pretty / gcc-reparent) before declaring the new conversion "better" 
>> or even "ready".  The data I'm seeing in differences between my and 
>> reposurgeon conversions shows that gcc-reparent conversion is /better/.
>> 
>> I suggest that GCC community adopts either gcc-pretty or gcc-reparent 
>> conversion.  I welcome Richard E. to modify his summary scripts to work with 
>> svn-git scripts, which should be straightforward, and I'm ready to help.
>> 
> 
> I don't think either of these conversions are any more ready to use than
> the reposurgeon one, possibly less so.  In fact, there are still some
> major issues to resolve first before they can be considered.
> 
> gcc-pretty has completely wrong parent information for the gcc-3 era
> release tags, showing the tags as being made directly from trunk with
> massive deltas representing the roll-up of all the commits that were
> made on the gcc-3 release branch.

I will clarify the above statement, and please correct me where you think I'm 
wrong.  Gcc-pretty conversion has the exact right parent information for the 
gcc-3 era
release tags as recorded in SVN version history.  Gcc-pretty conversion aims to 
produce an exact copy of SVN history in git.  IMO, it manages to do so just 
fine.

It is a different thing that SVN history has a screwed up record of gcc-3 era 
tags.

> 
> gcc-reparent is better, but many (most?) of the release tags are shown
> as merge commits with a fake parent back to the gcc-3 branch point,
> which is certainly not what happened when the tagging was done at that
> time.

I agree with you here.

> 
> Both of these factually misrepresent the history at the time of the
> release tag being made.

Yes and no.  Gcc-pretty repository mirrors SVN history.  And regarding the need 
for reparenting -- we lived with current history for gcc-3 release tags for a 
long time.  I would argue their continued brokenness is not a show-stopper.

Looking at this from a different perspective, when I posted the initial svn-git 
scripts back in Summer, the community roughly agreed on a plan to
1. Convert entire SVN history to git.
2. Use the stock git history rewrite tools (git filter-branch) to fixup what we 
want, e.g., reparent tags and branches or set better author/committer entries.

Gcc-pretty does (1) in entirety.

For reparenting, I tried a 15min fix to my scripts to enable reparenting, which 
worked, but with artifacts like the merge commit from old and new parents.  I 
will drop this and instead use tried-and-true "git filter-branch" to reparent 
those tags and branches, thus producing gcc-reparent from gcc-pretty.

> 
> As for converting my script to work with your tools, I'm afraid I don't
> have time to work on that right now.  I'm still bogged down validating
> the incorrect bug ids that the script has identified for some commits.
> I'm making good progress (we're down to 160 unreviewed commits now), but
> it is still going to take what time I have over the next week to
> complete that task.
> 
> Furthermore, there is no documentation on how your conversion scripts
> work, so it is not possible for me to test any work I might do in order
> to validate such changes.  Not being able to run the script locally to
> test change would be a non-starter.
> 
> You are welcome, of course, to clone the script I have and attempt to
> modify it yourself, it's reasonably well documented.  The sources can be
> found in esr's gcc-conversion repository here:
> https://gitlab.com/esr/gcc-conversion.git

--
Maxim Kuvyrkov
https://www.linaro.org

> 
> 
>> Meanwhile, I'm going to add additional root commits to my gcc-reparent 
>> conversion to bring in "missing" branches (the ones, which don't share 
>> history with trunk@1) and restart daily updates of gcc-

Re: Proposal for the transition timetable for the move to GIT

2019-12-30 Thread Maxim Kuvyrkov
> On Dec 30, 2019, at 6:31 PM, Richard Earnshaw (lists) 
>  wrote:
> 
> On 30/12/2019 13:00, Maxim Kuvyrkov wrote:
>>> On Dec 30, 2019, at 1:24 AM, Richard Earnshaw (lists) 
>>>  wrote:
>>> 
>>> On 29/12/2019 18:30, Maxim Kuvyrkov wrote:
>>>> Below are several more issues I found in reposurgeon-6a conversion 
>>>> comparing it against gcc-reparent conversion.
>>>> 
>>>> I am sure, these and whatever other problems I may find in the reposurgeon 
>>>> conversion can be fixed in time.  However, I don't see why should bother.  
>>>> My conversion has been available since summer 2019, I made it ready in 
>>>> time for GCC Cauldron 2019, and it didn't change in any significant way 
>>>> since then.
>>>> 
>>>> With the "Missed merges" problem (see below) I don't see how reposurgeon 
>>>> conversion can be considered "ready".  Also, I expected a diligent 
>>>> developer to compare new conversion (aka reposurgeon's) against existing 
>>>> conversion (aka gcc-pretty / gcc-reparent) before declaring the new 
>>>> conversion "better" or even "ready".  The data I'm seeing in differences 
>>>> between my and reposurgeon conversions shows that gcc-reparent conversion 
>>>> is /better/.
>>>> 
>>>> I suggest that GCC community adopts either gcc-pretty or gcc-reparent 
>>>> conversion.  I welcome Richard E. to modify his summary scripts to work 
>>>> with svn-git scripts, which should be straightforward, and I'm ready to 
>>>> help.
>>>> 
>>> 
>>> I don't think either of these conversions are any more ready to use than
>>> the reposurgeon one, possibly less so.  In fact, there are still some
>>> major issues to resolve first before they can be considered.
>>> 
>>> gcc-pretty has completely wrong parent information for the gcc-3 era
>>> release tags, showing the tags as being made directly from trunk with
>>> massive deltas representing the roll-up of all the commits that were
>>> made on the gcc-3 release branch.
>> 
>> I will clarify the above statement, and please correct me where you think 
>> I'm wrong.  Gcc-pretty conversion has the exact right parent information for 
>> the gcc-3 era
>> release tags as recorded in SVN version history.  Gcc-pretty conversion aims 
>> to produce an exact copy of SVN history in git.  IMO, it manages to do so 
>> just fine.
>> 
>> It is a different thing that SVN history has a screwed up record of gcc-3 
>> era tags.
> 
> It's not screwed up in svn.  Svn shows the correct history information for 
> the gcc-3 era release tags, but the git-svn conversion in gcc-pretty does not.
> 
> For example, looking at gcc_3_0_release in expr.c with git blame and svn 
> blame shows

In SVN history tags/gcc_3_0_release has been copied off /trunk:39596 and in the 
same commit bunch of files were replaced from /branches/gcc-3_0-branch/ (and 
from different revisions of this branch!).

$ svn log -qv --stop-on-copy file://$(pwd)/tags/gcc_3_0_release | grep 
"/tags/gcc_3_0_release \|/tags/gcc_3_0_release/gcc/expr.c 
\|/tags/gcc_3_0_release/gcc/reload.c "
   A /tags/gcc_3_0_release (from /trunk:39596)
   R /tags/gcc_3_0_release/gcc/expr.c (from 
/branches/gcc-3_0-branch/gcc/expr.c:43255)
   R /tags/gcc_3_0_release/gcc/reload.c (from 
/branches/gcc-3_0-branch/gcc/reload.c:42007)

IMO, from such history (absent external knowledge about better reparenting 
options) the best choice for parent branch is /trunk@39596, not 
/branches/gcc-3_0-branch at a random revision from the replaced files.

Still, I see your point, and I will fix reparenting support.  Whether GCC 
community opts to reparent or not reparent is a different topic.

--
Maxim Kuvyrkov
https://www.linaro.org


> git blame expr.c:
> 
> ba0a9cb85431 (Richard Kenner 1992-03-03 23:34:57 +   396) 
> return temp;
> ba0a9cb85431 (Richard Kenner 1992-03-03 23:34:57 +   397)   }
> 5fbf0b0d5828 (no-author  2001-06-17 19:44:25 +   398) /* 
> Copy the address into a pseudo, so that the returned value
> 5fbf0b0d5828 (no-author  2001-06-17 19:44:25 +   399)
> remains correct across calls to emit_queue.  */
> 5fbf0b0d5828 (no-author  2001-06-17 19:44:25 +   400) 
> XEXP (new, 0) = copy_to_reg (XEXP (new, 0));
> 59f26b7caad9 (Richard Kenner 1994-01-11 00:23:47 +   401) 
> return new;
> 
> git log 5fbf0b0d5828
> commit 5fbf0b0d5828687914c1c18a83ff

Re: Proposal for the transition timetable for the move to GIT

2020-01-08 Thread Maxim Kuvyrkov
> On Dec 30, 2019, at 7:08 PM, Richard Earnshaw (lists) 
>  wrote:
> 
> On 30/12/2019 15:49, Maxim Kuvyrkov wrote:
>>> On Dec 30, 2019, at 6:31 PM, Richard Earnshaw (lists) 
>>>  wrote:
>>> 
>>> On 30/12/2019 13:00, Maxim Kuvyrkov wrote:
>>>>> On Dec 30, 2019, at 1:24 AM, Richard Earnshaw (lists) 
>>>>>  wrote:
>>>>> 
>>>>> On 29/12/2019 18:30, Maxim Kuvyrkov wrote:
>>>>>> Below are several more issues I found in reposurgeon-6a conversion 
>>>>>> comparing it against gcc-reparent conversion.
>>>>>> 
>>>>>> I am sure, these and whatever other problems I may find in the 
>>>>>> reposurgeon conversion can be fixed in time.  However, I don't see why 
>>>>>> should bother.  My conversion has been available since summer 2019, I 
>>>>>> made it ready in time for GCC Cauldron 2019, and it didn't change in any 
>>>>>> significant way since then.
>>>>>> 
>>>>>> With the "Missed merges" problem (see below) I don't see how reposurgeon 
>>>>>> conversion can be considered "ready".  Also, I expected a diligent 
>>>>>> developer to compare new conversion (aka reposurgeon's) against existing 
>>>>>> conversion (aka gcc-pretty / gcc-reparent) before declaring the new 
>>>>>> conversion "better" or even "ready".  The data I'm seeing in differences 
>>>>>> between my and reposurgeon conversions shows that gcc-reparent 
>>>>>> conversion is /better/.
>>>>>> 
>>>>>> I suggest that GCC community adopts either gcc-pretty or gcc-reparent 
>>>>>> conversion.  I welcome Richard E. to modify his summary scripts to work 
>>>>>> with svn-git scripts, which should be straightforward, and I'm ready to 
>>>>>> help.
>>>>>> 
>>>>> 
>>>>> I don't think either of these conversions are any more ready to use than
>>>>> the reposurgeon one, possibly less so.  In fact, there are still some
>>>>> major issues to resolve first before they can be considered.
>>>>> 
>>>>> gcc-pretty has completely wrong parent information for the gcc-3 era
>>>>> release tags, showing the tags as being made directly from trunk with
>>>>> massive deltas representing the roll-up of all the commits that were
>>>>> made on the gcc-3 release branch.
>>>> 
>>>> I will clarify the above statement, and please correct me where you think 
>>>> I'm wrong.  Gcc-pretty conversion has the exact right parent information 
>>>> for the gcc-3 era
>>>> release tags as recorded in SVN version history.  Gcc-pretty conversion 
>>>> aims to produce an exact copy of SVN history in git.  IMO, it manages to 
>>>> do so just fine.
>>>> 
>>>> It is a different thing that SVN history has a screwed up record of gcc-3 
>>>> era tags.
>>> 
>>> It's not screwed up in svn.  Svn shows the correct history information for 
>>> the gcc-3 era release tags, but the git-svn conversion in gcc-pretty does 
>>> not.
>>> 
>>> For example, looking at gcc_3_0_release in expr.c with git blame and svn 
>>> blame shows
>> 
>> In SVN history tags/gcc_3_0_release has been copied off /trunk:39596 and in 
>> the same commit bunch of files were replaced from /branches/gcc-3_0-branch/ 
>> (and from different revisions of this branch!).
>> 
>> $ svn log -qv --stop-on-copy file://$(pwd)/tags/gcc_3_0_release | grep 
>> "/tags/gcc_3_0_release \|/tags/gcc_3_0_release/gcc/expr.c 
>> \|/tags/gcc_3_0_release/gcc/reload.c "
>>   A /tags/gcc_3_0_release (from /trunk:39596)
>>   R /tags/gcc_3_0_release/gcc/expr.c (from 
>> /branches/gcc-3_0-branch/gcc/expr.c:43255)
>>   R /tags/gcc_3_0_release/gcc/reload.c (from 
>> /branches/gcc-3_0-branch/gcc/reload.c:42007)
>> 
> 
> Right, (and wrong).  You have to understand how the release branches and
> tags are represented in CVS to understand why the SVN conversion is done
> this way.  When a branch was created in CVS a tag was added to each
> commit which would then be used in any future revisions along that
> branch.  But until a commit is made on that branch, the release branch
> is just a placeholder.
> 
> When a CVS release tag is created, the tag labels the relevant commit
&

Re: Proposal for the transition timetable for the move to GIT

2020-01-09 Thread Maxim Kuvyrkov
> On Jan 9, 2020, at 5:38 AM, Segher Boessenkool  
> wrote:
> 
> On Wed, Jan 08, 2020 at 11:34:32PM +, Joseph Myers wrote:
>> As noted on overseers, once Saturday's DATESTAMP update has run at 00:16 
>> UTC on Saturday, I intend to add a README.MOVED_TO_GIT file on SVN trunk 
>> and change the SVN hooks to make SVN readonly, then disable gccadmin's 
>> cron jobs that build snapshots and update online documentation until they 
>> are ready to run with the git repository.  Once the existing git mirror 
>> has picked up the last changes I'll make that read-only and disable that 
>> cron job as well, and start the conversion process with a view to having 
>> the converted repository in place this weekend (it could either be made 
>> writable as soon as I think it's ready, or left read-only until people 
>> have had time to do any final checks on Monday).  Before then, I'll work 
>> on hooks, documentation and maintainer-scripts updates.
> 
> Where and when and by who was it decided to use this conversion?

Joseph, please point to message on gcc@ mailing list that expresses consensus 
of GCC community to use reposurgeon conversion.  Otherwise, it is not 
appropriate to substitute one's opinion for community consensus.

I want GCC community to get the best possible conversion, being it mine or 
reposurgeon's.  To this end I'm comparing the two conversions and will post my 
results later today.

Unfortunately, the comparison is complicated by the fact that you uploaded only 
"b" version of gcc-reposurgeon-8 repository, which uses modified branch layout 
(or confirm that there are no substantial differences between "7" and "8" 
reposurgeon conversions).

--
Maxim Kuvyrkov
https://www.linaro.org



Re: Proposal for the transition timetable for the move to GIT

2020-01-10 Thread Maxim Kuvyrkov


> On Jan 10, 2020, at 10:33 AM, Maxim Kuvyrkov  
> wrote:
> 
>> On Jan 9, 2020, at 5:38 AM, Segher Boessenkool  
>> wrote:
>> 
>> On Wed, Jan 08, 2020 at 11:34:32PM +, Joseph Myers wrote:
>>> As noted on overseers, once Saturday's DATESTAMP update has run at 00:16 
>>> UTC on Saturday, I intend to add a README.MOVED_TO_GIT file on SVN trunk 
>>> and change the SVN hooks to make SVN readonly, then disable gccadmin's 
>>> cron jobs that build snapshots and update online documentation until they 
>>> are ready to run with the git repository.  Once the existing git mirror 
>>> has picked up the last changes I'll make that read-only and disable that 
>>> cron job as well, and start the conversion process with a view to having 
>>> the converted repository in place this weekend (it could either be made 
>>> writable as soon as I think it's ready, or left read-only until people 
>>> have had time to do any final checks on Monday).  Before then, I'll work 
>>> on hooks, documentation and maintainer-scripts updates.
>> 
>> Where and when and by who was it decided to use this conversion?
> 
> Joseph, please point to message on gcc@ mailing list that expresses consensus 
> of GCC community to use reposurgeon conversion.  Otherwise, it is not 
> appropriate to substitute one's opinion for community consensus.
> 
> I want GCC community to get the best possible conversion, being it mine or 
> reposurgeon's.  To this end I'm comparing the two conversions and will post 
> my results later today.
> 
> Unfortunately, the comparison is complicated by the fact that you uploaded 
> only "b" version of gcc-reposurgeon-8 repository, which uses modified branch 
> layout (or confirm that there are no substantial differences between "7" and 
> "8" reposurgeon conversions).

There are plenty of difference between reposurgeon and svn-git conversions; 
today I've ignored subjective differences like author and committer entries and 
focused on comparing histories of branches.

Redhat's branches are among the most complicated and below analysis is 
difficult to follow.  It took me most of today to untangle it.  Let's look at 
redhat/gcc-9-branch.

TL;DR:
1. Reposurgeon conversion has extra history (more commits than intended) of 
redhat/gcc-4_7-branch@182541 merged into redhat/gcc-4_8-branch, which is then 
propagated into all following branches including redhat/gcc-9-branch.
2. Svn-git conversion has redhat/gcc-4_8-branch with history corresponding to 
SVN history, with no less and no more commits.
3. Other branches are likely to have similar issues, I didn't check.
4. I consider history of reposurgeon conversion to be incorrect.
5. The only history artifact (extra merges in reparented branches/tags) of 
svn-git conversion has been fixed.
6. I can appreciate that GCC community is tired of this discussion and wants it 
to go away.

Analysis:
Commit histories for redhat/gcc-9-branch match up to history inherited from 
redhat/gcc-4_8-branch (yes, redhat's branch history goes into ancient 
branches).  So now we are looking at redhat/gcc-4_8-branch, and the two 
conversions have different commit histories for redhat/gcc-4_8-branch.  This is 
relevant because it shows up in current development branch.  The histories 
diverge at r194477:

r194477 | jakub | 2012-12-13 13:34:44 + (Thu, 13 Dec 2012) | 3 lines

svn merge -r182540:182541 
svn+ssh://gcc.gnu.org/svn/gcc/branches/redhat/gcc-4_7-branch
svn merge -r182546:182547 
svn+ssh://gcc.gnu.org/svn/gcc/branches/redhat/gcc-4_7-branch

Added: svn:mergeinfo
## -0,0 +0,4 ##
   Merged /branches/redhat/gcc-4_4-branch:r143377,143388,144574,144578,155228
   Merged /branches/redhat/gcc-4_5-branch:r161595
   Merged /branches/redhat/gcc-4_6-branch:r168425
   Merged /branches/redhat/gcc-4_7-branch:r182541,182547


To me this looks like cherry-picks of r182541 and r182547 from 
redhat/gcc-4_7-branch into redhat/gcc-4_8-branch.

[1] Note that commit r182541 is itself a merge of redhat/gcc-4_6-branch@168425 
into redhat/gcc-4_7-branch and cherry-picks from the other branches.  It is an 
actual merge (not cherry-pick) from redhat/gcc-4_6-branch@168425 because 
r168425 is the only commit to redhat/gcc-4_6-branch@168425 not present on 
trunk.  The other branches had more commits to their histories, so they can't 
be represented as git merges.

Reposurgeon commit for r194477 (e601ffdd860b0deed6d7ce78da61e8964c287b0b) 
merges in commit for r182541 from redhat/gcc-4_7-branch bringing *full* history 
of redhat/gcc-

Re: Proposal for the transition timetable for the move to GIT

2020-01-10 Thread Maxim Kuvyrkov
> On Jan 10, 2020, at 6:15 PM, Joseph Myers  wrote:
> 
> On Fri, 10 Jan 2020, Maxim Kuvyrkov wrote:
> 
>> To me this looks like cherry-picks of r182541 and r182547 from 
>> redhat/gcc-4_7-branch into redhat/gcc-4_8-branch.
> 
> r182541 is the first commit on /branches/redhat/gcc-4_7-branch after it 
> was created as a copy of trunk.  I.e., merging and cherry-picking it are 
> indistinguishable, and it's entirely correct for reposurgeon to consider a 
> commit merging it as a merge from r182541 (together with a cherry-pick of 
> r182547).

I was wrong re. r182541, I didn't notice that it is the first commit on branch. 
 This renders the analysis in favor of reposurgeon conversion, not svn-git.

--
Maxim Kuvyrkov
https://www.linaro.org



Re: Stack protector: leak of guard's address on stack

2018-04-29 Thread Maxim Kuvyrkov

> On Apr 28, 2018, at 9:22 PM, Florian Weimer  wrote:
> 
> * Thomas Preudhomme:
> 
>> Yes absolutely, CSE needs to be avoided. I made memory access volatile
>> because the change was easier to do. Also on Arm Thumb-1 computing the
>> guard's address itself takes several loads so had to modify some more
>> patterns. Anyway, regardless of the proper fix, do you have any objection
>> to raising a CVE for that issue?
> 
> Please file a bug in Bugzilla first and use that in the submission to
> MITRE.

Thomas filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 couple of weeks 
ago.

--
Maxim Kuvyrkov
www.linaro.org


Re: Stack protector: leak of guard's address on stack

2018-05-01 Thread Maxim Kuvyrkov
> On Apr 29, 2018, at 2:11 PM, Florian Weimer  wrote:
> 
> * Maxim Kuvyrkov:
> 
>>> On Apr 28, 2018, at 9:22 PM, Florian Weimer  wrote:
>>> 
>>> * Thomas Preudhomme:
>>> 
>>>> Yes absolutely, CSE needs to be avoided. I made memory access volatile
>>>> because the change was easier to do. Also on Arm Thumb-1 computing the
>>>> guard's address itself takes several loads so had to modify some more
>>>> patterns. Anyway, regardless of the proper fix, do you have any objection
>>>> to raising a CVE for that issue?
>>> 
>>> Please file a bug in Bugzilla first and use that in the submission to
>>> MITRE.
>> 
>> Thomas filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 couple
>> of weeks ago.
> 
> Is there a generic way to find other affected targets?
> 
> If we only plan to fix 32-bit Arm, we should make the CVE identifier
> specific to that, to avoid confusion.

The problem is fairly target-dependent, so architecture maintainers need to 
look at how stack-guard canaries and their addresses are handled and whether 
they can be spilled onto stack.

It appears we need to poll architecture maintainers before filing the CVE.

--
Maxim Kuvyrkov
www.linaro.org



Re: Devirtualization in gcc

2011-02-01 Thread Maxim Kuvyrkov
On Jan 26, 2011, at 3:27 AM, Ian Lance Taylor wrote:

> Black Bit  writes:
> 
>> Could someone tell me if the work described in this paper 
>> http://www.linuxsymposium.org/archives/GCC/Reprints-2006/namolaru-reprint.pdf
>>  was completed and is part of gcc?Thanks 
>>
> 
> To the best of my knowledge the work has not yet become part of mainline
> gcc.  Perhaps the Haifa folks can correct me if I am wrong.

The approach described in the paper resembles devirtualization optimizations 
Martin Jambor implemented as part of the IPA CP pass.  AFAIK, the two 
implementations were different efforts.

The implementation in current mainline does not define the lattice to track 
types as clear as the paper, but functionally it is very similar.  We 
(CodeSourcery) have patches that refactor type propagation code in ipa-cp.c to 
clearly describe the type information lattice [*].  Having information 
represented as lattice is advantageous as it makes it easier to reuse 
devirtualization analysis in other optimization passes.

[*] http://gcc.gnu.org/ml/gcc/2010-12/msg00461.html

--
Maxim Kuvyrkov
CodeSourcery
+7-812-677-6839



Simplification of relational operations (was [patch for PR18942])

2011-12-01 Thread Maxim Kuvyrkov
Zdenek,

I'm looking at a missed optimizations in combine and it is similar to the one 
you've fixed in PR18942 (http://thread.gmane.org/gmane.comp.gcc.patches/81504).

I'm trying to make GCC optimize
(leu:SI
  (plus:SI (reg:SI) (const_int -1))
  (const_int 1))

into

(leu:SI
  (reg:SI)
  (const_int 2))
.

Your patch for PR18942 handles only EQ/NE comparisons, and I wonder if there is 
a reason not to handle LEU/GEU, LTU/GTU comparisons as well.  I'm a bit fuzzy 
whether signed comparisons can be optimized here as well, but I can't see the 
problem with unsigned comparisons.

Any reason why this optimization would be unsafe?

Regarding the testcase, the general pattern

(set (tmp1) (plus:SI (reg:SI) (const_int A))
(set (tmp2) (leu:SI (tmp1) (const_int B))

is generated from switch statement

switch (reg) {
  case A:
  case B:
  ...
}

Combine tries merge the two instructions into one, but fails.  This causes an 
extra 'add' instruction per switch statement in the final assembly.  The target 
I'm working with is MIPS, but, I imagine, other architectures are affected as 
well.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics





Re: Simplification of relational operations (was [patch for PR18942])

2011-12-02 Thread Maxim Kuvyrkov
On 2/12/2011, at 9:45 PM, Jakub Jelinek wrote:

> On Fri, Dec 02, 2011 at 03:33:06PM +1300, Maxim Kuvyrkov wrote:
>> I'm looking at a missed optimizations in combine and it is similar to the
>> one you've fixed in PR18942
>> (http://thread.gmane.org/gmane.comp.gcc.patches/81504).
>> 
>> I'm trying to make GCC optimize
>> (leu:SI
>>  (plus:SI (reg:SI) (const_int -1))
>>  (const_int 1))
>> 
>> into
>> 
>> (leu:SI
>>  (reg:SI)
>>  (const_int 2))
>> .
>> 
>> Your patch for PR18942 handles only EQ/NE comparisons, and I wonder if
>> there is a reason not to handle LEU/GEU, LTU/GTU comparisons as well.  I'm
>> a bit fuzzy whether signed comparisons can be optimized here as well, but
>> I can't see the problem with unsigned comparisons.
> 
> Consider reg:SI being 0?  Then (leu:SI (plus:SI (reg:SI) (const_int -1)) 
> (const_int 1))
> is 0, but (leu:SI (reg:SI) (const_int 2)) is 1.
> You could transform this if you have a guarantee that reg:SI will not be 0
> (and, in your general
> 
>> Regarding the testcase, the general pattern
>> 
>> (set (tmp1) (plus:SI (reg:SI) (const_int A))
>> (set (tmp2) (leu:SI (tmp1) (const_int B))
> 
> case that reg:SI isn't 0 .. A-1).

Jacub,
Zdenek,

Thank you for explaining the overflow in comparisons.  In fact, the unsigned 
overflow is intentionally used in expanding of switch statements to catch the 
'default:' case in certain switch statements with just one conditional branch.

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics



Re: [SH] ICE compiling pr34330 testcase for sh-linux-gnu

2009-07-14 Thread Maxim Kuvyrkov

Andrew Stubbs wrote:

I'm having trouble with an ICE, and I'm hoping somebody can enlighten me.

Given the following command:

cc1 -fpreprocessed ../pr34330.i -quiet -dumpbase pr34330.c -da -mb 
-auxbase-strip pr34330.c -Os -version -ftree-parallelize-loops=4 
-ftree-vectorize -o pr34330.s -fschedule-insns


I get an internal compiler error:

GNU C (GCC) version 4.5.0 20090702 (experimental) (sh-linux-gnu)
compiled by GNU C version 4.3.2, GMP version 4.3.1, MPFR version 
2.4.1-p5, MPC version 0.6

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU C (GCC) version 4.5.0 20090702 (experimental) (sh-linux-gnu)
compiled by GNU C version 4.3.2, GMP version 4.3.1, MPFR version 
2.4.1-p5, MPC version 0.6

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: c91a929a0209c0670a3ae8b8067b9f9a
/scratch/ams/4.4-sh-linux-gnu-lite/src/gcc-trunk-4.4/gcc/testsuite/gcc.dg/torture/pr34330.c: 
In function 'foo':
/scratch/ams/4.4-sh-linux-gnu-lite/src/gcc-trunk-4.4/gcc/testsuite/gcc.dg/torture/pr34330.c:22:1: 
error: insn does not satisfy its constraints:
(insn 171 170 172 4 
/scratch/ams/4.4-sh-linux-gnu-lite/src/gcc-trunk-4.4/gcc/testsuite/gcc.dg/torture/pr34330.c:17 
(set (reg:SI 9 r9)

(plus:SI (reg:SI 8 r8)
(reg:SI 0 r0 [orig:243 ivtmp.11 ] [243]))) 35 
{*addsi3_compact} (nil))
/scratch/ams/4.4-sh-linux-gnu-lite/src/gcc-trunk-4.4/gcc/testsuite/gcc.dg/torture/pr34330.c:22:1: 
internal compiler error: in reload_cse_simplify_operands, at 
postreload.c:396


This looks much like PR37053 on m68k/ColdFire; the easiest way to check 
if this ICE was caused by the same error is to revert hunk in rtlanal.c: 
commutative_operand_precedence() -- see in the PR.


As to the fix, there are several patches being discussed here 
(http://gcc.gnu.org/ml/gcc-patches/2009-07/msg00816.html) and here 
(http://gcc.gnu.org/ml/gcc-patches/2009-07/msg00823.html).


My $0.02.

--
Maxim


[RFA] dwarf2out.c:eliminate_regs() bug

2009-09-20 Thread Maxim Kuvyrkov
I'm investigating an ICE on m68k architecture.  I'm not quite sure what 
is the right way to fix the bug so I welcome any feedback on the below 
analysis.


Compilation fails on the assert in dwarf2out.c:based_loc_descr():

  /* We only use "frame base" when we're sure we're talking about the
 post-prologue local stack frame.  We do this by *not* running
 register elimination until this point, and recognizing the special
 argument pointer and soft frame pointer rtx's.  */
  if (reg == arg_pointer_rtx || reg == frame_pointer_rtx)
{
  rtx elim = eliminate_regs (reg, VOIDmode, NULL_RTX);

  if (elim != reg)
{
  if (GET_CODE (elim) == PLUS)
{
  offset += INTVAL (XEXP (elim, 1));
  elim = XEXP (elim, 0);
}
  gcc_assert ((SUPPORTS_STACK_ALIGNMENT
   && (elim == hard_frame_pointer_rtx
   || elim == stack_pointer_rtx))
  || elim == (frame_pointer_needed
  ? hard_frame_pointer_rtx
  : stack_pointer_rtx));


This code uses eliminate_regs(), which implicitly assumes 
reload_completed as it uses reg_eliminate[], which assumes that 
frame_pointer_needed is properly set, which happens in ira.c.  However, 
in some cases this piece of based_loc_descr() can be reached during 
inlining pass (see backtrace below).  When called before reload, 
eliminate_regs() may return an inconsistent result which is why the 
assert in based_loc_descr() fails.  In the particular testcase I'm 
investigating, frame_pointer_needed is 0 (initial value), but 
eliminate_regs returns stack_pointer_rtx because it is guided by 
reg_eliminate information from the previous function which had 
frame_pointer_needed set to 1.


Now, how do we fix this?  For starters, it seems to be a good idea to 
assert (reload_in_progress | reload_completed) in eliminate_regs.  Then, 
there are users of eliminate_regs in dbxout.c, dwarf2out.c, and sdbout.c 
not counting reload and machine-specific parts.  From the 3 *out.c 
backends, only dwarf2out.c handles abstract functions, which is what 
causing it to be called before reload afaik, so the task seems to be in 
fixing the dwarf2out code.


There are two references to eliminate_regs in dwarf2out.  The first 
reference -- in based_loc_descr -- can *probably* be handled by adding 
reload_completed to the 'if' condition.  The second is in 
compute_frame_pointer_to_fb_displacement.  I'm no expert in dwarf2out.c 
code, but from the looks of it, it seems that compute_..._displacement 
is only called after reload, so a simple gcc_assert (reload_completed) 
may be enough there.


One last note, I'm investigating this bug against 4.4 branch as it 
doesn't trigger on the mainline.  Progression search on the mainline 
showed that failure became latent after this patch 
(http://gcc.gnu.org/viewcvs?view=revision&revision=147436) to inlining 
heuristics.


--
Maxim K.
CodeSourcery

The backtrace:
#0  eliminate_regs_1 (x=0xf7d60280, mem_mode=VOIDmode, insn=0x0, 
may_use_invariant=0 '\0') at gcc/reload1.c:2481
#1  0x0839e9b1 in eliminate_regs (x=0xf7d60280, mem_mode=VOIDmode, 
insn=0x0) at gcc/reload1.c:2870
#2  0x0821cf66 in based_loc_descr (reg=0xf7d60280, offset=8, 
initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:9868
#3  0x0821d7a7 in mem_loc_descriptor (rtl=0xf700bd98, mode=SImode, 
initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10158
#4  0x0821dd55 in loc_descriptor (rtl=0xf700bc90, 
initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10330
#5  0x0821ddde in loc_descriptor (rtl=0xf700d7a0, 
initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10349
#6  0x082205d6 in add_location_or_const_value_attribute (die=0xf702ad20, 
decl=0xf73922d0, attr=DW_AT_location) at gcc/dwarf2out.c:11841
#7  0x08223412 in gen_formal_parameter_die (node=0x0, origin=0xf73922d0, 
context_die=0xf702ace8) at gcc/dwarf2out.c:13349
#8  0x082273c6 in gen_decl_die (decl=0x0, origin=0xf73922d0, 
context_die=0xf702ace8) at gcc/dwarf2out.c:15388
#9  0x082268aa in process_scope_var (stmt=0xf7163620, decl=0x0, 
origin=0xf73922d0, context_die=0xf702ace8) at gcc/dwarf2out.c:14969
#10 0x0822698d in decls_for_scope (stmt=0xf7163620, 
context_die=0xf702ace8, depth=5) at gcc/dwarf2out.c:14993
#11 0x08225192 in gen_lexical_block_die (stmt=0xf7163620, 
context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14266
#12 0x082253b5 in gen_inlined_subroutine_die (stmt=0xf7163620, 
context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14308
#13 0x08226711 in gen_block_die (stmt=0xf7163620, 
context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14935
#14 0x082269ee in decls_for_scope (stmt=0xf7163038, 
context_die=0xf702a498, depth=4) at gcc/dwarf2out.c:15005
#15 0x08225192 in gen_lexical_block_die (stmt=0xf7163038, 
context_die=0xf7026f18, depth=4) at gcc/dwarf2out.c:14266
#16 0x0822672c in gen_block_die (stmt=0xf7163038, 
c

Re: question about speculative scheduling in gcc

2009-09-20 Thread Maxim Kuvyrkov

Amker.Cheng wrote:

Hi :
I'm puzzled when looking into speculative scheduling in gcc, the 4.2.4 version.

First, I noticed the document describing IBM haifa instruction
scheduler(as PowerPC Reference Compiler Optimization Project).

It presents that the instruction motion from bb s(dominated by t)
to t is speculative when split_blocks(s, t) not empty.

Second, There is SCED_FLAGS like DO_SPECULATION in codes.


These are two different types of speculative optimizations.



Here goes questions.
1, Does the DO_SPECULATION flag constrol whether do the
mentioned speculative motion or not?


DO_SPECULATION flag controls generation of IA64 data and control 
speculative instructions.  It is not used on other architectures.


Speculative instruction moves from the split blocks are controlled by 
flag_schedule_speculative.


--
Maxim


Re: [RFA] dwarf2out.c:eliminate_regs() bug

2009-09-20 Thread Maxim Kuvyrkov

Richard Guenther wrote:

On Sun, Sep 20, 2009 at 9:38 AM, Maxim Kuvyrkov  wrote:

...

This code uses eliminate_regs(), which implicitly assumes reload_completed
as it uses reg_eliminate[], which assumes that frame_pointer_needed is
properly set, which happens in ira.c.  However, in some cases this piece of
based_loc_descr() can be reached during inlining pass (see backtrace below).
 When called before reload, eliminate_regs() may return an inconsistent
result which is why the assert in based_loc_descr() fails.  In the
particular testcase I'm investigating, frame_pointer_needed is 0 (initial
value), but eliminate_regs returns stack_pointer_rtx because it is guided by
reg_eliminate information from the previous function which had
frame_pointer_needed set to 1.

...

I think you should avoid calling eliminate_regs for DECL_ABSTRACT
current_function_decl.  That should cover the inliner path.


Thanks for the insight.  Do you mean something like the attached patch?

--
Maxim
Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (revision 261914)
+++ gcc/dwarf2out.c (working copy)
@@ -9862,8 +9862,11 @@ based_loc_descr (rtx reg, HOST_WIDE_INT 
   /* We only use "frame base" when we're sure we're talking about the
  post-prologue local stack frame.  We do this by *not* running
  register elimination until this point, and recognizing the special
- argument pointer and soft frame pointer rtx's.  */
-  if (reg == arg_pointer_rtx || reg == frame_pointer_rtx)
+ argument pointer and soft frame pointer rtx's.
+ We might get here during the inlining pass (DECL_ABSTRACT is true then),
+ so don't try eliminating registers in such a case.  */
+  if (!DECL_ABSTRACT (current_function_decl)
+  && (reg == arg_pointer_rtx || reg == frame_pointer_rtx))
 {
   rtx elim = eliminate_regs (reg, VOIDmode, NULL_RTX);
 
@@ -12224,6 +12227,9 @@ compute_frame_pointer_to_fb_displacement
   offset += ARG_POINTER_CFA_OFFSET (current_function_decl);
 #endif
 
+  /* Make sure we don't try eliminating registers in abstract function.  */
+  gcc_assert (!DECL_ABSTRACT (current_function_decl));
+
   elim = eliminate_regs (reg, VOIDmode, NULL_RTX);
   if (GET_CODE (elim) == PLUS)
 {
Index: gcc/reload1.c
===
--- gcc/reload1.c   (revision 261914)
+++ gcc/reload1.c   (working copy)
@@ -2867,6 +2867,7 @@ eliminate_regs_1 (rtx x, enum machine_mo
 rtx
 eliminate_regs (rtx x, enum machine_mode mem_mode, rtx insn)
 {
+  gcc_assert (reload_in_progress || reload_completed);
   return eliminate_regs_1 (x, mem_mode, insn, false);
 }
 


Re: [RFA] dwarf2out.c:eliminate_regs() bug

2009-10-08 Thread Maxim Kuvyrkov

Richard Guenther wrote:
...

Yes, though we should probably try to catch the DECL_ABSTRACT case
further up the call chain - there shouldn't be any location lists for abstract
function.  Thus, see why

static dw_die_ref
gen_formal_parameter_die (tree node, tree origin, dw_die_ref context_die)
...
  if (! DECL_ABSTRACT (node_or_origin))
add_location_or_const_value_attribute (parm_die, node_or_origin,
   DW_AT_location);

the node_or_origin of the param isn't DECL_ABSTRACT.  In the end the
above check should have avoided the situation you run into.


The origin_or_origin (== origin) turned out to be 'this' pointer.  It 
came from BLOCK_NONLOCALIZED_VARs in decls_for_scope():


static void
decls_for_scope (tree stmt, dw_die_ref context_die, int depth)
{
...
   for (i = 0; i < BLOCK_NUM_NONLOCALIZED_VARS (stmt); i++)
 process_scope_var (stmt, NULL, BLOCK_NONLOCALIZED_VAR (stmt, i),
context_die);
...
}

set_decl_abstract_flags() doesn't seem to process 
BLOCK_NONLOCALIZED_VARs.  From what I gather, this is correct behavior.


At this point I got the feeling that something is clobbering the 
information.  There is this patch by Honza 
(http://gcc.gnu.org/viewcvs/trunk/gcc/dwarf2out.c?r1=151901&r2=151917) 
that fixes a clobbering issue with abstract functions.  Backporting it 
to my sources fixed the problem, yay!


Honza, does the bug you've fixed with the above patch resemble the 
problem I've stumbled into?


Regards,

--
Maxim K.


Re: How to define 2 bypasses for a single pair of insn_reservation

2009-01-05 Thread Maxim Kuvyrkov

Vladimir Makarov wrote:

Ye, Joey wrote:


...


Anyone can help me through this please?
  
It was supposed to have two latency definitions at most (one in 
define_insn_reservation and another one in define_bypass).  That time it 
seemed enough for all processors supported by GCC.  It also simplified 
semantics definition when two bypass conditions returns true for the 
same insn pair.


If you really need more one bypass for insn pair, I could implement 
this.  Please, let me know.  In this case semantics of choosing latency 
time could be


o time in first bypass occurred in pipeline description whose condition 
returns true

o time given in define_insn_reservation


I had a similar problem with ColdFire V4 scheduler model and the 
solution for me was using adjust_cost() target hook; it is a bit 
complicated, but it works fine.  Search m68k.c for 'bypass' for more 
information, comments there describe the thing in sufficient detail.


--
Maxim


Re: How to define 2 bypasses for a single pair of insn_reservation

2009-01-06 Thread Maxim Kuvyrkov

Ye, Joey wrote:

Maxim and Vladimir Wrote:

Anyone can help me through this please?
  
It was supposed to have two latency definitions at most (one in 
define_insn_reservation and another one in define_bypass).  That time it 
seemed enough for all processors supported by GCC.  It also simplified 
semantics definition when two bypass conditions returns true for the 
same insn pair.


If you really need more one bypass for insn pair, I could implement 
this.  Please, let me know.  In this case semantics of choosing latency 
time could be


o time in first bypass occurred in pipeline description whose condition 
returns true

o time given in define_insn_reservation
I had a similar problem with ColdFire V4 scheduler model and the 
solution for me was using adjust_cost() target hook; it is a bit 
complicated, but it works fine.  Search m68k.c for 'bypass' for more 
information, comments there describe the thing in sufficient detail.



Maxim, I read your implementation in m68k.c. IMHO it is a smart but
tricky solution. For example it depends on the assumption that
targetm.sched.adjust_cost () immediately called after bypass_p().


Yes, it does depend on this assumption and the comment states exactly that.


Also the redundant check and calls to min_insn_conflict_delay looks
inefficient.


Which check[s] do you have in mind, the gcc_assert's?  Also, out of 
curiosity, what is inefficient about the use of min_insn_conflict_delay?


For the record, min_insn_conflict delay has nothing to do with emulating 
two bypasses; this tweak makes scheduler faster by not adding 
instructions to the ready list which makes haifa-sched.c:max_issue() do 
its exhaustive-like search on a smaller set.



I'd prefer to extend semantics to support more than one
bypass.


Don't get me wrong, I'm not against adding support for N>1 bypasses; it 
is not that easy though ;) .


--
Maxim


Re: About Hazard Recognizer:DFA

2009-01-07 Thread Maxim Kuvyrkov

daniel tian wrote:

Hi Dr. Uday Khedker:
   Happy New Year!
   I met hazard problem. And I have debuged  this error for  a few
days. I wrote DFA to avoid load hazard, but still it exists. I wonder
whether in default the command './cc1 hazard.c' doesn't compile the
file with DFA.


By default the scheduler is enabled starting at -O2, so this should at 
least be './cc1 -O2 hazard.c'.  Of cause, you should also add generation 
of nops, as Vladimir said, either in machine dependent reorg or in 
assembler.  Also, scheduler dumps may be helpful for you; they can be 
enabled via -fsched-verbose=N switch.


--
Maxim



Re: scheduling question

2009-05-07 Thread Maxim Kuvyrkov

Alex Turjan wrote:

Hi,
During scheduling Im confronted with the fact that an instruction is moved
from the ready list to queued with the cost 2, while according to my
expectations the insn should have been moved to queued with cost 1.

Did anybody experience similar problem? 


From what you described it's not clear what the problem is.  When 
scheduler infers that an instruction cannot be scheduled in the next N 
cycles (due to DFA hazard or insn_cost/dep_cost hook considerations or 
due to something else) the scheduler queues the instruction on (N+1) cycle.



In case an insn is ready but can bot be schedled in the current
cycle, is it correct (i.e. the generated code is correct) to move the insn
to the queue list with cost 1 ?; no matter what it the value >=1
returned by state_transition.


Yes, that would be correct from code correctness point of view, but 
state_transition() *will* make the scheduler requeue the instruction on 
the next cycle, so you will just loose in compile time.


It seams to me that moving from the ready to queue list with cost >=1 is 
an optimization for compilation time.


Correct, scheduler would be working unnecessarily long otherwise.

--
Maxim


Re: sched2, ret, use, and VLIW bundling

2009-06-08 Thread Maxim Kuvyrkov

DJ Delorie wrote:

I'm working on a VLIW coprocessor for MeP.  One thing I noticed is
that sched2 won't bundle the function's RET with the insn that sets
the return value register, apparently because there's an intervening
USE of that register (insn 30 in the example below).

Is there any way around this?  The return value obviously isn't
actually used there, nor does the return insn need it - that USE is
just to keep the return value live until the function exits.


The problem may be in the dependency cost between the SET (insn 27) and 
the USE (insn 30) being >= 1.  Have you tried using 
targetm.sched.adjust_cost() hook to set the cost of USE to 0?


Anyway, this seems strange, the scheduler should just output the USEs as 
soon as they are ready.  One of the few places this can be forced untrue 
is targetm.sched.dfa_new_cycle() hook; does your port define it?


--
Maxim


Performance optimizations for Intel Core 2 and Core i7 processors

2010-05-16 Thread Maxim Kuvyrkov
CodeSourcery is working on improving performance for Intel's Core 2 and 
Core i7 families of processors.


CodeSourcery plans to add support for unaligned vector instructions, to 
provide fine-tuned scheduling support and to update instruction 
selection and instruction cost models for Core i7 and Core 2 families of 
processors.


As usual, CodeSourcery will be contributing its work to GCC.  Currently, 
our target is the end of GCC 4.6 Stage1.


If your favorite benchmark significantly under-performs on Core 2 or 
Core i7 CPUs, don't hesitate asking us to take a look at it.


We appreciate Intel sponsoring this project.


Thank you,

--
Maxim Kuvyrkov
CodeSourcery
ma...@codesourcery.com
(650) 331-3385 x724


Re: Performance optimizations for Intel Core 2 and Core i7 processors

2010-05-20 Thread Maxim Kuvyrkov

On 5/20/10 4:04 PM, Steven Bosscher wrote:

On Mon, May 17, 2010 at 8:44 AM, Maxim Kuvyrkov  wrote:

CodeSourcery is working on improving performance for Intel's Core 2 and Core
i7 families of processors.

CodeSourcery plans to add support for unaligned vector instructions, to
provide fine-tuned scheduling support and to update instruction selection
and instruction cost models for Core i7 and Core 2 families of processors.

As usual, CodeSourcery will be contributing its work to GCC.  Currently, our
target is the end of GCC 4.6 Stage1.

If your favorite benchmark significantly under-performs on Core 2 or Core i7
CPUs, don't hesitate asking us to take a look at it.


I'd like to ask you to look at ffmpeg (missed core2 vectorization
opportunities), polyhedron (PR34501, like, duh! :-), and Apache
benchmark (-mtune=core2 results in lower scores).

You could check overall effects on an openly available benchmark suite
such as http://www.phoronix-test-suite.com/


Thank you for the pointers!

--
Maxim Kuvyrkov
CodeSourcery
ma...@codesourcery.com
(650) 331-3385 x724


Re: Performance optimizations for Intel Core 2 and Core i7 processors

2010-05-26 Thread Maxim Kuvyrkov

On 5/21/10 9:06 PM, Vladimir N. Makarov wrote:

On 05/17/2010 02:44 AM, Maxim Kuvyrkov wrote:

...

If your favorite benchmark significantly under-performs on Core 2 or
Core i7 CPUs, don't hesitate asking us to take a look at it.

What I saw is people complaining about -mtune=core2 for polyhedron

http://gcc.gnu.org/ml/gcc-patches/2010-02/msg01272.html

The biggest complaint was on mdbx (about 16%).


Thank you for the pointers and analysis!

...

Also I think it is important to have a pipeline description for
Core2/Core i7 or at least to use it from generic.


Right.  We will be adding a pipeline description for Core 2/i7.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery
ma...@codesourcery.com
(650) 331-3385 x724


Re: Problem configuring uclinux toolchain

2010-07-09 Thread Maxim Kuvyrkov

On 7/9/10 3:22 PM, Anthony Green wrote:

Hi Maxim,

Recent changes to config.gcc are preventing me from building a
moxie-uclinux toolchain.


Anthony,

What is the error the build process is failing on?

Thanks,

--
Maxim Kuvyrkov
CodeSourcery
ma...@codesourcery.com
(650) 331-3385 x724


Re: Revisiting the use of cselib in alias.c for scheduling

2010-07-21 Thread Maxim Kuvyrkov

On 7/21/10 6:44 PM, Bernd Schmidt wrote:

On 07/21/2010 03:06 PM, Steven Bosscher wrote:

3. GCC now has better alias analysis than it used to, especially with
the alias-exporting stuff that exports the GIMPLE points-to analysis
results, but also just all the other little things that were
contributed over the last 10 years (little things like tree-ssa :)

[...]

It looks like ~9% extra !true_dependence cases are found with cselib,
with is not insignificant:

...

If that can't be improved, I think that rather than remove cselib from
the scheduler, the question should be: if it's useful, why don't we use
it for other schedulers rather than only sched-ebb?


Cselib can /always/ be used during second scheduling pass and on 
single-block regions during the first scheduling pass (after RA 
sched-rgn operates on single-block regions).


Modulo the bugs enabling cselib might surface, the only reason not to 
enable cselib for single-block regions in sched-rgn may be increased 
compile time.  That requires some benchmarking, but my gut feeling is 
that the benefits would outweigh the compile-time cost.


--
Maxim Kuvyrkov
CodeSourcery
ma...@codesourcery.com
(650) 331-3385 x724


Re: Revisiting the use of cselib in alias.c for scheduling

2010-07-22 Thread Maxim Kuvyrkov

On 7/22/10 3:34 AM, Steven Bosscher wrote:

On Wed, Jul 21, 2010 at 10:09 PM, Maxim Kuvyrkov  wrote:

Cselib can /always/ be used during second scheduling pass


Except with the selective scheduler when it works on regions that are
not extended basic blocks, I suppose?


Right, I was considering sched-rgn scheduler, not sel-sched.




and on
single-block regions during the first scheduling pass (after RA sched-rgn
operates on single-block regions).

Modulo the bugs enabling cselib might surface, the only reason not to enable
cselib for single-block regions in sched-rgn may be increased compile time.
  That requires some benchmarking, but my gut feeling is that the benefits
would outweigh the compile-time cost.


So something like the following _should_ work? If so, I'll give it a
try on x86*.

Ciao!
Steven

Index: sched-rgn.c
===
--- sched-rgn.c (revision 162355)
+++ sched-rgn.c (working copy)
@@ -3285,8 +3285,11 @@
  rgn_setup_sched_infos (void)
  {
if (!sel_sched_p ())
-memcpy (&rgn_sched_deps_info,&rgn_const_sched_deps_info,
-   sizeof (rgn_sched_deps_info));
+{
+  memcpy (&rgn_sched_deps_info,&rgn_const_sched_deps_info,
+ sizeof (rgn_sched_deps_info));
+  rgn_sched_deps_info.use_cselib = reload_completed;



Yes, this should work.  You can also enable cselib for single-block 
regions for first scheduling pass too.  I.e.,


index 89743c3..047b717 100644
--- a/gcc/sched-rgn.c
+++ b/gcc/sched-rgn.c
@@ -2935,6 +2935,9 @@ schedule_region (int rgn)
   if (sched_is_disabled_for_current_region_p ())
 return;

+  gcc_assert (!reload_completed || current_nr_blocks == 1);
+  rgn_sched_deps_info.use_cselib = (current_nr_blocks == 1);
+
   sched_rgn_compute_dependencies (rgn);

   sched_rgn_local_init (rgn);

Thanks,

--
Maxim Kuvyrkov
CodeSourcery
ma...@codesourcery.com
(650) 331-3385 x724


Re: Performance optimizations for Intel Core 2 and Core i7 processors

2010-08-17 Thread Maxim Kuvyrkov

On 8/13/10 11:40 PM, Jack Howarth wrote:

On Mon, May 17, 2010 at 10:44:57AM +0400, Maxim Kuvyrkov wrote:

CodeSourcery is working on improving performance for Intel's Core 2 and
Core i7 families of processors.

CodeSourcery plans to add support for unaligned vector instructions, to
provide fine-tuned scheduling support and to update instruction
selection and instruction cost models for Core i7 and Core 2 families of
processors.

As usual, CodeSourcery will be contributing its work to GCC.  Currently,
our target is the end of GCC 4.6 Stage1.

If your favorite benchmark significantly under-performs on Core 2 or
Core i7 CPUs, don't hesitate asking us to take a l...@it.

We appreciate Intel sponsoring this project.


Maxim,
 Do you have any updates on the progress of this project? Since
it has been proposed to default intel darwin to -mtune=core2, it
would be very helpful to be able to test (benchmark) any proposed
changes on x86_64-apple-darwin10 with gcc trunk. Thanks in advance.


Jack,

We will start posting patches very soon.  Bernd Schmidt has almost 
finished pipeline model for Core 2/i7, so that will be the first piece 
of work we'll post for upstream review.


Regards,

--
Maxim Kuvyrkov
CodeSourcery
ma...@codesourcery.com
(650) 331-3385 x724


Re: Questions about selective scheduler and PowerPC

2010-10-19 Thread Maxim Kuvyrkov

On 10/19/10 6:16 PM, Andrey Belevantsev wrote:
...

I agree that ISSUE_POINTS can be removed, as it was not used (maybe
Maxim can comment more on this).


I too agree with removing ISSUE_POINTS, it never found any use.

Regards,

--
Maxim Kuvyrkov
CodeSourcery
ma...@codesourcery.com
(650) 331-3385 x724


Loop-iv.c ICEs on subregs

2010-11-23 Thread Maxim Kuvyrkov
Zdenek,

I'm investigating an ICE in loop-iv.c:get_biv_step().  I hope you can shed some 
light on what the correct fix would be.

The ICE happens when processing:
==
(insn 111 (set (reg:SI 304)
   (plus (subreg:SI (reg:DI 251) 4)
 (const_int 1

(insn 177 (set (subreg:SI (reg:DI 251))
   (reg:SI 304)))
==

The code like the above does not occur on current mainline early enough for 
loop-iv.c to catch it.  The subregs above are produced by Tom's (CC'ed) 
extension elimination pass (scheduled before fwprop1) which is not in mainline 
yet [*].

The failure is due to assert in loop-iv.c:get_biv_step():
==
gcc_assert ((*inner_mode == *outer_mode) != (*extend != UNKNOWN));
==
i.e., inner and outer modes can differ iff there's an extend in the chain. 

Get_biv_step_1() starts with insn 177, then gets to insn 111, then loops back 
to insn 177 at which point it stops and returns GRD_MAYBE_BIV and sets:

* outer_mode == DImode == natural mode of (reg A);

* inner_mode == SImode == mode of (subreg (reg A)), set in get_biv_step_1:
==
  if (GET_CODE (next) == SUBREG)
{
  enum machine_mode amode = GET_MODE (next);

  if (GET_MODE_SIZE (amode) > GET_MODE_SIZE (*inner_mode))
return false;

  *inner_mode = amode;
  *inner_step = simplify_gen_binary (PLUS, outer_mode,
 *inner_step, *outer_step);
  *outer_step = const0_rtx;
  *extend = UNKNOWN;
}
==

* extend == UNKNOWN as there are no extensions in the chain.

It seems to me that computations of outer_mode and extend are correct, I'm not 
sure about inner_mode.

Zdenek, what do you think is the right way to handle the above case in loop 
analysis?

[*] http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01529.html

Thanks,

--
Maxim Kuvyrkov
CodeSourcery
+1-650-331-3385 x724



Re: Loop-iv.c ICEs on subregs

2010-11-25 Thread Maxim Kuvyrkov
On Nov 26, 2010, at 3:51 AM, Zdenek Dvorak wrote:

> Hi,
> 
>> I'm investigating an ICE in loop-iv.c:get_biv_step().  I hope you can shed 
>> some light on what the correct fix would be.
>> 
>> The ICE happens when processing:
>> ==
>> (insn 111 (set (reg:SI 304)
>>   (plus (subreg:SI (reg:DI 251) 4)
>> (const_int 1
>> 
>> (insn 177 (set (subreg:SI (reg:DI 251))
>>   (reg:SI 304)))
>> ==
...
> 
> loop iv analysis currently does not handle assignments to subregs
...
> So, if such a code
> gets produced consistently for a large fraction of the loops, it would be
> necessary to teach loop-iv to analyze induction variables represented in
> subregs.  This would mean a more involved rewrite of loop-iv, though,

I see.  In that case a simpler way to fix the problem may be to move Tom's 
extension elimination pass /after/ loop optimizers.  Do you (or anyone reading 
this thread) have suggestions on what would be a good spot in the optimization 
pipeline for sign- and zero-extension elimination pass?

Thanks,

--
Maxim Kuvyrkov
CodeSourcery
+1-650-331-3385 x724



Account for devirtualization opportunities in inlining heuristics

2010-12-27 Thread Maxim Kuvyrkov
Jan,

Here are the testcases for inlining improvements we've discussed on IRC a 
couple of days ago.

Current mainline handles inline-devirt-1.C and inline-devirt-5.C cases.  With 
my w-i-p patches to teach inlining heuristics about devirtualization 
opportunities (also attached) inline-devirt-2.C, inline-devirt-3.C are also 
fully optimized.

Let me know if you have suggestions for tackling the other cases.

Do you think committing the testcases mainline, XFAIL'ed as necessary, would be 
useful?

Thanks,

--
Maxim Kuvyrkov
CodeSourcery
+1-650-331-3385 x724



0005-Testcases.patch
Description: Binary data


0002-Refactor-ipa-cp.c-to-operate-on-type-lattices.ChangeLog
Description: Binary data


0002-Refactor-ipa-cp.c-to-operate-on-type-lattices.patch
Description: Binary data


0003-Fix-memory-leak.ChangeLog
Description: Binary data


0003-Fix-memory-leak.patch
Description: Binary data


0004-Account-for-devirtualization-in-inlining-heuristics.ChangeLog
Description: Binary data


0004-Account-for-devirtualization-in-inlining-heuristics.patch
Description: Binary data


Re: Generalize ready list sorting via heuristics in rank_for_schedule.

2006-02-01 Thread Maxim Kuvyrkov

Peter Steinmetz wrote:

Currently, within the ready_sort macro in haifa-sched.c, the call to qsort
is passed "rank_for_schedule" to help it decide which of two instructions
should be placed further towards the front of the ready list.
Rank_for_schedule uses a set of ordered heuristics (rank, priority, etc.)
to make this decision.  The set of heuristics is fixed for all target
machines.
There already are two target hooks specifically for this purpose: 
targetm.sched.{reorder, reorder2}. They both have higher priority, than 
ready_sort ().



There can be cases, however, where a target machine may want to define
heuristics driven by specific characteristics of that machine.  Those
heuristics may be meaningless on other targets.
In rank_for_schedule () only machine independent heuristics are 
gathered; the rest of the haifa scheduler is no less machine dependent, 
than these heuristics are. Machine dependent things are separated in 
reorder hooks (which, btw, are defined on 3 targets only).


--
Maxim



Re: [4.2 Regression]: Gcc generates unaligned access on IA64

2006-03-16 Thread Maxim Kuvyrkov

H. J. Lu wrote:

FYI, today's gcc 4.2 generates many unaligned access on IA64:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26721

It may be related to

http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01001.html
http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01000.html
http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00999.html
http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00997.html
http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00998.html

Has anyone else seen it? If it is confirmed, it looks pretty bad.


H.J.


I've got these unaligned access bugs too.  I'm now working on it.

--
Maxim




Re: alias time explosion

2006-03-22 Thread Maxim Kuvyrkov

Daniel Berlin wrote:

...


If i don't turn off scheduling entirely, this testcase now takes >10
minutes to compile (I gave up after that).

With scheduling turned off, it takes 315 seconds, checking enabled.

It looks like the scheduler is now trying to schedule some single region
with 51,000 instructions in it.

Everytime i broke into the debugger, it was busy in ready_sort re-doing
qsort on the ready-list (which probably had a ton of instructions), over
and over and over again.


I imagine the 51k instructions comes from the recent scheduling changes.
Maxim, can you please take the testcase Andrew attached earlier in the
thread, and make it so the scheduler can deal with it in a reasonable
amount of time again?  It used to take <20 seconds.


I've checked the trunk and everything appears ok to me.  Both the trunk 
and the trunk with my patches reverted compile the testcase in 5m30s 
(they were configured with CFLAGS=-g).  My best guess where the >10 
minutes came from is that you tried to compile the testcase with the 
compiler built with profile information - in this case the compilation 
will last for ~15 minutes.


--
Maxim



Re: IA-64 speculation patches have bad impact on ARM

2006-05-30 Thread Maxim Kuvyrkov
ip_length (int);
 static bool arm_function_ok_for_sibcall (tree, tree);
@@ -245,6 +249,12 @@ static bool arm_tls_symbol_p (rtx x);
 #undef  TARGET_SCHED_ADJUST_COST
 #define TARGET_SCHED_ADJUST_COST arm_adjust_cost
 
+#undef TARGET_SCHED_REORDER
+#define TARGET_SCHED_REORDER arm_reorder1
+
+#undef TARGET_SCHED_REORDER2
+#define TARGET_SCHED_REORDER2 arm_reorder2
+
 #undef TARGET_ENCODE_SECTION_INFO
 #ifdef ARM_PE
 #define TARGET_ENCODE_SECTION_INFO  arm_pe_encode_section_info
@@ -5229,6 +5239,50 @@ arm_adjust_cost (rtx insn, rtx link, rtx
   return cost;
 }
 
+static void
+arm_reorder (rtx *ready, int n_ready)
+{
+  if (n_ready > 1)
+{
+  /* This is correct for sched-rgn.c only.  */
+  basic_block bb = BLOCK_FOR_INSN (current_sched_info->prev_head);
+
+  if (BLOCK_FOR_INSN (ready[n_ready - 1]) != bb)
+{
+  int i;
+
+  for (i = n_ready - 1; i >= 0; i--)
+{
+  rtx insn = ready[i];
+
+  if (BLOCK_FOR_INSN (insn) != bb)
+continue;
+
+  memcpy (ready + i, ready + i + 1,
+  (n_ready - i - 1) * sizeof (*ready));
+  ready[n_ready - 1] = insn;
+  break;
+}
+}
+}
+}
+
+static int
+arm_reorder1 (FILE *dump ATTRIBUTE_UNUSED, int sched_verbose ATTRIBUTE_UNUSED,
+  rtx *ready, int *pn_ready, int clock_var ATTRIBUTE_UNUSED)
+{
+  arm_reorder (ready, *pn_ready);
+  return 1;
+}
+
+static int
+arm_reorder2 (FILE *dump ATTRIBUTE_UNUSED, int sched_verbose ATTRIBUTE_UNUSED,
+  rtx *ready, int *pn_ready, int clock_var ATTRIBUTE_UNUSED)
+{
+  arm_reorder (ready, *pn_ready);
+  return 0;
+}
+
 static int fp_consts_inited = 0;
 
 /* Only zero is valid for VFP.  Other values are also valid for FPA.  */
2006-05-30  Maxim Kuvyrkov  <[EMAIL PROTECTED]>

* haifa-sched.c (priority): Set INSN_PRIORITY to INSN_COST if all
dependencies of the insn are being ignored.
(adjust_priority): Don't adjust priority twice.
* sched-int.h (haifa_insn_data.priority_adjusted): New bitfield.
(INSN_PRIORITY_ADJUSTED): New access-macro.
--- haifa-sched.c   (/gcc-local/trunk/gcc)  (revision 19877)
+++ haifa-sched.c   (/gcc-local/arm-bug/gcc)(revision 19877)
@@ -751,6 +751,8 @@ priority (rtx insn)
 
  do
{
+  bool priority_inited_p = false;
+
  for (link = INSN_DEPEND (twin); link; link = XEXP (link, 1))
{
  rtx next;
@@ -785,9 +787,14 @@ priority (rtx insn)
 
  if (next_priority > this_priority)
this_priority = next_priority;
+
+  priority_inited_p = true;
}
}
  
+  if (!priority_inited_p)
+this_priority = insn_cost (insn, 0, 0);
+
  twin = PREV_INSN (twin);
}
  while (twin != prev_first);
@@ -1110,9 +1117,13 @@ adjust_priority (rtx prev)
 
  Revisit when we have a machine model to work with and not before.  */
 
-  if (targetm.sched.adjust_priority)
-INSN_PRIORITY (prev) =
-  targetm.sched.adjust_priority (prev, INSN_PRIORITY (prev));
+  if (targetm.sched.adjust_priority
+  && !INSN_PRIORITY_ADJUSTED (prev))
+{
+  INSN_PRIORITY (prev) =
+targetm.sched.adjust_priority (prev, INSN_PRIORITY (prev));
+  INSN_PRIORITY_ADJUSTED (prev) = 1;
+}
 }
 
 /* Advance time on one cycle.  */
@@ -4478,6 +4489,7 @@ clear_priorities (rtx insn)
   if (INSN_PRIORITY_KNOWN (pro))
{
  INSN_PRIORITY_KNOWN (pro) = 0;
+  INSN_PRIORITY_ADJUSTED (pro) = 0;
  clear_priorities (pro);
}
 }
--- sched-int.h (/gcc-local/trunk/gcc)  (revision 19877)
+++ sched-int.h (/gcc-local/arm-bug/gcc)(revision 19877)
@@ -317,6 +317,9 @@ struct haifa_insn_data
   /* Nonzero if priority has been computed already.  */
   unsigned int priority_known : 1;
 
+  /* Nonzero if priority has been adjusted already.  */
+  unsigned int priority_adjusted : 1;
+
   /* Nonzero if instruction has internal dependence
  (e.g. add_dependence was invoked with (insn == elem)).  */
   unsigned int has_internal_dep : 1;
@@ -350,6 +353,7 @@ extern regset *glat_start, *glat_end;
 #define INSN_DEP_COUNT(INSN)   (h_i_d[INSN_UID (INSN)].dep_count)
 #define INSN_PRIORITY(INSN)(h_i_d[INSN_UID (INSN)].priority)
 #define INSN_PRIORITY_KNOWN(INSN) (h_i_d[INSN_UID (INSN)].priority_known)
+#define INSN_PRIORITY_ADJUSTED(INSN) (h_i_d[INSN_UID (INSN)].priority_adjusted)
 #define INSN_COST(INSN)(h_i_d[INSN_UID (INSN)].cost)
 #define INSN_REG_WEIGHT(INSN)  (h_i_d[INSN_UID (INSN)].reg_weight)
 #define HAS_INTERNAL_DEP(INSN)  (h_i_d[INSN_UID (INSN)].has_internal_dep)


Re: IA-64 speculation patches have bad impact on ARM

2006-06-01 Thread Maxim Kuvyrkov

Vladimir Makarov wrote:

...

I am agree with this.  Two months ago Maxim submitted patches which 
affects only ia64 except one thing affecting all targets - the patch 
which builds more scheduling regions and as consequence permits more 
aggressive interblock scheduling.


Insn scheduling before the register allocation even without Maxim's 
patches is not safe when hard registers are used in RTL.  It is a 
known bug (e.g. for x86_64) and it is in bugzilla.  Jim Wilson wrote 
several possible solutions for this, no one is easy to implement 
except for switching off insn scheduling before RA (what is done for 
x86_64).


But we can restore the state (probably safe for most programs) what 
was before Maxim's patch.  So Maxim could you do this (of course you 
can save max-sched-extend-regions-iters value for ia64 because it is 
probably safe for targets with many registers).


Vlad


Considering that bug, I agree that by default there should be no 
additional regions.  The patch will be posted in a few minutes.


--
Maxim



Re: IA-64 speculation patches have bad impact on ARM

2006-06-01 Thread Maxim Kuvyrkov

Daniel Jacobowitz wrote:

...


Not even a single comment - shame on you both! :-)  If this is the
solution we choose, can we make sure that there's at least a comment
explaining what's going on?


Totally agree.  That was an *example patch*.  Here is a bit updated, but 
still an example of how we can arrange instructions on ARM or some other 
platform with not much execution units.


--
Maxim

--- config/arm/arm.c	(/gcc-local/trunk/gcc)	(revision 19935)
+++ config/arm/arm.c	(/gcc-local/arm-bug/gcc)	(revision 19935)
@@ -52,6 +52,7 @@
 #include "target-def.h"
 #include "debug.h"
 #include "langhooks.h"
+#include "sched-int.h"
 
 /* Forward definitions of types.  */
 typedef struct minipool_nodeMnode;
@@ -118,6 +119,9 @@ static void thumb_output_function_prolog
 static int arm_comp_type_attributes (tree, tree);
 static void arm_set_default_type_attributes (tree);
 static int arm_adjust_cost (rtx, rtx, rtx, int);
+static void arm_reorder (rtx *, int);
+static int arm_reorder1 (FILE *, int, rtx *, int *, int);
+static int arm_reorder2 (FILE *, int, rtx *, int *, int);
 static int count_insns_for_constant (HOST_WIDE_INT, int);
 static int arm_get_strip_length (int);
 static bool arm_function_ok_for_sibcall (tree, tree);
@@ -245,6 +249,12 @@ static bool arm_tls_symbol_p (rtx x);
 #undef  TARGET_SCHED_ADJUST_COST
 #define TARGET_SCHED_ADJUST_COST arm_adjust_cost
 
+#undef TARGET_SCHED_REORDER
+#define TARGET_SCHED_REORDER arm_reorder1
+
+#undef TARGET_SCHED_REORDER2
+#define TARGET_SCHED_REORDER2 arm_reorder2
+
 #undef TARGET_ENCODE_SECTION_INFO
 #ifdef ARM_PE
 #define TARGET_ENCODE_SECTION_INFO  arm_pe_encode_section_info
@@ -5229,6 +5239,68 @@ arm_adjust_cost (rtx insn, rtx link, rtx
   return cost;
 }
 
+/* Reorder insns in the ready list, so that instructions from the target block
+   will be scheduled ahead of instructions from the source blocks.  */
+static void
+arm_reorder (rtx *ready, int n_ready)
+{
+  if (n_ready > 1)
+{
+  /* Find out what target block is.
+
+ !!! It is better to use TARGET_BB itself from the
+ haifa-sched.c: schedule_block (), but it is unavailable due to its
+ local scope.  */
+  basic_block target_bb = BLOCK_FOR_INSN (current_sched_info->prev_head);
+
+  if (/* If insn, that will be scheduled next don't belong to
+ TARGET_BB.
+
+ !!! Actually, we want here another condition:
+ 'if (IS_SPECULATIVE_INSN (ready[n_ready - 1]))', but it is
+ unavailable due to local scope in sched-rgn.c .  */
+  BLOCK_FOR_INSN (ready[n_ready - 1]) != target_bb)
+/* Search the ready list for the most prioritized insn from the
+   TARGET_BB, and, if found, move it to the head of the list.  */
+{
+  int i;
+
+  for (i = n_ready - 1; i >= 0; i--)
+{
+  rtx insn = ready[i];
+
+  if (BLOCK_FOR_INSN (insn) != target_bb)
+continue;
+
+  memcpy (ready + i, ready + i + 1,
+  (n_ready - i - 1) * sizeof (*ready));
+  ready[n_ready - 1] = insn;
+  break;
+}
+}
+}
+}
+
+/* Override default sorting algorithm to reduce number of interblock
+   motions.  */
+static int
+arm_reorder1 (FILE *dump ATTRIBUTE_UNUSED, int sched_verbose ATTRIBUTE_UNUSED,
+  rtx *ready, int *pn_ready, int clock_var ATTRIBUTE_UNUSED)
+{
+  arm_reorder (ready, *pn_ready);
+  return 1;
+}
+
+/* Override default sorting algorithm to reduce number of interblock
+   motions.  */
+static int
+arm_reorder2 (FILE *dump ATTRIBUTE_UNUSED, int sched_verbose ATTRIBUTE_UNUSED,
+  rtx *ready, int *pn_ready, int clock_var ATTRIBUTE_UNUSED)
+{
+  arm_reorder (ready, *pn_ready);
+  return 0;
+}
+
 static int fp_consts_inited = 0;
 
 /* Only zero is valid for VFP.  Other values are also valid for FPA.  */


Re: GCC trunk build failed on ia64: ICE in __gcov_init

2006-06-13 Thread Maxim Kuvyrkov

Grigory Zagorodnev wrote:

Hi!

Build of mainline GCC on ia64-redhat-linux failed since Thu Jun 8 
16:23:09 UTC 2006 (revision 114488). Last successfully built revision is 
114468.


I wonder if somebody sees the same.


...


- Grigory



This was fixed in revision 114604.


--
Maxim



[Job] GNU toolchain developer

2013-04-03 Thread Maxim Kuvyrkov
Hi,

We are looking for developers passionate about open-source and toolchain 
development. You will be working on a variety of open-source projects, 
primarily on GCC, LLVM, glibc, GDB and Binutils.

You should have ... 

- Experience with open-source projects and upstream communities; 
- Experience with open-source toolchain projects is a plus (GCC, LLVM, glibc, 
Binutils, GDB, Newlib, uClibc, OProfile, QEMU, Valgrind, etc); 
- Knowledge of compiler technology; 
- Knowledge of low-level computer architecture; 
- Proficiency in C. Proficiency in C++ and Python is a plus; 
- Knowledge of Linux development environment; 
- Time management and self-organizing skills, desire to work from your home 
office (KugelWorks is a distributed company); 
- Professional ambitions; 
- Fluent English; 
- BSc in computer science (or rationale why you do not need one). 

At KugelWorks you will have the opportunity to ... 

- Hack on the toolchain;
- Develop your engineering, managerial, and communication skills; 
- Gain experience in product development; 
- Get public recognition for your open-source work; 
- Become open-source maintainter. 

Contact: 

- Maxim Kuvyrkov 
- Email: ma...@kugelworks.com 
- Phone: +1 831 295 8595 
- Website: www.kugelworks.com

--
Maxim Kuvyrkov
KugelWorks





Re: [Android] The reason why -Bsymbolic is turned on by default

2013-04-03 Thread Maxim Kuvyrkov
On 30/03/2013, at 7:55 AM, Alexander Ivchenko wrote:

> Hi,
> 
> When compiling a shared library with "-mandroid -shared" the option
> -Bsymbolic for linker is turned on by default. What was the reason
> behind that default?  Isn't using of -Bsymbolic somehow dangerous and
> should be avoided..? (as e.g. is explained in the mail from Richard
> Henderson http://gcc.gnu.org/ml/gcc/2001-05/msg01551.html).
> 
> Since there is no (AFAIK) option like -Bno-symbolic we cannot use
> -fno-pic binary with COPY relocations in it (android dynamic loader
> will throw an error when there is COPY relocation against DT_SYMBOLIC
> library..)

I don't know the exact reason behind -Bsymbolic (it came as a requirement from 
Google's Android team), but I believe it produces slightly faster code (and 
fancy symbol preemption is not required on phones and TVs).  Also, it might be 
that kernel can share more memory pages of libraries compiled with -Bsymbolic, 
but I'm not sure.

Now, it appears the problem is that an application cannot use COPY relocation 
to fetch a symbol out of shared -Bsymbolic library.  I don't quite understand 
why this is forbidden by Bionic's linker.  I understand why COPY relocations 
shouldn't be applied to the inside of DT_SYMBOLIC library.  However, I don't 
immediately see the problem of applying COPY relocation against symbol from 
DT_SYMBOLIC library to the inside of an executable.

Ard, you committed 5ae44f302b7d1d19f25c4c6f125e32dc369961d9 to Bionic that adds 
handling of ARM COPY relocations.  Can you comment on why COPY relocations from 
executables to DT_SYMBOLIC libraries are forbidden?

Thank you,

--
Maxim Kuvyrkov
KugelWorks



Re: Need a copyright assignment and a copyright disclaimer form

2013-04-30 Thread Maxim Kuvyrkov
On 25/04/2013, at 8:51 AM, dw wrote:

> I am attempting to submit a patch for the gcc documentation (see 
> http://gcc.gnu.org/ml/gcc-help/2013-04/msg00193.html).  I am told that I need 
> to submit one of these two forms.  Please send me copies so I can select one 
> and submit it.

You need to forward your email to .  It is the FSF, not the GCC 
project, that handles copyright assignment paperwork.

You need to specify your name, the name of your employer, and the name and 
title of the person who will be signing on behalf of the company.  You also 
need to list which FSFprojects (e.g., GCC, Binutils, GDB, glibc -- or just 
blanket ALL) you wish to contribute to.

Usually the FSF copyright office replies within 1-2 days, and feel free to ping 
us back here at gcc@ if FSF legal stalls.

Thank you,

--
Maxim Kuvyrkov
KugelWorks



Re: Delay scheduling due to possible future multiple issue in VLIW

2013-07-15 Thread Maxim Kuvyrkov
Paulo,

GCC schedule is not particularly designed for VLIW architectures, but it 
handles them reasonably well.  For the example of your code both schedules take 
same time to execute:

38: 0: r1 = e[r0]
40: 4: [r0] = r1
41: 5: r0 = r0+4
43: 5: p0 = r1!=0
44: 6: jump p0

and

38: 0: r1 = e[r0]
41: 1: r0 = r0+4
40: 4: [r0] = r1
43: 5: p0 = r1!=0
44: 6: jump p0

[It is true that the first schedule takes less space due to fortunate VLIW 
packing.]

You are correct that GCC scheduler is greedy and that it tries to issue 
instructions as soon as possible (i.e., it is better to issue something on the 
cycle, than nothing at all), which is a sensible strategy.  For small basic 
block the greedy algorithm may cause artifacts like the one you describe.

You could try increasing size of regions on which scheduler operates by 
switching your port to use scheb-ebb scheduler, which was originally developed 
for ia64.

Regards,

--
Maxim Kuvyrkov
KugelWorks



On 27/06/2013, at 8:35 PM, Paulo Matos wrote:

> Let me add to my own post saying that it seems that the problem is that the 
> list scheduler is greedy in the sense that it will take an instruction from 
> the ready list no matter what when waiting and trying to pair it with later 
> on with another instruction might be more beneficial. In a sense it seems 
> that the idea is that 'issuing instructions as soon as possible is better' 
> which might be true for a single issue chip but a VLIW with multiple issue 
> has to contend with other problems.
> 
> Any thoughts on this?
> 
> Paulo Matos
> 
> 
>> -Original Message-
>> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Paulo
>> Matos
>> Sent: 26 June 2013 15:08
>> To: gcc@gcc.gnu.org
>> Subject: Delay scheduling due to possible future multiple issue in VLIW
>> 
>> Hello,
>> 
>> We have a port for a VLIW machine using gcc head 4.8 with an maximum issue of
>> 2 per clock cycle (sometimes only 1 due to machine constraints).
>> We are seeing the following situation in sched2:
>> 
>> ;;   --- forward dependences: 
>> 
>> ;;   --- Region Dependences --- b 3 bb 0
>> ;;  insn  codebb   dep  prio  cost   reservation
>> ;;    --   ---       ---
>> ;;   38  1395 3 0 6 4
>> (p0+long_imm+ldst0+lock0),nothing*3 : 44m 43 41 40
>> ;;   40   491 3 1 2 2   (p0+long_imm+ldst0+lock0),nothing
>> : 44m 41
>> ;;   41   536 3 2 1 1   (p0+no_stl2)|(p1+no_dual)   : 44
>> ;;   43  1340 3 1 2 1   (p0+no_stl2)|(p1+no_dual)   : 44m
>> ;;   44  1440 3 4 1 1   (p0+long_imm)   :
>> 
>> ;;  dependencies resolved: insn 38
>> ;;  tick updated: insn 38 into ready
>> ;;  dependencies resolved: insn 41
>> ;;  tick updated: insn 41 into ready
>> ;;  Advanced a state.
>> ;;  Ready list after queue_to_ready:41:4  38:2
>> ;;  Ready list after ready_sort:41:4  38:2
>> ;;  Ready list (t =   0):41:4  38:2
>> ;;  Chosen insn : 38
>> ;;0--> b  0: i  38r1=zxn([r0+`b'])
>> :(p0+long_imm+ldst0+lock0),nothing*3
>> ;;  dependencies resolved: insn 43
>> ;;  Ready-->Q: insn 43: queued for 4 cycles (change queue index).
>> ;;  tick updated: insn 43 into queue with cost=4
>> ;;  dependencies resolved: insn 40
>> ;;  Ready-->Q: insn 40: queued for 4 cycles (change queue index).
>> ;;  tick updated: insn 40 into queue with cost=4
>> ;;  Ready-->Q: insn 41: queued for 1 cycles (resource conflict).
>> ;;  Ready list (t =   0):
>> ;;  Advanced a state.
>> ;;  Q-->Ready: insn 41: moving to ready without stalls
>> ;;  Ready list after queue_to_ready:41:4
>> ;;  Ready list after ready_sort:41:4
>> ;;  Ready list (t =   1):41:4
>> ;;  Chosen insn : 41
>> ;;1--> b  0: i  41r0=r0+0x4
>> :(p0+no_stl2)|(p1+no_dual)
>> 
>> So, it is scheduling first insn 38 followed by 41.
>> The insn chain for bb3 before sched2 looks like:
>> (insn 38 36 40 3 (set (reg:DI 1 r1)
>>(zero_extend:DI (mem:SI (plus:SI (reg:SI 0 r0 [orig:119 ivtmp.13 ]
>> [119])
>>(symbol_ref:SI ("b") [flags 0x80]  > 0x2b9c011f75a0 b>)) [2 MEM[symbol: b, index: ivtmp.13_7, offset: 0B]+0 S4
>> A32]))) pr3115b.c:13 1395 {zero_extendsidi2}
>> (nil))
>> (insn 40 38 41 

Re: [PATCH] GOMP_CPU_AFFINITY fails with >1024 cores

2013-07-16 Thread Maxim Kuvyrkov
On 17/07/2013, at 2:29 AM, Daniel J Blueman wrote:

> Jakub et al,
> 
> Steffen has developed a nice fix [1] for GOMP_CPU_AFFINITY failing with >1024 
> cores.
> 
> What steps are needed to get this into GCC 4.8.2?
> 
> Thanks,
>  Daniel
> 
> [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57298

It's easy!  Just follow the steps:

1. You test the patch on one of the primary architectures and make sure there 
are no regressions in the testsuites.

2. Ideally you add a test that fails before your patch, but passes with it.

3. You post your final patch to gcc-patches@ mailing list (this is gcc@ mailing 
list); CC one of the maintainers.  If you CC both -- each will think that the 
other will review the patch.

4. You include full description and analysis of the problem in the body of 
message (people are lazy to click on links).  You describe how your patch fixes 
the problem.  You write how and which architectures your patch was tested on.

5. You ping your submission every 2 weeks to one of the maintainers until they 
review your patch.

Good luck!

--
Maxim Kuvyrkov
KugelWorks




Re: toolchain build error with eglibc on OpenWrt

2013-07-18 Thread Maxim Kuvyrkov
On 17/07/2013, at 6:26 PM, lingw...@altaitechnologies.com wrote:

> Hi developers,
> 
> I'm encountered a problem when I build OpenWrt's toolchain for Cavium
> Octeon,which is mips64 r2 architectural.the error message as follow:
> My toolchain units version:
>   gcc: 4.7.x
>   binutils: 2.22
>   eglibc: 2.17 (svn version 22243)

It will be difficult to figure out how to fix it, sorry.

The problem seems to be in libgcc (which is part of GCC) not providing helpers 
for floating-point arithmetic operations.  The likely cause of this is that the 
compiler was configured for a hard-float target, but eglibc is being compiled 
for a soft-float target.  [For hard-float targets there is no need for FP 
helpers in libgcc since processor is assumed to handle that in silicon.]

Good luck,

--
Maxim Kuvyrkov
KugelWorks



Re: [ping] [buildrobot] gcc/config/linux-android.c:40:7: error: ‘OPTION_BIONIC’ was not declared in this scope

2013-09-09 Thread Maxim Kuvyrkov
On 7/09/2013, at 1:31 AM, Jan-Benedict Glaw wrote:

> On Mon, 2013-08-26 12:51:53 +0200, Jan-Benedict Glaw  
> wrote:
>> On Tue, 2013-08-20 11:24:31 +0400, Alexander Ivchenko  
>> wrote:
>>> Hi, thanks for cathing this.
>>> 
>>> I certainly missed that OPTION_BIONIC is not defined for linux targets
>>> that do not include config/linux.h in their tm.h.
>>> 
>>> This patch fixed build for powerpc64le-linux and mn10300-linux.
>>> linux_libc, LIBC_GLIBC, LIBC_BIONIC should be defined for all targets.
>> [...]
> 
> Seems the commit at Thu Sep 5 13:01:35 2013 (CEST) fixed most of the
> fallout.  Thanks!
> 
>> mn10300-linux: 
>> http://toolchain.lug-owl.de/buildbot/showlog.php?id=9657&mode=view
> 
> This however still seems to have issues in a current build:
> 
>   http://toolchain.lug-owl.de/buildbot/showlog.php?id=10520&mode=view

Jan-Benedict,

Mn10300-linux does not appear to be supporting linux.  Mn10300-linux target 
specifier expands into mn10300-unknown-linux-gnu, where *-gnu implies using 
Glibc library, which doesn't have mn10300 port.

Jeff,

You are mn10300 maintainer, is building GCC for mn10300-unknown-linux-gnu 
supposed to work?  

Thanks,

--
Maxim Kuvyrkov
www.kugelworks.com



[RFC] Apple Blocks extension

2013-11-03 Thread Maxim Kuvyrkov
Hi,

I am considering a project to add Apple's blocks [*] extension to GCC.  I am 
looking at adding blocks support to C, C++ and Obj-C/C++ front-ends.

There are many challenges (both technical and copyright) that require work 
before any patches are ready for review, and I would appreciate indication from 
front-end maintainers on whether a technically sound implementation of Blocks 
extension would be a welcome addition to GCC front-ends.

Joseph, Richard, as C front-end maintainers, would you be supportive of Blocks 
extension implemented for C front-end?

Jason, Mark, Nathan, as C++ front-end maintainers, would you be supportive of 
Blocks extension implemented for C++ front-end?

Mike, Stan, as Obj-C/C++ front-end maintainers, would you be supportive of 
Blocks extension implemented for Obj-C/C++ front-ends?

[*] http://en.wikipedia.org/wiki/Blocks_(C_language_extension)

Thank you!

--
Maxim Kuvyrkov
www.kugelworks.com





Re: Dependency confusion in sched-deps

2013-12-04 Thread Maxim Kuvyrkov
On 8/11/2013, at 1:48 am, Paulo Matos  wrote:

> Hello,
> 
> I am slightly unsure if the confusion is in the dependencies or it's my 
> confusion.
> 
> I have tracked this strange behaviour which only occurs when we need to flush 
> pending instructions due to the pending list becoming too large (gcc 4.8, 
> haven't tried with trunk).
> 
> I have two stores: 
> 85: st zr, [r12] # zr is the zero register
> 90: st zr, [r18]
> 
> While analysing dependencies for `st zr, [r12]`, we notice that pending list 
> is too large in sched_analyze_1 and call flush_pending_lists (deps, insn, 
> false, true).
> 
> This in turn causes the last_pending_memory_flush to be set to:
> (insn_list:REG_DEP_TRUE 85 (nil))
> 
> When insn 90 is analyzed next, it skips the flushing bit since the pending 
> lists had just been flushed and enters the else bit where it does:
> add_dependence_list (insn, deps->last_pending_memory_flush, 1,
>  REG_DEP_ANTI, true);
> 
> This adds the dependency: 90 has an anti-dependency to 85.
> I think this should be a true dependency (write after write). It even says so 
> in the list of last_pending_memory_flush, however add_dependence_list 
> function ignored this and uses the dep_type passed: REG_DEP_ANTI.
> 
> Is anti the correct dependence? Why?

Output dependency is the right type (write after write).  Anti dependency is 
write after read, and true dependency is read after write.

Dependency type plays a role for estimating costs and latencies between 
instructions (which affects performance), but using wrong or imprecise 
dependency type does not affect correctness.  Dependency flush is a force-major 
occurrence during compilation, and developers tend not to spend too much time 
on coding best possible handling for these [hopefully] rare occurrences.

Anti dependency is a good guess for dependency type between two memory 
instructions.  In the above particular case it is wrong, and, I imagine, this 
causes a performance problem for you.  You can add better handling of this 
situation by remembering whether last_pending_memory_flush is memory read or 
memory write and then use it to select correct dependency type for insn 90: 
output, anti or true.

Let me know whether you want to pursue this and I can help with advice and 
patch review.

Thanks, 

--
Maxim Kuvyrkov
www.kugelworks.com




Re: m68k optimisations?

2013-12-04 Thread Maxim Kuvyrkov
On 9/11/2013, at 12:08 am, Fredrik Olsson  wrote:

> I have this simple functions:
> int sum_vec(int c, ...) {
>va_list argptr;
>va_start(argptr, c);
>int sum = 0;
>while (c--) {
>int x = va_arg(argptr, int);
>sum += x;
>}
>va_end(argptr);
>return sum;
> }
> 
> 
> When compiling with "-fomit-frame-pointer -Os -march=68000 -c -S
> -mshort" I get this assembly (I have manually added comments with
> clock cycles per instruction and a total for a count of 0, 8 and n>0):
>.even
>.globl _sum_vec
> _sum_vec:
>lea (6,%sp),%a0 | 8
>move.w 4(%sp),%d1   | 12
>clr.w %d0   | 4
>jra .L1 | 12
> .L2:
>add.w (%a0)+,%d0| 8
> .L1:
>dbra %d1,.L2| 16,12
>rts | 16
> | c==0: 8+12+4+12+12+16=64
> | c==8: 8+12+4+12+(16+8)*8+12+16=256
> | c==n: =64+24n
> 
> When instead compiling with "-fomit-frame-pointer -O3 -march=68000 -c
> -S -mshort" I expect to get more aggressive optimisation than -Os, or
> at least just as performant, but instead I get this:
>.even
>.globl _sum_vec
> _sum_vec:
>move.w 4(%sp),%d0   | 12
>jeq .L2 | 12,8
>lea (6,%sp),%a0 | 8
>subq.w #1,%d0   | 4
>and.l #65535,%d0| 16
>add.l %d0,%d0   | 8
>lea 8(%sp,%d0.l),%a1| 16
>clr.w %d0   | 4
> .L1:
>add.w (%a0)+,%d0| 8
>cmp.l %a0,%a1   | 8
>jne .L1 | 12|8
>rts | 16
> .L2:
>clr.w %d0   | 4
>rts | 16
> | c==0: 12+12+4+16=44
> | c==8: 12+8+8+4+16+8+16+4+(8+8+12)*4-4+16=316
> | c==n: =88+28n
> 
> The count==0 case is better. I can see what optimisation has been
> tried for the loop, but it just not working since both the ini for the
> loop and the loop itself becomes more costly.
> 
> Being a GCC beginner I would like a few pointers as to how I should go
> about to fix this?

You investigate such problems by comparing intermediate debug dumps of two 
compilation scenarios; by the assembly time it is almost impossible to guess 
where the problem is coming from.  Add -fdump-tree-all and -fdump-rtl-all to 
the compilation flags and find which optimization pass makes the wrong 
decision.  Then you trace that optimization pass or file a bug report in hopes 
that someone (optimization maintainer) will look at it.

Read through GCC wiki for information on debugging and troubleshooting GCC:
- http://gcc.gnu.org/wiki/GettingStarted
- http://gcc.gnu.org/wiki/FAQ
- http://gcc.gnu.org/wiki/

Thanks,

--
Maxim Kuvyrkov
www.kugelworks.com





Re: Dependency confusion in sched-deps

2013-12-05 Thread Maxim Kuvyrkov
On 6/12/2013, at 4:25 am, Michael Matz  wrote:

> Hi,
> 
> On Thu, 5 Dec 2013, Maxim Kuvyrkov wrote:
> 
>> Output dependency is the right type (write after write).  Anti 
>> dependency is write after read, and true dependency is read after write.
>> 
>> Dependency type plays a role for estimating costs and latencies between 
>> instructions (which affects performance), but using wrong or imprecise 
>> dependency type does not affect correctness.
> 
> In the context of GCC and the middle ends memory model this statement is 
> not correct.  For some dependency types we're using type based aliasing to 
> disambiguate, i.e ignore that dependency, which for others we don't.  In 
> particular a read-after-write memory-access dependency can be ignored if 
> type info says they can't alias (because a program where both _would_ 
> access the same memory would be invalid according to our mem model), but 
> for write-after-read or write-after-write we cannot do that disambiguation 
> (because the last write overrides the dynamic type of the memory cell even 
> if it was incompatible with the one before).

Yes, this is correct for dependencies between memory locations in the general 
context of GCC.  [Below clarifications are for Paolo's benefit and anyone 
else's who wants to find out how GCC scheduling works.]

Scheduler dependency analysis is a user of the aforementioned alias analysis 
and it simply won't create a dependency between instructions if alias analysis 
tells it that it is OK to do so.  In the context of scheduler, the dependencies 
(and their types) are between instructions, not individual registers or memory 
locations.  The mere fact of two instructions having a dependency of any kind 
will make the scheduler produce correct code.  The difference between two 
instructions having true vs anti vs output dependency will manifest itself in 
how close the 2nd instruction will be issued to the 1st one.

Furthermore, when two instructions have dependencies on several items (e.g., 
both on register and on memory location), the resulting dependency type is set 
to the greater of dependency types of all dependent items: true-dependency 
having most weight, followed by anti-dependency, followed by output-dependency.

Consider instructions

[r1] = r2
r1 = [r2]

The scheduler dependency analysis will find an anti-dependency on r1 and 
true-dependency on memory locations (assuming [r1] and [r2] may alias).  The 
resulting dependency between instructions will be true-dependency and the 
instructions will be scheduled several cycles apart.  However, one might argue 
that [r1] and [r2] are unlikely to alias and scheduling these instructions 
back-to-back (downgrading dependency type from true to anti) would produce 
better code on average.  This is one of countless improvements that could be 
made to GCC scheduler.

--
Maxim Kuvyrkov
www.kugelworks.com




Re: Dependency confusion in sched-deps

2013-12-05 Thread Maxim Kuvyrkov
On 6/12/2013, at 8:44 am, shmeel gutl  wrote:

> On 05-Dec-13 02:39 AM, Maxim Kuvyrkov wrote:
>> Dependency type plays a role for estimating costs and latencies between 
>> instructions (which affects performance), but using wrong or imprecise 
>> dependency type does not affect correctness.
> On multi-issue architectures it does make a difference. Anti dependence 
> permits the two instructions to be issued during the same cycle whereas true 
> dependency and output dependency would forbid this.
> 
> Or am I misinterpreting your comment?

On VLIW-flavoured machines without resource conflict checking -- "yes", it is 
critical not to use anti dependency where an output or true dependency exist.  
This is the case though, only because these machines do not follow sequential 
semantics for instruction execution (i.e., effects from previous instructions 
are not necessarily observed by subsequent instructions on the same/close 
cycles.

On machines with internal resource conflict checking having a wrong type on the 
dependency should not cause wrong behavior, but "only" suboptimal performance.

Thank you,

--
Maxim Kuvyrkov
www.kugelworks.com




Re: [buildrobot] mips64-linux broken

2013-12-08 Thread Maxim Kuvyrkov
On 9/12/2013, at 3:24 am, Jan-Benedict Glaw  wrote:

> Hi Maxim!
> 
> One of your recent libc<->android clean-up patches broke the
> mips64-linux target as a side-effect, see eg.
> http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=53806:
> 
> g++ -c  -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -g -O2 -DIN_GCC  
> -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions -fno-rtti 
> -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
> -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long 
> -Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -I. 
> -Ic-family -I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/c-family 
> -I/home/jbglaw/repos/gcc/gcc/../include 
> -I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
> -I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
> -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
> -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o c-family/c-cppbuiltin.o 
> -MT c-family/c-cppbuiltin.o -MMD -MP -MF c-family/.deps/c-cppbuiltin.TPo 
> /home/jbglaw/repos/gcc/gcc/c-family/c-cppbuiltin.c
> /home/jbglaw/repos/gcc/gcc/c-family/c-cppbuiltin.c: In function ‘void 
> c_cpp_builtins(cpp_reader*)’:
> /home/jbglaw/repos/gcc/gcc/c-family/c-cppbuiltin.c:1014:370: error: 
> ‘ANDROID_TARGET_OS_CPP_BUILTINS’ was not declared in this scope
> make[1]: *** [c-family/c-cppbuiltin.o] Error 1

I'm looking into this.

Thanks,

--
Maxim Kuvyrkov
www.kugelworks.com






Re: [buildrobot] mips64-linux broken

2013-12-09 Thread Maxim Kuvyrkov
On 10/12/2013, at 7:28 am, Steve Ellcey  wrote:

> On Mon, 2013-12-09 at 08:21 +1300, Maxim Kuvyrkov wrote:
> 
>> I'm looking into this.
>> 
>> Thanks,
>> 
>> --
>> Maxim Kuvyrkov
>> www.kugelworks.com
> 
> 
> My mips-mti-linux-gnu build is working after I applied this patch
> locally.  I didn't do a test build of mips64-linux-gnu.
> 
> Steve Ellcey
> sell...@mips.com
> 
> 
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 93743d8..ee17071 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1918,16 +1918,18 @@ mips*-*-netbsd*)  # NetBSD/mips, 
> either endian.
>   extra_options="${extra_options} netbsd.opt netbsd-elf.opt"
>   ;;
> mips*-mti-linux*)
> - tm_file="dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h ${tm_file} 
> mips/gnu-user.h mips/gnu-user64.h mips/linux64.h mips/linux-common.h 
> mips/mti-linux.h"
> + tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h 
> glibc-stdint.h ${tm_file} mips/gnu-user.h mips/gnu-user64.h mips/linux64.h 
> mips/linux-common.h mips/mti-linux.h"
>   tmake_file="${tmake_file} mips/t-mti-linux"
>   tm_defines="${tm_defines} MIPS_ISA_DEFAULT=33 MIPS_ABI_DEFAULT=ABI_32"
> + extra_options="${extra_options} linux-android.opt"
>   gnu_ld=yes
>   gas=yes
>   ;;
> mips64*-*-linux* | mipsisa64*-*-linux*)
> - tm_file="dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h ${tm_file} 
> mips/gnu-user.h mips/gnu-user64.h mips/linux64.h mips/linux-common.h"
> + tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h 
> glibc-stdint.h ${tm_file} mips/gnu-user.h mips/gnu-user64.h mips/linux64.h 
> mips/linux-common.h"
>   tmake_file="${tmake_file} mips/t-linux64"
>   tm_defines="${tm_defines} MIPS_ABI_DEFAULT=ABI_N32"
> + extra_options="${extra_options} linux-android.opt"
>   case ${target} in
>   mips64el-st-linux-gnu)
>   tm_file="${tm_file} mips/st.h"

Hi Steve,

I've came up with the same patch, and Richard S. already approved it.  I'll 
check it in once s390x-linux part of that patch is approved (hopefully, later 
today).

Thank you,

--
Maxim Kuvyrkov
www.kugelworks.com




Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling

2013-12-10 Thread Maxim Kuvyrkov
On 11/12/2013, at 5:17 am, Ramana Radhakrishnan  
wrote:

> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos  wrote:
>> Hi,
>> 
>> Near the start of schedule_block, find_modifiable_mems is called if 
>> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It seems on 
>> c6x backend currently uses this.
>> However, it's quite strange that this is not a requirement for all backends 
>> since find_modifiable_mems, moves all my dependencies in SD_LIST_HARD_BACK 
>> to SD_LIST_SPEC_BACK even though I don't have DO_SPECULATION enabled.
>> 
>> Since dependencies are accessed later on from try_ready (for example), I 
>> would have thought that it would be always good not to call 
>> find_modifiable_mems,  given that it seems to 'literally' break dependencies.
>> 
>> Is the behaviour of find_modifiable_mems a bug or somehow expected?

"Breaking" a dependency in scheduler involves modification of instructions that 
would allow scheduler to move one instruction past the other.  The most common 
case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" which can be 
transformed into "r3 = [r1 + 4]; r2 = r1 + 4;".  Breaking a dependency is not 
ignoring it, speculatively or otherwise; it is an equivalent code 
transformation to allow scheduler more freedom to fill up CPU cycles.

> 
> 
> It's funny how I've been trying to track down a glitch and ended up
> asking the same question today. Additionally if I use
> TARGET_SCHED_SET_SCHED_FLAGS on a port that doesn't use the selective
> scheduler, this does nothing. Does anyone know why is this the default
> for ports where we don't turn on selective scheduling and might need a
> hook to turn this off ?

SCHED_FLAGS is used to enable or disable various parts of GCC scheduler.  On an 
architecture that supports speculative scheduling with recovery (IA64) it can 
turn this feature on or off.  The documentation for various features of 
sched-rgn, sched-ebb and sel-sched is not the best and one will likely get 
weird artefacts by trying out non-default settings.

I believe that only IA64 backend supports selective scheduling reliably.  I've 
other ports trying out selective scheduling, but I don't know whether those 
efforts got positive results.

--
Maxim Kuvyrkov
www.kugelworks.com




Re: Dependency confusion in sched-deps

2013-12-10 Thread Maxim Kuvyrkov
On 6/12/2013, at 9:44 pm, shmeel gutl  wrote:

> On 06-Dec-13 01:34 AM, Maxim Kuvyrkov wrote:
>> On 6/12/2013, at 8:44 am, shmeel gutl  wrote:
>> 
>>> On 05-Dec-13 02:39 AM, Maxim Kuvyrkov wrote:
>>>> Dependency type plays a role for estimating costs and latencies between 
>>>> instructions (which affects performance), but using wrong or imprecise 
>>>> dependency type does not affect correctness.
>>> On multi-issue architectures it does make a difference. Anti dependence 
>>> permits the two instructions to be issued during the same cycle whereas 
>>> true dependency and output dependency would forbid this.
>>> 
>>> Or am I misinterpreting your comment?
>> On VLIW-flavoured machines without resource conflict checking -- "yes", it 
>> is critical not to use anti dependency where an output or true dependency 
>> exist.  This is the case though, only because these machines do not follow 
>> sequential semantics for instruction execution (i.e., effects from previous 
>> instructions are not necessarily observed by subsequent instructions on the 
>> same/close cycles.
>> 
>> On machines with internal resource conflict checking having a wrong type on 
>> the dependency should not cause wrong behavior, but "only" suboptimal 
>> performance.
>> 
>> 
...
> Earlier in the thread you wrote
>> Output dependency is the right type (write after write).  Anti dependency is 
>> write after read, and true dependency is read after write.
> Should the code be changed to accommodate vliw machines.. It has been there 
> since the module was originally checked into trunk.

The usual solution for VLIW machines is to have assembler split VLIW bundles 
that have internal dependencies and execute them on different cycles.  The idea 
is for compiler to strive to do its best to produce code without any internal 
dependencies, but it is up to assembler to do the final check and fix any 
occasional problems.  [A good assembler has to do this work anyway to 
accommodate for mistakes in hand-written assembly.]

The scheduler is expected to produces code with no internal dependencies for 
VLIW machines 99% of the time.  This 99% effectiveness is good enough since 
scheduler is often not the last pass that touches code, and subsequent 
transformations can screw up VLIW bundles anyway.

--
Maxim Kuvyrkov
www.kugelworks.com




Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling

2013-12-10 Thread Maxim Kuvyrkov
On 11/12/2013, at 11:14 am, Ramana Radhakrishnan  
wrote:

> On Tue, Dec 10, 2013 at 9:44 PM, Maxim Kuvyrkov  wrote:
>> On 11/12/2013, at 5:17 am, Ramana Radhakrishnan  
>> wrote:
>> 
>>> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos  wrote:
>>>> Hi,
>>>> 
>>>> Near the start of schedule_block, find_modifiable_mems is called if 
>>>> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It seems 
>>>> on c6x backend currently uses this.
>>>> However, it's quite strange that this is not a requirement for all 
>>>> backends since find_modifiable_mems, moves all my dependencies in 
>>>> SD_LIST_HARD_BACK to SD_LIST_SPEC_BACK even though I don't have 
>>>> DO_SPECULATION enabled.
>>>> 
>>>> Since dependencies are accessed later on from try_ready (for example), I 
>>>> would have thought that it would be always good not to call 
>>>> find_modifiable_mems,  given that it seems to 'literally' break 
>>>> dependencies.
>>>> 
>>>> Is the behaviour of find_modifiable_mems a bug or somehow expected?
>> 
>> "Breaking" a dependency in scheduler involves modification of instructions 
>> that would allow scheduler to move one instruction past the other.  The most 
>> common case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" which can 
>> be transformed into "r3 = [r1 + 4]; r2 = r1 + 4;".  Breaking a dependency is 
>> not ignoring it, speculatively or otherwise; it is an equivalent code 
>> transformation to allow scheduler more freedom to fill up CPU cycles.
> 
> 
> Yes, but there are times when it does this a bit too aggressively and
> this looks like the cause for a performance regression that I'm
> investigating on ARM. I was looking for a way of preventing this
> transformation and there doesn't seem to be an easy one other than the
> obvious hack.

If you want a particular transformation from occurring, then you need to 
investigate why scheduler thinks that there is nothing better to do than to 
schedule an instruction which requires breaking a dependency.  "Breaking" a 
dependency only increases pool of instructions available to schedule, and your 
problem seems to be laying in "why" the wrong instruction is selected from that 
pool.

Are you sure that the problem is introduced by dependency breaking, rather than 
dependency breaking exposing a latent bug?

> 
> Additionally there appears to be no way to control "flags" in a
> backend hook for sched-rgn for DONT_BREAK_DEPENDENCIES . Again if the
> DONT_BREAK_DEPENDENCIES is meant to be disabled with these flags, then
> it looks like we should allow for these to also be handled or describe
> TARGET_SCHED_SET_SCHED_FLAGS as only a hook valid with the selective
> scheduler.

I'm not sure I follow you here.  Any port can define 
TARGET_SCHED_SET_SCHED_FLAGS and set current_sched_info->flags to whatever it 
thinks is appropriate.  E.g., c6x does this to disable dependency breaking for 
a particular kind of loops.

> 
>> 
>>> 
>>> 
>>> It's funny how I've been trying to track down a glitch and ended up
>>> asking the same question today. Additionally if I use
>>> TARGET_SCHED_SET_SCHED_FLAGS on a port that doesn't use the selective
>>> scheduler, this does nothing. Does anyone know why is this the default
>>> for ports where we don't turn on selective scheduling and might need a
>>> hook to turn this off ?
>> 
>> SCHED_FLAGS is used to enable or disable various parts of GCC scheduler.  On 
>> an architecture that supports speculative >scheduling with recovery (IA64) 
>> it can turn this feature on or off.  The documentation for various features 
>> of sched-rgn, sched-ebb and sel-sched is not the best and one will likely 
>> get weird artefacts by trying out non-default settings.
> 
> 
> Well, it appears as though TARGET_SCHED_SET_SCHED_FLAGS is only valid
> with the selective scheduler on as above and is a no-op as far as
> sched-rgn goes. This whole area could do with some improved
> documentation - I'll follow up with some patches to see if I can
> improve the situation.

I don't think this is the case.  TARGET_SCHED_SET_SCHED_FLAGS has two outputs: 
one is SPEC_INFO structure (which is used for IA64 only, both for sel-sched and 
sched-rgn), and the other one is modification of current_sched_info->flags, 
which affects all schedulers (sched-rgn, sched-ebb and sel-sched) and all ports.

--
Maxim Kuvyrkov
www.kugelworks.com






Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling

2013-12-10 Thread Maxim Kuvyrkov
On 11/12/2013, at 3:45 pm, Ramana Radhakrishnan  
wrote:

> On Wed, Dec 11, 2013 at 12:02 AM, Maxim Kuvyrkov  wrote:
>> On 11/12/2013, at 11:14 am, Ramana Radhakrishnan  
>> wrote:
>> 
>>> On Tue, Dec 10, 2013 at 9:44 PM, Maxim Kuvyrkov  
>>> wrote:
>>>> On 11/12/2013, at 5:17 am, Ramana Radhakrishnan 
>>>>  wrote:
>>>> 
>>>>> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos  wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> Near the start of schedule_block, find_modifiable_mems is called if 
>>>>>> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It 
>>>>>> seems on c6x backend currently uses this.
>>>>>> However, it's quite strange that this is not a requirement for all 
>>>>>> backends since find_modifiable_mems, moves all my dependencies in 
>>>>>> SD_LIST_HARD_BACK to SD_LIST_SPEC_BACK even though I don't have 
>>>>>> DO_SPECULATION enabled.
>>>>>> 
>>>>>> Since dependencies are accessed later on from try_ready (for example), I 
>>>>>> would have thought that it would be always good not to call 
>>>>>> find_modifiable_mems,  given that it seems to 'literally' break 
>>>>>> dependencies.
>>>>>> 
>>>>>> Is the behaviour of find_modifiable_mems a bug or somehow expected?
>>>> 
>>>> "Breaking" a dependency in scheduler involves modification of instructions 
>>>> that would allow scheduler to move one instruction past the other.  The 
>>>> most common case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" 
>>>> which can be transformed into "r3 = [r1 + 4]; r2 = r1 + 4;".  Breaking a 
>>>> dependency is not ignoring it, speculatively or otherwise; it is an 
>>>> equivalent code transformation to allow scheduler more freedom to fill up 
>>>> CPU cycles.
>>> 
>>> 
>>> Yes, but there are times when it does this a bit too aggressively and
>>> this looks like the cause for a performance regression that I'm
>>> investigating on ARM. I was looking for a way of preventing this
>>> transformation and there doesn't seem to be an easy one other than the
>>> obvious hack.
>> 
>> If you want a particular transformation from occurring, then you need to 
>> investigate why scheduler thinks that there is nothing better to do than to 
>> schedule an instruction which requires breaking a dependency.  "Breaking" a 
>> dependency only increases pool of instructions available to schedule, and 
>> your problem seems to be laying in "why" the wrong instruction is selected 
>> from that pool.
>> 
>> Are you sure that the problem is introduced by dependency breaking, rather 
>> than dependency breaking exposing a latent bug?
> 
> From my reading because the dependency breaking is of addresses that
> are in a memcpy type loop which is unrolled and the original
> expectation is that by switching this to an add and a negative offset
> one can get more ILP in theory, but in practice the effects appear to
> be worse because of secondary issues that I'm still investigating.

Is this happening in the 1st or 2nd scheduling pass?  From your comments I get 
a feeling that dependency breaking is introducing an additional instruction, 
rather then adding an offset to a memory reference.  Ideally, dependency 
breaking during 1st scheduling pass should be more conservative and avoid too 
many new instructions (e.g., by breaking a dependency only if nothing 
whatsoever can be scheduled on the current cycle).  Dependency breaking during 
2nd scheduling pass can be more aggressive as it can make sure that adding 
offset to a memory instruction will not cause it to be split.

> 
>> 
>>> 
>>> Additionally there appears to be no way to control "flags" in a
>>> backend hook for sched-rgn for DONT_BREAK_DEPENDENCIES . Again if the
>>> DONT_BREAK_DEPENDENCIES is meant to be disabled with these flags, then
>>> it looks like we should allow for these to also be handled or describe
>>> TARGET_SCHED_SET_SCHED_FLAGS as only a hook valid with the selective
>>> scheduler.
>> 
>> I'm not sure I follow you here.  Any port can define 
>> TARGET_SCHED_SET_SCHED_FLAGS and set current_sched_info->flags to whatever 
>> it thinks is appropriate.  E.g., c6x does this to disable dependency 
>> breaking for a particular kind of loops.
> 
> Ah, that will probably work and that's probably what I was missing. I
> don't like the idea in general of the same interface setting global
> state randomly in a backend is probably not the best approach in the
> long term. Expecting to set global state in this form from an
> interface is something I wasn't expecting especially when it takes a
> parameter.

Originally TARGET_SCHED_SET_SCHED_FLAGS was setting current_sched_info->flags 
and nothing else, hence the name.  The parameter spec_info appeared later to 
hold flags related to IA64-specific speculative scheduling.


--
Maxim Kuvyrkov
www.kugelworks.com





Re: Google Summer of Code -- Admin needed

2014-02-10 Thread Maxim Kuvyrkov
On 6/02/2014, at 7:45 am, Moore, Catherine  wrote:

> Hi All,
> 
> I acted as the Google Summer of Code Administrator in 2013 and I do not wish 
> to continue.
> 
> There is an upcoming deadline (February 14th) for an organization to submit 
> their applications to the Google Summer of Code.Is there anyone who would 
> like to act as the gcc admin for 2014?
> I assume that folks would like to have the gcc project continue to 
> participate;  we need to find someone to submit the application and commit to 
> the admin duties.
> 
> The bulk of the work is organizational.  There are some web forms to fill 
> out, evaluations need to be completed, an irc meeting was required, plus 
> finding projects and mentors for the projects.
> 
> I hope someone will pick this up.

I want to admin GCC's GSoC this year.

In the next several days I will be bugging past GCC GSoC admins and mentors to 
get an idea of what I'm getting myself into.  Please send me a note if you 
haven't been GSoC mentor in the past years, but want to try this year.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org



GSoC project ideas

2014-02-16 Thread Maxim Kuvyrkov
Hi,

GCC has applied as a mentoring organization to GSoC 2014, and we need to update 
Project Ideas page: http://gcc.gnu.org/wiki/SummerOfCode .  Ideas is where GSoC 
starts, and this is what captures attention and imagination of prospective 
students (and future developers!) of GCC.

If you have an idea for a student project -- post it at 
http://gcc.gnu.org/wiki/SummerOfCode .  If you can't easily edit the wiki 
directly, feel free to send your ideas to me directly or as a reply to this 
thread, I will add them to the wiki.

You don't have to commit to be a mentor for an idea that you post.  We will 
worry about finding mentors once a student expresses interest in a particular 
idea.

You don't have to be an active GCC developer to post an idea.  If you are an 
experienced GCC user and you wanted all your life a feature X in GCC -- post an 
idea about it.

If you are a prospective GSoC student -- then we definitely want to hear your 
ideas.

We need the ideas page all updated and ready by the end of February (couple of 
weeks left).  Student applications period opens on March 10th, and keep in mind 
that students would need to meditate on the various projects/ideas/choices for 
a week or so.

For GSoC 2014 timeline see 
https://www.google-melange.com/gsoc/events/google/gsoc2014 .

Thank you,

--
Maxim Kuvyrkov
www.linaro.org





[GSoC] GCC has been accepted to GSoC 2014

2014-02-25 Thread Maxim Kuvyrkov
Hi All,

GCC has been accepted as mentoring organization to Google Summer of Code 2014, 
and we are off to the races!

If you want to be a GCC GSoC student check out the project idea page at 
http://gcc.gnu.org/wiki/SummerOfCode .  Feel free to ask questions on IRC [1] 
and get in touch with your potential mentors.  If you are not sure who to 
contact -- send me an email at maxim.kuvyr...@linaro.org.

If you are a GCC developer then create a profile at 
http://www.google-melange.com/gsoc/homepage/google/gsoc2014 to be able to rank 
student applications .  Once registered, connect with "GCC - GNU Compiler 
Collection" organization.

If you actively want to mentor a student project, then note so in your GSoC 
connection request.

If you have any questions or comments please contact your friendly GSoC admin 
via IRC (maximk), email (maxim.kuvyr...@linaro.org) or Skype/Hangouts.

Thank you,

[1] irc://irc.oftc.net/#gcc

--
Maxim Kuvyrkov
www.linaro.org





Re: [gsoc 2014] moving fold-const patterns to gimple

2014-03-18 Thread Maxim Kuvyrkov
On Mar 18, 2014, at 9:13 PM, Prathamesh Kulkarni  
wrote:

> On Mon, Mar 17, 2014 at 2:22 PM, Richard Biener
>  wrote:
>> On Sun, Mar 16, 2014 at 1:21 PM, Prathamesh Kulkarni
>>  wrote:
>>> In c_expr::c_expr, shouldn't OP_C_EXPR be passed to operand
>>> constructor instead of OP_EXPR ?
>> 
>> Indeed - I have committed the fix.
>> 
> My earlier mail got rejected (maybe because I attached pdf ?),
> by mailer daemon, sorry for the double post.
> I have uploaded the proposal here:
> https://drive.google.com/file/d/0B7zFk-y3DFiHa1Nkdzh6TFZpVFE/edit?usp=sharing
> I would be grateful to receive your feedback.

Prathamesh,

I will let Richard to comment on the proposal contents, but make sure you have 
formally applied on the GSoC website and uploaded some version of your proposal 
by end of Thursday (only 2 days left!).  You will be able to update details of 
the proposal later.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org



Re: GSoC Concepts - separate checking

2014-03-18 Thread Maxim Kuvyrkov
On Mar 12, 2014, at 12:19 PM, Braden Obrzut  wrote:

> My name is Braden Obrzut and I am a student from the University of Akron
> interested in contributing to GCC for GSoC.  I am interested in working on a
> project related to the c++-concepts branch.
> 
> In particular, I am interested in implementing mechanisms for checking the
> safety of constrained templates (separate checking). I have discussed the
> project with Andrew Sutton (who maintains the c++-concepts branch and happens
> to be a professor at Akron) and believe that some aspects of the work would be
> feasible within the three month time span. I also hope to continue working on
> the project as my honors thesis project.
> 
> As a hobby I usually design and implement declarative languages for content
> definition in old video games.  While I currently may have limited experience
> with GCC internals, I think this would be a great opportunity for me to learn
> how real compilers works and help with the development of the C++ programming
> language.

Braden,

Do you have a proposal for a GSoC GCC project?  If you do want to apply, please 
make sure you are registered at the GSoC website and have a application filed 
by end of Thursday (only 2 days left!).

Thank you,

--
Maxim Kuvyrkov
www.linaro.org


Re: GSoC 2014 C++ Concepts project

2014-03-18 Thread Maxim Kuvyrkov
On Mar 12, 2014, at 11:42 AM, Thomas Wynn  wrote:

> Hello, my name is Thomas Wynn. I am a junior in pursuit of a B.S. in
> Computer Science at The University of Akron. I am interested in
> working on a project with GCC for this year's Google Summer of Code.
> More specifically, I would like to work on support for concept
> variables and shorthand notation of concepts for C++ Concepts Lite.
> 
> I am currently doing an independent study with Andrew Sutton in which
> I have been porting and creating various tests for concepts used in
> the DejaGNU test suite of an experimental branch of GCC 4.9, and will
> soon be helping with the development of features in branch. I would
> greatly appreciate any suggestions or feedback for this project so
> that I may write a more detailed, relevant, and accurate proposal.

Hi Thomas,

Do you have a proposal for a GSoC GCC project?  If you do want to apply, please 
make sure you are registered at the GSoC website and have a application filed 
by end of Thursday (only 2 days left!).

--
Maxim Kuvyrkov
www.linaro.org




Re: About gsoc 2014 OpenMP 4.0 Projects

2014-03-18 Thread Maxim Kuvyrkov
On Feb 26, 2014, at 12:27 AM, guray ozen  wrote:

> Hello,
> 
> I'm master student at high-performance computing at barcelona
> supercomputing center. And I'm working on my thesis regarding openmp
> accelerator model implementation onto our compiler (OmpSs). Actually i
> almost finished implementation of all new directives  to generate CUDA
> code and same implementation OpenCL doesn't take so much according to
> my design. But i haven't even tried for Intel mic and apu other
> hardware accelerator :) Now i'm bench-marking output kernel codes
> which are generated by my compiler. although output kernel is
> generally naive, speedup is not very very bad. when I compare results
> with HMPP OpenACC 3.2.x compiler, speedups are almost same or in some
> cases my results are slightly better than. That's why in this term, i
> am going to work on compiler level or runtime level optimizations for
> gpus.
> 
> When i looked gcc openmp 4.0 project, i couldn't see any things about
> code generation. Are you going to announce later? or should i apply
> gsoc with my idea about code generations and device code
> optimizations?

Guray

Do you have a proposal for a GSoC GCC project?  If you do want to apply, please 
make sure you are registered at the GSoC website and have a application filed 
by end of Thursday (only 2 days left!).

Thank you,

--
Maxim Kuvyrkov
www.linaro.org


Re: Google Summer of Code

2014-03-18 Thread Maxim Kuvyrkov
On Mar 17, 2014, at 2:39 AM, Mihai Mandrescu  wrote:

> Hello,
> 
> I just enrolled in Google Summer of Code and would like to contribute
> to GCC. I'm not very familiar with the process of getting a project
> for GSoC nor with free software development in general, but I would
> like to learn. Can someone give me some hints please?
> 

Hi Mihai,

There is very little time left for student application -- only 2 days.

In general, by now you should have a specific idea that you want to work on, it 
doesn't have to be your own, there are many ideas for potential GSoC projects 
at http://gcc.gnu.org/wiki/SummerOfCode .

You need to be realistic about your experience in compiler development and GCC 
development.  It is better to apply for an easier/smaller project and 
successfully finish it, than to work on a complicated project and not get it 
done.

Finally, please don't cross-post to several lists, gcc@gcc.gnu.org is the 
correct list for development discussions (with gcc-patc...@gcc.gnu.org being 
the list for discussion of specific patches).

Thank you,

--
Maxim Kuvyrkov
www.linaro.org




Re: [GSoC 2014] Proposal: OpenCL Code Generator

2014-03-19 Thread Maxim Kuvyrkov
On Mar 20, 2014, at 4:02 AM, Ilmir Usmanov  wrote:

> Hi all!
> 
> My name is Ilmir Usmanov and I'm a student of Moscow Institute of Physics and 
> Technology.
> Also I'm implementing OpenACC 1.0 in gomp4 branch as an employee of Samsung 
> R&D Institute Russia (SRR). My research interests are connected with creating 
> OpenCL Code Generator. So I'd like to participate GSoC 2014 with project 
> called "OpenCL Code Generator" as an independent student. I will do the 
> project during my free time, my employer will not pay for this.

I do not think you will qualify as a student for the GSoC program.  Students 
are expected to work close to full-time during summer break and have GSoC as 
their main priority.  With your employment obligation I don't think you will be 
able to commit to your GSoC project at that level.

If I am wrong in my assumptions above, and you can commit to the GSoC project 
being your first priority for the summer months, please apply with your 
proposal on the GSoC website.  There is very little time left, so move fast.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org




Re: Register at the GSoC website to rate projects

2014-03-25 Thread Maxim Kuvyrkov
[Moving to gcc@ from gcc-patches@]

Community,

We've got 11 student proposals (good job, students!), and only the N top-rated 
ones will be accepted into the program.  Therefore, we as a community need to 
make sure that the ratings are representative of our goals -- making GCC the 
best compiler there is.

Go rate the proposals!  Make your voice heard!  
!

Here is a list of proposals (and, "yes" 'GCC Go escape analysis' is submitted 
by two different students).

Generating folding patterns from meta description
Concepts Separate Checking
Integration of ISL code generator into Graphite
GCC Go escape analysis
Dynamically add headers to code
C++11 Support in GCC and libstdc++
GCC: Diagnostics
GCC Go escape analysis
Converting representation levels of GCC back to the source condes
Separate front-end folder from middle-end folder
interested in Minimal support for garbage collection

Thank you,

--
Maxim Kuvyrkov
www.linaro.org



On Mar 15, 2014, at 6:50 PM, Maxim Kuvyrkov  wrote:

> Hi,
> 
> You are receiving this message because you are in top 50 contributors to GCC 
> [1].  Congratulations!
> 
> Since you are a top contributor to GCC project it is important for you to 
> rate the incoming student GSOC applications.  Go and register at 
> https://www.google-melange.com/gsoc/homepage/google/gsoc2014 and connect with 
> "GCC - GNU Compiler Collection" organization.  Pretty.  Please.  It will take 
> 3-5 minutes of your time.
> 
> Furthermore, if you work at a college or university (or otherwise interact 
> with talented computer science students), encourage them to look at GCC's 
> ideas page [2] and run with it for a summer project (or, indeed, propose 
> their own idea).  They should hurry, only one week is left!
> 
> So far we've got several good proposals from students, but we want to see 
> more.
> 
> Thank you,
> 
> [1] As determined by number of checked in patches over the last 2 years (and, 
> "yes", I know this is not the fairest metric).  Script used:
> $ git log "--pretty=format:%an" | head -n 12000 | awk '{ a[$1]++; } END { for 
> (i in a) print a[i] " " i;  }' | sort -g | tail -n 50
> 
> [2] http://gcc.gnu.org/wiki/SummerOfCode
> 
> --
> Maxim Kuvyrkov
> www.linaro.org
> 
> 
> 



Re: add_branch_dependences in sched-rgn.c

2014-04-09 Thread Maxim Kuvyrkov
On Apr 9, 2014, at 4:15 AM, Kyrill Tkachov  wrote:

> Hi all,
> 
> I'm looking at some curious pre-reload scheduling behaviour and I noticed 
> this:
> 
> At the add_branch_dependences function sched-rgn.c there is a comment that 
> says "branches, calls, uses, clobbers, cc0 setters, and instructions that can 
> throw exceptions" should be scheduled at the end of the basic block.
> 
> However right below it the code that detects this kind of insns seems to only 
> look for these insns that are directly adjacent to the end of the block 
> (implemented with a while loop that ends as soon as the current insn is not 
> one of the aforementioned).
> 
> Shouldn't the code look through the whole basic block, gather all of the 
> branches, clobbers etc. and schedule them at the end?
> 

Not really.  The instruction sequences mentioned in the comment end basic block 
by definition -- if there is a jump or other "special" sequence, then basic 
block can't continue beyond that as control may be transffered to something 
other than the next instruction.  Add_branch_dependencies() makes sure that 
scheduler does not "accidentally" place something after those "special" 
sequences thus creating a corrupted basic block.

--
Maxim Kuvyrkov
www.linaro.org




[GSoC] Status - 20140410

2014-04-09 Thread Maxim Kuvyrkov
Community, [and BCC'ed mentors]

Google Summer of Code is panning out nicely for GCC.

We have received 5 slots for GSoC projects this year.  The plan is to accept 5 
top-rated student proposals.  If you haven't rated the projects yet, you have 2 
days to go to GSoC website [1] and rate the proposals.

I will mark the top-5 proposals "Accepted" this Friday/Saturday.

We already have mentors volunteered for the 5 currently leading projects, which 
is great.  We also need a couple of backup mentors in case one of the primary 
mentors becomes temporarily unavailable.  Your main job as backup mentor will 
be to follow 2-5 of the student projects and be ready to step in, should a need 
arise. Any volunteers for the roles of backup mentors?

I will send the next GSoC update early next week [when student projects are 
accepted].

Thank you,

[1] https://www.google-melange.com/gsoc/homepage/google/gsoc2014

--
Maxim Kuvyrkov
www.linaro.org





  1   2   >