Improvements of the haifa scheduler
Hi. I want to share some of my thoughts and doings on improving / cleaning up current GCC instruction scheduler (Haifa) - most of them are just small obvious improvements. I have semi-ready patches for about a half of them and would appreciate any early suggestion or comments on the following draft plan: 1. Remove compute_forward_dependencies (). [Done] Since new representation of dependencies lists was checked in, we don't longer need to compute forward dependencies separately. It would be natural to add forward links at the same time as we generate backward ones. 2. Use alloc_pools instead of obstacks for dep_nodes and deps_lists. [In progress] As pointed out by Jan Hubicka scheduler peaks +100M on a testcase for PR28071 after my patch for new dependencies lists was checked in. Though alloc_pools should have been my first choice while writing that patch, I decided to mimic as close as possible the original rtx instruction lists with their scheme of deallocation at the very end. So the next step would be to define proper lifetime of dependency lists and use alloc_pools to reuse nodes and lists from the previous regions. Which brings us to ... 3. Define clear interface for manipulating dependencies. [In progress] This one popped up when I began to debug <2> and understood that the scheduler uses and changes dependencies lists the way it shouldn't. Lists are being copied, edited and deleted directly without interaction with sched-deps.c . What the scheduler really needs is the following set of primitives: o FOR_EACH_DEP (insn, which_list, iterator, dep) - walk through insn's which_list (one of {backward, resolved_backward, forward, resolved_forward}) and provide the user with the dep. Ayal Zaks suggested this type of macro weeks ago but at that time I didn't agree with him. o dep_t find_dep_between (producer, consumer) - find dependency between two instructions. Currently we walk through the list looking for what we need. A better way would be to first check dependency caches and then, if we can't determine that there is no dependency, walk through the shorter list given the choice of two: producer's forward list and consumer's backward list. o void add_dep (dep_t) - Add a dependency. o void remove_dep (iterator) - Remove dependency pointed out by iterator. o void resolve_dep (iterator) - Resolve dependency pointed out by iterator. o int get_list_size (insn, which_list) - Get the size of the insn's which_list. o bool list_has_only_speculative_deps (insn, which_list) - Return true if all insn's dependencies can be overcome with some sort of speculation. o void {create, delete}_dependency_lists (insn) - Create / delete dependency lists for insn. As you can see, the scheduler doesn't need to know the internal representation of the deps_list / dep_node. 4. Support speculative loads into subregs. [Planned] As noted by Jim Wilson current support for ia64 speculation doesn't handle subregs though that would be easy to fix. 5. Make sched-deps.c mark only those dependencies as speculative which can actually be overcame with speculation types currently in use. [Planned] At the moment we first generate speculative dependencies and only at the moment of adding instruction to the ready list we check if we can (or it worth to) overcome every of its dependencies. 6. Make ds_t a structure. [Planned] ds_t is type for representing status of a dependency. It contains information about types of the dependency (true, output, anti) and probabilities of speculation success (begin_data, be_in_data, begin_control, be_in_control) - that makes three bits and for integers coded in a single int. Historical reasons forced this inelegant approach but now the reasons are inexistent and the problem can be solved in a natural way. 7. Use cse_lib in sched-rgn.c . [In progress] At the moment cse_lib works to improve alias analysis only during sched-ebb scheduling. It is trivial that we can also enable it when scheduling single block regions in sched-rgn. The patch for this is a one-liner which was tested on a bootstrap but not on SPECs. It is also possible to use cse_lib on sequential basic_blocks of the region thus handling them as an extended basic block. If it is possible to save cse_lib states, then we'll be able to process trees and merging capabilities are required for DAGs. Don't know if this can be done. 8. Don't generate a memory barrier on simplejump. [Done] sched-deps.c handles every jump in the scheduling region as a memory barrier - e.g. almost no memory operation can be moved through it. But unconditional jumps don't really need such restrictions. A one-liner patch for this was tested on the bootstrap but not on SPECs. 9. Use sched-ebb on other architectures. [Done] After patches for ia64 speculation and follow up fixes to them sched-ebb no longer corrupts CFG and can safely be used as not final pass on platforms other than ia64. I successfully bootstrapped (and, pro
Re: Improvements of the haifa scheduler
Vladimir N. Makarov wrote: Maxim Kuvyrkov wrote: Hi. I want to share some of my thoughts and doings on improving / cleaning up current GCC instruction scheduler (Haifa) - most of them are just small obvious improvements. I have semi-ready patches for about a half of them and would appreciate any early suggestion or comments on the following draft plan: ... Any comments, suggestions or 'please don't do that' will be greatly appreciated. ... Good aliasing is very important for the scheduler. But I'd look at this more wider. We need a good aliasing for many RTL optimizations. What's happened to ISP RAS aliasing patch propagating SSA info to RTL? Why is it stalled? As for Sanjiv Gupta's aliasing work, that was interesting but as I remember the patch made compiler too slow (like 40% slower). You should make this approach faster to make it accepted as used by default. I understand that good aliasing is important for several RTL passes and I hope that the general aliasing support will improve in time. I must admit that I didn't investigate the Gupta's patch yet, but I believe that it is so slow because it needs to rescan the function many times in order to get correct aliasing information. On the other hand alias analysis for scheduler's data speculation has a luxury of being not correct at times. So it looks like a low hanging fruit to me to try fast but not safe variant of Gupta's work and see if the magic will happen. Another important thing to do is to make the 1st scheduler register pressure sensitive. It would improve performance and solve the 1st insn scheduling problem for x86, x86_64. Now it is off by default because the scheduler moves insns containing hard regs too freely and this results in failure of the reload to find a hard register of a small class. If you need benchmarking for machines (like ppc) you have no access to, I can provide the benchmarking. I should also mention that I do all these works in my spare time which is not a lot, thus the above is my plan for about a half of the year. I really appreciate. May be if you or ISP RAS could find students (e.g. from Moscow University) to do this as Google Summer Code, it could help you. I think it is not too late. You should ask Ian Taylor or Daniel Berlin, if you want to do this. The projects I've described are too small to qualify as a 3 months works. But your suggestion to solve the 'scheduler -> ra' problem I would estimate just like the right one. The other GSC project might be to investigate and fix the places in compiler where exported from tree-ssa aliasing information is being invalidated. So basically here are three Google Summer of Code projects: o Scheduler -> RA o Fix passes that invalidate tree-ssa alias export. o { Fast but unsafe Gupta's aliasing patch, Unsafe tree-ssa alias export } in scheduler's data speculation. Ian, Daniel, what do you think? Thanks, Maxim
Re: Improvements of the haifa scheduler
Diego Novillo wrote: Maxim Kuvyrkov wrote on 03/05/07 02:14: o Fix passes that invalidate tree-ssa alias export. Yes, this should be good and shouldn't need a lot of work. o { Fast but unsafe Gupta's aliasing patch, Unsafe tree-ssa alias export } in scheduler's data speculation. "unsafe" alias export? I would definitely like to see the tree->rtl alias information transfer fixed once and for all. Finishing RAS's tree->rtl work would probably make a good SoC project. "Unsafe" doesn't mean not fixed. My thought is that it would be nice to have a switch in aliasing that will turn such operations as join (pt_anything, points_to) -> pt_anything into join (pt_anything, points_to) -> points_to This transformation will sacrifice correctness for sake of additional information. Thanks, Maxim
Re: GCC 4.2.0 Status Report (2007-04-15)
Steven Bosscher wrote: On 4/16/07, Mark Mitchell <[EMAIL PROTECTED]> wrote: 29841 [4.2/4.3 regression] ICE with scheduling and __builtin_trap Honza, PING! There is a patch for this PR29841 in http://gcc.gnu.org/ml/gcc-patches/2007-02/msg01134.html . The problem is that I don't really know which maintainer ask to review it :( -- Maxim
Re: [RFC] Kernel livepatching support in GCC
Hi, The feedback in this thread was overall positive with good suggestions on implementation details. I'm starting to work on the first draft, and plan to post something in 2-4 weeks. Thanks. On 28 May 2015 at 11:39, Maxim Kuvyrkov wrote: > Hi, > > Akashi-san and I have been discussing required GCC changes to make kernel's > livepatching work for AArch64 and other architectures. At the moment > livepatching is supported for x86[_64] using the following options: "-pg > -mfentry -mrecord-mcount -mnop-mcount" which is geek-speak for "please add > several NOPs at the very beginning of each function, and make a section with > addresses of all those NOP pads". > > The above long-ish list of options is a historical artifact of how > livepatching support evolved for x86. The end result is that for > livepatching (or ftrace, or possible future kernel features) to work compiler > needs to generate a little bit of empty code space at the beginning of each > function. Kernel can later use that space to insert call sequences for > various hooks. > > Our proposal is that instead of adding -mfentry/-mnop-count/-mrecord-mcount > options to other architectures, we should implement a target-independent > option -fprolog-pad=N, which will generate a pad of N nops at the beginning > of each function and add a section entry describing the pad similar to > -mrecord-mcount [1]. > > Since adding NOPs is much less architecture-specific then outputting call > instruction sequences, this option can be handled in a target-independent way > at least for some/most architectures. > > Comments? > > As I found out today, the team from Huawei has implemented [2], which follows > x86 example of -mfentry option generating a hard-coded call sequence. I hope > that this proposal can be easily incorporated into their work since most of > the livepatching changes are in the kernel. > > [1] Technically, generating a NOP pad and adding a section entry in > .__mcount_loc are two separate actions, so we may want to have a > -fprolog-pad-record option. My instinct is to stick with a single option for > now, since we can always add more later. > > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/346905.html > > -- > Maxim Kuvyrkov > www.linaro.org > > > -- Maxim Kuvyrkov www.linaro.org
Re: [RFC, Fortran] Avoid race on testsuite temporary files
> On Dec 9, 2015, at 5:27 PM, Yvan Roux wrote: > > Hi, > > as it was raised in > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01540.html we experiment > random failures in gfortran validation when it is run in parallel (-j > 8). The issues occurs because of concurrent access to the same file, > the first two patches fixed some of them by changing the file names > used, but there are still remaining conflicts (6 usages of foo, 8 of > test.dat). This is easy to fix and I've a patch for that, but there is > another issue and I'd like to have your thoughts on it. > > There is a little bit more than 1000 testcases which use IO without > explicit file names, ~150 use scratches (a temporary file named > gfortrantmp + 6 extra char is created) and the others, which only > specify the io unit number, use a file named fort.NN with NN the > number of the io unit. We see conflicts on these generated files, as > lot of testcases use the same number, the most used are: > > 10 => 150 > 99 => 70 > 6 => 31 > 11 => 27 > 1 => 23 > > I started to change the testcases to use scratches when it is possible > and before finding that there is that many to fix, and I also had > conflicts on the generated scratch names. The options I see to fix > that are: > > 1- Move all these testcases into an IO subdir and change the testsuite > to avoid parallelism in that directory. > 2- Use scratches when possible and improve libgfortran file name > generation, I don't know well fortran but is it possible to change the > file name patterns for scratches and io unit files ? > 3- Change the io unit numbers used, as it was suggested on irc, but I > find it a bit painful to maintain. > > Any comments are welcome. I have also investigated several races on I/O in the gfortran testsuite, and my preference is to go with [1]. Specifically, if a fortran test does I/O with filenames that can clash with some other test, then the test should be located in a sub-directory of gfortran.dg testsuite that runs its test in-order. -- Maxim Kuvyrkov www.linaro.org
Re: Live range shrinkage in pre-reload scheduling
On May 13, 2014, at 10:27 PM, Kyrill Tkachov wrote: > Hi all, > > In haifa-sched.c (in rank_for_schedule) I notice that live range shrinkage is > not performed when SCHED_PRESSURE_MODEL is used and the comment mentions that > it results in much worse code. > > Could anyone elaborate on this? Was it just empirically noticed on x86_64? + Richard Sandiford who wrote SCHED_PRESSURE_MODEL -- Maxim Kuvyrkov www.linaro.org
Re: Live range shrinkage in pre-reload scheduling
On May 15, 2014, at 6:46 PM, Ramana Radhakrishnan wrote: > >> >> I'm not claiming it's a great heuristic or anything. There's bound to >> be room for improvement. But it was based on "reality" and real results. >> >> Of course, if it turns out not be a win for ARM or s390x any more then it >> should be disabled. > > The current situation that Kyrill is investigating is a case where we > notice the first scheduler pass being a bit too aggressive with > creating ILP opportunities with the A15 scheduler that causes > performance differences with not turning on the first scheduler pass > vs using the defaults. Charles has a work-in-progress patch that fixes a bug in SCHED_PRESSURE_MODEL that causes the above symptoms. The bug causes 1st scheduler to unnecessarily increase live ranges of pseudo registers when there are a lot of instructions in the ready list. Charles, can you finish your patch in the next several days and post it for review? Thank you, -- Maxim Kuvyrkov www.linaro.org
[GSoC] Status - 20140516
Hi Community, The community bonding period is coming to a close, students can officially start coding on Monday, May 19th. In the past month the student should have applied for FSF copyright assignment and, hopefully, executed on a couple of test tasks to get a feel for GCC development. The GSoC Reunion (an unconference to discuss results of concluded GSoC) will be held in San Jose, CA, on 23-26 October 2014. GCC gets to send 2 delegates on Google's dime (airfare, hotel, food), but more can attend via a registration lottery and covering their own expenses. If you are interested in going to GSoC Reunion, please let me know. Thank you, -- Maxim Kuvyrkov www.linaro.org
Re: [GSoC] Status - 20140516
On May 17, 2014, at 10:41 AM, Tobias Grosser wrote: > > > On 17/05/2014 00:27, Maxim Kuvyrkov wrote: >> Hi Community, >> >> The community bonding period is coming to a close, students can officially >> start coding on Monday, May 19th. >> >> In the past month the student should have applied for FSF copyright >> assignment and, hopefully, executed on a couple of test tasks to get a feel >> for GCC development. > > In the last mail, I got the impression that you will keep track of the > copyright assignments. Is this the case? Yes. Two of the students already have copyright assignment in place, and I have asked the other 3 about their assignment progress today. Thank you, -- Maxim Kuvyrkov www.linaro.org
[GSoC] Status - 20140616
Hi Community, We are 1 week away from midterm evaluations of students' work. Mentors, please start looking closely into your student's progress and draft up evaluation notes. Midterm evaluations are very important in GSoC. Students who fail this evaluation are immediately kicked out of the program. Students who pass -- get their midterm payment ($2250). Both mentors and students will need to submit midterm evaluations between June 23-27. There is no excuse for not submitting your evaluations. Please let me know if you have any problems submitting your evaluation in the period June 23-27. For evaluations, you might find this guide helpful: http://en.flossmanuals.net/GSoCMentoring/evaluations/ . On another note, copyright assignments are now completed for 4 out of 5 students. I have pinged the last student to get his assignment in order. -- Maxim Kuvyrkov www.linaro.org
[GSoC] Status - 20140707
Hi Community, All GCC GSoC students have successfully passed mid-term evaluations, and are continuing to work on their projects. Congratulations to all the students! Furthermore, Linaro has generously provided sponsorship to pay for 1 GCC GSoC student to travel to GNU Tools Cauldron this year. By the results of mid-term evaluations and mentor comments -- Prathamesh Kulkarni was selected. As always, thank you to Google for hosting the Cauldron and to Diego for procuring an extra registration spot. Our plan is to continue bringing top 1-3 GSoC students to GCC conferences each year. Hopefully, we will get more sponsorship slots from companies doing GCC development next year. We also plan to earmark funds that GCC project will receive for mentoring the students ($500 per student) towards sponsoring one of the next year's students. Thank you, and will see at the Cauldron! -- Maxim Kuvyrkov www.linaro.org
Re: mn10300, invariants on DEP_PRO/DEP_CON and on TARGET_SCHED_ADJUST_COST params
On Jul 9, 2014, at 8:21 AM, David Malcolm wrote: > [CCing nickc, who wrote the mn10300 hook in question] > > I'm experimenting with separating out instructions from expressions in > RTL; see [1] for more info on that. > > I noticed that mn10300 has this implementation of a target hook: > #define TARGET_SCHED_ADJUST_COST mn10300_adjust_sched_cost > > Within mn10300_adjust_sched_cost (where "insn" and "dep" are the first > and third parameters respectively), there's this code: > > if (GET_CODE (insn) == PARALLEL) >insn = XVECEXP (insn, 0, 0); > > if (GET_CODE (dep) == PARALLEL) >dep = XVECEXP (dep, 0, 0); > > However, I believe that these params of this hook ("insn") always > satisfy INSN_CHAIN_CODE_P, and so can't have code PARALLEL. [Nick: did > those conditionals ever get triggered, or was this defensive coding?] >From what I can tell these are remnants from the early days of haifa-sched >(10+ years ago). I would be very surprised if scheduler didn't ICE on a >PARALLEL of INSNs (not to be confused with a PARALLEL as INSN_PATTERN). > > Specifically, the hook is called from haifa-sched.c:dep_cost_1 on the > DEP_CON and DEP_PRO of a dep_t. > > It's my belief that DEP_CON and DEP_PRO always satisfy INSN_CHAIN_CODE_P > - and on every other config so far that seems to be the case. > > Is my belief about DEP_CON/DEP_PRO correct? (or, at least, consistent > with other gcc developers' views on the matter :)) My patch kit [2] has > this expressed in the type system as of [3], so if I'm incorrect about > this I'd prefer to know ASAP. Yes, it is correct. > > Similarly, do the first and third params of TARGET_SCHED_ADJUST_COST > also satisfy INSN_CHAIN_CODE_P? > Yes, since they are always derived from DEP_CON / DEP_PRO. -- Maxim Kuvyrkov www.linaro.org
[GSoC] Status - 20140804
Hi Community, The Google Summer of Code is in its final month, and our students made good progress on their projects. I very much hope that the talented developers who worked on GCC in this year's program will continue to hack on and contribute to GCC outside of the GSoC program! It's has been a great weekend at the Cauldron, and we all had fun and some good discussions too! Prathamesh will be posting notes on the sessions that he attended this week. Big shoutout to all the sponsors and organizers of the Cauldron -- Great Job! Alessandro and Braden will be GCC's delegates to the GSoC Reunion un-conference this year -- sponsored by Google. -- Maxim Kuvyrkov www.linaro.org
[GSoC] Status - 20140819
GSoC Mentors and Students, Please remember that the deadline for final evaluations is August 22 19:00UTC. Both mentors and students should submit their evaluations on GSoC website [*] by that time. So far we have only 2 evaluations (out of 10) submitted. Thank you, -- Maxim Kuvyrkov www.linaro.org
[GSoC] Status - 20140901 FINAL
Hi Community! Google Summer of Code 2014 has come to an end. We've got some very good results this year -- with code from 4 out of 5 projects checked in to either GCC trunk or topic branch. Congratulations to students and mentors for their great work! Even more impressive is the fact that [according to student self-evaluations] most of the students intend to continue GCC development outside of the program. I encourage both mentors and students to echo their feedback about GCC's GSoC in this thread. The evaluations you posted on the GSoC website is visible to only a few people, and there are good comments and thoughts there that deserve a wider audience. Thank you, [your friendly GSoC admin signing off] -- Maxim Kuvyrkov www.linaro.org
Re: Maxim Kuvyrkov appointed Android sub-port reviewer
On Nov 14, 2014, at 9:00 PM, H.J. Lu wrote: > On Sun, Apr 15, 2012 at 5:08 PM, David Edelsohn wrote: >>I am pleased to announce that the GCC Steering Committee has >> appointed Maxim Kuvyrkov as reviewer for the Android sub-port. >> >>Please join me in congratulating Maxim on his new role. >> Maxim, please update your listing in the MAINTAINERS file. >> >> Happy hacking! >> David >> > > Hi Maxim, > > Have you added your name to MAINTAINERS? Will do. Thank you, -- Maxim Kuvyrkov www.linaro.org
Re: [GSoC] Google Summer of Code 2015?
Hi Thomas, Tobias will be GSoC admin for GCC this year. He has submitted GSoC application today. Tobias, would you please CC gcc@ for future GSoC-related news and updates? Thank you, -- Maxim Kuvyrkov www.linaro.org > On Feb 19, 2015, at 11:11 AM, Thomas Schwinge wrote: > > Hi! > > I can't remember this being discussed: if the GCC community would like to > participate in this year's Google Summer of Code -- the organization > application period will end tomorrow, > <http://groups.google.com/group/google-summer-of-code-announce>, > <http://www.google-melange.com/gsoc/homepage/google/gsoc2015>. If I > remember correctly, Maxim handled the organizational bits last year; > CCing him, just in case. ;-) > > > Grüße, > Thomas
Git-only namespace for vendor branches
Hi Jason, We at Linaro are moving fully to git, and will be using git-only branches in GCC's git mirror for Linaro's branches starting with gcc-5-branch. Everything would have been simple if we didn't have linaro/* namespace in SVN repo. I want to double-check with you (and anyone else skilled in GCC's git mirror) on the changes we plan to make to linaro branches in git mirror. At the moment we have: 1. SVN repository has linaro/gcc-4_8-branch and linaro/gcc-4_9-branch. These will continue to live in SVN repo. It's fine to not have access to these branches from the git mirror. 2. Git repository has linaro-dev/* namespace with a couple of project branches in it: linaro-dev/sched-model-prefetch and linaro-dev/type-promotion-pass. We want to get to: 1. Git repository has linaro/* namespace that hosts linaro/gcc-5-branch, linaro-dev/sched-model-prefetch and linaro-dev/type-promotion-pass. 2. Ideally, linaro/* namespace would also have branches linaro/gcc-4_8-branch and linaro/gcc-4_9-branch mirrored from SVN. My understanding is that git-svn will not cooperate on this one, so absence of these branches from Git mirror is OK. My main question is whether it is OK to overlay git-only namespace on the same-named svn namespace? I want to avoid having linaro/* namespace in SVN, and linaro-something namespace in git, especially since linaro/* SVN namespace does not appear in git mirror by default (you have to tell git to fetch non-default SVN branches to see it). Thank you, -- Maxim Kuvyrkov www.linaro.org
[RFC] Kernel livepatching support in GCC
Hi, Akashi-san and I have been discussing required GCC changes to make kernel's livepatching work for AArch64 and other architectures. At the moment livepatching is supported for x86[_64] using the following options: "-pg -mfentry -mrecord-mcount -mnop-mcount" which is geek-speak for "please add several NOPs at the very beginning of each function, and make a section with addresses of all those NOP pads". The above long-ish list of options is a historical artifact of how livepatching support evolved for x86. The end result is that for livepatching (or ftrace, or possible future kernel features) to work compiler needs to generate a little bit of empty code space at the beginning of each function. Kernel can later use that space to insert call sequences for various hooks. Our proposal is that instead of adding -mfentry/-mnop-count/-mrecord-mcount options to other architectures, we should implement a target-independent option -fprolog-pad=N, which will generate a pad of N nops at the beginning of each function and add a section entry describing the pad similar to -mrecord-mcount [1]. Since adding NOPs is much less architecture-specific then outputting call instruction sequences, this option can be handled in a target-independent way at least for some/most architectures. Comments? As I found out today, the team from Huawei has implemented [2], which follows x86 example of -mfentry option generating a hard-coded call sequence. I hope that this proposal can be easily incorporated into their work since most of the livepatching changes are in the kernel. [1] Technically, generating a NOP pad and adding a section entry in .__mcount_loc are two separate actions, so we may want to have a -fprolog-pad-record option. My instinct is to stick with a single option for now, since we can always add more later. [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/346905.html -- Maxim Kuvyrkov www.linaro.org
Re: [RFC] Kernel livepatching support in GCC
> On May 28, 2015, at 11:59 AM, Richard Biener > wrote: > > On May 28, 2015 10:39:27 AM GMT+02:00, Maxim Kuvyrkov > wrote: >> Hi, >> >> Akashi-san and I have been discussing required GCC changes to make >> kernel's livepatching work for AArch64 and other architectures. At the >> moment livepatching is supported for x86[_64] using the following >> options: "-pg -mfentry -mrecord-mcount -mnop-mcount" which is >> geek-speak for "please add several NOPs at the very beginning of each >> function, and make a section with addresses of all those NOP pads". >> >> The above long-ish list of options is a historical artifact of how >> livepatching support evolved for x86. The end result is that for >> livepatching (or ftrace, or possible future kernel features) to work >> compiler needs to generate a little bit of empty code space at the >> beginning of each function. Kernel can later use that space to insert >> call sequences for various hooks. >> >> Our proposal is that instead of adding >> -mfentry/-mnop-count/-mrecord-mcount options to other architectures, we >> should implement a target-independent option -fprolog-pad=N, which will >> generate a pad of N nops at the beginning of each function and add a >> section entry describing the pad similar to -mrecord-mcount [1]. >> >> Since adding NOPs is much less architecture-specific then outputting >> call instruction sequences, this option can be handled in a >> target-independent way at least for some/most architectures. >> >> Comments? > > Maybe follow s390 -mhotpatch instead? Regarding implementation of the option, it will follow what s390 is doing with function attributes to mark which functions to apply nop-treatment to (using attributes will avoid problems with [coming] LTO builds of the kernel). The new option will set value of the attribute on all functions in current compilation unit, and then nops will be generated from the attribute specification. On the other hand, s390 does not generate a section of descriptor entries of NOP pads, which seems like a useful (or necessary) option. A more-or-less generic implementation should, therefore, combine s390's attributes approach to annotating functions and x86's approach to providing information in an ELF section about NOP entries. Or can we record value of a function attribute in ELF in a generic way? Whatever the specifics, implementation of livepatch support should be decoupled from -pg/mcount dependency as I don't see any real need in overloading mcount with livepatching stuff. -- Maxim Kuvyrkov www.linaro.org
Re: Proposal for the transition timetable for the move to GIT
> On Sep 17, 2019, at 3:02 PM, Richard Earnshaw (lists) > wrote: > > At the Cauldron this weekend the overwhelming view for the move to GIT soon > was finally expressed. > ... > > So in summary my proposed timetable would be: > > Monday 16th December 2019 - cut off date for picking which git conversion to > use > > Tuesday 31st December 2019 - SVN repo becomes read-only at end of stage 3. > > Thursday 2nd January 2020 - (ie read-only + 2 days) new git repo comes on > line for live commits. > > Doing this over the new year holiday period has both advantages and > disadvantages. On the one hand the traffic is light, so the impact to most > developers will be quite low; on the other, it is a holiday period, so > getting the right key folk to help might be difficult. I won't object > strongly if others feel that slipping a few days (but not weeks) would make > things significantly easier. The timetable looks entirely reasonable to me. I have regenerated my primary version this week, and it's up at https://git.linaro.org/people/maxim-kuvyrkov/gcc-pretty.git/ . So far I have received only minor issue reports about it, and all known problems have been fixed. I could use a bit more scrutiny :-). Regards, -- Maxim Kuvyrkov www.linaro.org
Re: Branch and tag deletions
> On Nov 25, 2019, at 7:07 PM, Joseph Myers wrote: > > I'm looking at the sets of branches and tags resulting from a GCC > repository conversion with reposurgeon. > > 1. I see 227 branches (and one tag) with names like > cxx0x-concepts-branch-deleted-r131428-1 (this is out of 780 branches in > total in a conversion of GCC history as of a few days ago). Can we tell > reposurgeon not to create such branches (and tags)? I can't simply do > "branch /-deleted-r/ delete" because that command doesn't take a regular > expression. > > 2. gcc.lift has a series of "tag delete" commands, generally > deleting tags that aren't official GCC releases or prereleases (many of > which were artifacts of how creating such tags was necessary to track > merges in the CVS and older SVN era). But some such commands are > mysteriously failing to work. For example I see > > tag /ZLIB_/ delete > reposurgeon: no tags matching /ZLIB_/ > > but there are tags ZLIB_1_1_3, ZLIB_1_1_4, ZLIB_1_2_1, ZLIB_1_2_3 left > after the conversion. This isn't just an issue with regular expressions; > I also see e.g. > > tag apple/ppc-import-20040330 delete > reposurgeon: no tags matching apple/ppc-import-20040330 > > and again that tag exists after the conversion. IMO, we should aim to convert complete SVN history frozen at a specific point. So that if we don't want to convert some of the branches or tags to git, then we should delete them from SVN repository before conversion. Otherwise it will (a) complicate comparison or repos converted by different tools, and (b) will require us to remember why parts of SVN history were not converted to git. My conversion at https://git.linaro.org/people/maxim-kuvyrkov/gcc-pretty.git/ contains all branches and tags present in current SVN repo. -- Maxim Kuvyrkov https://www.linaro.org
Re: Branch and tag deletions
of their branches. Therefore, it may make for a smoother git experience to put user branches out of sight. Vendors, otoh, tend to keep their branches very clean. > >> >>> d) releases should go into refs/{heads/tags}/releases (makes it clearer >>> to casual users of the repo that these are 'official') >> >> What are releases? Release branches? > > branches in the heads space, tags in the tags space. > >> >> It would be very inconvenient to not have the recent releases immediately >> accessible, fwiw, but those could be just a copy. And then delete that >> one after a branch is closed? >> >>> e) other general development branches in refs/{heads/tags}/devt >> >> What does this mean? "other", "general"? > > Anything that's not vendor/user specific and not a release - a topic > branch most likely >> >>> That probably means the top-level heads/tags spaces are empty; but I >>> have no problem with that. >> >> It is good when people get the most often used things immediately. > > git branch -a will show anything in refs/remotes, and the default pull > spec is to pull refs/heads/* (and anything under that), so all release > and topic branches would be pulled by default, but not anything else. > > According to the git fetch manual page, tags are fetched if an object > they point to is fetched. I presume this only applies to tags under > refs/tags. But this is getting into details of git that I've not used > before. I need to experiment a bit more here. > > R. > > PS. Just seen https://git-scm.com/docs/gitnamespaces, that might be > exactly what we want for users, vendors and legacy stuff. I'll > investigate some more... -- Maxim Kuvyrkov https://www.linaro.org
Re: Proposal for the transition timetable for the move to GIT
> On Sep 19, 2019, at 6:34 PM, Maxim Kuvyrkov wrote: > >> On Sep 17, 2019, at 3:02 PM, Richard Earnshaw (lists) >> wrote: >> >> At the Cauldron this weekend the overwhelming view for the move to GIT soon >> was finally expressed. >> > ... >> >> So in summary my proposed timetable would be: >> >> Monday 16th December 2019 - cut off date for picking which git conversion to >> use >> >> Tuesday 31st December 2019 - SVN repo becomes read-only at end of stage 3. >> >> Thursday 2nd January 2020 - (ie read-only + 2 days) new git repo comes on >> line for live commits. >> >> Doing this over the new year holiday period has both advantages and >> disadvantages. On the one hand the traffic is light, so the impact to most >> developers will be quite low; on the other, it is a holiday period, so >> getting the right key folk to help might be difficult. I won't object >> strongly if others feel that slipping a few days (but not weeks) would make >> things significantly easier. > > The timetable looks entirely reasonable to me. > > I have regenerated my primary version this week, and it's up at > https://git.linaro.org/people/maxim-kuvyrkov/gcc-pretty.git/ . So far I have > received only minor issue reports about it, and all known problems have been > fixed. I could use a bit more scrutiny :-). I think now is a good time to give status update on the svn->git conversion I maintain. See https://git.linaro.org/people/maxim-kuvyrkov/gcc-pretty.git/ . 1. The conversion has all SVN live branches converted as branches under refs/heads/* . 2. The conversion has all SVN live tags converted as annotated tags under refs/tags/* . 3. If desired, it would be trivial to add all deleted / leaf SVN branches and tags. They would be named as branches/my-deleted-branch@12345, where @12345 is the revision at which the branch was deleted. Branches created and deleted multiple times would have separate entries corresponding to delete revisions. 4. Git committer and git author entries are very accurate (imo, better than reposurgeon's, but I'm biased). Developers' names and email addresses are mined from commit logs, changelogs and source code and have historically-accurately attributions to employer's email addresses. 5. Since there is interest in reparenting branches to fix cvs2svn merge issues, I've added this feature to my scripts as well (turned out to be trivial). I'll keep the original gcc-pretty.git repo intact and will upload the new one at https://git.linaro.org/people/maxim-kuvyrkov/gcc-reparent.git/ -- should be live by Monday. Finally, there seems to be quite a few misunderstandings about the scripts I've developed and their limitations. Most of these misunderstanding stem from assumption that all git-svn limitations must apply to my scripts. That's not the case. SVN merges, branch/tag reparenting, adjusting of commit logs are all handled correctly in my scripts. I welcome criticism with pointers to revisions which have been incorrectly converted. The general conversion workflow is (this really is a poor-man's translator of one DAG into another): 1. Parse SVN history of entire SVN root (svn log -qv file:///svnrepo/) and build a list of branch points. 2. From the branch points build a DAG of "basic blocks" of revision history. Each basic block is a consecutive set of commits where only the last commit can be a branchpoint. 3. Walk the DAG and ... 4. ... use git-svn to individually convert these basic blocks. 4a. Optionally, post-process git result of basic block conversion using "git filter-branch" and similar tools. Git-svn is used in a limited role, and it does its job very well in this role. Regards, -- Maxim Kuvyrkov https://www.linaro.org
Re: Proposal for the transition timetable for the move to GIT
> On Dec 9, 2019, at 9:19 PM, Joseph Myers wrote: > > On Fri, 6 Dec 2019, Eric S. Raymond wrote: > >> Reposurgeon has been used for several major conversions, including groff >> and Emacs. I don't mean to be nasty to Maxim, but I have not yet seen >> *anybody* who thought they could get the job done with ad-hoc scripts >> turn out to be correct. Unfortunately, the costs of failure are often >> well-hidden problems in the converted history that people trip over >> months and years later. > > I think the ad hoc script is the risk factor here as much as the fact that > the ad hoc script makes limited use of git-svn. > > For any conversion we're clearly going to need to run various validation > (comparing properties of the converted repository, such as contents at > branch tips, with expected values of those properties based on the SVN > repository) and fix issues shown up by that validation. reposurgeon has > its own tools for such validation; I also intend to write some validation > scripts myself. And clearly we need to fix issues shown up by such > validation - that's what various recent reposurgeon issues Richard and I > have reported are about, fixing the most obvious issues that show up, > which in turn will enable running more detailed validation. > > The main risks are about issues that are less obvious in validation and so > don't get fixed in that process. There, if you're using an ad hoc script, > the risks are essentially unknown. But using a known conversion tool with > an extensive testsuite, such as reposurgeon, gives confidence based on > reposurgeon passing its own testsuite (once the SVN dump reader rewrite > does so) that a wide range of potential conversion bugs, that might appear > without showing up in the kinds of validation people try, are less likely > because of all the regression tests for conversion issues seen in past > conversions. When using an ad hoc script specific to one conversion you > lose that confidence that comes from a conversion tool having been used in > previous conversions and having tests to ensure bugs found in those > conversions don't come back. > > I think we should fix whatever the remaining relevant bugs are in > reposurgeon and do the conversion with reposurgeon being used to read and > convert the SVN history and do any desired surgical operations on it. > > Ad hoc scripts identifying specific proposed local changes to the > repository content, such as the proposed commit message improvements from > Richard or my branch parent fixes, to be performed with reposurgeon, seem > a lot safer than ad hoc code doing the conversion itself. And for > validation, the more validation scripts people come up with the better. > If anyone has or wishes to write custom scripts to analyze the SVN > repository branch structure and turn that into verifiable assertions about > what a git conversion should look like, rather than into directly > generating a git repository or doing surgery on history, that helps us > check a reposurgeon-converted repository in areas that might be > problematic - and in that case it's OK for the custom script to have > unknown bugs because issues it shows up are just pointing out places where > the converted repository needs checking more carefully to decide whether > there is a conversion bug or not. Firstly, I am not going to defend my svn-git-* scripts or the git-svn tool they are using. They are likely to have bugs and problems. I am, though, going to defend the conversion that these tools produced. No matter the conversion tool, all that matters is the final result. I have asked many times to scrutinize the git repository that I have uploaded several months ago and to point out any artifacts or mistakes. Surely, it can't be hard for one to find a mistake or two in my converted repository by comparing it against any other /better/ repository that one has. [FWIW, I am going to privately compare reposurgeon-generated repo that Richard E. uploaded against my repo. The results of such comparison can appear biased, so I'm not planning to publish them.] Secondly, the GCC community has overwhelmingly supported move to git, and in private conversations many developers have expressed the same view: 1. all we care about is history of trunk and recent release branches 2. current gcc-mirror is really all we need 3. having vendor branches and author info would be nice, but not so nice as to delay the switch any longer Granted, the above is not the /official/ consensus of GCC community, and I don't want to represent it as such. However, it is equally not the consensus of GCC community to delay the switch to git until we have a confirmed perfect repo. -- Maxim Kuvyrkov https://www.linaro.org
Re: Test GCC conversion with reposurgeon available
> On Dec 22, 2019, at 4:56 PM, Joseph Myers wrote: > > On Thu, 19 Dec 2019, Joseph Myers wrote: > >> And two more. >> >> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4a.git >> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4b.git > > Two more. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5b.git > > The main changes are: > > * The case of both svnmerge-integrated and svn:mergeinfo being set is now > handled properly, so the commit Bernd found is interpreted as a merge from > trunk to named-addr-spaces-branch and has exactly two parents as expected, > with the parents corresponding to the merges from other branches to trunk > being optimized away. > > * The author map used now avoids timezone-only entries also remapping > email addresses, so the email addresses from the ChangeLogs are used > whenever a commit adds ChangeLog entries from exactly one author. > > * When commits add ChangeLog entries from more than one author (e.g. > merges done in CVS), the committer is now used as the author rather than > selecting one of the authors from the ChangeLog entries. > > * The latest whitelisting / PR corrections are used with Richard's script > (430 checkme: entries remain). > > * One fix to the ref renaming in gcc-reposurgeon-5b.git so that the tag > gcc-3_2-rhl8-3_2-7 properly ends up in vendors rather than prereleases. I'll spend next couple of days comparing Joseph's gcc-reposurgeon-5a.git conversion against my gcc-pretty.git and gcc-reparent.git conversions, and will post results along with the scripts to this mailing list. Regarding gcc-pretty.git and gcc-reparent.git conversions, I have the following comments so far: Q1: Why are there missing branches for stuff that didn't originate at trunk@1? A1: Indeed, that's by design / configuration. The scripts start with trunk@1 and build a parent DAG from that node. If desired, it is trivial to add more initial "root" commits to include these missing branches. Q2: Why are entries from branches/st/tags treated as branches, not as tags? A2: Because I opted to not special-case these to simplify comparison of different conversions. Tags/* entries are converted to git annotated tags in a separate pass, an it is trivial to add handling for branches/st/tags there. Q3: Why do reparented branches in gcc-reparent.git repo have merge commits at the point of reparenting? A3: That's an artifact of svn-git machinery my scripts are using. I haven't looked at this in depth. Q4: Is it possible to integrate Richard E.'s script to rewrite commit log messages? A5: Yes, absolutely. The scripts have a pass to rewrite commit author/committer entries, and log rewrite easily fits in there. It would be very helpful to have a version of Richard's script that runs on per-commit basis, suitable for "git filter-branch" consumption. Regards, -- Maxim Kuvyrkov https://www.linaro.org
Re: Proposal for the transition timetable for the move to GIT
> On Dec 26, 2019, at 2:16 PM, Jakub Jelinek wrote: > > On Thu, Dec 26, 2019 at 11:04:29AM +, Joseph Myers wrote: > Is there some easy way (e.g. file in the conversion scripts) to correct > spelling and other mistakes in the commit authors? > E.g. there are misspelled surnames, etc. (e.g. looking at my name, I see > Jakub Jakub Jelinek (1): > Jakub Jeilnek (1): > Jelinek (1): > entries next to the expected one with most of the commits. > For the misspellings, wonder if e.g. we couldn't compute edit distances from > other names and if we have one with many commits and then one with very few > with small edit distance from those, flag it for human review. This is close to what svn-git-author.sh script is doing in gcc-pretty and gcc-reparent conversions. It ignores 1-3 character differences in author/committer names and email addresses. I've audited results for all branches and didn't spot any mistakes. In other news, I'm working on comparison of gcc-pretty, gcc-reparent and gcc-reposurgeon-5a repos among themselves. Below are current notes for comparison of gcc-pretty/trunk and gcc-reposurgeon-5a/trunk. == Merges on trunk == Reposurgeon creates merge entries on trunk when changes from a branch are merged into trunk. This brings entire development history from the branch to trunk, which is both good and bad. The good part is that we get more visibility into how the code evolved. The bad part is that we get many "noisy" commits from merged branch (e.g., "Merge in trunk" every few revisions) and that our SVN branches are work-in-progress quality, not ready for review/commit quality. It's common for files to be re-written in large chunks on branches. Also, reposurgeon's commit logs don't have information on SVN path from which the change came, so there is no easy way to determine that a given commit is from a merged branch, not an original trunk commit. Git-svn, on the other hand, provides "git-svn-id: @" tags in its commit logs. My conversion follows current GCC development policy that trunk history should be linear. Branch merges to trunk are squashed. Merges between non-trunk branches are handled as specified by svn:mergeinfo SVN properties. == Differences in trees == Git trees (aka filesystem content) match between pretty/trunk and reposurgeon-5a/trunk from current tip and up tosvn's r130805. Here is SVN log of that revision (restoration of deleted trunk): r130805 | dberlin | 2007-12-13 01:53:37 + (Thu, 13 Dec 2007) Changed paths: A /trunk (from /trunk:130802) Reposurgeon conversion has: - commit 7e6f2a96e89d96c2418482788f94155d87791f0a Author: Daniel Berlin Date: Thu Dec 13 01:53:37 2007 + Readd trunk Legacy-ID: 130805 .gitignore | 17 - 1 file changed, 17 deletions(-) - and my conversion has: - commit fb128f3970789ce094c798945b4fa20eceb84cc7 Author: Daniel Berlin Date: Thu Dec 13 01:53:37 2007 + Readd trunk git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@130805 138bc75d-0d04-0410-961f-82ee72b054a4 - It appears that .gitignore has been added in r1 by reposurgeon and then deleted at r130805. In SVN repository .gitignore was added in r195087. I speculate that addition of .gitignore at r1 is expected, but it's deletion at r130805 is highly suspicious. == Committer entries == Reposurgeon uses $u...@gcc.gnu.org for committer email addresses even when it correctly detects author name from ChangeLog. reposurgeon-5a: r278995 Martin Liska Martin Liska r278994 Jozef Lawrynowicz Jozef Lawrynowicz r278993 Frederik Harwath Frederik Harwath r278992 Georg-Johann Lay Georg-Johann Lay r278991 Richard Biener Richard Biener pretty: r278995 Martin Liska Martin Liska r278994 Jozef Lawrynowicz Jozef Lawrynowicz r278993 Frederik Harwath Frederik Harwath r278992 Georg-Johann Lay Georg-Johann Lay r278991 Richard Biener Richard Biener == Bad summary line == While looking around r138087, below caught my eye. Is the contents of summary line as expected? commit cc2726884d56995c514d8171cc4a03657851657e Author: Chris Fairles Date: Wed Jul 23 14:49:00 2008 + acinclude.m4 ([GLIBCXX_CHECK_CLOCK_GETTIME]): Define GLIBCXX_LIBS. 2008-07-23 Chris Fairles * acinclude.m4 ([GLIBCXX_CHECK_CLOCK_GETTIME]): Define GLIBCXX_LIBS. Holds the lib that defines clock_gettime (-lrt or -lposix4). * src/Makefile.am: Use it. * configure: Regenerate. * configure.in: Likewise. * Makefile.in: Likewise. * src/Makefile.in: Likewise. * libsup++/Makefile.in: Likewise. * po/Ma
Re: Proposal for the transition timetable for the move to GIT
> On Dec 27, 2019, at 4:32 AM, Joseph Myers wrote: > > On Thu, 26 Dec 2019, Joseph Myers wrote: > >>> It appears that .gitignore has been added in r1 by reposurgeon and then >>> deleted at r130805. In SVN repository .gitignore was added in r195087. >>> I speculate that addition of .gitignore at r1 is expected, but it's >>> deletion at r130805 is highly suspicious. >> >> I suspect this is one of the known issues related to reposurgeon-generated >> .gitignore files. Since such files are not really part of the GCC >> history, and the .gitignore files checked into SVN are properly preserved >> as far as I can see, I don't think it's a particularly important issue for >> the GCC conversion (since auto-generated .gitignore files are only >> nice-to-have, not required). I've filed >> https://gitlab.com/esr/reposurgeon/issues/219 anyway with a reduced test >> for this oddity. > > This has now been fixed, so future conversion runs with reposurgeon should > have the automatically-generated .gitignore present until replaced by the > one checked into SVN. (If people don't want automatically-generated > .gitignore files at all, we could always add an option to reposurgeon not > to generate them.) Removing auto-generated .gitignore files from reposurgeon conversion would allow comparison of git trees vs gcc-pretty and gcc-reparent beyond r195087. So, while we are evaluating the conversion candidates, it is best to disable conversion features that cause hard-to-workaround differences. > > I'll do another GCC conversion run to pick up all the accumulated fixes > and improvements (including many more PR whitelist entries / fixes in > Richard's script), once another ChangeLog-related fix is in. -- Maxim Kuvyrkov https://www.linaro.org
Re: Proposal for the transition timetable for the move to GIT
Below are several more issues I found in reposurgeon-6a conversion comparing it against gcc-reparent conversion. I am sure, these and whatever other problems I may find in the reposurgeon conversion can be fixed in time. However, I don't see why should bother. My conversion has been available since summer 2019, I made it ready in time for GCC Cauldron 2019, and it didn't change in any significant way since then. With the "Missed merges" problem (see below) I don't see how reposurgeon conversion can be considered "ready". Also, I expected a diligent developer to compare new conversion (aka reposurgeon's) against existing conversion (aka gcc-pretty / gcc-reparent) before declaring the new conversion "better" or even "ready". The data I'm seeing in differences between my and reposurgeon conversions shows that gcc-reparent conversion is /better/. I suggest that GCC community adopts either gcc-pretty or gcc-reparent conversion. I welcome Richard E. to modify his summary scripts to work with svn-git scripts, which should be straightforward, and I'm ready to help. Meanwhile, I'm going to add additional root commits to my gcc-reparent conversion to bring in "missing" branches (the ones, which don't share history with trunk@1) and restart daily updates of gcc-reparent conversion. Finally, with the comparison data I have, I consider statements about git-svn's poor quality to be very misleading. Git-svn may have had serious bugs years ago when Eric R. evaluated it and started his work on reposurgeon. But a lot of development has happened and many problems have been fixed since them. At the moment it is reposurgeon that is producing conversions with obscure mistakes in repository metadata. === Missed merges === Reposurgeon misses merges from trunk on 130+ branches. I've spot-checked ARM/hard_vfp_branch and redhat/gcc-9-branch and, indeed, rather mundane merges were omitted. Below is analysis for ARM/hard_vfp_branch. $ git log --stat refs/remotes/gcc-reposurgeon-6a/ARM/hard_vfp_branch~4 commit ef92c24b042965dfef982349cd5994a2e0ff5fde Author: Richard Earnshaw Date: Mon Jul 20 08:15:51 2009 + Merge trunk through to r149768 Legacy-ID: 149804 COPYING.RUNTIME |73 + ChangeLog | 270 +- MAINTAINERS |19 +- at the same time for svn-git scripts we have: $ git log --stat refs/remotes/gcc-reparent/ARM/hard_vfp_branch~4 commit ce7d5c8df673a7a561c29f095869f20567a7c598 Merge: 4970119c20da 3a69b1e566a7 Author: Richard Earnshaw Date: Mon Jul 20 08:15:51 2009 + Merge trunk through to r149768 git-svn-id: https://gcc.gnu.org/svn/gcc/branches/ARM/hard_vfp_branch@149804 138bc75d-0d04-0410-961f-82ee72b054a4 ... which agrees with $ svn propget svn:mergeinfo file:///home/maxim.kuvyrkov/tmpfs-stuff/svnrepo/branches/ARM/hard_vfp_branch@149804 /trunk:142588-149768 === Bad author entries === Reposurgeon-6a conversion has authors "12:46:56 1998 Jim Wilson" and "2005-03-18 Kazu Hirata". It is rather obvious that person's name is unlikely to start with a digit. === Missed authors === Reposurgeon-6a conversion misses many authors, below is a list of people with names starting with "A". Akos Kiss Anders Bertelrud Andrew Pochinsky Anton Hartl Arthur Norman Aymeric Vincent === Conservative author entries === Reposurgeon-6a conversion uses default "@gcc.gnu.org" emails for many commits where svn-git conversion manages to extract valid email from commit data. This happens for hundreds of author entries. Regards, -- Maxim Kuvyrkov https://www.linaro.org > On Dec 26, 2019, at 7:11 PM, Maxim Kuvyrkov wrote: > > >> On Dec 26, 2019, at 2:16 PM, Jakub Jelinek wrote: >> >> On Thu, Dec 26, 2019 at 11:04:29AM +, Joseph Myers wrote: >> Is there some easy way (e.g. file in the conversion scripts) to correct >> spelling and other mistakes in the commit authors? >> E.g. there are misspelled surnames, etc. (e.g. looking at my name, I see >> Jakub Jakub Jelinek (1): >> Jakub Jeilnek (1): >> Jelinek (1): >> entries next to the expected one with most of the commits. >> For the misspellings, wonder if e.g. we couldn't compute edit distances from >> other names and if we have one with many commits and then one with very few >> with small edit distance from those, flag it for human review. > > This is close to what svn-git-author.sh script is doing in gcc-pretty and > gcc-reparent conversions. It ignores 1-3 character differences in > author/committer names and email addresses. I've audited results for all > branches and didn't spot any mistakes
Re: Proposal for the transition timetable for the move to GIT
> On Dec 30, 2019, at 3:18 AM, Joseph Myers wrote: > > On Sun, 29 Dec 2019, Richard Earnshaw (lists) wrote: > >> gcc-reparent is better, but many (most?) of the release tags are shown >> as merge commits with a fake parent back to the gcc-3 branch point, >> which is certainly not what happened when the tagging was done at that >> time. > > And looking at the history of gcc-reparent as part of preparing to compare > authors to identify commits needing manual attention to author > identification, I see other oddities. > > Do "git log egcs_1_1_2_prerelease_2" in gcc-reparent, for example. The > history ends up containing two different versions of SVN r5 and of many > other commits. One of them looks normal: > > commit c01d37f1690de9ea83b341780fad458f506b80c7 > Author: Charles Hannum > Date: Mon Nov 27 21:22:14 1989 + > >entered into RCS > > >git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@5 > 138bc75d-0d04-0410-961f-82ee72b054a4 > > The other looks strange: > > commit 09c5a0fa5ed76e58cc67f3d72bf397277fdd > Author: Charles Hannum > Date: Mon Nov 27 21:22:14 1989 + > >entered into RCS > > >git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@5 > 138bc75d-0d04-0410-961f-82ee72b054a4 >Updated tag 'egcs_1_1_2_prerelease_2@279090' (was bc80be265a0) >Updated tag 'egcs_1_1_2_prerelease_2@279154' (was f7cee65b219) >Updated tag 'egcs_1_1_2_prerelease_2@279213' (was 74dcba9b414) >Updated tag 'egcs_1_1_2_prerelease_2@279270' (was 7e63c9b344d) >Updated tag 'egcs_1_1_2_prerelease_2@279336' (was 47894371e3c) >Updated tag 'egcs_1_1_2_prerelease_2@279392' (was 3c3f6932316) >Updated tag 'egcs_1_1_2_prerelease_2@279402' (was 29d9998f523b) > > (and in fact it seems there are *four* commits corresponding to SVN r5 and > reachable from refs in the gcc-reparent repository). So we don't just > have stray merge commits, they actually end up leading back to strange > alternative versions of history (which I think is clearly worse than > conservatively not having a merge commit in some case where a commit might > or might not be unambiguously a merge - if a merge was missed on an active > branch, the branch maintainer can easily correct that afterwards with "git > merge -s ours" to avoid problems with future merges). > > My expectation is that there are only multiple git commits corresponding > to an SVN commit when the SVN commit touched more than one SVN branch or > tag and so has to be split to represent it in git (there are about 1500 > such SVN commits, most of them automatic datestamp updates in the CVS era > that cvs2svn turned into mixed-branch commits). Thanks for catching this. This is fallout from incremental rebuilds (rather than fresh builds) of gcc-reparent repository. Incremental builds take about 1h and full rebuilds take about 30h. I'll switch to doing full rebuilds. -- Maxim Kuvyrkov https://www.linaro.org
Re: Proposal for the transition timetable for the move to GIT
> On Dec 30, 2019, at 1:24 AM, Richard Earnshaw (lists) > wrote: > > On 29/12/2019 18:30, Maxim Kuvyrkov wrote: >> Below are several more issues I found in reposurgeon-6a conversion comparing >> it against gcc-reparent conversion. >> >> I am sure, these and whatever other problems I may find in the reposurgeon >> conversion can be fixed in time. However, I don't see why should bother. >> My conversion has been available since summer 2019, I made it ready in time >> for GCC Cauldron 2019, and it didn't change in any significant way since >> then. >> >> With the "Missed merges" problem (see below) I don't see how reposurgeon >> conversion can be considered "ready". Also, I expected a diligent developer >> to compare new conversion (aka reposurgeon's) against existing conversion >> (aka gcc-pretty / gcc-reparent) before declaring the new conversion "better" >> or even "ready". The data I'm seeing in differences between my and >> reposurgeon conversions shows that gcc-reparent conversion is /better/. >> >> I suggest that GCC community adopts either gcc-pretty or gcc-reparent >> conversion. I welcome Richard E. to modify his summary scripts to work with >> svn-git scripts, which should be straightforward, and I'm ready to help. >> > > I don't think either of these conversions are any more ready to use than > the reposurgeon one, possibly less so. In fact, there are still some > major issues to resolve first before they can be considered. > > gcc-pretty has completely wrong parent information for the gcc-3 era > release tags, showing the tags as being made directly from trunk with > massive deltas representing the roll-up of all the commits that were > made on the gcc-3 release branch. I will clarify the above statement, and please correct me where you think I'm wrong. Gcc-pretty conversion has the exact right parent information for the gcc-3 era release tags as recorded in SVN version history. Gcc-pretty conversion aims to produce an exact copy of SVN history in git. IMO, it manages to do so just fine. It is a different thing that SVN history has a screwed up record of gcc-3 era tags. > > gcc-reparent is better, but many (most?) of the release tags are shown > as merge commits with a fake parent back to the gcc-3 branch point, > which is certainly not what happened when the tagging was done at that > time. I agree with you here. > > Both of these factually misrepresent the history at the time of the > release tag being made. Yes and no. Gcc-pretty repository mirrors SVN history. And regarding the need for reparenting -- we lived with current history for gcc-3 release tags for a long time. I would argue their continued brokenness is not a show-stopper. Looking at this from a different perspective, when I posted the initial svn-git scripts back in Summer, the community roughly agreed on a plan to 1. Convert entire SVN history to git. 2. Use the stock git history rewrite tools (git filter-branch) to fixup what we want, e.g., reparent tags and branches or set better author/committer entries. Gcc-pretty does (1) in entirety. For reparenting, I tried a 15min fix to my scripts to enable reparenting, which worked, but with artifacts like the merge commit from old and new parents. I will drop this and instead use tried-and-true "git filter-branch" to reparent those tags and branches, thus producing gcc-reparent from gcc-pretty. > > As for converting my script to work with your tools, I'm afraid I don't > have time to work on that right now. I'm still bogged down validating > the incorrect bug ids that the script has identified for some commits. > I'm making good progress (we're down to 160 unreviewed commits now), but > it is still going to take what time I have over the next week to > complete that task. > > Furthermore, there is no documentation on how your conversion scripts > work, so it is not possible for me to test any work I might do in order > to validate such changes. Not being able to run the script locally to > test change would be a non-starter. > > You are welcome, of course, to clone the script I have and attempt to > modify it yourself, it's reasonably well documented. The sources can be > found in esr's gcc-conversion repository here: > https://gitlab.com/esr/gcc-conversion.git -- Maxim Kuvyrkov https://www.linaro.org > > >> Meanwhile, I'm going to add additional root commits to my gcc-reparent >> conversion to bring in "missing" branches (the ones, which don't share >> history with trunk@1) and restart daily updates of gcc-
Re: Proposal for the transition timetable for the move to GIT
> On Dec 30, 2019, at 6:31 PM, Richard Earnshaw (lists) > wrote: > > On 30/12/2019 13:00, Maxim Kuvyrkov wrote: >>> On Dec 30, 2019, at 1:24 AM, Richard Earnshaw (lists) >>> wrote: >>> >>> On 29/12/2019 18:30, Maxim Kuvyrkov wrote: >>>> Below are several more issues I found in reposurgeon-6a conversion >>>> comparing it against gcc-reparent conversion. >>>> >>>> I am sure, these and whatever other problems I may find in the reposurgeon >>>> conversion can be fixed in time. However, I don't see why should bother. >>>> My conversion has been available since summer 2019, I made it ready in >>>> time for GCC Cauldron 2019, and it didn't change in any significant way >>>> since then. >>>> >>>> With the "Missed merges" problem (see below) I don't see how reposurgeon >>>> conversion can be considered "ready". Also, I expected a diligent >>>> developer to compare new conversion (aka reposurgeon's) against existing >>>> conversion (aka gcc-pretty / gcc-reparent) before declaring the new >>>> conversion "better" or even "ready". The data I'm seeing in differences >>>> between my and reposurgeon conversions shows that gcc-reparent conversion >>>> is /better/. >>>> >>>> I suggest that GCC community adopts either gcc-pretty or gcc-reparent >>>> conversion. I welcome Richard E. to modify his summary scripts to work >>>> with svn-git scripts, which should be straightforward, and I'm ready to >>>> help. >>>> >>> >>> I don't think either of these conversions are any more ready to use than >>> the reposurgeon one, possibly less so. In fact, there are still some >>> major issues to resolve first before they can be considered. >>> >>> gcc-pretty has completely wrong parent information for the gcc-3 era >>> release tags, showing the tags as being made directly from trunk with >>> massive deltas representing the roll-up of all the commits that were >>> made on the gcc-3 release branch. >> >> I will clarify the above statement, and please correct me where you think >> I'm wrong. Gcc-pretty conversion has the exact right parent information for >> the gcc-3 era >> release tags as recorded in SVN version history. Gcc-pretty conversion aims >> to produce an exact copy of SVN history in git. IMO, it manages to do so >> just fine. >> >> It is a different thing that SVN history has a screwed up record of gcc-3 >> era tags. > > It's not screwed up in svn. Svn shows the correct history information for > the gcc-3 era release tags, but the git-svn conversion in gcc-pretty does not. > > For example, looking at gcc_3_0_release in expr.c with git blame and svn > blame shows In SVN history tags/gcc_3_0_release has been copied off /trunk:39596 and in the same commit bunch of files were replaced from /branches/gcc-3_0-branch/ (and from different revisions of this branch!). $ svn log -qv --stop-on-copy file://$(pwd)/tags/gcc_3_0_release | grep "/tags/gcc_3_0_release \|/tags/gcc_3_0_release/gcc/expr.c \|/tags/gcc_3_0_release/gcc/reload.c " A /tags/gcc_3_0_release (from /trunk:39596) R /tags/gcc_3_0_release/gcc/expr.c (from /branches/gcc-3_0-branch/gcc/expr.c:43255) R /tags/gcc_3_0_release/gcc/reload.c (from /branches/gcc-3_0-branch/gcc/reload.c:42007) IMO, from such history (absent external knowledge about better reparenting options) the best choice for parent branch is /trunk@39596, not /branches/gcc-3_0-branch at a random revision from the replaced files. Still, I see your point, and I will fix reparenting support. Whether GCC community opts to reparent or not reparent is a different topic. -- Maxim Kuvyrkov https://www.linaro.org > git blame expr.c: > > ba0a9cb85431 (Richard Kenner 1992-03-03 23:34:57 + 396) > return temp; > ba0a9cb85431 (Richard Kenner 1992-03-03 23:34:57 + 397) } > 5fbf0b0d5828 (no-author 2001-06-17 19:44:25 + 398) /* > Copy the address into a pseudo, so that the returned value > 5fbf0b0d5828 (no-author 2001-06-17 19:44:25 + 399) > remains correct across calls to emit_queue. */ > 5fbf0b0d5828 (no-author 2001-06-17 19:44:25 + 400) > XEXP (new, 0) = copy_to_reg (XEXP (new, 0)); > 59f26b7caad9 (Richard Kenner 1994-01-11 00:23:47 + 401) > return new; > > git log 5fbf0b0d5828 > commit 5fbf0b0d5828687914c1c18a83ff
Re: Proposal for the transition timetable for the move to GIT
> On Dec 30, 2019, at 7:08 PM, Richard Earnshaw (lists) > wrote: > > On 30/12/2019 15:49, Maxim Kuvyrkov wrote: >>> On Dec 30, 2019, at 6:31 PM, Richard Earnshaw (lists) >>> wrote: >>> >>> On 30/12/2019 13:00, Maxim Kuvyrkov wrote: >>>>> On Dec 30, 2019, at 1:24 AM, Richard Earnshaw (lists) >>>>> wrote: >>>>> >>>>> On 29/12/2019 18:30, Maxim Kuvyrkov wrote: >>>>>> Below are several more issues I found in reposurgeon-6a conversion >>>>>> comparing it against gcc-reparent conversion. >>>>>> >>>>>> I am sure, these and whatever other problems I may find in the >>>>>> reposurgeon conversion can be fixed in time. However, I don't see why >>>>>> should bother. My conversion has been available since summer 2019, I >>>>>> made it ready in time for GCC Cauldron 2019, and it didn't change in any >>>>>> significant way since then. >>>>>> >>>>>> With the "Missed merges" problem (see below) I don't see how reposurgeon >>>>>> conversion can be considered "ready". Also, I expected a diligent >>>>>> developer to compare new conversion (aka reposurgeon's) against existing >>>>>> conversion (aka gcc-pretty / gcc-reparent) before declaring the new >>>>>> conversion "better" or even "ready". The data I'm seeing in differences >>>>>> between my and reposurgeon conversions shows that gcc-reparent >>>>>> conversion is /better/. >>>>>> >>>>>> I suggest that GCC community adopts either gcc-pretty or gcc-reparent >>>>>> conversion. I welcome Richard E. to modify his summary scripts to work >>>>>> with svn-git scripts, which should be straightforward, and I'm ready to >>>>>> help. >>>>>> >>>>> >>>>> I don't think either of these conversions are any more ready to use than >>>>> the reposurgeon one, possibly less so. In fact, there are still some >>>>> major issues to resolve first before they can be considered. >>>>> >>>>> gcc-pretty has completely wrong parent information for the gcc-3 era >>>>> release tags, showing the tags as being made directly from trunk with >>>>> massive deltas representing the roll-up of all the commits that were >>>>> made on the gcc-3 release branch. >>>> >>>> I will clarify the above statement, and please correct me where you think >>>> I'm wrong. Gcc-pretty conversion has the exact right parent information >>>> for the gcc-3 era >>>> release tags as recorded in SVN version history. Gcc-pretty conversion >>>> aims to produce an exact copy of SVN history in git. IMO, it manages to >>>> do so just fine. >>>> >>>> It is a different thing that SVN history has a screwed up record of gcc-3 >>>> era tags. >>> >>> It's not screwed up in svn. Svn shows the correct history information for >>> the gcc-3 era release tags, but the git-svn conversion in gcc-pretty does >>> not. >>> >>> For example, looking at gcc_3_0_release in expr.c with git blame and svn >>> blame shows >> >> In SVN history tags/gcc_3_0_release has been copied off /trunk:39596 and in >> the same commit bunch of files were replaced from /branches/gcc-3_0-branch/ >> (and from different revisions of this branch!). >> >> $ svn log -qv --stop-on-copy file://$(pwd)/tags/gcc_3_0_release | grep >> "/tags/gcc_3_0_release \|/tags/gcc_3_0_release/gcc/expr.c >> \|/tags/gcc_3_0_release/gcc/reload.c " >> A /tags/gcc_3_0_release (from /trunk:39596) >> R /tags/gcc_3_0_release/gcc/expr.c (from >> /branches/gcc-3_0-branch/gcc/expr.c:43255) >> R /tags/gcc_3_0_release/gcc/reload.c (from >> /branches/gcc-3_0-branch/gcc/reload.c:42007) >> > > Right, (and wrong). You have to understand how the release branches and > tags are represented in CVS to understand why the SVN conversion is done > this way. When a branch was created in CVS a tag was added to each > commit which would then be used in any future revisions along that > branch. But until a commit is made on that branch, the release branch > is just a placeholder. > > When a CVS release tag is created, the tag labels the relevant commit &
Re: Proposal for the transition timetable for the move to GIT
> On Jan 9, 2020, at 5:38 AM, Segher Boessenkool > wrote: > > On Wed, Jan 08, 2020 at 11:34:32PM +, Joseph Myers wrote: >> As noted on overseers, once Saturday's DATESTAMP update has run at 00:16 >> UTC on Saturday, I intend to add a README.MOVED_TO_GIT file on SVN trunk >> and change the SVN hooks to make SVN readonly, then disable gccadmin's >> cron jobs that build snapshots and update online documentation until they >> are ready to run with the git repository. Once the existing git mirror >> has picked up the last changes I'll make that read-only and disable that >> cron job as well, and start the conversion process with a view to having >> the converted repository in place this weekend (it could either be made >> writable as soon as I think it's ready, or left read-only until people >> have had time to do any final checks on Monday). Before then, I'll work >> on hooks, documentation and maintainer-scripts updates. > > Where and when and by who was it decided to use this conversion? Joseph, please point to message on gcc@ mailing list that expresses consensus of GCC community to use reposurgeon conversion. Otherwise, it is not appropriate to substitute one's opinion for community consensus. I want GCC community to get the best possible conversion, being it mine or reposurgeon's. To this end I'm comparing the two conversions and will post my results later today. Unfortunately, the comparison is complicated by the fact that you uploaded only "b" version of gcc-reposurgeon-8 repository, which uses modified branch layout (or confirm that there are no substantial differences between "7" and "8" reposurgeon conversions). -- Maxim Kuvyrkov https://www.linaro.org
Re: Proposal for the transition timetable for the move to GIT
> On Jan 10, 2020, at 10:33 AM, Maxim Kuvyrkov > wrote: > >> On Jan 9, 2020, at 5:38 AM, Segher Boessenkool >> wrote: >> >> On Wed, Jan 08, 2020 at 11:34:32PM +, Joseph Myers wrote: >>> As noted on overseers, once Saturday's DATESTAMP update has run at 00:16 >>> UTC on Saturday, I intend to add a README.MOVED_TO_GIT file on SVN trunk >>> and change the SVN hooks to make SVN readonly, then disable gccadmin's >>> cron jobs that build snapshots and update online documentation until they >>> are ready to run with the git repository. Once the existing git mirror >>> has picked up the last changes I'll make that read-only and disable that >>> cron job as well, and start the conversion process with a view to having >>> the converted repository in place this weekend (it could either be made >>> writable as soon as I think it's ready, or left read-only until people >>> have had time to do any final checks on Monday). Before then, I'll work >>> on hooks, documentation and maintainer-scripts updates. >> >> Where and when and by who was it decided to use this conversion? > > Joseph, please point to message on gcc@ mailing list that expresses consensus > of GCC community to use reposurgeon conversion. Otherwise, it is not > appropriate to substitute one's opinion for community consensus. > > I want GCC community to get the best possible conversion, being it mine or > reposurgeon's. To this end I'm comparing the two conversions and will post > my results later today. > > Unfortunately, the comparison is complicated by the fact that you uploaded > only "b" version of gcc-reposurgeon-8 repository, which uses modified branch > layout (or confirm that there are no substantial differences between "7" and > "8" reposurgeon conversions). There are plenty of difference between reposurgeon and svn-git conversions; today I've ignored subjective differences like author and committer entries and focused on comparing histories of branches. Redhat's branches are among the most complicated and below analysis is difficult to follow. It took me most of today to untangle it. Let's look at redhat/gcc-9-branch. TL;DR: 1. Reposurgeon conversion has extra history (more commits than intended) of redhat/gcc-4_7-branch@182541 merged into redhat/gcc-4_8-branch, which is then propagated into all following branches including redhat/gcc-9-branch. 2. Svn-git conversion has redhat/gcc-4_8-branch with history corresponding to SVN history, with no less and no more commits. 3. Other branches are likely to have similar issues, I didn't check. 4. I consider history of reposurgeon conversion to be incorrect. 5. The only history artifact (extra merges in reparented branches/tags) of svn-git conversion has been fixed. 6. I can appreciate that GCC community is tired of this discussion and wants it to go away. Analysis: Commit histories for redhat/gcc-9-branch match up to history inherited from redhat/gcc-4_8-branch (yes, redhat's branch history goes into ancient branches). So now we are looking at redhat/gcc-4_8-branch, and the two conversions have different commit histories for redhat/gcc-4_8-branch. This is relevant because it shows up in current development branch. The histories diverge at r194477: r194477 | jakub | 2012-12-13 13:34:44 + (Thu, 13 Dec 2012) | 3 lines svn merge -r182540:182541 svn+ssh://gcc.gnu.org/svn/gcc/branches/redhat/gcc-4_7-branch svn merge -r182546:182547 svn+ssh://gcc.gnu.org/svn/gcc/branches/redhat/gcc-4_7-branch Added: svn:mergeinfo ## -0,0 +0,4 ## Merged /branches/redhat/gcc-4_4-branch:r143377,143388,144574,144578,155228 Merged /branches/redhat/gcc-4_5-branch:r161595 Merged /branches/redhat/gcc-4_6-branch:r168425 Merged /branches/redhat/gcc-4_7-branch:r182541,182547 To me this looks like cherry-picks of r182541 and r182547 from redhat/gcc-4_7-branch into redhat/gcc-4_8-branch. [1] Note that commit r182541 is itself a merge of redhat/gcc-4_6-branch@168425 into redhat/gcc-4_7-branch and cherry-picks from the other branches. It is an actual merge (not cherry-pick) from redhat/gcc-4_6-branch@168425 because r168425 is the only commit to redhat/gcc-4_6-branch@168425 not present on trunk. The other branches had more commits to their histories, so they can't be represented as git merges. Reposurgeon commit for r194477 (e601ffdd860b0deed6d7ce78da61e8964c287b0b) merges in commit for r182541 from redhat/gcc-4_7-branch bringing *full* history of redhat/gcc-
Re: Proposal for the transition timetable for the move to GIT
> On Jan 10, 2020, at 6:15 PM, Joseph Myers wrote: > > On Fri, 10 Jan 2020, Maxim Kuvyrkov wrote: > >> To me this looks like cherry-picks of r182541 and r182547 from >> redhat/gcc-4_7-branch into redhat/gcc-4_8-branch. > > r182541 is the first commit on /branches/redhat/gcc-4_7-branch after it > was created as a copy of trunk. I.e., merging and cherry-picking it are > indistinguishable, and it's entirely correct for reposurgeon to consider a > commit merging it as a merge from r182541 (together with a cherry-pick of > r182547). I was wrong re. r182541, I didn't notice that it is the first commit on branch. This renders the analysis in favor of reposurgeon conversion, not svn-git. -- Maxim Kuvyrkov https://www.linaro.org
Re: Stack protector: leak of guard's address on stack
> On Apr 28, 2018, at 9:22 PM, Florian Weimer wrote: > > * Thomas Preudhomme: > >> Yes absolutely, CSE needs to be avoided. I made memory access volatile >> because the change was easier to do. Also on Arm Thumb-1 computing the >> guard's address itself takes several loads so had to modify some more >> patterns. Anyway, regardless of the proper fix, do you have any objection >> to raising a CVE for that issue? > > Please file a bug in Bugzilla first and use that in the submission to > MITRE. Thomas filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 couple of weeks ago. -- Maxim Kuvyrkov www.linaro.org
Re: Stack protector: leak of guard's address on stack
> On Apr 29, 2018, at 2:11 PM, Florian Weimer wrote: > > * Maxim Kuvyrkov: > >>> On Apr 28, 2018, at 9:22 PM, Florian Weimer wrote: >>> >>> * Thomas Preudhomme: >>> >>>> Yes absolutely, CSE needs to be avoided. I made memory access volatile >>>> because the change was easier to do. Also on Arm Thumb-1 computing the >>>> guard's address itself takes several loads so had to modify some more >>>> patterns. Anyway, regardless of the proper fix, do you have any objection >>>> to raising a CVE for that issue? >>> >>> Please file a bug in Bugzilla first and use that in the submission to >>> MITRE. >> >> Thomas filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 couple >> of weeks ago. > > Is there a generic way to find other affected targets? > > If we only plan to fix 32-bit Arm, we should make the CVE identifier > specific to that, to avoid confusion. The problem is fairly target-dependent, so architecture maintainers need to look at how stack-guard canaries and their addresses are handled and whether they can be spilled onto stack. It appears we need to poll architecture maintainers before filing the CVE. -- Maxim Kuvyrkov www.linaro.org
Re: Devirtualization in gcc
On Jan 26, 2011, at 3:27 AM, Ian Lance Taylor wrote: > Black Bit writes: > >> Could someone tell me if the work described in this paper >> http://www.linuxsymposium.org/archives/GCC/Reprints-2006/namolaru-reprint.pdf >> was completed and is part of gcc?Thanks >> > > To the best of my knowledge the work has not yet become part of mainline > gcc. Perhaps the Haifa folks can correct me if I am wrong. The approach described in the paper resembles devirtualization optimizations Martin Jambor implemented as part of the IPA CP pass. AFAIK, the two implementations were different efforts. The implementation in current mainline does not define the lattice to track types as clear as the paper, but functionally it is very similar. We (CodeSourcery) have patches that refactor type propagation code in ipa-cp.c to clearly describe the type information lattice [*]. Having information represented as lattice is advantageous as it makes it easier to reuse devirtualization analysis in other optimization passes. [*] http://gcc.gnu.org/ml/gcc/2010-12/msg00461.html -- Maxim Kuvyrkov CodeSourcery +7-812-677-6839
Simplification of relational operations (was [patch for PR18942])
Zdenek, I'm looking at a missed optimizations in combine and it is similar to the one you've fixed in PR18942 (http://thread.gmane.org/gmane.comp.gcc.patches/81504). I'm trying to make GCC optimize (leu:SI (plus:SI (reg:SI) (const_int -1)) (const_int 1)) into (leu:SI (reg:SI) (const_int 2)) . Your patch for PR18942 handles only EQ/NE comparisons, and I wonder if there is a reason not to handle LEU/GEU, LTU/GTU comparisons as well. I'm a bit fuzzy whether signed comparisons can be optimized here as well, but I can't see the problem with unsigned comparisons. Any reason why this optimization would be unsafe? Regarding the testcase, the general pattern (set (tmp1) (plus:SI (reg:SI) (const_int A)) (set (tmp2) (leu:SI (tmp1) (const_int B)) is generated from switch statement switch (reg) { case A: case B: ... } Combine tries merge the two instructions into one, but fails. This causes an extra 'add' instruction per switch statement in the final assembly. The target I'm working with is MIPS, but, I imagine, other architectures are affected as well. Thank you, -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics
Re: Simplification of relational operations (was [patch for PR18942])
On 2/12/2011, at 9:45 PM, Jakub Jelinek wrote: > On Fri, Dec 02, 2011 at 03:33:06PM +1300, Maxim Kuvyrkov wrote: >> I'm looking at a missed optimizations in combine and it is similar to the >> one you've fixed in PR18942 >> (http://thread.gmane.org/gmane.comp.gcc.patches/81504). >> >> I'm trying to make GCC optimize >> (leu:SI >> (plus:SI (reg:SI) (const_int -1)) >> (const_int 1)) >> >> into >> >> (leu:SI >> (reg:SI) >> (const_int 2)) >> . >> >> Your patch for PR18942 handles only EQ/NE comparisons, and I wonder if >> there is a reason not to handle LEU/GEU, LTU/GTU comparisons as well. I'm >> a bit fuzzy whether signed comparisons can be optimized here as well, but >> I can't see the problem with unsigned comparisons. > > Consider reg:SI being 0? Then (leu:SI (plus:SI (reg:SI) (const_int -1)) > (const_int 1)) > is 0, but (leu:SI (reg:SI) (const_int 2)) is 1. > You could transform this if you have a guarantee that reg:SI will not be 0 > (and, in your general > >> Regarding the testcase, the general pattern >> >> (set (tmp1) (plus:SI (reg:SI) (const_int A)) >> (set (tmp2) (leu:SI (tmp1) (const_int B)) > > case that reg:SI isn't 0 .. A-1). Jacub, Zdenek, Thank you for explaining the overflow in comparisons. In fact, the unsigned overflow is intentionally used in expanding of switch statements to catch the 'default:' case in certain switch statements with just one conditional branch. -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics
Re: [SH] ICE compiling pr34330 testcase for sh-linux-gnu
Andrew Stubbs wrote: I'm having trouble with an ICE, and I'm hoping somebody can enlighten me. Given the following command: cc1 -fpreprocessed ../pr34330.i -quiet -dumpbase pr34330.c -da -mb -auxbase-strip pr34330.c -Os -version -ftree-parallelize-loops=4 -ftree-vectorize -o pr34330.s -fschedule-insns I get an internal compiler error: GNU C (GCC) version 4.5.0 20090702 (experimental) (sh-linux-gnu) compiled by GNU C version 4.3.2, GMP version 4.3.1, MPFR version 2.4.1-p5, MPC version 0.6 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU C (GCC) version 4.5.0 20090702 (experimental) (sh-linux-gnu) compiled by GNU C version 4.3.2, GMP version 4.3.1, MPFR version 2.4.1-p5, MPC version 0.6 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: c91a929a0209c0670a3ae8b8067b9f9a /scratch/ams/4.4-sh-linux-gnu-lite/src/gcc-trunk-4.4/gcc/testsuite/gcc.dg/torture/pr34330.c: In function 'foo': /scratch/ams/4.4-sh-linux-gnu-lite/src/gcc-trunk-4.4/gcc/testsuite/gcc.dg/torture/pr34330.c:22:1: error: insn does not satisfy its constraints: (insn 171 170 172 4 /scratch/ams/4.4-sh-linux-gnu-lite/src/gcc-trunk-4.4/gcc/testsuite/gcc.dg/torture/pr34330.c:17 (set (reg:SI 9 r9) (plus:SI (reg:SI 8 r8) (reg:SI 0 r0 [orig:243 ivtmp.11 ] [243]))) 35 {*addsi3_compact} (nil)) /scratch/ams/4.4-sh-linux-gnu-lite/src/gcc-trunk-4.4/gcc/testsuite/gcc.dg/torture/pr34330.c:22:1: internal compiler error: in reload_cse_simplify_operands, at postreload.c:396 This looks much like PR37053 on m68k/ColdFire; the easiest way to check if this ICE was caused by the same error is to revert hunk in rtlanal.c: commutative_operand_precedence() -- see in the PR. As to the fix, there are several patches being discussed here (http://gcc.gnu.org/ml/gcc-patches/2009-07/msg00816.html) and here (http://gcc.gnu.org/ml/gcc-patches/2009-07/msg00823.html). My $0.02. -- Maxim
[RFA] dwarf2out.c:eliminate_regs() bug
I'm investigating an ICE on m68k architecture. I'm not quite sure what is the right way to fix the bug so I welcome any feedback on the below analysis. Compilation fails on the assert in dwarf2out.c:based_loc_descr(): /* We only use "frame base" when we're sure we're talking about the post-prologue local stack frame. We do this by *not* running register elimination until this point, and recognizing the special argument pointer and soft frame pointer rtx's. */ if (reg == arg_pointer_rtx || reg == frame_pointer_rtx) { rtx elim = eliminate_regs (reg, VOIDmode, NULL_RTX); if (elim != reg) { if (GET_CODE (elim) == PLUS) { offset += INTVAL (XEXP (elim, 1)); elim = XEXP (elim, 0); } gcc_assert ((SUPPORTS_STACK_ALIGNMENT && (elim == hard_frame_pointer_rtx || elim == stack_pointer_rtx)) || elim == (frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx)); This code uses eliminate_regs(), which implicitly assumes reload_completed as it uses reg_eliminate[], which assumes that frame_pointer_needed is properly set, which happens in ira.c. However, in some cases this piece of based_loc_descr() can be reached during inlining pass (see backtrace below). When called before reload, eliminate_regs() may return an inconsistent result which is why the assert in based_loc_descr() fails. In the particular testcase I'm investigating, frame_pointer_needed is 0 (initial value), but eliminate_regs returns stack_pointer_rtx because it is guided by reg_eliminate information from the previous function which had frame_pointer_needed set to 1. Now, how do we fix this? For starters, it seems to be a good idea to assert (reload_in_progress | reload_completed) in eliminate_regs. Then, there are users of eliminate_regs in dbxout.c, dwarf2out.c, and sdbout.c not counting reload and machine-specific parts. From the 3 *out.c backends, only dwarf2out.c handles abstract functions, which is what causing it to be called before reload afaik, so the task seems to be in fixing the dwarf2out code. There are two references to eliminate_regs in dwarf2out. The first reference -- in based_loc_descr -- can *probably* be handled by adding reload_completed to the 'if' condition. The second is in compute_frame_pointer_to_fb_displacement. I'm no expert in dwarf2out.c code, but from the looks of it, it seems that compute_..._displacement is only called after reload, so a simple gcc_assert (reload_completed) may be enough there. One last note, I'm investigating this bug against 4.4 branch as it doesn't trigger on the mainline. Progression search on the mainline showed that failure became latent after this patch (http://gcc.gnu.org/viewcvs?view=revision&revision=147436) to inlining heuristics. -- Maxim K. CodeSourcery The backtrace: #0 eliminate_regs_1 (x=0xf7d60280, mem_mode=VOIDmode, insn=0x0, may_use_invariant=0 '\0') at gcc/reload1.c:2481 #1 0x0839e9b1 in eliminate_regs (x=0xf7d60280, mem_mode=VOIDmode, insn=0x0) at gcc/reload1.c:2870 #2 0x0821cf66 in based_loc_descr (reg=0xf7d60280, offset=8, initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:9868 #3 0x0821d7a7 in mem_loc_descriptor (rtl=0xf700bd98, mode=SImode, initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10158 #4 0x0821dd55 in loc_descriptor (rtl=0xf700bc90, initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10330 #5 0x0821ddde in loc_descriptor (rtl=0xf700d7a0, initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10349 #6 0x082205d6 in add_location_or_const_value_attribute (die=0xf702ad20, decl=0xf73922d0, attr=DW_AT_location) at gcc/dwarf2out.c:11841 #7 0x08223412 in gen_formal_parameter_die (node=0x0, origin=0xf73922d0, context_die=0xf702ace8) at gcc/dwarf2out.c:13349 #8 0x082273c6 in gen_decl_die (decl=0x0, origin=0xf73922d0, context_die=0xf702ace8) at gcc/dwarf2out.c:15388 #9 0x082268aa in process_scope_var (stmt=0xf7163620, decl=0x0, origin=0xf73922d0, context_die=0xf702ace8) at gcc/dwarf2out.c:14969 #10 0x0822698d in decls_for_scope (stmt=0xf7163620, context_die=0xf702ace8, depth=5) at gcc/dwarf2out.c:14993 #11 0x08225192 in gen_lexical_block_die (stmt=0xf7163620, context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14266 #12 0x082253b5 in gen_inlined_subroutine_die (stmt=0xf7163620, context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14308 #13 0x08226711 in gen_block_die (stmt=0xf7163620, context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14935 #14 0x082269ee in decls_for_scope (stmt=0xf7163038, context_die=0xf702a498, depth=4) at gcc/dwarf2out.c:15005 #15 0x08225192 in gen_lexical_block_die (stmt=0xf7163038, context_die=0xf7026f18, depth=4) at gcc/dwarf2out.c:14266 #16 0x0822672c in gen_block_die (stmt=0xf7163038, c
Re: question about speculative scheduling in gcc
Amker.Cheng wrote: Hi : I'm puzzled when looking into speculative scheduling in gcc, the 4.2.4 version. First, I noticed the document describing IBM haifa instruction scheduler(as PowerPC Reference Compiler Optimization Project). It presents that the instruction motion from bb s(dominated by t) to t is speculative when split_blocks(s, t) not empty. Second, There is SCED_FLAGS like DO_SPECULATION in codes. These are two different types of speculative optimizations. Here goes questions. 1, Does the DO_SPECULATION flag constrol whether do the mentioned speculative motion or not? DO_SPECULATION flag controls generation of IA64 data and control speculative instructions. It is not used on other architectures. Speculative instruction moves from the split blocks are controlled by flag_schedule_speculative. -- Maxim
Re: [RFA] dwarf2out.c:eliminate_regs() bug
Richard Guenther wrote: On Sun, Sep 20, 2009 at 9:38 AM, Maxim Kuvyrkov wrote: ... This code uses eliminate_regs(), which implicitly assumes reload_completed as it uses reg_eliminate[], which assumes that frame_pointer_needed is properly set, which happens in ira.c. However, in some cases this piece of based_loc_descr() can be reached during inlining pass (see backtrace below). When called before reload, eliminate_regs() may return an inconsistent result which is why the assert in based_loc_descr() fails. In the particular testcase I'm investigating, frame_pointer_needed is 0 (initial value), but eliminate_regs returns stack_pointer_rtx because it is guided by reg_eliminate information from the previous function which had frame_pointer_needed set to 1. ... I think you should avoid calling eliminate_regs for DECL_ABSTRACT current_function_decl. That should cover the inliner path. Thanks for the insight. Do you mean something like the attached patch? -- Maxim Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 261914) +++ gcc/dwarf2out.c (working copy) @@ -9862,8 +9862,11 @@ based_loc_descr (rtx reg, HOST_WIDE_INT /* We only use "frame base" when we're sure we're talking about the post-prologue local stack frame. We do this by *not* running register elimination until this point, and recognizing the special - argument pointer and soft frame pointer rtx's. */ - if (reg == arg_pointer_rtx || reg == frame_pointer_rtx) + argument pointer and soft frame pointer rtx's. + We might get here during the inlining pass (DECL_ABSTRACT is true then), + so don't try eliminating registers in such a case. */ + if (!DECL_ABSTRACT (current_function_decl) + && (reg == arg_pointer_rtx || reg == frame_pointer_rtx)) { rtx elim = eliminate_regs (reg, VOIDmode, NULL_RTX); @@ -12224,6 +12227,9 @@ compute_frame_pointer_to_fb_displacement offset += ARG_POINTER_CFA_OFFSET (current_function_decl); #endif + /* Make sure we don't try eliminating registers in abstract function. */ + gcc_assert (!DECL_ABSTRACT (current_function_decl)); + elim = eliminate_regs (reg, VOIDmode, NULL_RTX); if (GET_CODE (elim) == PLUS) { Index: gcc/reload1.c === --- gcc/reload1.c (revision 261914) +++ gcc/reload1.c (working copy) @@ -2867,6 +2867,7 @@ eliminate_regs_1 (rtx x, enum machine_mo rtx eliminate_regs (rtx x, enum machine_mode mem_mode, rtx insn) { + gcc_assert (reload_in_progress || reload_completed); return eliminate_regs_1 (x, mem_mode, insn, false); }
Re: [RFA] dwarf2out.c:eliminate_regs() bug
Richard Guenther wrote: ... Yes, though we should probably try to catch the DECL_ABSTRACT case further up the call chain - there shouldn't be any location lists for abstract function. Thus, see why static dw_die_ref gen_formal_parameter_die (tree node, tree origin, dw_die_ref context_die) ... if (! DECL_ABSTRACT (node_or_origin)) add_location_or_const_value_attribute (parm_die, node_or_origin, DW_AT_location); the node_or_origin of the param isn't DECL_ABSTRACT. In the end the above check should have avoided the situation you run into. The origin_or_origin (== origin) turned out to be 'this' pointer. It came from BLOCK_NONLOCALIZED_VARs in decls_for_scope(): static void decls_for_scope (tree stmt, dw_die_ref context_die, int depth) { ... for (i = 0; i < BLOCK_NUM_NONLOCALIZED_VARS (stmt); i++) process_scope_var (stmt, NULL, BLOCK_NONLOCALIZED_VAR (stmt, i), context_die); ... } set_decl_abstract_flags() doesn't seem to process BLOCK_NONLOCALIZED_VARs. From what I gather, this is correct behavior. At this point I got the feeling that something is clobbering the information. There is this patch by Honza (http://gcc.gnu.org/viewcvs/trunk/gcc/dwarf2out.c?r1=151901&r2=151917) that fixes a clobbering issue with abstract functions. Backporting it to my sources fixed the problem, yay! Honza, does the bug you've fixed with the above patch resemble the problem I've stumbled into? Regards, -- Maxim K.
Re: How to define 2 bypasses for a single pair of insn_reservation
Vladimir Makarov wrote: Ye, Joey wrote: ... Anyone can help me through this please? It was supposed to have two latency definitions at most (one in define_insn_reservation and another one in define_bypass). That time it seemed enough for all processors supported by GCC. It also simplified semantics definition when two bypass conditions returns true for the same insn pair. If you really need more one bypass for insn pair, I could implement this. Please, let me know. In this case semantics of choosing latency time could be o time in first bypass occurred in pipeline description whose condition returns true o time given in define_insn_reservation I had a similar problem with ColdFire V4 scheduler model and the solution for me was using adjust_cost() target hook; it is a bit complicated, but it works fine. Search m68k.c for 'bypass' for more information, comments there describe the thing in sufficient detail. -- Maxim
Re: How to define 2 bypasses for a single pair of insn_reservation
Ye, Joey wrote: Maxim and Vladimir Wrote: Anyone can help me through this please? It was supposed to have two latency definitions at most (one in define_insn_reservation and another one in define_bypass). That time it seemed enough for all processors supported by GCC. It also simplified semantics definition when two bypass conditions returns true for the same insn pair. If you really need more one bypass for insn pair, I could implement this. Please, let me know. In this case semantics of choosing latency time could be o time in first bypass occurred in pipeline description whose condition returns true o time given in define_insn_reservation I had a similar problem with ColdFire V4 scheduler model and the solution for me was using adjust_cost() target hook; it is a bit complicated, but it works fine. Search m68k.c for 'bypass' for more information, comments there describe the thing in sufficient detail. Maxim, I read your implementation in m68k.c. IMHO it is a smart but tricky solution. For example it depends on the assumption that targetm.sched.adjust_cost () immediately called after bypass_p(). Yes, it does depend on this assumption and the comment states exactly that. Also the redundant check and calls to min_insn_conflict_delay looks inefficient. Which check[s] do you have in mind, the gcc_assert's? Also, out of curiosity, what is inefficient about the use of min_insn_conflict_delay? For the record, min_insn_conflict delay has nothing to do with emulating two bypasses; this tweak makes scheduler faster by not adding instructions to the ready list which makes haifa-sched.c:max_issue() do its exhaustive-like search on a smaller set. I'd prefer to extend semantics to support more than one bypass. Don't get me wrong, I'm not against adding support for N>1 bypasses; it is not that easy though ;) . -- Maxim
Re: About Hazard Recognizer:DFA
daniel tian wrote: Hi Dr. Uday Khedker: Happy New Year! I met hazard problem. And I have debuged this error for a few days. I wrote DFA to avoid load hazard, but still it exists. I wonder whether in default the command './cc1 hazard.c' doesn't compile the file with DFA. By default the scheduler is enabled starting at -O2, so this should at least be './cc1 -O2 hazard.c'. Of cause, you should also add generation of nops, as Vladimir said, either in machine dependent reorg or in assembler. Also, scheduler dumps may be helpful for you; they can be enabled via -fsched-verbose=N switch. -- Maxim
Re: scheduling question
Alex Turjan wrote: Hi, During scheduling Im confronted with the fact that an instruction is moved from the ready list to queued with the cost 2, while according to my expectations the insn should have been moved to queued with cost 1. Did anybody experience similar problem? From what you described it's not clear what the problem is. When scheduler infers that an instruction cannot be scheduled in the next N cycles (due to DFA hazard or insn_cost/dep_cost hook considerations or due to something else) the scheduler queues the instruction on (N+1) cycle. In case an insn is ready but can bot be schedled in the current cycle, is it correct (i.e. the generated code is correct) to move the insn to the queue list with cost 1 ?; no matter what it the value >=1 returned by state_transition. Yes, that would be correct from code correctness point of view, but state_transition() *will* make the scheduler requeue the instruction on the next cycle, so you will just loose in compile time. It seams to me that moving from the ready to queue list with cost >=1 is an optimization for compilation time. Correct, scheduler would be working unnecessarily long otherwise. -- Maxim
Re: sched2, ret, use, and VLIW bundling
DJ Delorie wrote: I'm working on a VLIW coprocessor for MeP. One thing I noticed is that sched2 won't bundle the function's RET with the insn that sets the return value register, apparently because there's an intervening USE of that register (insn 30 in the example below). Is there any way around this? The return value obviously isn't actually used there, nor does the return insn need it - that USE is just to keep the return value live until the function exits. The problem may be in the dependency cost between the SET (insn 27) and the USE (insn 30) being >= 1. Have you tried using targetm.sched.adjust_cost() hook to set the cost of USE to 0? Anyway, this seems strange, the scheduler should just output the USEs as soon as they are ready. One of the few places this can be forced untrue is targetm.sched.dfa_new_cycle() hook; does your port define it? -- Maxim
Performance optimizations for Intel Core 2 and Core i7 processors
CodeSourcery is working on improving performance for Intel's Core 2 and Core i7 families of processors. CodeSourcery plans to add support for unaligned vector instructions, to provide fine-tuned scheduling support and to update instruction selection and instruction cost models for Core i7 and Core 2 families of processors. As usual, CodeSourcery will be contributing its work to GCC. Currently, our target is the end of GCC 4.6 Stage1. If your favorite benchmark significantly under-performs on Core 2 or Core i7 CPUs, don't hesitate asking us to take a look at it. We appreciate Intel sponsoring this project. Thank you, -- Maxim Kuvyrkov CodeSourcery ma...@codesourcery.com (650) 331-3385 x724
Re: Performance optimizations for Intel Core 2 and Core i7 processors
On 5/20/10 4:04 PM, Steven Bosscher wrote: On Mon, May 17, 2010 at 8:44 AM, Maxim Kuvyrkov wrote: CodeSourcery is working on improving performance for Intel's Core 2 and Core i7 families of processors. CodeSourcery plans to add support for unaligned vector instructions, to provide fine-tuned scheduling support and to update instruction selection and instruction cost models for Core i7 and Core 2 families of processors. As usual, CodeSourcery will be contributing its work to GCC. Currently, our target is the end of GCC 4.6 Stage1. If your favorite benchmark significantly under-performs on Core 2 or Core i7 CPUs, don't hesitate asking us to take a look at it. I'd like to ask you to look at ffmpeg (missed core2 vectorization opportunities), polyhedron (PR34501, like, duh! :-), and Apache benchmark (-mtune=core2 results in lower scores). You could check overall effects on an openly available benchmark suite such as http://www.phoronix-test-suite.com/ Thank you for the pointers! -- Maxim Kuvyrkov CodeSourcery ma...@codesourcery.com (650) 331-3385 x724
Re: Performance optimizations for Intel Core 2 and Core i7 processors
On 5/21/10 9:06 PM, Vladimir N. Makarov wrote: On 05/17/2010 02:44 AM, Maxim Kuvyrkov wrote: ... If your favorite benchmark significantly under-performs on Core 2 or Core i7 CPUs, don't hesitate asking us to take a look at it. What I saw is people complaining about -mtune=core2 for polyhedron http://gcc.gnu.org/ml/gcc-patches/2010-02/msg01272.html The biggest complaint was on mdbx (about 16%). Thank you for the pointers and analysis! ... Also I think it is important to have a pipeline description for Core2/Core i7 or at least to use it from generic. Right. We will be adding a pipeline description for Core 2/i7. Thank you, -- Maxim Kuvyrkov CodeSourcery ma...@codesourcery.com (650) 331-3385 x724
Re: Problem configuring uclinux toolchain
On 7/9/10 3:22 PM, Anthony Green wrote: Hi Maxim, Recent changes to config.gcc are preventing me from building a moxie-uclinux toolchain. Anthony, What is the error the build process is failing on? Thanks, -- Maxim Kuvyrkov CodeSourcery ma...@codesourcery.com (650) 331-3385 x724
Re: Revisiting the use of cselib in alias.c for scheduling
On 7/21/10 6:44 PM, Bernd Schmidt wrote: On 07/21/2010 03:06 PM, Steven Bosscher wrote: 3. GCC now has better alias analysis than it used to, especially with the alias-exporting stuff that exports the GIMPLE points-to analysis results, but also just all the other little things that were contributed over the last 10 years (little things like tree-ssa :) [...] It looks like ~9% extra !true_dependence cases are found with cselib, with is not insignificant: ... If that can't be improved, I think that rather than remove cselib from the scheduler, the question should be: if it's useful, why don't we use it for other schedulers rather than only sched-ebb? Cselib can /always/ be used during second scheduling pass and on single-block regions during the first scheduling pass (after RA sched-rgn operates on single-block regions). Modulo the bugs enabling cselib might surface, the only reason not to enable cselib for single-block regions in sched-rgn may be increased compile time. That requires some benchmarking, but my gut feeling is that the benefits would outweigh the compile-time cost. -- Maxim Kuvyrkov CodeSourcery ma...@codesourcery.com (650) 331-3385 x724
Re: Revisiting the use of cselib in alias.c for scheduling
On 7/22/10 3:34 AM, Steven Bosscher wrote: On Wed, Jul 21, 2010 at 10:09 PM, Maxim Kuvyrkov wrote: Cselib can /always/ be used during second scheduling pass Except with the selective scheduler when it works on regions that are not extended basic blocks, I suppose? Right, I was considering sched-rgn scheduler, not sel-sched. and on single-block regions during the first scheduling pass (after RA sched-rgn operates on single-block regions). Modulo the bugs enabling cselib might surface, the only reason not to enable cselib for single-block regions in sched-rgn may be increased compile time. That requires some benchmarking, but my gut feeling is that the benefits would outweigh the compile-time cost. So something like the following _should_ work? If so, I'll give it a try on x86*. Ciao! Steven Index: sched-rgn.c === --- sched-rgn.c (revision 162355) +++ sched-rgn.c (working copy) @@ -3285,8 +3285,11 @@ rgn_setup_sched_infos (void) { if (!sel_sched_p ()) -memcpy (&rgn_sched_deps_info,&rgn_const_sched_deps_info, - sizeof (rgn_sched_deps_info)); +{ + memcpy (&rgn_sched_deps_info,&rgn_const_sched_deps_info, + sizeof (rgn_sched_deps_info)); + rgn_sched_deps_info.use_cselib = reload_completed; Yes, this should work. You can also enable cselib for single-block regions for first scheduling pass too. I.e., index 89743c3..047b717 100644 --- a/gcc/sched-rgn.c +++ b/gcc/sched-rgn.c @@ -2935,6 +2935,9 @@ schedule_region (int rgn) if (sched_is_disabled_for_current_region_p ()) return; + gcc_assert (!reload_completed || current_nr_blocks == 1); + rgn_sched_deps_info.use_cselib = (current_nr_blocks == 1); + sched_rgn_compute_dependencies (rgn); sched_rgn_local_init (rgn); Thanks, -- Maxim Kuvyrkov CodeSourcery ma...@codesourcery.com (650) 331-3385 x724
Re: Performance optimizations for Intel Core 2 and Core i7 processors
On 8/13/10 11:40 PM, Jack Howarth wrote: On Mon, May 17, 2010 at 10:44:57AM +0400, Maxim Kuvyrkov wrote: CodeSourcery is working on improving performance for Intel's Core 2 and Core i7 families of processors. CodeSourcery plans to add support for unaligned vector instructions, to provide fine-tuned scheduling support and to update instruction selection and instruction cost models for Core i7 and Core 2 families of processors. As usual, CodeSourcery will be contributing its work to GCC. Currently, our target is the end of GCC 4.6 Stage1. If your favorite benchmark significantly under-performs on Core 2 or Core i7 CPUs, don't hesitate asking us to take a l...@it. We appreciate Intel sponsoring this project. Maxim, Do you have any updates on the progress of this project? Since it has been proposed to default intel darwin to -mtune=core2, it would be very helpful to be able to test (benchmark) any proposed changes on x86_64-apple-darwin10 with gcc trunk. Thanks in advance. Jack, We will start posting patches very soon. Bernd Schmidt has almost finished pipeline model for Core 2/i7, so that will be the first piece of work we'll post for upstream review. Regards, -- Maxim Kuvyrkov CodeSourcery ma...@codesourcery.com (650) 331-3385 x724
Re: Questions about selective scheduler and PowerPC
On 10/19/10 6:16 PM, Andrey Belevantsev wrote: ... I agree that ISSUE_POINTS can be removed, as it was not used (maybe Maxim can comment more on this). I too agree with removing ISSUE_POINTS, it never found any use. Regards, -- Maxim Kuvyrkov CodeSourcery ma...@codesourcery.com (650) 331-3385 x724
Loop-iv.c ICEs on subregs
Zdenek, I'm investigating an ICE in loop-iv.c:get_biv_step(). I hope you can shed some light on what the correct fix would be. The ICE happens when processing: == (insn 111 (set (reg:SI 304) (plus (subreg:SI (reg:DI 251) 4) (const_int 1 (insn 177 (set (subreg:SI (reg:DI 251)) (reg:SI 304))) == The code like the above does not occur on current mainline early enough for loop-iv.c to catch it. The subregs above are produced by Tom's (CC'ed) extension elimination pass (scheduled before fwprop1) which is not in mainline yet [*]. The failure is due to assert in loop-iv.c:get_biv_step(): == gcc_assert ((*inner_mode == *outer_mode) != (*extend != UNKNOWN)); == i.e., inner and outer modes can differ iff there's an extend in the chain. Get_biv_step_1() starts with insn 177, then gets to insn 111, then loops back to insn 177 at which point it stops and returns GRD_MAYBE_BIV and sets: * outer_mode == DImode == natural mode of (reg A); * inner_mode == SImode == mode of (subreg (reg A)), set in get_biv_step_1: == if (GET_CODE (next) == SUBREG) { enum machine_mode amode = GET_MODE (next); if (GET_MODE_SIZE (amode) > GET_MODE_SIZE (*inner_mode)) return false; *inner_mode = amode; *inner_step = simplify_gen_binary (PLUS, outer_mode, *inner_step, *outer_step); *outer_step = const0_rtx; *extend = UNKNOWN; } == * extend == UNKNOWN as there are no extensions in the chain. It seems to me that computations of outer_mode and extend are correct, I'm not sure about inner_mode. Zdenek, what do you think is the right way to handle the above case in loop analysis? [*] http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01529.html Thanks, -- Maxim Kuvyrkov CodeSourcery +1-650-331-3385 x724
Re: Loop-iv.c ICEs on subregs
On Nov 26, 2010, at 3:51 AM, Zdenek Dvorak wrote: > Hi, > >> I'm investigating an ICE in loop-iv.c:get_biv_step(). I hope you can shed >> some light on what the correct fix would be. >> >> The ICE happens when processing: >> == >> (insn 111 (set (reg:SI 304) >> (plus (subreg:SI (reg:DI 251) 4) >> (const_int 1 >> >> (insn 177 (set (subreg:SI (reg:DI 251)) >> (reg:SI 304))) >> == ... > > loop iv analysis currently does not handle assignments to subregs ... > So, if such a code > gets produced consistently for a large fraction of the loops, it would be > necessary to teach loop-iv to analyze induction variables represented in > subregs. This would mean a more involved rewrite of loop-iv, though, I see. In that case a simpler way to fix the problem may be to move Tom's extension elimination pass /after/ loop optimizers. Do you (or anyone reading this thread) have suggestions on what would be a good spot in the optimization pipeline for sign- and zero-extension elimination pass? Thanks, -- Maxim Kuvyrkov CodeSourcery +1-650-331-3385 x724
Account for devirtualization opportunities in inlining heuristics
Jan, Here are the testcases for inlining improvements we've discussed on IRC a couple of days ago. Current mainline handles inline-devirt-1.C and inline-devirt-5.C cases. With my w-i-p patches to teach inlining heuristics about devirtualization opportunities (also attached) inline-devirt-2.C, inline-devirt-3.C are also fully optimized. Let me know if you have suggestions for tackling the other cases. Do you think committing the testcases mainline, XFAIL'ed as necessary, would be useful? Thanks, -- Maxim Kuvyrkov CodeSourcery +1-650-331-3385 x724 0005-Testcases.patch Description: Binary data 0002-Refactor-ipa-cp.c-to-operate-on-type-lattices.ChangeLog Description: Binary data 0002-Refactor-ipa-cp.c-to-operate-on-type-lattices.patch Description: Binary data 0003-Fix-memory-leak.ChangeLog Description: Binary data 0003-Fix-memory-leak.patch Description: Binary data 0004-Account-for-devirtualization-in-inlining-heuristics.ChangeLog Description: Binary data 0004-Account-for-devirtualization-in-inlining-heuristics.patch Description: Binary data
Re: Generalize ready list sorting via heuristics in rank_for_schedule.
Peter Steinmetz wrote: Currently, within the ready_sort macro in haifa-sched.c, the call to qsort is passed "rank_for_schedule" to help it decide which of two instructions should be placed further towards the front of the ready list. Rank_for_schedule uses a set of ordered heuristics (rank, priority, etc.) to make this decision. The set of heuristics is fixed for all target machines. There already are two target hooks specifically for this purpose: targetm.sched.{reorder, reorder2}. They both have higher priority, than ready_sort (). There can be cases, however, where a target machine may want to define heuristics driven by specific characteristics of that machine. Those heuristics may be meaningless on other targets. In rank_for_schedule () only machine independent heuristics are gathered; the rest of the haifa scheduler is no less machine dependent, than these heuristics are. Machine dependent things are separated in reorder hooks (which, btw, are defined on 3 targets only). -- Maxim
Re: [4.2 Regression]: Gcc generates unaligned access on IA64
H. J. Lu wrote: FYI, today's gcc 4.2 generates many unaligned access on IA64: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26721 It may be related to http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01001.html http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01000.html http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00999.html http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00997.html http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00998.html Has anyone else seen it? If it is confirmed, it looks pretty bad. H.J. I've got these unaligned access bugs too. I'm now working on it. -- Maxim
Re: alias time explosion
Daniel Berlin wrote: ... If i don't turn off scheduling entirely, this testcase now takes >10 minutes to compile (I gave up after that). With scheduling turned off, it takes 315 seconds, checking enabled. It looks like the scheduler is now trying to schedule some single region with 51,000 instructions in it. Everytime i broke into the debugger, it was busy in ready_sort re-doing qsort on the ready-list (which probably had a ton of instructions), over and over and over again. I imagine the 51k instructions comes from the recent scheduling changes. Maxim, can you please take the testcase Andrew attached earlier in the thread, and make it so the scheduler can deal with it in a reasonable amount of time again? It used to take <20 seconds. I've checked the trunk and everything appears ok to me. Both the trunk and the trunk with my patches reverted compile the testcase in 5m30s (they were configured with CFLAGS=-g). My best guess where the >10 minutes came from is that you tried to compile the testcase with the compiler built with profile information - in this case the compilation will last for ~15 minutes. -- Maxim
Re: IA-64 speculation patches have bad impact on ARM
ip_length (int); static bool arm_function_ok_for_sibcall (tree, tree); @@ -245,6 +249,12 @@ static bool arm_tls_symbol_p (rtx x); #undef TARGET_SCHED_ADJUST_COST #define TARGET_SCHED_ADJUST_COST arm_adjust_cost +#undef TARGET_SCHED_REORDER +#define TARGET_SCHED_REORDER arm_reorder1 + +#undef TARGET_SCHED_REORDER2 +#define TARGET_SCHED_REORDER2 arm_reorder2 + #undef TARGET_ENCODE_SECTION_INFO #ifdef ARM_PE #define TARGET_ENCODE_SECTION_INFO arm_pe_encode_section_info @@ -5229,6 +5239,50 @@ arm_adjust_cost (rtx insn, rtx link, rtx return cost; } +static void +arm_reorder (rtx *ready, int n_ready) +{ + if (n_ready > 1) +{ + /* This is correct for sched-rgn.c only. */ + basic_block bb = BLOCK_FOR_INSN (current_sched_info->prev_head); + + if (BLOCK_FOR_INSN (ready[n_ready - 1]) != bb) +{ + int i; + + for (i = n_ready - 1; i >= 0; i--) +{ + rtx insn = ready[i]; + + if (BLOCK_FOR_INSN (insn) != bb) +continue; + + memcpy (ready + i, ready + i + 1, + (n_ready - i - 1) * sizeof (*ready)); + ready[n_ready - 1] = insn; + break; +} +} +} +} + +static int +arm_reorder1 (FILE *dump ATTRIBUTE_UNUSED, int sched_verbose ATTRIBUTE_UNUSED, + rtx *ready, int *pn_ready, int clock_var ATTRIBUTE_UNUSED) +{ + arm_reorder (ready, *pn_ready); + return 1; +} + +static int +arm_reorder2 (FILE *dump ATTRIBUTE_UNUSED, int sched_verbose ATTRIBUTE_UNUSED, + rtx *ready, int *pn_ready, int clock_var ATTRIBUTE_UNUSED) +{ + arm_reorder (ready, *pn_ready); + return 0; +} + static int fp_consts_inited = 0; /* Only zero is valid for VFP. Other values are also valid for FPA. */ 2006-05-30 Maxim Kuvyrkov <[EMAIL PROTECTED]> * haifa-sched.c (priority): Set INSN_PRIORITY to INSN_COST if all dependencies of the insn are being ignored. (adjust_priority): Don't adjust priority twice. * sched-int.h (haifa_insn_data.priority_adjusted): New bitfield. (INSN_PRIORITY_ADJUSTED): New access-macro. --- haifa-sched.c (/gcc-local/trunk/gcc) (revision 19877) +++ haifa-sched.c (/gcc-local/arm-bug/gcc)(revision 19877) @@ -751,6 +751,8 @@ priority (rtx insn) do { + bool priority_inited_p = false; + for (link = INSN_DEPEND (twin); link; link = XEXP (link, 1)) { rtx next; @@ -785,9 +787,14 @@ priority (rtx insn) if (next_priority > this_priority) this_priority = next_priority; + + priority_inited_p = true; } } + if (!priority_inited_p) +this_priority = insn_cost (insn, 0, 0); + twin = PREV_INSN (twin); } while (twin != prev_first); @@ -1110,9 +1117,13 @@ adjust_priority (rtx prev) Revisit when we have a machine model to work with and not before. */ - if (targetm.sched.adjust_priority) -INSN_PRIORITY (prev) = - targetm.sched.adjust_priority (prev, INSN_PRIORITY (prev)); + if (targetm.sched.adjust_priority + && !INSN_PRIORITY_ADJUSTED (prev)) +{ + INSN_PRIORITY (prev) = +targetm.sched.adjust_priority (prev, INSN_PRIORITY (prev)); + INSN_PRIORITY_ADJUSTED (prev) = 1; +} } /* Advance time on one cycle. */ @@ -4478,6 +4489,7 @@ clear_priorities (rtx insn) if (INSN_PRIORITY_KNOWN (pro)) { INSN_PRIORITY_KNOWN (pro) = 0; + INSN_PRIORITY_ADJUSTED (pro) = 0; clear_priorities (pro); } } --- sched-int.h (/gcc-local/trunk/gcc) (revision 19877) +++ sched-int.h (/gcc-local/arm-bug/gcc)(revision 19877) @@ -317,6 +317,9 @@ struct haifa_insn_data /* Nonzero if priority has been computed already. */ unsigned int priority_known : 1; + /* Nonzero if priority has been adjusted already. */ + unsigned int priority_adjusted : 1; + /* Nonzero if instruction has internal dependence (e.g. add_dependence was invoked with (insn == elem)). */ unsigned int has_internal_dep : 1; @@ -350,6 +353,7 @@ extern regset *glat_start, *glat_end; #define INSN_DEP_COUNT(INSN) (h_i_d[INSN_UID (INSN)].dep_count) #define INSN_PRIORITY(INSN)(h_i_d[INSN_UID (INSN)].priority) #define INSN_PRIORITY_KNOWN(INSN) (h_i_d[INSN_UID (INSN)].priority_known) +#define INSN_PRIORITY_ADJUSTED(INSN) (h_i_d[INSN_UID (INSN)].priority_adjusted) #define INSN_COST(INSN)(h_i_d[INSN_UID (INSN)].cost) #define INSN_REG_WEIGHT(INSN) (h_i_d[INSN_UID (INSN)].reg_weight) #define HAS_INTERNAL_DEP(INSN) (h_i_d[INSN_UID (INSN)].has_internal_dep)
Re: IA-64 speculation patches have bad impact on ARM
Vladimir Makarov wrote: ... I am agree with this. Two months ago Maxim submitted patches which affects only ia64 except one thing affecting all targets - the patch which builds more scheduling regions and as consequence permits more aggressive interblock scheduling. Insn scheduling before the register allocation even without Maxim's patches is not safe when hard registers are used in RTL. It is a known bug (e.g. for x86_64) and it is in bugzilla. Jim Wilson wrote several possible solutions for this, no one is easy to implement except for switching off insn scheduling before RA (what is done for x86_64). But we can restore the state (probably safe for most programs) what was before Maxim's patch. So Maxim could you do this (of course you can save max-sched-extend-regions-iters value for ia64 because it is probably safe for targets with many registers). Vlad Considering that bug, I agree that by default there should be no additional regions. The patch will be posted in a few minutes. -- Maxim
Re: IA-64 speculation patches have bad impact on ARM
Daniel Jacobowitz wrote: ... Not even a single comment - shame on you both! :-) If this is the solution we choose, can we make sure that there's at least a comment explaining what's going on? Totally agree. That was an *example patch*. Here is a bit updated, but still an example of how we can arrange instructions on ARM or some other platform with not much execution units. -- Maxim --- config/arm/arm.c (/gcc-local/trunk/gcc) (revision 19935) +++ config/arm/arm.c (/gcc-local/arm-bug/gcc) (revision 19935) @@ -52,6 +52,7 @@ #include "target-def.h" #include "debug.h" #include "langhooks.h" +#include "sched-int.h" /* Forward definitions of types. */ typedef struct minipool_nodeMnode; @@ -118,6 +119,9 @@ static void thumb_output_function_prolog static int arm_comp_type_attributes (tree, tree); static void arm_set_default_type_attributes (tree); static int arm_adjust_cost (rtx, rtx, rtx, int); +static void arm_reorder (rtx *, int); +static int arm_reorder1 (FILE *, int, rtx *, int *, int); +static int arm_reorder2 (FILE *, int, rtx *, int *, int); static int count_insns_for_constant (HOST_WIDE_INT, int); static int arm_get_strip_length (int); static bool arm_function_ok_for_sibcall (tree, tree); @@ -245,6 +249,12 @@ static bool arm_tls_symbol_p (rtx x); #undef TARGET_SCHED_ADJUST_COST #define TARGET_SCHED_ADJUST_COST arm_adjust_cost +#undef TARGET_SCHED_REORDER +#define TARGET_SCHED_REORDER arm_reorder1 + +#undef TARGET_SCHED_REORDER2 +#define TARGET_SCHED_REORDER2 arm_reorder2 + #undef TARGET_ENCODE_SECTION_INFO #ifdef ARM_PE #define TARGET_ENCODE_SECTION_INFO arm_pe_encode_section_info @@ -5229,6 +5239,68 @@ arm_adjust_cost (rtx insn, rtx link, rtx return cost; } +/* Reorder insns in the ready list, so that instructions from the target block + will be scheduled ahead of instructions from the source blocks. */ +static void +arm_reorder (rtx *ready, int n_ready) +{ + if (n_ready > 1) +{ + /* Find out what target block is. + + !!! It is better to use TARGET_BB itself from the + haifa-sched.c: schedule_block (), but it is unavailable due to its + local scope. */ + basic_block target_bb = BLOCK_FOR_INSN (current_sched_info->prev_head); + + if (/* If insn, that will be scheduled next don't belong to + TARGET_BB. + + !!! Actually, we want here another condition: + 'if (IS_SPECULATIVE_INSN (ready[n_ready - 1]))', but it is + unavailable due to local scope in sched-rgn.c . */ + BLOCK_FOR_INSN (ready[n_ready - 1]) != target_bb) +/* Search the ready list for the most prioritized insn from the + TARGET_BB, and, if found, move it to the head of the list. */ +{ + int i; + + for (i = n_ready - 1; i >= 0; i--) +{ + rtx insn = ready[i]; + + if (BLOCK_FOR_INSN (insn) != target_bb) +continue; + + memcpy (ready + i, ready + i + 1, + (n_ready - i - 1) * sizeof (*ready)); + ready[n_ready - 1] = insn; + break; +} +} +} +} + +/* Override default sorting algorithm to reduce number of interblock + motions. */ +static int +arm_reorder1 (FILE *dump ATTRIBUTE_UNUSED, int sched_verbose ATTRIBUTE_UNUSED, + rtx *ready, int *pn_ready, int clock_var ATTRIBUTE_UNUSED) +{ + arm_reorder (ready, *pn_ready); + return 1; +} + +/* Override default sorting algorithm to reduce number of interblock + motions. */ +static int +arm_reorder2 (FILE *dump ATTRIBUTE_UNUSED, int sched_verbose ATTRIBUTE_UNUSED, + rtx *ready, int *pn_ready, int clock_var ATTRIBUTE_UNUSED) +{ + arm_reorder (ready, *pn_ready); + return 0; +} + static int fp_consts_inited = 0; /* Only zero is valid for VFP. Other values are also valid for FPA. */
Re: GCC trunk build failed on ia64: ICE in __gcov_init
Grigory Zagorodnev wrote: Hi! Build of mainline GCC on ia64-redhat-linux failed since Thu Jun 8 16:23:09 UTC 2006 (revision 114488). Last successfully built revision is 114468. I wonder if somebody sees the same. ... - Grigory This was fixed in revision 114604. -- Maxim
[Job] GNU toolchain developer
Hi, We are looking for developers passionate about open-source and toolchain development. You will be working on a variety of open-source projects, primarily on GCC, LLVM, glibc, GDB and Binutils. You should have ... - Experience with open-source projects and upstream communities; - Experience with open-source toolchain projects is a plus (GCC, LLVM, glibc, Binutils, GDB, Newlib, uClibc, OProfile, QEMU, Valgrind, etc); - Knowledge of compiler technology; - Knowledge of low-level computer architecture; - Proficiency in C. Proficiency in C++ and Python is a plus; - Knowledge of Linux development environment; - Time management and self-organizing skills, desire to work from your home office (KugelWorks is a distributed company); - Professional ambitions; - Fluent English; - BSc in computer science (or rationale why you do not need one). At KugelWorks you will have the opportunity to ... - Hack on the toolchain; - Develop your engineering, managerial, and communication skills; - Gain experience in product development; - Get public recognition for your open-source work; - Become open-source maintainter. Contact: - Maxim Kuvyrkov - Email: ma...@kugelworks.com - Phone: +1 831 295 8595 - Website: www.kugelworks.com -- Maxim Kuvyrkov KugelWorks
Re: [Android] The reason why -Bsymbolic is turned on by default
On 30/03/2013, at 7:55 AM, Alexander Ivchenko wrote: > Hi, > > When compiling a shared library with "-mandroid -shared" the option > -Bsymbolic for linker is turned on by default. What was the reason > behind that default? Isn't using of -Bsymbolic somehow dangerous and > should be avoided..? (as e.g. is explained in the mail from Richard > Henderson http://gcc.gnu.org/ml/gcc/2001-05/msg01551.html). > > Since there is no (AFAIK) option like -Bno-symbolic we cannot use > -fno-pic binary with COPY relocations in it (android dynamic loader > will throw an error when there is COPY relocation against DT_SYMBOLIC > library..) I don't know the exact reason behind -Bsymbolic (it came as a requirement from Google's Android team), but I believe it produces slightly faster code (and fancy symbol preemption is not required on phones and TVs). Also, it might be that kernel can share more memory pages of libraries compiled with -Bsymbolic, but I'm not sure. Now, it appears the problem is that an application cannot use COPY relocation to fetch a symbol out of shared -Bsymbolic library. I don't quite understand why this is forbidden by Bionic's linker. I understand why COPY relocations shouldn't be applied to the inside of DT_SYMBOLIC library. However, I don't immediately see the problem of applying COPY relocation against symbol from DT_SYMBOLIC library to the inside of an executable. Ard, you committed 5ae44f302b7d1d19f25c4c6f125e32dc369961d9 to Bionic that adds handling of ARM COPY relocations. Can you comment on why COPY relocations from executables to DT_SYMBOLIC libraries are forbidden? Thank you, -- Maxim Kuvyrkov KugelWorks
Re: Need a copyright assignment and a copyright disclaimer form
On 25/04/2013, at 8:51 AM, dw wrote: > I am attempting to submit a patch for the gcc documentation (see > http://gcc.gnu.org/ml/gcc-help/2013-04/msg00193.html). I am told that I need > to submit one of these two forms. Please send me copies so I can select one > and submit it. You need to forward your email to . It is the FSF, not the GCC project, that handles copyright assignment paperwork. You need to specify your name, the name of your employer, and the name and title of the person who will be signing on behalf of the company. You also need to list which FSFprojects (e.g., GCC, Binutils, GDB, glibc -- or just blanket ALL) you wish to contribute to. Usually the FSF copyright office replies within 1-2 days, and feel free to ping us back here at gcc@ if FSF legal stalls. Thank you, -- Maxim Kuvyrkov KugelWorks
Re: Delay scheduling due to possible future multiple issue in VLIW
Paulo, GCC schedule is not particularly designed for VLIW architectures, but it handles them reasonably well. For the example of your code both schedules take same time to execute: 38: 0: r1 = e[r0] 40: 4: [r0] = r1 41: 5: r0 = r0+4 43: 5: p0 = r1!=0 44: 6: jump p0 and 38: 0: r1 = e[r0] 41: 1: r0 = r0+4 40: 4: [r0] = r1 43: 5: p0 = r1!=0 44: 6: jump p0 [It is true that the first schedule takes less space due to fortunate VLIW packing.] You are correct that GCC scheduler is greedy and that it tries to issue instructions as soon as possible (i.e., it is better to issue something on the cycle, than nothing at all), which is a sensible strategy. For small basic block the greedy algorithm may cause artifacts like the one you describe. You could try increasing size of regions on which scheduler operates by switching your port to use scheb-ebb scheduler, which was originally developed for ia64. Regards, -- Maxim Kuvyrkov KugelWorks On 27/06/2013, at 8:35 PM, Paulo Matos wrote: > Let me add to my own post saying that it seems that the problem is that the > list scheduler is greedy in the sense that it will take an instruction from > the ready list no matter what when waiting and trying to pair it with later > on with another instruction might be more beneficial. In a sense it seems > that the idea is that 'issuing instructions as soon as possible is better' > which might be true for a single issue chip but a VLIW with multiple issue > has to contend with other problems. > > Any thoughts on this? > > Paulo Matos > > >> -Original Message- >> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Paulo >> Matos >> Sent: 26 June 2013 15:08 >> To: gcc@gcc.gnu.org >> Subject: Delay scheduling due to possible future multiple issue in VLIW >> >> Hello, >> >> We have a port for a VLIW machine using gcc head 4.8 with an maximum issue of >> 2 per clock cycle (sometimes only 1 due to machine constraints). >> We are seeing the following situation in sched2: >> >> ;; --- forward dependences: >> >> ;; --- Region Dependences --- b 3 bb 0 >> ;; insn codebb dep prio cost reservation >> ;; -- --- --- >> ;; 38 1395 3 0 6 4 >> (p0+long_imm+ldst0+lock0),nothing*3 : 44m 43 41 40 >> ;; 40 491 3 1 2 2 (p0+long_imm+ldst0+lock0),nothing >> : 44m 41 >> ;; 41 536 3 2 1 1 (p0+no_stl2)|(p1+no_dual) : 44 >> ;; 43 1340 3 1 2 1 (p0+no_stl2)|(p1+no_dual) : 44m >> ;; 44 1440 3 4 1 1 (p0+long_imm) : >> >> ;; dependencies resolved: insn 38 >> ;; tick updated: insn 38 into ready >> ;; dependencies resolved: insn 41 >> ;; tick updated: insn 41 into ready >> ;; Advanced a state. >> ;; Ready list after queue_to_ready:41:4 38:2 >> ;; Ready list after ready_sort:41:4 38:2 >> ;; Ready list (t = 0):41:4 38:2 >> ;; Chosen insn : 38 >> ;;0--> b 0: i 38r1=zxn([r0+`b']) >> :(p0+long_imm+ldst0+lock0),nothing*3 >> ;; dependencies resolved: insn 43 >> ;; Ready-->Q: insn 43: queued for 4 cycles (change queue index). >> ;; tick updated: insn 43 into queue with cost=4 >> ;; dependencies resolved: insn 40 >> ;; Ready-->Q: insn 40: queued for 4 cycles (change queue index). >> ;; tick updated: insn 40 into queue with cost=4 >> ;; Ready-->Q: insn 41: queued for 1 cycles (resource conflict). >> ;; Ready list (t = 0): >> ;; Advanced a state. >> ;; Q-->Ready: insn 41: moving to ready without stalls >> ;; Ready list after queue_to_ready:41:4 >> ;; Ready list after ready_sort:41:4 >> ;; Ready list (t = 1):41:4 >> ;; Chosen insn : 41 >> ;;1--> b 0: i 41r0=r0+0x4 >> :(p0+no_stl2)|(p1+no_dual) >> >> So, it is scheduling first insn 38 followed by 41. >> The insn chain for bb3 before sched2 looks like: >> (insn 38 36 40 3 (set (reg:DI 1 r1) >>(zero_extend:DI (mem:SI (plus:SI (reg:SI 0 r0 [orig:119 ivtmp.13 ] >> [119]) >>(symbol_ref:SI ("b") [flags 0x80] > 0x2b9c011f75a0 b>)) [2 MEM[symbol: b, index: ivtmp.13_7, offset: 0B]+0 S4 >> A32]))) pr3115b.c:13 1395 {zero_extendsidi2} >> (nil)) >> (insn 40 38 41
Re: [PATCH] GOMP_CPU_AFFINITY fails with >1024 cores
On 17/07/2013, at 2:29 AM, Daniel J Blueman wrote: > Jakub et al, > > Steffen has developed a nice fix [1] for GOMP_CPU_AFFINITY failing with >1024 > cores. > > What steps are needed to get this into GCC 4.8.2? > > Thanks, > Daniel > > [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57298 It's easy! Just follow the steps: 1. You test the patch on one of the primary architectures and make sure there are no regressions in the testsuites. 2. Ideally you add a test that fails before your patch, but passes with it. 3. You post your final patch to gcc-patches@ mailing list (this is gcc@ mailing list); CC one of the maintainers. If you CC both -- each will think that the other will review the patch. 4. You include full description and analysis of the problem in the body of message (people are lazy to click on links). You describe how your patch fixes the problem. You write how and which architectures your patch was tested on. 5. You ping your submission every 2 weeks to one of the maintainers until they review your patch. Good luck! -- Maxim Kuvyrkov KugelWorks
Re: toolchain build error with eglibc on OpenWrt
On 17/07/2013, at 6:26 PM, lingw...@altaitechnologies.com wrote: > Hi developers, > > I'm encountered a problem when I build OpenWrt's toolchain for Cavium > Octeon,which is mips64 r2 architectural.the error message as follow: > My toolchain units version: > gcc: 4.7.x > binutils: 2.22 > eglibc: 2.17 (svn version 22243) It will be difficult to figure out how to fix it, sorry. The problem seems to be in libgcc (which is part of GCC) not providing helpers for floating-point arithmetic operations. The likely cause of this is that the compiler was configured for a hard-float target, but eglibc is being compiled for a soft-float target. [For hard-float targets there is no need for FP helpers in libgcc since processor is assumed to handle that in silicon.] Good luck, -- Maxim Kuvyrkov KugelWorks
Re: [ping] [buildrobot] gcc/config/linux-android.c:40:7: error: ‘OPTION_BIONIC’ was not declared in this scope
On 7/09/2013, at 1:31 AM, Jan-Benedict Glaw wrote: > On Mon, 2013-08-26 12:51:53 +0200, Jan-Benedict Glaw > wrote: >> On Tue, 2013-08-20 11:24:31 +0400, Alexander Ivchenko >> wrote: >>> Hi, thanks for cathing this. >>> >>> I certainly missed that OPTION_BIONIC is not defined for linux targets >>> that do not include config/linux.h in their tm.h. >>> >>> This patch fixed build for powerpc64le-linux and mn10300-linux. >>> linux_libc, LIBC_GLIBC, LIBC_BIONIC should be defined for all targets. >> [...] > > Seems the commit at Thu Sep 5 13:01:35 2013 (CEST) fixed most of the > fallout. Thanks! > >> mn10300-linux: >> http://toolchain.lug-owl.de/buildbot/showlog.php?id=9657&mode=view > > This however still seems to have issues in a current build: > > http://toolchain.lug-owl.de/buildbot/showlog.php?id=10520&mode=view Jan-Benedict, Mn10300-linux does not appear to be supporting linux. Mn10300-linux target specifier expands into mn10300-unknown-linux-gnu, where *-gnu implies using Glibc library, which doesn't have mn10300 port. Jeff, You are mn10300 maintainer, is building GCC for mn10300-unknown-linux-gnu supposed to work? Thanks, -- Maxim Kuvyrkov www.kugelworks.com
[RFC] Apple Blocks extension
Hi, I am considering a project to add Apple's blocks [*] extension to GCC. I am looking at adding blocks support to C, C++ and Obj-C/C++ front-ends. There are many challenges (both technical and copyright) that require work before any patches are ready for review, and I would appreciate indication from front-end maintainers on whether a technically sound implementation of Blocks extension would be a welcome addition to GCC front-ends. Joseph, Richard, as C front-end maintainers, would you be supportive of Blocks extension implemented for C front-end? Jason, Mark, Nathan, as C++ front-end maintainers, would you be supportive of Blocks extension implemented for C++ front-end? Mike, Stan, as Obj-C/C++ front-end maintainers, would you be supportive of Blocks extension implemented for Obj-C/C++ front-ends? [*] http://en.wikipedia.org/wiki/Blocks_(C_language_extension) Thank you! -- Maxim Kuvyrkov www.kugelworks.com
Re: Dependency confusion in sched-deps
On 8/11/2013, at 1:48 am, Paulo Matos wrote: > Hello, > > I am slightly unsure if the confusion is in the dependencies or it's my > confusion. > > I have tracked this strange behaviour which only occurs when we need to flush > pending instructions due to the pending list becoming too large (gcc 4.8, > haven't tried with trunk). > > I have two stores: > 85: st zr, [r12] # zr is the zero register > 90: st zr, [r18] > > While analysing dependencies for `st zr, [r12]`, we notice that pending list > is too large in sched_analyze_1 and call flush_pending_lists (deps, insn, > false, true). > > This in turn causes the last_pending_memory_flush to be set to: > (insn_list:REG_DEP_TRUE 85 (nil)) > > When insn 90 is analyzed next, it skips the flushing bit since the pending > lists had just been flushed and enters the else bit where it does: > add_dependence_list (insn, deps->last_pending_memory_flush, 1, > REG_DEP_ANTI, true); > > This adds the dependency: 90 has an anti-dependency to 85. > I think this should be a true dependency (write after write). It even says so > in the list of last_pending_memory_flush, however add_dependence_list > function ignored this and uses the dep_type passed: REG_DEP_ANTI. > > Is anti the correct dependence? Why? Output dependency is the right type (write after write). Anti dependency is write after read, and true dependency is read after write. Dependency type plays a role for estimating costs and latencies between instructions (which affects performance), but using wrong or imprecise dependency type does not affect correctness. Dependency flush is a force-major occurrence during compilation, and developers tend not to spend too much time on coding best possible handling for these [hopefully] rare occurrences. Anti dependency is a good guess for dependency type between two memory instructions. In the above particular case it is wrong, and, I imagine, this causes a performance problem for you. You can add better handling of this situation by remembering whether last_pending_memory_flush is memory read or memory write and then use it to select correct dependency type for insn 90: output, anti or true. Let me know whether you want to pursue this and I can help with advice and patch review. Thanks, -- Maxim Kuvyrkov www.kugelworks.com
Re: m68k optimisations?
On 9/11/2013, at 12:08 am, Fredrik Olsson wrote: > I have this simple functions: > int sum_vec(int c, ...) { >va_list argptr; >va_start(argptr, c); >int sum = 0; >while (c--) { >int x = va_arg(argptr, int); >sum += x; >} >va_end(argptr); >return sum; > } > > > When compiling with "-fomit-frame-pointer -Os -march=68000 -c -S > -mshort" I get this assembly (I have manually added comments with > clock cycles per instruction and a total for a count of 0, 8 and n>0): >.even >.globl _sum_vec > _sum_vec: >lea (6,%sp),%a0 | 8 >move.w 4(%sp),%d1 | 12 >clr.w %d0 | 4 >jra .L1 | 12 > .L2: >add.w (%a0)+,%d0| 8 > .L1: >dbra %d1,.L2| 16,12 >rts | 16 > | c==0: 8+12+4+12+12+16=64 > | c==8: 8+12+4+12+(16+8)*8+12+16=256 > | c==n: =64+24n > > When instead compiling with "-fomit-frame-pointer -O3 -march=68000 -c > -S -mshort" I expect to get more aggressive optimisation than -Os, or > at least just as performant, but instead I get this: >.even >.globl _sum_vec > _sum_vec: >move.w 4(%sp),%d0 | 12 >jeq .L2 | 12,8 >lea (6,%sp),%a0 | 8 >subq.w #1,%d0 | 4 >and.l #65535,%d0| 16 >add.l %d0,%d0 | 8 >lea 8(%sp,%d0.l),%a1| 16 >clr.w %d0 | 4 > .L1: >add.w (%a0)+,%d0| 8 >cmp.l %a0,%a1 | 8 >jne .L1 | 12|8 >rts | 16 > .L2: >clr.w %d0 | 4 >rts | 16 > | c==0: 12+12+4+16=44 > | c==8: 12+8+8+4+16+8+16+4+(8+8+12)*4-4+16=316 > | c==n: =88+28n > > The count==0 case is better. I can see what optimisation has been > tried for the loop, but it just not working since both the ini for the > loop and the loop itself becomes more costly. > > Being a GCC beginner I would like a few pointers as to how I should go > about to fix this? You investigate such problems by comparing intermediate debug dumps of two compilation scenarios; by the assembly time it is almost impossible to guess where the problem is coming from. Add -fdump-tree-all and -fdump-rtl-all to the compilation flags and find which optimization pass makes the wrong decision. Then you trace that optimization pass or file a bug report in hopes that someone (optimization maintainer) will look at it. Read through GCC wiki for information on debugging and troubleshooting GCC: - http://gcc.gnu.org/wiki/GettingStarted - http://gcc.gnu.org/wiki/FAQ - http://gcc.gnu.org/wiki/ Thanks, -- Maxim Kuvyrkov www.kugelworks.com
Re: Dependency confusion in sched-deps
On 6/12/2013, at 4:25 am, Michael Matz wrote: > Hi, > > On Thu, 5 Dec 2013, Maxim Kuvyrkov wrote: > >> Output dependency is the right type (write after write). Anti >> dependency is write after read, and true dependency is read after write. >> >> Dependency type plays a role for estimating costs and latencies between >> instructions (which affects performance), but using wrong or imprecise >> dependency type does not affect correctness. > > In the context of GCC and the middle ends memory model this statement is > not correct. For some dependency types we're using type based aliasing to > disambiguate, i.e ignore that dependency, which for others we don't. In > particular a read-after-write memory-access dependency can be ignored if > type info says they can't alias (because a program where both _would_ > access the same memory would be invalid according to our mem model), but > for write-after-read or write-after-write we cannot do that disambiguation > (because the last write overrides the dynamic type of the memory cell even > if it was incompatible with the one before). Yes, this is correct for dependencies between memory locations in the general context of GCC. [Below clarifications are for Paolo's benefit and anyone else's who wants to find out how GCC scheduling works.] Scheduler dependency analysis is a user of the aforementioned alias analysis and it simply won't create a dependency between instructions if alias analysis tells it that it is OK to do so. In the context of scheduler, the dependencies (and their types) are between instructions, not individual registers or memory locations. The mere fact of two instructions having a dependency of any kind will make the scheduler produce correct code. The difference between two instructions having true vs anti vs output dependency will manifest itself in how close the 2nd instruction will be issued to the 1st one. Furthermore, when two instructions have dependencies on several items (e.g., both on register and on memory location), the resulting dependency type is set to the greater of dependency types of all dependent items: true-dependency having most weight, followed by anti-dependency, followed by output-dependency. Consider instructions [r1] = r2 r1 = [r2] The scheduler dependency analysis will find an anti-dependency on r1 and true-dependency on memory locations (assuming [r1] and [r2] may alias). The resulting dependency between instructions will be true-dependency and the instructions will be scheduled several cycles apart. However, one might argue that [r1] and [r2] are unlikely to alias and scheduling these instructions back-to-back (downgrading dependency type from true to anti) would produce better code on average. This is one of countless improvements that could be made to GCC scheduler. -- Maxim Kuvyrkov www.kugelworks.com
Re: Dependency confusion in sched-deps
On 6/12/2013, at 8:44 am, shmeel gutl wrote: > On 05-Dec-13 02:39 AM, Maxim Kuvyrkov wrote: >> Dependency type plays a role for estimating costs and latencies between >> instructions (which affects performance), but using wrong or imprecise >> dependency type does not affect correctness. > On multi-issue architectures it does make a difference. Anti dependence > permits the two instructions to be issued during the same cycle whereas true > dependency and output dependency would forbid this. > > Or am I misinterpreting your comment? On VLIW-flavoured machines without resource conflict checking -- "yes", it is critical not to use anti dependency where an output or true dependency exist. This is the case though, only because these machines do not follow sequential semantics for instruction execution (i.e., effects from previous instructions are not necessarily observed by subsequent instructions on the same/close cycles. On machines with internal resource conflict checking having a wrong type on the dependency should not cause wrong behavior, but "only" suboptimal performance. Thank you, -- Maxim Kuvyrkov www.kugelworks.com
Re: [buildrobot] mips64-linux broken
On 9/12/2013, at 3:24 am, Jan-Benedict Glaw wrote: > Hi Maxim! > > One of your recent libc<->android clean-up patches broke the > mips64-linux target as a side-effect, see eg. > http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=53806: > > g++ -c -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -g -O2 -DIN_GCC > -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti > -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings > -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long > -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. > -Ic-family -I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/c-family > -I/home/jbglaw/repos/gcc/gcc/../include > -I/home/jbglaw/repos/gcc/gcc/../libcpp/include > -I/home/jbglaw/repos/gcc/gcc/../libdecnumber > -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber > -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o c-family/c-cppbuiltin.o > -MT c-family/c-cppbuiltin.o -MMD -MP -MF c-family/.deps/c-cppbuiltin.TPo > /home/jbglaw/repos/gcc/gcc/c-family/c-cppbuiltin.c > /home/jbglaw/repos/gcc/gcc/c-family/c-cppbuiltin.c: In function ‘void > c_cpp_builtins(cpp_reader*)’: > /home/jbglaw/repos/gcc/gcc/c-family/c-cppbuiltin.c:1014:370: error: > ‘ANDROID_TARGET_OS_CPP_BUILTINS’ was not declared in this scope > make[1]: *** [c-family/c-cppbuiltin.o] Error 1 I'm looking into this. Thanks, -- Maxim Kuvyrkov www.kugelworks.com
Re: [buildrobot] mips64-linux broken
On 10/12/2013, at 7:28 am, Steve Ellcey wrote: > On Mon, 2013-12-09 at 08:21 +1300, Maxim Kuvyrkov wrote: > >> I'm looking into this. >> >> Thanks, >> >> -- >> Maxim Kuvyrkov >> www.kugelworks.com > > > My mips-mti-linux-gnu build is working after I applied this patch > locally. I didn't do a test build of mips64-linux-gnu. > > Steve Ellcey > sell...@mips.com > > > diff --git a/gcc/config.gcc b/gcc/config.gcc > index 93743d8..ee17071 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -1918,16 +1918,18 @@ mips*-*-netbsd*) # NetBSD/mips, > either endian. > extra_options="${extra_options} netbsd.opt netbsd-elf.opt" > ;; > mips*-mti-linux*) > - tm_file="dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h ${tm_file} > mips/gnu-user.h mips/gnu-user64.h mips/linux64.h mips/linux-common.h > mips/mti-linux.h" > + tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h > glibc-stdint.h ${tm_file} mips/gnu-user.h mips/gnu-user64.h mips/linux64.h > mips/linux-common.h mips/mti-linux.h" > tmake_file="${tmake_file} mips/t-mti-linux" > tm_defines="${tm_defines} MIPS_ISA_DEFAULT=33 MIPS_ABI_DEFAULT=ABI_32" > + extra_options="${extra_options} linux-android.opt" > gnu_ld=yes > gas=yes > ;; > mips64*-*-linux* | mipsisa64*-*-linux*) > - tm_file="dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h ${tm_file} > mips/gnu-user.h mips/gnu-user64.h mips/linux64.h mips/linux-common.h" > + tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h > glibc-stdint.h ${tm_file} mips/gnu-user.h mips/gnu-user64.h mips/linux64.h > mips/linux-common.h" > tmake_file="${tmake_file} mips/t-linux64" > tm_defines="${tm_defines} MIPS_ABI_DEFAULT=ABI_N32" > + extra_options="${extra_options} linux-android.opt" > case ${target} in > mips64el-st-linux-gnu) > tm_file="${tm_file} mips/st.h" Hi Steve, I've came up with the same patch, and Richard S. already approved it. I'll check it in once s390x-linux part of that patch is approved (hopefully, later today). Thank you, -- Maxim Kuvyrkov www.kugelworks.com
Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling
On 11/12/2013, at 5:17 am, Ramana Radhakrishnan wrote: > On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos wrote: >> Hi, >> >> Near the start of schedule_block, find_modifiable_mems is called if >> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It seems on >> c6x backend currently uses this. >> However, it's quite strange that this is not a requirement for all backends >> since find_modifiable_mems, moves all my dependencies in SD_LIST_HARD_BACK >> to SD_LIST_SPEC_BACK even though I don't have DO_SPECULATION enabled. >> >> Since dependencies are accessed later on from try_ready (for example), I >> would have thought that it would be always good not to call >> find_modifiable_mems, given that it seems to 'literally' break dependencies. >> >> Is the behaviour of find_modifiable_mems a bug or somehow expected? "Breaking" a dependency in scheduler involves modification of instructions that would allow scheduler to move one instruction past the other. The most common case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" which can be transformed into "r3 = [r1 + 4]; r2 = r1 + 4;". Breaking a dependency is not ignoring it, speculatively or otherwise; it is an equivalent code transformation to allow scheduler more freedom to fill up CPU cycles. > > > It's funny how I've been trying to track down a glitch and ended up > asking the same question today. Additionally if I use > TARGET_SCHED_SET_SCHED_FLAGS on a port that doesn't use the selective > scheduler, this does nothing. Does anyone know why is this the default > for ports where we don't turn on selective scheduling and might need a > hook to turn this off ? SCHED_FLAGS is used to enable or disable various parts of GCC scheduler. On an architecture that supports speculative scheduling with recovery (IA64) it can turn this feature on or off. The documentation for various features of sched-rgn, sched-ebb and sel-sched is not the best and one will likely get weird artefacts by trying out non-default settings. I believe that only IA64 backend supports selective scheduling reliably. I've other ports trying out selective scheduling, but I don't know whether those efforts got positive results. -- Maxim Kuvyrkov www.kugelworks.com
Re: Dependency confusion in sched-deps
On 6/12/2013, at 9:44 pm, shmeel gutl wrote: > On 06-Dec-13 01:34 AM, Maxim Kuvyrkov wrote: >> On 6/12/2013, at 8:44 am, shmeel gutl wrote: >> >>> On 05-Dec-13 02:39 AM, Maxim Kuvyrkov wrote: >>>> Dependency type plays a role for estimating costs and latencies between >>>> instructions (which affects performance), but using wrong or imprecise >>>> dependency type does not affect correctness. >>> On multi-issue architectures it does make a difference. Anti dependence >>> permits the two instructions to be issued during the same cycle whereas >>> true dependency and output dependency would forbid this. >>> >>> Or am I misinterpreting your comment? >> On VLIW-flavoured machines without resource conflict checking -- "yes", it >> is critical not to use anti dependency where an output or true dependency >> exist. This is the case though, only because these machines do not follow >> sequential semantics for instruction execution (i.e., effects from previous >> instructions are not necessarily observed by subsequent instructions on the >> same/close cycles. >> >> On machines with internal resource conflict checking having a wrong type on >> the dependency should not cause wrong behavior, but "only" suboptimal >> performance. >> >> ... > Earlier in the thread you wrote >> Output dependency is the right type (write after write). Anti dependency is >> write after read, and true dependency is read after write. > Should the code be changed to accommodate vliw machines.. It has been there > since the module was originally checked into trunk. The usual solution for VLIW machines is to have assembler split VLIW bundles that have internal dependencies and execute them on different cycles. The idea is for compiler to strive to do its best to produce code without any internal dependencies, but it is up to assembler to do the final check and fix any occasional problems. [A good assembler has to do this work anyway to accommodate for mistakes in hand-written assembly.] The scheduler is expected to produces code with no internal dependencies for VLIW machines 99% of the time. This 99% effectiveness is good enough since scheduler is often not the last pass that touches code, and subsequent transformations can screw up VLIW bundles anyway. -- Maxim Kuvyrkov www.kugelworks.com
Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling
On 11/12/2013, at 11:14 am, Ramana Radhakrishnan wrote: > On Tue, Dec 10, 2013 at 9:44 PM, Maxim Kuvyrkov wrote: >> On 11/12/2013, at 5:17 am, Ramana Radhakrishnan >> wrote: >> >>> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos wrote: >>>> Hi, >>>> >>>> Near the start of schedule_block, find_modifiable_mems is called if >>>> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It seems >>>> on c6x backend currently uses this. >>>> However, it's quite strange that this is not a requirement for all >>>> backends since find_modifiable_mems, moves all my dependencies in >>>> SD_LIST_HARD_BACK to SD_LIST_SPEC_BACK even though I don't have >>>> DO_SPECULATION enabled. >>>> >>>> Since dependencies are accessed later on from try_ready (for example), I >>>> would have thought that it would be always good not to call >>>> find_modifiable_mems, given that it seems to 'literally' break >>>> dependencies. >>>> >>>> Is the behaviour of find_modifiable_mems a bug or somehow expected? >> >> "Breaking" a dependency in scheduler involves modification of instructions >> that would allow scheduler to move one instruction past the other. The most >> common case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" which can >> be transformed into "r3 = [r1 + 4]; r2 = r1 + 4;". Breaking a dependency is >> not ignoring it, speculatively or otherwise; it is an equivalent code >> transformation to allow scheduler more freedom to fill up CPU cycles. > > > Yes, but there are times when it does this a bit too aggressively and > this looks like the cause for a performance regression that I'm > investigating on ARM. I was looking for a way of preventing this > transformation and there doesn't seem to be an easy one other than the > obvious hack. If you want a particular transformation from occurring, then you need to investigate why scheduler thinks that there is nothing better to do than to schedule an instruction which requires breaking a dependency. "Breaking" a dependency only increases pool of instructions available to schedule, and your problem seems to be laying in "why" the wrong instruction is selected from that pool. Are you sure that the problem is introduced by dependency breaking, rather than dependency breaking exposing a latent bug? > > Additionally there appears to be no way to control "flags" in a > backend hook for sched-rgn for DONT_BREAK_DEPENDENCIES . Again if the > DONT_BREAK_DEPENDENCIES is meant to be disabled with these flags, then > it looks like we should allow for these to also be handled or describe > TARGET_SCHED_SET_SCHED_FLAGS as only a hook valid with the selective > scheduler. I'm not sure I follow you here. Any port can define TARGET_SCHED_SET_SCHED_FLAGS and set current_sched_info->flags to whatever it thinks is appropriate. E.g., c6x does this to disable dependency breaking for a particular kind of loops. > >> >>> >>> >>> It's funny how I've been trying to track down a glitch and ended up >>> asking the same question today. Additionally if I use >>> TARGET_SCHED_SET_SCHED_FLAGS on a port that doesn't use the selective >>> scheduler, this does nothing. Does anyone know why is this the default >>> for ports where we don't turn on selective scheduling and might need a >>> hook to turn this off ? >> >> SCHED_FLAGS is used to enable or disable various parts of GCC scheduler. On >> an architecture that supports speculative >scheduling with recovery (IA64) >> it can turn this feature on or off. The documentation for various features >> of sched-rgn, sched-ebb and sel-sched is not the best and one will likely >> get weird artefacts by trying out non-default settings. > > > Well, it appears as though TARGET_SCHED_SET_SCHED_FLAGS is only valid > with the selective scheduler on as above and is a no-op as far as > sched-rgn goes. This whole area could do with some improved > documentation - I'll follow up with some patches to see if I can > improve the situation. I don't think this is the case. TARGET_SCHED_SET_SCHED_FLAGS has two outputs: one is SPEC_INFO structure (which is used for IA64 only, both for sel-sched and sched-rgn), and the other one is modification of current_sched_info->flags, which affects all schedulers (sched-rgn, sched-ebb and sel-sched) and all ports. -- Maxim Kuvyrkov www.kugelworks.com
Re: DONT_BREAK_DEPENDENCIES bitmask for scheduling
On 11/12/2013, at 3:45 pm, Ramana Radhakrishnan wrote: > On Wed, Dec 11, 2013 at 12:02 AM, Maxim Kuvyrkov wrote: >> On 11/12/2013, at 11:14 am, Ramana Radhakrishnan >> wrote: >> >>> On Tue, Dec 10, 2013 at 9:44 PM, Maxim Kuvyrkov >>> wrote: >>>> On 11/12/2013, at 5:17 am, Ramana Radhakrishnan >>>> wrote: >>>> >>>>> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos wrote: >>>>>> Hi, >>>>>> >>>>>> Near the start of schedule_block, find_modifiable_mems is called if >>>>>> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It >>>>>> seems on c6x backend currently uses this. >>>>>> However, it's quite strange that this is not a requirement for all >>>>>> backends since find_modifiable_mems, moves all my dependencies in >>>>>> SD_LIST_HARD_BACK to SD_LIST_SPEC_BACK even though I don't have >>>>>> DO_SPECULATION enabled. >>>>>> >>>>>> Since dependencies are accessed later on from try_ready (for example), I >>>>>> would have thought that it would be always good not to call >>>>>> find_modifiable_mems, given that it seems to 'literally' break >>>>>> dependencies. >>>>>> >>>>>> Is the behaviour of find_modifiable_mems a bug or somehow expected? >>>> >>>> "Breaking" a dependency in scheduler involves modification of instructions >>>> that would allow scheduler to move one instruction past the other. The >>>> most common case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" >>>> which can be transformed into "r3 = [r1 + 4]; r2 = r1 + 4;". Breaking a >>>> dependency is not ignoring it, speculatively or otherwise; it is an >>>> equivalent code transformation to allow scheduler more freedom to fill up >>>> CPU cycles. >>> >>> >>> Yes, but there are times when it does this a bit too aggressively and >>> this looks like the cause for a performance regression that I'm >>> investigating on ARM. I was looking for a way of preventing this >>> transformation and there doesn't seem to be an easy one other than the >>> obvious hack. >> >> If you want a particular transformation from occurring, then you need to >> investigate why scheduler thinks that there is nothing better to do than to >> schedule an instruction which requires breaking a dependency. "Breaking" a >> dependency only increases pool of instructions available to schedule, and >> your problem seems to be laying in "why" the wrong instruction is selected >> from that pool. >> >> Are you sure that the problem is introduced by dependency breaking, rather >> than dependency breaking exposing a latent bug? > > From my reading because the dependency breaking is of addresses that > are in a memcpy type loop which is unrolled and the original > expectation is that by switching this to an add and a negative offset > one can get more ILP in theory, but in practice the effects appear to > be worse because of secondary issues that I'm still investigating. Is this happening in the 1st or 2nd scheduling pass? From your comments I get a feeling that dependency breaking is introducing an additional instruction, rather then adding an offset to a memory reference. Ideally, dependency breaking during 1st scheduling pass should be more conservative and avoid too many new instructions (e.g., by breaking a dependency only if nothing whatsoever can be scheduled on the current cycle). Dependency breaking during 2nd scheduling pass can be more aggressive as it can make sure that adding offset to a memory instruction will not cause it to be split. > >> >>> >>> Additionally there appears to be no way to control "flags" in a >>> backend hook for sched-rgn for DONT_BREAK_DEPENDENCIES . Again if the >>> DONT_BREAK_DEPENDENCIES is meant to be disabled with these flags, then >>> it looks like we should allow for these to also be handled or describe >>> TARGET_SCHED_SET_SCHED_FLAGS as only a hook valid with the selective >>> scheduler. >> >> I'm not sure I follow you here. Any port can define >> TARGET_SCHED_SET_SCHED_FLAGS and set current_sched_info->flags to whatever >> it thinks is appropriate. E.g., c6x does this to disable dependency >> breaking for a particular kind of loops. > > Ah, that will probably work and that's probably what I was missing. I > don't like the idea in general of the same interface setting global > state randomly in a backend is probably not the best approach in the > long term. Expecting to set global state in this form from an > interface is something I wasn't expecting especially when it takes a > parameter. Originally TARGET_SCHED_SET_SCHED_FLAGS was setting current_sched_info->flags and nothing else, hence the name. The parameter spec_info appeared later to hold flags related to IA64-specific speculative scheduling. -- Maxim Kuvyrkov www.kugelworks.com
Re: Google Summer of Code -- Admin needed
On 6/02/2014, at 7:45 am, Moore, Catherine wrote: > Hi All, > > I acted as the Google Summer of Code Administrator in 2013 and I do not wish > to continue. > > There is an upcoming deadline (February 14th) for an organization to submit > their applications to the Google Summer of Code.Is there anyone who would > like to act as the gcc admin for 2014? > I assume that folks would like to have the gcc project continue to > participate; we need to find someone to submit the application and commit to > the admin duties. > > The bulk of the work is organizational. There are some web forms to fill > out, evaluations need to be completed, an irc meeting was required, plus > finding projects and mentors for the projects. > > I hope someone will pick this up. I want to admin GCC's GSoC this year. In the next several days I will be bugging past GCC GSoC admins and mentors to get an idea of what I'm getting myself into. Please send me a note if you haven't been GSoC mentor in the past years, but want to try this year. Thank you, -- Maxim Kuvyrkov www.linaro.org
GSoC project ideas
Hi, GCC has applied as a mentoring organization to GSoC 2014, and we need to update Project Ideas page: http://gcc.gnu.org/wiki/SummerOfCode . Ideas is where GSoC starts, and this is what captures attention and imagination of prospective students (and future developers!) of GCC. If you have an idea for a student project -- post it at http://gcc.gnu.org/wiki/SummerOfCode . If you can't easily edit the wiki directly, feel free to send your ideas to me directly or as a reply to this thread, I will add them to the wiki. You don't have to commit to be a mentor for an idea that you post. We will worry about finding mentors once a student expresses interest in a particular idea. You don't have to be an active GCC developer to post an idea. If you are an experienced GCC user and you wanted all your life a feature X in GCC -- post an idea about it. If you are a prospective GSoC student -- then we definitely want to hear your ideas. We need the ideas page all updated and ready by the end of February (couple of weeks left). Student applications period opens on March 10th, and keep in mind that students would need to meditate on the various projects/ideas/choices for a week or so. For GSoC 2014 timeline see https://www.google-melange.com/gsoc/events/google/gsoc2014 . Thank you, -- Maxim Kuvyrkov www.linaro.org
[GSoC] GCC has been accepted to GSoC 2014
Hi All, GCC has been accepted as mentoring organization to Google Summer of Code 2014, and we are off to the races! If you want to be a GCC GSoC student check out the project idea page at http://gcc.gnu.org/wiki/SummerOfCode . Feel free to ask questions on IRC [1] and get in touch with your potential mentors. If you are not sure who to contact -- send me an email at maxim.kuvyr...@linaro.org. If you are a GCC developer then create a profile at http://www.google-melange.com/gsoc/homepage/google/gsoc2014 to be able to rank student applications . Once registered, connect with "GCC - GNU Compiler Collection" organization. If you actively want to mentor a student project, then note so in your GSoC connection request. If you have any questions or comments please contact your friendly GSoC admin via IRC (maximk), email (maxim.kuvyr...@linaro.org) or Skype/Hangouts. Thank you, [1] irc://irc.oftc.net/#gcc -- Maxim Kuvyrkov www.linaro.org
Re: [gsoc 2014] moving fold-const patterns to gimple
On Mar 18, 2014, at 9:13 PM, Prathamesh Kulkarni wrote: > On Mon, Mar 17, 2014 at 2:22 PM, Richard Biener > wrote: >> On Sun, Mar 16, 2014 at 1:21 PM, Prathamesh Kulkarni >> wrote: >>> In c_expr::c_expr, shouldn't OP_C_EXPR be passed to operand >>> constructor instead of OP_EXPR ? >> >> Indeed - I have committed the fix. >> > My earlier mail got rejected (maybe because I attached pdf ?), > by mailer daemon, sorry for the double post. > I have uploaded the proposal here: > https://drive.google.com/file/d/0B7zFk-y3DFiHa1Nkdzh6TFZpVFE/edit?usp=sharing > I would be grateful to receive your feedback. Prathamesh, I will let Richard to comment on the proposal contents, but make sure you have formally applied on the GSoC website and uploaded some version of your proposal by end of Thursday (only 2 days left!). You will be able to update details of the proposal later. Thank you, -- Maxim Kuvyrkov www.linaro.org
Re: GSoC Concepts - separate checking
On Mar 12, 2014, at 12:19 PM, Braden Obrzut wrote: > My name is Braden Obrzut and I am a student from the University of Akron > interested in contributing to GCC for GSoC. I am interested in working on a > project related to the c++-concepts branch. > > In particular, I am interested in implementing mechanisms for checking the > safety of constrained templates (separate checking). I have discussed the > project with Andrew Sutton (who maintains the c++-concepts branch and happens > to be a professor at Akron) and believe that some aspects of the work would be > feasible within the three month time span. I also hope to continue working on > the project as my honors thesis project. > > As a hobby I usually design and implement declarative languages for content > definition in old video games. While I currently may have limited experience > with GCC internals, I think this would be a great opportunity for me to learn > how real compilers works and help with the development of the C++ programming > language. Braden, Do you have a proposal for a GSoC GCC project? If you do want to apply, please make sure you are registered at the GSoC website and have a application filed by end of Thursday (only 2 days left!). Thank you, -- Maxim Kuvyrkov www.linaro.org
Re: GSoC 2014 C++ Concepts project
On Mar 12, 2014, at 11:42 AM, Thomas Wynn wrote: > Hello, my name is Thomas Wynn. I am a junior in pursuit of a B.S. in > Computer Science at The University of Akron. I am interested in > working on a project with GCC for this year's Google Summer of Code. > More specifically, I would like to work on support for concept > variables and shorthand notation of concepts for C++ Concepts Lite. > > I am currently doing an independent study with Andrew Sutton in which > I have been porting and creating various tests for concepts used in > the DejaGNU test suite of an experimental branch of GCC 4.9, and will > soon be helping with the development of features in branch. I would > greatly appreciate any suggestions or feedback for this project so > that I may write a more detailed, relevant, and accurate proposal. Hi Thomas, Do you have a proposal for a GSoC GCC project? If you do want to apply, please make sure you are registered at the GSoC website and have a application filed by end of Thursday (only 2 days left!). -- Maxim Kuvyrkov www.linaro.org
Re: About gsoc 2014 OpenMP 4.0 Projects
On Feb 26, 2014, at 12:27 AM, guray ozen wrote: > Hello, > > I'm master student at high-performance computing at barcelona > supercomputing center. And I'm working on my thesis regarding openmp > accelerator model implementation onto our compiler (OmpSs). Actually i > almost finished implementation of all new directives to generate CUDA > code and same implementation OpenCL doesn't take so much according to > my design. But i haven't even tried for Intel mic and apu other > hardware accelerator :) Now i'm bench-marking output kernel codes > which are generated by my compiler. although output kernel is > generally naive, speedup is not very very bad. when I compare results > with HMPP OpenACC 3.2.x compiler, speedups are almost same or in some > cases my results are slightly better than. That's why in this term, i > am going to work on compiler level or runtime level optimizations for > gpus. > > When i looked gcc openmp 4.0 project, i couldn't see any things about > code generation. Are you going to announce later? or should i apply > gsoc with my idea about code generations and device code > optimizations? Guray Do you have a proposal for a GSoC GCC project? If you do want to apply, please make sure you are registered at the GSoC website and have a application filed by end of Thursday (only 2 days left!). Thank you, -- Maxim Kuvyrkov www.linaro.org
Re: Google Summer of Code
On Mar 17, 2014, at 2:39 AM, Mihai Mandrescu wrote: > Hello, > > I just enrolled in Google Summer of Code and would like to contribute > to GCC. I'm not very familiar with the process of getting a project > for GSoC nor with free software development in general, but I would > like to learn. Can someone give me some hints please? > Hi Mihai, There is very little time left for student application -- only 2 days. In general, by now you should have a specific idea that you want to work on, it doesn't have to be your own, there are many ideas for potential GSoC projects at http://gcc.gnu.org/wiki/SummerOfCode . You need to be realistic about your experience in compiler development and GCC development. It is better to apply for an easier/smaller project and successfully finish it, than to work on a complicated project and not get it done. Finally, please don't cross-post to several lists, gcc@gcc.gnu.org is the correct list for development discussions (with gcc-patc...@gcc.gnu.org being the list for discussion of specific patches). Thank you, -- Maxim Kuvyrkov www.linaro.org
Re: [GSoC 2014] Proposal: OpenCL Code Generator
On Mar 20, 2014, at 4:02 AM, Ilmir Usmanov wrote: > Hi all! > > My name is Ilmir Usmanov and I'm a student of Moscow Institute of Physics and > Technology. > Also I'm implementing OpenACC 1.0 in gomp4 branch as an employee of Samsung > R&D Institute Russia (SRR). My research interests are connected with creating > OpenCL Code Generator. So I'd like to participate GSoC 2014 with project > called "OpenCL Code Generator" as an independent student. I will do the > project during my free time, my employer will not pay for this. I do not think you will qualify as a student for the GSoC program. Students are expected to work close to full-time during summer break and have GSoC as their main priority. With your employment obligation I don't think you will be able to commit to your GSoC project at that level. If I am wrong in my assumptions above, and you can commit to the GSoC project being your first priority for the summer months, please apply with your proposal on the GSoC website. There is very little time left, so move fast. Thank you, -- Maxim Kuvyrkov www.linaro.org
Re: Register at the GSoC website to rate projects
[Moving to gcc@ from gcc-patches@] Community, We've got 11 student proposals (good job, students!), and only the N top-rated ones will be accepted into the program. Therefore, we as a community need to make sure that the ratings are representative of our goals -- making GCC the best compiler there is. Go rate the proposals! Make your voice heard! ! Here is a list of proposals (and, "yes" 'GCC Go escape analysis' is submitted by two different students). Generating folding patterns from meta description Concepts Separate Checking Integration of ISL code generator into Graphite GCC Go escape analysis Dynamically add headers to code C++11 Support in GCC and libstdc++ GCC: Diagnostics GCC Go escape analysis Converting representation levels of GCC back to the source condes Separate front-end folder from middle-end folder interested in Minimal support for garbage collection Thank you, -- Maxim Kuvyrkov www.linaro.org On Mar 15, 2014, at 6:50 PM, Maxim Kuvyrkov wrote: > Hi, > > You are receiving this message because you are in top 50 contributors to GCC > [1]. Congratulations! > > Since you are a top contributor to GCC project it is important for you to > rate the incoming student GSOC applications. Go and register at > https://www.google-melange.com/gsoc/homepage/google/gsoc2014 and connect with > "GCC - GNU Compiler Collection" organization. Pretty. Please. It will take > 3-5 minutes of your time. > > Furthermore, if you work at a college or university (or otherwise interact > with talented computer science students), encourage them to look at GCC's > ideas page [2] and run with it for a summer project (or, indeed, propose > their own idea). They should hurry, only one week is left! > > So far we've got several good proposals from students, but we want to see > more. > > Thank you, > > [1] As determined by number of checked in patches over the last 2 years (and, > "yes", I know this is not the fairest metric). Script used: > $ git log "--pretty=format:%an" | head -n 12000 | awk '{ a[$1]++; } END { for > (i in a) print a[i] " " i; }' | sort -g | tail -n 50 > > [2] http://gcc.gnu.org/wiki/SummerOfCode > > -- > Maxim Kuvyrkov > www.linaro.org > > >
Re: add_branch_dependences in sched-rgn.c
On Apr 9, 2014, at 4:15 AM, Kyrill Tkachov wrote: > Hi all, > > I'm looking at some curious pre-reload scheduling behaviour and I noticed > this: > > At the add_branch_dependences function sched-rgn.c there is a comment that > says "branches, calls, uses, clobbers, cc0 setters, and instructions that can > throw exceptions" should be scheduled at the end of the basic block. > > However right below it the code that detects this kind of insns seems to only > look for these insns that are directly adjacent to the end of the block > (implemented with a while loop that ends as soon as the current insn is not > one of the aforementioned). > > Shouldn't the code look through the whole basic block, gather all of the > branches, clobbers etc. and schedule them at the end? > Not really. The instruction sequences mentioned in the comment end basic block by definition -- if there is a jump or other "special" sequence, then basic block can't continue beyond that as control may be transffered to something other than the next instruction. Add_branch_dependencies() makes sure that scheduler does not "accidentally" place something after those "special" sequences thus creating a corrupted basic block. -- Maxim Kuvyrkov www.linaro.org
[GSoC] Status - 20140410
Community, [and BCC'ed mentors] Google Summer of Code is panning out nicely for GCC. We have received 5 slots for GSoC projects this year. The plan is to accept 5 top-rated student proposals. If you haven't rated the projects yet, you have 2 days to go to GSoC website [1] and rate the proposals. I will mark the top-5 proposals "Accepted" this Friday/Saturday. We already have mentors volunteered for the 5 currently leading projects, which is great. We also need a couple of backup mentors in case one of the primary mentors becomes temporarily unavailable. Your main job as backup mentor will be to follow 2-5 of the student projects and be ready to step in, should a need arise. Any volunteers for the roles of backup mentors? I will send the next GSoC update early next week [when student projects are accepted]. Thank you, [1] https://www.google-melange.com/gsoc/homepage/google/gsoc2014 -- Maxim Kuvyrkov www.linaro.org