Re: Good news, bad news on the repository conversion
2018-07-09 2:27 GMT+02:00 Eric S. Raymond : > There is good news bad news on the GCC repository conversion. > > The good news is that I have solved the only known remaining technical > problem in reposurgeon blocking the conversion. I've fixed the bug > that prevented execute permissions from being carried by branch > copies. Great to hear that there is progress on that front! > The bad news is that my last test run overran the memnory capacity of > the 64GB Great Beast. I shall have to find some way of reducing the > working set, as 128GB DD4 memory is hideously expensive. Or maybe you could use a machine from the GCC compile farm? According to https://gcc.gnu.org/wiki/CompileFarm, there are three machines with at least 128GB available (gcc111, gcc112, gcc119). Cheers, Janus
Re: [GSOC] LTO dump tool project
On 07/09/2018 09:50 AM, Hrishikesh Kulkarni wrote: > Hi, > > The command line option -gimple-stats will dump the statistics of > gimple statements. > > For example: > > $ ../stage1-build/gcc/lto-dump test_hello.o -gimple-stats > > will dump: > > GIMPLE statements > Kind Stmts Bytes > --- > assignments3264 > phi nodes 1248 > conditionals 1 80 > everything else 12704 > --- > Total 17 1296 > --- > > I have pushed the changes to the repo. Please find the diff file > attached herewith. > > Regards, > > Hrishikesh Hi. Thanks for the work. I briefly took a look at the code and I would focus now directly on refactoring: - please make a new branch, first copy lto-dump.c file and all the Makefile needed stuff and commit that. - next please rebase (squash) all your patches which you have on top of it; you did some formatting corrections and it's very hard to read - please fix coding style, it's mentioned here: https://gcc.gnu.org/codingconventions.html the most problematic is usage of horizontal white spaces. We use 2 spaces for indentation level, and 8 spaces are replaced with a tabular; without that reading your code is very hard for me - then please start refactoring functionality that is copied from lto.c and put shared stuff into a header file that will be used by lto.c and lto-dump.c. Other observations: - you use "\t\t%s\t\t%s\t\t%s" formats for prints; I think it would be better to use fixed strings with spaces, try %20s. It's probably also used by tools like nm or objdump - come up with more specific name for 'entry' and 'compare' - 'entry' should have functions that will print names, ... according to options (flag_lto_dump_demangle, ...); you can have overrides for functions and variables - I would first put all symbols into the vector and then I would print it on a single place - consider using vec from vec.h, hash_map from hash-map.h instead of std:: variants - exit after functions like dump_list, dump_symbol,... - remove dummy 'dump' function Martin > > On Thu, Jul 5, 2018 at 10:41 PM, Hrishikesh Kulkarni > wrote: >> Hi, >> >> I have added new command line option: >> -objects >> which will dump size, offset and name of each section for all lto objects >> >> for example: >> $ ../stage1-build/gcc/lto-dump test_hello.o test.o -objects >> >> gives output: >> LTO object name: test_hello.o >> >> NO.OFFSETSIZESECTION NAME >> >> 16415.gnu.lto_.profile.a7add72ac123628 >> 27955.gnu.lto_.icf.a7add72ac123628 >> 3134134.gnu.lto_.jmpfuncs.a7add72ac123628 >> 4268116.gnu.lto_.inline.a7add72ac123628 >> 538424.gnu.lto_.pureconst.a7add72ac123628 >> 6408306.gnu.lto_foo.a7add72ac123628 >> 7714469.gnu.lto_bar.a7add72ac123628 >> 81183345.gnu.lto_main.a7add72ac123628 >> 9152888.gnu.lto_.symbol_nodes.a7add72ac123628 >> 10161615.gnu.lto_.refs.a7add72ac123628 >> 1116311205.gnu.lto_.decls.a7add72ac123628 >> 122836109.gnu.lto_.symtab.a7add72ac123628 >> 13294576.gnu.lto_.opts >> >> LTO object name: test.o >> >> NO.OFFSETSIZESECTION NAME >> >> 16415.gnu.lto_.profile.ffab9cb8eb84fc03 >> 27930.gnu.lto_.icf.ffab9cb8eb84fc03 >> 3109108.gnu.lto_.jmpfuncs.ffab9cb8eb84fc03 >> 421762.gnu.lto_.inline.ffab9cb8eb84fc03 >> 527921.gnu.lto_.pureconst.ffab9cb8eb84fc03 >> 6300194.gnu.lto_koo.ffab9cb8eb84fc03 >> 7494389.gnu.lto_gain.ffab9cb8eb84fc03 >> 888377.gnu.lto_.symbol_nodes.ffab9cb8eb84fc03 >> 996015.gnu.lto_.refs.ffab9cb8eb84fc03 >> 10975966.gnu.lto_.decls.ffab9cb8eb84fc03 >> 11194158.gnu.lto_.symtab.ffab9cb8eb84fc03 >> 12199976.gnu.lto_.opts >> >> >> I have pushed the changes to the repo. Please find the diff file >> attached herewith. >> >> Regards, >> >> Hrishikesh >> >> On Thu, Jul 5, 2018 at 12:24 AM, Hrishikesh Kulkarni >> wrote: >>> Hi, >>> >>> I have: >>> tried to do all the formatting and style corrections according to >>> output given by check_GNU_style.py >>> removed '-fdump-lto' prefix from the command line options >>> added few necessary comments in the code >>> added command line option -type-stats from previous branch (added a >>> new percentage column to it) >>> for e.g. >>> integer_type325.0
Re: Good news, bad news on the repository conversion
On 07/09/2018 02:27 AM, Eric S. Raymond wrote: > The bad news is that my last test run overran the memnory capacity of > the 64GB Great Beast. I shall have to find some way of reducing the > working set, as 128GB DD4 memory is hideously expensive. Hello. I can help with by running a conversion on our server machine. Feel free to contact me privately on my mail. Martin
Re: Good news, bad news on the repository conversion
Janus Weil : > > The bad news is that my last test run overran the memnory capacity of > > the 64GB Great Beast. I shall have to find some way of reducing the > > working set, as 128GB DD4 memory is hideously expensive. > > Or maybe you could use a machine from the GCC compile farm? > > According to https://gcc.gnu.org/wiki/CompileFarm, there are three > machines with at least 128GB available (gcc111, gcc112, gcc119). The Great Beast is a semi-custom PC optimized for doing graph theory on working sets gigabytes wide - its design emphasis is on the best possible memory caching. If I dropped back to a conventional machine the test times would go up by 50% (benchmarked, that's not a guess), and they're already bad enough to make test cycles very painful. I just saw elapsed time 8h30m36.292s for the current test - I had it down to 6h at one point but the runtimes scale badly with increasing repo size, there is intrinsically O(n**2) stuff going on. My first evasive maneuver is therefore to run tests with my browser shut down. That's working. I used to do that before I switched from C-Python to PyPy, which runs faster and has a lower per-object footprint. Now it's mandatory again. Tells me I need to get the conversion finished before the number of commits gets much higher. More memory would avoid OOM but not run the tests faster. More cores wouldn't help due to Python's GIL problem - many of reposurgeon's central algorithms are intrinsically serial, anyway. Higher single-processor speed could help a lot, but there plain isn't anything in COTS hardware that beats a Xeon 3 cranking 3.5Ghz by much. (The hardware wizard who built the Beast thinks he might be able to crank me up to 3.7GHz later this year but that hardware hasn't shipped yet.) The one technical change that might help is moving reposurgeon from Python to Go - I might hope for as much as a 10x drop in runtimes from that and a somewhat smaller decrease in working set. Unfortunately while the move is theoretically possible (I've scoped the job) that too would be very hard and take a long time. It's 14KLOC of the most algorithmically dense Python you are ever likely to encounter, with dependencies on Python libraries sans Go equivalents that might double the LOC; only the fact that I built a *really good* regression- and unit-test suite in self-defense keeps it anywhere near to practical. (Before you ask, at the time I started reposurgeon in 2010 there wasn't any really production-ready language that might have been a better fit than Python. I did look. OO languages with GC and compiled speed are still pretty thin on the ground.) The truth is we're near the bleeding edge of what conventional tools and hardware can handle gracefully. Most jobs with working sets as big as this one's do only comparatively dumb operations that can be parallellized and thrown on a GPU or supercomputer. Most jobs with the algorithmic complexity of repository surgery have *much* smaller working sets. The combination of both extrema is hard. -- http://www.catb.org/~esr/";>Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own.
Re: Good news, bad news on the repository conversion
On Mon, Jul 9, 2018 at 6:16 AM Eric S. Raymond wrote: > > Janus Weil : > > > The bad news is that my last test run overran the memnory capacity of > > > the 64GB Great Beast. I shall have to find some way of reducing the > > > working set, as 128GB DD4 memory is hideously expensive. > > > > Or maybe you could use a machine from the GCC compile farm? > > > > According to https://gcc.gnu.org/wiki/CompileFarm, there are three > > machines with at least 128GB available (gcc111, gcc112, gcc119). > > The Great Beast is a semi-custom PC optimized for doing graph theory > on working sets gigabytes wide - its design emphasis is on the best > possible memory caching. If I dropped back to a conventional machine > the test times would go up by 50% (benchmarked, that's not a guess), > and they're already bad enough to make test cycles very painful. > I just saw elapsed time 8h30m36.292s for the current test - I had it > down to 6h at one point but the runtimes scale badly with increasing > repo size, there is intrinsically O(n**2) stuff going on. > > My first evasive maneuver is therefore to run tests with my browser > shut down. That's working. I used to do that before I switched from > C-Python to PyPy, which runs faster and has a lower per-object > footprint. Now it's mandatory again. Tells me I need to get the > conversion finished before the number of commits gets much higher. > > More memory would avoid OOM but not run the tests faster. More cores > wouldn't help due to Python's GIL problem - many of reposurgeon's > central algorithms are intrinsically serial, anyway. Higher > single-processor speed could help a lot, but there plain isn't > anything in COTS hardware that beats a Xeon 3 cranking 3.5Ghz by > much. (The hardware wizard who built the Beast thinks he might be able > to crank me up to 3.7GHz later this year but that hardware hasn't > shipped yet.) > > The one technical change that might help is moving reposurgeon from > Python to Go - I might hope for as much as a 10x drop in runtimes from > that and a somewhat smaller decrease in working set. Unfortunately > while the move is theoretically possible (I've scoped the job) that > too would be very hard and take a long time. It's 14KLOC of the most > algorithmically dense Python you are ever likely to encounter, with > dependencies on Python libraries sans Go equivalents that might > double the LOC; only the fact that I built a *really good* regression- > and unit-test suite in self-defense keeps it anywhere near to > practical. > > (Before you ask, at the time I started reposurgeon in 2010 there > wasn't any really production-ready language that might have been a > better fit than Python. I did look. OO languages with GC and compiled > speed are still pretty thin on the ground.) > > The truth is we're near the bleeding edge of what conventional tools > and hardware can handle gracefully. Most jobs with working sets as > big as this one's do only comparatively dumb operations that can be > parallellized and thrown on a GPU or supercomputer. Most jobs with > the algorithmic complexity of repository surgery have *much* smaller > working sets. The combination of both extrema is hard. If you come to the conclusion that the GCC Community could help with resources, such as the GNU Compile Farm or paying for more RAM, let us know. Thanks, David
Re: Good news, bad news on the repository conversion
David Edelsohn : > > The truth is we're near the bleeding edge of what conventional tools > > and hardware can handle gracefully. Most jobs with working sets as > > big as this one's do only comparatively dumb operations that can be > > parallellized and thrown on a GPU or supercomputer. Most jobs with > > the algorithmic complexity of repository surgery have *much* smaller > > working sets. The combination of both extrema is hard. > > If you come to the conclusion that the GCC Community could help with > resources, such as the GNU Compile Farm or paying for more RAM, let us > know. 128GB of DDR4 registered RAM would allow me to run conversions with my browser up, but be eye-wateringly expensive. Thanks, but I'm not going to yell for that help unless the working set gets so large that it blows out 64GB even with nothing but i4 and some xterms running. Unfortunately that is a contingency that no longer seems impossible. (If you're not familar, i4 is a minimalist tiling window manager with a really small working set. I like it and would use it even if I didn't have a memory-crowding problem. Since I do it is extra helpful.) -- http://www.catb.org/~esr/";>Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own.
Re: Good news, bad news on the repository conversion
2018-07-09 18:35 GMT+02:00 Eric S. Raymond : > David Edelsohn : >> > The truth is we're near the bleeding edge of what conventional tools >> > and hardware can handle gracefully. Most jobs with working sets as >> > big as this one's do only comparatively dumb operations that can be >> > parallellized and thrown on a GPU or supercomputer. Most jobs with >> > the algorithmic complexity of repository surgery have *much* smaller >> > working sets. The combination of both extrema is hard. >> >> If you come to the conclusion that the GCC Community could help with >> resources, such as the GNU Compile Farm or paying for more RAM, let us >> know. > > 128GB of DDR4 registered RAM would allow me to run conversions with my > browser up, but be eye-wateringly expensive. Thanks, but I'm not > going to yell for that help I for one would certainly be happy to donate some spare bucks towards beastie RAM if it helps to get the GCC repo converted to git in a timely manner, and I'm sure there are other GCC developers/users/sympathizers who'd be willing to join in. So, where do we throw those bucks? Cheers, Janus
Re: Good news, bad news on the repository conversion
On 07/09/2018 10:53 AM, Janus Weil wrote: > 2018-07-09 18:35 GMT+02:00 Eric S. Raymond : >> David Edelsohn : The truth is we're near the bleeding edge of what conventional tools and hardware can handle gracefully. Most jobs with working sets as big as this one's do only comparatively dumb operations that can be parallellized and thrown on a GPU or supercomputer. Most jobs with the algorithmic complexity of repository surgery have *much* smaller working sets. The combination of both extrema is hard. >>> >>> If you come to the conclusion that the GCC Community could help with >>> resources, such as the GNU Compile Farm or paying for more RAM, let us >>> know. >> >> 128GB of DDR4 registered RAM would allow me to run conversions with my >> browser up, but be eye-wateringly expensive. Thanks, but I'm not >> going to yell for that help > > I for one would certainly be happy to donate some spare bucks towards > beastie RAM if it helps to get the GCC repo converted to git in a > timely manner, and I'm sure there are other GCC > developers/users/sympathizers who'd be willing to join in. So, where > do we throw those bucks? I'd be willing to throw some $$$ at this as well. Jeff
Re: Good news, bad news on the repository conversion
On Mon, Jul 9, 2018 at 12:35 PM Eric S. Raymond wrote: > > David Edelsohn : > > > The truth is we're near the bleeding edge of what conventional tools > > > and hardware can handle gracefully. Most jobs with working sets as > > > big as this one's do only comparatively dumb operations that can be > > > parallellized and thrown on a GPU or supercomputer. Most jobs with > > > the algorithmic complexity of repository surgery have *much* smaller > > > working sets. The combination of both extrema is hard. > > > > If you come to the conclusion that the GCC Community could help with > > resources, such as the GNU Compile Farm or paying for more RAM, let us > > know. > > 128GB of DDR4 registered RAM would allow me to run conversions with my > browser up, but be eye-wateringly expensive. Thanks, but I'm not > going to yell for that help unless the working set gets so large that > it blows out 64GB even with nothing but i4 and some xterms running. Funds in the FSF GNU Toolchain Fund probably can be allocated to purchase additional RAM, if that proves necessary. Also, IBM Power Systems have excellent memory subsystems. The ones in the GNU Compile Farm have more than 128GB of memory available. Thanks, David
Re: Good news, bad news on the repository conversion
On Mon, 2018-07-09 at 10:57 -0600, Jeff Law wrote: > On 07/09/2018 10:53 AM, Janus Weil wrote: > > 2018-07-09 18:35 GMT+02:00 Eric S. Raymond : > > > David Edelsohn : > > > > > The truth is we're near the bleeding edge of what conventional tools > > > > > and hardware can handle gracefully. Most jobs with working sets as > > > > > big as this one's do only comparatively dumb operations that can be > > > > > parallellized and thrown on a GPU or supercomputer. Most jobs with > > > > > the algorithmic complexity of repository surgery have *much* smaller > > > > > working sets. The combination of both extrema is hard. > > > > > > > > If you come to the conclusion that the GCC Community could help with > > > > resources, such as the GNU Compile Farm or paying for more RAM, let us > > > > know. > > > > > > 128GB of DDR4 registered RAM would allow me to run conversions with my > > > browser up, but be eye-wateringly expensive. Thanks, but I'm not > > > going to yell for that help > > > > I for one would certainly be happy to donate some spare bucks towards > > beastie RAM if it helps to get the GCC repo converted to git in a > > timely manner, and I'm sure there are other GCC > > developers/users/sympathizers who'd be willing to join in. So, where > > do we throw those bucks? > > I'd be willing to throw some $$$ at this as well. I may be misreading between the lines but I suspect Eric is more hoping to get everyone to focus on moving this through before the GCC commit count gets even more out of control, than he is asking for a hardware handout :). Maybe the question should rather be, what does the dev community need to do to help push this conversion through soonest?
Re: Good news, bad news on the repository conversion
On Mon, 2018-07-09 at 06:16 -0400, Eric S. Raymond wrote: > Janus Weil : > > > The bad news is that my last test run overran the memnory > > > capacity of > > > the 64GB Great Beast. I shall have to find some way of reducing > > > the > > > working set, as 128GB DD4 memory is hideously expensive. > > > > Or maybe you could use a machine from the GCC compile farm? > > > > According to https://gcc.gnu.org/wiki/CompileFarm, there are three > > machines with at least 128GB available (gcc111, gcc112, gcc119). > > The Great Beast is a semi-custom PC optimized for doing graph theory > on working sets gigabytes wide - its design emphasis is on the best > possible memory caching. If I dropped back to a conventional machine > the test times would go up by 50% (benchmarked, that's not a guess), > and they're already bad enough to make test cycles very painful. > I just saw elapsed time 8h30m36.292s for the current test - I had it > down to 6h at one point but the runtimes scale badly with increasing > repo size, there is intrinsically O(n**2) stuff going on. > > My first evasive maneuver is therefore to run tests with my browser > shut down. That's working. I used to do that before I switched from > C-Python to PyPy, which runs faster and has a lower per-object > footprint. Now it's mandatory again. Tells me I need to get the > conversion finished before the number of commits gets much higher. I wonder if one approach would be to tune PyPy for the problem? I was going to check that you've read: https://pypy.org/performance.html but I see you've already contributed text to it :) For CPU, does PyPy's JIT get a chance to kick in and turn the hot loops into machine code, or is it stuck interpreting bytecode for the most part? For RAM, is there a way to make PyPy make more efficient use of the RAM to store the objects? (PyPy already has a number of tricks it uses to store things more efficiently, and it's possible, though hard, to teach it new ones) This is possibly self-serving, as I vaguely know them from my days in the Python community, but note that the PyPy lead developers have a consulting gig where they offer paid consulting on dealing with Python and PyPy performance issues: https://baroquesoftware.com/ (though I don't know who would pay for that for the GCC repo conversion) Hope this is constructive. Dave > More memory would avoid OOM but not run the tests faster. More cores > wouldn't help due to Python's GIL problem - many of reposurgeon's > central algorithms are intrinsically serial, anyway. Higher > single-processor speed could help a lot, but there plain isn't > anything in COTS hardware that beats a Xeon 3 cranking 3.5Ghz by > much. (The hardware wizard who built the Beast thinks he might be > able > to crank me up to 3.7GHz later this year but that hardware hasn't > shipped yet.) > The one technical change that might help is moving reposurgeon from > Python to Go - I might hope for as much as a 10x drop in runtimes > from > that and a somewhat smaller decrease in working set. Unfortunately > while the move is theoretically possible (I've scoped the job) that > too would be very hard and take a long time. It's 14KLOC of the most > algorithmically dense Python you are ever likely to encounter, with > dependencies on Python libraries sans Go equivalents that might > double the LOC; only the fact that I built a *really good* > regression- > and unit-test suite in self-defense keeps it anywhere near to > practical. > > (Before you ask, at the time I started reposurgeon in 2010 there > wasn't any really production-ready language that might have been a > better fit than Python. I did look. OO languages with GC and compiled > speed are still pretty thin on the ground.) > > The truth is we're near the bleeding edge of what conventional tools > and hardware can handle gracefully. Most jobs with working sets as > big as this one's do only comparatively dumb operations that can be > parallellized and thrown on a GPU or supercomputer. Most jobs with > the algorithmic complexity of repository surgery have *much* smaller > working sets. The combination of both extrema is hard.
Repo conversion troubles.
Last time I did a comparison between SVN head and the git conversion tip they matched exactly. This time I have mismatches in the following files. libtool.m4 libvtv/ChangeLog libvtv/configure libvtv/testsuite/lib/libvtv.exp ltmain.sh lto-plugin/ChangeLog lto-plugin/configure lto-plugin/lto-plugin.c MAINTAINERS maintainer-scripts/ChangeLog maintainer-scripts/crontab maintainer-scripts/gcc_release Makefile.def Makefile.in Makefile.tpl zlib/configure zlib/configure.ac Now I'll explain what this means and why it's a serious problem. Reposurgeon is never confused by linear history, branching, or tagging; I have lots of regression tests for those cases. When it screws up it is invariably around branch copy operations, because there are cases near those where the data model of Subversion stream files is underspecified. That model was in fact entirely undocumented before I reverse-engineered it and wrote the description that now lives in the Subversion source tree. But that description is not complete; nobody, not even Subversion's designers, knows how to fill in all the corner cases. Thus, a content mismatch like this means there was some recent branch merge to trunk in the gcc history that reposurgeon is not interpreting as intended, or more likely an operator error such as a non-Subversion directory copy followed by a commit - my analyzer can recover from most such cases but not all. There are brute-force ways to pin down such malformations, but none of them are practical at the huge scale of this repository. The main problem here wouldn't reposurgeon itself but the fact that Subversion checkouts on a repo this large are very slow. I've seen a single one take 12 hours; an attempt at a whole bisection run to pin down the divergence point on trunk would therefore probably cost log2 of the commit length times that, or about 18 days. So...does that list of changed files look familar to anyone? If we can identify the revision number of the bad commit, the odds of being able to unscramble this mess go way up. They still aren't good, not when merely loading the repository for examination takes over four hours, but they would way better than if I were starting from zero. This is serious. I have preduced demonstrably correct history conversions of the gcc repo in the past. We may now be in a situation where I will never again be able to do that. -- http://www.catb.org/~esr/";>Eric S. Raymond The real point of audits is to instill fear, not to extract revenue; the IRS aims at winning through intimidation and (thereby) getting maximum voluntary compliance -- Paul Strassel, former IRS Headquarters Agent Wall St. Journal 1980
Re: Repo conversion troubles.
On 07/09/2018 01:19 PM, Eric S. Raymond wrote: > Last time I did a comparison between SVN head and the git conversion > tip they matched exactly. This time I have mismatches in the following > files. > > libtool.m4 > libvtv/ChangeLog > libvtv/configure > libvtv/testsuite/lib/libvtv.exp > ltmain.sh > lto-plugin/ChangeLog > lto-plugin/configure > lto-plugin/lto-plugin.c > MAINTAINERS > maintainer-scripts/ChangeLog > maintainer-scripts/crontab > maintainer-scripts/gcc_release > Makefile.def > Makefile.in > Makefile.tpl > zlib/configure > zlib/configure.ac > > Now I'll explain what this means and why it's a serious problem. [ ... ] That's weird -- let's take maintainer-scripts/crontab as our victim. That file (according to the git mirror) has only changed on the trunk 3 times in the last year. They're all changes from Jakub and none look unusual at all. Just trivial looking updates. libvtv.exp is another interesting file. It changed twice in early May of this year. Prior to that it hadn't changed since 2015. [ ... ] > > There are brute-force ways to pin down such malformations, but none of > them are practical at the huge scale of this repository. The main > problem here wouldn't reposurgeon itself but the fact that Subversion > checkouts on a repo this large are very slow. I've seen a single one > take 12 hours; an attempt at a whole bisection run to pin down the > divergence point on trunk would therefore probably cost log2 of the > commit length times that, or about 18 days. I'm not aware of any such merges, but any that occurred most likely happened after mid-April when the trunk was re-opened for development. I'm assuming that it's only work that merges onto the trunk that's potentially problematical here. > > So...does that list of changed files look familar to anyone? If we can > identify the revision number of the bad commit, the odds of being able > to unscramble this mess go way up. They still aren't good, not when > merely loading the repository for examination takes over four hours, > but they would way better than if I were starting from zero. They're familiar only in the sense that I know what those files are :-) Jeff
Re: Repo conversion troubles.
On 07/09/2018 09:19 PM, Eric S. Raymond wrote: > Last time I did a comparison between SVN head and the git conversion > tip they matched exactly. This time I have mismatches in the following > files. So what are the diffs? Are we talking about small differences (like one change missing) or large-scale mismatches? Bernd
Re: Good news, bad news on the repository conversion
* Eric S. Raymond: > The bad news is that my last test run overran the memnory capacity of > the 64GB Great Beast. I shall have to find some way of reducing the > working set, as 128GB DD4 memory is hideously expensive. Do you need interactive access to the machine, or can we run the job for you? If your application is not NUMA-aware, we probably need something that has 128 GiB per NUMA node, which might be bit harder to find, but I'm sure many of us have suitable lab machines which could be temporarily allocated for that purpose.
Re: Repo conversion troubles.
Jeff Law : > > There are brute-force ways to pin down such malformations, but none of > > them are practical at the huge scale of this repository. The main > > problem here wouldn't reposurgeon itself but the fact that Subversion > > checkouts on a repo this large are very slow. I've seen a single one > > take 12 hours; an attempt at a whole bisection run to pin down the > > divergence point on trunk would therefore probably cost log2 of the > > commit length times that, or about 18 days. > > I'm not aware of any such merges, but any that occurred most likely > happened after mid-April when the trunk was re-opened for development. I agree it can't have been earlier than that, or I'd have hit this rock sooner. I'd bet on the problem having arisen within the last six weeks. > I'm assuming that it's only work that merges onto the trunk that's > potentially problematical here. Yes. It is possible there are also content mismatches on branches - I haven't run that check yet, it takes an absurd amount of time to complete - - but not much point in worrying about that if we can't get trunk right. I'm pretty certain things were still good at r256000. I've started that check running. Not expecting results in less than twelve hours. -- http://www.catb.org/~esr/";>Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own.
Re: Repo conversion troubles.
Bernd Schmidt : > On 07/09/2018 09:19 PM, Eric S. Raymond wrote: > > Last time I did a comparison between SVN head and the git conversion > > tip they matched exactly. This time I have mismatches in the following > > files. > > So what are the diffs? Are we talking about small differences (like one > change missing) or large-scale mismatches? Large-scale, I'm afraid. The context diff is about a GLOC. -- http://www.catb.org/~esr/";>Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own.
Re: Repo conversion troubles.
On 07/09/2018 01:57 PM, Eric S. Raymond wrote: > Jeff Law : >>> There are brute-force ways to pin down such malformations, but none of >>> them are practical at the huge scale of this repository. The main >>> problem here wouldn't reposurgeon itself but the fact that Subversion >>> checkouts on a repo this large are very slow. I've seen a single one >>> take 12 hours; an attempt at a whole bisection run to pin down the >>> divergence point on trunk would therefore probably cost log2 of the >>> commit length times that, or about 18 days. >> >> I'm not aware of any such merges, but any that occurred most likely >> happened after mid-April when the trunk was re-opened for development. > > I agree it can't have been earlier than that, or I'd have hit this rock > sooner. I'd bet on the problem having arisen within the last six weeks. > >> I'm assuming that it's only work that merges onto the trunk that's >> potentially problematical here. > > Yes. It is possible there are also content mismatches on branches - I > haven't run that check yet, it takes an absurd amount of time to complete - > - but not much point in worrying about that if we can't get trunk right. > > I'm pretty certain things were still good at r256000. I've started that > check running. Not expecting results in less than twelve hours. r256000 would be roughly Christmas 2017. I'd be very surprised if any merges to the trunk happened between that point and early April. We're essentially in regression bugfixes only during that timeframe. Not a time for branch->trunk merging :-) jeff
Re: Good news, bad news on the repository conversion
Florian Weimer : > * Eric S. Raymond: > > > The bad news is that my last test run overran the memnory capacity of > > the 64GB Great Beast. I shall have to find some way of reducing the > > working set, as 128GB DD4 memory is hideously expensive. > > Do you need interactive access to the machine, or can we run the job > for you? > > If your application is not NUMA-aware, we probably need something that > has 128 GiB per NUMA node, which might be bit harder to find, but I'm > sure many of us have suitable lab machines which could be temporarily > allocated for that purpose. I would need interactive access. But that's now one level way from the principal problem; there is somme kind of recent metadata damage - or maybe some "correct" but weird and undocumented stream semantics that reposurgeon doesn't know how to emulate - that is blocking correct conversion. -- http://www.catb.org/~esr/";>Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own.
Re: Repo conversion troubles.
On July 9, 2018 9:19:11 PM GMT+02:00, e...@thyrsus.com wrote: >Last time I did a comparison between SVN head and the git conversion >tip they matched exactly. This time I have mismatches in the following >files. > >libtool.m4 >libvtv/ChangeLog >libvtv/configure >libvtv/testsuite/lib/libvtv.exp >ltmain.sh >lto-plugin/ChangeLog >lto-plugin/configure >lto-plugin/lto-plugin.c >MAINTAINERS >maintainer-scripts/ChangeLog >maintainer-scripts/crontab >maintainer-scripts/gcc_release >Makefile.def >Makefile.in >Makefile.tpl >zlib/configure >zlib/configure.ac > >Now I'll explain what this means and why it's a serious problem. > >Reposurgeon is never confused by linear history, branching, or >tagging; I have lots of regression tests for those cases. When it >screws up it is invariably around branch copy operations, because >there are cases near those where the data model of Subversion stream >files is underspecified. That model was in fact entirely undocumented >before I reverse-engineered it and wrote the description that now >lives in the Subversion source tree. But that description is not >complete; nobody, not even Subversion's designers, knows how to fill >in all the corner cases. > >Thus, a content mismatch like this means there was some recent branch >merge to trunk in the gcc history that reposurgeon is not interpreting >as intended, or more likely an operator error such as a non-Subversion >directory copy followed by a commit - my analyzer can recover from >most such cases but not all. > >There are brute-force ways to pin down such malformations, but none of >them are practical at the huge scale of this repository. The main >problem here wouldn't reposurgeon itself but the fact that Subversion >checkouts on a repo this large are very slow. I've seen a single one >take 12 hours; an attempt at a whole bisection run to pin down the >divergence point on trunk would therefore probably cost log2 of the >commit length times that, or about 18 days. 12 hours from remote I guess? The subversion repository is available through rsync so you can create a local mirror to work from (we've been doing that at suse for years) Richard. > >So...does that list of changed files look familar to anyone? If we can >identify the revision number of the bad commit, the odds of being able >to unscramble this mess go way up. They still aren't good, not when >merely loading the repository for examination takes over four hours, >but they would way better than if I were starting from zero. > >This is serious. I have preduced demonstrably correct history >conversions of the gcc repo in the past. We may now be in a situation >where I will never again be able to do that.
Re: Repo conversion troubles.
Jeff Law : > > I'm pretty certain things were still good at r256000. I've started that > > check running. Not expecting results in less than twelve hours. > r256000 would be roughly Christmas 2017. I'd be very surprised if any > merges to the trunk happened between that point and early April. We're > essentially in regression bugfixes only during that timeframe. Not a > time for branch->trunk merging :-) Thanks, that's useful to know. That means if the r256000 check passes I can jump forward to 1 Apr reasonably expecting that one to pass too. -- http://www.catb.org/~esr/";>Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own.
Re: Repo conversion troubles.
Richard Biener : > 12 hours from remote I guess? The subversion repository is available through > rsync so you can create a local mirror to work from (we've been doing that at > suse for years) I'm saying I see rsync plus local checkout take 10-12 hours. I asked Jason about this and his response was basically "Well...we don't do that often." You probably never see thids case. Update from a remote is much faster. I'm trying to do a manual correctness check via update to commit 256000 now. -- http://www.catb.org/~esr/";>Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own.
Re: Repo conversion troubles.
On Jul 9, 2018, Jeff Law wrote: > On 07/09/2018 01:57 PM, Eric S. Raymond wrote: >> Jeff Law : >>> I'm not aware of any such merges, but any that occurred most likely >>> happened after mid-April when the trunk was re-opened for development. >> I'm pretty certain things were still good at r256000. I've started that >> check running. Not expecting results in less than twelve hours. > r256000 would be roughly Christmas 2017. When was the RAID/LVM disk corruption incident? Could it possibly have left any of our svn repo metadata in a corrupted way that confuses reposurgeon, and that leads to such huge differences? On Jul 9, 2018, "Eric S. Raymond" wrote: > Bernd Schmidt : >> So what are the diffs? Are we talking about small differences (like one >> change missing) or large-scale mismatches? > Large-scale, I'm afraid. The context diff is about a GLOC. -- Alexandre Oliva, freedom fighter https://FSFLA.org/blogs/lxo Be the change, be Free! FSF Latin America board member GNU Toolchain EngineerFree Software Evangelist
Re: -Wclass-memaccess warning should be in -Wextra, not -Wall
On 07/05/2018 05:14 PM, Soul Studios wrote: Simply because a struct has a constructor does not mean it isn't a viable target/source for use with memcpy/memmove/memset. As the documentation that Segher quoted explains, it does mean exactly that. Some classes have user-defined copy and default ctors with the same effect as memcpy/memset. In modern C++ those ctors should be defaulted (= default) and GCC should emit optimal code for them. In fact, in loops they can result in more efficient code than the equivalent memset/memcpy calls. In any case, "native" operations lend themselves more readily to code analysis than raw memory accesses and as a result allow all compilers (not just GCC) do a better a job of detecting bugs or performing interesting transformations that they may not be able to do otherwise. Having benchmarked the alternatives memcpy/memmove/memset definitely makes a difference in various scenarios. Please open bugs with small test cases showing the inefficiencies so the optimizers can be improved. Martin My point to all of this (and I'm annoyed that I'm having to repeat it again, as it my first post wasn't clear enough - which it was) was that any programmer using memcpy/memmove/memset is going to know what they're getting into. Therefore it makes no sense to penalize them by getting them to write ugly, needless code - regardless of the surrounding politics/codewars. Extra seems an amiable place to put this, All doesn't. As for test cases, well, this is something I've benchmarked over a range of scenarios in various projects over the past 3 years (mainly plf::colony and plf::list). If I have time I'll submit a sample.
Re: -Wclass-memaccess warning should be in -Wextra, not -Wall
On 07/09/2018 07:22 PM, Soul Studios wrote: On 07/05/2018 05:14 PM, Soul Studios wrote: Simply because a struct has a constructor does not mean it isn't a viable target/source for use with memcpy/memmove/memset. As the documentation that Segher quoted explains, it does mean exactly that. Some classes have user-defined copy and default ctors with the same effect as memcpy/memset. In modern C++ those ctors should be defaulted (= default) and GCC should emit optimal code for them. In fact, in loops they can result in more efficient code than the equivalent memset/memcpy calls. In any case, "native" operations lend themselves more readily to code analysis than raw memory accesses and as a result allow all compilers (not just GCC) do a better a job of detecting bugs or performing interesting transformations that they may not be able to do otherwise. Having benchmarked the alternatives memcpy/memmove/memset definitely makes a difference in various scenarios. Please open bugs with small test cases showing the inefficiencies so the optimizers can be improved. Martin My point to all of this (and I'm annoyed that I'm having to repeat it again, as it my first post wasn't clear enough - which it was) was that any programmer using memcpy/memmove/memset is going to know what they're getting into. No, programmers don't always know that. In fact, it's easy even for an expert programmer to make the mistake that what looks like a POD struct can safely be cleared by memset or copied by memcpy when doing so is undefined because one of the struct members is of a non-trivial type (such a container like string). Therefore it makes no sense to penalize them by getting them to write ugly, needless code - regardless of the surrounding politics/codewars. Quite a lot of thought and discussion went into the design and implementation of the warning, so venting your frustrations or insulting those of us involved in the process is unlikely to help you effect a change. To make a compelling argument you need to provide convincing evidence that we have missed an important use case. The best way to do that in this forum is with test cases and/or real world designs that are hampered by our choice. That's a high bar to meet for warnings whose documented purpose is to diagnose "constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning)." Martin
Re: Repo conversion troubles.
On July 9, 2018 10:20:39 PM GMT+02:00, "Eric S. Raymond" wrote: >Richard Biener : >> 12 hours from remote I guess? The subversion repository is available >through rsync so you can create a local mirror to work from (we've been >doing that at suse for years) > >I'm saying I see rsync plus local checkout take 10-12 hours. For a fresh rsync I can guess that's true. But it works incremental just fine and quick for me... I asked >Jason >about this and his response was basically "Well...we don't do that >often." > >You probably never see thids case. Update from a remote is much >faster. > >I'm trying to do a manual correctness check via update to commit 256000 >now.