Optimizing for code size: new PR about issues with code hoisting
Hello, Earlier this year, I had a chat with Richard Earnshaw about code size optimizations in GCC. I'm of the opinion that GCC should be able to do better, if a small number of obvious shortcomings would be fixed. There are many passes that matter for code size, which are currently rather impaired (see e.g. the cross jumping bug, PR30905). One typical code-size optimization that is not implemented very well (IMHO) is code hoisting. I don't know if Richard E. remembers, but I promised to list the issues I knew about. Well then, here we go... I think the most important issue is that code hoisting could be done very efficiently at the tree level, as a pass before the insert phase of GVN-PRE. Dan Berlin had a rough outline of how this would work [1] but nobody has implemented this so far (my bank account number is ... ;-). (Actually, I believe generally the people interested in code size optimizations are not making the most of the opportunities that tree-ssa has opened up, but that's a different issue...) GCC does implement code hoisting for RTL. I compiled a list of issues I know of in the code hoisting implementation in gcse.c. I have filed the issues in a new bug report, see http://gcc.gnu.org/PR33828. An issue I didn't list, is that the implementation apparently lacks good testing coverage. For instance, Richard Sandiford fixed an obvious error in the implementation two years ago (see PR22167), that must have resulted in wrong-code issues ever since the code hoisting pass was contributed in the late '90s. This apparent lack of test coverage is not really surprising, I suppose, because code hoisting is only enabled at -Os for some reason. I guess that some of the issues in my list may have kept bugs like Richard S.'s just hidden. I hope the list is useful for someone who wishes to improve GCC's optimizations for code size. Gr. Steven [1] http://gcc.gnu.org/ml/gcc-patches/2005-12/msg00077.html
one question: tree-ssa vs no tree-ssa? no such global optimization exists.
* Was it useful the implementation of the complicated tree-ssa code waited for long time (many years)? * Was it better the optimization without tree-ssa code? If doesn't exist a method for global optimization to use tree-ssa then * why did not it implement the simplest trial-and-error method for local optimization (e.g. minima/maxima local) following the K.I.S.S. principle without tree-ssa code? * why was it too complicated adding several optimization's features as instruction scheduling using tree-ssa code? * was not it too easy adding several optimization's features with trial-and-error and without tree-ssa code? * why both performance's measures will be different if both are following the same principle of local optimization when none uses global optimization? IMHO, A) with tree-ssa => many KLOCS. B) without tree-ssa and with trial-and-error => few KLOCS + more optimization's features + could outperfom better. There are other methods of search of minima/maxima local as Hill Climbing, Beam Search, Genetic Algorithms, Simulated Annealing, Tabu Search, A*, Alfa-Beta, Min-Max, Branch-and-Bound, Greedy, etc. The extension with more optimization's features is more easy without tree-ssa code. Sincerely, J.C. Pizarro ;)
(libgcc) what does _Unwind_Resume do ?
Hi, I have noticed that glibc has extensive use of the _Unwind_Resume procedure (from libgcc_eh.a) but I have failed to understand it (no luck with the onlinedocs: http://gcc.gnu.org/onlinedocs/gccint/Exception-handling-routines.html#Exception-handling-routines). Can anyone please explain me what is it ? why is it needed for C programs (I understand it is related to exception handling ?) ? Thank you, Sunzir.
Re: Optimizing for code size: new PR about issues with code hoisting
On Oct 20, 2007, at 4:48 AM, Steven Bosscher wrote: I hope the list is useful for someone who wishes to improve GCC's optimizations for code size. Is there a meta bug for code size optimizations? If not, that would be a good way to centralize the various ideas you have. -Chris
Fw[1]:
E-mail реклама - старт Вашего бизнеса! Качественно, от профессионалов. Cамые полные регулярно обновляемые базы (Москва, Россия, СНГ) Cамое современное оборудование Отчет о проделанной работе тел В москве: - 646-07-20
Re: gomp slowness
I'm not sure what OpenMP spec says about default data scope (too lazy to read through), but it seems that examples from http://kallipolis.com/openmp/2.html assume default(private), while GCC GOMP defaults to shared. In your case, #pragma omp parallel for shared(A, row, col) for (i = k+1; i
Re: gcc mirror
On Wed, 10 Oct 2007, [EMAIL PROTECTED] wrote: > I estabilished an http mirror via rsync, updated four times a day with a > cron job, located in Utah (US), available at the URL > "http://gcc.bigsearcher.com/";, from my server bigsearcher.com; > Please add this mirror to the list of available mirror location. I performed some checks against your server, and there seem to be some issues with the mirroring process. One, it seems you only mirror our releases, not our snapshots. Is this intentional? Two, nearly all releases that you mirror appear directly at the root of the server, except for GCC 3.3.3 which is in a subdirectory "releases" which only contains that single release. Gerald -- Gerald (Jerry) Pfeifer [EMAIL PROTECTED] http://www.pfeifer.com/gerald/
Re: Optimizing for code size: new PR about issues with code hoisting
On 10/20/07, Chris Lattner <[EMAIL PROTECTED]> wrote: > > On Oct 20, 2007, at 4:48 AM, Steven Bosscher wrote: > > > I hope the list is useful for someone who wishes to improve GCC's > > optimizations for code size. > > Is there a meta bug for code size optimizations? Yes there is. I made it depend on my new PR when I filed it. Gr. Steven
Re: gomp slowness
On Sat, 2007-10-20 at 22:32 +0400, Tomash Brechko wrote: > I'm not sure what OpenMP spec says about default data scope (too lazy > to read through), > but it seems that examples from > http://kallipolis.com/openmp/2.html assume default(private), while GCC > GOMP defaults to shared. In your case, > > #pragma omp parallel for shared(A, row, col) > for (i = k+1; i for (j = k+1; j A[i][j] = A[i][j] - row[i] * col[j]; > } > } > > '#pragma omp for' makes 'i' private implicitly (it couldn't be > otherwise), but 'j' is still shared. Good job!! Dang, so used to C++ and other languages where the control variable is localised. Haha .. but not in my own language Felix. > I just tried your original case, > not only it is slow, but it also produces different results with and > without OpenMP (just try to print any elem of 'A'). Adding > 'private(j)' (or defining 'j' inside the outer loop) will fix the > case. > > It would be nice if someone would post the measurement for the fixed > case, my machine has only HT, and I experience slowdown for this > example (but still it runs much faster then before the fix). Now I get: #threads Real User Sys 1 1.052 1.043 0.009 2 0.866 1.582 0.026 This is a much better result, 50% speedup (30% less time used). I only have a dual core at the moment (without HT), be nice to see the result for a quad! BTW: I also tried this variation in C++: #pragma omp parallel for shared(A, row, col) for (i = k+1; i Felix, successor to C++: http://felix.sf.net
Re: indirect memory op in SSA
On 10/18/07, Fran Baena <[EMAIL PROTECTED]> wrote: > My questions are: are nodes STRUCT_FIELD_TAG, NAME_MEMORY_TAG, > SYMBOL_MEMORY_TAG used for that purpose? and, where i can find > documentation about their use within the translation process to SSA? Yes and no. S_F_T is used to implement field-sensitive aliasing, N_M_T is used to implement flow-sensitive aliasing and S_M_T is used to group subsets of alias sets under a single name, to prevent compile-time and memory consumption problems when alias sets become too large. I would start by reading the articles and tutorials that we have in the Getting Started section of the wiki. http://gcc.gnu.org/wiki/GettingStarted If you have more questions after reading that material, you can ask here or join the #gcc IRC channel on irc.oftc.net.