Karol, If I understood correctly, functions like "foo" are automatically generated by gEcon's model parser. For such a long function, and depending on how many times you need to call it, it may make more sense to generate C++ code instead (including the 'for' loop). Then you can use Rcpp::sourceCpp, or Rcpp::cppFunction, to compile it and run it from R.
Iñaki El vie., 17 ago. 2018 a las 0:47, Karol Podemski (<gecon.maintena...@gmail.com>) escribió: > > Dear Thomas, > > thank you for prompt response and taking interest in this issue. I really > appreciate your compiler project and efficiency gains in usual case. I am > aware of limitations of interpreted languages too and because of that even > when writing my first mail I had a hunch that it is not that easy to > address this problem. As you mentioned optimisation of compiler for > handling non-standard code may be tricky and harmful for usual code. The > question is if gEcon is the only package that may face the same issue > because of compilation. > > The functions generated by gEcon are systems of non-linear equations > defining the equilibrium of an economy (see > http://gecon.r-forge.r-project.org/files/gEcon-users-guide.pdf if you want > to learn a bit how we obtain it). The rows, you suggested to vectorise, are > indeed vectorisable because they define equilibrium for similiar markets > (e.g. production and sale of beverages and food) but do not have to be > vectorisable in general case. So that not to delve into too much details I > will stop here in description of how the equations originate. However, I > would like to point that similiar large systems of linear equations may > arise in other fields ( https://en.wikipedia.org/wiki/Steady_state ) and > there may be other packages that generate similar large systems (e.g. > network problems like hydraulic networks). In that case, reports such as > mine may help you to assess the scale of the problems. > > Thank you for suggestions for improvement in our approach, i am going to > discuss them with other package developers. > > Regards, > Karol Podemski > > pon., 13 sie 2018 o 18:02 Tomas Kalibera <tomas.kalib...@gmail.com> > napisał(a): > > > Dear Karol, > > > > thank you for the report. I can reproduce that the function from you > > example takes very long to compile and I can see where most time is spent. > > The compiler is itself written in R and requires a lot of resources for > > large functions (foo() has over 16,000 lines of code, nearly 1 million of > > instructions/operands, 45,000 constants). In particular a lot of time is > > spent in garbage collection and in finding a unique set of constants. Some > > optimizations of the compiler may be possible, but it is unlikely that > > functions this large will compile fast any soon. For non-generated code, we > > now have the byte-compilation on installation by default which at least > > removes the compile overhead from runtime. Even though the compiler is > > slow, it is important to keep in mind that in principle, with any compiler > > there will be functions where compilation would not be improve performance > > (when the compile time is included or not). > > > > I think it is not a good idea to generate code for functions like foo() in > > R (or any interpreted language). You say that R's byte-code compiler > > produces code that runs 5-10x faster than when the function is interpreted > > by the AST interpreter (uncompiled), which sounds like a good result, but I > > believe that avoiding code generation would be much faster than that, apart > > from drastically reducing code size and therefore compile time. The > > generator of these functions has much more information than the compiler - > > it could be turned into an interpreter of these functions and compute their > > values on the fly. > > > > A significant source of inefficiency of the generated code are > > element-wise operations, such as > > > > r[12] <- -vv[88] + vv[16] * (1 + ppff[1307]) > > ... > > > > r[139] <- -vv[215] + vv[47] * (1 + ppff[1434]) > > > > (these could be vectorized, which would reduce code size and improve > > interpretation speed; and make it somewhat readable). Most of the code > > lines in the generated functions seem to be easily vectorizable. > > > > Compilers and interpreters necessarily use some heuristics or optimize at > > some code patterns. Optimizing for generated code may be tricky as it could > > even harm performance of usual code. And, I would much rather optimize the > > compiler for the usual code. > > > > Indeed, a pragmatic solution requiring the least amount of work would be > > to disable compilation of these generated functions. There is not a > > documented way to do that and maybe we could add it (and technically it is > > trivial), but I have been reluctant so far - in some cases, compilation > > even of these functions may be beneficial - if the speedup is 5-10x and we > > run very many times. But once the generated code included some pragma > > preventing compilation, it won't be ever compiled. Also, the trade-offs may > > change as the compiler evolves, perhaps not in this case, but in other > > where such pragma may be used. > > > > Well so the short answer would be that these functions should not be > > generated in the first place. If it were too much work rewriting, perhaps > > the generator could just be improved to produce vectorized operations. > > > > Best > > Tomas > > On 12.8.2018 21:31, Karol Podemski wrote: > > > > Dear R team, > > > > I am a co-author and maintainer of one of R packages distributed by R-forge > > (gEcon). One of gEcon package users found a strange behaviour of package (R > > froze for couple of minutes) and reported it to me. I traced the strange > > behaviour to compiler package. I attach short demonstration of the problem > > to this mail (demonstration makes use of compiler and tictoc packages only). > > > > In short, the compiler package has problems in compiling large functions - > > their compilation and execution may take much longer than direct execution > > of an uncompiled function. Such functions are generated by gEcon package as > > they describe steady state for economy. > > > > I am curious if you are aware of such problems and plan to handle the > > efficiency issues. On one of the boards I saw that there were efficiency > > issues in rpart package but they have been resolved. Or would you advise to > > turn off JIT on package load (package heavily uses such long functions > > generated whenever a new model is created)? > > > > Best regards, > > Karol Podemski > > > > > > > > ______________________________________________r-de...@r-project.org mailing > > listhttps://stat.ethz.ch/mailman/listinfo/r-devel > > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Iñaki Ucar ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel