[Cython] [ENH] Compile-time code execution (macros / templates)
Hello. I was going to post this as an issue on github, but since it's the idea isn't complete, I guess this mailing list is better suited. I read (not in too much details, though) the related CEP, and I feel they either do too little or too much. In my idea, there is some syntax (or decorator) and some propagation rules to decide what to execute during the compilation. It would merge the need for templates and for constants pre-evaluation. The uses would be great many. - It could generate new cdef classes based on an argument (possibly a C type). - It could make sure all the references to the cython module are caught during the compilation. - It could replace the DEF keyword with a pure python syntax. - It could easily allow any decorator on cdef functions. - Compile-time decision of a function implementation (common in C using the preprocessor). - And probably many more. As far as I understand, this would be strictly more general and elegant than these two proposals: https://github.com/cython/cython/wiki/enhancements-inlining https://github.com/cython/cython/wiki/enhancements-methodtemplates In my idea, the only metaprogramming entry points would be cdef functions, cdef classes (or cppclasses) and C/C++ types. No access to the AST as it's unusual in python to do so. But nontheless, it would be quite easy to have a CdefFunction object that exposes an `ast` attribute. As such, my proposal would provide a cleaner entry point to metaprogramming than these two proposals. Together with a pure python syntax. https://github.com/cython/cython/wiki/enhancements-metaprogramming https://github.com/cython/cython/wiki/enhancements-uneval The compile-time execution doesn't have to support everything. Especially, the evaulation of `cdef` code could be heavily restricted. Here are the restriction that apply to CTFE (Compile-Time Function Evaluation) in the language D. https://dlang.org/spec/function.html#interpretation As far as I can tell, it's quite common in python to write load-time code in modules to achieve a higher level of genericity. Like to support PY2 and PY3, Windows and Linux, etc. Or to generate classes using closures. I think bringing back the habits and culture of python to cython would be a good thing. On the syntax side, I'm not sure how to handle cases where the right hand side of an assignment should be evaluated and the cases where every use of the variable should be evaluated. # Need only to evaluate the right hand side cdef str pi = compute_pi(100) # Need to evaluate everywhere the variable appear os = "Linux" if os == "Windows": cdef dostuff(): return 1 else: cdef dostuff(): return 0 What do you think? Did I miss something that would make this feature mostly useless? Best regards, Celelibi ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [ENH] Compile-time code execution (macros / templates)
Le Tue, Feb 23, 2021 at 11:24:42AM -0500, Prakhar Goel a écrit : >I had similar ideas in the past (assuming I understand your proposal). I'd >love it if a decorator could link to some python function. This python >function would get something that represents the cdef function (with the >ast). It would then return a transformed ast that could be spliced into >the original program. As I said, I'm not sure allowing AST manipulation would be that useful. I mean, it's rarely ever needed in python. Why would it be needed with cython? However, it would surely make the code much less readable, since the code that's executed is not the code that's written. The use-cases I have in mind are more like these. Which are common patterns in python. Compile-time implementation choice: if os == "Windows": cdef thread_something(): # Handle Windows-specific behavior else: cdef thread_something(): # Assume POSIX behavior Decorators on cdef functions: def twice(f): cdef fun(x): return f(2*x) return fun @twice cdef foo(x): # Do something math-y Creating cdef classes on demand: def make_ptr_class(T): cdef class C: cdef T *ptr return C IntPtr = make_ptr_class(int) FloatPtr = make_ptr_class(float) My current use-case is actually exactly this one. Creating wrappers classes for pointer types. Which I currently cannot make both generic and type safe. The python code run at compile-time would basically manipulate python objects representing cdef functions, classes and C types. Internally, the way it could be implemented is that cython could generate a python program that would produce the AST. But it doesn't mean the AST itself has to be visible. Best regards, Celelibi ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [ENH] Compile-time code execution (macros / templates)
1) My suggestion would only run some standard python code during the compilation. Running the cdef functions at compile-time doesn't have to be supported. At least not for a first version. Even if it gets supported, calling external functions (written in C) doesn't have to be supported either. If only running the cdef functions is supported, then it means cython has the full code and would "only" have to implement the C semantics in python. And actually, not the whole C semantics has to be supported. The language D has quite some restrictions on the code that can run at compile-time. Especially when it comes to pointers. But first, running only python code seems quite doable. 2) That's something I didn't elaborate on in my initial message to keep it short (kinda). But indeed, some compile-time functions just wouldn't make sens at runtime. Especially, `twice` and `make_ptr_class` wouldn't as far as I can tell, because returning a cdef function or class at run time doesn't really make sens right now. (Maybe in the future if cython support compiling closures to C?) I guess some kind annotations would be necessary to mark a function (or maybe even a class?) as "compile-time only". Of course, other functions and classes might make sens to be used at compile-time and at run-time. The only thing I'm not sure about, is how to determine what should get evaluated at compile-time. A few places in the code should probably get automatically arked for CTFE, like decorators on cdef functions. IIRC, D uses "enum" to declare a compile-time constant. But this only solves half the issue. What about top-level code evaluation, like the `if os =="windows"` example? Maybe this could be wrapped in a function evaluated at compile-time. And thuse only function evaluation would be requred. Something like this: @cython.ctfe_only def choice_thread_something(os): if os == "windows": cdef thread_something(): # Windows behavior else: cdef thread_something(): # POSIX behavior return thread_something ctfe thread_something = choice_thread_something(os) Where the "ctfe" keyword would play a similar role to "enum" in D and force the evaluation of the right hand side during the compilation. As a technical detail, the "cdef" lines that define the thread_something functions could be remplaced during the compilation by some python code like: thread_something = CdefFunction(...) And the calling line `ctfe thread_something = ...` would just insert the returned object into the AST. All the remaining (non-CTFE) code would be converted to code that build the AST. I hope that makes it more clear. Celelibi Le Wed, Feb 24, 2021 at 10:20:39PM +, da-woods a écrit : > A few things to point out: > 1) Because Cython is mainly a is code translation tool it doesn't currently > have an overview of how things would eventually be evaluated - it leaves a > lot for the C compiler to sort out. Your suggestion would presumably add a > dependency on a C compiler at the .pyx -> .c stage (currently the only > dependencies are Python+Cython itself). > 2) Presumably `twice` and `make_ptr_class` would also have to work at > run-time? They are regular `def` functions after all? That seems like a big > change to now have to dynamically create `cdef` functions and classes at > both Cythonization-time and runtime. And also a Python-runtime > representation of every C type (so that you can call `make_ptr_class` at > runtime) > > I totally get the appeal of being able to wrap template-types quickly and > easily. However, to me this idea actually seems a lot harder than something > which creates new syntax, because you don't distinguish these special > functions and so they have to work universally as Python functions too. > > In summary: if it worked it'd be very nice, but I personally don't have any > idea how you'd implement it within how Cython current works. > > David > > > > On 24/02/2021 14:29, Celelibi wrote: > > Le Tue, Feb 23, 2021 at 11:24:42AM -0500, Prakhar Goel a écrit : > > > I had similar ideas in the past (assuming I understand your > > > proposal). I'd > > > love it if a decorator could link to some python function. This python > > > function would get something that represents the cdef function (with > > > the > > > ast). It would then return a transformed ast that could be spliced > > > into > > > the original program. > > As I said, I'm not sure allowing AST manipulation would be that useful. > > I mean, it's rarely ever needed in python. Why would it be needed wi
Re: [Cython] [ENH] Compile-time code execution (macros / templates)
Le Fri, Feb 26, 2021 at 01:32:34PM +1300, Greg Ewing a écrit : > On 26/02/21 9:29 am, Celelibi wrote: > > Maybe in the future if cython > > support compiling closures to C? > > Your "twice" example already needs some closure functionality. Yes, at the python level only. That's precisely the point. > It relies on being able to manufacture a cdef function inside > a Python function, with the expectation that it will have > access to arguments of the Python function. With all or some > of that happening at compile time rather than run time. I'm > having trouble imagining how it would be implemented. It relies on creating a python object representing the inner cdef function during the compile-time execution. Which is, like, the trivial part. Just picture the `twice` example being converted like this for the compile-time execution. def twice(f): fun = CdefFunction(... use f ...) return fun foo = CdefFunction(...) ast.append(AstCdefFunction("foo", twice(foo))) Where CdefFunction is not necessarily callable. It's just a class representing a cdef function. And AstCdefFunction is an AST node representing a cdef function definition. Does that seem far fetched to you? I think this might even be doable without having to even detect the closure in cython. We'd just have to let python perform the name lookup. Celelibi ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [ENH] Compile-time code execution (macros / templates)
Prakhar Goel, > Doesn't this just punt on how CdefFunction works? Feels like we're > back to AST re-writing. Indeed, internally it rewrites the AST, of course. I think that's the simplest way to implement it. But it doesn't mean the user that write the compile-time code would have to be aware of the AST at all. I mean, if you have a good use-case for it, then I'm ok with allowing the compile-time code manipulating it. But I'd really prefer this not be the main use of the compile-time execution feature. Greg Ewing, > On 26/02/21 3:21 pm, Celelibi wrote: > > def twice(f): > > fun = CdefFunction(... use f ...) > > return fun > > > > foo = CdefFunction(...) > > ast.append(AstCdefFunction("foo", twice(foo))) > > > > I think this might even be doable without having to even detect the > > closure in cython. We'd just have to let python perform the name lookup. > > The cdef function created inside twice() is going to have > to be some kind of closure, because it needs access at run > time to the particular f that was passed to twice() at > compile time. Yes, conceptually it's a closure. But not implemented as one. Since the object `fun` (whose type is CdefFunction) is *not* a function, it is not a closure in the usual sens. It's just some python object having a reference to another object. The only new feature might be to declare at the top-level the functions that are referenced by the CdefFunction object that reach the top-level. In short, the apparent closures are resolved during the compile-time execution. As Prakhar Goel pointed out, the AST of `fun` would hold a reference to the AST of `foo`. The compile-time execution of the `twice` example would produce an AST similar to that we would get by parsing the following code. cdef _unique_name_for_original_foo(x): # Something math-y cdef foo(x): return _unique_name_for_original_foo(2*x) Honestly, I would love to make a prototype as think there's really no major obstacle, yet many seem to think there is. Unfortunately, I don't think I'll have enough time until at least several weeks. Best regards, Celelibi ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [ENH] Compile-time code execution (macros / templates)
Le Fri, Feb 26, 2021 at 09:44:51PM -0500, Prakhar Goel a écrit : > > Indeed, internally it rewrites the AST, of course. I think that's the > > simplest way to implement it. But it doesn't mean the user that write > > the compile-time code would have to be aware of the AST at all. > > I mean, if you have a good use-case for it, then I'm ok with allowing > > the compile-time code manipulating it. But I'd really prefer this not be > > the main use of the compile-time execution feature. > > Ah. I see where you are coming from. Wouldn't this be a ridiculous > amount of work? At least if we insisted on something with a clean > mental model that wouldn't just shred the user with a thousand corner > cases? It would need a whole new sub-syntax in Cython with detailed > semantics for all the weird things code lets people do (e.g. what if > the input function f gets used like a first-class object and placed in > a Python list that then gets passed to another function... at compile > time...). You're basically asking for scheme's syntax-case but harder > because Python/Cython are nowhere near as regular! And scheme's > syntax-case is a beast (read up on phase separation if you don't > believe me: https://docs.racket-lang.org/guide/phases.html). Clearly, I think what I'm proposing has one of the simplest mental model one could dream of. And if I'm not mistaken, it's not much work (I mean, no fancy algorithms). I mean, how could the mental model be any simpler than bringing in that of python itself? I don't know what example you had in mind, but I don't plan to have any kind of static analysis in cython at all. All the burden is offloaded to the python interpreter by running nothing but Pure Python Code. I'll expand on that in a minute. I mean, if you think you got an example that would be hard to analyze with your list containing functions, then please expand on it. I'll try my best to answer how my proposal would handle it. That's one of the reasons I started this thread: to challenge the idea. Here's how I see it working. After some basic analysis, cython produces a python program that basically rebuild the AST at that point and run the code itself at the same time, with a few modifications. 1) cdef aren't run (of course, it's not python code), they're replaced by the creation of a CdefFunction or CdefClass objects. 2) Functions marked ctfe_only aren't added to the AST. 3) Assignments maked with the `ctfe` keyword are handled specially to turn whatever was returned to a new definition. (I can expand on that if you want, especially on the "outlining" of functions.) Running the code mostly means that all the def and assignment and everything is evaluated. This has some obvious drawbacks, though. Especially if it's a stadalone program, it would just... run. But this is unavoidable. For instance the program could have conditional function definition (a def in a if), that's used as compile time. But I guess it's not much of an issue since a module would not have much to run during load time, and a standalone program would be protected by `if __name__ == "__main__":`. > AST manipulation isn't _that_ hard (esp. with good helper libraries) > and on the plus side, the basic idea is dead-simple to implement: just > call some random Python function that the user points you towards. > There's going to be some scaffolding but of an entirely manageable > amount. I'm not saying AST manipulation is hard. I'm rather saying it's unusual in python. And it's so low level it's rarely easy to understand what's going. As someone who reads a lot more code than I write, I use grep and other such tools a lot. And I have to say I'm not really a big fan of that kind of trickery. If a coding pattern would fool grep + a brain, other analysis tools were lost eons ago. But if you have an actual use case for AST manipulation that would belong to a preprocessing phase, then please explain it to me. The only time I had to deal with python AST (beside cython itself, obviously), was to write an external analysis tool to help me migrate a program from threads to asyncio. Not a valid use-case for my proposal. Best Regards, Celelibi ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [ENH] Compile-time code execution (macros / templates)
Hm. If I understand correctly what you want is to generate attributes at compile-time and then get all the benefits of having statically declared attributes at run-time. This looks like a very interesting use-case. I guess it's a kind of named tuples with custom methods. The syntax you propose is quite pythonic and I would love that cython support something like that. However, in the current state of the code you proposed as some weird stuff going on. First and foremost, it would be quite challenging to disentangle what should be executed at compile-time and what should be left for the run-time. Second, you'd actually generate a different Base class for each of its subclasses. Which is pretty weird from a typing perspective. Your Foo and Bar wouldn't have a common base class beside Object. I guess what you're trying to do is kind of a compile-time generated mixin. I guess a better outline for this could be something like this: ``` cdef class Base: pass def generate_mixin(attrs): cdef class C: # Generate content based on attrs return C ctfe Foo = generate_mixin(["foo", "bar"]) ctfe Bar = generate_mixin(["bar", "baz"]) ``` These Foo and Bar classes could then be subclassed to add whatever specific method you want. On the other hand, the syntax I proposed makes it very clear what's executed at compile-time or run-time, but cannot support your use-case. Especially declaring a varying number of cdef attributes. I know D supports features like this. But it does so with a vastly different set of concepts. Like the `mixin` keyword that inline a compile-time string into the code. I guess I'll think about it and see if I can come up with a small set of concepts and syntaxes that support this use-case without having to resort to full blown AST manipulation. Le Fri, Feb 26, 2021 at 07:07:23PM -0800, Brock Mendel a écrit : >Would the following be a good use case for what you're discussing? >``` >cdef class Base: > cdef __cinit__(self, **kwargs): > for name in self.list_defined_on_subclass: > value = kwargs[name] > setattr(self, name, value) > cpdef copy(self): > kwargs = {name: getattr(self, name) for name >in self.list_defined_on_subclass} > return type(self)(**kwargs) >cdef class Foo(Base): > list_defined_on_subclass = ["foo", "bar"] >cdef class Bar(Base): > list_defined_on_subclass = ["bar", "baz"] >``` >and then at compile-time have the macro expand Foo to something like > >``` >cdef class Foo: > cdef: > something foo > something_else bar > def __cinit__(self, *, foo, bar): > self.foo = foo > self.bar = bar > cpdef copy(self): > return type(self)(foo=self.foo, bar=self.bar) >``` ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel