[Cython] [ENH] Compile-time code execution (macros / templates)

2021-02-23 Thread Celelibi
Hello. I was going to post this as an issue on github, but since it's
the idea isn't complete, I guess this mailing list is better suited.

I read (not in too much details, though) the related CEP, and I feel
they either do too little or too much.

In my idea, there is some syntax (or decorator) and some propagation
rules to decide what to execute during the compilation. It would merge
the need for templates and for constants pre-evaluation.

The uses would be great many.
- It could generate new cdef classes based on an argument (possibly a C
  type).
- It could make sure all the references to the cython module are caught
  during the compilation.
- It could replace the DEF keyword with a pure python syntax.
- It could easily allow any decorator on cdef functions.
- Compile-time decision of a function implementation (common in C using
  the preprocessor).
- And probably many more.


As far as I understand, this would be strictly more general and elegant
than these two proposals:
https://github.com/cython/cython/wiki/enhancements-inlining
https://github.com/cython/cython/wiki/enhancements-methodtemplates

In my idea, the only metaprogramming entry points would be cdef functions,
cdef classes (or cppclasses) and C/C++ types. No access to the AST as
it's unusual in python to do so. But nontheless, it would be quite easy
to have a CdefFunction object that exposes an `ast` attribute.

As such, my proposal would provide a cleaner entry point to
metaprogramming than these two proposals. Together with a pure python
syntax.
https://github.com/cython/cython/wiki/enhancements-metaprogramming
https://github.com/cython/cython/wiki/enhancements-uneval


The compile-time execution doesn't have to support everything.
Especially, the evaulation of `cdef` code could be heavily restricted.
Here are the restriction that apply to CTFE (Compile-Time Function
Evaluation) in the language D.
https://dlang.org/spec/function.html#interpretation


As far as I can tell, it's quite common in python to write load-time
code in modules to achieve a higher level of genericity. Like to support
PY2 and PY3, Windows and Linux, etc. Or to generate classes using
closures.
I think bringing back the habits and culture of python to cython would
be a good thing.


On the syntax side, I'm not sure how to handle cases where the right
hand side of an assignment should be evaluated and the cases where every
use of the variable should be evaluated.

# Need only to evaluate the right hand side
cdef str pi = compute_pi(100)

# Need to evaluate everywhere the variable appear
os = "Linux"
if os == "Windows":
cdef dostuff():
return 1
else:
cdef dostuff():
return 0


What do you think?
Did I miss something that would make this feature mostly useless?


Best regards,
Celelibi
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] [ENH] Compile-time code execution (macros / templates)

2021-02-24 Thread Celelibi
Le Tue, Feb 23, 2021 at 11:24:42AM -0500, Prakhar Goel a écrit :
>I had similar ideas in the past (assuming I understand your proposal). I'd
>love it if a decorator could link to some python function. This python
>function would get something that represents the cdef function (with the
>ast). It would then return a transformed ast that could be spliced into
>the original program.

As I said, I'm not sure allowing AST manipulation would be that useful.
I mean, it's rarely ever needed in python. Why would it be needed with
cython?
However, it would surely make the code much less readable, since the
code that's executed is not the code that's written.

The use-cases I have in mind are more like these. Which are common
patterns in python.

Compile-time implementation choice:

if os == "Windows":
cdef thread_something():
# Handle Windows-specific behavior
else:
cdef thread_something():
# Assume POSIX behavior


Decorators on cdef functions:
def twice(f):
cdef fun(x):
return f(2*x)
return fun

@twice
cdef foo(x):
# Do something math-y



Creating cdef classes on demand:

def make_ptr_class(T):
cdef class C:
cdef T *ptr
return C

IntPtr = make_ptr_class(int)
FloatPtr = make_ptr_class(float)

My current use-case is actually exactly this one. Creating wrappers
classes for pointer types. Which I currently cannot make both generic
and type safe.


The python code run at compile-time would basically manipulate python
objects representing cdef functions, classes and C types.

Internally, the way it could be implemented is that cython could
generate a python program that would produce the AST. But it doesn't
mean the AST itself has to be visible.

Best regards,
Celelibi
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] [ENH] Compile-time code execution (macros / templates)

2021-02-25 Thread Celelibi
1) My suggestion would only run some standard python code during the
compilation. Running the cdef functions at compile-time doesn't have to
be supported. At least not for a first version.

Even if it gets supported, calling external functions (written in C)
doesn't have to be supported either. If only running the cdef functions
is supported, then it means cython has the full code and would "only"
have to implement the C semantics in python. And actually, not the whole
C semantics has to be supported. The language D has quite some
restrictions on the code that can run at compile-time. Especially when
it comes to pointers.

But first, running only python code seems quite doable.


2) That's something I didn't elaborate on in my initial message to keep
it short (kinda). But indeed, some compile-time functions just wouldn't
make sens at runtime. Especially, `twice` and `make_ptr_class` wouldn't
as far as I can tell, because returning a cdef function or class at run
time doesn't really make sens right now. (Maybe in the future if cython
support compiling closures to C?)

I guess some kind annotations would be necessary to mark a function (or
maybe even a class?) as "compile-time only". Of course, other functions
and classes might make sens to be used at compile-time and at run-time.


The only thing I'm not sure about, is how to determine what should get
evaluated at compile-time. A few places in the code should probably get
automatically arked for CTFE, like decorators on cdef functions.

IIRC, D uses "enum" to declare a compile-time constant. But this only
solves half the issue. What about top-level code evaluation, like the
`if os =="windows"` example?
Maybe this could be wrapped in a function evaluated at compile-time. And
thuse only function evaluation would be requred.
Something like this:


@cython.ctfe_only
def choice_thread_something(os):
if os == "windows":
cdef thread_something():
# Windows behavior
else:
cdef thread_something():
# POSIX behavior
return thread_something

ctfe thread_something = choice_thread_something(os)


Where the "ctfe" keyword would play a similar role to "enum" in D and
force the evaluation of the right hand side during the compilation.

As a technical detail, the "cdef" lines that define the thread_something
functions could be remplaced during the compilation by some python code
like:
thread_something = CdefFunction(...)

And the calling line `ctfe thread_something = ...` would just insert the
returned object into the AST. All the remaining (non-CTFE) code would
be converted to code that build the AST.



I hope that makes it more clear.


Celelibi


Le Wed, Feb 24, 2021 at 10:20:39PM +, da-woods a écrit :
> A few things to point out:
> 1) Because Cython is mainly a is code translation tool it doesn't currently
> have an overview of how things would eventually be evaluated - it leaves a
> lot for the C compiler to sort out. Your suggestion would presumably add a
> dependency on a C compiler at the .pyx -> .c stage (currently the only
> dependencies are Python+Cython itself).
> 2) Presumably `twice` and `make_ptr_class` would also have to work at
> run-time? They are regular `def` functions after all? That seems like a big
> change to now have to dynamically create `cdef` functions and classes at
> both Cythonization-time and runtime. And also a Python-runtime
> representation of every C type (so that you can call `make_ptr_class` at
> runtime)
> 
> I totally get the appeal of being able to wrap template-types quickly and
> easily. However, to me this idea actually seems a lot harder than something
> which creates new syntax, because you don't distinguish these special
> functions and so they have to work universally as Python functions too.
> 
> In summary: if it worked it'd be very nice, but I personally don't have any
> idea how you'd implement it within how Cython current works.
> 
> David
> 
> 
> 
> On 24/02/2021 14:29, Celelibi wrote:
> > Le Tue, Feb 23, 2021 at 11:24:42AM -0500, Prakhar Goel a écrit :
> > > I had similar ideas in the past (assuming I understand your 
> > > proposal). I'd
> > > love it if a decorator could link to some python function. This python
> > > function would get something that represents the cdef function (with 
> > > the
> > > ast). It would then return a transformed ast that could be spliced 
> > > into
> > > the original program.
> > As I said, I'm not sure allowing AST manipulation would be that useful.
> > I mean, it's rarely ever needed in python. Why would it be needed wi

Re: [Cython] [ENH] Compile-time code execution (macros / templates)

2021-02-25 Thread Celelibi
Le Fri, Feb 26, 2021 at 01:32:34PM +1300, Greg Ewing a écrit :
> On 26/02/21 9:29 am, Celelibi wrote:
> > Maybe in the future if cython
> > support compiling closures to C?
> 
> Your "twice" example already needs some closure functionality.
Yes, at the python level only. That's precisely the point.

> It relies on being able to manufacture a cdef function inside
> a Python function, with the expectation that it will have
> access to arguments of the Python function. With all or some
> of that happening at compile time rather than run time. I'm
> having trouble imagining how it would be implemented.

It relies on creating a python object representing the inner cdef
function during the compile-time execution. Which is, like, the trivial
part.

Just picture the `twice` example being converted like this for the
compile-time execution.


def twice(f):
fun = CdefFunction(... use f ...)
return fun

foo = CdefFunction(...)
ast.append(AstCdefFunction("foo", twice(foo)))


Where CdefFunction is not necessarily callable. It's just a class
representing a cdef function. And AstCdefFunction is an AST node
representing a cdef function definition.

Does that seem far fetched to you?
I think this might even be doable without having to even detect the
closure in cython. We'd just have to let python perform the name lookup.


Celelibi
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] [ENH] Compile-time code execution (macros / templates)

2021-02-26 Thread Celelibi
Prakhar Goel,

> Doesn't this just punt on how CdefFunction works? Feels like we're
> back to AST re-writing.
Indeed, internally it rewrites the AST, of course. I think that's the
simplest way to implement it. But it doesn't mean the user that write
the compile-time code would have to be aware of the AST at all.
I mean, if you have a good use-case for it, then I'm ok with allowing
the compile-time code manipulating it. But I'd really prefer this not be
the main use of the compile-time execution feature.


Greg Ewing,
> On 26/02/21 3:21 pm, Celelibi wrote:
> > def twice(f):
> >  fun = CdefFunction(... use f ...)
> >  return fun
> > 
> > foo = CdefFunction(...)
> > ast.append(AstCdefFunction("foo", twice(foo)))
> > 
> > I think this might even be doable without having to even detect the
> > closure in cython. We'd just have to let python perform the name lookup.
> 
> The cdef function created inside twice() is going to have
> to be some kind of closure, because it needs access at run
> time to the particular f that was passed to twice() at
> compile time.

Yes, conceptually it's a closure. But not implemented as one.
Since the object `fun` (whose type is CdefFunction) is *not* a function,
it is not a closure in the usual sens. It's just some python object
having a reference to another object.
The only new feature might be to declare at the top-level the functions
that are referenced by the CdefFunction object that reach the top-level.

In short, the apparent closures are resolved during the compile-time
execution.

As Prakhar Goel pointed out, the AST of `fun` would hold a reference to
the AST of `foo`. The compile-time execution of the `twice` example
would produce an AST similar to that we would get by parsing the
following code.

cdef _unique_name_for_original_foo(x):
# Something math-y

cdef foo(x):
return _unique_name_for_original_foo(2*x)


Honestly, I would love to make a prototype as think there's really no
major obstacle, yet many seem to think there is.
Unfortunately, I don't think I'll have enough time until at least
several weeks.


Best regards,
Celelibi
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] [ENH] Compile-time code execution (macros / templates)

2021-02-27 Thread Celelibi
Le Fri, Feb 26, 2021 at 09:44:51PM -0500, Prakhar Goel a écrit :
> > Indeed, internally it rewrites the AST, of course. I think that's the
> > simplest way to implement it. But it doesn't mean the user that write
> > the compile-time code would have to be aware of the AST at all.
> > I mean, if you have a good use-case for it, then I'm ok with allowing
> > the compile-time code manipulating it. But I'd really prefer this not be
> > the main use of the compile-time execution feature.
> 
> Ah. I see where you are coming from. Wouldn't this be a ridiculous
> amount of work? At least if we insisted on something with a clean
> mental model that wouldn't just shred the user with a thousand corner
> cases? It would need a whole new sub-syntax in Cython with detailed
> semantics for all the weird things code lets people do (e.g. what if
> the input function f gets used like a first-class object and placed in
> a Python list that then gets passed to another function... at compile
> time...). You're basically asking for scheme's syntax-case but harder
> because Python/Cython are nowhere near as regular! And scheme's
> syntax-case is a beast (read up on phase separation if you don't
> believe me: https://docs.racket-lang.org/guide/phases.html).

Clearly, I think what I'm proposing has one of the simplest mental model
one could dream of. And if I'm not mistaken, it's not much work (I mean,
no fancy algorithms).
I mean, how could the mental model be any simpler than bringing in that
of python itself?

I don't know what example you had in mind, but I don't plan to have any
kind of static analysis in cython at all. All the burden is offloaded to
the python interpreter by running nothing but Pure Python Code. I'll
expand on that in a minute.

I mean, if you think you got an example that would be hard to analyze
with your list containing functions, then please expand on it. I'll
try my best to answer how my proposal would handle it. That's one of the
reasons I started this thread: to challenge the idea.

Here's how I see it working.
After some basic analysis, cython produces a python program that
basically rebuild the AST at that point and run the code itself at the
same time, with a few modifications.
1) cdef aren't run (of course, it's not python code), they're replaced
by the creation of a CdefFunction or CdefClass objects.
2) Functions marked ctfe_only aren't added to the AST.
3) Assignments maked with the `ctfe` keyword are handled specially to
turn whatever was returned to a new definition. (I can expand on that if
you want, especially on the "outlining" of functions.)

Running the code mostly means that all the def and assignment and
everything is evaluated. This has some obvious drawbacks, though.
Especially if it's a stadalone program, it would just... run. But this
is unavoidable. For instance the program could have conditional function
definition (a def in a if), that's used as compile time.
But I guess it's not much of an issue since a module would not have much
to run during load time, and a standalone program would be protected by
`if __name__ == "__main__":`.

> AST manipulation isn't _that_ hard (esp. with good helper libraries)
> and on the plus side, the basic idea is dead-simple to implement: just
> call some random Python function that the user points you towards.
> There's going to be some scaffolding but of an entirely manageable
> amount.
I'm not saying AST manipulation is hard. I'm rather saying it's unusual
in python. And it's so low level it's rarely easy to understand what's
going.
As someone who reads a lot more code than I write, I use grep and other
such tools a lot. And I have to say I'm not really a big fan of that
kind of trickery. If a coding pattern would fool grep + a brain, other
analysis tools were lost eons ago.

But if you have an actual use case for AST manipulation that would
belong to a preprocessing phase, then please explain it to me.
The only time I had to deal with python AST (beside cython itself,
obviously), was to write an external analysis tool to help me migrate a
program from threads to asyncio. Not a valid use-case for my proposal.


Best Regards,
Celelibi
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] [ENH] Compile-time code execution (macros / templates)

2021-02-28 Thread Celelibi
Hm.
If I understand correctly what you want is to generate attributes at
compile-time and then get all the benefits of having statically declared
attributes at run-time.

This looks like a very interesting use-case. I guess it's a kind of
named tuples with custom methods. The syntax you propose is quite
pythonic and I would love that cython support something like that.
However, in the current state of the code you proposed as some weird
stuff going on.
First and foremost, it would be quite challenging to disentangle what
should be executed at compile-time and what should be left for the
run-time.
Second, you'd actually generate a different Base class for each of its
subclasses. Which is pretty weird from a typing perspective. Your Foo
and Bar wouldn't have a common base class beside Object. I guess what
you're trying to do is kind of a compile-time generated mixin.
I guess a better outline for this could be something like this:
```
cdef class Base:
pass

def generate_mixin(attrs):
cdef class C:
# Generate content based on attrs
return C

ctfe Foo = generate_mixin(["foo", "bar"])
ctfe Bar = generate_mixin(["bar", "baz"])
```
These Foo and Bar classes could then be subclassed to add whatever
specific method you want.

On the other hand, the syntax I proposed makes it very clear what's
executed at compile-time or run-time, but cannot support your use-case.
Especially declaring a varying number of cdef attributes.

I know D supports features like this. But it does so with a vastly
different set of concepts. Like the `mixin` keyword that inline a
compile-time string into the code.

I guess I'll think about it and see if I can come up with a small set of
concepts and syntaxes that support this use-case without having to
resort to full blown AST manipulation.


Le Fri, Feb 26, 2021 at 07:07:23PM -0800, Brock Mendel a écrit :
>Would the following be a good use case for what you're discussing?
>```
>cdef class Base:
>    cdef __cinit__(self, **kwargs):
>        for name in self.list_defined_on_subclass:
>            value = kwargs[name]
>            setattr(self, name, value)
>     cpdef copy(self):
>          kwargs = {name: getattr(self, name) for name
>in self.list_defined_on_subclass}
>          return type(self)(**kwargs)
>cdef class Foo(Base):
>    list_defined_on_subclass =  ["foo", "bar"]
>cdef class Bar(Base):
>    list_defined_on_subclass =  ["bar", "baz"]
>```
>and then at compile-time have the macro expand Foo to something like
> 
>```
>cdef class Foo:
>    cdef:
>        something foo
>        something_else bar
>    def __cinit__(self, *, foo, bar):
>        self.foo = foo
>        self.bar = bar
>    cpdef copy(self):
>        return type(self)(foo=self.foo, bar=self.bar)
>```
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel