Re: [Cython] CEP1000: Native dispatch through callables
On Sat, Apr 14, 2012 at 11:39 PM, Stefan Behnel wrote: > Robert Bradshaw, 15.04.2012 08:32: >> On Sat, Apr 14, 2012 at 11:16 PM, Stefan Behnel wrote: >>> Robert Bradshaw, 15.04.2012 07:59: On Sat, Apr 14, 2012 at 2:00 PM, mark florisson wrote: > There may be a lot of promotion/demotion (you likely only want the > former) combinations, especially for multiple arguments, so perhaps it > makes sense to limit ourselves a bit. For instance for numeric scalar > argument types we could limit to long (and the unsigned counterparts), > double and double complex. > > So char, short and int scalars will be > promoted to long, float to double and float complex to double complex. > Anything bigger, like long long etc will be matched specifically. > Promotions and associated demotions if necessary in the callee should > be fairly cheap compared to checking all combinations or going through > the python layer. True, though this could be a convention rather than a requirement of the spec. Long vs. < long seems natural, but are there any systems where (scalar) float still has an advantage over double? Of course pointers like float* vs double* can't be promoted, so we would still need this kind of type declaration. >>> >>> Yes, passing data sets as C arrays requires proper knowledge about their >>> memory layout on both sides. >>> >>> OTOH, we are talking about functions that would otherwise be called through >>> Python, so this could only apply for buffers anyway. So why not require a >>> Py_buffer* as argument for them? >> >> That's certainly our (initial?) usecase, but there's no need to limit >> the protocol to this. > > I think the question here is: is this supposed to be a best effort protocol > for bypassing Python calls, or would it be an error in some situations if > no matching signature can be found? It may be an error in some cases. This isn't just about avoiding Python calls; Dag just barely summed this up quite nicely. - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
Dag Sverre Seljebotn, 15.04.2012 08:58: > Ah, Cython objects. Didn't think of that. More below. > > On 04/14/2012 11:02 PM, Stefan Behnel wrote: >> thanks for writing this up. Comments inline as I read through it. >> >> Dag Sverre Seljebotn, 14.04.2012 21:08: >>> each described by a function pointer and a signature specification >>> string, such as "id)i" for {{{int f(int, double)}}}. >> >> How do we deal with object argument types? Do we care on the caller side? >> Functions might have alternative signatures that differ in the type of >> their object parameters. Or should we handle this inside of the caller and >> expect that it's something like a fused function with internal dispatch in >> that case? >> >> Personally, I think there is not enough to gain from object parameters that >> we should handle it on the caller side. The callee can dispatch those if >> necessary. >> >> What about signatures that require an object when we have a C typed value? >> >> What about signatures that require a C typed argument when we have an >> arbitrary object value in our call parameters? >> >> We should also strip the "self" argument from the parameter list of >> methods. That's handled by the attribute lookup before even getting at the >> callable. > > On 04/15/2012 07:59 AM, Robert Bradshaw wrote: >> It would certainly be useful to have special syntax for memory views >> (after nailing down a well-defined ABI for them) and builtin types. >> Being able to declare something as taking a >> "sage.rings.integer.Integer" could also prove useful, but could result >> in long (and prefix-sharing) signatures, favoring the >> runtime-allocated ids. > > I do think describing Cython objects in this cross-tool CEP would work > nicely, this is for standardized ABIs only (we can't do memoryviews either > until their ABI is standard). It just occurred to me that an object's type can safely be represented at runtime as a pointer, i.e. an integer. Even if the type is heap allocated and replaced by another one later, a signature that uses that pointer value in its encoding would only ever match if both sides talk about the same type at call time (because at least one of them would hold a life reference to the type in order to actually use it). That would mean that IDs for signatures with object arguments would have to be generated at setup time, e.g. during module init, after importing the respective type. But I think that's acceptable. > I think I prefer to a) exclude it now, and b) down the line we need another > cross-tool ABI to communicate vtables, and then we could put that into this > CEP now. > > I strongly believe we should go with the Go "duck-typing" approach for > interfaces, i.e. it is not the declared name that should be compared but > the method names and signatures. > > The only question that needs answering for CEP1000 is: Would this blow up > the signature string enough that interning is the only viable option? That sounds excessive to me. Why would you want to test interfaces of arguments as part of the signature matching? Isn't that something that the callee should do when it actually needs a specific interface internally? Is there an important use case for passing objects with different interfaces as the same argument into the same callable? At least, it doesn't sound like such a use case would be performance critical in terms of the call overhead. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
On Sat, Apr 14, 2012 at 11:58 PM, Dag Sverre Seljebotn wrote: > Ah, Cython objects. Didn't think of that. More below. > > > On 04/14/2012 11:02 PM, Stefan Behnel wrote: >> >> Hi, >> >> thanks for writing this up. Comments inline as I read through it. >> >> Dag Sverre Seljebotn, 14.04.2012 21:08: >>> >>> each described by a function pointer and a signature specification >>> >>> string, such as "id)i" for {{{int f(int, double)}}}. >> >> >> How do we deal with object argument types? Do we care on the caller side? >> Functions might have alternative signatures that differ in the type of >> their object parameters. Or should we handle this inside of the caller and >> expect that it's something like a fused function with internal dispatch in >> that case? > >> >> Personally, I think there is not enough to gain from object parameters >> that >> we should handle it on the caller side. The callee can dispatch those if >> necessary. >> >> What about signatures that require an object when we have a C typed value? >> >> What about signatures that require a C typed argument when we have an >> arbitrary object value in our call parameters? >> >> We should also strip the "self" argument from the parameter list of >> methods. That's handled by the attribute lookup before even getting at the >> callable. > > On 04/15/2012 07:59 AM, Robert Bradshaw wrote: >> It would certainly be useful to have special syntax for memory views >> (after nailing down a well-defined ABI for them) and builtin types. >> Being able to declare something as taking a >> "sage.rings.integer.Integer" could also prove useful, but could result >> in long (and prefix-sharing) signatures, favoring the >> runtime-allocated ids. > > > I do think describing Cython objects in this cross-tool CEP would work > nicely, this is for standardized ABIs only (we can't do memoryviews either > until their ABI is standard). > > I think I prefer to a) exclude it now, and b) down the line we need another > cross-tool ABI to communicate vtables, and then we could put that into this > CEP now. > > I strongly believe we should go with the Go "duck-typing" approach for > interfaces, i.e. it is not the declared name that should be compared but the > method names and signatures. > > The only question that needs answering for CEP1000 is: Would this blow up > the signature string enough that interning is the only viable option? Exactly. > Some strcmp solutions: > > a) Hash each vtable descriptor to 160-bits, and assume the hash is unique. > Still, a couple of interfaces would blow up the signature string a lot. > > b) Modify approach B in CEP 1000 to this: If it is longer than 160 bits, > take a full cryptographic hash, and just assume there won't be hash > collisions (like git does). This still saves for short signature strings, > and avoids interning at the cost of doing 160-bit comparisons. > > Both of these require other ways at getting at the actual string data. But I > still like b) above better than interning. Requiring an implementation (or at least access too) a cryptographic hash greatly complicates the spec. (On another note, even a simple hash as a prefix might be useful to prevent a lot of false partial matches, e.g. "sage.rings...") 160 * n bits starts to get large too (and we'd have to twiddle them to insert/avoid a "dash" ever 16 bytes). Here's a crazy thought: we could assume signatures like this are "application specific." We can partition up portions of the signature space to individual projects to compute however they want. Cython can do this via interning for those signatures containing Cython types (which is not an undue burden for anyone attempting to interoperate with Cython types). For (some superset of) the basic C types we agree on a common encoding and inline it. - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [cython-users] Cython 0.16 RC 1
On Sat, Apr 14, 2012 at 9:14 PM, Al Danial wrote: > On Thu, Apr 12, 2012 at 7:38 AM, mark florisson > wrote: >> >> Yet another release candidate, this will hopefully be the last before >> the 0.16 release. You can grab it from here: >> http://wiki.cython.org/ReleaseNotes-0.16 > >> If there are any problems, please let us know. > > I'm having the same problem ("Cannot convert 'PyObject *' to Python object", > ref my posts at > http://groups.google.com/group/cython-users/browse_thread/thread/d1a727e9d61f93b6#) > on my code as with the release candidate 0. The code builds and runs > cleanly with 0.15.1. To duplicate: > > svn co http://pynastran.googlecode.com/svn/trunk/pyNastran/op4 > cd op4 > make clean ; make Including the problematic line would have been helpful. ndarray.base = array_wrapper_RS This is due to the Numpy 1.7 fix. I think we need to pull these commits out for now: https://github.com/cython/cython/pull/112 - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
On 04/15/2012 09:30 AM, Stefan Behnel wrote: Dag Sverre Seljebotn, 15.04.2012 08:58: Ah, Cython objects. Didn't think of that. More below. On 04/14/2012 11:02 PM, Stefan Behnel wrote: thanks for writing this up. Comments inline as I read through it. Dag Sverre Seljebotn, 14.04.2012 21:08: each described by a function pointer and a signature specification string, such as "id)i" for {{{int f(int, double)}}}. How do we deal with object argument types? Do we care on the caller side? Functions might have alternative signatures that differ in the type of their object parameters. Or should we handle this inside of the caller and expect that it's something like a fused function with internal dispatch in that case? Personally, I think there is not enough to gain from object parameters that we should handle it on the caller side. The callee can dispatch those if necessary. What about signatures that require an object when we have a C typed value? What about signatures that require a C typed argument when we have an arbitrary object value in our call parameters? We should also strip the "self" argument from the parameter list of methods. That's handled by the attribute lookup before even getting at the callable. On 04/15/2012 07:59 AM, Robert Bradshaw wrote: It would certainly be useful to have special syntax for memory views (after nailing down a well-defined ABI for them) and builtin types. Being able to declare something as taking a "sage.rings.integer.Integer" could also prove useful, but could result in long (and prefix-sharing) signatures, favoring the runtime-allocated ids. I do think describing Cython objects in this cross-tool CEP would work nicely, this is for standardized ABIs only (we can't do memoryviews either until their ABI is standard). It just occurred to me that an object's type can safely be represented at runtime as a pointer, i.e. an integer. Even if the type is heap allocated and replaced by another one later, a signature that uses that pointer value in its encoding would only ever match if both sides talk about the same type at call time (because at least one of them would hold a life reference to the type in order to actually use it). The missing piece here is that both me and Robert are huge fans of Go-style polymorphism. If you haven't read up on that I highly recommend it, basic idea is if you agree on method names and their signatures, you don't have to have access to the same interface declaration (you don't have to call the interface the same thing). Guess we should let this rest for a few days and get back to it with some benchmarks; since all we need to solve in CEP1000 is interned vs. strcmp. I'll try to do that. Dag ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
On 04/15/2012 09:39 AM, Robert Bradshaw wrote: On Sat, Apr 14, 2012 at 11:58 PM, Dag Sverre Seljebotn wrote: Ah, Cython objects. Didn't think of that. More below. On 04/14/2012 11:02 PM, Stefan Behnel wrote: Hi, thanks for writing this up. Comments inline as I read through it. Dag Sverre Seljebotn, 14.04.2012 21:08: each described by a function pointer and a signature specification string, such as "id)i" for {{{int f(int, double)}}}. How do we deal with object argument types? Do we care on the caller side? Functions might have alternative signatures that differ in the type of their object parameters. Or should we handle this inside of the caller and expect that it's something like a fused function with internal dispatch in that case? Personally, I think there is not enough to gain from object parameters that we should handle it on the caller side. The callee can dispatch those if necessary. What about signatures that require an object when we have a C typed value? What about signatures that require a C typed argument when we have an arbitrary object value in our call parameters? We should also strip the "self" argument from the parameter list of methods. That's handled by the attribute lookup before even getting at the callable. On 04/15/2012 07:59 AM, Robert Bradshaw wrote: It would certainly be useful to have special syntax for memory views (after nailing down a well-defined ABI for them) and builtin types. Being able to declare something as taking a "sage.rings.integer.Integer" could also prove useful, but could result in long (and prefix-sharing) signatures, favoring the runtime-allocated ids. I do think describing Cython objects in this cross-tool CEP would work nicely, this is for standardized ABIs only (we can't do memoryviews either until their ABI is standard). I think I prefer to a) exclude it now, and b) down the line we need another cross-tool ABI to communicate vtables, and then we could put that into this CEP now. I strongly believe we should go with the Go "duck-typing" approach for interfaces, i.e. it is not the declared name that should be compared but the method names and signatures. The only question that needs answering for CEP1000 is: Would this blow up the signature string enough that interning is the only viable option? Exactly. Some strcmp solutions: a) Hash each vtable descriptor to 160-bits, and assume the hash is unique. Still, a couple of interfaces would blow up the signature string a lot. b) Modify approach B in CEP 1000 to this: If it is longer than 160 bits, take a full cryptographic hash, and just assume there won't be hash collisions (like git does). This still saves for short signature strings, and avoids interning at the cost of doing 160-bit comparisons. Both of these require other ways at getting at the actual string data. But I still like b) above better than interning. Requiring an implementation (or at least access too) a cryptographic hash greatly complicates the spec. (On another note, even a simple hash as a prefix might be useful to prevent a lot of false partial matches, e.g. "sage.rings...") 160 * n bits starts to get large too (and we'd have to twiddle them to insert/avoid a "dash" ever 16 bytes). Do you really think it complicates the spec? SHA-1 is pretty standard, and Python ships with hashlib (the hashing part isn't performance critical). I prefer hashing to string-interning as it can still be done compile-time etc. 160 bits isn't worse than the second-to-best strcmp case of a 256-bit function entry. Shortening the hash to 120 bits (truncation) we could have a spec like this: - Short signature: [64 bit encoded signature. 64 bit funcptr] - Long signature: [64 bit hash, 64 bit pointer to full signature, 8 bit guard byte, 56 bits remaining hash, 64 bit funcptr] Anyway: Looks like it's about time to do some benchmarks. I'll try to get around to it next week. Dag ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
On 04/15/2012 10:07 AM, Dag Sverre Seljebotn wrote: On 04/15/2012 09:30 AM, Stefan Behnel wrote: Dag Sverre Seljebotn, 15.04.2012 08:58: Ah, Cython objects. Didn't think of that. More below. On 04/14/2012 11:02 PM, Stefan Behnel wrote: thanks for writing this up. Comments inline as I read through it. Dag Sverre Seljebotn, 14.04.2012 21:08: each described by a function pointer and a signature specification string, such as "id)i" for {{{int f(int, double)}}}. How do we deal with object argument types? Do we care on the caller side? Functions might have alternative signatures that differ in the type of their object parameters. Or should we handle this inside of the caller and expect that it's something like a fused function with internal dispatch in that case? Personally, I think there is not enough to gain from object parameters that we should handle it on the caller side. The callee can dispatch those if necessary. What about signatures that require an object when we have a C typed value? What about signatures that require a C typed argument when we have an arbitrary object value in our call parameters? We should also strip the "self" argument from the parameter list of methods. That's handled by the attribute lookup before even getting at the callable. On 04/15/2012 07:59 AM, Robert Bradshaw wrote: It would certainly be useful to have special syntax for memory views (after nailing down a well-defined ABI for them) and builtin types. Being able to declare something as taking a "sage.rings.integer.Integer" could also prove useful, but could result in long (and prefix-sharing) signatures, favoring the runtime-allocated ids. I do think describing Cython objects in this cross-tool CEP would work nicely, this is for standardized ABIs only (we can't do memoryviews either until their ABI is standard). It just occurred to me that an object's type can safely be represented at runtime as a pointer, i.e. an integer. Even if the type is heap allocated and replaced by another one later, a signature that uses that pointer value in its encoding would only ever match if both sides talk about the same type at call time (because at least one of them would hold a life reference to the type in order to actually use it). The missing piece here is that both me and Robert are huge fans of Go-style polymorphism. If you haven't read up on that I highly recommend it, basic idea is if you agree on method names and their signatures, you don't have to have access to the same interface declaration (you don't have to call the interface the same thing). Guess we should let this rest for a few days and get back to it with some benchmarks; since all we need to solve in CEP1000 is interned vs. strcmp. I'll try to do that. Actually, Stefan's idea above is valid for Go-style interfaces too, just replace pointer with an interned string. Which is what Robert proposed too. Dag ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
On Sun, Apr 15, 2012 at 9:15 AM, Dag Sverre Seljebotn wrote: > Do you really think it complicates the spec? SHA-1 is pretty standard, and > Python ships with hashlib (the hashing part isn't performance critical). > > I prefer hashing to string-interning as it can still be done compile-time > etc. 160 bits isn't worse than the second-to-best strcmp case of a 256-bit > function entry. If you're *so* set on compile-time calculation, one could also accommodate these within the intern framework pretty easily. Any PyString/PyBytes * will be aligned, which means the low bit will not be set, which means there are at least 2**31 bit-patterns that will never be used by a run-time interned string. So we could write down a lookup table in the spec that assigns arbitrary, well-known numbers to every common signature. "dd->d" is 1, "ii->i" is 2, etc. If you have 15 standard types, then you can assign such an id to every 0, 1, 2, 3, 4, 5, and 6 argument function with space left over. And this could all be abstracted away inside the intern() function. The only thing is that if you wanted to look at the characters in the interned string, you'd have to call a disintern() function instead of just following the pointer. I still think all this stuff would be complexity for its own sake, though. > Shortening the hash to 120 bits (truncation) we could have a spec like this: > > - Short signature: [64 bit encoded signature. 64 bit funcptr] > - Long signature: [64 bit hash, 64 bit pointer to full signature, > 8 bit guard byte, 56 bits remaining hash, > 64 bit funcptr] This is a fixed length encoding, so why does it need a guard byte? BTW, the guard byte design in the last version of the CEP looks buggy to me -- there's no guarantee that a valid pointer might not contain the guard byte by accident. A solution would be to move the to-be-continued byte (or bit) to the first word. This would also mean that if you're looking for a one-word signature via switch(), you won't hit signatures which have your signature as a prefix. In the variable-length encoding with the lookup rule you suggested you'd also want a second bit to mark the actual beginning of each structure, so you don't get hits on the middle of structures. > Anyway: Looks like it's about time to do some benchmarks. I'll try to get > around to it next week. Agreed :-). - N ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
On Sun, Apr 15, 2012 at 9:07 AM, Dag Sverre Seljebotn wrote: > On 04/15/2012 09:30 AM, Stefan Behnel wrote: >> >> Dag Sverre Seljebotn, 15.04.2012 08:58: >>> >>> Ah, Cython objects. Didn't think of that. More below. >>> >>> On 04/14/2012 11:02 PM, Stefan Behnel wrote: thanks for writing this up. Comments inline as I read through it. Dag Sverre Seljebotn, 14.04.2012 21:08: > > each described by a function pointer and a signature specification > string, such as "id)i" for {{{int f(int, double)}}}. How do we deal with object argument types? Do we care on the caller side? Functions might have alternative signatures that differ in the type of their object parameters. Or should we handle this inside of the caller and expect that it's something like a fused function with internal dispatch in that case? Personally, I think there is not enough to gain from object parameters that we should handle it on the caller side. The callee can dispatch those if necessary. What about signatures that require an object when we have a C typed value? What about signatures that require a C typed argument when we have an arbitrary object value in our call parameters? We should also strip the "self" argument from the parameter list of methods. That's handled by the attribute lookup before even getting at the callable. >>> >>> >>> On 04/15/2012 07:59 AM, Robert Bradshaw wrote: It would certainly be useful to have special syntax for memory views (after nailing down a well-defined ABI for them) and builtin types. Being able to declare something as taking a "sage.rings.integer.Integer" could also prove useful, but could result in long (and prefix-sharing) signatures, favoring the runtime-allocated ids. >>> >>> >>> I do think describing Cython objects in this cross-tool CEP would work >>> nicely, this is for standardized ABIs only (we can't do memoryviews >>> either >>> until their ABI is standard). >> >> >> It just occurred to me that an object's type can safely be represented at >> runtime as a pointer, i.e. an integer. Even if the type is heap allocated >> and replaced by another one later, a signature that uses that pointer >> value >> in its encoding would only ever match if both sides talk about the same >> type at call time (because at least one of them would hold a life >> reference >> to the type in order to actually use it). > > > The missing piece here is that both me and Robert are huge fans of Go-style > polymorphism. If you haven't read up on that I highly recommend it, basic > idea is if you agree on method names and their signatures, you don't have to > have access to the same interface declaration (you don't have to call the > interface the same thing). Go style polymorphism is certainly a neat idea, but two points: - You can't do this kind of matching via signature comparison. If I have a type with methods "foo", "bar" and "baz", then that should match the interface {"foo", "bar", "baz"}, but also {"foo", "bar"}, {"foo", "baz"}, {"bar"}, {}, etc. To find the right function for such a type, you need to decode each function signature and check them in some structured way. Unless your plan is to precompute the hash of all 2**n interfaces that each object fulfills. - Adding a whole new type system with polymorphic dispatch is a heck of a thing to do in a spec for boxing and unboxing pointers. Honestly at this level I'm even leery of describing Python objects via their type, as opposed to just "PyObject *". Just let the callee do the type checking if they need to, and if it later turns out that there are actually enough cases where Cython knows the exact type at compile time and is dispatching through a boxed pointer and the callee type checking is significant overhead, then extend the spec then. -- Nathaniel ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
Nathaniel Smith wrote: >On Sun, Apr 15, 2012 at 9:15 AM, Dag Sverre Seljebotn > wrote: >> Do you really think it complicates the spec? SHA-1 is pretty >standard, and >> Python ships with hashlib (the hashing part isn't performance >critical). >> >> I prefer hashing to string-interning as it can still be done >compile-time >> etc. 160 bits isn't worse than the second-to-best strcmp case of a >256-bit >> function entry. > >If you're *so* set on compile-time calculation, one could also >accommodate these within the intern framework pretty easily. Any >PyString/PyBytes * will be aligned, which means the low bit will not >be set, which means there are at least 2**31 bit-patterns that will >never be used by a run-time interned string. So we could write down a >lookup table in the spec that assigns arbitrary, well-known numbers to >every common signature. "dd->d" is 1, "ii->i" is 2, etc. If you have >15 standard types, then you can assign such an id to every 0, 1, 2, 3, >4, 5, and 6 argument function with space left over. > >And this could all be abstracted away inside the intern() function. >The only thing is that if you wanted to look at the characters in the >interned string, you'd have to call a disintern() function instead of >just following the pointer. > >I still think all this stuff would be complexity for its own sake, >though. > >> Shortening the hash to 120 bits (truncation) we could have a spec >like this: >> >> - Short signature: [64 bit encoded signature. 64 bit funcptr] >> - Long signature: [64 bit hash, 64 bit pointer to full signature, >> 8 bit guard byte, 56 bits remaining hash, >> 64 bit funcptr] > >This is a fixed length encoding, so why does it need a guard byte? No, there is two cases, one 128 bit and one 256 bit. > >BTW, the guard byte design in the last version of the CEP looks buggy >to me -- there's no guarantee that a valid pointer might not contain >the guard byte by accident. A solution would be to move the In the CEP text some posts ago? I am pretty sure I made sure that pointers would never be looked at -- you are supposed to scan in 128 bit jumps and will never look at the beginning of a pointer. Read it again and see if you can make a counterexample... That is the reason the above works, and why I split the hash in two segments. >to-be-continued byte (or bit) to the first word. This would also mean >that if you're looking for a one-word signature via switch(), you >won't hit signatures which have your signature as a prefix. In the You need 0-termination to be part of the signature (and if the 0 spills over, you spill over). I should have said that, good catch. Dag >variable-length encoding with the lookup rule you suggested you'd also >want a second bit to mark the actual beginning of each structure, so >you don't get hits on the middle of structures. > >> Anyway: Looks like it's about time to do some benchmarks. I'll try to >get >> around to it next week. > > Agreed :-). > >- N >___ >cython-devel mailing list >cython-devel@python.org >http://mail.python.org/mailman/listinfo/cython-devel -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
Nathaniel Smith wrote: >On Sun, Apr 15, 2012 at 9:07 AM, Dag Sverre Seljebotn > wrote: >> On 04/15/2012 09:30 AM, Stefan Behnel wrote: >>> >>> Dag Sverre Seljebotn, 15.04.2012 08:58: Ah, Cython objects. Didn't think of that. More below. On 04/14/2012 11:02 PM, Stefan Behnel wrote: > > thanks for writing this up. Comments inline as I read through it. > > Dag Sverre Seljebotn, 14.04.2012 21:08: >> >> each described by a function pointer and a signature >specification >> string, such as "id)i" for {{{int f(int, double)}}}. > > > How do we deal with object argument types? Do we care on the >caller > side? > Functions might have alternative signatures that differ in the >type of > their object parameters. Or should we handle this inside of the >caller > and > expect that it's something like a fused function with internal >dispatch > in > that case? > > Personally, I think there is not enough to gain from object >parameters > that > we should handle it on the caller side. The callee can dispatch >those if > necessary. > > What about signatures that require an object when we have a C >typed > value? > > What about signatures that require a C typed argument when we have >an > arbitrary object value in our call parameters? > > We should also strip the "self" argument from the parameter list >of > methods. That's handled by the attribute lookup before even >getting at > the > callable. On 04/15/2012 07:59 AM, Robert Bradshaw wrote: > > It would certainly be useful to have special syntax for memory >views > (after nailing down a well-defined ABI for them) and builtin >types. > Being able to declare something as taking a > "sage.rings.integer.Integer" could also prove useful, but could >result > in long (and prefix-sharing) signatures, favoring the > runtime-allocated ids. I do think describing Cython objects in this cross-tool CEP would >work nicely, this is for standardized ABIs only (we can't do memoryviews either until their ABI is standard). >>> >>> >>> It just occurred to me that an object's type can safely be >represented at >>> runtime as a pointer, i.e. an integer. Even if the type is heap >allocated >>> and replaced by another one later, a signature that uses that >pointer >>> value >>> in its encoding would only ever match if both sides talk about the >same >>> type at call time (because at least one of them would hold a life >>> reference >>> to the type in order to actually use it). >> >> >> The missing piece here is that both me and Robert are huge fans of >Go-style >> polymorphism. If you haven't read up on that I highly recommend it, >basic >> idea is if you agree on method names and their signatures, you don't >have to >> have access to the same interface declaration (you don't have to call >the >> interface the same thing). > >Go style polymorphism is certainly a neat idea, but two points: > >- You can't do this kind of matching via signature comparison. If I >have a type with methods "foo", "bar" and "baz", then that should >match the interface {"foo", "bar", "baz"}, but also {"foo", "bar"}, >{"foo", "baz"}, {"bar"}, {}, etc. To find the right function for such >a type, you need to decode each function signature and check them in >some structured way. Unless your plan is to precompute the hash of all >2**n interfaces that each object fulfills. You are of course right this needs a lot more thought. > >- Adding a whole new type system with polymorphic dispatch is a heck >of a thing to do in a spec for boxing and unboxing pointers. Honestly >at this level I'm even leery of describing Python objects via their >type, as opposed to just "PyObject *". Just let the callee do the type >checking if they need to, and if it later turns out that there are >actually enough cases where Cython knows the exact type at compile >time and is dispatching through a boxed pointer and the callee type >checking is significant overhead, then extend the spec then. We are not insane, it's been said several times this goes in a later spec. We're just trying to guess whether future developments would seriously impact intern vs. strcmp -- ie what a likely signature length is in the future. We make CEP1000 a simple spec, but spend some time to try to guess how it could be extended. Dag > >-- Nathaniel >___ >cython-devel mailing list >cython-devel@python.org >http://mail.python.org/mailman/listinfo/cython-devel -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
On 15 April 2012 07:26, Stefan Behnel wrote: > mark florisson, 14.04.2012 23:15: >> On 14 April 2012 22:02, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 14.04.2012 21:08: * TBD: Support for Cython-specific constructs like memoryview slices (so that arrays with strides and shape can be passed faster than passing an {{{"O"}}}). >>> >>> Is this really Cython specific or would a generic Py_buffer struct work? >> >> That could work through simple unboxing wrapper functions, but it >> would add some overhead, specifically because it would have to check >> the buffer's object, and if it didn't exist or was not a memoryview >> object, it would have to create one (checking whether something is a >> memoryview object would also be a pain, as each module has a different >> memoryview type). That could still be feasible for interaction with >> Cython functions from non-Cython code. > > Hmm, I don't get it. Isn't the overhead always there when a memory view is > requested in the signature? You'd have to create one for each call and that > seriously hurts the efficiency. Is that a common use case? Why would you > want to do more than passing unboxed buffers? > > Stefan > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel So, if you're going to accept Py_buffer *buf (which is useful in itself), then to use memoryviews you have to copy over some shape/strides/suboffsets and the data pointer, which it not a big deal. But you also want a memoryview object associated with the memoryview slice, that keeps things around like the format string, function pointers to convert the dtype to and from Python objects and a reference (acquisition) count or a lock in case atomics are not supported by the compiler (or Cython doesn't know about the compiler). So if buf->obj is not a memoryview object, it will have to create one in the callee, and the caller will have to convert a slice to a new Py_buffer struct. Arguably, the memoryview implementation is not optimal, it should have a memoryview struct with that data, making it somewhat less expensive. Finally, what are the semantics for Py_buffer? Will the callee own the buffer, or will it borrow it? If they will borrow, then the compiler will have to figure out whether it will need to own it (or be slower and always own it), and acquire the buffer through buf->obj. At least it won't have to validate the buffer, which is the most expensive part. I think in many cases you want to borrow though, but if you want to always own, the caller could do something more efficient if releasebuffer is not implemented, like simply incref buf->obj and pass in a pointer to a copy of the Py_buffer. I think borrowing is probably the easiest and most sane way though. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
mark florisson, 15.04.2012 13:30: > Finally, what are the semantics for Py_buffer? Will the callee own the > buffer, or will it borrow it? If they will borrow, then the compiler > will have to figure out whether it will need to own it (or be slower > and always own it), and acquire the buffer through buf->obj. At least > it won't have to validate the buffer, which is the most expensive > part. > I think in many cases you want to borrow though, but if you want to > always own, the caller could do something more efficient if > releasebuffer is not implemented, like simply incref buf->obj and pass > in a pointer to a copy of the Py_buffer. I think borrowing is probably > the easiest and most sane way though. I think that's easy. If you request and unpack a buffer yourself, you own it. If you receive an unpacked buffer from someone else as a call argument, you borrow it, and you know that your caller (or the caller of your caller, etc.) owns it and keeps it alive until you return. If you receive it as return value of a function call, it's less clear, but my intuition tells me that you'd normally either receive an owned Python object or a borrowed unpacked buffer. In the case at hand, you'd always receive a borrowed buffer from the caller as argument. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
On 15 April 2012 12:40, Stefan Behnel wrote: > mark florisson, 15.04.2012 13:30: >> Finally, what are the semantics for Py_buffer? Will the callee own the >> buffer, or will it borrow it? If they will borrow, then the compiler >> will have to figure out whether it will need to own it (or be slower >> and always own it), and acquire the buffer through buf->obj. At least >> it won't have to validate the buffer, which is the most expensive >> part. >> I think in many cases you want to borrow though, but if you want to >> always own, the caller could do something more efficient if >> releasebuffer is not implemented, like simply incref buf->obj and pass >> in a pointer to a copy of the Py_buffer. I think borrowing is probably >> the easiest and most sane way though. > > I think that's easy. If you request and unpack a buffer yourself, you own > it. If you receive an unpacked buffer from someone else as a call argument, > you borrow it, and you know that your caller (or the caller of your caller, > etc.) owns it and keeps it alive until you return. If you receive it as > return value of a function call, it's less clear, but my intuition tells me > that you'd normally either receive an owned Python object or a borrowed > unpacked buffer. > > In the case at hand, you'd always receive a borrowed buffer from the caller > as argument. > > Stefan > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel That makes sense, but it means a lot of overhead for memoryview slices, which I think justifies syntax for custom types in general. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Cython 0.16 RC 2
Hopefully a final release candidate for the 0.16 release can be found here: http://wiki.cython.org/ReleaseNotes-0.16 . This corresponds to the 'release' branch of the cython repository on github. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
Stefan Behnel wrote: It wasn't really a proposed syntax, I guess, more of a way to write down an example. That's okay, although you might want to mention in the PEP that the actual syntax is yet to be determined. Being a PEP, anything it says tends to come across as being a specification otherwise. -- Greg ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CEP1000: Native dispatch through callables
Robert Bradshaw wrote: Brevity, especially if the signature is inlined. (Encoding could take care of this by, e.g. ignoring the redundant opening, or we could just write di=d.) Yes, I was thinking in terms of replacing the paren with some other character, rather than inserting more parens. -- Greg ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel