On 04/15/2012 09:39 AM, Robert Bradshaw wrote:
On Sat, Apr 14, 2012 at 11:58 PM, Dag Sverre Seljebotn
<d.s.seljeb...@astro.uio.no> wrote:
Ah, Cython objects. Didn't think of that. More below.
On 04/14/2012 11:02 PM, Stefan Behnel wrote:
Hi,
thanks for writing this up. Comments inline as I read through it.
Dag Sverre Seljebotn, 14.04.2012 21:08:
each described by a function pointer and a signature specification
string, such as "id)i" for {{{int f(int, double)}}}.
How do we deal with object argument types? Do we care on the caller side?
Functions might have alternative signatures that differ in the type of
their object parameters. Or should we handle this inside of the caller and
expect that it's something like a fused function with internal dispatch in
that case?
Personally, I think there is not enough to gain from object parameters
that
we should handle it on the caller side. The callee can dispatch those if
necessary.
What about signatures that require an object when we have a C typed value?
What about signatures that require a C typed argument when we have an
arbitrary object value in our call parameters?
We should also strip the "self" argument from the parameter list of
methods. That's handled by the attribute lookup before even getting at the
callable.
On 04/15/2012 07:59 AM, Robert Bradshaw wrote:
It would certainly be useful to have special syntax for memory views
(after nailing down a well-defined ABI for them) and builtin types.
Being able to declare something as taking a
"sage.rings.integer.Integer" could also prove useful, but could result
in long (and prefix-sharing) signatures, favoring the
runtime-allocated ids.
I do think describing Cython objects in this cross-tool CEP would work
nicely, this is for standardized ABIs only (we can't do memoryviews either
until their ABI is standard).
I think I prefer to a) exclude it now, and b) down the line we need another
cross-tool ABI to communicate vtables, and then we could put that into this
CEP now.
I strongly believe we should go with the Go "duck-typing" approach for
interfaces, i.e. it is not the declared name that should be compared but the
method names and signatures.
The only question that needs answering for CEP1000 is: Would this blow up
the signature string enough that interning is the only viable option?
Exactly.
Some strcmp solutions:
a) Hash each vtable descriptor to 160-bits, and assume the hash is unique.
Still, a couple of interfaces would blow up the signature string a lot.
b) Modify approach B in CEP 1000 to this: If it is longer than 160 bits,
take a full cryptographic hash, and just assume there won't be hash
collisions (like git does). This still saves for short signature strings,
and avoids interning at the cost of doing 160-bit comparisons.
Both of these require other ways at getting at the actual string data. But I
still like b) above better than interning.
Requiring an implementation (or at least access too) a cryptographic
hash greatly complicates the spec. (On another note, even a simple
hash as a prefix might be useful to prevent a lot of false partial
matches, e.g. "sage.rings...") 160 * n bits starts to get large too
(and we'd have to twiddle them to insert/avoid a "dash" ever 16
bytes).
Do you really think it complicates the spec? SHA-1 is pretty standard,
and Python ships with hashlib (the hashing part isn't performance critical).
I prefer hashing to string-interning as it can still be done
compile-time etc. 160 bits isn't worse than the second-to-best strcmp
case of a 256-bit function entry.
Shortening the hash to 120 bits (truncation) we could have a spec like this:
- Short signature: [64 bit encoded signature. 64 bit funcptr]
- Long signature: [64 bit hash, 64 bit pointer to full signature,
8 bit guard byte, 56 bits remaining hash,
64 bit funcptr]
Anyway: Looks like it's about time to do some benchmarks. I'll try to
get around to it next week.
Dag
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel