Re: [Cython] Bug report: enumerate does not accept the "start" argument
Russell Warren, 08.05.2012 08:25: > Python's built-in function 'enumerate' has a lesser-known 2nd argument that > allows the start value of the enumeration to be set. See the python docs > here: > http://docs.python.org/library/functions.html#enumerate > > Cython 0.16 doesn't like it, and only allows one argument. > > Here is a simple file to reproduce the failure: > > for i in enumerate("abc", 1): >> print i > > > And the resulting output complaint: > > Error compiling Cython file: >> >> ... >> for i in enumerate("abc", 1): >> ^ >> >> deploy/_working/_cython_test.pyx:1:18: enumerate() takes at most 1 argument Thanks for the report, here is a fix: https://github.com/cython/cython/commit/2e3a306d0b624993d41a02f790725d8b2100e57d > I have requested a trac login to file bugs like this, but the request is > pending (just sent). Please file it anyway (when you get your account) so that we can document in the tracker that it's fixed. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] buffer syntax vs. memory view syntax
On 05/07/2012 11:21 PM, mark florisson wrote: On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: mark florisson wrote: On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: On 05/07/2012 04:16 PM, Stefan Behnel wrote: Stefan Behnel, 07.05.2012 15:04: Dag Sverre Seljebotn, 07.05.2012 13:48: BTW, with the coming of memoryviews, me and Mark talked about just deprecating the "mytype[...]" meaning buffers, and rather treat it as np.ndarray, array.array etc. being some sort of "template types". That is, we disallow "object[int]" and require some special declarations in the relevant pxd files. Hmm, yes, it's unfortunate that we have two different types of syntax now, one that declares the item type before the brackets and one that declares it afterwards. I actually think this merits some more discussion. Should we consider the buffer interface syntax deprecated and focus on the memory view syntax? I think that's the very-long-term intention. Then again, it may be too early to really tell yet, we just need to see how the memory views play out in real life and whether they'll be able to replace np.ndarray[double] among real users. We don't want to shove things down users throats. But the use of the trailing-[] syntax needs some cleaning up. Me and Mark agreed we'd put this proposal forward when we got around to it: - Deprecate the "object[double]" form, where [dtype] can be stuck on any extension type - But, do NOT (for the next year at least) deprecate np.ndarray[double], array.array[double], etc. Basically, there should be a magic flag in extension type declarations saying "I can be a buffer". For one thing, that is sort of needed to open up things for templated cdef classes/fused types cdef classes, if that is ever implemented. Deprecating is definitely a good start. I think at least if you only allow two types as buffers it will be at least reasonably clear when one is dealing with fused types or buffers. Basically, I think memoryviews should live up to demands of the users, which would mean there would be no reason to keep the buffer syntax. But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy.. One thing to do is make memoryviews coerce cheaply back to the original objects if wanted (which is likely). Writting np.asarray(mymemview) is kind of annoying. It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to... Yeah, the idea is as very simple, as you mention, just keep the object around cached, and when you slice construct one lazily. If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user. You mean the buffer syntax? Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension) I was thinking you could allow the user to register a callback, and use that to coerce from a memoryview back to an object (given a memoryview object). For numpy this would be np.asarray, and the implementation is allowed to cache the result (which it will). It may be too magicky though... but it will be convenient. The memoryview will act as a subclass, meaning that any of its methods will override methods of the converted object. My point was that this seems *way* to magicky. Beyond "confusing users" and so on that are sort of subjective, here's a fundamental problem for you: We're making it very difficult to type-infer memoryviews. Consider: cdef double[:] x = ... y = x print y.shape Now, because y is not typed, you're semantically throwing in a conversion on line 2, so that line 3 says that you want the attribute access to be invoked on "whatever object x coerced back to". And we have no idea what kind of object that is. If you don't transparently convert to object, it'd be safe to automatically infer y as a double[:]. On a related note, I've said before that I dislike the notion of cdef double[:] mview = obj I'd rather like cdef double[:] mview = double[:](obj) I support Robert in that "np.ndarray[double]" is the syntax to use when you want this kind of transparent "be an object when I need to and a memory view when I need to". Proposal: 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in the language. It means exactly what you would like double[:] to mean, i.e. a variable that is memoryview when you need to and an object otherwise. When you use this type, you bear the consequences of early-binding things that
Re: [Cython] buffer syntax vs. memory view syntax
Dag Sverre Seljebotn, 08.05.2012 09:57: > On 05/07/2012 11:21 PM, mark florisson wrote: >> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>> mark florisson wrote: On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: > On 05/07/2012 04:16 PM, Stefan Behnel wrote: >> Stefan Behnel, 07.05.2012 15:04: >>> Dag Sverre Seljebotn, 07.05.2012 13:48: BTW, with the coming of memoryviews, me and Mark talked about just deprecating the "mytype[...]" meaning buffers, and rather treat it as np.ndarray, array.array etc. being some sort of "template types". That is, we disallow "object[int]" and require some special declarations in the relevant pxd files. >>> >>> Hmm, yes, it's unfortunate that we have two different types of >>> syntax now, >>> one that declares the item type before the brackets and one that >>> declares it afterwards. >> Should we consider the >> buffer interface syntax deprecated and focus on the memory view >> syntax? > > I think that's the very-long-term intention. Then again, it may be > too early > to really tell yet, we just need to see how the memory views play out > in > real life and whether they'll be able to replace np.ndarray[double] > among real users. We don't want to shove things down users throats. > > But the use of the trailing-[] syntax needs some cleaning up. Me and > Mark agreed we'd put this proposal forward when we got around to it: > > - Deprecate the "object[double]" form, where [dtype] can be stuck on > any extension type > > - But, do NOT (for the next year at least) deprecate > np.ndarray[double], > array.array[double], etc. Basically, there should be a magic flag in > extension type declarations saying "I can be a buffer". > > For one thing, that is sort of needed to open up things for templated > cdef classes/fused types cdef classes, if that is ever implemented. Deprecating is definitely a good start. I think at least if you only allow two types as buffers it will be at least reasonably clear when one is dealing with fused types or buffers. Basically, I think memoryviews should live up to demands of the users, which would mean there would be no reason to keep the buffer syntax. >>> >>> But they are different approaches -- use a different type/API, or just >>> try to speed up parts of NumPy.. >>> One thing to do is make memoryviews coerce cheaply back to the original objects if wanted (which is likely). Writting np.asarray(mymemview) is kind of annoying. >>> >>> It is going to be very confusing to have type(mymemview), >>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>> full API of NumPy. Unless you auto-convert on getattr to... >> >> Yeah, the idea is as very simple, as you mention, just keep the object >> around cached, and when you slice construct one lazily. >> >>> If you want to eradicate the distinction between the backing array and >>> the memory view and make it transparent, I really suggest you kick back >>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>> construction after slicing, and so on). Implementation much the same >>> either way, it is all about how it is presented to the user. >> >> You mean the buffer syntax? >> >>> Something like mymemview.asobject() could work though, and while not >>> much shorter, it would have some polymorphism that np.asarray does not >>> have (based probably on some custom PEP 3118 extension) >> >> I was thinking you could allow the user to register a callback, and >> use that to coerce from a memoryview back to an object (given a >> memoryview object). For numpy this would be np.asarray, and the >> implementation is allowed to cache the result (which it will). >> It may be too magicky though... but it will be convenient. The >> memoryview will act as a subclass, meaning that any of its methods >> will override methods of the converted object. > > My point was that this seems *way* to magicky. > > Beyond "confusing users" and so on that are sort of subjective, here's a > fundamental problem for you: We're making it very difficult to type-infer > memoryviews. Consider: > > cdef double[:] x = ... > y = x > print y.shape > > Now, because y is not typed, you're semantically throwing in a conversion > on line 2, so that line 3 says that you want the attribute access to be > invoked on "whatever object x coerced back to". And we have no idea what > kind of object that is. > > If you don't transparently convert to object, it'd be safe to automatically > infer y as a double[:]. Why can't y be inferred as the type of x due to the assignment? > On a related note, I've said before that I dislike the notion of > > cdef double[:] mview = obj > > I'd rather like > > cdef double[:] mview = double[:](obj) Why? We currently allow cdef cha
Re: [Cython] buffer syntax vs. memory view syntax
On 05/08/2012 10:18 AM, Stefan Behnel wrote: Dag Sverre Seljebotn, 08.05.2012 09:57: On 05/07/2012 11:21 PM, mark florisson wrote: On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: mark florisson wrote: On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: On 05/07/2012 04:16 PM, Stefan Behnel wrote: Stefan Behnel, 07.05.2012 15:04: Dag Sverre Seljebotn, 07.05.2012 13:48: BTW, with the coming of memoryviews, me and Mark talked about just deprecating the "mytype[...]" meaning buffers, and rather treat it as np.ndarray, array.array etc. being some sort of "template types". That is, we disallow "object[int]" and require some special declarations in the relevant pxd files. Hmm, yes, it's unfortunate that we have two different types of syntax now, one that declares the item type before the brackets and one that declares it afterwards. Should we consider the buffer interface syntax deprecated and focus on the memory view syntax? I think that's the very-long-term intention. Then again, it may be too early to really tell yet, we just need to see how the memory views play out in real life and whether they'll be able to replace np.ndarray[double] among real users. We don't want to shove things down users throats. But the use of the trailing-[] syntax needs some cleaning up. Me and Mark agreed we'd put this proposal forward when we got around to it: - Deprecate the "object[double]" form, where [dtype] can be stuck on any extension type - But, do NOT (for the next year at least) deprecate np.ndarray[double], array.array[double], etc. Basically, there should be a magic flag in extension type declarations saying "I can be a buffer". For one thing, that is sort of needed to open up things for templated cdef classes/fused types cdef classes, if that is ever implemented. Deprecating is definitely a good start. I think at least if you only allow two types as buffers it will be at least reasonably clear when one is dealing with fused types or buffers. Basically, I think memoryviews should live up to demands of the users, which would mean there would be no reason to keep the buffer syntax. But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy.. One thing to do is make memoryviews coerce cheaply back to the original objects if wanted (which is likely). Writting np.asarray(mymemview) is kind of annoying. It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to... Yeah, the idea is as very simple, as you mention, just keep the object around cached, and when you slice construct one lazily. If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user. You mean the buffer syntax? Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension) I was thinking you could allow the user to register a callback, and use that to coerce from a memoryview back to an object (given a memoryview object). For numpy this would be np.asarray, and the implementation is allowed to cache the result (which it will). It may be too magicky though... but it will be convenient. The memoryview will act as a subclass, meaning that any of its methods will override methods of the converted object. My point was that this seems *way* to magicky. Beyond "confusing users" and so on that are sort of subjective, here's a fundamental problem for you: We're making it very difficult to type-infer memoryviews. Consider: cdef double[:] x = ... y = x print y.shape Now, because y is not typed, you're semantically throwing in a conversion on line 2, so that line 3 says that you want the attribute access to be invoked on "whatever object x coerced back to". And we have no idea what kind of object that is. If you don't transparently convert to object, it'd be safe to automatically infer y as a double[:]. Why can't y be inferred as the type of x due to the assignment? On a related note, I've said before that I dislike the notion of cdef double[:] mview = obj I'd rather like cdef double[:] mview = double[:](obj) Why? We currently allow cdef char* s = some_py_bytes_string Auto-coercion is a serious part of the language, and I don't see the advantage of requiring the redundancy in the case above. It's clear enough to me what the typed assignment is intended to mean: get me a buffer view on the object, regardless of what it is. Good point. I admit defeat. There's slight difference in that there's more of a 1:1 between a bytes
Re: [Cython] buffer syntax vs. memory view syntax
On 05/08/2012 10:18 AM, Stefan Behnel wrote: Dag Sverre Seljebotn, 08.05.2012 09:57: On 05/07/2012 11:21 PM, mark florisson wrote: On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: mark florisson wrote: On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: On 05/07/2012 04:16 PM, Stefan Behnel wrote: Stefan Behnel, 07.05.2012 15:04: Dag Sverre Seljebotn, 07.05.2012 13:48: BTW, with the coming of memoryviews, me and Mark talked about just deprecating the "mytype[...]" meaning buffers, and rather treat it as np.ndarray, array.array etc. being some sort of "template types". That is, we disallow "object[int]" and require some special declarations in the relevant pxd files. Hmm, yes, it's unfortunate that we have two different types of syntax now, one that declares the item type before the brackets and one that declares it afterwards. Should we consider the buffer interface syntax deprecated and focus on the memory view syntax? I think that's the very-long-term intention. Then again, it may be too early to really tell yet, we just need to see how the memory views play out in real life and whether they'll be able to replace np.ndarray[double] among real users. We don't want to shove things down users throats. But the use of the trailing-[] syntax needs some cleaning up. Me and Mark agreed we'd put this proposal forward when we got around to it: - Deprecate the "object[double]" form, where [dtype] can be stuck on any extension type - But, do NOT (for the next year at least) deprecate np.ndarray[double], array.array[double], etc. Basically, there should be a magic flag in extension type declarations saying "I can be a buffer". For one thing, that is sort of needed to open up things for templated cdef classes/fused types cdef classes, if that is ever implemented. Deprecating is definitely a good start. I think at least if you only allow two types as buffers it will be at least reasonably clear when one is dealing with fused types or buffers. Basically, I think memoryviews should live up to demands of the users, which would mean there would be no reason to keep the buffer syntax. But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy.. One thing to do is make memoryviews coerce cheaply back to the original objects if wanted (which is likely). Writting np.asarray(mymemview) is kind of annoying. It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to... Yeah, the idea is as very simple, as you mention, just keep the object around cached, and when you slice construct one lazily. If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user. You mean the buffer syntax? Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension) I was thinking you could allow the user to register a callback, and use that to coerce from a memoryview back to an object (given a memoryview object). For numpy this would be np.asarray, and the implementation is allowed to cache the result (which it will). It may be too magicky though... but it will be convenient. The memoryview will act as a subclass, meaning that any of its methods will override methods of the converted object. My point was that this seems *way* to magicky. Beyond "confusing users" and so on that are sort of subjective, here's a fundamental problem for you: We're making it very difficult to type-infer memoryviews. Consider: cdef double[:] x = ... y = x print y.shape Now, because y is not typed, you're semantically throwing in a conversion on line 2, so that line 3 says that you want the attribute access to be invoked on "whatever object x coerced back to". And we have no idea what kind of object that is. If you don't transparently convert to object, it'd be safe to automatically infer y as a double[:]. Why can't y be inferred as the type of x due to the assignment? On a related note, I've said before that I dislike the notion of cdef double[:] mview = obj I'd rather like cdef double[:] mview = double[:](obj) Why? We currently allow cdef char* s = some_py_bytes_string Auto-coercion is a serious part of the language, and I don't see the advantage of requiring the redundancy in the case above. It's clear enough to me what the typed assignment is intended to mean: get me a buffer view on the object, regardless of what it is. I support Robert in that "np.ndarray[double]" is the syntax to use when you want this kind of transp
Re: [Cython] buffer syntax vs. memory view syntax
Dag Sverre Seljebotn, 08.05.2012 10:36: > On 05/08/2012 10:18 AM, Stefan Behnel wrote: >> Dag Sverre Seljebotn, 08.05.2012 09:57: >>> On 05/07/2012 11:21 PM, mark florisson wrote: On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: > mark florisson wrote: >> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: Stefan Behnel, 07.05.2012 15:04: > Dag Sverre Seljebotn, 07.05.2012 13:48: >> BTW, with the coming of memoryviews, me and Mark talked about just >> deprecating the "mytype[...]" meaning buffers, and rather treat it >> as np.ndarray, array.array etc. being some sort of "template types". >> That is, >> we disallow "object[int]" and require some special declarations in >> the relevant pxd files. > > Hmm, yes, it's unfortunate that we have two different types of > syntax now, > one that declares the item type before the brackets and one that > declares it afterwards. Should we consider the buffer interface syntax deprecated and focus on the memory view syntax? >>> >>> I think that's the very-long-term intention. Then again, it may be >>> too early >>> to really tell yet, we just need to see how the memory views play out >>> in >>> real life and whether they'll be able to replace np.ndarray[double] >>> among real users. We don't want to shove things down users throats. >>> >>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>> Mark agreed we'd put this proposal forward when we got around to it: >>> >>>- Deprecate the "object[double]" form, where [dtype] can be stuck on >>>any extension type >>> >>>- But, do NOT (for the next year at least) deprecate >>>np.ndarray[double], >>>array.array[double], etc. Basically, there should be a magic flag in >>>extension type declarations saying "I can be a buffer". >>> >>> For one thing, that is sort of needed to open up things for templated >>> cdef classes/fused types cdef classes, if that is ever implemented. >> >> Deprecating is definitely a good start. I think at least if you only >> allow two types as buffers it will be at least reasonably clear when >> one is dealing with fused types or buffers. >> >> Basically, I think memoryviews should live up to demands of the users, >> which would mean there would be no reason to keep the buffer syntax. > > But they are different approaches -- use a different type/API, or just > try to speed up parts of NumPy.. > >> One thing to do is make memoryviews coerce cheaply back to the >> original objects if wanted (which is likely). Writting >> np.asarray(mymemview) is kind of annoying. > > It is going to be very confusing to have type(mymemview), > repr(mymemview), and so on come out as NumPy arrays, but not have the > full API of NumPy. Unless you auto-convert on getattr to... Yeah, the idea is as very simple, as you mention, just keep the object around cached, and when you slice construct one lazily. > If you want to eradicate the distinction between the backing array and > the memory view and make it transparent, I really suggest you kick back > alive np.ndarray (it can exist in some 'unrealized' state with delayed > construction after slicing, and so on). Implementation much the same > either way, it is all about how it is presented to the user. You mean the buffer syntax? > Something like mymemview.asobject() could work though, and while not > much shorter, it would have some polymorphism that np.asarray does not > have (based probably on some custom PEP 3118 extension) I was thinking you could allow the user to register a callback, and use that to coerce from a memoryview back to an object (given a memoryview object). For numpy this would be np.asarray, and the implementation is allowed to cache the result (which it will). It may be too magicky though... but it will be convenient. The memoryview will act as a subclass, meaning that any of its methods will override methods of the converted object. >>> >>> My point was that this seems *way* to magicky. >>> >>> Beyond "confusing users" and so on that are sort of subjective, here's a >>> fundamental problem for you: We're making it very difficult to type-infer >>> memoryviews. Consider: >>> >>> cdef double[:] x = ... >>> y = x >>> print y.shape >>> >>> Now, because y is not typed, you're semantically throwing in a conversion >>> on line 2, so that line 3 says that you want the attribute access to be >>> invoked on "whatever object x coerced back to". And we have no idea what >>> kind of object that is. >>> >>> If you don't transparently convert to object, it'd be safe to automatically >>>
Re: [Cython] buffer syntax vs. memory view syntax
On 8 May 2012 09:49, Stefan Behnel wrote: > Dag Sverre Seljebotn, 08.05.2012 10:36: >> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 08.05.2012 09:57: On 05/07/2012 11:21 PM, mark florisson wrote: > On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >> mark florisson wrote: >>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: On 05/07/2012 04:16 PM, Stefan Behnel wrote: > Stefan Behnel, 07.05.2012 15:04: >> Dag Sverre Seljebotn, 07.05.2012 13:48: >>> BTW, with the coming of memoryviews, me and Mark talked about just >>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>> as np.ndarray, array.array etc. being some sort of "template types". >>> That is, >>> we disallow "object[int]" and require some special declarations in >>> the relevant pxd files. >> >> Hmm, yes, it's unfortunate that we have two different types of >> syntax now, >> one that declares the item type before the brackets and one that >> declares it afterwards. > Should we consider the > buffer interface syntax deprecated and focus on the memory view > syntax? I think that's the very-long-term intention. Then again, it may be too early to really tell yet, we just need to see how the memory views play out in real life and whether they'll be able to replace np.ndarray[double] among real users. We don't want to shove things down users throats. But the use of the trailing-[] syntax needs some cleaning up. Me and Mark agreed we'd put this proposal forward when we got around to it: - Deprecate the "object[double]" form, where [dtype] can be stuck on any extension type - But, do NOT (for the next year at least) deprecate np.ndarray[double], array.array[double], etc. Basically, there should be a magic flag in extension type declarations saying "I can be a buffer". For one thing, that is sort of needed to open up things for templated cdef classes/fused types cdef classes, if that is ever implemented. >>> >>> Deprecating is definitely a good start. I think at least if you only >>> allow two types as buffers it will be at least reasonably clear when >>> one is dealing with fused types or buffers. >>> >>> Basically, I think memoryviews should live up to demands of the users, >>> which would mean there would be no reason to keep the buffer syntax. >> >> But they are different approaches -- use a different type/API, or just >> try to speed up parts of NumPy.. >> >>> One thing to do is make memoryviews coerce cheaply back to the >>> original objects if wanted (which is likely). Writting >>> np.asarray(mymemview) is kind of annoying. >> >> It is going to be very confusing to have type(mymemview), >> repr(mymemview), and so on come out as NumPy arrays, but not have the >> full API of NumPy. Unless you auto-convert on getattr to... > > Yeah, the idea is as very simple, as you mention, just keep the object > around cached, and when you slice construct one lazily. > >> If you want to eradicate the distinction between the backing array and >> the memory view and make it transparent, I really suggest you kick back >> alive np.ndarray (it can exist in some 'unrealized' state with delayed >> construction after slicing, and so on). Implementation much the same >> either way, it is all about how it is presented to the user. > > You mean the buffer syntax? > >> Something like mymemview.asobject() could work though, and while not >> much shorter, it would have some polymorphism that np.asarray does not >> have (based probably on some custom PEP 3118 extension) > > I was thinking you could allow the user to register a callback, and > use that to coerce from a memoryview back to an object (given a > memoryview object). For numpy this would be np.asarray, and the > implementation is allowed to cache the result (which it will). > It may be too magicky though... but it will be convenient. The > memoryview will act as a subclass, meaning that any of its methods > will override methods of the converted object. My point was that this seems *way* to magicky. Beyond "confusing users" and so on that are sort of subjective, here's a fundamental problem for you: We're making it very difficult to type-infer memoryviews. Consider: cdef double[:] x = ... y = x print y.shape Now, because y is not typed, you're semantically throwing in a conversion on line 2, so that line 3 says that you want the attribute access to be invoked on "whatever object x coerced back t
Re: [Cython] buffer syntax vs. memory view syntax
On 8 May 2012 09:36, Dag Sverre Seljebotn wrote: > On 05/08/2012 10:18 AM, Stefan Behnel wrote: >> >> Dag Sverre Seljebotn, 08.05.2012 09:57: >>> >>> On 05/07/2012 11:21 PM, mark florisson wrote: On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: > > mark florisson wrote: >> >> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>> >>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: Stefan Behnel, 07.05.2012 15:04: > > Dag Sverre Seljebotn, 07.05.2012 13:48: >> >> BTW, with the coming of memoryviews, me and Mark talked about just >> deprecating the "mytype[...]" meaning buffers, and rather treat it >> as np.ndarray, array.array etc. being some sort of "template >> types". >> That is, >> we disallow "object[int]" and require some special declarations in >> the relevant pxd files. > > > Hmm, yes, it's unfortunate that we have two different types of > syntax now, > one that declares the item type before the brackets and one that > declares it afterwards. Should we consider the buffer interface syntax deprecated and focus on the memory view syntax? >>> >>> >>> I think that's the very-long-term intention. Then again, it may be >>> too early >>> to really tell yet, we just need to see how the memory views play out >>> in >>> real life and whether they'll be able to replace np.ndarray[double] >>> among real users. We don't want to shove things down users throats. >>> >>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>> Mark agreed we'd put this proposal forward when we got around to it: >>> >>> - Deprecate the "object[double]" form, where [dtype] can be stuck >>> on >>> any extension type >>> >>> - But, do NOT (for the next year at least) deprecate >>> np.ndarray[double], >>> array.array[double], etc. Basically, there should be a magic flag >>> in >>> extension type declarations saying "I can be a buffer". >>> >>> For one thing, that is sort of needed to open up things for templated >>> cdef classes/fused types cdef classes, if that is ever implemented. >> >> >> Deprecating is definitely a good start. I think at least if you only >> allow two types as buffers it will be at least reasonably clear when >> one is dealing with fused types or buffers. >> >> Basically, I think memoryviews should live up to demands of the users, >> which would mean there would be no reason to keep the buffer syntax. > > > But they are different approaches -- use a different type/API, or just > try to speed up parts of NumPy.. > >> One thing to do is make memoryviews coerce cheaply back to the >> original objects if wanted (which is likely). Writting >> np.asarray(mymemview) is kind of annoying. > > > It is going to be very confusing to have type(mymemview), > repr(mymemview), and so on come out as NumPy arrays, but not have the > full API of NumPy. Unless you auto-convert on getattr to... Yeah, the idea is as very simple, as you mention, just keep the object around cached, and when you slice construct one lazily. > If you want to eradicate the distinction between the backing array and > the memory view and make it transparent, I really suggest you kick back > alive np.ndarray (it can exist in some 'unrealized' state with delayed > construction after slicing, and so on). Implementation much the same > either way, it is all about how it is presented to the user. You mean the buffer syntax? > Something like mymemview.asobject() could work though, and while not > much shorter, it would have some polymorphism that np.asarray does not > have (based probably on some custom PEP 3118 extension) I was thinking you could allow the user to register a callback, and use that to coerce from a memoryview back to an object (given a memoryview object). For numpy this would be np.asarray, and the implementation is allowed to cache the result (which it will). It may be too magicky though... but it will be convenient. The memoryview will act as a subclass, meaning that any of its methods will override methods of the converted object. >>> >>> >>> My point was that this seems *way* to magicky. >>> >>> Beyond "confusing users" and so on that are sort of subjective, here's a >>> fundamental problem for you: We're making it very difficult to type-infer >>> memoryviews. Consider: >>> >>> cdef double[:] x = ... >>> y = x >>> print y.shape >>> >>> Now, because y is not typed, you're semantically throwing in a conversion >>> on line 2, so that line 3 says that you want the attribute access to be >>> invoked on "whatever object x
Re: [Cython] buffer syntax vs. memory view syntax
On 8 May 2012 10:22, mark florisson wrote: > On 8 May 2012 09:36, Dag Sverre Seljebotn wrote: >> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>> >>> Dag Sverre Seljebotn, 08.05.2012 09:57: On 05/07/2012 11:21 PM, mark florisson wrote: > > On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >> >> mark florisson wrote: >>> >>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: On 05/07/2012 04:16 PM, Stefan Behnel wrote: > > Stefan Behnel, 07.05.2012 15:04: >> >> Dag Sverre Seljebotn, 07.05.2012 13:48: >>> >>> BTW, with the coming of memoryviews, me and Mark talked about just >>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>> as np.ndarray, array.array etc. being some sort of "template >>> types". >>> That is, >>> we disallow "object[int]" and require some special declarations in >>> the relevant pxd files. >> >> >> Hmm, yes, it's unfortunate that we have two different types of >> syntax now, >> one that declares the item type before the brackets and one that >> declares it afterwards. > > Should we consider the > buffer interface syntax deprecated and focus on the memory view > syntax? I think that's the very-long-term intention. Then again, it may be too early to really tell yet, we just need to see how the memory views play out in real life and whether they'll be able to replace np.ndarray[double] among real users. We don't want to shove things down users throats. But the use of the trailing-[] syntax needs some cleaning up. Me and Mark agreed we'd put this proposal forward when we got around to it: - Deprecate the "object[double]" form, where [dtype] can be stuck on any extension type - But, do NOT (for the next year at least) deprecate np.ndarray[double], array.array[double], etc. Basically, there should be a magic flag in extension type declarations saying "I can be a buffer". For one thing, that is sort of needed to open up things for templated cdef classes/fused types cdef classes, if that is ever implemented. >>> >>> >>> Deprecating is definitely a good start. I think at least if you only >>> allow two types as buffers it will be at least reasonably clear when >>> one is dealing with fused types or buffers. >>> >>> Basically, I think memoryviews should live up to demands of the users, >>> which would mean there would be no reason to keep the buffer syntax. >> >> >> But they are different approaches -- use a different type/API, or just >> try to speed up parts of NumPy.. >> >>> One thing to do is make memoryviews coerce cheaply back to the >>> original objects if wanted (which is likely). Writting >>> np.asarray(mymemview) is kind of annoying. >> >> >> It is going to be very confusing to have type(mymemview), >> repr(mymemview), and so on come out as NumPy arrays, but not have the >> full API of NumPy. Unless you auto-convert on getattr to... > > > Yeah, the idea is as very simple, as you mention, just keep the object > around cached, and when you slice construct one lazily. > >> If you want to eradicate the distinction between the backing array and >> the memory view and make it transparent, I really suggest you kick back >> alive np.ndarray (it can exist in some 'unrealized' state with delayed >> construction after slicing, and so on). Implementation much the same >> either way, it is all about how it is presented to the user. > > > You mean the buffer syntax? > >> Something like mymemview.asobject() could work though, and while not >> much shorter, it would have some polymorphism that np.asarray does not >> have (based probably on some custom PEP 3118 extension) > > > I was thinking you could allow the user to register a callback, and > use that to coerce from a memoryview back to an object (given a > memoryview object). For numpy this would be np.asarray, and the > implementation is allowed to cache the result (which it will). > It may be too magicky though... but it will be convenient. The > memoryview will act as a subclass, meaning that any of its methods > will override methods of the converted object. My point was that this seems *way* to magicky. Beyond "confusing users" and so on that are sort of subjective, here's a fundamental problem for you: We're making it very difficult to type-infer memoryviews. Consider: cdef double[:] x = ... y = x print y.shape Now, because y is
Re: [Cython] buffer syntax vs. memory view syntax
On 8 May 2012 09:49, Stefan Behnel wrote: > Dag Sverre Seljebotn, 08.05.2012 10:36: >> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 08.05.2012 09:57: On 05/07/2012 11:21 PM, mark florisson wrote: > On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >> mark florisson wrote: >>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: On 05/07/2012 04:16 PM, Stefan Behnel wrote: > Stefan Behnel, 07.05.2012 15:04: >> Dag Sverre Seljebotn, 07.05.2012 13:48: >>> BTW, with the coming of memoryviews, me and Mark talked about just >>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>> as np.ndarray, array.array etc. being some sort of "template types". >>> That is, >>> we disallow "object[int]" and require some special declarations in >>> the relevant pxd files. >> >> Hmm, yes, it's unfortunate that we have two different types of >> syntax now, >> one that declares the item type before the brackets and one that >> declares it afterwards. > Should we consider the > buffer interface syntax deprecated and focus on the memory view > syntax? I think that's the very-long-term intention. Then again, it may be too early to really tell yet, we just need to see how the memory views play out in real life and whether they'll be able to replace np.ndarray[double] among real users. We don't want to shove things down users throats. But the use of the trailing-[] syntax needs some cleaning up. Me and Mark agreed we'd put this proposal forward when we got around to it: - Deprecate the "object[double]" form, where [dtype] can be stuck on any extension type - But, do NOT (for the next year at least) deprecate np.ndarray[double], array.array[double], etc. Basically, there should be a magic flag in extension type declarations saying "I can be a buffer". For one thing, that is sort of needed to open up things for templated cdef classes/fused types cdef classes, if that is ever implemented. >>> >>> Deprecating is definitely a good start. I think at least if you only >>> allow two types as buffers it will be at least reasonably clear when >>> one is dealing with fused types or buffers. >>> >>> Basically, I think memoryviews should live up to demands of the users, >>> which would mean there would be no reason to keep the buffer syntax. >> >> But they are different approaches -- use a different type/API, or just >> try to speed up parts of NumPy.. >> >>> One thing to do is make memoryviews coerce cheaply back to the >>> original objects if wanted (which is likely). Writting >>> np.asarray(mymemview) is kind of annoying. >> >> It is going to be very confusing to have type(mymemview), >> repr(mymemview), and so on come out as NumPy arrays, but not have the >> full API of NumPy. Unless you auto-convert on getattr to... > > Yeah, the idea is as very simple, as you mention, just keep the object > around cached, and when you slice construct one lazily. > >> If you want to eradicate the distinction between the backing array and >> the memory view and make it transparent, I really suggest you kick back >> alive np.ndarray (it can exist in some 'unrealized' state with delayed >> construction after slicing, and so on). Implementation much the same >> either way, it is all about how it is presented to the user. > > You mean the buffer syntax? > >> Something like mymemview.asobject() could work though, and while not >> much shorter, it would have some polymorphism that np.asarray does not >> have (based probably on some custom PEP 3118 extension) > > I was thinking you could allow the user to register a callback, and > use that to coerce from a memoryview back to an object (given a > memoryview object). For numpy this would be np.asarray, and the > implementation is allowed to cache the result (which it will). > It may be too magicky though... but it will be convenient. The > memoryview will act as a subclass, meaning that any of its methods > will override methods of the converted object. My point was that this seems *way* to magicky. Beyond "confusing users" and so on that are sort of subjective, here's a fundamental problem for you: We're making it very difficult to type-infer memoryviews. Consider: cdef double[:] x = ... y = x print y.shape Now, because y is not typed, you're semantically throwing in a conversion on line 2, so that line 3 says that you want the attribute access to be invoked on "whatever object x coerced back t
Re: [Cython] buffer syntax vs. memory view syntax
On 05/08/2012 11:22 AM, mark florisson wrote: On 8 May 2012 09:36, Dag Sverre Seljebotn wrote: On 05/08/2012 10:18 AM, Stefan Behnel wrote: Dag Sverre Seljebotn, 08.05.2012 09:57: On 05/07/2012 11:21 PM, mark florisson wrote: On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: mark florisson wrote: On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: On 05/07/2012 04:16 PM, Stefan Behnel wrote: Stefan Behnel, 07.05.2012 15:04: Dag Sverre Seljebotn, 07.05.2012 13:48: BTW, with the coming of memoryviews, me and Mark talked about just deprecating the "mytype[...]" meaning buffers, and rather treat it as np.ndarray, array.array etc. being some sort of "template types". That is, we disallow "object[int]" and require some special declarations in the relevant pxd files. Hmm, yes, it's unfortunate that we have two different types of syntax now, one that declares the item type before the brackets and one that declares it afterwards. Should we consider the buffer interface syntax deprecated and focus on the memory view syntax? I think that's the very-long-term intention. Then again, it may be too early to really tell yet, we just need to see how the memory views play out in real life and whether they'll be able to replace np.ndarray[double] among real users. We don't want to shove things down users throats. But the use of the trailing-[] syntax needs some cleaning up. Me and Mark agreed we'd put this proposal forward when we got around to it: - Deprecate the "object[double]" form, where [dtype] can be stuck on any extension type - But, do NOT (for the next year at least) deprecate np.ndarray[double], array.array[double], etc. Basically, there should be a magic flag in extension type declarations saying "I can be a buffer". For one thing, that is sort of needed to open up things for templated cdef classes/fused types cdef classes, if that is ever implemented. Deprecating is definitely a good start. I think at least if you only allow two types as buffers it will be at least reasonably clear when one is dealing with fused types or buffers. Basically, I think memoryviews should live up to demands of the users, which would mean there would be no reason to keep the buffer syntax. But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy.. One thing to do is make memoryviews coerce cheaply back to the original objects if wanted (which is likely). Writting np.asarray(mymemview) is kind of annoying. It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to... Yeah, the idea is as very simple, as you mention, just keep the object around cached, and when you slice construct one lazily. If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user. You mean the buffer syntax? Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension) I was thinking you could allow the user to register a callback, and use that to coerce from a memoryview back to an object (given a memoryview object). For numpy this would be np.asarray, and the implementation is allowed to cache the result (which it will). It may be too magicky though... but it will be convenient. The memoryview will act as a subclass, meaning that any of its methods will override methods of the converted object. My point was that this seems *way* to magicky. Beyond "confusing users" and so on that are sort of subjective, here's a fundamental problem for you: We're making it very difficult to type-infer memoryviews. Consider: cdef double[:] x = ... y = x print y.shape Now, because y is not typed, you're semantically throwing in a conversion on line 2, so that line 3 says that you want the attribute access to be invoked on "whatever object x coerced back to". And we have no idea what kind of object that is. If you don't transparently convert to object, it'd be safe to automatically infer y as a double[:]. Why can't y be inferred as the type of x due to the assignment? On a related note, I've said before that I dislike the notion of cdef double[:] mview = obj I'd rather like cdef double[:] mview = double[:](obj) Why? We currently allow cdef char* s = some_py_bytes_string Auto-coercion is a serious part of the language, and I don't see the advantage of requiring the redundancy in the case above. It's clear enough to me what the typed assignment is intended to mean: get me a buffer view on the object, regardless
Re: [Cython] buffer syntax vs. memory view syntax
On 05/08/2012 11:30 AM, Dag Sverre Seljebotn wrote: On 05/08/2012 11:22 AM, mark florisson wrote: On 8 May 2012 09:36, Dag Sverre Seljebotn wrote: On 05/08/2012 10:18 AM, Stefan Behnel wrote: Dag Sverre Seljebotn, 08.05.2012 09:57: On 05/07/2012 11:21 PM, mark florisson wrote: On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: mark florisson wrote: On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: On 05/07/2012 04:16 PM, Stefan Behnel wrote: Stefan Behnel, 07.05.2012 15:04: Dag Sverre Seljebotn, 07.05.2012 13:48: BTW, with the coming of memoryviews, me and Mark talked about just deprecating the "mytype[...]" meaning buffers, and rather treat it as np.ndarray, array.array etc. being some sort of "template types". That is, we disallow "object[int]" and require some special declarations in the relevant pxd files. Hmm, yes, it's unfortunate that we have two different types of syntax now, one that declares the item type before the brackets and one that declares it afterwards. Should we consider the buffer interface syntax deprecated and focus on the memory view syntax? I think that's the very-long-term intention. Then again, it may be too early to really tell yet, we just need to see how the memory views play out in real life and whether they'll be able to replace np.ndarray[double] among real users. We don't want to shove things down users throats. But the use of the trailing-[] syntax needs some cleaning up. Me and Mark agreed we'd put this proposal forward when we got around to it: - Deprecate the "object[double]" form, where [dtype] can be stuck on any extension type - But, do NOT (for the next year at least) deprecate np.ndarray[double], array.array[double], etc. Basically, there should be a magic flag in extension type declarations saying "I can be a buffer". For one thing, that is sort of needed to open up things for templated cdef classes/fused types cdef classes, if that is ever implemented. Deprecating is definitely a good start. I think at least if you only allow two types as buffers it will be at least reasonably clear when one is dealing with fused types or buffers. Basically, I think memoryviews should live up to demands of the users, which would mean there would be no reason to keep the buffer syntax. But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy.. One thing to do is make memoryviews coerce cheaply back to the original objects if wanted (which is likely). Writting np.asarray(mymemview) is kind of annoying. It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to... Yeah, the idea is as very simple, as you mention, just keep the object around cached, and when you slice construct one lazily. If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user. You mean the buffer syntax? Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension) I was thinking you could allow the user to register a callback, and use that to coerce from a memoryview back to an object (given a memoryview object). For numpy this would be np.asarray, and the implementation is allowed to cache the result (which it will). It may be too magicky though... but it will be convenient. The memoryview will act as a subclass, meaning that any of its methods will override methods of the converted object. My point was that this seems *way* to magicky. Beyond "confusing users" and so on that are sort of subjective, here's a fundamental problem for you: We're making it very difficult to type-infer memoryviews. Consider: cdef double[:] x = ... y = x print y.shape Now, because y is not typed, you're semantically throwing in a conversion on line 2, so that line 3 says that you want the attribute access to be invoked on "whatever object x coerced back to". And we have no idea what kind of object that is. If you don't transparently convert to object, it'd be safe to automatically infer y as a double[:]. Why can't y be inferred as the type of x due to the assignment? On a related note, I've said before that I dislike the notion of cdef double[:] mview = obj I'd rather like cdef double[:] mview = double[:](obj) Why? We currently allow cdef char* s = some_py_bytes_string Auto-coercion is a serious part of the language, and I don't see the advantage of requiring the redundancy in the case above. It's clear enough to me what the typed assignment is intended to mean: get me a buffer v
Re: [Cython] buffer syntax vs. memory view syntax
mark florisson, 08.05.2012 11:24: Dag Sverre Seljebotn, 08.05.2012 09:57: > 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in > the language. It means exactly what you would like double[:] to mean, > i.e. > a variable that is memoryview when you need to and an object otherwise. > When you use this type, you bear the consequences of early-binding things > that could in theory be overridden. > > 2) double[:] is for when you want to access data of *any* Python object > in a generic way. Raw PEP 3118. In those situations, access to the > underlying object is much less useful. > > 2a) Therefore we require that you do "mview.asobject()" manually; doing > "mview.foo()" is a compile-time error >> [...] >> Character pointers coerce to strings. Hell, even structs coerce to and >> from python dicts, so disallowing the same for memoryviews would just >> be inconsistent and inconvenient. Two separate things to discuss here: the original exporter and a Python level wrapper. As long as wrapping the memoryview in a new object is can easily be done by users, I don't see a reason to provide compiler support for getting at the exporter. After all, a user may have a memory view that is backed by a NumPy array but wants to reinterpret it as a PIL image. Just because the underlying object has a specific object type doesn't mean that's the one to use for a given use case. If a user requires a specific object *instead* of a bare memory view, we have the object type buffer syntax for that. It's also not necessarily more efficient to access the underlying object than to create a new one if the underlying exporter has to learn about the mapped layout first. Regarding the coercion to Python, I do not see a problem with providing a general Python view object for memory views that arbitrary Cython memory views can coerce to. In fact, I consider that a useful feature. The builtin memoryview type in Python (at least the one in CPython 3.3) should be quite capable of providing this, although I don't mind what exactly this becomes. > Also, if you don't allow coercion from python, then it means they also > cannot be used as 'def' function arguments and be called from python. Coercion *from* Python is not being questioned. We have syntax for that, and a Python memory view wrapper can easily be unboxed (even transitively) through the buffer interface when entering back into Cython. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] buffer syntax vs. memory view syntax
On 8 May 2012 10:47, Dag Sverre Seljebotn wrote: > On 05/08/2012 11:30 AM, Dag Sverre Seljebotn wrote: >> >> On 05/08/2012 11:22 AM, mark florisson wrote: >>> >>> On 8 May 2012 09:36, Dag Sverre Seljebotn >>> wrote: On 05/08/2012 10:18 AM, Stefan Behnel wrote: > > > Dag Sverre Seljebotn, 08.05.2012 09:57: >> >> >> On 05/07/2012 11:21 PM, mark florisson wrote: >>> >>> >>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: mark florisson wrote: > > > On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >> >> >> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>> >>> >>> Stefan Behnel, 07.05.2012 15:04: Dag Sverre Seljebotn, 07.05.2012 13:48: > > > BTW, with the coming of memoryviews, me and Mark talked > about just > deprecating the "mytype[...]" meaning buffers, and rather > treat it > as np.ndarray, array.array etc. being some sort of "template > types". > That is, > we disallow "object[int]" and require some special > declarations in > the relevant pxd files. Hmm, yes, it's unfortunate that we have two different types of syntax now, one that declares the item type before the brackets and one that declares it afterwards. >>> >>> >>> Should we consider the >>> buffer interface syntax deprecated and focus on the memory view >>> syntax? >> >> >> >> I think that's the very-long-term intention. Then again, it may be >> too early >> to really tell yet, we just need to see how the memory views >> play out >> in >> real life and whether they'll be able to replace >> np.ndarray[double] >> among real users. We don't want to shove things down users >> throats. >> >> But the use of the trailing-[] syntax needs some cleaning up. >> Me and >> Mark agreed we'd put this proposal forward when we got around >> to it: >> >> - Deprecate the "object[double]" form, where [dtype] can be stuck >> on >> any extension type >> >> - But, do NOT (for the next year at least) deprecate >> np.ndarray[double], >> array.array[double], etc. Basically, there should be a magic flag >> in >> extension type declarations saying "I can be a buffer". >> >> For one thing, that is sort of needed to open up things for >> templated >> cdef classes/fused types cdef classes, if that is ever >> implemented. > > > > Deprecating is definitely a good start. I think at least if you > only > allow two types as buffers it will be at least reasonably clear > when > one is dealing with fused types or buffers. > > Basically, I think memoryviews should live up to demands of the > users, > which would mean there would be no reason to keep the buffer > syntax. But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy.. > One thing to do is make memoryviews coerce cheaply back to the > original objects if wanted (which is likely). Writting > np.asarray(mymemview) is kind of annoying. It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to... >>> >>> >>> >>> Yeah, the idea is as very simple, as you mention, just keep the >>> object >>> around cached, and when you slice construct one lazily. >>> If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user. >>> >>> >>> >>> You mean the buffer syntax? >>> Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension) >>> >>> >>> >>> I was thinking you could allow the user to register a callback, and >>> use that to coerce from a
Re: [Cython] buffer syntax vs. memory view syntax
On 8 May 2012 10:48, Stefan Behnel wrote: > mark florisson, 08.05.2012 11:24: > Dag Sverre Seljebotn, 08.05.2012 09:57: >> 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >> the language. It means exactly what you would like double[:] to mean, >> i.e. >> a variable that is memoryview when you need to and an object otherwise. >> When you use this type, you bear the consequences of early-binding things >> that could in theory be overridden. >> >> 2) double[:] is for when you want to access data of *any* Python object >> in a generic way. Raw PEP 3118. In those situations, access to the >> underlying object is much less useful. >> >> 2a) Therefore we require that you do "mview.asobject()" manually; doing >> "mview.foo()" is a compile-time error >>> [...] >>> Character pointers coerce to strings. Hell, even structs coerce to and >>> from python dicts, so disallowing the same for memoryviews would just >>> be inconsistent and inconvenient. > > Two separate things to discuss here: the original exporter and a Python > level wrapper. > > As long as wrapping the memoryview in a new object is can easily be done by > users, I don't see a reason to provide compiler support for getting at the > exporter. Well, the support is already there :) It's basically to be consistent with numpy's attributes. > After all, a user may have a memory view that is backed by a > NumPy array but wants to reinterpret it as a PIL image. Just because the > underlying object has a specific object type doesn't mean that's the one to > use for a given use case. If a user requires a specific object *instead* of > a bare memory view, we have the object type buffer syntax for that. Which is better deprecated to allow only one way to do things, and to make fused extension types less confusing. > It's also not necessarily more efficient to access the underlying object > than to create a new one if the underlying exporter has to learn about the > mapped layout first. > > Regarding the coercion to Python, I do not see a problem with providing a > general Python view object for memory views that arbitrary Cython memory > views can coerce to. In fact, I consider that a useful feature. The builtin > memoryview type in Python (at least the one in CPython 3.3) should be quite > capable of providing this, although I don't mind what exactly this becomes. > There are two ways to argue this entire problem, one is from a theoretical standpoint, and one from a pragmatic. Theoretically your points are sound, but in practice 99% of the uses will be numpy arrays, and in 99% of those uses people will want one back. If one does not allow easy, compiler-supported, conversion, then any numpy operation will go from typed memoryview slice -> memoryview object -> buffer interface -> some computation in numpy -> buffer interface -> typed memoryview. The compiler can help here by maintaining cached views aided by a user callback. In the case you're not slicing, you can just return the original object. I'm not sure how to register those callbacks though, as making them global may interfere between projects. Maybe it should be a module level thing? >> Also, if you don't allow coercion from python, then it means they also >> cannot be used as 'def' function arguments and be called from python. > > Coercion *from* Python is not being questioned. We have syntax for that, > and a Python memory view wrapper can easily be unboxed (even transitively) > through the buffer interface when entering back into Cython. > > Stefan > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] callable() optimization
I've noticed regression related to callable() optimization. https://github.com/cython/cython/commit/a40112b0461eae5ab22fbdd07ae798d4a72ff523 class C: pass print callable(C()) It prints True optimized version checks ((obj)->ob_type->tp_call != NULL) condition that is True for both class and instance. >>> help(callable) callable(...) callable(object) -> bool Return whether the object is callable (i.e., some kind of function). Note that classes are callable, as are instances with a __call__() method. -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] CF based type inference
Hi, Vitja has rebased the type inference on the control flow, so I wonder if this will enable us to properly infer this: def partial_validity(): """ >>> partial_validity() ('Python object', 'double', 'str object') """ a = 1.0 b = a + 2 # definitely double a = 'test' c = a + 'toast' # definitely str return typeof(a), typeof(b), typeof(c) I think, what is mainly needed for this is that a NameNode with an undeclared type should not report its own entry as dependency but that of its own cf_assignments. Would this work? (Haven't got the time to try it out right now, so I'm dumping it here.) Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [cython-users] Confusing beheavior of memoryview (assigning value to slice fails without any indication)
mark florisson, 08.05.2012 11:11: > On 7 May 2012 13:09, Maxim wrote: >> Consider the following code: >> >>> # a.pyx: >>> cdef class Base(object): >>> cdef public double[:,:] arr >>> # b.py: >>> from a import Base >>> import numpy as np >>> class MyClass(Base): >>> def __init__(self): >>> self.arr = np.zeros((10, 10), dtype=np.float64) >>> self.arr[1, :] = 10 # this line will execute correctly, but won't >>> have any effect >>> print self.arr[1,5] >>> >> >> Is it possible to somehow warn the user here that assigning value to >> memoryview slice is not supported? Finding this out after some debugging was >> a little annoying. > > Thanks for the report, that's a silly bug. It works with typed > memoryviews, but with objects it passes in the wrong ndim, and the > second dimension is 0, which means it does nothing. It is fixed in the > cython master branch. > > BTW, using 'self.arr' in the python subclass means you don't get your > numpy array back, but rather a cython memoryview that is far less > capable. I think it would be good to cherry pick this kind of fix directly over into the release branch so that we can start building up our pile of fixes for 0.16.1 there. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CF based type inference
2012/5/8 Stefan Behnel : > Hi, > > Vitja has rebased the type inference on the control flow, so I wonder if > this will enable us to properly infer this: > > def partial_validity(): > """ > >>> partial_validity() > ('Python object', 'double', 'str object') > """ > a = 1.0 > b = a + 2 # definitely double > a = 'test' > c = a + 'toast' # definitely str > return typeof(a), typeof(b), typeof(c) > > I think, what is mainly needed for this is that a NameNode with an > undeclared type should not report its own entry as dependency but that of > its own cf_assignments. Would this work? > > (Haven't got the time to try it out right now, so I'm dumping it here.) > Yeah, that might work. The other way to go is to split entries: def partial_validity(): """ >>> partial_validity() ('str object', 'double', 'str object') """ a_1 = 1.0 b = a_1 + 2 # definitely double a_2 = 'test' c = a_2 + 'toast' # definitely str return typeof(a_2), typeof(b), typeof(c) And this should work better because it allows to infer a_1 as a double and a_2 as a string. -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CF based type inference
Vitja Makarov wrote: >2012/5/8 Stefan Behnel : >> Hi, >> >> Vitja has rebased the type inference on the control flow, so I wonder >if >> this will enable us to properly infer this: >> >> def partial_validity(): >> """ >> >>> partial_validity() >> ('Python object', 'double', 'str object') >> """ >> a = 1.0 >> b = a + 2 # definitely double >> a = 'test' >> c = a + 'toast' # definitely str >> return typeof(a), typeof(b), typeof(c) >> >> I think, what is mainly needed for this is that a NameNode with an >> undeclared type should not report its own entry as dependency but >that of >> its own cf_assignments. Would this work? >> >> (Haven't got the time to try it out right now, so I'm dumping it >here.) >> > >Yeah, that might work. The other way to go is to split entries: > > def partial_validity(): > """ > >>> partial_validity() > ('str object', 'double', 'str object') > """ > a_1 = 1.0 > b = a_1 + 2 # definitely double > a_2 = 'test' > c = a_2 + 'toast' # definitely str > return typeof(a_2), typeof(b), typeof(c) > >And this should work better because it allows to infer a_1 as a double >and a_2 as a string. +1 (as also Mark has hinted several times). I also happen to like that typeof returns str rather than object... I don't think type inferred code has to restrict itself to what you could dousing *only* declarations. To go out on a hyperbole: Reinventing compiler theory to make things fit better with our current tree and the Pyrex legacy isn't sustainable forever, at some point we should do things the standard way and refactor some code if necesarry. Dag > > >-- >vitja. >___ >cython-devel mailing list >cython-devel@python.org >http://mail.python.org/mailman/listinfo/cython-devel -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] 0.16.1
Ok, so for the bugfix release 0.16.1 I propose that everyone cherry picks over its own fixes into the release branch (at least Stefan, since your fixes pertain to your newly merged branches and sometimes to the master branch itself). This branch should not be merged back into master, and any additional fixes should go into master and be picked over to release. Some things that should still be fixed: - nonechecks for memoryviews - memoryview documentation - more? We can then shortly-ish after release 0.17 with actual features (and new bugs, lets call those features too), depending on how many bugs are still found in 0.16.1. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] 0.16.1
On 8 May 2012 19:36, mark florisson wrote: > Ok, so for the bugfix release 0.16.1 I propose that everyone cherry > picks over its own fixes into the release branch (at least Stefan, > since your fixes pertain to your newly merged branches and sometimes > to the master branch itself). This branch should not be merged back > into master, and any additional fixes should go into master and be > picked over to release. > > Some things that should still be fixed: > - nonechecks for memoryviews > - memoryview documentation > - more? > > We can then shortly-ish after release 0.17 with actual features (and > new bugs, lets call those features too), depending on how many bugs > are still found in 0.16.1. TBH, if we're actually close to a major release, the usefulness of a bugfix release is imho not that great. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] 0.16.1
2012/5/8 mark florisson : > On 8 May 2012 19:36, mark florisson wrote: >> Ok, so for the bugfix release 0.16.1 I propose that everyone cherry >> picks over its own fixes into the release branch (at least Stefan, >> since your fixes pertain to your newly merged branches and sometimes >> to the master branch itself). This branch should not be merged back >> into master, and any additional fixes should go into master and be >> picked over to release. >> >> Some things that should still be fixed: >> - nonechecks for memoryviews >> - memoryview documentation >> - more? >> >> We can then shortly-ish after release 0.17 with actual features (and >> new bugs, lets call those features too), depending on how many bugs >> are still found in 0.16.1. > > TBH, if we're actually close to a major release, the usefulness of a > bugfix release is imho not that great. There are some fixes to generators implementation that depend on "yield from" that can't be easily cherry-picked. So I think you're right about 0.17 release. But new features may introduce new bugs and we'll have to release 0.17.1 soon. -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CF based type inference
On Tue, May 8, 2012 at 6:47 AM, Vitja Makarov wrote: > 2012/5/8 Stefan Behnel : >> Hi, >> >> Vitja has rebased the type inference on the control flow, so I wonder if >> this will enable us to properly infer this: >> >> def partial_validity(): >> """ >> >>> partial_validity() >> ('Python object', 'double', 'str object') >> """ >> a = 1.0 >> b = a + 2 # definitely double >> a = 'test' >> c = a + 'toast' # definitely str >> return typeof(a), typeof(b), typeof(c) >> >> I think, what is mainly needed for this is that a NameNode with an >> undeclared type should not report its own entry as dependency but that of >> its own cf_assignments. Would this work? >> >> (Haven't got the time to try it out right now, so I'm dumping it here.) >> > > Yeah, that might work. The other way to go is to split entries: > > def partial_validity(): > """ > >>> partial_validity() > ('str object', 'double', 'str object') > """ > a_1 = 1.0 > b = a_1 + 2 # definitely double > a_2 = 'test' > c = a_2 + 'toast' # definitely str > return typeof(a_2), typeof(b), typeof(c) > > And this should work better because it allows to infer a_1 as a double > and a_2 as a string. This already works, right? I agree it's nicer in general to split things up, but not being able to optimize a loop variable because it was used earlier or later in a different context is a disadvantage of the current system. - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] 0.16.1
On Tue, May 8, 2012 at 12:04 PM, Vitja Makarov wrote: > 2012/5/8 mark florisson : >> On 8 May 2012 19:36, mark florisson wrote: >>> Ok, so for the bugfix release 0.16.1 I propose that everyone cherry >>> picks over its own fixes into the release branch (at least Stefan, >>> since your fixes pertain to your newly merged branches and sometimes >>> to the master branch itself). This branch should not be merged back >>> into master, and any additional fixes should go into master and be >>> picked over to release. >>> >>> Some things that should still be fixed: >>> - nonechecks for memoryviews >>> - memoryview documentation >>> - more? >>> >>> We can then shortly-ish after release 0.17 with actual features (and >>> new bugs, lets call those features too), depending on how many bugs >>> are still found in 0.16.1. >> >> TBH, if we're actually close to a major release, the usefulness of a >> bugfix release is imho not that great. > > There are some fixes to generators implementation that depend on > "yield from" that can't be easily cherry-picked. > So I think you're right about 0.17 release. But new features may > introduce new bugs and we'll have to release 0.17.1 soon. If we're looking at doing 0.17 soon, lets just do that. In the future, we could have a bugfix branch that all bugfixes get checked into, regularly merged into master, which we could release more often as x.y.z releases. - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] CF based type inference
Robert Bradshaw, 09.05.2012 00:12: > On Tue, May 8, 2012 at 6:47 AM, Vitja Makarov wrote: >> 2012/5/8 Stefan Behnel: >>> Vitja has rebased the type inference on the control flow, so I wonder if >>> this will enable us to properly infer this: >>> >>> def partial_validity(): >>>""" >>>>>> partial_validity() >>>('Python object', 'double', 'str object') >>>""" >>>a = 1.0 >>>b = a + 2 # definitely double >>>a = 'test' >>>c = a + 'toast' # definitely str >>>return typeof(a), typeof(b), typeof(c) >>> >>> I think, what is mainly needed for this is that a NameNode with an >>> undeclared type should not report its own entry as dependency but that of >>> its own cf_assignments. Would this work? >>> >>> (Haven't got the time to try it out right now, so I'm dumping it here.) >>> >> >> Yeah, that might work. The other way to go is to split entries: >> >> def partial_validity(): >> """ >> >>> partial_validity() >> ('str object', 'double', 'str object') >> """ >> a_1 = 1.0 >> b = a_1 + 2 # definitely double >> a_2 = 'test' >> c = a_2 + 'toast' # definitely str >> return typeof(a_2), typeof(b), typeof(c) >> >> And this should work better because it allows to infer a_1 as a double >> and a_2 as a string. > > This already works, right? It would work if it was implemented. *wink* > I agree it's nicer in general to split > things up, but not being able to optimize a loop variable because it > was used earlier or later in a different context is a disadvantage of > the current system. Absolutely. I was considering entry splitting more of a "soon, maybe not now" type of thing because it isn't entire clear to me what needs to be done. It may not even be all that hard to implement, but I think it's more than just a local change in the scope implementation because the current lookup_here() doesn't know what node is asking. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] 0.16.1
Robert Bradshaw, 09.05.2012 00:16: > On Tue, May 8, 2012 at 12:04 PM, Vitja Makarov wrote: >> 2012/5/8 mark florisson: >>> On 8 May 2012 19:36, mark florisson wrote: Ok, so for the bugfix release 0.16.1 I propose that everyone cherry picks over its own fixes into the release branch (at least Stefan, since your fixes pertain to your newly merged branches and sometimes to the master branch itself). This branch should not be merged back into master, and any additional fixes should go into master and be picked over to release. Some things that should still be fixed: - nonechecks for memoryviews - memoryview documentation - more? We can then shortly-ish after release 0.17 with actual features (and new bugs, lets call those features too), depending on how many bugs are still found in 0.16.1. >>> >>> TBH, if we're actually close to a major release, the usefulness of a >>> bugfix release is imho not that great. >> >> There are some fixes to generators implementation that depend on >> "yield from" that can't be easily cherry-picked. >> So I think you're right about 0.17 release. But new features may >> introduce new bugs and we'll have to release 0.17.1 soon. > > If we're looking at doing 0.17 soon, lets just do that. I think it's close enough to be released. I'll try to get around to list the changes in the release notes (and maybe even add a note about alpha quality PyPy support to the docs), but I wouldn't mind if someone else was quicker, at least for a start. ;) > In the future, > we could have a bugfix branch that all bugfixes get checked into, > regularly merged into master, which we could release more often as > x.y.z releases. +11. We have the release branch for that, it just hasn't been used much since the last release. I also don't mind releasing a 0.16.1 shortly before (or even after) a 0.17. Distributors (e.g. Debian) often try to stick to a given release series during their support time frame (usually more than a year), so unless we release fixes, they'll end up cherry picking or porting their own fixes, each on their own. Applying at least the obvious fixes to the release branch and then merging it into the master from there would make it easier for them. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel