[Cython] Two minor bugs
Hi, I'm new to this list and to Cython internals. Reporting two recently found bugs: 1. Explicit cast fails unexpectedly: ctypedef char* LPSTR cdef LPSTR c_str = b"ascii" c_str # Failure: Python objects cannot be cast from pointers of primitive types The problem is CTypedefType not delegating can_coerce_to_pyobject() to the original type. (because BaseType.can_coerce_to_pyobject takes precedence over __getattr__). Patch+test case and attached. Interestingly, implicit casts use a different code path and are not affected. There is potential for similar bugs in the future, because __getattr__ delegation is inherently brittle in the presence of the base class (BaseType). 2. This recently added code does not compile with MSVC: https://github.com/cython/cython/blob/master/Cython/Utility/TypeConversion.c#L140-142 Interleaving declarations and statements is not allowed in C90... Best Regards, Nikita Nemkin 0001-Fixed-explicit-coercion-of-ctypedef-ed-C-types.patch Description: Binary data ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Py_UNICODE* string support
Hi, Please review my feature proposal to add Py_UNICODE* string support for better Windows interoperability: https://github.com/cython/cython/pull/191 This is motivated by my current work that involves calling lots of Windows APIs. If people are interested I can elaborate on some important points, like the choice of base type (Py_UNICODE vs wchar_t) or the nature of Py_UNICODE* literals or why this feature is necessary at all. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Py_UNICODE* string support
On Sun, 03 Mar 2013 13:52:49 +0600, Stefan Behnel wrote: Are you aware that Py_UNICODE is deprecated as of Py3.3? http://docs.python.org/3.4/c-api/unicode.html Your changes look a bit excessive for supporting something that's inefficient in recent Python versions and basically "dead". Yes, I'm well aware of Py3.3 changes, but consider this: 1. _All_ system APIs on Windows, old, new and in-between, use UTF-16 in the form of zero-terminated 2-byte wchar_t* strings (on Windows Py_UNICODE is _always_ aliased to wchar_t specifically for this reason). Whatever happens to Python internals, the need to interoperate with UTF-16 based platforms won't go away. 2. PY_UNICODE family of APIs remains the recommended way to interoperate with Windows. (So said the autor of PEP393 himself, I could find the relevant discussion in python-dev.) 3. It is not _that_ inefficient. Actually, it has the same efficiency as the UTF8-related APIs (which have to be used on UTF-8 platforms like most *nix systems). UTF8 allows sharing of ASCII buffer and has to convert USC2/UCS4, Py_UNICODE shares UCS2 buffer (assuming narrow build) and has to convert ASCII. One alternative to Py_UNICODE that I have rejected is using Python's wchar_t support. It's practicaly useless for these reasons: 1) wchar_t APIs do not exist in Py2 and have to be implemented for compatibility. 2) Implementing them brings in all the pain of nonportable wchar_t type (on *nix systems in general), whereas it's the primary users would target Windows, where (pretty horrible) wchar_t portability workarounds would be dead code. 3) wchar_t APIs do not offer a zero-copy option and do not manage the memory for us. The changes are some 50 lines of code, not counting the tests. I wouldn't call that excessive. And they mostly mirror existing code, no trickery of any kind. Inbuilt Py_UNICODE* support also means that the users would be shielded from 3.3 changes and Cython is free to optimize sting handling in the future. Believe me, nobody calls Py_UNICODE APIs because they want to, they just have to. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Py_UNICODE* string support
On Sun, 03 Mar 2013 15:32:36 +0600, Stefan Behnel wrote: 1) I would like to get rid of UnicodeConst. A Py_UNICODE* is not different from any other C array, except that it can coerce to and from Unicode strings. So the representation of a literal should be a (properly reference counted) Python Unicode object, and users would be allowed to cast them to , just as we support it for and bytes. I understand the idea. Since Python unicode literals are implicitly coercible to Py_UNICODE*, there appears to be no need for C-level Py_UNICODE[] literals. Indeed, client code will look exactly (!) the same whether they are supported or not. Except when it comes to nogil. (For example, native callbacks are almost guaranteed to be nogil.) Hiding Python operations in what appears to be pure C-level code will break users' assumptions. This is #1 reason why I went for C-level literals. #2 reason is efficiency on Py3.3. C-level literals don't need conversions and don't call any conversion APIs. 2) non-BMP literals should be supported by representing them as normal Unicode strings and creating the Py_UNICODE representation at need (i.e. explicitly through a cast, at runtime). Py_UNICODE[] literals are simply not portable. Py_UNICODE[] literals can be made fully portable if non-BMP ones are wrapped like this: #ifdef Py_UNICODE_WIDE static const k_xxx[] = { , 0 }; #else static const k_xxx[] = { , 0 }; #endif Literals containing only BMP chars are already portable and don't need this wrapping. 3) __Pyx_Py_UNICODE_strlen() is ok, but only for the special case that all we have is a Py_UNICODE*. As long as we are dealing with Unicode string objects, that won't be needed, so len() should be constant time in the normal case instead of linear time. len(Py_UNICODE*) simply mirrors len(char*). Its putpose is to provide platform-independent Py_UNICODE_strlen (which is Py3 only and deprecated in 3.3). So, the basic idea would be to use Unicode strings and their (optional) internal representation as Py_UNICODE[] instead of making Py_UNICODE[] a first class data type. And then go from there and optimise certain things to use the unpacked array directly, so that users won't need to put explicit C-API calls into their code. Please reconsider your decision wrt C-level literals. I believe that nogil code and a bit of efficiency (on 3.3) justify their existence. (char* literals do have C-level literals, Py_UNICODE* is in the same basket when it comes to Windows code). The code to support them is also small and well-contained. I've updated my pull request to fully support for non-BMP Py_UNICODE[] literals. If you are still not convinced, so be it, I'll drop C-level literal support. Best regards, Nikita Nemkin PS. I made a false claim in the previous mail. (Some of) Python's wchar_t APIs do exist in Py2. But they won't manage the memory automatically anyway. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Py_UNICODE* string support
On Mon, 04 Mar 2013 01:56:59 +0600, Stefan Behnel wrote: As one little nit-pick, may I ask you to rename the new name references to "unicode" into "py_unicode" in your code? For example, "is_unicode", "get_unicode_const", "unicode_const_index", etc. Given that Py_UNICODE is no longer the native equivalent of Python's unicode type in Py3.3, I'd like to avoid confusion in the code. The name "unicode" is much more likely to refer to the builtin Python type than to a native C type when it appears in Cython's sources. Actually, "py_unicode" is even more likely to be mistaken for Python-level unicode. There are already pairs of methods like get_string_const (C-level) + get_py_string_const (Py-level). I suggest one of "py_unicode_ptr", "py_unicode_str", "wstring", "wide_string", "ustring", "unicode_string" to unambiguously refer to Py_UNICODE* variables and constants. Take yout pick. Oh, and yet another thing: could you write up some documentation for this in docs/src/tutorial/strings.rst ? Basically a Windows/wchar_t related section, that also warns about the inefficiency in Py3.3, so that users don't accidentally assume it's efficient for anything that needs to be portable. Sure, I'm writing the docs now. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
On Thu, 07 Mar 2013 17:16:10 +0600, Yury V. Zaytsev wrote: Hi, Is there any syntax that I can use to do something like this in Cython: py_object_ = PyList_New(123); ? If not, do you think that this can be added in one way or another? Unfortunately, I can't think of a non-disruptive way of doing it. For instance, if this [None] * N is given a completely new meaning, like make an empty list (of NULLs), instead of making a real list of Nones, it will certainly break Python code. Besides, it would probably be still faster than no pre-allocation, but slower than an empty list with pre-allocation... Maybe [NULL] * N ? Any ideas? I really like the "[NULL] * N" thing. Efficient empty list allocation and filling is something I stumble upon quite often, especially in binding code. I doubt Cython will be able to automatically use PyList_SET_ITEM for assignment to such NULL list (it would require induction variable analysis), but eliminating one extra pass over the list is already helpful. Implementation note (if this gets implemented): Cython's optimized list assignment routine uses Py_DECREF, this will have to be changed to Py_XDECREF, otherwise NULL list items won't be directly assignable from Cython. (PyList_SetItem always uses Py_XDECREF on the old element). What do you need it for? Won't list comprehensions work for you? They could potentially be adapted to presize the list. List comprehensions do not preallocate the list. If they did, the need for the above would be somewhat diminished. And why won't [None]*N help you out? It should be pretty cheap. [None] * N makes ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
Sorry, accidental early send. Previous mail continued... [None] * N makes an extra pass over the list to assign None to each item (and also incref None n times). This is useless extra work. The larget the list, the worse it gets. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Minor bug: emitted junk line prevents compilation
Hi, I believe I have found a bit of broken/junk code. This line produces an unpaired and unnecessary #if directive: https://github.com/cython/cython/blob/master/Cython/Compiler/ModuleNode.py#L2423 The fix is to simply remove it. In case you are interested in how to hit this line, declare in some .pxd: cdef extern from "Python.h": ctypedef class __builtin__.BaseException [object PyBaseExceptionObject]: pass and cimport it in another .pyx. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Minor bug: emitted junk line prevents compilation
On Sun, 17 Mar 2013 02:59:12 +0600, Stefan Behnel wrote: In case you are interested in how to hit this line, declare in some .pxd: cdef extern from "Python.h": ctypedef class __builtin__.BaseException [object PyBaseExceptionObject]: pass Why would you need to do that in your code? It makes Cython treat BaseException as an extension type (Exception is declared similarly) and allows for things like: * Using "Exception" as a type for parameters, attributes, casts. All this with Cython-generated and optimized typeckecking. * Creating Exception subclasses as cdef classes. It's a hack, but a very useful one. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Minor bun in compile time constant handling
Hi, Here: https://github.com/cython/cython/blob/master/Cython/Compiler/Parsing.py#L708-L711 compile-time unicode and bytes values should be wrapped with EncodedString and BytesLiteral respectively: elif isinstance(value, _unicode): return ExprNodes.UnicodeNode(pos, value=EncodedString(value)) elif isinstance(value, _bytes): return ExprNodes.BytesNode(pos, value=BytesLiteral(value)) Otherwise attempts to use compile-time strings in Python context result in errors like "AttributeError: 'unicode' object has no attribute 'is_unicode'". Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Bug: Returning real value crashes the code, complex value does not
Hi, For your information, I was not able to reproduce the crash in the following environment: * Python 2.7.3 (x86, download from python.org) * gcc (GCC) 4.7.2 (download from mingw.org) * Windows 7 x64 The immediate fix for your problem seems to be upgrading gcc. Note: using gcc 4.7 on Windows requires a patch to distutils, see http://stackoverflow.com/a/6035864/204882 for details. (You can probably monkey-patch it for production use.) Best regards, Nikita Nemkin On Tue, 26 Mar 2013 15:52:02 +0600, Martin Fiers wrote: Dear Cython developers, I stumbled upon a strange error when using Cython. I made a minimal working example, see attachment for the two necessary files. (btw I didn't find the e-mail address of Robert Bradshaw so I could not request him for an account on the issue tracker. Is it possible to put the bug on there?) To reproduce the bug: 1) Reboot to Windows :) (the bug only appears on Windows) 2) Run compile_bug.py to generate the Cython extension 3) Try to run the my_func_exposed function: python >>> import complex_double (does not crash) >>> complex_double.my_func_exposed(1,1j) (crashes) >>> complex_double.my_func_exposed(1,1) ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Add support for the offsetof() C macro
Hi, offsetof() is not supported by current Cython, and I have not found any workaround (except hardcoding offsets for a specific architecture and compiler, but this is obviously wrong). offsetof() would certainly be very useful, but in the meantime offsetof(Struct, field) can be replaced with: &(NULL).field It's not ANSI C, but is portable enough. Another option (for extern or public structs only) is to abuse renaming: enum: Struct_offsetof_field1 "offsetof(Struct, field1)" This way a symbolic name can be given to any expression to be literally pasted into the generated code. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Add support for the offsetof() C macro
On Fri, 05 Apr 2013 23:31:44 +0600, Sturla Molden wrote: offsetof() is not supported by current Cython, and I have not found any workaround (except hardcoding offsets for a specific architecture and compiler, but this is obviously wrong). offsetof() would certainly be very useful, but in the meantime offsetof(Struct, field) can be replaced with: &(NULL).field It's not ANSI C, but is portable enough. This will dereference a NULL pointer. Nope it won't. It is purely an address calculation. Works fine in both msvc and gcc. offsetof is a builtin in gcc, but msvc implements it as a macro similar to the above. Also, Py_ssize_t is not guaranteed to be long enough to store a pointer (but Py_intptr_t is). Use Py_intptr_t when you cast pointers to integers. Thanks, Py_intptr_t is indeed more appropriate. I didn't know it existed. Another option (for extern or public structs only) is to abuse renaming: enum: Struct_offsetof_field1 "offsetof(Struct, field1)" This will fail if "Struct" is name mangled by Cython. Basically it requires that it is defined outside of the Cython code, e.g. in a header file. Please note "(for extern or public structs only)". These are not mangled. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Shared Cython runtime (was: Upcoming cython/numpy breakage with stride checking)
On Tue, 09 Apr 2013 19:33:20 +0600, mark florisson wrote: On 9 April 2013 14:24, Stefan Behnel wrote: mark florisson, 09.04.2013 15:13: > On 9 April 2013 14:11, Stefan Behnel wrote: >> Nathaniel Smith, 09.04.2013 15:00: >>> On 9 Apr 2013 13:50, "Stefan Behnel" wrote: >>>> There's also the problem of dependency hell and getting rid of old >>>> modules >>>> once they are no longer used on the user side. And also, how to get them >>>> there in the first place. Having one package overwrite the files of >>>> another during its installation is just asking for trouble. >>> >>> The system I described does not involve the addition of any new files to >>> any package. >> >> I take it then that you were envisaging a separate "cython-runtime" package >> on PyPI that Cython compiled modules would have to depend on? >> >> As long as people install their stuff using pip, that could work for them >> mostly ok, although with the regrettable Cython user impact of having to >> set that dependency for their packages in the first place. >> >> If people want to install stuff manually, dependency hell gets close. >> >> Or did you see any other ways of getting these things installed >> automatically, with a smaller user impact? > > For reference, here's a CEP about this written last year: > http://wiki.cython.org/enhancements/libcython Ok, but that CEP excludes the rather vital problem of distribution and installation. I also fail to see a reference to the problem of how multiple modules will interact that use different Cython runtime versions. That's a substantially bigger problem once symbols start becoming externally visible. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel I didn't say it was complete :) But the way I see it is basically what Nathaniel said, i.e. Cython modules dependent on the runtime import it at import time. It simply imports 'cython.runtime', which has been made available by the first module to initialize the runtime (compiled with --include-runtime), or otherwise must be present on the filesystem. So user packages can depend on a cython-runtime-x.y package (where each x.y is a different package), so pip will install all the runtime versions users need (or maybe we can otherwise improve upon this scheme). Extra dependency may not matter much for scientific code, but there are other users of Cython: library bindings, speedups for general purpose modules etc. Another (versioned!) binary dependency will become a liability to them. If memoryviews rely on autogenerated (module-specific) code, how is common runtime supposed to help? The bulk of Cython utility code is also either inline or module-specific. One alternative for code reuse in large Cython projects could be packaging multiple modules into one shared library. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Shared Cython runtime
On Tue, 09 Apr 2013 21:21:25 +0600, Stefan Behnel wrote: Oh, and even simpler, you can just stick multiple Cython modules with their module init functions into one big shared library and have the "main" init function of that module call the others. That will stick them into sys.modules as a side effect, so it's not really different from going through Python's import machinery for each module separately (just way more efficient - at least if you really need all of the modules...). That's exactly what I meant. There may be glitches with stuff like "__file__" and friends, but that should not be any worse than the current way of workings. I know of two "glitches": 1) Py_InitModule4 expects qualified module name to be provided via the _Py_PackageContext global, in order to initialize new module's __name__. Of course __name__ can also be set manually afterwards. 2) "Top level" init function does not have access to it's own __file__, but it has to initialize submodules' __file__ somehow. My solution is to query the current shared library name directly from the OS (GetModuleFileName() on Windows, dladdr() on everything else). I'm interested in implementing this feature someday. For now, doing it manually is good enough. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Shared Cython runtime
Robert Bradshaw, 09.04.2013 20:16: Yep. We could even create stub .so files for the included ones that did nothing but import the big one to avoid having to worry about importing things in the right order. The stubs don't have to be shared libraries, pure python would work just fine (and reduce the complexity). The big question is what API to specify that things should be linked together. An option to cythonize(...) could be used, but for some projects (e.g. Sage) one would want more control over which modules get bundled together rather than all-or-nothing. No-one forces you to use a plain cythonize("**/*.pyx"). Just be more specific about what files to include in each pattern that you pass to in. And you can always call cythonize() more than once if you need to. Once for each meta-module should usually be acceptable, given that the bundled source modules would tend to have a common configuration anyway. That would still be painful. Ideally, users should never have to modify the setup.py file. Users have to plan for what to bundle, what the package structure will be, etc. It is not an "enable and forget" type of thing. (Unless you have an all-Cython package tree.) I prefer explicitly creating "bundle" extensions with distutil.core.Extension, passing multiple pyx files as sources, then passing the result to cythonize. If you really want to push this into cythonize (it already does more than it should IMO), one option is to add a new comment directive, for example: # cython: bundle = mymodule._speedups similar to how distutils options are passed. All .pyx files with the same "bundle" value will be put in that bundle. There may be glitches with stuff like "__file__" and friends, but that should not be any worse than the current way of workings. This can probably be manipulated by Cython, though it's unclear what the best value would be. The best value is the shared library pathname. All extension modules have it like that. The fact that multiple modules share the same .so is (luckily) irrelevant to the Python import system. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Shared Cython runtime
On Wed, 10 Apr 2013 11:24:33 +0600, Stefan Behnel wrote: Nikita Nemkin, 10.04.2013 06:22: The stubs don't have to be shared libraries, pure python would work just fine (and reduce the complexity). Sure. That's actually the most common way to do it. Most C modules in CPython's stdlib have a Python wrapper script, for example. It's also not uncommon to add a certain amount of functionality in that script. That makes even just "having it there" a feature for future development. Cython's ability to create self-contained binaries is an important feature. For bundles of unrelated modules, stubs seem to be the only option, but whole package bundles can do without them. Maybe compiling whole packages isn't all that a good idea after all, unless at least most of the modules in that package are actually required for the core functionality. In-binary import is quite fast. No extra file access, no parsing. Capsule indirection can be eliminated, some other things optimized... As an option, PEP 302 (Import hooks) provides an elegant way to implement delayed imports for whole package bundles. The problem is not which value to use. The problem is that the shared library pathname is not known at init time, except inside of the loader, and it doesn't tell us about it. In the specific case of a meta-module, it won't even be set before *all* init functions of the submodules were called and the main init function terminates. Which is a really bad situation, because it means that even though the main module will eventually have its __file__ set by the loader, it has no way to propagate it to the submodules anymore. So they won't have their __file__ property set at all. Just in case someone missed it, here's the bug URL once again: http://bugs.python.org/issue13429 Cython can fix up stuff related to the FQMN and sys.modules (and it does that already), but can't do anything about the file path. That makes things like resource loading rather annoying. BTW, here's also the python-dev discussion on the topic: http://thread.gmane.org/gmane.comp.python.devel/135764 Maybe that discussion should be revived and this use case added to it. Well, here is the workaround Cython can use: https://gist.github.com/nnemkin/5352088 (Warning: I haven't fully tested it yet). Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Shared Cython runtime
On Wed, 10 Apr 2013 12:57:48 +0600, Stefan Behnel wrote: Nikita Nemkin, 10.04.2013 08:44: Well, here is the workaround Cython can use: https://gist.github.com/nnemkin/5352088 (Warning: I haven't fully tested it yet). Interesting. Looks like that could help here, yes. Do you know if it also works for statically linked binaries? (i.e. would it return the name of the executable in that case?) It will return the executable name, but correct behavior for statically linked modules is to not have __file__ attribute at all. There is probably a #define that tells when the module is compiled statically, I just haven't looked into it yet. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] bint and autodoc signatures
Hi, Arguments, return values and properties of type 'bint' appear as 'int' in autogenerated signatures. This is very confusing for the end user and logically wrong. (For one thing, bint and int handle input arguments differently: bint accepts pretty much any Python object, while int requries something numeric.) Would it be acceptable to change bint display presentation to 'bool' ? Note: other primitive types (short, int, float, long long etc) don't have this problem, because they are all numeric and coerce to/from Python numerics in an obvious way. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [cython] Autodoc improvements (#216)
On Sat, 20 Apr 2013 02:11:21 +0600, Stefan Behnel wrote: Nikita Nemkin, 18.04.2013 07:07: 2) Public/readonly cdef attributes can have docstrings attached, using the [PEP 258 attribute docstring syntax](http://www.python.org/dev/peps/pep-0258/#attribute-docstrings). Example: cdef class X: cdef public attr """attr docstring""" When ``embedsignature=True`` Cython already adds a signature docstring to the generated property (assuming the example above). This patch allows to add an actual docstring too. Dosctrings on private cdef attributes will give a warning ("docstring ignored"). Other forms and uses of attribute docstrings, as described in the PEP 258, are not supported at all (SyntaxError), because there is no standard introspection mechanism designed for them. ([PEP 224](http://www.python.org/dev/peps/pep-0224/) describing such mechanism had been rejected.) Worth noting that PEP 258 was rejected, although definitely for other reasons than this feature. I remember discussing this topic before on the mailing list, but don't remember if we reached a conclusion. I faintly recall that we preferred making the property explicit as soon as there's more to it than a plain 1:1 mapping to the cdef attribute. Separating property from the attribute requires mass renaming of the attribute (as was recently descussed here) and bloating the code x4. All this just to attach a docstring? PEP 258 was rejected, but Sphinx supports it, an so does epydoc. The need to autodocument attributes remains. The above syntax is not wrong, but it also doesn't feel right to me. For example, it's impossible to draw a distinction between the above and something like cdef class X: cdef public attr """ quotes used for commenting out code here. """ And users are unlikely to notice that they just dropped some internals into the docstring of a property they didn't even know they were implementing. IMHO, the user intention is way too unclear in this syntactic construct. Attribute docstring must appear immediately after the attribute. The extra blank line in your example makes the string standalone, just as you intended. Also, internal comments in the middle of the class are not usually put into strings, actual comment syntax is more appropriate here. Documenting attributes manually, for example, with ".. attribute" directives in the class docstring or .rst file, creates the following problems: 1) In embedsignature mode (which is almost mandatory for a decent autodoc) Cython adds signature docstrings to public attributes, making them appear twice: one instance with a manually written doc, another instance with a signature docstring. 2) Manually documented attributes don't follow autodoc settings (ordering, formatting). They are also invisible to autosummary and other extensions. 3) (Obviously) non-uniformity between property, function and attribute documentation practice. Attribute docstrings are the cleanest solution to all of these problems. And the patch is just a few lines (if you discount the refactoring of the "XXX This should go to AutoDocTransforms" block.) Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Autodoc improvements (#216)
On Sat, 20 Apr 2013 14:27:12 +0600, Stefan Behnel wrote: Nikita Nemkin, 20.04.2013 09:11: On Sat, 20 Apr 2013 02:11:21 +0600, Stefan Behnel wrote: Separating property from the attribute requires mass renaming of the attribute (as was recently descussed here) and bloating the code x4. All this just to attach a docstring? Well, once there is a docstring, that factor goes down to at most 3x. It's not entirely overkill, although I agree that it's definitely way more complexity than just adding a docstring. PEP 258 was rejected, but Sphinx supports it, an so does epydoc. That makes it more reasonable then, but see below. The above syntax is not wrong, but it also doesn't feel right to me. For example, it's impossible to draw a distinction between the above and something like cdef class X: cdef public attr """ quotes used for commenting out code here. """ And users are unlikely to notice that they just dropped some internals into the docstring of a property they didn't even know they were implementing. IMHO, the user intention is way too unclear in this syntactic construct. Attribute docstring must appear immediately after the attribute. The extra blank line in your example makes the string standalone, just as you intended. I can't see that from the tests you wrote. I honestly didn't expect random string literals in the middle of the class. Adding an extra test is never a problem. Also, internal comments in the middle of the class are not usually put into strings, actual comment syntax is more appropriate here. The less common it is, the more likely it is to be a source of confusion when it happens. And the proposed syntax is definitely prone to errors that go undetected. Very unlikely and very minor documentation(!) errors. Wrong docstring on a property descriptor, couldn't be any more harmless. I don't think "bad syntax leads to errors" argument holds here. Documenting attributes manually, for example, with ".. attribute" directives in the class docstring or .rst file, creates the following problems: 1) In embedsignature mode (which is almost mandatory for a decent autodoc) It's not more than a clumsy work-around for the CPython quirk that C implemented functions couldn't expose their signature. And that problem is much better tackled by completing the support for PEP 362. http://www.python.org/dev/peps/pep-0362/ PEP362 is irrelevant for properties and attributes. Anyway, I was just listing technicals problems biting autodoc users. 1) In embedsignature mode (which is almost mandatory for a decent autodoc) Cython adds signature docstrings to public attributes, making them appear twice: one instance with a manually written doc, another instance with a signature docstring. 2) Manually documented attributes don't follow autodoc settings (ordering, formatting). They are also invisible to autosummary and other extensions. 3) (Obviously) non-uniformity between property, function and attribute documentation practice. The way I see it, if the embedsignature feature actually died because of PEP 362, you could just document attributes yourself without having Cython interfere with it at all. And AFAIU, that would solve most of your above problems. I did document them manually at first. It is ugly (function doc in one place, attribute doc in another) and it does not work well with autodoc and autosummary (no summaries for attributes, attrbute section appears in the wrong place and other "minor" problems that require heavy Sphinx hacks to get around.) All in all, it completely defeats the purpose of AUTOdoc. The practice of autodocumentation is widespread in Python (docstrings are a part of the language semantics) and I'd rather it support properties and attributes without messy workarounds. Sphinx has some way to go in this area, but putting strings into __doc__ slots is Cython's responsibility. Attribute docstrings are the cleanest solution to all of these problems. And the patch is just a few lines (if you discount the refactoring of the "XXX This should go to AutoDocTransforms" block.) It's not about the implementation. The question is if this feature should become part of the language because it's "special enough to break the rules" of OOTDI and if so, how we should expose it. And yes, that's a wonderful field for bike shedding. It does not change the language syntax one bit, it just adds a semantic on top, like all docstrings do. Think of it this way: cdef public/readonly is a shorthand for a complete property declaration. Similarly, attribute docstring is a shortcut for providing a __doc__ for this property. PEP258/Sphinx/epydoc already have an established convention for such docstrings, we just put these pieces together. A menti
Re: [Cython] Autodoc improvements (#216)
On Sat, 20 Apr 2013 17:41:56 +0600, Stefan Behnel wrote: Nikita Nemkin, 20.04.2013 12:43: On Sat, 20 Apr 2013 14:27:12 +0600, Stefan Behnel wrote: I did document them manually at first. It is ugly (function doc in one place, attribute doc in another) As it should be. The attributes are part of the class documentation, whereas the function/method documentation belongs to the functions. I can't see why you consider this totally normal distinction "ugly". Why should attributes and especially properties (class members) be treated differently from methods (class members) for documentation purposes? From the OOP standpoint they certainly should not. Smalltalk and Java are good examples. (Smalltalk even has a documentation convention extremely similar to Python docstrings.) All I'm saying is that docstrings for cdef attributes are a very special case. If you had a Python class in a Cython module, you couldn't add a docstring to its attributes at all. And there already is a dedicated syntax for non-trivial properties in cdef classes. So the question is: is a property with a docstring still a trivial property or not? And is the use case common enough to complicate the simple special case that already exists, even though the completely general case can already be handled? "cdef public" is one of the two forms of property declaration in Cython. (The other being "property: ...") Both use nonstandard Python syntax and neither seems a "special case" to me. The property is trivial iif its code consists of one __get__ function with "return self.value" body. What does docstring have to do with it? "property:" is not completely general. I wont rename every single public attribute for the privilege of attaching a docstring to it (and a ton of boilerplate code). It makes no sense from technical and maintenance standpoint. Considering use cases. Currently I have 83 cdef public/readonly attributes across 46 classes that represent ~20% of the source codebase. All of them have docstrings (since they are public and part of the API.) This feature is very low impact to worry about added complexity. Attribute docstrings are the cleanest solution to all of these problems. And the patch is just a few lines (if you discount the refactoring of the "XXX This should go to AutoDocTransforms" block.) It's not about the implementation. The question is if this feature should become part of the language because it's "special enough to break the rules" of OOTDI and if so, how we should expose it. And yes, that's a wonderful field for bike shedding. It does not change the language syntax one bit I didn't say it would change the syntax. It changes the semantics of a string that follows an attribute declaration (for whatever reason) and makes strings that appear after such declarations the official One Right Way to document cdef attributes, i.e. a Cython language feature. Since there are currently Zero Right Ways to document these attributes, having at least one is a good thing. "property:" form is not diminished by that, in cases _when it is warranted_. BTW, the pure Python mode doesn't currently have this feature either, it seems. (I'm not even sure it has public/readonly attributes...) Pure python mode project can be processed by Sphinx like any ordinary Python project, with complete support for all autodoc features. Since Python level attributes have no __doc__ slots, there is nothing Cython can do with their docstrings anyway (except parsing them without syntax errors). Similarly, attribute docstring is a shortcut for providing a __doc__ for this property. PEP258/Sphinx/epydoc already have an established convention for such docstrings, we just put these pieces together. ISTM that there is more than one convention: http://sphinx-doc.org/ext/autodoc.html#directive-autoattribute Other conventions? Do you mean "#:" comments? These forms are not part of PEP258, can't be paralleled to the "property:" syntax and require too many changes to Cython lexer and parser for no extra benefit. Therefore I have not implemented them. But the newer one apparently uses the docstring syntax (which IMHO makes way more sense than special casing comments). Exactly. A mention that a docstring can be provided for a cdef public attribute is all "exposure" it needs. Those who don't need/don't know about it won't be affected. Except by accident, as I said. Users of Sphinx *may* be aware of this, others most likely won't be. In the worst case, probability of which is near zero, the unsuspecting victim will waste a few bytes of memory (the size of the comment). This is a non-issue. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython code producing different, incorrect results under Python 2.7 (not 3.x) under Cython 0.19
On Wed, 24 Apr 2013 12:19:36 +0600, Stefan Behnel wrote: The bug is here (at least one of them): https://github.com/scikit-image/scikit-image/blob/master/skimage/graph/_mcp.pyx#L179 "shape[:-1]" returns incorrect result: input (8, 8), expected output (8,), actual output (). I guess that means something is wrong with __Pyx_PyObject_GetSlice utility, although I wasn't able to create a simple repro yet. Best regards, Nikita Nemkin Hi, Josh Warner, 24.04.2013 08:06: Over in scikit-image we have traced an odd problem with a particular Cython file to the 0.19 update of Cython. From at least Cython 0.15.1 (probably earlier) through 0.18, `_mcp.pyx` in `skimage.graph` compiled and executed correctly, passing all package tests on both Python 2.7 and Python 3. After 0.19 was released and the Travis builds began using it, we began getting 100% repeatable errors from the previously clean master branch (example of an otherwise clean Python 2.7<https://travis-ci.org/scikit-image/scikit-image/jobs/6545505>Travis build; the Python 3 build passed<https://travis-ci.org/scikit-image/scikit-image/builds/6545504>all tests). All of these errors/failures trace back to this Cython file. Oddly, the errors only happen on Python 2.7; our Python 3 Travis build passes. We are discussing this issue in scikit-image Github Issue #534<https://github.com/scikit-image/scikit-image/issues/534>; feel free to join the discussion there. The .pyx Cython file is located here<https://github.com/scikit-image/scikit-image/blob/master/skimage/graph/_mcp.pyx>and it has an associated .pxd file here<https://github.com/scikit-image/scikit-image/blob/master/skimage/graph/_mcp.pxd>. It should be noted the file compiles and executes without errors, but its output is now incorrect in Python 2.x. In case the compiled results might be relevant, for your diffing pleasure here is the compiled .c file from Cython 0.18<https://gist.github.com/JDWarner/af4f8ea85dce356ce95c>which passes all tests on both Python 2.7 and Python 3.x, while here is the compiled .c file from Cython 0.19<https://gist.github.com/JDWarner/56d15b7a7527b8d4314e>which produces different, incorrect results in Python 2.7. In the short term we are temporarily forcing Travis to use the 0.18 release of Cython, but that isn't a viable long term solution. It's possible the error is on our end, but seeing as it worked with prior Cython releases we'd appreciate you taking a look. Thanks for bringing this up. You could make it a little easier for us by pointing us at the code that produces the incorrect results you are experiencing. The set of failing tests seems to be quite small, but before we start digging through your code, I'm sure you can provide pointers to the relevant code snippets for a couple of these tests (i.e. the test code itself and the major code parts that produce the results) much more quickly. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython code producing different, incorrect results under Python 2.7 (not 3.x) under Cython 0.19
On Wed, 24 Apr 2013 13:33:46 +0600, Nikita Nemkin wrote: Update: this is not a Cython bug. _mcp.pyx declares #cython: wraparound=False, any negative index is expected to fail. It worked previously because Cython was using PySequence_GetSlice which of course is not sensitive to Cython directives. The only "fix" would be to produce a warning for constant negative indexes in wraparound=False mode. Best regards, Nikita Nemkin On Wed, 24 Apr 2013 12:19:36 +0600, Stefan Behnel wrote: The bug is here (at least one of them): https://github.com/scikit-image/scikit-image/blob/master/skimage/graph/_mcp.pyx#L179 "shape[:-1]" returns incorrect result: input (8, 8), expected output (8,), actual output (). I guess that means something is wrong with __Pyx_PyObject_GetSlice utility, although I wasn't able to create a simple repro yet. Best regards, Nikita Nemkin Hi, Josh Warner, 24.04.2013 08:06: Over in scikit-image we have traced an odd problem with a particular Cython file to the 0.19 update of Cython. From at least Cython 0.15.1 (probably earlier) through 0.18, `_mcp.pyx` in `skimage.graph` compiled and executed correctly, passing all package tests on both Python 2.7 and Python 3. After 0.19 was released and the Travis builds began using it, we began getting 100% repeatable errors from the previously clean master branch (example of an otherwise clean Python 2.7<https://travis-ci.org/scikit-image/scikit-image/jobs/6545505>Travis build; the Python 3 build passed<https://travis-ci.org/scikit-image/scikit-image/builds/6545504>all tests). All of these errors/failures trace back to this Cython file. Oddly, the errors only happen on Python 2.7; our Python 3 Travis build passes. We are discussing this issue in scikit-image Github Issue #534<https://github.com/scikit-image/scikit-image/issues/534>; feel free to join the discussion there. The .pyx Cython file is located here<https://github.com/scikit-image/scikit-image/blob/master/skimage/graph/_mcp.pyx>and it has an associated .pxd file here<https://github.com/scikit-image/scikit-image/blob/master/skimage/graph/_mcp.pxd>. It should be noted the file compiles and executes without errors, but its output is now incorrect in Python 2.x. In case the compiled results might be relevant, for your diffing pleasure here is the compiled .c file from Cython 0.18<https://gist.github.com/JDWarner/af4f8ea85dce356ce95c>which passes all tests on both Python 2.7 and Python 3.x, while here is the compiled .c file from Cython 0.19<https://gist.github.com/JDWarner/56d15b7a7527b8d4314e>which produces different, incorrect results in Python 2.7. In the short term we are temporarily forcing Travis to use the 0.18 release of Cython, but that isn't a viable long term solution. It's possible the error is on our end, but seeing as it worked with prior Cython releases we'd appreciate you taking a look. Thanks for bringing this up. You could make it a little easier for us by pointing us at the code that produces the incorrect results you are experiencing. The set of failing tests seems to be quite small, but before we start digging through your code, I'm sure you can provide pointers to the relevant code snippets for a couple of these tests (i.e. the test code itself and the major code parts that produce the results) much more quickly. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Autodoc improvements (#216)
On Tue, 23 Apr 2013 10:19:15 +0600, Robert Bradshaw wrote: Jumping into this thread late, improvements (1) and (3) certainly seem beneficial. As far as documenting "attributes," I can't see much of a use, but the downsides seem are low enough (accidentally assigning a stray string to the previously declared attribute) and the implementation non-invasive enough that I'm +0.5 for this change. It doesn't introduce new syntax or behavior, just additional semantic meaning for those (otherwise unlikely to occur) strings and sure beats writing them out as properties (though the fact that cdef attributes are properties should be considered more of an implementation detail IMHO). I've just found a 4 year old ticket requesting the same feature (2): http://trac.cython.org/cython_trac/ticket/206 Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] A little bugtracker cleanup
Hi, While browsing Cython's bugtracker I've found a few issues that should be closed as fixed: http://trac.cython.org/cython_trac/ticket/42 http://trac.cython.org/cython_trac/ticket/94 http://trac.cython.org/cython_trac/ticket/113 http://trac.cython.org/cython_trac/ticket/246 http://trac.cython.org/cython_trac/ticket/358 Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] A little bugtracker cleanup
On Sat, 27 Apr 2013 04:07:35 +0600, Robert Bradshaw wrote: Thanks. I closed some of these, for the rest I'd like to verify we at least have a regression test. http://trac.cython.org/cython_trac/ticket/42 is about const support. I'm not sure it's necessary to test it separately, but here is the test anyway https://github.com/cython/cython/pull/219 http://trac.cython.org/cython_trac/ticket/113 was fixed here https://github.com/cython/cython/pull/200, tests included. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] any more fixes for 0.19.1 ?
On Fri, 03 May 2013 19:49:19 +0600, Stefan Behnel wrote: I also think that Ticket 775 ("memoryview" name leaks into module namespace) should eventually be fixed as it's annoying and unexpected, although not necessarily right in this release (it's been like this forever). http://trac.cython.org/cython_trac/ticket/775 I'm working on this one and I need an opinion: should utility classes be made internal by default or should they use @cython.internal explicitly? Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [cython] Hide Cython utility classes (like memoryview) from Python level module scope. (#222)
On Sun, 05 May 2013 17:03:54 +0600, Stefan Behnel wrote: Nikita Nemkin, 04.05.2013 19:52: Two changes included: 1) cdef classes in utility code can have compiler directives attached to them. This is not used anywhere ATM, but memoryviews may benefit from ``@cython.final``. 2) All utility classes are excluded from module dictionary by *implicitly* marking them with ``@cython.internal`` . This fixes [#775](http://trac.cython.org/cython_trac/ticket/775), test is included. I don't quite understand what CythonScope is and how utility classes are *supposed* to be hidden, but as it is now, utility code scope is merged into main module scope and there is nothing special about its classes. BTW if a user declares his own class with the same name as utility class (for example, ``memoryview``), everything breaks down. I wonder why utility classes actually need a Python name. Can't they just live with None as "name"? All they should really need is their cname and a properly analysed entry stored in the right places, so deleting their Python visible name when merging them into the main module should fix this. entry.name serves for general identification and bookkeeping, not just to provide a python level name. Non-null entry name is a very useful invariant, I'd rather not break it for something trivial like name hiding. All codegen algorithms will have to worry about (class) entries with null names afterwards. Even if it works currently, it may break in the future. Anyway, just setting entry.name to None does not work, because it is not the only place to get a python name (and of course it's never checked for None). For example, module init code uses ClassScope.class_name. Some other code may use entry.type.name etc... Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Compiler crash fix
Hi, Someone please apply this patch (too simple for pull request): diff --git a/Cython/Compiler/ExprNodes.py b/Cython/Compiler/ExprNodes.py index 0406fad..640d002 100755 --- a/Cython/Compiler/ExprNodes.py +++ b/Cython/Compiler/ExprNodes.py @@ -4993,7 +4993,7 @@ class AttributeNode(ExprNode): # creates a corresponding NameNode and returns it, otherwise # returns None. type = self.obj.analyse_as_extension_type(env) -if type: +if type and type.scope: entry = type.scope.lookup_here(self.attribute) if entry and entry.is_cmethod: if type.is_builtin_type: It fixes CompilerCrash (None does not have "lookup_here" method) that I have observed on two occasions: 1) cdef class Name; cimport Name; Name.attr 2) from X cimport Name; Name.attr # cimport_from_pyx is active, Name is a class with errors Makes me wonder if ErrorScope should be introduced to avoid None scope checks. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] How should be C-clobal deletion handled?
On Sat, 25 May 2013 23:29:29 +0600, Vitja Makarov wrote: 2013/5/25 Stefan Behnel Am 25.05.2013 08:34, schrieb Robert Bradshaw: > On Thu, May 23, 2013 at 10:02 PM, Vitja Makarov wrote: >> Recently I've found that the following code causes segmentation fault: >> >> cdef object f >> del f >> print f >> >> So the question is: how should that work? >> >> global objects are implicitly initialized to None and no CF and no cf >> analysis is performed for it. >> >> So I see three options here: >> >> 1. prohibit cglobal deletion >> 2. set it back to None >> 3. check for a null value at every reference and assignment > > I'd go for 1, with 2 as a backup option. +1 for 1. Stefan I tried to disable it and found that it's already used for global C++ objects deletion. It would be strange to have global deletion for C++ objects and not for ordinary python objects. There is nothing strange about it. C++ pointers behave differently from Python objects and users are well aware of that. Moreover, C++ deletion is a totally different operation that just happens to reuse the same keyword. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Exception check optimization
Hi, I wonder why is __pyx_filename (in exception check blocks) tracked dynamically? AFAIK it's impossible to split function body between multiple files (include only works at the top level), which makes filename a compile time constant for any given function. If the above is correct, __pyx_filename variable can be eliminated, saving at least 5 bytes per exception check (more on x64). On a related note, why is c_lineno enabled by default? So far I have only found one pretty rare case when it is useful - debugging autogenerated code in module init. I suspect that most people don't bother turning it off and get suboptimal code. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Old style dependency tracking
Hi, I wonder if it's time to to remove old dependency tracking code? I mean this: http://trac.cython.org/cython_trac/ticket/379, CompilationOptions.recursive, .dep file generation etc. Since it is broken for a long time already, there is no problem with backwards (in)compatibility. And cythonize already provides equivalent functionality. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Core dump when Python is built without debug symbols
On Thu, 30 May 2013 17:23:39 +0600, Marin Atanasov Nikolov wrote: Hello, First, apologies if this is not exactly the right place to post this, but I have tried already cython-users@ and my post is still pending somewhere for approval, so I've decided to give it a shot here. I'm working on Cython wrappers for a C library and everything was going smooth on my dev machine, until I've decided to install the wrappers on another machine. The only difference between these machines is that my dev machine has Python built with debug symbols and the other machine does not have debug symbols in Python. The issue I'm having and was trying to solve for the past few days is that if I run my Cython wrappers on a machine where Python does not have debug symbols it core dumps ugly. If I run the same code on another machine with Python + debug symbols then everything is running perfect. I've done some debugging and tracing as well, and noticed that the core dump happens at PyTuple_Size() when I have Python *without* debug symbols. Running Python + debug symbols shows that the interpreter takes a different path and does not core dump (does not even call PyTuple_Size() for some reason) Anyway, I've uploaded the backtrace and some gdb tracing in here: * http://users.unix-heaven.org/~dnaeon/cython-wrappers/wrappers-trace.txt The code that I have you can find in here: * http://users.unix-heaven.org/~dnaeon/cython-wrappers/src/ What really puzzles me is why it core dumps when Python does not have debug symbols and why it takes a different path (and not core dumping) when Python is built with debug symbols. I'd say that if it core dumps because of a NULL pointer or something similar, then having Python with or without debug symbols should not make a difference, but it does for some reason. I've been working on this for the past few days and I'm out of ideas now. Any help/hints/feedback is much appreciated. In pkg-db.pxi, this line: return Pkg(pkg) and in pkg-pkg.pxi this line: def __cinit__(self, pkg): self._pkg = pkg are the problem. You are trying to pass a C pointer to __cinit__. What actually happens is that, because of the cast, Cython treats pkg as as Python object _itself_ and tries to do some object operations on it. Two ways to solve this (really common) problem: 1) Pass your pointer as a real python object containing the integer value of the pointer: def __cinit__(self, pkg): self._pkg = pkg # or Py_intptr_t return Pkg(pkg) Note the cast. When used on a C pointer, it makes a C integer that Cython convert to python int object. And when is used on a python int object, Cyton extracts its integer value as a C-level value, so it is castable to a C-level pointer. Using Py_ssize_t (or Py_intptr_t or your platform intptr_t) guarantees that the pointer will fit. 2) Don't use __cinit__. Declare your own cdef init method and call that. Since it's cdef method, it can take any C-level arguments, pointerw, whatever without the need to convert to python objects and back. @cython.final # small optimization cdef _init(self, c_pkg.pkg * pkg): self._pkg = pkg # cdef __cinit__ not needed. Maybe use it to set default # attribute values. cdef Pkg pkg = Pkg.__new__(Pkg) # fast object creation. Pkg() also works. pkg._init(pkg) (When you use the __new__ trick, __cnit__ is called without arguments and __init__ is not called at all. The benefit to this ugly syntax is that __new__ is faster than creating the object with traditional Pyhton syntax.) Also, if all your initialization is pointer assigment, you can just do it directly: cdef Pkg pkg = Pkg.__new__(Pkg) # fast object creation pkg._pkg = pkg There is a problem related to cython-dev. I think the code you wrote should have never compiled in the first place. cast must be required if you want to reinterpret a C pointer as object or vice versa. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Exception check optimization
On Thu, 30 May 2013 08:25:55 +0600, Robert Bradshaw wrote: On Sun, May 26, 2013 at 10:04 AM, Nikita Nemkin wrote: Hi, I wonder why is __pyx_filename (in exception check blocks) tracked dynamically? AFAIK it's impossible to split function body between multiple files (include only works at the top level), which makes filename a compile time constant for any given function. If the above is correct, __pyx_filename variable can be eliminated, saving at least 5 bytes per exception check (more on x64). Probably because exception checks are heavy-weight enough that we haven't bothered optimizing them at this level yet. By "exception checks" I mean these things: if (unlikely(!__pyx_t_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 15; __pyx_clineno = __LINE__; goto __pyx_L1_error;} They are fast, but the problem is the extra code: neither __pyx_filename nor __pyx_clineno assignments are necessary. Removing them reduces binary size by 2-3%. If you're OK with it, I'll make a patch that removes __pyx_filename and disables __pyx_clineno by default. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Core dump when Python is built without debug symbols
On Thu, 30 May 2013 19:51:16 +0600, Marin Atanasov Nikolov wrote: Any idea why having Python with debug symbols actually overcomes this issue? In debug builds, Python object structure is prepended with a couple of extra debug fields, which means a different piece of memory was wrongly interpreted and manipulated as a Python object. It is pure luck that whatever values were there didn't cause a crash. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] C data type and a C function() sharing the same name
On Sat, 01 Jun 2013 19:29:17 +0600, Marin Atanasov Nikolov wrote: Hello, Working on creating Cython wrappers for a C library I came across a strange problem. I have in the C library a struct like this: struct my_jobs; I also have this function, which returns the next job in the queue: int my_jobs(struct my_jobs *jobs); Translating this into Cython and putting this in the .pxd file it looks like this: cdef struct my_jobs int my_jobs(my_jobs *jobs) During build I'm having issues because it seems that the function my_jobs() is translated in a way that it should return a "int struct my_jobs". The real problem I see is that I cannot have a data type and a function sharing the same name. How can I overcome this issue? Suppose that I wrote the C API I could change that, but how would you really solve this if you cannot touch what's in upstream? Any ways to solve this? This question would be more appropriate on the cython-users mailing list. Use renaming: http://docs.cython.org/src/userguide/external_C_code.html#resolving-naming-conflicts-c-name-specifications For example, rename the function: int my_jobs_func "my_jobs" (my_jobs *jobs) or the struct: cdef struct my_jobs_t "my_jobs" or both. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] array.array member renaming
Hi, I just wanted to say that this https://github.com/cython/cython/commit/a3ace265e68ad97c24ce2b52d99d45b60b26eda2#L1L73 renaming seems totally unnecessary as it makes any array code verbose and ugly. I often have to create extra local variables just to avoid endless something.data.as_ints repetition. What was the reason for ranaming? It would be really nice to reintroduce old names (_i, _d etc). Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] array.array member renaming
On Tue, 04 Jun 2013 14:47:47 +0600, Stefan Behnel wrote: Nikita Nemkin, 04.06.2013 10:29: I just wanted to say that this https://github.com/cython/cython/commit/a3ace265e68ad97c24ce2b52d99d45b60b26eda2#L1L73 renaming seems totally unnecessary as it makes any array code verbose and ugly. I often have to create extra local variables just to avoid endless something.data.as_ints repetition. Are one-shot operations on arrays really so common for you that the explicit "unpacking" step matters for your code? I use array in most places where you would normally see bare pointer and malloc/PyMem_Malloc. Automatic memory management FTW. Many people would do the same if they knew about arrays and a special support for them that Cython provides. (Personally, I had discovered it by browsing standard include .pxd files) Array class members also have "self." prepended which does not help brevity. So, yeah, it matters. Sure I can live with overly verbose names, but there is certainly room for improvement. ATM I have 96 cases of ".data.as_XXX" in my codebase and that's after folding some of them using local variables (like "cdef int* segments = self.segments.data.as_ints"). What was the reason for ranaming? It would be really nice to reintroduce old names (_i, _d etc). IMHO, the explicit names read better and make it clear what happens. Indexing makes it clear enough that, well, indexing happens. Direct array access is sort of magic anyway. Here is an example of unnecessary verbosity: while width + piDx.data.as_ints[start] < maxWidth: width += piDx.data.as_ints[start] start += 1 Also, I think the original idea was that most people shouldn't access the field directly and use memory views and the buffer interface instead, at least for user provided data. It might be a little different for arrays that are only used internally. When using buffer interface, it really doesn't matter if user have passed an array or ndarray or whatever. Buffer interface covers everything, array-specific declarations are irrelevant. But when I know that the variable is an array, buffer declaration, acquisition and release code is dead weight (especially for class members which can't have buffer declaration attached to themselves, necessitating an extra local variable to declare a fast access view). Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] array.array member renaming
On Tue, 04 Jun 2013 18:27:15 +0600, Stefan Behnel wrote: Nikita Nemkin, 04.06.2013 12:17: On Tue, 04 Jun 2013 14:47:47 +0600, Stefan Behnel wrote: Nikita Nemkin, 04.06.2013 10:29: I just wanted to say that this https://github.com/cython/cython/commit/a3ace265e68ad97c24ce2b52d99d45b60b26eda2#L1L73 renaming seems totally unnecessary as it makes any array code verbose and ugly. I often have to create extra local variables just to avoid endless something.data.as_ints repetition. Are one-shot operations on arrays really so common for you that the explicit "unpacking" step matters for your code? I use array in most places where you would normally see bare pointer and malloc/PyMem_Malloc. Automatic memory management FTW. Many people would do the same if they knew about arrays and a special support for them that Cython provides. (Personally, I had discovered it by browsing standard include .pxd files) Array class members also have "self." prepended which does not help brevity. So, yeah, it matters. Sure I can live with overly verbose names, but there is certainly room for improvement. ATM I have 96 cases of ".data.as_XXX" in my codebase and that's after folding some of them using local variables (like "cdef int* segments = self.segments.data.as_ints"). And the local assignment also resolves the pointer indirection for "self" here, which the C compiler can't really reason about otherwise. What was the reason for ranaming? It would be really nice to reintroduce old names (_i, _d etc). IMHO, the explicit names read better and make it clear what happens. Indexing makes it clear enough that, well, indexing happens. Direct array access is sort of magic anyway. Here is an example of unnecessary verbosity: while width + piDx.data.as_ints[start] < maxWidth: width += piDx.data.as_ints[start] start += 1 Agreed that it's more verbose than necessary, but my gut feeling is still: if it's worth shorting, it's worth assigning. If it's not worth assigning, it's likely not worth shortening either. Shortening is about readability. Extra CPU time to dereference self is not my concern. (I'm pretty sure L1 cache hides the cost.) So, I do see your problem, but it's not obvious to me that it's worth doing something about it. Especially not something as broad as duplicating the direct access interface. I guess I'll just copy array.pxd and modify it to suit my needs. (Long member names is not my only grievance.) Modified include path should do the trick. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Conditional cast and builtin types
Hi, I have just discovered that the following operations do not perform any type checking ("?" is ignored): obj, obj, obj, obj, obj. Can someone please confirm this as a bug? Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Feature proposal: conditional cast with None
Hi, Currently, conditional casts like obj do not allow None values. I have found it useful to allow None sometimes. The syntax could be one of: * obj * obj * obj Use case (without this feature): tn_obj = self.any.non_trivial.nullable.expression() cdef TypeName tn if tn_obj is not None: tn = tn_obj ... Use case (with this feature): cdef TypeName tn = None?>self.any.non_trivial.nullable.expression() if tn is not None: ... As you can see, without this feature, two local variables pointing to the same object are required. This creates unnecessary confusion. Implementation is trivial, all that is necessary is to pass "notnone=False" flag from parser to TypecastNode to PyTypeTestNode. What do you think? Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Feature proposal: conditional cast with None
On Fri, 07 Jun 2013 19:10:34 +0600, Stefan Behnel wrote: Nikita Nemkin, 07.06.2013 13:24: Currently, conditional casts like obj do not allow None values. This might seem like an inconsistency, because obj does allow None values, just like a normal assignment. However, it's a special case that asks explicitly for the given type, so I think it's ok to be stricter. I have found it useful to allow None sometimes. The syntax could be one of: * obj * obj * obj Use case (without this feature): tn_obj = self.any.non_trivial.nullable.expression() cdef TypeName tn if tn_obj is not None: tn = tn_obj ... Use case (with this feature): cdef TypeName tn = self.any.non_trivial.nullable.expression() if tn is not None: ... Why not just cdef TypeName tn = self.any.non_trivial.nullable.expression() if tn is not None: ... ? I.e. why do you need that cast in the first place? You are right. The behavior I want is actually the default assignment behavior. Writing almost fully typed code, I started to percieve Cython as a statically typed language... Please disregard my feature request and thank you. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Funny idea: interpreted def functions
Hi, Pure Python functions rarely benefit from compilation. I thought it would be interesting to add an "interpreted" directive (global, module, class, function level + automatic heuristic) that will instruct Cython to compile def functions into _bytecode_ and store that bytecode in the binary. Together with module bundling and embed/freeze it could make a neat deployment solution. (I have no plans to implement this.) Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Funny idea: interpreted def functions
On Tue, 11 Jun 2013 20:03:45 +0600, Stefan Behnel wrote: Nikita Nemkin, 11.06.2013 13:51: Pure Python functions rarely benefit from compilation. I thought it would be interesting to add an "interpreted" directive (global, module, class, function level + automatic heuristic) that will instruct Cython to compile def functions into _bytecode_ and store that bytecode in the binary. Together with module bundling and embed/freeze it could make a neat deployment solution. Well, it shouldn't be all that hard to implement. Basically, we'd send a part of the source file through the Python parser after having parsed and processed it in the compiler. However, I fail to see the advantage of this feature that would make it worth providing to users. There usually *is* a visible performance advantage of compiled code over pure Python code, and the advantages of interpreted Python code in terms of semantics or compatibility are quite limited (debugging, maybe, or introspection). Let's just say there are legitimate reasons to stay interpreted, binary size and compatibility among them. Could you describe how/why you came up with this? Well, I was wondering why isn't CPython written in Cython (actually I know why) and how awesome it would be to have a system with CPython runtime and unified Cython/Python compiler front-end targeting both bytecode and native code. In such system, per-function compiled/interpreted switch would feel natural to me... That's how, if it answers your question. And from a different angle: many people praise Go(lang) for it's "single fat binary" deployment approach. First class bytecode support in Cython colud provide the same for Python. (Maybe not quite the same, but a step in this direction.) Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Windows Debug build improvement
On Thu, 18 Jul 2013 19:24:21 +0600, Wolfgang wrote: Hi, I tried to submit a improvement for the Windows build but the tracker is not accessible without a login. On Windows if someone does a Debug build of an extension the flag _DEBUG is set and so the Python Interpreter sets Py_DEBUG and for all extension modules "_d" is appended to load the debug version of a module. This is not really practical because then all modules and the Python Interpreter must be build in Debug mode. For some modules this is even not possible for Windows. :-( To debug my extensions on Windows (in Visual Studio), I just add the appropriate compiler flags: extension = Extension( ... extra_compile_args=['/Zi', '/Od'], # generate PDB, disable optimization extra_link_args=['/DEBUG']) # preserve debug info Add to that symbol files for the Python release you are using ("program database" links on this page http://www.python.org/getit/releases/2.7.5/) and you will have a comfortable debugging environment. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Problem with final cdef methods
On Thu, 29 Aug 2013 11:03:01 +0600, Stefan Behnel wrote: I noticed two problems with final cdef methods. When overriding a normal cdef method with a final cdef method, you currently get a C compiler warning about a call with the wrong 'self' type, and when you call them before you declare them in your code, you get an error because the method is not forward declared. http://trac.cython.org/cython_trac/ticket/819 I think this suggests that we should try to come up with a cleaner way to integrate this feature into the function overloading architecture rather than just remembering their cname and call that. As your ticket mentions, there is a fundamental problem with method inheritance. (and with the way Cython generates forward declarations...) Any inherited method assumes 2 types simultaneously: the type of the base method with 1st parameter being a base class pointer and the "actual" type of the method with 1st parameter being derived class pointer. 1st type, stored in method's Entry.type, is essentially the vtable slot type. It's used when virtual method calls a made. It's also used for generating forward declarations and for casting in vtable init code. 2nd type (stored in CFuncDef node) is used ONLY to generate function definition. The mismatch between these 2 types is the root of the problem. Before final methods were introduced, the 2nd ("actual") type was unimportant. Method was cast to the vtable slot type in vtable init code and forgotten. But final functions, bypassing vtable mechanic, rely soley on the 2nd type. I have solved it for myself by storing BOTH types in the method entry (Entry.type for the actual CFuncDef type and Entry.prev_type for the vtable slot type). By using correct types in generate_exttype_final_methods_declaration() and generate_exttype_vtable_init_code() the problem is avoided. You can see the patch here https://github.com/nnemkin/cython/compare/final_subtypes Notes for the patch: * I removed a bit of wtf code from CFuncType.declaration_code, these changes are incidental to the problem you describe. (Strictly necessary parts are only the lines involving "prev_type"). * There is a disabled bug test named "inherited_final_method", you may want to remove/merge it with the test you modified. I hope this helps. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Problem with final cdef methods
On Thu, 29 Aug 2013 12:57:19 +0600, Stefan Behnel wrote: Nikita Nemkin, 29.08.2013 08:24: I have solved it for myself by storing BOTH types in the method entry (Entry.type for the actual CFuncDef type and Entry.prev_type for the vtable slot type). By using correct types in generate_exttype_final_methods_declaration() and generate_exttype_vtable_init_code() the problem is avoided. You can see the patch here https://github.com/nnemkin/cython/compare/final_subtypes Interesting. Your change dates back a while already. Were you planning to clean it up in some way before you submit it as pull request? That was the plan. My solution is kind of stopgap (even the name "prev_type" makes me cringe), but "proper" solution would require some serious redesign and rewrite work, which may be unwise in itsef. So I got stuck and left it behind. If you want the problem fixed with minimal changes to the Cython codebase, you can use my approach. BTW I havent thoroughly tested that patch. It works for me in my environment and passes the test suite, but that's it. Notes for the patch: * There is a disabled bug test named "inherited_final_method", And you had to disable it, because ... ? you may want to remove/merge it with the test you modified. I already looked at it before I changed the main test. It appears to be a regression test for a specific bug that's different from the problem at hand, that's why I didn't touch it. The test was already disabled, because it hit the open bug. It's the same bug we are talking about: mismatch between Entry.type and the actual method type. Please take a closer look at it. My patch ENABLES the test because it fixes the bug. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel