Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.

2011-04-12 Thread Arthur de Souza Ribeiro
Hi Stefan, yes, I'm working on this, in fact I'm trying to recompile json
module (http://docs.python.org/library/json.html) adding some type
definitions and cython things o get the code faster.

I'm getting in trouble with some things too, I'm going to enumerate here so
that, you could give me some tips about how to solve them.

1 - Compile package modules - json module is inside a package (files:
__init__.py, decoder.py, encoder.py, decoder.py) is there a way to generate
the cython modules just like its get generated by cython?

2 - Because I'm getting in trouble with issue #1, I'm running the tests
manually, I go to %Python-dir%/Lib/tests/json_tests, get the files
corresponding to the tests python make and run manually.

3 - To get the performance of the module, I'm thinking about to use the
timeit function in  the unit tests for the project. I think a good number of
executions would be made and it would be possible to compare each time.

4 - I didn't create the .pxd files, some problems are happening, it tells
methods are not defined, but, they are defined, I will try to investigate
this better

The code is in this repository:
https://github.com/arthursribeiro/JSON-module your feedback would be very
important, so that I could improve my skills to get more and more able to
work sooner in the project.

I think some things implemented in this rewriting process are going to be
useful when doing this with C modules...

Thank you very much.

Best Regards.

[]s

Arthur

2011/4/12 Stefan Behnel 

> Arthur de Souza Ribeiro, 08.04.2011 02:43:
>
>> 2011/4/7 Robert Bradshaw
>>
>>> What I'd like to see is an implementation of a single simple but not
>>>
>>> entirely trivial (e.g. not math) module, passing regression tests with
>>> comprable if not better speed than the current C version (though I
>>> think it'd probably make sense to start out with the Python version
>>> and optimize that). E.g. http://docs.python.org/library/json.html
>>> looks like a good candidate. That should only take 8 hours or so,
>>> maybe two days at most, given your background. I'm not expecting
>>> anything before the application deadline, but if you could whip
>>> something like this out in the next week to point to that would help
>>> your application out immensely. In fact, one of the Python
>>> foundation's requirements is that students submit a patch before being
>>> accepted, and this would knock out that requirement and give you a
>>> chance to prove yourself. Create an account on https://github.com and
>>> commit your code into a new repository there.
>>>
>>
>> I will start the implementation of json module right now. I created my
>> github account and as soon as I have code implemented I will send
>> repository
>> link.
>>
>
> Any news on this? We're currently discussing which of the Cython-related
> projects to mentor. It's likely that not all of our projects will get
> accepted, so if you could get us a good initial idea about your work, we'd
> have a stronger incentive to value yours over the others.
>
> Stefan
>
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Test runner

2011-04-12 Thread Robert Bradshaw
On Mon, Apr 11, 2011 at 3:56 AM, mark florisson
 wrote:
> On 11 April 2011 12:53, mark florisson  wrote:
>> On 11 April 2011 12:45, Stefan Behnel  wrote:
>>> mark florisson, 11.04.2011 12:26:

 Can we select tests in the tests directory selectively? I see the -T
 or --ticket option, but it doens't seem to find the test tagged with #
 ticket:.

 I can select unit tests using python runtests.py
 Cython.SubPackage.Tests.SomeTest, but I can't seem to do the same
 thing for tests in the tests directory. Running the entire suite takes
 rather long.
>>>
>>> You can still select them by name using a regex, e.g.
>>>
>>>   runtests.py 'run\.empty_builtin_constructors'
>>>
>>> Stefan
>>> ___
>>> cython-devel mailing list
>>> cython-devel@python.org
>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>
>>
>> Great, thanks! I'll update the hackerguide wiki.
>>
> I see now that it is briefly mentioned there, apologies.

I've added a note there about tags as well, and fixed the -T to look
at the ticket tag. Note that "mode:run" is the default, so you don't
need to explicitly tag mode except for compile/error tests.

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] cython-docs repository

2011-04-12 Thread Robert Bradshaw
On Sat, Apr 9, 2011 at 10:13 AM, Jason Grout
 wrote:
> On 4/9/11 12:02 PM, Robert Bradshaw wrote:
>>
>> Yep, we did that during the workshop. I thought I had sent out an
>> announcement, but I guess not.
>
> Is there a summary anywhere of the exciting things that happened in the
> workshop?

Not yet, but I'll post as soon as I have a writeup.

> For example, it seems that generators are finally in, if I read
> the commit logs correctly.  Is that true?  If so, fantastic!

Yep!

> Any idea of a timeline for that to make it into an official release?

There's still a some fallout and more testing to do, but hopefully it
won't be too long before a release. The biggest step back seems to be
the disabling of inline generators, now they're full-fledged
generators, which is an optimization regression but not a feature
regression.

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] "Cython's Users Guide"

2011-04-12 Thread Robert Bradshaw
On Mon, Apr 11, 2011 at 2:35 PM, Francesc Alted  wrote:
> 2011/4/11 William Stein 
>>
>> Hi,
>>
>> I'm teaching Cython in my Sage course yet again, and noticed that
>> again there are some very confusing aspects of the Cython
>> documentation organization, which could probably be improved by a few
>> simple changes.
>>
>>  1. At http://cython.org/ there is a big link in the middle of the
>> page labeled "Cython Users Guide" which goes to
>> http://docs.cython.org/.   However, http://docs.cython.org/ is *not*
>> the users guide -- it is "Cython’s Documentation".     In fact, the
>> Users Guide is Chapter 3 of the documentation.
>>
>>  2. Looking at http://docs.cython.org, we see that Chapter 2 is
>> "Tutorials".  But then looking down to Chapter 3 we see that it is
>> "Cython Users Guide".  Of course, that's what one is after having just
>> clicked a link called "Cython Users Guide".  So we click on "Cython
>> Users Guide" again.
>>
>>  3. We arrive at a page that again has "Tutorial" as Chapter 2.   For
>> some reason this makes me feel even more confused.
>>
>> Recommend changes:
>>
>>  1. Change the link on the main page from "Cython Users Guide" to
>> "Documentation"  or put a direct link into the Users Guide, or have
>> two links.
>>
>>  2. At http://docs.cython.org/ rename the "Cython Users Guide" to
>> "Users Guide", since it is obviously the Cython Users Guide at this
>> point and "Cython documentation" is in the upper left of the page
>> everywhere.
>>
>>  3. Possibly rename the tutorial in chapter 2 of the users guide to
>> something like "First Steps" or "Basic Tutorial" or something.

Thanks for the suggestions. Done.

> Yeah, that's something that we discussed in the past workshop in Munich
> (BTW, many thanks for providing the means for making this happen!).  The
> basic idea is to completely remove the Chapter 3 (Cython Users Guide) by
> moving its parts to either Chapter 2 (Tutorials), or either to Chapter 4
> (Reference Guide).  During the meeting we agreed that the doc repository
> should be moved (and has been moved indeed) into the source repo, so that
> modifications that affect to code and docs can be put in the same
> commit/branch. Also, the wiki has a lot of information that can be better
> consolidated and integrated into the User's Guide.
> In fact, I already started some job in this direction and created a couple
> of pull requests during the workshop (that they have been already
> integrated).  I plan to continue this job, but unfortunately I'm pretty busy
> lately, so I don't think I can progress a lot in the next weeks, so if
> anybody is interested in joining the effort for improving Cython's
> documentation, she will be very welcome indeed!

+1, and thanks, Fransesc, for the work you've started in moving us in
this direction.

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


[Cython] Code examples missing in Cython User's Guide

2011-04-12 Thread Chris Lasher
My apologies for cross-posting this from cython-users, but I realized I
should have sent this bug report to cython-devel.

Several code examples are missing from the User's Guide due to the source
code files being moved or deleted. See for example the Tutorial page on the
User's Guide http://docs.cython.org/src/userguide/tutorial.html The code for
both fib.pyx and primes.pyx (and their setup.py files) is absent from the
document.

I looked at the ReST files and they are trying to source the files from an
"examples" directory. Looking through the git repository, I wasn't able to
locate this directory. Has it been moved or deleted?
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.

2011-04-12 Thread Sturla Molden

Den 12.04.2011 14:59, skrev Arthur de Souza Ribeiro:


1 - Compile package modules - json module is inside a package (files: 
__init__.py, decoder.py, encoder.py, decoder.py) is there a way to 
generate the cython modules just like its get generated by cython?





I'll propose these 10 guidelines:

1. The major concern is to replace the manual use of Python C API with 
Cython.  We aim to improve correctness and readability, not speed.


2. Replacing plain C with Cython for readability is less important, 
sometimes even discourged. If you do, it's ok to leverage on Python 
container types if it makes the code concise and readable, even if it 
will sacrifice some speed.


3. Use exceptions instead of C style error checks: It's better to ask 
forgiveness than permission.


4. Use exceptions correctly. All resourse C allocation belongs in 
__cinit__. All C resource deallocation belongs in __dealloc__. Remember 
that exceptions can cause resource leaks if you don't. Wrap all resource 
allocation in an extension type. Never use functions like malloc or 
fopen directly in your Cython code, except in a __cinit__ method.


5. We should keep as much of the code in Python as we can. Replacing 
Python with Cython for speed is less important. Only the parts that will 
really benefit from static typing should be changed to Cython.


6. Leave the __init__.py file as it is. A Python package is allowed 
contain a mix of Python source files and Cython extension libraries.


7. Be careful to release the GIL whenever appropriate, and never release 
it otherwise. Don't yield the GIL just because you can, it does not come 
for free, even with a single thread.


8. Use the Python and C standard libraries whenever you can.  Don't 
re-invent the wheel. Don't use system dependent APIs when the standard 
libraries declare a common interface. Callbacks to Python are ok.


9. Write code that will work correctly on 32 and 64 bit systems, big- or 
little-endian. Know your C: Py_intptr_t can contain a pointer. 
Py_ssize_t can represent the largest array size allowed. Py_intptr_t and 
Py_ssize_t can have different size. The native array offset can be 
different from Py_ssize_t, for which a common example is AMD64.


10. Don't clutter the namespace, use pxd includes. Short source files 
are preferred to long. Simple is better than complex. Keep the source 
nice and tidy.



Sturla
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Code examples missing in Cython User's Guide

2011-04-12 Thread Arthur de Souza Ribeiro
Hey Chris, the code for primes and fib examples, are in the directory
'Demos' of the repository...

Best Regards.

[]s

Arthur

2011/4/12 Chris Lasher 

> My apologies for cross-posting this from cython-users, but I realized I
> should have sent this bug report to cython-devel.
>
> Several code examples are missing from the User's Guide due to the source
> code files being moved or deleted. See for example the Tutorial page on the
> User's Guide http://docs.cython.org/src/userguide/tutorial.html The code
> for both fib.pyx and primes.pyx (and their setup.py files) is absent from
> the document.
>
> I looked at the ReST files and they are trying to source the files from an
> "examples" directory. Looking through the git repository, I wasn't able to
> locate this directory. Has it been moved or deleted?
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.

2011-04-12 Thread Stefan Behnel

Arthur de Souza Ribeiro, 12.04.2011 14:59:

Hi Stefan, yes, I'm working on this, in fact I'm trying to recompile json
module (http://docs.python.org/library/json.html) adding some type
definitions and cython things o get the code faster.


Cool.



I'm getting in trouble with some things too, I'm going to enumerate here so
that, you could give me some tips about how to solve them.

1 - Compile package modules - json module is inside a package (files:
__init__.py, decoder.py, encoder.py, decoder.py) is there a way to generate
the cython modules just like its get generated by cython?


The __init__.py doesn't really look performance critical. It's better to 
leave that modules in plain Python, that improves readability by reducing 
surprises and simplifies reuse by other implementations.


That being said, you can compile each module separately, just use the 
"cython" command line tool for that, or write a little distutils script as in


http://docs.cython.org/src/quickstart/build.html#building-a-cython-module-using-distutils

Don't worry too much about a build integration for now.



2 - Because I'm getting in trouble with issue #1, I'm running the tests
manually, I go to %Python-dir%/Lib/tests/json_tests, get the files
corresponding to the tests python make and run manually.


That's fine.



3 - To get the performance of the module, I'm thinking about to use the
timeit function in  the unit tests for the project. I think a good number of
executions would be made and it would be possible to compare each time.


That's ok for a start, artificial benchmarks are good to test specific 
functionality. However, unit tests tend to be short running with a lot of 
overhead, so later on, you will need to use real code to benchmark the 
modules. I would expect that there are benchmarks for JSON implementations 
around, and you can just generate a large JSON file and run loads and dumps 
on it.




4 - I didn't create the .pxd files, some problems are happening, it tells
methods are not defined, but, they are defined, I will try to investigate
this better


When reporting usage related problems (preferably on the cython-users 
mailing list), it's best to present the exact error messages and the 
relevant code snippets, so that others can quickly understand what's going 
on and/or reproduce the problem.




The code is in this repository:
https://github.com/arthursribeiro/JSON-module your feedback would be very
important, so that I could improve my skills to get more and more able to
work sooner in the project.


I'd strongly suggest implementing this in pure Python (.py files instead of 
.pyx files), with externally provided static types for performance. A 
single code base is very advantageous for a large project like CPython, 
much more than the ultimate 5% better performance.




I think some things implemented in this rewriting process are going to be
useful when doing this with C modules...


Well, if you can get the existing Python implementation up to mostly 
comparable speed as the C implementation, then there is no need to care 
about the C module anymore. Even if you can get only 90% of a module to run 
at comparable speed, and need to keep 10% in plain C, that's already a huge 
improvement in terms of maintainability.


Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.

2011-04-12 Thread Robert Bradshaw
On Tue, Apr 12, 2011 at 11:22 AM, Stefan Behnel  wrote:
> Arthur de Souza Ribeiro, 12.04.2011 14:59:
>>
>> Hi Stefan, yes, I'm working on this, in fact I'm trying to recompile json
>> module (http://docs.python.org/library/json.html) adding some type
>> definitions and cython things o get the code faster.
>
> Cool.
>
>
>> I'm getting in trouble with some things too, I'm going to enumerate here
>> so
>> that, you could give me some tips about how to solve them.
>>
>> 1 - Compile package modules - json module is inside a package (files:
>> __init__.py, decoder.py, encoder.py, decoder.py) is there a way to
>> generate
>> the cython modules just like its get generated by cython?
>
> The __init__.py doesn't really look performance critical. It's better to
> leave that modules in plain Python, that improves readability by reducing
> surprises and simplifies reuse by other implementations.
>
> That being said, you can compile each module separately, just use the
> "cython" command line tool for that, or write a little distutils script as
> in
>
> http://docs.cython.org/src/quickstart/build.html#building-a-cython-module-using-distutils
>
> Don't worry too much about a build integration for now.
>
>
>> 2 - Because I'm getting in trouble with issue #1, I'm running the tests
>> manually, I go to %Python-dir%/Lib/tests/json_tests, get the files
>> corresponding to the tests python make and run manually.
>
> That's fine.
>
>
>> 3 - To get the performance of the module, I'm thinking about to use the
>> timeit function in  the unit tests for the project. I think a good number
>> of
>> executions would be made and it would be possible to compare each time.
>
> That's ok for a start, artificial benchmarks are good to test specific
> functionality. However, unit tests tend to be short running with a lot of
> overhead, so later on, you will need to use real code to benchmark the
> modules. I would expect that there are benchmarks for JSON implementations
> around, and you can just generate a large JSON file and run loads and dumps
> on it.
>
>
>> 4 - I didn't create the .pxd files, some problems are happening, it tells
>> methods are not defined, but, they are defined, I will try to investigate
>> this better
>
> When reporting usage related problems (preferably on the cython-users
> mailing list), it's best to present the exact error messages and the
> relevant code snippets, so that others can quickly understand what's going
> on and/or reproduce the problem.
>
>
>> The code is in this repository:
>> https://github.com/arthursribeiro/JSON-module your feedback would be very
>> important, so that I could improve my skills to get more and more able to
>> work sooner in the project.
>
> I'd strongly suggest implementing this in pure Python (.py files instead of
> .pyx files), with externally provided static types for performance. A single
> code base is very advantageous for a large project like CPython, much more
> than the ultimate 5% better performance.

While this is advantageous for the final product, it may not be the
easiest to get up and running with.

>> I think some things implemented in this rewriting process are going to be
>> useful when doing this with C modules...
>
> Well, if you can get the existing Python implementation up to mostly
> comparable speed as the C implementation, then there is no need to care
> about the C module anymore. Even if you can get only 90% of a module to run
> at comparable speed, and need to keep 10% in plain C, that's already a huge
> improvement in terms of maintainability.
>
> Stefan
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.

2011-04-12 Thread Robert Bradshaw
On Tue, Apr 12, 2011 at 8:33 AM, Sturla Molden  wrote:
> Den 12.04.2011 14:59, skrev Arthur de Souza Ribeiro:
>>
>> 1 - Compile package modules - json module is inside a package (files:
>> __init__.py, decoder.py, encoder.py, decoder.py) is there a way to generate
>> the cython modules just like its get generated by cython?
>>
>
>
> I'll propose these 10 guidelines:
>
> 1. The major concern is to replace the manual use of Python C API with
> Cython.  We aim to improve correctness and readability, not speed.

Speed is a concern, otherwise many of these modules wouldn't have been
written in C in the first place (at least, not the ones with a pure
Python counterpart). Of course some of them are just wrapping C
libraries where speed doesn't matter as much.

> 2. Replacing plain C with Cython for readability is less important,
> sometimes even discourged.

Huh? I'd say this is a big point of the project. Maybe less so than
manual dependance on the C API, but certainly not discouraged.

> If you do, it's ok to leverage on Python
> container types if it makes the code concise and readable, even if it will
> sacrifice some speed.

That's true.

> 3. Use exceptions instead of C style error checks: It's better to ask
> forgiveness than permission.

Yep, this is natural in Cython.

> 4. Use exceptions correctly. All resourse C allocation belongs in __cinit__.
> All C resource deallocation belongs in __dealloc__. Remember that exceptions
> can cause resource leaks if you don't. Wrap all resource allocation in an
> extension type. Never use functions like malloc or fopen directly in your
> Cython code, except in a __cinit__ method.

This can be useful advice, but is not strictly necessary. Try..finally
can fill this need as well.

> 5. We should keep as much of the code in Python as we can. Replacing Python
> with Cython for speed is less important. Only the parts that will really
> benefit from static typing should be changed to Cython.

True. Of course, compiling the (unchanged) pure Python files with
Cython could also yield interesting results, but that's not part of
the project.

> 6. Leave the __init__.py file as it is. A Python package is allowed contain
> a mix of Python source files and Cython extension libraries.
>
> 7. Be careful to release the GIL whenever appropriate, and never release it
> otherwise. Don't yield the GIL just because you can, it does not come for
> free, even with a single thread.
>
> 8. Use the Python and C standard libraries whenever you can.  Don't
> re-invent the wheel. Don't use system dependent APIs when the standard
> libraries declare a common interface. Callbacks to Python are ok.
>
> 9. Write code that will work correctly on 32 and 64 bit systems, big- or
> little-endian. Know your C: Py_intptr_t can contain a pointer. Py_ssize_t
> can represent the largest array size allowed. Py_intptr_t and Py_ssize_t can
> have different size. The native array offset can be different from
> Py_ssize_t, for which a common example is AMD64.

It's rare to have to do pointer arithmetic in Cython, and rarer still
to have to store the pointer as an integer.

> 10. Don't clutter the namespace, use pxd includes. Short source files are
> preferred to long. Simple is better than complex. Keep the source nice and
> tidy.

Not sure what you mean by "pxd includes," but yes, you should use pxd
files and cimport just as you would in Python to keep things
manageable and modular.

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel