[Python-Dev] Improving the Python Memory Allocator

2005-01-23 Thread Evan Jones
This message is a follow up to a thread I started on python-dev back in 
October, archived here:

http://mail.python.org/pipermail/python-dev/2004-October/049480.html
Basically, the problem I am trying to solve is that the Python memory 
allocator never frees memory back to the operating system. I have 
attached a patch against obmalloc.c for discussion. The patch still has 
some rough edges and possibly some bugs, so I don't think it should be 
merged as is. However, I would appreciate any feedback on the chances 
for getting this implementation into the core. The rest of this message 
lists some disadvantages to this implementation, a description of the 
important changes, a benchmark, and my future plans if this change gets 
accepted.

The patch works for any version of Python that uses obmalloc.c (which 
includes Python 2.3 and 2.4), but I did my testing with Python 2.5 from 
CVS under Linux and Mac OS X. This version of the allocator will 
actually free memory. It has two disadvantages:

First, there is slightly more overhead with programs that allocate a 
lot of memory, release it, then reallocate it. The original allocator 
simply holds on to all the memory, allowing it to be efficiently 
reused. This allocator will call free(), so it also must call malloc() 
again when the memory is needed. I have a "worst case" benchmark which 
shows that this cost isn't too significant, but it could be a problem 
for some workloads. If it is, I have an idea for how to work around it.

Second, the previous allocator went out of its way to permit a module 
to call PyObject_Free while another thread is executing 
PyObject_Malloc. Apparently, this was a backwards compatibility hack 
for old Python modules which erroneously call these functions without 
holding the GIL. These modules will have to be fixed if this 
implementation is accepted into the core.

Summary of the changes:
- Add an "arena_object" structure for tracking pages that belong to 
each 256kB arena.
- Change the "arenas" array from an array of pointers to an array of 
arena_object structures.
- When freeing a page (a pool), it is placed on a free pool list for 
the arena it belongs to, instead of a global free pool list.
- When freeing a page, if the arena is completely unused, the arena is 
deallocated.
- When allocating a page, it is taken from the arena that is the most 
full. This gives arenas that are almost completely unused a chance to 
be freed.

Benchmark:
The only benchmark I have performed at the moment is the worst case for 
this allocator: A program that allocates 1 000 000 Python objects which 
occupy nearly 200MB, frees them, reallocates them, then quits. I ran 
the program four times, and discarded the initial time. Here is the 
object:

class Obj:
def __init__( self ):
self.dumb = "hello"
And here are the average execution times for this program:
Python 2.5:
real time: 16.304
user time: 16.016
system: 0.257
Python 2.5 + patch:
real time: 16.062
user time: 15.593
system: 0.450
As expected, the patched version spends nearly twice as much system 
time than the original version. This is because it calls free() and 
malloc() twice as many times. However, this difference is offset by the 
fact that the user space execution time is actually *less* than the 
original version. How is this possible? The likely cause is because the 
original version defined the arenas pointer to be "volatile" in order 
to work when Free and Malloc were called simultaneously. Since this 
version breaks that, the pointer no longer needs to be volatile, which 
allows the value to be stored in a register instead of being read from 
memory on each operation.

Here are some graphs of the memory allocator behaviour running this 
benchmark.

Original: http://www.eng.uwaterloo.ca/~ejones/original.png
New: http://www.eng.uwaterloo.ca/~ejones/new.png
Future Plans:
- More detailed benchmarking.
- The "specialized" allocators for the basic types, such as ints, also 
need to free memory back to the system.
- Potentially the allocator should keep some amount of free memory 
around to improve the performance of programs that cyclically allocate 
and free large amounts of memory. This amount should be "self-tuned" to 
the application.

Thank you for your feedback,
Evan Jones


python-allocator.diff
Description: Binary data
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Improving the Python Memory Allocator

2005-01-24 Thread Evan Jones
On Jan 24, 2005, at 7:00, Rodrigo Dias Arruda Senra wrote:
Depending on the cost of arena allocation, it might help to define a 
lower threshold keeping a minimum of empty arena_objects permanently 
available. Do you think this can bring any speedup ?
Yes, I think it might. I have to do some more benchmarking first, to 
try and figure out how expensive the allocations are. This is one of my 
"future work" items to work on if this change gets accepted. I have not 
implemented it yet, because I don't want to have to merge one *massive* 
patch. My rough idea is to do something like this:

1. Keep track of the largest number of pages in use at one time.
2. Every N memory operations (or some other measurement of "time"), 
reset this value and calculate a moving average of the number of pages. 
This estimates the current memory requirements of the application.

3. If (used + free) > average, free arenas until freeing one more arena 
would make (used + free) < average.

This is better than a static scheme which says "keep X MB of free 
memory around" because it will self-tune to the application's 
requirements. If you have an applications that needs lots of RAM, it 
will keep lots of RAM. If it has very low RAM usage, it will be more 
aggressive in reclaiming free space. The challenge is how to determine 
a good measurement of "time." Ideally, if the application was idle for 
a while, you would perform some housekeeping like this. Does Python's 
cyclic garbage collector currently do this? If so, I could hook this 
"management" stuff on to its calls to gc.collect()

Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Improving the Python Memory Allocator

2005-01-24 Thread Evan Jones
On Jan 24, 2005, at 18:16, Martin v. Löwis wrote:
- please don't post patches here; post them to SF
  You may ask for comments here after you posted them to SF.
Sure. This should be done even for patches which should absolutely not 
be committed?

- please follow Python coding style. In particular, don't write
if ( available_arenas == NULL ) {
  but write
if (available_arenas == NULL) {
Yikes! This is a "bad" habit of mine that is in the minority of coding 
style . Thank you for catching it.

Second, the previous allocator went out of its way to permit a module 
to call PyObject_Free while another thread is executing 
PyObject_Malloc. Apparently, this was a backwards compatibility hack 
for old Python modules which erroneously call these functions without 
holding the GIL. These modules will have to be fixed if this 
implementation is accepted into the core.
I'm not certain it is acceptable to make this assumption. Why is it
not possible to use the same approach that was previously used (i.e.
leak the arenas array)?
This is definitely a very important point of discussion. The main 
problem is that leaking the "arenas" arena is not sufficient to make 
the memory allocator thread safe. Back in October, Tim Peters suggested 
that it might be possible to make the breakage easily detectable:

http://mail.python.org/pipermail/python-dev/2004-October/049502.html
If we changed PyMem_{Free, FREE, Del, DEL} to map to the system
free(), all would be golden (except for broken old code mixing
PyObject_ with PyMem_ calls).  If any such broken code still exists,
that remapping would lead to dramatic failures, easy to reproduce; and
old code broken in the other, infinitely more subtle way (calling
PyMem_{Free, FREE, Del, DEL} when not holding the GIL) would continue
to work fine.
I'll be honest, I only have a theoretical understanding of why this 
support is necessary, or why it is currently correct. For example, is 
it possible to call PyMem_Free from two threads simultaneously? Since 
the problem is that threads could call PyMem_Free without holding the 
GIL, it seems to be that it is possible. Shouldn't it also be 
supported? In the current memory allocator, I believe that situation 
can lead to inconsistent state. For example, see obmalloc.c:746, where 
it has been determined that a block needs to be put on the list of free 
blocks:

*(block **)p = lastfree = pool->freeblock;
pool->freeblock = (block *)p;
Imagine two threads are simultaneously freeing blocks that belong to 
the same pool. They both read the same value for pool->freeblock, and 
assign that same value to p. The changes to pool->freeblock will have 
some arbitrary ordering. The result? You have just leaked a block of 
memory.

Basically, if a concurrent memory allocator is the requirement, then I 
think some other approach is necessary.

- When allocating a page, it is taken from the arena that is the most 
full. This gives arenas that are almost completely unused a chance to 
be freed.
It would be helpful if that was documented in the data structures
somewhere. The fact that the nextarena list is sorted by nfreepools
is only mentioned in the place where this property is preserved;
it should be mentioned in the introductory comments as well.
This is one of those rough edges I mentioned before. If there is some 
concensus that these changes should be accepted, then I will need to 
severely edit the comments at the beginning of obmalloc.c.

Thanks for your feedback,
Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python Interpreter Thread Safety?

2005-01-28 Thread Evan Jones
Due to the issue of thread safety in the Python memory allocator, I 
have been wondering about thread safety in the rest of the Python 
interpreter. I understand that the interpreter is not thread safe, but 
I'm not sure that I have seen a discussion of the all areas where this 
is an issue. Here are the areas I know of:

1. The memory allocator.
2. Reference counts.
3. The cyclic garbage collector.
4. Current interpreter state is pointed to by a single shared pointer.
5. Many modules may not be thread safe (?).
Ignoring the issue of #5 for the moment, are there any other areas 
where this is a problem? I'm curious about how much work it would be to 
allow concurrent execution of Python code.

Evan Jones
Note: One of the reasons I am asking is that my memory allocator patch 
is that it changes the current allocator from "sort of" thread safe to 
obviously unsafe. One way to eliminate this issue is to make the 
allocator completely thread safe, but that would require some fairly 
significant changes to avoid a major performance penalty. However, if 
it was one of the components that permitted the interpreter to go 
multi-threaded, then it would be worth it.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Interpreter Thread Safety?

2005-01-28 Thread Evan Jones
On Jan 28, 2005, at 18:24, Martin v. Löwis wrote:
5. Many modules may not be thread safe (?).
Modules often release the GIL through BEGIN_ALLOW_THREADS, if they know
that would be safe if another thread would enter the Python 
interpreter.
Right, I guess that the modules already have to deal with being 
reentrant and thread-safe, since Python threads could already cause 
issues.

Ignoring the issue of #5 for the moment, are there any other areas 
where this is a problem? I'm curious about how much work it would be 
to allow concurrent execution of Python code.
Define "concurrent". Webster's offers
Sorry, I really meant *parallel* execution of Python code: Multiple 
threads simultaneously executing a Python program, potentially on 
different CPUs.

There have been attempts to implement free threading, and
they have failed.
What I was trying to ask with my last email was what are the trouble 
areas? There are probably many that I am unaware of, due to my 
unfamiliarity the Python internals.

Due to some
unfortunate historical reasons, there is code which enters free()
without holding the GIL - and that is what the allocator specifically
deals with.
Right, but as said in a previous post, I'm not convinced that the 
current implementation is completely correct anyway.

Again, the interpreter supports multi-threading today. Removing
the GIL is more difficult, though - nearly any container object
(list, dictionary, etc) would have to change, plus the reference
counting (which would have to grow atomic increment/decrement).
Wouldn't it be up to the programmer to ensure that accesses to shared 
objects, like containers, are serialized? For example, with Java's 
collections, there are both synchronized and unsynchronized versions.

Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Interpreter Thread Safety?

2005-01-28 Thread Evan Jones
On Jan 28, 2005, at 20:27, Tim Peters wrote:
The hacks in PyObject_Free (== PyMem_Free)
are there solely so that question can be answered correctly in the
absence of holding the GIL.   "That question" == "does pymalloc
control the pointer passed to me, or does the system malloc?".

Ah! *Now* I get it. And yes, it will be possible to still support this 
in my patched version of the allocator. It just means that I have to 
leak the "arenas" array just like it did before, and then do some hard 
thinking about memory models and consistency to decide if the "arenas" 
pointer needs to be volatile.

When pymalloc was new, we went to insane lengths to
avoid breaking that stuff, but enough is enough.
So you don't think we need to bother supporting that any more?
Back to the present:
Wouldn't it be up to the programmer to ensure that accesses to shared
objects, like containers, are serialized? For example, with Java's
collections, there are both synchronized and unsynchronized versions.
Enormous mounds of existing threaded Python code freely manipulates
lists and dicts without explicit locking now.  We can't break that --
and wouldn't want to.  Writing threaded code is especially easy (a
relative stmt, not absolute) in Python because of it.
Right, because currently Python switches threads on a granularity of 
opcodes, which gives you this serialization with the cost of never 
having parallel execution.

Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Interpreter Thread Safety?

2005-01-28 Thread Evan Jones
On Jan 28, 2005, at 19:44, Martin v. Löwis wrote:
Python threads can run truly parallel, as long as one of them
invoke BEGIN_ALLOW_THREADS.
Except that they are really executing C code, not Python code.
I think nobody really remembers - ask Google for "Python free 
threading". Greg Stein did the patch, and the main problem apparently
was that the performance became unacceptable - apparently primarily
because of dictionary locking.
Thanks, I found the threads discussing it.
Right, but as said in a previous post, I'm not convinced that the 
current implementation is completely correct anyway.
Why do you think so? (I see in your previous post that you claim
it is not completely correct, but I don't see any proof).
There are a number of issues actually, but as Tim points, only if the 
blocks are managed by PyMalloc. I had written a description of three of 
them here, but they are not relevant. If the issue is calling 
PyMem_Free with a pointer that was allocated with malloc() while 
PyMalloc is doing other stuff, then no problem: That is possible to 
support, but I'll have to think rather hard about some of the issues.

For example, for lists, the C API allows direct access to the pointers
in the list. If the elements of the list could change in-between, an
object in the list might go away after you got the pointer, but before
you had a chance to INCREF it. This would cause a crash shortly
afterwards. Even if that was changed to always return a new refence,
lots of code would break, as it would create large memory leaks
(code would have needed to decref the list items, but currently
doesn't - nor is it currently necessary).
Ah! Right. In Java, the collections are all actually written in Java, 
and run on the VM. Thus, when some concurrent weirdness happens, it 
just corrupts the application, not the VM. However, in Python, this 
could actually corrupt the interpreter itself, crashing the entire 
thing with a very ungraceful Segmentation Fault or something similar.

Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: Moving towards Python 3.0 (was Re: [Python-Dev] Speed up function calls)

2005-01-31 Thread Evan Jones
On Jan 31, 2005, at 0:17, Guido van Rossum wrote:
The "just kidding" applies to the whole list, right? None of these
strike me as good ideas, except for improvements to function argument
passing.
Really? You see no advantage to moving to garbage collection, nor 
allowing Python to leverage multiple processor environments? I'd be 
curious to hear your reasons why not.

My knowledge about garbage collection is weak, but I have read a little 
bit of Hans Boehm's work on garbage collection. For example, his 
"Memory Allocation Myths and Half Truths" presentation 
(http://www.hpl.hp.com/personal/Hans_Boehm/gc/myths.ps) is quite 
interesting. On page 25 he examines reference counting. The biggest 
disadvantage mentioned is that simple pointer assignments end up 
becoming "increment ref count" operations as well, which can "involve 
at least 4 potential memory references." The next page has a 
micro-benchmark that shows reference counting performing very poorly. 
Not to mention that Python has a garbage collector *anyway,* so 
wouldn't it make sense to get rid of the reference counting?

My only argument for making Python capable of leveraging multiple 
processor environments is that multithreading seems to be where the big 
performance increases will be in the next few years. I am currently 
using Python for some relatively large simulations, so performance is 
important to me.

Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Memory Allocator Part 2: Did I get it right?

2005-02-15 Thread Evan Jones
After I finally understood what thread-safety guarantees the Python  
memory allocator needs to provide, I went and did some hard thinking  
about the code this afternoon. I believe that my modifications provide  
the same guarantees that the original version did. I do need to declare  
the arenas array to be volatile, and leak the array when resizing it.  
Please correct me if I am wrong, but the situation that needs to be  
supported is this:

While one thread holds the GIL, any other thread can call PyObject_Free  
with a pointer that was returned by the system malloc.

The following situation is *not* supported:
While one thread holds the GIL, another thread calls PyObject_Free with  
a pointer that was returned by PyObject_Malloc.

I'm hoping that I got things a little better this time around. I've  
submitted my updated patch to the patch tracker. For reference, I've  
included links to SourceForge and the previous thread.

Thank you,
Evan Jones

Previous thread:
http://mail.python.org/pipermail/python-dev/2005-January/051255.html
Patch location:
http://sourceforge.net/tracker/index.php? 
func=detail&aid=1123430&group_id=5470&atid=305470

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory Allocator Part 2: Did I get it right?

2005-02-15 Thread Evan Jones
On Feb 15, 2005, at 17:52, Tim Peters wrote:
As I said before, I don't think we need to support this any more.
More, I think we should not -- the support code is excruciatingly
subtle, it wasted plenty of your time trying to keep it working, and
if we keep it in it's going to continue to waste time over the coming
years (for example, in the short term, it will waste my time reviewing
it).
I do not have nearly enough experience in the Python world to evaluate 
this decision. I've only been programming in Python for about two years 
now, and as I am sure you are aware, this is my first patch that I have 
submitted to Python. I don't really know my way around the Python 
internals, beyond writing basic extensions in C. Martin's opinion is 
clearly the opposite of yours.

Basically, the debate seems to boil down to maintaining backwards 
compatibility at the cost of making the code in obmalloc.c harder to 
understand. The particular case that is being supported could 
definitely be viewed as a "bug" in the code that using obmalloc. It 
also likely is quite rare. However, until now it has been supported, so 
it is hard to judge exactly how much code would be affected. It would 
definitely be a minor barrier to moving to Python 2.5. Is there some 
sort of consensus that is possible on this issue?

While one thread holds the GIL, any other thread can call 
PyObject_Free
with a pointer that was returned by the system malloc.
What _was_ supported was more generally that
any number of threads could call PyObject_Free with pointers that 
were
returned by the system malloc/realloc

at the same time as
a single thread, holding the GIL, was doing anything whatsoever 
(including
executing any code inside obmalloc.c)
Okay, good, that is what I have assumed.
Although that's a misleading way of expressing the actual intent; more
on that below.
That's fine. It may be a misleading description of the intent, but it 
is an accurate description of the required behaviour. At least I hope 
it is.

  I expect it would be easier if you
ripped out the horrid support for PyObject_Free abuse; in a sane
world, the release-build PyMem_FREE, PyMem_Del, and PyMem_DEL would
expand to "free" instead of to "PyObject_FREE" (via changes to
pymem.h).
It turns out that basically the only thing that would change would be 
removing the "volatile" specifiers from two of the global variables, 
plus it would remove about 100 lines of comments. :) The "work" was 
basically just hurting my brain trying to reason about the concurrency 
issues, not changing code.

It was never legit to do #a without holding the GIL.  It was clear as
mud whether it was legit to do #b without holding the GIL.  If
PyMem_Del (etc) change to expand to "free" in a release build, then #b
can remain clear as mud without harming anyone.  Nobody should be
doing #a anymore.  If someone still is, "tough luck -- fix it, you've
had years of warning" is easy for me to live with at this stage.
Hmm... The issue is that case #a may not be an easy problem to 
diagnose: Some implementations of free() will happily do nothing if 
they are passed a pointer they know nothing about. This would just 
result in a memory leak. Other implementations of free() can output a 
warning or crash in this case, which would make it trivial to locate.

I suppose the other consideration is that already-compiled extension
modules on non-Windows(*) systems will, if they're not recompiled,
continue to call PyObject_Free everywhere they had a
PyMem_Del/DEL/FREE call.
Is it guaranteed that extension modules will be binary compatible with 
future Python releases? I didn't think this was the case.

Thanks for the feedback,
Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%

2005-02-16 Thread Evan Jones
On Feb 16, 2005, at 18:42, Martin v. Löwis wrote:
I must admit that I'm surprised. I would have expected
that most allocations in Python go through obmalloc, so
the heap would only see "large" allocations.
It would be interesting to find out, in your application,
why it is still an improvement to use the low-fragmentation
heaps.
Hmm... This is an excellent point. A grep through the Python source 
code shows that the following files call the native system malloc (I've 
excluded a few obviously platform specific files). A quick visual 
inspection shows that most of these are using it to allocate some sort 
of array or string, so it likely *should* go through the system malloc. 
Gfeller, any idea if you are using any of the modules on this list? If 
so, it would be pretty easy to try converting them to call the obmalloc 
functions instead, and see how that affects the performance.

Evan Jones
Demo/pysvr/pysvr.c
Modules/_bsddb.c
Modules/_curses_panel.c
Modules/_cursesmodule.c
Modules/_hotshot.c
Modules/_sre.c
Modules/audioop.c
Modules/bsddbmodule.c
Modules/cPickle.c
Modules/cStringIO.c
Modules/getaddrinfo.c
Modules/main.c
Modules/pyexpat.c
Modules/readline.c
Modules/regexpr.c
Modules/rgbimgmodule.c
Modules/svmodule.c
Modules/timemodule.c
Modules/zlibmodule.c
PC/getpathp.c
Python/strdup.c
Python/thread.c
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%

2005-02-18 Thread Evan Jones
On Thu, 2005-02-17 at 22:38, Tim Peters wrote:
Then you allocate a small object, marked 's':
bbbsfff
Isn't the whole point of obmalloc is that we don't want to allocate "s" 
on the heap, since it is small? I guess "s" could be an object that 
might potentially grow.

One thing to take from that is that LFH can't be helping list-growing
in a direct way either, if LFH (as seems likely) also needs to copy
objects that grow in order to keep its internal memory segregated by
size.  The indirect benefit is still available, though:  LFH may be
helping simply by keeping smaller objects out of the general heap's
hair.
So then wouldn't this mean that there would have to be some sort of 
small object being allocated via the system malloc that is causing the 
poor behaviour? As you mention, I wouldn't think it would be list 
objects, since resizing lists using LFH should be *worse*. That would 
actually be something that is worth verifying, however. It could be 
that the Windows LFH is extra clever?

I'm afraid the only you can know for sure is by obtaining detailed
memory maps and analyzing them.
Well, it would also be useful to find out what code is calling the 
system malloc. This would make it easy to examine the code and see if 
it should be calling obmalloc or the system malloc. Any good ideas for 
easily obtaining this information? I imagine that some profilers must 
be able to produce a complete call graph?

Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory Allocator Part 2: Did I get it right?

2005-02-18 Thread Evan Jones
Sorry for taking so long to get back to this thread, it has been one of 
those weeks for me.

On Feb 16, 2005, at 2:50, Martin v. Löwis wrote:
Evan then understood the feature, and made it possible.
This is very true: it was a very useful exercise.
I can personally accept breaking the code that still relies on the
invalid APIs. The only problem is that it is really hard to determine
whether some code *does* violate the API usage.
Great. Please ignore the patch on SourceForge for a little while. I'll 
produce a "revision 3" this weekend, without the compatibility hack.

Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%

2005-02-18 Thread Evan Jones
On Feb 18, 2005, at 17:51, Tim Peters wrote:
grow the list to its final size once, at the start (overestimating if
you don't know for sure).  Then instead of appending, keep an index to
the next free slot, same as you'd do in C.  Then the list guts never
move, so if that doesn't yield the same kind of speedup without using
LFH, list copying wasn't actually the culprit to begin with.
If this *does* improve the performance of his application by 15%, that  
would strongly argue for an addition to the list API similar to Java's  
ArrayList.ensureCapacity or the STL's vector::reserve. Since the  
list implementation already maintains separate ints for the list array  
size and the list occupied size, this would really just expose this  
implementation detail to Python. I don't like revealing the  
implementation in this fashion, but if it does make a significant  
performance difference, it could be worth it.

http://java.sun.com/j2se/1.5.0/docs/api/java/util/ 
ArrayList.html#ensureCapacity(int)
http://www.sgi.com/tech/stl/Vector.html#4

Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory Allocator Part 2: Did I get it right?

2005-03-11 Thread Evan Jones
On Mar 4, 2005, at 23:55, Brett C. wrote:
Evan uploaded the newest version (@ http://www.python.org/sf/1123430) 
and he says it is "complete".
I should point out (very late! It's been a busy couple of weeks) that I 
fully expect that there may be some issues with the patch that will 
arise when it is reviewed. I am still lurking on Python-Dev, and I will 
strive to correct any defects ASAP. A few users have contacted me 
privately and have tested the patch, so it works for a few people at 
least.

Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Unicode byte order mark decoding

2005-04-01 Thread Evan Jones
I recently rediscovered this strange behaviour in Python's Unicode 
handling. I *think* it is a bug, but before I go and try to hack 
together a patch, I figure I should run it by the experts here on 
Python-Dev. If you understand Unicode, please let me know if there are 
problems with making these minor changes.

>>> import codecs
>>> codecs.BOM_UTF8.decode( "utf8" )
u'\ufeff'
>>> codecs.BOM_UTF16.decode( "utf16" )
u''
Why does the UTF-16 decoder discard the BOM, while the UTF-8 decoder 
turns it into a character? The UTF-16 decoder contains logic to 
correctly handle the BOM. It even handles byte swapping, if necessary. 
I propose that  the UTF-8 decoder should have the same logic: it should 
remove the BOM if it is detected at the beginning of a string. This 
will remove a bit of manual work for Python programs that deal with 
UTF-8 files created on Windows, which frequently have the BOM at the 
beginning. The Unicode standard is unclear about how it should be 
handled (version 4, section 15.9):

Although there are never any questions of byte order with UTF-8 text, 
this sequence can serve as signature for UTF-8 encoded text where the 
character set is unmarked. [...] Systems that use the byte order mark 
must recognize when an initial U+FEFF signals the byte order. In those 
cases, it is not part of the textual content and should be removed 
before processing, because otherwise it may be mistaken for a 
legitimate zero width no-break space.
At the very least, it would be nice to add a note about this to the 
documentation, and possibly add this example function that implements 
the "UTF-8 or ASCII?" logic:

def autodecode( s ):
if s.beginswith( codecs.BOM_UTF8 ):
# The byte string s is UTF-8
out = s.decode( "utf8" )
return out[1:]
else: return s.decode( "ascii" )
As a second issue, the UTF-16LE and UTF-16BE encoders almost do the 
right thing: They turn the BOM into a character, just like the Unicode 
specification says they should.

>>> codecs.BOM_UTF16_LE.decode( "utf-16le" )
u'\ufeff'
>>> codecs.BOM_UTF16_BE.decode( "utf-16be" )
u'\ufeff'
However, they also *incorrectly* handle the reversed byte order mark:
>>> codecs.BOM_UTF16_BE.decode( "utf-16le" )
u'\ufffe'
This is *not* a valid Unicode character. The Unicode specification 
(version 4, section 15.8) says the following about non-characters:

Applications are free to use any of these noncharacter code points 
internally but should never attempt to exchange them. If a 
noncharacter is received in open interchange, an application is not 
required to interpret it in any way. It is good practice, however, to 
recognize it as a noncharacter and to take appropriate action, such as 
removing it from the text. Note that Unicode conformance freely allows 
the removal of these characters. (See C10 in Section3.2, Conformance 
Requirements.)
My interpretation of the specification means that Python should 
silently remove the character, resulting in a zero length Unicode 
string. Similarly, both of the following lines should also result in a 
zero length Unicode string:

>>> '\xff\xfe\xfe\xff'.decode( "utf16" )
u'\ufffe'
>>> '\xff\xfe\xff\xff'.decode( "utf16" )
u'\u'
Thanks for your feedback,
Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode byte order mark decoding

2005-04-01 Thread Evan Jones
On Apr 1, 2005, at 15:19, M.-A. Lemburg wrote:
The BOM (byte order mark) was a non-standard Microsoft invention
to detect Unicode text data as such (MS always uses UTF-16-LE for
Unicode text files).
Well, it's origins do not really matter since at this point the BOM is 
firmly encoded in the Unicode standard. It seems to me that it is in 
everyone's best interest to support it.

It is not needed for the UTF-8 because that format doesn't rely on
the byte order and the BOM character at the beginning of a stream is
a legitimate ZWNBSP (zero width non breakable space) code point.
You are correct: it is a legitimate character. However, its use as a 
ZWNBSP character has been deprecated:

The overloading of semantics for this code point has caused problems 
for programs and protocols. The new character U+2060 WORD JOINER has 
the same semantics in all cases as U+FEFF, except that it cannot be 
used as a signature. Implementers are strongly encouraged to use word 
joiner in those circumstances whenever word joining semantics is 
intended.
Also, the Unicode specification is ambiguous on what an implementation 
should do about a leading ZWNBSP that is encoded in UTF-16. Like I 
mentioned, if you look at the Unicode standard, version 4, section 
15.9, it says:

2. Unmarked Character Set. In some circumstances, the character set 
information for a stream of coded characters (such as a file) is not 
available. The only information available is that the stream contains 
text, but the precise character set is not known.
This seems to indicate that it is permitted to strip the BOM from the 
beginning of UTF-8 text.

-1; there's no standard for UTF-8 BOMs - adding it to the
codecs module was probably a mistake to begin with. You usually
only get UTF-8 files with BOM marks as the result of recoding
UTF-16 files into UTF-8.
This is clearly incorrect. The UTF-8 is specified in the Unicode 
standard version 4, section 15.9:

In UTF-8, the BOM corresponds to the byte sequence .
I normally find files with UTF-8 BOMs from many Windows applications 
when you save a text file as UTF8. I think that Notepad or WordPad does 
this, for example. I think UltraEdit also does the same thing. I know 
that Scintilla definitely does.

At the very least, it would be nice to add a note about this to the
documentation, and possibly add this example function that implements
the "UTF-8 or ASCII?" logic.
Well, I'd say that's a very English way of dealing with encoded
text ;-)
Please note I am saying only that something like this may want to me 
considered for addition to the documentation, and not to the Python 
standard library. This example function more closely replicates the 
logic that is used on those Windows applications when opening ".txt" 
files. It uses the default locale if there is no BOM:

def autodecode( s ):
if s.beginswith( codecs.BOM_UTF8 ):
# The byte string s is UTF-8
out = s.decode( "utf8" )
return out[1:]
else: return s.decode()
BTW, how do you know that s came from the start of a file
and not from slicing some already loaded file somewhere
in the middle ?
Well, the same argument could be applied to the UTF-16 decoder know 
that the string came from the start of a file, and not from slicing 
some already loaded file? The standard states that:

In the UTF-16 encoding scheme, U+FEFF at the very beginning of a file 
or stream explicitly signals the byte order.
So it is perfectly permissible to perform this type of processing if 
you consider a string to be equivalent to a stream.

My interpretation of the specification means that Python should 
silently
remove the character, resulting in a zero length Unicode string.
Hmm, wouldn't it be better to raise an error ? After all,
a reversed BOM mark in the stream looks a lot like you're
trying to decode a UTF-16 stream assuming the wrong
byte order ?!
Well, either one is possible, however the Unicode standard suggests, 
but does not require, silently removing them:

It is good practice, however, to recognize it as a noncharacter and to 
take appropriate action, such as removing it from the text. Note that 
Unicode conformance freely allows the removal of these characters.
I would prefer silently ignoring them from the str.decode() function, 
since I believe in "be strict in what you emit, but liberal in what you 
accept." I think that this only applies to str.decode(). Any other 
attempt to create non-characters, such as unichr( 0x ), *should* 
raise an exception because clearly the programmer is making a mistake.

Other than that: +1 on fixing this case.
Cool!
Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode byte order mark decoding

2005-04-05 Thread Evan Jones
On Apr 5, 2005, at 15:33, Walter Dörwald wrote:
The stateful decoder has a little problem: At least three bytes
have to be available from the stream until the StreamReader
decides whether these bytes are a BOM that has to be skipped.
This means that if the file only contains "ab", the user will
never see these two characters.
Shouldn't the decoder be capable of doing a partial match and quitting 
early? After all, "ab" is encoded in UTF8 as <61> <62> but the BOM is 
  . If it did this type of partial matching, this issue 
would be avoided except in rare situations.

A solution for this would be to add an argument named final to
the decode and read methods that tells the decoder that the
stream has ended and the remaining buffered bytes have to be
handled now.
This functionality is provided by a flush() method on similar objects, 
such as the zlib compression objects.

Evan Jones
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] [PATCH][BUG] Segmentation Fault in xml.dom.minidom.parse

2005-09-29 Thread Evan Jones
The following Python script causes Python 2.3, 2.4 and the latest CVS  
to crash with a Segmentation Fault:

import xml.dom.minidom
x = u'\nComment \xe7a va ? Tr\xe8s  
bien ?'
dom = xml.dom.minidom.parseString( x.encode( 'latin_1' ) )
print repr( dom.childNodes[0].localName )


The problem is that this XML document does not specify an encoding. In  
this case, minidom assumes that it is encoded in UTF-8. However, in  
fact it is encoded in Latin-1. My two line patch, in the SourceForge  
tracker at the URL below, causes this to raise a UnicodeDecodingError  
instead.

http://sourceforge.net/tracker/index.php? 
func=detail&aid=1309009&group_id=5470&atid=305470

Any chance that someone wants to commit this tiny two line fix? This  
might be the kind of fix that might be elegible to be backported to  
Python 2.4 as well. It passes "make test" on both my Linux system and  
my Mac. I've also attached a patch that adds this test case to  
test_minidom.py.

Thanks,

Evan Jones

--
Evan Jones
http://evanjones.ca/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Parser and Runtime: Divorced!

2005-10-26 Thread Evan Jones
After a few hours of tedious and frustrating hacking I've managed to 
separate the Python abstract syntax tree parser from the rest of Python 
itself. This could be useful for people who may wish to build Python 
tools without Python, or tools in C/C++.

In the process of doing this, I came across a comment mentioning that 
it would be desirable to separate the parser. Is there any interest in 
doing this? I now have a vague idea about how to do this. Of course, 
there is no point in making changes like this unless there is some 
tangible benefit.

I will make my ugly hack available once I have polished it a little bit 
more. It involved hacking header files to provide a "re-implementation" 
of the pieces of Python that the parser needs (PyObject, PyString, and 
PyInt). It likely is a bit buggy, and it doesn't support all the types 
(notably, it is missing support for Unicode, Longs, Floats, and 
Complex), but it works well enough to get the AST from a few simple 
strings, which is what I wanted.

Evan Jones

--
Evan Jones
http://evanjones.ca/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Parser and Runtime: Divorced!

2005-10-30 Thread Evan Jones
On Oct 26, 2005, at 20:02, Evan Jones wrote:
> In the process of doing this, I came across a comment mentioning that
> it would be desirable to separate the parser. Is there any interest in
> doing this? I now have a vague idea about how to do this. Of course,
> there is no point in making changes like this unless there is some
> tangible benefit.

I am going to assume that since no one was excited about my post, that 
the answer is: no, there is no interest in seperating the parser from 
the rest of the Python run time.

At any rate, if anyone is looking for a standalone C Python parser 
library, you can get it at the following URL. It includes a "print the 
tree" example that displays the AST for a specified file. It only 
supports a subset of the parse tree (assignment, functions, print, 
return), but it should be obvious how it could be extended.

http://evanjones.ca/software/pyparser.html

Evan Jones

--
Evan Jones
http://evanjones.ca/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com