[issue16392] [doc] import crashes on circular imports in ext modules

2021-12-13 Thread Stefan Behnel


Stefan Behnel  added the comment:

Given that PEP-489 has landed in Py3.5, which is already retired and has been 
for more than a year, I think we can just close this issue as outdated.

--
resolution:  -> out of date
stage: needs patch -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue16392>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45711] Simplify the interpreter's (type, val, tb) exception representation

2021-12-17 Thread Stefan Behnel


Change by Stefan Behnel :


--
nosy: +scoder

___
Python tracker 
<https://bugs.python.org/issue45711>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45711] Simplify the interpreter's (type, val, tb) exception representation

2021-12-17 Thread Stefan Behnel


Stefan Behnel  added the comment:

FYI, we track the Cython side of this in
https://github.com/cython/cython/issues/4500

--

___
Python tracker 
<https://bugs.python.org/issue45711>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45321] Module xml.parsers.expat.errors misses error code constants of libexpat >=2.0

2021-12-31 Thread Stefan Behnel


Stefan Behnel  added the comment:


New changeset e18d81569fa0564f3bc7bcfd2fce26ec91ba0a6e by Sebastian Pipping in 
branch 'main':
bpo-45321: Add missing error codes to module `xml.parsers.expat.errors` 
(GH-30188)
https://github.com/python/cpython/commit/e18d81569fa0564f3bc7bcfd2fce26ec91ba0a6e


--

___
Python tracker 
<https://bugs.python.org/issue45321>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45321] Module xml.parsers.expat.errors misses error code constants of libexpat >=2.0

2021-12-31 Thread Stefan Behnel


Change by Stefan Behnel :


--
components: +XML
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed
type:  -> enhancement
versions:  -Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue45321>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44394] [security] CVE-2013-0340 "Billion Laughs" fixed in Expat >=2.4.0: Update vendored copy to expat 2.4.1

2022-01-01 Thread Stefan Behnel

Stefan Behnel  added the comment:

I'd like to ask for clarification regarding issue 45321, which adds the missing 
error constants to the `expat` module. I consider those new features – it seems 
inappropriate to add new module constants in the middle of a release series. 
However, in this ticket here, the libexpat version was updated all the way back 
to Py3.6, to solve a security issue.

Should we also backport the error constants then?

--
nosy: +scoder

___
Python tracker 
<https://bugs.python.org/issue44394>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45569] Drop support for 15-bit PyLong digits?

2022-01-12 Thread Stefan Behnel

Stefan Behnel  added the comment:

Cython should be happy with whatever CPython uses (as long as CPython's header 
files agree with CPython's build ;-) ).

I saw the RasPi benchmarks on the ML. That would have been my suggested trial 
platform as well.
https://mail.python.org/archives/list/python-...@python.org/message/5RJGI6THWCDYTTEPXMWXU7CK66RQUTD4/

The results look ok. Maybe the slowdown for pickling is really the increased 
data size of integers. And it's visible that some compute-heavily benchmarks 
like pyaes did get a little slower. I doubt that they represent a real use case 
on such a platform, though. Doing any kind of number crunching on a RasPi 
without NumPy would appear like a rather strange adventure.

That said, if we decide to keep 15-bit digits in the end, I wonder if 
"SIZEOF_VOID_P" is the right decision point. It seems more of a "has reasonably 
fast 64-bit multiply or not" kind of decision – however that translates into 
code. I'm sure there are 32-bit platforms that would actually benefit from 
30-bit digits today.

If we find a platform that would be fine with 30-bits but lacks a fast 64-bit 
multiply, then we could still try to add a platform specific value size check 
for smaller numbers. Since those are common case, branch prediction might help 
us more often than not.

But then, I wonder how much complexity this is even worth, given that the goal 
is to reduce the complexity. Platform maintainers can still decide to configure 
the digit size externally for the time being, if it makes a difference for 
them. Maybe switching off 15-bits by default is just good enough for the next 
couple of years to come. :)

--

___
Python tracker 
<https://bugs.python.org/issue45569>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45948] Unexpected instantiation behavior for xml.etree.ElementTree.XMLParser(target=None)

2022-02-08 Thread Stefan Behnel


Stefan Behnel  added the comment:

This is a backwards incompatible change, but unlikely to have a wide impact.

I was thinking for a second if it's making the change in the right direction 
because it's not unreasonable to pass "None" for saying "I want no target". But 
it's documented this way and lxml does it the same, so I agree that this should 
be changed to make "None" behave the same as no argument.

--

___
Python tracker 
<https://bugs.python.org/issue45948>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24053] Define EXIT_SUCCESS and EXIT_FAILURE constants in sys

2022-02-18 Thread Stefan Behnel


Stefan Behnel  added the comment:

> Any reasons the PR still not merged?

There was dissent about whether these constants should be added or not. It 
doesn't help to merge a PR that is not expected to provide a benefit.

--

___
Python tracker 
<https://bugs.python.org/issue24053>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46798] xml.etree.ElementTree: get() doesn't return default value, always ATTLIST value

2022-02-22 Thread Stefan Behnel


Stefan Behnel  added the comment:

The question here is simply, which is considered more important: the default 
provided by the document, or the default provided by Python. I don't think it's 
a clear choice, but the way it is now does not seem unreasonable. Changing it 
would mean deliberate breakage of existing code that relies on the existing 
behaviour, and I do not see a reason to do that.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue46798>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46786] embed, source, track, wbr HTML elements not considered empty

2022-02-22 Thread Stefan Behnel


Stefan Behnel  added the comment:

Makes sense. That list hasn't been updated in 10 years.

--
versions:  -Python 3.10, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue46786>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46798] xml.etree.ElementTree: get() doesn't return default value, always ATTLIST value

2022-02-23 Thread Stefan Behnel


Stefan Behnel  added the comment:

> IMHO if the developer doesn't manage the XML itself it is VERY unreasonable 
> to use the document value and not the developer one.

I disagree. If the document says "this is the default if no explicit value if 
given", then I consider that just as good as providing a value each time. 
Meaning, the attribute *is* in fact present, just not explicitly spelled out on 
the element.

I would specifically like to avoid adding a new option just to override the way 
the document distributes its attribute value spelling across DTD and document 
structure. In particular, the .get() method is the wrong place to deal with 
this.

You can probably configure the parser to ignore the internal DTD subset, if 
that's what you want.

--

___
Python tracker 
<https://bugs.python.org/issue46798>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46836] [C API] Move PyFrameObject to the internal C API

2022-02-23 Thread Stefan Behnel


Stefan Behnel  added the comment:

I haven't looked fully into this yet, but I *think* that Cython can get rid of 
most of the direct usages of PyFrameObject by switching to the new 
InterpreterFrame struct instead. It looks like the important fields have now 
been moved over to that.

That won't improve the situation regarding the usage of CPython internals, but 
it's probably worth keeping in mind before we start adding new API functions 
that work on frame objects.

--

___
Python tracker 
<https://bugs.python.org/issue46836>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46389] 3.11: unused generator comprehensions cause f_lineno==None

2022-02-25 Thread Stefan Behnel


Stefan Behnel  added the comment:

Possibly also related, so I though I'd mention it here (sorry if this is 
hijacking the ticket, seems difficult to tell). We're also seeing None values 
in f_lineno in Cython's test suite with 3.11a5:

  File "", line 1, in 
run_trace(py_add, 1, 2)
^^^
  File "tests/run/line_trace.pyx", line 231, in line_trace.run_trace 
(line_trace.c:7000)
func(*args)
  File "tests/run/line_trace.pyx", line 60, in line_trace.trace_trampoline 
(line_trace.c:3460)
raise
  File "tests/run/line_trace.pyx", line 54, in line_trace.trace_trampoline 
(line_trace.c:3359)
result = callback(frame, what, arg)
  File "tests/run/line_trace.pyx", line 81, in 
line_trace._create_trace_func._trace_func (line_trace.c:3927)
trace.append((map_trace_types(event, event), frame.f_lineno - 
frame.f_code.co_firstlineno))
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'

https://github.com/cython/cython/blob/7ab11ec473a604792bae454305adece55cd8ab37/tests/run/line_trace.pyx

No generator expressions involved, though. (Much of that test was written while 
trying to get the debugger in PyCharm to work with Cython compiled modules.)

There is a chance that Cython is doing something wrong in its own line tracing 
code, obviously.
(I also remember seeing other tracing issues before, where the line reported 
was actually in the trace function itself rather than the code to be traced. We 
haven't caught up with the frame-internal changes yet.)

--
nosy: +scoder

___
Python tracker 
<https://bugs.python.org/issue46389>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46786] embed, source, track, wbr HTML elements not considered empty

2022-02-27 Thread Stefan Behnel


Stefan Behnel  added the comment:


New changeset 345572a1a0263076081020524016eae867677cac by Jannis Vajen in 
branch 'main':
bpo-46786: Make ElementTree write the HTML tags embed, source, track, wbr as 
empty tags (GH-31406)
https://github.com/python/cpython/commit/345572a1a0263076081020524016eae867677cac


--

___
Python tracker 
<https://bugs.python.org/issue46786>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46786] embed, source, track, wbr HTML elements not considered empty

2022-02-27 Thread Stefan Behnel


Change by Stefan Behnel :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue46786>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46798] xml.etree.ElementTree: get() doesn't return default value, always ATTLIST value

2022-03-05 Thread Stefan Behnel


Change by Stefan Behnel :


--
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue46798>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12946] PyModule_GetDict() claims it can never fail, but it can

2011-09-13 Thread Stefan Behnel

Stefan Behnel  added the comment:

I gave two reasons why this function can fail, and one turns out to be 
assumed-to-be-dead code. So, no, there are two issues now, one with the 
documentation, one with the code.

--

___
Python tracker 
<http://bugs.python.org/issue12946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13186] instance_ass_item() broken in classobject.c (Py2.7)

2011-10-15 Thread Stefan Behnel

New submission from Stefan Behnel :

Starting at line 1223 in classobject.c, you can find this code:

if (item == NULL)
arg = PyInt_FromSsize_t(i);
else
arg = Py_BuildValue("(nO)", i, item);
if (arg == NULL) {
Py_DECREF(func);
return -1;
}
res = PyEval_CallObject(func, arg);

If item is NULL, arg will be assigned an int object. Otherwise, it will receive 
a tuple. Only the second case works in the subsequent call to 
PyEval_CallObject(), i.e. arg must always be assigned an argument tuple.

A quick fix would be to call Py_BuildValue("(n)", i) in the first case. The 
code just did a getattr(), so this is not performance critical anymore.

I found this bug because the test_class.py test suite was failing in Cython.

--
components: Interpreter Core
messages: 145590
nosy: scoder
priority: normal
severity: normal
status: open
title: instance_ass_item() broken in classobject.c (Py2.7)
type: behavior
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue13186>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13378] Change the variable "nsmap" from global to instance (xml.etree.ElementTree)

2011-11-10 Thread Stefan Behnel

Stefan Behnel  added the comment:

Florent, thanks for the notification.

Nekmo, note that you are misusing this feature. The _namespace_map is meant to 
provide "well known namespace prefixes" only, so that common namespaces end up 
using the "expected" prefix. This is also the reason why it maps namespaces to 
prefixes and not the other way round. It is not meant to temporarily assign 
arbitrary prefix to namespaces. That is the reason for it being a global option.

That being said, lxml.etree's Element factory takes an "nsmap" parameter that 
implements the feature you want. It's documented here:

http://lxml.de/tutorial.html#namespaces

Note that it maps prefixes to namespaces and not the other way round. This is 
because there is a corresponding "nsmap" property on Elements that provides the 
currently defined prefixes in the context of an Element. ElementTree itself 
does not (and cannot) support this property because it drops the prefixes 
during parsing. However, I would still request that an implementation of the 
parameter to the Element() factory should be compatible for both libraries.

Also look for "nsmap" in the compatibility docs (appears in two sections):

http://lxml.de/compatibility.html

--

___
Python tracker 
<http://bugs.python.org/issue13378>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13378] Change the variable "nsmap" from global to instance (xml.etree.ElementTree)

2011-11-11 Thread Stefan Behnel

Stefan Behnel  added the comment:

Reading the proposed patch, I must agree that it makes more sense in 
ElementTree to support this as a serialiser feature. ET's tree model doesn't 
have a notion of prefixes, whereas it's native to lxml.etree.

Two major advantages of putting this into the serialiser are: 1) cET doesn't 
have to be modified, and 2) it does not require additional memory to store the 
nsmap reference on each Element. The latter by itself is a very valuable 
property, given that cET aims specifically at a low memory overhead.

I see a couple of drawbacks:

1) it only supports the case that namespaces are globally defined. The 
implementation cannot handle the case that local namespaces should only be 
defined in subtrees, or that prefixes are being reused. This is no real 
restriction because globally defined namespaces are usually just fine. It's 
more of an inconvenience in some cases, such as multi-namespace languages like 
SOAP or WSDL+XSD, where namespaces are commonly declared on the subtree where 
they start being used.

2) lxml.etree cannot support this because it keeps the prefixes in the tree 
nodes and uses them on serialisation. This cannot easily be overridden because 
the serialiser is part of libxml2.

I didn't see in the patch how (or if?) the prefix redefinition case is handled. 
Given that prefixes are always defined globally, it would be nice if this only 
resulted in an error if two namespaces that are really used in the document map 
to the same prefix, not always when the namespace dict is redundant by itself.

Also note that it's good to be explicit about the keyword arguments that a 
function accepts. It aids when help(tostring) tells you directly what you can 
pass in, instead of just printing "**kw".

--

___
Python tracker 
<http://bugs.python.org/issue13378>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13429] provide __file__ to extension init function

2011-11-18 Thread Stefan Behnel

New submission from Stefan Behnel :

In Python modules, the top-level module code sees the __file__ variable and can 
use it to refer to resources in package subdirectories, for example. This is 
not currently possible in extension modules, because __file__ is only set after 
running the module init function, and the module has no way to find out its 
runtime location.

CPython should set __file__ directly in PyModule_Create2(), based on 
information provided by the shared library loader. This would let 
PyModule_GetFilenameObject() work immediately with the newly created module 
object.

The relevant python-dev thread is here:

http://mail.python.org/pipermail/python-dev/2011-November/114476.html

A patch will follow soon.

--
components: Extension Modules, Interpreter Core
messages: 147881
nosy: scoder
priority: normal
severity: normal
status: open
title: provide __file__ to extension init function
type: feature request
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue13429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13429] provide __file__ to extension init function

2011-11-18 Thread Stefan Behnel

Changes by Stefan Behnel :


--
keywords: +patch
nosy: +loewis
Added file: http://bugs.python.org/file23725/ext_module_init_file_path.patch

___
Python tracker 
<http://bugs.python.org/issue13429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com




[issue13429] provide __file__ to extension init function

2011-11-18 Thread Stefan Behnel

Stefan Behnel  added the comment:

Here is an extension to the patch that implements the protocol also for 
extension module reinitialisation, so that the module creation can also set 
__file__ and the proper package in that case. Currently without tests (and 
users, I guess).

--
Added file: http://bugs.python.org/file23726/ext_module_reinit_context.patch

___
Python tracker 
<http://bugs.python.org/issue13429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10227] Improve performance of MemoryView slicing

2011-11-18 Thread Stefan Behnel

Stefan Behnel  added the comment:

Updated single slice caching patch for latest Py3.3 hg tip.

--
Added file: http://bugs.python.org/file23727/slice-object-cache.patch

___
Python tracker 
<http://bugs.python.org/issue10227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13429] provide __file__ to extension init function

2011-11-18 Thread Stefan Behnel

Stefan Behnel  added the comment:

I don't know how the import lock applies here. Would it have to be protected by 
it? The lifetime is restricted to the call of the extension module init 
function, and its value is saved recursively if the init function triggers 
further imports.

It works exactly like the existing _Py_PackageContext variable.

--

___
Python tracker 
<http://bugs.python.org/issue13429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13429] provide __file__ to extension init function

2011-11-18 Thread Stefan Behnel

Stefan Behnel  added the comment:

... and the module init function could create and register a different module 
first, and ...

Well, yes, it's a best effort thing. It's rather unlikely that the GIL would 
get released in between the call to the init function and the creation of the 
module within that function, but sure, I don't see a reason why it could not 
happen.

However, it can't happen in moduleobject.c between the NULL check and the 
setting of the __file__ attribute, so that is safe enough to not trigger 
crashes.

And even if the wrong __file__ value is set during the run of the init 
function, it will still get overwritten (and thus fixed) afterwards. So my 
intuition is that code that relies on this new feature will simply have to make 
sure the module object creation is the first thing it does, and code that does 
not rely on it, well, does not rely on it.

--

___
Python tracker 
<http://bugs.python.org/issue13429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13429] provide __file__ to extension init function

2011-11-19 Thread Stefan Behnel

Stefan Behnel  added the comment:

I'm aware that these things happen, that's why I said it. Actually, wouldn't it 
rather be *correct* for __file__ to be set to the same file path for all 
modules that an extension module creates in its init function? That would 
suggest that _Py_ModuleImportContext shouldn't be set to NULL after the first 
assignment, but instead stay alive until it gets reset by the dynlib loader. If 
the loader gets invoked recursively later on, it will do the right thing by 
storing away the old value during the import and restoring it afterwards. So 
_Py_ModuleImportContext would always point to the path that contains the init 
function that is currently being executed.

Regarding the lock (which, I assume, is simply reentrant), it's being acquired 
far up when the import mechanism starts, so the dynlib loader and the init 
function call are protected.

Note that this does not apply to the reinit case. 
_PyImport_FindExtensionObject() does not acquire the lock itself (which seems 
correct), and it can be called directly from imp.init_builtin(), i.e. from user 
code. Maybe that's why the _Py_PackageContext protocol was not implemented 
there. That's rather unfortunate, though. I guess the reasoning is that new 
code that uses this new feature is expected to actually be reentrant, also in 
parallel, because the module it creates and works on is local to the current 
thread until the init function terminates. So the import lock is not strictly 
required here. This does complicate the __file__ feature, though, so the second 
("reinit") patch won't work as is. I think the right fix for Python 4 would be 
to simply pass a context struct into the module init function.

On a related note, I just stumbled over this code in 
_PyImport_FindExtensionObject():

else {
if (def->m_base.m_init == NULL)
return NULL;
mod = def->m_base.m_init();
if (mod == NULL)
return NULL;
PyDict_SetItem(PyImport_GetModuleDict(), name, mod);
Py_DECREF(mod);
}
if (_PyState_AddModule(mod, def) < 0) {
PyDict_DelItem(PyImport_GetModuleDict(), name);
Py_DECREF(mod);
return NULL;
}

If PyDict_SetItem() fails, this is bound to crash. I think it would be worth 
looking into this mechanism a bit more.

--

___
Python tracker 
<http://bugs.python.org/issue13429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13429] provide __file__ to extension init function

2011-11-19 Thread Stefan Behnel

Stefan Behnel  added the comment:

Updated patch that does not reset _Py_ModuleImportContext after use.

--
Added file: http://bugs.python.org/file23728/ext_module_init_file_path_2.patch

___
Python tracker 
<http://bugs.python.org/issue13429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13431] Pass context information into the extension module init function

2011-11-19 Thread Stefan Behnel

New submission from Stefan Behnel :

This is a follow-up to issue 13429, which deals with setting __file__ on newly 
created extension modules earlier than it is currently the case.

Currently, the module init function of extension modules lacks a way to find 
out the context in which it is being imported, e.g. its package path or its 
location in the file system. This makes it tricky for extension modules to do 
things like loading package resources or using relative imports at init time.

This can be fixed by allowing the init function to take a context struct as 
argument, which would contain object pointers to the FQ package name and file 
path, and potentially other information.

I think this would be backwards compatible to existing code, because C should 
allow the caller of the init function to pack additional arguments on the stack 
that the called function simply doesn't care about. From CPython 3.3 on, 
however, new and updated code could benefit from this feature.

--
components: Extension Modules, Interpreter Core
messages: 147931
nosy: scoder
priority: normal
severity: normal
status: open
title: Pass context information into the extension module init function
type: feature request
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue13431>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13429] provide __file__ to extension init function

2011-11-19 Thread Stefan Behnel

Stefan Behnel  added the comment:

Replying to myself:
> I think the right fix for Python 4 would be to simply pass
> a context struct into the module init function.

Actually, this doesn't have to wait for Python 4. Changing the module init 
function to take a parameter should be backwards compatible in C. Existing code 
simply wouldn't read the value from the stack, and new (or updated) code could 
benefit from the feature, Cython code in particular.

Here is a follow-up ticket for this more general feature:

http://bugs.python.org/issue13431

--

___
Python tracker 
<http://bugs.python.org/issue13429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13431] Pass context information into the extension module init function

2011-11-19 Thread Stefan Behnel

Stefan Behnel  added the comment:

Yes, that's unfortunate. I found the same paragraph in section 6.5.2.2p6 of the 
C99 standard now, so it seems that this idea isn't suitable for the Py3.x 
series.

There's no Python 4 target version in the bug tracker, BTW. :)

--

___
Python tracker 
<http://bugs.python.org/issue13431>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13431] Pass context information into the extension module init function

2011-11-20 Thread Stefan Behnel

Stefan Behnel  added the comment:

The problem with having that information "internally" is that it's currently 
stored in local variables in the call chain from the dynamic library loader. 
Passing that information on into a callable function, without passing it as an 
argument into the init function, means, that it needs to get stored away in 
some global place, with all the drawbacks that this induces. That's what Martin 
was referring to.

I agree with Martin that the idea of adding a parameter to the module init 
function is not worth pursuing before Python 4, so I'm closing this bug.

--
resolution:  -> postponed
status: open -> closed
title: Pass context information into the extension module init  function -> 
Pass context information into the extension module init function

___
Python tracker 
<http://bugs.python.org/issue13431>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13429] provide __file__ to extension init function

2011-11-20 Thread Stefan Behnel

Stefan Behnel  added the comment:

As MvL noted in his response to issue 13431, simply adding a parameter to the 
module init function cannot safely be done before Python 4. So we are back to 
the idea of passing the information through to the module creation function, 
i.e. this very issue.

A variant of the implementation would be to store the context information in 
thread local storage instead of a global variable. That would work around any 
threading issues. However, this would not be required in the normal import 
case, only in the reinit case, as the import case is protected by the import 
lock, as we have seen. Personally, I do not consider this a good idea for the 
time being, since I doubt that the number of users for the reinitialisation API 
is currently worth caring about.

In any case, the semantics of __file__ for extension modules would basically 
become that __file__ refers to the last library that was loaded before calling 
the module init function. So all extension modules that this init function 
creates will inherit the same __file__. My guess is that they currently end up 
with no __file__ attribute at all, as the loader only sets it on the module 
that the init function returns. So I consider that an improvement already.

--

___
Python tracker 
<http://bugs.python.org/issue13429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-11-28 Thread Stefan Behnel

Stefan Behnel  added the comment:

Ok, so, what do we make of this? I proposed improvements to the wording in the 
documentation, which make it much clearer for users what they are buying into 
when they start using minidom. I still think that "factually correct" but 
clearly misleading documentation is not helpful and that it needs fixing. Here 
is an updated phrasing that I hope we can settle on:

"""
:mod:`xml.dom.minidom` --- Pure Python DOM implementation

[...]

:mod:`xml.dom.minidom` is a pure Python implementation of the Document Object 
Model interface, as known from other programming languages. It is intended to 
provide a smaller and simpler API than the full W3C DOM.

Note that MiniDOM has a several times larger memory footprint than 
:mod:`xml.etree.ElementTree`, the light-weight Python XML library in the 
standard library. If you do not need a (mostly) compliant W3C DOM 
implementation, but a fast and memory friendly XML tree implementation with an 
easy to learn API, use that instead.
"""

--

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-11-29 Thread Stefan Behnel

Stefan Behnel  added the comment:

I find a factor of an order of magnitude worth mentioning, because it prevents 
certain kinds of usages.

--

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-11-29 Thread Stefan Behnel

Stefan Behnel  added the comment:

I don't think "FUD" is a suitable term for the rather minidom-friendly wording 
in my last proposal. Seriously, minidom is widely known for being extremely 
slow and extremely memory hungry. And that is backed by basically any benchmark 
that has ever been done on the subject. If 4DOM, which Martin cites, is really 
worse in terms of performance (I never used it), it must truly be the only 
existing species of that kind.

Still, here's a cleaned up version of Fred's proposal that I could live with:

"""
:mod:`xml.dom.minidom` --- Pure Python DOM implementation

:mod:`xml.dom.minidom` is an implementation of the Document Object Model 
interface.  The API is (intentionally) slightly simpler than the full W3C DOM, 
but the implementation has a significantly higher memory footprint than the XML 
tree library in :mod:`xml.etree.ElementTree`.
"""

--

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-11-29 Thread Stefan Behnel

Stefan Behnel  added the comment:

Ezio Melotti, 29.11.2011 16:26:
>> Seriously, minidom is widely known for being extremely slow and
>> extremely memory hungry. And that is backed by basically any benchmark
>> that has ever been done on the subject.
>
> Do you have any link?

I just did a quick Google search for "python minidom benchmark" and found 
these:

http://www.opensourcetutorials.com/tutorials/Server-Side-Coding/Python/xml-matters/page2.html

http://effbot.org/zone/celementtree.htm#benchmarks

http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/

Note that all three authors risk being biased, but given how similar the 
results are, I tend to believe them.

Stefan

--

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-11-29 Thread Stefan Behnel

Stefan Behnel  added the comment:

Given that the links were generally somewhat dated and used Py2.x instead of 
the post-PEP393 Py3.3, here is another little benchmark, comparing the parser 
performance of minidom to lxml.etree (latest), ElementTree and cElementTree 
(stdlib) in a recent Py3.3 build (e66b7c62eec0), everything properly optimised 
for my platform (Linux 64bit). I used os.fork() to start a new process after 
importing everything and reading the file a couple of times, and before 
parsing. The memory usage is measured inside of the forked child using the 
resource module's ru_maxrss value, so it correlates with the growth of 
CPython's memory heap after parsing, thus giving an estimate of the maximum 
amount of memory used during parsing and tree building.

Parsing hamlet.xml in English, 274KB:

Memory usage: 7284
xml.etree.ElementTree.parse done in 0.104 seconds
Memory usage: 14240 (+6956)
xml.etree.cElementTree.parse done in 0.022 seconds
Memory usage: 9736 (+2452)
lxml.etree.parse done in 0.014 seconds
Memory usage: 11028 (+3744)
minidom tree read in 0.152 seconds
Memory usage: 30360 (+23076)

Parsing the old testament in English (ot.xml, 3.4MB) into memory:

Memory usage: 20444
xml.etree.ElementTree.parse done in 0.385 seconds
Memory usage: 46088 (+25644)
xml.etree.cElementTree.parse done in 0.056 seconds
Memory usage: 32628 (+12184)
lxml.etree.parse done in 0.041 seconds
Memory usage: 37500 (+17056)
minidom tree read in 0.672 seconds
Memory usage: 110428 (+89984)

A 25MB XML file with Slavic Unicode text content:

Memory usage: 57368
xml.etree.ElementTree.parse done in 3.274 seconds
Memory usage: 223720 (+166352)
xml.etree.cElementTree.parse done in 0.459 seconds
Memory usage: 154012 (+96644)
lxml.etree.parse done in 0.454 seconds
Memory usage: 135720 (+78352)
minidom tree read in 6.193 seconds
Memory usage: 604860 (+547492)

And a contrived 4.5MB XML file with lot more structure than data:

Memory usage: 13308
xml.etree.ElementTree.parse done in 4.178 seconds
Memory usage: 222088 (+208780)
xml.etree.cElementTree.parse done in 0.478 seconds
Memory usage: 103056 (+89748)
lxml.etree.parse done in 0.199 seconds
Memory usage: 101860 (+88552)
minidom tree read in 8.705 seconds
Memory usage: 810964 (+797656)

Things to note: The factor of 5-10 for the memory overhead compared to cET 
depends heavily on the data. Also, minidom is consistently slower by more than 
a factor of 10 compared to the fastest parser (apparently the one in 
libxml2/lxml.etree, both of which surely can't be said to provide less features 
than the DOM that minidom implements).

--

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-11-29 Thread Stefan Behnel

Stefan Behnel  added the comment:

Hmm, looks like I messed up the last example. I accidentally left in the 
formatting whitespace, thus growing the file to 6.2 MB. Removing that, I get 
this for the (now really) 4.5 MB XML file with lots of structure and very 
little data:

Memory usage: 11600
xml.etree.ElementTree.parse done in 3.374 seconds
Memory usage: 203420 (+191820)
xml.etree.cElementTree.parse done in 0.192 seconds
Memory usage: 36444 (+24844)
lxml.etree.parse done in 0.131 seconds
Memory usage: 62648 (+51048)
minidom tree read in 5.935 seconds
Memory usage: 527684 (+516084)

It's actually surprising how much of a difference trailing whitespace content 
makes in minidom (from 2MB on disk to 300MB in memory???), most likely due to 
the usage of dedicated DOM text nodes in the tree.

PS: I think the "XML/performance" tags on this bug would hint at a separate 
ticket. This is really meant as a documentation bug.

--

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13378] Change the variable "nsmap" from global to instance (xml.etree.ElementTree)

2011-12-09 Thread Stefan Behnel

Changes by Stefan Behnel :


--
nosy: +effbot

___
Python tracker 
<http://bugs.python.org/issue13378>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13378] Change the variable "nsmap" from global to instance (xml.etree.ElementTree)

2011-12-09 Thread Stefan Behnel

Stefan Behnel  added the comment:

Given that this is a major new feature for the serialiser in ElementTree, I 
think it's worth asking Fredrik for any comments.

--

___
Python tracker 
<http://bugs.python.org/issue13378>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13611] Integrate ElementC14N module into xml.etree package

2011-12-16 Thread Stefan Behnel

New submission from Stefan Behnel :

The ElementC14N.py module by Fredrik Lundh implements XML canonicalisation for 
the ElementTree serialiser. Given that the required API hooks to use it are 
already in xml.etree.ElementTree, this would make a nice, simple and straight 
forward addition to the existing xml.etree package.

The source can be found here (unchanged since at least 2009):

https://bitbucket.org/effbot/et-2009-provolone/src/tip/elementtree/elementtree/ElementC14N.py

Note that the source needs some minor modifications to use relative imports at 
the top. Also, the "2.3 compatibility" code section can be dropped.

--
components: Library (Lib), XML
messages: 149598
nosy: scoder
priority: normal
severity: normal
status: open
title: Integrate ElementC14N module into xml.etree package
type: enhancement
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue13611>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-12-16 Thread Stefan Behnel

Stefan Behnel  added the comment:

I started a mailing list thread on the same topic:

http://thread.gmane.org/gmane.comp.python.devel/127963

Especially see

http://thread.gmane.org/gmane.comp.python.devel/127963/focus=128162

where I extract a proposal from the discussion. Basically, there should be a 
note at the top of the xml.dom documentation as follows:

"""
[[Note: The xml.dom.minidom module provides an implementation of the W3C-DOM 
whose API is similar to that in other programming languages. Users who are 
unfamiliar with the W3C-DOM interface or who would like to write less code for 
processing XML files should consider using the xml.etree.ElementTree module 
instead.]]
"""

I think this should go on the xml.dom.minidom page as well as the xml.dom 
package page. Hand-wavingly, users who are new to the DOM are more likely to 
hit the package page first, whereas those who know it already will likely find 
the MiniDOM page directly.

Note that I'd still encourage the removal of the misleading word "lightweight" 
until it makes sense to put it back in a meaningful way. I therefore propose 
the following minimalistic changes to the first paragraph on the minidom page:

"""
xml.dom.minidom is a [-XXX: light-weight] implementation of the Document Object 
Model interface. It is intended to be simpler than the full DOM and also [+XXX: 
provide a] significantly smaller [+XXX: API].
"""

Additionally, the documentation on the xml.sax page would benefit from the 
following paragraph:

"""
[[Note: The xml.sax package provides an implementation of the SAX interface 
whose API is similar to that in other programming languages. Users who are 
unfamiliar with the SAX interface or who would like to write less code for 
efficient stream processing of XML files should consider using the iterparse() 
function in the xml.etree.ElementTree module instead.]]
"""

--

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13429] provide __file__ to extension init function

2011-12-25 Thread Stefan Behnel

Stefan Behnel  added the comment:

Any comments on the last patch?

--

___
Python tracker 
<http://bugs.python.org/issue13429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8583] Hardcoded namespace_separator in the cElementTree.XMLParser

2011-05-28 Thread Stefan Behnel

Stefan Behnel  added the comment:

I don't see this having much to do with the DRY principle. It's "explicit is 
better than implicit" and "better safe than sorry" that applies here.

--

___
Python tracker 
<http://bugs.python.org/issue8583>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12946] PyModule_GetDict() claims it can never fail, but it can

2011-09-09 Thread Stefan Behnel

New submission from Stefan Behnel :

As is obvious from the code, PyModule_GetDict() can fail if being passed a 
non-module object, and when the (unlikely) dict creation at the end fails. The 
documentation of the C-API function should be fixed to reflect that, i.e. it 
should state that NULL is returned in the case of an error.

PyObject *
PyModule_GetDict(PyObject *m)
{
PyObject *d;
if (!PyModule_Check(m)) {
PyErr_BadInternalCall();
return NULL;
}
d = ((PyModuleObject *)m) -> md_dict;
if (d == NULL)
((PyModuleObject *)m) -> md_dict = d = PyDict_New();
return d;
}

--
assignee: docs@python
components: Documentation
messages: 143764
nosy: docs@python, scoder
priority: normal
severity: normal
status: open
title: PyModule_GetDict() claims it can never fail, but it can
type: behavior
versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3

___
Python tracker 
<http://bugs.python.org/issue12946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12949] Documentation of PyCode_New() lacks kwonlyargcount argument

2011-09-09 Thread Stefan Behnel

New submission from Stefan Behnel :

In Py3, PyCode_New() takes a new argument "kwonlyargcount". The signature 
change is not currently in the Py3 C-API documentation.

http://docs.python.org/dev/c-api/code.html

PyCodeObject *
PyCode_New(int argcount, int kwonlyargcount,
   int nlocals, int stacksize, int flags,
   PyObject *code, PyObject *consts, PyObject *names,
   PyObject *varnames, PyObject *freevars, PyObject *cellvars,
   PyObject *filename, PyObject *name, int firstlineno,
   PyObject *lnotab)

--
assignee: docs@python
components: Documentation
messages: 143784
nosy: docs@python, scoder
priority: normal
severity: normal
status: open
title: Documentation of PyCode_New() lacks kwonlyargcount argument
type: behavior
versions: Python 3.1, Python 3.2, Python 3.3

___
Python tracker 
<http://bugs.python.org/issue12949>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9522] xml.etree.ElementTree forgets the encoding

2010-08-12 Thread Stefan Behnel

Stefan Behnel  added the comment:

lxml.etree has encapsulated this in a 'docinfo' property which also holds the 
XML 'version', the 'standalone' state and the DOCTYPE (if available).

Note that this information is readily available in lxml.etree for any parsed 
Element (by wrapping it in a new ElementTree), but not in ET where it can only 
be associated to the ElementTree instance that did the parsing, not one that 
just wraps a parsed tree of Element objects. I would expect that this is still 
enough to handle this use case, though.

Stefan

--

___
Python tracker 
<http://bugs.python.org/issue9522>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9522] xml.etree.ElementTree forgets the encoding

2010-08-12 Thread Stefan Behnel

Stefan Behnel  added the comment:

That's why I mention it here to prevent future incompatibilities between the 
two libraries.

--

___
Python tracker 
<http://bugs.python.org/issue9522>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel  added the comment:

Here's a patch against the latest py3k. The following will call the new code, 
for example:

  str(memoryview(b'abc'), 'ASCII')

whereas bytes and bytesarray continue to use their own special casing code 
(which has also changed a bit since I wanted to avoid code duplication).

For testing, I wrote a short Cython module that implements the buffer protocol 
in an extension type and freshly allocates a new bytes object as buffer on each 
access:

  from cpython.ref cimport Py_INCREF, Py_DECREF, PyObject

  cdef class Test:
  def __getbuffer__(self, Py_buffer* buffer, int flags):
  s = b'abcdefg' * 10
  buffer.buf =  s
  buffer.obj = self
  buffer.len = len(s)
  Py_INCREF(s)
  buffer.internal =  s

  def __releasebuffer__(self, Py_buffer* buffer):
  Py_DECREF(buffer.internal)

Put it into a file "buftest.pyx", build it, start up Python 3.x and call

>>> import buftest
>>> print(len( str(buftest.Test(), "ASCII") ))

Under the unpatched Py3, this raises a decoding exception for me when it tries 
to decode data from the deallocated bytes object. Other systems may happily 
crash here. The patched Python runtime prints '70' as expected.

--
keywords: +patch
Added file: 
http://bugs.python.org/file18585/unicodeobject-PyUnicode_FromEncodedObject-buffer.patch

___
Python tracker 
<http://bugs.python.org/issue7415>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel  added the comment:

Doesn't the GIL protect the bytearray buffer? Or does decoding free the GIL?

--
versions: +Python 2.7

___
Python tracker 
<http://bugs.python.org/issue7415>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel  added the comment:

Regardless of the answer, I think Antoine is right, special cases aren't 
special enough to break the rules, and this is a special case that's more 
safely handled as part of the normal buffer case.

Updated patch uploaded.

--
Added file: 
http://bugs.python.org/file18587/unicodeobject-PyUnicode_FromEncodedObject-buffer2.patch

___
Python tracker 
<http://bugs.python.org/issue7415>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel  added the comment:

... and another complete patch that refactors the complete function to make it 
clearer what happens. Includes a small code duplication for the bytes object 
case, which I think it acceptable.

--
Added file: 
http://bugs.python.org/file18588/unicodeobject-PyUnicode_FromEncodedObject-buffer-refactored.patch

___
Python tracker 
<http://bugs.python.org/issue7415>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Changes by Stefan Behnel :


Removed file: 
http://bugs.python.org/file18588/unicodeobject-PyUnicode_FromEncodedObject-buffer-refactored.patch

___
Python tracker 
<http://bugs.python.org/issue7415>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel  added the comment:

Another updated patch with a readability fix (replacing the last one).

--
Added file: 
http://bugs.python.org/file18589/unicodeobject-PyUnicode_FromEncodedObject-buffer-refactored.patch

___
Python tracker 
<http://bugs.python.org/issue7415>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel  added the comment:

When I read the comments and exception texts in the function, it didn't occur 
to me that "char buffer" could have been used as a name for the old Py2 buffer 
interface. From the context, it totally makes sense to me that the function 
(which decodes a byte sequence into a unicode string) complains about not 
getting a "bytes object or char buffer" as input. Admittedly, this might sound 
slightly different when read in Python space.

--

___
Python tracker 
<http://bugs.python.org/issue7415>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9834] PySequence_GetSlice() lacks a NULL check

2010-09-11 Thread Stefan Behnel

New submission from Stefan Behnel :

PySequence_GetSlice() in Objects/abstract.c contains the following code:

mp = s->ob_type->tp_as_mapping;
if (mp->mp_subscript) {

This crashes when the type's "tp_as_mapping" is NULL. The obvious fix is to 
simply write

if (mp && mp->mp_subscript)

as basically everywhere else around that function. The problem seems to have 
occurred during a rewrite for Python 3, it's ok in the 2.x series.

--
components: Interpreter Core
messages: 116092
nosy: scoder
priority: normal
severity: normal
status: open
title: PySequence_GetSlice() lacks a NULL check
type: crash
versions: Python 3.1, Python 3.2

___
Python tracker 
<http://bugs.python.org/issue9834>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10227] Improve performance of MemoryView slicing

2010-11-01 Thread Stefan Behnel

Stefan Behnel  added the comment:

I find it a lot easier to appreciate patches that implement a single change 
than those that mix different changes. There are three different things in your 
patch, which I would like to see in at least three different commits. I'd be 
happy if you could separate the changes into more readable feature patches. 
That makes it easier to accept them.

I'm generally happy about the slice changes, but you will have to benchmark the 
equivalent changes in Py3.2 to prove that they are similarly worth applying 
there.

--
nosy: +scoder

___
Python tracker 
<http://bugs.python.org/issue10227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10294] Lib/test/test_unicode_file.py contains dead code

2010-11-02 Thread Stefan Behnel

New submission from Stefan Behnel :

Lib/test/test_unicode_file.py contains dead code:

def _test_equivalent(self, filename1, filename2):
remove_if_exists(filename1)
self.assertTrue(not os.path.exists(filename2))
f = file(filename1, "w")
f.close()
try:
self._do_equivalent(filename1, filename2)
finally:
os.unlink(filename1)

Note how this refers to the now-gone "file()". The method is never used in the 
test code. Similarly, the "_do_equivalent()" method that it calls appears 
otherwise unused.

--
components: Tests
messages: 120236
nosy: scoder
priority: normal
severity: normal
status: open
title: Lib/test/test_unicode_file.py contains dead code
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue10294>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9518] PyModuleDef_HEAD_INIT does not explicitly initialize all fields of m_base

2010-11-17 Thread Stefan Behnel

Stefan Behnel  added the comment:

I agree that this is annoying, we get the same thing in Cython's test suite all 
over the place. Any foreign warning that doesn't get triggered helps in 
debugging your own code. And this one is easy to avoid.

--
nosy: +scoder

___
Python tracker 
<http://bugs.python.org/issue9518>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-03-02 Thread Stefan Behnel

New submission from Stefan Behnel :

http://docs.python.org/library/xml.dom.minidom.html

presents MiniDOM as a "Lightweight DOM implementation". The word "lightweight" 
is easily misunderstood as meaning "efficient" or "memory friendly". MiniDOM is 
well known to be neither of the two.

The first paragraph then continues:

"""
xml.dom.minidom is a light-weight implementation of the Document Object Model 
interface. It is intended to be simpler than the full DOM and also 
significantly smaller.
"""

Again, "smaller" can be misread as "low memory footprint", whereas it is 
actually supposed to refer to an incomplete DOM API implementation. And 
"simpler" is also clearly exaggerated when compared to the alternative 
ElementTree package.

I would like to see this changed and combined with a clear and visible comment 
that MiniDOM has very high resource profile, e.g.

"""
19.7. xml.dom.minidom — Pure Python DOM implementation

xml.dom.minidom is a pure Python implementation of the Document Object Model 
interface, as known from other programming languages. It is intended to provide 
a smaller API than the full DOM.

Note, however, that MiniDOM has a very large memory footprint compared to other 
Python XML libraries. If you need a fast and memory friendly XML tree 
implementation with a vastly simpler API, use the xml.etree package instead.
"""

--
assignee: docs@python
components: Documentation
messages: 129914
nosy: docs@python, scoder
priority: normal
severity: normal
status: open
title: Remove "lightweight" from minidom description
versions: Python 2.7, Python 3.2, Python 3.3

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-03-02 Thread Stefan Behnel

Stefan Behnel  added the comment:

Well, I'm not aware of many people who use 4DOM these days, and if that's what 
it's meant to refer to, maybe that should be made more obvious, because it 
currently is not at all. Even cDomlette uses only half of the memory according 
to

http://effbot.org/zone/celementtree.htm

When you say that the description is "factually correct", that does by no means 
imply that the average reader will understand how it's meant. My point is that 
almost everyone who reads this will draw the wrong conclusions.

Also, when you say "lower footprint", that does not yet make it "light weight" 
in any way. It still uses something like ten times as much memory as 
cElementTree or lxml in Python 2 (and likely much more than even that in Python 
3), and still something like 4-5 times as much as plain Python ElementTree. 
That's a huge difference.

What about this phrasing then:

"""
MiniDOM has a smaller memory footprint than some of the other DOM compliant 
implementations for Python (such as 4DOM), but uses about 10x more memory than 
the faster and simpler xml.etree.cElementTree module.
"""

--

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-03-03 Thread Stefan Behnel

Stefan Behnel  added the comment:

It's the tree based API most python users are parsing XML with, though. So I do 
not agree that it's comparing apples and oranges, not at all. It's comparing 
tree based XML libraries, only one of which is worth being called "light 
weight", and that's not the one that is currently carrying that name.

I think it's worth telling new users what they are committing to when they 
write code that uses MiniDOM. The documentation should allow them to understand 
that.

--

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11379] Remove "lightweight" from minidom description

2011-03-03 Thread Stefan Behnel

Stefan Behnel  added the comment:

> If that is a real concern, I'd rather reduce the memory footprint of
> minidom than put actual performance figures into the documentation
> that will likely outdate over time.

Personally, I do not think it's worth putting much work into MiniDOM. I'd 
rather deprecate it to prevent new code from being written for it, but that's 
just my personal opinion, and this is the wrong place to discuss that. Given 
the current performance characteristics, I wouldn't be surprised if there was 
quite some room for improvements left in the xml.dom package.

If you dislike the "10x", feel free to use "several times". I doubt that 
MiniDOM will ever get so much closer to cET and lxml to prove that phrasing 
wrong.


> Notice that the documentation doesn't claim that it is a lightweight
> XML library, only that it's a ligthweight DOM implementation.

I imagine that you are as aware as I am that this nuance is easy to miss, 
especially for a new user. From my experience, it is very common for users, 
especially those with a Java-ish background, to confuse the terms "DOM" and 
"XML tree API/library". Hence my push to change the documentation.


> SAX is, of course, even lighter-weight.

Not so much more light weight than cET's iterparse(), but that's getting OT 
here.

Stefan

--

___
Python tracker 
<http://bugs.python.org/issue11379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11903] Incorrect test code in test_logging.py

2011-04-21 Thread Stefan Behnel

New submission from Stefan Behnel :

In test file test_logging.py, around line 2359, list.append() is called with 
two arguments instead of one. I suppose it is meant to be called with a tuple.

class ModuleLevelMiscTest(BaseTest):
[...]
def _test_log(self, method, level=None):
called = []
patch(self, logging, 'basicConfig',
  lambda *a, **kw: called.append(a, kw))  # <

--
components: Tests
messages: 134239
nosy: scoder
priority: normal
severity: normal
status: open
title: Incorrect test code in test_logging.py
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue11903>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12131] python built with --prefix fails in site.py with no section 'posix_prefix'

2011-05-20 Thread Stefan Behnel

Changes by Stefan Behnel :


--
nosy: +scoder

___
Python tracker 
<http://bugs.python.org/issue12131>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10500] Palevo.DZ worm msix86 installer 3.x installer

2010-11-28 Thread Stefan Behnel

Stefan Behnel  added the comment:

File checked with ClamAV under Linux, no findings.

--
nosy: +scoder

___
Python tracker 
<http://bugs.python.org/issue10500>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10227] Improve performance of MemoryView slicing

2011-02-01 Thread Stefan Behnel

Stefan Behnel  added the comment:

I've extracted and fixed the part of this patch that implements the slice 
object cache. In particular, PySlice_Fini() was incorrectly implemented. This 
patch applies cleanly for me against the latest py3k branch.

--
Added file: http://bugs.python.org/file20639/slice-object-cache.patch

___
Python tracker 
<http://bugs.python.org/issue10227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10227] Improve performance of MemoryView slicing

2011-02-02 Thread Stefan Behnel

Stefan Behnel  added the comment:

> Any benchmark numbers for the slice cache?

I ran the list tests in pybench and got this:

Test   minimum run-timeaverage  run-time
thisother   diffthisother   diff

ListSlicing:66ms67ms   -2.2%67ms68ms   -2.7%
 SmallLists:61ms64ms   -4.5%61ms65ms   -5.6%

Totals:   127ms   131ms   -3.3%   128ms   133ms   -4.1%

Repeating this gave me anything between 1.5% and 3.5% in total, with >2% for 
the small lists benchmark (which is the expected best case as slicing large 
lists obviously dominates the slice object creation).

IMHO, even 2% would be pretty good for such a small change.


> Also, is the call to PyObject_INIT necessary?

In any case, the ref-count needs to be re-initialised to 1. A call to 
_Py_NewReference() would be enough, though, following the example in 
listobject.c. So you can replace

 PyObject_INIT(obj, &PySlice_Type);

by

 _Py_NewReference((PyObject *)obj);

in the patch. New patch attached.

--
Added file: http://bugs.python.org/file20650/slice-object-cache.patch

___
Python tracker 
<http://bugs.python.org/issue10227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10227] Improve performance of MemoryView slicing

2011-02-02 Thread Stefan Behnel

Stefan Behnel  added the comment:

There's a "PyObject_Del(obj)" in all code paths.

--

___
Python tracker 
<http://bugs.python.org/issue10227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Stefan Behnel

Stefan Behnel  added the comment:

Here are some real micro benchmarks (note that the pybench benchmarks actually 
do lots of other stuff besides slicing):

base line:

$ ./python -m timeit -s 'l = list(range(100)); s=slice(None)' 'l[s]'
100 loops, best of 3: 0.464 usec per loop
$ ./python -m timeit -s 'l = list(range(10)); s=slice(None)' 'l[s]'
1000 loops, best of 3: 0.149 usec per loop
$ ./python -m timeit -s 'l = list(range(10)); s=slice(None,1)' 'l[s]'
1000 loops, best of 3: 0.135 usec per loop


patched:

$ ./python -m timeit -s 'l = list(range(100))' 'l[:1]'
1000 loops, best of 3: 0.158 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[:]'
100 loops, best of 3: 0.49 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[1:]'
100 loops, best of 3: 0.487 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[1:3]'
1000 loops, best of 3: 0.184 usec per loop

$ ./python -m timeit -s 'l = list(range(10))' 'l[:]'
1000 loops, best of 3: 0.185 usec per loop
$ ./python -m timeit -s 'l = list(range(10))' 'l[1:]'
1000 loops, best of 3: 0.181 usec per loop


original:

$ ./python -m timeit -s 'l = list(range(100))' 'l[:1]'
1000 loops, best of 3: 0.171 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[:]'
100 loops, best of 3: 0.499 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[1:]'
100 loops, best of 3: 0.509 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[1:3]'
1000 loops, best of 3: 0.198 usec per loop

$ ./python -m timeit -s 'l = list(range(10))' 'l[:]'
1000 loops, best of 3: 0.188 usec per loop
$ ./python -m timeit -s 'l = list(range(10))' 'l[1:]'
100 loops, best of 3: 0.196 usec per loop


So the maximum impact seems to be 8% for very short slices (<10) and it quickly 
goes down for longer slices where the copy impact clearly dominates. There's 
still some 2% for 100 items, though.

I find it interesting that the base line is way below the other timings. That 
makes me think it's actually worth caching constant slice instances, as CPython 
already does for tuples. Cython also caches both now. I would expect that 
constant slices like [:], [1:] or [:-1] are extremely common. As you can see 
above, caching them could speed up slicing by up to 30% for short lists, and 
still some 7% for a list of length 100.

Stefan

--

___
Python tracker 
<http://bugs.python.org/issue10227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Stefan Behnel

Stefan Behnel  added the comment:

Here's another base line test: slicing an empty list

patched:

$ ./python -m timeit -s 'l = []' 'l[:]'
1000 loops, best of 3: 0.0847 usec per loop

original:

$ ./python -m timeit -s 'l = []' 'l[:]'
1000 loops, best of 3: 0.0977 usec per loop

That's about 13% less overhead.

--

___
Python tracker 
<http://bugs.python.org/issue10227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Stefan Behnel

Stefan Behnel  added the comment:

> of course, this will not help for other common cases such as l[x:x+2]

... which is exactly what this slice caching patch is there for. ;-)

--

___
Python tracker 
<http://bugs.python.org/issue10227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Stefan Behnel

Stefan Behnel  added the comment:

A quick test against the py3k stdlib:

find -name "*.py" | while read file; do egrep '\[[-0-9]*:[-0-9]*\]' "$file"; 
done | wc -l

This finds 2096 lines in 393 files.

--

___
Python tracker 
<http://bugs.python.org/issue10227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11107] Cache constant "slice" instances

2011-02-03 Thread Stefan Behnel

New submission from Stefan Behnel :

Follow-up to ticket 10227. The following facts seem to indicate that it would 
be worth caching constant instances of the slice type, such as in [:] or [:-1].

with cached slice instance:

$ ./python -m timeit -s 'l = list(range(100)); s=slice(None)' 'l[s]'
100 loops, best of 3: 0.464 usec per loop
$ ./python -m timeit -s 'l = list(range(10)); s=slice(None)' 'l[s]'
1000 loops, best of 3: 0.149 usec per loop
$ ./python -m timeit -s 'l = list(range(10)); s=slice(None,1)' 'l[s]'
1000 loops, best of 3: 0.135 usec per loop

uncached normal usage:

$ ./python -m timeit -s 'l = list(range(100))' 'l[:]'
100 loops, best of 3: 0.499 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[:1]'
1000 loops, best of 3: 0.171 usec per loop

Timings based on Python 3.2 rc2.

A quick grep against the py3k stdlib finds 2096 lines in 393 files that use 
constant slices.

--
components: Interpreter Core
messages: 127804
nosy: scoder
priority: normal
severity: normal
status: open
title: Cache constant "slice" instances
type: performance
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue11107>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Stefan Behnel

Stefan Behnel  added the comment:

Created follow-up issue 11107 for caching constant slice objects.

--

___
Python tracker 
<http://bugs.python.org/issue10227>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11107] Cache constant "slice" instances

2011-02-03 Thread Stefan Behnel

Stefan Behnel  added the comment:

Erm, issue 10227.

--

___
Python tracker 
<http://bugs.python.org/issue11107>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11107] Cache constant "slice" instances

2011-02-03 Thread Stefan Behnel

Stefan Behnel  added the comment:

Hmm, ok, but AFAICT, your patch was rejected rather because of the way it 
approached the problem, not so much because of the issue itself.

Plus, the fact that Python 3 requires slices in more places than Python 2 
(which had the lower level getslice protocol) makes this a bigger issue now 
than it was three years ago.

--

___
Python tracker 
<http://bugs.python.org/issue11107>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11254] distutils doesn't byte-compile .py files to __pycache__ during installation

2011-02-20 Thread Stefan Behnel

New submission from Stefan Behnel :

During installation of Python packages (setup.py install or bdist), distutils 
puts .pyc files into the installed source directory, instead of moving them 
into __pycache__. This may mean that they are not getting used after 
installation (with potentially no way of getting updated due to lack of write 
access by users), and that source files that get imported during installation 
may end up with .pyc files in both the source directory and the __pycache__ 
directory in the installed package.

The relevant python-dev thread is here:

http://thread.gmane.org/gmane.comp.python.devel/121248/

--
assignee: tarek
components: Distutils
messages: 128897
nosy: eric.araujo, scoder, tarek
priority: normal
severity: normal
status: open
title: distutils doesn't byte-compile .py files to __pycache__ during 
installation
type: behavior
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue11254>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11254] distutils doesn't byte-compile .py files to __pycache__ during installation

2011-02-20 Thread Stefan Behnel

Stefan Behnel  added the comment:

Here's a patch. I basically copied over the way py_compile determines the .pyc 
file name.

It works for me for a "normal" installation. However, I couldn't test it with 
"-O", as 2to3 crashes for me when I enable it during installation. I guess 
that's a separate issue.

--
keywords: +patch
Added file: http://bugs.python.org/file20802/issue11254.patch

___
Python tracker 
<http://bugs.python.org/issue11254>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11255] 2to3 throws AttributeError during distutils installation with -O

2011-02-20 Thread Stefan Behnel

New submission from Stefan Behnel :

When running a distutils installation of Cython (which uses lib2to3) as 
"python3.2 -O setup.py bdist", I get this:

Skipping implicit fixer: buffer
Skipping implicit fixer: idioms
Skipping implicit fixer: set_literal
Skipping implicit fixer: ws_comma
Traceback (most recent call last):
  File "setup.py", line 319, in 
**setup_args
  File "/opt/python3.2-opt/lib/python3.2/distutils/core.py", line 149, in setup
dist.run_commands()
  File "/opt/python3.2-opt/lib/python3.2/distutils/dist.py", line 919, in 
run_commands
self.run_command(cmd)
  File "/opt/python3.2-opt/lib/python3.2/distutils/dist.py", line 938, in 
run_command
cmd_obj.run()
  File "/opt/python3.2-opt/lib/python3.2/distutils/command/bdist.py", line 132, 
in run
self.run_command(cmd_name)
  File "/opt/python3.2-opt/lib/python3.2/distutils/cmd.py", line 315, in 
run_command
self.distribution.run_command(command)
  File "/opt/python3.2-opt/lib/python3.2/distutils/dist.py", line 938, in 
run_command
cmd_obj.run()
  File "/opt/python3.2-opt/lib/python3.2/distutils/command/bdist_dumb.py", line 
74, in run
self.run_command('build')
  File "/opt/python3.2-opt/lib/python3.2/distutils/cmd.py", line 315, in 
run_command
self.distribution.run_command(command)
  File "/opt/python3.2-opt/lib/python3.2/distutils/dist.py", line 938, in 
run_command
cmd_obj.run()
  File "/opt/python3.2-opt/lib/python3.2/distutils/command/build.py", line 128, 
in run
self.run_command(cmd_name)
  File "/opt/python3.2-opt/lib/python3.2/distutils/cmd.py", line 315, in 
run_command
self.distribution.run_command(command)
  File "/opt/python3.2-opt/lib/python3.2/distutils/dist.py", line 938, in 
run_command
cmd_obj.run()
  File "/opt/python3.2-opt/lib/python3.2/distutils/command/build_py.py", line 
404, in run
self.run_2to3(self.updated_files)
  File "/opt/python3.2-opt/lib/python3.2/distutils/util.py", line 649, in 
run_2to3
return run_2to3(files, self.fixer_names, self.options, self.explicit)
  File "/opt/python3.2-opt/lib/python3.2/distutils/util.py", line 597, in 
run_2to3
r.refactor(files, write=True)
  File "/opt/python3.2-opt/lib/python3.2/lib2to3/refactor.py", line 296, in 
refactor
self.refactor_file(dir_or_file, write, doctests_only)
  File "/opt/python3.2-opt/lib/python3.2/lib2to3/refactor.py", line 349, in 
refactor_file
tree = self.refactor_string(input, filename)
  File "/opt/python3.2-opt/lib/python3.2/lib2to3/refactor.py", line 381, in 
refactor_string
self.refactor_tree(tree, name)
  File "/opt/python3.2-opt/lib/python3.2/lib2to3/refactor.py", line 442, in 
refactor_tree
find_root(node)
  File "/opt/python3.2-opt/lib/python3.2/lib2to3/fixer_util.py", line 276, in 
find_root
while node.type != syms.file_input:
AttributeError: 'NoneType' object has no attribute 'type'

--
components: 2to3 (2.x to 3.0 conversion tool)
messages: 128900
nosy: scoder
priority: normal
severity: normal
status: open
title: 2to3 throws AttributeError during distutils installation with -O
type: crash
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue11255>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2860] re module fails to handle digits in byte strings

2008-05-15 Thread Stefan Behnel

New submission from Stefan Behnel <[EMAIL PROTECTED]>:

The following fails in Py3.0a5:

>>> import re
>>> re.search(b'(\d+)', b'-2.80 98\n')

I get a TypeError: "Can't convert 'int' object to str implicitly" in
line 204 of file "sre_parse.py", code being "char = char + c".

--
components: Library (Lib)
messages: 66848
nosy: scoder
severity: normal
status: open
title: re module fails to handle digits in byte strings
type: behavior
versions: Python 3.0

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2860>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2895] Crash in ParseTupleAndKeywords when passing byte string keywords

2008-05-16 Thread Stefan Behnel

New submission from Stefan Behnel <[EMAIL PROTECTED]>:

Using 3.0a5, the following code crashes in vgetargskeywords (getargs.c:1542)

  >>> d = {b"encoding": "abc"}
  >>> str(b"abc", **d)

It should raise a TypeError instead, i.e. line 1535 should read

  if (!PyUnicode_Check(key)) {

instead of

  if (!PyString_Check(key) && !PyUnicode_Check(key)) {

--
components: Interpreter Core
messages: 66958
nosy: scoder
severity: normal
status: open
title: Crash in ParseTupleAndKeywords when passing byte string keywords
type: crash
versions: Python 3.0

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2895>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2915] PyObject_IsInstance() doesn't find bases named in type(name, bases, dict)

2008-05-19 Thread Stefan Behnel

New submission from Stefan Behnel <[EMAIL PROTECTED]>:

While porting the code that Cython generates to Py3a5 (almost completed,
BTW), I noticed a problem with class creation. We are currently using
this call to create a new class in Py3:

PyObject_CallFunctionObjArgs((PyObject *)&PyType_Type,
 name, bases, dict, NULL);

As an example, I subtype the built-in "list" type like this (Cython code!):

class B(list):
def append(self, *args):
for arg in args:
list.append(self, arg)

which calls type() as shown above with name="B" and bases=(PyList_Type,).

Surprisingly to me, the call to .append() then fails in the method
descriptor code with a type error on "self". I tried calling super(...)
instead, and it gives a similar error. I read through the descriptor
code and the thing that fails here is

PyObject_IsInstance(self, (PyObject *)(descr->d_type))

in line 229 of descrobject.c, which internally calls

PyObject_TypeCheck(inst, (PyTypeObject *)cls)

in line 2543 of abstract.c. The problem here is that this checks the
ob_type, which holds a "type" and not a "B", so it doesn't find the base
type "list" of the "B" type and instead looks through the base types of
"type". The result is that PyObject_IsInstance() does not consider the
result of the above call to type(name, bases, dict) an instance of the
types that were named in "bases".

As this works in Python 2.5.1 and also for equivalent Python code in the
interpreter of Python 3.0a5, I assume that this is a bug in the alpha
version.

--
components: Interpreter Core
messages: 67065
nosy: scoder
severity: normal
status: open
title: PyObject_IsInstance() doesn't find bases named in type(name, bases, dict)
type: behavior
versions: Python 3.0

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2915>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2860] re module fails to handle byte strings

2008-05-20 Thread Stefan Behnel

Stefan Behnel <[EMAIL PROTECTED]> added the comment:

Ah, I now see what the actual problem is. Byte strings return the byte
value on indexing.

Changing the title accordingly.

--
title: re module fails to handle digits in byte strings -> re module fails to 
handle byte strings

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2860>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2915] PyObject_IsInstance() doesn't find bases named in type(name, bases, dict)

2008-05-21 Thread Stefan Behnel

Stefan Behnel <[EMAIL PROTECTED]> added the comment:

Sorry, the bug was in Cython, which didn't call InstanceMethod().

Please ignore.

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2915>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2963] Method cache is broken in Py3

2008-05-25 Thread Stefan Behnel

New submission from Stefan Behnel <[EMAIL PROTECTED]>:

The method cache is deactivated in current Py3. As Lisandro Dalcín
noticed, the following code in typeobject.c always returns false for a
Py3 identifier:

#define MCACHE_CACHEABLE_NAME(name) \
PyString_CheckExact(name) &&\
PyString_GET_SIZE(name) <= MCACHE_MAX_ATTR_SIZE

--
components: Interpreter Core
messages: 67328
nosy: scoder
severity: normal
status: open
title: Method cache is broken in Py3
type: behavior
versions: Python 3.0

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2963>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2964] instancemethod_descr_get() lacks an INCREF

2008-05-25 Thread Stefan Behnel

New submission from Stefan Behnel <[EMAIL PROTECTED]>:

Here is a fix for Objects/classobject.c in Py3.0a5 that fixes a ref
count crash for classmethods.

--
components: Interpreter Core
files: instancemethod-fix.patch
keywords: patch
messages: 67334
nosy: scoder
severity: normal
status: open
title: instancemethod_descr_get() lacks an INCREF
type: crash
versions: Python 3.0
Added file: http://bugs.python.org/file10434/instancemethod-fix.patch

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2964>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2963] Method cache is broken in Py3

2008-05-25 Thread Stefan Behnel

Stefan Behnel <[EMAIL PROTECTED]> added the comment:

Here is a patch that fixes this.

--
keywords: +patch
Added file: http://bugs.python.org/file10435/py3k-method-cache-fix.patch

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2963>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2989] type_modified() in typeobject.c should be public

2008-05-28 Thread Stefan Behnel

New submission from Stefan Behnel <[EMAIL PROTECTED]>:

Here is a patch that makes this function public. This allows C code to
correctly taint a type after updating its attributes or base classes.

--
components: Interpreter Core
files: pytype_modified.patch
keywords: patch
messages: 67444
nosy: scoder
severity: normal
status: open
title: type_modified() in typeobject.c should be public
type: feature request
versions: Python 2.6, Python 3.0
Added file: http://bugs.python.org/file10457/pytype_modified.patch

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2989>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2990] type cache updates might run cleanup code in an inconsistent state

2008-05-28 Thread Stefan Behnel

New submission from Stefan Behnel <[EMAIL PROTECTED]>:

Similar to the "decref before set" issue solved by Py_CLEAR(), the code
in typeobject.c calls DECREF in the middle of a cache update. This
leaves one cache entry in an invalid state during the DECREF call, which
might result in running cleanup code in this state. If this code depends
on an attribute lookup, this might lead to a cache lookup, which in turn
can access the infected part of the cache. In the worst case, such a
scenario can lead to a crash as it accesses an already cleaned-up object.

Here is a patch that fixes this.

--
components: Interpreter Core
files: possible-decref-before-set-fix.patch
keywords: patch
messages: 67445
nosy: scoder
severity: normal
status: open
title: type cache updates might run cleanup code in an inconsistent state
type: behavior
versions: Python 2.6, Python 3.0
Added file: 
http://bugs.python.org/file10458/possible-decref-before-set-fix.patch

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2990>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2997] PyNumberMethods has left-over fields in Py3

2008-05-29 Thread Stefan Behnel

New submission from Stefan Behnel <[EMAIL PROTECTED]>:

Here is a patch that removes three unused fields from the
PyNumberMethods struct in Py3. Since two fields were already removed
(one even before the ones this patch removes), there is no way existing
Py2 C code that uses this struct can work in Py3 without changes, so it
doesn't add any problems.

--
components: Interpreter Core
files: pynumbermethods-cleanup.patch
keywords: patch
messages: 67477
nosy: scoder
severity: normal
status: open
title: PyNumberMethods has left-over fields in Py3
versions: Python 3.0
Added file: http://bugs.python.org/file10461/pynumbermethods-cleanup.patch

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2997>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2990] type cache updates might run cleanup code in an inconsistent state

2008-05-29 Thread Stefan Behnel

Stefan Behnel <[EMAIL PROTECTED]> added the comment:

Ok, I buy that argument. The patch may be considered a code uglification
then.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2990>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2997] PyNumberMethods has left-over fields in Py3

2008-06-01 Thread Stefan Behnel

Stefan Behnel <[EMAIL PROTECTED]> added the comment:

This seems to have been applied in current SVN.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2997>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3020] doctest should have lib2to3 integration

2008-06-01 Thread Stefan Behnel

New submission from Stefan Behnel <[EMAIL PROTECTED]>:

Running a doctest with Py2 syntax in Py3 currently involves either
running the 2to3 tool by hand or writing code to convert the doctest
using lib2to3, and then running the modified version. This basically
pushes the burden of automating this step in any test runner script in
the world onto the authors or users of these scripts.

Writing portable code is hard enough, but writing portable doctests that
remain user readable should not remain as hard as it currently is. The
doctest module in Py3 should have a simple option to run a Py2 doctest
(in a file or doc string) without requiring users to write the glue code
for it.

On a related note, if a 3to2 tool becomes available, this should be
directly supported by doctest in Py2.6.

--
assignee: collinwinter
components: 2to3 (2.x to 3.0 conversion tool), Library (Lib)
messages: 67594
nosy: collinwinter, scoder
severity: normal
status: open
title: doctest should have lib2to3 integration
type: feature request
versions: Python 3.0

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3020>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3046] Locking should be removed from the new buffer protocol

2008-06-05 Thread Stefan Behnel

New submission from Stefan Behnel <[EMAIL PROTECTED]>:

Here is a patch against the current PEP 3118 that removes the LOCK flag.
It follows this discussion on the Py3k mailing list:

http://comments.gmane.org/gmane.comp.python.python-3000.devel/13409?set_lines=10

It has not yet been approved by the PEP owners and requires
implementation, preferably before beta1.

--
components: Interpreter Core
files: pep-3118-no-locking.patch
keywords: patch
messages: 67747
nosy: scoder
severity: normal
status: open
title: Locking should be removed from the new buffer protocol
type: behavior
versions: Python 2.6, Python 3.0
Added file: http://bugs.python.org/file10533/pep-3118-no-locking.patch

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3046>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3046] Locking should be removed from the new buffer protocol

2008-06-06 Thread Stefan Behnel

Stefan Behnel <[EMAIL PROTECTED]> added the comment:

As a quick summary of the problems with the current PEP:

1) Many use cases will not require any locking at all, either because
they run single-threaded with a short-read/short-write pattern, or
because they do not write at all.

2) Write locks require exclusive access rights, but there isn't
currently a way to change an existing read lock into a write lock. This
means that to acquire a write lock, all consumers (including the
requester) must first release all read locks before a write lock can be
granted. Therefore, it is not necessary to have such a thing as a read
lock in the first place, as any read request essentially becomes a
read-lock from the POV of a write lock request. And for data integrity
reasons, some kind of write lock must always be applied when writing is
requested, regardless of requesting a lock or not.

3) The requirement in point 2) for releasing all locks before granting a
write lock necessitates short-read/short-write access, in which case
locking is of limited usefulness already.

4) More complex locking scenarios may also require special locking
semantics that are not currently handled by the proposed locking protocol.

The proposal is therefore to

a) remove the locking protocol all-together
b) leave it to application space how read/write locking should be
handled (if required at all).
c) leave it to providers how a request for a writable buffer should be
handled: by just granting it (thus jeopardising data integrity), by
applying a lock internally, or by copying buffers.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3046>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2997] PyNumberMethods has left-over fields in Py3

2008-06-06 Thread Stefan Behnel

Stefan Behnel <[EMAIL PROTECTED]> added the comment:

:) sorry, that's the problem when you don't have commit rights and leave
the changes in your local copy.

So this is still an open issue that should be fixed before beta1, thanks.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2997>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3046] Locking should be removed from the new buffer protocol

2008-06-06 Thread Stefan Behnel

Stefan Behnel <[EMAIL PROTECTED]> added the comment:

Here is a patch that removes all occurrences of the locking protocol
from the current py3k branch code base.

There are still some issues in memoryobject.c:

- there was an occurrence of PyBUF_SHADOW that might have to be handled

- memory_getbuf and memory_releasebuf must be reimplemented as it is no
longer allowed to call getbuffer/releasebuffer with a NULL pointer

Added file: http://bugs.python.org/file10534/buffer-no-locking.patch

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3046>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   5   6   7   8   9   10   >