Stefan Behnel added the comment:
New changeset a3697db0102b9b6747fe36009e42f9b08f0c1ea8 by Stefan Behnel in
branch 'master':
bpo-34160: explain how to deal with attribute order in ElementTree (GH-14867)
https://github.com/python/cpython/commit/a3697db0102b9b6747fe36009e42f9
Change by Stefan Behnel :
--
pull_requests: +14709
pull_request: https://github.com/python/cpython/pull/14936
___
Python tracker
<https://bugs.python.org/issue37
Stefan Behnel added the comment:
New changeset 63673916464bace8e2147357395fdf3497967ecb by Stefan Behnel (Miss
Islington (bot)) in branch '3.8':
[3.8] bpo-34160: explain how to deal with attribute order in ElementTree
(GH-14867) (GH-14935)
https://github.com/python/cpyt
Stefan Behnel added the comment:
New changeset bb697899aa65d90488af1950ac7cceeb3877d409 by Stefan Behnel in
branch '3.8':
[3.8] bpo-37399: Correctly attach tail text to the last element/comment/pi
(GH-14856) (GH-14936)
https://github.com/python/cpyt
Change by Stefan Behnel :
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Change by Stefan Behnel :
--
pull_requests: +14929
pull_request: https://github.com/python/cpython/pull/15200
___
Python tracker
<https://bugs.python.org/issue14
Stefan Behnel added the comment:
The spec section that Raymond quoted makes it clear that pretty printing is not
for everyone. But there are many use cases where it is 1) helpful, 2) leads to
correct results, and 3) does not grow the file size excessively. Whoever wants
to make use of it is
Stefan Behnel added the comment:
FWIW, deep traversing an XML tree on an operation as simple as "==" seems
excessive. To me, object identity comparison seems the most sensible behaviour
of "==" on Element objects.
(It's not "complicated to implement",
Stefan Behnel added the comment:
Right. If you want to compare XML trees for equality, either write your own
deep-tree comparison function, or use something like doctestcompare, or use a
C14N serialisation (which is now part of Py3.8). Whichever suits your needs.
https://github.com/lxml
Stefan Behnel added the comment:
FWIW, makes total sense to me to have it there. Question is more if we can
still get it into Py3.8, since it's a new feature for fractions.
--
nosy: +lukasz.langa, scoder
___
Python tracker
&
Stefan Behnel added the comment:
Can the PR be applied then? It looks good to me.
--
___
Python tracker
<http://bugs.python.org/issue29464>
___
___
Python-bug
Stefan Behnel added the comment:
Sorry for not responding, missed the message, it seems.
Cython has to support old-style relative imports also in Py3 because that's how
the user code was originally written, using Py2-style syntax and semantics.
Most Cython code has not been converted t
Stefan Behnel added the comment:
For future reference, this change is supported by Cython 0.26 (which is
currently close to release).
--
___
Python tracker
<http://bugs.python.org/issue29
Stefan Behnel added the comment:
> Are all uses of internal CPython details optional?
Well, what classifies as a "CPython detail" sometimes just becomes clear when
other implementations don't have it. ;-)
But yes, the C code that Cython generates selects alternative impleme
Stefan Behnel added the comment:
FYI, I've finally managed to find the time for implementing PEP 489 style
module initialisation in Cython. It was so easy that I'm sorry it took me so
long to get started. Cython 0.26 is fresh out, so the feature should go into
0.27.
https://github.
Stefan Behnel added the comment:
This has been resolved by PEP 489, issue 24268.
The module initialisation process receives the complete ModuleSpec now,
starting with CPython 3.5, and can do with it whatever it likes, before
executing any user code.
--
resolution: -> duplicate
st
New submission from Stefan Behnel:
PyObject *exception, *value, *tb;
PyErr_Fetch(&exception, &value, &tb);
/* PyObject_IsSubclass() can recurse and therefore is
not safe (see test_bad_getattr in test.pickletester). */
res = PyType_IsSubtype((PyType
Stefan Behnel added the comment:
Looks like the switch from PyObject_IsSubclass() to PyType_IsSubtype() was made
during the original Py3 development cycle. It should thus be safe to assume
that the semantics are "as designed". :)
What about applying the patch also to 3.6
Stefan Behnel added the comment:
1) Is this reproducible?
2) Getting a crash in compile.c indicates that this is happening at
parse/compile time and not when your Python code is executing. Can you confirm
that? Does it generate a .pyc file on import that would indicate a successful
byte code
Stefan Behnel added the comment:
Wouldn't this be a typical case where we'd expect a module to evolve and gain
usage on PyPI first, before adding it to the stdlib?
Searching for "grapheme" in PyPI gives some results for me. Even if they do not
cover what this ticket asks
Stefan Behnel added the comment:
I've looked at the file and it contains a huge amount of deeply nested
if-statements. Given that parsers and compilers are typically recursive, I can
well imagine that this is a problem, and my guess is that it's most likely just
the different C l
Stefan Behnel added the comment:
Regarding the user side of the problem, you might(!) be able to work around the
crash by merging nested if-conditions into and-expressions if they have no
elif/else. That's also why the split into multiple files doesn't help, it's the
depth of
Changes by Stefan Behnel :
--
pull_requests: +3322
___
Python tracker
<http://bugs.python.org/issue31336>
___
___
Python-bugs-list mailing list
Unsubscribe:
New submission from Stefan Behnel:
The method lookup fast path in _PyType_Lookup() does not apply during type
creation, which is highly dominated by the performance of the dict lookups
along the mro chain. Pre-calculating the name hash speeds up the creation of an
empty class (i.e. "
Stefan Behnel added the comment:
I literally just ran timeit on "class Test: pass", but I'll see if I can
provide proper numbers.
--
___
Python tracker
<http://bugs.pyt
Stefan Behnel added the comment:
Comparing against CPython master as of 122e88a8354e3f75aeaf6211232dac88ac296d54
I rebuilt my CPython to get clean results, and that still gave me almost 15%
overall speedup.
Original:
$ ./python -m timeit 'class Test: pass'
2 loops, best of 5:
Stefan Behnel added the comment:
It's the slot names in "slotdefs". See "update_one_slot()".
The time that is saved is mostly the overhead of calling PyDict_GetItem(). I
actually tried PyDict_GetItemWithError() first, which is faster due to the
lower error handling o
Stefan Behnel added the comment:
> I would prefer to use the _Py_IDENTIFIER API rather than using
> _PyDict_GetItem_KnownHash().
Do you mean for the table of slot descriptions? I'm not sure that the effect
would be comparable.
> Maybe there are other opportunities for optimiza
Stefan Behnel added the comment:
Since the number of applications that get along without any file access is
probably close to irrelevant, "os" and "io" feel like sufficiently widely used
modules to merit being part of a "usual Python startup" benchmark. M
Stefan Behnel added the comment:
I updated the pull request with a split version of _PyType_Lookup() that
bypasses the method cache during slot updates. I also ran the benchmarks with
PGO enabled now to get realistic results. The overall gain is around 15%.
Original:
$ ./python -m timeit
Stefan Behnel added the comment:
Since I'm getting highly reproducible results on re-runs, I tend to trust these
numbers.
--
___
Python tracker
<http://bugs.python.org/is
Stefan Behnel added the comment:
BTW, it seems that Yury's dict copy optimisation would also help here. When I
use a benchmark scenario with a simple non-empty method/attribute dict (from
Cython this time), almost 10% of the creation time is spent copying that dict,
which should essent
Stefan Behnel added the comment:
OTOH, if the created "module" is not a module object, then we could argue that
the extension implementation is on its own with that case, and has to do its
own re-execution safety checks.
--
___
Pyth
Stefan Behnel added the comment:
Marcel proposed to disallow main-execution if the extension *might* return
anything but a real object (not only if it actually does), but that seems
excessive to me. The actual problem is that we consider it unsafe if the module
is executed more than once
Stefan Behnel added the comment:
I was kinda guessing that modifying the slot list wasn't a good idea. ;)
My current use case is that I implement the "create" slot because it makes it
very easy to intercept the spec and its configuration. It is not passed into
"exec"
Stefan Behnel added the comment:
I'm a bit torn on this. On the one hand, it's basically saying, "Cython is
probably going to do it right anyway, so let's just assume it does". That's
nice, and might be applicable to other cases as well. But that also feels li
Stefan Behnel added the comment:
Any more comments on the proposed implementation? 13-15% seem worth it to me.
@Victor, or are you saying "PyId, or no change at all"?
--
___
Python tracker
<https://bugs.python.o
Stefan Behnel added the comment:
No, that one was addressed. I think only Victor's comment is still open, that's
why I asked back.
--
___
Python tracker
<https://bugs.python.o
New submission from Stefan Behnel:
The "XMLParser.__init__()" method in "_elementtree.c" contains this code:
self->handle_start = PyObject_GetAttrString(target, "start");
self->handle_data = PyObject_GetAttrString(target, "data");
se
Changes by Stefan Behnel :
--
keywords: +patch
pull_requests: +3542
stage: -> patch review
___
Python tracker
<https://bugs.python.org/issue31455>
___
___
Py
Stefan Behnel added the comment:
Feel free to provide a separate pull request. These issues seem independent of
the exception handling problem that I wrote a fix for.
--
___
Python tracker
<https://bugs.python.org/issue31
Stefan Behnel added the comment:
> Is it correct to call _PyType_Lookup() with an exception set?
The general rule of thumb is that it's not safe to call any user code with a
live exception set, and lookups can call into user code.
I quickly looked through all occurrences (there are
New submission from Stefan Behnel:
Follow-up to issue 31336:
The fact that _PyType_Lookup() does not propagate exceptions leaves some space
for ambiguity. If, in a chain of MRO lookups, one would fail and a later one
would succeed, is it correct to keep trying? What if the first failure
Stefan Behnel added the comment:
One more thing: the fact that the lookup does not propagate exceptions leaves
some space for ambiguity. If, in a chain of MRO lookups, one would fail and a
later one would succeed, is it correct to keep trying? What if the first
failure actually failed to see
Stefan Behnel added the comment:
I'm working on a PR for this, but after changing all usages and fixing up some
error handling here and there, it results in an interpreter crash for me. I'll
try to debug it during the next days.
--
nosy: +pitrou, serhiy
Changes by Stefan Behnel :
--
keywords: +patch
pull_requests: +3607
stage: -> patch review
___
Python tracker
<https://bugs.python.org/issue31465>
___
___
Py
Stefan Behnel added the comment:
Test suite passes now. The crash was due to an uninitialised error flag in one
case, which lead the C compiler to do incorrect optimisations on undefined
behaviour.
--
___
Python tracker
<https://bugs.python.
New submission from Stefan Behnel:
I'm seeing crashes in the latest Py3.7 when I run this test (taken from lxml's
compatibility test suite):
etree = xml.etree.ElementTree
def test_feed_parser_error_position(self):
ParseError = etree.ParseError
parser =
Stefan Behnel added the comment:
Sorry, wrong line number. Was using an installed Py3.7, not a fresh build.
However, my crashing installed version is from September 1st, *before* the
expat update, which was apparently on September 5th.
With a clean debug build, I get a reproducible crash
Stefan Behnel added the comment:
Minimal reproducer seems to be this:
--
import xml.etree.ElementTree as etree
def test():
parser = etree.XMLParser()
try:
parser.close()
except etree.ParseError as exc:
e = exc # must keep local reference!
test
Stefan Behnel added the comment:
Thanks for confirming, Victor.
I hadn't realised that the first update of expat was already back in June. That
means it's not ruled out yet as a source of this crash. Bisecting is probably a
good idea.
--
Stefan Behnel added the comment:
Question: Do you think it's ok to change the signature of _PyType_Lookup() in
this way by adding an error flag, or should I add a new function instead?
There is no performance difference to PR 3279 since gcc should optimise this
flag properly away in
Changes by Stefan Behnel :
--
pull_requests: +3632
___
Python tracker
<https://bugs.python.org/issue31465>
___
___
Python-bugs-list mailing list
Unsubscribe:
Stefan Behnel added the comment:
> The question is more why/how the code didn't crash before? :-)
Typical case of a Schroedinbug.
--
___
Python tracker
<https://bugs.python.org
Stefan Behnel added the comment:
What exactly is the problem here? encoding='unicode' will never appear in the
XML declaration, and thus will never be "presented to XML processors". It is up
to the user to deal with encodings in this case, which I think is fine. It'
Stefan Behnel added the comment:
If this is really just about debugging, then I would suggest to not break
existing code at all.
--
nosy: +scoder
___
Python tracker
<https://bugs.python.org/issue35
Stefan Behnel added the comment:
One regex related code pattern that I generally like is to assign bound methods
to good names and use those. In this case, I would write
_has_non_base16_digits = re.compile(b'[^0-9A-F]').search
...
if _has_non_base16_digits(s):
raise ...
-
New submission from Stefan Behnel :
Spelling out the numerator/denominator calculation in the __mod__ special
method, and actually implementing __divmod__, speeds up both operations by
2-3x. This is due to avoiding repeated Fraction instantiation and
normalisation, as well as less arithmetic
Change by Stefan Behnel :
--
keywords: +patch, patch
pull_requests: +10582, 10583
stage: -> patch review
___
Python tracker
<https://bugs.python.org/issu
Change by Stefan Behnel :
--
keywords: +patch
pull_requests: +10582
stage: -> patch review
___
Python tracker
<https://bugs.python.org/issue35588>
___
___
Py
Change by Stefan Behnel :
--
keywords: +patch, patch, patch
pull_requests: +10582, 10583, 10584
stage: -> patch review
___
Python tracker
<https://bugs.python.org/issu
Change by Stefan Behnel :
--
nosy: +mark.dickinson, serhiy.storchaka
___
Python tracker
<https://bugs.python.org/issue35588>
___
___
Python-bugs-list mailin
Stefan Behnel added the comment:
Similarly, I think "//" (__floordiv__) should be implemented using integer
operations rather than math.floor():
(a.numerator * b.denominator) // (b.numerator * a.denominator)
Thoughts?
--
___
Pyth
Stefan Behnel added the comment:
Motivation for the latter:
$ ./python -m timeit -s 'from fractions import Fraction as F; a = F(-7, 3); b =
F(3, 2)' 'a // b'
10 loops, best of 5: 3.7 usec per loop
$ ./python -m timeit -s 'from fractions import Fraction as F;
Stefan Behnel added the comment:
Sure, I can add tests, but I wonder what kind of regression you expect. The
algorithm is still the same as before, it's just implemented more efficiently.
It does trade a bit of memory for the speed, though, since there is no longer
an interme
Stefan Behnel added the comment:
Thanks for your review and ideas, Serhiy. I added a couple of test cases, but
failed to find any case where the new implementation is not much faster.
I also tried "divmod(n_div, d_div)" for implementing __divmod__(), and the
results are
Stefan Behnel added the comment:
I agree with Antoine. After all, we are optimising a safety check here that
runs in linear time. If people want speed, they should consider methods that do
not do this check in the first place.
--
___
Python
Stefan Behnel added the comment:
I have no doubts that the code is right. However, your expectations might not
be.
Try to print the values inside of the loop, for each iteration, as well as
their type. You'll likely be surprised what that gives.
(In any case, this is not a bug. If you
Stefan Behnel added the comment:
Unlikely to get changed in Py3.4/5 anymore, since this is not even a bug fix. I
wouldn't even fight for backporting, although 3.7 seems ok for it.
I agree that this code duplication is worth removing. I don't consider hashing
the empty string
Stefan Behnel added the comment:
> why bytes and str generates the same hash value for ASCII sequence
Probably mostly for historical Py2 reasons. These days, both are somewhat
unlikely to appear in the same dict. But still, I'd advise against changing the
hash function without a v
Stefan Behnel added the comment:
> maybe this can be changed in Python 4.0
Well, if you find a *very* good reason for changing it, as I said. Py4 won't be
special in that regard, I suppose.
--
___
Python tracker
<https://bugs
Change by Stefan Behnel :
--
nosy: +scoder -scode
___
Python tracker
<https://bugs.python.org/issue35582>
___
___
Python-bugs-list mailing list
Unsubscribe:
Stefan Behnel added the comment:
It might be worth inlining a fast path of "_PyArg_CheckPositional()" that only
tests "nargs < min || nargs > max" (even via a macro), and then branches to the
full error checking and reporting code only if that fails. Determining
Stefan Behnel added the comment:
Nice! Well done, Serhiy!
--
___
Python tracker
<https://bugs.python.org/issue35582>
___
___
Python-bugs-list mailing list
Unsub
Stefan Behnel added the comment:
This is not a bug, it's normal, documented behaviour. The children are not
guaranteed to be available during the "start" event. Only the tag itself is
guaranteed to be there. The guarantee that the subtree is complete is only
given for the &q
Change by Stefan Behnel :
--
title: XML.etree bug -> iterparse does not return the full subtree on "start"
events
type: performance -> behavior
___
Python tracker
<https://bugs.py
Stefan Behnel added the comment:
It seems right that a heap allocate object owns a reference to its (non-static)
type. But the mere fact that you had to adapt stdlib code makes it obvious that
this will also break existing user code out there. And such breakage is very
likely to remain
Stefan Behnel added the comment:
I think the REPL could, when it formats a stack trace for printing, check every
referenced source file if it's newer than its compiled .pyc (bytecode) file,
and insert a warning into the stack trace if that is the case.
I don't see any use in doin
Stefan Behnel added the comment:
The overhead of calling PyErr_Occurred() is definitely negligible in something
as involved as PyDict_GetItemStringWithError(), where a mere key lookup first
has to fire up the string decoder on a C character buffer to create a new
string object and then
Stefan Behnel added the comment:
Victor asked me for a review, so, well, what should I say? The intention seems
right, and the patch also looks good to me.
>From the top of my head, I wouldn't know any problems this would produce with
>Cython specifically, although it's wort
Stefan Behnel added the comment:
Adding Christian Tismer to the nosy list since he might be able to elaborate on
the impact on PySide (which IIRC uses the stable ABI, and thus, heap types).
--
components: +Extension Modules, Interpreter Core -Library (Lib)
nosy: +Christian.Tismer
Stefan Behnel added the comment:
>From Cython's point of view, the important fields in PyThreadState are the
>tracing/profiling and exception related ones. We're not using anything else.
>Users can explicitly opt out of the access to the exception fields by defining
Stefan Behnel added the comment:
Oh, and I forgot the new trashcan support. Cython will also start to use that
in its next release, so that adds the trashcan related attributes to the list.
https://github.com/cython/cython/pull/2842/files
Change by Stefan Behnel :
--
pull_requests: +11931
___
Python tracker
<https://bugs.python.org/issue35884>
___
___
Python-bugs-list mailing list
Unsubscribe:
Change by Stefan Behnel :
--
keywords: +patch
pull_requests: +11932
stage: -> patch review
___
Python tracker
<https://bugs.python.org/issue36012>
___
___
Py
Stefan Behnel added the comment:
It turns out that "update_slot()" is always called, even when we are not
updating a slot name (which is always a special dunder-name). The linear search
for names in "update_slots()" is a huge waste of time here, and
short-circuiting out
Stefan Behnel added the comment:
Well … yes.
The exception fields are performance critical, and we try hard to make them
visible to the C compiler so that swapping around exception state eats up as
little CPU time as possible.
You could argue that profiling and tracing are less critical
Stefan Behnel added the comment:
Looks good to me (didn't test it).
Note that getchildren() is not deprecated in lxml because it's actually the
fastest way to build a list of the children. It's faster than list(element)
because it avoids the Python (C-level) iteration overhead
Stefan Behnel added the comment:
Thanks for bringing me in. The PoC implementation looks nice. Whether I'd like
to support this in Cython? Absolutely. Requires some work, though, since Cython
still doesn't implement PEP 489. But it shouldn't be hard, if I remember the
discus
Stefan Behnel added the comment:
Agreed that this should be added. I think the key should be None, though, not
the empty string. I attached a quick patch for lxml's corresponding file. It's
mostly the same for ET.
--
keywords: +patch
Added file: http://bugs.python.org
Changes by Stefan Behnel :
Removed file: http://bugs.python.org/file46906/lxml_elpath_empty_prefix.patch
___
Python tracker
<http://bugs.python.org/issue30485>
___
___
Stefan Behnel added the comment:
Patch replaced by pull request.
https://github.com/python/cpython/pull/1823
--
___
Python tracker
<http://bugs.python.org/issue30
New submission from Stefan Behnel:
I'm seeing doctest failures in Cython's test suite with Py3.7 due to the change
of an error message:
Failed example:
func1(arg=None)
Expected:
Traceback (most recent call last):
...
TypeError: func1() takes no keyword arg
Stefan Behnel added the comment:
I looked up this change again and was surprised that it still wasn't applied.
It feels to me that it makes sense already for reasons of consistency. Any time
frame for changing it? I'd like to use METH_FASTCALL in Cython in a
future
Stefan Behnel added the comment:
I do not see this as a matter of performance but as a matter of usability.
Basically, CPython could do just fine with just a single catch-all calling
convention that packs all pos/kw arguments into C arguments and passes them
over, leaving it entirely to the
Stefan Behnel added the comment:
> We already have a syntax for dict merging: {**d1, **d2}.
Which doesn't mean that "d1 + d2" isn't much more intuitive than this
special-character heavy version. It takes me a while to see the dict merge
under that heap of stars.
Stefan Behnel added the comment:
> should we also implement +/+= for sets?
The question is: what would that do? The same as '|=' ? That would be rather
confusing, I think. "|" (meaning: "or") seems a very natural operation for
sets, in the same way that
Stefan Behnel added the comment:
The feature seems reasonable to me and the patch looks good.
--
___
Python tracker
<https://bugs.python.org/issue36227>
___
___
Stefan Behnel added the comment:
lxml does not support the "default_namespace" option specifically (because its
tree model preserves namespace prefixes), but it generally makes all
(justifiable) serialisation options available to both tostring() and ET.write().
I think the same sh
Stefan Behnel added the comment:
While I do understand the interest in a bit more visual consistency (and,
lacking further input, I assume that this is the OP's "problem"), it really is
at best a purely visual improvement with the potential to break code and/or
tests out th
401 - 500 of 1287 matches
Mail list logo