[issue29710] Incorrect representation caveat on bitwise operation docs

2017-12-02 Thread Tim Peters
Tim Peters added the comment: To answer the old accusation ;-), no, this isn't my wording. I _always_ explain that Python's integer bit operations act as if the integers were stored in 2's-complement representation but with an infinite number of sign bits. That's

[issue32382] Python mulitiprocessing.Queue fail to get according to correct sequence

2017-12-20 Thread Tim Peters
Tim Peters added the comment: First thing: the code uses the global name `outputer` for two different things, as the name of a module function and as the global name given to the Process object running that function. At least on Windows under Python 3.6.4 that confusion prevents the

[issue32475] Add ability to query number of buffered bytes available on buffered I/O

2018-01-01 Thread Tim Savannah
New submission from Tim Savannah : Hello! This is my first time submitting to Python bug tracker, so please bear with me if I miss/mess something. So a little bit of relevant background, I'm an avid python developer with many open-source projects. One of the projects I wrote and mainta

[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-06 Thread Tim Peters
Tim Peters added the comment: Right, "..." immediately after a ">>>" line is taken to indicate a code continuation line, and there's no way to stop that short of rewriting the parser. The workaround you already found could be made more palatable if

[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-06 Thread Tim Peters
Tim Peters added the comment: And I somehow managed to unsubscribe Steven :-( -- nosy: +steven.daprano ___ Python tracker <https://bugs.python.org/issue32

[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-07 Thread Tim Peters
Tim Peters added the comment: Jason, an ellipsis will match an empty string. But if your expected output is: """ x... abcd ... """ you're asking for output that: - starts with "x" - followed by 0 or more of anything - FOLLOWED BY A NEWLINE (I t

[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-07 Thread Tim Peters
Tim Peters added the comment: By the way, going back to your original problem, "the usual" solution to that different platforms can list directories in different orders is simply to sort the listing yourself. That's pretty easy in Python ;-) Then your test can verify the h

[issue33566] re.findall() dead locked whent the expected ending char not occur until end of string

2018-05-18 Thread Tim Peters
Tim Peters added the comment: Min, you need to give a complete example other people can actually run for themselves. Offhand, this part of the regexp (.|\s)* all by itself _can_ cause exponential-time behavior. You can run this for yourself: >>> import re >>> p = r"

[issue33572] False/True as dictionary keys treated as integers

2018-05-18 Thread Tim Peters
Tim Peters added the comment: I expect these docs date back to when ints, longs, and floats were the only hashable language-supplied types for which mixed-type comparison could ever return True. They could stand some updates ;-) `fractions.Fraction` and `decimal.Decimal` are more language

[issue33579] calendar.timegm not always an inverse of time.gmtime

2018-05-19 Thread Tim Peters
Tim Peters added the comment: They both look wrong to me. Under 3.6.5 on Win10, `one` and `three` are the same. Python 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 17:00:18) [MSC v.1900 64 bit (AMD64)] on win32 time.struct_time(tm_year=2009, tm_mon=2, tm_mday=13, tm_hour=23, tm_min=31, tm_sec

[issue32832] doctest should support custom ps1/ps2 prompts

2018-05-27 Thread Tim Peters
Tim Peters added the comment: doctest was intended to deal with the standard CPython terminal shell. I'd like to keep it that way, but recognize that everyone wants to change everything into "a framework" ;-) How many other shells are there? As Sergey linked to, IPython alre

[issue32832] doctest should support custom ps1/ps2 prompts

2018-05-27 Thread Tim Peters
Tim Peters added the comment: Sergey, I understand that, but I don't care. The only people I've ever seen _use_ this are people writing an entirely different shell interface. They're rare. There's no value in complicating doctest to cater to theoretical use cases that

[issue32832] doctest should support custom ps1/ps2 prompts

2018-05-28 Thread Tim Peters
Tim Peters added the comment: You missed my point about IPython: forget "In/Out arrays, etc". What you suggest is inadequate for _just_ changing PS1/PS2 for IPython. Again, read their `parse()` function. They support _more than one_ set of PS1/PS2 conventions. So the code c

[issue21196] Name mangling example in Python tutorial

2018-06-06 Thread Tim Peters
Tim Peters added the comment: Berker Peksag's change (PR 5667) is very simple and, I think, helpful. -- nosy: +tim.peters ___ Python tracker <https://bugs.python.org/is

[issue33795] Memory leak in X509StoreContext class.

2018-06-07 Thread Tim Boddy
New submission from Tim Boddy : I noticed a memory leak /usr/lib/python3.5/site-packages/OpenSSL/crypto.py in the definition of the class X509StoreContext. The problem is that the __init__ function calls self._init() then later the function verify_certificate calls _init() again. In spite of

[issue33795] Memory leak in X509StoreContext class.

2018-06-07 Thread Tim Boddy
Tim Boddy added the comment: Thank you for helping me figure out the correct place to file this. Is there a quick way for me to evaluate in the future wither a particular file belongs to the standard library? -- resolution: not a bug -> status: closed ->

[issue33795] Memory leak in X509StoreContext class.

2018-06-07 Thread Tim Boddy
Tim Boddy added the comment: It looks as if the issue has been fixed here: https://github.com/pyca/pyopenssl/blob/179eb1d0917ddc1067d056127e08e952206e0e91/src/OpenSSL/crypto.py#L1790 Thanks again for pointing me to the correct place! I'm sorry that I accidentally change the status

[issue33795] Memory leak in X509StoreContext class.

2018-06-07 Thread Tim Boddy
Tim Boddy added the comment: Would a leak associated with this stack trace fall within the domain of bugs.python.org? I do see site-packages on ths stack in frames 1 and 2 but frame 3 is in /lib/libpython3.5m.so.1.0: 55904900 #0 __GI___libc_malloc (bytes=8) at malloc.c:2910 #1

[issue33795] Memory leak in X509StoreContext class.

2018-06-07 Thread Tim Boddy
Tim Boddy added the comment: I'm sorry that I changed the resolution by accident. -- resolution: not a bug -> third party ___ Python tracker <https://bugs.python.org

[issue33812] Different behavior between datetime.py and its C accelerator

2018-06-08 Thread Tim Peters
Tim Peters added the comment: The message isn't confusing - the definition of "aware" is confusing ;-) """ A datetime object d is aware if d.tzinfo is not None and d.tzinfo.utcoffset(d) does not return None. If d.tzinfo is None, or if d.tzinfo is not None but

[issue33812] Different behavior between datetime.py and its C accelerator

2018-06-08 Thread Tim Peters
Tim Peters added the comment: I copy/pasted the definitions of "aware" and "naive" from the docs. Your TZ's .utcoffset() returns None, so, yes, any datetime using an instance of that for its tzinfo is naive. In print(datetime(2000,1,1).astimezone(timezone.utc))

[issue33814] exec() maybe has a memory leak

2018-06-09 Thread Tim Peters
Tim Peters added the comment: Dan, your bug report is pretty much incoherent ;-) This standard Stack Overflow advice applies here too: https://stackoverflow.com/help/mcve Guessing your complaint is that: sys.getrefcount(itertools.repeat) keeps increasing by 1 across calls to `leaks

[issue33812] Different behavior between datetime.py and its C accelerator

2018-06-09 Thread Tim Peters
Tim Peters added the comment: I'd call it a bug fix, but I'm really not anal about what people call things ;-) -- ___ Python tracker <https://bugs.python.o

[issue33089] Add multi-dimensional Euclidean distance function to the math module

2018-06-24 Thread Tim Peters
Tim Peters added the comment: Raymond, I'd say scaling is vital (to prevent spurious infinities), but complications beyond that are questionable, slowing things down for an improvement in accuracy that may be of no actual benefit. Note that your original "simple homework problem

[issue24567] random.choice IndexError due to double-rounding

2018-06-24 Thread Tim Peters
Tim Peters added the comment: There are a couple bug reports here that have been open for years, and it's about time we closed them. My stance: if any platform still exists on which "double rounding" is still a potential problem, Python _configuration_ should be changed to

[issue24567] random.choice IndexError due to double-rounding

2018-06-25 Thread Tim Peters
Tim Peters added the comment: Mark, do you believe that 32-bit Linux uses a different libm? One that fails if, e.g., SSE2 were used instead? I don't know, but I'd sure be surprised it if did. Very surprised - compilers have been notoriously unpredictable in exactly when

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: Mark, ya, I agree it's most prudent to let sleeping dogs lie. In the one "real" complaint we got (issue 24546) the cause was never determined - but double rounding was ruled out in that specific case, and no _plausible_ cause was identified (sho

[issue33973] HTTP request-line parsing splits on Unicode whitespace

2018-06-26 Thread Tim Burke
New submission from Tim Burke : This causes (admittedly, buggy) clients that would work with a Python 2 server to stop working when the server upgrades to Python 3. To demonstrate, run `python2.7 -m SimpleHTTPServer 8027` in one terminal and `curl -v http://127.0.0.1:8027/你好` in another

[issue33973] HTTP request-line parsing splits on Unicode whitespace

2018-06-26 Thread Tim Burke
Change by Tim Burke : -- keywords: +patch pull_requests: +7539 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue33973> ___ ___ Python-

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: [Mark] > If we do this, can we also persuade Guido to Pronounce that > Python implementations assume IEEE 754 format and semantics > for floating-point? On its own, I don't think a change to force 53-bit precision _on_ 754 boxes would justify that

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: Victor, look at Raymond's patch. In Python 3, `randrange()` and friends already use the all-integer `getrandbits()`. He's changing three other lines, where some variant of `int(random() * someinteger)` is being used in an inner loop for speed. Pres

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: [Victor] > This method [shuffle()] has a weird API. What is > the point of passing a random function, > ... I proposed to deprecate this argument and remove it later. I don't care here. This is a bug report. Making backward-incompatible API

[issue34016] Bug in sort()

2018-07-01 Thread Tim Peters
Tim Peters added the comment: Lucas, as Mark said you're sorting _strings_ here, not sorting integers. Please study his reply. As strings, "10" is less than "9", because "1" is less than "9". >>> "10

[issue34100] Same integers in a tuple of constant literals are not merged

2018-07-11 Thread Tim Peters
Tim Peters added the comment: The language doesn't define anything about this - any program relying on accidental identity is in error itself. Still, it's nice if a code object's co_consts vector is as short as reasonably possible. That's a matter of pragmatics

[issue34100] Same integers in a tuple of constant literals are not merged

2018-07-11 Thread Tim Peters
Tim Peters added the comment: Fine, Serhiy, so reword it a tiny bit: it's nice if a code object's co_consts vector references as few distinct objects as possible. Still a matter of pragmatics, not of correctness. -- ___ Python track

[issue34109] Accumulator bug

2018-07-13 Thread Tim Peters
Tim Peters added the comment: ? I expect your code to return -1 about once per 7**4 = 2401 times, which would be about 400 times per million tries, which is what your output shows. If you start with -5, and randint(1, 7) returns 1 four times in a row, r5 is left at -5 + 4 = -1

[issue29710] Incorrect representation caveat on bitwise operation docs

2018-07-14 Thread Tim Peters
Tim Peters added the comment: Nick, that seems a decent compromise. "Infinite string of sign bits" is how Guido & I both thought of it when the semantics of longs were first defined, and others in this report apparently find it natural enough too. It also applies to all 6

[issue29710] Incorrect representation caveat on bitwise operation docs

2018-07-15 Thread Tim Peters
Tim Peters added the comment: Well, all 6 operations "are calculated as though carried out in two's complement with an infinite number of sign bits", so I'd float that part out of the footnote and into the main text. When, e.g., you're thinking of ints _as_ bit

[issue29710] Incorrect representation caveat on bitwise operation docs

2018-07-16 Thread Tim Peters
Tim Peters added the comment: Ya, Mark's got a point there. Perhaps s/the internal/a finite two's complement/ ? -- ___ Python tracker <https://bugs.python.o

[issue34168] RAM consumption too high using concurrent.futures (Python 3.7 / 3.6 )

2018-07-20 Thread Tim Peters
Tim Peters added the comment: If your `bucket` has 30 million items, then for element in bucket: executor.submit(kwargs['function']['name'], element, **kwargs) is going to create 30 million Future objects (and all the under-the-covers objects needed to mana

[issue34168] RAM consumption too high using concurrent.futures (Python 3.7 / 3.6 )

2018-07-20 Thread Tim Peters
Tim Peters added the comment: Note that you can consume multiple gigabytes of RAM with this simpler program too, and for the same reasons: """ import concurrent.futures as cf bucket = range(30_000_000) def _dns_query(target): from time import sleep sleep(0.1) def

[issue34180] bool(Q) always return True for a priority queue Q

2018-07-22 Thread Tim Peters
Tim Peters added the comment: I'm sure Guido designed the API to discourage subtly bug-ridden code relying on the mistaken belief that it _can_ know the queue's current size. In the general multi-threaded context Queue is intended to be used, the only thing `.qsize()`'s cal

[issue34195] test_nt_helpers fails with case difference in drive letter

2018-07-23 Thread Tim Golden
New submission from Tim Golden : >From a fresh build on Win10 with VS2017: python -munittest -v test.test_ntpath.TestNtpath.test_nt_helpers gives the following error: == FAIL: test_nt_helpers (test.test_ntpath.TestNtp

[issue34195] test_nt_helpers fails with case difference in drive letter

2018-07-23 Thread Tim Golden
Tim Golden added the comment: import nt, sys; assert sys.executable.startswith(nt._getvolumepathname(sys.executable)) This code fails only when run from the python.bat as created by pcbuild\build.bat. The obvious difference is that the batch file sets PYTHONHOME which, presumably, is used

[issue34195] test_nt_helpers fails with case difference in drive letter

2018-07-23 Thread Tim Golden
Tim Golden added the comment: Thanks, @eryksun. Whatever the reason, it's consistently failing in the way I describe. A case-insensitive test is obviously good for that and for the other reasons you give, so I'll patch the t

[issue34195] test_nt_helpers fails with case difference in drive letter

2018-07-23 Thread Tim Golden
Tim Golden added the comment: @eryksun almost idly I ran your ctypes code in the built interpreter. As written, it produces a lower-case c:\\ as yours did. But... Running Debug|Win32 interpreter... Python 3.8.0a0 (heads/master:7a3056f, Jul 23 2018, 08:23:33) [MSC v.1912 32 bit (Intel)] on

[issue34195] test_nt_helpers fails with case difference in drive letter

2018-07-23 Thread Tim Golden
Tim Golden added the comment: I think I've got down to the determining factor. For info: PYTHONHOME has nothing to do with it: the same thing happens if I cd into PCBuild\win32 and run python_d.exe directly For historical reasons the directory in which I'm building (c:\work-in-pr

[issue29710] Incorrect representation caveat on bitwise operation docs

2018-07-23 Thread Tim Peters
Tim Peters added the comment: @CuriousLearner, does the PR also include Nick's first suggested change? Here: """ 1. Replace the opening paragraph of https://docs.python.org/3/library/stdtypes.html#bitwise-operations-on-integer-types (the one I originally quoted whe

[issue29710] Incorrect representation caveat on bitwise operation docs

2018-07-23 Thread Tim Peters
Tim Peters added the comment: Nick suggested two changes on 2018-07-15 (look above). Mark & I agreed about the first change, so it wasn't mentioned again after that. All the rest has been refining the second change. -- ___ Pytho

[issue34195] test_nt_helpers fails with case difference in drive letter

2018-07-24 Thread Tim Golden
Change by Tim Golden : -- keywords: +patch pull_requests: +7970 stage: needs patch -> patch review ___ Python tracker <https://bugs.python.org/issu

[issue34195] test_nt_helpers fails with case difference in drive letter

2018-07-25 Thread Tim Golden
Tim Golden added the comment: Test fixed to ignore case and volume differences between paths -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.or

[issue34195] test_nt_helpers fails with case difference in drive letter

2018-07-25 Thread Tim Golden
Tim Golden added the comment: New changeset ff64add8d4be2e37c552ba702f629b0b6639cd33 by Tim Golden in branch 'master': bpo-34195: Fix case-sensitive comparison in test_nt_helpers (GH-8448) https://github.com/python/cpython/commit/ff64add8d4be2e37c552ba702f629b

[issue34239] Convert test_bz2 to use tempfile

2018-07-26 Thread Tim Golden
New submission from Tim Golden : test_bz2 currently uses the test.support.TESTFN functionality which creates a temporary file local to the test directory named around the pid. This can give rise to race conditions where tests are competing with each other to delete and recreate the file

[issue34239] Convert test_bz2 to use tempfile

2018-07-26 Thread Tim Golden
Change by Tim Golden : -- keywords: +patch pull_requests: +8007 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue34239> ___ ___ Python-

[issue34240] Convert test_mmap to use tempfile

2018-07-26 Thread Tim Golden
New submission from Tim Golden : test_mmap currently uses the test.support.TESTFN functionality which creates a temporary file local to the test directory named around the pid. This can give rise to race conditions where tests are competing with each other to delete and recreate the file

[issue34240] Convert test_mmap to use tempfile

2018-07-26 Thread Tim Golden
Change by Tim Golden : -- keywords: +patch pull_requests: +8008 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue34240> ___ ___ Python-

[issue34239] Convert test_bz2 to use tempfile

2018-07-26 Thread Tim Golden
Tim Golden added the comment: New changeset 6a62e1d365934de82ff7c634981b3fbf218b4d5f by Tim Golden in branch 'master': bpo-34239: Convert test_bz2 to use tempfile (#8485) https://github.com/python/cpython/commit/6a62e1d365934de82ff7c634981b3f

[issue34240] Convert test_mmap to use tempfile

2018-07-27 Thread Tim Golden
Tim Golden added the comment: Thanks for the information, eryksun. For the moment, I can only say with a fair degree of certainty that using the tempfile functions as I have in test_bz2 & test_mmap appears to solve the issue which is repeatably if intermittently present without that ch

[issue34256] Python treats ASCII record seperator ('\x1e as a newline

2018-07-28 Thread Tim McNamara
New submission from Tim McNamara : Hello, I apologize if this is expected behavior, however it doesn't appear to be documented haven't. >>> "single\x1eline\x1estring".splitlines() ['single', 'line', 'string'] -- message

[issue34256] Python treats ASCII record seperator ('\x1e') as a newline

2018-07-28 Thread Tim McNamara
Tim McNamara added the comment: Hello, I apologize if this is expected behavior, however it doesn't appear to be documented. >>> "single\x1eline\x1estring".splitlines() ['single', 'line', 'string'] The glossary refers to the universa

[issue34259] Improve docstring of list.sort

2018-07-28 Thread Tim Hoffmann
New submission from Tim Hoffmann : The current docstring of list.sort is just "Stable sort *IN PLACE*." This is missing a description of the arguments key and reverse. Also a short explanation of stable and in-place would be helpful for less experienced users. -- messag

[issue34261] Add description to clinic.py

2018-07-28 Thread Tim Hoffmann
New submission from Tim Hoffmann : When trying to update a docstring of a CPython builtin, I had problems finding out what Argument Clinic actually does. First, I looked at the devguide, which does only mention that the clinic exists, but not what it does or how it's used. Next, I

[issue20177] Derby #8: Convert 28 sites to Argument Clinic across 2 files

2018-07-28 Thread Tim Hoffmann
Change by Tim Hoffmann : -- pull_requests: +8052 stage: needs patch -> patch review ___ Python tracker <https://bugs.python.org/issue20177> ___ ___ Python-

[issue34239] Convert test_bz2 to use tempfile

2018-07-28 Thread Tim Golden
Change by Tim Golden : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.or

[issue33113] Query performance is very low and can even lead to denial of service

2018-07-28 Thread Tim Peters
Tim Peters added the comment: Note: if you found a regexp like this _in_ the Python distribution, then a bug report would be appropriate. It's certainly possible to write regexps that can suffer catastrophic backtracking, and we've repaired a few of those, over the years, th

[issue33566] re.findall() dead locked whent the expected ending char not occur until end of string

2018-07-28 Thread Tim Peters
Tim Peters added the comment: Closing as not-a-bug - not enough info to reproduce, but the regexp looked prone to exponential-time backtracking to both MRAB and me, and there's been no response to requests for more info. -- components: +Regular Expressions nosy: +ezio.me

[issue34291] UnboundLocalError raised on call to global

2018-07-31 Thread Tim Peters
Tim Peters added the comment: Yes, the assignment does "hide the global definition of g". But this determination is made at compile time, not at run time: an assignment to `g` _anywhere_ inside `f()` makes _every_ appearance of `g` within `f()` local to `f`. -- nosy: +

[issue34376] Improve accuracy of math.hypot() and math.dist()

2018-08-10 Thread Tim Peters
Tim Peters added the comment: Not that it matters: "ulp" is a measure of absolute error, but the script is computing some notion of relative error and _calling_ that "ulp". It can understate the true ulp error by up to a factor of 2 (the "wobble" of base 2 f

[issue34376] Improve accuracy of math.hypot() and math.dist()

2018-08-11 Thread Tim Peters
Tim Peters added the comment: Thanks for doing the "real ulp" calc, Raymond! It was intended to make the Kahan gimmick look better, and it succeeded ;-) I don't personally care whether adding 10K things ends up with 50 ulp error, but to each their own. Division can be most

[issue34376] Improve accuracy of math.hypot() and math.dist()

2018-08-12 Thread Tim Peters
Tim Peters added the comment: Sure, if we make more assumptions. For 754 doubles, e.g., scaling isn't needed if `1e-100 < absmax < 1e100` unless there are a truly ludicrous number of points. Because, if that holds, the true sum is between 1e-200 and number_of_points*1e200, bo

[issue34397] remove redundant overflow checks in tuple and list implementations

2018-08-14 Thread Tim Peters
Tim Peters added the comment: I agree there's pointless code now, but don't understand why the patch replaces it with mysterious asserts. For example, what's the point of this? assert(Py_SIZE(a) <= PY_SSIZE_T_MAX / sizeof(PyObject*)); assert(Py_SIZE(b) <= PY_SSIZE_T_

[issue34397] remove redundant overflow checks in tuple and list implementations

2018-08-14 Thread Tim Peters
Tim Peters added the comment: Bah - the relevant thing to assert is really assert((size_t)Py_SIZE(a) + (size_t)Py_SIZE(b) <= (size_t)PY_SSIZE_T_MAX); C sucks ;-) -- ___ Python tracker <https://bugs.python.org/issu

[issue34561] Replace list sorting merge_collapse()?

2018-08-31 Thread Tim Peters
New submission from Tim Peters : The invariants on the run-length stack are uncomfortably subtle. There was a flap a while back when an attempt at a formal correctness proof uncovered that the _intended_ invariants weren't always maintained. That was easily repaired (as the resear

[issue34561] Replace list sorting merge_collapse()?

2018-09-01 Thread Tim Peters
Tim Peters added the comment: The attached runstack.py models the relevant parts of timsort's current merge_collapse and the proposed 2-merge. Barring conceptual or coding errors, they appear to behave much the same with respect to "total cost", with no clear overall win

[issue34572] C unpickling bypasses import thread safety

2018-09-03 Thread Tim Burgess
New submission from Tim Burgess : Retrieving and using a module directly from sys.modules (from C in this case) leads to a race condition where the module may be importing on another thread but has not yet been initialised. For slow filesystems or large modules (e.g. numpy) this seems to

[issue34572] C unpickling bypasses import thread safety

2018-09-03 Thread Tim Burgess
Change by Tim Burgess : -- keywords: +patch pull_requests: +8509 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue34572> ___ ___ Py

[issue34561] Replace list sorting merge_collapse()?

2018-09-03 Thread Tim Peters
Tim Peters added the comment: Looks like all sorts of academics are exercised over the run-merging order now. Here's a paper that's unhappy because timsort's strategy, and 2-merge too, aren't always near-optimal with respect to the entropy of the distribution of

[issue34561] Replace list sorting merge_collapse()?

2018-09-04 Thread Tim Peters
Tim Peters added the comment: "Galloping" is the heart & soul of Python's sorting algorithm. It's explained in detail here: https://github.com/python/cpython/blob/master/Objects/listsort.txt The Java fork of the sorting code has had repeated bugs due to reducing

[issue34561] Replace list sorting merge_collapse()?

2018-09-04 Thread Tim Peters
Tim Peters added the comment: A new version of the file models a version of the `powersort` merge ordering too. It clearly dominates timsort and 2-merge in all cases tried, for this notion of "cost". Against it, its code is much more complex, and the algorithm is very far fro

[issue34590] "Logging HOWTO" should share an example of best practices for using logging in a library

2018-09-06 Thread Tim Lesher
Change by Tim Lesher : -- nosy: +tlesher ___ Python tracker <https://bugs.python.org/issue34590> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue34561] Replace list sorting merge_collapse()?

2018-09-06 Thread Tim Peters
Tim Peters added the comment: The notion of cost is that merging runs of lengths A and B has "cost" A+B, period. Nothing to do with logarithms. Merge runs of lengths 1 and 1000, and it has cost 1001. They don't care about galloping, only about how the order in which merges

[issue34561] Replace list sorting merge_collapse()?

2018-09-06 Thread Tim Peters
Tim Peters added the comment: No, there's no requirement that run lengths on the stack be ordered in any way by magnitude. That's simply one rule timsort uses, as well as 2-merge and various other schemes discussed in papers. powersort has no such rule, and that's fine. Re

[issue34691] _contextvars missing in x64 master branch Windows build?

2018-09-14 Thread Tim Peters
New submission from Tim Peters : Using Visual Studio 2017 to build the current master branch of Python (something I'm trying for the first time in about two years - maybe I'm missing something obvious!), with the x64 target, under both the Release and Debug builds I get a Python

[issue34698] urllib.request.Request.set_proxy doesn't (necessarily) replace type

2018-09-15 Thread Tim Burke
New submission from Tim Burke : Not sure if this is a documentation or behavior bug, but... the docs for urllib.request.Request.set_proxy (https://docs.python.org/3/library/urllib.request.html#urllib.request.Request.set_proxy) say > Prepare the request by connecting to a proxy server. *

[issue34561] Replace list sorting merge_collapse()?

2018-09-15 Thread Tim Peters
Tim Peters added the comment: New version of runstack.py. - Reworked code to reflect that Python's sort uses (start_offset, run_length) pairs to record runs. - Two unbounded-integer power implementations, one using a loop and the other division. The loop version implies that, in Pyt

[issue34561] Replace list sorting merge_collapse()?

2018-09-16 Thread Tim Peters
Tim Peters added the comment: Another runstack.py adds a bad case for 2-merge, and an even worse (percentage-wise) bad case for timsort. powersort happens to be optimal for both. So they all have contrived bad cases now. powersort's bad cases are the least bad. So far ;-) But I e

[issue34691] _contextvars missing in x64 master branch Windows build?

2018-09-17 Thread Tim Peters
Tim Peters added the comment: FYI, I bet I didn't see a problem with the Win32 target because I followed instructions ;-) and did my first build using build.bat. Using that for the x64 too target makes the problem go away. -- ___ Python tr

[issue34659] Inconsistency between functools.reduce & itertools.accumulate

2018-09-17 Thread Tim Peters
Tim Peters added the comment: Ya, I care: `None` was always intended to be an explicit way to say "nothing here", and using unique non-None sentinels instead for that purpose is needlessly convoluted. `initial=None` is perfect. But then I'm old & in the way ;

[issue34751] Hash collisions for tuples

2018-09-20 Thread Tim Peters
Tim Peters added the comment: @jdemeyer, please define exactly what you mean by "Bernstein hash". Bernstein has authored many hashes, and none on his current hash page could possibly be called "simple": https://cr.yp.to/hash.html If you're talking about the

[issue34751] Hash collisions for tuples

2018-09-20 Thread Tim Peters
Tim Peters added the comment: Ah! I see that the original SourceForge bug report got duplicated on this tracker, as PR #942952. So clicking on that is a lot easier than digging thru the mail archive. One message there noted that replacing xor with addition made collision statistics much

[issue34751] Hash collisions for tuples

2018-09-20 Thread Tim Peters
Change by Tim Peters : -- nosy: +ned.deily ___ Python tracker <https://bugs.python.org/issue34751> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue34751] Hash collisions for tuples

2018-09-20 Thread Tim Peters
Tim Peters added the comment: @jdemeyer, you didn't submit a patch, or give any hint that you _might_. It _looked_ like you wanted other people to do all the work, based on a contrived example and a vague suggestion. And we already knew from history that "a simple Bernstein has

[issue34751] Hash collisions for tuples

2018-09-20 Thread Tim Peters
Tim Peters added the comment: You said it yourself: "It's not hard to come up with ...". That's not what "real life" means. Here: >>> len(set(hash(1 << i) for i in range(100_000))) 61 Wow! Only 61 hash codes across 100 thousand distinct int

[issue34751] Hash collisions for tuples

2018-09-21 Thread Tim Peters
Tim Peters added the comment: For me, it's largely because you make raw assertions with extreme confidence that the first thing you think of off the top of your head can't possibly make anything else worse. When it turns out it does make some things worse, you're equally con

[issue34751] Hash collisions for tuples

2018-09-21 Thread Tim Peters
Tim Peters added the comment: Oops! """ "j odd implies j^(-2) == -j, so that m*(j^(-2)) == -m" """ The tail end should say "m*(j^(-2)) == -m*j" instead. -- ___ P

[issue34561] Replace list sorting merge_collapse()?

2018-09-21 Thread Tim Peters
Tim Peters added the comment: Thank you, Vincent! I very much enjoyed - and appreciated - your paper I referenced at the start. Way back when, I thought I had a proof of O(N log N), but never wrote it up because some details weren't convincing - even to me ;-) . Then I had to move

[issue34751] Hash collisions for tuples

2018-09-21 Thread Tim Peters
Tim Peters added the comment: >> Why do you claim the original was "too small"? Too small for >> what purpose? > If the multiplier is too small, then the resulting hash values are > small too. This causes collisions to appear for smaller numbers: All right! An

[issue34397] remove redundant overflow checks in tuple and list implementations

2018-09-21 Thread Tim Peters
Tim Peters added the comment: Because the behavior of signed integer overflow isn't defined in C. Picture a 3-bit integer type, where the maximum value of the signed integer type is 3. 3+3 has no defined result. Cast them to the unsigned flavor of the integer type, though, and the r

[issue34751] Hash collisions for tuples

2018-09-22 Thread Tim Peters
Tim Peters added the comment: So you don't know of any directly relevant research either. "Offhand I can't see anything wrong" is better than nothing, but very far from "and we know it will be OK because [see references 1 and 2]". That Bernstein's DJBX3

[issue34751] Hash collisions for tuples

2018-09-22 Thread Tim Peters
Tim Peters added the comment: I strive not to believe anything in the absence of evidence ;-) FNV-1a supplanted Bernstein's scheme in many projects because it works better. Indeed, Python itself used FNV for string hashing before the security wonks got exercised over collision attacks

<    7   8   9   10   11   12   13   14   15   16   >