[issue45054] json module should issue warning about duplicate keys

2021-08-30 Thread Kevin Mills


New submission from Kevin Mills :

The json module will allow the following without complaint:

import json
d1 = {1: "fromstring", "1": "fromnumber"}
string = json.dumps(d1)
print(string)
d2 = json.loads(string)
print(d2)

And it prints:

{"1": "fromstring", "1": "fromnumber"}
{'1': 'fromnumber'}

This would be extremely confusing to anyone who doesn't already know that JSON 
keys have to be strings. Not only does `d1 != d2` (which the documentation does 
mention as a possibility after a round trip through JSON), but `len(d1) != 
len(d2)` and `d1['1'] != d2['1']`, even though '1' is in both.

I suggest that if json.dump or json.dumps notices that it is producing a JSON 
document with duplicate keys, it should issue a warning. Similarly, if 
json.load or json.loads notices that it is reading a JSON document with 
duplicate keys, it should also issue a warning.

--
components: Library (Lib)
messages: 400678
nosy: Zeturic
priority: normal
severity: normal
status: open
title: json module should issue warning about duplicate keys
type: enhancement
versions: Python 3.11

___
Python tracker 
<https://bugs.python.org/issue45054>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45054] json module should issue warning about duplicate keys

2021-08-31 Thread Kevin Mills


Kevin Mills  added the comment:

Sorry to the people I'm pinging, but I just noticed the initial dictionary in 
my example code is wrong. I figured I should fix it before anybody tested it 
and got confused about it not matching up with my description of the results.

It should've been:

import json
d1 = {"1": "fromstring", 1: "fromnumber"}
string = json.dumps(d1)
print(string)
d2 = json.loads(string)
print(d2)

--

___
Python tracker 
<https://bugs.python.org/issue45054>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43574] Regression in overallocation for literal list initialization in v3.9+

2021-09-14 Thread Kevin Mills


Change by Kevin Mills :


--
nosy: +Zeturic

___
Python tracker 
<https://bugs.python.org/issue43574>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-07 Thread Kevin Mills


New submission from Kevin Mills :

Sorry for the vague title. I'm not sure how to succinctly describe this issue.

The following code:

```
with open("data.bin", "rb") as f:
data = f.read()

base = 15403807 * b'\xff'
longer = base + b'\xff'

print(data.find(base))
print(data.find(longer))
```

Always hangs on the second call to find.

It might complete eventually, but I've left it running and never seen it do so. 
Because of the structure of data.bin, it should find the same position as the 
first call to find.

The first call to find completes and prints near instantly, which makes the 
pathological performance of the second (which is only searching for one b"\xff" 
more than the first) even more mystifying.

I attempted to upload the data.bin file I was working with as an attachment 
here, but it failed multiple times. I assume it's too large for an attachment; 
it's a 32MiB file consisting only of 00 bytes and FF bytes.

Since I couldn't attach it, I uploaded it to a gist. I hope that's okay.

https://gist.github.com/Zeturic/7d0480a94352968c1fe92aa62e8adeaf

I wasn't able to trigger the pathological runtime behavior with other sequences 
of bytes, which is why I uploaded it in the first place. For example, if it is 
randomly generated, it doesn't trigger it.

I've verified that this happens on multiple versions of CPython (as well as 
PyPy) and on multiple computers / operating systems.

--
messages: 378197
nosy: Zeturic
priority: normal
severity: normal
status: open
title: bytes.find consistently hangs in a particular scenario
type: performance
versions: Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29671] Add function to gc module to check if any reference cycles have been reclaimed.

2017-02-27 Thread Kevin Mills

New submission from Kevin Mills:

The intro paragraph for the gc module's documentation says:

> Since the collector supplements the reference counting already used in 
> Python, you can disable the collector if you are sure your program does not 
> create reference cycles.

How would you ever be sure of that?

While you can prevent reference cycles in your own code, what about your 
dependencies? You'd have to look through the code of all of your dependencies 
and transitive dependencies (including the standard library) to verify that 
none introduce reference cycles. And then you'd have to redo that work when 
upgrading any of them.

I propose adding a function to the gc module that returns True if the gc has 
reclaimed at least one reference cycle in the course of the current program's 
execution.

With that, it would be possible to, a program could, before it exits, force a 
collection and then check if any reference cycles were found over the program's 
lifetime, and then the programmer could use that information to decide whether 
they can safely turn off the gc or not.

Obviously that wouldn't guarantee that you could safely turn of the gc (it's 
possible that garbage with reference cycles is created on other runs of the 
program) it would at least be some information.

--
messages: 288668
nosy: Kevin Mills
priority: normal
severity: normal
status: open
title: Add function to gc module to check if any reference cycles have been 
reclaimed.
type: enhancement
versions: Python 3.7

___
Python tracker 
<http://bugs.python.org/issue29671>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29671] Add function to gc module to check if any reference cycles have been reclaimed.

2017-02-27 Thread Kevin Mills

Kevin Mills added the comment:

gc.disable() at the beginning and then analyzing the results of gc.collect() 
actually does do what I was wanting, thank you.

Reference cycles in and of themselves aren't the problem. It's only a problem 
if garbage contains reference cycles. In a normal program, a class wouldn't 
generally ever become garbage, so it wouldn't be a problem.

--

___
Python tracker 
<http://bugs.python.org/issue29671>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com