[issue43468] functools.cached_property locking is plain wrong.

2021-03-12 Thread Antti Haapala


Antti Haapala  added the comment:

I've been giving thought to implementing the locking on the instance or per 
instance instead, and there are bad and worse ideas like inserting per 
(instance, descriptor) into the instance `__dict__`, guarded by the 
per-descriptor lock; using a per-descriptor `WeakKeyDictionary` to map the 
instance to locks (which would of course not work - is there any way to map 
unhashable instances weakly?)

So far best ideas that I have heard from others or discovered myself are along 
the lines of "remove locking altogether" (breaks compatibility); "add 
`thread_unsafe` keyword argument" with documentation saying that this is what 
you want to use if you're actually running threads; "implement Java-style 
object monitors and synchronized methods in CPython and use those instead"; or 
"create yet another method".

--

___
Python tracker 
<https://bugs.python.org/issue43468>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43468] functools.cached_property incorrectly locks the entire descriptor on class instead of per-instance locking

2021-03-16 Thread Antti Haapala


Change by Antti Haapala :


--
title: functools.cached_property locking is plain wrong. -> 
functools.cached_property incorrectly locks the entire descriptor on class 
instead of per-instance locking

___
Python tracker 
<https://bugs.python.org/issue43468>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21081] missing vietnamese codec TCVN 5712:1993 in Python

2020-04-28 Thread Antti Haapala

Antti Haapala  added the comment:

The messages above seem to be a (quite likely a machine) translation of André's 
comment with a spam link to a paint ad site, so no need to bother to translate 
it.

Also, I invited Hiếu to the nosy list in case this patch needs some info that 
requires a native Vietnamese reader, to push this forward ;)

--

___
Python tracker 
<https://bugs.python.org/issue21081>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43468] functools.cached_property locking is plain wrong.

2021-03-10 Thread Antti Haapala


New submission from Antti Haapala :

The locking on functools.cached_property 
(https://github.com/python/cpython/blob/87f649a409da9d99682e78a55a83fc43225a8729/Lib/functools.py#L934)
 as it was written is completely undesirable for I/O bound values, parallel 
processing. Instead of protecting the calculation of cached property to the 
same instance in two threads, it completely blocks parallel calculations of 
cached values to *distinct instances* of the same class. 

Here's the code of __get__ in cached_property:

def __get__(self, instance, owner=None):
if instance is None:
return self
if self.attrname is None:
raise TypeError(
"Cannot use cached_property instance without calling 
__set_name__ on it.")
try:
cache = instance.__dict__
except AttributeError:  # not all objects have __dict__ (e.g. class 
defines slots)
msg = (
f"No '__dict__' attribute on {type(instance).__name__!r} "
f"instance to cache {self.attrname!r} property."
)
raise TypeError(msg) from None
val = cache.get(self.attrname, _NOT_FOUND)
if val is _NOT_FOUND:
with self.lock:
# check if another thread filled cache while we awaited lock
val = cache.get(self.attrname, _NOT_FOUND)
if val is _NOT_FOUND:
val = self.func(instance)
try:
cache[self.attrname] = val
except TypeError:
msg = (
f"The '__dict__' attribute on 
{type(instance).__name__!r} instance "
f"does not support item assignment for caching 
{self.attrname!r} property."
)
raise TypeError(msg) from None
return val


I noticed this because I was recommending that Pyramid web framework deprecate 
its much simpler 
[`reify`](https://docs.pylonsproject.org/projects/pyramid/en/latest/_modules/pyramid/decorator.html#reify)
 decorator in favour of using `cached_property`, and then noticed why it won't 
do.


Here is the test case for cached_property:

from functools import cached_property
from threading import Thread
from random import randint
import time



class Spam:
@cached_property
def ham(self):
print(f'Calculating amount of ham in {self}')
time.sleep(10)
return randint(0, 100)


def bacon():
spam = Spam()
print(f'The amount of ham in {spam} is {spam.ham}')


start = time.time()
threads = []
for _ in range(3):
t = Thread(target=bacon)
threads.append(t)
t.start()

for t in threads:
t.join()

print(f'Total running time was {time.time() - start}')


Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa220>
The amount of ham in <__main__.Spam object at 0x7fa50bcaa220> is 97
Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa4f0>
The amount of ham in <__main__.Spam object at 0x7fa50bcaa4f0> is 8
Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa7c0>
The amount of ham in <__main__.Spam object at 0x7fa50bcaa7c0> is 53
Total running time was 30.02147102355957


The runtime is 30 seconds; for `pyramid.decorator.reify` the runtime would be 
10 seconds:

Calculating amount of ham in <__main__.Spam object at 0x7fc4d8272430>
Calculating amount of ham in <__main__.Spam object at 0x7fc4d82726d0>
Calculating amount of ham in <__main__.Spam object at 0x7fc4d8272970>
The amount of ham in <__main__.Spam object at 0x7fc4d82726d0> is 94
The amount of ham in <__main__.Spam object at 0x7fc4d8272970> is 29
The amount of ham in <__main__.Spam object at 0x7fc4d8272430> is 93
Total running time was 10.010624170303345

`reify` in Pyramid is used heavily to add properties to incoming HTTP request 
objects - using `functools.cached_property` instead would mean that each 
independent request thread blocks others because most of them would always get 
the value for the same lazy property using the the same descriptor instance and 
locking the same lock.

--
components: Library (Lib)
messages: 388480
nosy: ztane
priority: normal
severity: normal
status: open
title: functools.cached_property locking is plain wrong.
type: resource usage
versions: Python 3.10, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue43468>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43468] functools.cached_property locking is plain wrong.

2021-03-11 Thread Antti Haapala


Antti Haapala  added the comment:

Django was going to replace their cached_property by the standard library one 
https://code.djangoproject.com/ticket/30949

--

___
Python tracker 
<https://bugs.python.org/issue43468>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26175] Fully implement IOBase abstract on SpooledTemporaryFile

2020-10-30 Thread Antti Haapala


Antti Haapala  added the comment:

Another test case:

import tempfile
import io
import json


with tempfile.SpooledTemporaryFile(max_size=2**20) as f:
tf = io.TextIOWrapper(f, encoding='utf-8')
json.dump({}, fp=tf)

I was writing json to a file-like object that I need to read in as binary (to 
upload to S3). Originally the code used BytesIO and I thought it would be wise 
to actually spool this to disk as I was operating with possible limited RAM... 
except that of course it didn't work.

--
nosy: +ztane

___
Python tracker 
<https://bugs.python.org/issue26175>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26175] Fully implement IOBase abstract on SpooledTemporaryFile

2020-10-30 Thread Antti Haapala


Antti Haapala  added the comment:

... to further clarify, it is disappointing that either BytesIO or 
TemporaryFile would work alone, but the one that merges these two doesn't.

--

___
Python tracker 
<https://bugs.python.org/issue26175>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42343] threading.local documentation should be on the net...

2020-11-13 Thread Antti Haapala

New submission from Antti Haapala :

The current documentation of `thread.local` is 



Thread-Local Data

Thread-local data is data whose values are thread specific. To manage 
thread-local data, just create an instance of local (or a subclass) and store 
attributes on it:

mydata = threading.local()
mydata.x = 1

The instance’s values will be different for separate threads.

class threading.local

A class that represents thread-local data.

For more details and extensive examples, see the documentation string of 
the _threading_local module.


There is no link to the `_threading_local` module docs in the documentation and 
none of the content from the modules docstrings appear  anywhere on 
docs.python.org website. This is rather annoying because the docstring contains 
completely non-trivial information including that threading.local can be 
subclassed and that the __init__ will be run once for each thread for each 
instance where attributes are accessed.

--
assignee: docs@python
components: Documentation
messages: 380875
nosy: docs@python, ztane
priority: normal
severity: normal
status: open
title: threading.local documentation should be on the net...
type: enhancement

___
Python tracker 
<https://bugs.python.org/issue42343>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37170] Wrong return value from PyLong_AsUnsignedLongLongMask on PyErr_BadInternalCall

2019-06-05 Thread Antti Haapala


New submission from Antti Haapala :

Hi, while checking the longobject implementation for a Stack Overflow answer, I 
noticed that the functions `_PyLong_AsUnsignedLongLongMask` and 
`PyLong_AsUnsignedLongLongMask` erroneously return `(unsigned long)-1` on error 
when bad internal call is thrown.

First case: 
https://github.com/python/cpython/blob/cb65202520e7959196a2df8215692de155bf0cc8/Objects/longobject.c#L1379

static unsigned long long
_PyLong_AsUnsignedLongLongMask(PyObject *vv)
{
PyLongObject *v;
unsigned long long x;
Py_ssize_t i;
int sign;

if (vv == NULL || !PyLong_Check(vv)) {
PyErr_BadInternalCall();
return (unsigned long) -1; <<<<
}

Second case: 
https://github.com/python/cpython/blob/cb65202520e7959196a2df8215692de155bf0cc8/Objects/longobject.c#L1407

They seem to have been incorrect for quite some time, the other one blames back 
to the SVN era. The bug seems to be in 2.7 alike: 
https://github.com/python/cpython/blob/20093b3adf6b06930fe994527670dfb3aee40cc7/Objects/longobject.c#L1025

The correct return value should of course be `(unsigned long long)-1`

--
components: Interpreter Core
messages: 344789
nosy: ztane
priority: normal
severity: normal
status: open
title: Wrong return value from PyLong_AsUnsignedLongLongMask on 
PyErr_BadInternalCall
type: behavior
versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue37170>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37170] Wrong return value from PyLong_AsUnsignedLongLongMask on PyErr_BadInternalCall

2019-06-06 Thread Antti Haapala


Antti Haapala  added the comment:

Victor, as a friendly reminder, (unsigned long)-1 is not necessarily the same 
number as (unsigned long long)-1. The documentation means the latter.

--

___
Python tracker 
<https://bugs.python.org/issue37170>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37170] Wrong return value from PyLong_AsUnsignedLongLongMask on PyErr_BadInternalCall

2019-06-06 Thread Antti Haapala


Antti Haapala  added the comment:

Unsigned long long needs to be at least 64 bits wide, so it is probably all 
32-bit platforms and 64-bit window at least. These functions are not used only 
in a few places within the CPython code and when they are they're guarded with 
`PyLong_Check`s or similar, as they probably should, but the other is part of 
public API

--

___
Python tracker 
<https://bugs.python.org/issue37170>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30969] Docs should say that `x is z or x == z` is used for `x in y` in containers that do not implement `__contains__`

2017-07-19 Thread Antti Haapala

New submission from Antti Haapala:

The doc reference/expressions.srt says that

> For user-defined classes which do not define __contains__() but do 
> define __iter__(), x in y is True if some value z with x == z is 
> produced while iterating over y. If an exception is raised during the 
> iteration, it is as if in raised that exception.

and

> Lastly, the old-style iteration protocol is tried: if a class defines 
> __getitem__(), x in y is True if and only if there is a non-negative 
> integer index i such that x == y[i], and all lower integer indices do 
> not raise IndexError exception. (If any other exception is raised, it 
> is as if in raised that exception).

The documentation doesn't match the implementation, which clearly does `x is y 
or x == y` to check if `x` is the element `y` from a container. Both the 
`__iter__` and the index-iteration method test the elements using `is` first. 
While the document says that `x is x` means that `x == x` should be true, it is 
not true for example in the case of `nan`:

--
assignee: docs@python
components: Documentation
messages: 298671
nosy: docs@python, ztane
priority: normal
severity: normal
status: open
title: Docs should say that `x is z or x == z` is used for `x in y` in 
containers that do not implement `__contains__`
type: enhancement

___
Python tracker 
<http://bugs.python.org/issue30969>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30969] Docs should say that `x is z or x == z` is used for `x in y` in containers that do not implement `__contains__`

2017-07-19 Thread Antti Haapala

Changes by Antti Haapala :


--
pull_requests: +2820

___
Python tracker 
<http://bugs.python.org/issue30969>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29753] Ctypes Packing Bitfields Incorrectly - Linux

2017-08-30 Thread Antti Haapala

Antti Haapala added the comment:

To Charles first: "Gives back a sizeof of 8 on Windows and 10 on Linux. The 
inconsistency makes it difficult to have code work cross-platform." 

The bitfields in particular and ctypes in general have *never* been meant to be 
cross-platform - instead they just must need to match the particular C compiler 
behaviour of the platform, thus the use of these for cross platform work is 
ill-advised - perhaps you should just use the struct module instead.

However, that said, on Linux, sizeof these structures - packed or not - do not 
match the output from GCC; unpacked one has sizeof 12 and packed 10 on my 
Python 3.5, but they're both 8 bytes on GCC. This is a real bug.

GCC says that the bitfield behaviour is: 
https://gcc.gnu.org/onlinedocs/gcc-4.9.1/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html

Whether a bit-field can straddle a storage-unit boundary (C90 6.5.2.1, C99 and 
C11 6.7.2.1).

Determined by ABI.
The order of allocation of bit-fields within a unit (C90 6.5.2.1, C99 and C11 
6.7.2.1).

Determined by ABI.
The alignment of non-bit-field members of structures (C90 6.5.2.1, C99 and C11 
6.7.2.1).

Determined by ABI. 

Thus, the actual behaviour need to be checked from the API documentation of the 
relevant platform. However - at least for unpacked structs - the x86-64 
behaviour is that a bitfield may not cross an addressable unit.

--
nosy: +ztane
title: Ctypes Packing Incorrectly - Linux -> Ctypes Packing Bitfields 
Incorrectly - Linux

___
Python tracker 
<http://bugs.python.org/issue29753>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35469] [2.7] time.asctime() regression

2018-12-21 Thread Antti Haapala


Antti Haapala  added the comment:

C11 specifies the format used by asctime as being exactly 

"%.3s %.3s%3d %.2d:%.2d:%.2d %d\n",

which matches the *new* output with space padding, less the newline.

As always, Microsoft got it wrong: 
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/asctime-wasctime?view=vs-2017
 - even if deliberately saying 1-31 instead of 01-31 in the table.

--
nosy: +ztane

___
Python tracker 
<https://bugs.python.org/issue35469>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13927] Extra spaces in the output of time.ctime

2018-12-22 Thread Antti Haapala


Antti Haapala  added the comment:

This should be added to `asctime` too.

The space-padded behaviour complies with the C standard which was the intent - 
after all, before it was using C `asctime` directly, and says that unlike C 
asctime, it doesn't have the newline character, meaning that as a rule, it 
should then behave similar to it, and only exception is marked.

Unfortunately MSVC asctime has been incorrectly using leading zeros 
(https://stackoverflow.com/q/53894148/918959).

--
nosy: +ztane

___
Python tracker 
<https://bugs.python.org/issue13927>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33053] Running a module with `-m` will add empty directory to sys.path

2018-03-12 Thread Antti Haapala

New submission from Antti Haapala :

I think this is a really stupid security bug. Running a module with `-mmodule` 
seems to add '' as a path in sys.path, and in front. This is doubly wrong, 
because '' will stand for whatever the current working directory might happen 
to be at the time of the *subsequent import statements*, i.e. it is far worse 
than https://bugs.python.org/issue16202

I.e. whereas python3 /usr/lib/module.py wouldn't do that, python3 -mmodule 
would make it so that following a chdirs in code, imports would be executed 
from arbitrary locations. Verified on MacOS X, Ubuntu 17.10, using variety of 
Python versions up to 3.7.

--
components: Interpreter Core
messages: 313641
nosy: ztane
priority: normal
severity: normal
status: open
title: Running a module with `-m` will add empty directory to sys.path
type: security

___
Python tracker 
<https://bugs.python.org/issue33053>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33012] Invalid function cast warnings with gcc 8 for METH_NOARGS

2018-03-15 Thread Antti Haapala

Antti Haapala  added the comment:

I don't have GCC 8 so I cannot verify this bug, but *function pointer casts* 
are fine - any function pointer can be cast to any other function pointer - it 
is only that they must *not* be called unless cast back again to be compatible 
with the function definition. Any fix to the contrary might well *cause* 
undefined behaviour!

Could you provide a sample of the *actual warnings* so that they could be 
studied?

--
nosy: +ztane

___
Python tracker 
<https://bugs.python.org/issue33012>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33012] Invalid function cast warnings with gcc 8 for METH_NOARGS

2018-03-15 Thread Antti Haapala

Antti Haapala  added the comment:

Yea, I looked into `ceval.c` and the function is *called incorrectly*, so there 
is undefined behaviour there - it has been wrong all along, in 3.5 all the way 
down to 2-something

if (flags & (METH_NOARGS | METH_O)) {
PyCFunction meth = PyCFunction_GET_FUNCTION(func);
PyObject *self = PyCFunction_GET_SELF(func);
if (flags & METH_NOARGS && na == 0) {
C_TRACE(x, (*meth)(self,NULL));

x = _Py_CheckFunctionResult(func, x, NULL);
}

The warning in GCC shouldn't probably have been enabled at all in `-Wall 
-Wextra` because the cast is explicit. However, it is somewhat true.

However, the correct way to fix would be to have the METH_NOARGS case cast the 
function to the right prototype. There exists lots of existing code that *is* 
going to break too. 

Perhaps PyCFunction should declare no prototype, i.e. empty parentheses, for 
backwards compatibility:

typedef PyObject *(*PyCFunction)();

and deprecate it; start using a new typedef for it - and then add proper casts 
in every place that call a function.

--

___
Python tracker 
<https://bugs.python.org/issue33012>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33053] Avoid adding an empty directory to sys.path when running a module with `-m`

2018-03-19 Thread Antti Haapala

Antti Haapala  added the comment:

Took 2 seconds.

% sudo python3 -mpip --version
hello world
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 183, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.6/runpy.py", line 142, in _get_module_details
return _get_module_details(pkg_main_name, error)
  File "/usr/lib/python3.6/runpy.py", line 109, in _get_module_details
__import__(pkg_name)
  File "/usr/lib/python3/dist-packages/pip/__init__.py", line 4, in 
import locale
  File "/usr/lib/python3.6/locale.py", line 180, in 
_percent_re = re.compile(r'%(?:\((?P.*?)\))?'
AttributeError: module 're' has no attribute 'compile'
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 53, in 
apport_excepthook
if not enabled():
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 28, in 
enabled
return re.search(r'^\s*enabled\s*=\s*0\s*$', conf, re.M) is None
AttributeError: module 're' has no attribute 'search'

Original exception was:
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 183, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.6/runpy.py", line 142, in _get_module_details
return _get_module_details(pkg_main_name, error)
  File "/usr/lib/python3.6/runpy.py", line 109, in _get_module_details
__import__(pkg_name)
  File "/usr/lib/python3/dist-packages/pip/__init__.py", line 4, in 
import locale
  File "/usr/lib/python3.6/locale.py", line 180, in 
_percent_re = re.compile(r'%(?:\((?P.*?)\))?'
AttributeError: module 're' has no attribute 'compile'

Same for `python -mhttp.server`, say. 



I'd prefer there be a change that the default be always safe from some version 
on, so that the REPL can do whatever it does, but `-m` etc probably shouldn't 
even have neither the *initial* current directory *nor* the current current 
directory in the path unless the interactive session is requested. I am not 
worried about the garbage that the user would have installed in their own 
directories breaking things.

--

___
Python tracker 
<https://bugs.python.org/issue33053>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29753] Ctypes Packing Bitfields Incorrectly - Linux

2017-11-17 Thread Antti Haapala

Antti Haapala  added the comment:

"Antti, is there a place in the ctypes documentation that explicitly says 
ctypes is not meant to be used cross-platform? If not, shouldn't that be 
mentioned?"

I don't know about that, but the thing is nowhere does it say that it is meant 
to be used cross-platform. It just says it allows defining C types. It is 
somewhat implied that C types are not cross-platform at binary level, at all.

--

___
Python tracker 
<https://bugs.python.org/issue29753>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32112] Should uuid.UUID() accept another UUID() instance?

2017-11-22 Thread Antti Haapala

Antti Haapala  added the comment:

I've been hit by this too, in similar contexts, and several times. It is really 
annoying that it is easier to coerce an UUID or UUID-string to a string than to 
coerce to a UUID. Usually when the copy semantics are clear and the class is 
plain old data, Python lets you execute the constructor with an instance of the 
same class:

>>> bytes(bytes())
b''
>>> bytearray(bytearray())
bytearray(b'')
>>> int(int())
0
>>> complex(complex())
0j
>>> tuple(tuple())
()

I don't to see why this shouldn't be true with UUID as well.

--
nosy: +ztane

___
Python tracker 
<https://bugs.python.org/issue32112>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33012] Invalid function cast warnings with gcc 8 for METH_NOARGS

2018-05-27 Thread Antti Haapala

Antti Haapala  added the comment:

Well, there's only one problem with casting to void *: while converting the 
function pointer to another *is* standard-compliant, and GCC is being just 
hypersensitive here, casting a function pointer to void * isn't, though it is a 
common extension (http://port70.net/~nsz/c/c11/n1570.html#J.5.7).

Pedantically the correct way is to cast to a function pointer with no prototype 
(empty parentheses) and from that to the target type. See for example. See for 
example https://godbolt.org/g/FdPdUj

--

___
Python tracker 
<https://bugs.python.org/issue33012>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33239] tempfile module: functions with the 'buffering' option are incorrectly documented

2018-09-13 Thread Antti Haapala


Antti Haapala  added the comment:

This week we were bit by this in production. I foolishly thought that the  docs 
would give me correct default values... It is worse that it didn't actually 
occur until we went over the limit.

--
nosy: +ztane

___
Python tracker 
<https://bugs.python.org/issue33239>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18290] json encoder does not support JSONP/JavaScript safe escaping

2013-06-23 Thread Antti Haapala

New submission from Antti Haapala:

JSON is not a strict superset of JavaScript 
(http://timelessrepo.com/json-isnt-a-javascript-subset). However, certain web 
technologies use JSON values as a part of JavaScript code (JSONP, inline 
 tags)... The Python json module, however, by default does not escape 
\u2028 or \u2029 when ensure_ascii is false. Furthermore, the / -> \/ escape is 
not supported by any switch.

Strictly speaking, json specification only requires that " be escaped to \" and 
\ to \\ - all other escaping is optional. The whitespace escapes only exist to 
aid handwriting and embedding values in HTML/code. Thus it can be argued that 
the choice of escapes used by json encoder is ill-adviced.

In an inline HTML <script> tag, no < cannot be escaped; however only 
the string '' (or sometimes "} embedded in inline javascript. The 
only correct way to escape such content in inline html is to escape all / into 
\/.

The \u2028, \u2029 problem is more subtle and can break not only inline 
javascript but also JSONP. Thus there an incorrect value injected by a 
malicious or unwitting user to the database might break the entire protocol.

The current solution is to re-escape everything that comes out of JSON encoder. 
The best solution for python would be to make these 3 escapes default in the 
python json module (notice again that the current set of default escapes when 
ensure_ascii=False is chosen arbitrarily), or if not default, then at least 
they could be enabled by a switch. Furthermore, documentation should be updated 
appropriately, to explain why such escape is needed.

--
components: Library (Lib)
messages: 191742
nosy: Ztane
priority: normal
severity: normal
status: open
title: json encoder does not support JSONP/JavaScript safe escaping
type: enhancement

___
Python tracker 
<http://bugs.python.org/issue18290>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18290] json encoder does not support JSONP/JavaScript safe escaping

2013-06-23 Thread Antti Haapala

Antti Haapala added the comment:

My mistake in writing, json ofc does specify that "control characters" be 
escaped. Then, it needs to be pointed out that JSON module DOES not currently 
escape \u007f-\u009f as it maybe strictly should

>>> unicodedata.category('\u007f')
'Cc'
>>> json.dumps({'a': '\u007f'}, ensure_ascii=False)
'{"a": "\x7f"}'

--

___
Python tracker 
<http://bugs.python.org/issue18290>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21385] Compiling modified AST crashes on debug build unless linenumbering discarded

2014-04-29 Thread Antti Haapala

New submission from Antti Haapala:

We had had problems with our web service occasionally hanging and performing 
poorly, and as we didn't have much clue about the cause of these, we decided to 
continuously run our staging build under debug enabled python 3.4, and then 
attaching gdb as needed. To much dismay we found out that our code generating 
code that builds AST trees and then compiles them to modules is dumping cores 
on the debug version. 

The assertion is the much discussed "linenumbers must grow monotonically" at 
http://hg.python.org/cpython/file/04f714765c13/Python/compile.c#l3969

In our case, the AST is generated from a HTML template with embedded python 
parts; as we could approximately point out much of the corresponding code in 
the source template, we decided to reuse the linenumbers in AST, and things 
seemed to work quite nicely and usually we could get usable tracebacks too.

Under debug build, however, as the ordering of some constructs in the source 
language are different from python, we need to discard *all* linenumbers and 
only after then use fix_missing_locations, and thus get completely unusable 
traces from these parts of code, all happening on line 1. Just using 
fix_missing_locations does not work. Likewise the rules for which parts of the 
tree should come in which order in the lnotab is quite hard to deduce.

It seems to me that when the lnotab was created, no one even had in mind that 
there would be an actually useful AST module that would be used for code 
generation. Considering that there have been other calls for breaking the 
correspondence of bytecode addresses to monotonically growing linenumbers, I 
want to reopen the discussion about changing the lnotab structures now to allow 
arbitrary mapping of source code locations to bytecode, and especially about 
the need for this assertion in the debug builds at all.

Attached is an example of code that appends a function to an existing module 
syntax tree, run under python*-dbg it dumps the core with 
"Python/compile.c:: assemble_lnotab: Assertion `d_lineno >= 0' failed." Ofc 
in this simple case it is easy to just modify the linenumbers so that function 
"bar" would come after "foo", however in some cases it is hard to know the 
actual rules; fix_missing_locations does not do this right at all.

I am also pretty sure most of the existing code that combine parsed and 
generated ASTs and then compile the resulting trees also would fail that 
assert, but no one is ever running their code under debug builds.

--
components: Interpreter Core
files: astlinenotest.py
messages: 217502
nosy: Ztane
priority: normal
severity: normal
status: open
title: Compiling modified AST crashes on debug build unless linenumbering 
discarded
type: crash
versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5
Added file: http://bugs.python.org/file35090/astlinenotest.py

___
Python tracker 
<http://bugs.python.org/issue21385>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2014-08-20 Thread Antti Haapala

New submission from Antti Haapala:

Because of "if x else ''" in _decode_args 
(http://hg.python.org/cpython/file/3.4/Lib/urllib/parse.py#l96), 
urllib.parse.urlparse accepts any falsy value as an url, returning a 
ParseResultBytes with all members set to empty bytestrings.

Thus you get:

>>> urllib.parse.urlparse({})
ParseResultBytes(scheme=b'', netloc=b'', path=b'', params=b'', query=b'', 
fragment=b'')

which may result in some very confusing exceptions later on: I had a list of 
URLs that accidentally contained some Nones and got very confusing TypeErrors 
while processing the results expecting them to be strings.

If the `if x else ''` part were removed, such invalid falsy values would fail 
with `AttributeError: 'foo' object has no attribute 'decode'`, as happens with 
any truthy invalid value.

--
components: Library (Lib)
messages: 225566
nosy: Ztane
priority: normal
severity: normal
status: open
title: urllib.parse.urlparse accepts any falsy value as an url
type: behavior
versions: Python 3.4

___
Python tracker 
<http://bugs.python.org/issue22234>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2014-08-24 Thread Antti Haapala

Antti Haapala added the comment:

On Python 2.7 urlparse.urlparse, parsing None, () or 0 will throw 
AttributeError because these classes do not have any 'find' method. [] has the 
find method, but will fail with TypeError, because the built-in caching 
requires that the input be hashable.

--

___
Python tracker 
<http://bugs.python.org/issue22234>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28682] Bytes support in os.fwalk()

2016-11-26 Thread Antti Haapala

Antti Haapala added the comment:

shouldn't this get in sooner, as the 3.5.2 documentation says that it behaves 
exactly like `os.walk`, with some additions, none of which says "bytes paths 
are not supported". This looks like a bug to me.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue28682>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11726] clarify that linecache only works on files that can be decoded successfully

2017-02-09 Thread Antti Haapala

Antti Haapala added the comment:

Every now and then there are new questions and answers regarding the use of 
`linecache` module on Stack Overflow for doing random access to text files, 
even though the documentation states that it is meant for Python source code 
files.

One problem is that the title still states: "11.9. linecache — Random access to 
text lines"; the title should really be changed to "Random access to Python 
source code lines" so that the title wouldn't imply that this is a 
general-purpose random access library for text files.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue11726>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26839] Python 3.5 running on Linux kernel 3.17+ can block at startup or on importing the random module on getrandom()

2016-06-06 Thread Antti Haapala

Antti Haapala added the comment:

I don't think setting environment variables is a solution, as it is not always 
clear which script occurs early in the boot process, or even that which program 
has components written in Python. However I'd want to be notified of failure as 
well, perhaps a warning should be emitted.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue26839>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Antti Haapala

Antti Haapala added the comment:

it is handy to be able to use `\w` and `\d` in non-raw-string *regular 
expressions*, without too much backslashitis. Seems to be in use in Python 
standard library as well, for example in csv.py

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27364>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27473] bytes_concat seems to check overflow using undefined behaviour

2016-07-10 Thread Antti Haapala

Antti Haapala added the comment:

The previous code was perfectly fine with `-fwrapv` since it makes signed 
overflow behaviour defined. And afaik BDFLs stance is that signed integer 
overflow should be defined to wrap anyhow.



In my opinion the `-fwrapv` itself makes one proliferate all these insane 
wrap-checks; indeed I'd rather have them defined in a macro, something like

if (PYSSIZE_OVERFLOWS_ON_ADD(va.len, vb.len)) {
PyErr_NoMemory();
goto done;
}

size = va.len + vb.len;

even though `-fwrapv` is defined; that way it'd be obvious what is supposed to 
happen there.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27473>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?

2016-07-10 Thread Antti Haapala

Antti Haapala added the comment:

I am not an expert on writing new opcodes to CPython (having never done it, 
don't know where to change the disassembler and such, how to make compiler 
generate them properly and such), but I'd be glad to help with testing, timing 
and writing the possible joiner algorithm for it, to help it make into Python 
3.6.

--

___
Python tracker 
<http://bugs.python.org/issue27078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1621] Do not assume signed integer overflow behavior

2016-07-10 Thread Antti Haapala

Antti Haapala added the comment:

One common case where signed integer overflow has been assumed has been the 
wraparound/overflow checks like in http://bugs.python.org/issue27473 

I propose that such commonly erroneous tasks such as overflow checks be 
implemented as common macros in CPython as getting them right is not quite easy 
(http://c-faq.com/misc/sd26.html); it would also make the C code more 
self-documenting.

Thus instead of writing

 if (va.len > PY_SSIZE_T_MAX - vb.len) {
  
one would write something like

if (PY_SSIZE_T_SUM_OVERFLOWS(va.len, vb.len)) {

and the mere fact that such a macro *wasn't* used there would signal about 
possible problems with the comparison.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue1621>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?

2016-07-10 Thread Antti Haapala

Antti Haapala added the comment:

And the expected performance for optimal `f'X is {x}'` code would be *faster* 
than `"'X is %s' % (x,)"` which still needs to interpret the string at runtime, 
and build a proper tuple object on stack.

--

___
Python tracker 
<http://bugs.python.org/issue27078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?

2016-07-11 Thread Antti Haapala

Antti Haapala added the comment:

Ah so it seems. Somehow I thought __format__ was slotted, but that is not the 
case and it needs to be looked up, and what is worse, of course a tuple needs 
to be built as well. 

Oh well, at least it should be quite minimal to make it be faster than `f(x)` 
there, which necessarily has one extra bound method call and interpretation of 
format string as the overhead, so there's minimally at least 30 % performance 
boost to achieve.

--

___
Python tracker 
<http://bugs.python.org/issue27078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?

2016-07-13 Thread Antti Haapala

Antti Haapala added the comment:

Yet the test cases just prove what is so expensive there: name lookups (global 
name `str`; looking up `join` on a string instance); building a tuple (for 
function arguments) is expensive as well. Of course `__format__` will be costly 
as well as it is not a slot-method, needs to build a new string etc. 

However for strings, 'foo'.format() already returns the instance itself, so if 
you were formatting other strings into strings there are cheap shortcuts 
available to even overtake 

a = 'Hello'
b = 'World'
'%s %s' % (a, b)

for fast string templates, namely, make FORMAT_VALUE without args return the 
original if `PyUnicode_CheckExact` and no arguments, don't need to build a 
tuple to join it.

--

___
Python tracker 
<http://bugs.python.org/issue27078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?

2016-07-13 Thread Antti Haapala

Antti Haapala added the comment:

It seems Eric has done some special casing for strings already in FORMAT_VALUE. 
Here are the results from my computer after applying Demur's patch for 
concatenating *strings*.

python3.6 -m timeit -s "x = 'a'" -- '"X is %s" % x'
100 loops, best of 3: 0.187 usec per loop

python3.6 -m timeit -s "x = 'a'" -- 'f"X is {x}"' 
1000 loops, best of 3: 0.0972 usec per loop

But then as more components are added, it starts to slow down rapidly:

python3.6 -m timeit -s "x = 'a'" -- 'f"X is {x}a"'   
100 loops, best of 3: 0.191 usec per loop

python3.6 -m timeit -s "x = 'a'" -- '"X is %sa" % x'
100 loops, best of 3: 0.216 usec per loop

Of course there is also the matter of string conversion vs "look up __format__":

python3.6 -m timeit -s "x = 1" -- 'f"X is {x}"'
100 loops, best of 3: 0.349 usec per loop

python3.6 -m timeit -s "x = 1" -- 'f"X is {x!s}"'
1000 loops, best of 3: 0.168 usec per loop

For FORMAT_VALUE opcode already has a special case for the latter. 

However I'd too say that switch/case for the some fastest builtin types in 
`PyObject_Format` as Eric has intended to do with Unicode in PyObject_Format 
(str, int, float), as those are commonly used to build strings such as text 
templates, text-like protocols like emails, HTTP protocol headers and such.

For others the speed-up wouldn't really matter either way: either 
PyObject_Format would fall back to object.__format__ (for example functions) - 
no one really cares about efficiency when doing such debug dumps - if you 
cared, you'd not do them at all; or they'd have complex representations (say, 
lists, dictionaries) - and their use again would mostly be that of debug dumps; 
or they'd have `__format__` written in Python and that'd be dwarfed by anything 
so far.

--

___
Python tracker 
<http://bugs.python.org/issue27078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?

2016-07-13 Thread Antti Haapala

Antti Haapala added the comment:

Thanks Serhiy, I was writing my comment for a long time, and only now noticed 
that you'd already posted the patch.

Indeed, it seems that not only is this the fastest method, it might also be the 
fastest string concatenation method in the history of Python. I did some 
comparison with 8-bit strings in Python 2, doing the equivalent of 

f'http://{domain}/{lang}/{path}'

with

domain = 'some_really_long_example.com'
lang = 'en'
path = 'some/really/long/path/'


and the results look quite favourable: 0.151 µs with your patch; 0.250ish for 
the second fastest method in Python 3.6 (''.join(tuple))

And the fastest method in Python 2 is a tie between concatenating with + or 
''.join with bound method reference; both of them take 0.203 µs on Python 2.7 
with 8-bit strings and about 0.245 - 0.255 µs if everything is Unicode.

--

___
Python tracker 
<http://bugs.python.org/issue27078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27507] bytearray.extend lacks overflow check when increasing buffer

2016-07-14 Thread Antti Haapala

Antti Haapala added the comment:

if (len == PY_SSIZE_T_MAX) is necessary for the case that the iterable is 
already PY_SSIZE_T_MAX items. However it could be moved inside the *other* if 
because if (len == PY_SSIZE_T_MAX) should also fail the overflow check.

However, I believe it is theoretical at most with stuff that Python supports 
that it would even be possible to allocate an array of PY_SSIZE_T_MAX 
*pointers*. The reason is that the maximum size of object can be only that of 
`size_t`, and Py_ssize_t should be a signed type of that size; and it would 
thus be possible only to allocate an array of PY_SSIZE_T_MAX pointers only if 
they're 16 bits wide.

In any case, this would be another place where my proposed macro 
"SUM_OVERFLOWS_PY_SSIZE_T" or something would be in order to make it read

if (SUM_OVERFLOWS_PY_SSIZE_T(len, (len >> 1) + 1)

which would make it easier to spot mistakes in the sign preceding 1.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27507>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27512] os.fspath is certain to crash when exception raised in __fspath__

2016-07-14 Thread Antti Haapala

Antti Haapala added the comment:

I believe tests is that they should *especially* be in place for any previously 
found "careless omissions". If it has been done before, who is to say that it 
wouldn't happen again?

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27512>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?

2016-07-15 Thread Antti Haapala

Antti Haapala added the comment:

Serhiy suggested this in Rietveld:

> For additional optimization we can pack all constant strings, parsed formats 
> and
> flags in one constant object and use the single LOAD_CONST. But this requires
> much larger changes (perhaps including changing the marshal format), and the
> benefit may be small. Maybe we'll get to it eventually, if this approach 
> proves
> efficient enough.

I was thinking about this and got an idea on how to do this too, without 
changes to marshal. Essentially, let TOS be a tuple of 

(flags, str1, str2, str3, str4, str5, str6, str7, str8, str9...)

flags would be n bytes for n-part format string; each byte would tell whether:

- the next component is a constant string (bit 0 = 0) from the tuple
- the next component is an interpolated value (bit 0 = 1)
   - and whether it has !s, !r, !a or default conversions (bits 1-2)
   - and whether it has extra argument to format() or not (bit 3) (argument is 
the next string from the tuple)

thus that tuple for

a, b = 'Hello', 'World!'
f'{a!s} {b:10}!'

would be

(b'\x03\x00\x05\x00', ' ', '10', '!')

and the opcodes would be

LOAD_FAST (b)
LOAD_FAST (a)
LOAD_CONST (0) (the tuple)
BUILD_FORMAT_STRING 3

--

___
Python tracker 
<http://bugs.python.org/issue27078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24954] No way to generate or parse timezone as produced by datetime.isoformat()

2016-07-15 Thread Antti Haapala

Antti Haapala added the comment:

"Be conservative in what you do, be liberal in what you accept from others" 
they say. Also Z as a timezone designator is also widely used in iso 8601 
timestamps. I believe the effort should be made to *parse* *any/all* of the ISO 
8601 supported time-zone codes with one conversion, the list is not that long, 
just 'Z', HH, HH:MM, HHMM, longest match. Literal 'Z' really does not need to 
be supported for *output* at all, but for input, please.

Otherwise this will still go down the road of iso8601 library, which just tries 
to support all the YYmmddTHHMMSS.FF variants. It uses regular 
expressions to parse the dates as it is faster than trying N different formats 
with `strptime`

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue24954>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24954] No way to generate or parse timezone as produced by datetime.isoformat()

2016-07-16 Thread Antti Haapala

Antti Haapala added the comment:

Alexander: that is true, because they are *separate* conversion flags. 

However even the POSIX standard strptime has some leniency: '%m` and `%d` 
accept the numbers *without* leading zeroes. This actually also means that one 
cannot use `%Y%m%d` to detect an invalid ISO timestamp:

>>> datetime.datetime.strptime('22', '%Y%m%d')
datetime.datetime(, 2, 2, 0, 0)

The `arrow` library depends on the supposed "strict" behaviour of strptime that 
has never been guaranteed, which often results in very buggy behaviour under 
some conditions.



(Also, it must be noted that GNU date program doesn't use these formats to 
*parse* dates, and POSIX strptime in *C* library outright ignores any timezone 
information)

--

___
Python tracker 
<http://bugs.python.org/issue24954>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1621] Do not assume signed integer overflow behavior

2016-07-16 Thread Antti Haapala

Antti Haapala added the comment:

Gnulib portability library has 
https://www.gnu.org/software/gnulib/manual/html_node/Integer-Range-Overflow.html
 and 
https://www.gnu.org/softwarhe/gnulib/manual/html_node/Integer-Type-Overflow.html
 and even macros for producing well-defined integer wraparound for signed 
integers: 
https://www.gnu.org/software/gnulib/manual/html_node/Wraparound-Arithmetic.html

That code is under GPL but I believe there is no problem if someone just looks 
into that for ideas on how to write similar macros.

--

___
Python tracker 
<http://bugs.python.org/issue1621>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27507] bytearray.extend lacks overflow check when increasing buffer

2016-07-17 Thread Antti Haapala

Antti Haapala added the comment:

Ah indeed, this is a bytearray and it is indeed possible to theoretically 
allocate PY_SSIZE_T_MAX bytes, if on an architecture that does segmented memory.

As for 

if (addition > PY_SSIZE_T_MAX - len - 1) {

it is very clear to *us* but it is not quite self-documenting on why to do it 
this way to someone who doesn't know undefined behaviours in C (hint: next to 
no one knows, judging from the amount of complaints that the GCC "bug" 
received), instead of say

if (INT_ADD_OVERFLOW(len, addition))

Where the INT_ADD_OVERFLOW would have a comment above explaining why it has to 
be done that way. But more discussion about it at 
https://bugs.python.org/issue1621

--

___
Python tracker 
<http://bugs.python.org/issue27507>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27556] Integer overflow on hex()

2016-07-17 Thread Antti Haapala

Antti Haapala added the comment:

Note that this has nothing to do with `hex()` function. The part that is 
problem here is 10**80, which takes ages to compute. You can 
interrupt it with Ctrl-C.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27556>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27558] SystemError inside multiprocessing.dummy Pool.map

2016-07-18 Thread Antti Haapala

Antti Haapala added the comment:

Reproducible on Python 3.6a4ish on Ubuntu. I believe this needs forking 
multiprocessing.

do_raise is called with 2 NULLs as arguments, it should raise

PyErr_SetString(PyExc_RuntimeError,
"No active exception to reraise");

What happens is that PyThreadState is initialized to *all* NULL pointers on the 
new thread on multiprocessing, however `type` is expected to point to `Py_None` 
when no exception has been raised:

PyThreadState *tstate = PyThreadState_GET();
PyObject *tb;
type = tstate->exc_type;
value = tstate->exc_value;
tb = tstate->exc_traceback;
if (type == Py_None) {
PyErr_SetString(PyExc_RuntimeError,
"No active exception to reraise");
return 0;
}

I am not sure where the thread state should have been initialized though

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27558>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27558] SystemError inside multiprocessing.dummy Pool.map

2016-07-18 Thread Antti Haapala

Antti Haapala added the comment:

OTOH, if you put sys.exc_info() in place of raise there, it correctly (None, 
None, None) there, because it does (sysmodule.c:sys_exc_info)

tstate->exc_type != NULL ? tstate->exc_type : Py_None,

Easiest fix would be to make do_raise test for both NULL and None, but I'd 
really consider fixing the new thread initialization if possible.

--

___
Python tracker 
<http://bugs.python.org/issue27558>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27558] SystemError inside multiprocessing.dummy Pool.map

2016-07-18 Thread Antti Haapala

Antti Haapala added the comment:

more easily reproducible by

import threading
def foo():
raise
threading.Thread(target=foo).start()

--

___
Python tracker 
<http://bugs.python.org/issue27558>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27558] SystemError inside multiprocessing.dummy Pool.map

2016-07-18 Thread Antti Haapala

Antti Haapala added the comment:

I was thinking that perhaps an exception is always raised somewhere before? I 
tried skipping site, but it still works, so I am not too sure.

--

___
Python tracker 
<http://bugs.python.org/issue27558>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27560] zlib.compress() crash and keyboard interrupt stops working

2016-07-18 Thread Antti Haapala

Antti Haapala added the comment:

I am pretty sure **it never calls zlib.compress**. I get memory error from that 
argument alone, on Linux with overcommit memory enabled, 16G ram, swap.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27560>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1621] Do not assume signed integer overflow behavior

2016-07-19 Thread Antti Haapala

Antti Haapala added the comment:

About shifts, according to C standard, right shifts >> of negative numbers are 
implementation-defined:

   "[in E1 >> E2], If E1 has a signed type and a negative value, the
   resulting value is implementation-defined."

In K&R this meant that the number can be either zero-extended or sign-extended. 
In any case it cannot lead to undefined behaviour, but the implementation must 
document what is happening there. Now, GCC says that >> is always 
arithmetic/sign-extended. This is the implementation-defined behaviour, now GCC 
has defined what its implementation will do, but some implementation can choose 
zero-extension. (unlikely)

As for the other part as it says "GCC does not use the latitude given in C99 
only to treat certain aspects of signed ‘<<’ as undefined". But GCC6 will 
diagnose that now with -Wextra, and thus it changed already, as with -Werror 
-Wextra the code doesn't necessarily compile any longer, which is fine. Note 
that this "certain -- only" refers to that section where the C99 and C11 
explicitly say that the behaviour is undefined and C89 doesn't say anything. It 
could as well be argued that in C89 it is undefined by omission.

Additionally all shifts that shift by more than or equal to the width *still* 
have undefined behaviour (as do shifts by negative amount). IIRC they work 
differently on ARM vs x86: in x86 the shift can be mod 32 on 386, and in ARM, 
mod 256.

--

___
Python tracker 
<http://bugs.python.org/issue1621>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27578] inspect.findsource raises exception with empty __init__.py

2016-07-21 Thread Antti Haapala

Antti Haapala added the comment:

Or perhaps getlines should return [''] for empty regular files?

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27578>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27578] inspect.findsource raises exception with empty __init__.py

2016-07-21 Thread Antti Haapala

Antti Haapala added the comment:

It must be noted that `getlines` itself is not documented, and thus there is no 
backwards-compatibility to preserve really. `getline` returns '' for *any* 
erroneous line, so it wouldn't affect it.

--

___
Python tracker 
<http://bugs.python.org/issue27578>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?

2016-07-31 Thread Antti Haapala

Antti Haapala added the comment:

I would very much like to see this in 3.6. Who could review it?

--

___
Python tracker 
<http://bugs.python.org/issue27078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1621] Do not assume signed integer overflow behavior

2016-07-31 Thread Antti Haapala

Antti Haapala added the comment:

I don't believe Python would really ever work on a platform with 
non-8-bit-bytes, I believe there are way too much assumptions *everywhere*. You 
can program in C on such a platform, yes, but not that sure about Python.

And on 8-bit-byte platfomrs, there is no large model with 16-bit pointers 
anywhere. There just are not enough bits that you could have multiple 64k 
byte-addressable segments that are addressed with 16-bit pointers. 

It might be that some obscure platform in the past would have had 128k memory, 
with large pointers, 2 allocatable 64k segments, >16 bit char pointer and 
16-bit object pointers pointing to even bytes, but I doubt you'd be really 
porting *Python 3* to such a platform, basically we're talking about something 
like Commodore 128 here.

--

___
Python tracker 
<http://bugs.python.org/issue1621>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27703] Replace two Py_XDECREFs with Py_DECREFs in do_raise

2016-08-07 Thread Antti Haapala

Antti Haapala added the comment:

Normally I wouldn't recommend changing working code. However those asserts 
would be OK; if either of them is NULL, then the previous if would have had 
undefined behaviour already. Thus the `XDECREF` wrongly signals that it'd be OK 
if they were NULLs until this point, which is not true.

I'd rather see more asserts in the code; would be a big aid in possible 
refactoring; now for example `PyErr_SetObject` checks twice and thrice if 
either of the arguments is NULL; would be nice to go see the call site and see 
asserts in place there, showing that the arguments never were NULL to begin 
with.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27703>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27703] Replace two Py_XDECREFs with Py_DECREFs in do_raise

2016-08-08 Thread Antti Haapala

Antti Haapala added the comment:

No, I was just trying to explain why your change could be considered beneficial.

--

___
Python tracker 
<http://bugs.python.org/issue27703>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27720] decimal.Context.to_eng_string wrong docstring

2016-08-09 Thread Antti Haapala

New submission from Antti Haapala:

https://docs.python.org/3/library/decimal.html#decimal.Context.to_eng_string

The docstring for `Context.to_eng_string` says "Converts a number to a string, 
using scientific notation.", which is, less extra comma, exactly the docstring 
for `Context.to_sci_string`. It should probably say "using engineering 
notation".

Additionally, docstring for Decimal.to_eng_string uses the term "an 
engineering-type string", which no one uses outside the said docstring. It 
should probably also say "Convert to a string using engineering notation."

--
assignee: docs@python
components: Documentation
messages: 272259
nosy: docs@python, ztane
priority: normal
severity: normal
status: open
title: decimal.Context.to_eng_string wrong docstring

___
Python tracker 
<http://bugs.python.org/issue27720>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27687] Linux shutil.move between mountpoints as root does not retain ownership

2016-08-09 Thread Antti Haapala

Antti Haapala added the comment:

And as it is documented, it would be a change against documentation.
However as a stop-gap it is rather trivial to make your own copy function to 
fix this. copy2 returns the actual destination, so you could do

 def copy_with_ownership(src, dest, *, follow_symlinks=True):
 actual_dest = copy2(src, dest, follow_symlinks=follow_symlinks)
 fix_ownership(src, actual_dest)
 return actual_dest

implement fix_ownership to do what it needs to do, and pass copy_with_ownership 
as the copy_function argument to move.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27687>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27720] decimal.Context.to_eng_string wrong docstring

2016-08-11 Thread Antti Haapala

Antti Haapala added the comment:

Raymond: your doc patch is not quite right. Decimal('123e1') is converted to 
Decimal('1.23e3') internally already; so that str(d) will print 1.23e3, 
scientific notation of that number is '1.23e3' and engineering notation is 
'1.23e3', thus not a good example. A better example would be  Also, the 
engineering notation is a string, not a Decimal instance.

Also, now that I test it, the whole `to_eng_string` seems to be utterly broken, 
same applies to "to_sci_string". They do not print in scientific notation if 
the exponent in the original number was 0:

decimal.Decimal('123456789101214161820222426.0e0').to_eng_string()

And all operations with decimal will now generate numbers with exponent of 0 if 
it is within their precision, so no engineering notation is ever printed, duh.

--

___
Python tracker 
<http://bugs.python.org/issue27720>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26223] decimal.to_eng_string() does not implement engineering notation in all cases.

2016-08-11 Thread Antti Haapala

Antti Haapala added the comment:

Indeed engineering notation is now utterly broken, the engineering notation is 
not printed for pretty much _any *engineering* numbers at all_ in 3.6. 
Engineering numbers mean numbers that could be met in an *engineering* context, 
not cosmological!

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue26223>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26223] decimal.to_eng_string() does not implement engineering notation in all cases.

2016-08-11 Thread Antti Haapala

Antti Haapala added the comment:

Ok, after reading the "spec" it seems that the engineering exponent is indeed 
printed for positive exponents *if* the precision of the number is less than 
the digits of the exponent, which I didn't realize that I should be testing. 

However the *precision* of decimals is meaningless anyhow. Add a very precisely 
measured '0e0' to any number and the sum also has exponent of 0, and is thus 
never displayed in exponential notation.

--

___
Python tracker 
<http://bugs.python.org/issue26223>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27720] decimal.Context.to_eng_string wrong docstring

2016-08-11 Thread Antti Haapala

Antti Haapala added the comment:

@Stefan after reading the bad standard I agree that it follows the standard, as 
unfortunate as it is.

However, that part is then also wrong in Raymond's documentation patch. It 
should be something like: the exponent is adjusted to a multiple of 3 if *any* 
exponent is to be shown, and exponent is shown only if the exponent is larger 
than there are significant figures in the number, or if it is less than or 
equal to -6, or something alike.

Or perhaps it should say "This is not the notation you are looking for."

--

___
Python tracker 
<http://bugs.python.org/issue27720>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23545] Turn on extra warnings on GCC

2016-08-11 Thread Antti Haapala

Antti Haapala added the comment:

I don't think adding -Wno-type-limits is a good idea.

The good question is how that can be happening, e.g. how PY_SSIZE_T_MAX divided 
by sizeof anything can be *more* than max(size_t)? E.g now that I stare at the 
code, *that* warning should be impossible if everything is correct. It means 
either that the RHS is negative or size_t is defined to be 32-bit in this 
compilation unit whereas PY_SSIZE_T is 64-bit. Neither sound like a good idea.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue23545>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23545] Turn on extra warnings on GCC

2016-08-12 Thread Antti Haapala

Antti Haapala added the comment:

Ah, indeed, I somehow missed that. Though, there is no good reason for it being 
unsigned either; as the actual type in SSL API's is of type int. Another 
argument of type int is cast to unsigned just for the comparison on line 4419, 
and unsigned int counters i and j are used in function _setup_ssl_threads.

The variable could be safely changed to `size_t` (along with those index 
variables) without it affecting anything at all, as it is a static variable 
within that module and only used to hold a size of an array, and never passed 
back to another function.

--

___
Python tracker 
<http://bugs.python.org/issue23545>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27128] Add _PyObject_FastCall()

2016-08-12 Thread Antti Haapala

Antti Haapala added the comment:

About "I hesitate between the C types "int" and "Py_ssize_t" for nargs. I read 
once that using "int" can cause performance issues on a loop using "i++" and 
"data[i]" because the compiler has to handle integer overflow of the int type."

This is true because of -fwrapv, but I believe it is true also for Py_ssize_t 
which is also of signed type. However, there would be a speed-up achievable by 
disabling -fwrapv, because only then the i++; data[i] can be safely optimized 
into *(++data)

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27128>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27742] Random.seed(5, version=1) generates different values in PYthon2 and Python3

2016-08-12 Thread Antti Haapala

Antti Haapala added the comment:

It is this change in 3.2:

randrange is more sophisticated about producing equally distributed
values.  Formerly it used a style like ``int(random()*n)`` which  '
could produce slightly uneven distributions.

-return self._randbelow(istart)
+if istart >= maxwidth:
+return self._randbelow(istart)
+return int(self.random() * istart)

by rhettinger. Since there has not been any regression tests that the seeded 
numbers would stay compatible, they don't. Perhaps it would be a good idea to 
*add* such tests.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27742>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27742] Random.seed(5, version=1) generates different values in PYthon2 and Python3

2016-08-12 Thread Antti Haapala

Antti Haapala added the comment:

Sorry + and - are backwards there (I did the delta in wrong direction); + is 
before, and - after Raymond's commit. The `if` was retained there for 
backward-compatibility.

--

___
Python tracker 
<http://bugs.python.org/issue27742>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27742] Random.seed(5, version=1) generates different values in PYthon2 and Python3

2016-08-12 Thread Antti Haapala

Antti Haapala added the comment:

but yes, now that I read the documentation, 3.5 docs it say very explicitly 
that:

  Two aspects are guaranteed not to change:

- If a new seeding method is added, then a backward compatible seeder will 
be offered.
- The generator’s random() method will continue to produce the same 
sequence when the compatible seeder is given the same seed.

thus no guarantee is given about any other method at all, including randrange 
and randint.

--

___
Python tracker 
<http://bugs.python.org/issue27742>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27742] Random.randint generates different values in Python2 and Python3

2016-08-12 Thread Antti Haapala

Antti Haapala added the comment:

Anyhow, in this case it is easy to simulate the Python 2 randint behaviour (add 
checks for hi >= lo if needed):

>>> random.seed(5, version=1)
>>> randint_compat = lambda lo, hi: lo + int(random.random() * (hi + 1 - 
lo))
>>> randint_compat(0, 999)
6229016

--

___
Python tracker 
<http://bugs.python.org/issue27742>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27752] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ', ' as a separator.

2016-08-13 Thread Antti Haapala

Antti Haapala added the comment:

Excel's behaviour has always been locale-dependent. If the user's locale uses , 
as the decimal mark , then ; has been used as the column separator in "C"SV. 
However, even if you use autodetection with sniff, it is impossible to detect 
with 100 % accuracy, e.g, is the following csv row comma or semicolon separated:

1,2;3;4,5;6,7;8;9

The dialect could be documented better though, as currently it simply says:

The excel class defines the usual properties of an Excel-generated CSV 
file. It is registered with the dialect name 'excel'.

And there really should be a separate dialect for Excel-semicolon separated 
values, as a couple billion people would see ; in their CSV.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27752>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27794] setattr a read-only property; the AttributeError should show the attribute that failed

2016-08-18 Thread Antti Haapala

New submission from Antti Haapala:

Today we had an internal server error in production. I went to see the sentry 
logs for the error, and was dismayed: the error was `AttributeError: can't set 
attribute`, and the faulting line was `setattr(obj, attr, value)` that happens 
in a for-loop that uses computed properties coming from who knows.

Well, I quickly ruled out that it was because the code was trying to set a 
value to a read-only property descriptor, the only problem was to find out 
*which of all these read-only properties* was it trying to set:

Python 3.6.0a3+ (default, Aug 11 2016, 11:45:31) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class Foo:
... @property
... def bar(self): pass
... 
>>> setattr(Foo(), 'bar', 42)
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: can't set attribute

Could we change this for Python 3.6 so that the message for this could include 
the name of the property just like `AttributeError: has no attribute 'baz'` 
does?

--
messages: 273027
nosy: ztane
priority: normal
severity: normal
status: open
title: setattr a read-only property; the AttributeError should show the 
attribute that failed
type: enhancement

___
Python tracker 
<http://bugs.python.org/issue27794>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27794] setattr a read-only property; the AttributeError should show the attribute that failed

2016-08-18 Thread Antti Haapala

Antti Haapala added the comment:

Unfortunately it seems that it is not that straightforward. The descriptor 
object doesn't know the name of the property. The error is raised in 
`property_descr_set`. However the error itself can be propagated from setting 
another property.

--

___
Python tracker 
<http://bugs.python.org/issue27794>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27805] In Python 3, open('/dev/stdout', 'a') raises OSError with errno=ESPIPE

2016-08-20 Thread Antti Haapala

Antti Haapala added the comment:

Presumably the case was that a *named* log file is opened with 'a' mode, and 
one could pass '/dev/stdout' just like any other name of a file, and it did 
work, but not in Python 3.5.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue27805>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27805] In Python 3, open('/dev/stdout', 'a') raises OSError with errno=ESPIPE

2016-08-21 Thread Antti Haapala

Antti Haapala added the comment:

Yeah, it definitely is a bug in CPython. open(mode='a') should always append to 
the end of the given file. 

If you're writing an append-only text log to some file-like object, that's the 
mode you use, not some version/platform/filesystem specific voodoo to find out 
what's the least incorrect way to work around Python implementation 
deficiencies.

--

___
Python tracker 
<http://bugs.python.org/issue27805>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27794] setattr a read-only property; the AttributeError should show the attribute that failed

2016-08-22 Thread Antti Haapala

Antti Haapala added the comment:

I've got one idea about how to implement this, but it would require adding a 
new flag field to PyExc_AttributeError type.

This flag, if set, would tell that the AttributeError in question was raised in 
C descriptor code or under similar circumstances, and that the attribute name 
was not known, and thus it is OK for setattr/delattr and attribute lookups to 
append ": attributename" to the end of the message, then clear the flag; then  
all those places that raise AttributeError in __get__, __set__, __del__ would 
just need to set this flag.

--

___
Python tracker 
<http://bugs.python.org/issue27794>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?

2016-08-27 Thread Antti Haapala

Antti Haapala added the comment:

So does this (new opcode) count as a new feature? It would be great to give f'' 
strings a flying start, saying that not only they're cool, they're also faster 
than anything that you've used before.

Here some more mini-benchmarks with serhiy's patch2 applied, the times are 
pretty stable:

% python3.6 -mtimeit -s 'x = 42' '"%s-" % x'
1000 loops, best of 3: 0.184 usec per loop

% python3.6 -mtimeit -s 'x = 42' 'f"{x}-"' 
1000 loops, best of 3: 0.142 usec per loop

and

% python3.6 -mtimeit -s 'x = "42"' 'f"{x}{x}"'
1000 loops, best of 3: 0.0709 usec per loop

% python3.6 -mtimeit -s 'x = "42"' '"%s%s" % (x,x)'
100 loops, best of 3: 0.213 usec per loop

python3.6 -mtimeit -s 'x = "42"' '"".join((x, x))'
1000 loops, best of 3: 0.155 usec per loop

This is only really achievable with some kind of bytecode support.

--

___
Python tracker 
<https://bugs.python.org/issue27078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24651] Mock.assert* API is in user namespace

2015-07-21 Thread Antti Haapala

Changes by Antti Haapala :


--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue24651>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24653] Mock.assert_has_calls([]) is surprising for users

2015-07-21 Thread Antti Haapala

Changes by Antti Haapala :


--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue24653>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25070] Python 2.6 - Python 3.4 allows unparenthesized generator with *args, **kw, forbidden in 3.5

2015-09-11 Thread Antti Haapala

New submission from Antti Haapala:

User DeTeReR asked a question 
(http://stackoverflow.com/questions/32521140/generator-as-function-argument) on 
Stack Overflow about a special case of code that seemed to work in Python 3.4:

f(1 for x in [1], *[2])

and

f(*[2], 1 for x in [1])

I found out that when Python 2.6 introduced the "keyword arguments after 
*args", the checks in ast.c did not follow:

for (i = 0; i < NCH(n); i++) {
node *ch = CHILD(n, i);
if (TYPE(ch) == argument) {
if (NCH(ch) == 1)
nargs++;
else if (TYPE(CHILD(ch, 1)) == gen_for)
ngens++;
else
nkeywords++;
}
}
if (ngens > 1 || (ngens && (nargs || nkeywords))) {
ast_error(n, "Generator expression must be parenthesized "
  "if not sole argument");
return NULL;
}

the *args, **kwargs were not considered to be of type "argument" by the 
Grammar, and thus the error was not generated in this case.

Further down, the error "non-keyword arg after keyword arg" was not triggered 
in the case of sole unparenthesized generator expression.

Now, the parsing changes in 3.5 have disallowed all of these constructs:

f(1 for i in [42], **kw)
f(1 for i in [42], *args)
f(*args, 1 for i in [42])

which were (erroneously) allowed in previous versions.

I believe at least 3.5 release notes should mention this change.

--
components: Interpreter Core
messages: 250468
nosy: ztane
priority: normal
severity: normal
status: open
title: Python 2.6 - Python 3.4 allows unparenthesized generator with *args, 
**kw, forbidden in 3.5
type: behavior
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5

___
Python tracker 
<http://bugs.python.org/issue25070>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25070] Python 2.6 - Python 3.4 allows unparenthesized generator with *args, **kw, forbidden in 3.5

2015-09-11 Thread Antti Haapala

Antti Haapala added the comment:

@haypo yes.

I must add that I found out that Python 2.5 also allows 

f(1 for x in [1], *a)

and 

f(1 for x in [1], **kw)

but not

f(*a, 1 for x in [1])

So I do not know if the first and second cases were intentional or not.
Also, in Python 2.6 - 3.4, f(*a, 1 for x in [1]) provides the generator as the 
*first* positional argument, in 3.5 it'd be the last one.

--

___
Python tracker 
<http://bugs.python.org/issue25070>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25070] Python 2.6 - Python 3.4 allows unparenthesized generator with *args, **kw, forbidden in 3.5

2015-09-11 Thread Antti Haapala

Antti Haapala added the comment:

Yeah, it is a bug in 2.5 too; https://docs.python.org/2.5/ref/calls.html

call ::= primary "(" [argument_list [","]
 | expression genexpr_for] ")"

--
assignee:  -> docs@python
components: +Documentation
nosy: +docs@python

___
Python tracker 
<http://bugs.python.org/issue25070>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26107] PEP 511: code.co_lnotab: use signed line number delta to support moving instructions in an optimizer

2016-01-19 Thread Antti Haapala

Antti Haapala added the comment:

Nice work, my issue21385 is also related. Basically, transforming non-Python 
code into Python meant that all line number information, which otherwise would 
have been useful for debugging, had to be discarded, or debug builds of Python 
would dump cores.

So, bye "assert(d_lineno >= 0);", you won't be missed.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue26107>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26261] NamedTemporaryFile documentation is vague about the `name` attribute

2016-02-01 Thread Antti Haapala

New submission from Antti Haapala:

The documentation for NamedTemporaryFile is a bit vague. It says

[--] That name can be retrieved from the name attribute of the file object. 
[--] The returned object is always a file-like object whose file
attribute is the underlying true file object. This file-like object can be used 
in a with statement, just like a normal file.

That `file-like object` vs `true file object` made me assume that I need to do

f = NamedTemporaryFile()
f.file.name

to get the filename, which sort of worked, but only later realized that 
`f.file.name` is actually the file descriptor number on Linux, a.k.a an 
integer. Thus I suggest that the one sentence be changed to "That name can be 
retrieved from the name attribute of the returned file-like object."

--
assignee: docs@python
components: Documentation
messages: 259334
nosy: docs@python, ztane
priority: normal
severity: normal
status: open
title: NamedTemporaryFile documentation is vague about the `name` attribute

___
Python tracker 
<http://bugs.python.org/issue26261>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26261] NamedTemporaryFile documentation is vague about the `name` attribute

2016-02-01 Thread Antti Haapala

Changes by Antti Haapala :


--
type:  -> enhancement

___
Python tracker 
<http://bugs.python.org/issue26261>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26358] mmap.mmap.__iter__ is broken (yields bytes instead of ints)

2016-02-13 Thread Antti Haapala

New submission from Antti Haapala:

Just noticed when answering a question on StackOverflow 
(http://stackoverflow.com/q/35387843/918959) that on Python 3 iterating over a 
mmap object yields individual bytes as bytes objects, even though iterating 
over slices, indexing and so on gives ints

Example:

import mmap

with open('test.dat', 'rb') as f:
mm = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
for b in mm:
print(b)
# prints for example b'A' instead of 65
mm.close()

I believe this should be fixed for the sake of completeness - the documentation 
says that "Memory-mapped file objects behave like both bytearray and like file 
objects." - however the current behaviour is neither like a bytearray nor like 
a file object, and quite confusing.

Similarly the `in` operator seems to be broken; one could search for space 
using `32 in bytesobj`, which would work for slices but not for the whole mmap 
object.

--
messages: 260261
nosy: ztane
priority: normal
severity: normal
status: open
title: mmap.mmap.__iter__ is broken (yields bytes instead of ints)
type: behavior
versions: Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

___
Python tracker 
<http://bugs.python.org/issue26358>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-02-22 Thread Antti Haapala

Antti Haapala added the comment:

I believe `urlparse` should throw a `TypeError` if not isinstance(url, (str, 
bytes))

--

___
Python tracker 
<http://bugs.python.org/issue22234>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26477] typing forward references and module attributes

2016-03-03 Thread Antti Haapala

Antti Haapala added the comment:

Indeed, the assumption is be that if a string is used, it is used there because 
the actual thing cannot be referenced by name at that point. Then trying to 
evaluate it at all would be an optimization in only those cases where it is 
used incorrectly / needlessly.

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue26477>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25973] Segmentation fault with nonlocal and two underscores

2016-03-04 Thread Antti Haapala

Antti Haapala added the comment:

So no fix for 3.4 for an obvious SIGSEGV?

% python3  
Python 3.4.3 (default, Mar 26 2015, 22:03:40) 
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class A:
... def f(self):
... nonlocal __x
... 
[4]19173 segmentation fault (core dumped)  python3

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue25973>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26495] super() does not work nested

2016-03-06 Thread Antti Haapala

New submission from Antti Haapala:

super() without arguments is

--
messages: 261264
nosy: ztane
priority: normal
severity: normal
status: open
title: super() does not work nested

___
Python tracker 
<http://bugs.python.org/issue26495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26495] super() does not work in nested functions, genexps, listcomps, and gives misleading exceptions

2016-03-06 Thread Antti Haapala

Antti Haapala added the comment:

super() without arguments gives proper "super() without arguments" in 
functions, generator functions nested in methods, if *those* do not have 
arguments. But if you use super() in a nested function that takes an argument, 
or in a generator expression or a comprehension, you'd get 

Got exception: TypeError super(type, obj): obj must be an instance or 
subtype of type

which is really annoying. Furthermore, if a nested function took another 
instance of type(self) as the first argument, then super() could refer 
unexpectedly to wrong instance:

class Bar(Foo):
def calculate(self, other_foos):
def complicated_calculation(other):
super().some_method(other)

for item in other_foos:
complicated_calculation(item)

now the `super()` call would not have implied `self` of `calculate` as the 
first argument, but the `other` argument of the nested function, all without 
warnings.

I believe it is a mistake that these nested functions can see `__class__` at 
all, since it would just mostly lead them misbehaving unexpectedly.

--
components: +Interpreter Core
title: super() does not work nested -> super() does not work in nested 
functions, genexps, listcomps, and gives misleading exceptions

___
Python tracker 
<http://bugs.python.org/issue26495>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26549] co_stacksize is calculated from unoptimized code

2016-03-12 Thread Antti Haapala

New submission from Antti Haapala:

When answering a question on StackOverflow, I noticed that a function that only 
loads a constant tuple to a local variable still has a large `co_stacksize` as 
if it was built with BUILD_TUPLE.

e.g.

>>> def foo():
... a = (1,2,3,4,5,6,7,8,9,10)
...
>>> foo.__code__.co_stacksize
10
>>> dis.dis(foo)
  2   0 LOAD_CONST  11 ((1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
  3 STORE_FAST   0 (a)
  6 LOAD_CONST   0 (None)
  9 RETURN_VALUE

I suspect it is because in the `makecode` the stack usage is calculated from 
the unoptimized assembler output instead of the actual optimized bytecode. I do 
not know if there is any optimization that would increase the stack usage, but 
perhaps it should be calculated from the resulting output.

--
components: Interpreter Core
messages: 261668
nosy: ztane
priority: normal
severity: normal
status: open
title: co_stacksize is calculated from unoptimized code
versions: Python 3.6

___
Python tracker 
<http://bugs.python.org/issue26549>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26601] Use new madvise()'s MADV_FREE on the private heap

2016-04-11 Thread Antti Haapala

Antti Haapala added the comment:

... and it turns out that munmapping is not always that smart thing to do: 
http://stackoverflow.com/questions/36548518/variable-assignment-faster-than-one-liner

--
nosy: +ztane

___
Python tracker 
<http://bugs.python.org/issue26601>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26601] Use new madvise()'s MADV_FREE on the private heap

2016-04-11 Thread Antti Haapala

Antti Haapala added the comment:

> Maybe we need an heuristic to release the free arena after N calls to object 
> allocator functions which don't need this free arena.

That'd be my thought; again I believe that `madvise` could be useful there; now 
`mmap`/`munmap` I believe is particularly slow because it actually needs to 
supply 256kbytes of *zeroed* pages.

--

___
Python tracker 
<http://bugs.python.org/issue26601>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26601] Use new madvise()'s MADV_FREE on the private heap

2016-04-11 Thread Antti Haapala

Antti Haapala added the comment:

I said that *munmapping* is not the smart thing to do: and it is not, if you're 
going to *mmap* soon again.

--

___
Python tracker 
<http://bugs.python.org/issue26601>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26601] Use new madvise()'s MADV_FREE on the private heap

2016-04-11 Thread Antti Haapala

Antti Haapala added the comment:

Also what is important to notice is that the behaviour occurs *exactly* because 
the current heuristics *work*; the allocations were successfully organized so 
that one arena could be freed as soon as possible. The question is that is it 
sane to try to free the few bits of free memory asap - say you're now holding 
100M of memory - it does not often matter much if you hold the 100M of memory 
for *one second longer* than you actually ended up needing.

--

___
Python tracker 
<http://bugs.python.org/issue26601>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   >