[issue19087] bytearray front-slicing not optimized

2015-04-17 Thread Martin Panter
Martin Panter added the comment: I think the changes for this issue are causing the crash and unexpected buffer expansion described in Issue 23985. Appending to a bytearray() can overstep the memory buffer because it doesn’t account for ob_start when checking for resizing. And “del” can expand

[issue19087] bytearray front-slicing not optimized

2013-10-05 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Side effect of this change is that bytearray's data now can be > non-aligned. We should examine all places which relies on this. The C API makes no guarantees as to alignment of private data areas, so any external code relying on it would be incorrect. The re

[issue19087] bytearray front-slicing not optimized

2013-10-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Side effect of this change is that bytearray's data now can be non-aligned. We should examine all places which relies on this. -- ___ Python tracker

[issue19087] bytearray front-slicing not optimized

2013-10-05 Thread Antoine Pitrou
Antoine Pitrou added the comment: The commit produced compiled errors on Windows, but I've since fixed them. -- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed ___ Python tracker

[issue19087] bytearray front-slicing not optimized

2013-10-05 Thread Roundup Robot
Roundup Robot added the comment: New changeset 499a96611baa by Antoine Pitrou in branch 'default': Issue #19087: Improve bytearray allocation in order to allow cheap popping of data at the front (slice deletion). http://hg.python.org/cpython/rev/499a96611baa -- nosy: +python-dev __

[issue19087] bytearray front-slicing not optimized

2013-10-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: > I meant that you can continue use self->ob_bytes instead of > PyByteArray_AS_STRING(self) if self->ob_bytes points not to the > start of physical buffer, but to the start of logical byte array. > *This* will simplify the patch a lot. It will make the diff smal

[issue19087] bytearray front-slicing not optimized

2013-10-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I don't see much sense in differences between bytea_slice2.patch and bytea_slice3.patch, because bytea_slice3.patch is not smaller and simpler than bytea_slice2.patch. I meant that you can continue use self->ob_bytes instead of PyByteArray_AS_STRING(self) i

[issue19087] bytearray front-slicing not optimized

2013-10-03 Thread STINNER Victor
STINNER Victor added the comment: bytea_slice3.patch looks simpler than bytea_slice2.patch, I prefer it. -- ___ Python tracker ___ __

[issue19087] bytearray front-slicing not optimized

2013-10-03 Thread Antoine Pitrou
Antoine Pitrou added the comment: Here is a slightly modified patch implementing Serhiy's suggestion. -- Added file: http://bugs.python.org/file31954/bytea_slice3.patch ___ Python tracker __

[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread STINNER Victor
STINNER Victor added the comment: I adapted my micro-benchmark to measure the speedup: bench_bytearray2.py. Result on bytea_slice2.patch: Common platform: CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes CPU model: Intel(R) Core(

[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread STINNER Victor
STINNER Victor added the comment: I took me some time, but Antoine explained me the use case on IRC :-) The patch is useful is the bytearray is used as a FIFO: remove front, append tail. It can be seen as an implementation for BufferedReader. Consumer/producer is a common pattern, especially c

[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > A 4x improvement on a micro-benchmark is very likely to make a difference in at least some real-world code (while a 10% improvement wouldn't). If there is a code that uses the deleting from the beginning of a bytearray. I just pray to demonstrate this code.

[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: Other benchmarks for the new patch (exercising FIFO-like behaviour: some data is appended at one end, and popped at the other): timeit -s "b=bytearray(10);s=b'x'*100" "b[:100] = b''; b.extend(s)" -> before: 4.07 usec per loop -> after: 0.812 usec per loop

[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: However, the patch had a bug in the resizing logic. Here is a new patch fixing that (+ an additional test). -- Added file: http://bugs.python.org/file31926/bytea_slice2.patch ___ Python tracker

[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: > I don't understand why you avoid to show any examples which benefit. > Shouldn't optimizing patches prove their efficient? Many micro-optimizations get committed without proving themselves on a high-level benchmark suite, as long as they produce a big enough d

[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I don't understand why you avoid to show any examples which benefit. Shouldn't optimizing patches prove their efficient? -- ___ Python tracker __

[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: > > The problem is the "suboptimal code" is also the natural way to > write such code. If you know a simple and idiomatic way to write an > optimal bytes FIFO, then please share it with us. > > Please share this written in the "natural way" real code with us. I

[issue19087] bytearray front-slicing not optimized

2013-09-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > I'm not sure that it is what you expected: bytearray() is only initialized > once ("setup" of timeit). You probably want to reinitialize at each loop. There is no "setup" of timeit here. And you forgot bytes(b) after accumulating loop. bench_bytearray.py s

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread STINNER Victor
STINNER Victor added the comment: Oh, by the way: > $ ./python -m timeit "b = bytearray(); a = b'x'" "for i in range(1): b > += a" "bytes(b)" I'm not sure that it is what you expected: bytearray() is only initialized once ("setup" of timeit). You probably want to reinitialize at each l

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread STINNER Victor
STINNER Victor added the comment: > One of most used cases for bytearrays is accumulating. And the patch slow > down this case. Please don't use the raw timeit command for micro-benchmarks, it is not reliable. For example, I'm unable to reproduce your "slow down" (7% on a microbenchmark is no

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > One of most used cases for bytearrays is accumulating. > And the patch slow down this case. I see no difference here. You are seeing a 10% slowdown, which is possibly a measurement glitch. The bottom line is that the performance remains approximately the sam

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > Is there a problem with that? No more than with msg198657. > Sorry, I don't get your point. It's not become Python is inefficient that > developers must develop workarounds. I'm not sure that "workarounds" are much worst than using this optimization. At

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > > No, because PyByteArray_Resize() is always called afterwards to ensure > > that the buffer is resized when it gets below 50% usage. > > I.e. the bytearray is compacted from time to time. Is there a problem with that? -- ___

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread STINNER Victor
STINNER Victor added the comment: @Serhiy: "I doubt than any performance critical code do it instead of increasing an index in constant time." Sorry, I don't get your point. It's not become Python is inefficient that developers must develop workarounds. Antoine's patch is simple, elegant, and

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > No, because PyByteArray_Resize() is always called afterwards to ensure > that the buffer is resized when it gets below 50% usage. > I.e. the bytearray is compacted from time to time. -- ___ Python tracker

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > You increase internal index in a bytearray. A bytearray with small > visible length can consume much hidden memory. No, because PyByteArray_Resize() is always called afterwards to ensure that the buffer is resized when it gets below 50% usage. -- ___

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: You increase internal index in a bytearray. A bytearray with small visible length can consume much hidden memory. -- ___ Python tracker ___ _

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > The same is true with your patch. I don't understand. What is "true with my patch"? -- ___ Python tracker ___ ___

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The same is true with your patch. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsu

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Deleting a slice at the front of a bytearray have linear complexity > from the size of a bytearray (in any case del b[:1] is a little faster > than b[:1] = b''). I doubt than any performance critical code do it > instead of increasing an index in constant time.

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Deleting a slice at the front of a bytearray have linear complexity from the size of a bytearray (in any case del b[:1] is a little faster than b[:1] = b''). I doubt than any performance critical code do it instead of increasing an index in constant time. -

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Could you please show concrete code in which you are going to use this > optimization? There's no need to "use" this optimization. Any networking code that has to find message boundaries and split on them will benefit. If you don't understand that, I'm not

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > A generic example is to parse messages out of a TCP stream. Basically any protocol transported on TCP needs such a facility, or has to find workarounds (which are either suboptimal or complicated). Could you please show concrete code in which you are going t

[issue19087] bytearray front-slicing not optimized

2013-09-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: Results under Windows: - before: PCbuild\amd64\python.exe -m timeit "b=bytearray(10)" "while b: b[-1:] = b''" 10 loops, best of 3: 74.8 msec per loop PCbuild\amd64\python.exe -m timeit "b=bytearray(10)" "while b: b[:1] = b''" 10 loops, best of 3: 330 m

[issue19087] bytearray front-slicing not optimized

2013-09-26 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Could you please add unit tests for check that ob_start is used > instead of memmove()? How would I do that? Most of the time we don't unit-test performance improvements (i.e. there are no tests that list.append() is O(1), for example). -- __

[issue19087] bytearray front-slicing not optimized

2013-09-26 Thread STINNER Victor
STINNER Victor added the comment: Could you please add unit tests for check that ob_start is used instead of memmove()? I didn't find a function for that in _testcapi. I tried to test it using sys.getsizeof(), but the size is not reliable (the bytearray buffer is not always shrinked, it depen

[issue19087] bytearray front-slicing not optimized

2013-09-26 Thread Antoine Pitrou
Antoine Pitrou added the comment: > @Antoine: Do you know if your patch may reduce the memory > fragmentation on "bytearray front-slicing"? It reduces the number of allocations so, yes, it can reduce memory fragmentation. We cannot really use a list of chunks for bytearray since it is supposed t

[issue19087] bytearray front-slicing not optimized

2013-09-26 Thread STINNER Victor
STINNER Victor added the comment: > Mercurial has another implementation strategy for a similar thing: > http://selenic.com/repo/hg/file/50d721553198/mercurial/util.py#l935 I found an interesting comment in the following issue: "I think the trouble we get into is chunkbuffer() creates new large

[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Could you please provide an example which uses this feature? A generic example is to parse messages out of a TCP stream. Basically any protocol transported on TCP needs such a facility, or has to find workarounds (which are either suboptimal or complicated).

[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Could you please provide an example which uses this feature? -- ___ Python tracker ___ ___ Python-

[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Antoine Pitrou
Changes by Antoine Pitrou : -- stage: needs patch -> patch review ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe

[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Antoine Pitrou
Antoine Pitrou added the comment: Here is a patch. Benchmarks (under Linux where realloc is fast; the gap may be wider under Windows): $ ./python -m timeit "b=bytearray(10)" "while b: b[:1] = b''" -> before: 225 msec per loop -> after: 60.4 msec per loop $ ./python -m timeit "b=bytearray(1

[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Antoine Pitrou
Antoine Pitrou added the comment: > And the same is for a list. List and bytearray are wrong types for > front deleting. There is no bytedeque(). > I don't think we should increase the size of > bytearray, and complicate and slowdown it for such special purpose. I don't think it would really s

[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: And the same is for a list. List and bytearray are wrong types for front deleting. I don't think we should increase the size of bytearray, and complicate and slowdown it for such special purpose. If you want to implement a fifo using bytearray more optimal,

[issue19087] bytearray front-slicing not optimized

2013-09-25 Thread Antoine Pitrou
New submission from Antoine Pitrou: If you delete a slice at the end of a bytearray, it is naturally optimized (thanks to the resizing strategy). However, if you delete a slice at the front of a bytearray, it is not: a memmove() gets done every time. $ ./python -m timeit "b=bytearray(1)" "