[issue15814] memoryview: equality-hash invariant

2012-11-02 Thread Stefan Krah
Changes by Stefan Krah : -- resolution: -> fixed stage: needs patch -> committed/rejected status: open -> closed ___ Python tracker ___ _

[issue15814] memoryview: equality-hash invariant

2012-11-02 Thread Roundup Robot
Roundup Robot added the comment: New changeset 969069d464bc by Stefan Krah in branch '3.3': Issue #15814: Use hash function that is compatible with the equality http://hg.python.org/cpython/rev/969069d464bc -- ___ Python tracker

[issue15814] memoryview: equality-hash invariant

2012-10-17 Thread Martin v . Löwis
Martin v. Löwis added the comment: 3.3.1 is correct, you apparently missed msg169425. -- ___ Python tracker ___ ___ Python-bugs-list m

[issue15814] memoryview: equality-hash invariant

2012-10-17 Thread Mark Lawrence
Mark Lawrence added the comment: The 3.3.0 docs now state "Note Hashing of memoryviews with formats other than ‘B’, ‘b’ or ‘c’ as well as hashing of multi-dimensional memoryviews is possible in version 3.3.0, but will raise an error in 3.3.1 in order to be compatible with the new memoryview eq

[issue15814] memoryview: equality-hash invariant

2012-09-09 Thread Roundup Robot
Roundup Robot added the comment: New changeset 7734eb2707a1 by Stefan Krah in branch 'default': Issue #15814: Document planned restrictions for memoryview hashes in 3.3.1. http://hg.python.org/cpython/rev/7734eb2707a1 New changeset 71f4d80400f2 by Stefan Krah in branch 'default': Issue #15814: D

[issue15814] memoryview: equality-hash invariant

2012-09-08 Thread Stefan Krah
Stefan Krah added the comment: Georg, thanks for including all changes that I've asked for! If you still have the patience for the constant stream of memoryview doc changes, there are three new ones that might be worth including: 3b2597c1fe35 c9c9d890400c ca81b9a3a015 -- _

[issue15814] memoryview: equality-hash invariant

2012-09-08 Thread Roundup Robot
Roundup Robot added the comment: New changeset ca81b9a3a015 by Stefan Krah in branch 'default': Issue #15814: Update whatsnew to the current state of hashing memoryviews. http://hg.python.org/cpython/rev/ca81b9a3a015 -- ___ Python tracker

[issue15814] memoryview: equality-hash invariant

2012-09-03 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Mon, Sep 3, 2012 at 3:59 PM, Martin v. Löwis wrote: > if hashing was restricted > to contiguous bytes, then the implementation would certainly be > simplified quite a bit: currently, if it's not contiguous, it needs > to make a separate copy and hash th

[issue15814] memoryview: equality-hash invariant

2012-09-03 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 02.09.2012 16:21, schrieb Alexander Belopolsky: > I have refrained from voting because in my line of work buffers or > memoryviews deal with large objects that rarely serve as dictionary > keys. As a result, I have zero experince with hashing of buffers. > T

[issue15814] memoryview: equality-hash invariant

2012-09-03 Thread Stefan Behnel
Changes by Stefan Behnel : -- nosy: -scoder ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python

[issue15814] memoryview: equality-hash invariant

2012-09-03 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Mon, Sep 3, 2012 at 8:38 AM, Stefan Krah wrote: > I don't see what could possibly be ill-defined about using the > tobytes() definition for ND-arrays. In all places memoryview now > uses the logical array, which is displayed by tolist(). +1. The key r

[issue15814] memoryview: equality-hash invariant

2012-09-03 Thread Stefan Krah
Stefan Krah added the comment: Small nitpick: multi-dimensional hashing wasn't really accidental, it was perfectly aligned with the previous statically typed equality definition. When I suggested PyBuffer_Hash = hash(obj.tobytes()) on python-dev for non-contiguous and multi-dimensional arrays, I

[issue15814] memoryview: equality-hash invariant

2012-09-03 Thread Roundup Robot
Roundup Robot added the comment: New changeset c9c9d890400c by Nick Coghlan in branch 'default': Issue #15814: Add NEWS entry regarding intended memoryview hashing restrictions http://hg.python.org/cpython/rev/c9c9d890400c -- ___ Python tracker

[issue15814] memoryview: equality-hash invariant

2012-09-03 Thread Nick Coghlan
Nick Coghlan added the comment: The main issue is that it's not quite clear how to deal with problems like C-style vs FORTRAN-style memory layouts and strides vs suboffsets in defining multidimensional hash equality. Without a use case, it's easier to just punt on the question and declare it i

[issue15814] memoryview: equality-hash invariant

2012-09-02 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Sep 2, 2012, at 8:44 AM, Stefan Krah wrote: > The totals are +11.5 :) for hashing, +1 for allowing non-contiguous and > -2 for multi-dimensional I have refrained from voting because in my line of work buffers or memoryviews deal with large objects th

[issue15814] memoryview: equality-hash invariant

2012-09-02 Thread Roundup Robot
Roundup Robot added the comment: New changeset 3b2597c1fe35 by Stefan Krah in branch 'default': Issue #15814: Documentation: disallow hashing of multi-dimensional memoryviews. http://hg.python.org/cpython/rev/3b2597c1fe35 -- ___ Python tracker

[issue15814] memoryview: equality-hash invariant

2012-09-02 Thread Stefan Krah
Stefan Krah added the comment: The totals are +11.5 :) for hashing, +1 for allowing non-contiguous and -2 for multi-dimensional. I'll update the docs soon. -- ___ Python tracker

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Nick Coghlan
Nick Coghlan added the comment: +1 for allowing bytes hashing. As Antoine noted, the 1D bytes variant of memoryview() fills the role previously handled by buffer(). +1 for allowing 1D non-contiguous hashing. This is from a simplicity perspective, as I don't want to have to explain to people wh

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 01.09.12 20:06, schrieb Stefan Krah: >- b'abc'[::-1] hashes, but memoryview(b'abc')[::-1] does not I find that memoryview(b'abc')[::-1] is a strange thing to have, anyway, so I'm not bothered by it behaving different. I can accept that it needs to be sup

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Stefan Krah
Stefan Krah added the comment: > py> x = memoryview(array.array('B',b'cba')) I find the array example is different. The user has to remember one thing: memoryviews based on arrays don't hash. For memoryviews based on bytes one would have to remember: - 'B', 'c' and 'b' hash - only C-conti

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Could we perhaps take a small poll? My own vote is: > > 1) Allow bytes hashing at all: +0.5 +10. The buffer() object in 2.x was hashable, and a very important use case of memoryview is replacing buffer(). > 2) If 1) is allowed, then also non-contiguous hashi

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 01.09.12 19:20, schrieb Stefan Krah: > Disallowing non-contiguous arrays leads to very strange situations though. I don't find that strange. That two object compare equal doesn't imply that they both hash - only that *if* they hash, they should hash equal.

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Stefan Krah
Stefan Krah added the comment: Disallowing non-contiguous arrays leads to very strange situations though. I'm positive that there will be a bug report about this: >>> x = memoryview(b'abc')[::-1] >>> b = b'cba' >>> d = {b'cba': 101} >>> >>> b in d True >>> x == b True >>> x in d Traceback (most

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 01.09.12 16:24, schrieb Stefan Krah: > Does "byte arrays" include 'b' and 'c' or just 'B'? I don't see a reason > to allow 'B' but not the others. Either type is fine with me. It's the multi-dimensional aspect I'd like to ban. > My reasoning was: If non-con

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Stefan Krah
Stefan Krah added the comment: > why is it desirable to have deliberate hash collisions between views with > different shapes? Since we're now restricting everything to bytes, the multi-dimensional case is probably not useful at all. As I said above, I would leave it in because it actually save

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Nick Coghlan
Nick Coghlan added the comment: Keep in mind that its OK if hash(m) == hash(m.tobytes()) in some cases where "m != m.tobytes()". The only cases we really need to kill are those that break the hash invariant. I don't like the idea of making the definition of hash more complicated just to rule

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Sep 1, 2012, at 11:06 AM, Stefan Krah wrote: > tobytes() is the same as the flattened multi-dimensional list representation > with all elements converted to bytes. This is correct, but why is it desirable to have deliberate hash collisions between vi

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Stefan Krah
Stefan Krah added the comment: tobytes() is the same as the flattened multi-dimensional list representation with all elements converted to bytes. If I'm not mistaken, that's how NumPy's tostring() behaves. -- ___ Python tracker

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Sep 1, 2012, at 10:24 AM, Stefan Krah wrote: > > The definition hash(m) == hash(m.tobytes()) is pretty straightforward. I probably missed something from the early discussion, but doesn't this definition only work for 1d (or 0d) views? Shouldn't shap

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Stefan Krah
Stefan Krah added the comment: Martin v. L??wis wrote: > Why be more permissive than necessary? -0 on the committed version; > it should IMO further restrict it to 1D contiguous byte arrays. Does "byte arrays" include 'b' and 'c' or just 'B'? I don't see a reason to allow 'B' but not the other

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 01.09.12 13:22, schrieb Stefan Krah: > Does everyone agree on (or tolerate at least) allowing 'B', 'b' and 'c'? Why be more permissive than necessary? -0 on the committed version; it should IMO further restrict it to 1D contiguous byte arrays. -- _

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Roundup Robot
Roundup Robot added the comment: New changeset 895e123d9476 by Stefan Krah in branch 'default': Issue #15814: Document planned restrictions for memoryview hashes in 3.3.1. http://hg.python.org/cpython/rev/895e123d9476 -- nosy: +python-dev ___ Python t

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Nick Coghlan
Nick Coghlan added the comment: +1 for docs patch for 3.3.0 and then enforcing the format restriction for 3.3.1. -- ___ Python tracker ___ ___

[issue15814] memoryview: equality-hash invariant

2012-09-01 Thread Stefan Krah
Stefan Krah added the comment: Here's a patch that enforces byte formats. Does everyone agree on (or tolerate at least) allowing 'B', 'b' and 'c'? I think we should at least commit the doc patch for 3.3.0. Otherwise implementors of exporting objects might waste time on a feature that's deprecat

[issue15814] memoryview: equality-hash invariant

2012-08-31 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 31.08.12 06:14, schrieb Stefan Behnel: > To add on Dag's comments, this essentially means that any caching of > the hash value is dangerous Stefan Krah is right here: the proper test (which is already implemented) is whether the underlying object is hashabl

[issue15814] memoryview: equality-hash invariant

2012-08-31 Thread Stefan Krah
Stefan Krah added the comment: Dag Sverre Seljebotn wrote: > OK, I can understand the desire to make memoryviews be bytes-like objects > (though to my mind, bytes is "frozen" in a very different way...) We have two desires that sometimes conflict: People who want to use memoryview as a *buffer*

[issue15814] memoryview: equality-hash invariant

2012-08-31 Thread Stefan Krah
Stefan Krah added the comment: Dag Sverre Seljebotn wrote: > It is perfectly possible for an object to export memory in a read-only > way that may still change. Another object may have a writeable view: > > x = obj.readonly_view > y = obj.writable_view > obj.move_to_next_image() # changes memor

[issue15814] memoryview: equality-hash invariant

2012-08-30 Thread Stefan Behnel
Stefan Behnel added the comment: To add on Dag's comments, this essentially means that any caching of the hash value is dangerous, unless it can be assured that the underlying buffer definitely has not changed in the meantime. There is no way for users to explicitly tell a memoryview to rehash

[issue15814] memoryview: equality-hash invariant

2012-08-30 Thread Dag Sverre Seljebotn
Dag Sverre Seljebotn added the comment: OK, I can understand the desire to make memoryviews be bytes-like objects (though to my mind, bytes is "frozen" in a very different way...) If so, and it is deemed worth it, my suggestion is to add a new PyBUF_CONST flag to the buffer acquisition in that

[issue15814] memoryview: equality-hash invariant

2012-08-30 Thread Dag Sverre Seljebotn
Dag Sverre Seljebotn added the comment: It is perfectly possible for an object to export memory in a read-only way that may still change. Another object may have a writeable view: x = obj.readonly_view y = obj.writable_view obj.move_to_next_image() # changes memory under x, y So, hashing using

[issue15814] memoryview: equality-hash invariant

2012-08-30 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 30.08.12 11:24, schrieb Stefan Krah: >>> The new equality definition and any possible new hash definition should >>> probably also be part of the buffer API documentation, since they >>> aren't memoryview specific. >> >> That's not true: they *are* memoryview

[issue15814] memoryview: equality-hash invariant

2012-08-30 Thread Stefan Krah
Changes by Stefan Krah : -- keywords: +patch Added file: http://bugs.python.org/file27058/issue15814-doc.diff ___ Python tracker ___ _

[issue15814] memoryview: equality-hash invariant

2012-08-30 Thread Stefan Krah
Stefan Krah added the comment: Here's a doc patch restricting the hash to formats 'B', 'b' and 'c'. I think non-contiguous views are fine: both __eq__() and tobytes() handle these, so the equality-hash invariant is preserved. -- ___ Python tracker

[issue15814] memoryview: equality-hash invariant

2012-08-30 Thread Stefan Krah
Changes by Stefan Krah : -- nosy: +scoder ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.or

[issue15814] memoryview: equality-hash invariant

2012-08-30 Thread Stefan Krah
Stefan Krah added the comment: Martin v. L??wis wrote: > > hash(x) == hash(x.tobytes()) > In the light of this requirement, it's even more difficult to ask > people that they change their hashing, since some exporters may already > comply with that original request. I don't think so. memory

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Nick Coghlan
Nick Coghlan added the comment: My perspective is that hashing a memoryview only makes sense when the memoryview is read-only and "m == m.tobytes()" (i.e. it's a C contiguous 1D view of bytes, either because that's what the original object exported as a buffer or because the view has been cast

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Alexander Belopolsky
Changes by Alexander Belopolsky : -- nosy: +belopolsky ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://m

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 29.08.12 22:04, schrieb Stefan Krah: > In the memoryview-hash thread on python-dev [1] this objection was > addressed by demanding from exporters that they all use: > > hash(x) == hash(x.tobytes()) > > Since the previous equality concept was also based on

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Stefan Krah
Stefan Krah added the comment: And since memory_richcompare() and any potentially compatible hash function are quite tricky to implement by now, perhaps the generally useful parts could be exposed as PyBuffer_RichCompareBool() and PyBuffer_Hash(). --

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Stefan Krah
Stefan Krah added the comment: Martin v. L??wis wrote: > In general, since memoryview(obj)==obj, it would be necessary that > hash(memoryview(obj))==hash(obj). However, since memoryview cannot > know what hashing algorithm obj uses, it cannot compute the hash > value with the same algorithm. In

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 29.08.12 21:06, schrieb Antoine Pitrou: >> So what specific hash algorithm do you propose? > > The current algorithm works well in conjunction with bytes objects. That's about the only type if works for. >> My claim is that any hash definition for memoryvie

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Am 29.08.12 20:01, schrieb Antoine Pitrou: > >> I think the proper solution is to make memoryview objects unhashable. > > > > Disagreed. If memoryviews are to be bytes-like objects they should be > > hashable (at least when readonly). > > So what specific hash

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 29.08.12 20:01, schrieb Antoine Pitrou: >> I think the proper solution is to make memoryview objects unhashable. > > Disagreed. If memoryviews are to be bytes-like objects they should be > hashable (at least when readonly). So what specific hash algorithm do

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Antoine Pitrou
Antoine Pitrou added the comment: > I think the proper solution is to make memoryview objects unhashable. Disagreed. If memoryviews are to be bytes-like objects they should be hashable (at least when readonly). > Any other approach will have flaws of some kind. Not more so than equality betwee

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Martin v . Löwis
Martin v. Löwis added the comment: I think the proper solution is to make memoryview objects unhashable. Any other approach will have flaws of some kind. -- ___ Python tracker _

[issue15814] memoryview: equality-hash invariant

2012-08-29 Thread Stefan Krah
New submission from Stefan Krah: The new PEP-3118 equality definition from #15573 that is based on element-wise comparisons breaks the equality-hash invariant: >>> from _testbuffer import ndarray >>> x = ndarray([1,2,3], shape=[3], format='f') >>> y = ndarray([1,2,3], shape=[3], format='B') >>>